Re: [Catalyst] UTF8 and content length

2016-07-22 Thread Marco Pessotto
John Napiorkowski  writes:

> So what it looks like to me is that the code that sets a content
> length if one is not set by the view is not dealing with unicode
> correctly. I have another unicode issue I need to look at soonish so I
> try to see if we can get a test case for this. -jnap
> sub index :Path :Args(0)
> {
> my ( $self, $c ) = @_;
>
> my $json_text = '{"id":1, "msg":"В Питере пить"}';
> $c->response->content_type('application/json');
> $c->response->body($json_text);
> }
>

The content type "application/json" is not encoded by catalyst, because
most of the serializers prefer to output bytes not characters (with the,
good or wrong, reason that json is data, not text). There you are
storing a decoded string, declaring the content type, and serve the
body. Which of course is not going to work.

https://metacpan.org/pod/Catalyst#ENCODING

It says: If you are producing JSON response in an unconventional manner
(such as via a template or manual strings) you should perform the UTF8
encoding manually as well such as to conform to the JSON specification.

And setting the json manually *is* an unconventional manner.

I hope this helps.

Best wishes


-- 
Marco

___
List: Catalyst@lists.scsys.co.uk
Listinfo: http://lists.scsys.co.uk/cgi-bin/mailman/listinfo/catalyst
Searchable archive: http://www.mail-archive.com/catalyst@lists.scsys.co.uk/
Dev site: http://dev.catalyst.perl.org/


Re: [Catalyst] UTF8 and content length

2016-07-22 Thread John Napiorkowski
So what it looks like to me is that the code that sets a content length if one 
is not set by the view is not dealing with unicode correctly.  I have another 
unicode issue I need to look at soonish so I try to see if we can get  a test 
case for this.  -jnap 

On Wednesday, July 20, 2016 8:18 AM, Kroshka Yenot  wrote:
 

  >>> Looks like to a bug to me tl;dr   I'm not sure its a Catalyst bug or 
problem. It's may be MY configuration problem or standard violation
  
  Here are my investigation results
  
  I created a test to reproduce this situation 
  # catalyst.pl test # test/script/test_create.pl view HTML TT
  # [editor]  test/lib/test/Controller/Root.pm
  sub index :Path :Args(0)
 {
     my ( $self, $c ) = @_;
 
     my $json_text = '{"id":1, "msg":"В Питере пить"}';
     $c->response->content_type('application/json');
     $c->response->body($json_text);
 }
  
  and found following:
  
   wget -S -O - http://domain.tld:3000
 --2016-07-20 13:56:18--  http://domain.tld:3000/
 Resolving cary.lv (cary.lv)... aaa.bbb.ccc.ddd
 Connecting to domain.tld (domain.tld)|aaa.bbb.ccc.ddd|:3000... connected.
 HTTP request sent, awaiting response...
   HTTP/1.0 200 OK
   Date: Wed, 20 Jul 2016 10:56:18 GMT
   Server: HTTP::Server::PSGI
   Content-Type: application/json
   X-Catalyst: 5.90106
   Content-Length: 42
 Length: 42 [application/json]
 Saving to: 'STDOUT'
  
  content-Length is properly set. I see same using Firefox Dev tools
  but in the log (build-in test server log)
  [debug] Response Code: 200; Content-Type: application/json; Content-Length: 
unknown
  
  Exactly same code, but app works as fastcgi daemon and Apache/2.4.23 
(FreeBSD) serves http requests 
  # wget -S -O - http://domain.tld/
 --2016-07-20 15:02:28--  http://domain.tld/
 Resolving domain.tld (domain.tld)... aaa.bbb.ccc.ddd
 Connecting to domain.tld (domain.tld)|aaa.bbb.ccc.ddd|:80... connected.
 HTTP request sent, awaiting response...
   HTTP/1.1 200 OK
   Date: Wed, 20 Jul 2016 12:02:28 GMT
   Server: Apache
   Set-Cookie: lang=ru; path=/; expires=Thu, 20-Jul-2017 12:02:28 GMT
   Set-Cookie: sid=3b2b88c4106b5e06c0c24a5c3a513ccbcb939299; domain=domain.tld; 
path=/; expires=Wed, 20-Jul-2016 12:52:28 GMT; HttpOnly
   X-Catalyst: 5.90106
   Content-Length: 31
   Keep-Alive: timeout=5, max=100
   Connection: Keep-Alive
   Content-Type: application/json
 Length: 31 [application/json]
  
  Content length here is in chars not in bytes
  A solution by Aristotle Pagaltzis 
   $c->response->body(Encode::encode_utf8 $json_text); gives proper content 
length in this situation
  I'm getting same proper content length if I change content type to 
'text/html' 
  Finally, I've discovered Catalyst::View::JSON and it not only solved this 
problem for me, but also gave me a much more comfortable solution to work with 
json 
  $c->stash->{msg} = "В Питере пить";
 $c->stash->{id} = 1;
 $c->forward('View::JSON');
  Works like a charm
  
  Taking this opportunity, thank you for this lovely framework! I'll be happy 
to provide any additional information if you still consider there is something 
should be fixed
  
  
  
 
 
 
 19.07.2016 19:10, John Napiorkowski пишет:
  
  Looks like to a bug to me, although I'm not personally keen on the auto 
length setting in Catalyst it should be corrected.  I'm happy to get a patch, 
or at the very least give me a broken test case (checkout 
https://github.com/perl-catalyst/catalyst-runtime/blob/master/t/utf_incoming.t  
  and see if you can help me figure it out -jnap 
  (created an issues for this, 
_https://github.com/perl-catalyst/catalyst-runtime/issues/143 
  
  
  
 
  On Friday, July 15, 2016 6:07 AM, Kroshka Yenot  wrote:
  
 
Hi! if content type is 'application/json' or 'application/json; 
charset=utf-8' Catalyst sets content length in chars, NOT IN BYTES and I'm  
getting 
  {"id":1, "msg":"В Питере if content type is 'text/html' Catalyst sets content 
length in bytes (properly) and everything works fine
  Is there any workaround to configure this behaviour, except setting content 
length manually everytime ?
  
  my $json_text = '{"id":1, "msg":"В Питере пить"}';
  $c->response->content_type('application/json');
 $c->response->content_length(bytes::length $json_text);
 $c->response->body($json_text);
  Thanks in advance
  

 ___
 List: Catalyst@lists.scsys.co.uk
 Listinfo: http://lists.scsys.co.uk/cgi-bin/mailman/listinfo/catalyst
 Searchable archive: http://www.mail-archive.com/catalyst@lists.scsys.co.uk/
 Dev site: http://dev.catalyst.perl.org/
 
 
  
  
 ___
List: Catalyst@lists.scsys.co.uk
Listinfo: http://lists.scsys.co.uk/cgi-bin/mailman/listinfo/catalyst
Searchable archive: http://www.mail-archive.com/catalyst@lists.scsys.co.uk/
Dev site: http://dev.catalyst.perl.org/
 
 
 
___
List: Catalyst@lists.scsys.co.uk

Re: [Catalyst] UTF8 and content length

2016-07-19 Thread John Napiorkowski
Looks like to a bug to me, although I'm not personally keen on the auto length 
setting in Catalyst it should be corrected.  I'm happy to get a patch, or at 
the very least give me a broken test case (checkout 
https://github.com/perl-catalyst/catalyst-runtime/blob/master/t/utf_incoming.t 
and see if you can help me figure it out -jnap
(created an issues for this, 
_https://github.com/perl-catalyst/catalyst-runtime/issues/143


 

On Friday, July 15, 2016 6:07 AM, Kroshka Yenot  wrote:
 

   Hi! if content type is 'application/json' or 'application/json; 
charset=utf-8' Catalyst sets content length in chars, NOT IN BYTES and I'm 
getting 
  {"id":1, "msg":"В Питере if content type is 'text/html' Catalyst sets content 
length in bytes (properly) and everything works fine
  Is there any workaround to configure this behaviour, except setting content 
length manually everytime ?
  
  my $json_text = '{"id":1, "msg":"В Питере пить"}';
  $c->response->content_type('application/json');
 $c->response->content_length(bytes::length $json_text);
 $c->response->body($json_text);
  Thanks in advance
  
  
___
List: Catalyst@lists.scsys.co.uk
Listinfo: http://lists.scsys.co.uk/cgi-bin/mailman/listinfo/catalyst
Searchable archive: http://www.mail-archive.com/catalyst@lists.scsys.co.uk/
Dev site: http://dev.catalyst.perl.org/


  ___
List: Catalyst@lists.scsys.co.uk
Listinfo: http://lists.scsys.co.uk/cgi-bin/mailman/listinfo/catalyst
Searchable archive: http://www.mail-archive.com/catalyst@lists.scsys.co.uk/
Dev site: http://dev.catalyst.perl.org/


Re: [Catalyst] UTF8 and content length

2016-07-15 Thread Kroshka Yenot
>>  The clean way to do this is to simply encode the data before you 
put it in the body:


I fogot or, most likely, didn't realise I need to encode to utf-8 string 
wich is already utf8 in sources. I still need to think over this tricky 
rocket science, but your solution is working.


σας ευχαριστώ




15.07.2016 15:12, Aristotle Pagaltzis пишет:

* Kroshka Yenot  [2016-07-15 13:12]:

Hi!

if content type is 'application/json' or 'application/json;
charset=utf-8' Catalyst sets content length in chars, NOT IN BYTES and
I'm getting

{"id":1, "msg":"В Питере

if content type is 'text/html' Catalyst sets content length in bytes
(properly) and everything works fine

I am guessing you have an encoding configured in Catalyst? If yes, then
it encodes text/html bodies etc automatically for you, so the body comes
out in bytes, and its length is then correct, so everything works.


Is there any workaround to configure this behaviour, except setting
content length manually everytime ?


my $json_text = '{"id":1, "msg":"В Питере пить"}';

$c->response->content_type('application/json');
$c->response->content_length(bytes::length $json_text);
$c->response->body($json_text);

Thanks in advance

(Side note: if that code works, you must have `use utf8` in effect.
Next time you ask about such a problem, please mention this and any
other relevant parts of your configuration/setup. They are crucial.)

Here you are using bytes::length, which is broken by design and is
always the wrong thing to use (unless you are debugging perl itself or
writing XS code maybe), after putting a character string in the body,
and then relying on the fact that perl falls back to converting char
strings to UTF-8 on output because it can’t do anything else.

This ends up working, but it’s a terrible way to achieve what you need.
It relies on multiple broken things and workarounds cancelling each
other in just the right way to get the correct answer. The clean way to
do this is to simply encode the data before you put it in the body:

 use utf8;
 my $json_text = '{"id":1, "msg":"В Питере пить"}';

 $c->response->content_type('application/json; charset=utf-8');
 $c->response->body(Encode::encode_utf8 $json_text);

Regards,



___
List: Catalyst@lists.scsys.co.uk
Listinfo: http://lists.scsys.co.uk/cgi-bin/mailman/listinfo/catalyst
Searchable archive: http://www.mail-archive.com/catalyst@lists.scsys.co.uk/
Dev site: http://dev.catalyst.perl.org/


Re: [Catalyst] UTF8 and content length

2016-07-15 Thread Aristotle Pagaltzis
* Kroshka Yenot  [2016-07-15 13:12]:
> Hi!
>
> if content type is 'application/json' or 'application/json;
> charset=utf-8' Catalyst sets content length in chars, NOT IN BYTES and
> I'm getting
>
> {"id":1, "msg":"В Питере
>
> if content type is 'text/html' Catalyst sets content length in bytes
> (properly) and everything works fine

I am guessing you have an encoding configured in Catalyst? If yes, then
it encodes text/html bodies etc automatically for you, so the body comes
out in bytes, and its length is then correct, so everything works.

> Is there any workaround to configure this behaviour, except setting
> content length manually everytime ?
>
>
> my $json_text = '{"id":1, "msg":"В Питере пить"}';
>
> $c->response->content_type('application/json');
> $c->response->content_length(bytes::length $json_text);
> $c->response->body($json_text);
>
> Thanks in advance

(Side note: if that code works, you must have `use utf8` in effect.
Next time you ask about such a problem, please mention this and any
other relevant parts of your configuration/setup. They are crucial.)

Here you are using bytes::length, which is broken by design and is
always the wrong thing to use (unless you are debugging perl itself or
writing XS code maybe), after putting a character string in the body,
and then relying on the fact that perl falls back to converting char
strings to UTF-8 on output because it can’t do anything else.

This ends up working, but it’s a terrible way to achieve what you need.
It relies on multiple broken things and workarounds cancelling each
other in just the right way to get the correct answer. The clean way to
do this is to simply encode the data before you put it in the body:

use utf8;
my $json_text = '{"id":1, "msg":"В Питере пить"}';

$c->response->content_type('application/json; charset=utf-8');
$c->response->body(Encode::encode_utf8 $json_text);

Regards,
-- 
Aristotle Pagaltzis // 

___
List: Catalyst@lists.scsys.co.uk
Listinfo: http://lists.scsys.co.uk/cgi-bin/mailman/listinfo/catalyst
Searchable archive: http://www.mail-archive.com/catalyst@lists.scsys.co.uk/
Dev site: http://dev.catalyst.perl.org/