Re: [Catalyst] Re: [Dbix-class] Re: utf8 / pg double encoding problem
Thanks Ashley, that did indeed fix the JSON problem! (wish I'd remembered to write a test *before* I fixed it like this though ;-) cheers Daniel On Jan 7, 2008 9:28 PM, Ashley Pond V [EMAIL PROTECTED] wrote: This may or may not be germane: try installing JSON::XS and updating your JSON and JSON::Any. JSON::XS is one of, if not the, fastest serializers in all data classes and its utf8 handling is better. JSON now, IIRC, calls it if it's present instead of its older Perl version. -Ashley On Jan 7, 2008, at 7:57 AM, Daniel McBrearty wrote: Data is actually sent in URI encoded utf8 (looks like ab%C3%A7 ), which is fine. The string is then picked up, decoded and stored in the db just fine. The problem is that what gets sent back the other way (via Catalyst::View::JSON ) is not getting encoded. I don't know quite why just now (according to the docs it should do). Manually adding the Encode::encode_utf8( $result ) step fixes it for now. I may try peeking to see why it doesn't get handled by the View. ___ List: Catalyst@lists.scsys.co.uk Listinfo: http://lists.scsys.co.uk/cgi-bin/mailman/listinfo/catalyst Searchable archive: http://www.mail-archive.com/catalyst@lists.scsys.co.uk/ Dev site: http://dev.catalyst.perl.org/ -- Daniel McBrearty email : danielmcbrearty at gmail.com http://www.engoi.com http://danmcb.vox.com http://danmcb.blogger.com find me on linkedin and facebook BTW : 0873928131 ___ List: Catalyst@lists.scsys.co.uk Listinfo: http://lists.scsys.co.uk/cgi-bin/mailman/listinfo/catalyst Searchable archive: http://www.mail-archive.com/catalyst@lists.scsys.co.uk/ Dev site: http://dev.catalyst.perl.org/
Re: [Catalyst] Re: [Dbix-class] Re: utf8 / pg double encoding problem
And thank you for the chance to not look like a dummy for a change. :) On Jan 9, 2008, at 1:32 PM, Daniel McBrearty wrote: Thanks Ashley, that did indeed fix the JSON problem! (wish I'd remembered to write a test *before* I fixed it like this though ;-) cheers Daniel On Jan 7, 2008 9:28 PM, Ashley Pond V [EMAIL PROTECTED] wrote: This may or may not be germane: try installing JSON::XS and updating your JSON and JSON::Any. JSON::XS is one of, if not the, fastest serializers in all data classes and its utf8 handling is better. JSON now, IIRC, calls it if it's present instead of its older Perl version. -Ashley On Jan 7, 2008, at 7:57 AM, Daniel McBrearty wrote: Data is actually sent in URI encoded utf8 (looks like ab%C3%A7 ), which is fine. The string is then picked up, decoded and stored in the db just fine. The problem is that what gets sent back the other way (via Catalyst::View::JSON ) is not getting encoded. I don't know quite why just now (according to the docs it should do). Manually adding the Encode::encode_utf8( $result ) step fixes it for now. I may try peeking to see why it doesn't get handled by the View. ___ List: Catalyst@lists.scsys.co.uk Listinfo: http://lists.scsys.co.uk/cgi-bin/mailman/listinfo/catalyst Searchable archive: http://www.mail-archive.com/catalyst@lists.scsys.co.uk/ Dev site: http://dev.catalyst.perl.org/
[Catalyst] Re: [Dbix-class] Re: utf8 / pg double encoding problem
well, the moral is probably never work on your code when recovering from flu. It was pretty much a self-induced problem. I have a system where the user submits a value, and the value is copied back to them. If there is a network problem with corruption of the data, they will see it, as the returned string should show the problem. I had a problem where the string being returned was in pure unicode code points (not utf8). Somehow I convinced myself that the problem was on the source side, and added a utf8 encoding in the sending js. Oops. The result was obvious ... Data is actually sent in URI encoded utf8 (looks like ab%C3%A7 ), which is fine. The string is then picked up, decoded and stored in the db just fine. The problem is that what gets sent back the other way (via Catalyst::View::JSON ) is not getting encoded. I don't know quite why just now (according to the docs it should do). Manually adding the Encode::encode_utf8( $result ) step fixes it for now. I may try peeking to see why it doesn't get handled by the View. there is still a slight oddity that now, the $edit parameter shows just fine in the debug screen, is confirmed as being yes its UTF8, but $c-log-debug( $edit ) does not print ok. Odd, but not really any worry right now. Some kind of argument between perl and the terminal? but why OK in the usual debug output, but not with $c-log-debug? thanks to those who pointed in the right direction, anyhow. ___ List: Catalyst@lists.scsys.co.uk Listinfo: http://lists.scsys.co.uk/cgi-bin/mailman/listinfo/catalyst Searchable archive: http://www.mail-archive.com/catalyst@lists.scsys.co.uk/ Dev site: http://dev.catalyst.perl.org/
Re: [Catalyst] Re: [Dbix-class] Re: utf8 / pg double encoding problem
This may or may not be germane: try installing JSON::XS and updating your JSON and JSON::Any. JSON::XS is one of, if not the, fastest serializers in all data classes and its utf8 handling is better. JSON now, IIRC, calls it if it's present instead of its older Perl version. -Ashley On Jan 7, 2008, at 7:57 AM, Daniel McBrearty wrote: Data is actually sent in URI encoded utf8 (looks like ab%C3%A7 ), which is fine. The string is then picked up, decoded and stored in the db just fine. The problem is that what gets sent back the other way (via Catalyst::View::JSON ) is not getting encoded. I don't know quite why just now (according to the docs it should do). Manually adding the Encode::encode_utf8( $result ) step fixes it for now. I may try peeking to see why it doesn't get handled by the View. ___ List: Catalyst@lists.scsys.co.uk Listinfo: http://lists.scsys.co.uk/cgi-bin/mailman/listinfo/catalyst Searchable archive: http://www.mail-archive.com/catalyst@lists.scsys.co.uk/ Dev site: http://dev.catalyst.perl.org/
[Catalyst] Re: [Dbix-class] Re: utf8 / pg double encoding problem
because it is utf8? shouldn't it be? On Jan 6, 2008 1:29 AM, Aristotle Pagaltzis [EMAIL PROTECTED] wrote: * Daniel McBrearty [EMAIL PROTECTED] [2008-01-06 00:00]: [debug] abçöeü [debug] $VAR1 = ab\x{c3}\x{a7}\x{c3}\x{b6}e\x{c3}\x{bc}; [debug] it's UTF8! Err, why doesn't Dumper say ab\x{e7}\x{f6}e\x{fc}? Strange that the first line looks correct, though. Regards, -- Aristotle Pagaltzis // http://plasmasturm.org/ ___ List: http://lists.scsys.co.uk/cgi-bin/mailman/listinfo/dbix-class IRC: irc.perl.org#dbix-class SVN: http://dev.catalyst.perl.org/repos/bast/DBIx-Class/ Searchable Archive: http://www.grokbase.com/group/[EMAIL PROTECTED] -- Daniel McBrearty email : danielmcbrearty at gmail.com http://www.engoi.com http://danmcb.vox.com http://danmcb.blogger.com find me on linkedin and facebook BTW : 0873928131 ___ List: Catalyst@lists.scsys.co.uk Listinfo: http://lists.scsys.co.uk/cgi-bin/mailman/listinfo/catalyst Searchable archive: http://www.mail-archive.com/catalyst@lists.scsys.co.uk/ Dev site: http://dev.catalyst.perl.org/
[Catalyst] Re: [Dbix-class] Re: utf8 / pg double encoding problem
so do you mean that Dumper should be seeing and outputting this as a char sequence? (what it actually shows is a mix of chars and hex bytes, in fact ...) On Jan 6, 2008 2:08 PM, Aristotle Pagaltzis [EMAIL PROTECTED] wrote: * Daniel McBrearty [EMAIL PROTECTED] [2008-01-06 13:30]: On Jan 6, 2008 1:29 AM, Aristotle Pagaltzis [EMAIL PROTECTED] wrote: * Daniel McBrearty [EMAIL PROTECTED] [2008-01-06 00:00]: [debug] abçöeü [debug] $VAR1 = ab\x{c3}\x{a7}\x{c3}\x{b6}e\x{c3}\x{bc}; [debug] it's UTF8! Err, why doesn't Dumper say ab\x{e7}\x{f6}e\x{fc}? Strange that the first line looks correct, though. because it is utf8? shouldn't it be? What Dumper outputs is the UTF-8 byte sequence; but the next line says that the Unicode flag is set, so this is a character string, not a byte string. So it's already double-encoded. I don't understand why the first line looks correct though. In any case the raw HTTP request that leads to all this would be interesting. Regards, -- Aristotle Pagaltzis // http://plasmasturm.org/ ___ List: http://lists.scsys.co.uk/cgi-bin/mailman/listinfo/dbix-class IRC: irc.perl.org#dbix-class SVN: http://dev.catalyst.perl.org/repos/bast/DBIx-Class/ Searchable Archive: http://www.grokbase.com/group/[EMAIL PROTECTED] -- Daniel McBrearty email : danielmcbrearty at gmail.com http://www.engoi.com http://danmcb.vox.com http://danmcb.blogger.com find me on linkedin and facebook BTW : 0873928131 ___ List: Catalyst@lists.scsys.co.uk Listinfo: http://lists.scsys.co.uk/cgi-bin/mailman/listinfo/catalyst Searchable archive: http://www.mail-archive.com/catalyst@lists.scsys.co.uk/ Dev site: http://dev.catalyst.perl.org/
[Catalyst] Re: [Dbix-class] Re: utf8 / pg double encoding problem
In any case the raw HTTP request that leads to all this would be interesting. I can tell you that the data in the raw request is just the 9 bytes of UTF8, exactly as shown by Dumper. I looked at it with wireshark to be sure. To give some background, this is getting pulled out of a form by javascript (which sees unicode), converted to UTF8, and then submitted as a POST to the cat controller. The $edit param is just another CGI parameter as far as Cat is concerned though. ___ List: Catalyst@lists.scsys.co.uk Listinfo: http://lists.scsys.co.uk/cgi-bin/mailman/listinfo/catalyst Searchable archive: http://www.mail-archive.com/catalyst@lists.scsys.co.uk/ Dev site: http://dev.catalyst.perl.org/
[Catalyst] Re: [Dbix-class] Re: utf8 / pg double encoding problem
On 06/01/2008, Daniel McBrearty [EMAIL PROTECTED] wrote: In any case the raw HTTP request that leads to all this would be interesting. I can tell you that the data in the raw request is just the 9 bytes of UTF8, exactly as shown by Dumper. I looked at it with wireshark to be sure. To give some background, this is getting pulled out of a form by javascript (which sees unicode), converted to UTF8, and then submitted as a POST to the cat controller. The $edit param is just another CGI parameter as far as Cat is concerned though. Just a note to be careful about your terminology. which see unicode is meaningless really, utf8 is unicode, utf16le is unicode, utf32 is unicode, etc they are just different encodings. I think you mean raw codepoints, but i doubt that JS operates on UTF32 (aka raw codepoints at this time), it most likely operates on utf16le or utf16be. (Whichever one Windows uses internally). Yves -- perl -Mre=debug -e /just|another|perl|hacker/ ___ List: Catalyst@lists.scsys.co.uk Listinfo: http://lists.scsys.co.uk/cgi-bin/mailman/listinfo/catalyst Searchable archive: http://www.mail-archive.com/catalyst@lists.scsys.co.uk/ Dev site: http://dev.catalyst.perl.org/