Re: [Catalyst] Re: decoding in core

2009-02-23 Thread Zbigniew Lukasiak
On Fri, Feb 20, 2009 at 6:57 PM, Jonathan Rockway j...@jrock.us wrote: Braindump follows. snip snip One last thing, if this becomes core, it will definitely break people's apps. Many, many apps are blissfully unaware of characters and treat text as binary... and their apps kind-of appear

Re: [Catalyst] Re: decoding in core

2009-02-23 Thread Neo [GC]
Zbigniew Lukasiak schrieb: Some more things to consider. - 'use utf8' in the code generated by the helpers? Reasonable, but only if documentet. It took weeks for us until we learned, that this changes _nothing_ but the behaviour of several perl-functions like regexp, sort aso. - ENCODING:

Re: [Catalyst] Re: decoding in core

2009-02-23 Thread Zbigniew Lukasiak
On Mon, Feb 23, 2009 at 2:58 PM, Neo [GC] n...@gothic-chat.de wrote: Zbigniew Lukasiak schrieb: Some more things to consider. - 'use utf8' in the code generated by the helpers? Reasonable, but only if documentet. It took weeks for us until we learned, that this changes _nothing_ but the

Re: [Catalyst] Re: decoding in core

2009-02-23 Thread Neo [GC]
Zbigniew Lukasiak schrieb: Hmm - in my understanding it only changes literals in the code ( $var = 'ą' ). So I looked into the pod and it says: Bytes in the source text that have their high-bit set will be treated as being part of a literal UTF-8 character. This includes most

Re: [Catalyst] Re: decoding in core

2009-02-23 Thread Neo [GC]
Oh I forgot something... or more precisely, my boss named it while having a smoke. Maybe somewhat OT, but definetly interesting (maybe could be used to simplify the problem of double-enconding): Does anyone know a _safe_ method to convert _any_ string-scalar to utf8? Something like

Re: [Catalyst] Re: decoding in core

2009-02-23 Thread Peter Karman
Neo [GC] wrote on 02/23/2009 09:41 AM: Does anyone know a _safe_ method to convert _any_ string-scalar to utf8? Something like anything_to_utf8($s) , regardless if $s contains ascii, latin1, utf8, tasty hodgepodge or hot fn0rd, utf8-flag is set or not and is neither affected by full moon nor

Re: [Catalyst] Re: decoding in core

2009-02-23 Thread Octavian Râşniţă
From: Peter Karman pe...@peknet.com Neo [GC] wrote on 02/23/2009 09:41 AM: Does anyone know a _safe_ method to convert _any_ string-scalar to utf8? Something like anything_to_utf8($s) , regardless if $s contains ascii, latin1, utf8, tasty hodgepodge or hot fn0rd, utf8-flag is set or not and is

Re: [Catalyst] Re: decoding in core

2009-02-23 Thread Bill Moseley
On Mon, Feb 23, 2009 at 06:45:40PM +0200, Octavian Râşniţă wrote: I understand that there are reasons for not transforming all the encodings to UTF-8 in core, even though it seems to be not very complicated, because maybe there are some tables that contain ISO-8859-2 chars and other tables

[Catalyst] Re: decoding in core

2009-02-23 Thread Aristotle Pagaltzis
* Neo [GC] n...@gothic-chat.de [2009-02-23 16:45]: Does anyone know a _safe_ method to convert _any_ string-scalar to utf8? There isn’t. Strings in Perl are untyped. They are simply sequences of arbitrarily large integers. If a string only contains values between 0 and 255, then it can be

Re: [Catalyst] Re: decoding in core

2009-02-22 Thread Bill Moseley
On Fri, Feb 20, 2009 at 11:57:29AM -0600, Jonathan Rockway wrote: The problem with writing a plugin or making this core is that people really really want to misuse Unicode, and will whine when you try to force correctness upon them. I'm not sure what you mean by wanting to misuse Unicode.

Re: [Catalyst] Re: decoding in core (Was: [Announce] Catalyst-Runtime-5.8000_05)

2009-02-20 Thread Tomas Doran
On 6 Feb 2009, at 17:36, Bill Moseley wrote: Sure. IIRC, I think there's already been some patches and code posted so maybe I can dig that up again off the archives. Please do. But, sounds like it's not that important of an issue. The fact that nobody is working on it currently is not

Re: [Catalyst] Re: decoding in core

2009-02-20 Thread Jonathan Rockway
Braindump follows. * On Fri, Feb 20 2009, Tomas Doran wrote: On 6 Feb 2009, at 17:36, Bill Moseley wrote: Sure. IIRC, I think there's already been some patches and code posted so maybe I can dig that up again off the archives. Please do. But, sounds like it's not that important of an

Re: [Catalyst] Re: decoding in core

2009-02-20 Thread Jonathan Rockway
* On Fri, Feb 20 2009, Jonathan Rockway wrote: Braindump follows. Oh yeah, one other thing. IDNs will need to be decoded/encoded, probably. ($c-req-host should contain perl characters, but links should probably be punycoded. Fun!) -- print just = another = perl = hacker = if $,=$

Re: [Catalyst] Re: decoding in core (Was: [Announce] Catalyst-Runtime-5.8000_05)

2009-02-06 Thread Tomas Doran
On 6 Feb 2009, at 14:46, Bill Moseley wrote: Nobody responded to the main point of this email -- if Catalyst should handle encoding in core instead of with a plugin. Nobody has an opinion about that? Or is was it just ignored -- which is often how people handle character encoding in

Re: [Catalyst] Re: decoding in core (Was: [Announce] Catalyst-Runtime-5.8000_05)

2009-02-06 Thread Bill Moseley
On Fri, Jan 30, 2009 at 11:44:57PM +0100, Aristotle Pagaltzis wrote: * Bill Moseley mose...@hank.org [2009-01-29 17:05]: Neither of the existing plugins do it correctly (IMO), as they only decode parameters leaving body_parameters as octets, and don't look at the request for the charset,

Re: [Catalyst] Re: decoding in core (Was: [Announce] Catalyst-Runtime-5.8000_05)

2009-02-06 Thread Bill Moseley
On Fri, Feb 06, 2009 at 03:16:14PM +, Tomas Doran wrote: On 6 Feb 2009, at 14:46, Bill Moseley wrote: Nobody responded to the main point of this email -- if Catalyst should handle encoding in core instead of with a plugin. Nobody has an opinion about that? Or is was it just ignored --

[Catalyst] Re: decoding in core (Was: [Announce] Catalyst-Runtime-5.8000_05)

2009-01-30 Thread Aristotle Pagaltzis
* Bill Moseley mose...@hank.org [2009-01-29 17:05]: Neither of the existing plugins do it correctly (IMO), as they only decode parameters leaving body_parameters as octets, and don't look at the request for the charset, IIRC. […] uri_for() rightly encodes to octets before escaping, but it