Re: [Catalyst] Re: decoding in core
On Fri, Feb 20, 2009 at 6:57 PM, Jonathan Rockway j...@jrock.us wrote: Braindump follows. snip snip One last thing, if this becomes core, it will definitely break people's apps. Many, many apps are blissfully unaware of characters and treat text as binary... and their apps kind-of appear to work. As soon as they get some real characters in their app, though, they will have double-encoded nonsense all over the place, and will blame you for this. (I loaded Catalyst::Plugin::Unicode, and my app broke! It's all your fault. Yup, people mail that to me privately all the time. For some reason, they think I am going to personally fix their app, despite having written volumes of documentation about this. Wrong.) Some more things to consider. - 'use utf8' in the code generated by the helpers? - ENCODING: UTF-8 for the TT view helper? Maybe a global config option to choose the byte or character semantics? But with the DB it becomes a bit more complex - because BLOB columns probably need to use byte sematic. -- Zbigniew Lukasiak http://brudnopis.blogspot.com/ http://perlalchemy.blogspot.com/ ___ List: Catalyst@lists.scsys.co.uk Listinfo: http://lists.scsys.co.uk/cgi-bin/mailman/listinfo/catalyst Searchable archive: http://www.mail-archive.com/catalyst@lists.scsys.co.uk/ Dev site: http://dev.catalyst.perl.org/
Re: [Catalyst] Re: decoding in core
Zbigniew Lukasiak schrieb: Some more things to consider. - 'use utf8' in the code generated by the helpers? Reasonable, but only if documentet. It took weeks for us until we learned, that this changes _nothing_ but the behaviour of several perl-functions like regexp, sort aso. - ENCODING: UTF-8 for the TT view helper? Maybe a global config option to choose the byte or character semantics? But with the DB it becomes a bit more complex - because BLOB columns probably need to use byte sematic. Uhm, of course, as BLOB is Binary and CLOB is Character. ;) This is even more complex, as the databases have different treating for this datatypes and some of Perls DBI-drivers are somewhat broken when it goes to unicode (according to our perl-saves-our-souls-guru). UTF-8 is ok in Perl itself (not easy, not coherent, but ok); but in combination of many modules (and as far as I learned, Perl is all about reusing modules) it is _hell_. Try to read UTF-8 from HTTP-request, store in database, select with correct order, write to XLS, convert to CSV, reimport it into the DB and output it to the browser, all with different subs in the same controller... and you know, what I mean. Even our most euphoric Perl-gurus don't have any clue how to handle UTF-8 from the beginning to the end without hour-long trialerror in their programs (and remember - we Germans do only have those bloody Umlauts - try to imagine this in China _). Maybe the best thing for all average-and-below users would be a _really_ good tutorial about Catalyst+UTF-8. What to do, what not to do. How to read UTF-8 from HTTP-request / uploaded file / local file / database, how to write it to client / downloadable file / local file / database. What catalystish variable is UTF-8-encoded when and why. How to determine what encoding a given scalar has and how to encode/decode/whatevercode it to a bloody nice scalar with shiny UTF-8 chars in it. Short: -- Umlauts with Catalyst for dummies -- (sorry for sounding so emotional afaik our company burned man-weeks on solving minor encoding-bugs :-/ every tutorial we found was like you can do it so or so or another way 'round the house, so it's perfect and if you don't understand is, you're retard and should use 7bit-ASCII... while lately even a colleague sounds like this - as he is enlinghtened by CPAN literature like UTF-8 vs. utf8 vs. UTF8 ;)). Greets and regards, Tom Weber ___ List: Catalyst@lists.scsys.co.uk Listinfo: http://lists.scsys.co.uk/cgi-bin/mailman/listinfo/catalyst Searchable archive: http://www.mail-archive.com/catalyst@lists.scsys.co.uk/ Dev site: http://dev.catalyst.perl.org/
Re: [Catalyst] Re: decoding in core
On Mon, Feb 23, 2009 at 2:58 PM, Neo [GC] n...@gothic-chat.de wrote: Zbigniew Lukasiak schrieb: Some more things to consider. - 'use utf8' in the code generated by the helpers? Reasonable, but only if documentet. It took weeks for us until we learned, that this changes _nothing_ but the behaviour of several perl-functions like regexp, sort aso. Hmm - in my understanding it only changes literals in the code ( $var = 'ą' ). So I looked into the pod and it says: Bytes in the source text that have their high-bit set will be treated as being part of a literal UTF-8 character. This includes most literals such as identifier names, string constants, and con- stant regular expression patterns. - ENCODING: UTF-8 for the TT view helper? Maybe a global config option to choose the byte or character semantics? But with the DB it becomes a bit more complex - because BLOB columns probably need to use byte sematic. Uhm, of course, as BLOB is Binary and CLOB is Character. ;) This is even more complex, as the databases have different treating for this datatypes and some of Perls DBI-drivers are somewhat broken when it goes to unicode (according to our perl-saves-our-souls-guru). UTF-8 is ok in Perl itself (not easy, not coherent, but ok); but in combination of many modules (and as far as I learned, Perl is all about reusing modules) it is _hell_. Try to read UTF-8 from HTTP-request, store in database, select with correct order, write to XLS, convert to CSV, reimport it into the DB and output it to the browser, all with different subs in the same controller... and you know, what I mean. Even our most euphoric Perl-gurus don't have any clue how to handle UTF-8 from the beginning to the end without hour-long trialerror in their programs (and remember - we Germans do only have those bloody Umlauts - try to imagine this in China _). Maybe the best thing for all average-and-below users would be a _really_ good tutorial about Catalyst+UTF-8. What to do, what not to do. How to read UTF-8 from HTTP-request / uploaded file / local file / database, how to write it to client / downloadable file / local file / database. What catalystish variable is UTF-8-encoded when and why. How to determine what encoding a given scalar has and how to encode/decode/whatevercode it to a bloody nice scalar with shiny UTF-8 chars in it. Short: -- Umlauts with Catalyst for dummies -- Hmm - maybe I'll add UTF-8 handling in InstantCRUD. I am waiting for good sentences showing off the national characters. -- Zbigniew Lukasiak http://brudnopis.blogspot.com/ http://perlalchemy.blogspot.com/ ___ List: Catalyst@lists.scsys.co.uk Listinfo: http://lists.scsys.co.uk/cgi-bin/mailman/listinfo/catalyst Searchable archive: http://www.mail-archive.com/catalyst@lists.scsys.co.uk/ Dev site: http://dev.catalyst.perl.org/
Re: [Catalyst] Re: decoding in core
Zbigniew Lukasiak schrieb: Hmm - in my understanding it only changes literals in the code ( $var = 'ą' ). So I looked into the pod and it says: Bytes in the source text that have their high-bit set will be treated as being part of a literal UTF-8 character. This includes most literals such as identifier names, string constants, and con- stant regular expression patterns. Ah SORRY! In my confusion I've confused it again... So if I get it right, use utf8 means you can do stuff like $s ~= s/a/ä/; (as the plain ä in the source will be treated as one character and not two octets), while the magical utf8-flag for $s tells perl, that the ä in the scalar really is an ä and not two strange octets. Am I right or am I completely lost again? Hmm - maybe I'll add UTF-8 handling in InstantCRUD. I am waiting for good sentences showing off the national characters. Does it have to be a complete sentence? My favourite test-string is something like äöüÄÖÜß'+ (UTF-8) C3 A4 C3 B6 C3 BC C3 84 C3 96 C3 9C C3 9F 22 27 2B (Hex) If I can put this string into some html-form, post/get it, process it, save to and read from db, output it to browser _and_ still have exactly 10 characters, the application _might_ work as it should. The Umlauts and the Eszett are a pain of unicode, the and ' are fun-with-html and escaping and the + ... well, URI-encoding, you know... For even more fun, one should do a regex in the application using utf8 (give me all those äÄs) and select it from the DB, first with blahfield LIKE 'ä', maybe upper(blahfield) LIKE upper('ä') and finally an ORDER BY blahfield, where blahfield should contain one row starting with a, one with ä and one with b and the output should have exactly this order and _not_ a,b,ä (hint hint: utf9 treated as ascii or latin1). Greets and regards, Tom Weber ___ List: Catalyst@lists.scsys.co.uk Listinfo: http://lists.scsys.co.uk/cgi-bin/mailman/listinfo/catalyst Searchable archive: http://www.mail-archive.com/catalyst@lists.scsys.co.uk/ Dev site: http://dev.catalyst.perl.org/
Re: [Catalyst] Re: decoding in core
Oh I forgot something... or more precisely, my boss named it while having a smoke. Maybe somewhat OT, but definetly interesting (maybe could be used to simplify the problem of double-enconding): Does anyone know a _safe_ method to convert _any_ string-scalar to utf8? Something like anything_to_utf8($s) , regardless if $s contains ascii, latin1, utf8, tasty hodgepodge or hot fn0rd, utf8-flag is set or not and is neither affected by full moon nor my horrorscope, _without_ doing double-encoding (there MUST be some way to determine if it already is utf8... my silly java editor can do it and perl makes difficult things at least possible). I would greatly appreciate this philosophers stone and will send my hero a bottle of finest bavarian (munich!) beer called Edelstoff (precious stuff - tasty). Greets and thanks! Tom Weber ___ List: Catalyst@lists.scsys.co.uk Listinfo: http://lists.scsys.co.uk/cgi-bin/mailman/listinfo/catalyst Searchable archive: http://www.mail-archive.com/catalyst@lists.scsys.co.uk/ Dev site: http://dev.catalyst.perl.org/
Re: [Catalyst] Re: decoding in core
Neo [GC] wrote on 02/23/2009 09:41 AM: Does anyone know a _safe_ method to convert _any_ string-scalar to utf8? Something like anything_to_utf8($s) , regardless if $s contains ascii, latin1, utf8, tasty hodgepodge or hot fn0rd, utf8-flag is set or not and is neither affected by full moon nor my horrorscope, _without_ doing double-encoding (there MUST be some way to determine if it already is utf8... my silly java editor can do it and perl makes difficult things at least possible). I would greatly appreciate this philosophers stone and will send my hero a bottle of finest bavarian (munich!) beer called Edelstoff (precious stuff - tasty). Search::Tools::UTF8::to_utf8() comes close. It won't handle mixed encoding in a single string (which would be garbage anyway) but it does try to prevent double-encoding and uses the Encode goodness under the hood. -- Peter Karman . pe...@peknet.com . http://peknet.com/ ___ List: Catalyst@lists.scsys.co.uk Listinfo: http://lists.scsys.co.uk/cgi-bin/mailman/listinfo/catalyst Searchable archive: http://www.mail-archive.com/catalyst@lists.scsys.co.uk/ Dev site: http://dev.catalyst.perl.org/
Re: [Catalyst] Re: decoding in core
From: Peter Karman pe...@peknet.com Neo [GC] wrote on 02/23/2009 09:41 AM: Does anyone know a _safe_ method to convert _any_ string-scalar to utf8? Something like anything_to_utf8($s) , regardless if $s contains ascii, latin1, utf8, tasty hodgepodge or hot fn0rd, utf8-flag is set or not and is neither affected by full moon nor my horrorscope, _without_ doing double-encoding (there MUST be some way to determine if it already is utf8... my silly java editor can do it and perl makes difficult things at least possible). I would greatly appreciate this philosophers stone and will send my hero a bottle of finest bavarian (munich!) beer called Edelstoff (precious stuff - tasty). Search::Tools::UTF8::to_utf8() comes close. It won't handle mixed encoding in a single string (which would be garbage anyway) but it does try to prevent double-encoding and uses the Encode goodness under the hood. -- Peter Karman . pe...@peknet.com . http://peknet.com/ I understand that there are reasons for not transforming all the encodings to UTF-8 in core, even though it seems to be not very complicated, because maybe there are some tables that contain ISO-8859-2 chars and other tables that contain ISO-8859-1 chars, and when the data need to be saved, it should keep its original encoding. But if somebody wants to create a new Catalyst app, with a new database, new templates, controllers, etc, I think it could be very helpful if the programmer would only need to specify only once that he wants to use UTF-8 everywhere - in the database, in the templates, in the configuration files of HTML::FormFu, in the controllers, and not in more places in the configuration file, or specify UTF8Columns in DBIC classes... It could be a kind of default. Octavian ___ List: Catalyst@lists.scsys.co.uk Listinfo: http://lists.scsys.co.uk/cgi-bin/mailman/listinfo/catalyst Searchable archive: http://www.mail-archive.com/catalyst@lists.scsys.co.uk/ Dev site: http://dev.catalyst.perl.org/
Re: [Catalyst] Re: decoding in core
On Mon, Feb 23, 2009 at 06:45:40PM +0200, Octavian Râşniţă wrote: I understand that there are reasons for not transforming all the encodings to UTF-8 in core, even though it seems to be not very complicated, because maybe there are some tables that contain ISO-8859-2 chars and other tables that contain ISO-8859-1 chars, and when the data need to be saved, it should keep its original encoding. Don't think about transforming encodings to UTF-8. In the vast majority of cases people expect to work with characters, and that's what Perl works with internally. UTF-8 is an encoding, not characters. The HTTP request is octets. The HTTP request specifies what encoding those octets represent and it's that encoding that is used to decode the octets into characters. The fact that Perl uses UTF-8 internally is best ignored -- it's just characters inside Perl once decoded. Conceptually it's not that much different than a request with Content-Encoding: gzip -- before using the request body parameters the gzipped octets must obviously be decoded. Likewise, the body must be url-decoded into separate parameters. And again, the resulting octets must be decoded into characters if the parameters are to be used as character. That last step has often been ignored. Then when sending a response of (abstract) characters that are inside Perl they must first be encoded into octets. Those things should be handled at the edge of the application, and that would be in the Engine (or the code the Engine uses). Yes, the same thing has to happen with templates, the database, and all external data sources. Those are separate issues. HTTP provides a standard way to determine how to encode and decode. -- Bill Moseley mose...@hank.org Sent from my iMutt ___ List: Catalyst@lists.scsys.co.uk Listinfo: http://lists.scsys.co.uk/cgi-bin/mailman/listinfo/catalyst Searchable archive: http://www.mail-archive.com/catalyst@lists.scsys.co.uk/ Dev site: http://dev.catalyst.perl.org/
Re: [Catalyst] Re: decoding in core
On Fri, Feb 20, 2009 at 11:57:29AM -0600, Jonathan Rockway wrote: The problem with writing a plugin or making this core is that people really really want to misuse Unicode, and will whine when you try to force correctness upon them. I'm not sure what you mean by wanting to misuse Unicode. You mean like decode using a different encoding than what the charset is in the HTTP headers? The only place where you are really allowed to use non-ASCII characters are in the request and response. (HTTP has a way of representing the character encoding of its payload -- URLs and Cookies don't.) C::P::Unicode handles this correct usage correctly. I disagree there. First, it assumes utf8 instead of what the request states as the encoding. That is generally okay (where you set accept-encoding in your forms), but why not decode as the request states? Second, it only decodes the request parameters. The body_parameters and query_parameters are left undecoded. Is that by design? That is, is it expected that in a POST $c-req-parameters-{foo} would be characters where $c-req-body_parameters-{foo} is undecoded octets? I would not want or expect that. The problem is that people want Unicode to magically work where it's not allowed. This includes HTTP headers (WTF!?), and URLs. (BTW, when I say Unicode, I don't necessarily mean Unicode... I mean non-ASCII characters. The Japanese character sets contain non-Unicode characters, and some people want to put these characters in their URLs or HTTP headers. I wish I was making ths up, but I am not. The Unicode process really fucked over the Asian languages.) I'm not sure we want to go down that path. Maybe a plugin for doing crazy stuff with HTTP header encoding, but my initial email was really just about moving decoding of the body (when we have a charset in the request) and encoding on sending (again if there's a charset in the response headers) into core. Trying to do more than that is probably asking for headaches (and whining). I think there's reasonable debate at what point in the request decoding should happen, though. Frankly, I'm not sure Catalyst should decode, rather HTTP::Body should. HTTP::Body looks at the content type header and if it's application/x-www-form-urlencoded it will decode the body into separate parameters. But, why should it ignore decoding the charset also specified? The query parameters are more troublesome, of course. Seems the common case is to use utf8 in URLs as the encoding, and in the end the encoding just has to be assumed (or specified as a separate parameter). uri_for()'s current behavior of encoding to utf8 is probably a good way to go and to just always decoded the query parameters as utf8 in Catalyst. I suppose uri_for() could add an additional _enc=utf8 parameter to allow for different encodings, but I can't imagine where just assuming utf8 would not be fine. Of course, someone will want to mix encodings in different query parameters. There are subtle issues, like knowing not to touch XML (it's binary), dealing with $c-res-body( filehandle ), and so on. The layer can be set on the file handle. XML will be decoded as application/octet-stream by HTTP::Body, so that should be ok. Although, if there's a chraset in the request I would still probably argue that decoding would be the correct thing to do. For custom processing I currently extend HTTP::Body. For example: $HTTP::Body::TYPES-{'text/xml'} = 'My::XML::Parser'; which does stream parsing of the XML and thus handles the XML charset decoding. One last thing, if this becomes core, it will definitely break people's apps. Many, many apps are blissfully unaware of characters and treat text as binary... and their apps kind-of appear to work. As soon as they get some real characters in their app, though, they will have double-encoded nonsense all over the place, and will blame you for this. That may be true for some. For most they probably have simply ignored encoding and don't realize they are working with octets instead of characters, and thanks to Perl it just all works. So working with real characters instead will likely be transparent for them. Catalyst::Plugin::Unicode blindly decodes using utf::decode() and I think that's a no-op if the content has already been decoded (utf8 flag is already set). Likewise, it only encodes if the utf8 flag is set. So, users of that plugin should be ok if character encoding was handled in core and they don't remove the plugin. -- Bill Moseley mose...@hank.org Sent from my iMutt ___ List: Catalyst@lists.scsys.co.uk Listinfo: http://lists.scsys.co.uk/cgi-bin/mailman/listinfo/catalyst Searchable archive: http://www.mail-archive.com/catalyst@lists.scsys.co.uk/ Dev site: http://dev.catalyst.perl.org/
Re: [Catalyst] Re: decoding in core (Was: [Announce] Catalyst-Runtime-5.8000_05)
On 6 Feb 2009, at 17:36, Bill Moseley wrote: Sure. IIRC, I think there's already been some patches and code posted so maybe I can dig that up again off the archives. Please do. But, sounds like it's not that important of an issue. The fact that nobody is working on it currently is not an indication that it isn't an important problem to try to solve. Cheers t0m ___ List: Catalyst@lists.scsys.co.uk Listinfo: http://lists.scsys.co.uk/cgi-bin/mailman/listinfo/catalyst Searchable archive: http://www.mail-archive.com/catalyst@lists.scsys.co.uk/ Dev site: http://dev.catalyst.perl.org/
Re: [Catalyst] Re: decoding in core
Braindump follows. * On Fri, Feb 20 2009, Tomas Doran wrote: On 6 Feb 2009, at 17:36, Bill Moseley wrote: Sure. IIRC, I think there's already been some patches and code posted so maybe I can dig that up again off the archives. Please do. But, sounds like it's not that important of an issue. The fact that nobody is working on it currently is not an indication that it isn't an important problem to try to solve. I meant to write a plugin to do this a long time ago, but I guess I never cared enough. The problem with writing a plugin or making this core is that people really really want to misuse Unicode, and will whine when you try to force correctness upon them. The only place where you are really allowed to use non-ASCII characters are in the request and response. (HTTP has a way of representing the character encoding of its payload -- URLs and Cookies don't.) C::P::Unicode handles this correct usage correctly. The problem is that people want Unicode to magically work where it's not allowed. This includes HTTP headers (WTF!?), and URLs. (BTW, when I say Unicode, I don't necessarily mean Unicode... I mean non-ASCII characters. The Japanese character sets contain non-Unicode characters, and some people want to put these characters in their URLs or HTTP headers. I wish I was making ths up, but I am not. The Unicode process really fucked over the Asian languages.) So anyway, the plugin basically needs to have the following config options, so users can specify what they want. Inside Catalyst, only Perl characters should be allowed, unless you mark the string as binary (there is a CPAN module that does this, Something::BLOB). * Input HTTP header encoding (ASCII default) (this is for data in $c-req-headers, cookies, etc.) (perhaps cookies should be separately configured) * Input URI encoding (probably UTF-8 default) (the dispatcher will dispatch on the decoded characters) (source code encoding is handled by Perl, hopefully) * Input request body encoding (read HTTP headers and decide) * Output HTTP headers encoding (maybe die if this happens, because it's totally illegal to have non-ascii in the headers) * Output URI encoding ($c-uri_for and friends will use this to translate the names of actions that are named with wide characters) * Output response body encoding (this needs to update the HTTP headers, namely the charset= part of Content-type) I think that is everything. There are subtle issues, like knowing not to touch XML (it's binary), dealing with $c-res-body( filehandle ), and so on. One last thing, if this becomes core, it will definitely break people's apps. Many, many apps are blissfully unaware of characters and treat text as binary... and their apps kind-of appear to work. As soon as they get some real characters in their app, though, they will have double-encoded nonsense all over the place, and will blame you for this. (I loaded Catalyst::Plugin::Unicode, and my app broke! It's all your fault. Yup, people mail that to me privately all the time. For some reason, they think I am going to personally fix their app, despite having written volumes of documentation about this. Wrong.) Anyway, I just wanted to get this out of my head and onto paper, for someone else to look at and perhaps implement. :) Regards, Jonathan Rockway -- print just = another = perl = hacker = if $,=$ ___ List: Catalyst@lists.scsys.co.uk Listinfo: http://lists.scsys.co.uk/cgi-bin/mailman/listinfo/catalyst Searchable archive: http://www.mail-archive.com/catalyst@lists.scsys.co.uk/ Dev site: http://dev.catalyst.perl.org/
Re: [Catalyst] Re: decoding in core
* On Fri, Feb 20 2009, Jonathan Rockway wrote: Braindump follows. Oh yeah, one other thing. IDNs will need to be decoded/encoded, probably. ($c-req-host should contain perl characters, but links should probably be punycoded. Fun!) -- print just = another = perl = hacker = if $,=$ ___ List: Catalyst@lists.scsys.co.uk Listinfo: http://lists.scsys.co.uk/cgi-bin/mailman/listinfo/catalyst Searchable archive: http://www.mail-archive.com/catalyst@lists.scsys.co.uk/ Dev site: http://dev.catalyst.perl.org/
Re: [Catalyst] Re: decoding in core (Was: [Announce] Catalyst-Runtime-5.8000_05)
On 6 Feb 2009, at 14:46, Bill Moseley wrote: Nobody responded to the main point of this email -- if Catalyst should handle encoding in core instead of with a plugin. Nobody has an opinion about that? Or is was it just ignored -- which is often how people handle character encoding in applications. ;) Does it make a difference if its in core or in a plugin? In your original email you said that the existing plugins don't do it right.. Which is quite possibly fair criticism, however I don't see how moving the functionality into core would help the code be more correct.. Saying 'Plugin X is broken', 'Lets move Plugin X into core' doesn't sound very convincing from where I'm sat. :_) Code speaks louder than words, so if you'd like to provide some failing tests for what you think encoding _should_ be doing, that'd probably be a better basis for further discussion. Cheers t0m ___ List: Catalyst@lists.scsys.co.uk Listinfo: http://lists.scsys.co.uk/cgi-bin/mailman/listinfo/catalyst Searchable archive: http://www.mail-archive.com/catalyst@lists.scsys.co.uk/ Dev site: http://dev.catalyst.perl.org/
Re: [Catalyst] Re: decoding in core (Was: [Announce] Catalyst-Runtime-5.8000_05)
On Fri, Jan 30, 2009 at 11:44:57PM +0100, Aristotle Pagaltzis wrote: * Bill Moseley mose...@hank.org [2009-01-29 17:05]: Neither of the existing plugins do it correctly (IMO), as they only decode parameters leaving body_parameters as octets, and don't look at the request for the charset, IIRC. […] uri_for() rightly encodes to octets before escaping, but it always encodes to utf-8. Is it assumed that query parameters are always utf-8 or should they be decoded with the charset specified in the request? The URI should always be assumed to be UTF-8 encoded octets. The body should be decoded according to the charset declared in the header by the browser. Assume UTF-8 because that's how the application encoded the URL in the first place? Is UTF-8 specified in an RFC? I thought it URIs were defined as characters with ASCII encoding for transmitting. Nobody responded to the main point of this email -- if Catalyst should handle encoding in core instead of with a plugin. Nobody has an opinion about that? Or is was it just ignored -- which is often how people handle character encoding in applications. ;) -- Bill Moseley mose...@hank.org Sent from my iMutt ___ List: Catalyst@lists.scsys.co.uk Listinfo: http://lists.scsys.co.uk/cgi-bin/mailman/listinfo/catalyst Searchable archive: http://www.mail-archive.com/catalyst@lists.scsys.co.uk/ Dev site: http://dev.catalyst.perl.org/
Re: [Catalyst] Re: decoding in core (Was: [Announce] Catalyst-Runtime-5.8000_05)
On Fri, Feb 06, 2009 at 03:16:14PM +, Tomas Doran wrote: On 6 Feb 2009, at 14:46, Bill Moseley wrote: Nobody responded to the main point of this email -- if Catalyst should handle encoding in core instead of with a plugin. Nobody has an opinion about that? Or is was it just ignored -- which is often how people handle character encoding in applications. ;) Does it make a difference if its in core or in a plugin? In your original email you said that the existing plugins don't do it right.. Which is quite possibly fair criticism, however I don't see how moving the functionality into core would help the code be more correct.. Saying 'Plugin X is broken', 'Lets move Plugin X into core' doesn't sound very convincing from where I'm sat. :_) Two different issues, although I would assume if you moved it into core there would be more careful consideration and discussion about how to do it. Which is why I posted -- for a discussion. The question is should encoding be a core function. A plugin works, but not everyone uses it. My argument for doing it in core is that inside Perl is character data so therefore it must be decoded at some point, and it's Catalyst (and the engines) that load the parameters. And if it's character data on the inside it has to be encoded when writing. Code speaks louder than words, so if you'd like to provide some failing tests for what you think encoding _should_ be doing, that'd probably be a better basis for further discussion. Sure. IIRC, I think there's already been some patches and code posted so maybe I can dig that up again off the archives. But, sounds like it's not that important of an issue. -- Bill Moseley mose...@hank.org Sent from my iMutt ___ List: Catalyst@lists.scsys.co.uk Listinfo: http://lists.scsys.co.uk/cgi-bin/mailman/listinfo/catalyst Searchable archive: http://www.mail-archive.com/catalyst@lists.scsys.co.uk/ Dev site: http://dev.catalyst.perl.org/