Re: [Catalyst] Re: Patch for Catalyst::Plugin::Unicode::Encoding

Matt Lawrence Wed, 19 Mar 2008 03:51:59 -0700

Aristotle Pagaltzis wrote:

* Tatsuhiko Miyagawa <[EMAIL PROTECTED]> [2008-03-19 07:20]:

Some modules like XML::LibXML adds UTF-8 flags regardless of if
the characters to handle are composed of latin-1 range (like
Encode::decode_utf8 instead of utf8::decode), and that's pretty
much realistic and sane approach I think.


Yes. If the flag is to have any use at all, then it has to have
the semantic of distinguishing character vs byte strings.

I agree with Bill that the plugin trying to decode already
utf-8 flagged string doesn't make any sense, but furthermore, I
wonder under which circumstance the plugin tries to decode
already-utf8-flagged strings.

I'd say that's the root problem.


Yes; and that’s exactly what Jon said.

There are a number of ways that incoming data could already be decoded:environment, perl switches or pragmata, ideally every application woulddo as Jon proposes and ensure that nothing decodes the string before theplugin sees it. But checking the flag before decoding is at worstharmless and at best prevents data corruption: it would preventalready-decoded strings becoming deformed, decode encoded UTF-8 (orwhatever) strings and leave unflagged ASCII strings alone, whether ornot decode had already be attempted.

Perhaps the best approach would be to warn and not decode when flaggeddata is seen, that way the data should never be deformed and the authorcan see that something else is decoding too early and they can fix it.


Matt


_______________________________________________
List: [email protected]
Listinfo: http://lists.scsys.co.uk/cgi-bin/mailman/listinfo/catalyst
Searchable archive: http://www.mail-archive.com/[email protected]/
Dev site: http://dev.catalyst.perl.org/

Re: [Catalyst] Re: Patch for Catalyst::Plugin::Unicode::Encoding

Reply via email to