Hello Marc

On 2010-05-12, at 15:40, Marc Mims wrote:
>> my $title = "\x{e4}\x{f6}\x{fc}"; # "äöü"
> 
> This isn't a UTF-8 string.
> 
>    utf8::is_utf8($title); # false
> 
>    utf8::upgrade($title); # now it is

It is a string consisting of the three characters \x{e4}, \x{f6} and \x{fc}. 
That's about all I have to know as a Perl user, reread [1] if in doubt. The 
important thing to know is that you cannot rely on Perl internally holding 
strings in UTF-8! Of course I could force Perl to internally hold this string 
in UTF-8 by using utf8::upgrade(), but the question is: where should I do that 
so as to cover all cases? As pointed out in [2], overwriting get_columns and 
store_columns won't work reliably. That's why I suggested using the 
inflate/deflate subroutines, but will this work in all cases? Even then it 
would be a bad idea to use utf8::upgrade() because that's not was it's meant 
for. As pointed out in [3] the flow should be as follows:

> 1. Receive and decode
> 2. Process
> 3. Encode and output


and as a matter of fact, neither DBIx::Class nor DBD::mysql do the 3rd step 
(encoding to UTF-8), because then the problem would not arise. Look at this:

my $title = "\x{e4}\x{f6}\x{fc}";
return Encode::encode('UTF-8', $title);

and

my $other_title = "\x{e4}\x{f6}\x{fc}";
utf8::upgrade($other_title); 
return Encode::encode('UTF-8', $other_title);

Both yield the same result. Using utf8::upgrade() here is useless, and again: 
as pointed out in [1] you shouldn't care about the internal format.

My question remains: is deflate/inflate a safe place to do encoding, or will it 
suffer the same flaws as DBIx::Class::UTF8Columns?

Regards
Matias E. Fernandez

[1] 
http://perldoc.perl.org/5.12.0/perlunifaq.html#I-lost-track%3b-what-encoding-is-the-internal-format-really%3f
[2] 
http://search.cpan.org/~frew/DBIx-Class-0.08121/lib/DBIx/Class/UTF8Columns.pm#Warning_-_Module_does_not_function_properly_on_create/insert
[3] 
http://perldoc.perl.org/5.12.0/perlunitut.html#I%2fO-flow-(the-actual-5-minute-tutorial)
_______________________________________________
List: http://lists.scsys.co.uk/cgi-bin/mailman/listinfo/dbix-class
IRC: irc.perl.org#dbix-class
SVN: http://dev.catalyst.perl.org/repos/bast/DBIx-Class/
Searchable Archive: http://www.grokbase.com/group/[email protected]

Reply via email to