Hello Jean-Christophe,

Am 12.06.2013 um 16:44 schrieb Jean-Christophe Boggio:

> Hello,
> 
> Can someone help me understand what could cause this :
> 
> warn "\$content : ".(utf8::is_utf8($content) ? "utf8" : "not utf8");
> warn "\$ticketdata[0]->[0] : ".(utf8::is_utf8($ticketdata[0]->[0]) ? "utf8" : 
> "not utf8");
> warn "content4=$content";
> if ($ticketdata[0]->[0] ne $content) {
>       warn "content5=$content";
>       #
>       warn "content6=$content stored=".$ticketdata[0]->[0];
>       warn "content7=$content";
> }
> 

[...]

> I guess the problem comes from the fact that on the same line I have one 
> utf-8 variable and one non-utf8 one.
> 
> $content comes from $fdat{content} (not marked as utf8 while the page 
> encoding is declared and recognized as utf-8).
> 
> What can I do to force embperl to always set the utf-8 flag on $fdat{...} ?
> 
> If you know a way of telling Apache/EmbPerl that no encoding other than UTF-8 
> exist in the world, I'll take it. And it's not a problem if I'm incompatible 
> with anything.



I guess your guess is right - having one utf8 flagged variable in a statement 
converts all other things to utf8 also - and perl uses ISO-8895-1 for the 
conversion! 
So your string is destroyed after that. The same thing happens, when you use a 
Freeze::Thaw or a DataDumper - bad for serializing and storing something in a 
database :-(

Embperl decides for itself, if the %fdat parameters are utf8 or not - I don't 
know, how it does so, maybe Gerald could say something about that - but we had 
a lot of "funny" things in the past regarding this problem. Our website is in 
different encodings (not UTF8 and not ISO-8859-1) so we ran in the trouble. We 
implemented an own "thaw" method which tries to thaw the data and if that 
fails, it converts the data to utf8 and thaws it again...

A solution for you could be: use "$content=decode('UTF-8',$content)" to flag 
your variable or walk over %fdat to do it with all keys which are not already 
utf8-flagged. After that, you should have UTF8-only variables and everything 
works as expected.

One little additional comment: using non utf8-flagged variables with 
utf8-content (as your $content variable) breaks a lot of perl stuff: lc, uc, 
cmp, le, gt, length, sort, ....


With best regards,

Dirk Melchers
/// IT/Software-Development ///

NUREG GmbH ///
Dorfäckerstraße 31 | 90427 Nürnberg | Germany
Tel. +49-911-32002-256 | Fax +49-911-32002-299
Mobil +49-172-9354670 | www.nureg.de
Nürnberg HRB 22653 | USt.ID DE 814 685 653
Geschäftsführer: Michael Schmidt, Stefan Boas


---------------------------------------------------------------------
To unsubscribe, e-mail: embperl-unsubscr...@perl.apache.org
For additional commands, e-mail: embperl-h...@perl.apache.org

Reply via email to