Ben Hiebert wrote :
Perl usually
tries to guess at the best encoding when it takes in the data and then
encodes it internally as best it can. You may have a problem where the
data comes in as ISO88591 but perl thinks it is UTF8 data, encodes it
internally as UTF8 and then prints out the UTF8-as-ISO88591 to give you
the bad results.
Yes, that is my guess too.
It may be worth checking to see what format Perl thinks your incoming
data is by using
$flag = utf8::is_utf8(STRING);
Good idea. I modified the code to this :
while (read($fdat{efilename},$buffer,32768)) {
if (utf8::is_utf8($buffer)) {
print OUT "u";
}
print FILE $buffer;
}
...but in both cases (working and not) I never get the "uuuuu" lines.
BUT when the $buffer is written to disk it is transformed ! I tried
with binmode FILE just after opening the file for output but same
things happen.
If perl thinks UTF8 then it is misintepreting your incoming data and
you'll need to either decode it with decode or with one of the other
UTF8 utilities. This may work:
$GoodInternalString = decode("iso-8859-1", $IncomingData);
That's what I use when the file *is* iso-8859-1.
These are the pages I read over and over and over again until my pages
magically work:
:-) I see *exactly* what you mean. I've read these pages over and over too.
I don't get the reason for that random behaviour.
Thanks,
JC
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]