Hi Dan,
[Silence “Wide character” warning globally one time]
Dan Muey schrieb am 29.07.2010 um 16:59 (-0500):
>
> I've a situation where a large code base will be outputting "byte
> strings" and "unicode strings" from a number of sources.
All lumped together? This will likely mean encoding issues.
> I essentially need to do
> no warnings "utf8";
That's hiding the problem.
> [ -- Problem: Unicode string gives warning -- ]
Sorry, but no, the warning is not the problem: it's an indication to
the user making him aware of the actual problem, which is printing wide
characters to a single-byte (narrow) output handle.
> perl -le 'print "Think before you code™ (bytes string)";print "Hello
> \x{201C}World\x{201D} (Unicode String)";'
perl -CO -le 'print "Hello \x{201C}World\x{201D}";'
See: perldoc perlrun
You need the equivalent of -CO in your script:
binmode STDOUT, ':utf8';
> perl -le 'binmode STDOUT, "utf8";print "Think before you code™ (bytes
> string)";print "Hello \x{201C}World\x{201D} (Unicode String)";'
You're mungling bytes and Unicode characters together. The result is, of
course, wrong. Pick either bytes or Unicode as your standard encoding.
Convert input accordingly.
> Is there any super voo doo that can be done?
The best voodoo is understanding Perl's Unicode handling by reading
Juerd's pages [1], which have also been included in the docs of current
Perl versions; so read those as well; but don't read the old docs, they
do contain some documentation bugs.
You might also want to read the archives of this list to see how I
managed to make some progress in understanding thanks to the good
answers I got here.
[1] http://juerd.nl/perl
--
Michael Ludwig