Hi, It appears that HTML::Parser modifies some unicode characters while parsing. The following program gives an example:
######### #!/usr/bin/perl use HTML::Parser; use utf8; open TEST, '>:utf8', 'word.txt'; my $p = new HTML::Parser text_h => [sub {print TEST shift}, 'text']; $p->parse("zespoÅÃw\n"); close TEST; ######### After running it, 'word.txt' contains "zespoÅÃw" with the funny l and the funny o following it transformed to something else. What am I doing wrong? I'm running: perl 5.8.5, HTML::Parser version 3.36 on linux. Thanks, Moshe
pgpBCzUf0Ovjw.pgp
Description: PGP signature