On Mon, Nov 10, 2003 at 05:20:59PM +0000, Edmund GRIMLEY EVANS wrote:
> I have a problem here with Perl v5.8.0 on Red Hat 9. Simplified, my
> script looks like this:
>
> while (<>) {
> s/Ä/cx/g;
> print;
> }
>
> This works with older versions of Perl, and it works in the C locale,
> but it doesn't work here in a UTF-8 locale. I tried putting stuff like
> "use bytes" or "no utf8" or "no locale", but it didn't help.
As long as the Perl script and the input is in the same encoding, it
works for me. (Debian unstable)
This is perl, v5.8.0 built for i386-linux-thread-multi
10:14am [EMAIL PROTECTED]/2 [~] cat testing.txt; file testing.txt
abÄd
testing.txt: UTF-8 Unicode text
10:17am [EMAIL PROTECTED]/2 [~] LANG=en_US.UTF-8 ./xxx.pl < testing.txt
abcxd
10:14am [EMAIL PROTECTED]/2 [~] LANG=C ./xxx.pl < testing.txt
abcxd
10:14am [EMAIL PROTECTED]/2 [~] LANG=en_US.ISO-8859-3 ./xxx.pl < testing.txt
abcxd
ISO-8859-3:
10:17am [EMAIL PROTECTED]/2 [~] LANG=en_US.UTF-8 ./xxx3.pl < testing-3.txt
abcxd
10:18am [EMAIL PROTECTED]/2 [~] LANG=C ./xxx3.pl < testing-3.txt
abcxd
10:18am [EMAIL PROTECTED]/2 [~] LANG=en_US.ISO-8859-3 ./xxx3.pl < testing-3.txt
abcxd
(Of course, no locale works if I mix encodings.)
> exec("/path/to/this/script", @ARGV);
> }
> .)??D??-|??Ë{??v??W?z[
Hmm. What's this garbage at the end of the message? Oh. Poking at the
raw message body, it's the stupid footer that the mailing list blindly
spams on every message (despite this being a base64 message).
--
Glenn Maynard
--
Linux-UTF8: i18n of Linux on all levels
Archive: http://mail.nl.linux.org/linux-utf8/