At 10:46 am +0100 20/12/05, [EMAIL PROTECTED] wrote:
...Let's say I have a txt file which contains a list of strings.
Some of these strings contain characters encoded in this fashion:
R\xC3\xA9union (\xC3\xA9 is one character - e with an accent).
...Now, this fails, even though when I look at the file name it is
Reunion (with accented e). This fails because my $in =~ s/// didn't
produce an accented e, although I've checked that \xC3\xA9 is the
correct encoding for that character. Can you please tell me what I
am doing wrong and, more generally, how to correctly make these
kinds of string comparisons with strange characters?
If I run this, which I think is reproducing your situation, first
with a string in the script and then with text read from a file:
#!/usr/bin/perl
$in = 'R\xC3\xA9union' . $/;
$in =~ s~\\x(..)~chr(hex($1))~eg;
print $in;#####
$testtext = 'R\xC3\xA9union' . $/;
$testfile = "$ENV{HOME}/test.txt";
open TEST, $testfile;
print TEST $testtext;
close TEST;
open TEST, "<encoding(us-ascii)", $testfile;
while (<TEST>) {
s~\\x(..)~chr(hex($1))~eg;
print #####
}
I get
Réunion
Réunion
Do you get a different result?
JD