Re: Matching encoded strings and file names

John Delacour Tue, 20 Dec 2005 16:04:23 -0800

At 10:46 am +0100 20/12/05, [EMAIL PROTECTED] wrote:

...Let's say I have a txt file which contains a list of strings.Some of these strings contain characters encoded in this fashion:
R\xC3\xA9union (\xC3\xA9 is one character - e with an accent).
...Now, this fails, even though when I look at the file name it isReunion (with accented e). This fails because my $in =~ s/// didn'tproduce an accented e, although I've checked that \xC3\xA9 is thecorrect encoding for that character. Can you please tell me what Iam doing wrong and, more generally, how to correctly make thesekinds of string comparisons with strange characters?

If I run this, which I think is reproducing your situation, firstwith a string in the script and then with text read from a file:


        #!/usr/bin/perl
        $in  = 'R\xC3\xA9union' . $/;
        $in =~ s~\\x(..)~chr(hex($1))~eg;
        print $in;#####
        $testtext = 'R\xC3\xA9union' . $/;
        $testfile = "$ENV{HOME}/test.txt";
        open TEST, $testfile;
        print TEST $testtext;
        close TEST;
        open TEST, "<encoding(us-ascii)", $testfile;
        while (<TEST>) {
          s~\\x(..)~chr(hex($1))~eg;
          print #####
        }

I get

        Réunion
        Réunion

Do you get a different result?

JD

Re: Matching encoded strings and file names

Reply via email to