Thank you. That seems to work. However, can you explain why every line that
has "Testing" in it in the file gets printed twice?

Kevin


my $string = "Testing";
$string =~ s/(.)/$1\x00/g;      # expand ANSI to UTF-8 (not the most
efficient, but...)
while (<FILE>)
{
        if (/$string/o)
        {
                print;
        }
}

-----Original Message-----
From: Ned Konz [mailto:[EMAIL PROTECTED]]
Sent: Saturday, May 06, 2000 12:14 AM
To: [EMAIL PROTECTED]
Cc: Perl-Win32-Users Mailing List
Subject: Re: Finding and replacing a UNICODE string.


[EMAIL PROTECTED] wrote:
> 
> I read in the docs that Unicode was limited in perl. I would like to
search
> for and replace a Unicode (2 byte) string in a file. If I want to simply
> print out each line in a Unicode file the following works just fine.
> 
> $filename = "file.unicode";
> open(FILE, $filename) or die "Can't open '$filename': $!";
> while(<FILE>) {
>         print $_;
> }
> close FILE;
> 
> Yet if I try to match a particular string the string is not matched.
> 
> $filename = "file.unicode";
> open(FILE, $filename) or die "Can't open '$filename': $!";
> while(<FILE>) {
>         if(/Testing/) {
>                 print $_;
>         }
> }
> close FILE;
> 
> I know that "Testing" is in the file (of course there are two bytes per
> character) but it seems that Perl does not find it or does not properly
> convert the characters. Any suggestions as to how I might proceed?

The Unicode::String module provides for mapping between unicode and
non-unicode
sets.

As far as regular expressions go, if you spell them right, they can work
on
unicode.

For instance, if you're looking for a constant UTF-8 string, you can do
this:

my $string = "Testing";
$string =~ s/(.)/$1\x00/g;      # expand ANSI to UTF-8 (not the most
efficient, but...)
while (<FILE>)
{
        if (/$string/o)
        {
                print;
        }
}

Perl 5.6 supports UTF-8 directly (even in regular expressions, as I
understand).

Maybe you should look at the perldelta documentation for this version.


-- 
Ned Konz
currently: Stanwood, WA
email:     [EMAIL PROTECTED]
homepage:  http://www.bike-nomad.com

---
You are currently subscribed to perl-win32-users as: [archive@jab.org]
To unsubscribe, forward this message to
         [EMAIL PROTECTED]
For non-automated Mailing List support, send email to  
         [EMAIL PROTECTED]

Reply via email to