use utf8; Will tell perl that the current file is encoded in utf8 and all strings will be assumed to be that (as opposed to latin1).
Since your string is likely coming from elsewhere, look into binmode($fh, ":utf8) and open($fh, "<:utf8", $file), and also Encode::decode. These are the common methods to get a string to be marked as unicode in memory, at which point the regex engine treats \w+ as really all alphanumerical characters, not only [a-zA-Z0-9_]. There is a tutorial by Juerd somewhere, it's supposed to be pretty good. Try google perhaps On Mon, Aug 20, 2007 at 15:39:58 +0300, Pinkhas Nisanov wrote: > Hi, > > I need catch string that may include 'utf8' characters: > e.g.: > > my $str_utf8 = 'N-Größe'; > my @res = ( $str_utf8 =~ /(\w+)/g ); > print join( " ++ ", @res ), "\n"; > > > it prints: > > N ++ Gr ++ e > > but I need: > > N ++ Größe > > > thanks > Pinkhas Nisanov > _______________________________________________ > Perl mailing list > [email protected] > http://perl.org.il/mailman/listinfo/perl -- Yuval Kogman <[EMAIL PROTECTED]> http://nothingmuch.woobling.org 0xEBD27418 _______________________________________________ Perl mailing list [email protected] http://perl.org.il/mailman/listinfo/perl
