*Benjamin Goldberg adds a clever (and i think easier to understand) bit of code to do this.
Index: perlfaq4.pod =================================================================== RCS file: /cvs/public/perlfaq/perlfaq4.pod,v retrieving revision 1.37 diff -u -d -r1.37 perlfaq4.pod --- perlfaq4.pod 13 Nov 2002 06:04:00 -0000 1.37 +++ perlfaq4.pod 13 Nov 2002 06:32:46 -0000 @@ -593,6 +593,28 @@ @$ = (eval{/$re/},$@!~/unmatched/i); print join("\n",@$[0..$#$]) if( $$[-1] ); +Benjamin Goldberg offers a Unicode-aware version that replaces +the markers with wide, private use characters. + + my $str = $_; + my @parts; + 1 while $str =~ s/BEGIN((?:(?!BEGIN)(?!END).)*)END/ + push @parts, $1; + chr( 0xE000 + $#parts ); + }ge; + my $re = @parts ? + sprintf "[%c-%c]", 0xE000, 0xE000 + $#parts : + '(?!)'; + s/($re)/$parts[ord($1) - 0xE000]/g for @parts; + +The private use part of Unicode is from 0xE000 to +0xF8FF---try not to have more than 0x18FF nested parts with +this code. This is unlikely to work on 5.6.0, due to it's +limited utf8 handling, unless you change 0xE000 to something +smaller, like 0x80, which won't require an upgrade from +bytes to utf8. But this means that it won't work on 8-bit +data, only on 7-bit data. + =head2 How do I reverse a string? Use reverse() in scalar context, as documented in