perlfaq4: How do I find matching/nesting anything?

_brian_d_foy Tue, 12 Nov 2002 22:38:32 -0800

*Benjamin Goldberg adds a clever (and i think easier to understand) bit of
code to do this.



Index: perlfaq4.pod
===================================================================
RCS file: /cvs/public/perlfaq/perlfaq4.pod,v
retrieving revision 1.37
diff -u -d -r1.37 perlfaq4.pod
--- perlfaq4.pod        13 Nov 2002 06:04:00 -0000      1.37
+++ perlfaq4.pod        13 Nov 2002 06:32:46 -0000
@@ -593,6 +593,28 @@
     @$ = (eval{/$re/},$@!~/unmatched/i);
     print join("\n",@$[0..$#$]) if( $$[-1] );
 
+Benjamin Goldberg offers a Unicode-aware version that replaces
+the markers with wide, private use characters.
+
+       my $str = $_;
+       my @parts;
+       1 while $str =~ s/BEGIN((?:(?!BEGIN)(?!END).)*)END/
+         push @parts, $1;
+         chr( 0xE000 + $#parts );
+       }ge;
+       my $re = @parts ?
+         sprintf "[%c-%c]", 0xE000, 0xE000 + $#parts :
+         '(?!)';
+       s/($re)/$parts[ord($1) - 0xE000]/g for @parts;
+
+The private use part of Unicode is from 0xE000 to
+0xF8FF---try not to have more than 0x18FF nested parts with
+this code. This is unlikely to work on 5.6.0, due to it's
+limited utf8 handling, unless you change 0xE000 to something
+smaller, like 0x80, which won't require an upgrade from
+bytes to utf8.  But this means that it won't work on 8-bit
+data, only on 7-bit data.
+
 =head2 How do I reverse a string?
 
 Use reverse() in scalar context, as documented in

perlfaq4: How do I find matching/nesting anything?

Reply via email to