I am coming into this late... but I would use \G
$str = '02 s1n1 s1n2 3 s2n1 s2n2 s2n3 1 s3n1 4 s4n1 s4n2 s4n3 s4n4';
while($str =~ /\G\s*(\d+)/g) {
my $cnt = $1;
my @d;
my $re = '\s*(s\dn\d)' x $cnt;
if($str =~ /\G$re/g) {
@d = ($1,$2,$3,$4,$5,$6,$7,$8,$9);
print "Got $cnt: @d\n";
} else {
print "ERROR\n";
}
}
of course... its a little kludgy... but assigning the regexp
to an array unfortunately puts the g modifier in a list
context, which blows screws up the whole thing...
Other code would have to be implemented to check for bad trailing data...
there is also, of course...
$str =~ s/\s*(\d+)((?:\s+s\d+n\d+)*)/check($1,$2)/ge;
sub check {
my($cnt, $data) = @_;
my @d = split /\s+/, $data;
shift @d; #leading space
if($cnt == @d) {
print "Got $cnt: @d\n";
} else {
print "Error, expected $cnt, got ",scalar(@d),"\n";
}
}
This is probably easier to deal with...
On Fri, Oct 11, 2002 at 02:20:13PM -0400, Ranga Nathan wrote:
> Thanks guys that was quick response;
> In the context of Any2XML, which forces me to use a regex for top-down
> parsing, I need to use regex. Even if the regex is slow, that is fine.
> If it is impossible to do this in regex, of course, I would resort to
> split().
>
> Can this be done using regex? Why does \1 not work? These are my
> questions.
>
> Thanks again...
>
> ---Closing information gaps-----
> Ranga Nathan, Reliance Technology
> >>SEVIS solution now! http://goreliance.com
> >>Live demo at http://any2xml.com/docs/timesheet_demo.shtml<<
> >>Get free COBOLExplorer at http://goreliance.com/download-products <<
>
> > -----Original Message-----
> > From: [EMAIL PROTECTED]
> > [mailto:[EMAIL PROTECTED]] On Behalf Of Mark Aisenberg
> > Sent: Friday, October 11, 2002 2:15 PM
> > To: [EMAIL PROTECTED]
> > Subject: RE: [Boston.pm] Calling regex gurus ..A regex question..
> >
> >
> > Your '\1' question aside:
> > Your solution requires you to know the number of patterns in
> > the string. For long strings, a regex could be slow. Why not
> > just 'split' on whitespace into an array and then use array
> > indices to easily extract the items you want?
> >
> >
> >
> >
> > -----Original Message-----
> > From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]]
> > On Behalf Of Ranga Nathan
> > Sent: Friday, October 11, 2002 1:42 PM
> > To: [EMAIL PROTECTED]
> > Subject: [Boston.pm] Calling regex gurus ..A regex question..
> >
> >
> > I need to parse a string that has multiple occurrences of a
> > pattern that is determined by an embeded count. For example:
> >
> > 02 s1n1 s1n2 3 s2n1 s2n2 s2n3 1 s3n1 4 s4n1 s4n2 s4n3 s4n4
> >
> > 02 is the count and I need to extract s1n1 and s2n2
> >
> > 3 is the count and I need to extract s2n1, s2n2 and s2n3
> >
> > And so on.
> >
> > So I tried to do:
> > $var =
> > /(.*?)\s+((?:.*?\s+){\1})(.*?)\s+((?:.*?\s+){\3})(.*?)\s+((?:.
> > *?\s+){\5}
> > )/;
> >
> > And was expecting "02" "s1n1 s1n2 " "3" "s2n1 s2n2 s2n3 " "1"
> > "s3n1 " "4" "s4n1 s4n2 s4n3 s4n4 " as matches.
> >
> > This does not work.
> > The \1, \2 etc are not evaluated as 'iterators'. I tried the
> > experimental ?{} too.
> >
> >
> >
> > ---Closing information gaps-----
> > Ranga Nathan, Reliance Technology
> > >>SEVIS solution now! http://goreliance.com
> > >>Live demo at http://any2xml.com/docs/timesheet_demo.shtml<<
> > >>Get free COBOLExplorer at http://goreliance.com/download-products <<
> >
> > _______________________________________________
> > Boston-pm mailing list
> > [EMAIL PROTECTED] http://mail.pm.org/mailman/listinfo/boston-pm
> >
> >
> >
> > _______________________________________________
> > Boston-pm mailing list
> > [EMAIL PROTECTED] http://mail.pm.org/mailman/listinfo/boston-pm
> >
>
> _______________________________________________
> Boston-pm mailing list
> [EMAIL PROTECTED]
> http://mail.pm.org/mailman/listinfo/boston-pm
--
___ __ __ __ _ _ ____ _ _ ____ ____
/ __)( )( ) /__\( \/ )( ___) ( \( )( ___)(_ _)
\__ \ )(__)( /(__)\\ / )__) ) ( )__) )(
(___/(______)(__)(__)\/ (____)()(_)\_)(____) (__)
"Say, this a little bit of all right!" - Die Fladermaus
_______________________________________________
Boston-pm mailing list
[EMAIL PROTECTED]
http://mail.pm.org/mailman/listinfo/boston-pm