I am coming into this late... but I would use \G

$str = '02 s1n1 s1n2 3 s2n1 s2n2 s2n3 1 s3n1 4 s4n1 s4n2 s4n3 s4n4';

while($str =~ /\G\s*(\d+)/g) {
    my $cnt = $1;
    my @d;

    my $re = '\s*(s\dn\d)' x $cnt;

    if($str =~ /\G$re/g) {
        @d = ($1,$2,$3,$4,$5,$6,$7,$8,$9);
        print "Got $cnt: @d\n";
    } else {

        print "ERROR\n";
    }

}

of course...  its a little kludgy... but assigning the regexp
to an array unfortunately puts the g modifier in a list
context, which blows screws up the whole thing...

Other code would have to be implemented to check for bad trailing data...

there is also, of course...

$str =~ s/\s*(\d+)((?:\s+s\d+n\d+)*)/check($1,$2)/ge;

sub check {
    my($cnt, $data) = @_;
    my @d = split /\s+/, $data;
    shift @d; #leading space
    if($cnt == @d) {
        print "Got $cnt: @d\n";
    } else {
        print "Error, expected $cnt, got ",scalar(@d),"\n";
    }
}


This is probably easier to deal with...




On Fri, Oct 11, 2002 at 02:20:13PM -0400, Ranga Nathan wrote:
> Thanks guys that was quick response;
> In the context of Any2XML, which forces me to use a regex for top-down
> parsing, I need to use regex. Even if the regex is slow, that is fine.
> If it is impossible to do this in regex, of course, I would resort to
> split().
> 
> Can this be done using regex? Why does \1 not work? These are my
> questions.
> 
> Thanks again...
> 
> ---Closing information gaps-----
> Ranga Nathan, Reliance Technology
> >>SEVIS solution now! http://goreliance.com
> >>Live demo at http://any2xml.com/docs/timesheet_demo.shtml<<
> >>Get free COBOLExplorer at http://goreliance.com/download-products <<
> 
> > -----Original Message-----
> > From: [EMAIL PROTECTED] 
> > [mailto:[EMAIL PROTECTED]] On Behalf Of Mark Aisenberg
> > Sent: Friday, October 11, 2002 2:15 PM
> > To: [EMAIL PROTECTED]
> > Subject: RE: [Boston.pm] Calling regex gurus ..A regex question..
> > 
> > 
> > Your '\1' question aside:
> > Your solution requires you to know the number of patterns in 
> > the string. For long strings, a regex could be slow. Why not 
> > just 'split' on whitespace into an array and then use array 
> > indices to easily extract the items you want?
> > 
> > 
> > 
> > 
> > -----Original Message-----
> > From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]]
> > On Behalf Of Ranga Nathan
> > Sent: Friday, October 11, 2002 1:42 PM
> > To: [EMAIL PROTECTED]
> > Subject: [Boston.pm] Calling regex gurus ..A regex question..
> > 
> > 
> > I need to parse a string that has multiple occurrences of a 
> > pattern that is determined by an embeded count. For example:
> > 
> > 02 s1n1 s1n2 3 s2n1 s2n2 s2n3 1 s3n1 4 s4n1 s4n2 s4n3 s4n4
> > 
> > 02 is the count and I need to extract s1n1 and s2n2
> > 
> > 3 is the count and  I need to extract s2n1, s2n2 and s2n3
> > 
> > And so on.
> > 
> > So I tried to do:
> > $var = 
> > /(.*?)\s+((?:.*?\s+){\1})(.*?)\s+((?:.*?\s+){\3})(.*?)\s+((?:.
> > *?\s+){\5}
> > )/;
> > 
> > And was expecting "02" "s1n1 s1n2 " "3" "s2n1 s2n2 s2n3 " "1" 
> > "s3n1 " "4" "s4n1 s4n2 s4n3 s4n4 " as matches.
> > 
> > This does not work.
> > The \1, \2 etc are not evaluated as 'iterators'. I tried the 
> > experimental ?{} too.
> > 
> > 
> > 
> > ---Closing information gaps-----
> > Ranga Nathan, Reliance Technology
> > >>SEVIS solution now! http://goreliance.com
> > >>Live demo at http://any2xml.com/docs/timesheet_demo.shtml<<
> > >>Get free COBOLExplorer at http://goreliance.com/download-products <<
> > 
> > _______________________________________________
> > Boston-pm mailing list
> > [EMAIL PROTECTED] http://mail.pm.org/mailman/listinfo/boston-pm
> > 
> > 
> > 
> > _______________________________________________
> > Boston-pm mailing list
> > [EMAIL PROTECTED] http://mail.pm.org/mailman/listinfo/boston-pm
> > 
> 
> _______________________________________________
> Boston-pm mailing list
> [EMAIL PROTECTED]
> http://mail.pm.org/mailman/listinfo/boston-pm

-- 
 ___  __  __    __  _  _  ____    _  _  ____  ____ 
/ __)(  )(  )  /__\( \/ )( ___)  ( \( )( ___)(_  _)
\__ \ )(__)(  /(__)\\  /  )__)    )  (  )__)   )(  
(___/(______)(__)(__)\/  (____)()(_)\_)(____) (__) 
"Say, this a little bit of all right!" - Die Fladermaus

_______________________________________________
Boston-pm mailing list
[EMAIL PROTECTED]
http://mail.pm.org/mailman/listinfo/boston-pm

Reply via email to