RE: [Boston.pm] Calling regex gurus ..A regex question..

Ranga Nathan Fri, 11 Oct 2002 13:40:33 -0700

Sick? It should!!! But gives me the result I want. Thanks a bunch for
that.
Yes the perldoc for perlre says that it is 'experimental feature' and
'don't try this at home!'.
Honestly, is this going to go away?


I was going to escalate this to MJD / Jeffrey Friedl. Wonder what they
would suggest?

For many of you, who think I am nuts, this kind of data structure is
very common in legacy computing platforms. Internal data structure for
many file systems allow multiple occurrences based on a count, although
they are fixed length strings rather than delimited by something. With
regex, I can add the delimiter magic to it!

Thanks Anthony for the contribution..

---Closing information gaps-----
Ranga Nathan, Reliance Technology
>>SEVIS solution now! http://goreliance.com
>>Live demo at http://any2xml.com/docs/timesheet_demo.shtml<<
>>Get free COBOLExplorer at http://goreliance.com/download-products <<

> -----Original Message-----
> From: [EMAIL PROTECTED] 
> [mailto:boston-pm-admin@;mail.pm.org] On Behalf Of Anthony R. J. Ball
> Sent: Friday, October 11, 2002 4:17 PM
> To: Ranga Nathan
> Cc: [EMAIL PROTECTED]
> Subject: Re: [Boston.pm] Calling regex gurus ..A regex question..
> 
> 
> 
>  HA! Here is a pure regexp that will verify that string for you...
> 
>  This oughtta make some people sick...
> 
> $str = '02 s1n1 s1n2 3 s2n1 s2n2 s2n3 1 s3n1 4 s4n1 s4n2 s4n3 s4n4';
> 
> if($str =~ /^((\d+)\s+(??{ my $re = 's\d+n\d+\s*'x$+; 
> qr($re); }))+$/) {
>       print "MATCH\n";
> }
> 
> 
> seemed to work with my limited testing...
> 
> On Fri, Oct 11, 2002 at 03:40:31PM -0400, Anthony R. J. Ball wrote:
> > 
> > 
> > I am coming into this late... but I would use \G
> > 
> > $str = '02 s1n1 s1n2 3 s2n1 s2n2 s2n3 1 s3n1 4 s4n1 s4n2 s4n3 s4n4';
> > 
> > while($str =~ /\G\s*(\d+)/g) {
> >     my $cnt = $1;
> >     my @d;
> > 
> >     my $re = '\s*(s\dn\d)' x $cnt;
> > 
> >     if($str =~ /\G$re/g) {
> >         @d = ($1,$2,$3,$4,$5,$6,$7,$8,$9);
> >         print "Got $cnt: @d\n";
> >     } else {
> > 
> >         print "ERROR\n";
> >     }
> > 
> > }
> > 
> > of course...  its a little kludgy... but assigning the regexp to an 
> > array unfortunately puts the g modifier in a list context, 
> which blows 
> > screws up the whole thing...
> > 
> > Other code would have to be implemented to check for bad trailing 
> > data...
> > 
> > there is also, of course...
> > 
> > $str =~ s/\s*(\d+)((?:\s+s\d+n\d+)*)/check($1,$2)/ge;
> > 
> > sub check {
> >     my($cnt, $data) = @_;
> >     my @d = split /\s+/, $data;
> >     shift @d; #leading space
> >     if($cnt == @d) {
> >         print "Got $cnt: @d\n";
> >     } else {
> >         print "Error, expected $cnt, got ",scalar(@d),"\n";
> >     }
> > }
> > 
> > 
> > This is probably easier to deal with...
> > 
> > 
> > 
> > 
> > On Fri, Oct 11, 2002 at 02:20:13PM -0400, Ranga Nathan wrote:
> > > Thanks guys that was quick response;
> > > In the context of Any2XML, which forces me to use a regex for 
> > > top-down parsing, I need to use regex. Even if the regex is slow, 
> > > that is fine. If it is impossible to do this in regex, of 
> course, I 
> > > would resort to split().
> > > 
> > > Can this be done using regex? Why does \1 not work? These are my 
> > > questions.
> > > 
> > > Thanks again...
> > > 
> > > ---Closing information gaps-----
> > > Ranga Nathan, Reliance Technology
> > > >>SEVIS solution now! http://goreliance.com
> > > >>Live demo at http://any2xml.com/docs/timesheet_demo.shtml<<
> > > >>Get free COBOLExplorer at 
> http://goreliance.com/download-products 
> > > >><<
> > > 
> > > > -----Original Message-----
> > > > From: [EMAIL PROTECTED]
> > > > [mailto:boston-pm-admin@;mail.pm.org] On Behalf Of Mark Aisenberg
> > > > Sent: Friday, October 11, 2002 2:15 PM
> > > > To: [EMAIL PROTECTED]
> > > > Subject: RE: [Boston.pm] Calling regex gurus ..A regex 
> question..
> > > > 
> > > > 
> > > > Your '\1' question aside:
> > > > Your solution requires you to know the number of patterns in
> > > > the string. For long strings, a regex could be slow. Why not 
> > > > just 'split' on whitespace into an array and then use array 
> > > > indices to easily extract the items you want?
> > > > 
> > > > 
> > > > 
> > > > 
> > > > -----Original Message-----
> > > > From: [EMAIL PROTECTED] 
> > > > [mailto:boston-pm-admin@;mail.pm.org]
> > > > On Behalf Of Ranga Nathan
> > > > Sent: Friday, October 11, 2002 1:42 PM
> > > > To: [EMAIL PROTECTED]
> > > > Subject: [Boston.pm] Calling regex gurus ..A regex question..
> > > > 
> > > > 
> > > > I need to parse a string that has multiple occurrences of a
> > > > pattern that is determined by an embeded count. For example:
> > > > 
> > > > 02 s1n1 s1n2 3 s2n1 s2n2 s2n3 1 s3n1 4 s4n1 s4n2 s4n3 s4n4
> > > > 
> > > > 02 is the count and I need to extract s1n1 and s2n2
> > > > 
> > > > 3 is the count and  I need to extract s2n1, s2n2 and s2n3
> > > > 
> > > > And so on.
> > > > 
> > > > So I tried to do:
> > > > $var =
> > > > /(.*?)\s+((?:.*?\s+){\1})(.*?)\s+((?:.*?\s+){\3})(.*?)\s+((?:.
> > > > *?\s+){\5}
> > > > )/;
> > > > 
> > > > And was expecting "02" "s1n1 s1n2 " "3" "s2n1 s2n2 s2n3 " "1"
> > > > "s3n1 " "4" "s4n1 s4n2 s4n3 s4n4 " as matches.
> > > > 
> > > > This does not work.
> > > > The \1, \2 etc are not evaluated as 'iterators'. I tried the
> > > > experimental ?{} too.
> > > > 
> > > > 
> > > > 
> > > > ---Closing information gaps-----
> > > > Ranga Nathan, Reliance Technology
> > > > >>SEVIS solution now! http://goreliance.com
> > > > >>Live demo at http://any2xml.com/docs/timesheet_demo.shtml<<
> > > > >>Get free COBOLExplorer at 
> > > > >>http://goreliance.com/download-products <<
> > > > 
> > > > _______________________________________________
> > > > Boston-pm mailing list
> > > > [EMAIL PROTECTED] 
> > > > http://mail.pm.org/mailman/listinfo/boston-pm
> > > > 
> > > > 
> > > > 
> > > > _______________________________________________
> > > > Boston-pm mailing list
> > > > [EMAIL PROTECTED] 
> > > > http://mail.pm.org/mailman/listinfo/boston-pm
> > > > 
> > > 
> > > _______________________________________________
> > > Boston-pm mailing list
> > > [EMAIL PROTECTED] 
> http://mail.pm.org/mailman/listinfo/boston-pm
> > 
> > -- 
> >  ___ 
>  __  __    __  _  _  ____    _  _  ____  ____ 
> > / __)(  )(  )  /__\( \/ )( ___)  ( \( )( ___)(_  _)
> > \__ \ )(__)(  /(__)\\  /  )__)    )  (  )__)   )(  
> > (___/(______)(__)(__)\/  (____)()(_)\_)(____) (__)
> > "Say, this a little bit of all right!" - Die Fladermaus
> > 
> > _______________________________________________
> > Boston-pm mailing list
> > [EMAIL PROTECTED] http://mail.pm.org/mailman/listinfo/boston-pm
> 
> -- 
>  ___  __  __    __  _  _  ____    _  _  ____  ____ 
> / __)(  )(  )  /__\( \/ )( ___)  ( \( )( ___)(_  _)
> \__ \ )(__)(  /(__)\\  /  )__)    )  (  )__)   )(  
> (___/(______)(__)(__)\/  (____)()(_)\_)(____) (__) 
> "Smoke me a kipper & I'll be back for breakfast" Ace, Red Dwarf
> 
> _______________________________________________
> Boston-pm mailing list
> [EMAIL PROTECTED] http://mail.pm.org/mailman/listinfo/boston-pm
> 

_______________________________________________
Boston-pm mailing list
[EMAIL PROTECTED]
http://mail.pm.org/mailman/listinfo/boston-pm

RE: [Boston.pm] Calling regex gurus ..A regex question..

Reply via email to