Re: Regexp code feature

2005-08-21 Thread David Landgren

Nicholas Clark a écrit :

On Mon, Jul 25, 2005 at 12:07:27PM -0500, Chris Dolan wrote:


This is exactly the sort of feedback I was hoping to see, however.  May 
I recommend that you add a note about that $^R problem to the docs at 
annocpan.org?

 http://annocpan.org/~NWCLARK/perl-5.8.7/pod/perlre.pod



Annotations are designed for real users.

*This* is a bug in the implementation that needs to be documented
(preferably fixed)

annocpan is not the *master* copy of the documentation of anything.
As ever, please report the bug to the maintainer of the module (via the
correct mechanism for that module), so that the master can be updated.
Otherwise the bug will never be fixed, and the incorrect documentation
will continue to propagate elsewhere.


For the archives, a bug report *was* filed against this behaviour in 
December 2004, as #32840 $^R value lost in (?:...)? constructs


I could mail a patch to p5p to document it in perlre.pod if you think 
it's important enough.


David



Re: Regexp code feature

2005-07-26 Thread Nicholas Clark
On Mon, Jul 25, 2005 at 12:07:27PM -0500, Chris Dolan wrote:

 This is exactly the sort of feedback I was hoping to see, however.  May 
 I recommend that you add a note about that $^R problem to the docs at 
 annocpan.org?
   http://annocpan.org/~NWCLARK/perl-5.8.7/pod/perlre.pod

Annotations are designed for real users.

*This* is a bug in the implementation that needs to be documented
(preferably fixed)

annocpan is not the *master* copy of the documentation of anything.
As ever, please report the bug to the maintainer of the module (via the
correct mechanism for that module), so that the master can be updated.
Otherwise the bug will never be fixed, and the incorrect documentation
will continue to propagate elsewhere.

Nicholas Clark


Re: Regexp code feature

2005-07-25 Thread David Landren

Chris Dolan wrote:
What's the minimum version of Perl needed to reliably exploit the (?{  
code }) feature in regexps?


I'm working on a module that I call Net::IP::Match::Regexp which builds  
and runs regexps that contain simple code blocks.  It works great under  
5.8.1 on my Mac, but I haven't tested older Perl versions yet.


My code turns these IP ranges (randomly generated)
  109.27.190.54/28= 1
  109.61.26.198/24= 2
  180.203.154.195/28  = 3
  5.98.198.68/19  = 4
  68.238.145.35/29= 5

into regexps look like this:

my $re =  
^(?:0(?:10101100010110(?{'2'})|1(?:000100111011101001000100100(? 
{'3'})|10110100(?:01101110100011(?{'5'})|0100011010(? 
{'4'}|1011010011001011100110101100(?{'1'}));



It's used in this context, pulling the code values out via $^R:


Note that $^R is broken. It won't be set correctly in cases like

'ab' =~ /^a(?{1})(?:b(?{2}))?$/

In this case $^R will be set to 1 instead of 2. I have filed a bug 
report on this, but I doubt it will ever get fixed unless Dave Mitchell 
find a pile of tuits. That said, looking at your problem domain, I'm not 
sure you'll encounter optional tails.


Plugging my own module here, you might find that Regexp::Assemble, in 
its tracked pattern mode, will do all that you need, and works around 
this bug. This should help save you a certain amount of development time.


David


Re: Regexp code feature

2005-07-25 Thread Chris Dolan

On Jul 25, 2005, at 10:59 AM, David Landren wrote:


Note that $^R is broken. It won't be set correctly in cases like

'ab' =~ /^a(?{1})(?:b(?{2}))?$/

In this case $^R will be set to 1 instead of 2. I have filed a bug 
report on this, but I doubt it will ever get fixed unless Dave 
Mitchell find a pile of tuits. That said, looking at your problem 
domain, I'm not sure you'll encounter optional tails.


Plugging my own module here, you might find that Regexp::Assemble, in 
its tracked pattern mode, will do all that you need, and works 
around this bug. This should help save you a certain amount of 
development time.


Thanks for the info, David.  No, it should not affect my code, 
fortunately.  My regexps are designed so that at most one code block is 
hit, right at the end of the match (they are all at the end of a series 
of literals).


This is exactly the sort of feedback I was hoping to see, however.  May 
I recommend that you add a note about that $^R problem to the docs at 
annocpan.org?

  http://annocpan.org/~NWCLARK/perl-5.8.7/pod/perlre.pod

I did look at Regexp::Assemble at Sébastien's suggestion.  It looks 
very powerful.  My problem space is much simpler than that module is 
trying to address, however, so constructing a binary tree to represent 
the IP ranges and building my regexps from that tree was 
straightforward (and performant) in about 30 lines of code.


Chris
--
Chris Dolan, Software Developer, Clotho Advanced Media Inc.
608-294-7900, fax 294-7025, 1435 E Main St, Madison WI 53703

Clotho Advanced Media, Inc. - Creators of MediaLandscape Software 
(http://www.media-landscape.com/) and partners in the revolutionary 
Croquet project (http://www.opencroquet.org/)




Re: Regexp code feature

2005-07-21 Thread Chris Dolan

On Jul 21, 2005, at 3:17 PM, A. Pagaltzis wrote:

* Chris Dolan [EMAIL PROTECTED] [2005-07-21 22:05]:

What's the minimum version of Perl needed to reliably exploit the (?{
code }) feature in regexps?


You can check at http://search.cpan.org/dist/perl/. The first
version where pelre documents this feature is 5.005.

Though as far as “reliably” is concerned: it has never lost its
“experimental” status. Even if I doubt that it would be removed
from Perl5 at this point.


Thanks, Aristotle.

After checking that URL, I've confirmed that 5.005 probably *could* 
work, but I've successfully tested my module on 5.6.0 and called that 
good enough (since I use our, etc).  I've uploaded 
Net::IP::Match::Regexp to CPAN.


Yes, I see your point about the experimental status.  Well, it's the 
only way I could think of to implement varying return values via the 
regexp.  I'm using only the most basic aspect of the (?{code}) feature, 
so I'm safe unless someone takes $^R away from me.  :-)


Chris
--
Chris Dolan, Software Developer, Clotho Advanced Media Inc.
608-294-7900, fax 294-7025, 1435 E Main St, Madison WI 53703

Clotho Advanced Media, Inc. - Creators of MediaLandscape Software 
(http://www.media-landscape.com/) and partners in the revolutionary 
Croquet project (http://www.opencroquet.org/)




Re: Regexp code feature

2005-07-21 Thread A. Pagaltzis
* Chris Dolan [EMAIL PROTECTED] [2005-07-21 23:30]:
 Yes, I see your point about the experimental status.  Well,
 it's the only way I could think of to implement varying return
 values via the regexp.  I'm using only the most basic aspect of
 the (?{code}) feature, so I'm safe unless someone takes $^R
 away from me.  :-)

Well, not to worry. I rely on the thing a lot, myself. I’m just
nitpicking about your use of “reliably.” ;-)

Regards,
-- 
#Aristotle
*AUTOLOAD=*_=sub{s/(.*)::(.*)/print$2,(,$\/, )[defined wantarray]/e;$1};
Just-another-Perl-hacker;


Re: Regexp code feature

2005-07-21 Thread Sébastien Aperghis-Tramoni

Chris Dolan wrote:

I'm working on a module that I call Net::IP::Match::Regexp which  
builds and runs regexps that contain simple code blocks.  It works  
great under 5.8.1 on my Mac, but I haven't tested older Perl versions  
yet.


My code turns these IP ranges (randomly generated)
  109.27.190.54/28= 1
  109.61.26.198/24= 2
  180.203.154.195/28  = 3
  5.98.198.68/19  = 4
  68.238.145.35/29= 5

into regexps look like this:

my $re =  
^(?:0(?:10101100010110(?{'2'})|1(?:000100111011101001000100100(?{' 
3'})|10110100(?:01101110100011(?{'5'})|0100011010(?{'4'}|10 
11010011001011100110101100(?{'1'}));


I may be wrong, but this looks similar to what Regexp::Assemble does,  
or that you could use it to ease your task.

http://search.cpan.org/dist/Regexp-Assemble/


Sébastien Aperghis-Tramoni
 -- - --- -- - -- - --- -- - --- -- - --[ http://maddingue.org ]
Close the world, txEn eht nepO