Re: [Perl-unix-users] regex performance

$Bill Luebkert Wed, 13 Jun 2001 14:57:39 -0700
Dan Jablonsky wrote:
> 
> Hi all,
> I remember reading (probably in the Camel book) that
> the more $1, $2 and so on you have in a regex the
> slower the regex will be executed. It seems any
> backreference is taxing performance considerably.
> 
> Is there an alternative? What I am trying to do is
> isolate some patterns with each line of a text file
> and then make small changes to those pieces and/or
> switching the position of some of those pieces. Is it
> possible to do that without back referencing?
> For instance I start with:
> 
> ABc Sun May 20 19:45:30, 2001 XYZ
> 
> (tabs between the date and both fields to the right
> and left, all other spaces are spaces) and I need to
> get something like:
> 
> ABcD Sun May 20 19:45:30 XY Z
> 
> (tab between XY and Z, a new field).
> 
> The way I do it now is:
> 
>$row=~s/^([A-Z][A-Z][a-z])\t([A-Z][a-z][a-z]\s[A-Z][a-z][a-z]\s{1,2}\d{1,2}\s\d{2}:\d{2}:\d{2}).*?\t([A-Z])([A-Z])([A-Z]).*?/$1D\t$2\t$3$4\$5/
> 
> Ok, I could have get XY in $4 and reduce the back
> references by one. Any other way to do it?

I don't think the quantity of terms captured affects the overhead that much.
Once you've done one, you've gotten the major overhead (I think).

Try timing them with benchmark module and see if it's really all that bad.

-- 
  ,-/-  __      _  _         $Bill Luebkert   ICQ=14439852
 (_/   /  )    // //       DBE Collectibles   Mailto:[EMAIL PROTECTED] 
  / ) /--<  o // //      http://dbecoll.webjump.com/ (Free Perl site)
-/-' /___/_<_</_</_     Castle of Medieval Myth & Magic http://www.todbe.com/
_______________________________________________
Perl-Unix-Users mailing list. To unsubscribe go to 
http://listserv.ActiveState.com/mailman/subscribe/perl-unix-users
Re: [Perl-unix-users] regex performance

Reply via email to