I recently had a similar problem. A regex that worked fine in sample code was a dog in the web-server code. It only happened with really long strings. I tracked down the problem to this from the 'perlre' manpage.
WARNING: Once Perl sees that you need one of "$&", "$`", or "$'" anywhere in the program, it has to provide them for every pattern match. This may substantially slow your program. Perl uses the same mechanism to produce $1, $2, etc, so you also pay a price for each pattern that contains capturing parentheses. (To avoid this cost while retaining the grouping behaviour, use the extended regular expression "(?: ... )" instead.) But if you never use "$&", "$`" or "$'", then patterns without capturing parentheses will not be penalized. So avoid "$&", "$'", and "$`" if you can, but if you can't (and some algorithms really appreciate them), once you've used them once, use them at will, because you've already paid the price. As of 5.005, "$&" is not so costly as the other two. Basically one of the modules in the web-app I was 'use'ing needed $', but my test code didn't 'use' that module. The result was pretty dramatic in this case, something that took approx 1 second in the test code was timing out after 2 minutes in the web-server. What I did in the end was something like this: In the code somewhere add this so it's run when a request hits. open(F, '>/tmp/modulelist'); print F join("\n", values %INC), "\n"; close(F); This creates a file which lists all the loaded modules. Then after sticking a request through the browser, do something like: grep \$\' `cat /tmp/modulelist` grep \$\& `cat /tmp/modulelist` grep \$\` `cat /tmp/modulelist` to try and track down the offending module. You'll get quite a few false hits (comments, etc), but you might find an offending module. The main ones I found were: Parse::RecDescent Net::DNS and a couple of others I can't remember now. I fixed Net::DNS myself and sent a patch to the maintainer, but haven't heard anything. If you find this happens to be your problem as well, ask me for the patched version. Parse::RecDescent makes heavy use of the above vars, no chance of fixing that in a hurry. Rob ----- Original Message ----- From: "Paul Mineiro" <[EMAIL PROTECTED]> To: <[EMAIL PROTECTED]> Sent: Thursday, January 24, 2002 11:01 AM Subject: Re: slow regex [BENCHMARK] > Paul Mineiro wrote: > > i've cleaned up the example to tighten the case: > > the mod perl code snippet is: > > --- > > my @cg; > > open DIL, '>', "/tmp/seqdata"; > print DIL $seq; > close DIL; > > warn "length seq = @{[length ($seq)]}"; > > my $t = timeit (1, sub { > while ($seq =~ /CG/g) > { > push @cg, pos ($seq); > } > }); > > print STDERR timestr ($t), "\n"; > > --- > > which yields > length seq = 200001 at > /home/aerives/genegrokker-interface/mod_perl/genomic_img.pm line 634, > <GEN1> line 102 > 16 wallclock secs (15.56 usr + 0.01 sys = 15.57 CPU) @ 0.06/s (n=1) > > and the perl script (command line) version is: > > --- > > #!/usr/bin/perl > > use Benchmark; > use strict; > > open DIL, '<', "/tmp/seqdata"; > my $seq = <DIL>; > close DIL; > > warn "length seq is @{[length $seq]}"; > > my @cg; > > my $t = timeit (1, sub { > while ($seq =~ /CG/g) > { > push @cg, pos ($seq); > } > }); > > print STDERR timestr ($t), "\n"; > > --- > which yields: > > length seq is 200001 at ./t.pl line 10. > 0 wallclock secs ( 0.00 usr + 0.00 sys = 0.00 CPU) > > the data is pretty big, so i didn't attach it, but feel free to contact > me directly for it. > > -- p > > >hi. i'm running mod_perl 1.26 + apache 1.3.14 + perl 5.6.1 > > > >i have a loop in a mod_perl handler like so: > >---- > > my $stime = time (); > > > > while ($seq =~ /CG/og) > > { > > push @cg, pos ($seq); > > } > > > > my $etime = time (); > > > > warn "time was: ", scalar localtime ($stime), " ", > > scalar localtime ($etime), " ", $etime - $stime; > >---- > > > >under mod_perl this takes 23 seconds. running the perl "by hand" (via > >extracting this piece into a seperate perl script) on the same data takes > >less than 1 second. > > > >has anyone seen this kind of extreme slowdown before? > > > >-- p > > > >info: > > > >apache build options: > > > >CFLAGS="-g -g -O3 -funroll-loops" \ > >LDFLAGS="-L/home/aerives/lib -L/home/aerives/lib/mysql" \ > >LIBS="-L/home/aerives/genegrokker-interface/lib > >-L/home/aerives/genegrokker-interface/ext/lib -L/home/aerives/lib > >-L/home/aerives/lib/mysql" \ > >./configure \ > >"--prefix=/home/aerives/genegrokker-interface/ext" \ > >"--enable-rule=EAPI" \ > >"--enable-module=most" \ > >"--enable-shared=max" \ > >"--with-layout=GNU" \ > >"--disable-rule=EXPAT" \ > >"$@" > > > >mod_perl build options: > > > >configure_options="PERL_USELARGEFILES=0 USE_APXS=1 > >WITH_APXS=$PLAYPEN_ROOT/ext/sbin/apxs EVERYTHING=1 > >INC=$PLAYPEN_ROOT/ext/include -DEAPI" > > > >perl -V: > >Summary of my perl5 (revision 5.0 version 6 subversion 1) configuration: > > Platform: > > osname=linux, osvers=2.4.13, archname=i386-linux > > uname='linux duende 2.4.13 #1 wed oct 31 19:18:07 est 2001 i686 unknown ' > > config_args='-Dccflags=-DDEBIAN -Dcccdlflags=-fPIC -Darchname=i386-linux > >-Dprefix=/usr -Dprivlib=/usr/share/perl/5.6.1 -Darchlib=/usr/lib/perl/5.6.1 > >-Dvendorprefix=/usr -Dvendorlib=/usr/share/perl5 -Dvendorarch=/usr/lib/perl 5 > >-Dsiteprefix=/usr/local -Dsitelib=/usr/local/share/perl/5.6.1 > >-Dsitearch=/usr/local/lib/perl/5.6.1 -Dman1dir=/usr/share/man/man1 > >-Dman3dir=/usr/share/man/man3 -Dman1ext=1 -Dman3ext=3perl > >-Dpager=/usr/bin/sensible-pager -Uafs -Ud_csh -Uusesfio -Duseshrplib > >-Dlibperl=libperl.so.5.6.1 -Dd_dosuid -des' > > hint=recommended, useposix=true, d_sigaction=define > > usethreads=undef use5005threads=undef useithreads=undef > >usemultiplicity=undef > > useperlio=undef d_sfio=undef uselargefiles=define usesocks=undef > > use64bitint=undef use64bitall=undef uselongdouble=undef > > Compiler: > > cc='cc', ccflags ='-DDEBIAN -fno-strict-aliasing -I/usr/local/include > >-D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64', > > optimize='-O2', > > cppflags='-DDEBIAN -fno-strict-aliasing -I/usr/local/include' > > ccversion='', gccversion='2.95.4 (Debian prerelease)', gccosandvers='' > > intsize=4, longsize=4, ptrsize=4, doublesize=8, byteorder=1234 > > d_longlong=define, longlongsize=8, d_longdbl=define, longdblsize=12 > > ivtype='long', ivsize=4, nvtype='double', nvsize=8, Off_t='off_t', > >lseeksize=8 > > alignbytes=4, usemymalloc=n, prototype=define > > Linker and Libraries: > > ld='cc', ldflags =' -L/usr/local/lib' > > libpth=/usr/local/lib /lib /usr/lib > > libs=-lgdbm -ldb -ldl -lm -lc -lcrypt > > perllibs=-ldl -lm -lc -lcrypt > > libc=/lib/libc-2.2.4.so, so=so, useshrplib=true, libperl=libperl.so.5.6.1 > > Dynamic Linking: > > dlsrc=dl_dlopen.xs, dlext=so, d_dlsymun=undef, ccdlflags='-rdynamic' > > cccdlflags='-fPIC', lddlflags='-shared -L/usr/local/lib' > > > > > >Characteristics of this binary (from libperl): > > Compile-time options: USE_LARGE_FILES > > Built under linux > > Compiled at Jan 11 2002 04:09:18 > > %ENV: > > > >PERL5LIB="/home/aerives/genegrokker-interface/lib/perl5:/home/aerives/geneg rokker-interface/ext/lib/perl5:/home/aerives/lib/perl5" > > @INC: > > /home/aerives/genegrokker-interface/lib/perl5 > > /home/aerives/genegrokker-interface/ext/lib/perl5 > > /home/aerives/lib/perl5 > > /usr/local/lib/perl/5.6.1 > > /usr/local/share/perl/5.6.1 > > /usr/lib/perl5 > > /usr/share/perl5 > > /usr/lib/perl/5.6.1 > > /usr/share/perl/5.6.1 > > /usr/local/lib/site_perl > > > > > >