Hi,



   I have spent about an hour diagnosing the following:

   I have a passwd file, about 50,000 lines in length. I had a problem
with a script grepping something from it, and when debugging I found out
that:

Pentium4 Xeon, RedHat 9, latest RH kernel, fully updated system:
time grep : passwdfile > /dev/null
~1 minute, 12 seconds

Pentium4 Xeon, RedHat 7.2, latest RH kernel, fully updated system:
time grep : passwdfile > /dev/null
~0.04 seconds

Pentium3, RedHat 9, latest RH kernel, fully updated system:
time grep : passwdfile > /dev/null
~0.08 seconds


However, using `pcregrep' on the same systems yielded:

Pentium4 Xeon, RedHat 9, latest RH kernel, fully updated system:
time pcregrep : passwdfile > /dev/null
~0.05 seconds

Pentium4 Xeon, RedHat 7.2, latest RH kernel, fully updated system:
time grep : passwdfile > /dev/null
~0.08 seconds

Pentium3, RedHat 9, latest RH kernel, fully updated system:
time grep : passwdfile > /dev/null
~0.12 seconds


  As you can see, there is a HUGE discrepancy between all the results
above and the Pentium4 Xeon, RedHat 9 `grep' case, about 900 times slower.

  I tried recompiling the .src.rpm of the RedHat 9 grep locally on the
Xeon, but it yielded the same result.


  As such, this appears (to me) to be some kind of a grep problem when
coupled with 2 Gigabytes of RAM, Xeon P4 CPU (with the HT on) on RedHat 9.


  While I am researching some more on this, does any of you have any idea ?

  My hunch is towards write(), since I also tested it with grep --mmap
(which uses mmap() instead of read() for reading) and it yielded the same
results.


--Ariel


--
Ariel Biener
e-mail: [EMAIL PROTECTED]
PGP(6.5.8) public key http://www.tau.ac.il/~ariel/pgp.html


=================================================================
To unsubscribe, send mail to [EMAIL PROTECTED] with
the word "unsubscribe" in the message body, e.g., run the command
echo unsubscribe | mail [EMAIL PROTECTED]

Reply via email to