On Tue, 6 Jan 2004, Ariel Biener wrote:
Ok, problem located. grep version 2.5.x includes UTF-8 support. If the systems default LANG variable is a UTF-8 one, like the following: # echo $LANG en_US.UTF-8 then grep is dog slow. Change it to en_US, and if flies like an eagle. RedHat.... --Ariel > > > Hi, > > > > I have spent about an hour diagnosing the following: > > I have a passwd file, about 50,000 lines in length. I had a problem > with a script grepping something from it, and when debugging I found out > that: > > Pentium4 Xeon, RedHat 9, latest RH kernel, fully updated system: > time grep : passwdfile > /dev/null > ~1 minute, 12 seconds > > Pentium4 Xeon, RedHat 7.2, latest RH kernel, fully updated system: > time grep : passwdfile > /dev/null > ~0.04 seconds > > Pentium3, RedHat 9, latest RH kernel, fully updated system: > time grep : passwdfile > /dev/null > ~0.08 seconds > > > However, using `pcregrep' on the same systems yielded: > > Pentium4 Xeon, RedHat 9, latest RH kernel, fully updated system: > time pcregrep : passwdfile > /dev/null > ~0.05seconds > > Pentium4 Xeon, RedHat 7.2, latest RH kernel, fully updated system: > time grep : passwdfile > /dev/null > ~0.08 seconds > > Pentium3, RedHat 9, latest RH kernel, fully updated system: > time grep : passwdfile > /dev/null > ~0.12 seconds > > > As you can see, there is a HUGE discrepancy between all the results > above and the Pentium4 Xeon, RedHat 9 `grep' case, about 900 times slower. > > I tried recompiling the .src.rpm of the RedHat 9 grep locally on the > Xeon, but it yielded the same result. > > > As such, this appears (to me) to be some kind of a grep problem when > coupled with 2 Gigabytes of RAM, Xeon P4 CPU (with the HT on) on RedHat 9. > > > While I am researching some more on this, does any of you have any idea ? > > My hunch is towards write(), since I also tested it with grep --mmap > (which uses mmap() instead of read() for reading) and it yielded the same > results. > > > --Ariel > > > -- > Ariel Biener > e-mail: [EMAIL PROTECTED] > PGP(6.5.8) public key http://www.tau.ac.il/~ariel/pgp.html > > > ================================================================= > To unsubscribe, send mail to [EMAIL PROTECTED] with > the word "unsubscribe" in the message body, e.g., run the command > echo unsubscribe | mail [EMAIL PROTECTED] > > > This Mail Was Scanned By Mail-seCure System > > ************************************************************************************ > This footnote confirms that this email message has been scanned by > PineApp Mail-SeCure for the presence ofmalicious code, vandals & computer viruses. > ************************************************************************************ > -- Ariel Biener e-mail: [EMAIL PROTECTED] PGP(6.5.8) public key http://www.tau.ac.il/~ariel/pgp.html ================================================================= To unsubscribe, send mail to [EMAIL PROTECTED] with the word "unsubscribe" in the message body, e.g., run the command echo unsubscribe | mail [EMAIL PROTECTED]
