Re: grep help

Matthew Gregan Sun, 07 Nov 2004 13:46:14 -0800

At 2004-11-08T10:03:23+1300, Douglas Royds wrote:

> grep -o a source.txt | wc -l


Yup, Timothy already suggested that.  The '-o' option works around the
fact that grep is line-oriented by causing each match to be printed on a
new line.

It's also a hideously inefficient way to use grep(1).  For a file with
50k characters and 30k matches, grep uses over 430MB of memory, and is
about 2.3 times slower than using sed(1), assuming that the machine
doesn't start swapping while grep is running.

For kicks, I wrote a very simple program in C to count the characters
too.  The results were:

$ /usr/bin/time ./a.out A < data
30043
0.00user 0.00system 0:00.00elapsed 100%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (0major+92minor)pagefaults 0swaps

$ /usr/bin/time sed -e "s/[^A]//g" < data | wc -c
22.65user 0.00system 0:22.66elapsed 99%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (0major+230minor)pagefaults 0swaps
30043

$ /usr/bin/time grep -o A < data | wc -l
51.57user 0.52system 0:52.14elapsed 99%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (0major+110899minor)pagefaults 0swaps
30043

Cheers,
-mjg
-- 
Matthew Gregan                     |/
                                  /|                [EMAIL PROTECTED]

Re: grep help

Reply via email to