On 23/02/2009 8:14 PM, Kim Boulton wrote: > Hehe, probably a combination of rubbish grep (i used regex function in a > text editor) and vaccuming a 4GB table at the same time.
google("scientific method") :-) > > @echo off > setlocal > set starttime=%time% > egrep --count > "(....W[CEF][SZ]|..W[CEF]S|...W[CEF]S|W[3CEF]S[25]..|W3S..|.11[CEF]S.)," > my-30-million-rows-of-data.txt > set stoptime=%time% > echo Started: %starttime% > echo Ended: %stoptime% > > results in: > 24561 > Started: 9:00:58.82 > Ended: 9:01:34.29 > > 36-ish seconds. obviously the regex needs a bit of work as there are > supposed to be around 200,000 matches. Probably a big contributing factor is that my regex is based on you getting rid of the commas in the part number. If the above input file is in your original format, you need to sprinkle commas about madly; the first subpattern would become: .,.,.,.,W,[CEF],[SZ] Note that your average record size is 16 to 17 bytes. If you lose 6 commas, it will be 10 to 11 bytes per record i.e. it can be reduced from about 500Mb to about 320Mb ... quite a useful saving in processing time as well as disk space. > > interesting nonetheless, never used grep before...useful. Sure is. Cheers, John _______________________________________________ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users