On 22/02/2009 7:48 PM, Kim Boulton wrote: > Thanks for the hints so far. > > Here is my existing Mysql setup and what I've tried with Sqlite.... > > *Mysql Table structure:* > CREATE TABLE `mydb`.`mytable` ( > `c1` > enum('0','1','2','3','4','5','6','7','8','9','A','B','C','D','E','F','G','H','I','J','K','L','M','N','O','P','Q','R','S','T','U','V','W','X','Y','Z') > > NOT NULL, [snip FFS] > `c7` > enum('0','1','2','3','4','5','6','7','8','9','A','B','C','D','E','F','G','H','I','J','K','L','M','N','O','P','Q','R','S','T','U','V','W','X','Y','Z') > > NULL, > `price` smallint(4) NOT NULL, KEY `c1` (`c1`), KEY `c2` (`c2`), KEY > `c3` (`c3`), KEY `c4` (`c4`), KEY `c5` (`c5`), KEY `c6` (`c6`), KEY > `c7` (`c7`), KEY `price` (`price`) ) ENGINE=MyISAM DEFAULT > CHARSET=latin1 PACK_KEYS=1 ROW_FORMAT=COMPRESSED; > > *CSV Import Data e.g.:* > 1,A,P,0,0,X,X,300 > 1,A,P,0,0,X,P,9999 > A,A,P,0,0,,,2000 > B,3,Y,0,1,X,,300 > ........ approx 30 million row, 500MB csv, text file > > If I compress the table it is approximately 900MB in size with an index > of approximately 550MB, which i can load into memory via LOAD INDEX INTO > CACHE > > *Mysql Query example:* > SELECT * FROM mytable WHERE (c5 = 'W' AND c6 IN ('C','E','F') AND c7 IN > ('S','Z')) UNION ALL > SELECT * FROM mytable WHERE (c3 = 'W' AND c4 IN ('C','E','F') AND c5 = > 'S' AND c6 IS NULL) UNION ALL > SELECT * FROM mytable WHERE (c4 = 'W' AND c5 IN ('C','E','F') AND c6 = > 'S' AND c7 IS NULL) UNION ALL > SELECT * FROM mytable WHERE (c1 = 'W' AND c2 IN ('3','C','E','F') AND c3 > IN ('2','5') ) UNION ALL > SELECT * FROM mytable WHERE (c1 = 'W' AND c2 = '3' AND c3 = 'S' AND c6 > IS NULL ) UNION ALL > SELECT * FROM mytable WHERE (c2 = '1' AND c3 = '1' AND c4 IN > ('C','E','F') AND c5 = 'S' AND c7 IS NULL) > ORDER BY c1,c2,c3,c4,c5,c6,c7 ASC; > * > The above query takes 37 seconds and produces approx 200,000 results and > uses 550MB RAM*
Try this: (1) ditch the redundant commas in your csv file (2) run this (suitably adjusted for my errors and typos, and for my ignorance of the minutiae of command-line and grep syntax on your [unspecified IIRC] OS): grep "(....W[CEF][SZ]|..W[CEF]S|...W[CEF]S|W[3CEF]S[25]..|W3S..|.11[CEF]S.)," yourfile.csv How long does that take? Another suggestion: search for clues on whether it might be better instead of doing select * from mytable where <critera1> union all etc etc select * from mytable where <criteria7> to do select * from mytable where <criteria1> or <criteria2> or etc etc and if you don't find a strong weight of opinion for one way or the other, give it a try. HTH, John _______________________________________________ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users