Woo, that's a bit better. Setting the page_size = 4096 gets the query 
time down to 17 seconds, so it's now faster than Mysql. Result!

I also tried changing the query to:
SELECT * FROM mytable WHERE (c5 = 'W' AND c6 IN ('C','E','F') AND c7 IN 
('S','Z')) OR
 (c3 = 'W' AND c4 IN ('C','E','F') AND c5 = 'S' AND c6 IS NULL) OR
 (c4 = 'W' AND c5 IN ('C','E','F') AND c6 = 'S' AND c7 IS NULL) OR
 (c1 = 'W' AND c2 IN ('3','C','E','F') AND c3 IN ('2','5') ) OR
 (c1 = 'W' AND c2 = '3' AND c3 = 'S' AND c6 IS NULL ) OR
 (c2 = '1' AND c3 = '1' AND c4 IN ('C','E','F') AND c5 = 'S' AND c7 IS NULL)
ORDER BY c1,c2,c3,c4,c5,c6,c7 ASC;

But OR is a lot slower than using UNION ALL on both Sqlite and Mysql

I'm going to try the page_size at 8192 and see if that works even better.

Thanks to everyone for the help!

kim

Kim Boulton wrote:
> Thanks for the hints so far.
>
> Here is my existing Mysql setup and what I've tried with Sqlite....
>
> *Mysql Table structure:*
> CREATE TABLE  `mydb`.`mytable` (
>   `c1` 
> enum('0','1','2','3','4','5','6','7','8','9','A','B','C','D','E','F','G','H','I','J','K','L','M','N','O','P','Q','R','S','T','U','V','W','X','Y','Z')
>  
> NOT NULL,
>   `c2` 
> enum('0','1','2','3','4','5','6','7','8','9','A','B','C','D','E','F','G','H','I','J','K','L','M','N','O','P','Q','R','S','T','U','V','W','X','Y','Z')
>  
> NOT NULL,
>   `c3` 
> enum('0','1','2','3','4','5','6','7','8','9','A','B','C','D','E','F','G','H','I','J','K','L','M','N','O','P','Q','R','S','T','U','V','W','X','Y','Z')
>  
> NOT NULL,
>   `c4` 
> enum('0','1','2','3','4','5','6','7','8','9','A','B','C','D','E','F','G','H','I','J','K','L','M','N','O','P','Q','R','S','T','U','V','W','X','Y','Z')
>  
> NOT NULL,
>   `c5` 
> enum('0','1','2','3','4','5','6','7','8','9','A','B','C','D','E','F','G','H','I','J','K','L','M','N','O','P','Q','R','S','T','U','V','W','X','Y','Z')
>  
> NOT NULL,
>   `c6` 
> enum('0','1','2','3','4','5','6','7','8','9','A','B','C','D','E','F','G','H','I','J','K','L','M','N','O','P','Q','R','S','T','U','V','W','X','Y','Z')
>  
> NULL,
>   `c7` 
> enum('0','1','2','3','4','5','6','7','8','9','A','B','C','D','E','F','G','H','I','J','K','L','M','N','O','P','Q','R','S','T','U','V','W','X','Y','Z')
>  
> NULL,
>   `price` smallint(4) NOT NULL,  KEY `c1` (`c1`),  KEY `c2` (`c2`),  
> KEY `c3` (`c3`),  KEY `c4` (`c4`),  KEY `c5` (`c5`),  KEY `c6` 
> (`c6`),  KEY `c7` (`c7`),  KEY `price` (`price`) ) ENGINE=MyISAM 
> DEFAULT CHARSET=latin1 PACK_KEYS=1 ROW_FORMAT=COMPRESSED;
>
> *CSV Import Data e.g.:*
> 1,A,P,0,0,X,X,300
> 1,A,P,0,0,X,P,9999
> A,A,P,0,0,,,2000
> B,3,Y,0,1,X,,300
> ........ approx 30 million row, 500MB csv, text file
>
> If I compress the table it is approximately 900MB in size with an 
> index of approximately 550MB, which i can load into memory via LOAD 
> INDEX INTO CACHE
>
> *Mysql Query example:*
> SELECT * FROM mytable WHERE (c5 = 'W' AND c6 IN ('C','E','F') AND c7 
> IN ('S','Z')) UNION ALL
> SELECT * FROM mytable WHERE (c3 = 'W' AND c4 IN ('C','E','F') AND c5 = 
> 'S' AND c6 IS NULL) UNION ALL
> SELECT * FROM mytable WHERE (c4 = 'W' AND c5 IN ('C','E','F') AND c6 = 
> 'S' AND c7 IS NULL) UNION ALL
> SELECT * FROM mytable WHERE (c1 = 'W' AND c2 IN ('3','C','E','F') AND 
> c3 IN ('2','5') ) UNION ALL
> SELECT * FROM mytable WHERE (c1 = 'W' AND c2 = '3' AND c3 = 'S' AND c6 
> IS NULL ) UNION ALL
> SELECT * FROM mytable WHERE (c2 = '1' AND c3 = '1' AND c4 IN 
> ('C','E','F') AND c5 = 'S' AND c7 IS NULL)
> ORDER BY c1,c2,c3,c4,c5,c6,c7 ASC;
> *
> The above query takes 37 seconds and produces approx 200,000 results 
> and uses 550MB RAM*
>
> *So, in Sqlite3 I created a similar table structure like this:*
> CREATE TABLE [mytable] ([c1] TEXT  NOT NULL,[c2] TEXT  NOT NULL,[c3] 
> TEXT  NOT NULL,[c4] TEXT  NOT NULL,[c5] TEXT  NOT NULL,[c6] TEXT  
> NULL,[c7] TEXT  NULL,[price] INTEGER  NOT NULL)
> CREATE INDEX [c1] ON [mytable]([c1]  ASC)
> CREATE INDEX [c2] ON [mytable]([c2]  ASC)
> CREATE INDEX [c3] ON [mytable]([c3]  ASC)
> CREATE INDEX [c4] ON [mytable]([c4]  ASC)
> CREATE INDEX [c5] ON [mytable]([c5]  ASC)
> CREATE INDEX [c6] ON [mytable]([c6]  ASC)
> CREATE INDEX [c7] ON [mytable]([c7]  ASC)
> CREATE INDEX [price] ON [mytable]([price]  ASC)
>
> *Then I imported the same csv data using:*
> .separator ","
> .import mycsv.csv mytable
>
> *Then fixed the NULL values in the last two columns with:*
> UPDATE mytable SET c6 = NULL where c6 = '';
> UPDATE mytable SET c7 = NULL where c7 = '';
>
> Then Vacuumed - took 6 hours!
>
> This leaves me with a 4GB Sqlite table
>
> *Then queried the Sqlite3 table with:*
> PRAGMA cache_size = 20000000; */uses up 1.5GB RAM regardless*/
> PRAGMA page_size = 20000000; /*this doesn't make any difference*/
> SELECT * FROM mytable WHERE (c5 = 'W' AND c6 IN ('C','E','F') AND c7 
> IN ('S','Z')) UNION ALL
> SELECT * FROM mytable WHERE (c3 = 'W' AND c4 IN ('C','E','F') AND c5 = 
> 'S' AND c6 IS NULL) UNION ALL
> SELECT * FROM mytable WHERE (c4 = 'W' AND c5 IN ('C','E','F') AND c6 = 
> 'S' AND c7 IS NULL) UNION ALL
> SELECT * FROM mytable WHERE (c1 = 'W' AND c2 IN ('3','C','E','F') AND 
> c3 IN ('2','5') ) UNION ALL
> SELECT * FROM mytable WHERE (c1 = 'W' AND c2 = '3' AND c3 = 'S' AND c6 
> IS NULL ) UNION ALL
> SELECT * FROM mytable WHERE (c2 = '1' AND c3 = '1' AND c4 IN 
> ('C','E','F') AND c5 = 'S' AND c7 IS NULL)
> ORDER BY c1,c2,c3,c4,c5,c6,c7 ASC;
>
> *The above query takes 57 seconds, 20 seconds slower than Mysql but 
> produces the same results - but uses up 1.5GB RAM!*
>
> Also if I try some more complex queries it always runs out steam at 
> 2GB RAM and crashes. Maybe that is a limitation of my PC or XP though.
>
> I have also tried...
> 1. Sqlite table with no indexes - very slow!
> 2. Sqlite table with one unique index using all columns - still slower 
> than seperate indexes.
>
> So I've had a fair crack of the whip with Sqlite. I was hoping it 
> would be faster and use less memory, no luck though.
>
> Unless anyone has some good ideas I might have to give up on Sqlite.
>
> Cheers.
>
> Kim
>
> Kim Boulton wrote:
>> Hello,
>>
>> I'm trying out Sqlite3 with an eye to improving the performance of 
>> queries on an existing MySQL database.
>>
>> I've imported the data into sqlite which is approx. 30 million rows of 
>> part numbers each with a price.
>>
>> So far, it's approx. four times slower than the MySQL version, and the 
>> size of the sqlite database is too big to fit in memory (several GB) 
>> whereas I can get the MySQL data down to 900MB if it's compressed and 
>> read only.
>>
>> I would appreciate some tips or pointers on getting sqlite3 performance 
>> up and the data size down. I googled but couldn't find much.
>>
>> I don't need concurrency or inserts, it's single user, read only.
>>
>> TIA
>>
>> kimb
>>
>>
>>
>>
>> _______________________________________________
>> sqlite-users mailing list
>> sqlite-users@sqlite.org
>> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
>>
>>   
_______________________________________________
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Reply via email to