Sorry, I forgot to mention that I was running the query twice, first to
cache it, second to measure the speed.

Yeah it's a horrible query but produces the fastest results in Mysql.
maybe using OR instead works better on Sqlite, working on that one.

Problem I have with indexes in sqlite is that the table balloons 400MB
in size with every index i add. At the moment it's 4GB with one on each
column, if i add more indexes across other columns i'm thinking it'll
get too big to cache up.

thanks

Thomas Briggs wrote:
>    For starters, I think that loading the index into the cache in
> MySQL is biasing your performance measures.  SQLite will automatically
> load pages of any necessary indexes into memory as part of executing
> the query, but doing so takes time.  By preloading the index on MySQL,
> you're removing that time from the amount of time MySQL spends
> executing that query, so you're not really comparing apples to apples.
>  Said another way: how long does the query take in MySQL if you don't
> pre-load the index?
>
>    Secondly, the page size needs to be set before the database is
> created, i.e. before you execute the CREATE TABLE statement.
>
>    Thirdly, that's a pretty horrible query. :)  Collapsing the
> different queries into fewer queries - even if it isn't a single query
> - will speed things up.  Each SELECT is a separate query that requires
> that the table data be read; reducing the number of SELECTs will thus
> reduce query time.
>
>    Lastly, your indexes need to reflect how the data is accessed.  For
> example, if you're going to do "WHERE c5 = x AND c6 = y AND c7 = 7"
> then you need one index on (c5, c6, c7) - not three individual indexes
> on c5, c6, and c7.
>
>    Performance tuning rule #1 - problems are almost always in your use
> of the database, not the database itself. :)
>
>    -Tom
>
> On Sun, Feb 22, 2009 at 3:48 AM, Kim Boulton <k...@jesk.co.uk> wrote:
>   
>> Thanks for the hints so far.
>>
>> Here is my existing Mysql setup and what I've tried with Sqlite....
>>
>> *Mysql Table structure:*
>> CREATE TABLE  `mydb`.`mytable` (
>>  `c1`
>> enum('0','1','2','3','4','5','6','7','8','9','A','B','C','D','E','F','G','H','I','J','K','L','M','N','O','P','Q','R','S','T','U','V','W','X','Y','Z')
>> NOT NULL,
>>  `c2`
>> enum('0','1','2','3','4','5','6','7','8','9','A','B','C','D','E','F','G','H','I','J','K','L','M','N','O','P','Q','R','S','T','U','V','W','X','Y','Z')
>> NOT NULL,
>>  `c3`
>> enum('0','1','2','3','4','5','6','7','8','9','A','B','C','D','E','F','G','H','I','J','K','L','M','N','O','P','Q','R','S','T','U','V','W','X','Y','Z')
>> NOT NULL,
>>  `c4`
>> enum('0','1','2','3','4','5','6','7','8','9','A','B','C','D','E','F','G','H','I','J','K','L','M','N','O','P','Q','R','S','T','U','V','W','X','Y','Z')
>> NOT NULL,
>>  `c5`
>> enum('0','1','2','3','4','5','6','7','8','9','A','B','C','D','E','F','G','H','I','J','K','L','M','N','O','P','Q','R','S','T','U','V','W','X','Y','Z')
>> NOT NULL,
>>  `c6`
>> enum('0','1','2','3','4','5','6','7','8','9','A','B','C','D','E','F','G','H','I','J','K','L','M','N','O','P','Q','R','S','T','U','V','W','X','Y','Z')
>> NULL,
>>  `c7`
>> enum('0','1','2','3','4','5','6','7','8','9','A','B','C','D','E','F','G','H','I','J','K','L','M','N','O','P','Q','R','S','T','U','V','W','X','Y','Z')
>> NULL,
>>  `price` smallint(4) NOT NULL,  KEY `c1` (`c1`),  KEY `c2` (`c2`),  KEY
>> `c3` (`c3`),  KEY `c4` (`c4`),  KEY `c5` (`c5`),  KEY `c6` (`c6`),  KEY
>> `c7` (`c7`),  KEY `price` (`price`) ) ENGINE=MyISAM DEFAULT
>> CHARSET=latin1 PACK_KEYS=1 ROW_FORMAT=COMPRESSED;
>>
>> *CSV Import Data e.g.:*
>> 1,A,P,0,0,X,X,300
>> 1,A,P,0,0,X,P,9999
>> A,A,P,0,0,,,2000
>> B,3,Y,0,1,X,,300
>> ........ approx 30 million row, 500MB csv, text file
>>
>> If I compress the table it is approximately 900MB in size with an index
>> of approximately 550MB, which i can load into memory via LOAD INDEX INTO
>> CACHE
>>
>> *Mysql Query example:*
>> SELECT * FROM mytable WHERE (c5 = 'W' AND c6 IN ('C','E','F') AND c7 IN
>> ('S','Z')) UNION ALL
>> SELECT * FROM mytable WHERE (c3 = 'W' AND c4 IN ('C','E','F') AND c5 =
>> 'S' AND c6 IS NULL) UNION ALL
>> SELECT * FROM mytable WHERE (c4 = 'W' AND c5 IN ('C','E','F') AND c6 =
>> 'S' AND c7 IS NULL) UNION ALL
>> SELECT * FROM mytable WHERE (c1 = 'W' AND c2 IN ('3','C','E','F') AND c3
>> IN ('2','5') ) UNION ALL
>> SELECT * FROM mytable WHERE (c1 = 'W' AND c2 = '3' AND c3 = 'S' AND c6
>> IS NULL ) UNION ALL
>> SELECT * FROM mytable WHERE (c2 = '1' AND c3 = '1' AND c4 IN
>> ('C','E','F') AND c5 = 'S' AND c7 IS NULL)
>> ORDER BY c1,c2,c3,c4,c5,c6,c7 ASC;
>> *
>> The above query takes 37 seconds and produces approx 200,000 results and
>> uses 550MB RAM*
>>
>> *So, in Sqlite3 I created a similar table structure like this:*
>> CREATE TABLE [mytable] ([c1] TEXT  NOT NULL,[c2] TEXT  NOT NULL,[c3]
>> TEXT  NOT NULL,[c4] TEXT  NOT NULL,[c5] TEXT  NOT NULL,[c6] TEXT
>> NULL,[c7] TEXT  NULL,[price] INTEGER  NOT NULL)
>> CREATE INDEX [c1] ON [mytable]([c1]  ASC)
>> CREATE INDEX [c2] ON [mytable]([c2]  ASC)
>> CREATE INDEX [c3] ON [mytable]([c3]  ASC)
>> CREATE INDEX [c4] ON [mytable]([c4]  ASC)
>> CREATE INDEX [c5] ON [mytable]([c5]  ASC)
>> CREATE INDEX [c6] ON [mytable]([c6]  ASC)
>> CREATE INDEX [c7] ON [mytable]([c7]  ASC)
>> CREATE INDEX [price] ON [mytable]([price]  ASC)
>>
>> *Then I imported the same csv data using:*
>> .separator ","
>> .import mycsv.csv mytable
>>
>> *Then fixed the NULL values in the last two columns with:*
>> UPDATE mytable SET c6 = NULL where c6 = '';
>> UPDATE mytable SET c7 = NULL where c7 = '';
>>
>> Then Vacuumed - took 6 hours!
>>
>> This leaves me with a 4GB Sqlite table
>>
>> *Then queried the Sqlite3 table with:*
>> PRAGMA cache_size = 20000000; */uses up 1.5GB RAM regardless*/
>> PRAGMA page_size = 20000000; /*this doesn't make any difference*/
>> SELECT * FROM mytable WHERE (c5 = 'W' AND c6 IN ('C','E','F') AND c7 IN
>> ('S','Z')) UNION ALL
>> SELECT * FROM mytable WHERE (c3 = 'W' AND c4 IN ('C','E','F') AND c5 =
>> 'S' AND c6 IS NULL) UNION ALL
>> SELECT * FROM mytable WHERE (c4 = 'W' AND c5 IN ('C','E','F') AND c6 =
>> 'S' AND c7 IS NULL) UNION ALL
>> SELECT * FROM mytable WHERE (c1 = 'W' AND c2 IN ('3','C','E','F') AND c3
>> IN ('2','5') ) UNION ALL
>> SELECT * FROM mytable WHERE (c1 = 'W' AND c2 = '3' AND c3 = 'S' AND c6
>> IS NULL ) UNION ALL
>> SELECT * FROM mytable WHERE (c2 = '1' AND c3 = '1' AND c4 IN
>> ('C','E','F') AND c5 = 'S' AND c7 IS NULL)
>> ORDER BY c1,c2,c3,c4,c5,c6,c7 ASC;
>>
>> *The above query takes 57 seconds, 20 seconds slower than Mysql but
>> produces the same results - but uses up 1.5GB RAM!*
>>
>> Also if I try some more complex queries it always runs out steam at 2GB
>> RAM and crashes. Maybe that is a limitation of my PC or XP though.
>>
>> I have also tried...
>> 1. Sqlite table with no indexes - very slow!
>> 2. Sqlite table with one unique index using all columns - still slower
>> than seperate indexes.
>>
>> So I've had a fair crack of the whip with Sqlite. I was hoping it would
>> be faster and use less memory, no luck though.
>>
>> Unless anyone has some good ideas I might have to give up on Sqlite.
>>
>> Cheers.
>>
>> Kim
>>
>> Kim Boulton wrote:
>>     
>>> Hello,
>>>
>>> I'm trying out Sqlite3 with an eye to improving the performance of
>>> queries on an existing MySQL database.
>>>
>>> I've imported the data into sqlite which is approx. 30 million rows of
>>> part numbers each with a price.
>>>
>>> So far, it's approx. four times slower than the MySQL version, and the
>>> size of the sqlite database is too big to fit in memory (several GB)
>>> whereas I can get the MySQL data down to 900MB if it's compressed and
>>> read only.
>>>
>>> I would appreciate some tips or pointers on getting sqlite3 performance
>>> up and the data size down. I googled but couldn't find much.
>>>
>>> I don't need concurrency or inserts, it's single user, read only.
>>>
>>> TIA
>>>
>>> kimb
>>>
>>>
>>>
>>>
>>> _______________________________________________
>>> sqlite-users mailing list
>>> sqlite-users@sqlite.org
>>> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
>>>
>>>
>>>       
>> _______________________________________________
>> sqlite-users mailing list
>> sqlite-users@sqlite.org
>> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
>>
>>     
>
>   

_______________________________________________
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Reply via email to