On 2004-06-16, at 06.04, Richard Kuo wrote:
Empirically speaking, we display our data in a scrolling 2 dimensional grid format. With MS access, this grid responds instantaneously when moving through the grid. With SQLite, there is very noticable stalling and lag and the disk i/o is higher than MS Access by roughly a factor of 10.
Hello Richard,
I too hit a rather puzzling SELECT performance problem last year. Apparently, if you have a table with many records, and you perform a SELECT on an indexed column where most of the values are the same, performance goes way down.
The mail correspondence from January 2003 does not seem to be in the archive, so below I have pasted a very enlightening response from DRH to my question.
Hope this helps.
/jak
On 2003-01-27, at 00.00, D. Richard Hipp wrote:
Example:
CREATE TABLE t1(a,b,c); CREATE INDEX i1a ON t1(a); CREATE INDEX i1b ON t1(b);
Next you insert lots of data where the value for T1.B is usually the same, say 'xyzzy'. Then you do a query:
SELECT * FROM t1 WHERE a=5 AND b='xyzzy';
When it is compiling this query, the optimizer can choose to use either index I1A and the "a=5" expression or it can choose to use index I1B and the "b='xyzzy'" expression. The choice it makes is arbitrary.
But I1B would be the wrong choice because almost every entry in T1 is a match for "b='xyzzy'". So the query must read in every one of these entries and check each one to see if "a=5". This can take even longer than doing a full table scan.
I1A is the right index to use here because only a few entries of T1 will match "a=5". So only a few entries have to be read in and checked for "b='xyzzy'" and the query goes MUCH faster.
Enterprise scale database engines do a more sophisticated job of selecting indices (when there is a choice like this) by collecting lots of statistics on the indices and using complex algorithms to make the choice. SQLite takes the easy way out and makes an arbitrary choice. With SQLite, it is up to you, the query author, to select an appropriate index when the choice of indices might make a difference. You can disable the other index by modifying the WHERE clause. To disable the I1A index, rewrite the WHERE clause like this:
SELECT * FROM t1 WHERE a+0=5 AND b='xyzzy';
To disable the I1B index you could write
SELECT * FROM t1 WHERE a=5 AND b LIKE 'xyzzy';
I should probably write some documentation talking about this and put it on the website....
--------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]