On 2004-06-16, at 06.04, Richard Kuo wrote:

    Empirically speaking, we display our data in a scrolling 2
dimensional grid format.  With MS access, this grid responds
instantaneously when moving through the grid.  With SQLite, there is
very noticable stalling and lag and the disk i/o is higher than MS
Access by roughly a factor of 10.



Hello Richard,

I too hit a rather puzzling SELECT performance problem last year. Apparently, if you have a table with many records, and you perform a SELECT on an indexed column where most of the values are the same, performance goes way down.

The mail correspondence from January 2003 does not seem to be in the archive, so below I have pasted a very enlightening response from DRH to my question.

Hope this helps.

/jak





On 2003-01-27, at 00.00, D. Richard Hipp wrote:


Example:

     CREATE TABLE t1(a,b,c);
     CREATE INDEX i1a ON t1(a);
     CREATE INDEX i1b ON t1(b);

Next you insert lots of data where the value for T1.B is
usually the same, say 'xyzzy'.  Then you do a query:

     SELECT * FROM t1 WHERE a=5 AND b='xyzzy';

When it is compiling this query, the optimizer can choose to
use either index I1A and the "a=5" expression or it can choose
to use index I1B and the "b='xyzzy'" expression.  The choice
it makes is arbitrary.

But I1B would be the wrong choice because almost every entry
in T1 is a match for "b='xyzzy'".  So the query must read in
every one of these entries and check each one to see if "a=5".
This can take even longer than doing a full table scan.

I1A is the right index to use here because only a few entries
of T1 will match "a=5".  So only a few entries have to be
read in and checked for "b='xyzzy'" and the query goes MUCH
faster.

Enterprise scale database engines do a more sophisticated job
of selecting indices (when there is a choice like this) by
collecting lots of statistics on the indices and using complex
algorithms to make the choice.  SQLite takes the easy way out
and makes an arbitrary choice.  With SQLite, it is up to you,
the query author, to select an appropriate index when the choice
of indices might make a difference.  You can disable the other
index by modifying the WHERE clause.  To disable the I1A index,
rewrite the WHERE clause like this:

     SELECT * FROM t1 WHERE a+0=5 AND b='xyzzy';

To disable the I1B index you could write

     SELECT * FROM t1 WHERE a=5 AND b LIKE 'xyzzy';

I should probably write some documentation talking about this
and put it on the website....


--------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]



Reply via email to