On Jul 1, 2008, at 2:17 PM, Jeff Gibson wrote:

> I'm including a copy of Alexey's relevant message below.  Unless I
> misunderstand, he has a test case that demonstrates that for the  
> table:
>
> CREATE TABLE events (eid INTEGER PRIMARY KEY,type INTEGER)
>
> the query:
>
> SELECT events.* FROM events WHERE eid<=32619760 and type=22 ORDER BY
> eid DESC LIMIT 1;
>
> runs much faster if there is no index on type.  The culprit seems to  
> be
> the <= in conjunction with the descending ordering.  If you change
> either, the query gets much faster.  He tried the using indices
> events(type,eid) and events(type,eid desc).  I also tried your  
> original
> suggestion of just events(type) and got the same result.

That would be a case of SQLite choosing a suboptimal index, which is  
very different from ignoring an index all together, which is what your  
original statement said.  I see that if there is an index on  
events(type) that index is used rather than the primary key.  This is  
because the query optimizer is assuming that type=22 is highly  
selective.  Running ANALYZE might help.  But a sure-fire solution is  
to change the query as follows:

    SELECT * FROM events
     WHERE eid<=32619750
           AND +type=22
     ORDER BY eid DESC
      LIMIT  1;

Note the "+" operator in front of the "type" field in the WHERE  
clause.  This + size makes that term of the WHERE clause an  
expression, rather than a constraint on a column, and this  
disqualifies it from use by an index.  That forces SQLite to use the  
other query strategy, which is to use the integer primary key.

Note that in this case, the correct index choice depends on the kind  
of data contained in the table.  If there is only a single row out of  
20 million for which type=22, but there are hundreds of thousands of  
rows with eid<=32619750, then the use of the index on event(type) is  
clearly the better strategy.  Only when type=22 is a common occurrence  
does it become better to use the integer primary key.  SQLite does not  
attempt to keep statistics on table contents, so it has no way of  
knowing which approach is really better.  It makes its best guess.  In  
this case, it happened to guess wrong.  But, as I pointed out, a  
programmer with higher-level knowledge of the table content can steer  
SQLite toward the better choice with the judicious use of a "+" symbol.


D. Richard Hipp
[EMAIL PROTECTED]



_______________________________________________
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Reply via email to