[sqlite] Wrong Index Select with Large Table Joined to Small Table

Richard Hipp Thu, 21 Jan 2016 16:43:06 -0500

On 1/21/16, Richard Hipp <drh at sqlite.org> wrote:
> On 1/21/16, Denis Burke <burkeden at gmail.com> wrote:
>>
>> That query takes 4 minutes to run...
>
> Can you please run ".fullschema" on your database from the sqlite3.exe
> command line shell and post the output
>


[Mr. Burke sent the .fullschema output to me off-list]

I've studied your schema, and as far as I can tell, SQLite is making
the right algorithm choice based on the information it has available
to it.

The "fast" query (the one where you specify an index using the INDEXED
BY clause) actually create two "automatic indexes" that are discarded
after the query completes.  There is a lot of performance risk in
choosing to create an automatic index, because creating an index is
expensive and SQLite has no way to know in advance how well that index
might work out.  And because of this risk, the estimated cost of using
an automatic index is artificially inflated.  Apparently in your case
the automatic indexes work out quite well, but as I said, the query
planner has no way of knowing this in advance.  So I think it is
making the right choice.

My recommendation is that you use CREATE INDEX to create real indexes
that cover the same columns as the two automatic indexes identified in
the EXPLAIN QUERY PLAN output of your fast query.  To wit:

   CREATE INDEX x1 ON PatientLocGroups(PatLocGrpID, Hospital,
ClientID, PatientLocID);
   CREATE INDEX x2 ON PriorityGroups(PriorityGroupID, ClientID, PriorityID);

I don't have any idea what data you are storing, but I wonder if
PatLogGrpID ought not be the PRIMARY KEY of the PatientLocGroups
table, and if PriorityGroupID ought not be the PRIMARY KEY of the
PriorityGroups table.  If that works, it would obviate the need for
these indexes, most likely.  But I don't know if that would be
appropriate for your data - only you know that.

Further to the previous paragraph, please note the difference between
INT PRIMARY KEY and INTEGER PRIMARY KEY.  You want the latter most
likely.

I suspect that if you make some of the changes above, your problem
will go away, and many of your other queries will be faster as well.

The other thing I saw was in the sqlite_stat1 data.  There are many
entries like this:

INSERT INTO sqlite_stat1 VALUES('ResultData','IndxRDRsltDate','7020427
7020427');

Whenever you see the first two numbers in the final column of
sqlite_stat1 are the same, that indicates bad index.  You should drop
it.  It is never being used (because the query planner knows that it
is an awful index) and is merely taking up space.

In your case, the indexed column appears to always be NULL.  Perhaps
you are thinking that you occasionally have a non-NULL entry that you
want to look up quick?  The way to do that is with a partial index.
(https://www.sqlite.org/partialindex.html)  For the specific example
above, you want:

   CREATE INDEX [IndxRDRsltDate] ON [ResultData] ([RsltDate] COLLATE
NOCASE ASC) WHERE RsltDate IS NOT NULL;

I think that "WHERE column IS NOT NULL" clause at the end will help you a lot.
-- 
D. Richard Hipp
drh at sqlite.org

[sqlite] Wrong Index Select with Large Table Joined to Small Table

Reply via email to