We have an application that has been using SQLite successfully for several
years.  We use SQLite for various purposes within our product, with a few
dozen tables overall.

Recently, I started to upgrade from version 3.6.3 to 3.7.5.  During that
time, I noticed that several previously fast indexed queries turned into
slow full table scans.  For example, on a simple (hypothetical) example:

CREATE TABLE tbl (
  id INTEGER,
  value INTEGER
);
CREATE INDEX idx_id ON tbl (id);

The query "SELECT * FROM tbl WHERE id = 1" previously made use of idx_id.
However, in 3.7.5, we noticed that it was doing a full table scan, causing
significant performance issues.

After some debugging, found that the issue is related to our (mis-)use of
the "analyze" command:  Historically, our software runs analyze any time the
db schema changes.  However, if our customer had not yet made use of a
particular feature, the corresponding table(s) might be _empty_ when analyze
is run.  On previous versions, this did not cause any obvious problems.  In
the new version, the query planner reasonably assumes that a full table scan
is faster than an index on a small/empty table.  However, when the customer
later makes use of those features (populating the tables), the queries
become quite slow, as it still does a full table scan.

I'm trying to determine the best way to resolve this issue:

- Re-running analyze after adding data to the table is an obvious
suggestion.  However, as we have lots of tables in use for various purposes,
I'd need to sprinkle lots of "if (first time data added) analyze()" code
around.
- I could add a user triggered "maintenance" feature to manually re-analyze,
but that takes away some of the "zero configuration" benefits of our
product.
- Even if I stop calling analyze at all going forward, there may be some
existing (empty) tables that may have already been analyzed by previous
code.
- I could make use of the "INDEXED BY" clause, but this goes against the
documented intent of this feature ("*not* intended for use in tuning the
performance of a query"), and requires re-writing of many queries.

I'm sure there are other good ideas-- I'm flexible, and open to suggestion.
I'd appreciate any suggestions the group might have.
_______________________________________________
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Reply via email to