> it thinks that scanning the whole ENTRY_ATTRIBUTES table will read > about 54855 rows. And you say that it has much more rows.
This particular database has less rows, the millions I mentioned are in a different database. I think the figures are right, but will check. > I think running ANALYZE on your database Analyze was done, but will double-check. RBS On Tue, Jul 12, 2011 at 8:59 PM, Pavel Ivanov <paiva...@gmail.com> wrote: >> Now what surprises me is that this optimization is not done >> automatically by SQLite. >> I suppose I just over estimate the capabilities of the SQLite plan generator. >> Or, would this be something that could be improved? > > It's very non-obvious optimization and I think other type of > optimization will give much better result in your case. > To understand the optimizer's behavior look at numbers it shows you: > it thinks that scanning the whole ENTRY_ATTRIBUTES table will read > about 54855 rows. And you say that it has much more rows. That's why > optimizer selects sub-optimal plan. > I think running ANALYZE on your database should fix selected plans and > even first query will run much faster. > > > Pavel > > > On Tue, Jul 12, 2011 at 3:39 PM, Bart Smissaert > <bart.smissa...@gmail.com> wrote: >> Joining a large table (ENTRY_ATTRIBUTES) and a small table >> (BPNewENTRY) and putting the resulting records >> in a third table, BP3. Large table may have a few million records and >> small table a few hundred records. >> The join field is called ENTRY_ID in both tables and this has a >> non-unique index in the large table and is the integer primary key >> in the small table. Data type is integer in both these fields. >> >> This is the SQL: >> >> INSERT OR REPLACE INTO BP3 >> (ENTRY_ID, NUMERIC_VALUE, UNIT) >> SELECT EA.ENTRY_ID, EA.NUMERIC_VALUE, EA.TYPE_SPECIFIC_INFO >> FROM >> ENTRY_ATTRIBUTES EA INNER JOIN BPNewENTRY E ON >> (EA.ENTRY_ID = E.ENTRY_ID) >> >> and this is the generated query plan for that: >> >> 0--0--0--SCAN TABLE ENTRY_ATTRIBUTES AS EA (~54855 rows) <--> >> 0--1--1--SEARCH TABLE BPNewENTRY AS E USING INTEGER PRIMARY KEY >> (rowid=?) (~1 rows) >> >> Now I can speed up the query a lot by putting a where clause in with >> the min(ENTRY_ID) of the small table: >> >> INSERT OR REPLACE INTO BP3 >> (ENTRY_ID, NUMERIC_VALUE, UNIT) >> SELECT EA.ENTRY_ID, EA.NUMERIC_VALUE, EA.TYPE_SPECIFIC_INFO >> FROM >> ENTRY_ATTRIBUTES EA INNER JOIN BPNewENTRY E ON >> (EA.ENTRY_ID = E.ENTRY_ID) >> WHERE >> EA.ENTRY_ID >= 4262936 >> >> and that will have this query plan: >> >> 0--0--0--SEARCH TABLE ENTRY_ATTRIBUTES AS EA USING INDEX >> IDX_ENTRY_ATTRIBUTES_ENTRY_ID (ENTRY_ID>?) (~18102 rows) <--> >> 0--1--1--SEARCH TABLE BPNewENTRY AS E USING INTEGER PRIMARY KEY >> (rowid=?) (~1 rows) >> >> Getting this min(ENTRY_ID) is done in a separate query and because it >> is done on the small table it is very fast. >> >> Now what surprises me is that this optimization is not done >> automatically by SQLite. >> I suppose I just over estimate the capabilities of the SQLite plan generator. >> Or, would this be something that could be improved? >> >> I am using SQLite version 3.7.5. >> >> >> RBS >> _______________________________________________ >> sqlite-users mailing list >> sqlite-users@sqlite.org >> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users >> > _______________________________________________ > sqlite-users mailing list > sqlite-users@sqlite.org > http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users > _______________________________________________ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users