Re: [sqlite] surprising missing query optimization

Bart Smissaert Tue, 12 Jul 2011 13:07:07 -0700

> it thinks that scanning the whole ENTRY_ATTRIBUTES table will read
> about 54855 rows. And you say that it has much more rows.


This particular database has less rows, the millions I mentioned are
in a different
database. I think the figures are right, but will check.

> I think running ANALYZE on your database

Analyze was done, but will double-check.

RBS


On Tue, Jul 12, 2011 at 8:59 PM, Pavel Ivanov <paiva...@gmail.com> wrote:
>> Now what surprises me is that this optimization is not done
>> automatically by SQLite.
>> I suppose I just over estimate the capabilities of the SQLite plan generator.
>> Or, would this be something that could be improved?
>
> It's very non-obvious optimization and I think other type of
> optimization will give much better result in your case.
> To understand the optimizer's behavior look at numbers it shows you:
> it thinks that scanning the whole ENTRY_ATTRIBUTES table will read
> about 54855 rows. And you say that it has much more rows. That's why
> optimizer selects sub-optimal plan.
> I think running ANALYZE on your database should fix selected plans and
> even first query will run much faster.
>
>
> Pavel
>
>
> On Tue, Jul 12, 2011 at 3:39 PM, Bart Smissaert
> <bart.smissa...@gmail.com> wrote:
>> Joining a large table (ENTRY_ATTRIBUTES) and a small table
>> (BPNewENTRY) and putting the resulting records
>> in a third table, BP3. Large table may have a few million records and
>> small table a few hundred records.
>> The join field is called ENTRY_ID in both tables and this has a
>> non-unique index in the large table and is the integer primary key
>> in the small table. Data type is integer in both these fields.
>>
>> This is the SQL:
>>
>> INSERT OR REPLACE INTO BP3
>> (ENTRY_ID, NUMERIC_VALUE, UNIT)
>> SELECT EA.ENTRY_ID, EA.NUMERIC_VALUE, EA.TYPE_SPECIFIC_INFO
>> FROM
>> ENTRY_ATTRIBUTES EA INNER JOIN BPNewENTRY E ON
>> (EA.ENTRY_ID = E.ENTRY_ID)
>>
>> and this is the generated query plan for that:
>>
>> 0--0--0--SCAN TABLE ENTRY_ATTRIBUTES AS EA (~54855 rows) <-->
>> 0--1--1--SEARCH TABLE BPNewENTRY AS E USING INTEGER PRIMARY KEY
>> (rowid=?) (~1 rows)
>>
>> Now I can speed up the query a lot by putting a where clause in with
>> the min(ENTRY_ID) of the small table:
>>
>> INSERT OR REPLACE INTO BP3
>> (ENTRY_ID, NUMERIC_VALUE, UNIT)
>> SELECT EA.ENTRY_ID, EA.NUMERIC_VALUE, EA.TYPE_SPECIFIC_INFO
>> FROM
>> ENTRY_ATTRIBUTES EA INNER JOIN BPNewENTRY E ON
>> (EA.ENTRY_ID = E.ENTRY_ID)
>> WHERE
>> EA.ENTRY_ID >= 4262936
>>
>> and that will have this query plan:
>>
>> 0--0--0--SEARCH TABLE ENTRY_ATTRIBUTES AS EA USING INDEX
>> IDX_ENTRY_ATTRIBUTES_ENTRY_ID (ENTRY_ID>?) (~18102 rows) <-->
>> 0--1--1--SEARCH TABLE BPNewENTRY AS E USING INTEGER PRIMARY KEY
>> (rowid=?) (~1 rows)
>>
>> Getting this min(ENTRY_ID) is done in a separate query and because it
>> is done on the small table it is very fast.
>>
>> Now what surprises me is that this optimization is not done
>> automatically by SQLite.
>> I suppose I just over estimate the capabilities of the SQLite plan generator.
>> Or, would this be something that could be improved?
>>
>> I am using SQLite version 3.7.5.
>>
>>
>> RBS
>> _______________________________________________
>> sqlite-users mailing list
>> sqlite-users@sqlite.org
>> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
>>
> _______________________________________________
> sqlite-users mailing list
> sqlite-users@sqlite.org
> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
>
_______________________________________________
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] surprising missing query optimization

Reply via email to