Re: [sqlite] Fwd: Best Query And Index Structure

Pavel Ivanov Fri, 16 Oct 2009 05:49:14 -0700

Yes, pretty interesting results. I didn't expect that. :)
Query plan seems to suggest that SQLite executes query not in the way
you said but first takes tit table, joins epgdata to it and then joins
tit1 and tit2 to it. So it should be executed faster than you
thought...


I've played with your queries a bit and found the only way to force
SQLite to execute query the way I've intended - to change table
epgdata so that id is not "integer primary key" but has a non-unique
(!) index on it. :) But of course that will not mean that query
execution would be the fastest in this case.
I'm surprised and impressed with SQLite's optimizer. :)

BTW, to make your query fastest you need index on (lang, epgdata_id,
tittext) instead of (lang, tittext, epgdata_id). Even for this
particular query tittext shouldn't be in the index at all.


Pavel

On Fri, Oct 16, 2009 at 8:08 AM, Brad Phelan <bradphe...@xtargets.com> wrote:
> On Fri, Oct 16, 2009 at 1:39 PM, Pavel Ivanov <paiva...@gmail.com> wrote:
>>> So if
>>> x has a very large range and a small probability of a match then
>>> we still have to do a full scan of 10,000 rows of A.
>>>
>>> Is there a better way to construct the query and or indexes so
>>> the result is faster.
>>
>> If your x has a small selectivity in B disregarding of A, i.e. for
>> each x you have pretty small amount of rows in B, then I'd suggest
>> instead of your index create these two:
>>
>> CREATE INDEX index_B on B (x, A_id);
>> CREATE INDEX index_A on A (id);
>>
>> And write your select in this way:
>>
>> select distinct *
>> from A join
>> (select B0.A_id as A_id
>>  from B B0, B B1
>>  where B0.x = 10
>>  and B1.x = 20
>>  and B0.A_id = B1.A_id) B2 on B2.A_id = A.id
>>
>>
>
> I've tried your pattern on my production code with some interesting
> results. The original pattern is below followed by your suggestion.
> However in this case I have used three terms. There is a LIKE "%FOO%"
> term in there which I really should replace with FTS3.
>
> select count(*) from epgdata
>     JOIN tit AS tit0
>         ON tit0.epgdata_id = epgdata.id
>         AND ( (tit0.lang = "deu") AND ((tit0.tittext) LIKE ("%die%")) )
>     JOIN tit AS tit1
>         ON tit1.epgdata_id = tit0.epgdata_id
>         AND ( (tit1.lang = "deu") AND ((tit1.tittext) LIKE ("%der%")) )
>     JOIN tit AS tit2
>         ON tit2.epgdata_id = tit1.epgdata_id
>         AND ( (tit2.lang = "deu") AND ((tit2.tittext) LIKE ("%zu%")) )
>
> -----------------------
>
> select count(*) from epgdata join
>    ( select tit0.epgdata_id as epgdata_id
>      from tit as tit0, tit as tit1, tit as tit2
>      where tit0.lang="deu" and tit0.tittext LIKE "%die%"
>        and tit1.lang="deu" and tit1.tittext LIKE "%der%"
>        and tit2.lang="deu" and tit2.tittext LIKE "%zu%"
>        and tit0.epgdata_id = tit1.epgdata_id
>        and tit0.epgdata_id = tit2.epgdata_id
>    ) as foo on foo.epgdata_id = epgdata.id
>
>
> generates almost identical sqlite bytecode using the EXPLAIN keyword. Some
> of the register numbers are different but the code structure is word for
> word the same. Unfortunately I can't make head or tail of the codes.
>
> The query plan for both of them is
>
>    0|1|TABLE tit AS tit0 WITH INDEX tit__lang__tittext__epgdata_id__
>    1|0|TABLE epgdata USING PRIMARY KEY
>    2|2|TABLE tit AS tit1 WITH INDEX tit__epgdata_id__
>    3|3|TABLE tit AS tit2 WITH INDEX tit__epgdata_id__
>
> I have indices
>
>    (epgdata_id) -> tit__epgdata_id__
>
> and
>
>    (lang, tittext, epgdata_id) -> tit__lang__tittext__epgdata_id__
>
> It seems that SQLite maps both queries to the same internal
> representation. Curious!!
>
> B
>
_______________________________________________
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Fwd: Best Query And Index Structure

Reply via email to