> So if
> x has a very large range and a small probability of a match then
> we still have to do a full scan of 10,000 rows of A.
>
> Is there a better way to construct the query and or indexes so
> the result is faster.

If your x has a small selectivity in B disregarding of A, i.e. for
each x you have pretty small amount of rows in B, then I'd suggest
instead of your index create these two:

CREATE INDEX index_B on B (x, A_id);
CREATE INDEX index_A on A (id);

And write your select in this way:

select distinct *
from A join
(select B0.A_id as A_id
 from B B0, B B1
 where B0.x = 10
 and B1.x = 20
 and B0.A_id = B1.A_id) B2 on B2.A_id = A.id


Pavel

On Fri, Oct 16, 2009 at 6:09 AM, Brad Phelan <bradphe...@xtargets.com> wrote:
> Hi all,
>
> I am curious on how to design a schema and indexes to best fit the
> following pattern. My
> data is tree like and stored normalized in the database.
>
> CREATE TABLE A
>    ( id INTEGER PRIMARY
>    )
>
> CREATE TABLE B
>    ( id INTEGER PRIMARY
>    , A_ID  INTEGER       # Foreign key to A
>    , x     INTEGER
>    )
>
>
> Now I wish to make queries such as.
>
>    All A where
>    any A/B.x = 10
>    and
>    any A/B.x = 20
>
> This can be coded trivially in SQL as
>
>    select distinct * from A
>    join B as B0 on A.id = B0.A_id and B0.x = 10
>    join B as B1 on A.id = B1.A_id and B0.x = 20
>
> My guess is that the suitable index to create is
>
>    CREATE INDEX index on B
>        ( A_id
>        , x
>        )
>
> However my limited understanding of how SQLite works suggests
> that this will be implemented as
>
> for a in A:
>    for b1 in B where b1.A_id = a.id and b1.x = 10:
>        for b2 in B where b2.A_id = a.id and b2.x = 20:
>            yield a
>
>
> Here the branching factor is quite small. There will be no more than
> 20 or so B's for every A but there may be about 10,000 A's. So if
> x has a very large range and a small probability of a match then
> we still have to do a full scan of 10,000 rows of A.
>
> In this case the index helps the joining but the search is still
> O(N)
>
> Is there a better way to construct the query and or indexes so
> the result is faster.
>
> Regards
>
> Brad Phelan
> _______________________________________________
> sqlite-users mailing list
> sqlite-users@sqlite.org
> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
>
_______________________________________________
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Reply via email to