Re: [sqlite] Streaming group by in union of already sorted tables

Emmanouil Karvounis Fri, 23 Jan 2015 09:00:41 -0800

Dear Simon,

Thank you for your answer and I'm sorry if I have used inappropriate
wording and confused you.


The issue is actually very simple:

tableA and tableB have both primary key on (c1, c2)

explain query plan
select c1, c2, count(*) from (
select c1, c2 from tableA
union all
select c1, c2 from tableB
)
group by c1,c2

2|0|0|SCAN TABLE tableA USING COVERING INDEX sqlite_autoindex_tableA_1
3|0|0|SCAN TABLE tableB USING COVERING INDEX sqlite_autoindex_tableB_1
1|0|0|COMPOUND SUBQUERIES 2 AND 3 (UNION ALL)
0|0|0|SCAN SUBQUERY 1
0|0|0|USE TEMP B-TREE FOR GROUP BY

There is no reason to create a new temp B-tree when you can sequentially
and in-synch scan the B-tree of tableA and of tableB and get the groups in
one pass. Think of how you would execute the sort step of mergesort on two
already sorted subarrays.


Manos

On 23 January 2015 at 18:42, Simon Slavin <[email protected]> wrote:

>
> On 23 Jan 2015, at 4:15pm, Emmanouil Karvounis <[email protected]> wrote:
>
> > In short, we have two tables that are already sorted on a combination of
> > two fields
>
> There is no such thing as a 'sorted table' in SQL.  Each table is a set of
> rows and the rows have no order.
>
> If you want to make it easy for SQL to access a table's rows in a
> particular order, create an index or make that order the table's primary
> key (which is another way of making an index).
>
> > select c1, c2, myagg(*) from (
> > select * from tableA
> > union all
> > select * from tableB
> > )
> > group by c1, c2;
>
> This command tells SQL that you want to construct a list of every row of
> tableA and every row of tableB.  In other words, if you have 300 rows in
> tableA and 500 rows in tableB, you are telling SQL to construct a new table
> of 800 rows.  And because this table doesn't yet exist, it doesn't have any
> indexes so it can't be searched quickly.  Is that what you wanted ?
>
> Is there a good reason for needing this data in two separate tables rather
> than one for which you can create an index on (c1, c2) ?
>
> Do the groups occur entirely within one table or do you have to add the
> tables together before SQL can figure out the groups.
>
> > where tableA, tableB have primary key (c1, c2) and their schema
> comprises 3
> > integers: c1, c2, and prop.
>
> It might be worth testing with something like 'total(*)' just to make sure
> it isn't your own function which is causing the problems.
>
> Simon.
> _______________________________________________
> sqlite-users mailing list
> [email protected]
> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
>
_______________________________________________
sqlite-users mailing list
[email protected]
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Streaming group by in union of already sorted tables

Reply via email to