Hi,
I am trying to find the best way to write a query that has two levels of group
by where the outer group by columns are a subset of the inner group by columns.
In my example below I want to do an aggregation grouping by per, prod, and mar,
then I want aggregate the results of this aggregation, grouping by just prod,
and per. From the results of explain query plan, I can see a B-Tree is not used
if I only do the first group by - this is mentioned in the query optimisation
page. However, a B-Tree is used for the second group by when both group bys are
present, and I don't understand why, as I think the rows from the subquery come
out in an order already suitable for the second group by?
My actual data is a bit more complex, but I am seeing a 10x-20x speed
difference between the query with the single group by and the query with both.
If the B-Tree is necessary, it would be good to be able to understand why, and
if not, it would be great if there were some way to communicate this to the
query planner.
This below was run on a freshly downloaded and compiled 3.28.
Thanks,
Kev
---
QUERY PLAN
`--SCAN TABLE data USING INDEX sqlite_autoindex_data_1
QUERY PLAN
|--CO-ROUTINE 1
| `--SCAN TABLE data USING INDEX sqlite_autoindex_data_1
|--SCAN SUBQUERY 1
`--USE TEMP B-TREE FOR GROUP BY
------------
create table data(
prod integer not null,
per integer not null,
mar integer not null,
off integer not null,
val real not null,
primary key(prod, per, mar, off)
);
explain query plan select
prod,
per,
mar,
sum(val) as val
from
data
group by
prod,
per,
mar
;
explain query plan select
prod,
per,
min(val)
from
(select
prod,
per,
mar,
sum(val) as val
from
data
group by
prod,
per,
mar)
group by
prod,
per
;
_______________________________________________
sqlite-users mailing list
[email protected]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users