[sqlite] Improving performance of GROUP BY

Udi Karni Fri, 25 May 2012 11:39:04 -0700

I am running the following query -

CREATE TABLE XYZ AS
SELECT ID, MIN (DATE)
FROM SOURCE-TABLE
WHERE CRITERIA
GROUP BY ID ;


SOURCE-TABLE has 600 million rows, 2 million meet the WHERE criteria, and
get grouped to 100,000 distinct IDs. There are no indexes as there is too
much variety in the WHERE clauses and index creation on this table takes
many hours. Temp_store=memory. When executed - the source table is read
very slowly, and runtime is 1/2 hour.

I split this SQL into 2 steps - the first just extracting - the second just
grouping. The extract read the table extremely fast and finished in 3
minutes. The GROUP BY ran in a few seconds. So a 10X improvement - which is
pretty incredible speed.

Why such a large disparity? Why wouldn't the single-step SQL read fast -
write the accepted values to TEMP - then sort and group - same as the
2-step?

Any way to influence the SQlite SQL processor to chose a more optimal
single-step approach?

Thanks !
_______________________________________________
sqlite-users mailing list
[email protected]
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

[sqlite] Improving performance of GROUP BY

Reply via email to