Re: [sqlite] Improving performance of GROUP BY

2012-06-17 Thread Udi Karni
Thanks. There are no indexes of any kind because I would need many and they take very long to create. I am experimenting with various GROUP-BY scenarios like the following where all the work is done in TEMP. When TEMP is on disk - it finishes but is slower - when it's in memory - it's faster but

Re: [sqlite] Improving performance of GROUP BY

2012-05-29 Thread Simon Slavin
On 29 May 2012, at 11:04pm, Nico Williams wrote: > Can you post EXPLAIN QUERY PLAN output for your statements? And/or your > schema. Also perform the SQL command 'ANALYZE' on that database, then try your timings again. See if the timings improve (or perhaps even get

Re: [sqlite] Improving performance of GROUP BY

2012-05-29 Thread Nico Williams
On Fri, May 25, 2012 at 1:38 PM, Udi Karni wrote: > I am running the following query - > > CREATE TABLE XYZ AS > SELECT ID, MIN (DATE) > FROM SOURCE-TABLE > WHERE CRITERIA > GROUP BY ID ; > > SOURCE-TABLE has 600 million rows, 2 million meet the WHERE criteria, and > get grouped

Re: [sqlite] Improving performance of GROUP BY

2012-05-26 Thread Simon Slavin
On 26 May 2012, at 6:25pm, Udi Karni wrote: > In any event - breaking the SQL up in 2 is still at least twice as better. I do see huge SQL commands on this list sometimes: great big commands of hundreds of characters which could be done simpler and faster if split into two

Re: [sqlite] Improving performance of GROUP BY

2012-05-26 Thread Udi Karni
It certainly seems to improve things. I ran multiple tests because over time seems like portions of the data get cached and there is a difference between an initial run and subsequent runs. I almost wish simple table-scans would always go exclusively to disk - I am not sure the caching always

Re: [sqlite] Improving performance of GROUP BY

2012-05-25 Thread Simon Slavin
On 25 May 2012, at 7:38pm, Udi Karni wrote: > I am running the following query - > > CREATE TABLE XYZ AS > SELECT ID, MIN (DATE) > FROM SOURCE-TABLE > WHERE CRITERIA > GROUP BY ID ; > > SOURCE-TABLE has 600 million rows, 2 million meet the WHERE criteria, and > get grouped

[sqlite] Improving performance of GROUP BY

2012-05-25 Thread Udi Karni
I am running the following query - CREATE TABLE XYZ AS SELECT ID, MIN (DATE) FROM SOURCE-TABLE WHERE CRITERIA GROUP BY ID ; SOURCE-TABLE has 600 million rows, 2 million meet the WHERE criteria, and get grouped to 100,000 distinct IDs. There are no indexes as there is too much variety in the