Re: [sqlite] Selecting data from WITH clause is very slow
On Mon Sep 11, 2017 at 11:15:22AM +0200, no...@null.net wrote: > SELECT > acc_no, > acc_name, > SUM(i_90.invoice_bal) AS 90_days, > SUM(i_current.invoice_bal) AS current > FROM >debtors_table > LEFT JOIN > invoices i_90 > ON > i_90.debtor_id = debtors_table.id AND > i_90.invoice_date > date_5 AND > i_90.invoice_date <= date_4 > LEFT JOIN > invoices i_current > ON > i_current.debtor_id = debtors_table.id AND > i_current.invoice_date > date_2 AND > i_current.invoice_date <= date_1 > > I should also mention for completeness the need for: > > GROUP BY > acc_no, > acc_name > > ORDER BY > ... Very interesting ideas. I have a lot of experimenting to do! Thanks very much, Mark. Frank ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Selecting data from WITH clause is very slow
On Mon Sep 11, 2017 at 11:15:22AM +0200, no...@null.net wrote: > SELECT > acc_no, > acc_name, > SUM(i_90.invoice_bal) AS 90_days, > SUM(i_current.invoice_bal) AS current > FROM >debtors_table > LEFT JOIN > invoices i_90 > ON > i_90.debtor_id = debtors_table.id AND > i_90.invoice_date > date_5 AND > i_90.invoice_date <= date_4 > LEFT JOIN > invoices i_current > ON > i_current.debtor_id = debtors_table.id AND > i_current.invoice_date > date_2 AND > i_current.invoice_date <= date_1 I should also mention for completeness the need for: GROUP BY acc_no, acc_name > ORDER BY > ... -- Mark Lawrence ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Selecting data from WITH clause is very slow
On Mon Sep 11, 2017 at 08:40:41AM +0200, Frank Millman wrote: > I could structure it like this (pseudo code) - > > SELECT acc_no, acc_name, > (SELECT SUM(invoice_bal) FROM invoices > WHERE invoice_date <= date_5) AS 120_days, > (SELECT SUM(invoice_bal) FROM invoices > WHERE invoice_date > date_5 AND invoice_date <= date_4) AS 90_days, > (SELECT SUM(invoice_bal) FROM invoices > WHERE invoice_date > date_4 AND invoice_date <= date_3) AS 60_days, > (SELECT SUM(invoice_bal) FROM invoices > WHERE invoice_date > date_3 AND invoice_date <= date_2) AS 30_days, > (SELECT SUM(invoice_bal) FROM invoices > WHERE invoice_date > date_2 AND invoice_date <= date_1) AS current > FROM debtors_table > > This works, but it requires 5 separate scans of the invoice table, > which is inefficient. You can achieve the same result with a single table scan using CASE statements: SELECT acc_no, acc_name, SUM( CASE WHEN invoice_date <= date_5 THEN invoice_bal ELSE 0 END ) AS 120_days, SUM( CASE WHEN invoice_date > date_5 AND invoice_date <= date_4 THEN invoice_bal ELSE 0 END ) AS 90_days, -- ... SUM( CASE WHEN invoice_date > date_2 AND invoice_date <= date_1 THEN invoice_bal ELSE 0 END ) AS current FROM debtors_table LEFT JOIN invoices ON invoices.debtor_id = debtors_table.id If your time periods are not exhaustive (covering all time) then you can perhaps be more efficient again by moving the filtering into the join clause: SELECT acc_no, acc_name, SUM(i_90.invoice_bal) AS 90_days, SUM(i_current.invoice_bal) AS current FROM debtors_table LEFT JOIN invoices i_90 ON i_90.debtor_id = debtors_table.id AND i_90.invoice_date > date_5 AND i_90.invoice_date <= date_4 LEFT JOIN invoices i_current ON i_current.debtor_id = debtors_table.id AND i_current.invoice_date > date_2 AND i_current.invoice_date <= date_1 ORDER BY ... For the above an index on invoices(debtor_id,invoice_date) would be appropriate. Also in case you weren't aware, the SUM() function in my second example could produce NULL values. You may be better served in that instance by CAST(TOTAL(i_current.invoice_bal) AS INTEGER). -- Mark Lawrence ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Selecting data from WITH clause is very slow
Clemens Ladisch wrote: > Frank Millman wrote: > > I changed it to use a WITH clause to make one scan of the invoice table and > > store the results. > The WITH clause itself does not store anything; it's syntactic sugar [...] To > force the ageing results to be stored, you'd have to create a temporary table. Thanks – I did not know that. It explains everything! Frank ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Selecting data from WITH clause is very slow
Frank Millman wrote: > SELECT acc_no, acc_name, > (SELECT SUM(invoice_bal) FROM invoices WHERE invoice_date <= date_5) > AS 120_days, > (SELECT SUM(invoice_bal) FROM invoices WHERE invoice_date > date_5 > AND invoice_date <= date_4) AS 90_days, > (SELECT SUM(invoice_bal) FROM invoices WHERE invoice_date > date_4 > AND invoice_date <= date_3) AS 60_days, > (SELECT SUM(invoice_bal) FROM invoices WHERE invoice_date > date_3 > AND invoice_date <= date_2) AS 30_days, > (SELECT SUM(invoice_bal) FROM invoices WHERE invoice_date > date_2 > AND invoice_date <= date_1) AS current > FROM debtors_table This query does not make sense without a GROUP BY and a correlation, which would affect how to optimize it. > This works, but it requires 5 separate scans of the invoice table, > which is inefficient. If there is an index on the invoice_date, it's efficient. Assuming that there is an account_numer filter you did not show, the index should look like this: CREATE INDEX invoices_accno_date_idx ON invoices(account_number, invoice_date); Alternatively, to create a covering index (http://www.sqlite.org/queryplanner.html#covidx) for this query, add the invoice_bal column. > I changed it to use a WITH clause to make one scan of the invoice > table and store the results. The WITH clause itself does not store anything; it's syntactic sugar. > WITH ageing AS (...) > SELECT acc_no, acc_name, > (SELECT balance FROM ageing WHERE account_number = acc_no AND > bucket_number = 5) AS 120_days, > (SELECT balance FROM ageing WHERE account_number = acc_no AND > bucket_number = 4) AS 90_days, > (SELECT balance FROM ageing WHERE account_number = acc_no AND > bucket_number = 3) AS 60_days, > (SELECT balance FROM ageing WHERE account_number = acc_no AND > bucket_number = 2) AS 30_days, > (SELECT balance FROM ageing WHERE account_number = acc_no AND > bucket_number = 1) AS current > FROM debtors_table > > It works, but it takes about 3.5 seconds to execute. The ageing query is likely to be executed five times for each row in the outer query. (This might be efficient with the above the index.) To force the ageing results to be stored, you'd have to create a temporary table. Regards, Clemens ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users