On Nov 12, 2017 11:06 AM, "Balaji Ramanathan" <balaji.ramanat...@gmail.com> wrote:
Hi, I have a table (mytable) with several numerical fields. Let us call them F1, F2, etc., etc. The table also has a row number field (RowNumber) that I make sure has incremental values from 1 through the number of rows in the table. In addition to printing out the values of F1 through Fn, I also want to print out, for each row of the table, the running sums and averages of these fields. There are two averages: one that uses only the count of non-null values of each field as the denominator and another that uses the count of both null and non-null values of each field. So, I wrote a query as below to calculate and print those values: select RowNumber, F1, (select sum(F1) from mytable where mytable.RowNumber <= T.RowNumber) as sumF1, (select avg(F1) from mytable where mytable.RowNumber <= T.RowNumber) as avgF1, --average excluding null values (select sum(F1) from mytable where mytable.RowNumber <= T.RowNumber)/T.RowNumber as OverallAvgF1, --average including null values F2, (select sum(F2) from mytable where mytable.RowNumber <= T.RowNumber) as sumF2, (select avg(F2) from mytable where mytable.RowNumber <= T.RowNumber) as avgF2, (select sum(F2) from mytable where mytable.RowNumber <= T.RowNumber)/T.RowNumber as OverallAvgF2, --Other fields, their running sums and averages go here from mytable as T order by RowNumber I have about 3000 rows in my table and about 10 fields I am doing this for. When I select F1 through F10 alone, the query runs in negligible time. But when I run the above query with running sums and averages included (total of 40 columns instead of 10 columns), the time increases to well over 5 minutes. When I run the query at the SQLite command line, the first 1000 or so records are produced very fast, but as the row number increases, I can see SQLite struggling to produce outputs. By the time I get to the end of the query, the records are being produced at about a rate of only 1 or 2 per second. Is there anything I can do to reduce the time taken? Is this the most efficient way to query for running sums and averages? Any other ideas to get what I need any quicker? The added wrinkle in all this is that mytable is actually not a table, but a view. F1 through F10 are computed from various tables and brought together in this view. So, I don't think I can create indexes on any of these fields. I have thought about creating a temp table out of this view so that I can index RowNumber (I don't know whether it will help) but it goes against my minimalist instincts. I am *FAR* from a SQLite (or SQL) guru, but it seems obvious to me that your calculations for sumF1, avgF1, OverallAvgF1, etc as written are running in quadratic time based on the number of rows in table T. Worse yet, you're recalculating all these values entirely from scratch every time you're recalculating them, rather than use the previously calculated value. Is there any way you can calculate table T with just RowNumber, F1, F2, ..., F10, and then as you subsequently step through T row by row calculate all the sums and averages as you step, saving the calculations for each row as you calculate them in temporary variables for use in calculating the values for the next row? This might be doable in pure SQL code, or you might have to do it in whatever language you're making calls to SQLite from. That should be linear time, at worst 2 times the number of rows in T (once to construct T, once to step through it). Be well. Joseph _______________________________________________ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users