On Thu, 19 Dec 2013 12:03:48 +0100
Clemens Ladisch <clem...@ladisch.de> wrote:

> > Is there a way to force rows to be dispatched (using the group by)
> > and aggregated on the fly instead of being stored, sorted and then
> > aggregated?
> 
> SQLite can optimize GROUP BY this way only if it can prove that the
> table is already sorted on the grouping column.  Otherwise, it creates
> a temporary index for all records, and then groups on those.
> 
> In theory, it would be possible to create a temporary index that
> stores the current aggregation _result_ for each group seen so far,
> and that is updated dynamically.  However, this is not implemented in
> SQLite.

That answer sent me back to the documentation. I've been toying with
virtual table functions that parse files because it seems to me there
are way too many file formats out there, and very few tools nearly as
good as SQLite for dealing with them.  

IIUC "prove that the table is already sorted" means
xBestIndex sets orderByConsumed to 1.  One of the orderings proposed
to xBestIndex is chosen by the query planner and given to xFilter as
idxNum and idxStr.  The function is then obliged to provide the rows in
that order, allowing SQLite to skip the step of writing the index to a
temporary table.  

Do I have that right?  

> In theory, it would be possible to create a temporary index that
> stores the current aggregation _result_ for each group seen so far,
> and that is updated dynamically.  

It's possible in theory, but it would complicate user-defined
aggregation functions greatly.  As it stands, a UDF is called
with xStep N times, and xFinal once per group.  The groups of rows are
implicitly provided to the function in order, allowing the function to
operate only on the values, oblivious of the key.  

Not all functions are amenable to incremental update.  It's quite
difficult, for example, to compute a median without access to the
entire dataset.  

--jkl
_______________________________________________
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Reply via email to