James K. Lowden wrote: > Clemens Ladisch <[email protected]> wrote: >>> Is there a way to force rows to be dispatched (using the group by) >>> and aggregated on the fly instead of being stored, sorted and then >>> aggregated? >> >> SQLite can optimize GROUP BY this way only if it can prove that the >> table is already sorted on the grouping column. Otherwise, it creates >> a temporary index for all records, and then groups on those. > > IIUC "prove that the table is already sorted" means > xBestIndex sets orderByConsumed to 1. One of the orderings proposed > to xBestIndex is chosen by the query planner and given to xFilter as > idxNum and idxStr. The function is then obliged to provide the rows in > that order, allowing SQLite to skip the step of writing the index to a > temporary table. > > Do I have that right?
Yes: > create virtual table t using fts3([...]); > explain query plan select * from t group by docid; 0|0|0|SCAN TABLE t VIRTUAL TABLE INDEX 0:ASC >> In theory, it would be possible to create a temporary index that >> stores the current aggregation _result_ for each group seen so far, >> and that is updated dynamically. > > It's possible in theory, but it would complicate user-defined > aggregation functions greatly. As it stands, a UDF is called > with xStep N times, and xFinal once per group. Sorry, what I meant to say was "stores the current accumulator". And you're right, this would require that the aggregation function can store its entire state in a sqlite3_value. Having two different aggregation APIs is probably not worth the effort. Regards, Clemens _______________________________________________ sqlite-users mailing list [email protected] http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

