James K. Lowden wrote:
> Clemens Ladisch <[email protected]> wrote:
>>> Is there a way to force rows to be dispatched (using the group by)
>>> and aggregated on the fly instead of being stored, sorted and then
>>> aggregated?
>>
>> SQLite can optimize GROUP BY this way only if it can prove that the
>> table is already sorted on the grouping column.  Otherwise, it creates
>> a temporary index for all records, and then groups on those.
>
> IIUC "prove that the table is already sorted" means
> xBestIndex sets orderByConsumed to 1.  One of the orderings proposed
> to xBestIndex is chosen by the query planner and given to xFilter as
> idxNum and idxStr.  The function is then obliged to provide the rows in
> that order, allowing SQLite to skip the step of writing the index to a
> temporary table.
>
> Do I have that right?

Yes:
  > create virtual table t using fts3([...]);
  > explain query plan select * from t group by docid;
  0|0|0|SCAN TABLE t VIRTUAL TABLE INDEX 0:ASC

>> In theory, it would be possible to create a temporary index that
>> stores the current aggregation _result_ for each group seen so far,
>> and that is updated dynamically.
>
> It's possible in theory, but it would complicate user-defined
> aggregation functions greatly.  As it stands, a UDF is called
> with xStep N times, and xFinal once per group.

Sorry, what I meant to say was "stores the current accumulator".

And you're right, this would require that the aggregation function
can store its entire state in a sqlite3_value.

Having two different aggregation APIs is probably not worth the
effort.


Regards,
Clemens
_______________________________________________
sqlite-users mailing list
[email protected]
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Reply via email to