On Mon, Dec 16, 2013 at 9:39 PM, Heikki Linnakangas <hlinnakan...@vmware.com
> wrote:

>
> There's another technique we could use which doesn't need a negative
> transition function, assuming the order you feed the values to the aggreate
> function doesn't matter: keep subtotals. For example, if the window first
> contains values 1, 2, 3, 4, you calculate 3 + 4 = 7, and then 1 + 2 + 7 =
> 10. Next, 1 leaves the window, and 5 enters it. Now you calculate  2 + 7 +
> 5 = 14. By keeping the subtotal (3 + 4 = 7) around, you saved one addition
> compared to calculating 2 + 3 + 4 + 5 from scratch.
>
> The negative transition function is a lot simpler and faster for count(*)
> and integer operations, so we probably should implement that anyway. But
> the subtotals technique could be very useful for other data types.
>
> - Heikki
>

That's quite interesting. I guess we would need another flag in
pg_aggregate to mark if the order of the tuples matters, string_agg would
be an example of one that would have to skip this.

At least for ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING if the
aggregate order did not matter than it would likely be quite efficient just
to aggregate from the bottom up and materialise the results at each tuple
and store it in the tuple store, then just use that materialised value when
that tuple is processed. This method won't work when the frame base is not
fixed.

I had been thinking ahead on how to improve MIN and MAX cases too. I came
up with something called "tuple store indexes" that could be build as
binary search trees with a composite index on the tuple position and the
aggregate's sort operator... Something similar to how the following query
could use an index on (id,value) to calculate max()
select max(value) from test where id between 1 and 100;
It's certainly not something for this patch, but it was an idea I came up
with which I think would be possible without adding any more columns to
pg_aggregate.

Regards

David Rowley

Reply via email to