Re: Re: parallel distinct union and aggregate support patch

[email protected] Thu, 22 Oct 2020 23:29:41 -0700

> Interesting idea.  So IIUC, whenever a worker is scanning the tuple it
> will directly put it into the respective batch(shared tuple store),
> based on the hash on grouping column and once all the workers are
> doing preparing the batch then each worker will pick those baches one
> by one, perform sort and finish the aggregation.  I think there is a
> scope of improvement that instead of directly putting the tuple to the
> batch what if the worker does the partial aggregations and then it
> places the partially aggregated rows in the shared tuple store based
> on the hash value and then the worker can pick the batch by batch.  By
> doing this way, we can avoid doing large sorts.  And then this
> approach can also be used with the hash aggregate, I mean the
> partially aggregated data by the hash aggregate can be put into the
> respective batch.


Good idea. Batch sort suitable for large aggregate result rows,
in large aggregate result using partial aggregation maybe out of memory,
and all aggregate functions must support partial(using batch sort this is 
unnecessary).

Actually i written a batch hash store for hash aggregate(for pg11) like this 
idea,
but not write partial aggregations to shared tuple store, it's write origin 
tuple and hash value
to shared tuple store, But it's not support parallel grouping sets.
I'am trying to write parallel hash aggregate support using batch shared tuple 
store for PG14,
and need support parallel grouping sets hash aggregate.

Re: Re: parallel distinct union and aggregate support patch

Reply via email to