On Tue, May 12, 2009 at 2:21 AM, Pavel Stehule <pavel.steh...@gmail.com> wrote: >> Moreover, I guess you don't even need to buffer tuples to aggregate by >> different keys. What you have to do is only to prepare more than one >> hash tables (, or set up sort order if the plan detects hash table is >> too large to fit in the memory), and one time seq scan will do. The >> trans values are only to be stored in the memory, not the outer plan's >> results. It will win greately in performance. > > it was my first solution. But I would to prepare one non hash method. > But now I thinking about some special executor node, that fill all > necessary hash parallel. It's special variant of hash agreggate.
I think HashAggregate will often be the fastest method of executing this kind of operation, but it would be nice to have an alternative (such as repeatedly sorting a tuplestore) to handle non-hashable datatypes or cases where the HashAggregate would eat too much memory. But that leads me to a question - does the existing HashAggregate code make any attempt to obey work_mem? I know that the infrastructure is present for HashJoin/Hash, but on a quick pass I didn't notice anything similar in HashAggregate. And on a slightly off-topic note for this thread, is there any compelling reason why we have at least three different hash implementations in the executor? HashJoin/Hash uses one for regular batches and one for the skew batch, and I believe that HashAggregate does something else entirely. It seems like it might improve code maintainability, if nothing else, to unify these to the extent possible. ...Robert -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers