Hi all,

while working on a prototype of memory-bounded hash aggregate (alternative
to Jeff's patch discussed in [1]), I ran into difficulties when dealing
with dynahash. So I'm asking for help ...

Some of the difficulties stem from my limited knowledge of dynahash, and
complexity in execGrouping.c, but it seems to me some are actually due to
a mismatch with the proposed hashjoin-like approach to batching.

The hashjoin-like batching essentially does this:

1) put data (in this case aggregate states) into a hash table, until a
memory limit is reached
2) double the number of batches and move half the entries from the hash
table (based on batch number, computed from the hash value)
3) repeat

This however requires two things, that I think seem quite difficult to do
with dynahash:

(a) detecting that a memory limit was reached

   This is usually done with a condition like this:

   if (consumed_memory > work_mem) {
      ... do the batching / remove ~50% of data from hash table
      ... it's expected counsumed_memory drops by 50%

   Where consumed memory is the size of a memory context (cheap to
compute, thanks to another patch from Jeff [2]).

   This however does not work with dynahash, because dynahash does not
release memory for removed tuples - it just moves it to a freelist, so
consumed_memory only grows.

   For a while I thought I could do this:

   if (consumed_memory > consumed_memory_prev) {
      consumed_memory_prev = consumed_memory

   But then I found out dynahash does not grow continuously, but in (quite
large) steps. Exceeding the limit a bit is not a big deal, but the
growth is quite fast and quickly leads to allocating much more than the

(b) removing tuples while batching

    Whenever the number of batches is increased (doubled), I need to walk
through the hash table and remove entries not belonging to the current
batch (should be ~50% of them). The only way to do this with dynahash
sems to be iterating over the entries, and then doing another search
with HASH_REMOVE. Is there a better way?

I've been considering switching this to a custom hash table (similar to
what's used in hashjoin
https://commitfest.postgresql.org/action/patch_view?id=1494), which seems
like a better solution for this use-case, but I'm not sure about this. I'm
not a big fan of replacing large amounts of code for no good reason.


I'd be grateful if someone more knowledgeable of dynahash / the way it's
used in execGrouping could review the prototype I have so far. There's
only a handful of functions related to dynahash, and most of the issues I
have is about passing the values properly (slots, dummy values, tuples).




Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:

Reply via email to