Hi,

On 2026-02-19 20:24:17 +0200, Ants Aasma wrote:
> On Thu, 19 Feb 2026 at 20:07, Ants Aasma <[email protected]> wrote:
> > I was thinking more along the lines of hashing together the pointer
> > value and worker number. But something more deterministic would indeed
> > be better. How about this?
> >
> > --- a/src/backend/executor/execGrouping.c
> > +++ b/src/backend/executor/execGrouping.c
> > @@ -201,3 +201,3 @@ BuildTupleHashTable(PlanState *parent,
> >         MemoryContext oldcontext;
> > -       uint32          hash_iv = 0;
> > +       uint32          hash_iv = parent->plan->plan_node_id;
> 
> I can confirm that this fixes the issue. A standalone reproducer is here:

I think this needs should use something that smears the bits from the plan_id
more widely. The hash functions of some types are ... peculiar (partially due
to cross-type hashjoin support).

I'd also combine it with use_variable_hash_iv, rather than just have
use_variable_hash_iv overwrite it.


> create table data as select random(1,1000000) from 
> generate_series(1,10000000);
> vacuum analyze data;
> set enable_gathermerge = off;
> explain analyze select distinct random from data;

Took me a moment to reproduce with that. I also needed parallel_tuple_cost=0
and work_mem=4MB (something fairly small at least) to reproduce it.

With those I can reproduce that the query only terminates in a reasonable time
with the hash_iv = parent->plan->plan_node_id thing.

Greetings,

Andres Freund


Reply via email to