On Wed, Apr 12, 2023 at 2:59 PM Melanie Plageman <melanieplage...@gmail.com> wrote: > > On Wed, Apr 12, 2023 at 2:14 PM Andres Freund <and...@anarazel.de> wrote: > > > > Hi, > > > > On 2023-04-12 10:57:17 -0400, Melanie Plageman wrote: > > > HeapTupleHeaderHasMatch() checks if HEAP_TUPLE_HAS_MATCH is set. > > > > > > In htup_details.h, you will see that HEAP_TUPLE_HAS_MATCH is defined as > > > HEAP_ONLY_TUPLE > > > /* > > > * HEAP_TUPLE_HAS_MATCH is a temporary flag used during hash joins. It is > > > * only used in tuples that are in the hash table, and those don't need > > > * any visibility information, so we can overlay it on a visibility flag > > > * instead of using up a dedicated bit. > > > */ > > > #define HEAP_TUPLE_HAS_MATCH HEAP_ONLY_TUPLE /* tuple has a join match > > > */ > > > > > > If you redefine HEAP_TUPLE_HAS_MATCH as something that isn't already > > > used, say 0x1800, the query returns correct results. > > > [...] > > > The question is, why does this only happen for a parallel full hash join? > > > > I'd guess that PHJ code is missing a HeapTupleHeaderClearMatch() somewhere, > > but the non-parallel case isn't. > > Indeed. Thanks! This diff fixes the case Richard provided. > > diff --git a/src/backend/executor/nodeHash.c b/src/backend/executor/nodeHash.c > index a45bd3a315..54c06c5eb3 100644 > --- a/src/backend/executor/nodeHash.c > +++ b/src/backend/executor/nodeHash.c > @@ -1724,6 +1724,7 @@ retry: > /* Store the hash value in the HashJoinTuple header. */ > hashTuple->hashvalue = hashvalue; > memcpy(HJTUPLE_MINTUPLE(hashTuple), tuple, tuple->t_len); > + HeapTupleHeaderClearMatch(HJTUPLE_MINTUPLE(hashTuple)); > > /* Push it onto the front of the bucket's list */ > > ExecParallelHashPushTuple(&hashtable->buckets.shared[bucketno], > > I will propose a patch that includes this change and a test. > > I just want to convince myself that ExecParallelHashTableInsertCurrentBatch() > covers the non-batch 0 cases and we don't need to add something to > sts_puttuple().
So, indeed, tuples in batches after batch 0 already had their match bit cleared by ExecParallelHashTableInsertCurrentBatch(). Attached patch includes the fix for ExecParallelHashTableInsert() as well as a test. I toyed with adapting one of the existing parallel full hash join tests to cover this case, however, I think Richard's repro is much more clear. Maybe it is worth throwing in a few updates to the tables in the existing queries to provide coverage for the other HeapTupleHeaderClearMatch() calls in the code, though. - Melanie
From 6aa53dae62e8a8a34166ed6f28ee621cf727c004 Mon Sep 17 00:00:00 2001 From: Melanie Plageman <melanieplage...@gmail.com> Date: Wed, 12 Apr 2023 17:16:28 -0400 Subject: [PATCH v1] Reset PHJ tuple match bit upon hashtable insert We reuse a single bit to indicate that a hash join tuple is heap-only and that it has a match during hash join execution. Serial hash join and parallel multi-batch hash join cleared this bit upon inserting the tuple into the hashtable. Single batch parallel hash join and batch 0 of unexpected multi-batch hash joins forgot to do this. Fix that. We didn't notice before because parallel hash join wasn't using the match bits for tuples in the hashtable since right and full parallel hash join were unsupported. Reported-by: Richard Guo Discussion: https://postgr.es/m/flat/CAMbWs48Nde1Mv%3DBJv6_vXmRKHMuHZm2Q_g4F6Z3_pn%2B3EV6BGQ%40mail.gmail.com --- src/backend/executor/nodeHash.c | 1 + src/test/regress/expected/join_hash.out | 20 ++++++++++++++++++++ src/test/regress/sql/join_hash.sql | 17 ++++++++++++++++- 3 files changed, 37 insertions(+), 1 deletion(-) diff --git a/src/backend/executor/nodeHash.c b/src/backend/executor/nodeHash.c index a45bd3a315..54c06c5eb3 100644 --- a/src/backend/executor/nodeHash.c +++ b/src/backend/executor/nodeHash.c @@ -1724,6 +1724,7 @@ retry: /* Store the hash value in the HashJoinTuple header. */ hashTuple->hashvalue = hashvalue; memcpy(HJTUPLE_MINTUPLE(hashTuple), tuple, tuple->t_len); + HeapTupleHeaderClearMatch(HJTUPLE_MINTUPLE(hashTuple)); /* Push it onto the front of the bucket's list */ ExecParallelHashPushTuple(&hashtable->buckets.shared[bucketno], diff --git a/src/test/regress/expected/join_hash.out b/src/test/regress/expected/join_hash.out index 09376514bb..c4f4e2ee54 100644 --- a/src/test/regress/expected/join_hash.out +++ b/src/test/regress/expected/join_hash.out @@ -955,6 +955,26 @@ $$); (1 row) rollback to settings; +-- Ensure that hash join tuple match bits have been cleared before putting them +-- into the hashtable. +SAVEPOINT settings; +SET enable_parallel_hash = on; +SET min_parallel_table_scan_size = 0; +SET parallel_setup_cost = 0; +SET parallel_tuple_cost = 0; +CREATE TABLE hjtest_matchbits_t1(id int); +CREATE TABLE hjtest_matchbits_t2(id int); +INSERT INTO hjtest_matchbits_t1 VALUES (1); +INSERT INTO hjtest_matchbits_t2 VALUES (2); +UPDATE hjtest_matchbits_t2 set id = 2; +SELECT * FROM hjtest_matchbits_t1 t1 FULL JOIN hjtest_matchbits_t2 t2 ON t1.id = t2.id; + id | id +----+---- + 1 | + | 2 +(2 rows) + +ROLLBACK TO settings; rollback; -- Verify that hash key expressions reference the correct -- nodes. Hashjoin's hashkeys need to reference its outer plan, Hash's diff --git a/src/test/regress/sql/join_hash.sql b/src/test/regress/sql/join_hash.sql index 179e94941c..86917ee13b 100644 --- a/src/test/regress/sql/join_hash.sql +++ b/src/test/regress/sql/join_hash.sql @@ -506,8 +506,23 @@ $$ $$); rollback to settings; -rollback; +-- Ensure that hash join tuple match bits have been cleared before putting them +-- into the hashtable. +SAVEPOINT settings; +SET enable_parallel_hash = on; +SET min_parallel_table_scan_size = 0; +SET parallel_setup_cost = 0; +SET parallel_tuple_cost = 0; +CREATE TABLE hjtest_matchbits_t1(id int); +CREATE TABLE hjtest_matchbits_t2(id int); +INSERT INTO hjtest_matchbits_t1 VALUES (1); +INSERT INTO hjtest_matchbits_t2 VALUES (2); +UPDATE hjtest_matchbits_t2 set id = 2; +SELECT * FROM hjtest_matchbits_t1 t1 FULL JOIN hjtest_matchbits_t2 t2 ON t1.id = t2.id; +ROLLBACK TO settings; + +rollback; -- Verify that hash key expressions reference the correct -- nodes. Hashjoin's hashkeys need to reference its outer plan, Hash's -- 2.37.2