On 25.03.2020 13:36, Richard Guo wrote:
On Tue, Mar 24, 2020 at 3:36 PM Richard Guo <guofengli...@gmail.com
<mailto:guofengli...@gmail.com>> wrote:
On Tue, Mar 24, 2020 at 11:05 AM Thomas Munro
<thomas.mu...@gmail.com <mailto:thomas.mu...@gmail.com>> wrote:
I think there might be a case like this:
* ExecRescanHashJoin() decides it can't reuse the hash table for a
rescan, so it calls ExecHashTableDestroy(), clears HashJoinState's
hj_HashTable and sets hj_JoinState to HJ_BUILD_HASHTABLE
* the HashState node still has a reference to the pfree'd
HashJoinTable!
* HJ_BUILD_HASHTABLE case reaches the empty-outer optimisation
case so
it doesn't bother to build a new hash table
* EXPLAIN examines the HashState's pointer to a freed
HashJoinTable struct
Yes, debugging with gdb shows this is exactly what happens.
According to the scenario above, here is a recipe that reproduces this
issue.
-- recipe start
create table a(i int, j int);
create table b(i int, j int);
create table c(i int, j int);
insert into a select 3,3;
insert into a select 2,2;
insert into a select 1,1;
insert into b select 3,3;
insert into c select 0,0;
analyze a;
analyze b;
analyze c;
set enable_nestloop to off;
set enable_mergejoin to off;
explain analyze
select exists(select * from b join c on a.i > c.i and a.i = b.i and
b.j = c.j) from a;
-- recipe end
I tried this recipe on different PostgreSQL versions, starting from
current master and going backwards. I was able to reproduce this issue
on all versions above 8.4. In 8.4 version, we do not output information
on hash buckets/batches. But manual inspection with gdb shows in 8.4 we
also have the dangling pointer for HashState->hashtable. I didn't check
versions below 8.4 though.
Thanks
Richard
I can propose the following patch for the problem.
--
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company
diff --git a/src/backend/commands/explain.c b/src/backend/commands/explain.c
index ff2f45c..470073b 100644
--- a/src/backend/commands/explain.c
+++ b/src/backend/commands/explain.c
@@ -2685,7 +2685,8 @@ show_hash_info(HashState *hashstate, ExplainState *es)
*/
if (hashstate->hashtable)
ExecHashGetInstrumentation(&hinstrument, hashstate->hashtable);
-
+ else
+ hinstrument = hashstate->saved_instrument;
/*
* Merge results from workers. In the parallel-oblivious case, the
* results from all participants should be identical, except where
diff --git a/src/backend/executor/nodeHashjoin.c b/src/backend/executor/nodeHashjoin.c
index c901a80..db082be 100644
--- a/src/backend/executor/nodeHashjoin.c
+++ b/src/backend/executor/nodeHashjoin.c
@@ -1336,6 +1336,13 @@ ExecReScanHashJoin(HashJoinState *node)
else
{
/* must destroy and rebuild hash table */
+ HashState* state = (HashState*)innerPlanState(node);
+ Assert(state->hashtable == node->hj_HashTable);
+ if (state->ps.instrument)
+ {
+ ExecHashGetInstrumentation(&state->saved_instrument, state->hashtable);
+ }
+ state->hashtable = NULL;
ExecHashTableDestroy(node->hj_HashTable);
node->hj_HashTable = NULL;
node->hj_JoinState = HJ_BUILD_HASHTABLE;
diff --git a/src/include/nodes/execnodes.h b/src/include/nodes/execnodes.h
index 3d27d50..94afc18 100644
--- a/src/include/nodes/execnodes.h
+++ b/src/include/nodes/execnodes.h
@@ -2302,6 +2302,8 @@ typedef struct HashState
SharedHashInfo *shared_info; /* one entry per worker */
HashInstrumentation *hinstrument; /* this worker's entry */
+ HashInstrumentation saved_instrument; /* save here instrumentation for dropped hash */
+
/* Parallel hash state. */
struct ParallelHashJoinState *parallel_state;
} HashState;