On Sat, May 11, 2024 at 3:18 PM Tomas Vondra <tomas.von...@enterprisedb.com> wrote: > > On 5/10/24 21:48, Melanie Plageman wrote: > > Attached is v3. I didn't use your exact language because the test > > wouldn't actually verify that we properly discard the tuples. Whether > > or not the empty tuples are all emitted, it just resets the counter to > > 0. I decided to go with "exercise" instead. > > > > I did go over the v3 patch, did a bunch of tests, and I think it's fine > and ready to go. The one thing that might need some minor tweaks is the > commit message. > > 1) Isn't the subject "Remove incorrect assert" a bit misleading, as the > patch does not simply remove an assert, but replaces it with a reset of > the field the assert used to check? (The commit message does not mention > this either, at least not explicitly.)
I've updated the commit message. > 2) The "heap AM-specific bitmap heap scan code" sounds a bit strange to > me, isn't the first "heap" unnecessary? bitmap heap scan has been used to refer to bitmap table scans, as the name wasn't changed from heap when the table AM API was added (e.g. BitmapHeapNext() is used by all table AMs doing bitmap table scans). 04e72ed617be specifically pushed the skip fetch optimization into heap implementations of bitmap table scan callbacks, so it was important to make this distinction. I've changed the commit message to say heap AM implementations of bitmap table scan callbacks. While looking at the patch again, I wondered if I should set enable_material=false in the test. It doesn't matter from the perspective of exercising the correct code; however, I wasn't sure if disabling materialization would make the test more resilient against future planner changes which could cause it to incorrectly fail. - Melanie
From d3628d8d36b2be54b64c18402bdda80e1c8a436f Mon Sep 17 00:00:00 2001 From: Melanie Plageman <melanieplage...@gmail.com> Date: Fri, 10 May 2024 14:52:34 -0400 Subject: [PATCH v4] BitmapHeapScan: Replace incorrect assert with reinitialization 04e72ed617be pushed the skip fetch optimization (allowing bitmap heap scans to operate like index-only scans if none of the underlying data is needed) into heap AM implementations of bitmap table scan callbacks. 04e72ed617be added an assert that all tuples in blocks eligible for the optimization had been NULL-filled and emitted by the end of the scan. This assert is incorrect when not all tuples need be scanned to execute the query; for example: a join in which not all inner tuples need to be scanned before skipping to the next outer tuple. Remove the assert and reset the field on which it previously asserted to avoid incorrectly emitting NULL-filled tuples from a previous scan on rescan. Author: Melanie Plageman Reviewed-by: Tomas Vondra Reported-by: Melanie Plageman Reproduced-by: Tomas Vondra, Richard Guo Discussion: https://postgr.es/m/CAMbWs48orzZVXa7-vP9Nt7vQWLTE04Qy4PePaLQYsVNQgo6qRg%40mail.gmail.com --- src/backend/access/heap/heapam.c | 4 ++-- src/test/regress/expected/join.out | 38 ++++++++++++++++++++++++++++++ src/test/regress/sql/join.sql | 26 ++++++++++++++++++++ 3 files changed, 66 insertions(+), 2 deletions(-) diff --git a/src/backend/access/heap/heapam.c b/src/backend/access/heap/heapam.c index 4be0dee4de..8600c22515 100644 --- a/src/backend/access/heap/heapam.c +++ b/src/backend/access/heap/heapam.c @@ -1184,7 +1184,7 @@ heap_rescan(TableScanDesc sscan, ScanKey key, bool set_params, scan->rs_vmbuffer = InvalidBuffer; } - Assert(scan->rs_empty_tuples_pending == 0); + scan->rs_empty_tuples_pending = 0; /* * The read stream is reset on rescan. This must be done before @@ -1216,7 +1216,7 @@ heap_endscan(TableScanDesc sscan) if (BufferIsValid(scan->rs_vmbuffer)) ReleaseBuffer(scan->rs_vmbuffer); - Assert(scan->rs_empty_tuples_pending == 0); + scan->rs_empty_tuples_pending = 0; /* * Must free the read stream before freeing the BufferAccessStrategy. diff --git a/src/test/regress/expected/join.out b/src/test/regress/expected/join.out index 0246d56aea..8829bd76e7 100644 --- a/src/test/regress/expected/join.out +++ b/src/test/regress/expected/join.out @@ -7924,3 +7924,41 @@ where exists (select 1 from j3 (13 rows) drop table j3; +-- Exercise the "skip fetch" Bitmap Heap Scan optimization when candidate +-- tuples are discarded. This may occur when: +-- 1. A join doesn't require all inner tuples to be scanned for each outer +-- tuple, and +-- 2. The inner side is scanned using a bitmap heap scan, and +-- 3. The bitmap heap scan is eligible for the "skip fetch" optimization. +-- This optimization is usable when no data from the underlying table is +-- needed. Use a temp table so it is only visible to this backend and +-- vacuum may reliably mark all blocks in the table all visible in the +-- visibility map. +CREATE TEMP TABLE skip_fetch (a INT, b INT) WITH (fillfactor=10); +INSERT INTO skip_fetch SELECT i % 3, i FROM generate_series(0,30) i; +CREATE INDEX ON skip_fetch(a); +VACUUM (ANALYZE) skip_fetch; +SET enable_indexonlyscan = off; +set enable_indexscan = off; +SET enable_seqscan = off; +EXPLAIN (COSTS OFF) +SELECT t1.a FROM skip_fetch t1 LEFT JOIN skip_fetch t2 ON t2.a = 1 WHERE t2.a IS NULL; + QUERY PLAN +--------------------------------------------------------- + Nested Loop Anti Join + -> Seq Scan on skip_fetch t1 + -> Materialize + -> Bitmap Heap Scan on skip_fetch t2 + Recheck Cond: (a = 1) + -> Bitmap Index Scan on skip_fetch_a_idx + Index Cond: (a = 1) +(7 rows) + +SELECT t1.a FROM skip_fetch t1 LEFT JOIN skip_fetch t2 ON t2.a = 1 WHERE t2.a IS NULL; + a +--- +(0 rows) + +RESET enable_indexonlyscan; +RESET enable_indexscan; +RESET enable_seqscan; diff --git a/src/test/regress/sql/join.sql b/src/test/regress/sql/join.sql index 923e7c5549..b73ce67bd2 100644 --- a/src/test/regress/sql/join.sql +++ b/src/test/regress/sql/join.sql @@ -2904,3 +2904,29 @@ where exists (select 1 from j3 and t1.unique1 < 1; drop table j3; + +-- Exercise the "skip fetch" Bitmap Heap Scan optimization when candidate +-- tuples are discarded. This may occur when: +-- 1. A join doesn't require all inner tuples to be scanned for each outer +-- tuple, and +-- 2. The inner side is scanned using a bitmap heap scan, and +-- 3. The bitmap heap scan is eligible for the "skip fetch" optimization. +-- This optimization is usable when no data from the underlying table is +-- needed. Use a temp table so it is only visible to this backend and +-- vacuum may reliably mark all blocks in the table all visible in the +-- visibility map. +CREATE TEMP TABLE skip_fetch (a INT, b INT) WITH (fillfactor=10); +INSERT INTO skip_fetch SELECT i % 3, i FROM generate_series(0,30) i; +CREATE INDEX ON skip_fetch(a); +VACUUM (ANALYZE) skip_fetch; + +SET enable_indexonlyscan = off; +set enable_indexscan = off; +SET enable_seqscan = off; +EXPLAIN (COSTS OFF) +SELECT t1.a FROM skip_fetch t1 LEFT JOIN skip_fetch t2 ON t2.a = 1 WHERE t2.a IS NULL; +SELECT t1.a FROM skip_fetch t1 LEFT JOIN skip_fetch t2 ON t2.a = 1 WHERE t2.a IS NULL; + +RESET enable_indexonlyscan; +RESET enable_indexscan; +RESET enable_seqscan; -- 2.34.1