On Thu, May 2, 2024 at 5:37 PM Tomas Vondra
<tomas.von...@enterprisedb.com> wrote:
>
>
>
> On 4/24/24 22:46, Melanie Plageman wrote:
> > On Tue, Apr 23, 2024 at 6:43 PM Tomas Vondra
> > <tomas.von...@enterprisedb.com> wrote:
> >>
> >> On 4/23/24 18:05, Melanie Plageman wrote:
> >>> The patch with a fix is attached. I put the test in
> >>> src/test/regress/sql/join.sql. It isn't the perfect location because
> >>> it is testing something exercisable with a join but not directly
> >>> related to the fact that it is a join. I also considered
> >>> src/test/regress/sql/select.sql, but it also isn't directly related to
> >>> the query being a SELECT query. If there is a better place for a test
> >>> of a bitmap heap scan edge case, let me know.
> >>
> >> I don't see a problem with adding this to join.sql - why wouldn't this
> >> count as something related to a join? Sure, it's not like this code
> >> matters only for joins, but if you look at join.sql that applies to a
> >> number of other tests (e.g. there are a couple btree tests).
> >
> > I suppose it's true that other tests in this file use joins to test
> > other code. I guess if we limited join.sql to containing tests of join
> > implementation, it would be rather small. I just imagined it would be
> > nice if tests were grouped by what they were testing -- not how they
> > were testing it.
> >
> >> That being said, it'd be good to explain in the comment why we're
> >> testing this particular plan, not just what the plan looks like.
> >
> > You mean I should explain in the test comment why I included the
> > EXPLAIN plan output? (because it needs to contain a bitmapheapscan to
> > actually be testing anything)
> >
>
> No, I meant that the comment before the test describes a couple
> requirements the plan needs to meet (no need to process all inner
> tuples, bitmapscan eligible for skip_fetch on outer side, ...), but it
> does not explain why we're testing that plan.
>
> I could get to that by doing git-blame to see what commit added this
> code, and then read the linked discussion. Perhaps that's enough, but
> maybe the comment could say something like "verify we properly discard
> tuples on rescans" or something like that?

Attached is v3. I didn't use your exact language because the test
wouldn't actually verify that we properly discard the tuples. Whether
or not the empty tuples are all emitted, it just resets the counter to
0. I decided to go with "exercise" instead.

- Melanie
From 0f10d304fc7dcce99622f348183f36a8062e70c6 Mon Sep 17 00:00:00 2001
From: Melanie Plageman <melanieplage...@gmail.com>
Date: Fri, 10 May 2024 14:52:34 -0400
Subject: [PATCH v3] BitmapHeapScan: Remove incorrect assert

04e72ed617be pushed the skip fetch optimization (allowing bitmap heap
scans to operate like index-only scans if none of the underlying data is
needed) into heap AM-specific bitmap heap scan code.

04e72ed617be added an assert that all tuples in blocks eligible for the
optimization had been NULL-filled and emitted by the end of the scan.
This assert is incorrect when not all tuples need be scanned to execute
the query; for example: a join in which not all inner tuples need to be
scanned before skipping to the next outer tuple.

Author: Melanie Plageman
Reviewed-by: Richard Guo, Tomas Vondra
Discussion: https://postgr.es/m/CAMbWs48orzZVXa7-vP9Nt7vQWLTE04Qy4PePaLQYsVNQgo6qRg%40mail.gmail.com
---
 src/backend/access/heap/heapam.c   |  4 ++--
 src/test/regress/expected/join.out | 38 ++++++++++++++++++++++++++++++
 src/test/regress/sql/join.sql      | 26 ++++++++++++++++++++
 3 files changed, 66 insertions(+), 2 deletions(-)

diff --git a/src/backend/access/heap/heapam.c b/src/backend/access/heap/heapam.c
index 4be0dee4de..8600c22515 100644
--- a/src/backend/access/heap/heapam.c
+++ b/src/backend/access/heap/heapam.c
@@ -1184,7 +1184,7 @@ heap_rescan(TableScanDesc sscan, ScanKey key, bool set_params,
 		scan->rs_vmbuffer = InvalidBuffer;
 	}
 
-	Assert(scan->rs_empty_tuples_pending == 0);
+	scan->rs_empty_tuples_pending = 0;
 
 	/*
 	 * The read stream is reset on rescan. This must be done before
@@ -1216,7 +1216,7 @@ heap_endscan(TableScanDesc sscan)
 	if (BufferIsValid(scan->rs_vmbuffer))
 		ReleaseBuffer(scan->rs_vmbuffer);
 
-	Assert(scan->rs_empty_tuples_pending == 0);
+	scan->rs_empty_tuples_pending = 0;
 
 	/*
 	 * Must free the read stream before freeing the BufferAccessStrategy.
diff --git a/src/test/regress/expected/join.out b/src/test/regress/expected/join.out
index 0246d56aea..8829bd76e7 100644
--- a/src/test/regress/expected/join.out
+++ b/src/test/regress/expected/join.out
@@ -7924,3 +7924,41 @@ where exists (select 1 from j3
 (13 rows)
 
 drop table j3;
+-- Exercise the "skip fetch" Bitmap Heap Scan optimization when candidate
+-- tuples are discarded. This may occur when:
+--   1. A join doesn't require all inner tuples to be scanned for each outer
+--      tuple, and
+--   2. The inner side is scanned using a bitmap heap scan, and
+--   3. The bitmap heap scan is eligible for the "skip fetch" optimization.
+--      This optimization is usable when no data from the underlying table is
+--      needed. Use a temp table so it is only visible to this backend and
+--      vacuum may reliably mark all blocks in the table all visible in the
+--      visibility map.
+CREATE TEMP TABLE skip_fetch (a INT, b INT) WITH (fillfactor=10);
+INSERT INTO skip_fetch SELECT i % 3, i FROM generate_series(0,30) i;
+CREATE INDEX ON skip_fetch(a);
+VACUUM (ANALYZE) skip_fetch;
+SET enable_indexonlyscan = off;
+set enable_indexscan = off;
+SET enable_seqscan = off;
+EXPLAIN (COSTS OFF)
+SELECT t1.a FROM skip_fetch t1 LEFT JOIN skip_fetch t2 ON t2.a = 1 WHERE t2.a IS NULL;
+                       QUERY PLAN                        
+---------------------------------------------------------
+ Nested Loop Anti Join
+   ->  Seq Scan on skip_fetch t1
+   ->  Materialize
+         ->  Bitmap Heap Scan on skip_fetch t2
+               Recheck Cond: (a = 1)
+               ->  Bitmap Index Scan on skip_fetch_a_idx
+                     Index Cond: (a = 1)
+(7 rows)
+
+SELECT t1.a FROM skip_fetch t1 LEFT JOIN skip_fetch t2 ON t2.a = 1 WHERE t2.a IS NULL;
+ a 
+---
+(0 rows)
+
+RESET enable_indexonlyscan;
+RESET enable_indexscan;
+RESET enable_seqscan;
diff --git a/src/test/regress/sql/join.sql b/src/test/regress/sql/join.sql
index 923e7c5549..b73ce67bd2 100644
--- a/src/test/regress/sql/join.sql
+++ b/src/test/regress/sql/join.sql
@@ -2904,3 +2904,29 @@ where exists (select 1 from j3
       and t1.unique1 < 1;
 
 drop table j3;
+
+-- Exercise the "skip fetch" Bitmap Heap Scan optimization when candidate
+-- tuples are discarded. This may occur when:
+--   1. A join doesn't require all inner tuples to be scanned for each outer
+--      tuple, and
+--   2. The inner side is scanned using a bitmap heap scan, and
+--   3. The bitmap heap scan is eligible for the "skip fetch" optimization.
+--      This optimization is usable when no data from the underlying table is
+--      needed. Use a temp table so it is only visible to this backend and
+--      vacuum may reliably mark all blocks in the table all visible in the
+--      visibility map.
+CREATE TEMP TABLE skip_fetch (a INT, b INT) WITH (fillfactor=10);
+INSERT INTO skip_fetch SELECT i % 3, i FROM generate_series(0,30) i;
+CREATE INDEX ON skip_fetch(a);
+VACUUM (ANALYZE) skip_fetch;
+
+SET enable_indexonlyscan = off;
+set enable_indexscan = off;
+SET enable_seqscan = off;
+EXPLAIN (COSTS OFF)
+SELECT t1.a FROM skip_fetch t1 LEFT JOIN skip_fetch t2 ON t2.a = 1 WHERE t2.a IS NULL;
+SELECT t1.a FROM skip_fetch t1 LEFT JOIN skip_fetch t2 ON t2.a = 1 WHERE t2.a IS NULL;
+
+RESET enable_indexonlyscan;
+RESET enable_indexscan;
+RESET enable_seqscan;
-- 
2.34.1

Reply via email to