On 2024-11-09 03:40, Peter Geoghegan wrote:
On Fri, Nov 8, 2024 at 8:25 AM Masahiro Ikeda
<ikeda...@oss.nttdata.com> wrote:
Since I couldn't understand the reason, I have a question: is the
following deletion related to this change?
@@ -770,7 +785,7 @@ _bt_parallel_done(IndexScanDesc scan)
/*
* Should not mark parallel scan done when there's still a
pending
- * primitive index scan (defensive)
+ * primitive index scan
*/
That's a good question.
Prior to this bugfix, the check of so->needPrimScan from within
_bt_parallel_done() was defensive; it wasn't strictly necessary. It
could have been "Assert(!so->needPrimScan)" instead (I guess that I
didn't make it into an assert like this because _bt_parallel_done
worked a little like this, prior to Postgres 17, when we had a
primscan counter instead of the current so->needPrimScan flag). But
that's no longer the case with the bugfix in place; the
so->needPrimScan check really is strictly necessary now.
It's hard to see why this is. Notice that _bt_parallel_seize() will
just return false when another primitive index scan is required. Prior
to this bugfix, we'd seize the parallel scan within _bt_steppage,
which could only succeed when "!so->needPrimScan" (which was actually
verified by an assertion that just got removed). With this bug fix,
nothing stops the so->needPrimScan flag from still being set from back
when we called _bt_readpage for the so->currPos we're using. And so,
and as I said, the check of so->needPrimScan from within
_bt_parallel_done() became strictly necessary (not just defensive) --
since so->needPrimScan definitely can be set when we call
_bt_parallel_done, and we definitely don't want to *really* end the
whole top-level scan when it is set (we must never confuse the end of
one primitive index scan with the end of the whole top-level parallel
index scan).
I understand, thanks to your explanation.
Now, there is a case where _bt_readnextpage() calls
_bt_parallel_seize(),
_bt_readpage() sets so->needPrimScan=true, and _bt_parallel_done() is
called
with so->needPrimScan=true. Prior to this bugfix, _bt_parallel_seize()
was
called after _bt_readpage() sets so->needPrimScan=true, and it just
returned
false without calling _bt_parallel_done().
Regards,
--
Masahiro Ikeda
NTT DATA CORPORATION