On 26.11.2025 09:15, David Geier wrote:
> What we can do is use a global variable. That also makes checking the
> flag a lot easier because we don't need to pass it around through
> multiple abstraction layers.
> 
> What needs to be taken care of though is to only bail from scans that
> are actually initiated from plan nodes. There are many places in the
> code that use e.g. the table AM API directly. We don't want to bail from
> these scans. Without flagging if a scan should bail or not, e.g.
> ScanPgRelation() will return no tuple and therefore relcache code starts
> failing.
> 
> The new patch accounts for that by introducing a new TableScanDescData
> flag SO_OBEY_PARALLEL_WORKER_STOP, which indicates if the scan should
> adhere to the parallel worker stop or not.
> 
> Stopping is broadcasted to all parallel workers via SendProcSignal().
> The stop variable is checked with the new
> CHECK_FOR_PARALLEL_WORKER_STOP() macro.
> 
> In the PoC implementation I've for now only changed nodeSeqScan.c. If
> there's agreement to pursue this approach, I'll change the other places
> as well. Naming can also likely be still improved.

I missed attaching the example I used for testing.

CREATE TABLE bar (col INT);
INSERT INTO bar SELECT generate_series(1, 50000000);
ANALYZE bar;
SET parallel_leader_participation = OFF;
SET synchronize_seqscans = OFF;
EXPLAIN ANALYZE VERBOSE SELECT col FROM bar WHERE col = 1 LIMIT 1;

I disabled sychronize_seqscans to make the test deterministic. I
disabled parallel_leader_participation to make sure one of the workers
finds the first row and quits.

Running with parallel_leader_participation enabled revealed one more
issue: if the leader finds the row, before the parallel workers have
started up, the stop signal is lost and the workers don't stop early.

Instead of SendProcSignal() we can store a flag in shared memory that
indicates that the leader has already enough rows. I'll give this
approach a try.

--
David Geier



Reply via email to