I was looking into supporting synchronized scans for VACUUM, and I noticed that we currently don't remove the reported scan location as this post suggests:
http://archives.postgresql.org/pgsql-patches/2007-06/msg00047.php There was some debate about whether it should be done, but I thought that the solution here seemed to satisfy most people's concerns: http://archives.postgresql.org/pgsql-patches/2007-06/msg00052.php I attached a patch that implements the above idea. The benefit is that if you have a singular scan, it will start at the beginning of the heap and not at some arbitrary place. The cost is that it's not 100% guaranteed that the location entry will be removed. The backend that started the scan could abort, die, etc. Also, in rare situations there is a small window created where a new scan might not be synchronized with existing concurrent scans. This is really only an issue when issuing queries with limits or issuing two scans that progress at different rates. I think it's somewhat reasonable to say that if you're doing either of those things, you shouldn't be too surprised if it messes with synchronized scanning. I have more information in the comments in the attached patch. I do not have a strong opinion about whether this patch is applied or not. I am submitting this just for the sake of completeness. Regards, Jeff Davis
diff --git a/src/backend/access/heap/heapam.c b/src/backend/access/heap/heapam.c index c1afff3..b5bf780 100644 --- a/src/backend/access/heap/heapam.c +++ b/src/backend/access/heap/heapam.c @@ -1234,6 +1234,9 @@ heap_endscan(HeapScanDesc scan) if (scan->rs_strategy != NULL) FreeAccessStrategy(scan->rs_strategy); + + if (scan->rs_syncscan) + ss_reset_location(scan->rs_rd); pfree(scan); } diff --git a/src/backend/access/heap/syncscan.c b/src/backend/access/heap/syncscan.c index dfc7265..5b2aa66 100644 --- a/src/backend/access/heap/syncscan.c +++ b/src/backend/access/heap/syncscan.c @@ -34,6 +34,8 @@ * INTERFACE ROUTINES * ss_get_location - return current scan location of a relation * ss_report_location - update current scan location + * ss_reset_location - reset location to zero if started by this + * process * * * Portions Copyright (c) 1996-2008, PostgreSQL Global Development Group @@ -91,6 +93,7 @@ bool trace_syncscan = false; typedef struct ss_scan_location_t { RelFileNode relfilenode; /* identity of a relation */ + pid_t pid; /* PID of the process that set the scan location */ BlockNumber location; /* last-reported location in the relation */ } ss_scan_location_t; @@ -161,6 +164,7 @@ SyncScanShmemInit(void) item->location.relfilenode.spcNode = InvalidOid; item->location.relfilenode.dbNode = InvalidOid; item->location.relfilenode.relNode = InvalidOid; + item->location.pid = 0; item->location.location = InvalidBlockNumber; item->prev = (i > 0) ? @@ -212,6 +216,13 @@ ss_search(RelFileNode relfilenode, BlockNumber location, bool set) else if (set) item->location.location = location; + /* + * If we are starting a new scan, set the pid to that of the + * current process. + */ + if(!set) + item->location.pid = MyProcPid; + /* Move the entry to the front of the LRU list */ if (item != scan_locations->head) { @@ -319,3 +330,53 @@ ss_report_location(Relation rel, BlockNumber location) #endif } } + +/* + * ss_reset_location --- reset location to zero if started by this process + * + * When a scan finishes, it can remove itself from the list of + * scan_locations. This means that when scans are no longer being run + * concurrently, new scans will again be started at the beginning of the + * heap. This is not required for correctness. + * + * The scan_location entry holds the pid of the most recently started scan, + * and when a scan finishes, it resets the entry to zero if and only if the + * pid in the entry matches that of the current process. + * + * When concurrent scans are active, it is unlikely that the most + * recently started scan will finish first, so the hint will usually not + * be removed unless this is the only scan on that relation. If the scans + * are merely started at nearly the same time, and the last one to start + * happens to finish first, there would be little benefit from + * synchronizing with a nearly-complete scan, anyway. + * + * In the rare case that the most recently started scan does finish + * significantly before older concurrent scans (such as in the case of a + * LIMIT clause), the hint will most likely be quickly filled by a location + * report from one of those older scans. If another scan begins during that + * narrow window, it will not have the benefit of being synchronized with + * the older concurrent scans. + * + * If we can't get the lock without waiting, then we do nothing. + */ +void ss_reset_location(Relation rel) +{ + ss_lru_item_t *item; + + if(LWLockConditionalAcquire(SyncScanLock, LW_EXCLUSIVE)) { + item = scan_locations->head; + + while(item != NULL) { + if(item->location.pid == MyProcPid && + RelFileNodeEquals(item->location.relfilenode, rel->rd_node)) { + /* Concurrent scans may still be active on this relation, + * we only know that this scan has finished. So, we just + * set the location back to zero rather than remove it. + */ + item->location.location = 0; + } + item = item->next; + } + LWLockRelease(SyncScanLock); + } +} diff --git a/src/include/access/heapam.h b/src/include/access/heapam.h index f29f835..7950c33 100644 --- a/src/include/access/heapam.h +++ b/src/include/access/heapam.h @@ -133,6 +133,7 @@ extern void heap_get_root_tuples(Page page, OffsetNumber *root_offsets); /* in heap/syncscan.c */ extern void ss_report_location(Relation rel, BlockNumber location); extern BlockNumber ss_get_location(Relation rel, BlockNumber relnblocks); +extern void ss_reset_location(Relation rel); extern void SyncScanShmemInit(void); extern Size SyncScanShmemSize(void);
-- Sent via pgsql-patches mailing list (pgsql-patches@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-patches