[PATCHES] synchronized scan: reset state at end of scan

Jeff Davis Sat, 31 May 2008 17:49:47 -0700

I was looking into supporting synchronized scans for VACUUM, and I
noticed that we currently don't remove the reported scan location as
this post suggests:


http://archives.postgresql.org/pgsql-patches/2007-06/msg00047.php

There was some debate about whether it should be done, but I thought
that the solution here seemed to satisfy most people's concerns:

http://archives.postgresql.org/pgsql-patches/2007-06/msg00052.php

I attached a patch that implements the above idea.

The benefit is that if you have a singular scan, it will start at the
beginning of the heap and not at some arbitrary place.

The cost is that it's not 100% guaranteed that the location entry will
be removed. The backend that started the scan could abort, die, etc.
Also, in rare situations there is a small window created where a new
scan might not be synchronized with existing concurrent scans. This is
really only an issue when issuing queries with limits or issuing two
scans that progress at different rates. I think it's somewhat reasonable
to say that if you're doing either of those things, you shouldn't be too
surprised if it messes with synchronized scanning. I have more
information in the comments in the attached patch.

I do not have a strong opinion about whether this patch is applied or
not. I am submitting this just for the sake of completeness.

Regards,
        Jeff Davis

diff --git a/src/backend/access/heap/heapam.c b/src/backend/access/heap/heapam.c
index c1afff3..b5bf780 100644
--- a/src/backend/access/heap/heapam.c
+++ b/src/backend/access/heap/heapam.c
@@ -1234,6 +1234,9 @@ heap_endscan(HeapScanDesc scan)
 
 	if (scan->rs_strategy != NULL)
 		FreeAccessStrategy(scan->rs_strategy);
+	
+	if (scan->rs_syncscan)
+		ss_reset_location(scan->rs_rd);
 
 	pfree(scan);
 }
diff --git a/src/backend/access/heap/syncscan.c b/src/backend/access/heap/syncscan.c
index dfc7265..5b2aa66 100644
--- a/src/backend/access/heap/syncscan.c
+++ b/src/backend/access/heap/syncscan.c
@@ -34,6 +34,8 @@
  * INTERFACE ROUTINES
  *		ss_get_location		- return current scan location of a relation
  *		ss_report_location	- update current scan location
+ *		ss_reset_location	- reset location to zero if started by this 
+ *							  process
  *
  *
  * Portions Copyright (c) 1996-2008, PostgreSQL Global Development Group
@@ -91,6 +93,7 @@ bool		trace_syncscan = false;
 typedef struct ss_scan_location_t
 {
 	RelFileNode relfilenode;	/* identity of a relation */
+	pid_t pid; 					/* PID of the process that set the scan location */
 	BlockNumber location;		/* last-reported location in the relation */
 } ss_scan_location_t;
 
@@ -161,6 +164,7 @@ SyncScanShmemInit(void)
 			item->location.relfilenode.spcNode = InvalidOid;
 			item->location.relfilenode.dbNode = InvalidOid;
 			item->location.relfilenode.relNode = InvalidOid;
+			item->location.pid = 0;
 			item->location.location = InvalidBlockNumber;
 
 			item->prev = (i > 0) ?
@@ -212,6 +216,13 @@ ss_search(RelFileNode relfilenode, BlockNumber location, bool set)
 			else if (set)
 				item->location.location = location;
 
+			/*
+			 * If we are starting a new scan, set the pid to that of the 
+			 * current process.
+			 */
+			if(!set)
+				item->location.pid = MyProcPid;
+				
 			/* Move the entry to the front of the LRU list */
 			if (item != scan_locations->head)
 			{
@@ -319,3 +330,53 @@ ss_report_location(Relation rel, BlockNumber location)
 #endif
 	}
 }
+
+/*
+ * ss_reset_location --- reset location to zero if started by this process
+ *
+ * When a scan finishes, it can remove itself from the list of 
+ * scan_locations. This means that when scans are no longer being run 
+ * concurrently, new scans will again be started at the beginning of the 
+ * heap. This is not required for correctness.
+ *
+ * The scan_location entry holds the pid of the most recently started scan,
+ * and when a scan finishes, it resets the entry to zero if and only if the
+ * pid in the entry matches that of the current process.
+ *
+ * When concurrent scans are active, it is unlikely that the most
+ * recently started scan will finish first, so the hint will usually not 
+ * be removed unless this is the only scan on that relation. If the scans 
+ * are merely started at nearly the same time, and the last one to start
+ * happens to finish first, there would be little benefit from
+ * synchronizing with a nearly-complete scan, anyway.
+ *
+ * In the rare case that the most recently started scan does finish 
+ * significantly before older concurrent scans (such as in the case of a 
+ * LIMIT clause), the hint will most likely be quickly filled by a location
+ * report from one of those older scans. If another scan begins during that
+ * narrow window, it will not have the benefit of being synchronized with 
+ * the older concurrent scans.
+ * 
+ * If we can't get the lock without waiting, then we do nothing.
+ */
+void ss_reset_location(Relation rel)
+{
+	ss_lru_item_t *item;
+
+	if(LWLockConditionalAcquire(SyncScanLock, LW_EXCLUSIVE)) {
+		item = scan_locations->head;
+
+		while(item != NULL) {
+			if(item->location.pid == MyProcPid &&
+				RelFileNodeEquals(item->location.relfilenode, rel->rd_node)) {
+				/* Concurrent scans may still be active on this relation, 
+				 * we only know that this scan has finished. So, we just 
+				 * set the location back to zero rather than remove it.
+				 */
+				item->location.location = 0;
+			}
+			item = item->next;
+		}
+		LWLockRelease(SyncScanLock);
+	}
+}
diff --git a/src/include/access/heapam.h b/src/include/access/heapam.h
index f29f835..7950c33 100644
--- a/src/include/access/heapam.h
+++ b/src/include/access/heapam.h
@@ -133,6 +133,7 @@ extern void heap_get_root_tuples(Page page, OffsetNumber *root_offsets);
 /* in heap/syncscan.c */
 extern void ss_report_location(Relation rel, BlockNumber location);
 extern BlockNumber ss_get_location(Relation rel, BlockNumber relnblocks);
+extern void ss_reset_location(Relation rel);
 extern void SyncScanShmemInit(void);
 extern Size SyncScanShmemSize(void);

-- 
Sent via pgsql-patches mailing list (pgsql-patches@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-patches

[PATCHES] synchronized scan: reset state at end of scan

Reply via email to