I bumped into a bug in the new buffering GiST build code. I did this:

create table gisttest (t text);
insert into gisttest select a||'fooooooooooooooooooooooooooooooooooooooooooooooo' from generate_series(1,10000000) a;

create index i_gisttest on gisttest using gist (t collate "C") WITH (fillfactor=10);

After a while, this segfaults.

I debugged this, and traced this into a bug in the gistRelocateBuildBuffersOnSplit() function. It splits a node buffer into two (or more) node buffers, when the corresponding index page is split. It first makes a copy of the old GISTNodeBuffer struct, and then repurposes the struct to hold the new buffer for the new leftmost page of the split. The temporary copy of the old page is only needed while the function moves all the tuples from the old buffer into the new buffers, after that it can be discarded. The temporary copy of the struct is kept in the stack. However, the temporary copy can find its way into the list of "loaded buffers". After the function exits, and it's time to unload all the currently loaded buffers, you get a segfault because the pointer now points to garbage. I think the reason this doesn't always crash is that the copy in the stack usually still happens to be valid enough that gistUnloadNodeBuffer() just skips it.

I'll commit the attached fix for that, marking the temporary copy explicitly, so that we can avoid leaking it into any long-lived data structures.

After fixing that, however, I'm now getting another error, much later in the build process:

ERROR:  failed to re-find parent for block 123002
STATEMENT: create index i_gisttest on gisttest using gist (t collate "C") WITH (fillfactor=10);

I'll continue debugging that, but it seems to be another, unrelated, bug.

--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com
diff --git a/src/backend/access/gist/gistbuildbuffers.c b/src/backend/access/gist/gistbuildbuffers.c
index 34a12bc..a40b838 100644
--- a/src/backend/access/gist/gistbuildbuffers.c
+++ b/src/backend/access/gist/gistbuildbuffers.c
@@ -148,8 +148,10 @@ gistGetNodeBuffer(GISTBuildBuffers *gfbb, GISTSTATE *giststate,
 		int			level;
 		MemoryContext oldcxt = MemoryContextSwitchTo(gfbb->context);
 
-		nodeBuffer->pageBuffer = NULL;
+		/* nodeBuffer->nodeBlocknum is the hash key and was filled in already */
 		nodeBuffer->blocksCount = 0;
+		nodeBuffer->pageBlocknum = InvalidBlockNumber;
+		nodeBuffer->pageBuffer = NULL;
 		nodeBuffer->queuedForEmptying = false;
 
 		/*
@@ -244,11 +246,15 @@ gistAllocateNewPageBuffer(GISTBuildBuffers *gfbb)
 }
 
 /*
- * Add specified block number into loadedBuffers array.
+ * Add specified buffer into loadedBuffers array.
  */
 static void
 gistAddLoadedBuffer(GISTBuildBuffers *gfbb, GISTNodeBuffer *nodeBuffer)
 {
+	/* Never add a temporary buffer to the array */
+	if (nodeBuffer->isTemp)
+		return;
+
 	/* Enlarge the array if needed */
 	if (gfbb->loadedBuffersCount >= gfbb->loadedBuffersLen)
 	{
@@ -591,7 +597,7 @@ gistRelocateBuildBuffersOnSplit(GISTBuildBuffers *gfbb, GISTSTATE *giststate,
 				i;
 	GISTENTRY	entry[INDEX_MAX_KEYS];
 	bool		isnull[INDEX_MAX_KEYS];
-	GISTNodeBuffer nodebuf;
+	GISTNodeBuffer oldBuf;
 	ListCell   *lc;
 
 	/* If the splitted page doesn't have buffers, we have nothing to do. */
@@ -619,16 +625,14 @@ gistRelocateBuildBuffersOnSplit(GISTBuildBuffers *gfbb, GISTSTATE *giststate,
 	 * read the tuples straight from the heap instead of the root buffer.
 	 */
 	Assert(blocknum != GIST_ROOT_BLKNO);
-	memcpy(&nodebuf, nodeBuffer, sizeof(GISTNodeBuffer));
+	memcpy(&oldBuf, nodeBuffer, sizeof(GISTNodeBuffer));
+	oldBuf.isTemp = true;
 
 	/* Reset the old buffer, used for the new left page from now on */
 	nodeBuffer->blocksCount = 0;
 	nodeBuffer->pageBuffer = NULL;
 	nodeBuffer->pageBlocknum = InvalidBlockNumber;
 
-	/* Reassign pointer to the saved copy. */
-	nodeBuffer = &nodebuf;
-
 	/*
 	 * Allocate memory for information about relocation buffers.
 	 */
@@ -675,7 +679,7 @@ gistRelocateBuildBuffersOnSplit(GISTBuildBuffers *gfbb, GISTSTATE *giststate,
 	 * Loop through all index tuples on the buffer on the splitted page,
 	 * moving them to buffers on the new pages.
 	 */
-	while (gistPopItupFromNodeBuffer(gfbb, nodeBuffer, &itup))
+	while (gistPopItupFromNodeBuffer(gfbb, &oldBuf, &itup))
 	{
 		float		sum_grow,
 					which_grow[INDEX_MAX_KEYS];
diff --git a/src/include/access/gist_private.h b/src/include/access/gist_private.h
index 0d6b625..481ef5f 100644
--- a/src/include/access/gist_private.h
+++ b/src/include/access/gist_private.h
@@ -326,6 +326,9 @@ typedef struct
 	/* is this buffer queued for emptying? */
 	bool		queuedForEmptying;
 
+	/* is this a temporary copy, not in the hash table? */
+	bool		isTemp;
+
 	struct GISTBufferingInsertStack *path;
 } GISTNodeBuffer;
 
-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to