Hi,

On Monday, September 24, 2012 01:27:54 PM Andres Freund wrote:
> Hi,
> 
> Problem 1: concurrency:
> 
> Testcase:
> 
> Session 1:
> CREATE TABLE test_drop_concurrently(id serial primary key, data int);
> INSERT INTO test_drop_concurrently(data) SELECT * FROM generate_series(1,
> 100000);
> CREATE INDEX test_drop_concurrently_data ON test_drop_concurrently(data);
> BEGIN;
> EXPLAIN ANALYZE SELECT * FROM test_drop_concurrently WHERE data = 34343;
> SELECT * FROM test_drop_concurrently WHERE data = 34343;
> (1 row)
> 
> Session 2:
> BEGIN;
> SELECT * FROM test_drop_concurrently WHERE data = 34343;
> 
> Session 3:
> DROP INDEX CONCURRENTLY test_drop_concurrently_data;
> (in-progress)
> 
> Session 2:
> INSERT INTO test_drop_concurrently(data) SELECT * FROM generate_series(1,
> 100000);
> COMMIT;
> 
> Session 1:
> SELECT * FROM test_drop_concurrently WHERE data = 34343;
> (1 row)
> SET enable_bitmapscan = false;
> SET enable_indexscan = false;
> SELECT * FROM test_drop_concurrently WHERE data = 34343;
> (2 rows)
> 
> Explanation:
> index_drop does:
>               indexForm->indisvalid = false;  /* make unusable for queries */
>               indexForm->indisready = false;  /* make invisible to changes */
> 
> Setting indisready = false is problematic because that prevents index
> updates which in turn breaks READ COMMITTED semantics. I think there need
> to be one more phase that waits for concurrent users of the index to
> finish before setting indisready = false.
The attached patch fixes this issue. Haven't looked at the other one in detail 
yet.

Greetings,

Andres
-- 
 Andres Freund                     http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services
From 8891fcd59496483793aecc21a096fc0119369e73 Mon Sep 17 00:00:00 2001
From: Andres Freund <and...@anarazel.de>
Date: Tue, 25 Sep 2012 01:41:29 +0200
Subject: [PATCH] Fix concurrency issues in concurrent index drops

Previously a DROP INDEX CONCURRENTLY started with unsetting indisvalid *and*
indisready. Thats problematic if some transaction is still looking at the index
and another transction makes changes. See the example below.

Now we do the drop in three stages, just as a concurrent index build. First
unset indivalid, wait, unset indisready, wait, drop index.

Example:

Session 1:
CREATE TABLE test_drop_concurrently(id serial primary key, data int);
INSERT INTO test_drop_concurrently(data) SELECT * FROM generate_series(1,
100000);
CREATE INDEX test_drop_concurrently_data ON test_drop_concurrently(data);
BEGIN;
EXPLAIN ANALYZE SELECT * FROM test_drop_concurrently WHERE data = 34343;
SELECT * FROM test_drop_concurrently WHERE data = 34343;
(1 row)

Session 2:
BEGIN;
SELECT * FROM test_drop_concurrently WHERE data = 34343;

Session 3:
DROP INDEX CONCURRENTLY test_drop_concurrently_data;
(in-progress)

Session 2:
INSERT INTO test_drop_concurrently(data) SELECT * FROM generate_series(1, 100000);
COMMIT;

Session 1:
SELECT * FROM test_drop_concurrently WHERE data = 34343;
(1 row)
SET enable_bitmapscan = false;
SET enable_indexscan = false;
SELECT * FROM test_drop_concurrently WHERE data = 34343;
(2 rows)
---
 src/backend/catalog/index.c | 99 ++++++++++++++++++++++++++++++++++++++-------
 1 file changed, 84 insertions(+), 15 deletions(-)

diff --git a/src/backend/catalog/index.c b/src/backend/catalog/index.c
index a61b424..3e1794f 100644
--- a/src/backend/catalog/index.c
+++ b/src/backend/catalog/index.c
@@ -1318,6 +1318,10 @@ index_drop(Oid indexId, bool concurrent)
 	 * table lock strong enough to prevent all queries on the table from
 	 * proceeding until we commit and send out a shared-cache-inval notice
 	 * that will make them update their index lists.
+	 *
+	 * In the concurrent case we make sure that nobody can be looking at the
+	 * indexes by dropping the index in multiple steps, so we don't need a full
+	 * fledged AccessExlusiveLock yet.
 	 */
 	heapId = IndexGetRelation(indexId, false);
 	if (concurrent)
@@ -1338,7 +1342,19 @@ index_drop(Oid indexId, bool concurrent)
 
 	/*
 	 * Drop Index concurrently is similar in many ways to creating an index
-	 * concurrently, so some actions are similar to DefineIndex()
+	 * concurrently, so some actions are similar to DefineIndex() just in the
+	 * reverse order.
+	 *
+	 * First we unset indisvalid so queries starting afterwards don't use the
+	 * index to answer queries anymore. We have to keep indisready = true
+	 * though so transactions that are still using the index can continue to
+	 * see valid index contents. E.g. when they are using READ COMMITTED mode,
+	 * and another transactions that started later commits makes changes and
+	 * commits, they need to see those new tuples in the index.
+	 *
+	 * After all transactions that could possibly have used it for queries
+	 * ended we can unset indisready and wait till nobody could be updating it
+	 * anymore.
 	 */
 	if (concurrent)
 	{
@@ -1357,21 +1373,14 @@ index_drop(Oid indexId, bool concurrent)
 			elog(ERROR, "cache lookup failed for index %u", indexId);
 		indexForm = (Form_pg_index) GETSTRUCT(tuple);
 
-		indexForm->indisvalid = false;	/* make unusable for queries */
-		indexForm->indisready = false;	/* make invisible to changes */
+		indexForm->indisvalid = false;	/* make unusable for new queries */
+		/* we keep indisready == true so it still gets updated */
 
 		simple_heap_update(indexRelation, &tuple->t_self, tuple);
 		CatalogUpdateIndexes(indexRelation, tuple);
 
 		heap_close(indexRelation, RowExclusiveLock);
 
-		/*
-		 * Invalidate the relcache for the table, so that after this
-		 * transaction we will refresh the index list. Forgetting just the
-		 * index is not enough.
-		 */
-		CacheInvalidateRelcache(userHeapRelation);
-
 		/* save lockrelid and locktag for below, then close but keep locks */
 		heaprelid = userHeapRelation->rd_lockInfo.lockRelId;
 		SET_LOCKTAG_RELATION(heaplocktag, heaprelid.dbId, heaprelid.relId);
@@ -1383,8 +1392,8 @@ index_drop(Oid indexId, bool concurrent)
 		/*
 		 * For a concurrent drop, it's important to make the catalog entries
 		 * visible to other transactions before we drop the index. The index
-		 * will be marked not indisvalid, so that no one else tries to either
-		 * insert into it or use it for queries.
+		 * will be marked not indisvalid, so that no one else tries to use it
+		 * for queries.
 		 *
 		 * We must commit our current transaction so that the index update
 		 * becomes visible; then start another.  Note that all the data
@@ -1399,6 +1408,7 @@ index_drop(Oid indexId, bool concurrent)
 		LockRelationIdForSession(&heaprelid, ShareUpdateExclusiveLock);
 		LockRelationIdForSession(&indexrelid, ShareUpdateExclusiveLock);
 
+		/* XXX: Why do we have to do that here, who pushed which snap? */
 		PopActiveSnapshot();
 		CommitTransactionCommand();
 		StartTransactionCommand();
@@ -1431,11 +1441,70 @@ index_drop(Oid indexId, bool concurrent)
 		}
 
 		/*
+		 * now we are sure that nobody uses the index for queries, they just
+		 * might have it opened for updating it. So now we can unset
+		 * ->indisready and wait till nobody could update the index anymore.
+		 */
+		indexRelation = heap_open(IndexRelationId, RowExclusiveLock);
+
+		userHeapRelation = heap_open(heapId, ShareUpdateExclusiveLock);
+		userIndexRelation = index_open(indexId, ShareUpdateExclusiveLock);
+
+		tuple = SearchSysCacheCopy1(INDEXRELID,
+									ObjectIdGetDatum(indexId));
+		if (!HeapTupleIsValid(tuple))
+			elog(ERROR, "cache lookup failed for index %u", indexId);
+		indexForm = (Form_pg_index) GETSTRUCT(tuple);
+
+		/* we hold a lock, but make sure we haven't screwed anything up */
+		Assert(indexForm->indisvalid == false);
+		indexForm->indisready = false;	/* don't update index anymore */
+
+		simple_heap_update(indexRelation, &tuple->t_self, tuple);
+		CatalogUpdateIndexes(indexRelation, tuple);
+
+		heap_close(indexRelation, RowExclusiveLock);
+
+		/*
+		 * we close the relations again, were still holding the session lock
+		 * from above though!
+		 */
+		heap_close(userHeapRelation, NoLock);
+		index_close(userIndexRelation, NoLock);
+
+		/*
+		 * Invalidate the relcache for the table, so that after this
+		 * transaction we will refresh the index list. Forgetting just the
+		 * index is not enough.
+		 */
+		CacheInvalidateRelcache(userHeapRelation);
+
+		/*
+		 * just as with indisvalid = false we need to make sure indisready =
+		 * false is visible for everyone.
+		 */
+		CommitTransactionCommand();
+		StartTransactionCommand();
+
+		/*
+		 * Wait till everyone that saw indisready = true finished so we can
+		 * finally really remove the index. The logic here is the same as
+		 * above.
+		 */
+		old_lockholders = GetLockConflicts(&heaplocktag, AccessExclusiveLock);
+
+		while (VirtualTransactionIdIsValid(*old_lockholders))
+		{
+			VirtualXactLock(*old_lockholders, true);
+			old_lockholders++;
+		}
+
+		/*
 		 * Re-open relations to allow us to complete our actions.
 		 *
-		 * At this point, nothing should be accessing the index, but lets
-		 * leave nothing to chance and grab AccessExclusiveLock on the index
-		 * before the physical deletion.
+		 * At this point, nothing should be accessing the index, but lets leave
+		 * nothing to chance and grab AccessExclusiveLock on the index before
+		 * the physical deletion.
 		 */
 		userHeapRelation = heap_open(heapId, ShareUpdateExclusiveLock);
 		userIndexRelation = index_open(indexId, AccessExclusiveLock);
-- 
1.7.12.289.g0ce9864.dirty

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to