On Sun, 2009-12-27 at 20:11 +0100, Andres Freund wrote:
> On Tuesday 22 December 2009 11:42:30 Simon Riggs wrote:
> > On Tue, 2009-12-22 at 03:19 +0100, Andres Freund wrote:
> > > On Monday 21 December 2009 16:48:52 Simon Riggs wrote:
> > > > Giving the drop database a snapshot is not the answer. I expect Andres
> > > > to be able to fix this with a simple patch that would not effect the
> > > > case of normal running.
> > >
> > > Actually its less simply than I had thought at first - I don't think the
> > > code ever handled that correctly.
> > > I might be wrong there, my knowledge of the involved code is a bit
> > > sparse... The whole conflict resolution builds on the concept of waiting
> > > for an VXid, but an idle backend does not have a valid vxid. Thats
> > > correct, right?
> > I don't see any mileage in making Startup process wait for an idle
> > session, so no real reason to wait for others either.
> So here is a small patch implementing that behaviour.

On further testing, I received a re-connection from an automatic session
retry. That shouldn't have happened, but it indicates the need for
locking around the conflict handler. I had understood that to be placed
elsewhere but that seems wrong now.

This is a low priority item, so looking for a quick fix to allow time on
other areas.

Any objections?

-- 
 Simon Riggs           www.2ndQuadrant.com
*** a/src/backend/commands/dbcommands.c
--- b/src/backend/commands/dbcommands.c
***************
*** 1944,1950 **** dbase_redo(XLogRecPtr lsn, XLogRecord *record)
--- 1944,1958 ----
  		dst_path = GetDatabasePath(xlrec->db_id, xlrec->tablespace_id);
  
  		if (InHotStandby)
+ 		{
+ 			/*
+ 			 * Lock database while we resolve conflicts to ensure that InitPostgres()
+ 			 * cannot fully re-execute concurrently. This avoids backends re-connecting
+ 			 * automatically to same database, which can happen in some cases.
+ 			 */
+ 			LockSharedObjectForSession(DatabaseRelationId, xlrec->db_id, 0, AccessExclusiveLock);
  			ResolveRecoveryConflictWithDatabase(xlrec->db_id);
+ 		}
  
  		/* Drop pages for this database that are in the shared buffer cache */
  		DropDatabaseBuffers(xlrec->db_id);
***************
*** 1960,1965 **** dbase_redo(XLogRecPtr lsn, XLogRecord *record)
--- 1968,1984 ----
  			ereport(WARNING,
  					(errmsg("some useless files may be left behind in old database directory \"%s\"",
  							dst_path)));
+ 
+ 		if (InHotStandby)
+ 		{
+ 			/*
+ 			 * Release locks prior to commit. XX There is a race condition here that may allow
+ 			 * backends to reconnect, but the window for this is small because the gap between
+ 			 * here and commit is mostly fairly small and it is unlikely that people will be
+ 			 * dropping databases that we are trying to connect to anyway.
+ 			 */
+ 			UnlockSharedObjectForSession(DatabaseRelationId, xlrec->db_id, 0, AccessExclusiveLock);
+ 		}
  	}
  	else
  		elog(PANIC, "dbase_redo: unknown op code %u", info);
-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to