On Sun, 2009-12-27 at 20:11 +0100, Andres Freund wrote: > On Tuesday 22 December 2009 11:42:30 Simon Riggs wrote: > > On Tue, 2009-12-22 at 03:19 +0100, Andres Freund wrote: > > > On Monday 21 December 2009 16:48:52 Simon Riggs wrote: > > > > Giving the drop database a snapshot is not the answer. I expect Andres > > > > to be able to fix this with a simple patch that would not effect the > > > > case of normal running. > > > > > > Actually its less simply than I had thought at first - I don't think the > > > code ever handled that correctly. > > > I might be wrong there, my knowledge of the involved code is a bit > > > sparse... The whole conflict resolution builds on the concept of waiting > > > for an VXid, but an idle backend does not have a valid vxid. Thats > > > correct, right? > > I don't see any mileage in making Startup process wait for an idle > > session, so no real reason to wait for others either. > So here is a small patch implementing that behaviour.
On further testing, I received a re-connection from an automatic session retry. That shouldn't have happened, but it indicates the need for locking around the conflict handler. I had understood that to be placed elsewhere but that seems wrong now. This is a low priority item, so looking for a quick fix to allow time on other areas. Any objections? -- Simon Riggs www.2ndQuadrant.com
*** a/src/backend/commands/dbcommands.c --- b/src/backend/commands/dbcommands.c *************** *** 1944,1950 **** dbase_redo(XLogRecPtr lsn, XLogRecord *record) --- 1944,1958 ---- dst_path = GetDatabasePath(xlrec->db_id, xlrec->tablespace_id); if (InHotStandby) + { + /* + * Lock database while we resolve conflicts to ensure that InitPostgres() + * cannot fully re-execute concurrently. This avoids backends re-connecting + * automatically to same database, which can happen in some cases. + */ + LockSharedObjectForSession(DatabaseRelationId, xlrec->db_id, 0, AccessExclusiveLock); ResolveRecoveryConflictWithDatabase(xlrec->db_id); + } /* Drop pages for this database that are in the shared buffer cache */ DropDatabaseBuffers(xlrec->db_id); *************** *** 1960,1965 **** dbase_redo(XLogRecPtr lsn, XLogRecord *record) --- 1968,1984 ---- ereport(WARNING, (errmsg("some useless files may be left behind in old database directory \"%s\"", dst_path))); + + if (InHotStandby) + { + /* + * Release locks prior to commit. XX There is a race condition here that may allow + * backends to reconnect, but the window for this is small because the gap between + * here and commit is mostly fairly small and it is unlikely that people will be + * dropping databases that we are trying to connect to anyway. + */ + UnlockSharedObjectForSession(DatabaseRelationId, xlrec->db_id, 0, AccessExclusiveLock); + } } else elog(PANIC, "dbase_redo: unknown op code %u", info);
-- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers