On Thu, May 19, 2011 at 01:52:46PM +0100, Leonardo Francalanci wrote: > > I'd guess some WAL record arising from the post-crash master restart makes > the > > standby do so. When a crash isn't involved, the commit or abort record is > >that > > signal. You could test and find out how it happens after a master crash > > with > >a > > procedure like this: > > > > 1. Start a master and standby on the same machine. > > 2. Connect to master; CREATE TABLE t(); BEGIN; ALTER TABLE t ADD c int; > > 3. kill -9 -`head -n1 $master_PGDATA/postmaster.pid` > > 4. Connect to standby and confirm that t is still locked. > > 5. Attach debugger to standby startup process and set breakpoints on > > StandbyReleaseLocks and StandbyReleaseLocksMany. > > 6. Restart master. > > > Well yes, based on the test the stack is something like: > > StandbyReleaseLocksMany > StandbyReleaseOldLocks > ProcArrayApplyRecoveryInfo > xlog_redo > > It's not very clear to me what ProcArrayApplyRecoveryInfo does (not too > familiar with the standby part I guess) but I see it's called by xlog_redo in > the "info == XLOG_CHECKPOINT_SHUTDOWN" case and by StartupXLOG. > > But I don't know if calling ResetUnloggedRelations before > the call to ProcArrayApplyRecoveryInfo in xlog_redo makes sense... > if it makes sense, it would solve the problem of the stray files in > the master crashing case I guess?
It would solve the problem, but it would mean resetting unlogged relations on the standby at every shutdown checkpoint. That's probably not a performance problem, but it is a hack. Offhand, I'd add a new smgr WAL record issued by ResetUnloggedRelations() when called with UNLOGGED_RELATION_CLEANUP. Another, simpler, idea is to split XLOG_CHECKPOINT_SHUTDOWN into XLOG_CHECKPOINT_SHUTDOWN and XLOG_CHECKPOINT_END_OF_RECOVERY, mirroring CreateCheckPoint()'s distinction. (Given that I regularly lack good taste, you might want to wait for other people to weigh in before spending too much time on that.) > > > > When you promote the standby, though, > > > ShutdownRecoveryTransactionEnvironment() > > > > releases the locks. > > > If I understand the code, ResetUnloggedRelations is called before > ShutdownRecoveryTransactionEnvironment, so that part shouldn't be > an issue Seems correct. nm -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers