On Thu, May 19, 2011 at 01:52:46PM +0100, Leonardo Francalanci wrote:
> > I'd guess some WAL record arising  from the post-crash master restart makes 
> the
> > standby do so.  When a  crash isn't involved, the commit or abort record is 
> >that
> > signal.  You  could test and find out how it happens after a master crash 
> > with 
> >a
> > procedure  like this:
> > 
> > 1. Start a master and standby on the same machine.
> > 2.  Connect to master; CREATE TABLE t(); BEGIN; ALTER TABLE t ADD c int;
> > 3. kill  -9 -`head -n1 $master_PGDATA/postmaster.pid`
> > 4. Connect to standby and  confirm that t is still locked.
> > 5. Attach debugger to standby startup process  and set breakpoints on
> > StandbyReleaseLocks and StandbyReleaseLocksMany.
> > 6.  Restart master.
> 
> 
> Well yes, based on the test the stack is something like:
> 
> StandbyReleaseLocksMany
> StandbyReleaseOldLocks 
> ProcArrayApplyRecoveryInfo  
> xlog_redo
> 
> It's not very clear to me what ProcArrayApplyRecoveryInfo does (not too
> familiar with the standby part I guess) but I see it's called by xlog_redo in
> the "info == XLOG_CHECKPOINT_SHUTDOWN" case and by StartupXLOG.
> 
> But I don't know if calling   ResetUnloggedRelations before 
> the call to ProcArrayApplyRecoveryInfo in   xlog_redo makes sense...
> if it makes sense, it would solve the problem of the stray files in
> the master crashing case I guess?

It would solve the problem, but it would mean resetting unlogged relations on
the standby at every shutdown checkpoint.  That's probably not a performance
problem, but it is a hack.  Offhand, I'd add a new smgr WAL record issued by
ResetUnloggedRelations() when called with UNLOGGED_RELATION_CLEANUP.  Another,
simpler, idea is to split XLOG_CHECKPOINT_SHUTDOWN into XLOG_CHECKPOINT_SHUTDOWN
and XLOG_CHECKPOINT_END_OF_RECOVERY, mirroring CreateCheckPoint()'s distinction.
(Given that I regularly lack good taste, you might want to wait for other people
to weigh in before spending too much time on that.)

> > > > When you promote the standby,  though,  
> > > ShutdownRecoveryTransactionEnvironment()
> > > > releases the  locks.
> 
> 
> If I understand the code, ResetUnloggedRelations is called before 
> ShutdownRecoveryTransactionEnvironment, so that part shouldn't be
> an issue 

Seems correct.

nm

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to