Re: [HACKERS] snapbuild woes

Andres Freund Mon, 12 Dec 2016 14:34:24 -0800

On 2016-12-12 23:27:30 +0100, Petr Jelinek wrote:
> On 12/12/16 22:42, Andres Freund wrote:
> > Hi,
> > 
> > On 2016-12-10 23:10:19 +0100, Petr Jelinek wrote:
> >> Hi,
> >> First one is outright bug, which has to do with how we track running
> >> transactions. What snapbuild basically does while doing initial snapshot
> >> is read the xl_running_xacts record, store the list of running txes and
> >> then wait until they all finish. The problem with this is that
> >> xl_running_xacts does not ensure that it only logs transactions that are
> >> actually still running (to avoid locking PGPROC) so there might be xids
> >> in xl_running_xacts that already committed before it was logged.
> > 
> > I don't think that's actually true?  Notice how LogStandbySnapshot()
> > only releases the lock *after* the LogCurrentRunningXacts() iff
> > wal_level >= WAL_LEVEL_LOGICAL.  So the explanation for the problem you
> > observed must actually be a bit more complex :(
> > 
> 
> Hmm, interesting, I did see the transaction commit in the WAL before the
> xl_running_xacts that contained the xid as running. I only seen it on
> production system though, didn't really manage to easily reproduce it
> locally.


I suspect the reason for that is that RecordTransactionCommit() doesn't
conflict with ProcArrayLock in the first place - only
ProcArrayEndTransaction() does.  So they're still running in the PGPROC
sense, just not the crash-recovery sense...

Andres


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] snapbuild woes

Reply via email to