On Sun, 2009-09-27 at 14:59 +0300, Heikki Linnakangas wrote:
> The locking in smgr_redo_commit and smgr_redo_abort doesn't look right.
> First of all, smgr_redo_abort is not holding XidGenLock and
> ProcArrayLock while modifying ShmemVariableCache->nextXid and
> ShmemVariableCache->latestCompletedXid, respectively, like
> smgr_redo_commit is. Attached patch fixes that.
> 
> But I think there's a little race condition in there anyway, as they
> remove the finished xids from known-assigned-xids list by calling
> ExpireTreeKnownAssignedTransactionIds, and
> ShmemVariableCache->latestCompletedXid is updated only after that. We're
> not holding ProcArrayLock between those two steps, like we do during
> normal transaction commit. I'm not sure what kind of issues that can
> lead to, but it seems like it can lead to broken snapshots if someone
> calls GetSnapshotData() between those steps.

I confess I didn't know what you were talking about when you wrote this
and was expecting you to have a coffee and retract. I realise now you
meant xact_redo_commit() rather than smgr_ and it makes sense at last.

I've just committed about half of your patch exactly, but not all.

I've avoided making the changes to
ShmemVariableCache->latestCompletedXid directly and moved the update to
ExpireTreeKnownAssignedTransactionIds() which already had access to
max_xid while holding ProcArrayLock. Hopefully that resolves your
comment about race condition.

Also, I noticed that you removed the line 
        TransactionIdAdvance(ShmemVariableCache->nextXid);
in xact_redo_abort(). That looks like an error to me, since this follows
neither the comment nor how it is coded in xact_redo_commit(). Let me
know if there was some other reason for that line removal and I'll make
the change again.

-- 
 Simon Riggs           www.2ndQuadrant.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to