On Fri, Aug 9, 2013 at 11:17 PM, Andres Freund <and...@2ndquadrant.com> wrote: > On 2013-08-09 14:11:46 -0400, Tom Lane wrote: >> Andres Freund <and...@2ndquadrant.com> writes: >> > On 2013-08-08 09:27:24 -0400, Robert Haas wrote: >> >> How can it be safe to try to read catalogs if the transaction is aborted? >> >> > Well. It isn't. At least not in general. The specific case triggered >> > here though are cache invalidations being processed which can lead to >> > the catalog being read (pretty crummy, but not easy to get rid >> > of). That's actually safe since before we process the invalidations we >> > have done: >> > 1) CurrentTransactionState->state = TRANS_ABORT >> > 2) RecordTransactionAbort(), marking the transaction as aborted in the >> > clog >> > 3) marked subxacts as aborted >> > 3) ProcArrayEndTransaction() (for toplevel ones) >> >> > Due to these any tqual stuff will treat the current (sub-)xact and it's >> > children as aborted. So the catalog lookups will use the catalog in a >> > sensible state. >> >> I don't have any faith in this argument. You might be right that we'll >> correctly see our own output rows as aborted, but that's barely the tip >> of the iceberg of risk here. Is it safe to take new locks in an aborted >> transaction? (What if we're already past the lock-release point in >> the abort sequence?) > > Don't get me wrong. I find the idea of doing catalog lookup during abort > horrid. But it's been that way for at least 10 years (I checked 7.4), so > it has at least some resemblance of working. > Today we do a good bit less than back then, for one we don't do a full > cache reload during abort anymore, just for the index support > infrastructure. Also, you've reduced the amount of lookups a bit with the > relmapper introduction. > >> For that matter, given that we don't know what >> exactly caused the transaction abort, how safe is it to do anything at >> all --- we might for instance be nearly out of memory. If the catalog >> reading attempt itself fails, won't we be in an infinite loop of >> transaction aborts? > > Looks like that's possible, yes. There seem to be quite some other > opportunities for this to happen if you look at the amount of work done > in AbortSubTransaction(). I guess it rarely happens because we > previously release some memory... > >> I could probably think of ten more risks if I spent a few more minutes >> at it. > > No need to convince me here. I neither could believe we were doing this, > nor figure out why it even "works" for the first hour of looking at it. > >> Cache invalidation during abort should *not* lead to any attempt to >> immediately revalidate the cache. No amount of excuses will make that >> okay. I have not looked to see just what the path of control is in this >> particular case, but we need to fix it, not paper over it. > > I agree, although that's easier said than done in the case of > subtransactions. The problem we have there is that it's perfectly valid > to still have references to a relation from the outer, not aborted, > transaction. Those need to be valid for anybody looking at the relcache > entry after we've processed the ROLLBACK TO/... > > I guess the fix is something like we do in the commit case, where we > transfer invalidations to the parent transaction. If we then process > local invalidations *after* we've cleaned up the subtransaction > completely we should be fine. We already do an implicity > CommandCounterIncrement() in CommitSubTransaction()...
Andres, are you (or is anyone) going to try to fix this assertion failure? -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers