On Fri, 2007-08-10 at 13:47 -0400, Tom Lane wrote: > "Simon Riggs" <[EMAIL PROTECTED]> writes: > > On Fri, 2007-08-10 at 11:11 -0400, Tom Lane wrote: > >> If we do that, we don't actually need XLogAsyncCommitFlush() at the start. > >> I'm inclined to remove it just because it's ugly. Comments? > > > I'm not clear why XLogAsyncCommitFlush() is ugly. It's one line of code > > that doesn't make anything else any harder. Apart from system catalogs, > > doing it that way would be error free for all normal VFs. > > Please recall that the failure that started this thread was on a regular > user table. To do what you want will introduce what I'm now thinking > is an unacceptable amount of fragility. In particular your patch of > yesterday to force hint-bit setting in VF scares the heck out of me. > > The reason why XLogAsyncCommitFlush() is ugly is that it doesn't > actually accomplish a darn thing, as we now see from this failure: > it does not guarantee that hint bits will get set, because of the > inexact bookkeeping in clog.c. It might marginally improve the > probability that they get set, but that's all. The reason I want > to take it out is mainly that its very existence tempts people to make > the same mistake that was made here.
I think this *can* work, but I accept you don't like it, even if I'm not sure why. The bug was in the assumption that all flushed async commits can be confirmed to be flushed, which isn't true, not in the flush itself. If VACUUM FULL has problems with catalog tables, then that is an existing bug, not one introduced by the async patch. Catalog tables might be unlocked and yet have uncommitted transactions in them, which violates the assumption in the current VF code that all hint bits will be set prior to repair_frag(). But the comments in vacuum.c line 1821 say "A tuple is either: (a) a tuple in a system catalog, inserted or deleted by a not yet committed transaction... in case (a) we would not be in repair_frag() at all" (it doesn't give a reason). If the current comments are correct, then we're OK to fix this by having a special case HeapTupleSatisfies, maybe coded slightly differently. If the current comments are false, in that (a) is possible, then VF has a pre-existing bug, that *must* be fixed in the way you suggest, but that has nothing to do with async. ...but... > Another argument is that VACUUM FULL is a dinosaur that should probably > go away entirely someday (a view I believe you share); it should > therefore not be allowed to drive the design of other parts of the > system. You're right. Let's make that day today. I vote to completely replace VF now with a cluster-style rebuild of the table. That way we won't have to keep fixing something we're gonna kill anyway. It would be better to release 8.3 with a new, clean, fast implementation of VF, than to release it with code that we *think* is right, but has so far proved a source of some truly obscure bugs. -- Simon Riggs EnterpriseDB http://www.enterprisedb.com ---------------------------(end of broadcast)--------------------------- TIP 5: don't forget to increase your free space map settings