Tom Lane wrote:
"Florian G. Pflug" <[EMAIL PROTECTED]> writes:
One comment is that at the time we make an entry into smgr's
pending-deletes list, I think we might not have acquired an XID yet
Hm.. I was just going to implement this, but I'm now wondering if
thats really worth it.
Basically what you'd give up is the ability to Assert() that there are
no deletable files if there's no XID, which seems to me to be an
important cross-check ... although maybe making smgr do that turns
this "cross-check" into a tautology ... hmm. I guess the case that's
bothering me is where we reach commit with deletable files and no XID.
But that should probably be an error condition anyway, ie, we should
error out and turn it into an abort. On the abort side we'd consider
it OK to have files and no XID. Seems reasonable to me.
I've done that now, and it turned out nicely. There is an Assertion
on "(nrels == 0) || xid assigned" in the COMMIT path, but
not in the ABORT path. Seems reasonable and safe.
And I'm quite tempted to not flush the XLOG at all during ABORT, and to
only force synchronous commits if one of the to-be-deleted files is
non-temporary. The last idea widens the leakage window quite a bit
though, so I maybe I should rather resist that temptation...
OTOH, it'd allow aynchronous commits for transactions that created
The only way we could make this more robust is if we could have
WAL-before-data rule for file *creation*, but I think that's not
possible given that we don't know what relfilenode number we will use
until we've successfully created a file. So there will always be
windows where a crash leaks unreferenced files. There's been some
debate about having crash recovery search for and delete such files, but
so far I've resisted it on the grounds that it sounds like data loss
waiting to happen --- someday it'll delete a file you wished it'd kept.
It seems doable, but it's not pretty. One possible scheme would be to
emit a record *after* chosing a name but *before* creating the file,
and then a second record when the file is actually created successfully.
Then, during replay we could remember a list of xids and filenames,
and remove those files for which we either haven't seen a "created
successfully" record, or no COMMIT record for the creating xid.
With this scheme, I'd be natural to force XID assignment in smgrcreate,
because we'd actually depend on logging the xid there.
greetings, Florian Pflug
---------------------------(end of broadcast)---------------------------
TIP 6: explain analyze is your friend