Andres Freund <and...@2ndquadrant.com> wrote: >On 2013-07-24 12:59:43 +0200, Andres Freund wrote: >> > <Approach 2> >> > Like the DROP TABLE/INDEX case, piggyback the directory deletion >record on >> > the transaction commit record, and eliminate the directory deletion >record >> > altogether. >> >> I don't think burdening commit records with that makes sense. It's >just >> not a common enough case. >> >> What we imo could do would be to drop the tablespaces in a *separate* >> transaction *after* the transaction that removed the pg_tablespace >> entry. Then an "incomplete actions" logic similar to btree and gin >could >> be used to remove the database directory if we crashed between the >two >> transactions. >> >> SO: >> TXN1 does: >> * remove catalog entries >> * drop buffers >> * XLogInsert(XLOG_DBASE_DROP_BEGIN) >> >> TXN2: >> * remove_dbtablespaces >> * XLogInsert(XLOG_DBASE_DROP_FINISH) >> >> The RM_DBASE_ID resource manager would then grow a rm_cleanup >callback >> (which would perform TXN2 if we failed inbetween) and a >> rm_safe_restartpoint which would prevent restartpoints from occuring >on >> standby between both. >> >> The same should probably done for CREATE DATABASE because that >currently >> can result in partially copied databases lying around. > >And CREATE/DROP TABLESPACE. > >Not really related, but CREATE DATABASE's implementation makes me itch >everytime I read parts of it...
I've been hoping that we could get rid of the rm_cleanup mechanism entirely. I eliminated it for gist a while back, and I've been thinking of doing the same for gin and btree. The way it works currently is buggy - while we have rm_safe_restartpoint to avoid creating a restartpoint at a bad moment, there is nothing to stop you from running a checkpoint while incomplete actions are pending. It's possible that there are page locks or something that prevent it in practice, but it feels shaky. So I'd prefer a solution that doesn't rely on rm_cleanup. Piggybacking on commit record seems ok to me, though if we're going to have a lot of different things to attach there, maybe we need to generalize it somehow. Like, allow resource managers to attach arbitrary payload to the commit record, and provide a new rm_redo_commit function to replay them. - Heikki -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers