On Wed, Jul 8, 2015 at 1:27 AM, Fabrízio de Royes Mello < fabriziome...@gmail.com> wrote:
> > On Fri, Jul 3, 2015 at 9:29 AM, Ashutosh Bapat < > ashutosh.ba...@enterprisedb.com> wrote: > > > > > > On Fri, Jul 3, 2015 at 4:48 PM, Fabrízio de Royes Mello < > fabriziome...@gmail.com> wrote: > >> > >> > >> On Thu, Apr 2, 2015 at 3:24 PM, Robert Haas <robertmh...@gmail.com> > wrote: > >> > > >> > On Wed, Mar 25, 2015 at 9:46 PM, Fabrízio de Royes Mello > >> > <fabriziome...@gmail.com> wrote: > >> > > > http://www.postgresql.org/message-id/ca+tgmozm+-0r7h0edpzzjbokvvq+gavkchmno4fypveccw-...@mail.gmail.com > >> > > >> > I like the idea of the feature a lot, but the proposal to which you > >> > refer here mentions some problems, which I'm curious how you think you > >> > might solve. (I don't have any good ideas myself, beyond what I > >> > mentioned there.) > >> > > >> > >> You're right, and we have another design (steps 1 and 2 was implemented > last year): > >> > >> > >> *** ALTER TABLE changes > >> > >> 1) ATController > >> 1.1) Acquire an AccessExclusiveLock > (src/backend/commands/tablecmds.c - AlterTableGetLockLevel:3023) > >> > >> > >> 2) Prepare to change relpersistence (src/backend/commands/tablecmds.c - > ATPrepCmd:3249-3270) > >> • check temp table (src/backend/commands/tablecmds.c - > ATPrepChangePersistence:11074) > >> • check foreign key constraints (src/backend/commands/tablecmds.c - > ATPrepChangePersistence:11102) > >> > >> > >> 3) FlushRelationBuffers, DropRelFileNodeBuffers and smgrimmedsync > (MAIN_FORKNUM, FSM_FORKNUM, VISIBILITYMAP_FORKNUM and INIT_FORKNUM if > exists) > >> > >> > >> 4) Create a new fork called "TRANSIENT INIT FORK": > >> > >> • from Unlogged to Logged (create _initl fork) (INIT_TO_LOGGED_FORKNUM) > >> ∘ new forkName (src/common/relpath.c) called "_initl" > >> ∘ insert XLog record to drop it if transaction abort > >> > >> • from Logged to Unlogged (create _initu fork) (INIT_TO_UNLOGGED_FORKUM) > >> ∘ new forkName (src/common/relpath.c) called "_initu" > >> ∘ insert XLog record to drop it if transaction abort > > > > > > AFAIU, while reading WAL, the server doesn't know whether the > transaction that produced that WAL record aborted or committed. It's only > when it sees a COMMIT/ABORT record down the line, it can confirm whether > the transaction committed or aborted. So, one can only "redo" the things > that WAL tells have been done. We can not "undo" things based on the WAL. > We might record this fact "somewhere" while reading the WAL and act on it > once we know the status of the transaction, but I do not see that as part > of this idea. This comment applies to all the steps inserting WALs for > undoing things. > > > > > > Even if I "xlog" the create/drop of the transient fork? > > Yes. > > > >> 5) Change the relpersistence in catalog (pg_class->relpersistence) > (heap, toast, indexes) > >> > >> > >> 6) Remove/Create INIT_FORK > >> > >> • from Unlogged to Logged > >> ∘ remove the INIT_FORK and INIT_TO_LOGGED_FORK adding to the > pendingDeletes queue > >> • from Logged to Unlogged > >> ∘ remove the INIT_TO_UNLOGGED_FORK adding to the pendingDeletes queue > >> ∘ create the INIT_FORK using "heap_create_init_fork" > >> ∘ insert XLog record to drop init fork if the transaction abort > >> > >> > >> > >> *** CRASH RECOVERY changes > >> > >> 1) During crash recovery > (src/backend/access/transam/xlog.c:6507:ResetUnloggedRelations) > >> > > > > This operation is carried out in two phases: one before replaying WAL > records and second after that. Are you sure that the WALs generated for the > unlogged or logged forks, as described above, have been taken care of? > > > > Hummm... actually no... > > > > >> > >> • if the transient fork "_initl" exists then > >> ∘ drop the transient fork "_initl" > >> ∘ if the init fork doesn't exist then create it > >> ∘ reset relation > >> • if the transient fork "_initu" exists then > >> ∘ drop the transient fork "_initl" > >> ∘ if the init fork exists then drop it > >> ∘ don't reset the relation > >> > > > > Consider case of converting unlogged to logged. The server crashes after > 6th step when both the forks are removed. During crash recovery, it will > not see any of the fork and won't reset the unlogged table. Remember the > table is still unlogged since the transaction was aborted due to the crash. > So, it has to have an init fork to be reset on crash recovery. > > > > Similarly while converting from logged to unlogged. The server crashes > in the 6th step after creating the INIT_FORK, during crash recovery the > init fork will be seen and a supposedly logged table will be trashed. > > > > > My intention for the 6th step is all recorded to wal, so if a crash occurs > the recovery process clean the mess. > > AFAIU, PostgreSQL recovery is based on "redo"ing WAL. What you described earlier, "undo"ing based on the WAL does not fit in the current framework. > > > The ideas in 1 and 2 might be better than having yet another init fork. > > > > The link for idea 2 is missing... > Sorry for that. 2nd idea is what Robert described using control file. > > > > > > 1. http://www.postgresql.org/message-id/533d457a.4030...@vmware.com > > > > Unfortunately I completely missed this idea, but is also interesting. But > I'll need to "stamp" in both ways (from unlogged to logged and vice-versa) > > During the recovery to check the status of a transaction can I use > src/backend/access/transam/clog.c:TransactionIdGetStatus ? I'm asking > because I don't know this part of code to much. > AFAIU, not unless all WALs (or atleast WALs till the commit/abort record of the transaction in question) are replayed. -- Best Wishes, Ashutosh Bapat EnterpriseDB Corporation The Postgres Database Company