Re: [PATCHES] [HACKERS] Infrastructure changes for recovery
On Thu, 2008-09-18 at 10:09 -0400, Tom Lane wrote: > Simon Riggs <[EMAIL PROTECTED]> writes: > > On Thu, 2008-09-18 at 09:06 -0400, Tom Lane wrote: > >> Do we really need a checkpoint there at all? > > > "Timelines only change at shutdown checkpoints". > > Hmm. I *think* that that is just a debugging crosscheck rather than a > critical property. But yeah, it would take some close investigation, > which maybe isn't warranted if you have a less-invasive solution. The important thing is that everybody uses the new timelineid. AFAICS we actually write new timelineids into the WAL file if the previous timelineid, so it can't be that risky. Ignore v5 then and look for v6 on Monday. Hopefully the final one :-) -- Simon Riggs www.2ndQuadrant.com PostgreSQL Training, Services and Support -- Sent via pgsql-patches mailing list (pgsql-patches@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-patches
Re: [PATCHES] [HACKERS] Infrastructure changes for recovery
Simon Riggs <[EMAIL PROTECTED]> writes: > On Thu, 2008-09-18 at 09:06 -0400, Tom Lane wrote: >> Do we really need a checkpoint there at all? > "Timelines only change at shutdown checkpoints". Hmm. I *think* that that is just a debugging crosscheck rather than a critical property. But yeah, it would take some close investigation, which maybe isn't warranted if you have a less-invasive solution. regards, tom lane -- Sent via pgsql-patches mailing list (pgsql-patches@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-patches
Re: [PATCHES] [HACKERS] Infrastructure changes for recovery
On Thu, 2008-09-18 at 09:06 -0400, Tom Lane wrote: > Simon Riggs <[EMAIL PROTECTED]> writes: > > Having some trouble trying to get a clean state change from recovery to > > normal mode. > > > Startup needs to be able to write WAL at the end of recovery so it can > > write a ShutdownCheckpoint, yet must not be allowed to write WAL before > > that. Other services are still not fully up yet. So every other process > > apart from Startup musn't write WAL until Startup has fully completed > > its actions, which is sometime later. > > ... > > We *might* solve this by making the final checkpoint that Startup writes > > into an online checkpoint. > > Do we really need a checkpoint there at all? I can't recall right now > if there was a good reason why Vadim made it do that. I have a feeling > that it might have had to do with an assumption that the recovery > environment was completely distinct from live environment; which is a > statement that's certainly not going to be true in a query-answering > slave. Hmm, a question I hadn't considered. We definitely don't want to do PreallocXlogFiles(). "Timelines only change at shutdown checkpoints". But its just as easy to write a short timeline change record rather than the checkpoint itself, so we can sort that out. It should be sufficient to request bgwriter to perform an immediate checkpoint, rather than have startup process perform it. That way startup can exit and we can change to normal processing faster. If we crash before the next checkpoint then we would revert to archive recovery or crash recovery. The last restartpoint might well be somewhere else. In streaming mode we would presumably have that data locally, so not really a problem. But we musn't mark control file in production yet, but we can do so following the next checkpoint. -- Simon Riggs www.2ndQuadrant.com PostgreSQL Training, Services and Support -- Sent via pgsql-patches mailing list (pgsql-patches@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-patches
Re: [PATCHES] [HACKERS] Infrastructure changes for recovery
Simon Riggs <[EMAIL PROTECTED]> writes: > Having some trouble trying to get a clean state change from recovery to > normal mode. > Startup needs to be able to write WAL at the end of recovery so it can > write a ShutdownCheckpoint, yet must not be allowed to write WAL before > that. Other services are still not fully up yet. So every other process > apart from Startup musn't write WAL until Startup has fully completed > its actions, which is sometime later. > ... > We *might* solve this by making the final checkpoint that Startup writes > into an online checkpoint. Do we really need a checkpoint there at all? I can't recall right now if there was a good reason why Vadim made it do that. I have a feeling that it might have had to do with an assumption that the recovery environment was completely distinct from live environment; which is a statement that's certainly not going to be true in a query-answering slave. regards, tom lane -- Sent via pgsql-patches mailing list (pgsql-patches@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-patches
Re: [PATCHES] [HACKERS] Infrastructure changes for recovery
On Thu, 2008-09-18 at 09:05 +0100, Simon Riggs wrote: > Feels like I should shutdown the bgwriter after recovery and then > allow it to be cranked up again after normal processing starts, and do > all of this through postmaster state changes. That way bgwriter > doesn't need to do a dynamic state change. This approach appears to be working nicely so far. Some ugly bits of former patch removed. Patch passes basic tests and changes state cleanly. Restarting test cycle on this patch now, confirm tomorrow. -- Simon Riggs www.2ndQuadrant.com PostgreSQL Training, Services and Support Index: src/backend/access/transam/xlog.c === RCS file: /home/sriggs/pg/REPOSITORY/pgsql/src/backend/access/transam/xlog.c,v retrieving revision 1.317 diff -c -r1.317 xlog.c *** src/backend/access/transam/xlog.c 11 Aug 2008 11:05:10 - 1.317 --- src/backend/access/transam/xlog.c 18 Sep 2008 11:39:07 - *** *** 119,124 --- 119,126 /* Are we doing recovery from XLOG? */ bool InRecovery = false; + bool reachedSafeStopPoint = false; + bool StartupCanWriteWAL = false; /* Are we recovering using offline XLOG archives? */ static bool InArchiveRecovery = false; *** *** 131,137 static bool recoveryTarget = false; static bool recoveryTargetExact = false; static bool recoveryTargetInclusive = true; - static bool recoveryLogRestartpoints = false; static TransactionId recoveryTargetXid; static TimestampTz recoveryTargetTime; static TimestampTz recoveryLastXTime = 0; --- 133,138 *** *** 286,295 --- 287,298 /* * Total shared-memory state for XLOG. */ + #define XLOGCTL_BUFFER_SPACING 128 typedef struct XLogCtlData { /* Protected by WALInsertLock: */ XLogCtlInsert Insert; + char InsertPadding[XLOGCTL_BUFFER_SPACING - sizeof(XLogCtlInsert)]; /* Protected by info_lck: */ XLogwrtRqst LogwrtRqst; *** *** 297,305 --- 300,315 uint32 ckptXidEpoch; /* nextXID & epoch of latest checkpoint */ TransactionId ckptXid; XLogRecPtr asyncCommitLSN; /* LSN of newest async commit */ + /* add data structure padding for above info_lck declarations */ + char InfoPadding[XLOGCTL_BUFFER_SPACING - sizeof(XLogwrtRqst) + - sizeof(XLogwrtResult) + - sizeof(uint32) + - sizeof(TransactionId) + - sizeof(XLogRecPtr)]; /* Protected by WALWriteLock: */ XLogCtlWrite Write; + char WritePadding[XLOGCTL_BUFFER_SPACING - sizeof(XLogCtlWrite)]; /* * These values do not change after startup, although the pointed-to pages *** *** 311,316 --- 321,337 int XLogCacheBlck; /* highest allocated xlog buffer index */ TimeLineID ThisTimeLineID; + /* + * canWriteWAL changes at the end of recovery only and is only ever set + * by the Startup process. We assume that changes to it are atomic, + * so accesses to it is never locked. When it does change bgwriter + * must immediately begin using it, since this helps it decide whether + * to flush WAL or not when it writes dirty blocks. If bgwriter does + * it too soon, we will write invalid WAL records and if it reflects the + * change too late it could skip flushing WAL for a data block change. + */ + bool canWriteWAL; + slock_t info_lck; /* locks shared variables shown above */ } XLogCtlData; *** *** 480,485 --- 501,510 bool doPageWrites; bool isLogSwitch = (rmid == RM_XLOG_ID && info == XLOG_SWITCH); + /* cross-check on whether we should be here or not */ + if (!(canWriteWAL() || (InRecovery && StartupCanWriteWAL))) + elog(FATAL, "cannot make new WAL entries during recovery"); + /* info's high bits are reserved for use by me */ if (info & XLR_INFO_MASK) elog(PANIC, "invalid xlog info mask %02X", info); *** *** 1677,1684 XLogRecPtr WriteRqstPtr; XLogwrtRqst WriteRqst; ! /* Disabled during REDO */ ! if (InRedo) return; /* Quick exit if already known flushed */ --- 1702,1709 XLogRecPtr WriteRqstPtr; XLogwrtRqst WriteRqst; ! /* Disabled during StartupXLog */ ! if (!canWriteWAL()) return; /* Quick exit if already known flushed */ *** *** 1766,1774 * the bad page is encountered again during recovery then we would be * unable to restart the database at all! (This scenario has actually * happened in the field several times with 7.1 releases. Note that we ! * cannot get here while InRedo is true, but if the bad page is brought in ! * and marked dirty during recovery then CreateCheckPoint will try to ! * flush it at the end of recovery.) * * The current approach is to ERROR under normal conditions, but only * WARNING during recovery, so that the system can be brought up even if --- 1791,1799 * the bad page is encountered again
Re: [PATCHES] [HACKERS] Infrastructure changes for recovery
On Tue, 2008-09-16 at 15:45 -0400, Tom Lane wrote: > Simon Riggs <[EMAIL PROTECTED]> wrote: > > Testing takes a while on this, I probably won't complete it until > > Friday. So enclosed patch is for eyeballs only at this stage. > > What's the status on that patch? Having some trouble trying to get a clean state change from recovery to normal mode. Startup needs to be able to write WAL at the end of recovery so it can write a ShutdownCheckpoint, yet must not be allowed to write WAL before that. Other services are still not fully up yet. So every other process apart from Startup musn't write WAL until Startup has fully completed its actions, which is sometime later. bgwriter needs to know about the impending state change so it can finish off any checkpoint its currently working on. But then can't start doing normal processing yet either. So it must have a third state that is between recovery and normal processing. When normal processing is reached, it *must* write WAL immediately from that point onwards, yet using the new timeline that startup decides upon. So the idea of a single database-wide boolean state seems impossible. We need a local canInsertWAL flag that is set at different times in different processes. Global states changes are now not started started - postmaster process then startup process safestoppingpoint - bgwriter starts startup_does_shutdown_checkpoint - bgwriter finishes up, just waits, startup is now allowed to insert WAL for checkpoint record startup completes StartupXLog - bgwriter begins normal processing startup exits - postmaster in normal state We *might* solve this by making the final checkpoint that Startup writes into an online checkpoint. That would then allow a straight and safe transition for bgwriter from just one state to the other. But that leaves some other ugliness that I don't want to deal with, 'cos there's other fish frying. Feels like I should shutdown the bgwriter after recovery and then allow it to be cranked up again after normal processing starts, and do all of this through postmaster state changes. That way bgwriter doesn't need to do a dynamic state change. Comments anyone? -- Simon Riggs www.2ndQuadrant.com PostgreSQL Training, Services and Support -- Sent via pgsql-patches mailing list (pgsql-patches@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-patches