On Mon, Oct 23, 2017 at 11:20 PM, Simon Riggs <si...@2ndquadrant.com> wrote:
> Remove the code that maintained two checkpoint's WAL files and all
> associated stuff.
>
> Try to avoid breaking anything else
>
> This
> * reduces disk space requirements on master
> * removes a minor bug in fast failover
> * simplifies code

In short, yeah! I had a close look at the patch and noticed a couple of issues.

+        else
             /*
-             * The last valid checkpoint record required for a streaming
-             * recovery exists in neither standby nor the primary.
+             * We used to attempt to go back to a secondary checkpoint
+             * record here, but only when not in standby_mode. We now
+             * just fail if we can't read the last checkpoint because
+             * this allows us to simplify processing around checkpoints.
              */
             ereport(PANIC,
                     (errmsg("could not locate a valid checkpoint record")));
-        }
Using brackets in this case is more elegant style IMO. OK, there is
one line, but the comment is long so the block becomes
confusingly-shaped.

 /* Version identifier for this pg_control format */
-#define PG_CONTROL_VERSION    1003
+#define PG_CONTROL_VERSION    1100
Yes, this had to happen anyway in this release cycle.

backup.sgml describes the following:
    to higher segment numbers.  It's assumed that segment files whose
    contents precede the checkpoint-before-last are no longer of
    interest and can be recycled.
However this is not true anymore with this patch.

The documentation of pg_control_checkpoint() is not updated. func.sgml
lists the tuple fields returned for this function.

You forgot to update pg_control_checkpoint in pg_proc.h. prior_lsn is
still listed in the list of arguments returned. And you need to update
as well the output list of types.

  * whichChkpt identifies the checkpoint (merely for reporting purposes).
- * 1 for "primary", 2 for "secondary", 0 for "other" (backup_label)
+ * 1 for "primary", 0 for "other" (backup_label)
+ * 2 for "secondary" is no longer supported
  */
I would suggest to just remove the reference to "secondary".

-    * Delete old log files (those no longer needed even for previous
-    * checkpoint or the standbys in XLOG streaming).
+    * Delete old log files and recycle them
     */
Here that's more "Delete or recycle old log files". Recycling of a WAL
segment is its renaming into a newer segment.

You forgot to update this formula in xlog.c:
    distance = (2.0 + CheckPointCompletionTarget) * CheckPointDistanceEstimate;
    /* add 10% for good measure. */
    distance *= 1.10;
You can guess easily where the mistake is.

-               checkPointLoc = ControlFile->prevCheckPoint;
+               /*
+                * It appears to be a bug that we used to use
prevCheckpoint here
+                */
+               checkPointLoc = ControlFile->checkPoint;
Er, no. This is correct because we expect the prior checkpoint to
still be present in the event of a failure detecting the latest
checkpoint.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to