On Wed, Feb 13, 2019 at 11:26:23PM -0800, Noah Misch wrote:
> On further study, I was able to reproduce data loss

> Fixes are available:
> 
> a. Fix the rounding in SimpleLruTruncate().  (The patch I posted upthread is
>    wrong; I will correct it in a separate message.)

Here's a corrected version.  I now delete a segment only if both its first
page and its last page are considered to precede the cutoff; see the new
comment at SlruMayDeleteSegment().
commit 09393a1 (HEAD)
Author:     Noah Misch <n...@leadboat.com>
AuthorDate: Sat Feb 16 20:02:51 2019 -0800
Commit:     Noah Misch <n...@leadboat.com>
CommitDate: Sat Feb 16 20:02:51 2019 -0800

    Don't round SimpleLruTruncate() cutoffPage values.
    
    Every core SLRU wraps around.  The rounding did not account for that; in
    rare cases, it permitted deletion of the most recently-populated page of
    SLRU data.  This closes a rare opportunity for data loss, which
    manifested as "could not access status of transaction" errors.  If a
    user's physical replication primary logged ": apparent wraparound"
    messages, the user should rebuild that primary's standbys regardless of
    symptoms.  At less risk is a cluster having emitted "not accepting
    commands" errors or "must be vacuumed" warnings at some point.  One can
    test a cluster for this data loss by running VACUUM FREEZE in every
    database.  Back-patch to 9.4 (all supported versions).
    
    Discussion: https://postgr.es/m/20190202083822.gc32...@gust.leadboat.com

diff --git a/src/backend/access/transam/slru.c 
b/src/backend/access/transam/slru.c
index 3623352..71e29b9 100644
--- a/src/backend/access/transam/slru.c
+++ b/src/backend/access/transam/slru.c
@@ -1171,11 +1171,6 @@ SimpleLruTruncate(SlruCtl ctl, int cutoffPage)
        int                     slotno;
 
        /*
-        * The cutoff point is the start of the segment containing cutoffPage.
-        */
-       cutoffPage -= cutoffPage % SLRU_PAGES_PER_SEGMENT;
-
-       /*
         * Scan shared memory and remove any pages preceding the cutoff page, to
         * ensure we won't rewrite them later.  (Since this is normally called 
in
         * or just after a checkpoint, any dirty pages should have been flushed
@@ -1221,8 +1216,11 @@ restart:;
                 * Hmm, we have (or may have) I/O operations acting on the 
page, so
                 * we've got to wait for them to finish and then start again. 
This is
                 * the same logic as in SlruSelectLRUPage.  (XXX if page is 
dirty,
-                * wouldn't it be OK to just discard it without writing it?  
For now,
-                * keep the logic the same as it was.)
+                * wouldn't it be OK to just discard it without writing it?
+                * SlruMayDeleteSegment() uses a stricter qualification, so we 
might
+                * not delete this page in the end; even if we don't delete it, 
we
+                * won't have cause to read its data again.  For now, keep the 
logic
+                * the same as it was.)
                 */
                if (shared->page_status[slotno] == SLRU_PAGE_VALID)
                        SlruInternalWritePage(ctl, slotno, NULL);
@@ -1313,18 +1311,40 @@ restart:
 }
 
 /*
+ * Determine whether a segment is okay to delete.
+ *
+ * segpage is the first page of the segment, and cutoffPage is the oldest (in
+ * PagePrecedes order) page in the SLRU containing still-useful data.  Since
+ * every core PagePrecedes callback implements "wrap around", check the
+ * segment's first and last pages:
+ *
+ * first<cutoff  && last<cutoff:  yes
+ * first<cutoff  && last>=cutoff: no; cutoff falls inside this segment
+ * first>=cutoff && last<cutoff:  no; wrap point falls inside this segment
+ * first>=cutoff && last>=cutoff: no; every page of this segment is too young
+ */
+static bool
+SlruMayDeleteSegment(SlruCtl ctl, int segpage, int cutoffPage)
+{
+       int                     seg_last_page = segpage + 
SLRU_PAGES_PER_SEGMENT - 1;
+
+       Assert(segpage % SLRU_PAGES_PER_SEGMENT == 0);
+
+       return (ctl->PagePrecedes(segpage, cutoffPage) &&
+                       ctl->PagePrecedes(seg_last_page, cutoffPage));
+}
+
+/*
  * SlruScanDirectory callback
- *             This callback reports true if there's any segment prior to the 
one
- *             containing the page passed as "data".
+ *             This callback reports true if there's any segment wholly prior 
to the
+ *             one containing the page passed as "data".
  */
 bool
 SlruScanDirCbReportPresence(SlruCtl ctl, char *filename, int segpage, void 
*data)
 {
        int                     cutoffPage = *(int *) data;
 
-       cutoffPage -= cutoffPage % SLRU_PAGES_PER_SEGMENT;
-
-       if (ctl->PagePrecedes(segpage, cutoffPage))
+       if (SlruMayDeleteSegment(ctl, segpage, cutoffPage))
                return true;                    /* found one; don't iterate any 
more */
 
        return false;                           /* keep going */
@@ -1339,7 +1359,7 @@ SlruScanDirCbDeleteCutoff(SlruCtl ctl, char *filename, 
int segpage, void *data)
 {
        int                     cutoffPage = *(int *) data;
 
-       if (ctl->PagePrecedes(segpage, cutoffPage))
+       if (SlruMayDeleteSegment(ctl, segpage, cutoffPage))
                SlruInternalDeleteSegment(ctl, filename);
 
        return false;                           /* keep going */

Reply via email to