Hello. I came from the following mail. https://www.postgresql.org/message-id/flat/0F97FA9ABBDBE54F91744A9B37151A5116E4DD@g01jpexmbkw24#8be5cc9b7c3cf454a82e561d84f4118f
At Mon, 26 Sep 2016 09:12:04 +0530, Amit Kapila <amit.kapil...@gmail.com> wrote in <caa4ek1k5yydmndko0zzw6wncn_dgfvha6dcycyuvkbwth5+...@mail.gmail.com> > On Mon, Sep 26, 2016 at 8:54 AM, Amit Kapila <amit.kapil...@gmail.com> wrote: > > On Sun, Sep 25, 2016 at 9:31 PM, Tom Lane <t...@sss.pgh.pa.us> wrote: > >> Amit Kapila <amit.kapil...@gmail.com> writes: > >>> On Fri, Sep 23, 2016 at 12:21 AM, Tom Lane <t...@sss.pgh.pa.us> wrote: > >>>> This is clearly an oversight in Simon's patch fafa374f2, which introduced > >>>> this code without any consideration for the possibility that the page > >>>> doesn't have a valid special area. ... > >>>> but I'm not very clear on whether this is a safe fix, mainly because > >>>> I don't understand what the reuse WAL record really accomplishes. > >>>> Maybe we need to instead generate a reuse record with some special > >>>> transaction ID indicating worst-case assumptions? > >> > >>> I think it is basically to ensure that the queries on standby should > >>> not be considering the page being recycled as valid and if there is > >>> any pending query to which this page is visible, cancel it. If this > >>> understanding is correct, then I think the fix you are proposing is > >>> correct. > >> > >> After thinking about it some more, I do not understand why we are emitting > >> WAL here at all in *any* case, either for a new page or for a deleted one > >> that we're about to recycle. Why wouldn't the appropriate actions have > >> been taken when the page was deleted, or even before that when its last > >> tuple was deleted? > >> > > > > It seems to me that we do take actions for conflict resolution during > > the page deletion (that looks to be covered by XLOG_HEAP2_CLEANUP_INFO > > which we emit in vacuum), but not sure if that is sufficient. > > Consider a case where the new transaction is started on standby after > > > > Here by new transaction, I intend to say some newer snapshot with > valid MyPgXact->xmin. I agree to the diagnosis. So the WAL record is not necessary if it is a new page since no one cannot be grabbing it. _bt_page_recyclable() has two callers. _bt_getbuf and btvacuumpage. A comment in btvacuumpage is saying that. > * _bt_checkpage(), which will barf on an all-zero page. We want to > * recycle all-zero pages, not fail. Also, we want to use a nondefault So it is right thing for _bt_page_recyclable() to return true for zeroed pages and it is consistent with the comment in _bt_page_recyclable(). At the problematic point, a new page is safe to recycle but there's no need to issue WAL record since it is not recycled with the meaning that _bt_log_resuse_page() thinking. > _bt_log_reuse_page(Relation rel, BlockNumber blkno, TransactionId > latestRemovedXid) > { > xl_btree_reuse_page xlrec_reuse; > > /* > * Note that we don't register the buffer with the record, because this > * operation doesn't modify the page. This record only exists to provide a > * conflict point for Hot Standby. > */ I think we have all requred testimony from exiting code and comments. FWIW my conclustion is that Tom's solution upthread is right thing to do and needs addition in comment. regards. -- Kyotaro Horiguchi NTT Open Source Software Center
diff --git a/src/backend/access/nbtree/nbtpage.c b/src/backend/access/nbtree/nbtpage.c index a24e64156a..b39634cafd 100644 --- a/src/backend/access/nbtree/nbtpage.c +++ b/src/backend/access/nbtree/nbtpage.c @@ -782,9 +782,12 @@ _bt_getbuf(Relation rel, BlockNumber blkno, int access) /* * If we are generating WAL for Hot Standby then create a * WAL record that will allow us to conflict with queries - * running on standby. + * running on standby. No need to do that if the page is + * all-zoeroed since no one cannot be interested in the + * page on standbys. See _bt_page_recyclable(). */ - if (XLogStandbyInfoActive() && RelationNeedsWAL(rel)) + if (XLogStandbyInfoActive() && RelationNeedsWAL(rel) && + !PageIsNew()) { BTPageOpaque opaque = (BTPageOpaque) PageGetSpecialPointer(page);