On Mon, Mar 27, 2017 at 2:38 PM, Kyotaro HORIGUCHI <horiguchi.kyot...@lab.ntt.co.jp> wrote: > At Sat, 25 Mar 2017 19:53:47 -0700, Jeff Janes <jeff.ja...@gmail.com> wrote > in <CAMkU=1x3+dpsfsu+af7wazavugmehua2+jnf7sual-mskq+...@mail.gmail.com> >> On Thu, Mar 23, 2017 at 7:01 PM, Kyotaro HORIGUCHI < >> horiguchi.kyot...@lab.ntt.co.jp> wrote: >> >> > At Wed, 22 Mar 2017 02:15:26 +0900, Masahiko Sawada <sawada.m...@gmail.com> >> > wrote in <CAD21AoAq2YHs3MvSky6TxX1oKqyiPwUphdSa2sJCab_V4ci4VQ@mail. >> > gmail.com> >> > > On Mon, Mar 20, 2017 at 11:28 PM, Robert Haas <robertmh...@gmail.com> >> > wrote: >> > > > On Sat, Mar 18, 2017 at 5:42 PM, Jeff Janes <jeff.ja...@gmail.com> >> > wrote: >> > > >> Isn't HEAP2_CLEAN only issued before an intended HOT update? (Which >> > then >> > > >> can't leave the block as all visible or all frozen). I think the >> > issue is >> > > >> here is HEAP2_VISIBLE or HEAP2_FREEZE_PAGE. Am I reading this >> > correctly, >> > > >> that neither of those ever update the FSM, regardless of FPI? >> > > > >> > > > Yes, updates to the FSM are never logged. Forcing replay of >> > > > HEAP2_FREEZE_PAGE to update the FSM might be a good idea. >> > > > >> > > >> > > I think I was missing something. I imaged your situation is that FPI >> > > is replayed during crash recovery after the crashed server vacuums the >> > > page and marked it as all-frozen. But this situation is also resolved >> > > by that solution. >> > >> > # HEAP2_CLEAN is issued in lazy_vacuum_page >> > >> > It will work but I'm not sure it is right direction for >> > HEAP2_FREEZE_PAGE to touch FSM. >> > >> > As Masahiko said, the situation must be created by HEAP2_VISIBLE >> > without preceding HEAP2_CLEAN, or with HEAP2_CLEAN with FPI. I >> > think only the latter can happen. The comment in heap_xlog_clean >> > below is right generally but if a page filled with tuples becomes >> > almost empty and freezable by this cleanup, a problematic >> > situation like this occurs. >> > >> >> I now think this is not the cause of the problem I am seeing. I made the >> replay of FREEZE_PAGE update the FSM (both with and without FPI), but that >> did not fix it. With frequent crashes, it still accumulated a lot of >> frozen and empty (but full according to FSM) pages. I also set up replica >> streaming and turned off crashing on the master, and the FSM of the replica >> stays accurate, so the WAL stream and replay logic is doing the right thing >> on the replica. >> >> I now think the dirtied FSM pages are somehow not getting marked as dirty, >> or are getting marked as dirty but somehow the checkpoint is skipping >> them. It looks like MarkBufferDirtyHint does do some operations unlocked >> which could explain lost update, but it seems unlikely that that would >> happen often enough to see the amount of lost updates I am seeing. > > Hmm.. clearing dirty hint seems already protected by exclusive > lock. And I think it can occur without lock failure. > > Other than by FPI, FSM update is omitted when record LSN is older > than page LSN. If heap page is evicted but FSM page is not after > vacuuming and before power cut, replaying HEAP2_CLEAN skips > update of FSM even though FPI is not attached. Of course this > cannot occur on standby. One FSM page covers as many heap pages > as about 4k, so FSM can stay far longer than heap pages. > > ALL_FROZEN is set with other than HEAP2_FREEZE_PAGE. When a page > is already empty when entering lazy_sacn_heap, or a page of > non-indexed heap is empitied in lazy_scan_heap, HRAP2_VISIBLE is > issued to set ALL_FROZEN. > > Perhaps the problem will be fixed by forcing heap_xlog_visible to > update FSM (addition to FREEZE_PAGE), or the same in > heap_xlog_clean. (As menthined in the previous mail, I prefer the > latter.)
Maybe it's enough just to make both heap_xlog_visible and heap_xlog_freeze_page forcibly updates the FSM (heap_xlog_freeze_page might be unnecessary). Because the problem happens on the page that is full according to FSM but is empty and marked as all-visible or all-frozen. Though heap_xlog_clean loads the heap page to the memory for redo operation, forcing heap_xlog_clean to update FSM might be overkill for this solution. Because it can happen on every pages that are not marked as neither all-visible nor all-frozen. Basically 100% accuracy of FSM is not required. On the other hand, if we makes heap_xlog_visible updates the FSM, it requires to load both heap page and FSM page, which can also be overhead. Another idea is, we can heap_xlog_visible to have the freespace of corresponding heap page, and then update FSM during recovery. > >> > > /* >> > > * Update the FSM as well. >> > > * >> > > * XXX: Don't do this if the page was restored from full page image. We >> > > * don't bother to update the FSM in that case, it doesn't need to be >> > > * totally accurate anyway. >> > > */ >> > >> >> What does that save us? If we restored from FPI, we already have the block >> in memory (we don't need to see the old version, just the new one), so it >> doesn't save us a random read IO. > > Updates on random pages can cause visits to many unloaded FSM > pages. It may be intending to avoid that. Or, especially for > INSERT, successive operations tends to occur on the same heap > page, the complexity of calculating FSM wouldn't be so small > relatively. FMS tells a lie that the page has spare space after > that but it doesn't harm. But I think that the things are > different for operations that increments free space. Regards, -- Masahiko Sawada NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers