Fwd: [HACKERS] free space map and visibility map

Jeff Janes Tue, 28 Mar 2017 08:51:38 -0700

I accidentally sent this off-list, sending to the list now:

On Sun, Mar 26, 2017 at 10:38 PM, Kyotaro HORIGUCHI <
horiguchi.kyot...@lab.ntt.co.jp> wrote:


> At Sat, 25 Mar 2017 19:53:47 -0700, Jeff Janes <jeff.ja...@gmail.com>
> wrote in <CAMkU=1x3+DPsfSU+AF7WAzAVugmEhUA2+jNf7SuAL-MSKQ+_KA@mail.
> gmail.com>
> > On Thu, Mar 23, 2017 at 7:01 PM, Kyotaro HORIGUCHI <
> > horiguchi.kyot...@lab.ntt.co.jp> wrote:
> >
> > > At Wed, 22 Mar 2017 02:15:26 +0900, Masahiko Sawada <
> sawada.m...@gmail.com>
> > > wrote in <CAD21AoAq2YHs3MvSky6TxX1oKqyiPwUphdSa2sJCab_V4ci4VQ@mail.
> > > gmail.com>
> > > > On Mon, Mar 20, 2017 at 11:28 PM, Robert Haas <robertmh...@gmail.com
> >
> > > wrote:
> > > > > On Sat, Mar 18, 2017 at 5:42 PM, Jeff Janes <jeff.ja...@gmail.com>
> > > wrote:
> > > > >> Isn't HEAP2_CLEAN only issued before an intended HOT update?
> (Which
> > > then
> > > > >> can't leave the block as all visible or all frozen).  I think the
> > > issue is
> > > > >> here is HEAP2_VISIBLE or HEAP2_FREEZE_PAGE.  Am I reading this
> > > correctly,
> > > > >> that neither of those ever update the FSM, regardless of FPI?
> > > > >
> > > > > Yes, updates to the FSM are never logged.  Forcing replay of
> > > > > HEAP2_FREEZE_PAGE to update the FSM might be a good idea.
> > > > >
> > > >
> > > > I think I was missing something. I imaged your situation is that FPI
> > > > is replayed during crash recovery after the crashed server vacuums
> the
> > > > page and marked it as all-frozen. But this situation is also resolved
> > > > by that solution.
> > >
> > > # HEAP2_CLEAN is issued in lazy_vacuum_page
> > >
> > > It will work but I'm not sure it is right direction for
> > > HEAP2_FREEZE_PAGE to touch FSM.
> > >
> > > As Masahiko said, the situation must be created by HEAP2_VISIBLE
> > > without preceding HEAP2_CLEAN, or with HEAP2_CLEAN with FPI. I
> > > think only the latter can happen. The comment in heap_xlog_clean
> > > below is right generally but if a page filled with tuples becomes
> > > almost empty and freezable by this cleanup, a problematic
> > > situation like this occurs.
> > >
> >
> > I now think this is not the cause of the problem I am seeing.  I made the
> > replay of FREEZE_PAGE update the FSM (both with and without FPI), but
> that
> > did not fix it.  With frequent crashes, it still accumulated a lot of
> > frozen and empty (but full according to FSM) pages.  I also set up
> replica
> > streaming and turned off crashing on the master, and the FSM of the
> replica
> > stays accurate, so the WAL stream and replay logic is doing the right
> thing
> > on the replica.
> >
> > I now think the dirtied FSM pages are somehow not getting marked as
> dirty,
> > or are getting marked as dirty but somehow the checkpoint is skipping
> > them.  It looks like MarkBufferDirtyHint does do some operations unlocked
> > which could explain lost update, but it seems unlikely that that would
> > happen often enough to see the amount of lost updates I am seeing.
>
> Hmm.. clearing dirty hint seems already protected by exclusive
> lock. And I think it can occur without lock failure.
>
> Other than by FPI, FSM update is omitted when record LSN is older
> than page LSN. If heap page is evicted but FSM page is not after
> vacuuming and before power cut, replaying HEAP2_CLEAN skips
> update of FSM even though FPI is not attached. Of course this
> cannot occur on standby. One FSM page covers as many heap pages
> as about 4k, so FSM can stay far longer than heap pages.
>

This corresponds to action == BLK_DONE case, right?


>
> ALL_FROZEN is set with other than HEAP2_FREEZE_PAGE. When a page
> is already empty when entering lazy_sacn_heap, or a page of
> non-indexed heap is empitied in lazy_scan_heap, HRAP2_VISIBLE is
> issued to set ALL_FROZEN.
>
> Perhaps the problem will be fixed by forcing heap_xlog_visible to
> update FSM (addition to FREEZE_PAGE), or the same in
> heap_xlog_clean. (As menthined in the previous mail, I prefer the
> latter.)
>

When I make heap_xlog_clean update FSM even on BLK_RESTORED (but not on
BLK_DONE), it solves the problem I was seeing.  Which still leaves me
wondering why the problem doesn't show up on the standby because, unlike
BLK_DONE, BLK_RESTORED should have the same issue on standby as it does on
a recovering master, shouldn't it? Maybe the difference is that the
existence a replication slot delays the clean up in a way that causes a
different pattern of WAL records.


> > > > /*
> > > >  * Update the FSM as well.
> > > >  *
> > > >  * XXX: Don't do this if the page was restored from full page image.
> We
> > > >  * don't bother to update the FSM in that case, it doesn't need to be
> > > >  * totally accurate anyway.
> > > >  */
> > >
> >
> > What does that save us?  If we restored from FPI, we already have the
> block
> > in memory (we don't need to see the old version, just the new one), so it
> > doesn't save us a random read IO.
>
> Updates on random pages can cause visits to many unloaded FSM
> pages. It may be intending to avoid that.


But I think that that would be no worse for BLK_RESTORED than it is for
BLK_NEEDS_REDO.  Why optimize only one of the cases, if it is worth
optimizing either one?

Cheers,

Jeff

fsm_clean.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Fwd: [HACKERS] free space map and visibility map

Reply via email to