On Wed, Jun 4, 2014 at 7:26 PM, Andres Freund <and...@2ndquadrant.com>
> On 2014-06-04 09:51:36 -0400, Robert Haas wrote:
> > On Wed, Jun 4, 2014 at 2:08 AM, Andres Freund <and...@2ndquadrant.com>
> > > On 2014-06-04 10:24:13 +0530, Amit Kapila wrote:
> > >> Incase of recovery, the shared buffers saved by this utility are
> > >> from previous shutdown which doesn't seem to be of more use
> > >> than buffers loaded by recovery.
> > >
> > > Why? The server might have been queried if it's a hot standby one?
Consider the case, crash (force kill or some other way) occurs when
BGSaver is saving the buffers, now I think it is possible that it has
saved partial information (information about some buffers is correct
and others is missing) and it is also possible by that time checkpoint
record is not written (which means recovery will start from previous
restart point). So whats going to happen is that pg_hibernate might
load some less used buffers/blocks (which have lower usage count)
and WAL replayed blocks will be sacrificed. So the WAL data from
previous restart point and some more due to delay in start of
standby (changes occured in master during that time) will be
Another case is of standalone server in which case there is always
high chance that blocks recovered by recovery are the active one's.
Now I agree that case of standalone servers is less, but still some
small applications might be using it. Also I think same is true if
the crashed server is master.
> > I think that's essentially the same point Amit is making. Gurjeet is
> > arguing for reloading the buffers from the previous shutdown at end of
> > recovery; IIUC, Amit, you, and I all think this isn't a good idea.
> I think I am actually arguing for Gurjeet's position. If the server is
> actively being queried (i.e. hot_standby=on and actually used for
> queries) it's quite reasonable to expect that shared_buffers has lots of
> content that is *not* determined by WAL replay.
Yes, that's quite possible, however there can be situations where it
is not true as explained above.
> There's not that much read IO going on during WAL replay anyway - after
> a crash/start from a restartpoint most of it is loaded via full page
> So it's only disadvantageous to fault in pages via pg_hibernate
> if that causes pages that already have been read in via FPIs to be
> thrown out.
So for such cases, pages loaded by pg_hibernate turn out to be loss.
Overall I think there can be both kind of cases when it is beneficial
to load buffers after recovery and before recovery, thats why I
mentioned above that either it can be a parameter from user to
decide the same or may be we can have a new API which will
load buffers by BGworker without evicting any existing buffer
(use buffers from free list only).