Re: Eager page freeze criteria clarification

Andres Freund Wed, 27 Sep 2023 09:34:35 -0700

Hi,

(I wrote the first part of the email before Robert and I chatted on a call, I
left it in the email for posterity)

On 2023-09-26 13:49:32 -0400, Robert Haas wrote:
> On Tue, Sep 26, 2023 at 11:11 AM Andres Freund <and...@anarazel.de> wrote:
> > As long as the most extreme cases are prevented, unnecessarily freezing is 
> > imo
> > far less harmful than freezing too little.
> >
> > I'm worried that using something as long as 100-200%
> > time-between-recent-checkpoints won't handle insert-mostly workload well,
> > which IME are also workloads suffering quite badly under our current scheme 
> > -
> > and which are quite common.
>
> I wrote about this problem in my other reply and I'm curious as to
> your thoughts about it. Basically, suppose we forget about all of
> Melanie's tests except for three cases: (1) an insert-only table, (2)
> an update-heavy workload with uniform distribution, and (3) an
> update-heavy workload with skew. In case (1), freezing is good. In
> case (2), freezing is bad. In case (3), freezing is good for cooler
> pages and bad for hotter ones. I postulate that any
> recency-of-modification threshold that handles (1) well will handle
> (2) poorly, and that the only way to get both right is to take some
> other factor into account. You seem to be arguing that we can just
> freeze aggressively in case (2) and it won't cost much, but it doesn't
> sound to me like Melanie believes that and I don't think I do either.

I don't believe we can freeze aggressively in all cases of 2) without causing
problems. A small-ish table that's vacuumed constantly, where all rows are
constantly frozen and then updated, will suffer a lot from the WAL
overhead. Whereas superfluously freezing a row in a table with many millions
of rows, where each row is only occasionally updated, due to the update rate
being much smaller than the number of rows, will have acceptable overhead.

What I *do* believe is that for all but the most extreme cases, it's safer to
freeze too much than to freeze too little. There definitely are negative
consequences, but they're more bounded and less surprising than not freezing
for ages and then suddenly freezing everything at once.

Whether 2) really exists in the real world for huge tables, is of course
somewhat debatable...

> > > This doesn't seem completely stupid, but I fear it would behave
> > > dramatically differently on a workload a little smaller than s_b vs.
> > > one a little larger than s_b, and that doesn't seem good.
> >
> > Hm. I'm not sure that that's a real problem. In the case of a workload 
> > bigger
> > than s_b, having to actually read the page again increases the cost of
> > freezing later, even if the workload is just a bit bigger than s_b.
>
> That is true, but I don't think it means that there is no problem. It
> could lead to a situation where, for a while, a table never needs any
> significant freezing, because we always freeze aggressively.

What do you mean with "always freeze aggressively" - do you mean 'aggressive'
autovacuums? Or opportunistic freezing being aggressive? I don't know why the
former would be the case?

> When it grows large enough, we suddenly stop freezing it aggressively, and
> now it starts experiencing vacuums that do a whole bunch of work all at
> once. A user who notices that is likely to be pretty confused about what
> happened, and maybe not too happy when they find out.

Hm - isn't my proposal exactly the other way round? I'm proposing that a page
is frozen more aggressively if not already in shared buffers - which will
become more common once the table has grown "large enough"?

(the remainder was written after that call)

I think there were three main ideas that we discussed:

1) We don't need to be accurate in the freezing decisions for individual
   pages, we "just" need to avoid the situation that over time we commonly
   freeze pages that will be updated again "soon".
2) It might be valuable to adjust the "should freeze page opportunistically"
   based on feedback.
3) We might need to classify the workload for a table and use different
   heruristics for different workloads.

For 2), one of the mechanisms we discussed was to collect information about
the "age" of a page when we "unfreeze" it. If we frequently unfreeze pages
that were recently frozen, we need to be less aggressive in opportunistic
freezing going forward. If that never happens, we can be more aggressive.

The basic idea for classifying the age of a page when unfreezing is to use
"insert_lsn - page_lsn", pretty simple. We can convert that into time using
the averaged WAL generation rate.  What's a bit harder is figuring out how to
usefully aggregate the age across multiple "unfreezes".

I was initially thinking of just using the mean, but Robert was rightly
concerned that that'd the mean would be moved a lot when occasionally freezing
very old pages, potentially leading to opportunistically freezing young pages
too aggressively. The median would be a better choice, but at least with the
naive algorithms we can't maintain that over time in the stats.

One way to deal with that would be to not track the average age in
LSN-difference-bytes, but convert the value to some age metric at that
time. If we e.g. were to convert the byte-age into an approximate age in
checkpoints, with quadratic bucketing (e.g. 0 -> current checkpoint, 1 -> 1
checkpoint, 2 -> 2 checkpoints ago, 3 -> 4 checkpoints ago, ...), using a mean
of that age would probably be fine.

Once we have a metric like that, we can increase the aggressiveness of
opportunistic freezing if recently frozen pages aren't frequently frozen, and
decrease aggressiveness when they are.

For 3), we could do something similar. If we made updates/deletes track how
long ago (again in LSN difference) the page was modified last, we should be
able to quite effectively differentiate between a workload that only modifies
recent data, and one that doesn't have such locality.

Greetings,

Andres Freund

Re: Eager page freeze criteria clarification

Reply via email to