Gregory Stark wrote:
I'm also a bit concerned that *how many hint bits* isn't enough information to
determine how important it is to write out the page.
Agreed, that doesn't seem like a very good metric to me either.
Or how many *unhinted* xmin/xmax
values were found? If HTSV can hint xmin for a tuple but finds xmax still in
progress perhaps that's a good sign it's not worth dirtying the page?
I like that thought.
Overall, I feel that we should never dirty when setting a hint bit, just
set the separate buffer flag to indicate that hint bits have been set.
The decision to dirty and write out, or not, should be delayed until
we're about to write/replace the buffer. That is, in bgwriter.
How about this strategy:
1. First of all, before writing a dirty buffer, scan all tuples on the
page and set all hint bits that can be set. This will hopefully save us
from having to dirty the page again in the future, when another tuple on
the page is accessed. This has been proposed before, and IIRC Tom has
argued that it's a modularity violation for bgwriter to access the
contents of pages like that, but I'm sure we can find a way to do it safely.
2. When bgwriter encounters a page that's marked as "hint bits dirty",
write it only if *all* hint bits on the page has been, or can be, set.
Dirtying a page before that point doesn't seem worthwhile, as the next
access to the tuple that doesn't have all the hint bits set will have to
dirty the page again.
Actually, I'd like to see some benchmarks on an even simpler strategy:
just never dirty a page just because a hint bit has been set. It might
work surprisingly well in practice: If a database is I/O bound, we don't
care about the extra CPU work or lock congestion of checking the clog.
If it's CPU bound, the active pages that matter are in the buffer cache,
and so are the hint bits for those pages.
Sent via pgsql-patches mailing list (firstname.lastname@example.org)
To make changes to your subscription: