Re: documentation on HOT

Peter Geoghegan Fri, 22 Jul 2022 10:05:46 -0700

On Fri, Jul 22, 2022 at 8:09 AM Jonathan S. Katz <jk...@postgresql.org> wrote:
> Thanks! This is great. Probably the most concise and clear explanation I
> have seen for HOT, which is exactly what we need for the docs :)


I'm delighted that this is finally happening!

> I think we should expand on this and explain how adjusting "fillfactor"
> will affect this. I think that may change the final sentence too.

Definitely -- this is by far the most important reason to tune heap
fill factor, which can make a big difference. There should be a link
from the fill factor docs to the new HOT section, at a minimum.

Another thing: would be good to point out how to monitor the number of
HOT updates vs non-HOT updates using views like pg_stat_user_tables.

One minor correction: Opportunistic pruning isn't limited to heap-only
tuples -- "HOT pruning" is actually a bit of a misnomer, that somehow
caught on in the Postgres community. Opportunistic pruning can and
will happen even with non-HOT updates -- the only difference on the
heap page level is that 4 byte LP_DEAD line pointers will accumulate
over time, which only VACUUM is able to clean up (we need LP_DEAD
items to stick around until then as tombstones, so that index scans
don't ever get confused). But pruning will still be able to free most
of the space on the heap page, in almost the same way.

Of course the indexes on the table are another matter -- each and
every index will accumulate versions when there are many non-HOT
updates, which can be a huge problem. My point is that *from the point
of view of an affected heap page*, pruning (whether opportunistic or
performed by VACUUM) can go ahead without HOT tuples/updates, and
isn't really that different in terms of the amount of space freed each
time.

Bottom-up index deletion is effective is no small part because pruning
was always pretty effective even with non-HOT updates, except within
indexes -- it pretty much compensates for that remaining problem. The
heuristics that it uses in the nbtree code work quite well because of
the natural tendency of those 4 byte LP_DEAD item pointers to
concentrate on individual heap pages. We can fit so many more 4 byte
LP_DEAD items than full sized live heap tuples that it becomes pretty
obvious (from the point of view of the nbtree code that starts out by
looking at an index leaf page) which pointed-to heap blocks are likely
to enable it to free space via index tuple deletes -- it often visits
the heap pages with the most TIDs.

Before bottom-up index deletion was added, I would notice that pgbench
variant workloads with non-HOT updates would result in a
pgbench_accounts table that is maybe 1%-2% larger after a few hours,
while the indexes on the same table would grow by 100% - 200%.
Opportunistic pruning would work pretty well for the heap, but
wouldn't do anything at all to control bloat in indexes. So
opportunistic pruning was always very effective with non-HOT updates
in one important way, but totally ineffective in another way.

There is a section about bottom-up index deletion in the docs (in the
B-Tree internals chapter) that already references HOT. It should link
to this new chapter now, I think.

-- 
Peter Geoghegan

Re: documentation on HOT

Reply via email to