Jeff Davis wrote:
On Mon, 2007-11-05 at 22:45 +0000, Heikki Linnakangas wrote:
1) Do as you say above. What are some of the cost trade-offs here? It
seems that frequent VACUUM FREEZE runs would keep the visibility map
mostly full, but will also cause more writing. I suppose the worst case
is that every tuple write needs results in two data page writes, one
normal write and another to freeze it later, which sounds bad. Maybe
there's a way to try to freeze the tuples on a page before it's written
out?
It would also create more WAL traffic, because freezing tuples needs to
be WAL-logged.
The thought crossed my mind, but I couldn't think of any reason that
would need to be logged. Of course you're right, and the comments
explain it well.
5) Have a more fine-grain equivalent of relfrozenxid. For example one
frozenxid per visibility map page, so that whenever you update the
visibility map, you also update the frozenxid. To advance the
relfrozenxid in pg_class, you scan the visibility map and set
relfrozenxid to the smallest frozenxid. Unlike relfrozenxid, it could be
set to FrozenXid if the group of pages are totally frozen.
Wouldn't that still require WAL traffic? Otherwise how can you guarantee
that the FrozenXid hits disk before TruncateCLOG truncates the old xmin
away?
Updating the fine-grain frozenxid would still need to be WAL-logged. But
it would be lot less frequent than aggressively freezing tuples.
Compared to the idea of having a separate bitmap or two bits per tuple
in one data structure, you wouldn't necessarily have to freeze tuples to
advance it, you could just observe what the smallest xid on a group of
pages is. Like regular lazy vacuum does right now for relfrozenxid, just
more fine-grained.
--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com
---------------------------(end of broadcast)---------------------------
TIP 2: Don't 'kill -9' the postmaster