On Fri, 2007-09-07 at 22:08 -0400, Bruce Momjian wrote:
> Simon Riggs wrote:
> > > We could begin pruning only when the chain is N long. Currently N=2, but
> > > we could set N=3+ easily enough. There's no code yet to actually count
> > > that, but we can do that easily as we do each lookup. We should also be
> > > able to remember the visibility result for each tuple in the chain to
> > > decide whether pruning will be effective or not.
> > >
> > > I would say that if the buffer is already dirty and the chain is
> > > prunable, we should prune it at the first possible chance.
> > If we defer pruning until the page is full, worst case we may could end
> > up with a chain ~240 tuples long, which might need to be scanned
> > repeatedly. That won't happen often, but it would be better to prune
> > whenever we hit one of these conditions
> > - when the block is full
> > - when we reach the 16th tuple in a chain
Thanks for defining terms.
Just to answer a few of your points:
> I don't see how following a HOT chain is any slower than following an
> UPDATE chain like we do now.
The speed/cost is the same. The issue is *when* we do this. Normal
SELECTs will follow the chain each time we do an index lookup.
So if our read/write ratio is 1000:1, then we will waste many cycles and
yet the chain is never pruned on SELECT. So there really must be some
point at which we prune on a SELECT.
Perhaps we could say if Xid % 16 == 0 then we prune, i.e. every 16th
transaction walking the chain will prune it.
> Also, why all the talk of index lookups doing pruning? Can't a
> sequential scan do pruning?
The SeqScan doesn't follow the chains, so can't easily determine whether
there are any long chains that need pruning. Its only when we come in
via an index that we need to walk the chain to the latest tuple version
and in that case we learn how long the chain is.
---------------------------(end of broadcast)---------------------------
TIP 5: don't forget to increase your free space map settings