On 9/11/07, Gregory Stark <[EMAIL PROTECTED]> wrote:
> You could mark such tuples with LP_DELETE. That would also let other
> transactions quickly tot up how much space would be available if they were
> run PageRepairFragmentation.
IMHO we are making full circles here. We have already tried LP_DELETE
and moved away to simplify things. We also tried reusing dead space without
running PageRepairFragmentation. Each of these techniques worked just fine
with slightly different performance characteristics. What we now have is a
simplified algorithm which is much easier to follow and is safer, yet giving
us a very good performance boost. I am not sure if this is the right time to
throw new ideas because we would never be sure as what we are doing
would be the most optimal solution. Would it help if we go with some
solution right now, get rest of the review process done and then use
the feedback during beta testing to tune things ? We may have far more
data points at that time to choose one technique over other. And we
would also know what areas to focus on.
I am also worried that by focusing too much on this issue we may
overlook some other correctness issue in the patch.
>From whatever we have discussed so far, IMHO we should do the following
things and let rest of the review process proceed
- Defragment a page only when the free space left in the page is not
enough to accommodate even a single tuple (use average tuple length for
this decision). This would mean we might be defragmenting even though
there is no immediate UPDATE to the page. But we can treat this as
fillfactor which allows us to provision for the next UPDATE coming to
the page. Since we are defragmenting when the page is almost full
hopefully we would reclaim good amount of space in the page and
won't call defragmentation for next few UPDATEs.
We already have mechanism to track average tuple request size in
relcache. May be we can have some relcache invalidation to keep the
information in sync (send invalidation when the average request size
changes by say 5%)
- Avoid pruning chains in every index or seq lookup. But if the chain
becomes longer than X tuples, mark the page to be pruned in the
next lookup. We can choose to separate prune and defragmentation
and only do pruning in this case. But I would prefer to keep them
together for now.
- Track the minimum xmin in the page header to avoid repeated
(wasted) attempts to prune a Prunable page in the presence of long running
We can save rest of the techniques for beta testing period or 8.4.