Re: Making all nbtree entries unique by having heap TIDs participate in comparisons

Robert Haas Fri, 15 Jun 2018 14:36:40 -0700

On Thu, Jun 14, 2018 at 2:44 PM, Peter Geoghegan <p...@bowt.ie> wrote:
> I've been thinking about using heap TID as a tie-breaker when
> comparing B-Tree index tuples for a while now [1]. I'd like to make
> all tuples at the leaf level unique, as assumed by L&Y. This can
> enable "retail index tuple deletion", which I think we'll probably end
> up implementing in some form or another, possibly as part of the zheap
> project. It's also possible that this work will facilitate GIN-style
> deduplication based on run length encoding of TIDs, or storing
> versioned heap TIDs in an out-of-line nbtree-versioning structure
> (unique indexes only). I can see many possibilities, but we have to
> start somewhere.


Yes, retail index deletion is essential for the delete-marking
approach that is proposed for zheap.

It could also be extremely useful in some workloads with the regular
heap.  If the indexes are large -- say, 100GB -- and the number of
tuples that vacuum needs to kill is small -- say, 5 -- scanning them
all to remove the references to those tuples is really inefficient.
If we had retail index deletion, then we could make a cost-based
decision about which approach to use in a particular case.

> mind now, while it's still swapped into my head. I won't do any
> serious work on this project unless and until I see a way to implement
> retail index tuple deletion, which seems like a multi-year project
> that requires the buy-in of multiple senior community members.

Can you enumerate some of the technical obstacles that you see?

> On its
> own, my patch regresses performance unacceptably in some workloads,
> probably due to interactions with kill_prior_tuple()/LP_DEAD hint
> setting, and interactions with page space management when there are
> many "duplicates" (it can still help performance in some pgbench
> workloads with non-unique indexes, though).

I think it would be helpful if you could talk more about these
regressions (and the wins).

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: Making all nbtree entries unique by having heap TIDs participate in comparisons

Reply via email to