On Thu, Jun 14, 2018 at 11:44 AM, Peter Geoghegan <p...@bowt.ie> wrote: > I attach an unfinished prototype of suffix truncation, that also > sometimes *adds* a new attribute in pivot tuples. It adds an extra > heap TID from the leaf level when truncating away non-distinguishing > attributes during a leaf page split, though only when it must. The > patch also has nbtree treat heap TID as a first class part of the key > space of the index. Claudio wrote a patch that did something similar, > though without the suffix truncation part [2] (I haven't studied his > patch, to be honest). My patch is actually a very indirect spin-off of > Anastasia's covering index patch, and I want to show what I have in > mind now, while it's still swapped into my head. I won't do any > serious work on this project unless and until I see a way to implement > retail index tuple deletion, which seems like a multi-year project > that requires the buy-in of multiple senior community members. On its > own, my patch regresses performance unacceptably in some workloads, > probably due to interactions with kill_prior_tuple()/LP_DEAD hint > setting, and interactions with page space management when there are > many "duplicates" (it can still help performance in some pgbench > workloads with non-unique indexes, though).
I attach a revised version, which is still very much of prototype quality, but manages to solve a few of the problems that v1 had. Andrey Lepikhov (CC'd) asked me to post any improved version I might have for use with his retail index tuple deletion patch, so I thought I'd post what I have. The main development for v2 is that the sort order of the implicit heap TID attribute is flipped. In v1, it was in "ascending" order. In v2, comparisons of heap TIDs are inverted to make the attribute order "descending". This has a number of advantages: * It's almost consistent with the current behavior when there are repeated insertions of duplicates. Currently, this tends to result in page splits of the leftmost leaf page among pages that mostly consist of the same duplicated value. This means that the destabilizing impact on DROP SCHEMA ... CASCADE regression test output noted before [1] is totally eliminated. There is now only a single trivial change to regression test "expected" files, whereas in v1 dozens of "expected" files had to be changed, often resulting in less useful reports for the user. * The performance regression I observed with various pgbench workloads seems to have gone away, or is now within the noise range. A patch like this one requires a lot of validation and testing, so this should be taken with a grain of salt. I may have been too quick to give up on my original ambition of writing a stand-alone patch that can be justified entirely on its own merits, without being tied to some much more ambitious project like retail index tuple deletion by VACUUM, or zheap's deletion marking. I still haven't tried to replace the kludgey handling of unique index enforcement, even though that would probably have a measurable additional performance benefit. I think that this patch could become an unambiguous win. [1] https://postgr.es/m/CAH2-Wz=wAKwhv0PqEBFuK2_s8E60kZRMzDdyLi=-mvcm_ph...@mail.gmail.com -- Peter Geoghegan
v2-0001-Ensure-nbtree-leaf-tuple-keys-are-always-unique.patch
Description: Binary data