On Thu, Jul 20, 2017 at 7:45 AM, Claudio Freire <klaussfre...@gmail.com> wrote: >> For the purposes of this discussion, I'm mostly talking about >> duplicates within a page on a unique index. If the keyspace owned by >> an int4 unique index page only covers 20 distinct values, it will only >> ever cover 20 distinct values, now and forever, despite the fact that >> there is room for about 400 (a 90/10 split leaves you with 366 items + >> 1 high key). > > Microvacuum could also help. > > If during a scan you find pointers that point to dead (in vacuum terms) > tuples, the pointers in the index could be deleted. That could be done > during insert into unique indexes before a split, to avoid the split. > > Chances are, if there are duplicates, at least a few of them will be dead.
My whole point is that that could easily fail to happen early enough to prevent a pagesplit that is only needed because there is a short term surge in the number of duplicate versions that need to be available for one old snapshot. A pagesplit can be a permanent solution to a temporary problem. Page deletion can only occur under tight conditions that are unlikely to *ever* be met in many cases. Imagine if it was impossible to insert physical duplicates into unique indexes. In that world, you'd end up bloating some overflow data structure in UPDATE heavy cases (where HOT doesn't work out). The bloat wouldn't go on leaf pages, and so you wouldn't get page splits, and so you wouldn't end up with leaf pages that can only store 20 distinct values now and forever, because that's the range of values represented by downlinks and the leaf's high key. That's a situation we actually saw for the leftmost leaf page in Alik's Zipfian distribution test. The way that the keyspace is broken up is supposed to be balanced, and to have long term utility. Working against that to absorb a short term bloat problem is penny wise, pound foolish. -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers