On Mon, Jan 4, 2021 at 8:28 AM Victor Yegorov <vyego...@gmail.com> wrote: > I've looked through the v12 patch. > > I like the new outline: > > - _bt_delete_or_dedup_one_page() is the main entry for the new code > - first we try _bt_simpledel_pass() does improved cleanup of LP_DEAD entries > - then (if necessary) _bt_bottomupdel_pass() for bottomup deletion > - finally, we perform _bt_dedup_pass() to deduplication > > We split the leaf page only if all the actions above failed to provide enough > space.
I'm thinking of repeating the LP_DEAD enhancement within GiST and hash shortly after this, as follow-up work. Their implementation of LP_DEAD deletion was always based on the nbtree original, and I think that it makes sense to keep everything in sync. The simple deletion enhancement probably makes just as much sense in these other places. > + * Only do this for key columns. A change to a non-key column within an > + * INCLUDE index should not be considered because that's just payload to > + * the index (they're not unlike table TIDs to the index AM). > > The last part of it (in the parenthesis) is difficult to grasp due to > the double negation (not unlike). I think it's better to rephrase it. Okay, will do. > 2. After reading the patch, I also think, that fact, that > index_unchanged_by_update() > and index_unchanged_by_update_var_walker() return different bool states > (i.e. when the latter returns true, the first one returns false) is a bit > misleading. > Although logic as it is looks fine, maybe > index_unchanged_by_update_var_walker() > can be renamed to avoid this confusion, to smth like > index_expression_changed_walker() ? I agree. I'll use the name index_expression_changed_walker() in the next revision. > 2. I'm not sure the bottomup_delete_items parameter is very helpful. In order > to disable > bottom-up deletion, DBA needs somehow to measure it's impact on a particular > index. > Currently I do not see how to achieve this. Not sure if this is overly > important, though, as > you have a similar parameter for the deduplication. I'll take bottomup_delete_items out, since I think that you may be right about that. It can be another "decision to recheck mid-beta" thing (on the PostgreSQL 14 open items list), so we can delay making a final decision on it until after it gets tested by the broadest possible group of people. > 3. It feels like indexUnchanged is better to make indexChanged and negate its > usage in the code. > As !indexChanged reads more natural than !indexUnchanged, at least to me. I don't want to do that because then we're negating "indexChanged" as part of our gating condition to the bottom-up deletion pass code. To me this feels wrong, since the hint only exists to be used to trigger index tuple deletion. This is unlikely to ever change. > First, I run a 2-hour long case with the same setup as I used in my e-mail > from 15 of November. > I found no difference between patch and master whatsoever. Which makes me > think, that current > master is quite good at keeping better bloat control (not sure if this is an > effect of > 4228817449 commit or deduplication). Commit 4228817449 can't have improved the performance of the master branch -- that commit was just about providing a *correct* latestRemovedXid value. It cannot affect how many index tuples get deleted per pass, or anything like that. Not surprised that you didn't see many problems with the master branch on your first attempt. It's normal for there to be non-linear effects with this stuff, and these are very hard (maybe even impossible) to model. For example, even with artificial test data you often see distinct "waves" of page splits, which is a phenomenon pretty well described by the "Waves of misery after index creation" paper [1]. It's likely that your original stress test case (the 15 November one) would have shown significant bloat if you just kept it up for long enough. One goal for this project is to make the performance characteristics more predictable. The performance characteristics are unpredictable precisely because they're pathological. B-Trees will always have non-linear performance characteristics, but they can be made a lot less chaotic in practice. To be honest, I was slightly surprised that your more recent stress test (the queue thing) worked out so well for the patch. But not too surprised, because I don't necessarily expect to understand how the patch can help for any given workload -- this is dynamic behavior that's part of a complex system. I first thought that maybe the presence of the INSERTs and DELETEs would mess things up. But I was wrong -- everything still worked well, probably because bottom-up index deletion "cooperated" with deduplication. The INSERTs could not trigger a bottom-up deletion (which might have been useful for these particular INSERTs), but that didn't actually matter because deduplication was triggered instead, which "held the line" for long enough for bottom-up deletion to eventually get triggered (by non-HOT UPDATEs). As I've said before, I believe that the workload should "figure out the best strategy for handling bloat on its own". A diversity of strategies are available, which can adapt to many different kinds of workloads organically. That kind of approach is sometimes necessary with a complex system IMV. [1] https://btw.informatik.uni-rostock.de/download/tagungsband/B2-2.pdf -- Peter Geoghegan