On Mon, Jan 4, 2021 at 4:08 AM Heikki Linnakangas <hlinn...@iki.fi> wrote: > You said above that heap_tid_shellsort() is very performance critical, > and that's why you use the two arrays approach. If it's so performance > critical that swapping 8 bytes vs 12 byte array elements makes a > difference, I would guess that comparing TID vs just the block numbers > would also make a difference. > > If you really want to shave cycles, you could also store BlockNumber and > OffsetNumber in the TM_IndexDelete array, instead of ItemPointerData. > What's the difference, you might ask? ItemPointerData stores the block > number as two 16 bit integers that need to be reassembled into a 32 bit > integer in the ItemPointerGetBlockNumber() macro. > > My argument here is two-pronged: If this is performance critical, you > could do these additional optimizations. If it's not, then you don't > need the two-arrays trick; one array would be simpler.
That's reasonable. The reason why I haven't been more decisive here is because the question of whether or not it matters is actually very complicated, for reasons that have little to do with sorting. The more effective the mechanism is each time (the more bytes it allows nbtree to free from each leaf page), the less often it is called, and the less performance critical the overhead per operation is. On the other hand there are a couple of other questions about what we do in heapam.c that aren't quite resolved yet (e.g. exact "favorable blocks"/prefetching behavior in bottom-up case), that probably affect how important the heap_tid_shellsort() microoptimisations are. I think that it makes sense to make a final decision on this at the last minute, once everything else is resolved, since the implicit dependencies make any other approach much too complicated. I agree that this kind of microoptimization is best avoided, but I really don't want to have to worry about regressions in workloads that I now understand fully. I think that the sort became less important on perhaps 2 or 3 occasions during development, even though that was never really the goal that I had in mind in each case. I'll do my best to avoid it. > Hmm, maybe: "... is the best information that we have to go on, it is > just a guess based on simple heuristics about duplicates in indexes". I'll add something like that to the next revision. > I see. Would be good to explain that pattern with multiple indexes in > the comment. Will do -- it is the single best example of how heap block locality can matter with a real workload, so it makes sense to go with it in explanatory comments. -- Peter Geoghegan