Re: Making all nbtree entries unique by having heap TIDs participate in comparisons

Andres Freund Tue, 12 Mar 2019 14:22:43 -0700

On 2019-03-12 14:15:06 -0700, Peter Geoghegan wrote:
> On Tue, Mar 12, 2019 at 12:40 PM Andres Freund <[email protected]> wrote:
> > Have you looked at an offwake or lwlock wait graph (bcc tools) or
> > something in that vein? Would be interesting to see what is waiting for
> > what most often...
> 
> Not recently, though I did use your BCC script for this very purpose
> quite a few months ago. I don't remember it helping that much at the
> time, but then that was with a version of the patch that lacked a
> couple of important optimizations that we have now. We're now very
> careful to not descend to the left with an equal pivot tuple. We
> descend right instead when that's definitely the only place we'll find
> matches (a high key doesn't count as a match in almost all cases!).
> Edge-cases where we unnecessarily move left then right, or
> unnecessarily move right a second time once on the leaf level have
> been fixed. I fixed the regression I was worried about at the time,
> without getting much benefit from the BCC script, and moved on.
> 
> This kind of minutiae is more important than it sounds. I have used
> EXPLAIN(ANALYZE, BUFFERS) instrumentation to make sure that I
> understand where every single block access comes from with these
> edge-cases, paying close attention to the structure of the index, and
> how the key space is broken up (the values of pivot tuples in internal
> pages). It is one thing to make the index smaller, and another thing
> to take full advantage of that -- I have both. This is one of the
> reasons why I believe that this minor regression cannot be avoided,
> short of simply allowing the index to get bloated: I'm simply not
> doing things that differently outside of the page split code, and what
> I am doing differently is clearly superior. Both in general, and for
> the NEW_ORDER transaction in particular.
> 
> I'll make that another TODO item -- this regression will be revisited
> using BCC instrumentation. I am currently performing a multi-day
> benchmark on a very large TPC-C/BenchmarkSQL database, and it will
> have to wait for that. (I would like to use the same environment as
> before.)


I'm basically just curious which buffers have most of the additional
contention. Is it the lower number of leaf pages, the inner pages, or
(somewhat unexplicably) the meta page, or ...?  I was thinking that the
callstack that e.g. my lwlock tool gives should be able to explain what
callstack most of the waits are occuring on.

(I should work a bit on that script, I locally had a version that showed
both waiters and the waking up callstack, but I don't find it anymore)

Greetings,

Andres Freund

Re: Making all nbtree entries unique by having heap TIDs participate in comparisons

Reply via email to