On 2019-03-12 14:15:06 -0700, Peter Geoghegan wrote: > On Tue, Mar 12, 2019 at 12:40 PM Andres Freund <and...@anarazel.de> wrote: > > Have you looked at an offwake or lwlock wait graph (bcc tools) or > > something in that vein? Would be interesting to see what is waiting for > > what most often... > > Not recently, though I did use your BCC script for this very purpose > quite a few months ago. I don't remember it helping that much at the > time, but then that was with a version of the patch that lacked a > couple of important optimizations that we have now. We're now very > careful to not descend to the left with an equal pivot tuple. We > descend right instead when that's definitely the only place we'll find > matches (a high key doesn't count as a match in almost all cases!). > Edge-cases where we unnecessarily move left then right, or > unnecessarily move right a second time once on the leaf level have > been fixed. I fixed the regression I was worried about at the time, > without getting much benefit from the BCC script, and moved on. > > This kind of minutiae is more important than it sounds. I have used > EXPLAIN(ANALYZE, BUFFERS) instrumentation to make sure that I > understand where every single block access comes from with these > edge-cases, paying close attention to the structure of the index, and > how the key space is broken up (the values of pivot tuples in internal > pages). It is one thing to make the index smaller, and another thing > to take full advantage of that -- I have both. This is one of the > reasons why I believe that this minor regression cannot be avoided, > short of simply allowing the index to get bloated: I'm simply not > doing things that differently outside of the page split code, and what > I am doing differently is clearly superior. Both in general, and for > the NEW_ORDER transaction in particular. > > I'll make that another TODO item -- this regression will be revisited > using BCC instrumentation. I am currently performing a multi-day > benchmark on a very large TPC-C/BenchmarkSQL database, and it will > have to wait for that. (I would like to use the same environment as > before.)
I'm basically just curious which buffers have most of the additional contention. Is it the lower number of leaf pages, the inner pages, or (somewhat unexplicably) the meta page, or ...? I was thinking that the callstack that e.g. my lwlock tool gives should be able to explain what callstack most of the waits are occuring on. (I should work a bit on that script, I locally had a version that showed both waiters and the waking up callstack, but I don't find it anymore) Greetings, Andres Freund