On 2014-04-24 15:56:45 +0300, Heikki Linnakangas wrote: > On 04/17/2014 12:06 PM, Andres Freund wrote: > >On 2014-04-16 19:33:52 -0400, Bruce Momjian wrote: > >>On Tue, Feb 4, 2014 at 12:58:49AM +0100, Andres Freund wrote: > >>>On 2014-02-03 11:22:45 -0500, Tom Lane wrote: > >>>>Andres Freund <and...@2ndquadrant.com> writes: > >>>>>On larger, multi-socket, machines, startup takes a fair bit of time. As > >>>>>I was profiling anyway I looked into it and noticed that just about all > >>>>>of it is spent in LWLockAssign() called by InitBufferPool(). Starting > >>>>>with shared_buffers=48GB on the server Nate Boley provided, takes about > >>>>>12 seconds. Nearly all of it spent taking the ShmemLock spinlock. > >>>>>Simply modifying LWLockAssign() to not take the spinlock when > >>>>>!IsUnderPostmaster speeds it up to 2 seconds. While certainly not making > >>>>>LWLockAssign() prettier it seems enough of a speedup to be worthwile > >>>>>nonetheless. > >>>> > >>>>Hm. This patch only works if the postmaster itself never assigns any > >>>>LWLocks except during startup. That's *probably* all right, but it > >>>>seems a bit scary. Is there any cheap way to make the logic actually > >>>>be what your comment claims, namely "Interlocking is not necessary during > >>>>postmaster startup"? I guess we could invent a ShmemInitInProgress global > >>>>flag ... > >>> > >>>So, here's a flag implementing things with that flag. I kept your name, > >>>as it's more in line with ipci.c's naming, but it looks kinda odd > >>>besides proc_exit_inprogress. > >> > >>Uh, where are we on this? > > > >I guess it's waiting for the next CF :(. > > Now that we have LWLock tranches in 9.4, it might be cleanest to have the > buffer manager allocate a separate tranche for the buffer locks. We could > also save some memory if we got rid of the LWLock pointers in BufferDesc > altogether, and just used the buffer id as an index into the LWLock array > (we could do that without tranches too, but would have to assume that the > lock ids returned by LWLockAssign() are a contiguous range).
I tried that, and it's nontrivial from a performance POV because it influences how a buffer descriptor fits into cacheline(s). I think this needs significant experimentation. My experimentation hinted that it'd be a good idea to put the content lwlock inline, but the io one not since it's accessed much less frequently. IIRC I could fit the remainder of the buffer descriptor into one cacheline after putting the io locks into a separate array. I wonder if we can't somehow get rid of the io locks entirely... > Another idea is to add an LWLockAssignBatch(int) function that assigns a > range of locks in one call. That would be very simple, and I think it would > be less likely to break things than a new global flag. I would be OK with > sneaking that into 9.4 still. I don't really see the advantage tbh. Assuming we always can avoid the spinlock initially seems simple enough - and I have significant doubts that anything but buffer locks will need enough locks that it matters for other users. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (firstname.lastname@example.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers