Re: [HACKERS] slow startup due to LWLockAssign() spinlock

2014-04-25 Thread Andres Freund
On 2014-04-24 23:28:14 +0200, Andres Freund wrote:
 On 2014-04-24 12:43:13 -0400, Tom Lane wrote:
  Andres Freund and...@2ndquadrant.com writes:
   On 2014-04-24 11:02:44 -0400, Tom Lane wrote:
   FWIW, I like the LWLockAssignBatch idea a lot better than the currently
   proposed patch.  LWLockAssign is a low-level function that has no 
   business
   making risky assumptions about the context it's invoked in.
  
   I don't think LWLockAssignBatch() is that easy without introducing
   layering violations. It can't just return a pointer out of the main
   lwlock array that then can be ++ed clientside because MainLWLockArray's
   stride isn't sizeof(LWLock).
  
  Meh.  I knew this business of using pointers instead of indexes would
  have some downsides.
  
  We could return the array stride ... kinda ugly, but since there's
  probably only one consumer for this API, it's not *that* bad.  Could
  wrap the stride-increment in a macro, perhaps.
 
 I think I am just going to wait for 9.5 where I sure hope we can
 allocate the buffer lwlocks outside the main array...

For reference (and backup), here's my current patch for that.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services
From 24cd57b86a5ad4dc625daa61b62663bb796a5e38 Mon Sep 17 00:00:00 2001
From: Andres Freund and...@anarazel.de
Date: Tue, 28 Jan 2014 17:06:01 +0100
Subject: [PATCH] lwlock-inline

---
 src/backend/storage/buffer/buf_init.c | 26 +--
 src/backend/storage/buffer/bufmgr.c   | 48 +--
 src/backend/storage/lmgr/lwlock.c | 11 +---
 src/include/storage/buf_internals.h   | 10 +---
 src/include/storage/lwlock.h  | 10 
 5 files changed, 66 insertions(+), 39 deletions(-)

diff --git a/src/backend/storage/buffer/buf_init.c b/src/backend/storage/buffer/buf_init.c
index e187242..27689d9 100644
--- a/src/backend/storage/buffer/buf_init.c
+++ b/src/backend/storage/buffer/buf_init.c
@@ -21,6 +21,7 @@
 BufferDesc *BufferDescriptors;
 char	   *BufferBlocks;
 int32	   *PrivateRefCount;
+LWLock		*BufferIOLocks;
 
 
 /*
@@ -62,6 +63,8 @@ int32	   *PrivateRefCount;
  *		backend.
  */
 
+static int			content_tranche_id, progress_tranche_id;
+static LWLockTranche content_tranche, progress_tranche;
 
 /*
  * Initialize shared buffer pool
@@ -79,10 +82,26 @@ InitBufferPool(void)
 		ShmemInitStruct(Buffer Descriptors,
 		NBuffers * sizeof(BufferDesc), foundDescs);
 
+	BufferIOLocks = (LWLock *)
+		ShmemInitStruct(Buffer IO Locks,
+		NBuffers * sizeof(LWLock), foundBufs);
+
 	BufferBlocks = (char *)
 		ShmemInitStruct(Buffer Blocks,
 		NBuffers * (Size) BLCKSZ, foundBufs);
 
+	content_tranche_id = 1;
+	content_tranche.name = Buffer Content Locks;
+	content_tranche.array_base = BufferDescriptors-content_lock;
+	content_tranche.array_stride = sizeof(BufferDesc);
+	LWLockRegisterTranche(content_tranche_id, content_tranche);
+
+	progress_tranche_id = 2;
+	progress_tranche.name = Buffer IO Locks;
+	progress_tranche.array_base = BufferIOLocks;
+	progress_tranche.array_stride = sizeof(LWLock);
+	LWLockRegisterTranche(progress_tranche_id, progress_tranche);
+
 	if (foundDescs || foundBufs)
 	{
 		/* both should be present or neither */
@@ -117,8 +136,8 @@ InitBufferPool(void)
 			 */
 			buf-freeNext = i + 1;
 
-			buf-io_in_progress_lock = LWLockAssign();
-			buf-content_lock = LWLockAssign();
+			LWLockInitialize(buf-content_lock, content_tranche_id);
+			LWLockInitialize(BufferIOLocks[i], progress_tranche_id);
 		}
 
 		/* Correct last entry of linked list */
@@ -168,6 +187,9 @@ BufferShmemSize(void)
 	/* size of buffer descriptors */
 	size = add_size(size, mul_size(NBuffers, sizeof(BufferDesc)));
 
+	/* size of io progress locks */
+	size = add_size(size, mul_size(NBuffers, sizeof(LWLock)));
+
 	/* size of data pages */
 	size = add_size(size, mul_size(NBuffers, BLCKSZ));
 
diff --git a/src/backend/storage/buffer/bufmgr.c b/src/backend/storage/buffer/bufmgr.c
index 4e46ddb..28357c9 100644
--- a/src/backend/storage/buffer/bufmgr.c
+++ b/src/backend/storage/buffer/bufmgr.c
@@ -651,7 +651,7 @@ BufferAlloc(SMgrRelation smgr, char relpersistence, ForkNumber forkNum,
 			 * happens to be trying to split the page the first one got from
 			 * StrategyGetBuffer.)
 			 */
-			if (LWLockConditionalAcquire(buf-content_lock, LW_SHARED))
+			if (LWLockConditionalAcquire((LWLock *) buf-content_lock, LW_SHARED))
 			{
 /*
  * If using a nondefault strategy, and writing the buffer
@@ -673,7 +673,7 @@ BufferAlloc(SMgrRelation smgr, char relpersistence, ForkNumber forkNum,
 		StrategyRejectBuffer(strategy, buf))
 	{
 		/* Drop lock/pin and loop around for another buffer */
-		LWLockRelease(buf-content_lock);
+		LWLockRelease((LWLock *) buf-content_lock);
 		UnpinBuffer(buf, true);
 		continue;
 	}
@@ -686,7 +686,7 @@ BufferAlloc(SMgrRelation smgr, char 

Re: [HACKERS] slow startup due to LWLockAssign() spinlock

2014-04-24 Thread Heikki Linnakangas

On 04/17/2014 12:06 PM, Andres Freund wrote:

On 2014-04-16 19:33:52 -0400, Bruce Momjian wrote:

On Tue, Feb  4, 2014 at 12:58:49AM +0100, Andres Freund wrote:

On 2014-02-03 11:22:45 -0500, Tom Lane wrote:

Andres Freund and...@2ndquadrant.com writes:

On larger, multi-socket, machines, startup takes a fair bit of time. As
I was profiling anyway I looked into it and noticed that just about all
of it is spent in LWLockAssign() called by InitBufferPool(). Starting
with shared_buffers=48GB on the server Nate Boley provided, takes about
12 seconds. Nearly all of it spent taking the ShmemLock spinlock.
Simply modifying LWLockAssign() to not take the spinlock when
!IsUnderPostmaster speeds it up to 2 seconds. While certainly not making
LWLockAssign() prettier it seems enough of a speedup to be worthwile
nonetheless.


Hm.  This patch only works if the postmaster itself never assigns any
LWLocks except during startup.  That's *probably* all right, but it
seems a bit scary.  Is there any cheap way to make the logic actually
be what your comment claims, namely Interlocking is not necessary during
postmaster startup?  I guess we could invent a ShmemInitInProgress global
flag ...


So, here's a flag implementing things with that flag. I kept your name,
as it's more in line with ipci.c's naming, but it looks kinda odd
besides proc_exit_inprogress.


Uh, where are we on this?


I guess it's waiting for the next CF :(.


Now that we have LWLock tranches in 9.4, it might be cleanest to have 
the buffer manager allocate a separate tranche for the buffer locks. We 
could also save some memory if we got rid of the LWLock pointers in 
BufferDesc altogether, and just used the buffer id as an index into the 
LWLock array (we could do that without tranches too, but would have to 
assume that the lock ids returned by LWLockAssign() are a contiguous range).


Another idea is to add an LWLockAssignBatch(int) function that assigns a 
range of locks in one call. That would be very simple, and I think it 
would be less likely to break things than a new global flag. I would be 
OK with sneaking that into 9.4 still.


- Heikki


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] slow startup due to LWLockAssign() spinlock

2014-04-24 Thread Andres Freund
On 2014-04-24 15:56:45 +0300, Heikki Linnakangas wrote:
 On 04/17/2014 12:06 PM, Andres Freund wrote:
 On 2014-04-16 19:33:52 -0400, Bruce Momjian wrote:
 On Tue, Feb  4, 2014 at 12:58:49AM +0100, Andres Freund wrote:
 On 2014-02-03 11:22:45 -0500, Tom Lane wrote:
 Andres Freund and...@2ndquadrant.com writes:
 On larger, multi-socket, machines, startup takes a fair bit of time. As
 I was profiling anyway I looked into it and noticed that just about all
 of it is spent in LWLockAssign() called by InitBufferPool(). Starting
 with shared_buffers=48GB on the server Nate Boley provided, takes about
 12 seconds. Nearly all of it spent taking the ShmemLock spinlock.
 Simply modifying LWLockAssign() to not take the spinlock when
 !IsUnderPostmaster speeds it up to 2 seconds. While certainly not making
 LWLockAssign() prettier it seems enough of a speedup to be worthwile
 nonetheless.
 
 Hm.  This patch only works if the postmaster itself never assigns any
 LWLocks except during startup.  That's *probably* all right, but it
 seems a bit scary.  Is there any cheap way to make the logic actually
 be what your comment claims, namely Interlocking is not necessary during
 postmaster startup?  I guess we could invent a ShmemInitInProgress global
 flag ...
 
 So, here's a flag implementing things with that flag. I kept your name,
 as it's more in line with ipci.c's naming, but it looks kinda odd
 besides proc_exit_inprogress.
 
 Uh, where are we on this?
 
 I guess it's waiting for the next CF :(.
 
 Now that we have LWLock tranches in 9.4, it might be cleanest to have the
 buffer manager allocate a separate tranche for the buffer locks. We could
 also save some memory if we got rid of the LWLock pointers in BufferDesc
 altogether, and just used the buffer id as an index into the LWLock array
 (we could do that without tranches too, but would have to assume that the
 lock ids returned by LWLockAssign() are a contiguous range).

I tried that, and it's nontrivial from a performance POV because it
influences how a buffer descriptor fits into cacheline(s). I think this
needs significant experimentation.
My experimentation hinted that it'd be a good idea to put the content
lwlock inline, but the io one not since it's accessed much less
frequently. IIRC I could fit the remainder of the buffer descriptor into
one cacheline after putting the io locks into a separate array. I wonder
if we can't somehow get rid of the io locks entirely...

 Another idea is to add an LWLockAssignBatch(int) function that assigns a
 range of locks in one call. That would be very simple, and I think it would
 be less likely to break things than a new global flag. I would be OK with
 sneaking that into 9.4 still.

I don't really see the advantage tbh. Assuming we always can avoid the
spinlock initially seems simple enough - and I have significant doubts
that anything but buffer locks will need enough locks that it matters
for other users.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] slow startup due to LWLockAssign() spinlock

2014-04-24 Thread Tom Lane
Andres Freund and...@2ndquadrant.com writes:
 On 2014-04-24 15:56:45 +0300, Heikki Linnakangas wrote:
 Another idea is to add an LWLockAssignBatch(int) function that assigns a
 range of locks in one call. That would be very simple, and I think it would
 be less likely to break things than a new global flag. I would be OK with
 sneaking that into 9.4 still.

 I don't really see the advantage tbh. Assuming we always can avoid the
 spinlock initially seems simple enough - and I have significant doubts
 that anything but buffer locks will need enough locks that it matters
 for other users.

FWIW, I like the LWLockAssignBatch idea a lot better than the currently
proposed patch.  LWLockAssign is a low-level function that has no business
making risky assumptions about the context it's invoked in.

The other ideas are 9.5 material at this point, since they involve
research --- but I agree with Heikki that LWLockAssignBatch could be
snuck into 9.4 still.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] slow startup due to LWLockAssign() spinlock

2014-04-24 Thread Andres Freund
On 2014-04-24 11:02:44 -0400, Tom Lane wrote:
 Andres Freund and...@2ndquadrant.com writes:
  On 2014-04-24 15:56:45 +0300, Heikki Linnakangas wrote:
  Another idea is to add an LWLockAssignBatch(int) function that assigns a
  range of locks in one call. That would be very simple, and I think it would
  be less likely to break things than a new global flag. I would be OK with
  sneaking that into 9.4 still.
 
  I don't really see the advantage tbh. Assuming we always can avoid the
  spinlock initially seems simple enough - and I have significant doubts
  that anything but buffer locks will need enough locks that it matters
  for other users.
 
 FWIW, I like the LWLockAssignBatch idea a lot better than the currently
 proposed patch.  LWLockAssign is a low-level function that has no business
 making risky assumptions about the context it's invoked in.

I don't think LWLockAssignBatch() is that easy without introducing
layering violations. It can't just return a pointer out of the main
lwlock array that then can be ++ed clientside because MainLWLockArray's
stride isn't sizeof(LWLock).
We could just add a LWLockAssignStartup(), that'd be pretty
trivial. Whoever uses it later gets to keep the pieces...

I guess if it's not that, the whole thing is 9.5 material. Once those
locks are in a separate tranche the whole thing is moot anyway.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] slow startup due to LWLockAssign() spinlock

2014-04-24 Thread Heikki Linnakangas

On 04/24/2014 07:24 PM, Andres Freund wrote:

On 2014-04-24 11:02:44 -0400, Tom Lane wrote:

Andres Freund and...@2ndquadrant.com writes:

On 2014-04-24 15:56:45 +0300, Heikki Linnakangas wrote:

Another idea is to add an LWLockAssignBatch(int) function that assigns a
range of locks in one call. That would be very simple, and I think it would
be less likely to break things than a new global flag. I would be OK with
sneaking that into 9.4 still.



I don't really see the advantage tbh. Assuming we always can avoid the
spinlock initially seems simple enough - and I have significant doubts
that anything but buffer locks will need enough locks that it matters
for other users.


FWIW, I like the LWLockAssignBatch idea a lot better than the currently
proposed patch.  LWLockAssign is a low-level function that has no business
making risky assumptions about the context it's invoked in.


I don't think LWLockAssignBatch() is that easy without introducing
layering violations. It can't just return a pointer out of the main
lwlock array that then can be ++ed clientside because MainLWLockArray's
stride isn't sizeof(LWLock).


Well, it could copy the pointers to an array of pointers that the caller 
provides. Or palloc an array and return that. Allocating a large enough 
array to hold NBuffers locks might not be nice, but if you do it in 
batches of, say, 1k locks, that would make it fast enough. Makes the 
caller a bit more complicated, but still might be worth it.


- Heikki


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] slow startup due to LWLockAssign() spinlock

2014-04-24 Thread Tom Lane
Andres Freund and...@2ndquadrant.com writes:
 On 2014-04-24 11:02:44 -0400, Tom Lane wrote:
 FWIW, I like the LWLockAssignBatch idea a lot better than the currently
 proposed patch.  LWLockAssign is a low-level function that has no business
 making risky assumptions about the context it's invoked in.

 I don't think LWLockAssignBatch() is that easy without introducing
 layering violations. It can't just return a pointer out of the main
 lwlock array that then can be ++ed clientside because MainLWLockArray's
 stride isn't sizeof(LWLock).

Meh.  I knew this business of using pointers instead of indexes would
have some downsides.

We could return the array stride ... kinda ugly, but since there's
probably only one consumer for this API, it's not *that* bad.  Could
wrap the stride-increment in a macro, perhaps.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] slow startup due to LWLockAssign() spinlock

2014-04-24 Thread Andres Freund
On 2014-04-24 12:43:13 -0400, Tom Lane wrote:
 Andres Freund and...@2ndquadrant.com writes:
  On 2014-04-24 11:02:44 -0400, Tom Lane wrote:
  FWIW, I like the LWLockAssignBatch idea a lot better than the currently
  proposed patch.  LWLockAssign is a low-level function that has no business
  making risky assumptions about the context it's invoked in.
 
  I don't think LWLockAssignBatch() is that easy without introducing
  layering violations. It can't just return a pointer out of the main
  lwlock array that then can be ++ed clientside because MainLWLockArray's
  stride isn't sizeof(LWLock).
 
 Meh.  I knew this business of using pointers instead of indexes would
 have some downsides.
 
 We could return the array stride ... kinda ugly, but since there's
 probably only one consumer for this API, it's not *that* bad.  Could
 wrap the stride-increment in a macro, perhaps.

I think I am just going to wait for 9.5 where I sure hope we can
allocate the buffer lwlocks outside the main array...

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] slow startup due to LWLockAssign() spinlock

2014-04-17 Thread Andres Freund
On 2014-04-16 19:33:52 -0400, Bruce Momjian wrote:
 On Tue, Feb  4, 2014 at 12:58:49AM +0100, Andres Freund wrote:
  On 2014-02-03 11:22:45 -0500, Tom Lane wrote:
   Andres Freund and...@2ndquadrant.com writes:
On larger, multi-socket, machines, startup takes a fair bit of time. As
I was profiling anyway I looked into it and noticed that just about all
of it is spent in LWLockAssign() called by InitBufferPool(). Starting
with shared_buffers=48GB on the server Nate Boley provided, takes about
12 seconds. Nearly all of it spent taking the ShmemLock spinlock.
Simply modifying LWLockAssign() to not take the spinlock when
!IsUnderPostmaster speeds it up to 2 seconds. While certainly not making
LWLockAssign() prettier it seems enough of a speedup to be worthwile
nonetheless.
   
   Hm.  This patch only works if the postmaster itself never assigns any
   LWLocks except during startup.  That's *probably* all right, but it
   seems a bit scary.  Is there any cheap way to make the logic actually
   be what your comment claims, namely Interlocking is not necessary during
   postmaster startup?  I guess we could invent a ShmemInitInProgress global
   flag ...
  
  So, here's a flag implementing things with that flag. I kept your name,
  as it's more in line with ipci.c's naming, but it looks kinda odd
  besides proc_exit_inprogress.
 
 Uh, where are we on this?

I guess it's waiting for the next CF :(.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] slow startup due to LWLockAssign() spinlock

2014-04-16 Thread Bruce Momjian
On Tue, Feb  4, 2014 at 12:58:49AM +0100, Andres Freund wrote:
 On 2014-02-03 11:22:45 -0500, Tom Lane wrote:
  Andres Freund and...@2ndquadrant.com writes:
   On larger, multi-socket, machines, startup takes a fair bit of time. As
   I was profiling anyway I looked into it and noticed that just about all
   of it is spent in LWLockAssign() called by InitBufferPool(). Starting
   with shared_buffers=48GB on the server Nate Boley provided, takes about
   12 seconds. Nearly all of it spent taking the ShmemLock spinlock.
   Simply modifying LWLockAssign() to not take the spinlock when
   !IsUnderPostmaster speeds it up to 2 seconds. While certainly not making
   LWLockAssign() prettier it seems enough of a speedup to be worthwile
   nonetheless.
  
  Hm.  This patch only works if the postmaster itself never assigns any
  LWLocks except during startup.  That's *probably* all right, but it
  seems a bit scary.  Is there any cheap way to make the logic actually
  be what your comment claims, namely Interlocking is not necessary during
  postmaster startup?  I guess we could invent a ShmemInitInProgress global
  flag ...
 
 So, here's a flag implementing things with that flag. I kept your name,
 as it's more in line with ipci.c's naming, but it looks kinda odd
 besides proc_exit_inprogress.

Uh, where are we on this?

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + Everyone has their own god. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] slow startup due to LWLockAssign() spinlock

2014-02-03 Thread Tom Lane
Andres Freund and...@2ndquadrant.com writes:
 On larger, multi-socket, machines, startup takes a fair bit of time. As
 I was profiling anyway I looked into it and noticed that just about all
 of it is spent in LWLockAssign() called by InitBufferPool(). Starting
 with shared_buffers=48GB on the server Nate Boley provided, takes about
 12 seconds. Nearly all of it spent taking the ShmemLock spinlock.
 Simply modifying LWLockAssign() to not take the spinlock when
 !IsUnderPostmaster speeds it up to 2 seconds. While certainly not making
 LWLockAssign() prettier it seems enough of a speedup to be worthwile
 nonetheless.

Hm.  This patch only works if the postmaster itself never assigns any
LWLocks except during startup.  That's *probably* all right, but it
seems a bit scary.  Is there any cheap way to make the logic actually
be what your comment claims, namely Interlocking is not necessary during
postmaster startup?  I guess we could invent a ShmemInitInProgress global
flag ...

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] slow startup due to LWLockAssign() spinlock

2014-02-03 Thread Andres Freund
On 2014-02-03 11:22:45 -0500, Tom Lane wrote:
 Andres Freund and...@2ndquadrant.com writes:
  On larger, multi-socket, machines, startup takes a fair bit of time. As
  I was profiling anyway I looked into it and noticed that just about all
  of it is spent in LWLockAssign() called by InitBufferPool(). Starting
  with shared_buffers=48GB on the server Nate Boley provided, takes about
  12 seconds. Nearly all of it spent taking the ShmemLock spinlock.
  Simply modifying LWLockAssign() to not take the spinlock when
  !IsUnderPostmaster speeds it up to 2 seconds. While certainly not making
  LWLockAssign() prettier it seems enough of a speedup to be worthwile
  nonetheless.
 
 Hm.  This patch only works if the postmaster itself never assigns any
 LWLocks except during startup.  That's *probably* all right, but it
 seems a bit scary.  Is there any cheap way to make the logic actually
 be what your comment claims, namely Interlocking is not necessary during
 postmaster startup?  I guess we could invent a ShmemInitInProgress global
 flag ...

I'd be fine with inventing such a flag, I couldn't find one and decided
that this alone didn't merit it, since it seems to be really unlikely
that we will start to allocate such resources after startup in the
postmaster. Unless we're talking about single user mode obviously, but
the spinlock isn't necessary there anyway.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] slow startup due to LWLockAssign() spinlock

2014-02-03 Thread Andres Freund
On 2014-02-03 11:22:45 -0500, Tom Lane wrote:
 Andres Freund and...@2ndquadrant.com writes:
  On larger, multi-socket, machines, startup takes a fair bit of time. As
  I was profiling anyway I looked into it and noticed that just about all
  of it is spent in LWLockAssign() called by InitBufferPool(). Starting
  with shared_buffers=48GB on the server Nate Boley provided, takes about
  12 seconds. Nearly all of it spent taking the ShmemLock spinlock.
  Simply modifying LWLockAssign() to not take the spinlock when
  !IsUnderPostmaster speeds it up to 2 seconds. While certainly not making
  LWLockAssign() prettier it seems enough of a speedup to be worthwile
  nonetheless.
 
 Hm.  This patch only works if the postmaster itself never assigns any
 LWLocks except during startup.  That's *probably* all right, but it
 seems a bit scary.  Is there any cheap way to make the logic actually
 be what your comment claims, namely Interlocking is not necessary during
 postmaster startup?  I guess we could invent a ShmemInitInProgress global
 flag ...

So, here's a flag implementing things with that flag. I kept your name,
as it's more in line with ipci.c's naming, but it looks kinda odd
besides proc_exit_inprogress.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services
diff --git a/src/backend/storage/ipc/ipci.c b/src/backend/storage/ipc/ipci.c
index c392d4f..ece9674 100644
--- a/src/backend/storage/ipc/ipci.c
+++ b/src/backend/storage/ipc/ipci.c
@@ -48,6 +48,11 @@ shmem_startup_hook_type shmem_startup_hook = NULL;
 static Size total_addin_request = 0;
 static bool addin_request_allowed = true;
 
+/*
+ * Signals whether shared memory is currently being initialized, while the
+ * postmaster hasn't forked any children yet.
+ */
+bool ShmemInitInProgress = false;
 
 /*
  * RequestAddinShmemSpace
@@ -97,6 +102,12 @@ CreateSharedMemoryAndSemaphores(bool makePrivate, int port)
 		int			numSemas;
 
 		/*
+		 * We're now initializing shared memory, and we won't have forked
+		 * children yet.
+		 */
+		ShmemInitInProgress = true;
+
+		/*
 		 * Size of the Postgres shared-memory block is estimated via
 		 * moderately-accurate estimates for the big hogs, plus 100K for the
 		 * stuff that's too small to bother with estimating.
@@ -261,4 +272,7 @@ CreateSharedMemoryAndSemaphores(bool makePrivate, int port)
 	 */
 	if (shmem_startup_hook)
 		shmem_startup_hook();
+
+	if(!IsUnderPostmaster)
+		ShmemInitInProgress = false;
 }
diff --git a/src/backend/storage/lmgr/lwlock.c b/src/backend/storage/lmgr/lwlock.c
index 82ef440..0365f9b 100644
--- a/src/backend/storage/lmgr/lwlock.c
+++ b/src/backend/storage/lmgr/lwlock.c
@@ -355,9 +355,11 @@ CreateLWLocks(void)
  * LWLockAssign - assign a dynamically-allocated LWLock number
  *
  * We interlock this using the same spinlock that is used to protect
- * ShmemAlloc().  Interlocking is not really necessary during postmaster
- * startup, but it is needed if any user-defined code tries to allocate
- * LWLocks after startup.
+ * ShmemAlloc().  Interlocking is not necessary while we're initializing
+ * shared memory, but it is needed for any code allocating LWLocks after
+ * startup.  Since we allocate large amounts of LWLocks for the buffer pool
+ * during startup, we avoid taking the spinlock when not needed, as it has
+ * shown to slowdown startup considerably.
  */
 LWLock *
 LWLockAssign(void)
@@ -368,14 +370,17 @@ LWLockAssign(void)
 	volatile int *LWLockCounter;
 
 	LWLockCounter = (int *) ((char *) MainLWLockArray - 3 * sizeof(int));
-	SpinLockAcquire(ShmemLock);
+	if (!ShmemInitInProgress)
+		SpinLockAcquire(ShmemLock);
 	if (LWLockCounter[0] = LWLockCounter[1])
 	{
-		SpinLockRelease(ShmemLock);
+		if (!ShmemInitInProgress)
+			SpinLockRelease(ShmemLock);
 		elog(ERROR, no more LWLocks available);
 	}
 	result = MainLWLockArray[LWLockCounter[0]++].lock;
-	SpinLockRelease(ShmemLock);
+	if (!ShmemInitInProgress)
+		SpinLockRelease(ShmemLock);
 	return result;
 }
 
diff --git a/src/include/storage/ipc.h b/src/include/storage/ipc.h
index 37eca7a..0f1e00a 100644
--- a/src/include/storage/ipc.h
+++ b/src/include/storage/ipc.h
@@ -63,6 +63,7 @@ typedef void (*shmem_startup_hook_type) (void);
 
 /* ipc.c */
 extern bool proc_exit_inprogress;
+extern bool ShmemInitInProgress;
 
 extern void proc_exit(int code) __attribute__((noreturn));
 extern void shmem_exit(int code);

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers