Hi,

I think I may have found one of the problems, PostgreSQL has on machines with many NUMA nodes. I am not yet sure what exactly happens on the NUMA bus, but there seems to be a tipping point at which the spinlock concurrency wreaks havoc and the performance of the database collapses.

On a machine with 8 sockets, 64 cores, Hyperthreaded 128 threads total, a pgbench -S peaks with 50-60 clients around 85,000 TPS. The throughput then takes a very sharp dive and reaches around 20,000 TPS at 120 clients. It never recovers from there.

The attached patch demonstrates that less aggressive spinning and (much) more often delaying improves the performance "on this type of machine". The 8 socket machine in question scales to over 350,000 TPS.

The patch is meant to demonstrate this effect only. It has a negative performance impact on smaller machines and client counts < #cores, so the real solution will probably look much different. But I thought it would be good to share this and start the discussion about reevaluating the spinlock code before PGCon.


Regards, Jan

--
Jan Wieck
Senior Software Engineer
http://slony.info
diff --git a/src/backend/storage/lmgr/s_lock.c b/src/backend/storage/lmgr/s_lock.c
index 0afcba1..edaa8fd 100644
--- a/src/backend/storage/lmgr/s_lock.c
+++ b/src/backend/storage/lmgr/s_lock.c
@@ -83,8 +83,8 @@ s_lock(volatile slock_t *lock, const char *file, int line)
 	 * the probability of unintended failure) than to fix the total time
 	 * spent.
 	 */
-#define MIN_SPINS_PER_DELAY 10
-#define MAX_SPINS_PER_DELAY 1000
+#define MIN_SPINS_PER_DELAY 4
+#define MAX_SPINS_PER_DELAY 10
 #define NUM_DELAYS			1000
 #define MIN_DELAY_USEC		1000L
 #define MAX_DELAY_USEC		1000000L
@@ -144,7 +144,7 @@ s_lock(volatile slock_t *lock, const char *file, int line)
 	{
 		/* we never had to delay */
 		if (spins_per_delay < MAX_SPINS_PER_DELAY)
-			spins_per_delay = Min(spins_per_delay + 100, MAX_SPINS_PER_DELAY);
+			spins_per_delay = Min(spins_per_delay + 2, MAX_SPINS_PER_DELAY);
 	}
 	else
 	{
diff --git a/src/include/storage/s_lock.h b/src/include/storage/s_lock.h
index c63cf54..42eb353 100644
--- a/src/include/storage/s_lock.h
+++ b/src/include/storage/s_lock.h
@@ -973,7 +973,7 @@ extern slock_t dummy_spinlock;
 extern int s_lock(volatile slock_t *lock, const char *file, int line);
 
 /* Support for dynamic adjustment of spins_per_delay */
-#define DEFAULT_SPINS_PER_DELAY  100
+#define DEFAULT_SPINS_PER_DELAY  10
 
 extern void set_spins_per_delay(int shared_spins_per_delay);
 extern int	update_spins_per_delay(int shared_spins_per_delay);
-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to