I added some padding between those two, per attached patch, and got a big improvement. So we should definitely do that. I just added a char[128] field between them, which is enough to put them on different cache lines on the server I'm testing on. I don't think there is any platform-independent way to get the current cache line size, unfortunately. Since this doesn't affect correctness, I'm inclined to just add the 128-byte padding field there.
Attached is a graph generated by pgbench-tools. Full results are available here: http://hlinnaka.iki.fi/xloginsert-scaling/padding/. The test query used was:
insert into foo:client_id select generate_series(1, 100);That is, each client inserts a lot of rows into a different table. The table has no indexes. This is pretty much the worst-case scenario for stressing XLogInsert.
The "master-b03d196" is unpatched git master, "xlog-padding-fb741c0" is with the padding. The -16 plots are the same, but with xloginsert_slots=16 (the default is 8). The server has 16 cores, 32 with hyperthreading.
It's interesting that although the peak performance is better with 8 slots than with 16, it doesn't scale as gracefully with 16 slots. I believe that's because with more insertion slots, you have more backends fighting over the insertpos_lck. With fewer slots, the extra work is queued behind the insertion slots instead, and sleeping is better than spinning.
In either case, the performance with the padding is better than without. - Heikki
<<attachment: clients-sets-padding.png>>
diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c index 39c58d0..28e62ea 100644 --- a/src/backend/access/transam/xlog.c +++ b/src/backend/access/transam/xlog.c @@ -428,8 +428,14 @@ typedef struct XLogCtlInsert uint64 CurrBytePos; uint64 PrevBytePos; - /* insertion slots, see above for details */ - XLogInsertSlotPadded *insertSlots; + /* + * Make sure the above heavily-contended spinlock and byte positions are + * on their own cache line. In particular, the RedoRecPtr and full page + * write variables below should be on a different cache line. They are + * read on every WAL insertion, but updated rarely, and we don't want + * those reads to steal the cache line containing Curr/PrevBytePos. + */ + char pad[128]; /* * fullPageWrites is the master copy used by all backends to determine @@ -455,6 +461,9 @@ typedef struct XLogCtlInsert bool exclusiveBackup; int nonExclusiveBackups; XLogRecPtr lastBackupStart; + + /* insertion slots, see XLogInsertSlot struct above for details */ + XLogInsertSlotPadded *insertSlots; } XLogCtlInsert; /*
-- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers