Re: [HACKERS] Misaligned BufferDescriptors causing major performance problems on AMD
Pushed a version (hopefully) fixing Tom's complaints. On 2015-01-28 13:52:30 -0500, Tom Lane wrote: Andres Freund and...@2ndquadrant.com writes: On 2015-01-28 13:38:51 -0500, Tom Lane wrote: #define BUFFERDESC_PADDED_SIZE (SIZEOF_VOID_P == 8 ? 64 : 32) Hm, did you intentionally put a 32in there or was that just the logical continuation of 64? Because there's no way it'll ever fit into 32 bytes in the near future. That's why I had put the sizeof(BufferDesc) there. We could just make it 1 as well, to indicate that we don't want any padding... Yeah, 1 would be fine too. Maybe better to call it BUFFERDESC_MIN_SIZE, because as this stands it's enforcing a min size not exact size. It's _PAD_TO_SIZE now. not going to whinge about that aspect of it; the main point here is to put in the union and fix the ensuing notational fallout. We can worry about exactly what size to pad to as a separate discussion.) Yea. The details can be changed in a concise way now. I hope ;) Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Misaligned BufferDescriptors causing major performance problems on AMD
On 2015-01-26 21:13:31 -0500, Robert Haas wrote: On Mon, Jan 26, 2015 at 9:08 PM, Robert Haas robertmh...@gmail.com wrote: Contrary opinions? Robert? I'm totally OK with further aligning just that one allocation. Of course, now that I think about it, aligning it probably works mostly because the size is almost exactly one cache line. Not just almost - it's precisely 64 byte on x86-64 linux. And I think it'll also be that on other 64bit platforms. The only architecture dependant thing in there is the spinlock and that already can be between 1 and 4 bytes without changing the size of the struct. If it were any bigger or smaller, aligning it more wouldn't help. Yea. So maybe we should also do something like what LWLocks do, and make a union between the actual structure and an appropriate array of padding bytes - say either 64 or 128 of them. Hm. That's a bit bigger patch. I'm inclined to just let it slide for the moment. I still have plans to downsize some of sbufdesc's content (move the io lock out) and move the content lwlock inline. Then we're not going to have much choice but do this... Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Misaligned BufferDescriptors causing major performance problems on AMD
On Wed, Jan 28, 2015 at 10:35 AM, Tom Lane t...@sss.pgh.pa.us wrote: Andres Freund and...@2ndquadrant.com writes: On 2015-01-26 21:13:31 -0500, Robert Haas wrote: So maybe we should also do something like what LWLocks do, and make a union between the actual structure and an appropriate array of padding bytes - say either 64 or 128 of them. Hm. That's a bit bigger patch. I'm inclined to just let it slide for the moment. I still have plans to downsize some of sbufdesc's content (move the io lock out) and move the content lwlock inline. Then we're not going to have much choice but do this... Even if you didn't have plans like that, it would be entire folly to imagine that buffer headers will be exactly 64 bytes without some forcing function for that. Accordingly, I vote against applying any patch that pretends to improve their alignment unless it also does something to ensure that the size is a power of 2. Any notational changes that are forced by that would be much better done in a patch that does only that than in a patch that also makes functional changes to the header contents. I entirely agree. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Misaligned BufferDescriptors causing major performance problems on AMD
On 2015-01-28 10:35:28 -0500, Tom Lane wrote: Andres Freund and...@2ndquadrant.com writes: On 2015-01-26 21:13:31 -0500, Robert Haas wrote: So maybe we should also do something like what LWLocks do, and make a union between the actual structure and an appropriate array of padding bytes - say either 64 or 128 of them. Hm. That's a bit bigger patch. I'm inclined to just let it slide for the moment. I still have plans to downsize some of sbufdesc's content (move the io lock out) and move the content lwlock inline. Then we're not going to have much choice but do this... Even if you didn't have plans like that, it would be entire folly to imagine that buffer headers will be exactly 64 bytes without some forcing function for that. Meh. The 128 byte additionally used by the alignment don't hurt in any case. But forcing all buffer descriptors to 64bit on a 32bit platform isn't guaranteed to be performance neutral. So, no I don't think it's a folly to do so. I'd rather make actual progress that improves the absurd situation today (a factor of 2-3 by changing max_connections by one...) than argue whether the impact on 32bit platforms is acceptable before doing so. If we additionally decide to pad the struct, fine. But I don't see why this has to be done at the same time. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Misaligned BufferDescriptors causing major performance problems on AMD
Andres Freund and...@2ndquadrant.com writes: On 2015-01-26 21:13:31 -0500, Robert Haas wrote: So maybe we should also do something like what LWLocks do, and make a union between the actual structure and an appropriate array of padding bytes - say either 64 or 128 of them. Hm. That's a bit bigger patch. I'm inclined to just let it slide for the moment. I still have plans to downsize some of sbufdesc's content (move the io lock out) and move the content lwlock inline. Then we're not going to have much choice but do this... Even if you didn't have plans like that, it would be entire folly to imagine that buffer headers will be exactly 64 bytes without some forcing function for that. Accordingly, I vote against applying any patch that pretends to improve their alignment unless it also does something to ensure that the size is a power of 2. Any notational changes that are forced by that would be much better done in a patch that does only that than in a patch that also makes functional changes to the header contents. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Misaligned BufferDescriptors causing major performance problems on AMD
On 2015-01-28 13:38:51 -0500, Tom Lane wrote: Andres Freund and...@2ndquadrant.com writes: I personally still think that a comment above sbufdesc's definition would be sufficient for now. But whatever. I'll enforce 64byte padding on 64bit platforms, and do nothing on 32bit platforms. Patch doing that attached. Surely the sizeof() in BufferShmemSize needs to be sizeof(BufferDescPadded) now? Good catch. It'd surely fail nastily later, once somebody actually made the size change away from 64 bytes. Also, the macro is unnecessarily unsafe for use in #if tests; why not something like Out of curiosity: What's the danger you're seing? I'm perfectly fine with using SIZEOF_VOID_P, I actually looked for a macro like it and didn't find anything. #define BUFFERDESC_PADDED_SIZE(SIZEOF_VOID_P == 8 ? 64 : 32) Hm, did you intentionally put a 32in there or was that just the logical continuation of 64? Because there's no way it'll ever fit into 32 bytes in the near future. That's why I had put the sizeof(BufferDesc) there. We could just make it 1 as well, to indicate that we don't want any padding... Otherwise it looks reasonably sane to me, just in a quick once-over; I didn't test. I tested it on both 32/64 bit linux and it indeed fixes the issue of changing performance due to max_connections+-1. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Misaligned BufferDescriptors causing major performance problems on AMD
On 2015-01-28 17:08:46 +0100, Andres Freund wrote: I just have no idea whether it'd be beneficial to use more space on 32bit to pad the individual entries. Since this mostly is beneficial on multi-socket, highly concurrent workloads, I doubt it really matter. I personally still think that a comment above sbufdesc's definition would be sufficient for now. But whatever. I'll enforce 64byte padding on 64bit platforms, and do nothing on 32bit platforms. Patch doing that attached. I've for now dealt with the 32bit issue by: #define BUFFERDESC_PADDED_SIZE (sizeof(void*) == 8 ? 64 : sizeof(BufferDesc)) and * XXX: As this is primarily matters in highly concurrent workloads which * probably all are 64bit these days, and the space wastage would be a bit * more noticeable on 32bit systems, we don't force the stride to be cache * line sized. If somebody does actual performance testing, we can reevaluate. I've replaced direct accesses to the (Local)BufferDescriptors array by a wrapper macros to hide the union indirect and allow potential future changes to be made more easily. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services From caeb98e4ab48748556f4983fdaa52098a190aa57 Mon Sep 17 00:00:00 2001 From: Andres Freund and...@anarazel.de Date: Wed, 28 Jan 2015 18:51:50 +0100 Subject: [PATCH 1/3] Fix #ifdefed'ed out code to compile again. --- src/backend/storage/buffer/bufmgr.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/backend/storage/buffer/bufmgr.c b/src/backend/storage/buffer/bufmgr.c index 7eb2d22..f92d0ef 100644 --- a/src/backend/storage/buffer/bufmgr.c +++ b/src/backend/storage/buffer/bufmgr.c @@ -2764,7 +2764,7 @@ PrintPinnedBufs(void) [%02d] (freeNext=%d, rel=%s, blockNum=%u, flags=0x%x, refcount=%u %d), i, buf-freeNext, - relpath(buf-tag.rnode, buf-tag.forkNum), + relpathperm(buf-tag.rnode, buf-tag.forkNum), buf-tag.blockNum, buf-flags, buf-refcount, GetPrivateRefCount(i + 1)); } -- 2.0.0.rc2.4.g1dc51c6.dirty From 2bb5f85d8c46cea30b3d1d80b781574c2f2a3903 Mon Sep 17 00:00:00 2001 From: Andres Freund and...@anarazel.de Date: Wed, 28 Jan 2015 13:55:36 +0100 Subject: [PATCH 2/3] Align buffer descriptors to cache line boundaries. Benchmarks has shown that aligning the buffer descriptor array to cache lines is important for scalability, especially on bigger, multi-socket, machines. Currently the array sometimes already happens to be aligned by happenstance, depending how large previous shared memory allocations were. That can lead to wildly varying performance results, after minor configuration changes. In addition to aligning the start of array array, also force the size of individual descriptors to be of common cache line size (64 bytes). That happens to already be the case on 64bit platforms, but this way we can change the definition more easily. As the alignment primarily matters in highly concurrent workloads which probably all are 64bit these days, and the space wastage of element alignment would be a bit more noticeable on 32bit systems, we don't force the stride to be cacheline sized for now. If somebody does actual performance testing, we can reevaluate that decision by changing the definition of BUFFERDESC_PADDED_SIZE. Discussion: 20140202151319.gd32...@awork2.anarazel.de Per discussion with Bruce Momjan, Robert Haas, Tom Lane and Peter Geoghegan. --- contrib/pg_buffercache/pg_buffercache_pages.c | 6 ++- src/backend/storage/buffer/buf_init.c | 19 src/backend/storage/buffer/bufmgr.c | 66 ++- src/backend/storage/buffer/freelist.c | 6 +-- src/backend/storage/buffer/localbuf.c | 12 ++--- src/include/c.h | 1 + src/include/storage/buf_internals.h | 34 +- 7 files changed, 91 insertions(+), 53 deletions(-) diff --git a/contrib/pg_buffercache/pg_buffercache_pages.c b/contrib/pg_buffercache/pg_buffercache_pages.c index d00f985..98016fc 100644 --- a/contrib/pg_buffercache/pg_buffercache_pages.c +++ b/contrib/pg_buffercache/pg_buffercache_pages.c @@ -73,7 +73,6 @@ pg_buffercache_pages(PG_FUNCTION_ARGS) if (SRF_IS_FIRSTCALL()) { int i; - volatile BufferDesc *bufHdr; funcctx = SRF_FIRSTCALL_INIT(); @@ -146,8 +145,11 @@ pg_buffercache_pages(PG_FUNCTION_ARGS) * Scan though all the buffers, saving the relevant fields in the * fctx-record structure. */ - for (i = 0, bufHdr = BufferDescriptors; i NBuffers; i++, bufHdr++) + for (i = 0; i NBuffers; i++) { + volatile BufferDesc *bufHdr; + + bufHdr = GetBufferDescriptor(i); /* Lock each buffer header before inspecting. */ LockBufHdr(bufHdr); diff --git a/src/backend/storage/buffer/buf_init.c b/src/backend/storage/buffer/buf_init.c index 4434bc3..c013e10 100644 --- a/src/backend/storage/buffer/buf_init.c +++
Re: [HACKERS] Misaligned BufferDescriptors causing major performance problems on AMD
Andres Freund and...@2ndquadrant.com writes: I personally still think that a comment above sbufdesc's definition would be sufficient for now. But whatever. I'll enforce 64byte padding on 64bit platforms, and do nothing on 32bit platforms. Patch doing that attached. Surely the sizeof() in BufferShmemSize needs to be sizeof(BufferDescPadded) now? Also, the macro is unnecessarily unsafe for use in #if tests; why not something like #define BUFFERDESC_PADDED_SIZE (SIZEOF_VOID_P == 8 ? 64 : 32) Otherwise it looks reasonably sane to me, just in a quick once-over; I didn't test. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Misaligned BufferDescriptors causing major performance problems on AMD
Andres Freund and...@2ndquadrant.com writes: On 2015-01-28 13:38:51 -0500, Tom Lane wrote: #define BUFFERDESC_PADDED_SIZE (SIZEOF_VOID_P == 8 ? 64 : 32) Hm, did you intentionally put a 32in there or was that just the logical continuation of 64? Because there's no way it'll ever fit into 32 bytes in the near future. That's why I had put the sizeof(BufferDesc) there. We could just make it 1 as well, to indicate that we don't want any padding... Yeah, 1 would be fine too. Maybe better to call it BUFFERDESC_MIN_SIZE, because as this stands it's enforcing a min size not exact size. (I'm not going to whinge about that aspect of it; the main point here is to put in the union and fix the ensuing notational fallout. We can worry about exactly what size to pad to as a separate discussion.) regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Misaligned BufferDescriptors causing major performance problems on AMD
On Mon, Jan 26, 2015 at 8:04 PM, Andres Freund and...@2ndquadrant.com wrote: On 2015-01-26 19:58:25 -0500, Bruce Momjian wrote: On Tue, Jan 27, 2015 at 01:43:41AM +0100, Andres Freund wrote: master + 32align.patch: -c max_connections=400 tps = 183791.872359 (high run vs. run variability) -c max_connections=401 tps = 185494.98244 (high run vs. run variability) master + 64align.patch: -c max_connections=400 tps = 489257.195570 -c max_connections=401 tps = 490496.520632 Pretty much as expected, rigth? Yes, I am convinced. Let's work on a patch now. Since I can reproduce some minor (1-3%) performance *regressions* at low client counts when aligning every shmem allocation, I'm inclined to just add special case code to BufferShmemSize()/InitBufferPool() to align descriptors to PG_CACHE_LINE_SIZE. That's really unlikely to regress anythign as it basically can't be a bad idea to align buffer descriptors. Contrary opinions? Robert? I'm totally OK with further aligning just that one allocation. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Misaligned BufferDescriptors causing major performance problems on AMD
On Mon, Jan 26, 2015 at 9:08 PM, Robert Haas robertmh...@gmail.com wrote: Contrary opinions? Robert? I'm totally OK with further aligning just that one allocation. Of course, now that I think about it, aligning it probably works mostly because the size is almost exactly one cache line. If it were any bigger or smaller, aligning it more wouldn't help. So maybe we should also do something like what LWLocks do, and make a union between the actual structure and an appropriate array of padding bytes - say either 64 or 128 of them. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Misaligned BufferDescriptors causing major performance problems on AMD
On 2015-01-26 19:58:25 -0500, Bruce Momjian wrote: On Tue, Jan 27, 2015 at 01:43:41AM +0100, Andres Freund wrote: master + 32align.patch: -c max_connections=400 tps = 183791.872359 (high run vs. run variability) -c max_connections=401 tps = 185494.98244 (high run vs. run variability) master + 64align.patch: -c max_connections=400 tps = 489257.195570 -c max_connections=401 tps = 490496.520632 Pretty much as expected, rigth? Yes, I am convinced. Let's work on a patch now. Since I can reproduce some minor (1-3%) performance *regressions* at low client counts when aligning every shmem allocation, I'm inclined to just add special case code to BufferShmemSize()/InitBufferPool() to align descriptors to PG_CACHE_LINE_SIZE. That's really unlikely to regress anythign as it basically can't be a bad idea to align buffer descriptors. Contrary opinions? Robert? Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Misaligned BufferDescriptors causing major performance problems on AMD
On 2015-01-05 21:43:04 -0500, Bruce Momjian wrote: On Fri, Jan 2, 2015 at 06:25:52PM +0100, Andres Freund wrote: I can't run tests right now... What exactly do you want to see with these tests? that's essentially what I've already benchmarked + some fprintfs? I want to test two patches that both allocate an extra 64 bytes but shift the alignment of just the buffer descriptor memory allocation. pgbench scale 300, with postgres configured: -c shared_buffers=48GB -c log_line_prefix=[%m %p] -c log_min_messages=debug1 -p 5440 -c checkpoint_segments=600 -c fsync=off -c synchronous_commit=off -c maintenance_work_mem=4GB -c huge_pages=off -c log_autovacuum_min_duration='10s' test: pgbench -n -M prepared -c 96 -j 96 -T 30 -S master as of 4b2a254793be50e31d43d4bfd813da8d141494b8: -c max_connections=400 tps = 193748.454044 (high run vs. run variability) -c max_connections=401 tps = 484238.551349 master + 32align.patch: -c max_connections=400 tps = 183791.872359 (high run vs. run variability) -c max_connections=401 tps = 185494.98244 (high run vs. run variability) master + 64align.patch: -c max_connections=400 tps = 489257.195570 -c max_connections=401 tps = 490496.520632 Pretty much as expected, rigth? Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Misaligned BufferDescriptors causing major performance problems on AMD
On Tue, Jan 27, 2015 at 01:43:41AM +0100, Andres Freund wrote: master + 32align.patch: -c max_connections=400 tps = 183791.872359 (high run vs. run variability) -c max_connections=401 tps = 185494.98244 (high run vs. run variability) master + 64align.patch: -c max_connections=400 tps = 489257.195570 -c max_connections=401 tps = 490496.520632 Pretty much as expected, rigth? Yes, I am convinced. Let's work on a patch now. -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + Everyone has their own god. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Misaligned BufferDescriptors causing major performance problems on AMD
On Thu, Jan 1, 2015 at 3:04 PM, Andres Freund and...@2ndquadrant.com wrote: That's true, but if you don't align the beginnings of the allocations, then it's a lot more complicated for the code to properly align stuff within the allocation. It's got to insert a variable amount of padding based on the alignment it happens to get. Hm? Allocate +PG_CACHELINE_SIZE and do var = CACHELINEALIGN(var). Meh. I guess that will work, but I see little advantage in it. Cache-line aligning the allocations is simple and, not of no value, of long precedent. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Misaligned BufferDescriptors causing major performance problems on AMD
On Fri, Jan 2, 2015 at 06:25:52PM +0100, Andres Freund wrote: I can't run tests right now... What exactly do you want to see with these tests? that's essentially what I've already benchmarked + some fprintfs? I want to test two patches that both allocate an extra 64 bytes but shift the alignment of just the buffer descriptor memory allocation. -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + Everyone has their own god. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Misaligned BufferDescriptors causing major performance problems on AMD
On 2014-12-29 16:59:05 -0500, Bruce Momjian wrote: diff --git a/src/backend/storage/buffer/buf_init.c b/src/backend/storage/buffer/buf_init.c new file mode 100644 index ff6c713..c4dce5b *** a/src/backend/storage/buffer/buf_init.c --- b/src/backend/storage/buffer/buf_init.c *** InitBufferPool(void) *** 67,72 --- 67,73 boolfoundBufs, foundDescs; + fprintf(stderr, Buffer Descriptors size = %ld\n, sizeof(BufferDesc)); BufferDescriptors = (BufferDesc *) ShmemInitStruct(Buffer Descriptors, NBuffers * sizeof(BufferDesc), foundDescs); diff --git a/src/backend/storage/ipc/shmem.c b/src/backend/storage/ipc/shmem.c new file mode 100644 index 2ea2216..669c07f *** a/src/backend/storage/ipc/shmem.c --- b/src/backend/storage/ipc/shmem.c *** ShmemInitStruct(const char *name, Size s *** 327,332 --- 327,335 ShmemIndexEnt *result; void *structPtr; + if (strcmp(name, Buffer Descriptors) == 0) + size = BUFFERALIGN(size) + 64; + LWLockAcquire(ShmemIndexLock, LW_EXCLUSIVE); if (!ShmemIndex) *** ShmemInitStruct(const char *name, Size s *** 413,418 --- 416,432 \%s\ (%zu bytes requested), name, size))); } + if (strcmp(name, Buffer Descriptors) == 0) + { + /* align on 32 */ + if ((int64)structPtr % 32 != 0) + structPtr = (void *)((int64)structPtr + 32 - (int64)structPtr % 32); + /* align on 32 but not 64 */ + if ((int64)structPtr % 64 == 0) + structPtr = (void *)((int64)structPtr + 32); + } + fprintf(stderr, shared memory alignment of %s: %ld-byte\n, name, + (int64)structPtr % 64 == 0 ? 64 : (int64)structPtr % 64); result-size = size; result-location = structPtr; } diff --git a/src/backend/storage/buffer/buf_init.c b/src/backend/storage/buffer/buf_init.c new file mode 100644 index ff6c713..c4dce5b *** a/src/backend/storage/buffer/buf_init.c --- b/src/backend/storage/buffer/buf_init.c *** InitBufferPool(void) *** 67,72 --- 67,73 boolfoundBufs, foundDescs; + fprintf(stderr, Buffer Descriptors size = %ld\n, sizeof(BufferDesc)); BufferDescriptors = (BufferDesc *) ShmemInitStruct(Buffer Descriptors, NBuffers * sizeof(BufferDesc), foundDescs); diff --git a/src/backend/storage/ipc/shmem.c b/src/backend/storage/ipc/shmem.c new file mode 100644 index 2ea2216..50f836e *** a/src/backend/storage/ipc/shmem.c --- b/src/backend/storage/ipc/shmem.c *** ShmemInitStruct(const char *name, Size s *** 327,332 --- 327,335 ShmemIndexEnt *result; void *structPtr; + if (strcmp(name, Buffer Descriptors) == 0) + size = BUFFERALIGN(size) + 64; + LWLockAcquire(ShmemIndexLock, LW_EXCLUSIVE); if (!ShmemIndex) *** ShmemInitStruct(const char *name, Size s *** 413,418 --- 416,429 \%s\ (%zu bytes requested), name, size))); } + if (strcmp(name, Buffer Descriptors) == 0) + { + /* align on 64 */ + if ((int64)structPtr % 64 != 0) + structPtr = (void *)((int64)structPtr + 64 - (int64)structPtr % 64); + } + fprintf(stderr, shared memory alignment of %s: %ld-byte\n, name, + (int64)structPtr % 64 == 0 ? 64 : (int64)structPtr % 64); result-size = size; result-location = structPtr; } I can't run tests right now... What exactly do you want to see with these tests? that's essentially what I've already benchmarked + some fprintfs? Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Misaligned BufferDescriptors causing major performance problems on AMD
On 2015-01-01 11:55:03 -0500, Robert Haas wrote: On Mon, Dec 29, 2014 at 6:48 PM, Andres Freund and...@2ndquadrant.com wrote: Andres reported the above 2x pgbench difference in February, but no action was taken as everyone felt there needed to be more performance testing, but it never happened: FWIW, I have no idea what exactly should be tested there. Right now I think what we should do is to remove the BUFFERALIGN from shmem.c and instead add explicit alignment code in a couple callers (BufferDescriptors/Blocks, proc.c stuff). That seems like a strange approach. I think it's pretty sensible to try to ensure that allocated blocks of shared memory have decent alignment, and we don't have enough of them for aligning on 64-byte boundaries (or even 128-byte boundaries, perhaps) to eat up any meaningful amount of memory. The BUFFERALIGN() stuff, like much else about the way we manage shared memory, has also made its way into the dynamic-shared-memory code. So if we do adjust the alignment that we guarantee for the main shared memory segment, we should perhaps adjust DSM to match. But I guess I don't understand why you'd want to do it that way. The problem is that just aligning the main allocation to some boundary doesn't mean the hot part of the allocation is properly aligned. shmem.c in fact can't really do much about that - so fully moving the responsibility seems more likely to ensure that future code thinks about alignment. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Misaligned BufferDescriptors causing major performance problems on AMD
On Mon, Dec 29, 2014 at 6:48 PM, Andres Freund and...@2ndquadrant.com wrote: Andres reported the above 2x pgbench difference in February, but no action was taken as everyone felt there needed to be more performance testing, but it never happened: FWIW, I have no idea what exactly should be tested there. Right now I think what we should do is to remove the BUFFERALIGN from shmem.c and instead add explicit alignment code in a couple callers (BufferDescriptors/Blocks, proc.c stuff). That seems like a strange approach. I think it's pretty sensible to try to ensure that allocated blocks of shared memory have decent alignment, and we don't have enough of them for aligning on 64-byte boundaries (or even 128-byte boundaries, perhaps) to eat up any meaningful amount of memory. The BUFFERALIGN() stuff, like much else about the way we manage shared memory, has also made its way into the dynamic-shared-memory code. So if we do adjust the alignment that we guarantee for the main shared memory segment, we should perhaps adjust DSM to match. But I guess I don't understand why you'd want to do it that way. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Misaligned BufferDescriptors causing major performance problems on AMD
On Thu, Jan 1, 2015 at 05:59:25PM +0100, Andres Freund wrote: That seems like a strange approach. I think it's pretty sensible to try to ensure that allocated blocks of shared memory have decent alignment, and we don't have enough of them for aligning on 64-byte boundaries (or even 128-byte boundaries, perhaps) to eat up any meaningful amount of memory. The BUFFERALIGN() stuff, like much else about the way we manage shared memory, has also made its way into the dynamic-shared-memory code. So if we do adjust the alignment that we guarantee for the main shared memory segment, we should perhaps adjust DSM to match. But I guess I don't understand why you'd want to do it that way. The problem is that just aligning the main allocation to some boundary doesn't mean the hot part of the allocation is properly aligned. shmem.c in fact can't really do much about that - so fully moving the responsibility seems more likely to ensure that future code thinks about alignment. Yes, there is shared memory allocation alignment and object alignment. Since there are only about 50 cases of these, a worst-case change to force 64-byte alignment would only cost 3.2k of shared memory. It might make sense to make them all 64-byte aligned to reduce CPU cache contention, but we have to have actual performance numbers to prove that. My two patches allow individual object alignment to be tested. I have not been able to see any performance difference (1%) with: $ pgbench --initialize --scale 100 pgbench $ pgbench --protocol prepared --client 32 --jobs 16 --time=100 --select-only pgbench on my dual-socket 16 vcore server. -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + Everyone has their own god. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Misaligned BufferDescriptors causing major performance problems on AMD
On Thu, Jan 1, 2015 at 09:04:48PM +0100, Andres Freund wrote: On January 1, 2015 8:49:06 PM CET, Robert Haas robertmh...@gmail.com wrote: On Thu, Jan 1, 2015 at 11:59 AM, Andres Freund and...@2ndquadrant.com wrote: The problem is that just aligning the main allocation to some boundary doesn't mean the hot part of the allocation is properly aligned. shmem.c in fact can't really do much about that - so fully moving the responsibility seems more likely to ensure that future code thinks about alignment. That's true, but if you don't align the beginnings of the allocations, then it's a lot more complicated for the code to properly align stuff within the allocation. It's got to insert a variable amount of padding based on the alignment it happens to get. Hm? Allocate +PG_CACHELINE_SIZE and do var = CACHELINEALIGN(var). Yeah, I am afraid we have to do that anyway --- you can see it in my patch too. I guess if you had the shmem allocation aligned on 64-byte boundries and required all allocations to be a multiple of 64, that would work too. -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + Everyone has their own god. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Misaligned BufferDescriptors causing major performance problems on AMD
On January 1, 2015 8:49:06 PM CET, Robert Haas robertmh...@gmail.com wrote: On Thu, Jan 1, 2015 at 11:59 AM, Andres Freund and...@2ndquadrant.com wrote: The problem is that just aligning the main allocation to some boundary doesn't mean the hot part of the allocation is properly aligned. shmem.c in fact can't really do much about that - so fully moving the responsibility seems more likely to ensure that future code thinks about alignment. That's true, but if you don't align the beginnings of the allocations, then it's a lot more complicated for the code to properly align stuff within the allocation. It's got to insert a variable amount of padding based on the alignment it happens to get. Hm? Allocate +PG_CACHELINE_SIZE and do var = CACHELINEALIGN(var). -- Please excuse brevity and formatting - I am writing this on my mobile phone. Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Misaligned BufferDescriptors causing major performance problems on AMD
On Thu, Jan 1, 2015 at 11:59 AM, Andres Freund and...@2ndquadrant.com wrote: The problem is that just aligning the main allocation to some boundary doesn't mean the hot part of the allocation is properly aligned. shmem.c in fact can't really do much about that - so fully moving the responsibility seems more likely to ensure that future code thinks about alignment. That's true, but if you don't align the beginnings of the allocations, then it's a lot more complicated for the code to properly align stuff within the allocation. It's got to insert a variable amount of padding based on the alignment it happens to get. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Misaligned BufferDescriptors causing major performance problems on AMD
On Sat, Dec 27, 2014 at 08:05:42PM -0500, Robert Haas wrote: On Wed, Dec 24, 2014 at 11:20 AM, Andres Freund and...@2ndquadrant.com wrote: I just verified that I can still reproduce the problem: # aligned case (max_connections=401) afreund@axle:~$ pgbench -P 1 -h /tmp/ -p5440 postgres -n -M prepared -c 96 -j 96 -T 100 -S progress: 1.0 s, 405170.2 tps, lat 0.195 ms stddev 0.928 progress: 2.0 s, 467011.1 tps, lat 0.204 ms stddev 0.140 progress: 3.0 s, 462832.1 tps, lat 0.205 ms stddev 0.154 progress: 4.0 s, 471035.5 tps, lat 0.202 ms stddev 0.154 progress: 5.0 s, 500329.0 tps, lat 0.190 ms stddev 0.132 BufferDescriptors is at 0x7f63610a6960 (which is 32byte aligned) # unaligned case (max_connections=400) afreund@axle:~$ pgbench -P 1 -h /tmp/ -p5440 postgres -n -M prepared -c 96 -j 96 -T 100 -S progress: 1.0 s, 202271.1 tps, lat 0.448 ms stddev 1.232 progress: 2.0 s, 223823.4 tps, lat 0.427 ms stddev 3.007 progress: 3.0 s, 227584.5 tps, lat 0.414 ms stddev 4.760 progress: 4.0 s, 221095.6 tps, lat 0.410 ms stddev 4.390 progress: 5.0 s, 217430.6 tps, lat 0.454 ms stddev 7.913 progress: 6.0 s, 210275.9 tps, lat 0.411 ms stddev 0.606 BufferDescriptors is at 0x7f1718aeb980 (which is 64byte aligned) So, should we increase ALIGNOF_BUFFER from 32 to 64? Seems like that's what these results are telling us. I am glad someone else considers this important. Andres reported the above 2x pgbench difference in February, but no action was taken as everyone felt there needed to be more performance testing, but it never happened: http://www.postgresql.org/message-id/20140202151319.gd32...@awork2.anarazel.de I have now performance tested this by developing the attached two patches which both increase the Buffer Descriptors allocation by 64 bytes. The first patch causes each 64-byte Buffer Descriptor struct to align on a 32-byte boundary but not a 64-byte boundary, while the second patch aligns it with a 64-byte boundary. I tried many tests, including this: $ pgbench --initialize --scale 1 pgbench $ pgbench --protocol prepared --client 16 --jobs 16 --transactions 10 --select-only pgbench I cannot measure any difference on my dual-CPU-socket, 16-vcore server (http://momjian.us/main/blogs/pgblog/2012.html#January_20_2012). I thought this test would cause the most Buffer Descriptor contention between the two CPUs. Can anyone else see a difference when testing these two patches? (The patch reports alignment in the server logs.) -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + Everyone has their own god. + diff --git a/src/backend/storage/buffer/buf_init.c b/src/backend/storage/buffer/buf_init.c new file mode 100644 index ff6c713..c4dce5b *** a/src/backend/storage/buffer/buf_init.c --- b/src/backend/storage/buffer/buf_init.c *** InitBufferPool(void) *** 67,72 --- 67,73 bool foundBufs, foundDescs; + fprintf(stderr, Buffer Descriptors size = %ld\n, sizeof(BufferDesc)); BufferDescriptors = (BufferDesc *) ShmemInitStruct(Buffer Descriptors, NBuffers * sizeof(BufferDesc), foundDescs); diff --git a/src/backend/storage/ipc/shmem.c b/src/backend/storage/ipc/shmem.c new file mode 100644 index 2ea2216..669c07f *** a/src/backend/storage/ipc/shmem.c --- b/src/backend/storage/ipc/shmem.c *** ShmemInitStruct(const char *name, Size s *** 327,332 --- 327,335 ShmemIndexEnt *result; void *structPtr; + if (strcmp(name, Buffer Descriptors) == 0) + size = BUFFERALIGN(size) + 64; + LWLockAcquire(ShmemIndexLock, LW_EXCLUSIVE); if (!ShmemIndex) *** ShmemInitStruct(const char *name, Size s *** 413,418 --- 416,432 \%s\ (%zu bytes requested), name, size))); } + if (strcmp(name, Buffer Descriptors) == 0) + { + /* align on 32 */ + if ((int64)structPtr % 32 != 0) + structPtr = (void *)((int64)structPtr + 32 - (int64)structPtr % 32); + /* align on 32 but not 64 */ + if ((int64)structPtr % 64 == 0) + structPtr = (void *)((int64)structPtr + 32); + } + fprintf(stderr, shared memory alignment of %s: %ld-byte\n, name, + (int64)structPtr % 64 == 0 ? 64 : (int64)structPtr % 64); result-size = size; result-location = structPtr; } diff --git a/src/backend/storage/buffer/buf_init.c b/src/backend/storage/buffer/buf_init.c new file mode 100644 index ff6c713..c4dce5b *** a/src/backend/storage/buffer/buf_init.c --- b/src/backend/storage/buffer/buf_init.c *** InitBufferPool(void) *** 67,72 --- 67,73 bool foundBufs, foundDescs; + fprintf(stderr, Buffer Descriptors size = %ld\n, sizeof(BufferDesc)); BufferDescriptors = (BufferDesc *) ShmemInitStruct(Buffer Descriptors, NBuffers * sizeof(BufferDesc), foundDescs); diff --git a/src/backend/storage/ipc/shmem.c
Re: [HACKERS] Misaligned BufferDescriptors causing major performance problems on AMD
On 2014-12-29 16:59:05 -0500, Bruce Momjian wrote: I am glad someone else considers this important. I do consider it important. I just considered the lwlock scalability more important... Andres reported the above 2x pgbench difference in February, but no action was taken as everyone felt there needed to be more performance testing, but it never happened: FWIW, I have no idea what exactly should be tested there. Right now I think what we should do is to remove the BUFFERALIGN from shmem.c and instead add explicit alignment code in a couple callers (BufferDescriptors/Blocks, proc.c stuff). $ pgbench --initialize --scale 1 pgbench Scale 1 isn't particularly helpful in benchmarks, not even read only ones. $ pgbench --protocol prepared --client 16 --jobs 16 --transactions 10 --select-only pgbench I'd suspect you're more likely to see differences with a higher client count. Also, I seriously doubt 100k xacts is enough to get stable results - even on my laptop I get 100k+ TPS. I'd suggest using something like -P 1 -T 100 or so. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Misaligned BufferDescriptors causing major performance problems on AMD
On Wed, Dec 24, 2014 at 11:20 AM, Andres Freund and...@2ndquadrant.com wrote: I just verified that I can still reproduce the problem: # aligned case (max_connections=401) afreund@axle:~$ pgbench -P 1 -h /tmp/ -p5440 postgres -n -M prepared -c 96 -j 96 -T 100 -S progress: 1.0 s, 405170.2 tps, lat 0.195 ms stddev 0.928 progress: 2.0 s, 467011.1 tps, lat 0.204 ms stddev 0.140 progress: 3.0 s, 462832.1 tps, lat 0.205 ms stddev 0.154 progress: 4.0 s, 471035.5 tps, lat 0.202 ms stddev 0.154 progress: 5.0 s, 500329.0 tps, lat 0.190 ms stddev 0.132 BufferDescriptors is at 0x7f63610a6960 (which is 32byte aligned) # unaligned case (max_connections=400) afreund@axle:~$ pgbench -P 1 -h /tmp/ -p5440 postgres -n -M prepared -c 96 -j 96 -T 100 -S progress: 1.0 s, 202271.1 tps, lat 0.448 ms stddev 1.232 progress: 2.0 s, 223823.4 tps, lat 0.427 ms stddev 3.007 progress: 3.0 s, 227584.5 tps, lat 0.414 ms stddev 4.760 progress: 4.0 s, 221095.6 tps, lat 0.410 ms stddev 4.390 progress: 5.0 s, 217430.6 tps, lat 0.454 ms stddev 7.913 progress: 6.0 s, 210275.9 tps, lat 0.411 ms stddev 0.606 BufferDescriptors is at 0x7f1718aeb980 (which is 64byte aligned) So, should we increase ALIGNOF_BUFFER from 32 to 64? Seems like that's what these results are telling us. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Misaligned BufferDescriptors causing major performance problems on AMD
On Wed, Dec 24, 2014 at 10:30:19AM +0100, Andres Freund wrote: On 2014-12-23 22:51:22 -0500, Bruce Momjian wrote: Many of these are 64-byte aligned, including Buffer Descriptors. In that case you need to change max_connections, some settings will lead to unaligned BufferDescriptors. Well, isn't my second patch that misaligns the buffers sufficient for testing? I tested pgbench with these commands: $ pgbench -i -s 95 pgbench $ pgbench -S -c 95 -j 95 -t 10 pgbench on a 16-core Xeon server and got 84k tps. I then applied another patch, attached, which causes all the structures to be non-64-byte aligned, but got the same tps number. 'Xeon' itself doesn't say much. It's been applied to widly different CPUs over the years. I guess that was a single socket server? You're much more likely to see significant problems on a multi node NUMA servers where the penalties for cache misses/false sharing are a magnitude or three higher. Sorry, the server has 2 x Intel Xeon E5620 2.4GHz Quad-Core Processors; the full details are here: http://momjian.us/main/blogs/pgblog/2012.html#January_20_2012 Can someone test these patches on an AMD CPU and see if you see a difference? Thanks. I don't think you'll see a bigger difference there. Uh, I thought AMD showed a huge difference for misalignment: http://www.postgresql.org/message-id/20140202151319.gd32...@awork2.anarazel.de and that email is from you. I ended up running pgbench using 16-scale and got 90k tps: pgbench -S -c 16 -j 16 -t 10 pgbench but again could not see any difference between aligned and misaligned. -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + Everyone has their own god. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Misaligned BufferDescriptors causing major performance problems on AMD
On 2014-12-24 10:00:05 -0500, Bruce Momjian wrote: On Wed, Dec 24, 2014 at 10:30:19AM +0100, Andres Freund wrote: On 2014-12-23 22:51:22 -0500, Bruce Momjian wrote: Many of these are 64-byte aligned, including Buffer Descriptors. In that case you need to change max_connections, some settings will lead to unaligned BufferDescriptors. Well, isn't my second patch that misaligns the buffers sufficient for testing? I hadn't looked at it. Note that you're quite likely to overlap the allocated region into the next one with the trivial method you're using. I just verified that I can still reproduce the problem: # aligned case (max_connections=401) afreund@axle:~$ pgbench -P 1 -h /tmp/ -p5440 postgres -n -M prepared -c 96 -j 96 -T 100 -S progress: 1.0 s, 405170.2 tps, lat 0.195 ms stddev 0.928 progress: 2.0 s, 467011.1 tps, lat 0.204 ms stddev 0.140 progress: 3.0 s, 462832.1 tps, lat 0.205 ms stddev 0.154 progress: 4.0 s, 471035.5 tps, lat 0.202 ms stddev 0.154 progress: 5.0 s, 500329.0 tps, lat 0.190 ms stddev 0.132 BufferDescriptors is at 0x7f63610a6960 (which is 32byte aligned) # unaligned case (max_connections=400) afreund@axle:~$ pgbench -P 1 -h /tmp/ -p5440 postgres -n -M prepared -c 96 -j 96 -T 100 -S progress: 1.0 s, 202271.1 tps, lat 0.448 ms stddev 1.232 progress: 2.0 s, 223823.4 tps, lat 0.427 ms stddev 3.007 progress: 3.0 s, 227584.5 tps, lat 0.414 ms stddev 4.760 progress: 4.0 s, 221095.6 tps, lat 0.410 ms stddev 4.390 progress: 5.0 s, 217430.6 tps, lat 0.454 ms stddev 7.913 progress: 6.0 s, 210275.9 tps, lat 0.411 ms stddev 0.606 BufferDescriptors is at 0x7f1718aeb980 (which is 64byte aligned) This is on a quad E5-4620 with 256GB RAM on a scale 100 pgbench instance. Can someone test these patches on an AMD CPU and see if you see a difference? Thanks. I don't think you'll see a bigger difference there. Uh, I thought AMD showed a huge difference for misalignment: http://www.postgresql.org/message-id/20140202151319.gd32...@awork2.anarazel.de Ugh, yes. Forgot that... There was another patch that wasn't showing big differences on AMD, and I mixed it up. I ended up running pgbench using 16-scale and got 90k tps: pgbench -S -c 16 -j 16 -t 10 pgbench but again could not see any difference between aligned and misaligned. At the very least you should use -M prepared, otherwise you'll be bottlenecked by the parser. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Misaligned BufferDescriptors causing major performance problems on AMD
On Thu, Apr 17, 2014 at 11:23:24AM +0200, Andres Freund wrote: On 2014-04-16 19:18:02 -0400, Bruce Momjian wrote: On Thu, Feb 6, 2014 at 09:40:32AM +0100, Andres Freund wrote: On 2014-02-05 12:36:42 -0500, Robert Haas wrote: It may well be that your proposal is spot on. But I'd like to see some data-structure-by-data-structure measurements, rather than assuming that alignment must be a good thing. I am fine with just aligning BufferDescriptors properly. That has clearly shown massive improvements. I thought your previous idea of increasing BUFFERALIGN to 64 bytes had a lot to recommend it. Good. I wonder if we shouldn't move that bit of logic: if (size = BUFSIZ) newStart = BUFFERALIGN(newStart); out of ShmemAlloc() and instead have a ShmemAllocAligned() and ShmemInitStructAligned() that does it. So we can sensibly can control it per struct. But that doesn't mean it doesn't need testing. I feel the need here, to say that I never said it doesn't need testing and never thought it didn't... Where are we on this? It needs somebody with time to evaluate possible performance regressions - I personally won't have time to look into this in detail before pgcon. I am doing performance testing to try to complete this item. I used the first attached patch to report which structures are 64-byte aligned: 64-byte shared memory alignment of Control File: 0 64-byte shared memory alignment of XLOG Ctl: 1 64-byte shared memory alignment of CLOG Ctl: 0 64-byte shared memory alignment of CommitTs Ctl: 0 64-byte shared memory alignment of CommitTs shared: 0 64-byte shared memory alignment of SUBTRANS Ctl: 1 64-byte shared memory alignment of MultiXactOffset Ctl: 1 64-byte shared memory alignment of MultiXactMember Ctl: 1 64-byte shared memory alignment of Shared MultiXact State: 1 64-byte shared memory alignment of Buffer Descriptors: 1 64-byte shared memory alignment of Buffer Blocks: 1 64-byte shared memory alignment of Shared Buffer Lookup Table: 1 64-byte shared memory alignment of Buffer Strategy Status: 1 64-byte shared memory alignment of LOCK hash: 0 64-byte shared memory alignment of PROCLOCK hash: 0 64-byte shared memory alignment of Fast Path Strong Relation Lock Data: 0 64-byte shared memory alignment of PREDICATELOCKTARGET hash: 0 64-byte shared memory alignment of PREDICATELOCK hash: 0 64-byte shared memory alignment of PredXactList: 0 64-byte shared memory alignment of SERIALIZABLEXID hash: 1 64-byte shared memory alignment of RWConflictPool: 1 64-byte shared memory alignment of FinishedSerializableTransactions: 0 64-byte shared memory alignment of OldSerXid SLRU Ctl: 1 64-byte shared memory alignment of OldSerXidControlData: 1 64-byte shared memory alignment of Proc Header: 0 64-byte shared memory alignment of Proc Array: 0 64-byte shared memory alignment of Backend Status Array: 0 64-byte shared memory alignment of Backend Application Name Buffer: 0 64-byte shared memory alignment of Backend Client Host Name Buffer: 0 64-byte shared memory alignment of Backend Activity Buffer: 0 64-byte shared memory alignment of Prepared Transaction Table: 0 64-byte shared memory alignment of Background Worker Data: 0 64-byte shared memory alignment of shmInvalBuffer: 1 64-byte shared memory alignment of PMSignalState: 0 64-byte shared memory alignment of ProcSignalSlots: 0 64-byte shared memory alignment of Checkpointer Data: 0 64-byte shared memory alignment of AutoVacuum Data: 0 64-byte shared memory alignment of Wal Sender Ctl: 0 64-byte shared memory alignment of Wal Receiver Ctl: 0 64-byte shared memory alignment of BTree Vacuum State: 0 64-byte shared memory alignment of Sync Scan Locations List: 0 64-byte shared memory alignment of Async Queue Control: 0 64-byte shared memory alignment of Async Ctl: 0 Many of these are 64-byte aligned, including Buffer Descriptors. I tested pgbench with these commands: $ pgbench -i -s 95 pgbench $ pgbench -S -c 95 -j 95 -t 10 pgbench on a 16-core Xeon server and got 84k tps. I then applied another patch, attached, which causes all the structures to be non-64-byte aligned, but got the same tps number. Can someone test these patches on an AMD CPU and see if you see a difference? Thanks. -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + Everyone has their own god. + diff --git a/src/backend/storage/ipc/shmem.c b/src/backend/storage/ipc/shmem.c new file mode 100644
Re: [HACKERS] Misaligned BufferDescriptors causing major performance problems on AMD
On Thu, Apr 17, 2014 at 11:23:24AM +0200, Andres Freund wrote: On 2014-04-16 19:18:02 -0400, Bruce Momjian wrote: On Thu, Feb 6, 2014 at 09:40:32AM +0100, Andres Freund wrote: On 2014-02-05 12:36:42 -0500, Robert Haas wrote: It may well be that your proposal is spot on. But I'd like to see some data-structure-by-data-structure measurements, rather than assuming that alignment must be a good thing. I am fine with just aligning BufferDescriptors properly. That has clearly shown massive improvements. I thought your previous idea of increasing BUFFERALIGN to 64 bytes had a lot to recommend it. Good. I wonder if we shouldn't move that bit of logic: if (size = BUFSIZ) newStart = BUFFERALIGN(newStart); out of ShmemAlloc() and instead have a ShmemAllocAligned() and ShmemInitStructAligned() that does it. So we can sensibly can control it per struct. But that doesn't mean it doesn't need testing. I feel the need here, to say that I never said it doesn't need testing and never thought it didn't... Where are we on this? It needs somebody with time to evaluate possible performance regressions - I personally won't have time to look into this in detail before pgcon. Again, has anyone made any headway on this? Is it a TODO item? -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + Everyone has their own god. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Misaligned BufferDescriptors causing major performance problems on AMD
On 2014-04-16 19:18:02 -0400, Bruce Momjian wrote: On Thu, Feb 6, 2014 at 09:40:32AM +0100, Andres Freund wrote: On 2014-02-05 12:36:42 -0500, Robert Haas wrote: It may well be that your proposal is spot on. But I'd like to see some data-structure-by-data-structure measurements, rather than assuming that alignment must be a good thing. I am fine with just aligning BufferDescriptors properly. That has clearly shown massive improvements. I thought your previous idea of increasing BUFFERALIGN to 64 bytes had a lot to recommend it. Good. I wonder if we shouldn't move that bit of logic: if (size = BUFSIZ) newStart = BUFFERALIGN(newStart); out of ShmemAlloc() and instead have a ShmemAllocAligned() and ShmemInitStructAligned() that does it. So we can sensibly can control it per struct. But that doesn't mean it doesn't need testing. I feel the need here, to say that I never said it doesn't need testing and never thought it didn't... Where are we on this? It needs somebody with time to evaluate possible performance regressions - I personally won't have time to look into this in detail before pgcon. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Misaligned BufferDescriptors causing major performance problems on AMD
On Thu, Feb 6, 2014 at 09:40:32AM +0100, Andres Freund wrote: On 2014-02-05 12:36:42 -0500, Robert Haas wrote: It may well be that your proposal is spot on. But I'd like to see some data-structure-by-data-structure measurements, rather than assuming that alignment must be a good thing. I am fine with just aligning BufferDescriptors properly. That has clearly shown massive improvements. I thought your previous idea of increasing BUFFERALIGN to 64 bytes had a lot to recommend it. Good. I wonder if we shouldn't move that bit of logic: if (size = BUFSIZ) newStart = BUFFERALIGN(newStart); out of ShmemAlloc() and instead have a ShmemAllocAligned() and ShmemInitStructAligned() that does it. So we can sensibly can control it per struct. But that doesn't mean it doesn't need testing. I feel the need here, to say that I never said it doesn't need testing and never thought it didn't... Where are we on this? -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + Everyone has their own god. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Misaligned BufferDescriptors causing major performance problems on AMD
On 2014-02-05 12:36:42 -0500, Robert Haas wrote: It may well be that your proposal is spot on. But I'd like to see some data-structure-by-data-structure measurements, rather than assuming that alignment must be a good thing. I am fine with just aligning BufferDescriptors properly. That has clearly shown massive improvements. I thought your previous idea of increasing BUFFERALIGN to 64 bytes had a lot to recommend it. Good. I wonder if we shouldn't move that bit of logic: if (size = BUFSIZ) newStart = BUFFERALIGN(newStart); out of ShmemAlloc() and instead have a ShmemAllocAligned() and ShmemInitStructAligned() that does it. So we can sensibly can control it per struct. But that doesn't mean it doesn't need testing. I feel the need here, to say that I never said it doesn't need testing and never thought it didn't... Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Misaligned BufferDescriptors causing major performance problems on AMD
Andres Freund and...@2ndquadrant.com writes: And I think somebody already thought about it (c.f. ALIGNOF_BUFFER), it just wasn't updated in the last 10 years. No, ALIGNOF_BUFFER is there because we read something that said that I/O transfers between userspace and kernel disk cache would be faster with aligned buffers. There's been no particular thought given to alignment of other data structures, AFAIR. It may well be that your proposal is spot on. But I'd like to see some data-structure-by-data-structure measurements, rather than assuming that alignment must be a good thing. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Misaligned BufferDescriptors causing major performance problems on AMD
On 2014-02-05 11:23:29 -0500, Tom Lane wrote: Andres Freund and...@2ndquadrant.com writes: And I think somebody already thought about it (c.f. ALIGNOF_BUFFER), it just wasn't updated in the last 10 years. No, ALIGNOF_BUFFER is there because we read something that said that I/O transfers between userspace and kernel disk cache would be faster with aligned buffers. There's been no particular thought given to alignment of other data structures, AFAIR. But it's not aligned anymore on at last half a decade's hardware, and it's what we already align *all* bigger ShmemAlloc() values with. And BufferDescriptors surely counts as larger in its entirety. It may well be that your proposal is spot on. But I'd like to see some data-structure-by-data-structure measurements, rather than assuming that alignment must be a good thing. I am fine with just aligning BufferDescriptors properly. That has clearly shown massive improvements. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Misaligned BufferDescriptors causing major performance problems on AMD
On Wed, Feb 5, 2014 at 11:37 AM, Andres Freund and...@2ndquadrant.com wrote: On 2014-02-05 11:23:29 -0500, Tom Lane wrote: Andres Freund and...@2ndquadrant.com writes: And I think somebody already thought about it (c.f. ALIGNOF_BUFFER), it just wasn't updated in the last 10 years. No, ALIGNOF_BUFFER is there because we read something that said that I/O transfers between userspace and kernel disk cache would be faster with aligned buffers. There's been no particular thought given to alignment of other data structures, AFAIR. But it's not aligned anymore on at last half a decade's hardware, and it's what we already align *all* bigger ShmemAlloc() values with. And BufferDescriptors surely counts as larger in its entirety. It may well be that your proposal is spot on. But I'd like to see some data-structure-by-data-structure measurements, rather than assuming that alignment must be a good thing. I am fine with just aligning BufferDescriptors properly. That has clearly shown massive improvements. I thought your previous idea of increasing BUFFERALIGN to 64 bytes had a lot to recommend it. But that doesn't mean it doesn't need testing. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers