Re: [HACKERS] Spinlocks and compiler/memory barriers
On Wed, Sep 24, 2014 at 2:45 PM, Andres Freund and...@2ndquadrant.com wrote: On 2014-09-09 17:54:03 -0400, Robert Haas wrote: So, that's committed, then. I think we should pick something that uses spinlocks and is likely to fail spectacularly if we haven't got this totally right yet, and de-volatilize it. And then watch to see what turns red in the buildfarm and/or which users start screaming. I'm inclined to propose lwlock.c as a candidate, since that's very widely used and a place where we know there's significant contention. Did you consider removing the volatiles from bufmgr.c? There's lots of volatiles in there and most of them don't seem to have been added in a principled way. I'm looking at my old patch for lockless pin/unpin of buffers and it'd look a lot cleaner without. I hadn't thought of it, but it sounds like a good idea. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Spinlocks and compiler/memory barriers
On Tue, Sep 9, 2014 at 6:00 PM, Andres Freund and...@2ndquadrant.com wrote: On 2014-09-09 17:54:03 -0400, Robert Haas wrote: So, that's committed, then. Yay, finally. I think we should pick something that uses spinlocks and is likely to fail spectacularly if we haven't got this totally right yet, and de-volatilize it. And then watch to see what turns red in the buildfarm and/or which users start screaming. Good plan. I'm inclined to propose lwlock.c as a candidate, since that's very widely used and a place where we know there's significant contention. I suggest, additionally possibly, GetSnapshotData(). It's surely one of the hottest functions in postgres, and I've seen some performance increases from de-volatilizing it. IIRC it requires one volatile cast in one place to enforce that a variable is accessed only once. Not sure if we want to add such volatile casts or use something like linux' ACCESS_ONCE macros for some common types. Helps to grep for places worthy of inspection. GetSnapshotData() isn't quite to the point, because it's using a volatile pointer, but not because of anything about spinlock manipulation. That would, perhaps, be a good test for barriers, but not for this. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Spinlocks and compiler/memory barriers
On 2014-09-04 08:18:37 -0400, Robert Haas wrote: On Tue, Aug 5, 2014 at 11:55 AM, Robert Haas robertmh...@gmail.com wrote: On Sun, Jul 6, 2014 at 3:12 PM, Andres Freund and...@2ndquadrant.com wrote: If you want to do that, it's fine with me. What I would do is: - Back-patch the addition of the sparcv8+ stuff all the way. If anyone's running anything older, let them complain... - Remove the special case for MIPS without gcc intrinsics only in master, leaving the back-branches broken. If anyone cares, let them complain... - Nothing else. I've gone ahead and done the second of these things. Thanks. Andres, do you want to go take a stab at fixing the SPARC stuff? Will do, will probably take me till thursday to come up with the brain cycles. Ping? This has been pending for almost two months now and, at your request, my patch to make spinlocks act as compiler barriers is waiting behind it. Can we please get this moving again soon, or can I commit that patch and you can fix this when you get around to it? I finally pushed this. And once more I seriously got pissed at the poor overall worldwide state of documentation and continously changing terminology around this. Sorry for taking this long :( Do you have a current version of your patch to make them compiler barriers? Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Spinlocks and compiler/memory barriers
On Mon, Sep 8, 2014 at 7:10 PM, Andres Freund and...@2ndquadrant.com wrote: This has been pending for almost two months now and, at your request, my patch to make spinlocks act as compiler barriers is waiting behind it. Can we please get this moving again soon, or can I commit that patch and you can fix this when you get around to it? I finally pushed this. And once more I seriously got pissed at the poor overall worldwide state of documentation and continously changing terminology around this. There does seem to be a deficit in that area. Sorry for taking this long :( Do you have a current version of your patch to make them compiler barriers? I had forgotten that it needed an update. Thanks for the reminder. Here's v2. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company diff --git a/src/backend/storage/lmgr/s_lock.c b/src/backend/storage/lmgr/s_lock.c index efe1b43..38dc34d 100644 --- a/src/backend/storage/lmgr/s_lock.c +++ b/src/backend/storage/lmgr/s_lock.c @@ -154,6 +154,17 @@ s_lock(volatile slock_t *lock, const char *file, int line) return delays; } +#ifdef USE_DEFAULT_S_UNLOCK +void +s_unlock(slock_t *lock) +{ +#ifdef TAS_ACTIVE_WORD + *TAS_ACTIVE_WORD(lock) = -1; +#else + *lock = 0; +#endif +} +#endif /* * Set local copy of spins_per_delay during backend startup. diff --git a/src/include/storage/s_lock.h b/src/include/storage/s_lock.h index 06dc963..8e0c4c3 100644 --- a/src/include/storage/s_lock.h +++ b/src/include/storage/s_lock.h @@ -55,14 +55,16 @@ * on Alpha TAS() will fail if interrupted. Therefore a retry loop must * always be used, even if you are certain the lock is free. * - * Another caution for users of these macros is that it is the caller's - * responsibility to ensure that the compiler doesn't re-order accesses - * to shared memory to precede the actual lock acquisition, or follow the - * lock release. Typically we handle this by using volatile-qualified - * pointers to refer to both the spinlock itself and the shared data - * structure being accessed within the spinlocked critical section. - * That fixes it because compilers are not allowed to re-order accesses - * to volatile objects relative to other such accesses. + * It is the responsibility of these macros to make sure that the compiler + * does not re-order accesses to shared memory to precede the actual lock + * acquisition, or follow the lock release. Prior to PostgreSQL 9.5, this + * was the caller's responsibility, which meant that callers had to use + * volatile-qualified pointers to refer to both the spinlock itself and the + * shared data being accessed within the spinlocked critical section. This + * was notationally awkward, easy to forget (and thus error-prone), and + * prevented some useful compiler optimizations. For these reasons, we + * now require that the macros themselves prevent compiler re-ordering, + * so that the caller doesn't need to take special precautions. * * On platforms with weak memory ordering, the TAS(), TAS_SPIN(), and * S_UNLOCK() macros must further include hardware-level memory fence @@ -484,14 +486,14 @@ tas(volatile slock_t *lock) #define S_UNLOCK(lock) \ do \ { \ - __asm__ __volatile__ ( lwsync \n); \ + __asm__ __volatile__ ( lwsync \n ::: memory); \ *((volatile slock_t *) (lock)) = 0; \ } while (0) #else #define S_UNLOCK(lock) \ do \ { \ - __asm__ __volatile__ ( sync \n); \ + __asm__ __volatile__ ( sync \n ::: memory); \ *((volatile slock_t *) (lock)) = 0; \ } while (0) #endif /* USE_PPC_LWSYNC */ @@ -599,7 +601,9 @@ do \ .set noreorder \n \ .set nomacro\n \ sync\n \ - .set pop ); \ + .set pop +: +: memory); *((volatile slock_t *) (lock)) = 0; \ } while (0) @@ -657,6 +661,23 @@ tas(volatile slock_t *lock) typedef unsigned char slock_t; #endif +/* + * Note that this implementation is unsafe for any platform that can speculate + * a memory access (either load or store) after a following store. That + * happens not to be possible x86 and most legacy architectures (some are + * single-processor!), but many modern systems have weaker memory ordering. + * Those that do must define their own version S_UNLOCK() rather than relying + * on this one. + */ +#if !defined(S_UNLOCK) +#if defined(__INTEL_COMPILER) +#define S_UNLOCK(lock) \ + do { __memory_barrier(); *(lock) = 0; } while (0) +#else +#define S_UNLOCK(lock) \ + do { __asm__ __volatile__( : : : memory); *(lock) = 0; } while (0) +#endif +#endif #endif /* defined(__GNUC__) || defined(__INTEL_COMPILER) */ @@ -730,9 +751,13 @@ tas(volatile slock_t *lock) return (lockval == 0); } -#endif /* __GNUC__ */ +#define S_UNLOCK(lock) \ + do { \ + __asm__ __volatile__( : : : memory); \ + *TAS_ACTIVE_WORD(lock) = -1; \ + } while (0) -#define S_UNLOCK(lock) (*TAS_ACTIVE_WORD(lock) = -1) +#endif /* __GNUC__ */ #define
Re: [HACKERS] Spinlocks and compiler/memory barriers
On 2014-09-09 13:52:40 -0400, Robert Haas wrote: I had forgotten that it needed an update. Thanks for the reminder. Here's v2. I've attached a incremental patch fixing minor gripes. Other than that I think you can go ahead with this once the buildfarm accepts the sparc fixes (man, those machines are slow. spoonbill apparently takes ~5h for one run). I've done a read through s_lock.h and the only remaining potential issue that I see is that I have no idea if unixware's tas() is actually a safe compiler barrier (it is a memory barrier). And I really, really can't make myself care. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services diff --git a/src/backend/storage/lmgr/s_lock.c b/src/backend/storage/lmgr/s_lock.c index 38dc34d..e8d3502 100644 --- a/src/backend/storage/lmgr/s_lock.c +++ b/src/backend/storage/lmgr/s_lock.c @@ -159,6 +159,7 @@ void s_unlock(slock_t *lock) { #ifdef TAS_ACTIVE_WORD + /* HP's PA-RISC */ *TAS_ACTIVE_WORD(lock) = -1; #else *lock = 0; diff --git a/src/include/storage/s_lock.h b/src/include/storage/s_lock.h index 8e0c4c3..5f62a2c 100644 --- a/src/include/storage/s_lock.h +++ b/src/include/storage/s_lock.h @@ -57,8 +57,8 @@ * * It is the responsibility of these macros to make sure that the compiler * does not re-order accesses to shared memory to precede the actual lock - * acquisition, or follow the lock release. Prior to PostgreSQL 9.5, this - * was the caller's responsibility, which meant that callers had to use + * acquisition, or following the lock release. Prior to PostgreSQL 9.5, + * this was the caller's responsibility, which meant that callers had to use * volatile-qualified pointers to refer to both the spinlock itself and the * shared data being accessed within the spinlocked critical section. This * was notationally awkward, easy to forget (and thus error-prone), and @@ -403,7 +403,7 @@ tas(volatile slock_t *lock) * No stbar or membar available, luckily no actually produced hardware * requires a barrier. */ -#define S_UNLOCK(lock) (*((volatile slock_t *) (lock)) = 0) +/* use the default S_UNLOCK definition for gcc */ #elif __sparcv8 /* stbar is available (and required for both PSO, RMO), membar isn't */ #define S_UNLOCK(lock) \ @@ -921,14 +921,14 @@ extern int tas_sema(volatile slock_t *lock); * store) after a following store; platforms where this is possible must * define their own S_UNLOCK. But CPU reordering is not the only concern: * if we simply defined S_UNLOCK() as an inline macro, the compiler might - * reorder instructions from the critical section to occur after the lock - * release. Since the compiler probably can't know what the external + * reorder instructions from inside the critical section to occur after the + * lock release. Since the compiler probably can't know what the external * function s_unlock is doing, putting the same logic there should be adequate. * A sufficiently-smart globally optimizing compiler could break that * assumption, though, and the cost of a function call for every spinlock * release may hurt performance significantly, so we use this implementation * only for platforms where we don't know of a suitable intrinsic. For the - * most part, those are relatively obscure platform/compiler platforms to + * most part, those are relatively obscure platform/compiler platforms to * which the PostgreSQL project does not have access. */ #define USE_DEFAULT_S_UNLOCK -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Spinlocks and compiler/memory barriers
On Tue, Sep 9, 2014 at 5:09 PM, Andres Freund and...@2ndquadrant.com wrote: On 2014-09-09 13:52:40 -0400, Robert Haas wrote: I had forgotten that it needed an update. Thanks for the reminder. Here's v2. I've attached a incremental patch fixing minor gripes. Other than that I think you can go ahead with this once the buildfarm accepts the sparc fixes (man, those machines are slow. spoonbill apparently takes ~5h for one run). I've done a read through s_lock.h and the only remaining potential issue that I see is that I have no idea if unixware's tas() is actually a safe compiler barrier (it is a memory barrier). And I really, really can't make myself care. *It is the responsibility of these macros to make sure that the compiler *does not re-order accesses to shared memory to precede the actual lock - *acquisition, or follow the lock release. Prior to PostgreSQL 9.5, this - *was the caller's responsibility, which meant that callers had to use + *acquisition, or following the lock release. Prior to PostgreSQL 9.5, + * this was the caller's responsibility, which meant that callers had to use AFAICS my version is right and your version is grammatically incorrect. re-order to proceed or follow uses the same verb tense in both branches of the or; re-order to proceed or following does not. I agree that if there are problems on UnixWare, we can let anyone who cares about UnixWare submit a patch to fix them. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Spinlocks and compiler/memory barriers
On 2014-09-09 17:30:44 -0400, Robert Haas wrote: On Tue, Sep 9, 2014 at 5:09 PM, Andres Freund and...@2ndquadrant.com wrote: On 2014-09-09 13:52:40 -0400, Robert Haas wrote: I had forgotten that it needed an update. Thanks for the reminder. Here's v2. I've attached a incremental patch fixing minor gripes. Other than that I think you can go ahead with this once the buildfarm accepts the sparc fixes (man, those machines are slow. spoonbill apparently takes ~5h for one run). I've done a read through s_lock.h and the only remaining potential issue that I see is that I have no idea if unixware's tas() is actually a safe compiler barrier (it is a memory barrier). And I really, really can't make myself care. *It is the responsibility of these macros to make sure that the compiler *does not re-order accesses to shared memory to precede the actual lock - *acquisition, or follow the lock release. Prior to PostgreSQL 9.5, this - *was the caller's responsibility, which meant that callers had to use + *acquisition, or following the lock release. Prior to PostgreSQL 9.5, + * this was the caller's responsibility, which meant that callers had to use AFAICS my version is right and your version is grammatically incorrect. re-order to proceed or follow uses the same verb tense in both branches of the or; re-order to proceed or following does not. Wasn't sure about that one. It read oddly to me, but then I'm not a native speaker. And won't read the sentence often ;) I agree that if there are problems on UnixWare, we can let anyone who cares about UnixWare submit a patch to fix them. Ok. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Spinlocks and compiler/memory barriers
On Tue, Sep 9, 2014 at 5:32 PM, Andres Freund and...@2ndquadrant.com wrote: On 2014-09-09 17:30:44 -0400, Robert Haas wrote: On Tue, Sep 9, 2014 at 5:09 PM, Andres Freund and...@2ndquadrant.com wrote: On 2014-09-09 13:52:40 -0400, Robert Haas wrote: I had forgotten that it needed an update. Thanks for the reminder. Here's v2. I've attached a incremental patch fixing minor gripes. Other than that I think you can go ahead with this once the buildfarm accepts the sparc fixes (man, those machines are slow. spoonbill apparently takes ~5h for one run). I've done a read through s_lock.h and the only remaining potential issue that I see is that I have no idea if unixware's tas() is actually a safe compiler barrier (it is a memory barrier). And I really, really can't make myself care. *It is the responsibility of these macros to make sure that the compiler *does not re-order accesses to shared memory to precede the actual lock - *acquisition, or follow the lock release. Prior to PostgreSQL 9.5, this - *was the caller's responsibility, which meant that callers had to use + *acquisition, or following the lock release. Prior to PostgreSQL 9.5, + * this was the caller's responsibility, which meant that callers had to use AFAICS my version is right and your version is grammatically incorrect. re-order to proceed or follow uses the same verb tense in both branches of the or; re-order to proceed or following does not. Wasn't sure about that one. It read oddly to me, but then I'm not a native speaker. And won't read the sentence often ;) I agree that if there are problems on UnixWare, we can let anyone who cares about UnixWare submit a patch to fix them. Ok. So, that's committed, then. I think we should pick something that uses spinlocks and is likely to fail spectacularly if we haven't got this totally right yet, and de-volatilize it. And then watch to see what turns red in the buildfarm and/or which users start screaming. I'm inclined to propose lwlock.c as a candidate, since that's very widely used and a place where we know there's significant contention. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Spinlocks and compiler/memory barriers
On 2014-09-09 17:54:03 -0400, Robert Haas wrote: So, that's committed, then. Yay, finally. I think we should pick something that uses spinlocks and is likely to fail spectacularly if we haven't got this totally right yet, and de-volatilize it. And then watch to see what turns red in the buildfarm and/or which users start screaming. Good plan. I'm inclined to propose lwlock.c as a candidate, since that's very widely used and a place where we know there's significant contention. I suggest, additionally possibly, GetSnapshotData(). It's surely one of the hottest functions in postgres, and I've seen some performance increases from de-volatilizing it. IIRC it requires one volatile cast in one place to enforce that a variable is accessed only once. Not sure if we want to add such volatile casts or use something like linux' ACCESS_ONCE macros for some common types. Helps to grep for places worthy of inspection. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Spinlocks and compiler/memory barriers
On 2014-09-04 14:19:47 +0200, Andres Freund wrote: Yes. I plan to push the patch this weekend. Sorry for the delay. I'm about to push this. Is it ok to first push it to master and only backpatch once a couple buildfarm cycles haven't complained? Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Spinlocks and compiler/memory barriers
On Mon, Sep 8, 2014 at 8:07 AM, Andres Freund and...@2ndquadrant.com wrote: On 2014-09-04 14:19:47 +0200, Andres Freund wrote: Yes. I plan to push the patch this weekend. Sorry for the delay. I'm about to push this. Is it ok to first push it to master and only backpatch once a couple buildfarm cycles haven't complained? That will have the disadvantage that src/tools/git_changelog will show the commits separately instead of grouping them together; so it's probably best not to make a practice of it. But I think it's up to your discretion how to handle it in any particular case. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Spinlocks and compiler/memory barriers
Andres Freund and...@2ndquadrant.com writes: On 2014-09-04 14:19:47 +0200, Andres Freund wrote: Yes. I plan to push the patch this weekend. Sorry for the delay. I'm about to push this. Is it ok to first push it to master and only backpatch once a couple buildfarm cycles haven't complained? It makes for a cleaner commit history if you push concurrently into all the branches you intend to patch. That also gives more buildfarm runs, which seems like a good thing for this sort of patch. That is, assuming that we ought to backpatch at all, which to my mind is debatable. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Spinlocks and compiler/memory barriers
On Mon, Sep 8, 2014 at 10:07 AM, Tom Lane t...@sss.pgh.pa.us wrote: It makes for a cleaner commit history if you push concurrently into all the branches you intend to patch. That also gives more buildfarm runs, which seems like a good thing for this sort of patch. That is, assuming that we ought to backpatch at all, which to my mind is debatable. We're not going to backpatch the main patch to make spinlock primitives act as compiler barriers - or at least, I will object loudly. But what we're talking about here is a bug fix for Sparc. And surely we ought to back-patch that. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Spinlocks and compiler/memory barriers
On Mon, Sep 8, 2014 at 10:08:04AM -0400, Robert Haas wrote: On Mon, Sep 8, 2014 at 8:07 AM, Andres Freund and...@2ndquadrant.com wrote: On 2014-09-04 14:19:47 +0200, Andres Freund wrote: Yes. I plan to push the patch this weekend. Sorry for the delay. I'm about to push this. Is it ok to first push it to master and only backpatch once a couple buildfarm cycles haven't complained? That will have the disadvantage that src/tools/git_changelog will show the commits separately instead of grouping them together; so it's probably best not to make a practice of it. But I think it's up to your discretion how to handle it in any particular case. Uh, git_changelog timespan check is 24 hours, so if the delay is less then 24 hours, I think we are ok, e.g.: # Might want to make this parameter user-settable. my $timestamp_slop = 24 * 60 * 60; -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + Everyone has their own god. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Spinlocks and compiler/memory barriers
Robert Haas robertmh...@gmail.com writes: But what we're talking about here is a bug fix for Sparc. And surely we ought to back-patch that. Ah. OK, no objection. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Spinlocks and compiler/memory barriers
On Tue, Aug 5, 2014 at 11:55 AM, Robert Haas robertmh...@gmail.com wrote: On Sun, Jul 6, 2014 at 3:12 PM, Andres Freund and...@2ndquadrant.com wrote: If you want to do that, it's fine with me. What I would do is: - Back-patch the addition of the sparcv8+ stuff all the way. If anyone's running anything older, let them complain... - Remove the special case for MIPS without gcc intrinsics only in master, leaving the back-branches broken. If anyone cares, let them complain... - Nothing else. I've gone ahead and done the second of these things. Thanks. Andres, do you want to go take a stab at fixing the SPARC stuff? Will do, will probably take me till thursday to come up with the brain cycles. Ping? This has been pending for almost two months now and, at your request, my patch to make spinlocks act as compiler barriers is waiting behind it. Can we please get this moving again soon, or can I commit that patch and you can fix this when you get around to it? -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Spinlocks and compiler/memory barriers
On September 4, 2014 2:18:37 PM CEST, Robert Haas robertmh...@gmail.com wrote: On Tue, Aug 5, 2014 at 11:55 AM, Robert Haas robertmh...@gmail.com wrote: On Sun, Jul 6, 2014 at 3:12 PM, Andres Freund and...@2ndquadrant.com wrote: If you want to do that, it's fine with me. What I would do is: - Back-patch the addition of the sparcv8+ stuff all the way. If anyone's running anything older, let them complain... - Remove the special case for MIPS without gcc intrinsics only in master, leaving the back-branches broken. If anyone cares, let them complain... - Nothing else. I've gone ahead and done the second of these things. Thanks. Andres, do you want to go take a stab at fixing the SPARC stuff? Will do, will probably take me till thursday to come up with the brain cycles. Ping? This has been pending for almost two months now and, at your request, my patch to make spinlocks act as compiler barriers is waiting behind it. Can we please get this moving again soon, or can I commit that patch and you can fix this when you get around to it? Yes. I plan to push the patch this weekend. Sorry for the delay. Andres -- Please excuse brevity and formatting - I am writing this on my mobile phone. Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Spinlocks and compiler/memory barriers
On Sun, Jul 6, 2014 at 3:12 PM, Andres Freund and...@2ndquadrant.com wrote: If you want to do that, it's fine with me. What I would do is: - Back-patch the addition of the sparcv8+ stuff all the way. If anyone's running anything older, let them complain... - Remove the special case for MIPS without gcc intrinsics only in master, leaving the back-branches broken. If anyone cares, let them complain... - Nothing else. I've gone ahead and done the second of these things. Thanks. Andres, do you want to go take a stab at fixing the SPARC stuff? Will do, will probably take me till thursday to come up with the brain cycles. Ping? -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Spinlocks and compiler/memory barriers
On Tue, Jul 1, 2014 at 12:22 PM, Robert Haas robertmh...@gmail.com wrote: Since we have a Sun Studio machine in the buildfarm, we shouldn't give up on SPARC completely, but maybe we should only add the cases for sparcv8+ and above? That at least has some chance of getting tested. That we have code for sparcv7 is ridiculous btw. Sparcv8 (not 8+/9) is from 1992. v7/sun4c been dropped from solaris 8 if I see that correctly. I agree that v8 (in contrast to v8+ aka v9 in 32bit mode) support has to go as well. I'd personally vote for backpatching a note to installation.sgml saying that it's probably not working, and not do anything else there. That means we also should replace the ldstub by cas in the the gcc inline assembly - but we have buildfarm members for that, so it's not too bad. If you want to do that, it's fine with me. What I would do is: - Back-patch the addition of the sparcv8+ stuff all the way. If anyone's running anything older, let them complain... - Remove the special case for MIPS without gcc intrinsics only in master, leaving the back-branches broken. If anyone cares, let them complain... - Nothing else. I've gone ahead and done the second of these things. Andres, do you want to go take a stab at fixing the SPARC stuff? -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Spinlocks and compiler/memory barriers
On 2014-07-06 15:02:21 -0400, Robert Haas wrote: On Tue, Jul 1, 2014 at 12:22 PM, Robert Haas robertmh...@gmail.com wrote: Since we have a Sun Studio machine in the buildfarm, we shouldn't give up on SPARC completely, but maybe we should only add the cases for sparcv8+ and above? That at least has some chance of getting tested. That we have code for sparcv7 is ridiculous btw. Sparcv8 (not 8+/9) is from 1992. v7/sun4c been dropped from solaris 8 if I see that correctly. I agree that v8 (in contrast to v8+ aka v9 in 32bit mode) support has to go as well. I'd personally vote for backpatching a note to installation.sgml saying that it's probably not working, and not do anything else there. That means we also should replace the ldstub by cas in the the gcc inline assembly - but we have buildfarm members for that, so it's not too bad. If you want to do that, it's fine with me. What I would do is: - Back-patch the addition of the sparcv8+ stuff all the way. If anyone's running anything older, let them complain... - Remove the special case for MIPS without gcc intrinsics only in master, leaving the back-branches broken. If anyone cares, let them complain... - Nothing else. I've gone ahead and done the second of these things. Thanks. Andres, do you want to go take a stab at fixing the SPARC stuff? Will do, will probably take me till thursday to come up with the brain cycles. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Spinlocks and compiler/memory barriers
On 2014-07-01 20:21:37 -0400, Tom Lane wrote: Andres Freund and...@2ndquadrant.com writes: On 2014-07-01 23:21:07 +0100, Mark Cave-Ayland wrote: Also if you're struggling for Sun buildfarm animals, recent versions of QEMU will quite happily install and run later versions of 32-bit Solaris over serial, and 2.0 even manages to give you a cgthree framebuffer for the full experience. Well. I have to admit I'm really not interested in investing that much time in something I've no stake in. If postgres developers have to put emulated machines to develop features something imo went seriously wrong. That's more effort than at least I'm willing to spend. Perhaps more to the point, I have no faith at all that an emulator will mimic multiprocessor timing behavior to the level of detail needed to tell whether memory-barrier-related logic works. See the VAX discussion just a couple days ago. Well, it would allow us to see wether fixed stuff actually compiles and runs - that's not nothing. The biggest problem with fixing stuff like armv5, sparc8, whatever is that it requires adding stuff to our inline assembly. It's easy to accidentally make it not compile, but comparatively harder to make the behaviour even worse than before. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Spinlocks and compiler/memory barriers
On 02/07/14 08:36, Andres Freund wrote: On 2014-07-01 20:21:37 -0400, Tom Lane wrote: Andres Freund and...@2ndquadrant.com writes: On 2014-07-01 23:21:07 +0100, Mark Cave-Ayland wrote: Also if you're struggling for Sun buildfarm animals, recent versions of QEMU will quite happily install and run later versions of 32-bit Solaris over serial, and 2.0 even manages to give you a cgthree framebuffer for the full experience. Well. I have to admit I'm really not interested in investing that much time in something I've no stake in. If postgres developers have to put emulated machines to develop features something imo went seriously wrong. That's more effort than at least I'm willing to spend. Perhaps more to the point, I have no faith at all that an emulator will mimic multiprocessor timing behavior to the level of detail needed to tell whether memory-barrier-related logic works. See the VAX discussion just a couple days ago. Well, it would allow us to see wether fixed stuff actually compiles and runs - that's not nothing. The biggest problem with fixing stuff like armv5, sparc8, whatever is that it requires adding stuff to our inline assembly. It's easy to accidentally make it not compile, but comparatively harder to make the behaviour even worse than before. The point I wanted to make was that there are certain applications for which SPARCv8 is still certified for particular military/aerospace use. While I don't use it myself, some people are still using it enough to want to contribute QEMU patches. In terms of QEMU, the main reason for mentioning it was that if the consensus were to keep SPARCv8 support then this could be one possible way to provide basic buildfarm support (although as Tom rightly points out, there would need to be testing to build up confidence that the emulator does the right thing in a multiprocessor environment). ATB, Mark. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Spinlocks and compiler/memory barriers
On Wed, Jul 2, 2014 at 4:06 AM, Mark Cave-Ayland mark.cave-ayl...@ilande.co.uk wrote: The point I wanted to make was that there are certain applications for which SPARCv8 is still certified for particular military/aerospace use. While I don't use it myself, some people are still using it enough to want to contribute QEMU patches. In terms of QEMU, the main reason for mentioning it was that if the consensus were to keep SPARCv8 support then this could be one possible way to provide basic buildfarm support (although as Tom rightly points out, there would need to be testing to build up confidence that the emulator does the right thing in a multiprocessor environment). I think we're getting off-track here. The PostgreSQL project is always willing to accept new buildfarm machines. In the absence of buildfarm machines, we try not to go around randomly breaking things that were previously working. But the current situation is that we have good reason to suspect that a couple of very old platforms are subtly broken. In that situation, I think removing the thought-to-be-broken-code is the only sensible approach. Now, we're not crossing any sort of Rubicon here. If buildfarm machines show up - or, hell, if ONE person with access to the hardware OR a simulator shows up and can compile test EVEN ONCE that a patch to re-add support compiles and that the regression tests pass - then we can add support back in two shakes of a lamb's tail. In the meantime, leaving broken code in our tree is not a service to anyone. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Spinlocks and compiler/memory barriers
On 2014-06-30 22:44:58 -0400, Robert Haas wrote: On Mon, Jun 30, 2014 at 6:28 PM, Andres Freund and...@2ndquadrant.com wrote: On 2014-06-30 19:22:59 +0200, Andres Freund wrote: On 2014-06-30 12:46:29 -0400, Robert Haas wrote: , which if I understand you correctly are ARM without GCC atomics, Sparc, and MIPS. I've to revise my statement on MIPS, it actually looks safe. I seem to have missed that it has its own S_UNLOCK. So, here's my first blind attempt at fixing these. Too tired to think much more about it. Sparc's configurable cache coherency drives me nuts. Linux apparently switched somewhere in 2011 from RMO (relaxed memory order) to TSO (total store order), solaris always used TSO, but the BSDs don't. Man. Accordingly there's a somewhat ugly thingy attached. I don't think this is really ready, but it's what I can come up today. You know, looking at this, I wonder if we shouldn't just remove support for ARMv5 instead of making a blind stab at a fix. Well, I argued that way for a while ;). We don't even need to really desupport it, but just say it's not supported for gcc 4.4. On the other hand, the swpb release thing isn't too complicated and just the inverse of existing code. I'm quite in favor of doing what we can to support obscure architectures, but I think this might be beyond what's reasonable. Yea, I felt like I was going mad doing it. Perhaps I should have stopped. Since we have a Sun Studio machine in the buildfarm, we shouldn't give up on SPARC completely, but maybe we should only add the cases for sparcv8+ and above? That at least has some chance of getting tested. That we have code for sparcv7 is ridiculous btw. Sparcv8 (not 8+/9) is from 1992. v7/sun4c been dropped from solaris 8 if I see that correctly. I agree that v8 (in contrast to v8+ aka v9 in 32bit mode) support has to go as well. I'd personally vote for backpatching a note to installation.sgml saying that it's probably not working, and not do anything else there. That means we also should replace the ldstub by cas in the the gcc inline assembly - but we have buildfarm members for that, so it's not too bad. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Spinlocks and compiler/memory barriers
On Tue, Jul 1, 2014 at 6:04 AM, Andres Freund and...@2ndquadrant.com wrote: You know, looking at this, I wonder if we shouldn't just remove support for ARMv5 instead of making a blind stab at a fix. Well, I argued that way for a while ;). We don't even need to really desupport it, but just say it's not supported for gcc 4.4. Yeah, I didn't realize at the time that you were making that argument that the existing code was thought to be broken on its own terms. Removing probably-working code that we just can't test easily is, in my mind, quite different from removing probably-broken code for which we can't test a fix. By the time PostgreSQL 9.5 is released, GCC 4.4 will be six years old, and telling people on an obscure platform that few operating system distributions support that they can't use a brand-new PostgeSQL with a seven-year-old compiler doesn't seem like a serious problem, especially since the only alternative we can offer is compiling against completely-untested code. Since we have a Sun Studio machine in the buildfarm, we shouldn't give up on SPARC completely, but maybe we should only add the cases for sparcv8+ and above? That at least has some chance of getting tested. That we have code for sparcv7 is ridiculous btw. Sparcv8 (not 8+/9) is from 1992. v7/sun4c been dropped from solaris 8 if I see that correctly. I agree that v8 (in contrast to v8+ aka v9 in 32bit mode) support has to go as well. I'd personally vote for backpatching a note to installation.sgml saying that it's probably not working, and not do anything else there. That means we also should replace the ldstub by cas in the the gcc inline assembly - but we have buildfarm members for that, so it's not too bad. If you want to do that, it's fine with me. What I would do is: - Back-patch the addition of the sparcv8+ stuff all the way. If anyone's running anything older, let them complain... - Remove the special case for MIPS without gcc intrinsics only in master, leaving the back-branches broken. If anyone cares, let them complain... - Nothing else. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Spinlocks and compiler/memory barriers
On Tue, Jul 1, 2014 at 11:22 AM, Robert Haas robertmh...@gmail.com wrote: On Tue, Jul 1, 2014 at 6:04 AM, Andres Freund and...@2ndquadrant.com wrote: You know, looking at this, I wonder if we shouldn't just remove support for ARMv5 instead of making a blind stab at a fix. Well, I argued that way for a while ;). We don't even need to really desupport it, but just say it's not supported for gcc 4.4. Yeah, I didn't realize at the time that you were making that argument that the existing code was thought to be broken on its own terms. Removing probably-working code that we just can't test easily is, in my mind, quite different from removing probably-broken code for which we can't test a fix. By the time PostgreSQL 9.5 is released, GCC 4.4 will be six years old, and telling people on an obscure platform that few operating system distributions support that they can't use a brand-new PostgeSQL with a seven-year-old compiler doesn't seem like a serious problem, especially since the only alternative we can offer is compiling against completely-untested code. A few years back I ported the postresql client libraries and a few other pieces of software (in particular subversion) to a lot of obscure platforms (old sparc, hpux, irix, older aix, etc etc). Getting a modern gcc working on those platforms (with the possible exception of aix) is in many cases difficult or impossible. So requiring new gcc is exactly equivalent to desupporting. merlin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Spinlocks and compiler/memory barriers
On 2014-07-01 11:46:19 -0500, Merlin Moncure wrote: On Tue, Jul 1, 2014 at 11:22 AM, Robert Haas robertmh...@gmail.com wrote: On Tue, Jul 1, 2014 at 6:04 AM, Andres Freund and...@2ndquadrant.com wrote: You know, looking at this, I wonder if we shouldn't just remove support for ARMv5 instead of making a blind stab at a fix. Well, I argued that way for a while ;). We don't even need to really desupport it, but just say it's not supported for gcc 4.4. Yeah, I didn't realize at the time that you were making that argument that the existing code was thought to be broken on its own terms. Removing probably-working code that we just can't test easily is, in my mind, quite different from removing probably-broken code for which we can't test a fix. By the time PostgreSQL 9.5 is released, GCC 4.4 will be six years old, and telling people on an obscure platform that few operating system distributions support that they can't use a brand-new PostgeSQL with a seven-year-old compiler doesn't seem like a serious problem, especially since the only alternative we can offer is compiling against completely-untested code. A few years back I ported the postresql client libraries and a few other pieces of software (in particular subversion) to a lot of obscure platforms (old sparc, hpux, irix, older aix, etc etc). Getting a modern gcc working on those platforms (with the possible exception of aix) is in many cases difficult or impossible. Well, we're talking about a case where the current code is *broken*. Subtly so. Where we have no way of testing potential fixes. If you/somebody steps up to fix test that, cool. But I don't see it as a service to anyone to ship broken stuff. So requiring new gcc is exactly equivalent to desupporting. Yea, right. The case we're talking about here is armv5 and some armv6's (the current fallback inline assembly doesn't work on all v6s). If postgres were indeed in use and updated on such a platform it'd be cross compiled with near certainty. The other case is sparcv7 and sparcv8 (not v8+). The former has been dropped from solaris8, the latter from solaris9. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Spinlocks and compiler/memory barriers
On Tue, Jul 1, 2014 at 12:46 PM, Merlin Moncure mmonc...@gmail.com wrote: A few years back I ported the postresql client libraries and a few other pieces of software (in particular subversion) to a lot of obscure platforms (old sparc, hpux, irix, older aix, etc etc). Getting a modern gcc working on those platforms (with the possible exception of aix) is in many cases difficult or impossible. So requiring new gcc is exactly equivalent to desupporting. I took a look around to see which operating systems support which hardware platforms. It's a little hard to tell who is supporting which operating systems, because the terminology is different on different project web sites, and it's hard to tell which things are actually ARMv5, as opposed to something else. But it seems that NetBSD is has by far the largest list of supported platforms, and it appears that they compile their current releases with either gcc 4.5 or gcc 4.8, depending on the particular port: http://www.netbsd.org/developers/features/ According to http://www.netbsd.org/ports/ the ports that run on some form of ARM are acorn26, acorn32, cats, epoc32, evbarm, hpcarm, iyonix, netwinder, shark, and zaurus. epoc32 is not listed at all on the features link above, but the others are all listed as using gcc 4.8.x. Even if they were using gcc 4.5.x, though, that would be good enough, and most platforms that are now on 4.8.x were on 4.5.x before the 4.8.x import got done: http://mail-index.netbsd.org/tech-userlevel/2014/02/18/msg008484.html Apparently, the last holdout preventing removal of gcc 4.1.x from NetBSD was the vax port, which has happily now been upgraded to 4.8.x. Debian also supports ARMv5; actually, they support ARMv4t and higher: https://wiki.debian.org/ArmEabiPort Debian has shipped a sufficiently-new compiler for our purposes since 6.0 (squeeze), released in 2009: https://packages.debian.org/squeeze/gcc So it's clearly *possible* to get newer gcc versions running on ARMv5. I will grant you that it may not be the easiest thing to do on an existing installation. But I don't think having us continue to ship either known-broken code or completely-untested code is any better. If someone needs to get PostgreSQL 9.5 running on ARMv5 using an older gcc, they can test Andres's already-posted patch and, if it works, we can commit it. What I think doesn't make sense is to ship it untested and claim we have support when that really might or might not be true. Also, none of this would affect the PostgreSQL client libraries that you are talking about. s_lock.h is only used by the backend. The bottom line is that I love supporting obscure platforms as much as anyone here, and several other committers are already telling me that I love it too much. We've got to draw the line somewhere, and I think refusing to ship newly-written code that we have exactly zero means of testing is a pretty good place to draw it. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Spinlocks and compiler/memory barriers
Robert Haas robertmh...@gmail.com writes: The bottom line is that I love supporting obscure platforms as much as anyone here, and several other committers are already telling me that I love it too much. We've got to draw the line somewhere, and I think refusing to ship newly-written code that we have exactly zero means of testing is a pretty good place to draw it. I'm good with the concept of expecting anyone who complains about lack of support for $platform to provide resources for testing/debugging PG on that platform; and that pending arrival of such resources it's okay to consider the platform desupported. What concerns me here is what level of support we could provide even with adequate resources. That is, are we going to be buying into a scenario where platforms with poor atomics support take a significant performance hit compared to the current state of affairs? I don't think that would be good. Another way of framing the problem is in response to Andres' upthread comment that relying on emulated atomics makes things much easier to reason about. That may be true as far as correctness is concerned but it's patently false for reasoning about performance. This ties into Robert's worry about how many different hardware performance profiles we're going to have to concern ourselves with. Basically the future that concerns me is that we perform well on x86_64 hardware (which I believe is pretty much all that any active developers are using) and poorly on other hardware. I don't want to end up there, but I think the current direction of this patch pretty much guarantees that outcome. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Spinlocks and compiler/memory barriers
On Tue, Jul 1, 2014 at 2:52 PM, Tom Lane t...@sss.pgh.pa.us wrote: Robert Haas robertmh...@gmail.com writes: The bottom line is that I love supporting obscure platforms as much as anyone here, and several other committers are already telling me that I love it too much. We've got to draw the line somewhere, and I think refusing to ship newly-written code that we have exactly zero means of testing is a pretty good place to draw it. I'm good with the concept of expecting anyone who complains about lack of support for $platform to provide resources for testing/debugging PG on that platform; and that pending arrival of such resources it's okay to consider the platform desupported. What concerns me here is what level of support we could provide even with adequate resources. That is, are we going to be buying into a scenario where platforms with poor atomics support take a significant performance hit compared to the current state of affairs? I don't think that would be good. Another way of framing the problem is in response to Andres' upthread comment that relying on emulated atomics makes things much easier to reason about. That may be true as far as correctness is concerned but it's patently false for reasoning about performance. This ties into Robert's worry about how many different hardware performance profiles we're going to have to concern ourselves with. I have to admit that my concerns in that area were ameliorated to a significant degree by the performance results that Andres posted yesterday. On top of the atomics patch (which is not this thread), he's got a rewritten lwlock patch that uses those atomics. His testing showed that, even when that patch uses the atomics emulation layer instead of real atomics, it's still better than the status quo. Maybe that won't be true for every patch that uses atomics, but it's certainly good as far as it goes. But I think the issue on this sub-thread is a different one altogether: in studying s_lock.h, Andres found clear evidence that the ARMv5 and Sun Studio spinlock implementations are in fact *buggy*. Even if the programmer uses volatile pointers until they turn blue in the face, nothing in that patch prevents the CPU from changing the apparent order of execution such that instructions within the spinlock-protected critical section appear to execute after lock release. That is no good. It means that if somebody runs PostgreSQL on those platforms and tries to do non-trivial things with it, it's probably going to break. Now, if we want, we can simply ignore that problem. No actual users have complained about it, and it's not entirely outside the bounds of plausibility that Andres is wrong, and those implementations are not buggy after all. But I don't think he's wrong. If we assume he's right, then we've got two choices: we can either blindly install fixes for those platforms that we can't test, or we can drop support. Basically the future that concerns me is that we perform well on x86_64 hardware (which I believe is pretty much all that any active developers are using) and poorly on other hardware. I don't want to end up there, but I think the current direction of this patch pretty much guarantees that outcome. I think this concern is more germane to the atomics patch than what we're talking about here, because what we're talking about here, at the moment, is either attempting to repair, or just rip out completely, S_UNLOCK() implementations that appear to be buggy. I have access to several PPC64 systems, use them regularly for benchmarking and performance-testing, and can provide access to one of those to others who may need it for testing. For that reason, I think at least x64/amd64 and PPC64 will get tested regularly, at least as long as IBM keeps that hardware on loan to us. I previously had access to an Itanium server and used that for a lot of testing during the 9.2 cycle, but that eventually went away. I think it would be awfully nice if some hardware vendor (or interested community member) could provide us with access to some top-quality SPARC, ARM, and maybe zSeries machines for testing, but nobody's offered that I know of. We can't test hardware we don't have. Despite my concerns about keeping the list of supported atomics short, and I do have concerns in that area, I'm not really sure that we have much choice but to go in that direction. We can't accept a 5x performance hit in the name of portability, and that's literally what we're talking about in some cases. I definitely want to think carefully about how we proceed in this area but doing nothing doesn't seem like an option. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Spinlocks and compiler/memory barriers
Robert Haas robertmh...@gmail.com writes: Despite my concerns about keeping the list of supported atomics short, and I do have concerns in that area, I'm not really sure that we have much choice but to go in that direction. We can't accept a 5x performance hit in the name of portability, and that's literally what we're talking about in some cases. I definitely want to think carefully about how we proceed in this area but doing nothing doesn't seem like an option. To be clear, I'm not advocating doing nothing (and I don't think anyone else is). It's obvious based on Andres' results that we want to use atomics on platforms where they're well-supported. The argument is around what we're going to do for other platforms. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Spinlocks and compiler/memory barriers
On Tue, Jul 1, 2014 at 4:54 PM, Tom Lane t...@sss.pgh.pa.us wrote: Robert Haas robertmh...@gmail.com writes: Despite my concerns about keeping the list of supported atomics short, and I do have concerns in that area, I'm not really sure that we have much choice but to go in that direction. We can't accept a 5x performance hit in the name of portability, and that's literally what we're talking about in some cases. I definitely want to think carefully about how we proceed in this area but doing nothing doesn't seem like an option. To be clear, I'm not advocating doing nothing (and I don't think anyone else is). It's obvious based on Andres' results that we want to use atomics on platforms where they're well-supported. The argument is around what we're going to do for other platforms. OK, but that still seems like the issue on the other thread, not what's being discussed here. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Spinlocks and compiler/memory barriers
On 01/07/14 11:04, Andres Freund wrote: Since we have a Sun Studio machine in the buildfarm, we shouldn't give up on SPARC completely, but maybe we should only add the cases for sparcv8+ and above? That at least has some chance of getting tested. That we have code for sparcv7 is ridiculous btw. Sparcv8 (not 8+/9) is from 1992. v7/sun4c been dropped from solaris 8 if I see that correctly. I agree that v8 (in contrast to v8+ aka v9 in 32bit mode) support has to go as well. I'd personally vote for backpatching a note to installation.sgml saying that it's probably not working, and not do anything else there. That means we also should replace the ldstub by cas in the the gcc inline assembly - but we have buildfarm members for that, so it's not too bad. Being involved in QEMU SPARC development, I can tell you that patches are still actively being received for SPARCv8. The last set of CPU patches were related to fixing bugs in the LEON3, a 32-bit SPARC CPU which is available in hardened versions certified for extreme environments such as military and space. I'd hate to find out that they switched to another database because they couldn't upgrade PostgreSQL on the ISS ;) Also if you're struggling for Sun buildfarm animals, recent versions of QEMU will quite happily install and run later versions of 32-bit Solaris over serial, and 2.0 even manages to give you a cgthree framebuffer for the full experience. ATB, Mark. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Spinlocks and compiler/memory barriers
On 2014-07-01 23:21:07 +0100, Mark Cave-Ayland wrote: On 01/07/14 11:04, Andres Freund wrote: Since we have a Sun Studio machine in the buildfarm, we shouldn't give up on SPARC completely, but maybe we should only add the cases for sparcv8+ and above? That at least has some chance of getting tested. That we have code for sparcv7 is ridiculous btw. Sparcv8 (not 8+/9) is from 1992. v7/sun4c been dropped from solaris 8 if I see that correctly. I agree that v8 (in contrast to v8+ aka v9 in 32bit mode) support has to go as well. I'd personally vote for backpatching a note to installation.sgml saying that it's probably not working, and not do anything else there. That means we also should replace the ldstub by cas in the the gcc inline assembly - but we have buildfarm members for that, so it's not too bad. Being involved in QEMU SPARC development, I can tell you that patches are still actively being received for SPARCv8. The last set of CPU patches were related to fixing bugs in the LEON3, a 32-bit SPARC CPU which is available in hardened versions certified for extreme environments such as military and space. I'd hate to find out that they switched to another database because they couldn't upgrade PostgreSQL on the ISS ;) LEON3 isn't really sparcv8. It's more like sparcv8 with v9 stuff cherry picked... It e.g. has CAS :) Your point stands though - we should probably backpatch the v8 version of my patch as well, since apparently LEON3 does *not* have membar. But I don't think this is a rat race we can win. We can't keep up with all the variants of sparc and even worse arm. We should add a intrinsics based version for sparc as well. That'll work just fine on any recent gcc. Also if you're struggling for Sun buildfarm animals, recent versions of QEMU will quite happily install and run later versions of 32-bit Solaris over serial, and 2.0 even manages to give you a cgthree framebuffer for the full experience. Well. I have to admit I'm really not interested in investing that much time in something I've no stake in. If postgres developers have to put emulated machines to develop features something imo went seriously wrong. That's more effort than at least I'm willing to spend. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Spinlocks and compiler/memory barriers
Andres Freund and...@2ndquadrant.com writes: On 2014-07-01 23:21:07 +0100, Mark Cave-Ayland wrote: Also if you're struggling for Sun buildfarm animals, recent versions of QEMU will quite happily install and run later versions of 32-bit Solaris over serial, and 2.0 even manages to give you a cgthree framebuffer for the full experience. Well. I have to admit I'm really not interested in investing that much time in something I've no stake in. If postgres developers have to put emulated machines to develop features something imo went seriously wrong. That's more effort than at least I'm willing to spend. Perhaps more to the point, I have no faith at all that an emulator will mimic multiprocessor timing behavior to the level of detail needed to tell whether memory-barrier-related logic works. See the VAX discussion just a couple days ago. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Spinlocks and compiler/memory barriers
On Sat, Jun 28, 2014 at 9:41 AM, Andres Freund and...@2ndquadrant.com wrote: No, I think it's going to be *much* simpler than that. How about I take a crack at this next week and then either (a) I'll see why it's a bad idea and we can go from there or (b) you can review what I come up with and tell me why it sucks? Ok. I think that's going in the wrong direction (duplication of nontrivial knowledge), but maybe I'm taking a to 'purist' approach here. Prove me wrong :) I'm not sure what you'll think of this, but see attached. I think newer releases of Sun Studio support that GCC way of doing a barrier, but I don't know how to test for that, so at the moment that uses the fallback put it in a function approach, along with HPPA non-GCC and Univel CC. Those things are obscure enough that I don't care about making them less performance. I think AIX is actually OK as-is; if _check_lock() is a compiler barrier but _clear_lock() is not, that would be pretty odd. How are you suggesting we deal with the generic S_UNLOCK case without having a huge ifdef? Why do we build an abstraction layer (barrier.h) and then not use it? Because (1) the abstraction doesn't fit very well unless we do a lot of additional work to build acquire and release barriers for every platform we support and Meh. Something like the (untested): #if !defined(pg_release_barrier) defined(pg_read_barrier) defined(pg_write_barrier) #define pg_release_barrier() do { pg_read_barrier(); pg_write_barrier();} while (0) #elif !defined(pg_release_barrier) #define pg_release_barrier() pg_memory_barrier() #endif before the fallback definition of pg_read/write_barrier should suffice? That doesn't sound like a good idea. For example, on PPC, a read barrier is lwsync, and so is a write barrier. That would also suck on any architecture where either a read or write barrier gets implemented as a full memory barrier, though I guess it might be smart enough to optimize away most of the cost of the second of two barriers in immediate succession. I think if we want to use barrier.h as the foundation for this, we're going to need to define a new set of things in there that have acquire and release semantics, or just use full barriers for everything - which would be wasteful on, e.g., x86. And I don't particularly see the point in going to a lot of work to invent those new abstractions everywhere when we can just tweak s_lock.h in place a bit and be done with it. Making those files interdependent doesn't strike me as a win. (2) I don't have much confidence that we can depend on the spinlock fallback for barriers without completely breaking obscure platforms, and I'd rather make a more minimal set of changes. Well, it's the beginning of the cycle. And we're already depending on barriers for correctness (and it's not getting less), so I don't really see what avoiding barrier usage buys us except harder to find breakage? If we use barriers to implement any facility other than spinlocks, I have reasonable confidence that we won't break things more than they already are broken today, because I think the fallback to acquire-and-release a spinlock probably works, even though it's likely terrible for performance. I have significantly less confidence that trying to implement spinlocks in terms of barriers is going to be reliable. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company diff --git a/src/backend/storage/lmgr/s_lock.c b/src/backend/storage/lmgr/s_lock.c index efe1b43..38dc34d 100644 --- a/src/backend/storage/lmgr/s_lock.c +++ b/src/backend/storage/lmgr/s_lock.c @@ -154,6 +154,17 @@ s_lock(volatile slock_t *lock, const char *file, int line) return delays; } +#ifdef USE_DEFAULT_S_UNLOCK +void +s_unlock(slock_t *lock) +{ +#ifdef TAS_ACTIVE_WORD + *TAS_ACTIVE_WORD(lock) = -1; +#else + *lock = 0; +#endif +} +#endif /* * Set local copy of spins_per_delay during backend startup. diff --git a/src/include/storage/s_lock.h b/src/include/storage/s_lock.h index 895abe6..f1a89dc 100644 --- a/src/include/storage/s_lock.h +++ b/src/include/storage/s_lock.h @@ -55,14 +55,16 @@ * on Alpha TAS() will fail if interrupted. Therefore a retry loop must * always be used, even if you are certain the lock is free. * - * Another caution for users of these macros is that it is the caller's - * responsibility to ensure that the compiler doesn't re-order accesses - * to shared memory to precede the actual lock acquisition, or follow the - * lock release. Typically we handle this by using volatile-qualified - * pointers to refer to both the spinlock itself and the shared data - * structure being accessed within the spinlocked critical section. - * That fixes it because compilers are not allowed to re-order accesses - * to volatile objects relative to other such accesses. + * It is the responsibility of these macros to make sure that the compiler + * does not re-order accesses to shared
Re: [HACKERS] Spinlocks and compiler/memory barriers
Hi, On 2014-06-30 08:03:40 -0400, Robert Haas wrote: On Sat, Jun 28, 2014 at 9:41 AM, Andres Freund and...@2ndquadrant.com wrote: No, I think it's going to be *much* simpler than that. How about I take a crack at this next week and then either (a) I'll see why it's a bad idea and we can go from there or (b) you can review what I come up with and tell me why it sucks? Ok. I think that's going in the wrong direction (duplication of nontrivial knowledge), but maybe I'm taking a to 'purist' approach here. Prove me wrong :) I'm not sure what you'll think of this, but see attached. I think it continues in the tradition of making s_lock.h ever harder to follow. But it's still better than what we have now from a correctness perspective. I think newer releases of Sun Studio support that GCC way of doing a barrier, but I don't know how to test for that, so at the moment that uses the fallback put it in a function approach, Sun studio's support for the gcc way is new (some update to sun studio 12), so that's probably not sufficient. #include mbarrier.h and __compiler_barrier()/__machine_rel_barrier() should do the trick for spinlocks. That seems to have existed for longer. Solaris seems to run with TSO enabled for SPARC (whereas linux uses RMO, relaxed memory ordering), so it's probably fine to just use the compiler barrier. along with HPPA non-GCC and Univel CC. Those things are obscure enough that I don't care about making them less performance. Fine with me. I think AIX is actually OK as-is; if _check_lock() is a compiler barrier but _clear_lock() is not, that would be pretty odd. Agreed. How are you suggesting we deal with the generic S_UNLOCK case without having a huge ifdef? Why do we build an abstraction layer (barrier.h) and then not use it? Because (1) the abstraction doesn't fit very well unless we do a lot of additional work to build acquire and release barriers for every platform we support and Meh. Something like the (untested): #if !defined(pg_release_barrier) defined(pg_read_barrier) defined(pg_write_barrier) #define pg_release_barrier() do { pg_read_barrier(); pg_write_barrier();} while (0) #elif !defined(pg_release_barrier) #define pg_release_barrier() pg_memory_barrier() #endif before the fallback definition of pg_read/write_barrier should suffice? That doesn't sound like a good idea. For example, on PPC, a read barrier is lwsync, and so is a write barrier. That would also suck on any architecture where either a read or write barrier gets implemented as a full memory barrier, though I guess it might be smart enough to optimize away most of the cost of the second of two barriers in immediate succession. Well, that's why I suggested only doing it if we haven't got a pg_release_barrier() defined. And fallback to memory_barrier() directly if read/write barriers are implemented using it so we don't have two memory barriers in a row. I think if we want to use barrier.h as the foundation for this, we're going to need to define a new set of things in there that have acquire and release semantics, or just use full barriers for everything - which would be wasteful on, e.g., x86. And I don't particularly see the point in going to a lot of work to invent those new abstractions everywhere when we can just tweak s_lock.h in place a bit and be done with it. Making those files interdependent doesn't strike me as a win. We'll need release semantics in more places than just s_lock.h. Anything that acts like a lock without using spinlocks actually needs acquire/release semantics. The lwlock patch only gets away with it because the atomic operations implementation happen to act as acquire or full memory barriers. (2) I don't have much confidence that we can depend on the spinlock fallback for barriers without completely breaking obscure platforms, and I'd rather make a more minimal set of changes. Well, it's the beginning of the cycle. And we're already depending on barriers for correctness (and it's not getting less), so I don't really see what avoiding barrier usage buys us except harder to find breakage? If we use barriers to implement any facility other than spinlocks, I have reasonable confidence that we won't break things more than they already are broken today, because I think the fallback to acquire-and-release a spinlock probably works, even though it's likely terrible for performance. I have significantly less confidence that trying to implement spinlocks in terms of barriers is going to be reliable. So you believe we have a reliable barrier implementation - but you don't believe it's reliable enough for spinlocks? If we'd *not* use the barrier fallback for spinlock release if, and only if, it's implemented via spinlocks, I don't see why we'd be worse off? +#if !defined(S_UNLOCK) +#if defined(__INTEL_COMPILER) +#define S_UNLOCK(lock) \ + do {
Re: [HACKERS] Spinlocks and compiler/memory barriers
On Mon, Jun 30, 2014 at 9:26 AM, Andres Freund and...@2ndquadrant.com wrote: I think it continues in the tradition of making s_lock.h ever harder to follow. But it's still better than what we have now from a correctness perspective. Well, as you and I have discussed before, someday we probably need to get rid of s_lock.h or at least refactor it heavily, but let's do one thing at a time. I think we're eventually going to require every platform that wants to run PostgreSQL to have working barriers and atomics, and then we can rewrite s_lock.h into something much shorter and cleaner, but I am opposed to doing that today, because even if we don't care about people running obscure proprietary compilers on weird platforms, there are still lots of people running older GCC versions. For right now, I think the prudent thing to do is keep s_lock.h on life support. I think newer releases of Sun Studio support that GCC way of doing a barrier, but I don't know how to test for that, so at the moment that uses the fallback put it in a function approach, Sun studio's support for the gcc way is new (some update to sun studio 12), so that's probably not sufficient. #include mbarrier.h and __compiler_barrier()/__machine_rel_barrier() should do the trick for spinlocks. That seems to have existed for longer. Can you either link to relevant documentation or be more specific about what needs to be done here? Solaris seems to run with TSO enabled for SPARC (whereas linux uses RMO, relaxed memory ordering), so it's probably fine to just use the compiler barrier. If it isn't, that's a change that has nothing to do with making spinlock operations compiler barriers and everything to do with fixing a bug in the existing implementation. So you believe we have a reliable barrier implementation - but you don't believe it's reliable enough for spinlocks? If we'd *not* use the barrier fallback for spinlock release if, and only if, it's implemented via spinlocks, I don't see why we'd be worse off? I can't parse this sentence because it's not clearly to me exactly which part the not applies to, and I think we're talking past each other a bit, too. Basically, every platform we support today has a spinlock implementation that is supposed to prevent CPU reordering across the acquire and release operations but might not prevent compiler reordering. IOW, S_LOCK() should be a CPU acquire fence, and S_UNLOCK() should be a CPU release fence. Now, we want to make these operations compiler fences as well, and AIUI your proposal for that is to make NEW_S_UNLOCK(lock) = out of line function call + S_LOCK(dummy) + S_UNLOCK(dummy) + S_UNLOCK(lock). My proposal is to make NEW_S_UNLOCK(lock) = out of line function call + S_UNLOCK(lock), which I think is strictly better. There's zip to guarantee that adding S_LOCK(dummy) + S_UNLOCK(dummy) is doing anything we want in there, and it's definitely going to be worse for performance. The other thing that I don't like about your proposal has to do with the fact that the support matrix for barrier.h and s_lock.h are not identical. If s_lock.h is confusing to you today, making it further intertwined with barrier.h is not going to make things better; at least, that confuses the crap out of me. Perhaps the only good thing to be said about this mess is that, right now, the dependency is in just one direction: barrier.h depends on s_lock.h, and not the other way around. At some future point we may well want to reverse the direction of that dependency, but to do that we need barrier.h to have a working barrier implementation for every platform that s_lock.h supports. Maybe we'll accomplish that by adding to barrier.h and and maybe we'll accomplish that by subtracting from s_lock.h but the short path to getting this issue fixed is to be agnostic to that question. 1) Afaics ARM will fall back to this for older gccs - and ARM is weakly ordered. I think I'd just also use swpb to release the lock. Something like #define S_INIT_LOCK(lock) (*(lock)) = 0); #define S_UNLOCK s_unlock static __inline__ void s_unlock(volatile slock_t *lock) { register slock_t _res = 0; __asm__ __volatile__( swpb%0, %0, [%2]\n : +r(_res), +m(*lock) : r(lock) : memory); Assert(_res == 1); // lock should be held by us } 2) Afaics it's also wrong for sparc on linux (and probably the BSDs) where relaxed memory ordering is used. 3) Also looks wrong on mips which needs a full memory barrier. You're mixing apples and oranges again. If the existing definitions aren't CPU barriers, they're already broken, and we should fix that and back-patch. On the other hand, the API contract change making these into compiler barriers is a master-only change. I think for purposes of this patch we should assume that the existing code is sufficient to prevent CPU reordering and just focus on fixing
Re: [HACKERS] Spinlocks and compiler/memory barriers
On 2014-06-30 10:05:44 -0400, Robert Haas wrote: On Mon, Jun 30, 2014 at 9:26 AM, Andres Freund and...@2ndquadrant.com wrote: I think it continues in the tradition of making s_lock.h ever harder to follow. But it's still better than what we have now from a correctness perspective. Well, as you and I have discussed before, someday we probably need to get rid of s_lock.h or at least refactor it heavily, but let's do one thing at a time. Well, that task gets harder by making it more complex... (will answer separately about sun studio) Solaris seems to run with TSO enabled for SPARC (whereas linux uses RMO, relaxed memory ordering), so it's probably fine to just use the compiler barrier. If it isn't, that's a change that has nothing to do with making spinlock operations compiler barriers and everything to do with fixing a bug in the existing implementation. Well, it actually has. If we start to remove volatiles from critical sections the compiler will schedule stores closer to the spinlock release and reads more freely - making externally visible ordering violations more likely. Especially on itanic. So, I agree that we need to fix unlocks that aren't sufficiently strong memory barriers, but it does get more urgent by not relying on volatile inside criticial sections anymore. Basically, every platform we support today has a spinlock implementation that is supposed to prevent CPU reordering across the acquire and release operations but might not prevent compiler reordering. IOW, S_LOCK() should be a CPU acquire fence, and S_UNLOCK() should be a CPU release fence. Well, I think s_lock.h comes woefully short of that goal on several platforms. Scary. Now, we want to make these operations compiler fences as well, and AIUI your proposal for that is to make NEW_S_UNLOCK(lock) = out of line function call + S_LOCK(dummy) + S_UNLOCK(dummy) + S_UNLOCK(lock). My proposal was to use barrier.h provided barriers as long as it provides a native implementation. If neither a compiler nor a memory barrier is provided my proposal was to fall back to the memory barrier emulation we already have, but move it out of line (to avoid reordering issues). The reason for doing so is that the release *has* to be a release barrier. To avoid issues with recursion the S_UNLOCK() used in the out of line memory barrier implementation used a modified S_UNLOCK that doesn't include a barrier. My proposal is to make NEW_S_UNLOCK(lock) = out of line function call + S_UNLOCK(lock), which I think is strictly better. There's zip to guarantee that adding S_LOCK(dummy) + S_UNLOCK(dummy) is doing anything we want in there, and it's definitely going to be worse for performance. Uh? At the very least doing a S_LOCK() guarantees that we're doing some sort of memory barrier, which your's doesn't at all. That'd actually fix the majority of platforms with borked release semantics because pretty much all tas() implementations are full barriers. The other thing that I don't like about your proposal has to do with the fact that the support matrix for barrier.h and s_lock.h are not identical. If s_lock.h is confusing to you today, making it further intertwined with barrier.h is not going to make things better; at least, that confuses the crap out of me. Uh. It actually *removed* the confusing edge of the dependency. It's rather confusing that memory barriers rely spinlocks and not the other way round. I think we actually should do that unconditionally, independent of any other changes. The only reasons barrier.h includes s_lock.h is that dummy_spinlock is declared and that the memory barrier is declared inline. 1) Afaics ARM will fall back to this for older gccs - and ARM is weakly ordered. I think I'd just also use swpb to release the lock. Something like #define S_INIT_LOCK(lock) (*(lock)) = 0); #define S_UNLOCK s_unlock static __inline__ void s_unlock(volatile slock_t *lock) { register slock_t _res = 0; __asm__ __volatile__( swpb%0, %0, [%2]\n : +r(_res), +m(*lock) : r(lock) : memory); Assert(_res == 1); // lock should be held by us } 2) Afaics it's also wrong for sparc on linux (and probably the BSDs) where relaxed memory ordering is used. 3) Also looks wrong on mips which needs a full memory barrier. You're mixing apples and oranges again. No, I'm not. If the existing definitions aren't CPU barriers, they're already broken, and we should fix that and back-patch. Yea, and that gets harder if we do it after master has changed incompatibly. And, as explained above, we need to fix S_UNLOCK() to be a memory barrier before we can remove volatiles - which is the goal of your patch, no? On the other hand, the API contract change making these into compiler barriers is a master-only change. I think for purposes of this patch we should assume
Re: [HACKERS] Spinlocks and compiler/memory barriers
On Mon, Jun 30, 2014 at 11:17 AM, Andres Freund and...@2ndquadrant.com wrote: Solaris seems to run with TSO enabled for SPARC (whereas linux uses RMO, relaxed memory ordering), so it's probably fine to just use the compiler barrier. If it isn't, that's a change that has nothing to do with making spinlock operations compiler barriers and everything to do with fixing a bug in the existing implementation. Well, it actually has. If we start to remove volatiles from critical sections the compiler will schedule stores closer to the spinlock release and reads more freely - making externally visible ordering violations more likely. Especially on itanic. Granted. Now, we want to make these operations compiler fences as well, and AIUI your proposal for that is to make NEW_S_UNLOCK(lock) = out of line function call + S_LOCK(dummy) + S_UNLOCK(dummy) + S_UNLOCK(lock). My proposal was to use barrier.h provided barriers as long as it provides a native implementation. If neither a compiler nor a memory barrier is provided my proposal was to fall back to the memory barrier emulation we already have, but move it out of line (to avoid reordering issues). The reason for doing so is that the release *has* to be a release barrier. What do you mean by the memory barrier emulation we already have? The only memory barrier emulation we have, to my knowledge, is a spinlock acquire-release cycle. But trying to use a spinlock acquire-release to shore up problems with the spinlock release implementation makes my head explode. On the other hand, the API contract change making these into compiler barriers is a master-only change. I think for purposes of this patch we should assume that the existing code is sufficient to prevent CPU reordering and just focus on fixing compiler ordering problems. Whatever needs to change on the CPU-reordering end of things is a separate commit. I'm not arguing against that it should be a separate commit. Just that the proper memory barrier bit has to come first. I feel like you're speaking somewhat imprecisely in an area where very precise speech is needed to avoid confusion. If you're saying that you think we should fix the S_UNLOCK() implementations that fail to prevent CPU-ordering before we change all the S_UNLOCK() implementations to prevent compiler-reordering, then that is fine with me; otherwise, I don't understand what you're getting at here. Do you want to propose a patch that does *only* that first part before we go further here? Do you want me to try to put together such a patch based on the details mentioned on this thread? -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Spinlocks and compiler/memory barriers
On 2014-06-30 11:38:31 -0400, Robert Haas wrote: On Mon, Jun 30, 2014 at 11:17 AM, Andres Freund and...@2ndquadrant.com wrote: Now, we want to make these operations compiler fences as well, and AIUI your proposal for that is to make NEW_S_UNLOCK(lock) = out of line function call + S_LOCK(dummy) + S_UNLOCK(dummy) + S_UNLOCK(lock). My proposal was to use barrier.h provided barriers as long as it provides a native implementation. If neither a compiler nor a memory barrier is provided my proposal was to fall back to the memory barrier emulation we already have, but move it out of line (to avoid reordering issues). The reason for doing so is that the release *has* to be a release barrier. What do you mean by the memory barrier emulation we already have? The only memory barrier emulation we have, to my knowledge, is a spinlock acquire-release cycle. Yes. Unrelated, but why haven't we defined it as if (TAS(dummy)) S_UNLOCK(dummy);? That's just about as strong and less of a performance drain? Hm, I guess platforms that do an unlocked test first would be problematic :/ But trying to use a spinlock acquire-release to shore up problems with the spinlock release implementation makes my head explode. Well, it actually makes some sense. Nearly any TAS() implementation is going to have some memory barrier semantics - so using a TAS() as fallback makes sense. That's why we're relying on it for use in memory barrier emulation after all. Also note that barrier.h assumes that a S_LOCK/S_UNLOCK acts as a compiler barrier - which really isn't guaranteed by the current contract of s_lock.h. Although the tas() implementations look safe. On the other hand, the API contract change making these into compiler barriers is a master-only change. I think for purposes of this patch we should assume that the existing code is sufficient to prevent CPU reordering and just focus on fixing compiler ordering problems. Whatever needs to change on the CPU-reordering end of things is a separate commit. I'm not arguing against that it should be a separate commit. Just that the proper memory barrier bit has to come first. I feel like you're speaking somewhat imprecisely in an area where very precise speech is needed to avoid confusion. If you're saying that you think we should fix the S_UNLOCK() implementations that fail to prevent CPU-ordering before we change all the S_UNLOCK() implementations to prevent compiler-reordering, then that is fine with me; Yes, that's what I think is needed. Do you want to propose a patch that does *only* that first part before we go further here? Do you want me to try to put together such a patch based on the details mentioned on this thread? I'm fine with either - we're both going to flying pretty blind :/. I think the way S_UNLOCK's release memory barrier semantics are fixed might influence the way the compiler barrier issue can be solved. Fixing the release semantics will involve going through all tas() implementations and see whether the default release semantics are ok. The ones with broken semantics will need to grow their own S_UNLOCK. I am wondering if that commit shouldn't just remove the default S_UNLOCK and move it to the tas() implementation sites. Please don't forget that I started this thread because I found the current implementation lacking because s_lock neither has sane memory release nor compiler release semantics... So it's not surprising that I want to solve both :) Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Spinlocks and compiler/memory barriers
On Mon, Jun 30, 2014 at 12:20 PM, Andres Freund and...@2ndquadrant.com wrote: On 2014-06-30 11:38:31 -0400, Robert Haas wrote: On Mon, Jun 30, 2014 at 11:17 AM, Andres Freund and...@2ndquadrant.com wrote: Now, we want to make these operations compiler fences as well, and AIUI your proposal for that is to make NEW_S_UNLOCK(lock) = out of line function call + S_LOCK(dummy) + S_UNLOCK(dummy) + S_UNLOCK(lock). My proposal was to use barrier.h provided barriers as long as it provides a native implementation. If neither a compiler nor a memory barrier is provided my proposal was to fall back to the memory barrier emulation we already have, but move it out of line (to avoid reordering issues). The reason for doing so is that the release *has* to be a release barrier. What do you mean by the memory barrier emulation we already have? The only memory barrier emulation we have, to my knowledge, is a spinlock acquire-release cycle. Yes. Unrelated, but why haven't we defined it as if (TAS(dummy)) S_UNLOCK(dummy);? That's just about as strong and less of a performance drain? Hm, I guess platforms that do an unlocked test first would be problematic :/ Yes; also, there's no requirement for S_LOCK() to be defined in terms of TAS(), and thus no requirement for TAS() to exist at all. But trying to use a spinlock acquire-release to shore up problems with the spinlock release implementation makes my head explode. Well, it actually makes some sense. Nearly any TAS() implementation is going to have some memory barrier semantics - so using a TAS() as fallback makes sense. That's why we're relying on it for use in memory barrier emulation after all. As far as I know, all of our TAS() implementations prevent CPU reordering in the acquire direction. It is not clear that they provide CPU-reordering guarantees adequate for the release direction, even when paired with whatever S_UNLOCK() implementation they're mated with. And it's quite clear that many of them aren't adequate to prevent compiler-reordering in any case. Also note that barrier.h assumes that a S_LOCK/S_UNLOCK acts as a compiler barrier - which really isn't guaranteed by the current contract of s_lock.h. Although the tas() implementations look safe. Ugh, you're right. That should really be moved out-of-line. On the other hand, the API contract change making these into compiler barriers is a master-only change. I think for purposes of this patch we should assume that the existing code is sufficient to prevent CPU reordering and just focus on fixing compiler ordering problems. Whatever needs to change on the CPU-reordering end of things is a separate commit. I'm not arguing against that it should be a separate commit. Just that the proper memory barrier bit has to come first. I feel like you're speaking somewhat imprecisely in an area where very precise speech is needed to avoid confusion. If you're saying that you think we should fix the S_UNLOCK() implementations that fail to prevent CPU-ordering before we change all the S_UNLOCK() implementations to prevent compiler-reordering, then that is fine with me; Yes, that's what I think is needed. OK, let's do it that way then. Do you want to propose a patch that does *only* that first part before we go further here? Do you want me to try to put together such a patch based on the details mentioned on this thread? I'm fine with either - we're both going to flying pretty blind :/. I think the way S_UNLOCK's release memory barrier semantics are fixed might influence the way the compiler barrier issue can be solved. I think I'm starting to understand the terminological confusion: to me, a memory barrier means a fence against both the compiler and the CPU. I now think you're using it to mean a fence against the CPU, as distinct from a fence against the compiler. That had me really confused in some previous replies. Fixing the release semantics will involve going through all tas() implementations and see whether the default release semantics are ok. The ones with broken semantics will need to grow their own S_UNLOCK. I am wondering if that commit shouldn't just remove the default S_UNLOCK and move it to the tas() implementation sites. So, I think that here you are talking about CPU behavior rather than compiler behavior. Next paragraph is on that basis. The implementations that don't currently have their own S_UNLOCK() are i386, x86_64, Itanium, ARM without GCC atomics, S/390 zSeries, Sparc, Linux m68k, VAX, m32r, SuperH, non-Linux m68k, Univel CC, HP/UX non-GCC, Sun Studio, and WIN32_ONLY_COMPILER. Because most of those are older platforms, I'm betting that more of them than not are OK; I think we should confine ourselves to trying to fix the ones we're sure are wrong, which if I understand you correctly are ARM without GCC atomics, Sparc, and MIPS. I think it'd be better to just add copies of S_UNLOCK to those three
Re: [HACKERS] Spinlocks and compiler/memory barriers
On 2014-06-30 12:46:29 -0400, Robert Haas wrote: But trying to use a spinlock acquire-release to shore up problems with the spinlock release implementation makes my head explode. Well, it actually makes some sense. Nearly any TAS() implementation is going to have some memory barrier semantics - so using a TAS() as fallback makes sense. That's why we're relying on it for use in memory barrier emulation after all. As far as I know, all of our TAS() implementations prevent CPU reordering in the acquire direction. It is not clear that they provide CPU-reordering guarantees adequate for the release direction, even when paired with whatever S_UNLOCK() implementation they're mated with. And it's quite clear that many of them aren't adequate to prevent compiler-reordering in any case. I actually don't think there currently are tas() implementations that aren't compiler barriers? Maybe UNIVEL/unixware. A bit of luck. Also note that barrier.h assumes that a S_LOCK/S_UNLOCK acts as a compiler barrier - which really isn't guaranteed by the current contract of s_lock.h. Although the tas() implementations look safe. Ugh, you're right. That should really be moved out-of-line. Good. Then we already loose the compile time interdependency from barrier.h to s_lock.h - although the fallback will have a runtime dependency. Do you want to propose a patch that does *only* that first part before we go further here? Do you want me to try to put together such a patch based on the details mentioned on this thread? I'm fine with either - we're both going to flying pretty blind :/. I think the way S_UNLOCK's release memory barrier semantics are fixed might influence the way the compiler barrier issue can be solved. I think I'm starting to understand the terminological confusion: to me, a memory barrier means a fence against both the compiler and the CPU. I now think you're using it to mean a fence against the CPU, as distinct from a fence against the compiler. That had me really confused in some previous replies. Oh. Lets henceforth define 'fence' as the cache coherency thingy and read/write/release/acquire/memory barrier as the combination? Fixing the release semantics will involve going through all tas() implementations and see whether the default release semantics are ok. The ones with broken semantics will need to grow their own S_UNLOCK. I am wondering if that commit shouldn't just remove the default S_UNLOCK and move it to the tas() implementation sites. So, I think that here you are talking about CPU behavior rather than compiler behavior. Next paragraph is on that basis. Yes, I am. The implementations that don't currently have their own S_UNLOCK() are i386 x86_64 TSO, thus fine. Itanium As a special case volatile stores emit release/acquire fences unless specific compiler flags are used. Thus safe. ARM without GCC atomics Borked. S/390 zSeries Strongly ordered. Sparc Sun Studio Borked. At least in some setups. Linux m68k At least linux doesn't support SMP for m68k, so cache coherency shouldn't be a problem. VAX I don't really know, but I don't care. The NetBSD people statements about VAX SMP support didn't increase my concern for VAX SMP. m32r No idea. SuperH Not offially supported (as it's not in installation.sgml), don't care :) non-Linux m68k Couldn't figure out if anybody supports SMP here. Doesn't look like it. Univel CC Don't care. HP/UX non-GCC Itanics volatile semantics should work. and WIN32_ONLY_COMPILER Should be fine due to TSO on x86 and itanic volatiles. Because most of those are older platforms, I'm betting that more of them than not are OK; I think we should confine ourselves to trying to fix the ones we're sure are wrong Sounds sane. , which if I understand you correctly are ARM without GCC atomics, Sparc, and MIPS. I've to revise my statement on MIPS, it actually looks safe. I seem to have missed that it has its own S_UNLOCK. Do I see it correctly that !(defined(__mips__) !defined(__sgi)) isn't supported? I think it'd be better to just add copies of S_UNLOCK to those three rather and NOT duplicate the definition in the other 12 places. Ok. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Spinlocks and compiler/memory barriers
On 2014-06-30 19:22:59 +0200, Andres Freund wrote: On 2014-06-30 12:46:29 -0400, Robert Haas wrote: , which if I understand you correctly are ARM without GCC atomics, Sparc, and MIPS. I've to revise my statement on MIPS, it actually looks safe. I seem to have missed that it has its own S_UNLOCK. So, here's my first blind attempt at fixing these. Too tired to think much more about it. Sparc's configurable cache coherency drives me nuts. Linux apparently switched somewhere in 2011 from RMO (relaxed memory order) to TSO (total store order), solaris always used TSO, but the BSDs don't. Man. Accordingly there's a somewhat ugly thingy attached. I don't think this is really ready, but it's what I can come up today. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services From 691839dc8fbb78134b530096d33d8a1307231eef Mon Sep 17 00:00:00 2001 From: Andres Freund and...@anarazel.de Date: Tue, 1 Jul 2014 00:11:17 +0200 Subject: [PATCH] Fix spinlock implementations for some sparc and arm platforms. When compiling postgres on arm using a older gcc, that doesn't yet understand __sync_lock_test_and_set(), the default S_UNLOCK routine was used. Unfortunately that's not correct for ARM's memory model. The support for older gccs is using the swbp instruction (available for both arvm5 and most armv6) - unfortunately there's not a common barrier instruction for those architecture versions. So instead implement unlock using swbp again, that's possibly a bit slower but correct. Some Sparc CPUs can be run in various coherence models, ranging from RMO (relaxed) over PSO (partial) to TSO (total). Solaris has always run CPUs in TSO mode, but linux didn't use to and the various *BSDs still don't. Unfortunately the sparc TAS/S_UNLOCK were only correct under TSO. Fix that by adding the necessary memory barrier instructions. On sparcv8+, which should be all relavant CPUs, these are treated as NOPs if the current consistency model doesn't require the barriers. Blindly written, so possibly not working. They didn't use to use correct acquire/ --- src/backend/port/tas/sunstudio_sparc.s | 2 + src/include/storage/s_lock.h | 74 +- 2 files changed, 75 insertions(+), 1 deletion(-) diff --git a/src/backend/port/tas/sunstudio_sparc.s b/src/backend/port/tas/sunstudio_sparc.s index 486b7be..dcf54e2 100644 --- a/src/backend/port/tas/sunstudio_sparc.s +++ b/src/backend/port/tas/sunstudio_sparc.s @@ -37,6 +37,8 @@ pg_atomic_cas: ! ! http://cvs.opensolaris.org/source/xref/on/usr/src/lib/libc/sparc/threads/sparc.il ! + ! NB: We're assuming we're running on a TSO system here - solaris + ! userland luckily always has done so. #if defined(__sparcv9) || defined(__sparcv8plus) cas [%o0],%o2,%o1 diff --git a/src/include/storage/s_lock.h b/src/include/storage/s_lock.h index 895abe6..9aa9e8e 100644 --- a/src/include/storage/s_lock.h +++ b/src/include/storage/s_lock.h @@ -341,6 +341,30 @@ tas(volatile slock_t *lock) return (int) _res; } +/* + * Implement unlocking using swpb as well, that guarantees release + * semantics. Only some armv5 models have the data synchronization barrier + * instructions - but those need it. Luckily swpb always works. + */ +#define S_UNLOCK(lock) s_unlock(lock) + +static __inline__ void +s_unlock(volatile slock_t *lock) +{ + register slock_t _res = 0; + + __asm__ __volatile__( + swpb %0, %0, [%2] \n +: +r(_res), +m(*lock) +: r(lock) +: memory); + + /* lock should have been held */ + Assert(res == 1); +} + +#define S_INIT_LOCK(lock) (*(lock) == 0) + #endif /* HAVE_GCC_INT_ATOMICS */ #endif /* __arm__ */ @@ -393,6 +417,12 @@ tas(volatile slock_t *lock) #if defined(__sparc__) /* Sparc */ +/* + * Solaris has always run sparc processors in TSO (total store) mode, but + * linux didn't use to and the *BSDs still don't. So, be careful about + * acquire/release semantics. The CPU will treat superflous membars as NOPs, + * so it's just code space. + */ #define HAS_TEST_AND_SET typedef unsigned char slock_t; @@ -414,9 +444,51 @@ tas(volatile slock_t *lock) : =r(_res), +m(*lock) : r(lock) : memory); +#if defined(__sparcv7) + /* + * No stbar or membar available, luckily no actually produced hardware + * requires a barrier + */ +#elif defined(__sparcv8) + /* stbar is available (and required for both PSO, RMO), membar isn't */ + __asm__ __volatile__ (stbar \n:::memory); +#else + /* + * #StoreLoad (RMO) | #StoreStore (PSO, RMO)are the approppriate release + * barrier for sparcv8 upwards. + */ + __asm__ __volatile__ (membar #StoreLoad | #StoreStore \n:::memory); +#endif return (int) _res; } +#if defined(__sparcv7) +/* + * No stbar or membar available, luckily no actually produced hardware + * requires a barrier + */ +#define S_UNLOCK(lock) (*((volatile slock_t *) (lock)) = 0) +#elif __sparcv8 +/* stbar is available (and required
Re: [HACKERS] Spinlocks and compiler/memory barriers
On Mon, Jun 30, 2014 at 6:28 PM, Andres Freund and...@2ndquadrant.com wrote: On 2014-06-30 19:22:59 +0200, Andres Freund wrote: On 2014-06-30 12:46:29 -0400, Robert Haas wrote: , which if I understand you correctly are ARM without GCC atomics, Sparc, and MIPS. I've to revise my statement on MIPS, it actually looks safe. I seem to have missed that it has its own S_UNLOCK. So, here's my first blind attempt at fixing these. Too tired to think much more about it. Sparc's configurable cache coherency drives me nuts. Linux apparently switched somewhere in 2011 from RMO (relaxed memory order) to TSO (total store order), solaris always used TSO, but the BSDs don't. Man. Accordingly there's a somewhat ugly thingy attached. I don't think this is really ready, but it's what I can come up today. You know, looking at this, I wonder if we shouldn't just remove support for ARMv5 instead of making a blind stab at a fix. I'm quite in favor of doing what we can to support obscure architectures, but I think this might be beyond what's reasonable. First of all, committing untested assembler code to our tree seems like an iffy idea at best. Secondly, even if someone is running on ARMv5, they'll be fine as long as they're running a sufficiently new GCC, so we're not really giving up much by dumping the hand-rolled code for that platform. The relevant GCC versions are several years old at this point, and if we rely on them to get it right, that's really probably a better bet than trying to do this blind. Since we have a Sun Studio machine in the buildfarm, we shouldn't give up on SPARC completely, but maybe we should only add the cases for sparcv8+ and above? That at least has some chance of getting tested. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Spinlocks and compiler/memory barriers
On 2014-06-27 22:34:19 -0400, Robert Haas wrote: On Fri, Jun 27, 2014 at 2:04 PM, Andres Freund and...@2ndquadrant.com wrote: On 2014-06-27 13:04:02 -0400, Robert Haas wrote: On Thu, Jun 26, 2014 at 6:40 PM, Tom Lane t...@sss.pgh.pa.us wrote: Andres Freund and...@2ndquadrant.com writes: On 2014-06-26 14:13:07 -0700, Tom Lane wrote: Surely it had better be a read barrier as well? I don't immediately see why it has to be read barrier? Hoisting a load from after the release into the locked area of code should be safe? No doubt, but delaying a read till after the unlocking write would certainly not be safe. AFAICT, README.barrier completely fails to define what we think the semantics of pg_read_barrier and pg_write_barrier actually are, so if you believe that a write barrier prevents reordering of reads relative to writes, you'd better propose some new text for that file. It certainly doesn't say that today. The relevant text is in barrier.h Note that that definition of a write barrier is *not* sufficient for the release of a lock... As I said elsewhere I think all the barrier definitions, except maybe alpha, luckily seem to be strong enough for that anyway. Do we want to introduce acquire/release barriers? Or do we want to redefine the current barriers to be strong enough for that? Well, unless we're prepared to dump support for an awful lot of platfomrs, trying to support acquire and release barriers on every platform we support is a doomed effort. Hm? Just declare them to be as heavy as we need them? Already several platforms fall back to more heavyweight operations than necessary? The definitions of the barriers implemented by barrier.h are the same as the ones that Linux has (minus read-barrier-depends) Linux has smb_load_acquire()/smp_store_release() for locks on all platforms. If we were going to use any of those in s_lock.h, it'd have to be pg_memory_barrier(), but I don't think making s_lock.h dependent on barrier.h is the way to go. I think we should just adjust s_lock.h in a minimal way, using inline assembler or tweaking the existing assembler or whatever. Isn't that just going to be repeating the contents of barrier.h pretty much again? How are you suggesting we deal with the generic S_UNLOCK case without having a huge ifdef? Why do we build an abstraction layer (barrier.h) and then not use it? Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Spinlocks and compiler/memory barriers
On Sat, Jun 28, 2014 at 4:31 AM, Andres Freund and...@2ndquadrant.com wrote: Do we want to introduce acquire/release barriers? Or do we want to redefine the current barriers to be strong enough for that? Well, unless we're prepared to dump support for an awful lot of platfomrs, trying to support acquire and release barriers on every platform we support is a doomed effort. Hm? Just declare them to be as heavy as we need them? Already several platforms fall back to more heavyweight operations than necessary? Can't we keep this simple for starters? Strength-reducing the existing operations is yet a third problem, on top of the already-existing problems of (1) making spinlock operations compiler barriers and (2) fixing any buggy implementations. I'm explicitly trying to avoid defining this in a way that means we need a Gigantic Patch that Changes Everything. The definitions of the barriers implemented by barrier.h are the same as the ones that Linux has (minus read-barrier-depends) Linux has smb_load_acquire()/smp_store_release() for locks on all platforms. You mean smp. If we were going to use any of those in s_lock.h, it'd have to be pg_memory_barrier(), but I don't think making s_lock.h dependent on barrier.h is the way to go. I think we should just adjust s_lock.h in a minimal way, using inline assembler or tweaking the existing assembler or whatever. Isn't that just going to be repeating the contents of barrier.h pretty much again? No, I think it's going to be *much* simpler than that. How about I take a crack at this next week and then either (a) I'll see why it's a bad idea and we can go from there or (b) you can review what I come up with and tell me why it sucks? How are you suggesting we deal with the generic S_UNLOCK case without having a huge ifdef? Why do we build an abstraction layer (barrier.h) and then not use it? Because (1) the abstraction doesn't fit very well unless we do a lot of additional work to build acquire and release barriers for every platform we support and (2) I don't have much confidence that we can depend on the spinlock fallback for barriers without completely breaking obscure platforms, and I'd rather make a more minimal set of changes. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Spinlocks and compiler/memory barriers
On 2014-06-28 09:25:32 -0400, Robert Haas wrote: On Sat, Jun 28, 2014 at 4:31 AM, Andres Freund and...@2ndquadrant.com wrote: Do we want to introduce acquire/release barriers? Or do we want to redefine the current barriers to be strong enough for that? Well, unless we're prepared to dump support for an awful lot of platfomrs, trying to support acquire and release barriers on every platform we support is a doomed effort. Hm? Just declare them to be as heavy as we need them? Already several platforms fall back to more heavyweight operations than necessary? Can't we keep this simple for starters? Strength-reducing the existing operations is yet a third problem, on top of the already-existing problems of (1) making spinlock operations compiler barriers and (2) fixing any buggy implementations. I'm explicitly trying to avoid defining this in a way that means we need a Gigantic Patch that Changes Everything. I actually mean that we can just define release barriers to be full memory barriers for platforms where we don't want to think about it. Not that we should weaken barriers. No, I think it's going to be *much* simpler than that. How about I take a crack at this next week and then either (a) I'll see why it's a bad idea and we can go from there or (b) you can review what I come up with and tell me why it sucks? Ok. I think that's going in the wrong direction (duplication of nontrivial knowledge), but maybe I'm taking a to 'purist' approach here. Prove me wrong :) You'll pick up the clobber changes from my patch? How are you suggesting we deal with the generic S_UNLOCK case without having a huge ifdef? Why do we build an abstraction layer (barrier.h) and then not use it? Because (1) the abstraction doesn't fit very well unless we do a lot of additional work to build acquire and release barriers for every platform we support and Meh. Something like the (untested): #if !defined(pg_release_barrier) defined(pg_read_barrier) defined(pg_write_barrier) #define pg_release_barrier() do { pg_read_barrier(); pg_write_barrier();} while (0) #elif !defined(pg_release_barrier) #define pg_release_barrier() pg_memory_barrier() #endif before the fallback definition of pg_read/write_barrier should suffice? (2) I don't have much confidence that we can depend on the spinlock fallback for barriers without completely breaking obscure platforms, and I'd rather make a more minimal set of changes. Well, it's the beginning of the cycle. And we're already depending on barriers for correctness (and it's not getting less), so I don't really see what avoiding barrier usage buys us except harder to find breakage? Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Spinlocks and compiler/memory barriers
On 2014-06-28 15:41:46 +0200, Andres Freund wrote: On 2014-06-28 09:25:32 -0400, Robert Haas wrote: No, I think it's going to be *much* simpler than that. How about I take a crack at this next week and then either (a) I'll see why it's a bad idea and we can go from there or (b) you can review what I come up with and tell me why it sucks? Ok. I think that's going in the wrong direction (duplication of nontrivial knowledge), but maybe I'm taking a to 'purist' approach here. Prove me wrong :) What I forgot: I'm also pretty sure that the more lockless stuff we introduce the more places are going to need acquire/release semantics... Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Spinlocks and compiler/memory barriers
On Thu, Jun 26, 2014 at 5:01 PM, Andres Freund and...@2ndquadrant.com wrote: But a) isn't really avoidable because it'll otherwise generate compiler warnings and b) is done that way all over the tree. E.g. lwlock.c has several copies of (note the nonvolatility of proc): volatile LWLock *lock = l; PGPROC *proc = MyProc; ... proc-lwWaiting = true; proc-lwWaitMode = LW_WAIT_UNTIL_FREE; proc-lwWaitLink = NULL; /* waiters are added to the front of the queue */ proc-lwWaitLink = lock-head; if (lock-head == NULL) lock-tail = proc; lock-head = proc; /* Can release the mutex now */ SpinLockRelease(lock-mutex); There's nothing forcing the compiler to not move any of the proc-* assignments past the SpinLockRelease(). And indeed in my case it was actually the store to lwWaitLink that was moved across the lock. I think it's just pure luck that there's no active bug (that we know of) caused by this. I wouldn't be surprised if some dubious behaviour we've seen would be caused by similar issues. Now, we can fix this and similar cases by more gratuitous use of volatile. But for one we're never going to find all cases. For another it won't help *at all* for architectures with looser CPU level memory ordering guarantees. I think we finally need to bite the bullet and make all S_UNLOCKs compiler/write barriers. There are two separate issues here: 1. SpinLockAcquire and SpinLockRelease are not guaranteed to be compiler barriers, so all relevant memory accesses in the critical section need to be done using volatile pointers. Failing to do this is an easy mistake to make, and we've fixed numerous bugs of this type over the years (most recently, to my knowledge, in 4bc15a8bfbc7856bc3426dc9ab99567eebbb64d3). Forcing SpinLockAcquire() and SpinLockRelease() to serve as compiler barriers would let us dispense with a whole lot of volatile calls and make writing future code correctly a lot easier. 2. Some of our implementations of SpinLockAcquire and/or SpinLockRelease, but in particular SpinLockRelease, may not actually provide the memory-ordering semantics which they are required to provide. In particular, ... I'd previously, in http://www.postgresql.org/message-id/20130920151110.ga8...@awork2.anarazel.de, gone through the list of S_UNLOCKs and found several that were lacking. Most prominently the default S_UNLOCK is just #define S_UNLOCK(lock) (*((volatile slock_t *) (lock)) = 0) which allows the compiler to move non volatile access across and does nothing for CPU level cache coherency. ...this default implementation of S_UNLOCK() is pretty sketchy. Even on a platform that enforces reads in program order and writes in program order, this is still unsafe because a read within the critical section might get postponed until after this write. Now, x86 happens to have an additional constraint, which is that it can reorder loads before stores but not stores before loads; so that coding happens to provide release semantics. But that need not be true on every architecture, and the trend seems to be toward weaker memory ordering. As you pointed out to me on chat, the non-intrinsics based ARM implementation brokenly relies on the default S_UNLOCK(), which clearly isn't adequate. Now, in terms of solving these problems: I tend to think that we should try to think about these two problems somewhat separately. As to #1, in the back-branches, I think further volatile-izing the LWLock* routines is probably the only realistic solution. In master, I fully support moving the goalposts such that we require SpinLockAcquire() and SpinLockRelease() are compiler barriers. Once we do this, I think we should go back and rip out all the places where we've used volatile-ized pointers to provide compiler ordering. That way, if we haven't actually managed to provide compiler ordering everywhere, it's more likely that something will fall over and warn us about the problem; plus, that avoids keeping around a coding pattern which isn't actually the one we want people to copy. However, I think your proposed S_UNLOCKED_UNLOCK() hack is plain ugly, probably cripplingly slow, and there's no guarantee that's even correct; see for example the comments about Itanium's tas implementation possibly being only an acquire barrier (blech). On gcc and icc, which account for lines 99 through 700 of spin.h, it should be simple and mechanical to use compiler intrinsics to make sure that every S_UNLOCK implementation includes a compiler barrier. However, lines 710 through 895 support non-gcc, non-icc compilers, and some of those we may not know how to implement a compiler barrier - in particular, Univel CC, the Tru64 Alpha compiler, HPPA, AIX, or Sun's compilers. Except for Sun, we have no buildfarm support for those platforms, so we could consider just dropping support entirely, but I'd be inclined to do
Re: [HACKERS] Spinlocks and compiler/memory barriers
On Thu, Jun 26, 2014 at 6:40 PM, Tom Lane t...@sss.pgh.pa.us wrote: Andres Freund and...@2ndquadrant.com writes: On 2014-06-26 14:13:07 -0700, Tom Lane wrote: Surely it had better be a read barrier as well? I don't immediately see why it has to be read barrier? Hoisting a load from after the release into the locked area of code should be safe? No doubt, but delaying a read till after the unlocking write would certainly not be safe. AFAICT, README.barrier completely fails to define what we think the semantics of pg_read_barrier and pg_write_barrier actually are, so if you believe that a write barrier prevents reordering of reads relative to writes, you'd better propose some new text for that file. It certainly doesn't say that today. The relevant text is in barrier.h -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Spinlocks and compiler/memory barriers
On 2014-06-27 13:00:34 -0400, Robert Haas wrote: There are two separate issues here: 1. SpinLockAcquire and SpinLockRelease are not guaranteed to be compiler barriers, so all relevant memory accesses in the critical section need to be done using volatile pointers. Failing to do this is an easy mistake to make, and we've fixed numerous bugs of this type over the years (most recently, to my knowledge, in 4bc15a8bfbc7856bc3426dc9ab99567eebbb64d3). Forcing SpinLockAcquire() and SpinLockRelease() to serve as compiler barriers would let us dispense with a whole lot of volatile calls and make writing future code correctly a lot easier. And actually faster in some cases. I'm just playing around with some bigger POWER machine and removing the volatiles in GetSnapshotData() + some access forcing macro for accessing xmin is worth 1% or so. 2. Some of our implementations of SpinLockAcquire and/or SpinLockRelease, but in particular SpinLockRelease, may not actually provide the memory-ordering semantics which they are required to provide. I tend to think that we should try to think about these two problems somewhat separately. As to #1, in the back-branches, I think further volatile-izing the LWLock* routines is probably the only realistic solution. We could also decide not to do anything :/. In master, I fully support moving the goalposts such that we require SpinLockAcquire() and SpinLockRelease() are compiler barriers. Once we do this, I think we should go back and rip out all the places where we've used volatile-ized pointers to provide compiler ordering. That way, if we haven't actually managed to provide compiler ordering everywhere, it's more likely that something will fall over and warn us about the problem; plus, that avoids keeping around a coding pattern which isn't actually the one we want people to copy. +1 However, I think your proposed S_UNLOCKED_UNLOCK() hack is plain ugly, probably cripplingly slow, and there's no guarantee that's even correct; see for example the comments about Itanium's tas implementation possibly being only an acquire barrier (blech). Heh. I don't think it's worse than the current fallback barrier implementation. The S_UNLOCKED_UNLOCK() thing was just to avoid recursion when using a barrier in the spinlock used to implement barriers... On gcc and icc, which account for lines 99 through 700 of spin.h, it should be simple and mechanical to use compiler intrinsics to make sure that every S_UNLOCK implementation includes a compiler barrier. However, lines 710 through 895 support non-gcc, non-icc compilers, and some of those we may not know how to implement a compiler barrier - in particular, Univel CC, the Tru64 Alpha compiler, HPPA, AIX, or Sun's compilers. Except for Sun, we have no buildfarm support for those platforms, so we could consider just dropping support entirely Both sun's and AIX's compilers can relatively easily be handled: * Solaris has atomic.h with membar_enter() et al. Apparently since at least solaris 9. * XLC has __fence and __isync intrinsics. There's been recent talk about AIX, including about AIX animal, so I'd be hesitant to drop it. It's also still developed. I'm obviously in favor of dropping Alpha. And I'm, unsurprisingly, all for removing unixware support (which is what univel CC seems to be used for after you dropped the univel port proper). I think the only person that has used postgres on hppa in the last 5 years is Tom, so I guess he'll have to speak up about it. Tom? void fake_compiler_barrier(void) { } void (*fake_compiler_barrier_hook) = fake_compiler_barrier; #define pg_compiler_barrier() ((*fake_compiler_barrier_hook)()) But we can do that as a fallback. It's what HPPA's example spinlock implementation does after all. Now, this doesn't remove the circular dependency between s_lock.h and barrier.h, because we still don't have a fallback method, other than acquiring and releasing a spinlock, of implementing a barrier that blocks both compiler reordering and CPU reordering. But it is enough to solve problem #1, and doesn't require that we drop support for anything that works now. I think we can move the fallback into a C function. Compared to the cost of a tas/unlock that shouldn't be significant. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Spinlocks and compiler/memory barriers
On 2014-06-27 13:04:02 -0400, Robert Haas wrote: On Thu, Jun 26, 2014 at 6:40 PM, Tom Lane t...@sss.pgh.pa.us wrote: Andres Freund and...@2ndquadrant.com writes: On 2014-06-26 14:13:07 -0700, Tom Lane wrote: Surely it had better be a read barrier as well? I don't immediately see why it has to be read barrier? Hoisting a load from after the release into the locked area of code should be safe? No doubt, but delaying a read till after the unlocking write would certainly not be safe. AFAICT, README.barrier completely fails to define what we think the semantics of pg_read_barrier and pg_write_barrier actually are, so if you believe that a write barrier prevents reordering of reads relative to writes, you'd better propose some new text for that file. It certainly doesn't say that today. The relevant text is in barrier.h Note that that definition of a write barrier is *not* sufficient for the release of a lock... As I said elsewhere I think all the barrier definitions, except maybe alpha, luckily seem to be strong enough for that anyway. Do we want to introduce acquire/release barriers? Or do we want to redefine the current barriers to be strong enough for that? Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Spinlocks and compiler/memory barriers
On Fri, Jun 27, 2014 at 2:04 PM, Andres Freund and...@2ndquadrant.com wrote: On 2014-06-27 13:04:02 -0400, Robert Haas wrote: On Thu, Jun 26, 2014 at 6:40 PM, Tom Lane t...@sss.pgh.pa.us wrote: Andres Freund and...@2ndquadrant.com writes: On 2014-06-26 14:13:07 -0700, Tom Lane wrote: Surely it had better be a read barrier as well? I don't immediately see why it has to be read barrier? Hoisting a load from after the release into the locked area of code should be safe? No doubt, but delaying a read till after the unlocking write would certainly not be safe. AFAICT, README.barrier completely fails to define what we think the semantics of pg_read_barrier and pg_write_barrier actually are, so if you believe that a write barrier prevents reordering of reads relative to writes, you'd better propose some new text for that file. It certainly doesn't say that today. The relevant text is in barrier.h Note that that definition of a write barrier is *not* sufficient for the release of a lock... As I said elsewhere I think all the barrier definitions, except maybe alpha, luckily seem to be strong enough for that anyway. Do we want to introduce acquire/release barriers? Or do we want to redefine the current barriers to be strong enough for that? Well, unless we're prepared to dump support for an awful lot of platfomrs, trying to support acquire and release barriers on every platform we support is a doomed effort. The definitions of the barriers implemented by barrier.h are the same as the ones that Linux has (minus read-barrier-depends), which I think is probably good evidence that they are generally useful definitions. If we were going to use any of those in s_lock.h, it'd have to be pg_memory_barrier(), but I don't think making s_lock.h dependent on barrier.h is the way to go. I think we should just adjust s_lock.h in a minimal way, using inline assembler or tweaking the existing assembler or whatever. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Spinlocks and compiler/memory barriers
Andres Freund and...@2ndquadrant.com writes: I think we should rework things so that we fall back to pg_write_barrier(), (*((volatile slock_t *) (lock)) = 0) instead of what we have right now. Surely it had better be a read barrier as well? And S_LOCK the same? regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Spinlocks and compiler/memory barriers
On 2014-06-26 14:13:07 -0700, Tom Lane wrote: Andres Freund and...@2ndquadrant.com writes: I think we should rework things so that we fall back to pg_write_barrier(), (*((volatile slock_t *) (lock)) = 0) instead of what we have right now. Surely it had better be a read barrier as well? I don't immediately see why it has to be read barrier? Hoisting a load from after the release into the locked area of code should be safe? Note that 'bad' reads can only happen for variables which aren't protected by the spinlock since the S_LOCK needs to have acquire semantics and no other process can modify protected variables concurrently. The important thing is that all modifications that have been done inside the spinlock are visible to other backends and that no writes are moved outside the protected are. And S_LOCK the same? It better be a read barrier, yes. I haven't checked yet, but I assume that pretty much all TAS/tas implementation already guarantee that. I think if not we'd seen problems. Well, at least on platforms that receive testing under concurrent circumstances :/ Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Spinlocks and compiler/memory barriers
Andres Freund and...@2ndquadrant.com writes: On 2014-06-26 14:13:07 -0700, Tom Lane wrote: Surely it had better be a read barrier as well? I don't immediately see why it has to be read barrier? Hoisting a load from after the release into the locked area of code should be safe? No doubt, but delaying a read till after the unlocking write would certainly not be safe. AFAICT, README.barrier completely fails to define what we think the semantics of pg_read_barrier and pg_write_barrier actually are, so if you believe that a write barrier prevents reordering of reads relative to writes, you'd better propose some new text for that file. It certainly doesn't say that today. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Spinlocks and compiler/memory barriers
Hi, On 2014-06-26 23:01:10 +0200, Andres Freund wrote: I think we should rework things so that we fall back to pg_write_barrier(), (*((volatile slock_t *) (lock)) = 0) instead of what we have right now. That'd require to make barrier.h independent from s_lock.h, but I think that'd be a pretty clear improvement. One open question is what happens with the SpinlockRelease() when barriers are implemented using spinlocks and we need a barrier for the SpinlockRelease(). Too tired to think about this any further today. Here's my current state of this: * barrier.h's spinlock implementation moved to s_lock.c, loosing the s_lock.h include. That requires a S_UNLOCK definition which doesn't again use a barrier. I've coined it S_UNLOCKED_UNLOCK * s_lock.h now includes barrier.h to implement the generic S_UNLOCK safely. * gcc x86-64 had a superflous cc clobber. Likely copied from the 32bit version which does additional operations. * PPC was missing a compiler barrier * alpha was missing a cc clobber. * mips was missing a compiler barrier and a cc clobber * I have no idea how to fix pa-risc's S_UNLOCK for !gcc. The referenced spinlock paper calls a external function to avoid reordering. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services diff --git a/src/backend/storage/lmgr/s_lock.c b/src/backend/storage/lmgr/s_lock.c index efe1b43..5052718 100644 --- a/src/backend/storage/lmgr/s_lock.c +++ b/src/backend/storage/lmgr/s_lock.c @@ -187,6 +187,23 @@ update_spins_per_delay(int shared_spins_per_delay) return (shared_spins_per_delay * 15 + spins_per_delay) / 16; } +/* + * Memory barrier implementation based on lock/unlock to a spinlock. Check + * barrier.h for details. + */ +#ifdef PG_SIMULATE_MEMORY_BARRIER +void +pg_memory_barrier_impl(void) +{ + S_LOCK(dummy_spinlock); +#ifdef S_UNLOCKED_UNLOCK + S_UNLOCKED_UNLOCK(dummy_spinlock); +#else + S_UNLOCK(dummy_spinlock); +#endif +} +#endif /* PG_SIMULATE_MEMORY_BARRIER */ + /* * Various TAS implementations that cannot live in s_lock.h as no inline diff --git a/src/backend/storage/lmgr/spin.c b/src/backend/storage/lmgr/spin.c index 9b71744..03a4fe8 100644 --- a/src/backend/storage/lmgr/spin.c +++ b/src/backend/storage/lmgr/spin.c @@ -25,6 +25,7 @@ #include access/xlog.h #include miscadmin.h #include replication/walsender.h +#include storage/barrier.h #include storage/lwlock.h #include storage/pg_sema.h #include storage/spin.h diff --git a/src/include/storage/barrier.h b/src/include/storage/barrier.h index bc61de0..e5692ac 100644 --- a/src/include/storage/barrier.h +++ b/src/include/storage/barrier.h @@ -13,10 +13,6 @@ #ifndef BARRIER_H #define BARRIER_H -#include storage/s_lock.h - -extern slock_t dummy_spinlock; - /* * A compiler barrier need not (and preferably should not) emit any actual * machine code, but must act as an optimization fence: the compiler must not @@ -155,10 +151,14 @@ extern slock_t dummy_spinlock; * spinlock acquire-and-release would be equivalent to a full memory barrier. * For example, I'm not sure that Itanium's acq and rel add up to a full * fence. But all of our actual implementations seem OK in this regard. + * + * The actual implementation is in s_lock.c so we can include barrier.h from + * s_lock.h. Yes, this will make things even slower, but who cares. */ #if !defined(pg_memory_barrier) -#define pg_memory_barrier() \ - do { S_LOCK(dummy_spinlock); S_UNLOCK(dummy_spinlock); } while (0) +#define PG_SIMULATE_MEMORY_BARRIER +extern void pg_memory_barrier_impl(void); +#define pg_memory_barrier() pg_memory_barrier_impl() #endif /* diff --git a/src/include/storage/s_lock.h b/src/include/storage/s_lock.h index ba4dfe1..2732054 100644 --- a/src/include/storage/s_lock.h +++ b/src/include/storage/s_lock.h @@ -19,7 +19,8 @@ * Should return number of delays; see s_lock.c * * void S_UNLOCK(slock_t *lock) - * Unlock a previously acquired lock. + * Unlock a previously acquired lock. Needs to imply a compiler and + * write memory barrier. * * bool S_LOCK_FREE(slock_t *lock) * Tests if the lock is free. Returns TRUE if free, FALSE if locked. @@ -39,6 +40,7 @@ * Atomic test-and-set instruction. Attempt to acquire the lock, * but do *not* wait. Returns 0 if successful, nonzero if unable * to acquire the lock. + * Needs to imply a compiler and read memory barrier. * * int TAS_SPIN(slock_t *lock) * Like TAS(), but this version is used when waiting for a lock @@ -94,6 +96,8 @@ #ifndef S_LOCK_H #define S_LOCK_H +#include storage/barrier.h + #ifdef HAVE_SPINLOCKS /* skip spinlocks if requested */ #if defined(__GNUC__) || defined(__INTEL_COMPILER) @@ -224,7 +228,7 @@ tas(volatile slock_t *lock) xchgb %0,%1 \n : +q(_res), +m(*lock) : -: memory, cc); +: memory); return (int) _res; } @@ -478,14 +482,14 @@ tas(volatile slock_t *lock) #define S_UNLOCK(lock) \ do \
Re: [HACKERS] Spinlocks and compiler/memory barriers
On 2014-06-26 15:40:11 -0700, Tom Lane wrote: Andres Freund and...@2ndquadrant.com writes: On 2014-06-26 14:13:07 -0700, Tom Lane wrote: Surely it had better be a read barrier as well? I don't immediately see why it has to be read barrier? Hoisting a load from after the release into the locked area of code should be safe? No doubt, but delaying a read till after the unlocking write would certainly not be safe. Right. What we actually want for locking is what several systems (e.g. C11, linux) coin acquire/release barriers. Not sure whether we should introduce a separate set of acquire/release macros, or promote our barriers to have stronger guarantees than the name implies. The definition as I understand it is: A acquire barrier implies that: * memory operations from after the barrier cannot appear to have happened before the barrier * but: memory operations from *before* the barrier are *not* guaranteed to be finished A finished release barrier implies: * stores from before the barrier cannot be moved past * loads from before the barrier cannot be moved past * but: reads from *after* the barrier might occur *before* the barrier. I believe that all our current barrier definitions (except maybe alpha which I haven't bothered to check thoroughly) satisfy those constraints. That's primarily because we don't have support for all that many platforms and use full memory barriers for read/write barriers in several places. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers