Re: [HACKERS] Spinlocks and compiler/memory barriers

2014-09-25 Thread Robert Haas
On Wed, Sep 24, 2014 at 2:45 PM, Andres Freund and...@2ndquadrant.com wrote:
 On 2014-09-09 17:54:03 -0400, Robert Haas wrote:
 So, that's committed, then. I think we should pick something that uses
 spinlocks and is likely to fail spectacularly if we haven't got this
 totally right yet, and de-volatilize it.  And then watch to see what
 turns red in the buildfarm and/or which users start screaming.  I'm
 inclined to propose lwlock.c as a candidate, since that's very widely
 used and a place where we know there's significant contention.

 Did you consider removing the volatiles from bufmgr.c? There's lots of
 volatiles in there and most of them don't seem to have been added in a
 principled way. I'm looking at my old patch for lockless pin/unpin of
 buffers and it'd look a lot cleaner without.

I hadn't thought of it, but it sounds like a good idea.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Spinlocks and compiler/memory barriers

2014-09-10 Thread Robert Haas
On Tue, Sep 9, 2014 at 6:00 PM, Andres Freund and...@2ndquadrant.com wrote:
 On 2014-09-09 17:54:03 -0400, Robert Haas wrote:
 So, that's committed, then.

 Yay, finally.

 I think we should pick something that uses
 spinlocks and is likely to fail spectacularly if we haven't got this
 totally right yet, and de-volatilize it.  And then watch to see what
 turns red in the buildfarm and/or which users start screaming.

 Good plan.

 I'm inclined to propose lwlock.c as a candidate, since that's very
 widely used and a place where we know there's significant contention.

 I suggest, additionally possibly, GetSnapshotData(). It's surely one of
 the hottest functions in postgres, and I've seen some performance
 increases from de-volatilizing it. IIRC it requires one volatile cast in
 one place to enforce that a variable is accessed only once. Not sure if
 we want to add such volatile casts or use something like linux'
 ACCESS_ONCE macros for some common types. Helps to grep for places
 worthy of inspection.

GetSnapshotData() isn't quite to the point, because it's using a
volatile pointer, but not because of anything about spinlock
manipulation.  That would, perhaps, be a good test for barriers, but
not for this.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Spinlocks and compiler/memory barriers

2014-09-09 Thread Andres Freund
On 2014-09-04 08:18:37 -0400, Robert Haas wrote:
 On Tue, Aug 5, 2014 at 11:55 AM, Robert Haas robertmh...@gmail.com wrote:
  On Sun, Jul 6, 2014 at 3:12 PM, Andres Freund and...@2ndquadrant.com 
  wrote:
   If you want to do that, it's fine with me.  What I would do is:
  
   - Back-patch the addition of the sparcv8+ stuff all the way.  If
   anyone's running anything older, let them complain...
   - Remove the special case for MIPS without gcc intrinsics only in
   master, leaving the back-branches broken.  If anyone cares, let them
   complain...
   - Nothing else.
 
  I've gone ahead and done the second of these things.
 
  Thanks.
 
  Andres, do you want to go take a stab at fixing the SPARC stuff?
 
  Will do, will probably take me till thursday to come up with the brain
  cycles.
 
  Ping?
 
 This has been pending for almost two months now and, at your request,
 my patch to make spinlocks act as compiler barriers is waiting behind
 it.  Can we please get this moving again soon, or can I commit that
 patch and you can fix this when you get around to it?

I finally pushed this. And once more I seriously got pissed at the poor
overall worldwide state of documentation and continously changing
terminology around this.

Sorry for taking this long :(

Do you have a current version of your patch to make them compiler
barriers?

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Spinlocks and compiler/memory barriers

2014-09-09 Thread Robert Haas
On Mon, Sep 8, 2014 at 7:10 PM, Andres Freund and...@2ndquadrant.com wrote:
 This has been pending for almost two months now and, at your request,
 my patch to make spinlocks act as compiler barriers is waiting behind
 it.  Can we please get this moving again soon, or can I commit that
 patch and you can fix this when you get around to it?

 I finally pushed this. And once more I seriously got pissed at the poor
 overall worldwide state of documentation and continously changing
 terminology around this.

There does seem to be a deficit in that area.

 Sorry for taking this long :(

 Do you have a current version of your patch to make them compiler
 barriers?

I had forgotten that it needed an update.  Thanks for the reminder.  Here's v2.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
diff --git a/src/backend/storage/lmgr/s_lock.c b/src/backend/storage/lmgr/s_lock.c
index efe1b43..38dc34d 100644
--- a/src/backend/storage/lmgr/s_lock.c
+++ b/src/backend/storage/lmgr/s_lock.c
@@ -154,6 +154,17 @@ s_lock(volatile slock_t *lock, const char *file, int line)
 	return delays;
 }
 
+#ifdef USE_DEFAULT_S_UNLOCK
+void
+s_unlock(slock_t *lock)
+{
+#ifdef TAS_ACTIVE_WORD
+	*TAS_ACTIVE_WORD(lock) = -1;
+#else
+	*lock = 0;
+#endif
+}
+#endif
 
 /*
  * Set local copy of spins_per_delay during backend startup.
diff --git a/src/include/storage/s_lock.h b/src/include/storage/s_lock.h
index 06dc963..8e0c4c3 100644
--- a/src/include/storage/s_lock.h
+++ b/src/include/storage/s_lock.h
@@ -55,14 +55,16 @@
  *	on Alpha TAS() will fail if interrupted.  Therefore a retry loop must
  *	always be used, even if you are certain the lock is free.
  *
- *	Another caution for users of these macros is that it is the caller's
- *	responsibility to ensure that the compiler doesn't re-order accesses
- *	to shared memory to precede the actual lock acquisition, or follow the
- *	lock release.  Typically we handle this by using volatile-qualified
- *	pointers to refer to both the spinlock itself and the shared data
- *	structure being accessed within the spinlocked critical section.
- *	That fixes it because compilers are not allowed to re-order accesses
- *	to volatile objects relative to other such accesses.
+ *	It is the responsibility of these macros to make sure that the compiler
+ *	does not re-order accesses to shared memory to precede the actual lock
+ *	acquisition, or follow the lock release.  Prior to PostgreSQL 9.5, this
+ *	was the caller's responsibility, which meant that callers had to use
+ *	volatile-qualified pointers to refer to both the spinlock itself and the
+ *	shared data being accessed within the spinlocked critical section.  This
+ *	was notationally awkward, easy to forget (and thus error-prone), and
+ *	prevented some useful compiler optimizations.  For these reasons, we
+ *	now require that the macros themselves prevent compiler re-ordering,
+ *	so that the caller doesn't need to take special precautions.
  *
  *	On platforms with weak memory ordering, the TAS(), TAS_SPIN(), and
  *	S_UNLOCK() macros must further include hardware-level memory fence
@@ -484,14 +486,14 @@ tas(volatile slock_t *lock)
 #define S_UNLOCK(lock)	\
 do \
 { \
-	__asm__ __volatile__ (	lwsync \n); \
+	__asm__ __volatile__ (	lwsync \n ::: memory); \
 	*((volatile slock_t *) (lock)) = 0; \
 } while (0)
 #else
 #define S_UNLOCK(lock)	\
 do \
 { \
-	__asm__ __volatile__ (	sync \n); \
+	__asm__ __volatile__ (	sync \n ::: memory); \
 	*((volatile slock_t *) (lock)) = 0; \
 } while (0)
 #endif /* USE_PPC_LWSYNC */
@@ -599,7 +601,9 @@ do \
 		   .set noreorder  \n \
 		   .set nomacro\n \
 		   sync\n \
-		   .set pop  ); \
+		   .set pop  
+:
+:		memory);
 	*((volatile slock_t *) (lock)) = 0; \
 } while (0)
 
@@ -657,6 +661,23 @@ tas(volatile slock_t *lock)
 typedef unsigned char slock_t;
 #endif
 
+/*
+ * Note that this implementation is unsafe for any platform that can speculate
+ * a memory access (either load or store) after a following store.  That
+ * happens not to be possible x86 and most legacy architectures (some are
+ * single-processor!), but many modern systems have weaker memory ordering.
+ * Those that do must define their own version S_UNLOCK() rather than relying
+ * on this one.
+ */
+#if !defined(S_UNLOCK)
+#if defined(__INTEL_COMPILER)
+#define S_UNLOCK(lock)	\
+	do { __memory_barrier(); *(lock) = 0; } while (0)
+#else
+#define S_UNLOCK(lock)	\
+	do { __asm__ __volatile__( : : : memory);  *(lock) = 0; } while (0)
+#endif
+#endif
 
 #endif	/* defined(__GNUC__) || defined(__INTEL_COMPILER) */
 
@@ -730,9 +751,13 @@ tas(volatile slock_t *lock)
 	return (lockval == 0);
 }
 
-#endif /* __GNUC__ */
+#define S_UNLOCK(lock)	\
+	do { \
+		__asm__ __volatile__( : : : memory); \
+		*TAS_ACTIVE_WORD(lock) = -1; \
+	} while (0)
 
-#define S_UNLOCK(lock)	(*TAS_ACTIVE_WORD(lock) = -1)
+#endif /* __GNUC__ */
 
 #define 

Re: [HACKERS] Spinlocks and compiler/memory barriers

2014-09-09 Thread Andres Freund
On 2014-09-09 13:52:40 -0400, Robert Haas wrote:
 I had forgotten that it needed an update.  Thanks for the reminder.  Here's 
 v2.

I've attached a incremental patch fixing minor gripes. Other than that I
think you can go ahead with this once the buildfarm accepts the sparc
fixes (man, those machines are slow. spoonbill apparently takes ~5h for
one run).

I've done a read through s_lock.h and the only remaining potential issue
that I see is that I have no idea if unixware's tas() is actually a safe
compiler barrier (it is a memory barrier). And I really, really can't
make myself care.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services
diff --git a/src/backend/storage/lmgr/s_lock.c b/src/backend/storage/lmgr/s_lock.c
index 38dc34d..e8d3502 100644
--- a/src/backend/storage/lmgr/s_lock.c
+++ b/src/backend/storage/lmgr/s_lock.c
@@ -159,6 +159,7 @@ void
 s_unlock(slock_t *lock)
 {
 #ifdef TAS_ACTIVE_WORD
+	/* HP's PA-RISC */
 	*TAS_ACTIVE_WORD(lock) = -1;
 #else
 	*lock = 0;
diff --git a/src/include/storage/s_lock.h b/src/include/storage/s_lock.h
index 8e0c4c3..5f62a2c 100644
--- a/src/include/storage/s_lock.h
+++ b/src/include/storage/s_lock.h
@@ -57,8 +57,8 @@
  *
  *	It is the responsibility of these macros to make sure that the compiler
  *	does not re-order accesses to shared memory to precede the actual lock
- *	acquisition, or follow the lock release.  Prior to PostgreSQL 9.5, this
- *	was the caller's responsibility, which meant that callers had to use
+ *	acquisition, or following the lock release.  Prior to PostgreSQL 9.5,
+ *  this was the caller's responsibility, which meant that callers had to use
  *	volatile-qualified pointers to refer to both the spinlock itself and the
  *	shared data being accessed within the spinlocked critical section.  This
  *	was notationally awkward, easy to forget (and thus error-prone), and
@@ -403,7 +403,7 @@ tas(volatile slock_t *lock)
  * No stbar or membar available, luckily no actually produced hardware
  * requires a barrier.
  */
-#define S_UNLOCK(lock)		(*((volatile slock_t *) (lock)) = 0)
+/* use the default S_UNLOCK definition for gcc */
 #elif  __sparcv8
 /* stbar is available (and required for both PSO, RMO), membar isn't */
 #define S_UNLOCK(lock)	\
@@ -921,14 +921,14 @@ extern int	tas_sema(volatile slock_t *lock);
  * store) after a following store; platforms where this is possible must
  * define their own S_UNLOCK.  But CPU reordering is not the only concern:
  * if we simply defined S_UNLOCK() as an inline macro, the compiler might
- * reorder instructions from the critical section to occur after the lock
- * release.  Since the compiler probably can't know what the external
+ * reorder instructions from inside the critical section to occur after the
+ * lock release.  Since the compiler probably can't know what the external
  * function s_unlock is doing, putting the same logic there should be adequate.
  * A sufficiently-smart globally optimizing compiler could break that
  * assumption, though, and the cost of a function call for every spinlock
  * release may hurt performance significantly, so we use this implementation
  * only for platforms where we don't know of a suitable intrinsic.  For the
- * most part, those are relatively obscure platform/compiler platforms to 
+ * most part, those are relatively obscure platform/compiler platforms to
  * which the PostgreSQL project does not have access.
  */
 #define USE_DEFAULT_S_UNLOCK

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Spinlocks and compiler/memory barriers

2014-09-09 Thread Robert Haas
On Tue, Sep 9, 2014 at 5:09 PM, Andres Freund and...@2ndquadrant.com wrote:
 On 2014-09-09 13:52:40 -0400, Robert Haas wrote:
 I had forgotten that it needed an update.  Thanks for the reminder.  Here's 
 v2.

 I've attached a incremental patch fixing minor gripes. Other than that I
 think you can go ahead with this once the buildfarm accepts the sparc
 fixes (man, those machines are slow. spoonbill apparently takes ~5h for
 one run).

 I've done a read through s_lock.h and the only remaining potential issue
 that I see is that I have no idea if unixware's tas() is actually a safe
 compiler barrier (it is a memory barrier). And I really, really can't
 make myself care.

  *It is the responsibility of these macros to make sure that the compiler
  *does not re-order accesses to shared memory to precede the actual lock
- *acquisition, or follow the lock release.  Prior to PostgreSQL 9.5, this
- *was the caller's responsibility, which meant that callers had to use
+ *acquisition, or following the lock release.  Prior to PostgreSQL 9.5,
+ *  this was the caller's responsibility, which meant that callers had to use

AFAICS my version is right and your version is grammatically
incorrect. re-order to proceed or follow uses the same verb tense in
both branches of the or; re-order to proceed or following does not.

I agree that if there are problems on UnixWare, we can let anyone who
cares about UnixWare submit a patch to fix them.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Spinlocks and compiler/memory barriers

2014-09-09 Thread Andres Freund
On 2014-09-09 17:30:44 -0400, Robert Haas wrote:
 On Tue, Sep 9, 2014 at 5:09 PM, Andres Freund and...@2ndquadrant.com wrote:
  On 2014-09-09 13:52:40 -0400, Robert Haas wrote:
  I had forgotten that it needed an update.  Thanks for the reminder.  
  Here's v2.
 
  I've attached a incremental patch fixing minor gripes. Other than that I
  think you can go ahead with this once the buildfarm accepts the sparc
  fixes (man, those machines are slow. spoonbill apparently takes ~5h for
  one run).
 
  I've done a read through s_lock.h and the only remaining potential issue
  that I see is that I have no idea if unixware's tas() is actually a safe
  compiler barrier (it is a memory barrier). And I really, really can't
  make myself care.
 
   *It is the responsibility of these macros to make sure that the compiler
   *does not re-order accesses to shared memory to precede the actual lock
 - *acquisition, or follow the lock release.  Prior to PostgreSQL 9.5, this
 - *was the caller's responsibility, which meant that callers had to use
 + *acquisition, or following the lock release.  Prior to PostgreSQL 9.5,
 + *  this was the caller's responsibility, which meant that callers had to use
 
 AFAICS my version is right and your version is grammatically
 incorrect. re-order to proceed or follow uses the same verb tense in
 both branches of the or; re-order to proceed or following does not.

Wasn't sure about that one. It read oddly to me, but then I'm not a
native speaker. And won't read the sentence often ;)

 I agree that if there are problems on UnixWare, we can let anyone who
 cares about UnixWare submit a patch to fix them.

Ok.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Spinlocks and compiler/memory barriers

2014-09-09 Thread Robert Haas
On Tue, Sep 9, 2014 at 5:32 PM, Andres Freund and...@2ndquadrant.com wrote:
 On 2014-09-09 17:30:44 -0400, Robert Haas wrote:
 On Tue, Sep 9, 2014 at 5:09 PM, Andres Freund and...@2ndquadrant.com wrote:
  On 2014-09-09 13:52:40 -0400, Robert Haas wrote:
  I had forgotten that it needed an update.  Thanks for the reminder.  
  Here's v2.
 
  I've attached a incremental patch fixing minor gripes. Other than that I
  think you can go ahead with this once the buildfarm accepts the sparc
  fixes (man, those machines are slow. spoonbill apparently takes ~5h for
  one run).
 
  I've done a read through s_lock.h and the only remaining potential issue
  that I see is that I have no idea if unixware's tas() is actually a safe
  compiler barrier (it is a memory barrier). And I really, really can't
  make myself care.

   *It is the responsibility of these macros to make sure that the 
 compiler
   *does not re-order accesses to shared memory to precede the actual lock
 - *acquisition, or follow the lock release.  Prior to PostgreSQL 9.5, 
 this
 - *was the caller's responsibility, which meant that callers had to use
 + *acquisition, or following the lock release.  Prior to PostgreSQL 9.5,
 + *  this was the caller's responsibility, which meant that callers had to 
 use

 AFAICS my version is right and your version is grammatically
 incorrect. re-order to proceed or follow uses the same verb tense in
 both branches of the or; re-order to proceed or following does not.

 Wasn't sure about that one. It read oddly to me, but then I'm not a
 native speaker. And won't read the sentence often ;)

 I agree that if there are problems on UnixWare, we can let anyone who
 cares about UnixWare submit a patch to fix them.

 Ok.

So, that's committed, then. I think we should pick something that uses
spinlocks and is likely to fail spectacularly if we haven't got this
totally right yet, and de-volatilize it.  And then watch to see what
turns red in the buildfarm and/or which users start screaming.  I'm
inclined to propose lwlock.c as a candidate, since that's very widely
used and a place where we know there's significant contention.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Spinlocks and compiler/memory barriers

2014-09-09 Thread Andres Freund
On 2014-09-09 17:54:03 -0400, Robert Haas wrote:
 So, that's committed, then.

Yay, finally.

 I think we should pick something that uses
 spinlocks and is likely to fail spectacularly if we haven't got this
 totally right yet, and de-volatilize it.  And then watch to see what
 turns red in the buildfarm and/or which users start screaming.

Good plan.

 I'm inclined to propose lwlock.c as a candidate, since that's very
 widely used and a place where we know there's significant contention.

I suggest, additionally possibly, GetSnapshotData(). It's surely one of
the hottest functions in postgres, and I've seen some performance
increases from de-volatilizing it. IIRC it requires one volatile cast in
one place to enforce that a variable is accessed only once. Not sure if
we want to add such volatile casts or use something like linux'
ACCESS_ONCE macros for some common types. Helps to grep for places
worthy of inspection.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Spinlocks and compiler/memory barriers

2014-09-08 Thread Andres Freund
On 2014-09-04 14:19:47 +0200, Andres Freund wrote:
 Yes. I plan to push the patch this weekend. Sorry for the delay.

I'm about to push this. Is it ok to first push it to master and only
backpatch once a couple buildfarm cycles haven't complained?

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Spinlocks and compiler/memory barriers

2014-09-08 Thread Robert Haas
On Mon, Sep 8, 2014 at 8:07 AM, Andres Freund and...@2ndquadrant.com wrote:
 On 2014-09-04 14:19:47 +0200, Andres Freund wrote:
 Yes. I plan to push the patch this weekend. Sorry for the delay.

 I'm about to push this. Is it ok to first push it to master and only
 backpatch once a couple buildfarm cycles haven't complained?

That will have the disadvantage that src/tools/git_changelog will show
the commits separately instead of grouping them together; so it's
probably best not to make a practice of it.  But I think it's up to
your discretion how to handle it in any particular case.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Spinlocks and compiler/memory barriers

2014-09-08 Thread Tom Lane
Andres Freund and...@2ndquadrant.com writes:
 On 2014-09-04 14:19:47 +0200, Andres Freund wrote:
 Yes. I plan to push the patch this weekend. Sorry for the delay.

 I'm about to push this. Is it ok to first push it to master and only
 backpatch once a couple buildfarm cycles haven't complained?

It makes for a cleaner commit history if you push concurrently into
all the branches you intend to patch.  That also gives more buildfarm
runs, which seems like a good thing for this sort of patch.

That is, assuming that we ought to backpatch at all, which to my mind
is debatable.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Spinlocks and compiler/memory barriers

2014-09-08 Thread Robert Haas
On Mon, Sep 8, 2014 at 10:07 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 It makes for a cleaner commit history if you push concurrently into
 all the branches you intend to patch.  That also gives more buildfarm
 runs, which seems like a good thing for this sort of patch.

 That is, assuming that we ought to backpatch at all, which to my mind
 is debatable.

We're not going to backpatch the main patch to make spinlock
primitives act as compiler barriers - or at least, I will object
loudly.

But what we're talking about here is a bug fix for Sparc.  And surely
we ought to back-patch that.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Spinlocks and compiler/memory barriers

2014-09-08 Thread Bruce Momjian
On Mon, Sep  8, 2014 at 10:08:04AM -0400, Robert Haas wrote:
 On Mon, Sep 8, 2014 at 8:07 AM, Andres Freund and...@2ndquadrant.com wrote:
  On 2014-09-04 14:19:47 +0200, Andres Freund wrote:
  Yes. I plan to push the patch this weekend. Sorry for the delay.
 
  I'm about to push this. Is it ok to first push it to master and only
  backpatch once a couple buildfarm cycles haven't complained?
 
 That will have the disadvantage that src/tools/git_changelog will show
 the commits separately instead of grouping them together; so it's
 probably best not to make a practice of it.  But I think it's up to
 your discretion how to handle it in any particular case.

Uh, git_changelog timespan check is 24 hours, so if the delay is less
then 24 hours, I think we are ok, e.g.:

# Might want to make this parameter user-settable.
my $timestamp_slop = 24 * 60 * 60;


-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + Everyone has their own god. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Spinlocks and compiler/memory barriers

2014-09-08 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes:
 But what we're talking about here is a bug fix for Sparc.  And surely
 we ought to back-patch that.

Ah.  OK, no objection.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Spinlocks and compiler/memory barriers

2014-09-04 Thread Robert Haas
On Tue, Aug 5, 2014 at 11:55 AM, Robert Haas robertmh...@gmail.com wrote:
 On Sun, Jul 6, 2014 at 3:12 PM, Andres Freund and...@2ndquadrant.com wrote:
  If you want to do that, it's fine with me.  What I would do is:
 
  - Back-patch the addition of the sparcv8+ stuff all the way.  If
  anyone's running anything older, let them complain...
  - Remove the special case for MIPS without gcc intrinsics only in
  master, leaving the back-branches broken.  If anyone cares, let them
  complain...
  - Nothing else.

 I've gone ahead and done the second of these things.

 Thanks.

 Andres, do you want to go take a stab at fixing the SPARC stuff?

 Will do, will probably take me till thursday to come up with the brain
 cycles.

 Ping?

This has been pending for almost two months now and, at your request,
my patch to make spinlocks act as compiler barriers is waiting behind
it.  Can we please get this moving again soon, or can I commit that
patch and you can fix this when you get around to it?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Spinlocks and compiler/memory barriers

2014-09-04 Thread Andres Freund
On September 4, 2014 2:18:37 PM CEST, Robert Haas robertmh...@gmail.com wrote:
On Tue, Aug 5, 2014 at 11:55 AM, Robert Haas robertmh...@gmail.com
wrote:
 On Sun, Jul 6, 2014 at 3:12 PM, Andres Freund
and...@2ndquadrant.com wrote:
  If you want to do that, it's fine with me.  What I would do is:
 
  - Back-patch the addition of the sparcv8+ stuff all the way.  If
  anyone's running anything older, let them complain...
  - Remove the special case for MIPS without gcc intrinsics only in
  master, leaving the back-branches broken.  If anyone cares, let
them
  complain...
  - Nothing else.

 I've gone ahead and done the second of these things.

 Thanks.

 Andres, do you want to go take a stab at fixing the SPARC stuff?

 Will do, will probably take me till thursday to come up with the
brain
 cycles.

 Ping?

This has been pending for almost two months now and, at your request,
my patch to make spinlocks act as compiler barriers is waiting behind
it.  Can we please get this moving again soon, or can I commit that
patch and you can fix this when you get around to it?

Yes. I plan to push the patch this weekend. Sorry for the delay.

Andres

-- 
Please excuse brevity and formatting - I am writing this on my mobile phone.

Andres Freund  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Spinlocks and compiler/memory barriers

2014-08-05 Thread Robert Haas
On Sun, Jul 6, 2014 at 3:12 PM, Andres Freund and...@2ndquadrant.com wrote:
  If you want to do that, it's fine with me.  What I would do is:
 
  - Back-patch the addition of the sparcv8+ stuff all the way.  If
  anyone's running anything older, let them complain...
  - Remove the special case for MIPS without gcc intrinsics only in
  master, leaving the back-branches broken.  If anyone cares, let them
  complain...
  - Nothing else.

 I've gone ahead and done the second of these things.

 Thanks.

 Andres, do you want to go take a stab at fixing the SPARC stuff?

 Will do, will probably take me till thursday to come up with the brain
 cycles.

Ping?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Spinlocks and compiler/memory barriers

2014-07-06 Thread Robert Haas
On Tue, Jul 1, 2014 at 12:22 PM, Robert Haas robertmh...@gmail.com wrote:
 Since we have a Sun Studio machine in the buildfarm, we shouldn't give
 up on SPARC completely, but maybe we should only add the cases for
 sparcv8+ and above?  That at least has some chance of getting tested.

 That we have code for sparcv7 is ridiculous btw. Sparcv8 (not 8+/9) is
 from 1992. v7/sun4c been dropped from solaris 8 if I see that correctly.

 I agree that v8 (in contrast to v8+ aka v9 in 32bit mode) support has to
 go as well. I'd personally vote for backpatching a note to
 installation.sgml saying that it's probably not working, and not do
 anything else there. That means we also should replace the ldstub by cas
 in the the gcc inline assembly - but we have buildfarm members for that,
 so it's not too bad.

 If you want to do that, it's fine with me.  What I would do is:

 - Back-patch the addition of the sparcv8+ stuff all the way.  If
 anyone's running anything older, let them complain...
 - Remove the special case for MIPS without gcc intrinsics only in
 master, leaving the back-branches broken.  If anyone cares, let them
 complain...
 - Nothing else.

I've gone ahead and done the second of these things.

Andres, do you want to go take a stab at fixing the SPARC stuff?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Spinlocks and compiler/memory barriers

2014-07-06 Thread Andres Freund
On 2014-07-06 15:02:21 -0400, Robert Haas wrote:
 On Tue, Jul 1, 2014 at 12:22 PM, Robert Haas robertmh...@gmail.com wrote:
  Since we have a Sun Studio machine in the buildfarm, we shouldn't give
  up on SPARC completely, but maybe we should only add the cases for
  sparcv8+ and above?  That at least has some chance of getting tested.
 
  That we have code for sparcv7 is ridiculous btw. Sparcv8 (not 8+/9) is
  from 1992. v7/sun4c been dropped from solaris 8 if I see that correctly.
 
  I agree that v8 (in contrast to v8+ aka v9 in 32bit mode) support has to
  go as well. I'd personally vote for backpatching a note to
  installation.sgml saying that it's probably not working, and not do
  anything else there. That means we also should replace the ldstub by cas
  in the the gcc inline assembly - but we have buildfarm members for that,
  so it's not too bad.
 
  If you want to do that, it's fine with me.  What I would do is:
 
  - Back-patch the addition of the sparcv8+ stuff all the way.  If
  anyone's running anything older, let them complain...
  - Remove the special case for MIPS without gcc intrinsics only in
  master, leaving the back-branches broken.  If anyone cares, let them
  complain...
  - Nothing else.
 
 I've gone ahead and done the second of these things.

Thanks.

 Andres, do you want to go take a stab at fixing the SPARC stuff?

Will do, will probably take me till thursday to come up with the brain
cycles.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Spinlocks and compiler/memory barriers

2014-07-02 Thread Andres Freund
On 2014-07-01 20:21:37 -0400, Tom Lane wrote:
 Andres Freund and...@2ndquadrant.com writes:
  On 2014-07-01 23:21:07 +0100, Mark Cave-Ayland wrote:
  Also if you're struggling for Sun buildfarm animals, recent versions of 
  QEMU
  will quite happily install and run later versions of 32-bit Solaris over
  serial, and 2.0 even manages to give you a cgthree framebuffer for the full
  experience.
 
  Well. I have to admit I'm really not interested in investing that much
  time in something I've no stake in. If postgres developers have to put
  emulated machines to develop features something imo went seriously
  wrong. That's more effort than at least I'm willing to spend.
 
 Perhaps more to the point, I have no faith at all that an emulator will
 mimic multiprocessor timing behavior to the level of detail needed to
 tell whether memory-barrier-related logic works.  See the VAX discussion
 just a couple days ago.

Well, it would allow us to see wether fixed stuff actually compiles and
runs - that's not nothing. The biggest problem with fixing stuff like
armv5, sparc8, whatever is that it requires adding stuff to our inline
assembly. It's easy to accidentally make it not compile, but
comparatively harder to make the behaviour even worse than before.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Spinlocks and compiler/memory barriers

2014-07-02 Thread Mark Cave-Ayland

On 02/07/14 08:36, Andres Freund wrote:


On 2014-07-01 20:21:37 -0400, Tom Lane wrote:

Andres Freund and...@2ndquadrant.com writes:

On 2014-07-01 23:21:07 +0100, Mark Cave-Ayland wrote:

Also if you're struggling for Sun buildfarm animals, recent versions of QEMU
will quite happily install and run later versions of 32-bit Solaris over
serial, and 2.0 even manages to give you a cgthree framebuffer for the full
experience.



Well. I have to admit I'm really not interested in investing that much
time in something I've no stake in. If postgres developers have to put
emulated machines to develop features something imo went seriously
wrong. That's more effort than at least I'm willing to spend.


Perhaps more to the point, I have no faith at all that an emulator will
mimic multiprocessor timing behavior to the level of detail needed to
tell whether memory-barrier-related logic works.  See the VAX discussion
just a couple days ago.


Well, it would allow us to see wether fixed stuff actually compiles and
runs - that's not nothing. The biggest problem with fixing stuff like
armv5, sparc8, whatever is that it requires adding stuff to our inline
assembly. It's easy to accidentally make it not compile, but
comparatively harder to make the behaviour even worse than before.


The point I wanted to make was that there are certain applications for 
which SPARCv8 is still certified for particular military/aerospace use. 
While I don't use it myself, some people are still using it enough to 
want to contribute QEMU patches.


In terms of QEMU, the main reason for mentioning it was that if the 
consensus were to keep SPARCv8 support then this could be one possible 
way to provide basic buildfarm support (although as Tom rightly points 
out, there would need to be testing to build up confidence that the 
emulator does the right thing in a multiprocessor environment).



ATB,

Mark.



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Spinlocks and compiler/memory barriers

2014-07-02 Thread Robert Haas
On Wed, Jul 2, 2014 at 4:06 AM, Mark Cave-Ayland
mark.cave-ayl...@ilande.co.uk wrote:
 The point I wanted to make was that there are certain applications for which
 SPARCv8 is still certified for particular military/aerospace use. While I
 don't use it myself, some people are still using it enough to want to
 contribute QEMU patches.

 In terms of QEMU, the main reason for mentioning it was that if the
 consensus were to keep SPARCv8 support then this could be one possible way
 to provide basic buildfarm support (although as Tom rightly points out,
 there would need to be testing to build up confidence that the emulator does
 the right thing in a multiprocessor environment).

I think we're getting off-track here.  The PostgreSQL project is
always willing to accept new buildfarm machines.  In the absence of
buildfarm machines, we try not to go around randomly breaking things
that were previously working.  But the current situation is that we
have good reason to suspect that a couple of very old platforms are
subtly broken.  In that situation, I think removing the
thought-to-be-broken-code is the only sensible approach.

Now, we're not crossing any sort of Rubicon here.  If buildfarm
machines show up - or, hell, if ONE person with access to the hardware
OR a simulator shows up and can compile test EVEN ONCE that a patch to
re-add support compiles and that the regression tests pass - then we
can add support back in two shakes of a lamb's tail.  In the meantime,
leaving broken code in our tree is not a service to anyone.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Spinlocks and compiler/memory barriers

2014-07-01 Thread Andres Freund
On 2014-06-30 22:44:58 -0400, Robert Haas wrote:
 On Mon, Jun 30, 2014 at 6:28 PM, Andres Freund and...@2ndquadrant.com wrote:
  On 2014-06-30 19:22:59 +0200, Andres Freund wrote:
  On 2014-06-30 12:46:29 -0400, Robert Haas wrote:
  , which if I understand you correctly are ARM without GCC
   atomics, Sparc, and MIPS.
 
  I've to revise my statement on MIPS, it actually looks safe. I seem to
  have missed that it has its own S_UNLOCK.
 
  So, here's my first blind attempt at fixing these. Too tired to think
  much more about it. Sparc's configurable cache coherency drives me
  nuts. Linux apparently switched somewhere in 2011 from RMO (relaxed
  memory order) to TSO (total store order), solaris always used TSO, but
  the BSDs don't. Man.
 
  Accordingly there's a somewhat ugly thingy attached. I don't think this
  is really ready, but it's what I can come up today.
 
 You know, looking at this, I wonder if we shouldn't just remove
 support for ARMv5 instead of making a blind stab at a fix.

Well, I argued that way for a while ;). We don't even need to really
desupport it, but just say it's not supported for gcc  4.4.

On the other hand, the swpb release thing isn't too complicated and just
the inverse of existing code.

 I'm quite
 in favor of doing what we can to support obscure architectures, but I
 think this might be beyond what's reasonable.

Yea, I felt like I was going mad doing it. Perhaps I should have
stopped.

 Since we have a Sun Studio machine in the buildfarm, we shouldn't give
 up on SPARC completely, but maybe we should only add the cases for
 sparcv8+ and above?  That at least has some chance of getting tested.

That we have code for sparcv7 is ridiculous btw. Sparcv8 (not 8+/9) is
from 1992. v7/sun4c been dropped from solaris 8 if I see that correctly.

I agree that v8 (in contrast to v8+ aka v9 in 32bit mode) support has to
go as well. I'd personally vote for backpatching a note to
installation.sgml saying that it's probably not working, and not do
anything else there. That means we also should replace the ldstub by cas
in the the gcc inline assembly - but we have buildfarm members for that,
so it's not too bad.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Spinlocks and compiler/memory barriers

2014-07-01 Thread Robert Haas
On Tue, Jul 1, 2014 at 6:04 AM, Andres Freund and...@2ndquadrant.com wrote:
 You know, looking at this, I wonder if we shouldn't just remove
 support for ARMv5 instead of making a blind stab at a fix.

 Well, I argued that way for a while ;). We don't even need to really
 desupport it, but just say it's not supported for gcc  4.4.

Yeah, I didn't realize at the time that you were making that argument
that the existing code was thought to be broken on its own terms.
Removing probably-working code that we just can't test easily is, in
my mind, quite different from removing probably-broken code for which
we can't test a fix.  By the time PostgreSQL 9.5 is released, GCC 4.4
will be six years old, and telling people on an obscure platform that
few operating system distributions support that they can't use a
brand-new PostgeSQL with a seven-year-old compiler doesn't seem like a
serious problem, especially since the only alternative we can offer is
compiling against completely-untested code.

 Since we have a Sun Studio machine in the buildfarm, we shouldn't give
 up on SPARC completely, but maybe we should only add the cases for
 sparcv8+ and above?  That at least has some chance of getting tested.

 That we have code for sparcv7 is ridiculous btw. Sparcv8 (not 8+/9) is
 from 1992. v7/sun4c been dropped from solaris 8 if I see that correctly.

 I agree that v8 (in contrast to v8+ aka v9 in 32bit mode) support has to
 go as well. I'd personally vote for backpatching a note to
 installation.sgml saying that it's probably not working, and not do
 anything else there. That means we also should replace the ldstub by cas
 in the the gcc inline assembly - but we have buildfarm members for that,
 so it's not too bad.

If you want to do that, it's fine with me.  What I would do is:

- Back-patch the addition of the sparcv8+ stuff all the way.  If
anyone's running anything older, let them complain...
- Remove the special case for MIPS without gcc intrinsics only in
master, leaving the back-branches broken.  If anyone cares, let them
complain...
- Nothing else.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Spinlocks and compiler/memory barriers

2014-07-01 Thread Merlin Moncure
On Tue, Jul 1, 2014 at 11:22 AM, Robert Haas robertmh...@gmail.com wrote:
 On Tue, Jul 1, 2014 at 6:04 AM, Andres Freund and...@2ndquadrant.com wrote:
 You know, looking at this, I wonder if we shouldn't just remove
 support for ARMv5 instead of making a blind stab at a fix.

 Well, I argued that way for a while ;). We don't even need to really
 desupport it, but just say it's not supported for gcc  4.4.

 Yeah, I didn't realize at the time that you were making that argument
 that the existing code was thought to be broken on its own terms.
 Removing probably-working code that we just can't test easily is, in
 my mind, quite different from removing probably-broken code for which
 we can't test a fix.  By the time PostgreSQL 9.5 is released, GCC 4.4
 will be six years old, and telling people on an obscure platform that
 few operating system distributions support that they can't use a
 brand-new PostgeSQL with a seven-year-old compiler doesn't seem like a
 serious problem, especially since the only alternative we can offer is
 compiling against completely-untested code.

A few years back I ported the postresql client libraries and a few
other pieces of software (in particular subversion) to a lot of
obscure platforms (old sparc, hpux, irix, older aix, etc etc).
Getting a modern gcc working on those platforms (with the possible
exception of aix) is in many cases difficult or impossible.   So
requiring new gcc is exactly equivalent to desupporting.

merlin


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Spinlocks and compiler/memory barriers

2014-07-01 Thread Andres Freund
On 2014-07-01 11:46:19 -0500, Merlin Moncure wrote:
 On Tue, Jul 1, 2014 at 11:22 AM, Robert Haas robertmh...@gmail.com wrote:
  On Tue, Jul 1, 2014 at 6:04 AM, Andres Freund and...@2ndquadrant.com 
  wrote:
  You know, looking at this, I wonder if we shouldn't just remove
  support for ARMv5 instead of making a blind stab at a fix.
 
  Well, I argued that way for a while ;). We don't even need to really
  desupport it, but just say it's not supported for gcc  4.4.
 
  Yeah, I didn't realize at the time that you were making that argument
  that the existing code was thought to be broken on its own terms.
  Removing probably-working code that we just can't test easily is, in
  my mind, quite different from removing probably-broken code for which
  we can't test a fix.  By the time PostgreSQL 9.5 is released, GCC 4.4
  will be six years old, and telling people on an obscure platform that
  few operating system distributions support that they can't use a
  brand-new PostgeSQL with a seven-year-old compiler doesn't seem like a
  serious problem, especially since the only alternative we can offer is
  compiling against completely-untested code.
 
 A few years back I ported the postresql client libraries and a few
 other pieces of software (in particular subversion) to a lot of
 obscure platforms (old sparc, hpux, irix, older aix, etc etc).
 Getting a modern gcc working on those platforms (with the possible
 exception of aix) is in many cases difficult or impossible.

Well, we're talking about a case where the current code is
*broken*. Subtly so. Where we have no way of testing potential fixes. If
you/somebody steps up to fix  test that, cool. But I don't see it as a
service to anyone to ship broken stuff.

 So requiring new gcc is exactly equivalent to desupporting.

Yea, right. The case we're talking about here is armv5 and some armv6's
(the current fallback inline assembly doesn't work on all v6s). If
postgres were indeed in use and updated on such a platform it'd be cross
compiled with near certainty.

The other case is sparcv7 and sparcv8 (not v8+). The former has been
dropped from solaris8, the latter from solaris9.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Spinlocks and compiler/memory barriers

2014-07-01 Thread Robert Haas
On Tue, Jul 1, 2014 at 12:46 PM, Merlin Moncure mmonc...@gmail.com wrote:
 A few years back I ported the postresql client libraries and a few
 other pieces of software (in particular subversion) to a lot of
 obscure platforms (old sparc, hpux, irix, older aix, etc etc).
 Getting a modern gcc working on those platforms (with the possible
 exception of aix) is in many cases difficult or impossible.   So
 requiring new gcc is exactly equivalent to desupporting.

I took a look around to see which operating systems support which
hardware platforms.  It's a little hard to tell who is supporting
which operating systems, because the terminology is different on
different project web sites, and it's hard to tell which things are
actually ARMv5, as opposed to something else.  But it seems that
NetBSD is has by far the largest list of supported platforms, and it
appears that they compile their current releases with either gcc 4.5
or gcc 4.8, depending on the particular port:

http://www.netbsd.org/developers/features/

According to http://www.netbsd.org/ports/ the ports that run on some
form of ARM are acorn26, acorn32, cats, epoc32, evbarm, hpcarm,
iyonix, netwinder, shark, and zaurus.  epoc32 is not listed at all on
the features link above, but the others are all listed as using gcc
4.8.x.  Even if they were using gcc 4.5.x, though, that would be good
enough, and most platforms that are now on 4.8.x were on 4.5.x before
the 4.8.x import got done:

http://mail-index.netbsd.org/tech-userlevel/2014/02/18/msg008484.html

Apparently, the last holdout preventing removal of gcc 4.1.x from
NetBSD was the vax port, which has happily now been upgraded to 4.8.x.

Debian also supports ARMv5; actually, they support ARMv4t and higher:

https://wiki.debian.org/ArmEabiPort

Debian has shipped a sufficiently-new compiler for our purposes since
6.0 (squeeze), released in 2009:

https://packages.debian.org/squeeze/gcc

So it's clearly *possible* to get newer gcc versions running on ARMv5.
I will grant you that it may not be the easiest thing to do on an
existing installation.  But I don't think having us continue to ship
either known-broken code or completely-untested code is any better.
If someone needs to get PostgreSQL 9.5 running on ARMv5 using an older
gcc, they can test Andres's already-posted patch and, if it works, we
can commit it.  What I think doesn't make sense is to ship it untested
and claim we have support when that really might or might not be true.

Also, none of this would affect the PostgreSQL client libraries that
you are talking about.  s_lock.h is only used by the backend.

The bottom line is that I love supporting obscure platforms as much as
anyone here, and several other committers are already telling me that
I love it too much.  We've got to draw the line somewhere, and I think
refusing to ship newly-written code that we have exactly zero means of
testing is a pretty good place to draw it.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Spinlocks and compiler/memory barriers

2014-07-01 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes:
 The bottom line is that I love supporting obscure platforms as much as
 anyone here, and several other committers are already telling me that
 I love it too much.  We've got to draw the line somewhere, and I think
 refusing to ship newly-written code that we have exactly zero means of
 testing is a pretty good place to draw it.

I'm good with the concept of expecting anyone who complains about lack
of support for $platform to provide resources for testing/debugging PG
on that platform; and that pending arrival of such resources it's okay
to consider the platform desupported.  What concerns me here is what
level of support we could provide even with adequate resources.  That
is, are we going to be buying into a scenario where platforms with poor
atomics support take a significant performance hit compared to the
current state of affairs?  I don't think that would be good.

Another way of framing the problem is in response to Andres' upthread
comment that relying on emulated atomics makes things much easier to
reason about.  That may be true as far as correctness is concerned but
it's patently false for reasoning about performance.  This ties into
Robert's worry about how many different hardware performance profiles
we're going to have to concern ourselves with.

Basically the future that concerns me is that we perform well on x86_64
hardware (which I believe is pretty much all that any active developers
are using) and poorly on other hardware.  I don't want to end up there,
but I think the current direction of this patch pretty much guarantees
that outcome.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Spinlocks and compiler/memory barriers

2014-07-01 Thread Robert Haas
On Tue, Jul 1, 2014 at 2:52 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Robert Haas robertmh...@gmail.com writes:
 The bottom line is that I love supporting obscure platforms as much as
 anyone here, and several other committers are already telling me that
 I love it too much.  We've got to draw the line somewhere, and I think
 refusing to ship newly-written code that we have exactly zero means of
 testing is a pretty good place to draw it.

 I'm good with the concept of expecting anyone who complains about lack
 of support for $platform to provide resources for testing/debugging PG
 on that platform; and that pending arrival of such resources it's okay
 to consider the platform desupported.  What concerns me here is what
 level of support we could provide even with adequate resources.  That
 is, are we going to be buying into a scenario where platforms with poor
 atomics support take a significant performance hit compared to the
 current state of affairs?  I don't think that would be good.

 Another way of framing the problem is in response to Andres' upthread
 comment that relying on emulated atomics makes things much easier to
 reason about.  That may be true as far as correctness is concerned but
 it's patently false for reasoning about performance.  This ties into
 Robert's worry about how many different hardware performance profiles
 we're going to have to concern ourselves with.

I have to admit that my concerns in that area were ameliorated to a
significant degree by the performance results that Andres posted
yesterday.  On top of the atomics patch (which is not this thread),
he's got a rewritten lwlock patch that uses those atomics.  His
testing showed that, even when that patch uses the atomics emulation
layer instead of real atomics, it's still better than the status quo.
Maybe that won't be true for every patch that uses atomics, but it's
certainly good as far as it goes.

But I think the issue on this sub-thread is a different one
altogether: in studying s_lock.h, Andres found clear evidence that the
ARMv5 and Sun Studio spinlock implementations are in fact *buggy*.
Even if the programmer uses volatile pointers until they turn blue in
the face, nothing in that patch prevents the CPU from changing the
apparent order of execution such that instructions within the
spinlock-protected critical section appear to execute after lock
release.  That is no good.  It means that if somebody runs PostgreSQL
on those platforms and tries to do non-trivial things with it, it's
probably going to break.

Now, if we want, we can simply ignore that problem.  No actual users
have complained about it, and it's not entirely outside the bounds of
plausibility that Andres is wrong, and those implementations are not
buggy after all.  But I don't think he's wrong.  If we assume he's
right, then we've got two choices: we can either blindly install fixes
for those platforms that we can't test, or we can drop support.

 Basically the future that concerns me is that we perform well on x86_64
 hardware (which I believe is pretty much all that any active developers
 are using) and poorly on other hardware.  I don't want to end up there,
 but I think the current direction of this patch pretty much guarantees
 that outcome.

I think this concern is more germane to the atomics patch than what
we're talking about here, because what we're talking about here, at
the moment, is either attempting to repair, or just rip out
completely, S_UNLOCK() implementations that appear to be buggy.

I have access to several PPC64 systems, use them regularly for
benchmarking and performance-testing, and can provide access to one of
those to others who may need it for testing.  For that reason, I think
at least x64/amd64 and PPC64 will get tested regularly, at least as
long as IBM keeps that hardware on loan to us.  I previously had
access to an Itanium server and used that for a lot of testing during
the 9.2 cycle, but that eventually went away.  I think it would be
awfully nice if some hardware vendor (or interested community member)
could provide us with access to some top-quality SPARC, ARM, and maybe
zSeries machines for testing, but nobody's offered that I know of.  We
can't test hardware we don't have.

Despite my concerns about keeping the list of supported atomics short,
and I do have concerns in that area, I'm not really sure that we have
much choice but to go in that direction.  We can't accept a 5x
performance hit in the name of portability, and that's literally what
we're talking about in some cases.  I definitely want to think
carefully about how we proceed in this area but doing nothing doesn't
seem like an option.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Spinlocks and compiler/memory barriers

2014-07-01 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes:
 Despite my concerns about keeping the list of supported atomics short,
 and I do have concerns in that area, I'm not really sure that we have
 much choice but to go in that direction.  We can't accept a 5x
 performance hit in the name of portability, and that's literally what
 we're talking about in some cases.  I definitely want to think
 carefully about how we proceed in this area but doing nothing doesn't
 seem like an option.

To be clear, I'm not advocating doing nothing (and I don't think anyone
else is).  It's obvious based on Andres' results that we want to use
atomics on platforms where they're well-supported.  The argument is
around what we're going to do for other platforms.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Spinlocks and compiler/memory barriers

2014-07-01 Thread Robert Haas
On Tue, Jul 1, 2014 at 4:54 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Robert Haas robertmh...@gmail.com writes:
 Despite my concerns about keeping the list of supported atomics short,
 and I do have concerns in that area, I'm not really sure that we have
 much choice but to go in that direction.  We can't accept a 5x
 performance hit in the name of portability, and that's literally what
 we're talking about in some cases.  I definitely want to think
 carefully about how we proceed in this area but doing nothing doesn't
 seem like an option.

 To be clear, I'm not advocating doing nothing (and I don't think anyone
 else is).  It's obvious based on Andres' results that we want to use
 atomics on platforms where they're well-supported.  The argument is
 around what we're going to do for other platforms.

OK, but that still seems like the issue on the other thread, not
what's being discussed here.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Spinlocks and compiler/memory barriers

2014-07-01 Thread Mark Cave-Ayland

On 01/07/14 11:04, Andres Freund wrote:


Since we have a Sun Studio machine in the buildfarm, we shouldn't give
up on SPARC completely, but maybe we should only add the cases for
sparcv8+ and above?  That at least has some chance of getting tested.


That we have code for sparcv7 is ridiculous btw. Sparcv8 (not 8+/9) is
from 1992. v7/sun4c been dropped from solaris 8 if I see that correctly.

I agree that v8 (in contrast to v8+ aka v9 in 32bit mode) support has to
go as well. I'd personally vote for backpatching a note to
installation.sgml saying that it's probably not working, and not do
anything else there. That means we also should replace the ldstub by cas
in the the gcc inline assembly - but we have buildfarm members for that,
so it's not too bad.


Being involved in QEMU SPARC development, I can tell you that patches 
are still actively being received for SPARCv8. The last set of CPU 
patches were related to fixing bugs in the LEON3, a 32-bit SPARC CPU 
which is available in hardened versions certified for extreme 
environments such as military and space. I'd hate to find out that they 
switched to another database because they couldn't upgrade PostgreSQL on 
the ISS ;)


Also if you're struggling for Sun buildfarm animals, recent versions of 
QEMU will quite happily install and run later versions of 32-bit Solaris 
over serial, and 2.0 even manages to give you a cgthree framebuffer for 
the full experience.



ATB,

Mark.



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Spinlocks and compiler/memory barriers

2014-07-01 Thread Andres Freund
On 2014-07-01 23:21:07 +0100, Mark Cave-Ayland wrote:
 On 01/07/14 11:04, Andres Freund wrote:
 
 Since we have a Sun Studio machine in the buildfarm, we shouldn't give
 up on SPARC completely, but maybe we should only add the cases for
 sparcv8+ and above?  That at least has some chance of getting tested.
 
 That we have code for sparcv7 is ridiculous btw. Sparcv8 (not 8+/9) is
 from 1992. v7/sun4c been dropped from solaris 8 if I see that correctly.
 
 I agree that v8 (in contrast to v8+ aka v9 in 32bit mode) support has to
 go as well. I'd personally vote for backpatching a note to
 installation.sgml saying that it's probably not working, and not do
 anything else there. That means we also should replace the ldstub by cas
 in the the gcc inline assembly - but we have buildfarm members for that,
 so it's not too bad.
 
 Being involved in QEMU SPARC development, I can tell you that patches are
 still actively being received for SPARCv8. The last set of CPU patches were
 related to fixing bugs in the LEON3, a 32-bit SPARC CPU which is available
 in hardened versions certified for extreme environments such as military and
 space. I'd hate to find out that they switched to another database because
 they couldn't upgrade PostgreSQL on the ISS ;)

LEON3 isn't really sparcv8. It's more like sparcv8 with v9 stuff cherry
picked... It e.g. has CAS :)
Your point stands though - we should probably backpatch the v8 version
of my patch as well, since apparently LEON3 does *not* have membar.

But I don't think this is a rat race we can win. We can't keep up with
all the variants of sparc and even worse arm. We should add a intrinsics
based version for sparc as well. That'll work just fine on any recent
gcc.

 Also if you're struggling for Sun buildfarm animals, recent versions of QEMU
 will quite happily install and run later versions of 32-bit Solaris over
 serial, and 2.0 even manages to give you a cgthree framebuffer for the full
 experience.

Well. I have to admit I'm really not interested in investing that much
time in something I've no stake in. If postgres developers have to put
emulated machines to develop features something imo went seriously
wrong. That's more effort than at least I'm willing to spend.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Spinlocks and compiler/memory barriers

2014-07-01 Thread Tom Lane
Andres Freund and...@2ndquadrant.com writes:
 On 2014-07-01 23:21:07 +0100, Mark Cave-Ayland wrote:
 Also if you're struggling for Sun buildfarm animals, recent versions of QEMU
 will quite happily install and run later versions of 32-bit Solaris over
 serial, and 2.0 even manages to give you a cgthree framebuffer for the full
 experience.

 Well. I have to admit I'm really not interested in investing that much
 time in something I've no stake in. If postgres developers have to put
 emulated machines to develop features something imo went seriously
 wrong. That's more effort than at least I'm willing to spend.

Perhaps more to the point, I have no faith at all that an emulator will
mimic multiprocessor timing behavior to the level of detail needed to
tell whether memory-barrier-related logic works.  See the VAX discussion
just a couple days ago.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Spinlocks and compiler/memory barriers

2014-06-30 Thread Robert Haas
On Sat, Jun 28, 2014 at 9:41 AM, Andres Freund and...@2ndquadrant.com wrote:
 No, I think it's going to be *much* simpler than that.  How about I
 take a crack at this next week and then either (a) I'll see why it's a
 bad idea and we can go from there or (b) you can review what I come up
 with and tell me why it sucks?

 Ok. I think that's going in the wrong direction (duplication of
 nontrivial knowledge), but maybe I'm taking a to 'purist' approach
 here. Prove me wrong :)

I'm not sure what you'll think of this, but see attached.  I think
newer releases of Sun Studio support that GCC way of doing a barrier,
but I don't know how to test for that, so at the moment that uses the
fallback put it in a function approach, along with HPPA non-GCC and
Univel CC.  Those things are obscure enough that I don't care about
making them less performance.  I think AIX is actually OK as-is; if
_check_lock() is a compiler barrier but _clear_lock() is not, that
would be pretty odd.

  How are you suggesting we deal with the generic S_UNLOCK
  case without having a huge ifdef?
  Why do we build an abstraction layer (barrier.h) and then not use it?

 Because (1) the abstraction doesn't fit very well unless we do a lot
 of additional work to build acquire and release barriers for every
 platform we support and

 Meh. Something like the (untested):
 #if !defined(pg_release_barrier)  defined(pg_read_barrier)  
 defined(pg_write_barrier)
 #define pg_release_barrier() do { pg_read_barrier(); pg_write_barrier();} 
 while (0)
 #elif !defined(pg_release_barrier)
 #define pg_release_barrier() pg_memory_barrier()
 #endif

 before the fallback definition of pg_read/write_barrier should suffice?

That doesn't sound like a good idea.  For example, on PPC, a read
barrier is lwsync, and so is a write barrier.  That would also suck on
any architecture where either a read or write barrier gets implemented
as a full memory barrier, though I guess it might be smart enough to
optimize away most of the cost of the second of two barriers in
immediate succession.

I think if we want to use barrier.h as the foundation for this, we're
going to need to define a new set of things in there that have acquire
and release semantics, or just use full barriers for everything -
which would be wasteful on, e.g., x86.  And I don't particularly see
the point in going to a lot of work to invent those new abstractions
everywhere when we can just tweak s_lock.h in place a bit and be done
with it.  Making those files interdependent doesn't strike me as a
win.

 (2) I don't have much confidence that we can
 depend on the spinlock fallback for barriers without completely
 breaking obscure platforms, and I'd rather make a more minimal set of
 changes.

 Well, it's the beginning of the cycle. And we're already depending on
 barriers for correctness (and it's not getting less), so I don't really
 see what avoiding barrier usage buys us except harder to find breakage?

If we use barriers to implement any facility other than spinlocks, I
have reasonable confidence that we won't break things more than they
already are broken today, because I think the fallback to
acquire-and-release a spinlock probably works, even though it's likely
terrible for performance.  I have significantly less confidence that
trying to implement spinlocks in terms of barriers is going to be
reliable.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
diff --git a/src/backend/storage/lmgr/s_lock.c b/src/backend/storage/lmgr/s_lock.c
index efe1b43..38dc34d 100644
--- a/src/backend/storage/lmgr/s_lock.c
+++ b/src/backend/storage/lmgr/s_lock.c
@@ -154,6 +154,17 @@ s_lock(volatile slock_t *lock, const char *file, int line)
 	return delays;
 }
 
+#ifdef USE_DEFAULT_S_UNLOCK
+void
+s_unlock(slock_t *lock)
+{
+#ifdef TAS_ACTIVE_WORD
+	*TAS_ACTIVE_WORD(lock) = -1;
+#else
+	*lock = 0;
+#endif
+}
+#endif
 
 /*
  * Set local copy of spins_per_delay during backend startup.
diff --git a/src/include/storage/s_lock.h b/src/include/storage/s_lock.h
index 895abe6..f1a89dc 100644
--- a/src/include/storage/s_lock.h
+++ b/src/include/storage/s_lock.h
@@ -55,14 +55,16 @@
  *	on Alpha TAS() will fail if interrupted.  Therefore a retry loop must
  *	always be used, even if you are certain the lock is free.
  *
- *	Another caution for users of these macros is that it is the caller's
- *	responsibility to ensure that the compiler doesn't re-order accesses
- *	to shared memory to precede the actual lock acquisition, or follow the
- *	lock release.  Typically we handle this by using volatile-qualified
- *	pointers to refer to both the spinlock itself and the shared data
- *	structure being accessed within the spinlocked critical section.
- *	That fixes it because compilers are not allowed to re-order accesses
- *	to volatile objects relative to other such accesses.
+ *	It is the responsibility of these macros to make sure that the compiler
+ *	does not re-order accesses to shared 

Re: [HACKERS] Spinlocks and compiler/memory barriers

2014-06-30 Thread Andres Freund
Hi,

On 2014-06-30 08:03:40 -0400, Robert Haas wrote:
 On Sat, Jun 28, 2014 at 9:41 AM, Andres Freund and...@2ndquadrant.com wrote:
  No, I think it's going to be *much* simpler than that.  How about I
  take a crack at this next week and then either (a) I'll see why it's a
  bad idea and we can go from there or (b) you can review what I come up
  with and tell me why it sucks?
 
  Ok. I think that's going in the wrong direction (duplication of
  nontrivial knowledge), but maybe I'm taking a to 'purist' approach
  here. Prove me wrong :)
 
 I'm not sure what you'll think of this, but see attached.

I think it continues in the tradition of making s_lock.h ever harder to
follow. But it's still better than what we have now from a correctness
perspective.

  I think
 newer releases of Sun Studio support that GCC way of doing a barrier,
 but I don't know how to test for that, so at the moment that uses the
 fallback put it in a function approach,

Sun studio's support for the gcc way is new (some update to sun studio 12), so 
that's
probably not sufficient.
#include mbarrier.h and __compiler_barrier()/__machine_rel_barrier()
should do the trick for spinlocks. That seems to have existed for
longer.

Solaris seems to run with TSO enabled for SPARC (whereas linux uses RMO,
relaxed memory ordering), so it's probably fine to just use the compiler
barrier.

 along with HPPA non-GCC and
 Univel CC.  Those things are obscure enough that I don't care about
 making them less performance.

Fine with me.

  I think AIX is actually OK as-is; if _check_lock() is a compiler
 barrier but _clear_lock() is not, that would be pretty odd.

Agreed.

   How are you suggesting we deal with the generic S_UNLOCK
   case without having a huge ifdef?
   Why do we build an abstraction layer (barrier.h) and then not use it?
 
  Because (1) the abstraction doesn't fit very well unless we do a lot
  of additional work to build acquire and release barriers for every
  platform we support and
 
  Meh. Something like the (untested):
  #if !defined(pg_release_barrier)  defined(pg_read_barrier)  
  defined(pg_write_barrier)
  #define pg_release_barrier() do { pg_read_barrier(); pg_write_barrier();} 
  while (0)
  #elif !defined(pg_release_barrier)
  #define pg_release_barrier() pg_memory_barrier()
  #endif
 
  before the fallback definition of pg_read/write_barrier should suffice?
 
 That doesn't sound like a good idea.  For example, on PPC, a read
 barrier is lwsync, and so is a write barrier.  That would also suck on
 any architecture where either a read or write barrier gets implemented
 as a full memory barrier, though I guess it might be smart enough to
 optimize away most of the cost of the second of two barriers in
 immediate succession.

Well, that's why I suggested only doing it if we haven't got a
pg_release_barrier() defined. And fallback to memory_barrier() directly
if read/write barriers are implemented using it so we don't have two
memory barriers in a row.

 I think if we want to use barrier.h as the foundation for this, we're
 going to need to define a new set of things in there that have acquire
 and release semantics, or just use full barriers for everything -
 which would be wasteful on, e.g., x86.  And I don't particularly see
 the point in going to a lot of work to invent those new abstractions
 everywhere when we can just tweak s_lock.h in place a bit and be done
 with it.  Making those files interdependent doesn't strike me as a
 win.

We'll need release semantics in more places than just s_lock.h. Anything
that acts like a lock without using spinlocks actually needs
acquire/release semantics. The lwlock patch only gets away with it
because the atomic operations implementation happen to act as acquire or
full memory barriers.

  (2) I don't have much confidence that we can
  depend on the spinlock fallback for barriers without completely
  breaking obscure platforms, and I'd rather make a more minimal set of
  changes.
 
  Well, it's the beginning of the cycle. And we're already depending on
  barriers for correctness (and it's not getting less), so I don't really
  see what avoiding barrier usage buys us except harder to find breakage?
 
 If we use barriers to implement any facility other than spinlocks, I
 have reasonable confidence that we won't break things more than they
 already are broken today, because I think the fallback to
 acquire-and-release a spinlock probably works, even though it's likely
 terrible for performance.  I have significantly less confidence that
 trying to implement spinlocks in terms of barriers is going to be
 reliable.

So you believe we have a reliable barrier implementation - but you don't
believe it's reliable enough for spinlocks? If we'd *not* use the
barrier fallback for spinlock release if, and only if, it's implemented
via spinlocks, I don't see why we'd be worse off?

 +#if !defined(S_UNLOCK)
 +#if defined(__INTEL_COMPILER)
 +#define S_UNLOCK(lock)   \
 + do { 

Re: [HACKERS] Spinlocks and compiler/memory barriers

2014-06-30 Thread Robert Haas
On Mon, Jun 30, 2014 at 9:26 AM, Andres Freund and...@2ndquadrant.com wrote:
 I think it continues in the tradition of making s_lock.h ever harder to
 follow. But it's still better than what we have now from a correctness
 perspective.

Well, as you and I have discussed before, someday we probably need to
get rid of s_lock.h or at least refactor it heavily, but let's do one
thing at a time.  I think we're eventually going to require every
platform that wants to run PostgreSQL to have working barriers and
atomics, and then we can rewrite s_lock.h into something much shorter
and cleaner, but I am opposed to doing that today, because even if we
don't care about people running obscure proprietary compilers on weird
platforms, there are still lots of people running older GCC versions.
For right now, I think the prudent thing to do is keep s_lock.h on
life support.

  I think
 newer releases of Sun Studio support that GCC way of doing a barrier,
 but I don't know how to test for that, so at the moment that uses the
 fallback put it in a function approach,

 Sun studio's support for the gcc way is new (some update to sun studio 12), 
 so that's
 probably not sufficient.
 #include mbarrier.h and __compiler_barrier()/__machine_rel_barrier()
 should do the trick for spinlocks. That seems to have existed for
 longer.

Can you either link to relevant documentation or be more specific
about what needs to be done here?

 Solaris seems to run with TSO enabled for SPARC (whereas linux uses RMO,
 relaxed memory ordering), so it's probably fine to just use the compiler
 barrier.

If it isn't, that's a change that has nothing to do with making
spinlock operations compiler barriers and everything to do with fixing
a bug in the existing implementation.

 So you believe we have a reliable barrier implementation - but you don't
 believe it's reliable enough for spinlocks? If we'd *not* use the
 barrier fallback for spinlock release if, and only if, it's implemented
 via spinlocks, I don't see why we'd be worse off?

I can't parse this sentence because it's not clearly to me exactly
which part the not applies to, and I think we're talking past each
other a bit, too.  Basically, every platform we support today has a
spinlock implementation that is supposed to prevent CPU reordering
across the acquire and release operations but might not prevent
compiler reordering.  IOW, S_LOCK() should be a CPU acquire fence, and
S_UNLOCK() should be a CPU release fence.  Now, we want to make these
operations compiler fences as well, and AIUI your proposal for that is
to make NEW_S_UNLOCK(lock) = out of line function call + S_LOCK(dummy)
+ S_UNLOCK(dummy) + S_UNLOCK(lock).  My proposal is to make
NEW_S_UNLOCK(lock) = out of line function call + S_UNLOCK(lock), which
I think is strictly better.  There's zip to guarantee that adding
S_LOCK(dummy) + S_UNLOCK(dummy) is doing anything we want in there,
and it's definitely going to be worse for performance.

The other thing that I don't like about your proposal has to do with
the fact that the support matrix for barrier.h and s_lock.h are not
identical.  If s_lock.h is confusing to you today, making it further
intertwined with barrier.h is not going to make things better; at
least, that confuses the crap out of me.  Perhaps the only good thing
to be said about this mess is that, right now, the dependency is in
just one direction: barrier.h depends on s_lock.h, and not the other
way around.  At some future point we may well want to reverse the
direction of that dependency, but to do that we need barrier.h to have
a working barrier implementation for every platform that s_lock.h
supports.  Maybe we'll accomplish that by adding to barrier.h and and
maybe we'll accomplish that by subtracting from s_lock.h but the short
path to getting this issue fixed is to be agnostic to that question.

 1)
 Afaics ARM will fall back to this for older gccs - and ARM is weakly
 ordered. I think I'd just also use swpb to release the lock. Something
 like
 #define S_INIT_LOCK(lock)   (*(lock)) = 0);

 #define S_UNLOCK s_unlock
 static __inline__ void
 s_unlock(volatile slock_t *lock)
 {
 register slock_t _res = 0;

 __asm__ __volatile__(
swpb%0, %0, [%2]\n
 :   +r(_res), +m(*lock)
 :   r(lock)
 :   memory);
 Assert(_res == 1); // lock should be held by us
 }

 2)
 Afaics it's also wrong for sparc on linux (and probably the BSDs) where
 relaxed memory ordering is used.

 3)
 Also looks wrong on mips which needs a full memory barrier.

You're mixing apples and oranges again.  If the existing definitions
aren't CPU barriers, they're already broken, and we should fix that
and back-patch.  On the other hand, the API contract change making
these into compiler barriers is a master-only change.  I think for
purposes of this patch we should assume that the existing code is
sufficient to prevent CPU reordering and just focus on fixing 

Re: [HACKERS] Spinlocks and compiler/memory barriers

2014-06-30 Thread Andres Freund
On 2014-06-30 10:05:44 -0400, Robert Haas wrote:
 On Mon, Jun 30, 2014 at 9:26 AM, Andres Freund and...@2ndquadrant.com wrote:
  I think it continues in the tradition of making s_lock.h ever harder to
  follow. But it's still better than what we have now from a correctness
  perspective.
 
 Well, as you and I have discussed before, someday we probably need to
 get rid of s_lock.h or at least refactor it heavily, but let's do one
 thing at a time.

Well, that task gets harder by making it more complex...

(will answer separately about sun studio)

  Solaris seems to run with TSO enabled for SPARC (whereas linux uses RMO,
  relaxed memory ordering), so it's probably fine to just use the compiler
  barrier.
 
 If it isn't, that's a change that has nothing to do with making
 spinlock operations compiler barriers and everything to do with fixing
 a bug in the existing implementation.

Well, it actually has. If we start to remove volatiles from critical
sections the compiler will schedule stores closer to the spinlock
release and reads more freely - making externally visible ordering
violations more likely. Especially on itanic.

So, I agree that we need to fix unlocks that aren't sufficiently strong
memory barriers, but it does get more urgent by not relying on volatile
inside criticial sections anymore.

  Basically, every platform we support today has a
 spinlock implementation that is supposed to prevent CPU reordering
 across the acquire and release operations but might not prevent
 compiler reordering.  IOW, S_LOCK() should be a CPU acquire fence, and
 S_UNLOCK() should be a CPU release fence.

Well, I think s_lock.h comes woefully short of that goal on several
platforms. Scary.

 Now, we want to make these
 operations compiler fences as well, and AIUI your proposal for that is
 to make NEW_S_UNLOCK(lock) = out of line function call + S_LOCK(dummy)
 + S_UNLOCK(dummy) + S_UNLOCK(lock).

My proposal was to use barrier.h provided barriers as long as it
provides a native implementation. If neither a compiler nor a memory
barrier is provided my proposal was to fall back to the memory barrier
emulation we already have, but move it out of line (to avoid reordering
issues). The reason for doing so is that the release *has* to be a
release barrier.

To avoid issues with recursion the S_UNLOCK() used in the out of line
memory barrier implementation used a modified S_UNLOCK that doesn't
include a barrier.

 My proposal is to make
 NEW_S_UNLOCK(lock) = out of line function call + S_UNLOCK(lock), which
 I think is strictly better.  There's zip to guarantee that adding
 S_LOCK(dummy) + S_UNLOCK(dummy) is doing anything we want in there,
 and it's definitely going to be worse for performance.

Uh? At the very least doing a S_LOCK() guarantees that we're doing some
sort of memory barrier, which your's doesn't at all. That'd actually fix
the majority of platforms with borked release semantics because pretty
much all tas() implementations are full barriers.

 The other thing that I don't like about your proposal has to do with
 the fact that the support matrix for barrier.h and s_lock.h are not
 identical.  If s_lock.h is confusing to you today, making it further
 intertwined with barrier.h is not going to make things better; at
 least, that confuses the crap out of me.

Uh. It actually *removed* the confusing edge of the dependency. It's
rather confusing that memory barriers rely spinlocks and not the other
way round. I think we actually should do that unconditionally,
independent of any other changes. The only reasons barrier.h includes
s_lock.h is that dummy_spinlock is declared and that the memory barrier
is declared inline.

  1)
  Afaics ARM will fall back to this for older gccs - and ARM is weakly
  ordered. I think I'd just also use swpb to release the lock. Something
  like
  #define S_INIT_LOCK(lock)   (*(lock)) = 0);
 
  #define S_UNLOCK s_unlock
  static __inline__ void
  s_unlock(volatile slock_t *lock)
  {
  register slock_t _res = 0;
 
  __asm__ __volatile__(
 swpb%0, %0, [%2]\n
  :   +r(_res), +m(*lock)
  :   r(lock)
  :   memory);
  Assert(_res == 1); // lock should be held by us
  }
 
  2)
  Afaics it's also wrong for sparc on linux (and probably the BSDs) where
  relaxed memory ordering is used.
 
  3)
  Also looks wrong on mips which needs a full memory barrier.
 
 You're mixing apples and oranges again.

No, I'm not.

 If the existing definitions
 aren't CPU barriers, they're already broken, and we should fix that
 and back-patch.

Yea, and that gets harder if we do it after master has changed
incompatibly. And, as explained above, we need to fix S_UNLOCK() to be a
memory barrier before we can remove volatiles - which is the goal of
your patch, no?

 On the other hand, the API contract change making
 these into compiler barriers is a master-only change.  I think for
 purposes of this patch we should assume 

Re: [HACKERS] Spinlocks and compiler/memory barriers

2014-06-30 Thread Robert Haas
On Mon, Jun 30, 2014 at 11:17 AM, Andres Freund and...@2ndquadrant.com wrote:
  Solaris seems to run with TSO enabled for SPARC (whereas linux uses RMO,
  relaxed memory ordering), so it's probably fine to just use the compiler
  barrier.

 If it isn't, that's a change that has nothing to do with making
 spinlock operations compiler barriers and everything to do with fixing
 a bug in the existing implementation.

 Well, it actually has. If we start to remove volatiles from critical
 sections the compiler will schedule stores closer to the spinlock
 release and reads more freely - making externally visible ordering
 violations more likely. Especially on itanic.

Granted.

 Now, we want to make these
 operations compiler fences as well, and AIUI your proposal for that is
 to make NEW_S_UNLOCK(lock) = out of line function call + S_LOCK(dummy)
 + S_UNLOCK(dummy) + S_UNLOCK(lock).

 My proposal was to use barrier.h provided barriers as long as it
 provides a native implementation. If neither a compiler nor a memory
 barrier is provided my proposal was to fall back to the memory barrier
 emulation we already have, but move it out of line (to avoid reordering
 issues). The reason for doing so is that the release *has* to be a
 release barrier.

What do you mean by the memory barrier emulation we already have?
The only memory barrier emulation we have, to my knowledge, is a
spinlock acquire-release cycle.  But trying to use a spinlock
acquire-release to shore up problems with the spinlock release
implementation makes my head explode.

 On the other hand, the API contract change making
 these into compiler barriers is a master-only change.  I think for
 purposes of this patch we should assume that the existing code is
 sufficient to prevent CPU reordering and just focus on fixing compiler
 ordering problems.  Whatever needs to change on the CPU-reordering end
 of things is a separate commit.

 I'm not arguing against that it should be a separate commit. Just that
 the proper memory barrier bit has to come first.

I feel like you're speaking somewhat imprecisely in an area where very
precise speech is needed to avoid confusion.   If you're saying that
you think we should fix the S_UNLOCK() implementations that fail to
prevent CPU-ordering before we change all the S_UNLOCK()
implementations to prevent compiler-reordering, then that is fine with
me; otherwise, I don't understand what you're getting at here.  Do you
want to propose a patch that does *only* that first part before we go
further here?  Do you want me to try to put together such a patch
based on the details mentioned on this thread?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Spinlocks and compiler/memory barriers

2014-06-30 Thread Andres Freund
On 2014-06-30 11:38:31 -0400, Robert Haas wrote:
 On Mon, Jun 30, 2014 at 11:17 AM, Andres Freund and...@2ndquadrant.com 
 wrote:
  Now, we want to make these
  operations compiler fences as well, and AIUI your proposal for that is
  to make NEW_S_UNLOCK(lock) = out of line function call + S_LOCK(dummy)
  + S_UNLOCK(dummy) + S_UNLOCK(lock).
 
  My proposal was to use barrier.h provided barriers as long as it
  provides a native implementation. If neither a compiler nor a memory
  barrier is provided my proposal was to fall back to the memory barrier
  emulation we already have, but move it out of line (to avoid reordering
  issues). The reason for doing so is that the release *has* to be a
  release barrier.
 
 What do you mean by the memory barrier emulation we already have?
 The only memory barrier emulation we have, to my knowledge, is a
 spinlock acquire-release cycle.

Yes.

Unrelated, but why haven't we defined it as if (TAS(dummy))
S_UNLOCK(dummy);? That's just about as strong and less of a performance
drain?
Hm, I guess platforms that do an unlocked test first would be
problematic :/

 But trying to use a spinlock
 acquire-release to shore up problems with the spinlock release
 implementation makes my head explode.

Well, it actually makes some sense. Nearly any TAS() implementation is
going to have some memory barrier semantics - so using a TAS() as
fallback makes sense. That's why we're relying on it for use in memory
barrier emulation after all.

Also note that barrier.h assumes that a S_LOCK/S_UNLOCK acts as a
compiler barrier - which really isn't guaranteed by the current contract
of s_lock.h. Although the tas() implementations look safe.

  On the other hand, the API contract change making
  these into compiler barriers is a master-only change.  I think for
  purposes of this patch we should assume that the existing code is
  sufficient to prevent CPU reordering and just focus on fixing compiler
  ordering problems.  Whatever needs to change on the CPU-reordering end
  of things is a separate commit.
 
  I'm not arguing against that it should be a separate commit. Just that
  the proper memory barrier bit has to come first.
 
 I feel like you're speaking somewhat imprecisely in an area where very
 precise speech is needed to avoid confusion.   If you're saying that
 you think we should fix the S_UNLOCK() implementations that fail to
 prevent CPU-ordering before we change all the S_UNLOCK()
 implementations to prevent compiler-reordering, then that is fine with
 me;

Yes, that's what I think is needed.

 Do you want to propose a patch that does *only* that first part before
 we go further here?  Do you want me to try to put together such a
 patch based on the details mentioned on this thread?

I'm fine with either - we're both going to flying pretty blind :/.

I think the way S_UNLOCK's release memory barrier semantics are fixed
might influence the way the compiler barrier issue can be solved. Fixing
the release semantics will involve going through all tas()
implementations and see whether the default release semantics are
ok. The ones with broken semantics will need to grow their own
S_UNLOCK. I am wondering if that commit shouldn't just remove the
default S_UNLOCK and move it to the tas() implementation sites.



Please don't forget that I started this thread because I found the
current implementation lacking because s_lock neither has sane memory
release nor compiler release semantics... So it's not surprising that I
want to solve both :)

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Spinlocks and compiler/memory barriers

2014-06-30 Thread Robert Haas
On Mon, Jun 30, 2014 at 12:20 PM, Andres Freund and...@2ndquadrant.com wrote:
 On 2014-06-30 11:38:31 -0400, Robert Haas wrote:
 On Mon, Jun 30, 2014 at 11:17 AM, Andres Freund and...@2ndquadrant.com 
 wrote:
  Now, we want to make these
  operations compiler fences as well, and AIUI your proposal for that is
  to make NEW_S_UNLOCK(lock) = out of line function call + S_LOCK(dummy)
  + S_UNLOCK(dummy) + S_UNLOCK(lock).
 
  My proposal was to use barrier.h provided barriers as long as it
  provides a native implementation. If neither a compiler nor a memory
  barrier is provided my proposal was to fall back to the memory barrier
  emulation we already have, but move it out of line (to avoid reordering
  issues). The reason for doing so is that the release *has* to be a
  release barrier.

 What do you mean by the memory barrier emulation we already have?
 The only memory barrier emulation we have, to my knowledge, is a
 spinlock acquire-release cycle.

 Yes.

 Unrelated, but why haven't we defined it as if (TAS(dummy))
 S_UNLOCK(dummy);? That's just about as strong and less of a performance
 drain?
 Hm, I guess platforms that do an unlocked test first would be
 problematic :/

Yes; also, there's no requirement for S_LOCK() to be defined in terms
of TAS(), and thus no requirement for TAS() to exist at all.

 But trying to use a spinlock
 acquire-release to shore up problems with the spinlock release
 implementation makes my head explode.

 Well, it actually makes some sense. Nearly any TAS() implementation is
 going to have some memory barrier semantics - so using a TAS() as
 fallback makes sense. That's why we're relying on it for use in memory
 barrier emulation after all.

As far as I know, all of our TAS() implementations prevent CPU
reordering in the acquire direction.  It is not clear that they
provide CPU-reordering guarantees adequate for the release direction,
even when paired with whatever S_UNLOCK() implementation they're mated
with.  And it's quite clear that many of them aren't adequate to
prevent compiler-reordering in any case.

 Also note that barrier.h assumes that a S_LOCK/S_UNLOCK acts as a
 compiler barrier - which really isn't guaranteed by the current contract
 of s_lock.h. Although the tas() implementations look safe.

Ugh, you're right.  That should really be moved out-of-line.

  On the other hand, the API contract change making
  these into compiler barriers is a master-only change.  I think for
  purposes of this patch we should assume that the existing code is
  sufficient to prevent CPU reordering and just focus on fixing compiler
  ordering problems.  Whatever needs to change on the CPU-reordering end
  of things is a separate commit.
 
  I'm not arguing against that it should be a separate commit. Just that
  the proper memory barrier bit has to come first.

 I feel like you're speaking somewhat imprecisely in an area where very
 precise speech is needed to avoid confusion.   If you're saying that
 you think we should fix the S_UNLOCK() implementations that fail to
 prevent CPU-ordering before we change all the S_UNLOCK()
 implementations to prevent compiler-reordering, then that is fine with
 me;

 Yes, that's what I think is needed.

OK, let's do it that way then.

 Do you want to propose a patch that does *only* that first part before
 we go further here?  Do you want me to try to put together such a
 patch based on the details mentioned on this thread?

 I'm fine with either - we're both going to flying pretty blind :/.

 I think the way S_UNLOCK's release memory barrier semantics are fixed
 might influence the way the compiler barrier issue can be solved.

I think I'm starting to understand the terminological confusion: to
me, a memory barrier means a fence against both the compiler and the
CPU.  I now think you're using it to mean a fence against the CPU, as
distinct from a fence against the compiler.  That had me really
confused in some previous replies.

 Fixing
 the release semantics will involve going through all tas()
 implementations and see whether the default release semantics are
 ok. The ones with broken semantics will need to grow their own
 S_UNLOCK. I am wondering if that commit shouldn't just remove the
 default S_UNLOCK and move it to the tas() implementation sites.

So, I think that here you are talking about CPU behavior rather than
compiler behavior.  Next paragraph is on that basis.

The implementations that don't currently have their own S_UNLOCK() are
i386, x86_64, Itanium, ARM without GCC atomics, S/390 zSeries, Sparc,
Linux m68k, VAX, m32r, SuperH, non-Linux m68k, Univel CC, HP/UX
non-GCC, Sun Studio, and WIN32_ONLY_COMPILER.  Because most of those
are older platforms, I'm betting that more of them than not are OK; I
think we should confine ourselves to trying to fix the ones we're sure
are wrong, which if I understand you correctly are ARM without GCC
atomics, Sparc, and MIPS.  I think it'd be better to just add copies
of S_UNLOCK to those three 

Re: [HACKERS] Spinlocks and compiler/memory barriers

2014-06-30 Thread Andres Freund
On 2014-06-30 12:46:29 -0400, Robert Haas wrote:
  But trying to use a spinlock
  acquire-release to shore up problems with the spinlock release
  implementation makes my head explode.
 
  Well, it actually makes some sense. Nearly any TAS() implementation is
  going to have some memory barrier semantics - so using a TAS() as
  fallback makes sense. That's why we're relying on it for use in memory
  barrier emulation after all.
 
 As far as I know, all of our TAS() implementations prevent CPU
 reordering in the acquire direction.  It is not clear that they
 provide CPU-reordering guarantees adequate for the release direction,
 even when paired with whatever S_UNLOCK() implementation they're mated
 with.


 And it's quite clear that many of them aren't adequate to prevent
 compiler-reordering in any case.

I actually don't think there currently are tas() implementations that
aren't compiler barriers? Maybe UNIVEL/unixware. A bit of luck.

  Also note that barrier.h assumes that a S_LOCK/S_UNLOCK acts as a
  compiler barrier - which really isn't guaranteed by the current contract
  of s_lock.h. Although the tas() implementations look safe.
 
 Ugh, you're right.  That should really be moved out-of-line.

Good. Then we already loose the compile time interdependency from
barrier.h to s_lock.h - although the fallback will have a runtime
dependency.

  Do you want to propose a patch that does *only* that first part before
  we go further here?  Do you want me to try to put together such a
  patch based on the details mentioned on this thread?
 
  I'm fine with either - we're both going to flying pretty blind :/.
 
  I think the way S_UNLOCK's release memory barrier semantics are fixed
  might influence the way the compiler barrier issue can be solved.
 
 I think I'm starting to understand the terminological confusion: to
 me, a memory barrier means a fence against both the compiler and the
 CPU.  I now think you're using it to mean a fence against the CPU, as
 distinct from a fence against the compiler.  That had me really
 confused in some previous replies.

Oh. Lets henceforth define 'fence' as the cache coherency thingy and
read/write/release/acquire/memory barrier as the combination?

  Fixing
  the release semantics will involve going through all tas()
  implementations and see whether the default release semantics are
  ok. The ones with broken semantics will need to grow their own
  S_UNLOCK. I am wondering if that commit shouldn't just remove the
  default S_UNLOCK and move it to the tas() implementation sites.
 
 So, I think that here you are talking about CPU behavior rather than
 compiler behavior.  Next paragraph is on that basis.

Yes, I am.

 The implementations that don't currently have their own S_UNLOCK() are
 i386
 x86_64

TSO, thus fine.

 Itanium

As a special case volatile stores emit release/acquire fences unless
specific compiler flags are used. Thus safe.

 ARM without GCC atomics

Borked.

 S/390 zSeries

Strongly ordered.

 Sparc
 Sun Studio

Borked. At least in some setups.

 Linux m68k

At least linux doesn't support SMP for m68k, so cache coherency
shouldn't be a problem.

 VAX

I don't really know, but I don't care. The NetBSD people statements about VAX
SMP support didn't increase my concern for VAX SMP.

 m32r

No idea.

 SuperH

Not offially supported (as it's not in installation.sgml), don't care :)

 non-Linux m68k

Couldn't figure out if anybody supports SMP here. Doesn't look like it.

 Univel CC

Don't care.

 HP/UX non-GCC

Itanics volatile semantics should work.

 and WIN32_ONLY_COMPILER

Should be fine due to TSO on x86 and itanic volatiles.

 Because most of those
 are older platforms, I'm betting that more of them than not are OK; I
 think we should confine ourselves to trying to fix the ones we're sure
 are wrong

Sounds sane.

, which if I understand you correctly are ARM without GCC
 atomics, Sparc, and MIPS.

I've to revise my statement on MIPS, it actually looks safe. I seem to
have missed that it has its own S_UNLOCK.

Do I see it correctly that !(defined(__mips__)  !defined(__sgi)) isn't
supported?

 I think it'd be better to just add copies
 of S_UNLOCK to those three rather and NOT duplicate the definition in
 the other 12 places.

Ok.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Spinlocks and compiler/memory barriers

2014-06-30 Thread Andres Freund
On 2014-06-30 19:22:59 +0200, Andres Freund wrote:
 On 2014-06-30 12:46:29 -0400, Robert Haas wrote:
 , which if I understand you correctly are ARM without GCC
  atomics, Sparc, and MIPS.
 
 I've to revise my statement on MIPS, it actually looks safe. I seem to
 have missed that it has its own S_UNLOCK.

So, here's my first blind attempt at fixing these. Too tired to think
much more about it. Sparc's configurable cache coherency drives me
nuts. Linux apparently switched somewhere in 2011 from RMO (relaxed
memory order) to TSO (total store order), solaris always used TSO, but
the BSDs don't. Man.

Accordingly there's a somewhat ugly thingy attached. I don't think this
is really ready, but it's what I can come up today.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services
From 691839dc8fbb78134b530096d33d8a1307231eef Mon Sep 17 00:00:00 2001
From: Andres Freund and...@anarazel.de
Date: Tue, 1 Jul 2014 00:11:17 +0200
Subject: [PATCH] Fix spinlock implementations for some sparc and arm
 platforms.

When compiling postgres on arm using a older gcc, that doesn't yet
understand __sync_lock_test_and_set(), the default S_UNLOCK routine
was used. Unfortunately that's not correct for ARM's memory model. The
support for older gccs is using the swbp instruction (available for
both arvm5 and most armv6) - unfortunately there's not a common
barrier instruction for those architecture versions. So instead
implement unlock using swbp again, that's possibly a bit slower but
correct.

Some Sparc CPUs can be run in various coherence models, ranging from
RMO (relaxed) over PSO (partial) to TSO (total). Solaris has always
run CPUs in TSO mode, but linux didn't use to and the various *BSDs
still don't. Unfortunately the sparc TAS/S_UNLOCK were only correct
under TSO. Fix that by adding the necessary memory barrier
instructions. On sparcv8+, which should be all relavant CPUs, these
are treated as NOPs if the current consistency model doesn't require
the barriers.

Blindly written, so possibly not working.

They didn't use to use correct acquire/
---
 src/backend/port/tas/sunstudio_sparc.s |  2 +
 src/include/storage/s_lock.h   | 74 +-
 2 files changed, 75 insertions(+), 1 deletion(-)

diff --git a/src/backend/port/tas/sunstudio_sparc.s b/src/backend/port/tas/sunstudio_sparc.s
index 486b7be..dcf54e2 100644
--- a/src/backend/port/tas/sunstudio_sparc.s
+++ b/src/backend/port/tas/sunstudio_sparc.s
@@ -37,6 +37,8 @@ pg_atomic_cas:
 	!
 	!   http://cvs.opensolaris.org/source/xref/on/usr/src/lib/libc/sparc/threads/sparc.il
 	!
+	! NB: We're assuming we're running on a TSO system here - solaris
+	! userland luckily always has done so.
 
 #if defined(__sparcv9) || defined(__sparcv8plus)
 	cas [%o0],%o2,%o1
diff --git a/src/include/storage/s_lock.h b/src/include/storage/s_lock.h
index 895abe6..9aa9e8e 100644
--- a/src/include/storage/s_lock.h
+++ b/src/include/storage/s_lock.h
@@ -341,6 +341,30 @@ tas(volatile slock_t *lock)
 	return (int) _res;
 }
 
+/*
+ * Implement unlocking using swpb as well, that guarantees release
+ * semantics. Only some armv5 models have the data synchronization barrier
+ * instructions - but those need it. Luckily swpb always works.
+ */
+#define S_UNLOCK(lock)		s_unlock(lock)
+
+static __inline__ void
+s_unlock(volatile slock_t *lock)
+{
+	register slock_t _res = 0;
+
+	__asm__ __volatile__(
+			swpb 	%0, %0, [%2]	\n
+:		+r(_res), +m(*lock)
+:		r(lock)
+:		memory);
+
+	/* lock should have been held */
+	Assert(res == 1);
+}
+
+#define S_INIT_LOCK(lock)	(*(lock) == 0)
+
 #endif	 /* HAVE_GCC_INT_ATOMICS */
 #endif	 /* __arm__ */
 
@@ -393,6 +417,12 @@ tas(volatile slock_t *lock)
 
 
 #if defined(__sparc__)		/* Sparc */
+/*
+ * Solaris has always run sparc processors in TSO (total store) mode, but
+ * linux didn't use to and the *BSDs still don't. So, be careful about
+ * acquire/release semantics. The CPU will treat superflous membars as NOPs,
+ * so it's just code space.
+ */
 #define HAS_TEST_AND_SET
 
 typedef unsigned char slock_t;
@@ -414,9 +444,51 @@ tas(volatile slock_t *lock)
 :		=r(_res), +m(*lock)
 :		r(lock)
 :		memory);
+#if defined(__sparcv7)
+	/*
+	 * No stbar or membar available, luckily no actually produced hardware
+	 * requires a barrier
+	 */
+#elif defined(__sparcv8)
+	/* stbar is available (and required for both PSO, RMO), membar isn't */
+	__asm__ __volatile__ (stbar	 \n:::memory);
+#else
+	/*
+	 * #StoreLoad (RMO) | #StoreStore (PSO, RMO)are the approppriate release
+	 * barrier for sparcv8 upwards.
+	 */
+	__asm__ __volatile__ (membar  #StoreLoad | #StoreStore \n:::memory);
+#endif
 	return (int) _res;
 }
 
+#if defined(__sparcv7)
+/*
+ * No stbar or membar available, luckily no actually produced hardware
+ * requires a barrier
+ */
+#define S_UNLOCK(lock)		(*((volatile slock_t *) (lock)) = 0)
+#elif  __sparcv8
+/* stbar is available (and required 

Re: [HACKERS] Spinlocks and compiler/memory barriers

2014-06-30 Thread Robert Haas
On Mon, Jun 30, 2014 at 6:28 PM, Andres Freund and...@2ndquadrant.com wrote:
 On 2014-06-30 19:22:59 +0200, Andres Freund wrote:
 On 2014-06-30 12:46:29 -0400, Robert Haas wrote:
 , which if I understand you correctly are ARM without GCC
  atomics, Sparc, and MIPS.

 I've to revise my statement on MIPS, it actually looks safe. I seem to
 have missed that it has its own S_UNLOCK.

 So, here's my first blind attempt at fixing these. Too tired to think
 much more about it. Sparc's configurable cache coherency drives me
 nuts. Linux apparently switched somewhere in 2011 from RMO (relaxed
 memory order) to TSO (total store order), solaris always used TSO, but
 the BSDs don't. Man.

 Accordingly there's a somewhat ugly thingy attached. I don't think this
 is really ready, but it's what I can come up today.

You know, looking at this, I wonder if we shouldn't just remove
support for ARMv5 instead of making a blind stab at a fix.  I'm quite
in favor of doing what we can to support obscure architectures, but I
think this might be beyond what's reasonable.  First of all,
committing untested assembler code to our tree seems like an iffy idea
at best.  Secondly, even if someone is running on ARMv5, they'll be
fine as long as they're running a sufficiently new GCC, so we're not
really giving up much by dumping the hand-rolled code for that
platform.  The relevant GCC versions are several years old at this
point, and if we rely on them to get it right, that's really probably
a better bet than trying to do this blind.

Since we have a Sun Studio machine in the buildfarm, we shouldn't give
up on SPARC completely, but maybe we should only add the cases for
sparcv8+ and above?  That at least has some chance of getting tested.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Spinlocks and compiler/memory barriers

2014-06-28 Thread Andres Freund
On 2014-06-27 22:34:19 -0400, Robert Haas wrote:
 On Fri, Jun 27, 2014 at 2:04 PM, Andres Freund and...@2ndquadrant.com wrote:
  On 2014-06-27 13:04:02 -0400, Robert Haas wrote:
  On Thu, Jun 26, 2014 at 6:40 PM, Tom Lane t...@sss.pgh.pa.us wrote:
   Andres Freund and...@2ndquadrant.com writes:
   On 2014-06-26 14:13:07 -0700, Tom Lane wrote:
   Surely it had better be a read barrier as well?
  
   I don't immediately see why it has to be read barrier? Hoisting a load
   from after the release into the locked area of code should be safe?
  
   No doubt, but delaying a read till after the unlocking write would
   certainly not be safe.
  
   AFAICT, README.barrier completely fails to define what we think the
   semantics of pg_read_barrier and pg_write_barrier actually are, so if
   you believe that a write barrier prevents reordering of reads relative to
   writes, you'd better propose some new text for that file.  It certainly
   doesn't say that today.
 
  The relevant text is in barrier.h
 
  Note that that definition of a write barrier is *not* sufficient for the
  release of a lock... As I said elsewhere I think all the barrier
  definitions, except maybe alpha, luckily seem to be strong enough for
  that anyway.
 
  Do we want to introduce acquire/release barriers? Or do we want to
  redefine the current barriers to be strong enough for that?
 
 Well, unless we're prepared to dump support for an awful lot of
 platfomrs, trying to support acquire and release barriers on every
 platform we support is a doomed effort.

Hm? Just declare them to be as heavy as we need them? Already several
platforms fall back to more heavyweight operations than necessary?

 The definitions of the
 barriers implemented by barrier.h are the same as the ones that Linux
 has (minus read-barrier-depends)

Linux has smb_load_acquire()/smp_store_release() for locks on all
platforms.

 If we were going
 to use any of those in s_lock.h, it'd have to be pg_memory_barrier(),
 but I don't think making s_lock.h dependent on barrier.h is the way to
 go.  I think we should just adjust s_lock.h in a minimal way, using
 inline assembler or tweaking the existing assembler or whatever.

Isn't that just going to be repeating the contents of barrier.h pretty
much again? How are you suggesting we deal with the generic S_UNLOCK
case without having a huge ifdef?
Why do we build an abstraction layer (barrier.h) and then not use it?

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Spinlocks and compiler/memory barriers

2014-06-28 Thread Robert Haas
On Sat, Jun 28, 2014 at 4:31 AM, Andres Freund and...@2ndquadrant.com wrote:
  Do we want to introduce acquire/release barriers? Or do we want to
  redefine the current barriers to be strong enough for that?

 Well, unless we're prepared to dump support for an awful lot of
 platfomrs, trying to support acquire and release barriers on every
 platform we support is a doomed effort.

 Hm? Just declare them to be as heavy as we need them? Already several
 platforms fall back to more heavyweight operations than necessary?

Can't we keep this simple for starters?  Strength-reducing the
existing operations is yet a third problem, on top of the
already-existing problems of (1) making spinlock operations compiler
barriers and (2) fixing any buggy implementations.  I'm explicitly
trying to avoid defining this in a way that means we need a Gigantic
Patch that Changes Everything.

 The definitions of the
 barriers implemented by barrier.h are the same as the ones that Linux
 has (minus read-barrier-depends)

 Linux has smb_load_acquire()/smp_store_release() for locks on all
 platforms.

You mean smp.

 If we were going
 to use any of those in s_lock.h, it'd have to be pg_memory_barrier(),
 but I don't think making s_lock.h dependent on barrier.h is the way to
 go.  I think we should just adjust s_lock.h in a minimal way, using
 inline assembler or tweaking the existing assembler or whatever.

 Isn't that just going to be repeating the contents of barrier.h pretty
 much again?

No, I think it's going to be *much* simpler than that.  How about I
take a crack at this next week and then either (a) I'll see why it's a
bad idea and we can go from there or (b) you can review what I come up
with and tell me why it sucks?

 How are you suggesting we deal with the generic S_UNLOCK
 case without having a huge ifdef?
 Why do we build an abstraction layer (barrier.h) and then not use it?

Because (1) the abstraction doesn't fit very well unless we do a lot
of additional work to build acquire and release barriers for every
platform we support and (2) I don't have much confidence that we can
depend on the spinlock fallback for barriers without completely
breaking obscure platforms, and I'd rather make a more minimal set of
changes.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Spinlocks and compiler/memory barriers

2014-06-28 Thread Andres Freund
On 2014-06-28 09:25:32 -0400, Robert Haas wrote:
 On Sat, Jun 28, 2014 at 4:31 AM, Andres Freund and...@2ndquadrant.com wrote:
   Do we want to introduce acquire/release barriers? Or do we want to
   redefine the current barriers to be strong enough for that?
 
  Well, unless we're prepared to dump support for an awful lot of
  platfomrs, trying to support acquire and release barriers on every
  platform we support is a doomed effort.
 
  Hm? Just declare them to be as heavy as we need them? Already several
  platforms fall back to more heavyweight operations than necessary?
 
 Can't we keep this simple for starters?  Strength-reducing the
 existing operations is yet a third problem, on top of the
 already-existing problems of (1) making spinlock operations compiler
 barriers and (2) fixing any buggy implementations.  I'm explicitly
 trying to avoid defining this in a way that means we need a Gigantic
 Patch that Changes Everything.

I actually mean that we can just define release barriers to be full
memory barriers for platforms where we don't want to think about it. Not
that we should weaken barriers.

 No, I think it's going to be *much* simpler than that.  How about I
 take a crack at this next week and then either (a) I'll see why it's a
 bad idea and we can go from there or (b) you can review what I come up
 with and tell me why it sucks?

Ok. I think that's going in the wrong direction (duplication of
nontrivial knowledge), but maybe I'm taking a to 'purist' approach
here. Prove me wrong :)
You'll pick up the clobber changes from my patch?

  How are you suggesting we deal with the generic S_UNLOCK
  case without having a huge ifdef?
  Why do we build an abstraction layer (barrier.h) and then not use it?
 
 Because (1) the abstraction doesn't fit very well unless we do a lot
 of additional work to build acquire and release barriers for every
 platform we support and

Meh. Something like the (untested):
#if !defined(pg_release_barrier)  defined(pg_read_barrier)  
defined(pg_write_barrier)
#define pg_release_barrier() do { pg_read_barrier(); pg_write_barrier();} while 
(0)
#elif !defined(pg_release_barrier)
#define pg_release_barrier() pg_memory_barrier()
#endif

before the fallback definition of pg_read/write_barrier should suffice?

 (2) I don't have much confidence that we can
 depend on the spinlock fallback for barriers without completely
 breaking obscure platforms, and I'd rather make a more minimal set of
 changes.

Well, it's the beginning of the cycle. And we're already depending on
barriers for correctness (and it's not getting less), so I don't really
see what avoiding barrier usage buys us except harder to find breakage?

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Spinlocks and compiler/memory barriers

2014-06-28 Thread Andres Freund
On 2014-06-28 15:41:46 +0200, Andres Freund wrote:
 On 2014-06-28 09:25:32 -0400, Robert Haas wrote:
  No, I think it's going to be *much* simpler than that.  How about I
  take a crack at this next week and then either (a) I'll see why it's a
  bad idea and we can go from there or (b) you can review what I come up
  with and tell me why it sucks?
 
 Ok. I think that's going in the wrong direction (duplication of
 nontrivial knowledge), but maybe I'm taking a to 'purist' approach
 here. Prove me wrong :)

What I forgot: I'm also pretty sure that the more lockless stuff we
introduce the more places are going to need acquire/release semantics...

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Spinlocks and compiler/memory barriers

2014-06-27 Thread Robert Haas
On Thu, Jun 26, 2014 at 5:01 PM, Andres Freund and...@2ndquadrant.com wrote:
 But a) isn't really avoidable because it'll otherwise generate compiler
 warnings and b) is done that way all over the tree. E.g. lwlock.c has
 several copies of (note the nonvolatility of proc):
 volatile LWLock *lock = l;
 PGPROC *proc = MyProc;
 ...
 proc-lwWaiting = true;
 proc-lwWaitMode = LW_WAIT_UNTIL_FREE;
 proc-lwWaitLink = NULL;

 /* waiters are added to the front of the queue */
 proc-lwWaitLink = lock-head;
 if (lock-head == NULL)
 lock-tail = proc;
 lock-head = proc;

 /* Can release the mutex now */
 SpinLockRelease(lock-mutex);
 There's nothing forcing the compiler to not move any of the proc-*
 assignments past the SpinLockRelease(). And indeed in my case it was
 actually the store to lwWaitLink that was moved across the lock.

 I think it's just pure luck that there's no active bug (that we know of)
 caused by this. I wouldn't be surprised if some dubious behaviour we've
 seen would be caused by similar issues.

 Now, we can fix this and similar cases by more gratuitous use of
 volatile. But for one we're never going to find all cases. For another
 it won't help *at all* for architectures with looser CPU level memory
 ordering guarantees.
 I think we finally need to bite the bullet and make all S_UNLOCKs
 compiler/write barriers.

There are two separate issues here:

1. SpinLockAcquire and SpinLockRelease are not guaranteed to be
compiler barriers, so all relevant memory accesses in the critical
section need to be done using volatile pointers.  Failing to do this
is an easy mistake to make, and we've fixed numerous bugs of this type
over the years (most recently, to my knowledge, in
4bc15a8bfbc7856bc3426dc9ab99567eebbb64d3).  Forcing SpinLockAcquire()
and SpinLockRelease() to serve as compiler barriers would let us
dispense with a whole lot of volatile calls and make writing future
code correctly a lot easier.

2. Some of our implementations of SpinLockAcquire and/or
SpinLockRelease, but in particular SpinLockRelease, may not actually
provide the memory-ordering semantics which they are required to
provide.  In particular, ...

 I'd previously, in
 http://www.postgresql.org/message-id/20130920151110.ga8...@awork2.anarazel.de,
 gone through the list of S_UNLOCKs and found several that were
 lacking. Most prominently the default S_UNLOCK is just
 #define S_UNLOCK(lock) (*((volatile slock_t *) (lock)) = 0)
 which allows the compiler to move non volatile access across and does
 nothing for CPU level cache coherency.

...this default implementation of S_UNLOCK() is pretty sketchy.  Even
on a platform that enforces reads in program order and writes in
program order, this is still unsafe because a read within the critical
section might get postponed until after this write.  Now, x86 happens
to have an additional constraint, which is that it can reorder loads
before stores but not stores before loads; so that coding happens to
provide release semantics.  But that need not be true on every
architecture, and the trend seems to be toward weaker memory ordering.
As you pointed out to me on chat, the non-intrinsics based ARM
implementation brokenly relies on the default S_UNLOCK(), which
clearly isn't adequate.

Now, in terms of solving these problems:

I tend to think that we should try to think about these two problems
somewhat separately.  As to #1, in the back-branches, I think further
volatile-izing the LWLock* routines is probably the only realistic
solution.  In master, I fully support moving the goalposts such that
we require SpinLockAcquire() and SpinLockRelease() are compiler
barriers.  Once we do this, I think we should go back and rip out all
the places where we've used volatile-ized pointers to provide compiler
ordering.  That way, if we haven't actually managed to provide
compiler ordering everywhere, it's more likely that something will
fall over and warn us about the problem; plus, that avoids keeping
around a coding pattern which isn't actually the one we want people to
copy.  However, I think your proposed S_UNLOCKED_UNLOCK() hack is
plain ugly, probably cripplingly slow, and there's no guarantee that's
even correct; see for example the comments about Itanium's tas
implementation possibly being only an acquire barrier (blech).

On gcc and icc, which account for lines 99 through 700 of spin.h, it
should be simple and mechanical to use compiler intrinsics to make
sure that every S_UNLOCK implementation includes a compiler barrier.
However, lines 710 through 895 support non-gcc, non-icc compilers, and
some of those we may not know how to implement a compiler barrier - in
particular, Univel CC, the Tru64 Alpha compiler, HPPA, AIX, or Sun's
compilers.  Except for Sun, we have no buildfarm support for those
platforms, so we could consider just dropping support entirely, but
I'd be inclined to do 

Re: [HACKERS] Spinlocks and compiler/memory barriers

2014-06-27 Thread Robert Haas
On Thu, Jun 26, 2014 at 6:40 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Andres Freund and...@2ndquadrant.com writes:
 On 2014-06-26 14:13:07 -0700, Tom Lane wrote:
 Surely it had better be a read barrier as well?

 I don't immediately see why it has to be read barrier? Hoisting a load
 from after the release into the locked area of code should be safe?

 No doubt, but delaying a read till after the unlocking write would
 certainly not be safe.

 AFAICT, README.barrier completely fails to define what we think the
 semantics of pg_read_barrier and pg_write_barrier actually are, so if
 you believe that a write barrier prevents reordering of reads relative to
 writes, you'd better propose some new text for that file.  It certainly
 doesn't say that today.

The relevant text is in barrier.h

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Spinlocks and compiler/memory barriers

2014-06-27 Thread Andres Freund
On 2014-06-27 13:00:34 -0400, Robert Haas wrote:
 There are two separate issues here:
 
 1. SpinLockAcquire and SpinLockRelease are not guaranteed to be
 compiler barriers, so all relevant memory accesses in the critical
 section need to be done using volatile pointers.  Failing to do this
 is an easy mistake to make, and we've fixed numerous bugs of this type
 over the years (most recently, to my knowledge, in
 4bc15a8bfbc7856bc3426dc9ab99567eebbb64d3).  Forcing SpinLockAcquire()
 and SpinLockRelease() to serve as compiler barriers would let us
 dispense with a whole lot of volatile calls and make writing future
 code correctly a lot easier.

And actually faster in some cases. I'm just playing around with some
bigger POWER machine and removing the volatiles in GetSnapshotData() +
some access forcing macro for accessing xmin is worth 1% or so.

 2. Some of our implementations of SpinLockAcquire and/or
 SpinLockRelease, but in particular SpinLockRelease, may not actually
 provide the memory-ordering semantics which they are required to
 provide.

 I tend to think that we should try to think about these two problems
 somewhat separately.  As to #1, in the back-branches, I think further
 volatile-izing the LWLock* routines is probably the only realistic
 solution.

We could also decide not to do anything :/.

  In master, I fully support moving the goalposts such that
 we require SpinLockAcquire() and SpinLockRelease() are compiler
 barriers.  Once we do this, I think we should go back and rip out all
 the places where we've used volatile-ized pointers to provide compiler
 ordering.  That way, if we haven't actually managed to provide
 compiler ordering everywhere, it's more likely that something will
 fall over and warn us about the problem; plus, that avoids keeping
 around a coding pattern which isn't actually the one we want people to
 copy.

+1

 However, I think your proposed S_UNLOCKED_UNLOCK() hack is
 plain ugly, probably cripplingly slow, and there's no guarantee that's
 even correct; see for example the comments about Itanium's tas
 implementation possibly being only an acquire barrier (blech).

Heh. I don't think it's worse than the current fallback barrier
implementation. The S_UNLOCKED_UNLOCK() thing was just to avoid
recursion when using a barrier in the spinlock used to implement
barriers...

 On gcc and icc, which account for lines 99 through 700 of spin.h, it
 should be simple and mechanical to use compiler intrinsics to make
 sure that every S_UNLOCK implementation includes a compiler barrier.
 However, lines 710 through 895 support non-gcc, non-icc compilers, and
 some of those we may not know how to implement a compiler barrier - in
 particular, Univel CC, the Tru64 Alpha compiler, HPPA, AIX, or Sun's
 compilers.  Except for Sun, we have no buildfarm support for those
 platforms, so we could consider just dropping support entirely

Both sun's and AIX's compilers can relatively easily be handled:
* Solaris has atomic.h with membar_enter() et al. Apparently since at least
solaris 9.
* XLC has __fence and __isync intrinsics.

There's been recent talk about AIX, including about AIX animal, so I'd
be hesitant to drop it. It's also still developed.

I'm obviously in favor of dropping Alpha. And I'm, unsurprisingly, all
for removing unixware support (which is what univel CC seems to be used
for after you dropped the univel port proper).

I think the only person that has used postgres on hppa in the last 5
years is Tom, so I guess he'll have to speak up about it. Tom?

 void fake_compiler_barrier(void) { }
 void (*fake_compiler_barrier_hook) = fake_compiler_barrier;
 #define pg_compiler_barrier() ((*fake_compiler_barrier_hook)())

But we can do that as a fallback. It's what HPPA's example spinlock
implementation does after all.

 Now, this doesn't remove the circular dependency between s_lock.h and
 barrier.h, because we still don't have a fallback method, other than
 acquiring and releasing a spinlock, of implementing a barrier that
 blocks both compiler reordering and CPU reordering.  But it is enough
 to solve problem #1, and doesn't require that we drop support for
 anything that works now.

I think we can move the fallback into a C function. Compared to the cost
of a tas/unlock that shouldn't be significant.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Spinlocks and compiler/memory barriers

2014-06-27 Thread Andres Freund
On 2014-06-27 13:04:02 -0400, Robert Haas wrote:
 On Thu, Jun 26, 2014 at 6:40 PM, Tom Lane t...@sss.pgh.pa.us wrote:
  Andres Freund and...@2ndquadrant.com writes:
  On 2014-06-26 14:13:07 -0700, Tom Lane wrote:
  Surely it had better be a read barrier as well?
 
  I don't immediately see why it has to be read barrier? Hoisting a load
  from after the release into the locked area of code should be safe?
 
  No doubt, but delaying a read till after the unlocking write would
  certainly not be safe.
 
  AFAICT, README.barrier completely fails to define what we think the
  semantics of pg_read_barrier and pg_write_barrier actually are, so if
  you believe that a write barrier prevents reordering of reads relative to
  writes, you'd better propose some new text for that file.  It certainly
  doesn't say that today.
 
 The relevant text is in barrier.h

Note that that definition of a write barrier is *not* sufficient for the
release of a lock... As I said elsewhere I think all the barrier
definitions, except maybe alpha, luckily seem to be strong enough for
that anyway.

Do we want to introduce acquire/release barriers? Or do we want to
redefine the current barriers to be strong enough for that?

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Spinlocks and compiler/memory barriers

2014-06-27 Thread Robert Haas
On Fri, Jun 27, 2014 at 2:04 PM, Andres Freund and...@2ndquadrant.com wrote:
 On 2014-06-27 13:04:02 -0400, Robert Haas wrote:
 On Thu, Jun 26, 2014 at 6:40 PM, Tom Lane t...@sss.pgh.pa.us wrote:
  Andres Freund and...@2ndquadrant.com writes:
  On 2014-06-26 14:13:07 -0700, Tom Lane wrote:
  Surely it had better be a read barrier as well?
 
  I don't immediately see why it has to be read barrier? Hoisting a load
  from after the release into the locked area of code should be safe?
 
  No doubt, but delaying a read till after the unlocking write would
  certainly not be safe.
 
  AFAICT, README.barrier completely fails to define what we think the
  semantics of pg_read_barrier and pg_write_barrier actually are, so if
  you believe that a write barrier prevents reordering of reads relative to
  writes, you'd better propose some new text for that file.  It certainly
  doesn't say that today.

 The relevant text is in barrier.h

 Note that that definition of a write barrier is *not* sufficient for the
 release of a lock... As I said elsewhere I think all the barrier
 definitions, except maybe alpha, luckily seem to be strong enough for
 that anyway.

 Do we want to introduce acquire/release barriers? Or do we want to
 redefine the current barriers to be strong enough for that?

Well, unless we're prepared to dump support for an awful lot of
platfomrs, trying to support acquire and release barriers on every
platform we support is a doomed effort.  The definitions of the
barriers implemented by barrier.h are the same as the ones that Linux
has (minus read-barrier-depends), which I think is probably good
evidence that they are generally useful definitions.  If we were going
to use any of those in s_lock.h, it'd have to be pg_memory_barrier(),
but I don't think making s_lock.h dependent on barrier.h is the way to
go.  I think we should just adjust s_lock.h in a minimal way, using
inline assembler or tweaking the existing assembler or whatever.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Spinlocks and compiler/memory barriers

2014-06-26 Thread Tom Lane
Andres Freund and...@2ndquadrant.com writes:
 I think we should rework things so that we fall back to
 pg_write_barrier(), (*((volatile slock_t *) (lock)) = 0) instead of what
 we have right now.

Surely it had better be a read barrier as well?  And S_LOCK the same?

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Spinlocks and compiler/memory barriers

2014-06-26 Thread Andres Freund
On 2014-06-26 14:13:07 -0700, Tom Lane wrote:
 Andres Freund and...@2ndquadrant.com writes:
  I think we should rework things so that we fall back to
  pg_write_barrier(), (*((volatile slock_t *) (lock)) = 0) instead of what
  we have right now.
 
 Surely it had better be a read barrier as well?

I don't immediately see why it has to be read barrier? Hoisting a load
from after the release into the locked area of code should be safe? Note
that 'bad' reads can only happen for variables which aren't protected by
the spinlock since the S_LOCK needs to have acquire semantics and no
other process can modify protected variables concurrently.
The important thing is that all modifications that have been done inside
the spinlock are visible to other backends and that no writes are moved
outside the protected are.

 And S_LOCK the same?

It better be a read barrier, yes. I haven't checked yet, but I assume
that pretty much all TAS/tas implementation already guarantee that. I
think if not we'd seen problems. Well, at least on platforms that
receive testing under concurrent circumstances :/

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Spinlocks and compiler/memory barriers

2014-06-26 Thread Tom Lane
Andres Freund and...@2ndquadrant.com writes:
 On 2014-06-26 14:13:07 -0700, Tom Lane wrote:
 Surely it had better be a read barrier as well?

 I don't immediately see why it has to be read barrier? Hoisting a load
 from after the release into the locked area of code should be safe?

No doubt, but delaying a read till after the unlocking write would
certainly not be safe.

AFAICT, README.barrier completely fails to define what we think the
semantics of pg_read_barrier and pg_write_barrier actually are, so if
you believe that a write barrier prevents reordering of reads relative to
writes, you'd better propose some new text for that file.  It certainly
doesn't say that today.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Spinlocks and compiler/memory barriers

2014-06-26 Thread Andres Freund
Hi,

On 2014-06-26 23:01:10 +0200, Andres Freund wrote:
 I think we should rework things so that we fall back to
 pg_write_barrier(), (*((volatile slock_t *) (lock)) = 0) instead of what
 we have right now.
 That'd require to make barrier.h independent from s_lock.h, but I think
 that'd be a pretty clear improvement. One open question is what happens
 with the SpinlockRelease() when barriers are implemented using spinlocks
 and we need a barrier for the SpinlockRelease().

Too tired to think about this any further today. Here's my current state
of this:
* barrier.h's spinlock implementation moved to s_lock.c, loosing the
  s_lock.h include. That requires a S_UNLOCK definition which doesn't
  again use a barrier. I've coined it S_UNLOCKED_UNLOCK
* s_lock.h now includes barrier.h to implement the generic S_UNLOCK
  safely.
* gcc x86-64 had a superflous cc clobber. Likely copied from the 32bit
  version which does additional operations.
* PPC was missing a compiler barrier
* alpha was missing a cc clobber.
* mips was missing a compiler barrier and a cc clobber
* I have no idea how to fix pa-risc's S_UNLOCK for !gcc. The referenced
  spinlock paper calls a external function to avoid reordering.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services
diff --git a/src/backend/storage/lmgr/s_lock.c b/src/backend/storage/lmgr/s_lock.c
index efe1b43..5052718 100644
--- a/src/backend/storage/lmgr/s_lock.c
+++ b/src/backend/storage/lmgr/s_lock.c
@@ -187,6 +187,23 @@ update_spins_per_delay(int shared_spins_per_delay)
 	return (shared_spins_per_delay * 15 + spins_per_delay) / 16;
 }
 
+/*
+ * Memory barrier implementation based on lock/unlock to a spinlock. Check
+ * barrier.h for details.
+ */
+#ifdef PG_SIMULATE_MEMORY_BARRIER
+void
+pg_memory_barrier_impl(void)
+{
+	S_LOCK(dummy_spinlock);
+#ifdef S_UNLOCKED_UNLOCK
+	S_UNLOCKED_UNLOCK(dummy_spinlock);
+#else
+	S_UNLOCK(dummy_spinlock);
+#endif
+}
+#endif /* PG_SIMULATE_MEMORY_BARRIER */
+
 
 /*
  * Various TAS implementations that cannot live in s_lock.h as no inline
diff --git a/src/backend/storage/lmgr/spin.c b/src/backend/storage/lmgr/spin.c
index 9b71744..03a4fe8 100644
--- a/src/backend/storage/lmgr/spin.c
+++ b/src/backend/storage/lmgr/spin.c
@@ -25,6 +25,7 @@
 #include access/xlog.h
 #include miscadmin.h
 #include replication/walsender.h
+#include storage/barrier.h
 #include storage/lwlock.h
 #include storage/pg_sema.h
 #include storage/spin.h
diff --git a/src/include/storage/barrier.h b/src/include/storage/barrier.h
index bc61de0..e5692ac 100644
--- a/src/include/storage/barrier.h
+++ b/src/include/storage/barrier.h
@@ -13,10 +13,6 @@
 #ifndef BARRIER_H
 #define BARRIER_H
 
-#include storage/s_lock.h
-
-extern slock_t dummy_spinlock;
-
 /*
  * A compiler barrier need not (and preferably should not) emit any actual
  * machine code, but must act as an optimization fence: the compiler must not
@@ -155,10 +151,14 @@ extern slock_t dummy_spinlock;
  * spinlock acquire-and-release would be equivalent to a full memory barrier.
  * For example, I'm not sure that Itanium's acq and rel add up to a full
  * fence.  But all of our actual implementations seem OK in this regard.
+ *
+ * The actual implementation is in s_lock.c so we can include barrier.h from
+ * s_lock.h. Yes, this will make things even slower, but who cares.
  */
 #if !defined(pg_memory_barrier)
-#define pg_memory_barrier() \
-	do { S_LOCK(dummy_spinlock); S_UNLOCK(dummy_spinlock); } while (0)
+#define PG_SIMULATE_MEMORY_BARRIER
+extern void pg_memory_barrier_impl(void);
+#define pg_memory_barrier() pg_memory_barrier_impl()
 #endif
 
 /*
diff --git a/src/include/storage/s_lock.h b/src/include/storage/s_lock.h
index ba4dfe1..2732054 100644
--- a/src/include/storage/s_lock.h
+++ b/src/include/storage/s_lock.h
@@ -19,7 +19,8 @@
  *		Should return number of delays; see s_lock.c
  *
  *	void S_UNLOCK(slock_t *lock)
- *		Unlock a previously acquired lock.
+ *		Unlock a previously acquired lock. Needs to imply a compiler and
+ *  	write memory barrier.
  *
  *	bool S_LOCK_FREE(slock_t *lock)
  *		Tests if the lock is free. Returns TRUE if free, FALSE if locked.
@@ -39,6 +40,7 @@
  *		Atomic test-and-set instruction.  Attempt to acquire the lock,
  *		but do *not* wait.	Returns 0 if successful, nonzero if unable
  *		to acquire the lock.
+ *		Needs to imply a compiler and read memory barrier.
  *
  *	int TAS_SPIN(slock_t *lock)
  *		Like TAS(), but this version is used when waiting for a lock
@@ -94,6 +96,8 @@
 #ifndef S_LOCK_H
 #define S_LOCK_H
 
+#include storage/barrier.h
+
 #ifdef HAVE_SPINLOCKS	/* skip spinlocks if requested */
 
 #if defined(__GNUC__) || defined(__INTEL_COMPILER)
@@ -224,7 +228,7 @@ tas(volatile slock_t *lock)
 			xchgb	%0,%1	\n
 :		+q(_res), +m(*lock)
 :
-:		memory, cc);
+:		memory);
 	return (int) _res;
 }
 
@@ -478,14 +482,14 @@ tas(volatile slock_t *lock)
 #define S_UNLOCK(lock)	\
 do \
 

Re: [HACKERS] Spinlocks and compiler/memory barriers

2014-06-26 Thread Andres Freund
On 2014-06-26 15:40:11 -0700, Tom Lane wrote:
 Andres Freund and...@2ndquadrant.com writes:
  On 2014-06-26 14:13:07 -0700, Tom Lane wrote:
  Surely it had better be a read barrier as well?
 
  I don't immediately see why it has to be read barrier? Hoisting a load
  from after the release into the locked area of code should be safe?
 
 No doubt, but delaying a read till after the unlocking write would
 certainly not be safe.

Right. What we actually want for locking is what several systems
(e.g. C11, linux) coin acquire/release barriers.
Not sure whether we should introduce a separate set of acquire/release
macros, or promote our barriers to have stronger guarantees than the
name implies.

The definition as I understand it is:

A acquire barrier implies that:
* memory operations from after the barrier cannot appear to have
  happened before the barrier
* but: memory operations from *before* the barrier are *not* guaranteed to be
  finished

A finished release barrier implies:
* stores from before the barrier cannot be moved past
* loads from before the barrier cannot be moved past
* but: reads from *after* the barrier might occur *before* the barrier.

I believe that all our current barrier definitions (except maybe alpha
which I haven't bothered to check thoroughly) satisfy those
constraints. That's primarily because we don't have support for all that
many platforms and use full memory barriers for read/write barriers in
several places.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers