On Wed, Jul 3, 2024 at 8:09 PM Tom Lane <t...@sss.pgh.pa.us> wrote:
> Thomas Munro <thomas.mu...@gmail.com> writes:
> > Here are some experimental patches to try out some ideas mentioned
> > upthread, that are approximately unlocked by that cleanup.
>
> FWIW, I'm good with getting rid of --disable-spinlocks and
> --disable-atomics.  That's a fair amount of code and needing to
> support it causes problems, as you say.  I am very much less
> excited about ripping out our spinlock and/or atomics code in favor
> of <stdatomic.h>; I just don't see the gain there, and I do see risk
> in ceding control of the semantics and performance of those
> primitives.

OK, <stdatomic.h> part on ice for now.  Here's an update of the rest,
this time also removing the barrier fallbacks as discussed in the LTO
thread[1].

I guess we should also consider reimplementing the spinlock on the
atomic API, but I can see that Andres is poking at spinlock code right
now so I'll keep out of his way...

Side issue: I noticed via CI failure when I tried to require
read/write barriers to be provided (a choice I backed out of), that on
MSVC we seem to be using the full memory barrier fallback for those.
Huh?  For x86, I think they should be using pg_compiler_barrier() (no
code gen, just prevent reordering), not pg_pg_memory_barrier(), no?
Perhaps I'm missing something but I suspect we might be failing to
include arch-x86.h on that compiler when we should... maybe it needs
to detect _M_AMD64 too?  For ARM, from a quick look, the only way to
reach real acquire/release barriers seems to be to use the C11
interface (which would also be fine on x86 where it should degrade to
a no-op compiler barrier or signal fence as the standard calls it),
but IIRC the Windows/ARM basics haven't gone in yet anyway.

[1] 
https://www.postgresql.org/message-id/flat/721bf39a-ed8a-44b0-8b8e-be3bd81db748%40technowledgy.de#66ba381b05e8ee08b11503b846acc4a1
From 33533817949052f7af423aaee0ef6e737031effb Mon Sep 17 00:00:00 2001
From: Thomas Munro <thomas.mu...@gmail.com>
Date: Tue, 2 Jul 2024 14:47:59 +1200
Subject: [PATCH v2 1/4] Remove --disable-spinlocks.

A later change will require atomic support, so it wouldn't make sense
for a new system not to be able to implement true spinlocks.

Discussion: https://postgr.es/m/3351991.1697728588%40sss.pgh.pa.us
---
 configure                               |  40 ------
 configure.ac                            |  13 --
 doc/src/sgml/installation.sgml          |  37 +----
 meson.build                             |   6 -
 src/backend/port/atomics.c              |  26 ----
 src/backend/port/posix_sema.c           |   3 +-
 src/backend/port/sysv_sema.c            |   3 +-
 src/backend/postmaster/launch_backend.c |   8 --
 src/backend/storage/ipc/ipci.c          |  10 --
 src/backend/storage/lmgr/Makefile       |   1 -
 src/backend/storage/lmgr/meson.build    |   1 -
 src/backend/storage/lmgr/s_lock.c       |   2 +-
 src/backend/storage/lmgr/spin.c         | 180 ------------------------
 src/include/pg_config.h.in              |   3 -
 src/include/pg_config_manual.h          |  15 --
 src/include/port/atomics.h              |   4 +-
 src/include/port/atomics/fallback.h     |   4 +-
 src/include/storage/s_lock.h            |  39 +----
 src/include/storage/spin.h              |  18 +--
 src/test/regress/regress.c              |  86 -----------
 20 files changed, 13 insertions(+), 486 deletions(-)
 delete mode 100644 src/backend/storage/lmgr/spin.c

diff --git a/configure b/configure
index ea5514fab1..f8deaa8d78 100755
--- a/configure
+++ b/configure
@@ -836,7 +836,6 @@ enable_integer_datetimes
 enable_nls
 with_pgport
 enable_rpath
-enable_spinlocks
 enable_atomics
 enable_debug
 enable_profiling
@@ -1529,7 +1528,6 @@ Optional Features:
                           enable Native Language Support
   --disable-rpath         do not embed shared library search path in
                           executables
-  --disable-spinlocks     do not use spinlocks
   --disable-atomics       do not use atomic operations
   --enable-debug          build with debugging symbols (-g)
   --enable-profiling      build with profiling enabled
@@ -3266,33 +3264,6 @@ fi
 
 
 
-#
-# Spinlocks
-#
-
-
-# Check whether --enable-spinlocks was given.
-if test "${enable_spinlocks+set}" = set; then :
-  enableval=$enable_spinlocks;
-  case $enableval in
-    yes)
-      :
-      ;;
-    no)
-      :
-      ;;
-    *)
-      as_fn_error $? "no argument expected for --enable-spinlocks option" "$LINENO" 5
-      ;;
-  esac
-
-else
-  enable_spinlocks=yes
-
-fi
-
-
-
 #
 # Atomic operations
 #
@@ -12185,17 +12156,6 @@ fi
 
 fi
 
-if test "$enable_spinlocks" = yes; then
-
-$as_echo "#define HAVE_SPINLOCKS 1" >>confdefs.h
-
-else
-  { $as_echo "$as_me:${as_lineno-$LINENO}: WARNING:
-*** Not using spinlocks will cause poor performance." >&5
-$as_echo "$as_me: WARNING:
-*** Not using spinlocks will cause poor performance." >&2;}
-fi
-
 if test "$enable_atomics" = yes; then
 
 $as_echo "#define HAVE_ATOMICS 1" >>confdefs.h
diff --git a/configure.ac b/configure.ac
index 0089e78b68..a72169f574 100644
--- a/configure.ac
+++ b/configure.ac
@@ -186,12 +186,6 @@ PGAC_ARG_BOOL(enable, rpath, yes,
               [do not embed shared library search path in executables])
 AC_SUBST(enable_rpath)
 
-#
-# Spinlocks
-#
-PGAC_ARG_BOOL(enable, spinlocks, yes,
-              [do not use spinlocks])
-
 #
 # Atomic operations
 #
@@ -1296,13 +1290,6 @@ failure.  It is possible the compiler isn't looking in the proper directory.
 Use --without-zlib to disable zlib support.])])
 fi
 
-if test "$enable_spinlocks" = yes; then
-  AC_DEFINE(HAVE_SPINLOCKS, 1, [Define to 1 if you have spinlocks.])
-else
-  AC_MSG_WARN([
-*** Not using spinlocks will cause poor performance.])
-fi
-
 if test "$enable_atomics" = yes; then
   AC_DEFINE(HAVE_ATOMICS, 1, [Define to 1 if you want to use atomics if available.])
 else
diff --git a/doc/src/sgml/installation.sgml b/doc/src/sgml/installation.sgml
index 4784834ab9..3f19f272b1 100644
--- a/doc/src/sgml/installation.sgml
+++ b/doc/src/sgml/installation.sgml
@@ -1258,22 +1258,6 @@ build-postgresql:
        </listitem>
       </varlistentry>
 
-      <varlistentry id="configure-option-disable-spinlocks">
-       <term><option>--disable-spinlocks</option></term>
-       <listitem>
-        <para>
-         Allow the build to succeed even if <productname>PostgreSQL</productname>
-         has no CPU spinlock support for the platform.  The lack of
-         spinlock support will result in very poor performance; therefore,
-         this option should only be used if the build aborts and
-         informs you that the platform lacks spinlock support. If this
-         option is required to build <productname>PostgreSQL</productname> on
-         your platform, please report the problem to the
-         <productname>PostgreSQL</productname> developers.
-        </para>
-       </listitem>
-      </varlistentry>
-
       <varlistentry id="configure-option-disable-atomics">
        <term><option>--disable-atomics</option></term>
        <listitem>
@@ -2690,23 +2674,6 @@ ninja install
       </listitem>
      </varlistentry>
 
-     <varlistentry id="configure-spinlocks-meson">
-      <term><option>-Dspinlocks={ true | false }</option></term>
-      <listitem>
-       <para>
-        This option is set to true by default; setting it to false will
-        allow the build to succeed even if <productname>PostgreSQL</productname>
-        has no CPU spinlock support for the platform.  The lack of
-        spinlock support will result in very poor performance; therefore,
-        this option should only be changed if the build aborts and
-        informs you that the platform lacks spinlock support. If setting this
-        option to false is required to build <productname>PostgreSQL</productname> on
-        your platform, please report the problem to the
-        <productname>PostgreSQL</productname> developers.
-       </para>
-      </listitem>
-     </varlistentry>
-
      <varlistentry id="configure-atomics-meson">
       <term><option>-Datomics={ true | false }</option></term>
       <listitem>
@@ -2719,6 +2686,7 @@ ninja install
        </para>
       </listitem>
      </varlistentry>
+
     </variablelist>
    </sect3>
 
@@ -3393,9 +3361,6 @@ export MANPATH
    these CPU architectures: x86, PowerPC, S/390, SPARC, ARM, MIPS,
    and RISC-V, including
    big-endian, little-endian, 32-bit, and 64-bit variants where applicable.
-   It is often
-   possible to build on an unsupported CPU type by configuring with
-   <option>--disable-spinlocks</option>, but performance will be poor.
   </para>
 
   <para>
diff --git a/meson.build b/meson.build
index 27805b9bcc..6a0d538365 100644
--- a/meson.build
+++ b/meson.build
@@ -2089,12 +2089,6 @@ endif
 # Atomics
 ###############################################################
 
-if not get_option('spinlocks')
-  warning('Not using spinlocks will cause poor performance')
-else
-  cdata.set('HAVE_SPINLOCKS', 1)
-endif
-
 if not get_option('atomics')
   warning('Not using atomics will cause poor performance')
 else
diff --git a/src/backend/port/atomics.c b/src/backend/port/atomics.c
index 93789b4e05..cd7ede9672 100644
--- a/src/backend/port/atomics.c
+++ b/src/backend/port/atomics.c
@@ -57,17 +57,7 @@ pg_atomic_init_flag_impl(volatile pg_atomic_flag *ptr)
 	StaticAssertDecl(sizeof(ptr->sema) >= sizeof(slock_t),
 					 "size mismatch of atomic_flag vs slock_t");
 
-#ifndef HAVE_SPINLOCKS
-
-	/*
-	 * NB: If we're using semaphore based TAS emulation, be careful to use a
-	 * separate set of semaphores. Otherwise we'd get in trouble if an atomic
-	 * var would be manipulated while spinlock is held.
-	 */
-	s_init_lock_sema((slock_t *) &ptr->sema, true);
-#else
 	SpinLockInit((slock_t *) &ptr->sema);
-#endif
 
 	ptr->value = false;
 }
@@ -108,15 +98,7 @@ pg_atomic_init_u32_impl(volatile pg_atomic_uint32 *ptr, uint32 val_)
 	StaticAssertDecl(sizeof(ptr->sema) >= sizeof(slock_t),
 					 "size mismatch of atomic_uint32 vs slock_t");
 
-	/*
-	 * If we're using semaphore based atomic flags, be careful about nested
-	 * usage of atomics while a spinlock is held.
-	 */
-#ifndef HAVE_SPINLOCKS
-	s_init_lock_sema((slock_t *) &ptr->sema, true);
-#else
 	SpinLockInit((slock_t *) &ptr->sema);
-#endif
 	ptr->value = val_;
 }
 
@@ -184,15 +166,7 @@ pg_atomic_init_u64_impl(volatile pg_atomic_uint64 *ptr, uint64 val_)
 	StaticAssertDecl(sizeof(ptr->sema) >= sizeof(slock_t),
 					 "size mismatch of atomic_uint64 vs slock_t");
 
-	/*
-	 * If we're using semaphore based atomic flags, be careful about nested
-	 * usage of atomics while a spinlock is held.
-	 */
-#ifndef HAVE_SPINLOCKS
-	s_init_lock_sema((slock_t *) &ptr->sema, true);
-#else
 	SpinLockInit((slock_t *) &ptr->sema);
-#endif
 	ptr->value = val_;
 }
 
diff --git a/src/backend/port/posix_sema.c b/src/backend/port/posix_sema.c
index 5886d2233f..64186ec0a7 100644
--- a/src/backend/port/posix_sema.c
+++ b/src/backend/port/posix_sema.c
@@ -217,8 +217,7 @@ PGReserveSemaphores(int maxSemas)
 
 	/*
 	 * We must use ShmemAllocUnlocked(), since the spinlock protecting
-	 * ShmemAlloc() won't be ready yet.  (This ordering is necessary when we
-	 * are emulating spinlocks with semaphores.)
+	 * ShmemAlloc() won't be ready yet.
 	 */
 	sharedSemas = (PGSemaphore)
 		ShmemAllocUnlocked(PGSemaphoreShmemSize(maxSemas));
diff --git a/src/backend/port/sysv_sema.c b/src/backend/port/sysv_sema.c
index 1454f96b5f..5b88a92bc9 100644
--- a/src/backend/port/sysv_sema.c
+++ b/src/backend/port/sysv_sema.c
@@ -325,8 +325,7 @@ PGReserveSemaphores(int maxSemas)
 
 	/*
 	 * We must use ShmemAllocUnlocked(), since the spinlock protecting
-	 * ShmemAlloc() won't be ready yet.  (This ordering is necessary when we
-	 * are emulating spinlocks with semaphores.)
+	 * ShmemAlloc() won't be ready yet.
 	 */
 	sharedSemas = (PGSemaphore)
 		ShmemAllocUnlocked(PGSemaphoreShmemSize(maxSemas));
diff --git a/src/backend/postmaster/launch_backend.c b/src/backend/postmaster/launch_backend.c
index 8d4589846a..69ffbd169f 100644
--- a/src/backend/postmaster/launch_backend.c
+++ b/src/backend/postmaster/launch_backend.c
@@ -107,9 +107,7 @@ typedef struct
 #ifdef USE_INJECTION_POINTS
 	struct InjectionPointsCtl *ActiveInjectionPoints;
 #endif
-#ifndef HAVE_SPINLOCKS
 	PGSemaphore *SpinlockSemaArray;
-#endif
 	int			NamedLWLockTrancheRequests;
 	NamedLWLockTranche *NamedLWLockTrancheArray;
 	LWLockPadded *MainLWLockArray;
@@ -716,9 +714,6 @@ save_backend_variables(BackendParameters *param, ClientSocket *client_sock,
 	param->ActiveInjectionPoints = ActiveInjectionPoints;
 #endif
 
-#ifndef HAVE_SPINLOCKS
-	param->SpinlockSemaArray = SpinlockSemaArray;
-#endif
 	param->NamedLWLockTrancheRequests = NamedLWLockTrancheRequests;
 	param->NamedLWLockTrancheArray = NamedLWLockTrancheArray;
 	param->MainLWLockArray = MainLWLockArray;
@@ -978,9 +973,6 @@ restore_backend_variables(BackendParameters *param)
 	ActiveInjectionPoints = param->ActiveInjectionPoints;
 #endif
 
-#ifndef HAVE_SPINLOCKS
-	SpinlockSemaArray = param->SpinlockSemaArray;
-#endif
 	NamedLWLockTrancheRequests = param->NamedLWLockTrancheRequests;
 	NamedLWLockTrancheArray = param->NamedLWLockTrancheArray;
 	MainLWLockArray = param->MainLWLockArray;
diff --git a/src/backend/storage/ipc/ipci.c b/src/backend/storage/ipc/ipci.c
index b6c3b16950..34e4d17b67 100644
--- a/src/backend/storage/ipc/ipci.c
+++ b/src/backend/storage/ipc/ipci.c
@@ -94,7 +94,6 @@ CalculateShmemSize(int *num_semaphores)
 
 	/* Compute number of semaphores we'll need */
 	numSemas = ProcGlobalSemas();
-	numSemas += SpinlockSemas();
 
 	/* Return the number of semaphores if requested by the caller */
 	if (num_semaphores)
@@ -111,7 +110,6 @@ CalculateShmemSize(int *num_semaphores)
 	 */
 	size = 100000;
 	size = add_size(size, PGSemaphoreShmemSize(numSemas));
-	size = add_size(size, SpinlockSemaSize());
 	size = add_size(size, hash_estimate_size(SHMEM_INDEX_SIZE,
 											 sizeof(ShmemIndexEnt)));
 	size = add_size(size, dsm_estimate_size());
@@ -225,14 +223,6 @@ CreateSharedMemoryAndSemaphores(void)
 	 */
 	PGReserveSemaphores(numSemas);
 
-	/*
-	 * If spinlocks are disabled, initialize emulation layer (which depends on
-	 * semaphores, so the order is important here).
-	 */
-#ifndef HAVE_SPINLOCKS
-	SpinlockSemaInit();
-#endif
-
 	/*
 	 * Set up shared memory allocation mechanism
 	 */
diff --git a/src/backend/storage/lmgr/Makefile b/src/backend/storage/lmgr/Makefile
index 3f89548bde..6cbaf23b85 100644
--- a/src/backend/storage/lmgr/Makefile
+++ b/src/backend/storage/lmgr/Makefile
@@ -21,7 +21,6 @@ OBJS = \
 	predicate.o \
 	proc.o \
 	s_lock.o \
-	spin.o
 
 include $(top_srcdir)/src/backend/common.mk
 
diff --git a/src/backend/storage/lmgr/meson.build b/src/backend/storage/lmgr/meson.build
index 05ac41e809..d43511925e 100644
--- a/src/backend/storage/lmgr/meson.build
+++ b/src/backend/storage/lmgr/meson.build
@@ -9,5 +9,4 @@ backend_sources += files(
   'predicate.c',
   'proc.c',
   's_lock.c',
-  'spin.c',
 )
diff --git a/src/backend/storage/lmgr/s_lock.c b/src/backend/storage/lmgr/s_lock.c
index cba48b3e77..69549a65db 100644
--- a/src/backend/storage/lmgr/s_lock.c
+++ b/src/backend/storage/lmgr/s_lock.c
@@ -1,7 +1,7 @@
 /*-------------------------------------------------------------------------
  *
  * s_lock.c
- *	   Hardware-dependent implementation of spinlocks.
+ *	   Implementation of spinlocks.
  *
  * When waiting for a contended spinlock we loop tightly for awhile, then
  * delay using pg_usleep() and try again.  Preferably, "awhile" should be a
diff --git a/src/backend/storage/lmgr/spin.c b/src/backend/storage/lmgr/spin.c
deleted file mode 100644
index 50cb99cd3b..0000000000
--- a/src/backend/storage/lmgr/spin.c
+++ /dev/null
@@ -1,180 +0,0 @@
-/*-------------------------------------------------------------------------
- *
- * spin.c
- *	   Hardware-independent implementation of spinlocks.
- *
- *
- * For machines that have test-and-set (TAS) instructions, s_lock.h/.c
- * define the spinlock implementation.  This file contains only a stub
- * implementation for spinlocks using PGSemaphores.  Unless semaphores
- * are implemented in a way that doesn't involve a kernel call, this
- * is too slow to be very useful :-(
- *
- *
- * Portions Copyright (c) 1996-2024, PostgreSQL Global Development Group
- * Portions Copyright (c) 1994, Regents of the University of California
- *
- *
- * IDENTIFICATION
- *	  src/backend/storage/lmgr/spin.c
- *
- *-------------------------------------------------------------------------
- */
-#include "postgres.h"
-
-#include "storage/pg_sema.h"
-#include "storage/shmem.h"
-#include "storage/spin.h"
-
-
-#ifndef HAVE_SPINLOCKS
-
-/*
- * No TAS, so spinlocks are implemented as PGSemaphores.
- */
-
-#ifndef HAVE_ATOMICS
-#define NUM_EMULATION_SEMAPHORES (NUM_SPINLOCK_SEMAPHORES + NUM_ATOMICS_SEMAPHORES)
-#else
-#define NUM_EMULATION_SEMAPHORES (NUM_SPINLOCK_SEMAPHORES)
-#endif							/* HAVE_ATOMICS */
-
-PGSemaphore *SpinlockSemaArray;
-
-#else							/* !HAVE_SPINLOCKS */
-
-#define NUM_EMULATION_SEMAPHORES 0
-
-#endif							/* HAVE_SPINLOCKS */
-
-/*
- * Report the amount of shared memory needed to store semaphores for spinlock
- * support.
- */
-Size
-SpinlockSemaSize(void)
-{
-	return NUM_EMULATION_SEMAPHORES * sizeof(PGSemaphore);
-}
-
-/*
- * Report number of semaphores needed to support spinlocks.
- */
-int
-SpinlockSemas(void)
-{
-	return NUM_EMULATION_SEMAPHORES;
-}
-
-#ifndef HAVE_SPINLOCKS
-
-/*
- * Initialize spinlock emulation.
- *
- * This must be called after PGReserveSemaphores().
- */
-void
-SpinlockSemaInit(void)
-{
-	PGSemaphore *spinsemas;
-	int			nsemas = SpinlockSemas();
-	int			i;
-
-	/*
-	 * We must use ShmemAllocUnlocked(), since the spinlock protecting
-	 * ShmemAlloc() obviously can't be ready yet.
-	 */
-	spinsemas = (PGSemaphore *) ShmemAllocUnlocked(SpinlockSemaSize());
-	for (i = 0; i < nsemas; ++i)
-		spinsemas[i] = PGSemaphoreCreate();
-	SpinlockSemaArray = spinsemas;
-}
-
-/*
- * s_lock.h hardware-spinlock emulation using semaphores
- *
- * We map all spinlocks onto NUM_EMULATION_SEMAPHORES semaphores.  It's okay to
- * map multiple spinlocks onto one semaphore because no process should ever
- * hold more than one at a time.  We just need enough semaphores so that we
- * aren't adding too much extra contention from that.
- *
- * There is one exception to the restriction of only holding one spinlock at a
- * time, which is that it's ok if emulated atomic operations are nested inside
- * spinlocks. To avoid the danger of spinlocks and atomic using the same sema,
- * we make sure "normal" spinlocks and atomics backed by spinlocks use
- * distinct semaphores (see the nested argument to s_init_lock_sema).
- *
- * slock_t is just an int for this implementation; it holds the spinlock
- * number from 1..NUM_EMULATION_SEMAPHORES.  We intentionally ensure that 0
- * is not a valid value, so that testing with this code can help find
- * failures to initialize spinlocks.
- */
-
-static inline void
-s_check_valid(int lockndx)
-{
-	if (unlikely(lockndx <= 0 || lockndx > NUM_EMULATION_SEMAPHORES))
-		elog(ERROR, "invalid spinlock number: %d", lockndx);
-}
-
-void
-s_init_lock_sema(volatile slock_t *lock, bool nested)
-{
-	static uint32 counter = 0;
-	uint32		offset;
-	uint32		sema_total;
-	uint32		idx;
-
-	if (nested)
-	{
-		/*
-		 * To allow nesting atomics inside spinlocked sections, use a
-		 * different spinlock. See comment above.
-		 */
-		offset = 1 + NUM_SPINLOCK_SEMAPHORES;
-		sema_total = NUM_ATOMICS_SEMAPHORES;
-	}
-	else
-	{
-		offset = 1;
-		sema_total = NUM_SPINLOCK_SEMAPHORES;
-	}
-
-	idx = (counter++ % sema_total) + offset;
-
-	/* double check we did things correctly */
-	s_check_valid(idx);
-
-	*lock = idx;
-}
-
-void
-s_unlock_sema(volatile slock_t *lock)
-{
-	int			lockndx = *lock;
-
-	s_check_valid(lockndx);
-
-	PGSemaphoreUnlock(SpinlockSemaArray[lockndx - 1]);
-}
-
-bool
-s_lock_free_sema(volatile slock_t *lock)
-{
-	/* We don't currently use S_LOCK_FREE anyway */
-	elog(ERROR, "spin.c does not support S_LOCK_FREE()");
-	return false;
-}
-
-int
-tas_sema(volatile slock_t *lock)
-{
-	int			lockndx = *lock;
-
-	s_check_valid(lockndx);
-
-	/* Note that TAS macros return 0 if *success* */
-	return !PGSemaphoreTryLock(SpinlockSemaArray[lockndx - 1]);
-}
-
-#endif							/* !HAVE_SPINLOCKS */
diff --git a/src/include/pg_config.h.in b/src/include/pg_config.h.in
index 3dea3856aa..e6c06f6102 100644
--- a/src/include/pg_config.h.in
+++ b/src/include/pg_config.h.in
@@ -382,9 +382,6 @@
 /* Define to 1 if the system has the type `socklen_t'. */
 #undef HAVE_SOCKLEN_T
 
-/* Define to 1 if you have spinlocks. */
-#undef HAVE_SPINLOCKS
-
 /* Define to 1 if you have the `SSL_CTX_set_cert_cb' function. */
 #undef HAVE_SSL_CTX_SET_CERT_CB
 
diff --git a/src/include/pg_config_manual.h b/src/include/pg_config_manual.h
index f941ee2faf..11f74f4b56 100644
--- a/src/include/pg_config_manual.h
+++ b/src/include/pg_config_manual.h
@@ -86,21 +86,6 @@
 #define USE_FLOAT8_BYVAL 1
 #endif
 
-/*
- * When we don't have native spinlocks, we use semaphores to simulate them.
- * Decreasing this value reduces consumption of OS resources; increasing it
- * may improve performance, but supplying a real spinlock implementation is
- * probably far better.
- */
-#define NUM_SPINLOCK_SEMAPHORES		128
-
-/*
- * When we have neither spinlocks nor atomic operations support we're
- * implementing atomic operations on top of spinlock on top of semaphores. To
- * be safe against atomic operations while holding a spinlock separate
- * semaphores have to be used.
- */
-#define NUM_ATOMICS_SEMAPHORES		64
 
 /*
  * MAXPGPATH: standard size of a pathname buffer in PostgreSQL (hence,
diff --git a/src/include/port/atomics.h b/src/include/port/atomics.h
index f6fa432d2d..03134e3b7b 100644
--- a/src/include/port/atomics.h
+++ b/src/include/port/atomics.h
@@ -16,8 +16,8 @@
  *
  * There exist generic, hardware independent, implementations for several
  * compilers which might be sufficient, although possibly not optimal, for a
- * new platform. If no such generic implementation is available spinlocks (or
- * even OS provided semaphores) will be used to implement the API.
+ * new platform. If no such generic implementation is available spinlocks will
+ * be used to implement the API.
  *
  * Implement _u64 atomics if and only if your platform can use them
  * efficiently (and obviously correctly).
diff --git a/src/include/port/atomics/fallback.h b/src/include/port/atomics/fallback.h
index 34cfee110f..2e3eef4aca 100644
--- a/src/include/port/atomics/fallback.h
+++ b/src/include/port/atomics/fallback.h
@@ -20,9 +20,7 @@
 #ifndef pg_memory_barrier_impl
 /*
  * If we have no memory barrier implementation for this architecture, we
- * fall back to acquiring and releasing a spinlock.  This might, in turn,
- * fall back to the semaphore-based spinlock implementation, which will be
- * amazingly slow.
+ * fall back to acquiring and releasing a spinlock.
  *
  * It's not self-evident that every possible legal implementation of a
  * spinlock acquire-and-release would be equivalent to a full memory barrier.
diff --git a/src/include/storage/s_lock.h b/src/include/storage/s_lock.h
index 02c68513a5..e94ed5f48b 100644
--- a/src/include/storage/s_lock.h
+++ b/src/include/storage/s_lock.h
@@ -1,10 +1,10 @@
 /*-------------------------------------------------------------------------
  *
  * s_lock.h
- *	   Hardware-dependent implementation of spinlocks.
+ *	   Implementation of spinlocks.
  *
  *	NOTE: none of the macros in this file are intended to be called directly.
- *	Call them through the hardware-independent macros in spin.h.
+ *	Call them through the macros in spin.h.
  *
  *	The following hardware-dependent macros must be provided for each
  *	supported platform:
@@ -78,13 +78,6 @@
  *	in assembly language to execute a hardware atomic-test-and-set
  *	instruction.  Equivalent OS-supplied mutex routines could be used too.
  *
- *	If no system-specific TAS() is available (ie, HAVE_SPINLOCKS is not
- *	defined), then we fall back on an emulation that uses SysV semaphores
- *	(see spin.c).  This emulation will be MUCH MUCH slower than a proper TAS()
- *	implementation, because of the cost of a kernel call per lock or unlock.
- *	An old report is that Postgres spends around 40% of its time in semop(2)
- *	when using the SysV semaphore code.
- *
  *
  * Portions Copyright (c) 1996-2024, PostgreSQL Global Development Group
  * Portions Copyright (c) 1994, Regents of the University of California
@@ -100,8 +93,6 @@
 #error "s_lock.h may not be included from frontend code"
 #endif
 
-#ifdef HAVE_SPINLOCKS	/* skip spinlocks if requested */
-
 #if defined(__GNUC__) || defined(__INTEL_COMPILER)
 /*************************************************************************
  * All the gcc inlines
@@ -655,34 +646,10 @@ spin_delay(void)
 
 /* Blow up if we didn't have any way to do spinlocks */
 #ifndef HAS_TEST_AND_SET
-#error PostgreSQL does not have native spinlock support on this platform.  To continue the compilation, rerun configure using --disable-spinlocks.  However, performance will be poor.  Please report this to pgsql-b...@lists.postgresql.org.
+#error PostgreSQL does not have spinlock support on this platform.  Please report this to pgsql-b...@lists.postgresql.org.
 #endif
 
 
-#else	/* !HAVE_SPINLOCKS */
-
-
-/*
- * Fake spinlock implementation using semaphores --- slow and prone
- * to fall foul of kernel limits on number of semaphores, so don't use this
- * unless you must!  The subroutines appear in spin.c.
- */
-typedef int slock_t;
-
-extern bool s_lock_free_sema(volatile slock_t *lock);
-extern void s_unlock_sema(volatile slock_t *lock);
-extern void s_init_lock_sema(volatile slock_t *lock, bool nested);
-extern int	tas_sema(volatile slock_t *lock);
-
-#define S_LOCK_FREE(lock)	s_lock_free_sema(lock)
-#define S_UNLOCK(lock)	 s_unlock_sema(lock)
-#define S_INIT_LOCK(lock)	s_init_lock_sema(lock, false)
-#define TAS(lock)	tas_sema(lock)
-
-
-#endif	/* HAVE_SPINLOCKS */
-
-
 /*
  * Default Definitions - override these above as needed.
  */
diff --git a/src/include/storage/spin.h b/src/include/storage/spin.h
index c0679c5999..3ae2a56d07 100644
--- a/src/include/storage/spin.h
+++ b/src/include/storage/spin.h
@@ -1,11 +1,11 @@
 /*-------------------------------------------------------------------------
  *
  * spin.h
- *	   Hardware-independent implementation of spinlocks.
+ *	   API for spinlocks.
  *
  *
- *	The hardware-independent interface to spinlocks is defined by the
- *	typedef "slock_t" and these macros:
+ *	The interface to spinlocks is defined by the typedef "slock_t" and
+ *	these macros:
  *
  *	void SpinLockInit(volatile slock_t *lock)
  *		Initialize a spinlock (to the unlocked state).
@@ -52,9 +52,6 @@
 #define SPIN_H
 
 #include "storage/s_lock.h"
-#ifndef HAVE_SPINLOCKS
-#include "storage/pg_sema.h"
-#endif
 
 
 #define SpinLockInit(lock)	S_INIT_LOCK(lock)
@@ -65,13 +62,4 @@
 
 #define SpinLockFree(lock)	S_LOCK_FREE(lock)
 
-
-extern int	SpinlockSemas(void);
-extern Size SpinlockSemaSize(void);
-
-#ifndef HAVE_SPINLOCKS
-extern void SpinlockSemaInit(void);
-extern PGDLLIMPORT PGSemaphore *SpinlockSemaArray;
-#endif
-
 #endif							/* SPIN_H */
diff --git a/src/test/regress/regress.c b/src/test/regress/regress.c
index 45a6ad3c49..14aad5a0c6 100644
--- a/src/test/regress/regress.c
+++ b/src/test/regress/regress.c
@@ -887,91 +887,7 @@ test_spinlock(void)
 		if (memcmp(struct_w_lock.data_after, "ef12", 4) != 0)
 			elog(ERROR, "padding after spinlock modified");
 	}
-
-	/*
-	 * Ensure that allocating more than INT32_MAX emulated spinlocks works.
-	 * That's interesting because the spinlock emulation uses a 32bit integer
-	 * to map spinlocks onto semaphores. There've been bugs...
-	 */
-#ifndef HAVE_SPINLOCKS
-	{
-		/*
-		 * Initialize enough spinlocks to advance counter close to wraparound.
-		 * It's too expensive to perform acquire/release for each, as those
-		 * may be syscalls when the spinlock emulation is used (and even just
-		 * atomic TAS would be expensive).
-		 */
-		for (uint32 i = 0; i < INT32_MAX - 100000; i++)
-		{
-			slock_t		lock;
-
-			SpinLockInit(&lock);
-		}
-
-		for (uint32 i = 0; i < 200000; i++)
-		{
-			slock_t		lock;
-
-			SpinLockInit(&lock);
-
-			SpinLockAcquire(&lock);
-			SpinLockRelease(&lock);
-			SpinLockAcquire(&lock);
-			SpinLockRelease(&lock);
-		}
-	}
-#endif
-}
-
-/*
- * Verify that performing atomic ops inside a spinlock isn't a
- * problem. Realistically that's only going to be a problem when both
- * --disable-spinlocks and --disable-atomics are used, but it's cheap enough
- * to just always test.
- *
- * The test works by initializing enough atomics that we'd conflict if there
- * were an overlap between a spinlock and an atomic by holding a spinlock
- * while manipulating more than NUM_SPINLOCK_SEMAPHORES atomics.
- *
- * NUM_TEST_ATOMICS doesn't really need to be more than
- * NUM_SPINLOCK_SEMAPHORES, but it seems better to test a bit more
- * extensively.
- */
-static void
-test_atomic_spin_nest(void)
-{
-	slock_t		lock;
-#define NUM_TEST_ATOMICS (NUM_SPINLOCK_SEMAPHORES + NUM_ATOMICS_SEMAPHORES + 27)
-	pg_atomic_uint32 atomics32[NUM_TEST_ATOMICS];
-	pg_atomic_uint64 atomics64[NUM_TEST_ATOMICS];
-
-	SpinLockInit(&lock);
-
-	for (int i = 0; i < NUM_TEST_ATOMICS; i++)
-	{
-		pg_atomic_init_u32(&atomics32[i], 0);
-		pg_atomic_init_u64(&atomics64[i], 0);
-	}
-
-	/* just so it's not all zeroes */
-	for (int i = 0; i < NUM_TEST_ATOMICS; i++)
-	{
-		EXPECT_EQ_U32(pg_atomic_fetch_add_u32(&atomics32[i], i), 0);
-		EXPECT_EQ_U64(pg_atomic_fetch_add_u64(&atomics64[i], i), 0);
-	}
-
-	/* test whether we can do atomic op with lock held */
-	SpinLockAcquire(&lock);
-	for (int i = 0; i < NUM_TEST_ATOMICS; i++)
-	{
-		EXPECT_EQ_U32(pg_atomic_fetch_sub_u32(&atomics32[i], i), i);
-		EXPECT_EQ_U32(pg_atomic_read_u32(&atomics32[i]), 0);
-		EXPECT_EQ_U64(pg_atomic_fetch_sub_u64(&atomics64[i], i), i);
-		EXPECT_EQ_U64(pg_atomic_read_u64(&atomics64[i]), 0);
-	}
-	SpinLockRelease(&lock);
 }
-#undef NUM_TEST_ATOMICS
 
 PG_FUNCTION_INFO_V1(test_atomic_ops);
 Datum
@@ -989,8 +905,6 @@ test_atomic_ops(PG_FUNCTION_ARGS)
 	 */
 	test_spinlock();
 
-	test_atomic_spin_nest();
-
 	PG_RETURN_BOOL(true);
 }
 
-- 
2.39.2

From ca0d057fdf8b47957ce469c1300835564a82da92 Mon Sep 17 00:00:00 2001
From: Thomas Munro <thomas.mu...@gmail.com>
Date: Tue, 2 Jul 2024 15:13:00 +1200
Subject: [PATCH v2 2/4] Remove --disable-atomics, require 32 bit atomics.

All targeted systems have real atomics support in practice.  Since
edadeb07 there is no remaining reason to carry code that simulates
atomic flag and uint32 imperfectly with spinlocks.

Any modern toolchain capable of implementing C11 <stdatomic.h> must have
the underlying operations we need, though we don't require C11 yet.

Discussion: https://postgr.es/m/3351991.1697728588%40sss.pgh.pa.us
---
 configure                                 |  40 --------
 configure.ac                              |  13 ---
 doc/src/sgml/installation.sgml            |  25 -----
 meson.build                               |  65 ++++++-------
 src/backend/port/atomics.c                | 109 ----------------------
 src/include/pg_config.h.in                |   3 -
 src/include/port/atomics.h                |  18 ++--
 src/include/port/atomics/arch-x86.h       |   8 --
 src/include/port/atomics/fallback.h       |  87 +----------------
 src/include/port/atomics/generic-gcc.h    |   4 -
 src/include/port/atomics/generic-msvc.h   |   4 -
 src/include/port/atomics/generic-sunpro.h |   8 --
 12 files changed, 39 insertions(+), 345 deletions(-)

diff --git a/configure b/configure
index f8deaa8d78..8f684f7945 100755
--- a/configure
+++ b/configure
@@ -836,7 +836,6 @@ enable_integer_datetimes
 enable_nls
 with_pgport
 enable_rpath
-enable_atomics
 enable_debug
 enable_profiling
 enable_coverage
@@ -1528,7 +1527,6 @@ Optional Features:
                           enable Native Language Support
   --disable-rpath         do not embed shared library search path in
                           executables
-  --disable-atomics       do not use atomic operations
   --enable-debug          build with debugging symbols (-g)
   --enable-profiling      build with profiling enabled
   --enable-coverage       build with coverage testing instrumentation
@@ -3264,33 +3262,6 @@ fi
 
 
 
-#
-# Atomic operations
-#
-
-
-# Check whether --enable-atomics was given.
-if test "${enable_atomics+set}" = set; then :
-  enableval=$enable_atomics;
-  case $enableval in
-    yes)
-      :
-      ;;
-    no)
-      :
-      ;;
-    *)
-      as_fn_error $? "no argument expected for --enable-atomics option" "$LINENO" 5
-      ;;
-  esac
-
-else
-  enable_atomics=yes
-
-fi
-
-
-
 #
 # --enable-debug adds -g to compiler flags
 #
@@ -12156,17 +12127,6 @@ fi
 
 fi
 
-if test "$enable_atomics" = yes; then
-
-$as_echo "#define HAVE_ATOMICS 1" >>confdefs.h
-
-else
-  { $as_echo "$as_me:${as_lineno-$LINENO}: WARNING:
-*** Not using atomic operations will cause poor performance." >&5
-$as_echo "$as_me: WARNING:
-*** Not using atomic operations will cause poor performance." >&2;}
-fi
-
 if test "$with_gssapi" = yes ; then
   if test "$PORTNAME" != "win32"; then
     { $as_echo "$as_me:${as_lineno-$LINENO}: checking for library containing gss_store_cred_into" >&5
diff --git a/configure.ac b/configure.ac
index a72169f574..75b73532fe 100644
--- a/configure.ac
+++ b/configure.ac
@@ -186,12 +186,6 @@ PGAC_ARG_BOOL(enable, rpath, yes,
               [do not embed shared library search path in executables])
 AC_SUBST(enable_rpath)
 
-#
-# Atomic operations
-#
-PGAC_ARG_BOOL(enable, atomics, yes,
-              [do not use atomic operations])
-
 #
 # --enable-debug adds -g to compiler flags
 #
@@ -1290,13 +1284,6 @@ failure.  It is possible the compiler isn't looking in the proper directory.
 Use --without-zlib to disable zlib support.])])
 fi
 
-if test "$enable_atomics" = yes; then
-  AC_DEFINE(HAVE_ATOMICS, 1, [Define to 1 if you want to use atomics if available.])
-else
-  AC_MSG_WARN([
-*** Not using atomic operations will cause poor performance.])
-fi
-
 if test "$with_gssapi" = yes ; then
   if test "$PORTNAME" != "win32"; then
     AC_SEARCH_LIBS(gss_store_cred_into, [gssapi_krb5 gss 'gssapi -lkrb5 -lcrypto'], [],
diff --git a/doc/src/sgml/installation.sgml b/doc/src/sgml/installation.sgml
index 3f19f272b1..4ab8ddba7c 100644
--- a/doc/src/sgml/installation.sgml
+++ b/doc/src/sgml/installation.sgml
@@ -1258,18 +1258,6 @@ build-postgresql:
        </listitem>
       </varlistentry>
 
-      <varlistentry id="configure-option-disable-atomics">
-       <term><option>--disable-atomics</option></term>
-       <listitem>
-        <para>
-         Disable use of CPU atomic operations.  This option does nothing on
-         platforms that lack such operations.  On platforms that do have
-         them, this will result in poor performance.  This option is only
-         useful for debugging or making performance comparisons.
-        </para>
-       </listitem>
-      </varlistentry>
-
      </variablelist>
 
    </sect3>
@@ -2674,19 +2662,6 @@ ninja install
       </listitem>
      </varlistentry>
 
-     <varlistentry id="configure-atomics-meson">
-      <term><option>-Datomics={ true | false }</option></term>
-      <listitem>
-       <para>
-        This option is set to true by default; setting it to false will
-        disable use of CPU atomic operations.  The option does nothing on
-        platforms that lack such operations.  On platforms that do have
-        them, disabling atomics will result in poor performance.  Changing
-        this option is only useful for debugging or making performance comparisons.
-       </para>
-      </listitem>
-     </varlistentry>
-
     </variablelist>
    </sect3>
 
diff --git a/meson.build b/meson.build
index 6a0d538365..7de0371226 100644
--- a/meson.build
+++ b/meson.build
@@ -2089,70 +2089,61 @@ endif
 # Atomics
 ###############################################################
 
-if not get_option('atomics')
-  warning('Not using atomics will cause poor performance')
-else
-  # XXX: perhaps we should require some atomics support in this case these
-  # days?
-  cdata.set('HAVE_ATOMICS', 1)
-
-  atomic_checks = [
-    {'name': 'HAVE_GCC__SYNC_CHAR_TAS',
-     'desc': '__sync_lock_test_and_set(char)',
-     'test': '''
+atomic_checks = [
+  {'name': 'HAVE_GCC__SYNC_CHAR_TAS',
+   'desc': '__sync_lock_test_and_set(char)',
+   'test': '''
 char lock = 0;
 __sync_lock_test_and_set(&lock, 1);
 __sync_lock_release(&lock);'''},
 
-    {'name': 'HAVE_GCC__SYNC_INT32_TAS',
-     'desc': '__sync_lock_test_and_set(int32)',
-     'test': '''
+  {'name': 'HAVE_GCC__SYNC_INT32_TAS',
+   'desc': '__sync_lock_test_and_set(int32)',
+   'test': '''
 int lock = 0;
 __sync_lock_test_and_set(&lock, 1);
 __sync_lock_release(&lock);'''},
 
-    {'name': 'HAVE_GCC__SYNC_INT32_CAS',
-     'desc': '__sync_val_compare_and_swap(int32)',
-     'test': '''
+  {'name': 'HAVE_GCC__SYNC_INT32_CAS',
+   'desc': '__sync_val_compare_and_swap(int32)',
+   'test': '''
 int val = 0;
 __sync_val_compare_and_swap(&val, 0, 37);'''},
 
-    {'name': 'HAVE_GCC__SYNC_INT64_CAS',
-     'desc': '__sync_val_compare_and_swap(int64)',
-     'test': '''
+  {'name': 'HAVE_GCC__SYNC_INT64_CAS',
+   'desc': '__sync_val_compare_and_swap(int64)',
+   'test': '''
 INT64 val = 0;
 __sync_val_compare_and_swap(&val, 0, 37);'''},
 
-    {'name': 'HAVE_GCC__ATOMIC_INT32_CAS',
-     'desc': ' __atomic_compare_exchange_n(int32)',
-     'test': '''
+  {'name': 'HAVE_GCC__ATOMIC_INT32_CAS',
+   'desc': ' __atomic_compare_exchange_n(int32)',
+   'test': '''
 int val = 0;
 int expect = 0;
 __atomic_compare_exchange_n(&val, &expect, 37, 0, __ATOMIC_SEQ_CST, __ATOMIC_RELAXED);'''},
 
-    {'name': 'HAVE_GCC__ATOMIC_INT64_CAS',
-     'desc': ' __atomic_compare_exchange_n(int64)',
-     'test': '''
+  {'name': 'HAVE_GCC__ATOMIC_INT64_CAS',
+   'desc': ' __atomic_compare_exchange_n(int64)',
+   'test': '''
 INT64 val = 0;
 INT64 expect = 0;
 __atomic_compare_exchange_n(&val, &expect, 37, 0, __ATOMIC_SEQ_CST, __ATOMIC_RELAXED);'''},
-  ]
+]
 
-  foreach check : atomic_checks
-    test = '''
+foreach check : atomic_checks
+  test = '''
 int main(void)
 {
 @0@
 }'''.format(check['test'])
 
-    cdata.set(check['name'],
-      cc.links(test,
-        name: check['desc'],
-        args: test_c_args + ['-DINT64=@0@'.format(cdata.get('PG_INT64_TYPE'))]) ? 1 : false
-    )
-  endforeach
-
-endif
+  cdata.set(check['name'],
+    cc.links(test,
+      name: check['desc'],
+      args: test_c_args + ['-DINT64=@0@'.format(cdata.get('PG_INT64_TYPE'))]) ? 1 : false
+  )
+endforeach
 
 
 ###############################################################
diff --git a/src/backend/port/atomics.c b/src/backend/port/atomics.c
index cd7ede9672..6f1e014d0b 100644
--- a/src/backend/port/atomics.c
+++ b/src/backend/port/atomics.c
@@ -49,115 +49,6 @@ pg_extern_compiler_barrier(void)
 #endif
 
 
-#ifdef PG_HAVE_ATOMIC_FLAG_SIMULATION
-
-void
-pg_atomic_init_flag_impl(volatile pg_atomic_flag *ptr)
-{
-	StaticAssertDecl(sizeof(ptr->sema) >= sizeof(slock_t),
-					 "size mismatch of atomic_flag vs slock_t");
-
-	SpinLockInit((slock_t *) &ptr->sema);
-
-	ptr->value = false;
-}
-
-bool
-pg_atomic_test_set_flag_impl(volatile pg_atomic_flag *ptr)
-{
-	uint32		oldval;
-
-	SpinLockAcquire((slock_t *) &ptr->sema);
-	oldval = ptr->value;
-	ptr->value = true;
-	SpinLockRelease((slock_t *) &ptr->sema);
-
-	return oldval == 0;
-}
-
-void
-pg_atomic_clear_flag_impl(volatile pg_atomic_flag *ptr)
-{
-	SpinLockAcquire((slock_t *) &ptr->sema);
-	ptr->value = false;
-	SpinLockRelease((slock_t *) &ptr->sema);
-}
-
-bool
-pg_atomic_unlocked_test_flag_impl(volatile pg_atomic_flag *ptr)
-{
-	return ptr->value == 0;
-}
-
-#endif							/* PG_HAVE_ATOMIC_FLAG_SIMULATION */
-
-#ifdef PG_HAVE_ATOMIC_U32_SIMULATION
-void
-pg_atomic_init_u32_impl(volatile pg_atomic_uint32 *ptr, uint32 val_)
-{
-	StaticAssertDecl(sizeof(ptr->sema) >= sizeof(slock_t),
-					 "size mismatch of atomic_uint32 vs slock_t");
-
-	SpinLockInit((slock_t *) &ptr->sema);
-	ptr->value = val_;
-}
-
-void
-pg_atomic_write_u32_impl(volatile pg_atomic_uint32 *ptr, uint32 val)
-{
-	/*
-	 * One might think that an unlocked write doesn't need to acquire the
-	 * spinlock, but one would be wrong. Even an unlocked write has to cause a
-	 * concurrent pg_atomic_compare_exchange_u32() (et al) to fail.
-	 */
-	SpinLockAcquire((slock_t *) &ptr->sema);
-	ptr->value = val;
-	SpinLockRelease((slock_t *) &ptr->sema);
-}
-
-bool
-pg_atomic_compare_exchange_u32_impl(volatile pg_atomic_uint32 *ptr,
-									uint32 *expected, uint32 newval)
-{
-	bool		ret;
-
-	/*
-	 * Do atomic op under a spinlock. It might look like we could just skip
-	 * the cmpxchg if the lock isn't available, but that'd just emulate a
-	 * 'weak' compare and swap. I.e. one that allows spurious failures. Since
-	 * several algorithms rely on a strong variant and that is efficiently
-	 * implementable on most major architectures let's emulate it here as
-	 * well.
-	 */
-	SpinLockAcquire((slock_t *) &ptr->sema);
-
-	/* perform compare/exchange logic */
-	ret = ptr->value == *expected;
-	*expected = ptr->value;
-	if (ret)
-		ptr->value = newval;
-
-	/* and release lock */
-	SpinLockRelease((slock_t *) &ptr->sema);
-
-	return ret;
-}
-
-uint32
-pg_atomic_fetch_add_u32_impl(volatile pg_atomic_uint32 *ptr, int32 add_)
-{
-	uint32		oldval;
-
-	SpinLockAcquire((slock_t *) &ptr->sema);
-	oldval = ptr->value;
-	ptr->value += add_;
-	SpinLockRelease((slock_t *) &ptr->sema);
-	return oldval;
-}
-
-#endif							/* PG_HAVE_ATOMIC_U32_SIMULATION */
-
-
 #ifdef PG_HAVE_ATOMIC_U64_SIMULATION
 
 void
diff --git a/src/include/pg_config.h.in b/src/include/pg_config.h.in
index e6c06f6102..0e9b108e66 100644
--- a/src/include/pg_config.h.in
+++ b/src/include/pg_config.h.in
@@ -57,9 +57,6 @@
 /* Define to 1 if you have the `ASN1_STRING_get0_data' function. */
 #undef HAVE_ASN1_STRING_GET0_DATA
 
-/* Define to 1 if you want to use atomics if available. */
-#undef HAVE_ATOMICS
-
 /* Define to 1 if you have the <atomic.h> header file. */
 #undef HAVE_ATOMIC_H
 
diff --git a/src/include/port/atomics.h b/src/include/port/atomics.h
index 03134e3b7b..c2ce10a718 100644
--- a/src/include/port/atomics.h
+++ b/src/include/port/atomics.h
@@ -17,7 +17,7 @@
  * There exist generic, hardware independent, implementations for several
  * compilers which might be sufficient, although possibly not optimal, for a
  * new platform. If no such generic implementation is available spinlocks will
- * be used to implement the API.
+ * be used to implement the 64-bit parts of the API.
  *
  * Implement _u64 atomics if and only if your platform can use them
  * efficiently (and obviously correctly).
@@ -91,17 +91,17 @@
 #elif defined(__SUNPRO_C) && !defined(__GNUC__)
 #include "port/atomics/generic-sunpro.h"
 #else
-/*
- * Unsupported compiler, we'll likely use slower fallbacks... At least
- * compiler barriers should really be provided.
- */
+/* Unknown compiler. */
+#endif
+
+/* Fail if we couldn't find implementations of required facilities. */
+#if !defined(PG_HAVE_ATOMIC_U32_SUPPORT)
+#error "could not find an implementation of pg_atomic_uint32"
 #endif
 
 /*
- * Provide a full fallback of the pg_*_barrier(), pg_atomic**_flag and
- * pg_atomic_* APIs for platforms without sufficient spinlock and/or atomics
- * support. In the case of spinlock backed atomics the emulation is expected
- * to be efficient, although less so than native atomics support.
+ * Provide a spinlock-based implementation of the 64 bit variants, if
+ * necessary.
  */
 #include "port/atomics/fallback.h"
 
diff --git a/src/include/port/atomics/arch-x86.h b/src/include/port/atomics/arch-x86.h
index 2a8eca30fc..c12f8a6069 100644
--- a/src/include/port/atomics/arch-x86.h
+++ b/src/include/port/atomics/arch-x86.h
@@ -49,8 +49,6 @@
  * nice to support older gcc's and the compare/exchange implementation here is
  * actually more efficient than the * __sync variant.
  */
-#if defined(HAVE_ATOMICS)
-
 #if defined(__GNUC__) || defined(__INTEL_COMPILER)
 
 #define PG_HAVE_ATOMIC_FLAG_SUPPORT
@@ -80,8 +78,6 @@ typedef struct pg_atomic_uint64
 
 #endif /* defined(__GNUC__) || defined(__INTEL_COMPILER) */
 
-#endif /* defined(HAVE_ATOMICS) */
-
 #if !defined(PG_HAVE_SPIN_DELAY)
 /*
  * This sequence is equivalent to the PAUSE instruction ("rep" is
@@ -132,8 +128,6 @@ pg_spin_delay_impl(void)
 #endif /* !defined(PG_HAVE_SPIN_DELAY) */
 
 
-#if defined(HAVE_ATOMICS)
-
 #if defined(__GNUC__) || defined(__INTEL_COMPILER)
 
 #define PG_HAVE_ATOMIC_TEST_SET_FLAG
@@ -250,5 +244,3 @@ pg_atomic_fetch_add_u64_impl(volatile pg_atomic_uint64 *ptr, int64 add_)
 	defined(__x86_64__) || defined(__x86_64) || defined(_M_X64) /* gcc, sunpro, msvc */
 #define PG_HAVE_8BYTE_SINGLE_COPY_ATOMICITY
 #endif /* 8 byte single-copy atomicity */
-
-#endif /* HAVE_ATOMICS */
diff --git a/src/include/port/atomics/fallback.h b/src/include/port/atomics/fallback.h
index 2e3eef4aca..8ffd1a8fd3 100644
--- a/src/include/port/atomics/fallback.h
+++ b/src/include/port/atomics/fallback.h
@@ -1,7 +1,7 @@
 /*-------------------------------------------------------------------------
  *
  * fallback.h
- *    Fallback for platforms without spinlock and/or atomics support. Slower
+ *    Fallback for platforms without 64 bit atomics support. Slower
  *    than native atomics support, but not unusably slow.
  *
  * Portions Copyright (c) 1996-2024, PostgreSQL Global Development Group
@@ -49,50 +49,6 @@ extern void pg_extern_compiler_barrier(void);
 #endif
 
 
-/*
- * If we have atomics implementation for this platform, fall back to providing
- * the atomics API using a spinlock to protect the internal state. Possibly
- * the spinlock implementation uses semaphores internally...
- *
- * We have to be a bit careful here, as it's not guaranteed that atomic
- * variables are mapped to the same address in every process (e.g. dynamic
- * shared memory segments). We can't just hash the address and use that to map
- * to a spinlock. Instead assign a spinlock on initialization of the atomic
- * variable.
- */
-#if !defined(PG_HAVE_ATOMIC_FLAG_SUPPORT) && !defined(PG_HAVE_ATOMIC_U32_SUPPORT)
-
-#define PG_HAVE_ATOMIC_FLAG_SIMULATION
-#define PG_HAVE_ATOMIC_FLAG_SUPPORT
-
-typedef struct pg_atomic_flag
-{
-	/*
-	 * To avoid circular includes we can't use s_lock as a type here. Instead
-	 * just reserve enough space for all spinlock types. Some platforms would
-	 * be content with just one byte instead of 4, but that's not too much
-	 * waste.
-	 */
-	int			sema;
-	volatile bool value;
-} pg_atomic_flag;
-
-#endif /* PG_HAVE_ATOMIC_FLAG_SUPPORT */
-
-#if !defined(PG_HAVE_ATOMIC_U32_SUPPORT)
-
-#define PG_HAVE_ATOMIC_U32_SIMULATION
-
-#define PG_HAVE_ATOMIC_U32_SUPPORT
-typedef struct pg_atomic_uint32
-{
-	/* Check pg_atomic_flag's definition above for an explanation */
-	int			sema;
-	volatile uint32 value;
-} pg_atomic_uint32;
-
-#endif /* PG_HAVE_ATOMIC_U32_SUPPORT */
-
 #if !defined(PG_HAVE_ATOMIC_U64_SUPPORT)
 
 #define PG_HAVE_ATOMIC_U64_SIMULATION
@@ -100,49 +56,10 @@ typedef struct pg_atomic_uint32
 #define PG_HAVE_ATOMIC_U64_SUPPORT
 typedef struct pg_atomic_uint64
 {
-	/* Check pg_atomic_flag's definition above for an explanation */
 	int			sema;
 	volatile uint64 value;
 } pg_atomic_uint64;
 
-#endif /* PG_HAVE_ATOMIC_U64_SUPPORT */
-
-#ifdef PG_HAVE_ATOMIC_FLAG_SIMULATION
-
-#define PG_HAVE_ATOMIC_INIT_FLAG
-extern void pg_atomic_init_flag_impl(volatile pg_atomic_flag *ptr);
-
-#define PG_HAVE_ATOMIC_TEST_SET_FLAG
-extern bool pg_atomic_test_set_flag_impl(volatile pg_atomic_flag *ptr);
-
-#define PG_HAVE_ATOMIC_CLEAR_FLAG
-extern void pg_atomic_clear_flag_impl(volatile pg_atomic_flag *ptr);
-
-#define PG_HAVE_ATOMIC_UNLOCKED_TEST_FLAG
-extern bool pg_atomic_unlocked_test_flag_impl(volatile pg_atomic_flag *ptr);
-
-#endif /* PG_HAVE_ATOMIC_FLAG_SIMULATION */
-
-#ifdef PG_HAVE_ATOMIC_U32_SIMULATION
-
-#define PG_HAVE_ATOMIC_INIT_U32
-extern void pg_atomic_init_u32_impl(volatile pg_atomic_uint32 *ptr, uint32 val_);
-
-#define PG_HAVE_ATOMIC_WRITE_U32
-extern void pg_atomic_write_u32_impl(volatile pg_atomic_uint32 *ptr, uint32 val);
-
-#define PG_HAVE_ATOMIC_COMPARE_EXCHANGE_U32
-extern bool pg_atomic_compare_exchange_u32_impl(volatile pg_atomic_uint32 *ptr,
-												uint32 *expected, uint32 newval);
-
-#define PG_HAVE_ATOMIC_FETCH_ADD_U32
-extern uint32 pg_atomic_fetch_add_u32_impl(volatile pg_atomic_uint32 *ptr, int32 add_);
-
-#endif /* PG_HAVE_ATOMIC_U32_SIMULATION */
-
-
-#ifdef PG_HAVE_ATOMIC_U64_SIMULATION
-
 #define PG_HAVE_ATOMIC_INIT_U64
 extern void pg_atomic_init_u64_impl(volatile pg_atomic_uint64 *ptr, uint64 val_);
 
@@ -153,4 +70,4 @@ extern bool pg_atomic_compare_exchange_u64_impl(volatile pg_atomic_uint64 *ptr,
 #define PG_HAVE_ATOMIC_FETCH_ADD_U64
 extern uint64 pg_atomic_fetch_add_u64_impl(volatile pg_atomic_uint64 *ptr, int64 add_);
 
-#endif /* PG_HAVE_ATOMIC_U64_SIMULATION */
+#endif /* PG_HAVE_ATOMIC_U64_SUPPORT */
diff --git a/src/include/port/atomics/generic-gcc.h b/src/include/port/atomics/generic-gcc.h
index 872d2f02af..cfbcbe0fff 100644
--- a/src/include/port/atomics/generic-gcc.h
+++ b/src/include/port/atomics/generic-gcc.h
@@ -53,8 +53,6 @@
 #endif
 
 
-#ifdef HAVE_ATOMICS
-
 /* generic gcc based atomic flag implementation */
 #if !defined(PG_HAVE_ATOMIC_FLAG_SUPPORT) \
 	&& (defined(HAVE_GCC__SYNC_INT32_TAS) || defined(HAVE_GCC__SYNC_CHAR_TAS))
@@ -319,5 +317,3 @@ pg_atomic_fetch_or_u64_impl(volatile pg_atomic_uint64 *ptr, uint64 or_)
 #endif
 
 #endif /* !defined(PG_DISABLE_64_BIT_ATOMICS) */
-
-#endif /* defined(HAVE_ATOMICS) */
diff --git a/src/include/port/atomics/generic-msvc.h b/src/include/port/atomics/generic-msvc.h
index c013aca5e7..677436f260 100644
--- a/src/include/port/atomics/generic-msvc.h
+++ b/src/include/port/atomics/generic-msvc.h
@@ -30,8 +30,6 @@
 #define pg_memory_barrier_impl()	MemoryBarrier()
 #endif
 
-#if defined(HAVE_ATOMICS)
-
 #define PG_HAVE_ATOMIC_U32_SUPPORT
 typedef struct pg_atomic_uint32
 {
@@ -115,5 +113,3 @@ pg_atomic_fetch_add_u64_impl(volatile pg_atomic_uint64 *ptr, int64 add_)
 }
 
 #endif /* _WIN64 */
-
-#endif /* HAVE_ATOMICS */
diff --git a/src/include/port/atomics/generic-sunpro.h b/src/include/port/atomics/generic-sunpro.h
index 840a45e778..08f093ed2c 100644
--- a/src/include/port/atomics/generic-sunpro.h
+++ b/src/include/port/atomics/generic-sunpro.h
@@ -17,8 +17,6 @@
  * -------------------------------------------------------------------------
  */
 
-#if defined(HAVE_ATOMICS)
-
 #ifdef HAVE_MBARRIER_H
 #include <mbarrier.h>
 
@@ -66,10 +64,6 @@ typedef struct pg_atomic_uint64
 
 #endif /* HAVE_ATOMIC_H */
 
-#endif /* defined(HAVE_ATOMICS) */
-
-
-#if defined(HAVE_ATOMICS)
 
 #ifdef HAVE_ATOMIC_H
 
@@ -117,5 +111,3 @@ pg_atomic_exchange_u64_impl(volatile pg_atomic_uint64 *ptr, uint64 newval)
 }
 
 #endif /* HAVE_ATOMIC_H */
-
-#endif /* defined(HAVE_ATOMICS) */
-- 
2.39.2

From e0a1081846cb907a71adf13050605478c71a827c Mon Sep 17 00:00:00 2001
From: Thomas Munro <thomas.mu...@gmail.com>
Date: Tue, 30 Jul 2024 06:27:57 +1200
Subject: [PATCH v2 3/4] Require compiler barrier support.

Previously we had a fallback implementation of pg_compiler_barrier()
that called an empty function across a translation unit boundary so the
compiler couldn't see what it did.  That might not work with a link time
optimizer.  Since we now require knowledge of how to implement atomics,
there shouldn't be any cases where we don't also know how to implement
compiler barriers.

Discussion: https://postgr.es/m/721bf39a-ed8a-44b0-8b8e-be3bd81db748%40technowledgy.de
Discussion: https://postgr.es/m/3351991.1697728588%40sss.pgh.pa.us
---
 src/backend/port/atomics.c          |  8 --------
 src/include/port/atomics.h          |  3 +++
 src/include/port/atomics/fallback.h | 15 ---------------
 3 files changed, 3 insertions(+), 23 deletions(-)

diff --git a/src/backend/port/atomics.c b/src/backend/port/atomics.c
index 6f1e014d0b..19a84a7849 100644
--- a/src/backend/port/atomics.c
+++ b/src/backend/port/atomics.c
@@ -40,14 +40,6 @@ pg_spinlock_barrier(void)
 }
 #endif
 
-#ifdef PG_HAVE_COMPILER_BARRIER_EMULATION
-void
-pg_extern_compiler_barrier(void)
-{
-	/* do nothing */
-}
-#endif
-
 
 #ifdef PG_HAVE_ATOMIC_U64_SIMULATION
 
diff --git a/src/include/port/atomics.h b/src/include/port/atomics.h
index c2ce10a718..edb0ae40dc 100644
--- a/src/include/port/atomics.h
+++ b/src/include/port/atomics.h
@@ -98,6 +98,9 @@
 #if !defined(PG_HAVE_ATOMIC_U32_SUPPORT)
 #error "could not find an implementation of pg_atomic_uint32"
 #endif
+#if !defined(pg_compiler_barrier_impl)
+#error "could not find an implementation of pg_compiler_barrier"
+#endif
 
 /*
  * Provide a spinlock-based implementation of the 64 bit variants, if
diff --git a/src/include/port/atomics/fallback.h b/src/include/port/atomics/fallback.h
index 8ffd1a8fd3..9f83827d83 100644
--- a/src/include/port/atomics/fallback.h
+++ b/src/include/port/atomics/fallback.h
@@ -33,21 +33,6 @@ extern void pg_spinlock_barrier(void);
 #define pg_memory_barrier_impl pg_spinlock_barrier
 #endif
 
-#ifndef pg_compiler_barrier_impl
-/*
- * If the compiler/arch combination does not provide compiler barriers,
- * provide a fallback.  The fallback simply consists of a function call into
- * an externally defined function.  That should guarantee compiler barrier
- * semantics except for compilers that do inter translation unit/global
- * optimization - those better provide an actual compiler barrier.
- *
- * A native compiler barrier for sure is a lot faster than this...
- */
-#define PG_HAVE_COMPILER_BARRIER_EMULATION
-extern void pg_extern_compiler_barrier(void);
-#define pg_compiler_barrier_impl pg_extern_compiler_barrier
-#endif
-
 
 #if !defined(PG_HAVE_ATOMIC_U64_SUPPORT)
 
-- 
2.39.2

From 2ccc8fdaf2f1441d9ba2924451ffa53bf1a1f4a5 Mon Sep 17 00:00:00 2001
From: Thomas Munro <thomas.mu...@gmail.com>
Date: Tue, 30 Jul 2024 07:14:59 +1200
Subject: [PATCH v2 4/4] Require memory barrier support.

Previously we had a fallback implementation that made a harmless system
call, on the theory that system calls must contain a memory barrier.
(Some of the comments removed refer to a spinlock implementation, but
that changed in 1b468a13 which left some stray comments.)

We don't require 'read' and 'write' barriers, falling back to full
memory barriers still.  Notably, MSVC relies on that, which is probably
incorrect XXX?

Discussion: https://postgr.es/m/721bf39a-ed8a-44b0-8b8e-be3bd81db748%40technowledgy.de
Discussion: https://postgr.es/m/3351991.1697728588%40sss.pgh.pa.us
---
 src/backend/port/atomics.c          | 23 -----------------------
 src/include/port/atomics.h          |  4 ++++
 src/include/port/atomics/fallback.h | 16 ----------------
 src/include/port/atomics/generic.h  | 10 ----------
 4 files changed, 4 insertions(+), 49 deletions(-)

diff --git a/src/backend/port/atomics.c b/src/backend/port/atomics.c
index 19a84a7849..f98f6b6dbd 100644
--- a/src/backend/port/atomics.c
+++ b/src/backend/port/atomics.c
@@ -17,29 +17,6 @@
 #include "port/atomics.h"
 #include "storage/spin.h"
 
-#ifdef PG_HAVE_MEMORY_BARRIER_EMULATION
-#ifdef WIN32
-#error "barriers are required (and provided) on WIN32 platforms"
-#endif
-#include <signal.h>
-#endif
-
-#ifdef PG_HAVE_MEMORY_BARRIER_EMULATION
-void
-pg_spinlock_barrier(void)
-{
-	/*
-	 * NB: we have to be reentrant here, some barriers are placed in signal
-	 * handlers.
-	 *
-	 * We use kill(0) for the fallback barrier as we assume that kernels on
-	 * systems old enough to require fallback barrier support will include an
-	 * appropriate barrier while checking the existence of the postmaster pid.
-	 */
-	(void) kill(PostmasterPid, 0);
-}
-#endif
-
 
 #ifdef PG_HAVE_ATOMIC_U64_SIMULATION
 
diff --git a/src/include/port/atomics.h b/src/include/port/atomics.h
index edb0ae40dc..c0c8688f73 100644
--- a/src/include/port/atomics.h
+++ b/src/include/port/atomics.h
@@ -101,6 +101,10 @@
 #if !defined(pg_compiler_barrier_impl)
 #error "could not find an implementation of pg_compiler_barrier"
 #endif
+#if !defined(pg_memory_barrier_impl)
+#error "could not find an implementation of pg_memory_barrier_impl"
+#endif
+
 
 /*
  * Provide a spinlock-based implementation of the 64 bit variants, if
diff --git a/src/include/port/atomics/fallback.h b/src/include/port/atomics/fallback.h
index 9f83827d83..2c0eb28768 100644
--- a/src/include/port/atomics/fallback.h
+++ b/src/include/port/atomics/fallback.h
@@ -17,22 +17,6 @@
 #	error "should be included via atomics.h"
 #endif
 
-#ifndef pg_memory_barrier_impl
-/*
- * If we have no memory barrier implementation for this architecture, we
- * fall back to acquiring and releasing a spinlock.
- *
- * It's not self-evident that every possible legal implementation of a
- * spinlock acquire-and-release would be equivalent to a full memory barrier.
- * For example, I'm not sure that Itanium's acq and rel add up to a full
- * fence.  But all of our actual implementations seem OK in this regard.
- */
-#define PG_HAVE_MEMORY_BARRIER_EMULATION
-
-extern void pg_spinlock_barrier(void);
-#define pg_memory_barrier_impl pg_spinlock_barrier
-#endif
-
 
 #if !defined(PG_HAVE_ATOMIC_U64_SUPPORT)
 
diff --git a/src/include/port/atomics/generic.h b/src/include/port/atomics/generic.h
index 6113ab62a3..b636f95142 100644
--- a/src/include/port/atomics/generic.h
+++ b/src/include/port/atomics/generic.h
@@ -135,19 +135,9 @@ pg_atomic_unlocked_test_flag_impl(volatile pg_atomic_flag *ptr)
 static inline void
 pg_atomic_clear_flag_impl(volatile pg_atomic_flag *ptr)
 {
-	/*
-	 * Use a memory barrier + plain write if we have a native memory
-	 * barrier. But don't do so if memory barriers use spinlocks - that'd lead
-	 * to circularity if flags are used to implement spinlocks.
-	 */
-#ifndef PG_HAVE_MEMORY_BARRIER_EMULATION
 	/* XXX: release semantics suffice? */
 	pg_memory_barrier_impl();
 	pg_atomic_write_u32_impl(ptr, 0);
-#else
-	uint32 value = 1;
-	pg_atomic_compare_exchange_u32_impl(ptr, &value, 0);
-#endif
 }
 
 #elif !defined(PG_HAVE_ATOMIC_TEST_SET_FLAG)
-- 
2.39.2

Reply via email to