It looks like this patch was developed concurrently with the switch from pg_attribute_aligned to C11 standard alignas, and it ended up being committed with the "old" style. I propose the attached patch to update that. It's only a stylistic change, but good for consistency and as an example for future code.

On 03.04.26 10:22, John Naylor wrote:
On Thu, Apr 2, 2026 at 11:17 PM Nathan Bossart <[email protected]> wrote:

On Thu, Apr 02, 2026 at 10:53:24AM -0500, Nathan Bossart wrote:
I think the new pg_comp_crc32_choose() is infinitely recursing on macOS
because USE_ARMV8_CRC32C_WITH_RUNTIME_CHECK is not defined but
pg_crc32c_armv8_available() returns false.  If I trace through that
function, I see that it's going straight to the

       #else
               return false;
       #endif

at the end.  And sure enough, both HAVE_ELF_AUX_INFO and HAVE_GETAUXVAL

Ah of course.

aren't defined in pg_config.h.  I think we might need to use sysctlbyname()
to determine PMULL support on macOS, but at this stage of the development
cycle, I would probably lean towards just compiling in the sb8
implementation.

Hm.  On second thought, that probably regresses macOS builds because it was
presumably using the armv8 path without runtime checks before...

I went with the following for v5, and it passes MacOS on my Github CI:

+  /* set fallbacks */
+#ifdef USE_ARMV8_CRC32C
+  /* On e.g. MacOS, our runtime feature detection doesn't work */
+  pg_comp_crc32c = pg_comp_crc32c_armv8;
+#else
+  pg_comp_crc32c = pg_comp_crc32c_sb8;
+#endif
+ [...crc and pmull checks]

That should keep scalar hardware support working, but now it'll only
use direct calls for constant inputs.

I also did some benchmarking on an ARM Neoverse N1 / gcc 8.3
(attached). There the vector loop still works well all the way down to
the minimum input size of 64 bytes, and on long inputs it's almost
twice as fast as scalar. For reproduceability, I slightly modified the
benchmark we used last year, to make sure the input is aligned
(attached but not for CI). In the end, I want to add a length check so
that inputs smaller than 80 bytes go straight to the scalar path.
Above 80, after alignment adjustments in the preamble, that still
guarantees at least one loop iteration in the vector path.

--
John Naylor
Amazon Web Services
From de2c64985662e0a788da5a4ed9f4e7358ed544f4 Mon Sep 17 00:00:00 2001
From: Peter Eisentraut <[email protected]>
Date: Mon, 29 Jun 2026 10:46:51 +0200
Subject: [PATCH] Use C11 alignas instead of pg_attribute_aligned

Replace pg_attribute_aligned with C11 alignas, for consistency with
current conventions.

Discussion: 
https://postgr.es/m/CANWCAZaKhE+RD5KKouUFoxx1EbUNrNhcduM1VQ=dksdadne...@mail.gmail.com
---
 src/port/pg_crc32c_armv8.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/src/port/pg_crc32c_armv8.c b/src/port/pg_crc32c_armv8.c
index 8cf838a510a..8fad2f83c24 100644
--- a/src/port/pg_crc32c_armv8.c
+++ b/src/port/pg_crc32c_armv8.c
@@ -166,7 +166,7 @@ pg_comp_crc32c_pmull(pg_crc32c crc, const void *data, 
size_t len)
                uint64x2_t      k;
 
                {
-                       static const uint64_t pg_attribute_aligned(16) k_[] = 
{0x740eef02, 0x9e4addf8};
+                       static const alignas(16) uint64_t k_[] = {0x740eef02, 
0x9e4addf8};
 
                        k = vld1q_u64(k_);
                }
@@ -192,14 +192,14 @@ pg_comp_crc32c_pmull(pg_crc32c crc, const void *data, 
size_t len)
 
                /* Reduce x0 ... x3 to just x0. */
                {
-                       static const uint64_t pg_attribute_aligned(16) k_[] = 
{0xf20c0dfe, 0x493c7d27};
+                       static const alignas(16) uint64_t k_[] = {0xf20c0dfe, 
0x493c7d27};
 
                        k = vld1q_u64(k_);
                }
                y0 = clmul_lo_e(x0, k, x1), x0 = clmul_hi_e(x0, k, y0);
                y2 = clmul_lo_e(x2, k, x3), x2 = clmul_hi_e(x2, k, y2);
                {
-                       static const uint64_t pg_attribute_aligned(16) k_[] = 
{0x3da6d0cb, 0xba4fc28e};
+                       static const alignas(16) uint64_t k_[] = {0x3da6d0cb, 
0xba4fc28e};
 
                        k = vld1q_u64(k_);
                }
-- 
2.54.0

Reply via email to