https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110773
Bug ID: 110773 Summary: [Aarch64] crash (SIGBUS) due to atomic instructions on under-aligned memory Product: gcc Version: 14.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: target Assignee: unassigned at gcc dot gnu.org Reporter: scw-gcc at google dot com Target Milestone: --- This reproduces in versions as far back as godbolt has ARM64 gcc (5.4). The following code snippet has two copies of 4-byte-aligned, 8-byte-sized objects `fp1` and `fp2`. Their placements in `storage` guarantee that they are 12 bytes apart, and thus one would be 8-byte aligned and one would not (it'd still be 4-byte aligned, though). ===== struct FloatPair { float f1; float f2; }; struct Storage { FloatPair fp1; float padding; FloatPair fp2; } storage; float f() { FloatPair fp1, fp2; __atomic_load(&storage.fp1, &fp1, __ATOMIC_SEQ_CST); __atomic_load(&storage.fp2, &fp2, __ATOMIC_SEQ_CST); return fp1.f1 + fp1.f2 + fp2.f1 + fp2.f2; } ===== Godbolt with GCC and Clang: https://godbolt.org/z/P9rbTePnG GCC uses two `ldar` instructions for the loads while Clang makes calls to libatomic. The GCC codegen crashes on AArch64 machines (tested on Cavium ThunderX2 as well as Neoverse-N1). AArch64 allows unaligned memory access, except for atomic operations: https://developer.arm.com/documentation/ddi0596/2021-03/Shared-Pseudocode/AArch64-Functions?lang=en#AArch64.CheckAlignment.4 For memory reads, it uses the size as the alignment (unless it's operating on a pair of 64-bit registers): https://developer.arm.com/documentation/ddi0596/2021-03/Shared-Pseudocode/AArch64-Functions?lang=en#impl-aarch64.Mem.read.3 In this case, `FloatPair` has size 8 and fit in one 64-bit register, so the 64-bit `ldar` can only be used on 8-byte-aligned reads. One of the two call will thus violate the alignment requirement. Potentially related, GCC also uses single atomic RMW instructions when available regardless of alignment. To trigger this, one has to construct unaligned pointers so I'm not sure if it's a problem. ===== struct Storage { char c; int i; } __attribute__((packed)) storage; int inc() { return __atomic_add_fetch(&storage.i, 1, __ATOMIC_SEQ_CST); } ===== Needs -march=armv8.1-a (LSE) to reproduce: https://godbolt.org/z/qKM1fGbrj In my testing, even using the clang codegen it still crashes inside libatomic, though.