https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110773

            Bug ID: 110773
           Summary: [Aarch64] crash (SIGBUS) due to atomic instructions on
                    under-aligned memory
           Product: gcc
           Version: 14.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: target
          Assignee: unassigned at gcc dot gnu.org
          Reporter: scw-gcc at google dot com
  Target Milestone: ---

This reproduces in versions as far back as godbolt has ARM64 gcc (5.4).

The following code snippet has two copies of 4-byte-aligned, 8-byte-sized
objects `fp1` and `fp2`. Their placements in `storage` guarantee that they are
12 bytes apart, and thus one would be 8-byte aligned and one would not (it'd
still be 4-byte aligned, though).

=====
struct FloatPair {
    float f1;
    float f2;
};

struct Storage {
    FloatPair fp1;
    float padding;
    FloatPair fp2;
} storage;

float f() {
    FloatPair fp1, fp2;
    __atomic_load(&storage.fp1, &fp1, __ATOMIC_SEQ_CST);
    __atomic_load(&storage.fp2, &fp2, __ATOMIC_SEQ_CST);
    return fp1.f1 + fp1.f2 + fp2.f1 + fp2.f2;
}
=====

Godbolt with GCC and Clang: https://godbolt.org/z/P9rbTePnG

GCC uses two `ldar` instructions for the loads while Clang makes calls to
libatomic. The GCC codegen crashes on AArch64 machines (tested on Cavium
ThunderX2 as well as Neoverse-N1).

AArch64 allows unaligned memory access, except for atomic operations:
https://developer.arm.com/documentation/ddi0596/2021-03/Shared-Pseudocode/AArch64-Functions?lang=en#AArch64.CheckAlignment.4

For memory reads, it uses the size as the alignment (unless it's operating on a
pair of 64-bit registers):
https://developer.arm.com/documentation/ddi0596/2021-03/Shared-Pseudocode/AArch64-Functions?lang=en#impl-aarch64.Mem.read.3

In this case, `FloatPair` has size 8 and fit in one 64-bit register, so the
64-bit `ldar` can only be used on 8-byte-aligned reads. One of the two call
will thus violate the alignment requirement.

Potentially related, GCC also uses single atomic RMW instructions when
available regardless of alignment. To trigger this, one has to construct
unaligned pointers so I'm not sure if it's a problem.

=====
struct Storage {
    char c;
    int i;
} __attribute__((packed))
storage;

int inc() {
    return __atomic_add_fetch(&storage.i, 1, __ATOMIC_SEQ_CST);
}
=====

Needs -march=armv8.1-a (LSE) to reproduce: https://godbolt.org/z/qKM1fGbrj

In my testing, even using the clang codegen it still crashes inside libatomic,
though.

Reply via email to