https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110100
Bug ID: 110100 Summary: __builtin_aarch64_st64b stores to the wrong address Product: gcc Version: 12.2.1 Status: UNCONFIRMED Severity: normal Priority: P3 Component: target Assignee: unassigned at gcc dot gnu.org Reporter: ianthompson at microsoft dot com Target Milestone: --- Created attachment 55247 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=55247&action=edit Preprocessed reproduction, compile with "aarch64-none-elf-gcc -march=armv8.7-a -c" On aarch64, the __arm_st64b function (which is just a thin wrapper around __builtin_aarch64_st64b) produces incorrect code which uses an uninitialized register as the address for an ST64B store instruction. A full preprocessed file is attached, but this snippet is sufficient to reproduce: #include <arm_acle.h> void do_st64b(data512_t data) { __arm_st64b((void*)0x10000000, data); } Compiling with "aarch64-none-elf-gcc -O2 -march=armv8.7-a", I get the following assembly snippet: do_st64b: ldp x8, x9, [x0] ldp x10, x11, [x0, 16] ldp x12, x13, [x0, 32] ldp x14, x15, [x0, 48] st64b x8, [x1] ret Notice that the st64b instruction uses the uninitialized register x1 as an address, and the constant 0x10000000 is not loaded. I turned optimizations on to keep the assembly small, but the same issue also occurs without optimizations. Digging a bit into aarch64-builtins.cc, it seems like AARCH64_LS64_BUILTIN_ST64B incorrectly declares the address register an output instead of an input, leading to the uninitialized register seen above. The builtins for LD64B, ST64BV, and ST64BV0 appear to be fine, it's just ST64B which has this issue. Initially discovered on Arm's build of the toolchain, but I can also reproduce on trunk and an official 13.1.0 release: aarch64-none-elf-gcc (Arm GNU Toolchain 12.2.Rel1 (Build arm-12.24)) 12.2.1 20221205