https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113618
--- Comment #5 from GCC Commits <cvs-commit at gcc dot gnu.org> --- The master branch has been updated by Wilco Dijkstra <wi...@gcc.gnu.org>: https://gcc.gnu.org/g:19b23bf3c32df3cbb96b3d898a1d7142f7bea4a0 commit r14-9373-g19b23bf3c32df3cbb96b3d898a1d7142f7bea4a0 Author: Wilco Dijkstra <wilco.dijks...@arm.com> Date: Wed Feb 21 23:33:58 2024 +0000 AArch64: memcpy/memset expansions should not emit LDP/STP [PR113618] The new RTL introduced for LDP/STP results in regressions due to use of UNSPEC. Given the new LDP fusion pass is good at finding LDP opportunities, change the memcpy, memmove and memset expansions to emit single vector loads/stores. This fixes the regression and enables more RTL optimization on the standard memory accesses. Handling of unaligned tail of memcpy/memmove is improved with -mgeneral-regs-only. SPEC2017 performance improves slightly. Codesize is a bit worse due to missed LDP opportunities as discussed in the PR. gcc/ChangeLog: PR target/113618 * config/aarch64/aarch64.cc (aarch64_copy_one_block): Remove. (aarch64_expand_cpymem): Emit single load/store only. (aarch64_set_one_block): Emit single stores only. gcc/testsuite/ChangeLog: PR target/113618 * gcc.target/aarch64/pr113618.c: New test.