https://gcc.gnu.org/bugzilla/show_bug.cgi?id=125795
--- Comment #13 from GCC Commits <cvs-commit at gcc dot gnu.org> --- The releases/gcc-14 branch has been updated by Kyrylo Tkachov <[email protected]>: https://gcc.gnu.org/g:49deb28ce2c8f8442cdc33d318bf25d1c54c9ae4 commit r14-12674-g49deb28ce2c8f8442cdc33d318bf25d1c54c9ae4 Author: Kyrylo Tkachov <[email protected]> Date: Mon Jun 15 04:53:34 2026 -0700 aarch64: Fix early-ra wrong code with full-width FPR color groups [PR125795] early_ra::allocate_colors marks the FPRs occupied by a color with m_allocated_fprs |= ((1U << color->group->size) - 1) << best; When a color group spans the whole register file (size == 32), as can happen for a heavily unrolled, vectorized loop, "1U << 32" is undefined and evaluates to 1 on AArch64 hosts, so the expression sets no bits at all. The 32 FPRs of the group are therefore not recorded as allocated. Subsequent colors (and broaden_colors) then reuse those registers, which breaks the invariant that distinct colors receive disjoint FPRs. In PR125795 this let the loop-invariant TBL permute index, which is live across the whole loop, share v28 with the LD2 tuple destinations, so the index was clobbered mid-loop and the loop produced wrong results. Fix this by using a 64-bit shift base: unsigned long long is at least 64 bits on every host, so "1ULL << 32" is well-defined. best + size <= 32 is guaranteed by the candidate search, which the patch also asserts, so the result still fits in the 32-bit m_allocated_fprs When the full-width group can no longer be hidden, allocate_colors correctly fails to find a register for the other color and the region is left to the real register allocator, matching -mearly-ra=none. Bootstrapped and tested on aarch64-none-linux-gnu. Pushing to trunk and later to the branches after testing. Signed-off-by: Kyrylo Tkachov <[email protected]> gcc/ChangeLog: PR target/125795 * config/aarch64/aarch64-early-ra.cc (early_ra::allocate_colors): Compute the allocated-FPR mask as ((1ULL << color->group->size) - 1) << best. gcc/testsuite/ChangeLog: PR target/125795 * gcc.target/aarch64/pr125795.c: New test. (cherry picked from commit 39de311c74d13949feab1fc9fe45654e0219b065)
