https://gcc.gnu.org/bugzilla/show_bug.cgi?id=122069
--- Comment #5 from GCC Commits <cvs-commit at gcc dot gnu.org> --- The master branch has been updated by Tamar Christina <[email protected]>: https://gcc.gnu.org/g:c8dc5d5070c09792bf8d224cac90989885818aaf commit r16-4477-gc8dc5d5070c09792bf8d224cac90989885818aaf Author: Tamar Christina <[email protected]> Date: Sat Oct 18 08:20:07 2025 +0100 AArch64: add double widen_sum optab using dotprod for Adv.SIMD [PR122069] This patch implements support for using dotproduct to do sum reductions by changing += a into += (a * 1). i.e. we seed the multiplication with 1. Given the example int foo_int(unsigned char *x, unsigned char * restrict y) { int sum = 0; for (int i = 0; i < 8000; i++) sum += char_abs(x[i] - y[i]); return sum; } we used to generate .L2: ldr q0, [x0, x2] ldr q28, [x1, x2] sub v28.16b, v0.16b, v28.16b zip1 v29.16b, v28.16b, v31.16b zip2 v28.16b, v28.16b, v31.16b uaddw v30.4s, v30.4s, v29.4h uaddw2 v30.4s, v30.4s, v29.8h uaddw v30.4s, v30.4s, v28.4h uaddw2 v30.4s, v30.4s, v28.8h add x2, x2, 16 cmp x2, x3 bne .L2 addv s31, v30.4s but now generates with +dotprod .L2: ldr q29, [x0, x2] ldr q28, [x1, x2] sub v28.16b, v29.16b, v28.16b udot v31.4s, v28.16b, v30.16b add x2, x2, 16 cmp x2, x3 bne .L2 addv s31, v31.4s gcc/ChangeLog: PR middle-end/122069 * config/aarch64/aarch64-simd.md (widen_ssum<mode><vsi2qi>3): New. (widen_usum<mode><vsi2qi>3): New. gcc/testsuite/ChangeLog: PR middle-end/122069 * gcc.target/aarch64/pr122069_3.c: New test. * gcc.target/aarch64/pr122069_4.c: New test.
