https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101390
Bug ID: 101390 Summary: Expand vector mod as vector div + multiply-subtract Product: gcc Version: 12.0 Status: UNCONFIRMED Keywords: missed-optimization Severity: normal Priority: P3 Component: tree-optimization Assignee: unassigned at gcc dot gnu.org Reporter: ktkachov at gcc dot gnu.org Target Milestone: --- When the target supports an sdiv/udiv pattern for vector modes we could synthesise a vector modulus operation using the division and a multiply-subtract operation. #define N 128 extern signed int si_a[N], si_b[N], si_c[N]; void test_si () { for (int i = 0; i < N; i++) si_c[i] = si_a[i] % si_b[i]; } On AArch64 SVE (but not Neon) has vector SDIV/UDIV instructions and so could generate: .L2: ld1w z2.s, p0/z, [x4, x0, lsl 2] ld1w z1.s, p0/z, [x3, x0, lsl 2] movprfx z0, z2 sdiv z0.s, p1/m, z0.s, z1.s msb z0.s, p1/m, z1.s, z2.s st1w z0.s, p0, [x1, x0, lsl 2] incw x0 whilelo p0.s, w0, w2 b.any .L2 This can be achieved by implementing the smod and mod optabs in the aarch64 backend for SVE, but this is a generic transformation, so could be handled more generally in vect_recog_divmod_pattern and/or the vector lowering code so that more targets can benefit.