https://gcc.gnu.org/bugzilla/show_bug.cgi?id=122722

            Bug ID: 122722
           Summary: Fail to SLP vectorize in-order reduction pairs
           Product: gcc
           Version: 16.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: tree-optimization
          Assignee: unassigned at gcc dot gnu.org
          Reporter: rguenth at gcc dot gnu.org
  Target Milestone: ---

void foo (double * __restrict sums, double *a, double *b, int n)
{
  for (int i = 0; i < n; ++i)
    {
      sums[0] = sums[0] + a[2*i];
      sums[1] = sums[1] + a[2*i+1];
      sums[2] = sums[2] + b[2*i];
      sums[3] = sums[3] + b[2*i+1];
    }
}

should be vectorizable with V2DFmode in pairs for 'a' and 'b' even when using
in-order reductions.  But SLP discovery for the SLP reduction covering all
four reductions fails and we fall back to single-lane reductions which is not
profitable.

With -ffast-math we get profitable reduction but a too high vectorization
factor and required interleaving for the memory accesses.

Reply via email to