https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110935

            Bug ID: 110935
           Summary: Missed BB reduction vectorization because of missed
                    eliding of a permute
           Product: gcc
           Version: 13.1.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: tree-optimization
          Assignee: unassigned at gcc dot gnu.org
          Reporter: rguenth at gcc dot gnu.org
  Target Milestone: ---

double vals[16];
double test ()
{
  vals[0]++;
  return vals[2] + vals[4] + vals[1] + vals[3];
}

has the reduction not vectorized with -ffast-math because

t.c:5:38: note:   === vect_slp_analyze_operations ===
t.c:5:38: note:   ==> examining statement: _8 = vals[3];
t.c:5:38: missed:   BB vectorization with gaps at the end of a load is not
supported
t.c:5:44: missed:   not vectorized: relevant stmt not supported: _8 = vals[3];
t.c:5:38: note:   removing SLP instance operations starting from: _11 = _7 +
_8;
t.c:5:38: missed:  not vectorized: bad operation in basic block.

we fail to elide the load permutation (BB vect allows a consecutive
sub-set):

t.c:5:38: note:   Final SLP tree for instance 0x51c8d60:
t.c:5:38: note:   node 0x5285860 (max_nunits=2, refcnt=2) vector(2) double
t.c:5:38: note:   op template: _8 = vals[3];
t.c:5:38: note:         stmt 0 _8 = vals[3];
t.c:5:38: note:         stmt 1 _6 = vals[1];
t.c:5:38: note:         stmt 2 _3 = vals[2];
t.c:5:38: note:         stmt 3 _4 = vals[4];
t.c:5:38: note:         load permutation { 3 1 2 4 }
t.c:5:38: note:    === vect_match_slp_patterns ===
t.c:5:38: note:    Analyzing SLP tree 0x5285860 for patterns
t.c:5:38: note:  SLP optimize permutations:
t.c:5:38: note:    1: { 2, 0, 1, 3 }
t.c:5:38: note:  SLP optimize partitions:
t.c:5:38: note:    -------------
t.c:5:38: note:    partition 0 (layout 0):
t.c:5:38: note:      nodes:
t.c:5:38: note:        - 0x5285860:
t.c:5:38: note:            weight: 1.000000
t.c:5:38: note:            op template: _8 = vals[3];
t.c:5:38: note:      edges:
t.c:5:38: note:      layout 0: (*)
t.c:5:38: note:          {depth: 0.000000, total: 0.000000}
t.c:5:38: note:        + {depth: 1.000000, total: 1.000000}
t.c:5:38: note:        + {depth: 0.000000, total: 0.000000}
t.c:5:38: note:        = {depth: 1.000000, total: 1.000000}
t.c:5:38: note:      layout 1:
t.c:5:38: note:          {depth: 0.000000, total: 0.000000}
t.c:5:38: note:        + {depth: 1.000000, total: 1.000000}
t.c:5:38: note:        + {depth: 0.000000, total: 0.000000}
t.c:5:38: note:        = {depth: 1.000000, total: 1.000000}
t.c:5:38: note:  recording new base alignment for &vals
  alignment:    32
  misalignment: 0
  based on:     _1 = vals[0];
t.c:5:38: note:   === vect_slp_analyze_instance_alignment ===

Reply via email to