pr52252-ld.c shows store permutation runs into three vector limit

rguenth at gcc dot gnu.org via Gcc-bugs Wed, 18 Sep 2024 05:56:54 -0700

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116762


            Bug ID: 116762
           Summary: gcc.dg/vect/pr52252-ld.c shows store permutation runs
                    into three vector limit
           Product: gcc
           Version: 15.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: tree-optimization
          Assignee: unassigned at gcc dot gnu.org
          Reporter: rguenth at gcc dot gnu.org
  Target Milestone: ---

When failing vectorization without SLP we see that gcc.dg/vect/pr52252-ld.c
ends up using single-lane SLP.  That's way better than what GCC 14 does which
is hybrid SLP but it might be possible to use a better strathegy for lowering

  node 0x4b25cf0 (max_nunits=1, refcnt=1) vector(16) unsigned char
      op: VEC_PERM_EXPR
      { }
      lane permutation { 0[0] 0[1] 0[2] 1[0] }
      children 0x4b25750 0x4b25990

that merges the three lane and single-lane values.

[Bug tree-optimization/116762] New: gcc.dg/vect/pr52252-ld.c shows store permutation runs into three vector limit

Reply via email to