https://gcc.gnu.org/bugzilla/show_bug.cgi?id=123175

            Bug ID: 123175
           Summary: Wrong folding of VEC_PERM_EXPR
           Product: gcc
           Version: 16.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: middle-end
          Assignee: unassigned at gcc dot gnu.org
          Reporter: rguenth at gcc dot gnu.org
  Target Milestone: ---

When we use relaxed rules for VEC_PERM_EXPR vectors like

typedef int v4si __attribute__((vector_size(16)));
typedef int v2si __attribute__((vector_size(8)));
typedef char v4qi __attribute__((vector_size(4)));

v4si __GIMPLE() foo (v2si a, v2si b)
{
  v4si res;

  res = __VEC_PERM (a, b, _Literal (v4qi) { 0, 1, 2, 3 });
  return res;
}

we mis-fold that to

  res_3 = a_1(D);

because of the match.pd rule

(simplify
 (vec_perm @0 @1 VECTOR_CST@2)
 (with
  {
    tree op0 = @0, op1 = @1, op2 = @2;
    machine_mode result_mode = TYPE_MODE (type);
...

which uses

      /* Create a vec_perm_indices for the integer vector.  */
      poly_uint64 nelts = TYPE_VECTOR_SUBPARTS (type);
      bool single_arg = (op0 == op1);
      vec_perm_indices sel (builder, single_arg ? 1 : 2, nelts);

with nelts == 4 which misleds sel.series_p (0, 1, 0, 1) but also
sel.all_from_input_p as can be seen with

v4si __GIMPLE() foo (v2si a, v2si b)
{
  v4si res;

  res = __VEC_PERM (a, b, _Literal (v4qi) { 0, 2, 2, 1 });
  return res;
}

which we fold to

  res_3 = VEC_PERM_EXPR <a_1(D), a_1(D), { 0, 2, 2, 1 }>;

I've seen this when removing the unnecessary padding of shufflevector inputs
to mask length for PR123156.

Reply via email to