https://gcc.gnu.org/bugzilla/show_bug.cgi?id=122598

--- Comment #9 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
So, let's consider e.g.
typedef char V __attribute__ ((vector_size (32)));

V
foo (V v)
{
  V a = v >> 5;
  return (V) {} < v ? v : a;
}
for -O2 -mavx512{vl,dq,cd,bw} -mgfni and for -O2 -mavx512vl -mgfni and then
the same with s/32/64/.
For the first one before combine I see
(insn 9 8 10 2 (set (reg:V32QI 106 [ a_3 ])
        (unspec:V32QI [
                (reg/v:V32QI 101 [ v ])
                (reg:V32QI 107)
                (const_int 0 [0])
            ] UNSPEC_GF2P8AFFINE)) "pr122598-5.C":6:5 10237
{vgf2p8affineqb_v32qi}
     (expr_list:REG_DEAD (reg:V32QI 107)
        (expr_list:REG_EQUAL (ashiftrt:V32QI (reg/v:V32QI 101 [ v ])
                (const_int 5 [0x5]))
            (nil))))
(insn 10 9 15 2 (set (reg:V32QI 103)
        (vec_merge:V32QI (reg/v:V32QI 101 [ v ])
            (reg:V32QI 106 [ a_3 ])
            (reg:SI 105 [ _1 ]))) "pr122598-5.C":7:27 discrim 1 2583
{avx512vl_blendmv32qi}
     (expr_list:REG_DEAD (reg:V32QI 106 [ a_3 ])
        (expr_list:REG_DEAD (reg:SI 105 [ _1 ])
            (expr_list:REG_DEAD (reg/v:V32QI 101 [ v ])
                (nil)))))
and in the combine dump I see
Trying 9 -> 10:
    9: r106:V32QI=unspec[r101:V32QI,[`*.LC0'],0] 200
   10: r103:V32QI=vec_merge(r101:V32QI,r106:V32QI,r105:SI)
      REG_DEAD r106:V32QI
      REG_DEAD r105:SI
      REG_DEAD r101:V32QI
Failed to match this instruction:
(set (reg:V32QI 103)
    (vec_merge:V32QI (reg/v:V32QI 101 [ v ])
        (unspec:V32QI [
                (reg/v:V32QI 101 [ v ])
                (mem/u/c:V32QI (symbol_ref/u:DI ("*.LC0") [flags 0x2]) [0  S32
A256])
                (const_int 0 [0])
            ] UNSPEC_GF2P8AFFINE)
        (reg:SI 105 [ _1 ])))
vec_merge is not commutative.
Now, if I try
typedef char V __attribute__ ((vector_size (32)));

V
foo (V v)
{
  V a = v >> 5;
  return (V) {} < v ? a : v;
}
instead, it is handled as masked insn with -O2 -mavx512{vl,dq,cd,bw} -mgfni
with no changes:
        vpxor   %xmm1, %xmm1, %xmm1
        vpcmpb  $6, %ymm1, %ymm0, %k1
        vgf2p8affineqb  $0, .LC0(%rip), %ymm0, %ymm0{%k1}
        ret
Could we try to swap the VEC_MERGE arguments if it doesn't match and if that
matches invert the mask?  Yes, but it would be a general change, not related to
this particular insn.

Reply via email to