https://gcc.gnu.org/bugzilla/show_bug.cgi?id=122598
--- Comment #9 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
So, let's consider e.g.
typedef char V __attribute__ ((vector_size (32)));
V
foo (V v)
{
V a = v >> 5;
return (V) {} < v ? v : a;
}
for -O2 -mavx512{vl,dq,cd,bw} -mgfni and for -O2 -mavx512vl -mgfni and then
the same with s/32/64/.
For the first one before combine I see
(insn 9 8 10 2 (set (reg:V32QI 106 [ a_3 ])
(unspec:V32QI [
(reg/v:V32QI 101 [ v ])
(reg:V32QI 107)
(const_int 0 [0])
] UNSPEC_GF2P8AFFINE)) "pr122598-5.C":6:5 10237
{vgf2p8affineqb_v32qi}
(expr_list:REG_DEAD (reg:V32QI 107)
(expr_list:REG_EQUAL (ashiftrt:V32QI (reg/v:V32QI 101 [ v ])
(const_int 5 [0x5]))
(nil))))
(insn 10 9 15 2 (set (reg:V32QI 103)
(vec_merge:V32QI (reg/v:V32QI 101 [ v ])
(reg:V32QI 106 [ a_3 ])
(reg:SI 105 [ _1 ]))) "pr122598-5.C":7:27 discrim 1 2583
{avx512vl_blendmv32qi}
(expr_list:REG_DEAD (reg:V32QI 106 [ a_3 ])
(expr_list:REG_DEAD (reg:SI 105 [ _1 ])
(expr_list:REG_DEAD (reg/v:V32QI 101 [ v ])
(nil)))))
and in the combine dump I see
Trying 9 -> 10:
9: r106:V32QI=unspec[r101:V32QI,[`*.LC0'],0] 200
10: r103:V32QI=vec_merge(r101:V32QI,r106:V32QI,r105:SI)
REG_DEAD r106:V32QI
REG_DEAD r105:SI
REG_DEAD r101:V32QI
Failed to match this instruction:
(set (reg:V32QI 103)
(vec_merge:V32QI (reg/v:V32QI 101 [ v ])
(unspec:V32QI [
(reg/v:V32QI 101 [ v ])
(mem/u/c:V32QI (symbol_ref/u:DI ("*.LC0") [flags 0x2]) [0 S32
A256])
(const_int 0 [0])
] UNSPEC_GF2P8AFFINE)
(reg:SI 105 [ _1 ])))
vec_merge is not commutative.
Now, if I try
typedef char V __attribute__ ((vector_size (32)));
V
foo (V v)
{
V a = v >> 5;
return (V) {} < v ? a : v;
}
instead, it is handled as masked insn with -O2 -mavx512{vl,dq,cd,bw} -mgfni
with no changes:
vpxor %xmm1, %xmm1, %xmm1
vpcmpb $6, %ymm1, %ymm0, %k1
vgf2p8affineqb $0, .LC0(%rip), %ymm0, %ymm0{%k1}
ret
Could we try to swap the VEC_MERGE arguments if it doesn't match and if that
matches invert the mask? Yes, but it would be a general change, not related to
this particular insn.