https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69671
--- Comment #14 from Kirill Yukhin <kyukhin at gcc dot gnu.org> ---
Okay,
I've tried:
1. Run AVX-512 testing on Spec2006 and see no impact of the one-liner:
Geomeans:
INT : 5.11 5.11 +0.05%
FP : 2.73 2.73 -0.08%
ALL : 3.54 3.54 -0.02%
2. Tried Uroš's proposal. Adding to guilty pattern a condition like this:
"TARGET_AVX512VL
&& ((REG_P (operands[2]) && REG_P (operands[0]) && REGNO (operands[0]) ==
REGNO (operands[2]))
|| (operands[2] == CONST0_RTX (<MODE>mode)))"
No success as well. The problem is that zero-masked built-in have register as
second sorce at expand. Which when rematerializes to zero. So, setting this
condition will lead to ICE in recog @ expand.
So, for v6 it looks like we need to remove one-liner.
For v7 we need to extend define_subst a bit to allow multiple output patterns.
E.g. currently:
(define_subst "mask"
[(set (match_operand:SUBST_V 0)
(match_operand:SUBST_V 1))]
"TARGET_AVX512F"
[(set (match_dup 0)
(vec_merge:SUBST_V
(match_dup 1)
(match_operand:SUBST_V 2 "vector_move_operand" "0C")
(match_operand:<avx512fmaskmode> 3 "register_operand" "Yk")))])
It'd solve a problem if we'll had this instead:
(define_subst "mask"
[(set (match_operand:SUBST_V 0)
(match_operand:SUBST_V 1))]
"TARGET_AVX512F"
[(set (match_dup 0)
(vec_merge:SUBST_V
(match_dup 1)
(match_dup 0)
(match_operand:<avx512fmaskmode> 3 "register_operand" "Yk")))])
[(set (match_dup 0)
(vec_merge:SUBST_V
(match_dup 1)
(match_operand:SUBST_V 2 "const0_operand" "C")
(match_operand:<avx512fmaskmode> 3 "register_operand" "Yk")))])
Opinions?