https://gcc.gnu.org/bugzilla/show_bug.cgi?id=123779

--- Comment #3 from Hongyu Wang <hongyuw at gcc dot gnu.org> ---
(In reply to Hongtao Liu from comment #2)
> (In reply to Hongyu Wang from comment #1)
> > This is because define_insn_and_split "*sse4_1_<code>v8qiv8hi2<mask_name>_2"
> > has mask_name subst, but it doesn't generate corresponding split for the
> > mask variant.
> 
> we should remove <mask_name>

Just remove <mask_name> produces an extra blend:

vmovdqa e(%rip), %xmm0
vmovdqa g(%rip), %xmm1
vpcmpw  $1, f(%rip), %xmm0, %k1
vpmovm2w        %k1, %xmm0
vpmovzxbw       d(%rip), %xmm0
vpblendmw       %xmm0, %xmm1, %xmm0{%k1}

I think better to separate define_insn_and_split for nonmask/mask variants,
which produces

vmovdqa e(%rip), %xmm0         
vpcmpw  $1, f(%rip), %xmm0, %k1
vpmovm2w        %k1, %xmm0     
vmovdqa g(%rip), %xmm0             
vpmovzxbw       d(%rip), %xmm0{%k1}

Reply via email to