https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101639
--- Comment #25 from Hongtao Liu <liuhongt at gcc dot gnu.org> ---
(In reply to Richard Biener from comment #24)
> (In reply to Hongtao Liu from comment #22)
> > (In reply to Richard Biener from comment #21)
> > > (In reply to Hongtao Liu from comment #20)
> > > > (In reply to Hongtao Liu from comment #19)
> > > > > Created attachment 62562 [details]
> > > > > avx512/avx2 reduc_mask_{and,ixor,xor}_m
> > > > >
> > > > > I didn't support V*HImode for reduc_mask_xor_m since x86 only has
> > > > > vmovmskps/pd and vpmovmskb. for others, unit test looks ok and I'm
> > > > > going to
> > > > > have more test for that.
> > > >
> > > > It failed bootstrap in stage3 with --with-arch=native on SPR, need to
> > > > take a
> > > > look.
> > >
> > > It might very well be a bug on the vectorizer side of course.
> >
> > should be related to reduc_mask_and, the mask needs to be compared to
> > allones(-1), no zero since any zero bit will cause the result to be zero.
>
> Yes, I also see the new gcc.dg/vect/vect-reduc-bool-1.c fail execution with
> AVX2. For and AND reduction of 16 char elements we create
>
> vpxor %xmm1, %xmm1, %xmm1
> vpcmpeqb (%rdi), %xmm1, %xmm0
> vpcmpeqb %xmm1, %xmm0, %xmm0
> vptest %xmm0, %xmm0
> sete %al
>
> clang produces
>
> vpxor %xmm0, %xmm0, %xmm0
> vpcmpeqb (%rdi), %xmm0, %xmm0
> vpmovmskb %xmm0, %eax
> testl %eax, %eax
> sete %al
>
> seems it's tricky to mate a != 0 compare with the all-zero vptest optimally,
It could be possibly handled in combine, we already has ptest for CCZ and CCC
separately, if only CCZ is cared, then (unspec:CCZ (eq (eq op const0) const0)
unspec_ptest) can be simplified.