https://gcc.gnu.org/bugzilla/show_bug.cgi?id=121059
--- Comment #10 from Richard Sandiford <rsandifo at gcc dot gnu.org> ---
(In reply to Richard Biener from comment #9)
> vectorizable_operation during transform does
>
> /* When combining two masks check if either of them is elsewhere
> combined with a loop mask, if that's the case we can mark that
> the
> new combined mask doesn't need to be combined with a loop mask.
> */
> if (masked_loop_p
> && code == BIT_AND_EXPR
> && VECTOR_BOOLEAN_TYPE_P (vectype))
> {
> if (loop_vinfo->scalar_cond_masked_set.contains ({ op0, 1 }))
> {
> mask = vect_get_loop_mask (loop_vinfo, gsi, masks,
> vec_num, vectype, i);
>
> but that's not reflected by analysis, which misses to record a loop mask
> for !mask_out_inactive operations. So the fix is as simple as the following,
> but this might put us to using masks? There is no good way to do this
> I guess. The scalar_cond_masked_set optimization does not have a
> corresponding
> len operation. I'm not sure what we can do here?
>
> diff --git a/gcc/tree-vect-stmts.cc b/gcc/tree-vect-stmts.cc
> index 4aa69da2218..55002bd0cc2 100644
> --- a/gcc/tree-vect-stmts.cc
> +++ b/gcc/tree-vect-stmts.cc
> @@ -6978,6 +6978,16 @@ vectorizable_operation (vec_info *vinfo,
> LOOP_VINFO_CAN_USE_PARTIAL_VECTORS_P (loop_vinfo) = false;
> }
> }
> + else if (loop_vinfo
> + && LOOP_VINFO_CAN_USE_PARTIAL_VECTORS_P (loop_vinfo)
> + && code == BIT_AND_EXPR
> + && VECTOR_BOOLEAN_TYPE_P (vectype))
> + vect_record_loop_mask (loop_vinfo, masks, vec_num, vectype, NULL);
>
> /* Put types on constant and invariant SLP children. */
> if (!vect_maybe_update_slp_op_vectype (slp_op0, vectype)
Yeah, we shouldn't do that. The question is why op0 is in
scalar_cond_masked_set with masked_loop_p true if there's no associated loop
mask.