Leaving the main review to Richard, just some comments...

Prathamesh Kulkarni <prathamesh.kulka...@linaro.org> writes:
> @@ -9774,6 +9777,10 @@ vect_is_simple_cond (tree cond, vec_info *vinfo,
>  
>     When STMT_INFO is vectorized as a nested cycle, for_reduction is true.
>  
> +   For COND_EXPR<C, T, E> if T comes from masked load, and is conditional
> +   on C, we apply loop mask to result of vector comparison, if it's present.
> +   Similarly for E, if it is conditional on !C.
> +
>     Return true if STMT_INFO is vectorizable in this way.  */
>  
>  bool

I think this is a bit misleading.  But IMO it'd be better not to have
a comment here and just rely on the one in the main function body.
This optimisation isn't really changing the vectorisation strategy,
and the comment could easily get forgotten if things change in future.

> [...]
> @@ -9999,6 +10006,35 @@ vectorizable_condition (stmt_vec_info stmt_info, 
> gimple_stmt_iterator *gsi,
>    /* Handle cond expr.  */
>    for (j = 0; j < ncopies; j++)
>      {
> +      tree loop_mask = NULL_TREE;
> +      bool swap_cond_operands = false;
> +
> +      /* Look up if there is a loop mask associated with the
> +      scalar cond, or it's inverse.  */

Maybe:

   See whether another part of the vectorized code applies a loop
   mask to the condition, or to its inverse.

> +
> +      if (loop_vinfo && LOOP_VINFO_FULLY_MASKED_P (loop_vinfo))
> +     {
> +       scalar_cond_masked_key cond (cond_expr, ncopies);
> +       if (loop_vinfo->scalar_cond_masked_set.contains (cond))
> +         {
> +           vec_loop_masks *masks = &LOOP_VINFO_MASKS (loop_vinfo);
> +           loop_mask = vect_get_loop_mask (gsi, masks, ncopies, vectype, j);
> +         }
> +       else
> +         {
> +           bool honor_nans = HONOR_NANS (TREE_TYPE (cond.op0));
> +           cond.code = invert_tree_comparison (cond.code, honor_nans);
> +           if (loop_vinfo->scalar_cond_masked_set.contains (cond))
> +             {
> +               vec_loop_masks *masks = &LOOP_VINFO_MASKS (loop_vinfo);
> +               loop_mask = vect_get_loop_mask (gsi, masks, ncopies,
> +                                               vectype, j);
> +               cond_code = cond.code;
> +               swap_cond_operands = true;
> +             }
> +         }
> +     }
> +
>        stmt_vec_info new_stmt_info = NULL;
>        if (j == 0)
>       {
> @@ -10114,6 +10153,47 @@ vectorizable_condition (stmt_vec_info stmt_info, 
> gimple_stmt_iterator *gsi,
>                   }
>               }
>           }
> +
> +       /* If loop mask is present, then AND it with

Maybe "If we decided to apply a loop mask, ..."

> +          result of vec comparison, so later passes (fre4)

Probably better not to name the pass -- could easily change in future.

> +          will reuse the same condition used in masked load.

Could be a masked store, or potentially other things too.
So maybe just "will reuse the masked condition"?

> +
> +          For example:
> +          for (int i = 0; i < 100; ++i)
> +            x[i] = y[i] ? z[i] : 10;
> +
> +          results in following optimized GIMPLE: 
> +
> +          mask__35.8_43 = vect__4.7_41 != { 0, ... };
> +          vec_mask_and_46 = loop_mask_40 & mask__35.8_43;
> +          _19 = &MEM[base: z_12(D), index: ivtmp_56, step: 4, offset: 0B];
> +          vect_iftmp.11_47 = .MASK_LOAD (_19, 4B, vec_mask_and_46);
> +          vect_iftmp.12_52 = VEC_COND_EXPR <vec_mask_and_46,
> +                                            vect_iftmp.11_47, { 10, ... }>;
> +
> +          instead of recomputing vec != { 0, ... } in vec_cond_expr  */

That's true, but gives the impression that avoiding the vec != { 0, ... }
is the main goal, whereas we could do that just by forcing a three-operand
COND_EXPR.  It's really more about making sure that vec != { 0, ... }
and its masked form aren't both live at the same time.  So maybe:

             instead of using a masked and unmasked forms of
             vect__4.7_41 != { 0, ... } (masked in the MASK_LOAD,
             unmasked in the VEC_COND_EXPR).  */

Thanks,
Richard

Reply via email to