On Mon, 15 Sep 2025, Avinash Jayakar wrote:

> Hello Richard,
> 
> Thank you for reviewing the patch! I have made changes based on your
> comments, but I have some doubts for a few comments as mentioned below.
> 
> On Thu, 2025-09-11 at 13:08 +0200, Richard Biener wrote:
> > On Wed, 10 Sep 2025, Avinash Jayakar wrote:
> > > +  bool unsignedp = TYPE_UNSIGNED (itype) && (tree_int_cst_sgn
> > > (oprnd1) > 0);
> > > +
> > > +  if (unsignedp) 
> > > +  {
> > > +    switch (rhs_code) 
> > > +    {
> > > +      case FLOOR_DIV_EXPR:
> > > +        rhs_code = TRUNC_DIV_EXPR;
> > > +        break;
> > > +      case FLOOR_MOD_EXPR:
> > > +        rhs_code = TRUNC_MOD_EXPR;
> > > +        break;
> > 
> > I would have expected this to be already done on gimple, in
> > a more generic way by
> > 
> > /* For unsigned integral types, FLOOR_DIV_EXPR is the same as
> >    TRUNC_DIV_EXPR.  Rewrite into the latter in this case.  Similarly
> >    for MOD instead of DIV.  */
> > (for floor_divmod (floor_div floor_mod)
> >      trunc_divmod (trunc_div trunc_mod)
> >  (simplify
> >   (floor_divmod @0 @1)
> >   (if ((INTEGRAL_TYPE_P (type) || VECTOR_INTEGER_TYPE_P (type))
> >        && TYPE_UNSIGNED (type))
> >    (trunc_divmod @0 @1))))
> > 
> 
> I think this happens only if types for both operands is of integer
> type. But in this particular case operand #1 is a constant so I think
> this rule is not working for that case. 

A constant also has a type and for binary ops the types of the
operands and the result agrees.

> bool unsignedp = TYPE_UNSIGNED (itype) && (tree_int_cst_sgn
> (oprnd1) > 0);
> This line however checks if operand #1 > 0, the previous checks in this
> function also makes sure that operand #1 must be a constant, else this
> pattern returns NULL.

That extra tree_int_cst_sgn (oprnd1) > 0 simply makes sure this isn't
division by zero.  So as said, you shouldn't see an unsigned
floor_div/mod.

> > You are emitting code that might not be vectorizable and which needs
> > post-processing with bool vector patterns.  So you should
> > 
> >  1) use the appropriate precision scalar bools, anticipating the
> > vector 
> >     mask type used
> >  2) check at least whether the compares are supported, I think we can
> >     rely on bit operations suppoort
> 
> I have limited COND_EXPR stmt in the following code to just 1 stmt.
> Also added checks to see all used operations are supported for the
> target. 
> I did not understand point 1), would it still be valid if I use only
> the result of LT_EXPR only as the input for COND_EXPR as in the below
> code? 
> if (rhs_code == FLOOR_MOD_EXPR) 
>       {
>         if (!target_has_vecop_for_code(NEGATE_EXPR, vectype)
>             || !target_has_vecop_for_code(BIT_XOR_EXPR, vectype)
>             || !target_has_vecop_for_code(LT_EXPR, vectype)
>             || !target_has_vecop_for_code(BIT_IOR_EXPR, vectype)
>             || !target_has_vecop_for_code(COND_EXPR, vectype)

this check isn't going to work and will always fail.
The same is true for LT_EXPR.

>             || !target_has_vecop_for_code(PLUS_EXPR, vectype)
>           )
>           return NULL;
> 
>         append_pattern_def_seq (vinfo, stmt_vinfo, pattern_stmt);
>         // r = x %[fl] y;                                             
>         // is                                                         
>         // r = x % y; if (r && (x ^ y) < 0) r += y; 
>         // Produce following sequence
>       // v0 = x^y
>       // v1 = -r
>       // v2 = r | -r
>       // v3 = v0 & v2
>       // v4 = v3 < 0 (equivalent to (r && (x ^ y) < 0))
>       // v5 = v4 ? y : 0
>       // v6 = r + v5  (final result)
>         tree cond_reg = vect_recog_temp_ssa_var(itype, NULL);
>         def_stmt = gimple_build_assign(cond_reg, BIT_XOR_EXPR, oprnd0,
> oprnd1);
>         append_pattern_def_seq (vinfo, stmt_vinfo, def_stmt);
>        
>         // -r
>         tree negate_r = vect_recog_temp_ssa_var(itype, NULL);
>         def_stmt = gimple_build_assign(negate_r, NEGATE_EXPR, r);
>         append_pattern_def_seq (vinfo, stmt_vinfo, def_stmt);
> 
>         // r | -r , sign bit is set if r!=0
>         tree r_or_negr = vect_recog_temp_ssa_var(itype, NULL);
>         def_stmt = gimple_build_assign(r_or_negr, BIT_IOR_EXPR, r,
> negate_r);
>         append_pattern_def_seq (vinfo, stmt_vinfo, def_stmt);
> 
>         // (x^y) & (r|-r)
>         tree r_or_negr_and_xor = vect_recog_temp_ssa_var(itype, NULL);
>         def_stmt = gimple_build_assign(r_or_negr_and_xor, BIT_AND_EXPR,
> r_or_negr,
>            cond_reg);
>         append_pattern_def_seq (vinfo, stmt_vinfo, def_stmt);
> 
>         // (x^y) & (r|-r) < 0 which is equivalent to  (x^y < 0 && r!=0)
>         tree bool_cond = vect_recog_temp_ssa_var(boolean_type_node,
> NULL);

So what I meant with 1) is that here you should do sth like

   tree mask_vectype = truth_type_for (vectype);
   if (!mask_vectype)
     return NULL;

   and for compaes use the TREE_TYPE (mask_vectype) component
   type for the scalar result of the LT_EXPR and as the first
   operand of COND_EXPRs.

You need to verify expand_vec_cmp_expr_p (vectype, mask_vectype, LT_EXPR)
and expand_vec_cond_expr_p (vectype, mask_vectype) for the COND_EXPR.

>         def_stmt = gimple_build_assign(bool_cond, LT_EXPR,
> r_or_negr_and_xor, 
>           build_int_cst(itype, 0));
>         append_pattern_def_seq (vinfo, stmt_vinfo, def_stmt,
>            truth_type_for(vectype), itype);
> 
>         //  (x^y < 0 && r) ? y : 0
>         tree extr_cond = vect_recog_temp_ssa_var(itype, NULL);
>         def_stmt = gimple_build_assign(extr_cond, COND_EXPR, bool_cond,
> oprnd1, 
>           build_int_cst(itype, 0));
>         append_pattern_def_seq (vinfo, stmt_vinfo, def_stmt);
>         
>         // r += (x ^ y < 0 && r) ? y : 0
>         tree floor_mod_r = vect_recog_temp_ssa_var(itype, NULL);
>         pattern_stmt = gimple_build_assign(floor_mod_r, PLUS_EXPR, r,
> extr_cond);
>       }
> 
> Thanks and regards,
> Avinash Jayakar
> 

-- 
Richard Biener <rguent...@suse.de>
SUSE Software Solutions Germany GmbH,
Frankenstrasse 146, 90461 Nuernberg, Germany;
GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg)

Reply via email to