Ping!

Kindly review. The vmod instruction in powerpc are much more expensive
that using sequence of shifts/multiply/add. 

Thanks and regards,
Avinash Jayakar

On Mon, 2025-09-15 at 17:09 +0530, Avinash Jayakar wrote:
> Hi,
> 
> Below is a patch for fixing PR121700 proposed for master. Kindly
> review. 
> Regtesting on powerpc64le is still running. Will update the status
> soon in this
> thread.
> 
> Thank you,
> Avinash Jayakar
> 
> rs6000: Update scalar cost of {TRUNC,FLOOR}_MOD_EXPR[PR121700]
> 
> The default cost model with O2 flag is the VERY_CHEAP model which
> produces sub-optimal code for loops with TRUNC/FLOOR modulo
> expressions.
> Currently the vectorized variant of modulo operator is almost 4 times
> faster than scalar variant for 32 bit integeres on power10.
> 
> In order to fairly compare scalar and vectorized variants of the loop
> in function vect_analyze_loop_costing, update the scalar cost for
> TRUNC_MOD_EXPR and FLOOR_MOD_EXPR. The value 6 is currently the
> number
> of instructions generated for these expressions with O2 flag.
> 
> 2025-09-15  Avinash Jayakar <[email protected]>
> 
> gcc/ChangeLog:
>       PR target/121700
>         * config/rs6000/rs6000.cc (rs6000_adjust_vect_cost_per_stmt):
> Add cost
>       for {FLOOR,TRUNC}_MOD_EXPR.
> ---
>  gcc/config/rs6000/rs6000.cc | 6 ++++++
>  1 file changed, 6 insertions(+)
> 
> diff --git a/gcc/config/rs6000/rs6000.cc
> b/gcc/config/rs6000/rs6000.cc
> index 8dd23f8619c..183e454c5bc 100644
> --- a/gcc/config/rs6000/rs6000.cc
> +++ b/gcc/config/rs6000/rs6000.cc
> @@ -5311,6 +5311,12 @@ rs6000_adjust_vect_cost_per_stmt (enum
> vect_cost_for_stmt kind,
>        tree_code subcode = gimple_assign_rhs_code (stmt_info->stmt);
>        if (subcode == COND_EXPR)
>       return 2;
> +/* For {FLOOR,TRUNC}_MOD_EXPR, cost them a bit higher in order to
> fairly 
> +   compare the scalar and vector costs, since there is no direct
> instruction
> +   that can evaluation these expressions with just 1 instruction.
> Currently
> +   using the number of instructions generated for these
> expressions.*/
> +      if (subcode == FLOOR_MOD_EXPR || subcode == TRUNC_MOD_EXPR)
> +  return 6;
>      }
>  
>    return 0;

Reply via email to