On Tue, Mar 26, 2024 at 10:52 AM Xi Ruoyao <xry...@xry111.site> wrote:
>
> The latency of LA464 and LA664 division instructions depends on the
> input.  When I updated the costs in r14-6642, I unintentionally set the
> division costs to the best-case latency (when the first operand is 0).
> Per a recent discussion [1] we should use "something sensible" instead
> of it.
>
> Use the average of the minimum and maximum latency observed instead.
> This enables multiplication to reciprocal sequence reduction and speeds
> up the following test case for about 30%:
>
>     int
>     main (void)
>     {
>       unsigned long stat = 0xdeadbeef;
>       for (int i = 0; i < 100000000; i++)
>         stat = (stat * stat + stat * 114514 + 1919810) % 1000000007;
>       asm(""::"r"(stat));
>     }

I think you should be able to see a constant divisor and thus could do
better than return the same latency for everything.  For non-constant
divisors using the best-case latency shouldn't be a problem.

> [1]: https://gcc.gnu.org/pipermail/gcc-patches/2024-March/648348.html
>
> gcc/ChangeLog:
>
>         * config/loongarch/loongarch-def.cc
>         (loongarch_rtx_cost_data::loongarch_rtx_cost_data): Increase
>         default division cost to the average of the best case and worst
>         case senarios observed.
>
> gcc/testsuite/ChangeLog:
>
>         * gcc.target/loongarch/div-const-reduction.c: New test.
> ---
>
> Bootstrapped and regtested on loongarch64-linux-gnu.  Ok for trunk?
>
>  gcc/config/loongarch/loongarch-def.cc                    | 8 ++++----
>  gcc/testsuite/gcc.target/loongarch/div-const-reduction.c | 9 +++++++++
>  2 files changed, 13 insertions(+), 4 deletions(-)
>  create mode 100644 gcc/testsuite/gcc.target/loongarch/div-const-reduction.c
>
> diff --git a/gcc/config/loongarch/loongarch-def.cc 
> b/gcc/config/loongarch/loongarch-def.cc
> index e8c129ce643..93e72a520d5 100644
> --- a/gcc/config/loongarch/loongarch-def.cc
> +++ b/gcc/config/loongarch/loongarch-def.cc
> @@ -95,12 +95,12 @@ loongarch_rtx_cost_data::loongarch_rtx_cost_data ()
>    : fp_add (COSTS_N_INSNS (5)),
>      fp_mult_sf (COSTS_N_INSNS (5)),
>      fp_mult_df (COSTS_N_INSNS (5)),
> -    fp_div_sf (COSTS_N_INSNS (8)),
> -    fp_div_df (COSTS_N_INSNS (8)),
> +    fp_div_sf (COSTS_N_INSNS (12)),
> +    fp_div_df (COSTS_N_INSNS (15)),
>      int_mult_si (COSTS_N_INSNS (4)),
>      int_mult_di (COSTS_N_INSNS (4)),
> -    int_div_si (COSTS_N_INSNS (5)),
> -    int_div_di (COSTS_N_INSNS (5)),
> +    int_div_si (COSTS_N_INSNS (14)),
> +    int_div_di (COSTS_N_INSNS (22)),
>      movcf2gr (COSTS_N_INSNS (7)),
>      movgr2cf (COSTS_N_INSNS (15)),
>      branch_cost (6),
> diff --git a/gcc/testsuite/gcc.target/loongarch/div-const-reduction.c 
> b/gcc/testsuite/gcc.target/loongarch/div-const-reduction.c
> new file mode 100644
> index 00000000000..0ee86410dd7
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/loongarch/div-const-reduction.c
> @@ -0,0 +1,9 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O2 -mtune=la464" } */
> +/* { dg-final { scan-assembler-not "div\.\[dw\]" } } */
> +
> +int
> +test (int a)
> +{
> +  return a % 1000000007;
> +}
> --
> 2.44.0
>

Reply via email to