On Tue, Mar 26, 2024 at 10:52 AM Xi Ruoyao <xry...@xry111.site> wrote: > > The latency of LA464 and LA664 division instructions depends on the > input. When I updated the costs in r14-6642, I unintentionally set the > division costs to the best-case latency (when the first operand is 0). > Per a recent discussion [1] we should use "something sensible" instead > of it. > > Use the average of the minimum and maximum latency observed instead. > This enables multiplication to reciprocal sequence reduction and speeds > up the following test case for about 30%: > > int > main (void) > { > unsigned long stat = 0xdeadbeef; > for (int i = 0; i < 100000000; i++) > stat = (stat * stat + stat * 114514 + 1919810) % 1000000007; > asm(""::"r"(stat)); > }
I think you should be able to see a constant divisor and thus could do better than return the same latency for everything. For non-constant divisors using the best-case latency shouldn't be a problem. > [1]: https://gcc.gnu.org/pipermail/gcc-patches/2024-March/648348.html > > gcc/ChangeLog: > > * config/loongarch/loongarch-def.cc > (loongarch_rtx_cost_data::loongarch_rtx_cost_data): Increase > default division cost to the average of the best case and worst > case senarios observed. > > gcc/testsuite/ChangeLog: > > * gcc.target/loongarch/div-const-reduction.c: New test. > --- > > Bootstrapped and regtested on loongarch64-linux-gnu. Ok for trunk? > > gcc/config/loongarch/loongarch-def.cc | 8 ++++---- > gcc/testsuite/gcc.target/loongarch/div-const-reduction.c | 9 +++++++++ > 2 files changed, 13 insertions(+), 4 deletions(-) > create mode 100644 gcc/testsuite/gcc.target/loongarch/div-const-reduction.c > > diff --git a/gcc/config/loongarch/loongarch-def.cc > b/gcc/config/loongarch/loongarch-def.cc > index e8c129ce643..93e72a520d5 100644 > --- a/gcc/config/loongarch/loongarch-def.cc > +++ b/gcc/config/loongarch/loongarch-def.cc > @@ -95,12 +95,12 @@ loongarch_rtx_cost_data::loongarch_rtx_cost_data () > : fp_add (COSTS_N_INSNS (5)), > fp_mult_sf (COSTS_N_INSNS (5)), > fp_mult_df (COSTS_N_INSNS (5)), > - fp_div_sf (COSTS_N_INSNS (8)), > - fp_div_df (COSTS_N_INSNS (8)), > + fp_div_sf (COSTS_N_INSNS (12)), > + fp_div_df (COSTS_N_INSNS (15)), > int_mult_si (COSTS_N_INSNS (4)), > int_mult_di (COSTS_N_INSNS (4)), > - int_div_si (COSTS_N_INSNS (5)), > - int_div_di (COSTS_N_INSNS (5)), > + int_div_si (COSTS_N_INSNS (14)), > + int_div_di (COSTS_N_INSNS (22)), > movcf2gr (COSTS_N_INSNS (7)), > movgr2cf (COSTS_N_INSNS (15)), > branch_cost (6), > diff --git a/gcc/testsuite/gcc.target/loongarch/div-const-reduction.c > b/gcc/testsuite/gcc.target/loongarch/div-const-reduction.c > new file mode 100644 > index 00000000000..0ee86410dd7 > --- /dev/null > +++ b/gcc/testsuite/gcc.target/loongarch/div-const-reduction.c > @@ -0,0 +1,9 @@ > +/* { dg-do compile } */ > +/* { dg-options "-O2 -mtune=la464" } */ > +/* { dg-final { scan-assembler-not "div\.\[dw\]" } } */ > + > +int > +test (int a) > +{ > + return a % 1000000007; > +} > -- > 2.44.0 >