On Wed, Feb 5, 2025 at 1:29 PM Richard Biener <[email protected]> wrote:
>
> The PR shows fold-mem-offsets taking ages and a lot of memory computing
> DU/UD chains as that requires the RD problem. The issue is not so much
> the memory required for the pruned sets but the high CFG connectivity
> (and that the CFG is cyclic) which makes solving the dataflow problem
> expensive.
>
> The following adds the same limit as the one imposed by GCSE and CPROP.
>
> Bootstrap and regtest ongoing on x86_64-unknown-linux-gnu, this
> reduces the compile-time of the PR26854 testcase from 480s to 150s.
>
> OK?
Thank you for sending this.
I was working on something like this as well, but I could not
find a reasonable threshold expression to disable the pass.
>
> Thanks,
> Richard.
>
> PR rtl-optimization/117922
> * fold-mem-offsets.cc (pass_fold_mem_offsets::execute):
> Do nothing for a highly connected CFG.
> ---
> gcc/fold-mem-offsets.cc | 18 ++++++++++++++++++
> 1 file changed, 18 insertions(+)
>
> diff --git a/gcc/fold-mem-offsets.cc b/gcc/fold-mem-offsets.cc
> index a816006e207..c1c94472a07 100644
> --- a/gcc/fold-mem-offsets.cc
> +++ b/gcc/fold-mem-offsets.cc
> @@ -35,6 +35,7 @@ along with GCC; see the file COPYING3. If not see
> #include "df.h"
> #include "tree-pass.h"
> #include "cfgrtl.h"
> +#include "diagnostic-core.h"
>
> /* This pass tries to optimize memory offset calculations by moving constants
> from add instructions to the memory instructions (loads / stores).
> @@ -841,6 +842,23 @@ do_commit_insn (rtx_insn *insn)
> unsigned int
> pass_fold_mem_offsets::execute (function *fn)
> {
> + /* Computing UD/DU chains for flow graphs which have a high connectivity
> + will take a long time and is unlikely to be particularly useful.
> +
> + In normal circumstances a cfg should have about twice as many
> + edges as blocks. But we do not want to punish small functions
> + which have a couple switch statements. Rather than simply
> + threshold the number of blocks, uses something with a more
> + graceful degradation. */
> + if (n_edges_for_fn (fn) > 20000 + n_basic_blocks_for_fn (fn) * 4)
> + {
> + warning (OPT_Wdisabled_optimization,
> + "fold-mem-offsets: %d basic blocks and %d edges/basic block",
> + n_basic_blocks_for_fn (cfun),
> + n_edges_for_fn (cfun) / n_basic_blocks_for_fn (cfun));
> + return 0;
> + }
> +
> df_set_flags (DF_EQ_NOTES + DF_RD_PRUNE_DEAD_DEFS + DF_DEFER_INSN_RESCAN);
> df_chain_add_problem (DF_UD_CHAIN + DF_DU_CHAIN);
> df_analyze ();
> --
> 2.43.0