Re: [PATCH] Cleanup --params for backward threader.
That's odd. Yeah, please open a PR. Thanks. Aldy On Tue, Oct 19, 2021, 22:47 Jan-Benedict Glaw wrote: > Hi Aldy! > > On Thu, 2021-10-14 16:25:48 +0200, Aldy Hernandez via Gcc-patches < > gcc-patches@gcc.gnu.org> wrote: > > The new backward threader makes some of the --param knobs used to > > control it questionable at best or no longer applicable at worst. > > > > The fsm-maximum-phi-arguments param is unused and can be removed. > > > > The max-fsm-thread-length param is block based which is a bit redundant, > > since we already restrict paths based on instruction estimates. > > > > The max-fsm-thread-paths restricts the total number of threadable paths > > in a function. We probably don't need this. Besides, the forward > > threader has no such restriction. > > > > OK pending tests? > > This causes a regression for me. I'm auto-building lots of GCC > cross-compilers and use these to cross-build the Linux kernel. > > Using binutils/gas/gcc configured for --target=sh-linux (actual > configuration for GCC is this: > > .../gcc/configure --target=sh-linux --enable-werror-always \ > --enable-languages=all --disable-gcov\ > --disable-shared --disable-threads \ > --without-headers \ > > --prefix=/var/lib/laminar/run/gcc-sh-linux/13/toolchain-install > ) > > Then, building Linux for a good number of default configurations for > ARCH=sh and ARCH=arm, GCC will just loop: > > $ make ARCH=sh distclean > $ cp arch/sh/configs/r7780mp_defconfig .config > $ make V=1 ARCH=sh CROSS_COMPILE=sh-linux- oldconfig < /dev/null > $ make V=1 ARCH=sh CROSS_COMPILE=sh-linux- prepare > $ make V=1 ARCH=sh CROSS_COMPILE=sh-linux- all > [...] > sh-linux-gcc -Wp,-MMD,drivers/ata/.libata-core.o.d -nostdinc -isystem > /tmp/testbed/install/lib/gcc/sh-linux/12.0.0/include -I./arch/sh/include > -I./arch/sh/include/generated -I./include -I./arch/sh/include/uapi > -I./arch/sh/include/generated/uapi -I./include/uapi > -I./include/generated/uapi -include ./include/linux/compiler-version.h > -include ./include/linux/kconfig.h -include > ./include/linux/compiler_types.h -D__KERNEL__ -m4 -m4-nofpu -m4a -m4a-nofpu > -ml -mno-fdpic -Wa,-isa=sh4a-up -ffreestanding -I > ./arch/sh/include/cpu-sh4a -I ./arch/sh/include/cpu-sh4 -I > ./arch/sh/include/cpu-common -I ./arch/sh/include/mach-highlander -I > ./arch/sh/include/mach-common -fmacro-prefix-map=./= -Wall -Wundef > -Werror=strict-prototypes -Wno-trigraphs -fno-strict-aliasing -fno-common > -fshort-wchar -fno-PIE -Werror=implicit-function-declaration > -Werror=implicit-int -Werror=return-type -Wno-format-security -std=gnu89 > -pipe -m4 -m4-nofpu -m4a -m4a-nofpu -ml -mno-fdpic -Wa,-isa=sh4a-up > -ffreestanding -I ./arch/sh/include/cpu-sh4a -I ./arch/sh/include/cpu-sh4 > -I ./arch/sh/include/cpu-common -I ./arch/sh/include/mach-highlander -I > ./arch/sh/include/mach-common -fno-delete-null-pointer-checks > -Wno-frame-address -Wno-format-truncation -Wno-format-overflow > -Wno-address-of-packed-member -O2 -fno-allow-store-data-races > -Wframe-larger-than=1024 -fstack-protector-strong -Wimplicit-fallthrough=5 > -Wno-main -Wno-unused-but-set-variable -Wno-unused-const-variable > -fomit-frame-pointer -ftrivial-auto-var-init=zero > -enable-trivial-auto-var-init-zero-knowing-it-will-be-removed-from-clang > -fno-stack-clash-protection -g -Wdeclaration-after-statement -Wvla > -Wno-pointer-sign -Wno-stringop-truncation -Wno-zero-length-bounds > -Wno-array-bounds -Wno-stringop-overflow -Wno-restrict > -Wno-maybe-uninitialized -fno-strict-overflow -fno-stack-check > -fconserve-stack -Werror=date-time -Werror=incompatible-pointer-types > -Werror=designated-init -Wno-packed-not-aligned > -DKBUILD_MODFILE='"drivers/ata/libata"' -DKBUILD_BASENAME='"libata_core"' > -DKBUILD_MODNAME='"libata"' -D__KBUILD_MODNAME=kmod_libata -c -o > drivers/ata/libata-core.o drivers/ata/libata-core.c > > > (gdb) bt > #0 0x0100318e in vec vl_ptr>::operator[] (ix=0, this=0x50ee6e0) at ../../gcc/gcc/vec.h:1495 > #1 back_jt_path_registry::adjust_paths_after_duplication > (this=0x7ffdf8b6e868, curr_path_num=0) at > ../../gcc/gcc/tree-ssa-threadupdate.c:2315 > #2 0x01003c0d in back_jt_path_registry::duplicate_thread_path > (this=0x7ffdf8b6e868, entry=0x7f92651000c0, exit=, > region=, n_region=8, > current_path_no=0) at ../../gcc/gcc/tree-ssa-threadupdate.c:2546 > #3 0x010051e4 in back_jt_path_registry::update_cfg > (this=0x7ffdf8b6e868) at ../../gcc/gcc/tree-ssa-threadupdate.c:2656 > #4 0x01003ecc in jt_path_registry::thread_through_all_blocks > (this=0x7ffdf8b6e868, peel_loop_headers=) at > ../../gcc/gcc/tree-ssa-threadupdate.c:2604 > #5 0x00ffb5a7 in > back_threader_registry::thread_through_all_blocks > (may_peel_loop_headers=true, this=0x7ffdf8b6e868) at > ../../gcc/gcc/tree-ssa-threadbackward.c:556 > #6 back_threader::thread_through_all_blocks
Re: [PATCH] Cleanup --params for backward threader.
Hi Aldy! On Thu, 2021-10-14 16:25:48 +0200, Aldy Hernandez via Gcc-patches wrote: > The new backward threader makes some of the --param knobs used to > control it questionable at best or no longer applicable at worst. > > The fsm-maximum-phi-arguments param is unused and can be removed. > > The max-fsm-thread-length param is block based which is a bit redundant, > since we already restrict paths based on instruction estimates. > > The max-fsm-thread-paths restricts the total number of threadable paths > in a function. We probably don't need this. Besides, the forward > threader has no such restriction. > > OK pending tests? This causes a regression for me. I'm auto-building lots of GCC cross-compilers and use these to cross-build the Linux kernel. Using binutils/gas/gcc configured for --target=sh-linux (actual configuration for GCC is this: .../gcc/configure --target=sh-linux --enable-werror-always \ --enable-languages=all --disable-gcov\ --disable-shared --disable-threads \ --without-headers \ --prefix=/var/lib/laminar/run/gcc-sh-linux/13/toolchain-install ) Then, building Linux for a good number of default configurations for ARCH=sh and ARCH=arm, GCC will just loop: $ make ARCH=sh distclean $ cp arch/sh/configs/r7780mp_defconfig .config $ make V=1 ARCH=sh CROSS_COMPILE=sh-linux- oldconfig < /dev/null $ make V=1 ARCH=sh CROSS_COMPILE=sh-linux- prepare $ make V=1 ARCH=sh CROSS_COMPILE=sh-linux- all [...] sh-linux-gcc -Wp,-MMD,drivers/ata/.libata-core.o.d -nostdinc -isystem /tmp/testbed/install/lib/gcc/sh-linux/12.0.0/include -I./arch/sh/include -I./arch/sh/include/generated -I./include -I./arch/sh/include/uapi -I./arch/sh/include/generated/uapi -I./include/uapi -I./include/generated/uapi -include ./include/linux/compiler-version.h -include ./include/linux/kconfig.h -include ./include/linux/compiler_types.h -D__KERNEL__ -m4 -m4-nofpu -m4a -m4a-nofpu -ml -mno-fdpic -Wa,-isa=sh4a-up -ffreestanding -I ./arch/sh/include/cpu-sh4a -I ./arch/sh/include/cpu-sh4 -I ./arch/sh/include/cpu-common -I ./arch/sh/include/mach-highlander -I ./arch/sh/include/mach-common -fmacro-prefix-map=./= -Wall -Wundef -Werror=strict-prototypes -Wno-trigraphs -fno-strict-aliasing -fno-common -fshort-wchar -fno-PIE -Werror=implicit-function-declaration -Werror=implicit-int -Werror=return-type -Wno-format-security -std=gnu89 -pipe -m4 -m4-nofpu -m4a -m4a-nofpu -ml -mno-fdpic -Wa,-isa=sh4a-up -ffreestanding -I ./arch/sh/include/cpu-sh4a -I ./arch/sh/include/cpu-sh4 -I ./arch/sh/include/cpu-common -I ./arch/sh/include/mach-highlander -I ./arch/sh/include/mach-common -fno-delete-null-pointer-checks -Wno-frame-address -Wno-format-truncation -Wno-format-overflow -Wno-address-of-packed-member -O2 -fno-allow-store-data-races -Wframe-larger-than=1024 -fstack-protector-strong -Wimplicit-fallthrough=5 -Wno-main -Wno-unused-but-set-variable -Wno-unused-const-variable -fomit-frame-pointer -ftrivial-auto-var-init=zero -enable-trivial-auto-var-init-zero-knowing-it-will-be-removed-from-clang -fno-stack-clash-protection -g -Wdeclaration-after-statement -Wvla -Wno-pointer-sign -Wno-stringop-truncation -Wno-zero-length-bounds -Wno-array-bounds -Wno-stringop-overflow -Wno-restrict -Wno-maybe-uninitialized -fno-strict-overflow -fno-stack-check -fconserve-stack -Werror=date-time -Werror=incompatible-pointer-types -Werror=designated-init -Wno-packed-not-aligned-DKBUILD_MODFILE='"drivers/ata/libata"' -DKBUILD_BASENAME='"libata_core"' -DKBUILD_MODNAME='"libata"' -D__KBUILD_MODNAME=kmod_libata -c -o drivers/ata/libata-core.o drivers/ata/libata-core.c (gdb) bt #0 0x0100318e in vec::operator[] (ix=0, this=0x50ee6e0) at ../../gcc/gcc/vec.h:1495 #1 back_jt_path_registry::adjust_paths_after_duplication (this=0x7ffdf8b6e868, curr_path_num=0) at ../../gcc/gcc/tree-ssa-threadupdate.c:2315 #2 0x01003c0d in back_jt_path_registry::duplicate_thread_path (this=0x7ffdf8b6e868, entry=0x7f92651000c0, exit=, region=, n_region=8, current_path_no=0) at ../../gcc/gcc/tree-ssa-threadupdate.c:2546 #3 0x010051e4 in back_jt_path_registry::update_cfg (this=0x7ffdf8b6e868) at ../../gcc/gcc/tree-ssa-threadupdate.c:2656 #4 0x01003ecc in jt_path_registry::thread_through_all_blocks (this=0x7ffdf8b6e868, peel_loop_headers=) at ../../gcc/gcc/tree-ssa-threadupdate.c:2604 #5 0x00ffb5a7 in back_threader_registry::thread_through_all_blocks (may_peel_loop_headers=true, this=0x7ffdf8b6e868) at ../../gcc/gcc/tree-ssa-threadbackward.c:556 #6 back_threader::thread_through_all_blocks (may_peel_loop_headers=true, this=0x7ffdf8b6e860) at ../../gcc/gcc/tree-ssa-threadbackward.c:501 #7 (anonymous namespace)::try_thread_blocks (fun=fun@entry=0x7f926eb389c0) at ../../gcc/gcc/tree-ssa-threadbackward.c:946 #8 0x00ffb5eb in (anonymous
Re: [PATCH] Cleanup --params for backward threader.
On 10/14/2021 8:25 AM, Aldy Hernandez wrote: The new backward threader makes some of the --param knobs used to control it questionable at best or no longer applicable at worst. The fsm-maximum-phi-arguments param is unused and can be removed. The max-fsm-thread-length param is block based which is a bit redundant, since we already restrict paths based on instruction estimates. The max-fsm-thread-paths restricts the total number of threadable paths in a function. We probably don't need this. Besides, the forward threader has no such restriction. OK pending tests? gcc/ChangeLog: * doc/invoke.texi: Remove max-fsm-thread-length, max-fsm-thread-paths, and fsm-maximum-phi-arguments. * params.opt: Same. * tree-ssa-threadbackward.c (back_threader::back_threader): Remove argument. (back_threader_registry::back_threader_registry): Same. (back_threader_profitability::profitable_path_p): Remove param_max_fsm_thread-length. (back_threader_registry::register_path): Remove m_max_allowable_paths. OK. I don't think any of those params were addressing pathological cases and thread-length is really handled better by costing based on the # statements. jeff
[PATCH] Cleanup --params for backward threader.
The new backward threader makes some of the --param knobs used to control it questionable at best or no longer applicable at worst. The fsm-maximum-phi-arguments param is unused and can be removed. The max-fsm-thread-length param is block based which is a bit redundant, since we already restrict paths based on instruction estimates. The max-fsm-thread-paths restricts the total number of threadable paths in a function. We probably don't need this. Besides, the forward threader has no such restriction. OK pending tests? gcc/ChangeLog: * doc/invoke.texi: Remove max-fsm-thread-length, max-fsm-thread-paths, and fsm-maximum-phi-arguments. * params.opt: Same. * tree-ssa-threadbackward.c (back_threader::back_threader): Remove argument. (back_threader_registry::back_threader_registry): Same. (back_threader_profitability::profitable_path_p): Remove param_max_fsm_thread-length. (back_threader_registry::register_path): Remove m_max_allowable_paths. --- gcc/doc/invoke.texi | 12 gcc/params.opt| 12 gcc/tree-ssa-threadbackward.c | 27 +++ 3 files changed, 3 insertions(+), 48 deletions(-) diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi index 03234c887dc..69993270b39 100644 --- a/gcc/doc/invoke.texi +++ b/gcc/doc/invoke.texi @@ -14464,14 +14464,6 @@ Emit instrumentation calls to __tsan_func_entry() and __tsan_func_exit(). Maximum number of instructions to copy when duplicating blocks on a finite state automaton jump thread path. -@item max-fsm-thread-length -Maximum number of basic blocks on a finite state automaton jump thread -path. - -@item max-fsm-thread-paths -Maximum number of new jump thread paths to create for a finite state -automaton. - @item parloops-chunk-size Chunk size of omp schedule for loops parallelized by parloops. @@ -14626,10 +14618,6 @@ The maximum depth of recursive inlining for non-inline functions. Scale factor to apply to the number of statements in a threading path when comparing to the number of (scaled) blocks. -@item fsm-maximum-phi-arguments -Maximum number of arguments a PHI may have before the FSM threader -will not try to thread through its block. - @item uninit-control-dep-attempts Maximum number of nested calls to search for control dependencies during uninitialized variable analysis. diff --git a/gcc/params.opt b/gcc/params.opt index 84d642d72c5..06a6fdc9deb 100644 --- a/gcc/params.opt +++ b/gcc/params.opt @@ -173,10 +173,6 @@ Common Joined UInteger Var(param_ranger_logical_depth) Init(6) IntegerRange(1, 9 Maximum depth of logical expression evaluation ranger will look through when evaluating outgoing edge ranges. --param=fsm-maximum-phi-arguments= -Common Joined UInteger Var(param_fsm_maximum_phi_arguments) Init(100) IntegerRange(1, 99) Param Optimization -Maximum number of arguments a PHI may have before the FSM threader will not try to thread through its block. - -param=fsm-scale-path-blocks= Common Joined UInteger Var(param_fsm_scale_path_blocks) Init(3) IntegerRange(1, 10) Param Optimization Scale factor to apply to the number of blocks in a threading path when comparing to the number of (scaled) statements. @@ -537,18 +533,10 @@ The maximum number of nested indirect inlining performed by early inliner. Common Joined UInteger Var(param_max_fields_for_field_sensitive) Param Maximum number of fields in a structure before pointer analysis treats the structure as a single variable. --param=max-fsm-thread-length= -Common Joined UInteger Var(param_max_fsm_thread_length) Init(10) IntegerRange(1, 99) Param Optimization -Maximum number of basic blocks on a finite state automaton jump thread path. - -param=max-fsm-thread-path-insns= Common Joined UInteger Var(param_max_fsm_thread_path_insns) Init(100) IntegerRange(1, 99) Param Optimization Maximum number of instructions to copy when duplicating blocks on a finite state automaton jump thread path. --param=max-fsm-thread-paths= -Common Joined UInteger Var(param_max_fsm_thread_paths) Init(50) IntegerRange(1, 99) Param Optimization -Maximum number of new jump thread paths to create for a finite state automaton. - -param=max-gcse-insertion-ratio= Common Joined UInteger Var(param_max_gcse_insertion_ratio) Init(20) Param Optimization The maximum ratio of insertions to deletions of expressions in GCSE. diff --git a/gcc/tree-ssa-threadbackward.c b/gcc/tree-ssa-threadbackward.c index 1999ccf4834..7c2c1112bee 100644 --- a/gcc/tree-ssa-threadbackward.c +++ b/gcc/tree-ssa-threadbackward.c @@ -52,12 +52,11 @@ along with GCC; see the file COPYING3. If not see class back_threader_registry { public: - back_threader_registry (int max_allowable_paths); + back_threader_registry (); bool register_path (const vec &, edge taken); bool thread_through_all_blocks (bool may_peel_loop_headers); private: