Re: [PATCH] Cleanup --params for backward threader.

2021-10-19 Thread Aldy Hernandez via Gcc-patches
That's odd. Yeah, please open a PR.

Thanks.
Aldy

On Tue, Oct 19, 2021, 22:47 Jan-Benedict Glaw  wrote:

> Hi Aldy!
>
> On Thu, 2021-10-14 16:25:48 +0200, Aldy Hernandez via Gcc-patches <
> gcc-patches@gcc.gnu.org> wrote:
> > The new backward threader makes some of the --param knobs used to
> > control it questionable at best or no longer applicable at worst.
> >
> > The fsm-maximum-phi-arguments param is unused and can be removed.
> >
> > The max-fsm-thread-length param is block based which is a bit redundant,
> > since we already restrict paths based on instruction estimates.
> >
> > The max-fsm-thread-paths restricts the total number of threadable paths
> > in a function.  We probably don't need this.  Besides, the forward
> > threader has no such restriction.
> >
> > OK pending tests?
>
> This causes a regression for me. I'm auto-building lots of GCC
> cross-compilers and use these to cross-build the Linux kernel.
>
>   Using binutils/gas/gcc configured for --target=sh-linux (actual
> configuration for GCC is this:
>
> .../gcc/configure --target=sh-linux --enable-werror-always \
>   --enable-languages=all --disable-gcov\
>   --disable-shared --disable-threads   \
>   --without-headers \
>
> --prefix=/var/lib/laminar/run/gcc-sh-linux/13/toolchain-install
> )
>
> Then, building Linux for a good number of default configurations for
> ARCH=sh and ARCH=arm, GCC will just loop:
>
> $ make ARCH=sh distclean
> $ cp arch/sh/configs/r7780mp_defconfig .config
> $ make V=1 ARCH=sh CROSS_COMPILE=sh-linux- oldconfig < /dev/null
> $ make V=1 ARCH=sh CROSS_COMPILE=sh-linux- prepare
> $ make V=1 ARCH=sh CROSS_COMPILE=sh-linux- all
> [...]
>   sh-linux-gcc -Wp,-MMD,drivers/ata/.libata-core.o.d -nostdinc -isystem
> /tmp/testbed/install/lib/gcc/sh-linux/12.0.0/include -I./arch/sh/include
> -I./arch/sh/include/generated  -I./include -I./arch/sh/include/uapi
> -I./arch/sh/include/generated/uapi -I./include/uapi
> -I./include/generated/uapi -include ./include/linux/compiler-version.h
> -include ./include/linux/kconfig.h -include
> ./include/linux/compiler_types.h -D__KERNEL__ -m4 -m4-nofpu -m4a -m4a-nofpu
> -ml -mno-fdpic -Wa,-isa=sh4a-up -ffreestanding -I
> ./arch/sh/include/cpu-sh4a -I ./arch/sh/include/cpu-sh4 -I
> ./arch/sh/include/cpu-common -I ./arch/sh/include/mach-highlander -I
> ./arch/sh/include/mach-common -fmacro-prefix-map=./= -Wall -Wundef
> -Werror=strict-prototypes -Wno-trigraphs -fno-strict-aliasing -fno-common
> -fshort-wchar -fno-PIE -Werror=implicit-function-declaration
> -Werror=implicit-int -Werror=return-type -Wno-format-security -std=gnu89
> -pipe -m4 -m4-nofpu -m4a -m4a-nofpu -ml -mno-fdpic -Wa,-isa=sh4a-up
> -ffreestanding -I ./arch/sh/include/cpu-sh4a -I ./arch/sh/include/cpu-sh4
> -I ./arch/sh/include/cpu-common -I ./arch/sh/include/mach-highlander -I
> ./arch/sh/include/mach-common -fno-delete-null-pointer-checks
> -Wno-frame-address -Wno-format-truncation -Wno-format-overflow
> -Wno-address-of-packed-member -O2 -fno-allow-store-data-races
> -Wframe-larger-than=1024 -fstack-protector-strong -Wimplicit-fallthrough=5
> -Wno-main -Wno-unused-but-set-variable -Wno-unused-const-variable
> -fomit-frame-pointer -ftrivial-auto-var-init=zero
> -enable-trivial-auto-var-init-zero-knowing-it-will-be-removed-from-clang
> -fno-stack-clash-protection -g -Wdeclaration-after-statement -Wvla
> -Wno-pointer-sign -Wno-stringop-truncation -Wno-zero-length-bounds
> -Wno-array-bounds -Wno-stringop-overflow -Wno-restrict
> -Wno-maybe-uninitialized -fno-strict-overflow -fno-stack-check
> -fconserve-stack -Werror=date-time -Werror=incompatible-pointer-types
> -Werror=designated-init -Wno-packed-not-aligned
> -DKBUILD_MODFILE='"drivers/ata/libata"' -DKBUILD_BASENAME='"libata_core"'
> -DKBUILD_MODNAME='"libata"' -D__KBUILD_MODNAME=kmod_libata -c -o
> drivers/ata/libata-core.o drivers/ata/libata-core.c
>
>
> (gdb) bt
> #0  0x0100318e in vec vl_ptr>::operator[] (ix=0, this=0x50ee6e0) at ../../gcc/gcc/vec.h:1495
> #1  back_jt_path_registry::adjust_paths_after_duplication
> (this=0x7ffdf8b6e868, curr_path_num=0) at
> ../../gcc/gcc/tree-ssa-threadupdate.c:2315
> #2  0x01003c0d in back_jt_path_registry::duplicate_thread_path
> (this=0x7ffdf8b6e868, entry=0x7f92651000c0, exit=,
> region=, n_region=8,
> current_path_no=0) at ../../gcc/gcc/tree-ssa-threadupdate.c:2546
> #3  0x010051e4 in back_jt_path_registry::update_cfg
> (this=0x7ffdf8b6e868) at ../../gcc/gcc/tree-ssa-threadupdate.c:2656
> #4  0x01003ecc in jt_path_registry::thread_through_all_blocks
> (this=0x7ffdf8b6e868, peel_loop_headers=) at
> ../../gcc/gcc/tree-ssa-threadupdate.c:2604
> #5  0x00ffb5a7 in
> back_threader_registry::thread_through_all_blocks
> (may_peel_loop_headers=true, this=0x7ffdf8b6e868) at
> ../../gcc/gcc/tree-ssa-threadbackward.c:556
> #6  back_threader::thread_through_all_blocks 

Re: [PATCH] Cleanup --params for backward threader.

2021-10-19 Thread Jan-Benedict Glaw
Hi Aldy!

On Thu, 2021-10-14 16:25:48 +0200, Aldy Hernandez via Gcc-patches 
 wrote:
> The new backward threader makes some of the --param knobs used to
> control it questionable at best or no longer applicable at worst.
> 
> The fsm-maximum-phi-arguments param is unused and can be removed.
> 
> The max-fsm-thread-length param is block based which is a bit redundant,
> since we already restrict paths based on instruction estimates.
> 
> The max-fsm-thread-paths restricts the total number of threadable paths
> in a function.  We probably don't need this.  Besides, the forward
> threader has no such restriction.
> 
> OK pending tests?

This causes a regression for me. I'm auto-building lots of GCC
cross-compilers and use these to cross-build the Linux kernel.

  Using binutils/gas/gcc configured for --target=sh-linux (actual
configuration for GCC is this:

.../gcc/configure --target=sh-linux --enable-werror-always \
  --enable-languages=all --disable-gcov\
  --disable-shared --disable-threads   \
  --without-headers \
  
--prefix=/var/lib/laminar/run/gcc-sh-linux/13/toolchain-install
)

Then, building Linux for a good number of default configurations for
ARCH=sh and ARCH=arm, GCC will just loop:

$ make ARCH=sh distclean
$ cp arch/sh/configs/r7780mp_defconfig .config
$ make V=1 ARCH=sh CROSS_COMPILE=sh-linux- oldconfig < /dev/null
$ make V=1 ARCH=sh CROSS_COMPILE=sh-linux- prepare
$ make V=1 ARCH=sh CROSS_COMPILE=sh-linux- all
[...]
  sh-linux-gcc -Wp,-MMD,drivers/ata/.libata-core.o.d -nostdinc -isystem 
/tmp/testbed/install/lib/gcc/sh-linux/12.0.0/include -I./arch/sh/include 
-I./arch/sh/include/generated  -I./include -I./arch/sh/include/uapi 
-I./arch/sh/include/generated/uapi -I./include/uapi -I./include/generated/uapi 
-include ./include/linux/compiler-version.h -include ./include/linux/kconfig.h 
-include ./include/linux/compiler_types.h -D__KERNEL__ -m4 -m4-nofpu -m4a 
-m4a-nofpu -ml -mno-fdpic -Wa,-isa=sh4a-up -ffreestanding -I 
./arch/sh/include/cpu-sh4a -I ./arch/sh/include/cpu-sh4 -I 
./arch/sh/include/cpu-common -I ./arch/sh/include/mach-highlander -I 
./arch/sh/include/mach-common -fmacro-prefix-map=./= -Wall -Wundef 
-Werror=strict-prototypes -Wno-trigraphs -fno-strict-aliasing -fno-common 
-fshort-wchar -fno-PIE -Werror=implicit-function-declaration 
-Werror=implicit-int -Werror=return-type -Wno-format-security -std=gnu89 -pipe 
-m4 -m4-nofpu -m4a -m4a-nofpu -ml -mno-fdpic -Wa,-isa=sh4a-up -ffreestanding -I 
./arch/sh/include/cpu-sh4a -I ./arch/sh/include/cpu-sh4 -I 
./arch/sh/include/cpu-common -I ./arch/sh/include/mach-highlander -I 
./arch/sh/include/mach-common -fno-delete-null-pointer-checks 
-Wno-frame-address -Wno-format-truncation -Wno-format-overflow 
-Wno-address-of-packed-member -O2 -fno-allow-store-data-races 
-Wframe-larger-than=1024 -fstack-protector-strong -Wimplicit-fallthrough=5 
-Wno-main -Wno-unused-but-set-variable -Wno-unused-const-variable 
-fomit-frame-pointer -ftrivial-auto-var-init=zero 
-enable-trivial-auto-var-init-zero-knowing-it-will-be-removed-from-clang 
-fno-stack-clash-protection -g -Wdeclaration-after-statement -Wvla 
-Wno-pointer-sign -Wno-stringop-truncation -Wno-zero-length-bounds 
-Wno-array-bounds -Wno-stringop-overflow -Wno-restrict -Wno-maybe-uninitialized 
-fno-strict-overflow -fno-stack-check -fconserve-stack -Werror=date-time 
-Werror=incompatible-pointer-types -Werror=designated-init 
-Wno-packed-not-aligned-DKBUILD_MODFILE='"drivers/ata/libata"' 
-DKBUILD_BASENAME='"libata_core"' -DKBUILD_MODNAME='"libata"' 
-D__KBUILD_MODNAME=kmod_libata -c -o drivers/ata/libata-core.o 
drivers/ata/libata-core.c


(gdb) bt
#0  0x0100318e in vec::operator[] 
(ix=0, this=0x50ee6e0) at ../../gcc/gcc/vec.h:1495
#1  back_jt_path_registry::adjust_paths_after_duplication (this=0x7ffdf8b6e868, 
curr_path_num=0) at ../../gcc/gcc/tree-ssa-threadupdate.c:2315
#2  0x01003c0d in back_jt_path_registry::duplicate_thread_path 
(this=0x7ffdf8b6e868, entry=0x7f92651000c0, exit=, 
region=, n_region=8, 
current_path_no=0) at ../../gcc/gcc/tree-ssa-threadupdate.c:2546
#3  0x010051e4 in back_jt_path_registry::update_cfg 
(this=0x7ffdf8b6e868) at ../../gcc/gcc/tree-ssa-threadupdate.c:2656
#4  0x01003ecc in jt_path_registry::thread_through_all_blocks 
(this=0x7ffdf8b6e868, peel_loop_headers=) at 
../../gcc/gcc/tree-ssa-threadupdate.c:2604
#5  0x00ffb5a7 in back_threader_registry::thread_through_all_blocks 
(may_peel_loop_headers=true, this=0x7ffdf8b6e868) at 
../../gcc/gcc/tree-ssa-threadbackward.c:556
#6  back_threader::thread_through_all_blocks (may_peel_loop_headers=true, 
this=0x7ffdf8b6e860) at ../../gcc/gcc/tree-ssa-threadbackward.c:501
#7  (anonymous namespace)::try_thread_blocks (fun=fun@entry=0x7f926eb389c0) at 
../../gcc/gcc/tree-ssa-threadbackward.c:946
#8  0x00ffb5eb in (anonymous 

Re: [PATCH] Cleanup --params for backward threader.

2021-10-14 Thread Jeff Law via Gcc-patches




On 10/14/2021 8:25 AM, Aldy Hernandez wrote:

The new backward threader makes some of the --param knobs used to
control it questionable at best or no longer applicable at worst.

The fsm-maximum-phi-arguments param is unused and can be removed.

The max-fsm-thread-length param is block based which is a bit redundant,
since we already restrict paths based on instruction estimates.

The max-fsm-thread-paths restricts the total number of threadable paths
in a function.  We probably don't need this.  Besides, the forward
threader has no such restriction.

OK pending tests?

gcc/ChangeLog:

* doc/invoke.texi: Remove max-fsm-thread-length,
max-fsm-thread-paths, and fsm-maximum-phi-arguments.
* params.opt: Same.
* tree-ssa-threadbackward.c (back_threader::back_threader): Remove
argument.
(back_threader_registry::back_threader_registry): Same.
(back_threader_profitability::profitable_path_p): Remove
param_max_fsm_thread-length.
(back_threader_registry::register_path): Remove
m_max_allowable_paths.
OK.  I don't think any of those params were addressing pathological 
cases and thread-length is really handled better by costing based on the 
# statements.


jeff



[PATCH] Cleanup --params for backward threader.

2021-10-14 Thread Aldy Hernandez via Gcc-patches
The new backward threader makes some of the --param knobs used to
control it questionable at best or no longer applicable at worst.

The fsm-maximum-phi-arguments param is unused and can be removed.

The max-fsm-thread-length param is block based which is a bit redundant,
since we already restrict paths based on instruction estimates.

The max-fsm-thread-paths restricts the total number of threadable paths
in a function.  We probably don't need this.  Besides, the forward
threader has no such restriction.

OK pending tests?

gcc/ChangeLog:

* doc/invoke.texi: Remove max-fsm-thread-length,
max-fsm-thread-paths, and fsm-maximum-phi-arguments.
* params.opt: Same.
* tree-ssa-threadbackward.c (back_threader::back_threader): Remove
argument.
(back_threader_registry::back_threader_registry): Same.
(back_threader_profitability::profitable_path_p): Remove
param_max_fsm_thread-length.
(back_threader_registry::register_path): Remove
m_max_allowable_paths.
---
 gcc/doc/invoke.texi   | 12 
 gcc/params.opt| 12 
 gcc/tree-ssa-threadbackward.c | 27 +++
 3 files changed, 3 insertions(+), 48 deletions(-)

diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index 03234c887dc..69993270b39 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -14464,14 +14464,6 @@ Emit instrumentation calls to __tsan_func_entry() and 
__tsan_func_exit().
 Maximum number of instructions to copy when duplicating blocks on a
 finite state automaton jump thread path.
 
-@item max-fsm-thread-length
-Maximum number of basic blocks on a finite state automaton jump thread
-path.
-
-@item max-fsm-thread-paths
-Maximum number of new jump thread paths to create for a finite state
-automaton.
-
 @item parloops-chunk-size
 Chunk size of omp schedule for loops parallelized by parloops.
 
@@ -14626,10 +14618,6 @@ The maximum depth of recursive inlining for non-inline 
functions.
 Scale factor to apply to the number of statements in a threading path
 when comparing to the number of (scaled) blocks.
 
-@item fsm-maximum-phi-arguments
-Maximum number of arguments a PHI may have before the FSM threader
-will not try to thread through its block.
-
 @item uninit-control-dep-attempts
 Maximum number of nested calls to search for control dependencies
 during uninitialized variable analysis.
diff --git a/gcc/params.opt b/gcc/params.opt
index 84d642d72c5..06a6fdc9deb 100644
--- a/gcc/params.opt
+++ b/gcc/params.opt
@@ -173,10 +173,6 @@ Common Joined UInteger Var(param_ranger_logical_depth) 
Init(6) IntegerRange(1, 9
 Maximum depth of logical expression evaluation ranger will look through when
 evaluating outgoing edge ranges.
 
--param=fsm-maximum-phi-arguments=
-Common Joined UInteger Var(param_fsm_maximum_phi_arguments) Init(100) 
IntegerRange(1, 99) Param Optimization
-Maximum number of arguments a PHI may have before the FSM threader will not 
try to thread through its block.
-
 -param=fsm-scale-path-blocks=
 Common Joined UInteger Var(param_fsm_scale_path_blocks) Init(3) 
IntegerRange(1, 10) Param Optimization
 Scale factor to apply to the number of blocks in a threading path when 
comparing to the number of (scaled) statements.
@@ -537,18 +533,10 @@ The maximum number of nested indirect inlining performed 
by early inliner.
 Common Joined UInteger Var(param_max_fields_for_field_sensitive) Param
 Maximum number of fields in a structure before pointer analysis treats the 
structure as a single variable.
 
--param=max-fsm-thread-length=
-Common Joined UInteger Var(param_max_fsm_thread_length) Init(10) 
IntegerRange(1, 99) Param Optimization
-Maximum number of basic blocks on a finite state automaton jump thread path.
-
 -param=max-fsm-thread-path-insns=
 Common Joined UInteger Var(param_max_fsm_thread_path_insns) Init(100) 
IntegerRange(1, 99) Param Optimization
 Maximum number of instructions to copy when duplicating blocks on a finite 
state automaton jump thread path.
 
--param=max-fsm-thread-paths=
-Common Joined UInteger Var(param_max_fsm_thread_paths) Init(50) 
IntegerRange(1, 99) Param Optimization
-Maximum number of new jump thread paths to create for a finite state automaton.
-
 -param=max-gcse-insertion-ratio=
 Common Joined UInteger Var(param_max_gcse_insertion_ratio) Init(20) Param 
Optimization
 The maximum ratio of insertions to deletions of expressions in GCSE.
diff --git a/gcc/tree-ssa-threadbackward.c b/gcc/tree-ssa-threadbackward.c
index 1999ccf4834..7c2c1112bee 100644
--- a/gcc/tree-ssa-threadbackward.c
+++ b/gcc/tree-ssa-threadbackward.c
@@ -52,12 +52,11 @@ along with GCC; see the file COPYING3.  If not see
 class back_threader_registry
 {
 public:
-  back_threader_registry (int max_allowable_paths);
+  back_threader_registry ();
   bool register_path (const vec &, edge taken);
   bool thread_through_all_blocks (bool may_peel_loop_headers);
 private: