Re: [Patch SRA] Fix PR66119 by calling get_move_ratio in SRA
On Tue, 30 Jun 2015, James Greenhalgh wrote: On Fri, Jun 26, 2015 at 06:10:00PM +0100, Jakub Jelinek wrote: On Fri, Jun 26, 2015 at 06:03:34PM +0100, James Greenhalgh wrote: --- /dev/null +++ b/gcc/testsuite/g++.dg/pr66119.C I think generally testcases shouldn't be added into g++.dg/ directly, but subdirectories. So g++.dg/opt/ ? @@ -0,0 +1,69 @@ +/* PR66119 - MOVE_RATIO is not constant in a compiler run, so Scalar + Reduction of Aggregates must ask the back-end more than once what + the value of MOVE_RATIO now is. */ + +/* { dg-do compile { target i?86-*-* x86_64-*-* } } */ In g++.dg/, dejagnu cycles through all 3 major -std=c* versions, thus using -std=c++11 is inappropriate. If the test requires c++11, instead you do // { dg-do compile { target { { i?86-*-* x86_64-*-* } c++11 } } } +/* { dg-options -std=c++11 -O3 -mavx -fdump-tree-sra -march=slm { target avx_runtime } } */ and remove -std=c++11 here. I don't see any point in guarding it with avx_runtime, after all, if not avx_runtime, the test will be compiled with -O0 and thus very likely fail the scan-tree-dump test. As it is dg-do compile test only, you have no dependency on assembler nor linker nor runtime. But I'd add -mtune=slm too. Thanks, I'm used to the dance we try to do to get Neon enabled/disabled correctly when testing multilib environments on ARM so tried to overengineer things! I've updated the testcase as you suggested, and moved it to g++.dg/opt. OK? Looks good to me now. Thanks, Richard. Thanks, James --- gcc/ 2015-06-30 James Greenhalgh james.greenha...@arm.com PR tree-optimization/66119 * toplev.c (process_options): Don't set up default values for the sra_max_scalarization_size_{speed,size} parameters. * tree-sra (analyze_all_variable_accesses): If no values have been set for the sra_max_scalarization_size_{speed,size} parameters, call get_move_ratio to get target defaults. gcc/testsuite/ 2015-06-30 James Greenhalgh james.greenha...@arm.com * g++.dg/opt/pr66119.C: New. -- Richard Biener rguent...@suse.de SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Dilip Upmanyu, Graham Norton, HRB 21284 (AG Nuernberg)
Re: [Patch SRA] Fix PR66119 by calling get_move_ratio in SRA
On Fri, Jun 26, 2015 at 06:10:00PM +0100, Jakub Jelinek wrote: On Fri, Jun 26, 2015 at 06:03:34PM +0100, James Greenhalgh wrote: --- /dev/null +++ b/gcc/testsuite/g++.dg/pr66119.C I think generally testcases shouldn't be added into g++.dg/ directly, but subdirectories. So g++.dg/opt/ ? @@ -0,0 +1,69 @@ +/* PR66119 - MOVE_RATIO is not constant in a compiler run, so Scalar + Reduction of Aggregates must ask the back-end more than once what + the value of MOVE_RATIO now is. */ + +/* { dg-do compile { target i?86-*-* x86_64-*-* } } */ In g++.dg/, dejagnu cycles through all 3 major -std=c* versions, thus using -std=c++11 is inappropriate. If the test requires c++11, instead you do // { dg-do compile { target { { i?86-*-* x86_64-*-* } c++11 } } } +/* { dg-options -std=c++11 -O3 -mavx -fdump-tree-sra -march=slm { target avx_runtime } } */ and remove -std=c++11 here. I don't see any point in guarding it with avx_runtime, after all, if not avx_runtime, the test will be compiled with -O0 and thus very likely fail the scan-tree-dump test. As it is dg-do compile test only, you have no dependency on assembler nor linker nor runtime. But I'd add -mtune=slm too. Thanks, I'm used to the dance we try to do to get Neon enabled/disabled correctly when testing multilib environments on ARM so tried to overengineer things! I've updated the testcase as you suggested, and moved it to g++.dg/opt. OK? Thanks, James --- gcc/ 2015-06-30 James Greenhalgh james.greenha...@arm.com PR tree-optimization/66119 * toplev.c (process_options): Don't set up default values for the sra_max_scalarization_size_{speed,size} parameters. * tree-sra (analyze_all_variable_accesses): If no values have been set for the sra_max_scalarization_size_{speed,size} parameters, call get_move_ratio to get target defaults. gcc/testsuite/ 2015-06-30 James Greenhalgh james.greenha...@arm.com * g++.dg/opt/pr66119.C: New. diff --git a/gcc/testsuite/g++.dg/opt/pr66119.C b/gcc/testsuite/g++.dg/opt/pr66119.C new file mode 100644 index 000..5b420c2 --- /dev/null +++ b/gcc/testsuite/g++.dg/opt/pr66119.C @@ -0,0 +1,69 @@ +/* PR66119 - MOVE_RATIO is not constant in a compiler run, so Scalar + Reduction of Aggregates must ask the back-end more than once what + the value of MOVE_RATIO now is. */ + +/* { dg-do compile { target { { i?86-*-* x86_64-*-* } c++11 } } } */ +/* { dg-options -O3 -mavx -fdump-tree-sra -march=slm -mtune=slm } */ + +#include immintrin.h + +class MyAVX +{ + __m256d data; +public: + MyAVX () = default; + MyAVX (const MyAVX ) = default; + MyAVX (__m256d _data) : data(_data) { ; } + + MyAVX operator= (const MyAVX ) = default; + + operator __m256d () const { return data; } + MyAVX operator+ (MyAVX s2) { return data+s2.data; } +}; + +template typename T class AVX_trait { ; }; + +template class AVX_traitdouble { +public: + typedef __m256d TSIMD; +}; + + +template typename T +class MyTSIMD +{ + typename AVX_traitT::TSIMD data; + +public: + MyTSIMD () = default; + MyTSIMD (const MyTSIMD ) = default; + // MyTSIMD (const MyTSIMD s2) : data(s2.data) { ; } + MyTSIMD (typename AVX_traitT::TSIMD _data) : data(_data) { ; } + + operator typename AVX_traitT::TSIMD() const { return data; } + MyTSIMD operator+ (MyTSIMD s2) { return data+s2.data; } +}; + +// using MyVec = MyAVX; +using MyVec = MyTSIMDdouble; + +class Vec2 +{ + MyVec a, b; +public: + Vec2 (MyVec aa, MyVec ab) : a(aa), b(ab) { ; } + Vec2 operator+ (Vec2 v2) { return Vec2(a+v2.a, b+v2.b); } +}; + +inline __attribute__ ((__always_inline__)) +Vec2 ComputeSomething (Vec2 a, Vec2 b) +{ + return a+b; +} + +Vec2 TestFunction (Vec2 a, Vec2 b) +{ + return ComputeSomething (a,b); +} + +/* { dg-final { scan-tree-dump Created a replacement for b sra } } */ diff --git a/gcc/toplev.c b/gcc/toplev.c index 573b144..d97d852 100644 --- a/gcc/toplev.c +++ b/gcc/toplev.c @@ -1299,20 +1299,6 @@ process_options (void) so we can correctly initialize debug output. */ no_backend = lang_hooks.post_options (main_input_filename); - /* Set default values for parameters relation to the Scalar Reduction - of Aggregates passes (SRA and IP-SRA). We must do this here, rather - than in opts.c:default_options_optimization as historically these - tuning heuristics have been based on MOVE_RATIO, which on some - targets requires other symbols from the backend. */ - maybe_set_param_value -(PARAM_SRA_MAX_SCALARIZATION_SIZE_SPEED, - get_move_ratio (true) * UNITS_PER_WORD, - global_options.x_param_values, global_options_set.x_param_values); - maybe_set_param_value -(PARAM_SRA_MAX_SCALARIZATION_SIZE_SIZE, - get_move_ratio (false) * UNITS_PER_WORD, - global_options.x_param_values, global_options_set.x_param_values); - /* Some machines may reject certain combinations of options. */ targetm.target_option.override (); diff --git
Re: [Patch SRA] Fix PR66119 by calling get_move_ratio in SRA
On Fri, Jun 26, 2015 at 06:03:34PM +0100, James Greenhalgh wrote: --- /dev/null +++ b/gcc/testsuite/g++.dg/pr66119.C I think generally testcases shouldn't be added into g++.dg/ directly, but subdirectories. So g++.dg/opt/ ? @@ -0,0 +1,69 @@ +/* PR66119 - MOVE_RATIO is not constant in a compiler run, so Scalar + Reduction of Aggregates must ask the back-end more than once what + the value of MOVE_RATIO now is. */ + +/* { dg-do compile { target i?86-*-* x86_64-*-* } } */ In g++.dg/, dejagnu cycles through all 3 major -std=c* versions, thus using -std=c++11 is inappropriate. If the test requires c++11, instead you do // { dg-do compile { target { { i?86-*-* x86_64-*-* } c++11 } } } +/* { dg-options -std=c++11 -O3 -mavx -fdump-tree-sra -march=slm { target avx_runtime } } */ and remove -std=c++11 here. I don't see any point in guarding it with avx_runtime, after all, if not avx_runtime, the test will be compiled with -O0 and thus very likely fail the scan-tree-dump test. As it is dg-do compile test only, you have no dependency on assembler nor linker nor runtime. But I'd add -mtune=slm too. Jakub
Re: [Patch SRA] Fix PR66119 by calling get_move_ratio in SRA
On Thu, Jun 25, 2015 at 05:05:22AM +0100, Jeff Law wrote: On 06/23/2015 09:42 AM, James Greenhalgh wrote: On Tue, Jun 23, 2015 at 09:52:01AM +0100, Jakub Jelinek wrote: On Tue, Jun 23, 2015 at 09:18:52AM +0100, James Greenhalgh wrote: This patch fixes the issue by always calling get_move_ratio in the SRA code, ensuring that an up-to-date value is used. Unfortunately, this means we have to use 0 as a sentinel value for the parameter - indicating no user override of the feature - and therefore cannot use it to disable scalarization. However, there are other ways to disable scalarazation (-fno-tree-sra) so this is not a great loss. You can handle even that. snip enum compiler_param param = optimize_function_for_size_p (cfun) ? PARAM_SRA_MAX_SCALARIZATION_SIZE_SIZE : PARAM_SRA_MAX_SCALARIZATION_SIZE_SPEED; unsigned max_scalarization_size = PARAM_VALUE (param) * BITS_PER_UNIT; if (!max_scalarization_size !global_options_set.x_param_values[param]) Then it will handle explicit --param sra-max-scalarization-size-Os*=0 differently from implicit 0. Ah hah! OK, I've respun the patch removing this extra justification in the documentation and reshuffling the logic a little. OT, shouldn't max_scalarization_size be at least unsigned HOST_WIDE_INT, so that it doesn't overflow for larger values (0x4000 etc.)? Probably need some cast in the multiplication to avoid UB in the compiler. I've increased the size of max_scalarization_size to a UHWI in this spin. Bootstrapped and tested on AArch64 and x86-64 with no issues and checked to see the PR is fixed. OK for trunk, and gcc-5 in a few days? Thanks, James --- gcc/ 2015-06-23 James Greenhalgh james.greenha...@arm.com PR tree-optimization/66119 * toplev.c (process_options): Don't set up default values for the sra_max_scalarization_size_{speed,size} parameters. * tree-sra (analyze_all_variable_accesses): If no values have been set for the sra_max_scalarization_size_{speed,size} parameters, call get_move_ratio to get target defaults. Any testcase for this change? OK with a testcase. jeff Thanks Jeff, I'm not really up to scratch on the gotchas for the dg directives targetting avx on x86, does the testcase in this patch (taken from the PR) look OK? Cheers, James diff --git a/gcc/testsuite/g++.dg/pr66119.C b/gcc/testsuite/g++.dg/pr66119.C new file mode 100644 index 000..9979e91 --- /dev/null +++ b/gcc/testsuite/g++.dg/pr66119.C @@ -0,0 +1,69 @@ +/* PR66119 - MOVE_RATIO is not constant in a compiler run, so Scalar + Reduction of Aggregates must ask the back-end more than once what + the value of MOVE_RATIO now is. */ + +/* { dg-do compile { target i?86-*-* x86_64-*-* } } */ +/* { dg-options -std=c++11 -O3 -mavx -fdump-tree-sra -march=slm { target avx_runtime } } */ + +#include immintrin.h + +class MyAVX +{ + __m256d data; +public: + MyAVX () = default; + MyAVX (const MyAVX ) = default; + MyAVX (__m256d _data) : data(_data) { ; } + + MyAVX operator= (const MyAVX ) = default; + + operator __m256d () const { return data; } + MyAVX operator+ (MyAVX s2) { return data+s2.data; } +}; + +template typename T class AVX_trait { ; }; + +template class AVX_traitdouble { +public: + typedef __m256d TSIMD; +}; + + +template typename T +class MyTSIMD +{ + typename AVX_traitT::TSIMD data; + +public: + MyTSIMD () = default; + MyTSIMD (const MyTSIMD ) = default; + // MyTSIMD (const MyTSIMD s2) : data(s2.data) { ; } + MyTSIMD (typename AVX_traitT::TSIMD _data) : data(_data) { ; } + + operator typename AVX_traitT::TSIMD() const { return data; } + MyTSIMD operator+ (MyTSIMD s2) { return data+s2.data; } +}; + +// using MyVec = MyAVX; +using MyVec = MyTSIMDdouble; + +class Vec2 +{ + MyVec a, b; +public: + Vec2 (MyVec aa, MyVec ab) : a(aa), b(ab) { ; } + Vec2 operator+ (Vec2 v2) { return Vec2(a+v2.a, b+v2.b); } +}; + +inline __attribute__ ((__always_inline__)) +Vec2 ComputeSomething (Vec2 a, Vec2 b) +{ + return a+b; +} + +Vec2 TestFunction (Vec2 a, Vec2 b) +{ + return ComputeSomething (a,b); +} + +/* { dg-final { scan-tree-dump Created a replacement for b sra } } */ diff --git a/gcc/toplev.c b/gcc/toplev.c index 2f43a89..902bfc7 100644 --- a/gcc/toplev.c +++ b/gcc/toplev.c @@ -1301,20 +1301,6 @@ process_options (void) so we can correctly initialize debug output. */ no_backend = lang_hooks.post_options (main_input_filename); - /* Set default values for parameters relation to the Scalar Reduction - of Aggregates passes (SRA and IP-SRA). We must do this here, rather - than in opts.c:default_options_optimization as historically these - tuning heuristics have been based on MOVE_RATIO, which on some - targets requires other symbols from the backend. */ - maybe_set_param_value -(PARAM_SRA_MAX_SCALARIZATION_SIZE_SPEED, - get_move_ratio (true) * UNITS_PER_WORD, -
Re: [Patch SRA] Fix PR66119 by calling get_move_ratio in SRA
On 06/23/2015 09:42 AM, James Greenhalgh wrote: On Tue, Jun 23, 2015 at 09:52:01AM +0100, Jakub Jelinek wrote: On Tue, Jun 23, 2015 at 09:18:52AM +0100, James Greenhalgh wrote: This patch fixes the issue by always calling get_move_ratio in the SRA code, ensuring that an up-to-date value is used. Unfortunately, this means we have to use 0 as a sentinel value for the parameter - indicating no user override of the feature - and therefore cannot use it to disable scalarization. However, there are other ways to disable scalarazation (-fno-tree-sra) so this is not a great loss. You can handle even that. snip enum compiler_param param = optimize_function_for_size_p (cfun) ? PARAM_SRA_MAX_SCALARIZATION_SIZE_SIZE : PARAM_SRA_MAX_SCALARIZATION_SIZE_SPEED; unsigned max_scalarization_size = PARAM_VALUE (param) * BITS_PER_UNIT; if (!max_scalarization_size !global_options_set.x_param_values[param]) Then it will handle explicit --param sra-max-scalarization-size-Os*=0 differently from implicit 0. Ah hah! OK, I've respun the patch removing this extra justification in the documentation and reshuffling the logic a little. OT, shouldn't max_scalarization_size be at least unsigned HOST_WIDE_INT, so that it doesn't overflow for larger values (0x4000 etc.)? Probably need some cast in the multiplication to avoid UB in the compiler. I've increased the size of max_scalarization_size to a UHWI in this spin. Bootstrapped and tested on AArch64 and x86-64 with no issues and checked to see the PR is fixed. OK for trunk, and gcc-5 in a few days? Thanks, James --- gcc/ 2015-06-23 James Greenhalgh james.greenha...@arm.com PR tree-optimization/66119 * toplev.c (process_options): Don't set up default values for the sra_max_scalarization_size_{speed,size} parameters. * tree-sra (analyze_all_variable_accesses): If no values have been set for the sra_max_scalarization_size_{speed,size} parameters, call get_move_ratio to get target defaults. Any testcase for this change? OK with a testcase. jeff
Re: [Patch SRA] Fix PR66119 by calling get_move_ratio in SRA
On Tue, Jun 23, 2015 at 09:52:01AM +0100, Jakub Jelinek wrote: On Tue, Jun 23, 2015 at 09:18:52AM +0100, James Greenhalgh wrote: This patch fixes the issue by always calling get_move_ratio in the SRA code, ensuring that an up-to-date value is used. Unfortunately, this means we have to use 0 as a sentinel value for the parameter - indicating no user override of the feature - and therefore cannot use it to disable scalarization. However, there are other ways to disable scalarazation (-fno-tree-sra) so this is not a great loss. You can handle even that. snip enum compiler_param param = optimize_function_for_size_p (cfun) ? PARAM_SRA_MAX_SCALARIZATION_SIZE_SIZE : PARAM_SRA_MAX_SCALARIZATION_SIZE_SPEED; unsigned max_scalarization_size = PARAM_VALUE (param) * BITS_PER_UNIT; if (!max_scalarization_size !global_options_set.x_param_values[param]) Then it will handle explicit --param sra-max-scalarization-size-Os*=0 differently from implicit 0. Ah hah! OK, I've respun the patch removing this extra justification in the documentation and reshuffling the logic a little. OT, shouldn't max_scalarization_size be at least unsigned HOST_WIDE_INT, so that it doesn't overflow for larger values (0x4000 etc.)? Probably need some cast in the multiplication to avoid UB in the compiler. I've increased the size of max_scalarization_size to a UHWI in this spin. Bootstrapped and tested on AArch64 and x86-64 with no issues and checked to see the PR is fixed. OK for trunk, and gcc-5 in a few days? Thanks, James --- gcc/ 2015-06-23 James Greenhalgh james.greenha...@arm.com PR tree-optimization/66119 * toplev.c (process_options): Don't set up default values for the sra_max_scalarization_size_{speed,size} parameters. * tree-sra (analyze_all_variable_accesses): If no values have been set for the sra_max_scalarization_size_{speed,size} parameters, call get_move_ratio to get target defaults. diff --git a/gcc/toplev.c b/gcc/toplev.c index 2f43a89..902bfc7 100644 --- a/gcc/toplev.c +++ b/gcc/toplev.c @@ -1301,20 +1301,6 @@ process_options (void) so we can correctly initialize debug output. */ no_backend = lang_hooks.post_options (main_input_filename); - /* Set default values for parameters relation to the Scalar Reduction - of Aggregates passes (SRA and IP-SRA). We must do this here, rather - than in opts.c:default_options_optimization as historically these - tuning heuristics have been based on MOVE_RATIO, which on some - targets requires other symbols from the backend. */ - maybe_set_param_value -(PARAM_SRA_MAX_SCALARIZATION_SIZE_SPEED, - get_move_ratio (true) * UNITS_PER_WORD, - global_options.x_param_values, global_options_set.x_param_values); - maybe_set_param_value -(PARAM_SRA_MAX_SCALARIZATION_SIZE_SIZE, - get_move_ratio (false) * UNITS_PER_WORD, - global_options.x_param_values, global_options_set.x_param_values); - /* Some machines may reject certain combinations of options. */ targetm.target_option.override (); diff --git a/gcc/tree-sra.c b/gcc/tree-sra.c index 8e34244..5f573f6 100644 --- a/gcc/tree-sra.c +++ b/gcc/tree-sra.c @@ -2549,11 +2549,20 @@ analyze_all_variable_accesses (void) bitmap tmp = BITMAP_ALLOC (NULL); bitmap_iterator bi; unsigned i; - unsigned max_scalarization_size -= (optimize_function_for_size_p (cfun) - ? PARAM_VALUE (PARAM_SRA_MAX_SCALARIZATION_SIZE_SIZE) - : PARAM_VALUE (PARAM_SRA_MAX_SCALARIZATION_SIZE_SPEED)) - * BITS_PER_UNIT; + bool optimize_speed_p = !optimize_function_for_size_p (cfun); + + enum compiler_param param = optimize_speed_p + ? PARAM_SRA_MAX_SCALARIZATION_SIZE_SPEED + : PARAM_SRA_MAX_SCALARIZATION_SIZE_SIZE; + + /* If the user didn't set PARAM_SRA_MAX_SCALARIZATION_SIZE_..., + fall back to a target default. */ + unsigned HOST_WIDE_INT max_scalarization_size += global_options_set.x_param_values[param] + ? PARAM_VALUE (param) + : get_move_ratio (optimize_speed_p) * UNITS_PER_WORD; + + max_scalarization_size *= BITS_PER_UNIT; EXECUTE_IF_SET_IN_BITMAP (candidate_bitmap, 0, i, bi) if (bitmap_bit_p (should_scalarize_away_bitmap, i)
[Patch SRA] Fix PR66119 by calling get_move_ratio in SRA
Hi, The problem in PR66119 is that we assume MOVE_RATIO will be constant for a compilation run, such that we only need to read it once at compiler startup if we want to set up defaults for --param sra-max-scalarization-size-Osize and --param sra-max-scalarization-size-Osize. This assumption is faulty. Some targets may have MOVE_RATIO set up to use state which depends on the selected processor - which may vary per function for a switchable target. This patch fixes the issue by always calling get_move_ratio in the SRA code, ensuring that an up-to-date value is used. Unfortunately, this means we have to use 0 as a sentinel value for the parameter - indicating no user override of the feature - and therefore cannot use it to disable scalarization. However, there are other ways to disable scalarazation (-fno-tree-sra) so this is not a great loss. Bootstrapped and tested on x86-64 and AArch64 with no issues. OK for trunk (and 5.2 after a few days watching for fallout)? Thanks, James --- gcc/ 2015-06-23 James Greenhalgh james.greenha...@arm.com PR tree-optimization/66119 * doc/invoke.texi (sra-max-scalarization-size-Osize): Mention that 0 is used as a sentinel value. (sra-max-scalarization-size-Ospeed): Likewise. * toplev.c (process_options): Don't set up default values for the sra_max_scalarization_size_{speed,size} parameters. * tree-sra (analyze_all_variable_accesses): If no values have been set for the sra_max_scalarization_size_{speed,size} parameters, call get_move_ratio to get target defaults. diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi index b99ab1c..fc9dad7 100644 --- a/gcc/doc/invoke.texi +++ b/gcc/doc/invoke.texi @@ -10894,7 +10894,9 @@ variables. These parameters control the maximum size, in storage units, of aggregate which is considered for replacement when compiling for speed (@option{sra-max-scalarization-size-Ospeed}) or size -(@option{sra-max-scalarization-size-Osize}) respectively. +(@option{sra-max-scalarization-size-Osize}) respectively. The +value 0 indicates that the compiler should use an appropriate size +for the target processor, this is the default behaviour. @item tm-max-aggregate-size When making copies of thread-local variables in a transaction, this diff --git a/gcc/toplev.c b/gcc/toplev.c index 2f43a89..902bfc7 100644 --- a/gcc/toplev.c +++ b/gcc/toplev.c @@ -1301,20 +1301,6 @@ process_options (void) so we can correctly initialize debug output. */ no_backend = lang_hooks.post_options (main_input_filename); - /* Set default values for parameters relation to the Scalar Reduction - of Aggregates passes (SRA and IP-SRA). We must do this here, rather - than in opts.c:default_options_optimization as historically these - tuning heuristics have been based on MOVE_RATIO, which on some - targets requires other symbols from the backend. */ - maybe_set_param_value -(PARAM_SRA_MAX_SCALARIZATION_SIZE_SPEED, - get_move_ratio (true) * UNITS_PER_WORD, - global_options.x_param_values, global_options_set.x_param_values); - maybe_set_param_value -(PARAM_SRA_MAX_SCALARIZATION_SIZE_SIZE, - get_move_ratio (false) * UNITS_PER_WORD, - global_options.x_param_values, global_options_set.x_param_values); - /* Some machines may reject certain combinations of options. */ targetm.target_option.override (); diff --git a/gcc/tree-sra.c b/gcc/tree-sra.c index 8e34244..e2419af 100644 --- a/gcc/tree-sra.c +++ b/gcc/tree-sra.c @@ -2549,12 +2549,24 @@ analyze_all_variable_accesses (void) bitmap tmp = BITMAP_ALLOC (NULL); bitmap_iterator bi; unsigned i; + bool optimize_speed_p = !optimize_function_for_size_p (cfun); + unsigned max_scalarization_size -= (optimize_function_for_size_p (cfun) - ? PARAM_VALUE (PARAM_SRA_MAX_SCALARIZATION_SIZE_SIZE) - : PARAM_VALUE (PARAM_SRA_MAX_SCALARIZATION_SIZE_SPEED)) += (optimize_speed_p + ? PARAM_VALUE (PARAM_SRA_MAX_SCALARIZATION_SIZE_SPEED) + : PARAM_VALUE (PARAM_SRA_MAX_SCALARIZATION_SIZE_SIZE)) * BITS_PER_UNIT; + /* If the user didn't set PARAM_SRA_MAX_SCALARIZATION_SIZE_..., + fall back to a target default. This means that zero cannot be + used to disable scalarization as we've taken it as a sentinel + value. This is not ideal, but see PR66119 for the reason we + can't simply set the target defaults ahead of time during option + handling. */ + if (!max_scalarization_size) +max_scalarization_size = get_move_ratio (optimize_speed_p) + * UNITS_PER_WORD * BITS_PER_UNIT; + EXECUTE_IF_SET_IN_BITMAP (candidate_bitmap, 0, i, bi) if (bitmap_bit_p (should_scalarize_away_bitmap, i) !bitmap_bit_p (cannot_scalarize_away_bitmap, i))
Re: [Patch SRA] Fix PR66119 by calling get_move_ratio in SRA
On Tue, Jun 23, 2015 at 09:18:52AM +0100, James Greenhalgh wrote: This patch fixes the issue by always calling get_move_ratio in the SRA code, ensuring that an up-to-date value is used. Unfortunately, this means we have to use 0 as a sentinel value for the parameter - indicating no user override of the feature - and therefore cannot use it to disable scalarization. However, there are other ways to disable scalarazation (-fno-tree-sra) so this is not a great loss. You can handle even that. diff --git a/gcc/tree-sra.c b/gcc/tree-sra.c index 8e34244..e2419af 100644 --- a/gcc/tree-sra.c +++ b/gcc/tree-sra.c @@ -2549,12 +2549,24 @@ analyze_all_variable_accesses (void) bitmap tmp = BITMAP_ALLOC (NULL); bitmap_iterator bi; unsigned i; + bool optimize_speed_p = !optimize_function_for_size_p (cfun); + unsigned max_scalarization_size -= (optimize_function_for_size_p (cfun) - ? PARAM_VALUE (PARAM_SRA_MAX_SCALARIZATION_SIZE_SIZE) - : PARAM_VALUE (PARAM_SRA_MAX_SCALARIZATION_SIZE_SPEED)) += (optimize_speed_p + ? PARAM_VALUE (PARAM_SRA_MAX_SCALARIZATION_SIZE_SPEED) + : PARAM_VALUE (PARAM_SRA_MAX_SCALARIZATION_SIZE_SIZE)) * BITS_PER_UNIT; + /* If the user didn't set PARAM_SRA_MAX_SCALARIZATION_SIZE_..., + fall back to a target default. This means that zero cannot be + used to disable scalarization as we've taken it as a sentinel + value. This is not ideal, but see PR66119 for the reason we + can't simply set the target defaults ahead of time during option + handling. */ + if (!max_scalarization_size) enum compiler_param param = optimize_function_for_size_p (cfun) ? PARAM_SRA_MAX_SCALARIZATION_SIZE_SIZE : PARAM_SRA_MAX_SCALARIZATION_SIZE_SPEED; unsigned max_scalarization_size = PARAM_VALUE (param) * BITS_PER_UNIT; if (!max_scalarization_size !global_options_set.x_param_values[param]) Then it will handle explicit --param sra-max-scalarization-size-Os*=0 differently from implicit 0. Or you could allow value of -1 for those params and make that the default - -1 would mean the special get_move_ration value, 0 would disable, 0 would be the requested scalarization. OT, shouldn't max_scalarization_size be at least unsigned HOST_WIDE_INT, so that it doesn't overflow for larger values (0x4000 etc.)? Probably need some cast in the multiplication to avoid UB in the compiler. Jakub