[Bug tree-optimization/92738] [10 regression] Large code size growth for -O2 binaries between 2019-05-19...2019-05-29
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92738 --- Comment #16 from Jakub Jelinek --- Author: jakub Date: Fri Dec 20 23:51:15 2019 New Revision: 279687 URL: https://gcc.gnu.org/viewcvs?rev=279687=gcc=rev Log: PR middle-end/91512 PR fortran/92738 * lang.opt (-finline-arg-packing): Add trailing dot to help text. Modified: trunk/gcc/fortran/ChangeLog trunk/gcc/fortran/lang.opt trunk/gcc/testsuite/ChangeLog
[Bug tree-optimization/92738] [10 regression] Large code size growth for -O2 binaries between 2019-05-19...2019-05-29
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92738 Thomas Koenig changed: What|Removed |Added Status|NEW |RESOLVED Resolution|--- |FIXED --- Comment #15 from Thomas Koenig --- The user can now circumvent this with -finline-arg-packing. Closing as FIXED.
[Bug tree-optimization/92738] [10 regression] Large code size growth for -O2 binaries between 2019-05-19...2019-05-29
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92738 --- Comment #14 from Thomas Koenig --- Author: tkoenig Date: Fri Dec 20 11:51:05 2019 New Revision: 279639 URL: https://gcc.gnu.org/viewcvs?rev=279639=gcc=rev Log: Introduce -finline-arg-packing. 2019-12-20 Thomas Koenig PR middle-end/91512 PR fortran/92738 * invoke.texi: Document -finline-arg-packing. * lang.opt: Add -finline-arg-packing. * options.c (gfc_post_options): Handle -finline-arg-packing. * trans-array.c (gfc_conv_array_parameter): Use flag_inline_arg_packing instead of checking for optimize and optimize_size. 2019-12-20 Thomas Koenig PR middle-end/91512 PR fortran/92738 * gfortran.dg/inline_pack_25.f90: New test. Added: trunk/gcc/testsuite/gfortran.dg/internal_pack_25.f90 Modified: trunk/gcc/fortran/ChangeLog trunk/gcc/fortran/invoke.texi trunk/gcc/fortran/lang.opt trunk/gcc/fortran/options.c trunk/gcc/fortran/trans-array.c trunk/gcc/testsuite/ChangeLog
[Bug tree-optimization/92738] [10 regression] Large code size growth for -O2 binaries between 2019-05-19...2019-05-29
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92738 --- Comment #13 from Thomas Koenig --- (In reply to Thomas Koenig from comment #12) > People who have problems can then enable I meant disable, of course. > that option for > the specific files they have the problems with.
[Bug tree-optimization/92738] [10 regression] Large code size growth for -O2 binaries between 2019-05-19...2019-05-29
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92738 --- Comment #12 from Thomas Koenig --- (In reply to Wilco from comment #11) > Would using -frepack-arrays solve this issue? I proposed making that the > default a while back. It would do any repacking that is necessary at call > sites rather than creating multiple copies of all loops in every function. I'm not convinced that that is the answer - this would penalize (at runtime) programs which use non-contiguous memory when _not_ passing them to an explicit size or assumed size paramter. For example, all the optimizations for passing a transposed argument would then no longer work. What we could do instead is to introduce another option, -frepack-inline (or whatever we want to call it) and enable this by default at all -O except at -Os. People who have problems can then enable that option for the specific files they have the problems with.
[Bug tree-optimization/92738] [10 regression] Large code size growth for -O2 binaries between 2019-05-19...2019-05-29
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92738 --- Comment #11 from Wilco --- (In reply to Thomas Koenig from comment #10) > (In reply to Martin Liška from comment #6) > > So wrf grew starting with r271377, size (w/o debug info) goes from 20164464B > > to 23674792. > > I think we've had this discussion before, although I cannot offhand > recall the PR number. PR91512 is closely related. > > Since r271377, arguments which may be contiguous are now (conditionally) > packed and unpacked inline (see PR88821). > > This was done so that the middle end can look into these conversions > and possibly eliminate them, if it can be determined via inlining > or LTO that the argument is contiguous anyway). This can lead to an > extremely large performance boost for some test cases (*10 or more), > but will, in general, lead to a size increase. > > Now, wrf has an extremely strange (and rare) programming style where they > pass > a ton of assumed shape arguments (where it is not clear, at compile-time, > if they need packing/unpacking) to an old-style array argument. This > causes considerable code size increase. > > So, it's a tradeoff, which we discussed at the time. This is why this > is not done at -Os. > > Should we "fix" this? I think not, the style of wrf is just too horrid, > and pessimizing other programs for the sake of one benchmark makes little > sense to me. Would using -frepack-arrays solve this issue? I proposed making that the default a while back. It would do any repacking that is necessary at call sites rather than creating multiple copies of all loops in every function.
[Bug tree-optimization/92738] [10 regression] Large code size growth for -O2 binaries between 2019-05-19...2019-05-29
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92738 --- Comment #10 from Thomas Koenig --- (In reply to Martin Liška from comment #6) > So wrf grew starting with r271377, size (w/o debug info) goes from 20164464B > to 23674792. I think we've had this discussion before, although I cannot offhand recall the PR number. PR91512 is closely related. Since r271377, arguments which may be contiguous are now (conditionally) packed and unpacked inline (see PR88821). This was done so that the middle end can look into these conversions and possibly eliminate them, if it can be determined via inlining or LTO that the argument is contiguous anyway). This can lead to an extremely large performance boost for some test cases (*10 or more), but will, in general, lead to a size increase. Now, wrf has an extremely strange (and rare) programming style where they pass a ton of assumed shape arguments (where it is not clear, at compile-time, if they need packing/unpacking) to an old-style array argument. This causes considerable code size increase. So, it's a tradeoff, which we discussed at the time. This is why this is not done at -Os. Should we "fix" this? I think not, the style of wrf is just too horrid, and pessimizing other programs for the sake of one benchmark makes little sense to me.
[Bug tree-optimization/92738] [10 regression] Large code size growth for -O2 binaries between 2019-05-19...2019-05-29
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92738 --- Comment #9 from Wilco --- (In reply to Martin Liška from comment #8) > (In reply to Wilco from comment #7) > > (In reply to Martin Liška from comment #6) > > > So wrf grew starting with r271377, size (w/o debug info) goes from > > > 20164464B > > > to 23674792. > > > > Also check the build time of wrf. Looking at my logs trunk takes 2x as long > > to build it since June. > > Maybe related to: > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91509 > ? I think not, this is plain -Ofast, no LTO or prefetching. The same slowdown happens with -O2.
[Bug tree-optimization/92738] [10 regression] Large code size growth for -O2 binaries between 2019-05-19...2019-05-29
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92738 --- Comment #8 from Martin Liška --- (In reply to Wilco from comment #7) > (In reply to Martin Liška from comment #6) > > So wrf grew starting with r271377, size (w/o debug info) goes from 20164464B > > to 23674792. > > Also check the build time of wrf. Looking at my logs trunk takes 2x as long > to build it since June. Maybe related to: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91509 ?
[Bug tree-optimization/92738] [10 regression] Large code size growth for -O2 binaries between 2019-05-19...2019-05-29
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92738 Wilco changed: What|Removed |Added CC||wilco at gcc dot gnu.org --- Comment #7 from Wilco --- (In reply to Martin Liška from comment #6) > So wrf grew starting with r271377, size (w/o debug info) goes from 20164464B > to 23674792. Also check the build time of wrf. Looking at my logs trunk takes 2x as long to build it since June.
[Bug tree-optimization/92738] [10 regression] Large code size growth for -O2 binaries between 2019-05-19...2019-05-29
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92738 Martin Liška changed: What|Removed |Added CC||tkoenig at gcc dot gnu.org --- Comment #6 from Martin Liška --- So wrf grew starting with r271377, size (w/o debug info) goes from 20164464B to 23674792.
[Bug tree-optimization/92738] [10 regression] Large code size growth for -O2 binaries between 2019-05-19...2019-05-29
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92738 --- Comment #5 from Martin Liška --- Ok, I've just updated LNT filter, and one can see it better with: https://lnt.opensuse.org/db_default/v4/SPEC/spec_report/branch?sorting=gcc-9%2Cgcc-trunk_elf_detail_stats=on I'm going to bisect the WRF size bump.
[Bug tree-optimization/92738] [10 regression] Large code size growth for -O2 binaries between 2019-05-19...2019-05-29
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92738 Richard Biener changed: What|Removed |Added Keywords||missed-optimization, ||needs-bisection --- Comment #4 from Richard Biener --- There is 2019-05-23 Richard Biener PR tree-optimization/88440 * opts.c (default_options_table): Enable -ftree-loop-distribute-patterns at -O[2s]+. * tree-loop-distribution.c (generate_memset_builtin): Fold the generated call. (generate_memcpy_builtin): Likewise. (distribute_loop): Pass in whether to only distribute patterns. (prepare_perfect_loop_nest): Also allow size optimization. (pass_loop_distribution::execute): When optimizing a loop nest for size allow pattern replacement. but that should cause code-size shrinking... (just try -fno-tree-loop-distribute-patterns and see if fixed)
[Bug tree-optimization/92738] [10 regression] Large code size growth for -O2 binaries between 2019-05-19...2019-05-29
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92738 --- Comment #3 from Martin Liška --- One of the big changes that caused that: https://lnt.opensuse.org/db_default/v4/SPEC/graph?plot.0=21.264.4
[Bug tree-optimization/92738] [10 regression] Large code size growth for -O2 binaries between 2019-05-19...2019-05-29
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92738 Martin Liška changed: What|Removed |Added Status|UNCONFIRMED |NEW Last reconfirmed||2019-12-01 CC||marxin at gcc dot gnu.org Known to work||9.2.0 Blocks||26163 Target Milestone|--- |10.0 Ever confirmed|0 |1 Known to fail||10.0 Referenced Bugs: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=26163 [Bug 26163] [meta-bug] missed optimization in SPEC (2k17, 2k and 2k6 and 95)
[Bug tree-optimization/92738] [10 regression] Large code size growth for -O2 binaries between 2019-05-19...2019-05-29
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92738 --- Comment #2 from Jan Hubicka --- https://lnt.opensuse.org/db_default/v4/SPEC/graph?plot.0=10.542.4_run=7354 shows shorter range +2019-05-24 Jakub Jelinek + + * tree-core.h (enum omp_clause_code): Add OMP_CLAUSE__CONDTEMP_. + * tree.h (OMP_CLAUSE_DECL): Use OMP_CLAUSE__CONDTEMP_ instead of + OMP_CLAUSE__REDUCTEMP_. + * tree.c (omp_clause_num_ops, omp_clause_code_name): Add + OMP_CLAUSE__CONDTEMP_. +2019-05-19 Segher Boessenkool + + * config/rs6000/constraints.md (define_register_constraint "wo"): + Delete. + * config/rs6000/rs6000.h (enum r6000_reg_class_enum): Delete + RS6000_CONSTRAINT_wo. + * config/rs6000/rs6000.c (rs6000_debug_reg_global): Adjust. + (rs6000_init_hard_regno_mode_ok): Adjust. + * config/rs6000/rs6000.md: Replace "wo" constraint by "wa" with "p9v". + * config/rs6000/altivec.md: Ditto. + * doc/md.texi (Machine Constraints): Adjust. + 2019-05-18 Iain Sandoe It may be easy to bisect.
[Bug tree-optimization/92738] [10 regression] Large code size growth for -O2 binaries between 2019-05-19...2019-05-29
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92738 --- Comment #1 from Jan Hubicka --- This is seen on https://lnt.opensuse.org/db_default/v4/SPEC/graph?highlight_run=7361=31.574.4