On May 26, 2018 11:32:29 AM GMT+02:00, Allan Sandfeld Jensen <li...@carewolf.com> wrote: >I brought this subject up earlier, and was told to suggest it again for >gcc 9, >so I have attached the preliminary changes. > >My studies have show that with generic x86-64 optimization it reduces >binary >size with around 0.5%, and when optimizing for x64 targets with SSE4 or > >better, it reduces binary size by 2-3% on average. The performance >changes are >negligible however*, and I haven't been able to detect changes in >compile time >big enough to penetrate general noise on my platform, but perhaps >someone has >a better setup for that? > >* I believe that is because it currently works best on non-optimized >code, it >is better at big basic blocks doing all kinds of things than tightly >written >inner loops. > >Anythhing else I should test or report?
If you have access to SPEC CPU I'd like to see performance, size and compile-time effects of the patch on that. Embedded folks may want to rhn their favorite benchmark and report results as well. Richard. >Best regards >'Allan > > >diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi >index beba295bef5..05851229354 100644 >--- a/gcc/doc/invoke.texi >+++ b/gcc/doc/invoke.texi >@@ -7612,6 +7612,7 @@ also turns on the following optimization flags: > -fstore-merging @gol > -fstrict-aliasing @gol > -ftree-builtin-call-dce @gol >+-ftree-slp-vectorize @gol > -ftree-switch-conversion -ftree-tail-merge @gol > -fcode-hoisting @gol > -ftree-pre @gol >@@ -7635,7 +7636,6 @@ by @option{-O2} and also turns on the following >optimization flags: > -floop-interchange @gol > -floop-unroll-and-jam @gol > -fsplit-paths @gol >--ftree-slp-vectorize @gol > -fvect-cost-model @gol > -ftree-partial-pre @gol > -fpeel-loops @gol >@@ -8932,7 +8932,7 @@ Perform loop vectorization on trees. This flag is > >enabled by default at > @item -ftree-slp-vectorize > @opindex ftree-slp-vectorize >Perform basic block vectorization on trees. This flag is enabled by >default >at >-@option{-O3} and when @option{-ftree-vectorize} is enabled. >+@option{-O2} or higher, and when @option{-ftree-vectorize} is enabled. > > @item -fvect-cost-model=@var{model} > @opindex fvect-cost-model >diff --git a/gcc/opts.c b/gcc/opts.c >index 33efcc0d6e7..11027b847e8 100644 >--- a/gcc/opts.c >+++ b/gcc/opts.c >@@ -523,6 +523,7 @@ static const struct default_options >default_options_table[] = > { OPT_LEVELS_2_PLUS, OPT_fipa_ra, NULL, 1 }, > { OPT_LEVELS_2_PLUS, OPT_flra_remat, NULL, 1 }, > { OPT_LEVELS_2_PLUS, OPT_fstore_merging, NULL, 1 }, >+ { OPT_LEVELS_2_PLUS, OPT_ftree_slp_vectorize, NULL, 1 }, > > /* -O3 optimizations. */ > { OPT_LEVELS_3_PLUS, OPT_ftree_loop_distribute_patterns, NULL, 1 }, >@@ -539,7 +540,6 @@ static const struct default_options >default_options_table[] = > { OPT_LEVELS_3_PLUS, OPT_floop_unroll_and_jam, NULL, 1 }, > { OPT_LEVELS_3_PLUS, OPT_fgcse_after_reload, NULL, 1 }, > { OPT_LEVELS_3_PLUS, OPT_ftree_loop_vectorize, NULL, 1 }, >- { OPT_LEVELS_3_PLUS, OPT_ftree_slp_vectorize, NULL, 1 }, >{ OPT_LEVELS_3_PLUS, OPT_fvect_cost_model_, NULL, >VECT_COST_MODEL_DYNAMIC >}, > { OPT_LEVELS_3_PLUS, OPT_fipa_cp_clone, NULL, 1 }, > { OPT_LEVELS_3_PLUS, OPT_ftree_partial_pre, NULL, 1 },