Re: [PATCH][ARM] Enable code hoisting with -Os (PR80155)

2019-10-10 Thread Joey Ye
Committed to ARM/arm-9-branch

On Wed, Sep 11, 2019 at 4:50 PM Wilco Dijkstra  wrote:
>
> While code hoisting generally improves codesize, it can affect performance
> negatively.  Benchmarking shows it doesn't help SPEC and negatively affects
> embedded benchmarks, so only enable code hoisting with -Os on Arm.
>
> Bootstrap OK, OK for commit?
>
> ChangeLog:
> 2019-09-11  Wilco Dijkstra  
>
>
> PR tree-optimization/80155
> * common/config/arm/arm-common.c (arm_option_optimization_table):
> Enable -fcode-hoisting with -Os.
>
> --
> diff --git a/gcc/common/config/arm/arm-common.c 
> b/gcc/common/config/arm/arm-common.c
> index 
> 41a920f6dc96833e778faa8dbcc19beac483734c..b0d5fb300bf01acc1fb6f4631635f8a1bfe6441c
>  100644
> --- a/gcc/common/config/arm/arm-common.c
> +++ b/gcc/common/config/arm/arm-common.c
> @@ -39,6 +39,8 @@ static const struct default_options 
> arm_option_optimization_table[] =
>  /* Enable section anchors by default at -O1 or higher.  */
>  { OPT_LEVELS_1_PLUS, OPT_fsection_anchors, NULL, 1 },
>  { OPT_LEVELS_1_PLUS, OPT_fsched_pressure, NULL, 1 },
> +/* Enable code hoisting only with -Os.  */
> +{ OPT_LEVELS_SIZE, OPT_fcode_hoisting, NULL, 1 },
>  { OPT_LEVELS_NONE, 0, NULL, 0 }
>};
>
>


Re: [PATCH][ARM] Enable code hoisting with -Os (PR80155)

2019-09-19 Thread Richard Biener
On Thu, Sep 19, 2019 at 3:28 AM Prathamesh Kulkarni
 wrote:
>
> On Wed, 18 Sep 2019 at 22:17, Prathamesh Kulkarni
>  wrote:
> >
> > On Wed, 18 Sep 2019 at 01:46, Richard Biener  
> > wrote:
> > >
> > > On Tue, Sep 17, 2019 at 7:18 PM Wilco Dijkstra  
> > > wrote:
> > > >
> > > > Hi Richard,
> > > >
> > > > > The issue with the bugzilla is that it lacked appropriate testcase(s) 
> > > > > and thus
> > > > > it is now a mess.  There are clear testcases (maybe not in the 
> > > > > benchmarks you
> > > >
> > > > Agreed - it's not clear whether any of the proposed changes would 
> > > > actually
> > > > help the original issue. My patch absolutely does.
> > > >
> > > > > care about) that benefit from code hoisting as enabler, mainly when 
> > > > > control
> > > > > flow can be then converted to data flow.  Also note that "size 
> > > > > optimizations"
> > > > > are important for all cases where followup transforms have size 
> > > > > limits on the IL
> > > > > in place.
> > > >
> > > > The gain from -fcode-hoisting is about 0.2% overall on Thumb-2. Ie. 
> > > > it's definitely
> > > > useful, but there are much larger gains to be had from other tweaks 
> > > > [1]. So we can
> > > > live without it until a better solution is found.
> > >
> > > A "solution" for better eembc benchmark results?
> > >
> > > The issues are all latent even w/o code-hoisting since you can do the
> > > same transform at the source level.  Which is usually why I argue
> > > trying to fix this in code-hoisting is not a complete fix.  Nor is turning
> > > off random GIMPLE passes for specific benchmark regressions.
> > >
> > > Anyway, it's arm maintainers call if you want to have such hacks in
> > > place or not.
> > >
> > > As a release manager I say that GCC isn't a benchmark compiler.
> > >
> > > As the one "responsible" for the code-hoisting introduction I say that
> > > as long as I don't have access to the actual benchmark I can't assess
> > > wrongdoing of the pass nor suggest an appropriate place for optimization.
> > Hi Richard,
> > The actual benchmark function for PR80155 is almost identical to FMS()
> > function defined in
> > pr77445-2.c, and the test-case reproduces the same issue as in the 
> > benchmark.
> Hi,
> The attached patch is another workaround for hoisting. The rationale
> behind the patch is
> to avoid "long range" hoistings  for a "large enough" CFG.

But that isn't a good heuristic.  The issue with coming up with a good heuristic
is that we'd need to evaluate the effect on register pressure which, on GIMPLE
at least, can be only approximated.  The insertion phase of PRE isn't a good
place to do that, the only place you could reasonably decide on this is
the elimination phase which sees the earlier inserted load and increment
and needs to decide whether to replace the later load and increment with
the loaded value.  But in the end this boils down to do scheduling and
register rematerialization on GIMPLE based on the value graph which is
of course possible but not implemented and if implemented is probably
better done as a standalone pass.

> The patch walks dom tree for the block and finds the "furthest" dom
> block, for which
> intersect_p (availout_in_some, AVAIL_OUT (dom_block)) is true. The
> "distance" is measured
> in terms of dom depth.
>
> We can have two params say param_hoist_n_bbs to determine "large enough"
> CFG, and param_hoist_expr_dist, that avoids hoisting if the "distance"
> exceeds the param threshold.
> For the values (hardcoded) in the patch, it "works" for avoiding the
> spill and does not regress ssa-pre-*.c and ssa-hoist-*.c
> tests (the param values could be made target-specific). Does the
> approach look reasonable ?

As said, not really.  First you only see parts of the expression hoisted;
in the testcase it's *transitions + 1 where you first see _1 = *transition
and then _2 = _1 + 1; if it were *transitions + _3 and _3 dies here then
hoisting doesn't make register pressure worse; if it were _4 + _3 and
both operands die here then hoisting decreases register pressure.

On x86 I don't see any spilling so the other question is whether this
is really a register pressure issue?

> My concern with current version is that walking dom tree per block in
> do_hoist_insertion can end up being
> pretty expensive. Any suggestions on how we could speed that up ?

You can probably use the DFS numbers directly?

> Alternatively, we could consider hoisting from "bottom up", one block
> at a time, and keep a map to count number of
> times an expr is hoisted and refuse hoisting the expr further up if it
> exceeds target-specific param threshold ?

Hoisting piggy-backs on the ANTIC data flow solution so that doesn't work
in the current framework.  In fact the current framework relies on top-down
operation to not hoist things one at a time.

Richard.

> Thanks,
> Prathamesh
> >
> > Thanks,
> > Prathamesh
> > >
> > > Richard.
> > >
> > > >
> > > > [1] https://gcc.gnu.org/ml/gcc-patches/2019-07/msg01739

Re: [PATCH][ARM] Enable code hoisting with -Os (PR80155)

2019-09-19 Thread Richard Biener
On Wed, Sep 18, 2019 at 6:47 PM Prathamesh Kulkarni
 wrote:
>
> On Wed, 18 Sep 2019 at 01:46, Richard Biener  
> wrote:
> >
> > On Tue, Sep 17, 2019 at 7:18 PM Wilco Dijkstra  
> > wrote:
> > >
> > > Hi Richard,
> > >
> > > > The issue with the bugzilla is that it lacked appropriate testcase(s) 
> > > > and thus
> > > > it is now a mess.  There are clear testcases (maybe not in the 
> > > > benchmarks you
> > >
> > > Agreed - it's not clear whether any of the proposed changes would actually
> > > help the original issue. My patch absolutely does.
> > >
> > > > care about) that benefit from code hoisting as enabler, mainly when 
> > > > control
> > > > flow can be then converted to data flow.  Also note that "size 
> > > > optimizations"
> > > > are important for all cases where followup transforms have size limits 
> > > > on the IL
> > > > in place.
> > >
> > > The gain from -fcode-hoisting is about 0.2% overall on Thumb-2. Ie. it's 
> > > definitely
> > > useful, but there are much larger gains to be had from other tweaks [1]. 
> > > So we can
> > > live without it until a better solution is found.
> >
> > A "solution" for better eembc benchmark results?
> >
> > The issues are all latent even w/o code-hoisting since you can do the
> > same transform at the source level.  Which is usually why I argue
> > trying to fix this in code-hoisting is not a complete fix.  Nor is turning
> > off random GIMPLE passes for specific benchmark regressions.
> >
> > Anyway, it's arm maintainers call if you want to have such hacks in
> > place or not.
> >
> > As a release manager I say that GCC isn't a benchmark compiler.
> >
> > As the one "responsible" for the code-hoisting introduction I say that
> > as long as I don't have access to the actual benchmark I can't assess
> > wrongdoing of the pass nor suggest an appropriate place for optimization.
> Hi Richard,
> The actual benchmark function for PR80155 is almost identical to FMS()
> function defined in
> pr77445-2.c, and the test-case reproduces the same issue as in the benchmark.

So on that code-hoisting identifies the redudnant
transitions[S1]++;
stmts, but only the load and increment because it doesn't hoist stores.
I think that's ultimately useful and would be even profitable if we manage
to hoist the stores as well.  If you have access to the actual benchmark
you can do this transform in the source and measure the effect.

Richard.

> Thanks,
> Prathamesh
> >
> > Richard.
> >
> > >
> > > [1] https://gcc.gnu.org/ml/gcc-patches/2019-07/msg01739.html
> > >
> > > Wilco


Re: [PATCH][ARM] Enable code hoisting with -Os (PR80155)

2019-09-18 Thread Prathamesh Kulkarni
On Wed, 18 Sep 2019 at 22:17, Prathamesh Kulkarni
 wrote:
>
> On Wed, 18 Sep 2019 at 01:46, Richard Biener  
> wrote:
> >
> > On Tue, Sep 17, 2019 at 7:18 PM Wilco Dijkstra  
> > wrote:
> > >
> > > Hi Richard,
> > >
> > > > The issue with the bugzilla is that it lacked appropriate testcase(s) 
> > > > and thus
> > > > it is now a mess.  There are clear testcases (maybe not in the 
> > > > benchmarks you
> > >
> > > Agreed - it's not clear whether any of the proposed changes would actually
> > > help the original issue. My patch absolutely does.
> > >
> > > > care about) that benefit from code hoisting as enabler, mainly when 
> > > > control
> > > > flow can be then converted to data flow.  Also note that "size 
> > > > optimizations"
> > > > are important for all cases where followup transforms have size limits 
> > > > on the IL
> > > > in place.
> > >
> > > The gain from -fcode-hoisting is about 0.2% overall on Thumb-2. Ie. it's 
> > > definitely
> > > useful, but there are much larger gains to be had from other tweaks [1]. 
> > > So we can
> > > live without it until a better solution is found.
> >
> > A "solution" for better eembc benchmark results?
> >
> > The issues are all latent even w/o code-hoisting since you can do the
> > same transform at the source level.  Which is usually why I argue
> > trying to fix this in code-hoisting is not a complete fix.  Nor is turning
> > off random GIMPLE passes for specific benchmark regressions.
> >
> > Anyway, it's arm maintainers call if you want to have such hacks in
> > place or not.
> >
> > As a release manager I say that GCC isn't a benchmark compiler.
> >
> > As the one "responsible" for the code-hoisting introduction I say that
> > as long as I don't have access to the actual benchmark I can't assess
> > wrongdoing of the pass nor suggest an appropriate place for optimization.
> Hi Richard,
> The actual benchmark function for PR80155 is almost identical to FMS()
> function defined in
> pr77445-2.c, and the test-case reproduces the same issue as in the benchmark.
Hi,
The attached patch is another workaround for hoisting. The rationale
behind the patch is
to avoid "long range" hoistings  for a "large enough" CFG.

The patch walks dom tree for the block and finds the "furthest" dom
block, for which
intersect_p (availout_in_some, AVAIL_OUT (dom_block)) is true. The
"distance" is measured
in terms of dom depth.

We can have two params say param_hoist_n_bbs to determine "large enough"
CFG, and param_hoist_expr_dist, that avoids hoisting if the "distance"
exceeds the param threshold.
For the values (hardcoded) in the patch, it "works" for avoiding the
spill and does not regress ssa-pre-*.c and ssa-hoist-*.c
tests (the param values could be made target-specific). Does the
approach look reasonable ?
My concern with current version is that walking dom tree per block in
do_hoist_insertion can end up being
pretty expensive. Any suggestions on how we could speed that up ?

Alternatively, we could consider hoisting from "bottom up", one block
at a time, and keep a map to count number of
times an expr is hoisted and refuse hoisting the expr further up if it
exceeds target-specific param threshold ?

Thanks,
Prathamesh
>
> Thanks,
> Prathamesh
> >
> > Richard.
> >
> > >
> > > [1] https://gcc.gnu.org/ml/gcc-patches/2019-07/msg01739.html
> > >
> > > Wilco
diff --git a/gcc/graph.h b/gcc/graph.h
index 5ec4f1c107f..c05cc6df25b 100644
--- a/gcc/graph.h
+++ b/gcc/graph.h
@@ -24,5 +24,6 @@ extern void print_graph_cfg (const char *, struct function *);
 extern void clean_graph_dump_file (const char *);
 extern void finish_graph_dump_file (const char *);
 extern void debug_dot_cfg (struct function *);
+extern void save_dot_cfg (struct function *);
 
 #endif /* ! GCC_GRAPH_H */
diff --git a/gcc/tree-ssa-pre.c b/gcc/tree-ssa-pre.c
index c618601a184..d13d1fdfd8a 100644
--- a/gcc/tree-ssa-pre.c
+++ b/gcc/tree-ssa-pre.c
@@ -3463,6 +3463,35 @@ do_pre_partial_partial_insertion (basic_block block, basic_block dom)
   return new_stuff;
 }
 
+/* Return longest "distance" of block, for which
+   intersect_p (availout_in_some, AVAIL_OUT (block)) is true.
+   Distance is measured in terms of dom depth.  */
+
+static int
+get_longest_dist (bitmap_head availout_in_some, basic_block block)
+{
+  basic_block son;
+  int dist;
+  int max_dist = 0;
+
+  for (son = first_dom_son (CDI_DOMINATORS, block);
+   son;
+   son = next_dom_son (CDI_DOMINATORS, son))
+{
+  dist = get_longest_dist (availout_in_some, son);
+  if (dist > max_dist)
+	max_dist = dist;
+}
+
+  if (max_dist != -1)
+return max_dist + 1;
+
+  if (bitmap_intersect_p (&availout_in_some, &AVAIL_OUT (block)->values))
+return 0;
+
+  return -1;
+}
+
 /* Insert expressions in BLOCK to compute hoistable values up.
Return TRUE if something was inserted, otherwise return FALSE.
The caller has to make sure that BLOCK has at least two successors.  */
@@ -3509,6 +3538,7 @@ do_hoist_insertion (basic_block b

Re: [PATCH][ARM] Enable code hoisting with -Os (PR80155)

2019-09-18 Thread Prathamesh Kulkarni
On Wed, 18 Sep 2019 at 01:46, Richard Biener  wrote:
>
> On Tue, Sep 17, 2019 at 7:18 PM Wilco Dijkstra  wrote:
> >
> > Hi Richard,
> >
> > > The issue with the bugzilla is that it lacked appropriate testcase(s) and 
> > > thus
> > > it is now a mess.  There are clear testcases (maybe not in the benchmarks 
> > > you
> >
> > Agreed - it's not clear whether any of the proposed changes would actually
> > help the original issue. My patch absolutely does.
> >
> > > care about) that benefit from code hoisting as enabler, mainly when 
> > > control
> > > flow can be then converted to data flow.  Also note that "size 
> > > optimizations"
> > > are important for all cases where followup transforms have size limits on 
> > > the IL
> > > in place.
> >
> > The gain from -fcode-hoisting is about 0.2% overall on Thumb-2. Ie. it's 
> > definitely
> > useful, but there are much larger gains to be had from other tweaks [1]. So 
> > we can
> > live without it until a better solution is found.
>
> A "solution" for better eembc benchmark results?
>
> The issues are all latent even w/o code-hoisting since you can do the
> same transform at the source level.  Which is usually why I argue
> trying to fix this in code-hoisting is not a complete fix.  Nor is turning
> off random GIMPLE passes for specific benchmark regressions.
>
> Anyway, it's arm maintainers call if you want to have such hacks in
> place or not.
>
> As a release manager I say that GCC isn't a benchmark compiler.
>
> As the one "responsible" for the code-hoisting introduction I say that
> as long as I don't have access to the actual benchmark I can't assess
> wrongdoing of the pass nor suggest an appropriate place for optimization.
Hi Richard,
The actual benchmark function for PR80155 is almost identical to FMS()
function defined in
pr77445-2.c, and the test-case reproduces the same issue as in the benchmark.

Thanks,
Prathamesh
>
> Richard.
>
> >
> > [1] https://gcc.gnu.org/ml/gcc-patches/2019-07/msg01739.html
> >
> > Wilco


Re: [PATCH][ARM] Enable code hoisting with -Os (PR80155)

2019-09-18 Thread Richard Biener
On Tue, Sep 17, 2019 at 7:18 PM Wilco Dijkstra  wrote:
>
> Hi Richard,
>
> > The issue with the bugzilla is that it lacked appropriate testcase(s) and 
> > thus
> > it is now a mess.  There are clear testcases (maybe not in the benchmarks 
> > you
>
> Agreed - it's not clear whether any of the proposed changes would actually
> help the original issue. My patch absolutely does.
>
> > care about) that benefit from code hoisting as enabler, mainly when control
> > flow can be then converted to data flow.  Also note that "size 
> > optimizations"
> > are important for all cases where followup transforms have size limits on 
> > the IL
> > in place.
>
> The gain from -fcode-hoisting is about 0.2% overall on Thumb-2. Ie. it's 
> definitely
> useful, but there are much larger gains to be had from other tweaks [1]. So 
> we can
> live without it until a better solution is found.

A "solution" for better eembc benchmark results?

The issues are all latent even w/o code-hoisting since you can do the
same transform at the source level.  Which is usually why I argue
trying to fix this in code-hoisting is not a complete fix.  Nor is turning
off random GIMPLE passes for specific benchmark regressions.

Anyway, it's arm maintainers call if you want to have such hacks in
place or not.

As a release manager I say that GCC isn't a benchmark compiler.

As the one "responsible" for the code-hoisting introduction I say that
as long as I don't have access to the actual benchmark I can't assess
wrongdoing of the pass nor suggest an appropriate place for optimization.

Richard.

>
> [1] https://gcc.gnu.org/ml/gcc-patches/2019-07/msg01739.html
>
> Wilco


Re: [PATCH][ARM] Enable code hoisting with -Os (PR80155)

2019-09-17 Thread Wilco Dijkstra
Hi Richard,

> The issue with the bugzilla is that it lacked appropriate testcase(s) and thus
> it is now a mess.  There are clear testcases (maybe not in the benchmarks you

Agreed - it's not clear whether any of the proposed changes would actually
help the original issue. My patch absolutely does.

> care about) that benefit from code hoisting as enabler, mainly when control
> flow can be then converted to data flow.  Also note that "size optimizations"
> are important for all cases where followup transforms have size limits on the 
> IL
> in place.

The gain from -fcode-hoisting is about 0.2% overall on Thumb-2. Ie. it's 
definitely
useful, but there are much larger gains to be had from other tweaks [1]. So we 
can
live without it until a better solution is found.

[1] https://gcc.gnu.org/ml/gcc-patches/2019-07/msg01739.html

Wilco

Re: [PATCH][ARM] Enable code hoisting with -Os (PR80155)

2019-09-16 Thread Richard Biener
On Mon, Sep 16, 2019 at 2:59 PM Wilco Dijkstra  wrote:
>
> Hi Prathamesh,
>
> > My only concern with the patch is that the issue isn't specific to
> > code-hoisting.
> > For this particular case (reproducible with pr77445-2.c), disabling
> > jump threading
> > doesn't cause the register spill with hoisting enabled.
> > Likewise disabling forwprop3 and forwprop4 prevents the spill.
> > The last time I tried, setting setting
> > max-jump-thread-duplication-stmts to 20 and
> > fsm-scale-path-stmts to 3, not only removed the spill but also
> > resulted in 9 more hoistings,
> > but regressed some other test on jump threading.
> > So I was wondering whether we should look into fine-tuning relevant params,
> > instead of disabling code hoisting entirely (we will end up regressing
> > some test cases either way) ?
>
> The point is that -fcode-hoisting is the optimization that made some 
> benchmarks
> run significantly slower compared to older GCC versions. It's been more than 
> 2.5
> years without progress, so let's just fix it now in the obvious way.

The issue with the bugzilla is that it lacked appropriate testcase(s) and thus
it is now a mess.  There are clear testcases (maybe not in the benchmarks you
care about) that benefit from code hoisting as enabler, mainly when control
flow can be then converted to data flow.  Also note that "size optimizations"
are important for all cases where followup transforms have size limits on the IL
in place.

Richard.

> Yes, there are likely better solutions. And if someone writes an acceptable 
> patch
> then we can easily enable hoisting again. Surely it is far better to have a 
> fixed
> compiler *now* rather than wait a few more years for a solution that has the
> same effect but which might not be trivially backportable?
>
> Cheers,
> Wilco
>


Re: [PATCH][ARM] Enable code hoisting with -Os (PR80155)

2019-09-16 Thread Wilco Dijkstra
Hi Prathamesh,

> My only concern with the patch is that the issue isn't specific to
> code-hoisting.
> For this particular case (reproducible with pr77445-2.c), disabling
> jump threading
> doesn't cause the register spill with hoisting enabled.
> Likewise disabling forwprop3 and forwprop4 prevents the spill.
> The last time I tried, setting setting
> max-jump-thread-duplication-stmts to 20 and
> fsm-scale-path-stmts to 3, not only removed the spill but also
> resulted in 9 more hoistings,
> but regressed some other test on jump threading.
> So I was wondering whether we should look into fine-tuning relevant params,
> instead of disabling code hoisting entirely (we will end up regressing
> some test cases either way) ?

The point is that -fcode-hoisting is the optimization that made some benchmarks
run significantly slower compared to older GCC versions. It's been more than 2.5
years without progress, so let's just fix it now in the obvious way.

Yes, there are likely better solutions. And if someone writes an acceptable 
patch
then we can easily enable hoisting again. Surely it is far better to have a 
fixed
compiler *now* rather than wait a few more years for a solution that has the
same effect but which might not be trivially backportable?

Cheers,
Wilco



Re: [PATCH][ARM] Enable code hoisting with -Os (PR80155)

2019-09-14 Thread Prathamesh Kulkarni
On Wed, 11 Sep 2019 at 11:50, Wilco Dijkstra  wrote:
>
> While code hoisting generally improves codesize, it can affect performance
> negatively.  Benchmarking shows it doesn't help SPEC and negatively affects
> embedded benchmarks, so only enable code hoisting with -Os on Arm.
>
> Bootstrap OK, OK for commit?
Hi Wilco,
My only concern with the patch is that the issue isn't specific to
code-hoisting.
For this particular case (reproducible with pr77445-2.c), disabling
jump threading
doesn't cause the register spill with hoisting enabled.
Likewise disabling forwprop3 and forwprop4 prevents the spill.
The last time I tried, setting setting
max-jump-thread-duplication-stmts to 20 and
fsm-scale-path-stmts to 3, not only removed the spill but also
resulted in 9 more hoistings,
but regressed some other test on jump threading.
So I was wondering whether we should look into fine-tuning relevant params,
instead of disabling code hoisting entirely (we will end up regressing
some test cases either way) ?

Thanks,
Prathamesh
>
> ChangeLog:
> 2019-09-11  Wilco Dijkstra  
>
>
> PR tree-optimization/80155
> * common/config/arm/arm-common.c (arm_option_optimization_table):
> Enable -fcode-hoisting with -Os.
>
> --
> diff --git a/gcc/common/config/arm/arm-common.c 
> b/gcc/common/config/arm/arm-common.c
> index 
> 41a920f6dc96833e778faa8dbcc19beac483734c..b0d5fb300bf01acc1fb6f4631635f8a1bfe6441c
>  100644
> --- a/gcc/common/config/arm/arm-common.c
> +++ b/gcc/common/config/arm/arm-common.c
> @@ -39,6 +39,8 @@ static const struct default_options 
> arm_option_optimization_table[] =
>  /* Enable section anchors by default at -O1 or higher.  */
>  { OPT_LEVELS_1_PLUS, OPT_fsection_anchors, NULL, 1 },
>  { OPT_LEVELS_1_PLUS, OPT_fsched_pressure, NULL, 1 },
> +/* Enable code hoisting only with -Os.  */
> +{ OPT_LEVELS_SIZE, OPT_fcode_hoisting, NULL, 1 },
>  { OPT_LEVELS_NONE, 0, NULL, 0 }
>};
>
>


Re: [PATCH][ARM] Enable code hoisting with -Os (PR80155)

2019-09-12 Thread Richard Biener
On Thu, Sep 12, 2019 at 1:29 PM Wilco Dijkstra  wrote:
>
> Hi Richard,
>
> > Do we document target specific deviations from "default" behavior somewhere?
>
> Not as far as I know. The other option changes in arm-common.c are not 
> mentioned
> anywhere, neither is any of arm_option_override_internal.
>
> If we want to keep documentation useful, we shouldn't clutter the generic 
> options
> with target-specific deviations from the default settings. What might be 
> feasible is
> adding a section for each target which describes changes from the default 
> settings.

Agreed.

Richard.

> Wilco


Re: [PATCH][ARM] Enable code hoisting with -Os (PR80155)

2019-09-12 Thread Wilco Dijkstra
Hi Richard,

> Do we document target specific deviations from "default" behavior somewhere?

Not as far as I know. The other option changes in arm-common.c are not mentioned
anywhere, neither is any of arm_option_override_internal.
 
If we want to keep documentation useful, we shouldn't clutter the generic 
options
with target-specific deviations from the default settings. What might be 
feasible is
adding a section for each target which describes changes from the default 
settings.

Wilco

Re: [PATCH][ARM] Enable code hoisting with -Os (PR80155)

2019-09-12 Thread Richard Biener
On Wed, Sep 11, 2019 at 5:50 PM Wilco Dijkstra  wrote:
>
> While code hoisting generally improves codesize, it can affect performance
> negatively.  Benchmarking shows it doesn't help SPEC and negatively affects
> embedded benchmarks, so only enable code hoisting with -Os on Arm.
>
> Bootstrap OK, OK for commit?

Do we document target specific deviations from "default" behavior somewhere?
invoke.texi has

@item -fcode-hoisting
@opindex fcode-hoisting
Perform code hoisting.  Code hoisting tries to move the
evaluation of expressions executed on all paths to the function exit
as early as possible.  This is especially useful as a code size
optimization, but it often helps for code speed as well.
This flag is enabled by default at @option{-O2} and higher.


> ChangeLog:
> 2019-09-11  Wilco Dijkstra  
>
>
> PR tree-optimization/80155
> * common/config/arm/arm-common.c (arm_option_optimization_table):
> Enable -fcode-hoisting with -Os.
>
> --
> diff --git a/gcc/common/config/arm/arm-common.c 
> b/gcc/common/config/arm/arm-common.c
> index 
> 41a920f6dc96833e778faa8dbcc19beac483734c..b0d5fb300bf01acc1fb6f4631635f8a1bfe6441c
>  100644
> --- a/gcc/common/config/arm/arm-common.c
> +++ b/gcc/common/config/arm/arm-common.c
> @@ -39,6 +39,8 @@ static const struct default_options 
> arm_option_optimization_table[] =
>  /* Enable section anchors by default at -O1 or higher.  */
>  { OPT_LEVELS_1_PLUS, OPT_fsection_anchors, NULL, 1 },
>  { OPT_LEVELS_1_PLUS, OPT_fsched_pressure, NULL, 1 },
> +/* Enable code hoisting only with -Os.  */
> +{ OPT_LEVELS_SIZE, OPT_fcode_hoisting, NULL, 1 },
>  { OPT_LEVELS_NONE, 0, NULL, 0 }
>};
>
>


[PATCH][ARM] Enable code hoisting with -Os (PR80155)

2019-09-11 Thread Wilco Dijkstra
While code hoisting generally improves codesize, it can affect performance
negatively.  Benchmarking shows it doesn't help SPEC and negatively affects
embedded benchmarks, so only enable code hoisting with -Os on Arm.

Bootstrap OK, OK for commit?

ChangeLog:
2019-09-11  Wilco Dijkstra  


PR tree-optimization/80155
* common/config/arm/arm-common.c (arm_option_optimization_table):
Enable -fcode-hoisting with -Os.

--
diff --git a/gcc/common/config/arm/arm-common.c 
b/gcc/common/config/arm/arm-common.c
index 
41a920f6dc96833e778faa8dbcc19beac483734c..b0d5fb300bf01acc1fb6f4631635f8a1bfe6441c
 100644
--- a/gcc/common/config/arm/arm-common.c
+++ b/gcc/common/config/arm/arm-common.c
@@ -39,6 +39,8 @@ static const struct default_options 
arm_option_optimization_table[] =
 /* Enable section anchors by default at -O1 or higher.  */
 { OPT_LEVELS_1_PLUS, OPT_fsection_anchors, NULL, 1 },
 { OPT_LEVELS_1_PLUS, OPT_fsched_pressure, NULL, 1 },
+/* Enable code hoisting only with -Os.  */
+{ OPT_LEVELS_SIZE, OPT_fcode_hoisting, NULL, 1 },
 { OPT_LEVELS_NONE, 0, NULL, 0 }
   };