The following patch can fix an ICE when compiling with LIPO. OK for google-4_9?
Thanks,
Dehao
Index: gcc/l-ipo.c
===
--- gcc/l-ipo.c (revision 225685)
+++ gcc/l-ipo.c (working copy)
@@ -731,6 +731,7 @@ lipo_cmp_type (tree t1, tree
comments?
Bootstrapped and test on-going.
OK for trunk?
Thanks,
Dehao
ChangeLog:
2015-06-23 Dehao Chen de...@google.com
* opts.c(finish_options): Disable reorder_blocks_and_partition for DWARF2.
Index: opts.c
===
--- opts.c
ok for google branch.
Dehao
On Tue, Mar 3, 2015 at 12:26 PM, Cary Coutant ccout...@google.com wrote:
@@ -21817,22 +21823,39 @@ out_subprog_directive (subprog_entry *su
{
tree decl = subprog-decl;
tree decl_name = DECL_NAME (decl);
- const char *name;
+ tree origin;
Explicitly
ok.
Dehao
On Mon, Feb 23, 2015 at 11:02 AM, Cary Coutant ccout...@google.com wrote:
Minor changes to -ftwo-level-line-tables.
This patch is for the google/gcc-4_9 branch.
Originally, -ftwo-level-line-tables would output .subprog directives
only for inlined subprograms, and not for
The offset overflow warning would cause build fails when function's
start line is missing(0). Until the start line issues is fixed, we
will suppress this warning.
Testing on-going. OK for google-4_9?
Thanks,
Dehao
Index: gcc/auto-profile.c
patch is ok for google branch.
Dehao
On Thu, Jan 29, 2015 at 1:11 PM, Cary Coutant ccout...@google.com wrote:
Here's a very slightly revised patch, fixing a couple of bugs found
during GDB testing.
In out_logical_entry, I should pass along the value of is_stmt when
creating a logical for
On Sun, Jan 25, 2015 at 6:06 PM, Cary Coutant ccout...@google.com wrote:
Add -ftwo-level-line-tables and -gline-tables-only options.
With -ftwo-level-line-tables, GCC will generate two-level line tables,
which adds inline call information to the line tables, obviating the
need to keep bulky
On Wed, Jan 28, 2015 at 3:04 PM, Cary Coutant ccout...@google.com wrote:
+static subprog_entry *
+add_subprog_entry (tree decl, bool is_inlined)
+{
+ subprog_entry **slot;
+ subprog_entry *entry;
+
+ slot = subprog_table-find_slot_with_hash (decl, DECL_UID (decl),
INSERT);
+
On Wed, Jan 28, 2015 at 4:34 PM, Cary Coutant ccout...@google.com wrote:
Not quite clear why we need block_table. This table is not gonna be
emitted. And we can easily get subprog_entry through block-block_num
When final_scan_insn() calls dwarf2out_begin_block(), all it passes is
a block
. Free Software Foundation, Inc.
Contributed by Dehao Chen (de...@google.com)
This file is part of GCC.
@@ -18,19 +18,17 @@ You should have received a copy of the GNU General
along with GCC; see the file COPYING3. If not see
http://www.gnu.org/licenses/. */
-/* Read and annotate call graph
This patch fixes the bug for undefined symbol in AutoFDO build.
Testing on going. OK for google-4_9 branch?
Thanks,
Dehao
Index: gcc/auto-profile.c
===
--- gcc/auto-profile.c (revision 218784)
+++ gcc/auto-profile.c (working copy)
ping...
Thanks,
Dehao
On Tue, Nov 18, 2014 at 2:29 PM, Dehao Chen de...@google.com wrote:
This patch updates ssa and inline summary in the correct location for AutoFDO.
Bootstrapped and passed regression test. OK for trunk?
Thanks,
Dehao
gcc/ChangeLog:
2014-11-18 Dehao Chen de
the
indirect call anyway.
Dehao
On Tue, Dec 16, 2014 at 2:45 PM, Xinliang David Li davi...@google.com wrote:
Does it paper over the real bug?
David
On Tue, Dec 16, 2014 at 2:38 PM, Dehao Chen de...@google.com wrote:
This patch fixes the bug for undefined symbol in AutoFDO build.
Testing on going
This patch updates ssa and inline summary in the correct location for AutoFDO.
Bootstrapped and passed regression test. OK for trunk?
Thanks,
Dehao
gcc/ChangeLog:
2014-11-18 Dehao Chen de...@google.com
* auto-profile.c (afdo_annotate_cfg): Invoke update_ssa in the right
place
(callee-decl)
+ else if (!flag_auto_profile DECL_COMDAT (callee-decl)
growth = PARAM_VALUE (PARAM_EARLY_INLINING_INSNS_COMDAT))
;
else if ((n = num_calls (callee)) != 0
On Thu, Nov 13, 2014 at 3:42 PM, Dehao Chen de...@google.com wrote:
We do not do sophisticated
.
Is the workaround needed for the mainline autofdo version too?
-Andi
David
On Mon, Nov 17, 2014 at 12:47 PM, Dehao Chen de...@google.com wrote:
The patch was updated to ignore comdat einline tuning for AutoFDO.
Performance testing is green.
In AutoFDO, we increase einline iterations. This could lead to
extensive code bloat if we have recursive calls like:
dtor() {
destroy(node);
}
destroy(node) {
destroy(left)
destroy(right)
}
In this case, the size growth will be around 8 which is smaller than
threshold (11). However, if we
at 2:25 PM, Dehao Chen de...@google.com wrote:
In AutoFDO, we increase einline iterations. This could lead to
extensive code bloat if we have recursive calls like:
dtor() {
destroy(node);
}
destroy(node) {
destroy(left)
destroy(right)
}
In this case, the size growth will be around 8
We do not do sophisticated recursive call detection in einline phase.
It only happens in ipa-inline phase.
Dehao
On Thu, Nov 13, 2014 at 3:18 PM, Xinliang David Li davi...@google.com wrote:
On Thu, Nov 13, 2014 at 2:57 PM, Dehao Chen de...@google.com wrote:
IIRC, AutoFDO the actual iteration
The patch tested OK. And I think it's a trivial patch, and already
committed it to trunk.
About the perf parser. I'm syncing the toolchain to head which should
already have newer kernel support.
Thanks,
Dehao
On Wed, Oct 22, 2014 at 10:07 AM, Xinliang David Li davi...@google.com wrote:
Can
CPU. But you are
more than welcome to tune the propagation algorithm to get most out of
inaccurate instruction profile.
Cheers,
Dehao
On Tue, Oct 21, 2014 at 12:30 PM, Markus Trippelsdorf
mar...@trippelsdorf.de wrote:
On 2014.10.20 at 14:21 -0700, Dehao Chen wrote:
+If @var{path} is specified
Looks like the perf data type is incompatible with quipper (perf data
parser). Can you send me the perf.data file so that I can take a look.
Thanks,
Dehao
On Tue, Oct 21, 2014 at 2:25 PM, Markus Trippelsdorf
mar...@trippelsdorf.de wrote:
On 2014.10.21 at 13:53 -0700, Dehao Chen wrote
. Free Software Foundation, Inc.
+ Contributed by Dehao Chen (de...@google.com)
+
+This file is part of GCC.
+
+GCC is free software; you can redistribute it and/or modify it under
+the terms of the GNU General Public License as published by the Free
+Software Foundation; either version 3
and annotate call graph profile from the auto profile data file.
+ Copyright (C) 2014. Free Software Foundation, Inc.
+ Contributed by Dehao Chen (de...@google.com)
+
+This file is part of GCC.
+
+GCC is free software; you can redistribute it and/or modify it under
+the terms of the GNU General
This patch recalculates dominance info before update_ssa call in
AutoFDO. This fixes bug when dominance info is out-of-date and causes
segfaults during update_ssa.
Bootstrapped and regression test on-going.
OK for google-4_9 branch?
Thanks,
Dehao
Index: gcc/auto-profile.c
...@google.com wrote:
Is it destroyed by value profile transformations? Can you move the
dominance recomputing code closer to where it gets invalidated?
David
On Wed, Oct 15, 2014 at 10:37 AM, Dehao Chen de...@google.com wrote:
This patch recalculates dominance info before update_ssa call
===
--- gcc/auto-profile.c (revision 0)
+++ gcc/auto-profile.c (revision 0)
@@ -0,0 +1,1664 @@
+/* Read and annotate call graph profile from the auto profile data file.
+ Copyright (C) 2014. Free Software Foundation, Inc.
+ Contributed by Dehao Chen (de...@google.com)
+
+This file is part
The new patch is attached. I used clang-format for format auto-profile.{c|h}
Thanks,
Dehao
On Tue, Oct 14, 2014 at 2:05 PM, Dehao Chen de...@google.com wrote:
On Tue, Oct 14, 2014 at 8:02 AM, Jan Hubicka hubi...@ucw.cz wrote:
Index: gcc/cgraphclones.c
This will cause bzip2 performance to degrade 6%. I haven't had time to
triage the problem. Will investigate this later.
Still I would preffer to make this by default
flag_reorder_blocks_and_partition
to false with auto_profile. We could do that incrementally, lets just drop
this from
/auto-profile.c (revision 0)
@@ -0,0 +1,1662 @@
+/* Read and annotate call graph profile from the auto profile data file.
+ Copyright (C) 2014. Free Software Foundation, Inc.
+ Contributed by Dehao Chen (de...@google.com)
+
+This file is part of GCC.
+
+GCC is free software; you can redistribute
OK for google-4_8 and google-4_9. David and Teresa may have further comments.
Dehao
On Wed, Aug 6, 2014 at 3:36 PM, Yi Yang ahyan...@google.com wrote:
This currently puts split sections together again in the specified
section and breaks DWARF output. This patch disables the partitioning
for
This patch replaces getline with fgets so that gcc builts fine in darwin.
Testing on going, ok for google-4_9 if test passes?
Thanks,
Dehao
Index: gcc/coverage.c
===
--- gcc/coverage.c (revision 212523)
+++ gcc/coverage.c (working
wrote:
Done.
On Mon, Jun 30, 2014 at 5:20 PM, Dehao Chen de...@google.com wrote:
For get_locus_information, can you cal get_inline_stack and directly use
its
output to get the function name instead?
Dehao
On Mon, Jun 30, 2014 at 4:47 PM, Yi Yang ahyan...@google.com wrote:
Removed
You don't need extra space to store file name in locus_information_t.
Use pointer instead.
Dehao
On Mon, Jun 30, 2014 at 1:36 PM, Yi Yang ahyan...@google.com wrote:
I refactored the code and added comments. A bug (prematurely breaking
from a loop) was fixed during the refactoring.
(My last
Let's use %d to replace %f (manual conversion, let's do xx%).
Dehao
On Mon, Jun 30, 2014 at 2:06 PM, Yi Yang ahyan...@google.com wrote:
Fixed.
Also, I spotted some warnings caused by me using %lfs in snprintf().
I changed these to %f and tested.
On Mon, Jun 30, 2014 at 1:49 PM, Dehao Chen
that for the actual probability, the best way to store it is to
store the edge count, since the probability is just
edge_count/bb_count. But this causes disparity in the formats of the
two probabilities.
On Mon, Jun 30, 2014 at 2:12 PM, Dehao Chen de...@google.com wrote:
Let's use %d to replace %f
OK for google-4_8 and google-4_9
Thanks,
Dehao
On Tue, Jun 24, 2014 at 3:09 PM, Yi Yang ahyan...@google.com wrote:
Hi,
This patch removes unnecessary edge probability calculations in
afdo_propagate_circuit() that would eventually be overridden by
afdo_calculate_branch_prob().
This would
. Ok for Google/4_8?
Teresa
2014-06-12 Teresa Johnson tejohn...@google.com
Dehao Chen de...@google.com
Google ref b/15521327.
* cgraphclones.c (cgraph_clone_edge): Use resolved node.
* l-ipo.c (resolve_cgraph_node): Resolve to non-removable node
ping...
Dehao
On Fri, May 30, 2014 at 4:13 PM, Dehao Chen de...@google.com wrote:
This will increase c++ g1/g2 binary size a little. For all spec
cint2006 benchmarks, the binary size change is shown below.
400 0.00% 0.00% 0.00% 0.00%
401 0.00% 0.00% 0.00% 0.00%
403 0.00% 0.00% 0.00% 0.00
On Wed, Jun 11, 2014 at 10:38 AM, Cary Coutant ccout...@google.com wrote:
This will increase c++ g1/g2 binary size a little. For all spec
cint2006 benchmarks, the binary size change is shown below.
400 0.00% 0.00% 0.00% 0.00%
401 0.00% 0.00% 0.00% 0.00%
403 0.00% 0.00% 0.00% 0.00%
429 0.00%
This patch rebuilds frequency after vrp.
Bootstrapped and testing on-going. OK for trunk if test pass?
Thanks,
Dehao
gcc/ChangeLog:
2014-06-02 Dehao Chen de...@google.com
PR tree-optimization/61384
* tree-vrp.c (execute_vrp): rebuild frequency after vrp.
gcc/testsuite
,
Dehao
gcc/ChangeLog:
2014-06-02 Dehao Chen de...@google.com
PR tree-optimization/61384
* tree-vrp.c (execute_vrp): rebuild frequency after vrp.
gcc/testsuite/ChangeLog:
2014-06-02 Dehao Chen de...@google.com
PR tree-optimization/61384
* gcc.dg/pr61384.c
Just tried with Teresa's patch, the ICE in
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61384 is not resolved.
Dehao
On Mon, Jun 2, 2014 at 9:45 AM, Jeff Law l...@redhat.com wrote:
On 06/02/14 10:17, Dehao Chen wrote:
We need to rebuild frequency after vrp, otherwise the following code
This patch updates the merged bb count only when they are in the same loop.
Bootstrapped and passed regression test.
Ok for trunk?
Thanks,
Dehao
gcc/ChangeLog:
2014-05-30 Dehao Chen de...@google.com
* tree-cfg.c (gimple_merge_blocks): Only reset count when BBs
As we are pushing AutoFDO patch upstream, is this patch OK for trunk?
Thanks,
Dehao
On Mon, Aug 19, 2013 at 1:32 PM, Dehao Chen de...@google.com wrote:
After rerunning test, this will fail one gcc regression test. So I
updated the patch to make sure all test will pass:
Index: gcc/dwarf2out.c
Chen de...@google.com wrote:
As we are pushing AutoFDO patch upstream, is this patch OK for trunk?
Thanks,
Dehao
On Mon, Aug 19, 2013 at 1:32 PM, Dehao Chen de...@google.com wrote:
After rerunning test, this will fail one gcc regression test. So I
updated the patch to make sure all test
Thanks for the suggestion. I actually want this function to be inlined
in ipa-inline phase, not einline phase.
Dehao
On Fri, May 30, 2014 at 4:50 PM, Steven Bosscher stevenb@gmail.com wrote:
On Fri, May 30, 2014 at 11:43 PM, Dehao Chen wrote:
Index: gcc/testsuite/gcc.dg/tree-prof
This patch fixes LIPO ICE that an unresolved node escaped after lipo fixup.
testing on going. OK for google-4_9?
Thanks,
Dehao
Index: gcc/ipa.c
===
--- gcc/ipa.c (revision 210864)
+++ gcc/ipa.c (working copy)
@@ -39,6 +39,7 @@
If a loop's header count is less than iteration count, the iteration
estimation is apparently incorrect for this loop. Thus disable
unrolling of such loops.
Testing on going. OK for trunk if test pass?
Thanks,
Dehao
gcc/ChangeLog:
2014-05-21 Dehao Chen de...@google.com
* cfgloop.h
I've updated the patch. Shall I move the check inside cgraph_clone_node?
Thanks,
Dehao
Index: gcc/ipa-inline-transform.c
===
--- gcc/ipa-inline-transform.c (revision 210535)
+++ gcc/ipa-inline-transform.c (working copy)
@@ -183,8
On Mon, May 19, 2014 at 1:40 PM, Jan Hubicka hubi...@ucw.cz wrote:
I've updated the patch. Shall I move the check inside cgraph_clone_node?
Thanks,
I think it is OK as it is. I belive individual users should know what do to
in such cases themselves.
You may want to also check what ipa-cp is
if test pass?
Thanks,
Dehao
gcc/ChangeLog:
2014-05-16 Dehao Chen de...@google.com
* cfghooks.c (make_forwarder_block): Use direct computation to
get fall-through edge's count and frequency.
Index: gcc/cfghooks.c
This patch uses optimize_function_for_size_p to replace old
optimize_size check in regs.h and ira-int.h to make it consistent.
Bootstrapped and testing on-going.
OK for trunk if test passes?
Thanks,
Dehao
gcc/ChangeLog:
2014-05-16 Dehao Chen de...@google.com
* ira-int.h
This patch makes sure max count is used when merging two basic blocks.
Bootstrapped and testing on-going.
OK for trunk if test is ok?
Thanks,
Dehao
gcc/ChangeLog:
2014-05-16 Dehao Chen de...@google.com
* tree-cfg.c (gimple_merge_blocks): Updates bb count with max count.
Index: gcc
Is this patch ok for trunk? Bootstrapped and regression test on-going.
Thanks,
Dehao
2014-05-16 Dehao Chen de...@google.com
* tree-inline.c (initialize_cfun): Ensure count_scale is no larger
than REG_BR_PROB_BASE.
(copy_cfg_body): Likewise.
Index: gcc/tree-inline.c
On Fri, May 16, 2014 at 4:41 PM, Jan Hubicka hubi...@ucw.cz wrote:
Is this patch ok for trunk? Bootstrapped and regression test on-going.
Thanks,
Dehao
2014-05-16 Dehao Chen de...@google.com
* tree-inline.c (initialize_cfun): Ensure count_scale is no larger
Do you mean adjusting bb-count? Because in
expand_call_inline(tree-inline.c), it will use bb-count to pass into
copy_body to calculate count_scale.
Thanks,
Dehao
On Fri, May 16, 2014 at 5:22 PM, Jan Hubicka hubi...@ucw.cz wrote:
In AutoFDO, a basic block's count can be much larger than it's
The previous checkin will break build for most application:
http://gcc.gnu.org/viewcvs/gcc/branches/google/gcc-4_9/gcc/?view=log
This patch fixes the regression by updating highest_location.
Testing on-going,
OK for google-4_9 branch?
Thanks,
Dehao
Index: gcc/input.c
The problem is that linemap_location_from_macro_expansion_p will
always return true if locus has discriminator. And in linemap_lookup,
this will lead to call linemap_macro_map_lookup, in which there is an
assertion:
linemap_assert (line = LINEMAPS_MACRO_LOWEST_LOCATION (set));
However, line is
As discussed offline, this is actually due to missing parts of the
previous patch (some changes does not appear in the change log of
r199154). I've updated the patch to include those missing pieces.
Testing on going.
Dehao
On Tue, May 13, 2014 at 10:04 AM, Cary Coutant ccout...@google.com wrote:
Attached patch passes regression tests and benchmark test. OK for google-4_9?
Thanks,
Dehao
On Tue, May 13, 2014 at 10:43 AM, Dehao Chen de...@google.com wrote:
As discussed offline, this is actually due to missing parts of the
previous patch (some changes does not appear in the change log
This patch backports r199154 from google-4_8 to google-4_9
Bootstrapped and passed regression test.
OK for google-4_9 branch?
Thanks,
Dehao
Index: gcc/final.c
===
--- gcc/final.c (revision 210329)
+++ gcc/final.c (working copy)
@@
Yes, this patch is a combination of all these patches. Some of them
are already in trunk.
Dehao
On Mon, May 12, 2014 at 1:28 PM, Cary Coutant ccout...@google.com wrote:
On Mon, May 12, 2014 at 1:11 PM, Dehao Chen de...@google.com wrote:
This patch backports r199154 from google-4_8 to google
We have open-sourced AutoFDO profile toolchain in:
https://github.com/google/autofdo
For GCC developers, the most important tool is create_gcov, which
converts sampling based profile to GCC-readable profile. Please refer
to the readme file
This patch handles TYPE_PACK_EXPANSION in lipo_cmp_type.
testing on going. OK for google-4_8?
Thanks,
Dehao
Index: gcc/l-ipo.c
===
--- gcc/l-ipo.c (revision 209226)
+++ gcc/l-ipo.c (working copy)
@@ -676,6 +676,7 @@
This patch calls add_fake_edge for the AutoFDO+LIPO path.
Bootstrapped and passed regression test and performance test.
OK for google-4_8?
Thanks,
Dehao
Index: gcc/auto-profile.c
===
--- gcc/auto-profile.c (revision 209123)
+++
This patch updates SSA after VPT transformation. This is needed
because compute_inline_parameters will ICE without updated SSA.
Testing on-going.
OK for google-4_8?
Thanks,
Dehao
Index: gcc/auto-profile.c
===
---
On Wed, Mar 26, 2014 at 3:54 PM, Dehao Chen de...@google.com wrote:
Patch updated, passed performance tests.
Dehao
On Tue, Mar 25, 2014 at 4:03 PM, Xinliang David Li davi...@google.com
wrote:
Add comment to the new function. init_node_map is better invoked after
the link step to avoid
.
David
On Tue, Mar 25, 2014 at 3:38 PM, Dehao Chen de...@google.com wrote:
This patch refactors LIPO fixup related code to move it into a
standalone function. This makes sure that
symtab_remove_unreachable_nodes is called right after the fixup so
that there is not dangling cgraph nodes any time
This patch refactors LIPO fixup related code to move it into a
standalone function. This makes sure that
symtab_remove_unreachable_nodes is called right after the fixup so
that there is not dangling cgraph nodes any time.
Bootstrapped and regression test on-going.
OK for google-4_8?
Thanks,
:
2014-03-21 Dehao Chen de...@google.com
*ipa-inline.c (early_inliner): updates overall summary.
Looks resonable, do you have testcase where it would make a difference?
Sorry, no small test case because this depends on autofdo profile.
The problem actually does not manifest in trunk unless
This patch guards autofdo annotation coverage recording with a flag.
Test on-going.
OK for google-4_8 if test passes?
Thanks,
Dehao
Index: gcc/auto-profile.c
===
--- gcc/auto-profile.c (revision 208753)
+++ gcc/auto-profile.c
ping ^2...
Dehao
On Mon, Feb 10, 2014 at 8:35 AM, Dehao Chen de...@google.com wrote:
ping...
Dehao
On Fri, Jan 24, 2014 at 1:54 PM, Dehao Chen de...@google.com wrote:
Thanks, test updated:
Index: gcc/testsuite/gcc.dg/predict-8.c
Hi,
This patch updates node's inline summary after edge_summary is
updated. Otherwise it could lead to incorrect inline summary.
Bootstrapped and gcc regression test on-going.
OK for trunk?
Thanks,
Dehao
gcc/ChangeLog:
2014-03-21 Dehao Chen de...@google.com
*ipa-inline.c (early_inliner
This patch calls update_ssa before compute_inline_paramters.
Bootstrapped and perf test on-going.
OK for google-4_8?
Thanks,
Dehao
Index: gcc/auto-profile.c
===
--- gcc/auto-profile.c (revision 208726)
+++ gcc/auto-profile.c
) need_ssa_update_p (cfun))
flags |= TODO_update_ssa;
}
if (flags TODO_update_ssa_any)
{
unsigned update_flags = flags TODO_update_ssa_any;
update_ssa (update_flags);
cfun-last_verified = ~TODO_verify_ssa;
}
David
On Thu, Mar 20, 2014 at 10:39 AM, Dehao
On Thu, Mar 20, 2014 at 1:02 PM, Xinliang David Li davi...@google.com wrote:
On Thu, Mar 20, 2014 at 12:40 PM, Dehao Chen de...@google.com wrote:
Patch updated to add a wrapper early_inline function
Index: gcc/auto-profile.c
Thanks Cary for the comments.
Patch updated, an also added a tool in contrib/ to dump the profile
annotation coverage.
Dehao
On Wed, Mar 12, 2014 at 9:48 AM, Cary Coutant ccout...@google.com wrote:
+void autofdo_source_profile::write_annotated_count () const
+{
+ switch_to_section
Looks good to me.
Dehao
On Wed, Mar 12, 2014 at 3:35 PM, Hán Shěn (沈涵) shen...@google.com wrote:
ARM build (on chrome) is broken because of duplicate entries in arm.md
and unspecs.md. Fixed by removing duplication and merge those in
arm.md into unspecs.md.
(We had a similar fix for
Looks good to me.
Dehao
On Tue, Mar 11, 2014 at 3:22 PM, Hán Shěn (沈涵) shen...@google.com wrote:
Hi current google/main fails to build for arm because of duplicated
head file entries in gtyp-input.list.
Fixed by removing duplication in macro tm_file. This only affects arm
platform. Tested
During AutoFDO annotation, we want to record the annotation stats into
an elf section, so that we can calculate how much percentage of the
profile is annotated, which can be used as an indicator whether code
has changed significantly comparing with the profiled source.
Bootstrapped and
This patch removes the size limit for loop unroll/peel when the loop
is truly hot. This makes the implementation easily maintanable between
FDO and AutoFDO.
Bootstrapped and loadtest perf show neutral impact.
OK for google-4_8?
Thanks,
Dehao
Index: gcc/loop-unroll.c
This patch fixes the bug of not calling compute_inline_parameters
before early_inliner, which would lead to ICE.
Testing on going, OK for google-4_8 if test passes?
Thanks,
Dehao
Index: gcc/auto-profile.c
===
--- gcc/auto-profile.c
, Xinliang David Li davi...@google.com wrote:
On Wed, Feb 26, 2014 at 3:23 PM, Dehao Chen de...@google.com wrote:
This patch fixes the bug of not calling compute_inline_parameters
before early_inliner, which would lead to ICE.
Testing on going, OK for google-4_8 if test passes?
Thanks,
Dehao
ping...
Dehao
On Fri, Jan 24, 2014 at 1:54 PM, Dehao Chen de...@google.com wrote:
Thanks, test updated:
Index: gcc/testsuite/gcc.dg/predict-8.c
===
--- gcc/testsuite/gcc.dg/predict-8.c (revision 0)
+++ gcc/testsuite/gcc.dg
A new test is added:
gcc/testsuite/ChangeLog:
2014-01-24 Dehao Chen de...@google.com
* gcc.dg/predict-8.c: New test.
Index: gcc/testsuite/gcc.dg/predict-8.c
===
--- gcc/testsuite/gcc.dg/predict-8.c (revision 0)
+++ gcc
. Lu hjl.to...@gmail.com wrote:
On Fri, Jan 24, 2014 at 10:57 AM, Jakub Jelinek ja...@redhat.com wrote:
On Fri, Jan 24, 2014 at 10:20:53AM -0800, Dehao Chen wrote:
--- gcc/testsuite/gcc.dg/predict-8.c (revision 0)
+++ gcc/testsuite/gcc.dg/predict-8.c (revision 0)
@@ -0,0 +1,12 @@
+/* { dg-do
This patch fixes performance regression for AutoFDO. When the entry
block count is 0, which is quite possible in AutoFDO, it can still
make right optimization decision.
Bootstrapped passed regression test and performance test (improve 0.5%
on average).
OK for google-4_8?
Thanks,
Dehao
Index:
, Jan 17, 2014 at 3:12 PM, Xinliang David Li davi...@google.com wrote:
Can callgraph node count be fixed up properly instead of doing
individual fixups like this?
David
On Fri, Jan 17, 2014 at 2:38 PM, Dehao Chen de...@google.com wrote:
In AutoFDO, sometime edge count might be propagated
as 1%.
Bootstrapped and passed regression test.
OK for trunk?
Thanks,
Dehao
gcc/ChangeLog:
2014-01-22 Dehao Chen de...@google.com
* dojump.c (do_compare_rtx_and_jump): Sets correct probability for
compiler inserted conditional jumps for NAN float check.
Index: gcc/dojump.c
If a loop is cunrolled/vectorized, the AutoFDO computed trip count
will be very small. This patch disallows overwritting of precomputed
loop bound in AutoFDO mode.
Bootstrapped and passed regression test. Performance test on-going.
OK for Google branches?
Thanks,
Dehao
Index:
In AutoFDO, sometime edge count might be propagated to be too large
due to bad debug info. In this cases, we need to make sure the count
scale is no larger than 100% otherwise it'll make real hot code cold.
Bootstrapped and passed regression test. Performance test on-going.
OK for google-4_8 if
This patch moves the LIPO linking before profile annotation so that
iterative-early-inline can cover functions from aux-module.
Bootstrapped and passed regression test and benchmark test.
OK for google-4_8?
Thanks,
Dehao
Index: gcc/auto-profile.c
This patch removes mod_id_to_name map because the info is already
there in module_infos. And also, AutoFDO don't have access to update
this map because its a file-static structure.
Bootstrapped and passed regression test.
OK for google branch?
Thanks,
Dehao
Index: gcc/coverage.c
This patch fix the bug to honor max-lipo-group for AutoFDO.
Bootstrapped and passed regression test.
OK for google-4_8 branch?
Thanks,
Dehao
Index: gcc/auto-profile.c
===
--- gcc/auto-profile.c (revision 206135)
+++
afdo_propagate_multi_edge can do everything afdo_propagate_single_edge
does. So we refactor the code to keep only one afdo_propagate_edge
function.
Bootstrapped and passed all unittests and performance tests.
OK for googlge branch?
Thanks,
Dehao
Index: gcc/auto-profile.c
with that heuristic?
In this case, the propagate_edge function will keep increasing the BB
count. We set a threshold (PARAM_AUTOFDO_MAX_PROPAGATE_ITERATIONS) to
prevent it from making BB count too large.
Dehao
Thanks. Diego.
On Mon, Nov 25, 2013 at 12:56 PM, Dehao Chen de...@google.com wrote
On Mon, Nov 25, 2013 at 10:26 AM, Diego Novillo dnovi...@google.com wrote:
On Mon, Nov 25, 2013 at 1:22 PM, Xinliang David Li davi...@google.com wrote:
In this case the backedge will be a critical edge, which will be split by
GCC.
Right. So, if I split it, I will reach essentially the same
This patch removes the zero_edge heuristic during profile propagation.
The zero_edge heuristic does not seem to be effective in improving
performance.
Tested:
Bootstrapped and passed regression test and performance test.
OK for google-4_8?
Thanks,
Dehao
Index: gcc/auto-profile.c
, Xinliang David Li davi...@google.com wrote:
On Fri, Nov 22, 2013 at 12:27 PM, Dehao Chen de...@google.com wrote:
This patch removes the zero_edge heuristic during profile propagation.
The zero_edge heuristic does not seem to be effective in improving
performance.
not effective here means degrading
1 - 100 of 395 matches
Mail list logo