On Wed, Oct 2, 2013 at 9:19 AM, Jan Hubicka hubi...@ucw.cz wrote:
2013-09-29 Teresa Johnson tejohn...@google.com
* bb-reorder.c
(find_rarely_executed_basic_blocks_and_crossing_edges):
Treat profile insanities conservatively.
* predict.c (probably_never_executed):
On Thu, Oct 3, 2013 at 6:41 AM, Teresa Johnson tejohn...@google.com wrote:
On Wed, Oct 2, 2013 at 9:19 AM, Jan Hubicka hubi...@ucw.cz wrote:
2013-09-29 Teresa Johnson tejohn...@google.com
* bb-reorder.c
(find_rarely_executed_basic_blocks_and_crossing_edges):
Treat profile
2013-09-29 Teresa Johnson tejohn...@google.com
* bb-reorder.c (find_rarely_executed_basic_blocks_and_crossing_edges):
Treat profile insanities conservatively.
* predict.c (probably_never_executed): New function. Treat profile
insanities conservatively.
On Wed, Oct 2, 2013 at 9:19 AM, Jan Hubicka hubi...@ucw.cz wrote:
2013-09-29 Teresa Johnson tejohn...@google.com
* bb-reorder.c
(find_rarely_executed_basic_blocks_and_crossing_edges):
Treat profile insanities conservatively.
* predict.c (probably_never_executed):
But why do we want to consider blocks as probably never executed
when the frequency suggests they are sometimes executed?
Well, probably never executed is mean to reffer to one run. If you have
something like code handling fatal errors, you probably still want to have it
in cold secion even if
On Tue, Sep 24, 2013 at 11:25 AM, Teresa Johnson tejohn...@google.com wrote:
On Tue, Sep 24, 2013 at 10:57 AM, Jan Hubicka hubi...@ucw.cz wrote:
I looked at one that failed after 100 as well (20031204-1.c). In this
case, it was due to expansion which was creating multiple branches/bbs
from a
On Fri, Sep 27, 2013 at 7:15 AM, Teresa Johnson tejohn...@google.com wrote:
On Thu, Sep 26, 2013 at 3:02 PM, Jan Hubicka hubi...@ucw.cz wrote:
Why not just have probably_never_executed_bb_p return simply return
false bb-frequency is non-zero (right now it does the opposite -
We want to have
On Thu, Sep 26, 2013 at 3:02 PM, Jan Hubicka hubi...@ucw.cz wrote:
Why not just have probably_never_executed_bb_p return simply return
false bb-frequency is non-zero (right now it does the opposite -
We want to have frequencies guessed for functions that was not trained
in the profiling run
On Tue, Sep 24, 2013 at 5:31 AM, Jan Hubicka hubi...@ucw.cz wrote:
Hi Honza,
I am finally getting back to working on this after a few weeks of
working on some other priorities.
I am also trying to return to this, so good timming ;)
Martin has got smaller C++ programs (Inkscape) to not touch
As for COMDAT merging, i would like to see the patch. I am experimenting
now with a patch to also privatize COMDATs during -fprofile-generate to
avoid problems with lost profiles mentioned above.
Do you mean you privatize every COMDAT function in the profile-generate?
We discussed
Why not just have probably_never_executed_bb_p return simply return
false bb-frequency is non-zero (right now it does the opposite -
We want to have frequencies guessed for functions that was not trained
in the profiling run (that was patch I posted earlier that I think did not
go in, yet).
On Thu, Sep 26, 2013 at 2:54 PM, Jan Hubicka hubi...@ucw.cz wrote:
As for COMDAT merging, i would like to see the patch. I am experimenting
now with a patch to also privatize COMDATs during -fprofile-generate to
avoid problems with lost profiles mentioned above.
Do you mean you
On Tue, Sep 24, 2013 at 11:25 AM, Teresa Johnson tejohn...@google.com wrote:
On Tue, Sep 24, 2013 at 10:57 AM, Jan Hubicka hubi...@ucw.cz wrote:
I looked at one that failed after 100 as well (20031204-1.c). In this
case, it was due to expansion which was creating multiple branches/bbs
from a
On Wed, Sep 25, 2013 at 2:33 PM, Teresa Johnson tejohn...@google.com wrote:
On Tue, Sep 24, 2013 at 11:25 AM, Teresa Johnson tejohn...@google.com wrote:
On Tue, Sep 24, 2013 at 10:57 AM, Jan Hubicka hubi...@ucw.cz wrote:
I looked at one that failed after 100 as well (20031204-1.c). In this
Hi Honza,
I am finally getting back to working on this after a few weeks of
working on some other priorities.
I am also trying to return to this, so good timming ;)
Martin has got smaller C++ programs (Inkscape) to not touch cold segment
during the startup with FDO (w/o partitioning).
I looked at one that failed after 100 as well (20031204-1.c). In this
case, it was due to expansion which was creating multiple branches/bbs
from a logical OR and guessing incorrectly on how to assign the
counts:
if (octets == 4 (*cp == ':' || *cp == '\0')) {
The (*cp == ':' || *cp
On Tue, Sep 24, 2013 at 10:57 AM, Jan Hubicka hubi...@ucw.cz wrote:
I looked at one that failed after 100 as well (20031204-1.c). In this
case, it was due to expansion which was creating multiple branches/bbs
from a logical OR and guessing incorrectly on how to assign the
counts:
if
Rong - can you answer the questions below on the comdat patch?
On Tue, Sep 24, 2013 at 5:31 AM, Jan Hubicka hubi...@ucw.cz wrote:
Hi Honza,
I am finally getting back to working on this after a few weeks of
working on some other priorities.
I am also trying to return to this, so good
Hi Honza,
I am finally getting back to working on this after a few weeks of
working on some other priorities.
On Sat, Aug 31, 2013 at 2:46 PM, Jan Hubicka hubi...@ucw.cz wrote:
Hi,
I run my script on execute testsuite and looked into few testcases. The
problem I found
was roundoff errors -
Hi,
With Martin we made script for testing the profiling failures.
First do
ld --verbose ~/script
then apply
--- /home/jh/script22013-08-31 17:59:11.0 +0200
+++ /home/jh/script 2013-08-31 17:39:40.0 +0200
@@ -1,12 +1,3 @@
-GNU ld (GNU Binutils for Debian)
Hi,
I run my script on execute testsuite and looked into few testcases. The problem
I found
was roundoff errors - i.e. when expanding switch we set 50% chage that out of
bound
value is above or bellow. Both gets rounded to 0, because switch is executed
once
and the value is bellow.
Partly
Great! Is this the LTO merging you were talking about in an earlier
message, or the gcov runtime fix (that would presumably not be
lto-specific)?
It is the LTO path - we need to merge profiles there anyway for his code
unification
work.
I have patch to track this. Moreover vforks seems to
The esitmated profile is already there before reading the profile in, so we
only do not want to overwrite it. Does the following work for you?
It does get the estimated frequencies on the bbs.
Good.
We wil also need to solve problem that in this case cgraph edges will have
0
Can someone review and ok the attached patch for trunk? It has been
bootstrapped and tested on x86-64-unknown-linux-gnu, and tested by
enabling -freorder-blocks-and-partition enabled for a
profiledbootstrap as well.
(Honza, see more responses inlined below. Rong, please see note below as well).
The frequency condition needs to be done only when you walk predecestors -
when
you walk down the edge probabilities are just fine.
True. For simplicity I think it should be fine to leave as-is so there
isn't more special casing as the current approach works in both
directions.
Yep,
On Fri, Aug 30, 2013 at 2:31 AM, Jan Hubicka hubi...@ucw.cz wrote:
The esitmated profile is already there before reading the profile in, so we
only do not want to overwrite it. Does the following work for you?
It does get the estimated frequencies on the bbs.
Good.
I ended up making some
On Fri, Aug 30, 2013 at 8:26 AM, Jan Hubicka hubi...@ucw.cz wrote:
The frequency condition needs to be done only when you walk predecestors -
when
you walk down the edge probabilities are just fine.
True. For simplicity I think it should be fine to leave as-is so there
isn't more
On Fri, Aug 30, 2013 at 7:50 AM, Teresa Johnson tejohn...@google.com wrote:
Can someone review and ok the attached patch for trunk? It has been
bootstrapped and tested on x86-64-unknown-linux-gnu, and tested by
enabling -freorder-blocks-and-partition enabled for a
profiledbootstrap as well.
On Wed, Aug 28, 2013 at 11:20 AM, Teresa Johnson tejohn...@google.com wrote:
On Wed, Aug 28, 2013 at 9:58 AM, Jan Hubicka hubi...@ucw.cz wrote:
Hi,
with Martin we did bit of progress on analyzing the problems. We now have
COMDAT profile merging for FDO
Great! Is this the LTO merging you
Hi,
with Martin we did bit of progress on analyzing the problems. We now have
COMDAT profile merging for FDO and we also noticed that forks can make your
basic blocks appear never executed even though they are executed every run:
the fork is accounted as 3 independnet runs of the program. First
On Wed, Aug 28, 2013 at 9:58 AM, Jan Hubicka hubi...@ucw.cz wrote:
Hi,
with Martin we did bit of progress on analyzing the problems. We now have
COMDAT profile merging for FDO
Great! Is this the LTO merging you were talking about in an earlier
message, or the gcov runtime fix (that would
On Mon, Aug 19, 2013 at 10:47 AM, Teresa Johnson tejohn...@google.com wrote:
On Mon, Aug 19, 2013 at 8:09 AM, Jan Hubicka hubi...@ucw.cz wrote:
Remember it isn't using dominance anymore. The latest patch was
instead ensuring the most frequent path between hot blocks and the
entry/exit are
On Wed, Aug 21, 2013 at 8:25 AM, Jan Hubicka hubi...@ucw.cz wrote:
Because offline COMDAT functoin will be porduced for every COMDAT used, I
think
it is bad to porduce any COMDAT (or any reachable function via calls with
non-0
count) that has empty profile (either because it got lost
Because offline COMDAT functoin will be porduced for every COMDAT used, I
think
it is bad to porduce any COMDAT (or any reachable function via calls with
non-0
count) that has empty profile (either because it got lost by COMDAT merging
or because of reading mismatch).
The
On Sat, Aug 17, 2013 at 1:44 PM, Jan Hubicka hubi...@ucw.cz wrote:
I added both of these and ran into issues due to profile maintenance.
For example, there were non-zero blocks in the cold section because
pro_and_epilogue split a simple return block that was previously reach
by both hot and
Remember it isn't using dominance anymore. The latest patch was
instead ensuring the most frequent path between hot blocks and the
entry/exit are marked hot. That should be better than the dominance
approach used in the earlier version.
Indeed, that looks more resonable approach.
Can you
On Sat, Aug 17, 2013 at 1:44 PM, Jan Hubicka hubi...@ucw.cz wrote:
patch for updating counts based on estimated frequencies to address
inlined comdats with 0 profile counts:
013-08-16 Teresa Johnson tejohn...@google.com
* tree-inline.c (copy_bb): Compute count based on frequency.
On Mon, Aug 19, 2013 at 8:09 AM, Jan Hubicka hubi...@ucw.cz wrote:
Remember it isn't using dominance anymore. The latest patch was
instead ensuring the most frequent path between hot blocks and the
entry/exit are marked hot. That should be better than the dominance
approach used in the earlier
Dear Teresa,
On 19 August 2013 19:47, Teresa Johnson tejohn...@google.com wrote:
On Mon, Aug 19, 2013 at 8:09 AM, Jan Hubicka hubi...@ucw.cz wrote:
Remember it isn't using dominance anymore. The latest patch was
instead ensuring the most frequent path between hot blocks and the
entry/exit are
On Fri, Aug 9, 2013 at 2:02 PM, Teresa Johnson tejohn...@google.com wrote:
On Fri, Aug 9, 2013 at 8:28 AM, Jan Hubicka hubi...@ucw.cz wrote:
Do we sanity check that the cold partition does not contain any blocks of
count 0? It may be that the profile is broken enough to make partitioning
I added both of these and ran into issues due to profile maintenance.
For example, there were non-zero blocks in the cold section because
pro_and_epilogue split a simple return block that was previously reach
by both hot and cold paths. The new return block that was then only
reached via
I see, yes LTO can deal with this better since it has global
information. In non-LTO mode (including LIPO) we have the issue.
Either Martin or me will implement merging of the multiple copies at
LTO link time. This is needed for Martin's code unification patch anyway.
Theoretically gcov
Cc'ing Rong since he is also working on trying to address the comdat
profile issue. Rong, you may need to see an earlier message for more
context:
http://gcc.gnu.org/ml/gcc-patches/2013-08/msg00558.html
Teresa
On Sun, Aug 11, 2013 at 5:21 AM, Jan Hubicka hubi...@ucw.cz wrote:
I see, yes LTO
Hello,
I did a collection of systemtap graphs for GIMP.
All these graphs were created with enabled LTO, profiling and -O2.
1) gimp-reordered.pdf - function are reordered according to my newly
created profile that utilizes LTO infrastructure
2) gimp-no-top-level-reorder.pdf - (GCC rev.
Hi,
thinking about it a bit more, I suppose easiest way is to
1) make separate sets of counters for each comdat and place them
into comdat section named as DECL_COMDAT_GROUP (node) + cfg_checksum +
individual_counter_counts.
This will make linker to unify the sections for us.
2) extend API
On Thu, Aug 8, 2013 at 3:23 PM, Jan Hubicka hubi...@ucw.cz wrote:
Hi,
Martin Liska was kind enough to generate disk seeking graph of gimp statrup
with his function reordering.
His code simply measures time of firest execution of a function and orders
functions in the given order.
On Fri, Aug 9, 2013 at 2:58 AM, Jan Hubicka hubi...@ucw.cz wrote:
On Thu, Aug 8, 2013 at 3:23 PM, Jan Hubicka hubi...@ucw.cz wrote:
Hi,
Martin Liska was kind enough to generate disk seeking graph of gimp
statrup with his function reordering.
His code simply measures time of firest
Do we sanity check that the cold partition does not contain any blocks of
count 0? It may be that the profile is broken enough to make partitioning
not work.
Do you mean sanity check that the cold partition does not contain any
blocks of count 0? (they should all be zero) I don't think
Hi
On 9 August 2013 17:28, Jan Hubicka hubi...@ucw.cz wrote:
Do we sanity check that the cold partition does not contain any blocks of
count 0? It may be that the profile is broken enough to make partitioning
not work.
Do you mean sanity check that the cold partition does not contain any
On Fri, Aug 9, 2013 at 8:54 AM, Martin Liška marxin.li...@gmail.com wrote:
Hi
On 9 August 2013 17:28, Jan Hubicka hubi...@ucw.cz wrote:
Do we sanity check that the cold partition does not contain any blocks of
count 0? It may be that the profile is broken enough to make partitioning
not
On Fri, Aug 9, 2013 at 8:28 AM, Jan Hubicka hubi...@ucw.cz wrote:
Do we sanity check that the cold partition does not contain any blocks of
count 0? It may be that the profile is broken enough to make partitioning
not work.
Do you mean sanity check that the cold partition does not contain
I see, yes LTO can deal with this better since it has global
information. In non-LTO mode (including LIPO) we have the issue.
Thinking about it, there is still one problem left: I usually suggest
users to train with -fno-lto to avoid excessive linking time with
instrumentation. This actually
On Thu, Aug 8, 2013 at 3:23 PM, Jan Hubicka hubi...@ucw.cz wrote:
Hi,
Martin Liska was kind enough to generate disk seeking graph of gimp statrup
with his function reordering.
His code simply measures time of firest execution of a function and orders
functions in the given order.
The
On Fri, Aug 2, 2013 at 9:48 PM, Teresa Johnson tejohn...@google.com wrote:
On Fri, Aug 2, 2013 at 8:05 AM, Jan Hubicka hubi...@ucw.cz wrote:
2013-08-01 Teresa Johnson tejohn...@google.com
Steven Bosscher ste...@gcc.gnu.org
* cfgrtl.c (fixup_bb_partition): New routine.
The patch looks OK to me in general (I can not approve it).
Still have one question...
+
+/* Ensure that no cold bbs dominate hot bbs along the dominance or
+ post-dominance DIR, for example as a result of edge weight insanities.
+ Returns the updated value of COLD_BB_COUNT and adds
On Mon, Aug 5, 2013 at 7:11 AM, Jan Hubicka hubi...@ucw.cz wrote:
The patch looks OK to me in general (I can not approve it).
Still have one question...
+
+/* Ensure that no cold bbs dominate hot bbs along the dominance or
+ post-dominance DIR, for example as a result of edge weight
On Mon, Aug 5, 2013 at 7:57 AM, Teresa Johnson tejohn...@google.com wrote:
On Mon, Aug 5, 2013 at 7:11 AM, Jan Hubicka hubi...@ucw.cz wrote:
The patch looks OK to me in general (I can not approve it).
Still have one question...
+
+/* Ensure that no cold bbs dominate hot bbs along the
On 1 August 2013 18:32, Teresa Johnson tejohn...@google.com wrote:
Patch 3 of 3 split out from the patch I sent in May that fixes problems with
-freorder-blocks-and-partition, with changes/fixes discussed in that thread.
See http://gcc.gnu.org/ml/gcc-patches/2013-05/threads.html#00388 for
On Fri, Aug 2, 2013 at 4:22 AM, Bernhard Reutner-Fischer
rep.dot@gmail.com wrote:
On 1 August 2013 18:32, Teresa Johnson tejohn...@google.com wrote:
Patch 3 of 3 split out from the patch I sent in May that fixes problems with
-freorder-blocks-and-partition, with changes/fixes discussed in
2013-08-01 Teresa Johnson tejohn...@google.com
Steven Bosscher ste...@gcc.gnu.org
* cfgrtl.c (fixup_bb_partition): New routine.
(commit_edge_insertions): Invoke fixup_partitions.
(find_partition_fixes): New routine.
(fixup_partitions):
On Fri, Aug 2, 2013 at 5:05 PM, Jan Hubicka hubi...@ucw.cz wrote:
+/* Called when block BB has been reassigned to a different partition,
+ to ensure that the region crossing attributes are updated. */
+
+static void
+fixup_bb_partition (basic_block bb)
+{
+ edge e;
+ edge_iterator ei;
On Fri, Aug 2, 2013 at 8:05 AM, Jan Hubicka hubi...@ucw.cz wrote:
2013-08-01 Teresa Johnson tejohn...@google.com
Steven Bosscher ste...@gcc.gnu.org
* cfgrtl.c (fixup_bb_partition): New routine.
(commit_edge_insertions): Invoke fixup_partitions.
On Fri, Aug 2, 2013 at 4:04 PM, Steven Bosscher stevenb@gmail.com wrote:
On Fri, Aug 2, 2013 at 5:05 PM, Jan Hubicka hubi...@ucw.cz wrote:
+/* Called when block BB has been reassigned to a different partition,
+ to ensure that the region crossing attributes are updated. */
+
+static
Patch 3 of 3 split out from the patch I sent in May that fixes problems with
-freorder-blocks-and-partition, with changes/fixes discussed in that thread.
See http://gcc.gnu.org/ml/gcc-patches/2013-05/threads.html#00388 for context.
This patch sanitizes the partitioning to address issues such as
64 matches
Mail list logo