Re: [PATCH] Add working-set size and hotness information to fdo summary (issue6465057)

2012-08-22 Thread Jan Hubicka
On Tue, Aug 21, 2012 at 6:56 PM, Jan Hubicka hubi...@ucw.cz wrote: I can go ahead with the histogram approach. There is some roundoff error from the working set scaling approach that can affect different merging orders as you note, although I think this only really affects the small

Re: [PATCH] Add working-set size and hotness information to fdo summary (issue6465057)

2012-08-22 Thread Teresa Johnson
On Tue, Aug 21, 2012 at 11:18 PM, Jan Hubicka hubi...@ucw.cz wrote: On Tue, Aug 21, 2012 at 6:56 PM, Jan Hubicka hubi...@ucw.cz wrote: I can go ahead with the histogram approach. There is some roundoff error from the working set scaling approach that can affect different merging orders

Re: [PATCH] Add working-set size and hotness information to fdo summary (issue6465057)

2012-08-22 Thread Jan Hubicka
I'm concerned about inaccuracies arising when multiple runs have different sizes and therefore counters are in different buckets in the corresponding histograms. We would end up in a situation where the merged histogram has more counters in it than there are actual counters in the program.

Re: [PATCH] Add working-set size and hotness information to fdo summary (issue6465057)

2012-08-21 Thread Xinliang David Li
On Mon, Aug 20, 2012 at 10:29 PM, Jan Hubicka hubi...@ucw.cz wrote: On Mon, Aug 20, 2012 at 6:27 PM, Jan Hubicka hubi...@ucw.cz wrote: Xinliang David Li davi...@google.com writes: Process level synchronization problems can happen when two processes (running the instrumented binary)

Re: [PATCH] Add working-set size and hotness information to fdo summary (issue6465057)

2012-08-21 Thread Jan Hubicka
This is useful for large applications with a long tail. The instruction working set for those applications are very large, and inliner and unroller need to be aware of that and good heuristics can be developed to throttle aggressive code bloat transformations. For inliner, it is kind of the

Re: [PATCH] Add working-set size and hotness information to fdo summary (issue6465057)

2012-08-21 Thread Xinliang David Li
On Mon, Aug 20, 2012 at 11:33 PM, Jan Hubicka hubi...@ucw.cz wrote: This is useful for large applications with a long tail. The instruction working set for those applications are very large, and inliner and unroller need to be aware of that and good heuristics can be developed to throttle

Re: [PATCH] Add working-set size and hotness information to fdo summary (issue6465057)

2012-08-21 Thread Jan Hubicka
Teresa has done some tunings for the unroller so far. The inliner tuning is the next step. What concerns me that it is greatly inaccurate - you have no idea how many instructions given counter is guarding and it can differ quite a lot. Also inlining/optimization makes working sets

Re: [PATCH] Add working-set size and hotness information to fdo summary (issue6465057)

2012-08-21 Thread Andi Kleen
The issue here is holding lock for all the files (that can be many) versus number of locks limits possibilities for deadlocking (mind that updating may happen in different orders on the same files for different programs built from same objects) lockf typically has a deadlock detector, and

Re: [PATCH] Add working-set size and hotness information to fdo summary (issue6465057)

2012-08-21 Thread Xinliang David Li
On Tue, Aug 21, 2012 at 12:34 AM, Jan Hubicka hubi...@ucw.cz wrote: Teresa has done some tunings for the unroller so far. The inliner tuning is the next step. What concerns me that it is greatly inaccurate - you have no idea how many instructions given counter is guarding and it can

Re: [PATCH] Add working-set size and hotness information to fdo summary (issue6465057)

2012-08-21 Thread Jan Hubicka
I can go ahead with the histogram approach. There is some roundoff error from the working set scaling approach that can affect different merging orders as you note, although I think this only really affects the small counter values. The other place where saving/merging the histogram Do

Re: [PATCH] Add working-set size and hotness information to fdo summary (issue6465057)

2012-08-21 Thread Teresa Johnson
On Tue, Aug 21, 2012 at 6:56 PM, Jan Hubicka hubi...@ucw.cz wrote: I can go ahead with the histogram approach. There is some roundoff error from the working set scaling approach that can affect different merging orders as you note, although I think this only really affects the small

Re: [PATCH] Add working-set size and hotness information to fdo summary (issue6465057)

2012-08-20 Thread Jan Hubicka
Well, it should store the largest working set in BBs, or one that came from a much longer run. But I see the issue you are pointing out. The num_counters (the number of hot bbs) should be reasonable, as the total number of BBs is the same between all runs, and I want to show the largest or

Re: [PATCH] Add working-set size and hotness information to fdo summary (issue6465057)

2012-08-20 Thread Teresa Johnson
On Mon, Aug 20, 2012 at 2:48 AM, Jan Hubicka hubi...@ucw.cz wrote: Well, it should store the largest working set in BBs, or one that came from a much longer run. But I see the issue you are pointing out. The num_counters (the number of hot bbs) should be reasonable, as the total number of BBs

Re: [PATCH] Add working-set size and hotness information to fdo summary (issue6465057)

2012-08-20 Thread Steven Bosscher
On Mon, Aug 20, 2012 at 11:48 AM, Jan Hubicka hubi...@ucw.cz wrote: The summary merging in coverage.c confuses me a bit as it seems to be handling the case when there are multiple program summaries in a single gcda file. When would this happen? It looks like the merge handling in libgcov

Re: [PATCH] Add working-set size and hotness information to fdo summary (issue6465057)

2012-08-20 Thread Xinliang David Li
So I definitely preffer 2 or 3 over 1. David has experience with 3. How well does it work for LIPO? This (lack of locking, races) is not a new problem. There is no synchronization in libgcov for profile update/merge at both thread and process level. Thread level data races leads to

Re: [PATCH] Add working-set size and hotness information to fdo summary (issue6465057)

2012-08-20 Thread Xinliang David Li
If this approach seems like it is feasible, then we could stick with the current approach of emitting the working set array in the summary, mitigating it somewhat by doing the sum_all based scaling of the counter values, then in a follow on patch restructure the merging code to delay the

Re: [PATCH] Add working-set size and hotness information to fdo summary (issue6465057)

2012-08-20 Thread Teresa Johnson
On Mon, Aug 20, 2012 at 8:35 AM, Steven Bosscher stevenb@gmail.com wrote: On Mon, Aug 20, 2012 at 11:48 AM, Jan Hubicka hubi...@ucw.cz wrote: The summary merging in coverage.c confuses me a bit as it seems to be handling the case when there are multiple program summaries in a single gcda

Re: [PATCH] Add working-set size and hotness information to fdo summary (issue6465057)

2012-08-20 Thread Steven Bosscher
On Mon, Aug 20, 2012 at 7:44 PM, Teresa Johnson tejohn...@google.com wrote: But the code in coverage.c is dealing with a single gcda file containing multiple program summaries. Is there code somewhere that will cause this to happen? Not that I know of, no. (There are enhancement requests for

Re: [PATCH] Add working-set size and hotness information to fdo summary (issue6465057)

2012-08-20 Thread Andi Kleen
Xinliang David Li davi...@google.com writes: Process level synchronization problems can happen when two processes (running the instrumented binary) exit at the same time. The updated/merged counters from one process may be overwritten by another process -- this is true for both counter data

Re: [PATCH] Add working-set size and hotness information to fdo summary (issue6465057)

2012-08-20 Thread Xinliang David Li
I was mistaken here -- gcov_open actually uses locking via fcntl interface -- so we do need to find a way to solve the summary update synchronization problem when it is separate out of the per-file update loop. David On Mon, Aug 20, 2012 at 12:03 PM, Andi Kleen a...@firstfloor.org wrote:

Re: [PATCH] Add working-set size and hotness information to fdo summary (issue6465057)

2012-08-20 Thread Jan Hubicka
Xinliang David Li davi...@google.com writes: Process level synchronization problems can happen when two processes (running the instrumented binary) exit at the same time. The updated/merged counters from one process may be overwritten by another process -- this is true for both counter

Re: [PATCH] Add working-set size and hotness information to fdo summary (issue6465057)

2012-08-20 Thread Teresa Johnson
On Mon, Aug 20, 2012 at 6:27 PM, Jan Hubicka hubi...@ucw.cz wrote: Xinliang David Li davi...@google.com writes: Process level synchronization problems can happen when two processes (running the instrumented binary) exit at the same time. The updated/merged counters from one process may be

Re: [PATCH] Add working-set size and hotness information to fdo summary (issue6465057)

2012-08-20 Thread Jan Hubicka
On Mon, Aug 20, 2012 at 6:27 PM, Jan Hubicka hubi...@ucw.cz wrote: Xinliang David Li davi...@google.com writes: Process level synchronization problems can happen when two processes (running the instrumented binary) exit at the same time. The updated/merged counters from one process

Re: [PATCH] Add working-set size and hotness information to fdo summary (issue6465057)

2012-08-19 Thread Teresa Johnson
On Sat, Aug 18, 2012 at 1:19 AM, Jan Hubicka hubi...@ucw.cz wrote: +{ + cs_prg-num = cs_tprg-num; + /* Allocate the working set array for the merged summary. */ + if (ws_cnt) +{ +

Re: [PATCH] Add working-set size and hotness information to fdo summary (issue6465057)

2012-08-19 Thread Xinliang David Li
On Sun, Aug 19, 2012 at 9:59 PM, Teresa Johnson tejohn...@google.com wrote: On Sat, Aug 18, 2012 at 1:19 AM, Jan Hubicka hubi...@ucw.cz wrote: +{ + cs_prg-num = cs_tprg-num; + /* Allocate the working set array for the merged summary. */

Re: [PATCH] Add working-set size and hotness information to fdo summary (issue6465057)

2012-08-18 Thread Jan Hubicka
+{ + cs_prg-num = cs_tprg-num; + /* Allocate the working set array for the merged summary. */ + if (ws_cnt) +{ + cs_prg-working_set_count = ws_cnt; +