subject:"Re\: \[PATCH\] Convert more passes to new dump framework"

Re: [PATCH] Convert more passes to new dump framework

2013-12-21 Thread Xinliang David Li

thanks.

David

On Fri, Dec 20, 2013 at 11:48 PM, Sharad Singhai sing...@google.com wrote:
 Committed documentation as r206161. Sorry about the delay.

 Thanks,
 Sharad

 On Thu, Nov 28, 2013 at 10:03 AM, Martin Jambor mjam...@suse.cz wrote:
 Hi,

 On Tue, Aug 06, 2013 at 10:18:05AM -0700, Sharad Singhai wrote:
 On Tue, Aug 6, 2013 at 10:10 AM, Martin Jambor mjam...@suse.cz wrote:
  On Tue, Aug 06, 2013 at 09:22:02AM -0700, Sharad Singhai wrote:
  On Tue, Aug 6, 2013 at 8:57 AM, Xinliang David Li davi...@google.com 
  wrote:
   On Tue, Aug 6, 2013 at 5:37 AM, Martin Jambor mjam...@suse.cz wrote:
   Hi,
  
   On Mon, Aug 05, 2013 at 10:37:00PM -0700, Teresa Johnson wrote:
   This patch ports messages to the new dump framework,
  
   It would be great this new framework was documented somewhere.  I lost
   track of what was agreed it would be and from the uses in the
   vectorizer I was never quite sure how to utilize it in other passes.
  
   Sharad, can you put the documentation in GCC wiki.
 
  Sure. I had user documentation in form of gcc info. But I will add
  more developer details to a GCC wiki.
 
 
  I have built trunk gccint.info yesterday but could not find any string
  dump_enabled_p there, for example.  And when I quickly searched just
  for the string dump, I did not find any thing that looked like
  dumping infrastructure either.  OTOH, I agree that fie would be the
  best place for the documentation.
 
  Or did I just miss it?  What section is it in then?

 Actually, the user-facing documentation is in doc/invoke.texi.
 However, that doesn't describe dump_enabled_p. Do you think
 gccint.info would be a good place? I can add documentation there
 instead of creating a GCC wiki.


 please do not forget about this, otherwise few people will use your
 framework.

 Thanks,

 Martin

Re: [PATCH] Convert more passes to new dump framework

2013-12-20 Thread Sharad Singhai

Committed documentation as r206161. Sorry about the delay.

Thanks,
Sharad

On Thu, Nov 28, 2013 at 10:03 AM, Martin Jambor mjam...@suse.cz wrote:
 Hi,

 On Tue, Aug 06, 2013 at 10:18:05AM -0700, Sharad Singhai wrote:
 On Tue, Aug 6, 2013 at 10:10 AM, Martin Jambor mjam...@suse.cz wrote:
  On Tue, Aug 06, 2013 at 09:22:02AM -0700, Sharad Singhai wrote:
  On Tue, Aug 6, 2013 at 8:57 AM, Xinliang David Li davi...@google.com 
  wrote:
   On Tue, Aug 6, 2013 at 5:37 AM, Martin Jambor mjam...@suse.cz wrote:
   Hi,
  
   On Mon, Aug 05, 2013 at 10:37:00PM -0700, Teresa Johnson wrote:
   This patch ports messages to the new dump framework,
  
   It would be great this new framework was documented somewhere.  I lost
   track of what was agreed it would be and from the uses in the
   vectorizer I was never quite sure how to utilize it in other passes.
  
   Sharad, can you put the documentation in GCC wiki.
 
  Sure. I had user documentation in form of gcc info. But I will add
  more developer details to a GCC wiki.
 
 
  I have built trunk gccint.info yesterday but could not find any string
  dump_enabled_p there, for example.  And when I quickly searched just
  for the string dump, I did not find any thing that looked like
  dumping infrastructure either.  OTOH, I agree that fie would be the
  best place for the documentation.
 
  Or did I just miss it?  What section is it in then?

 Actually, the user-facing documentation is in doc/invoke.texi.
 However, that doesn't describe dump_enabled_p. Do you think
 gccint.info would be a good place? I can add documentation there
 instead of creating a GCC wiki.


 please do not forget about this, otherwise few people will use your
 framework.

 Thanks,

 Martin

Re: [PATCH] Convert more passes to new dump framework

2013-11-28 Thread Martin Jambor

Hi,

On Tue, Aug 06, 2013 at 10:18:05AM -0700, Sharad Singhai wrote:
 On Tue, Aug 6, 2013 at 10:10 AM, Martin Jambor mjam...@suse.cz wrote:
  On Tue, Aug 06, 2013 at 09:22:02AM -0700, Sharad Singhai wrote:
  On Tue, Aug 6, 2013 at 8:57 AM, Xinliang David Li davi...@google.com 
  wrote:
   On Tue, Aug 6, 2013 at 5:37 AM, Martin Jambor mjam...@suse.cz wrote:
   Hi,
  
   On Mon, Aug 05, 2013 at 10:37:00PM -0700, Teresa Johnson wrote:
   This patch ports messages to the new dump framework,
  
   It would be great this new framework was documented somewhere.  I lost
   track of what was agreed it would be and from the uses in the
   vectorizer I was never quite sure how to utilize it in other passes.
  
   Sharad, can you put the documentation in GCC wiki.
 
  Sure. I had user documentation in form of gcc info. But I will add
  more developer details to a GCC wiki.
 
 
  I have built trunk gccint.info yesterday but could not find any string
  dump_enabled_p there, for example.  And when I quickly searched just
  for the string dump, I did not find any thing that looked like
  dumping infrastructure either.  OTOH, I agree that fie would be the
  best place for the documentation.
 
  Or did I just miss it?  What section is it in then?
 
 Actually, the user-facing documentation is in doc/invoke.texi.
 However, that doesn't describe dump_enabled_p. Do you think
 gccint.info would be a good place? I can add documentation there
 instead of creating a GCC wiki.
 

please do not forget about this, otherwise few people will use your
framework.

Thanks,

Martin

Re: [PATCH] Convert more passes to new dump framework

2013-09-04 Thread Richard Biener

On Tue, Sep 3, 2013 at 9:39 PM, Teresa Johnson tejohn...@google.com wrote:
 On Fri, Aug 30, 2013 at 11:28 PM, Sharad Singhai sing...@google.com wrote:
 Found the issue. The stream was incorrectly being closed when it was
 stderr/stdout. So only the dump output before the first dump_finish
 call was being emitted to stderr. I fixed this the same way the
 alt_dump_file was being handled just below - don't close if it is
 stderr/stdout. Confirmed that this fixes the problem.

 (So the real ratio between the volume of -fdump-...=stderr and
 -fopt-info is much higher than what I reported in an earlier email)

 Is the following patch ok, pending regression tests?

 2013-08-30  Teresa Johnson  tejohn...@google.com

 * dumpfile.c (dump_finish): Don't close stderr/stdout.

 Index: dumpfile.c
 ===
 --- dumpfile.c  (revision 202059)
 +++ dumpfile.c  (working copy)
 @@ -450,7 +450,8 @@ dump_finish (int phase)
if (phase  0)
  return;
dfi = get_dump_file_info (phase);
 -  if (dfi-pstream)
 +  if (dfi-pstream  strcmp(stderr, dfi-pfilename) != 0
 +   strcmp(stdout, dfi-pfilename) != 0)
  fclose (dfi-pstream);

if (dfi-alt_stream  strcmp(stderr, dfi-alt_filename) != 0

 Yes, this is clearly a bug which I missed. Thanks for fixing it. Is it
 feasible to add a test case for it?

 Thanks,
 Sharad


 Good idea. I modified an existing test to dump to stderr instead of a
 dump file. Since it has 2 functions with messages from each, I
 confirmed that it exposed the bug.

 Here is the full patch. Bootstrapped and tested on
 x86_64-unknown-linux-gnu. Ok for trunk?

Ok.

Thanks,
Richard.

 Thanks,
 Teresa

 2013-09-03  Teresa Johnson  tejohn...@google.com

 * dumpfile.c (dump_finish): Don't close stderr/stdout.

 * testsuite/gcc.dg/unroll_1.c: Test dumping to stderr.

 Index: dumpfile.c
 ===
 --- dumpfile.c  (revision 202121)
 +++ dumpfile.c  (working copy)
 @@ -450,7 +450,9 @@ dump_finish (int phase)
if (phase  0)
  return;
dfi = get_dump_file_info (phase);
 -  if (dfi-pstream)
 +  if (dfi-pstream  (!dfi-pfilename
 +   || (strcmp(stderr, dfi-pfilename) != 0
 +strcmp(stdout, dfi-pfilename) != 0)))
  fclose (dfi-pstream);

if (dfi-alt_stream  strcmp(stderr, dfi-alt_filename) != 0
 Index: testsuite/gcc.dg/unroll_1.c
 ===
 --- testsuite/gcc.dg/unroll_1.c (revision 202121)
 +++ testsuite/gcc.dg/unroll_1.c (working copy)
 @@ -1,5 +1,5 @@
  /* { dg-do compile } */
 -/* { dg-options -O2 -fdump-rtl-loop2_unroll -fno-peel-loops
 -fdisable-tree-cunroll -fdisable-tree-cunrolli
 -fenable-rtl-loop2_unroll } */
 +/* { dg-options -O2 -fdump-rtl-loop2_unroll=stderr -fno-peel-loops
 -fdisable-tree-cunroll -fdisable-tree-cunrolli
 -fenable-rtl-loop2_unroll } */

  unsigned a[100], b[100];
  inline void bar()
 @@ -11,7 +11,7 @@ int foo(void)
  {
int i;
bar();
 -  for (i = 0; i  2; i++)
 +  for (i = 0; i  2; i++) /* { dg-message note: loop turned into
 non-loop; it never loops } */
{
   a[i]= b[i] + 1;
}
 @@ -21,12 +21,10 @@ int foo(void)
  int foo2(void)
  {
int i;
 -  for (i = 0; i  2; i++)
 +  for (i = 0; i  2; i++) /* { dg-message note: loop turned into
 non-loop; it never loops } */
{
   a[i]= b[i] + 1;
}
return 1;
  }
 -
 -/* { dg-final { scan-rtl-dump-times loop turned into non-loop; it
 never loops 2 loop2_unroll } } */
 -/* { dg-final { cleanup-rtl-dump loop2_unroll } } */
 +/* { dg-prune-output .* } */


 --
 Teresa Johnson | Software Engineer | tejohn...@google.com | 408-460-2413

Re: [PATCH] Convert more passes to new dump framework

2013-09-03 Thread Teresa Johnson

On Fri, Aug 30, 2013 at 11:28 PM, Sharad Singhai sing...@google.com wrote:
 Found the issue. The stream was incorrectly being closed when it was
 stderr/stdout. So only the dump output before the first dump_finish
 call was being emitted to stderr. I fixed this the same way the
 alt_dump_file was being handled just below - don't close if it is
 stderr/stdout. Confirmed that this fixes the problem.

 (So the real ratio between the volume of -fdump-...=stderr and
 -fopt-info is much higher than what I reported in an earlier email)

 Is the following patch ok, pending regression tests?

 2013-08-30  Teresa Johnson  tejohn...@google.com

 * dumpfile.c (dump_finish): Don't close stderr/stdout.

 Index: dumpfile.c
 ===
 --- dumpfile.c  (revision 202059)
 +++ dumpfile.c  (working copy)
 @@ -450,7 +450,8 @@ dump_finish (int phase)
if (phase  0)
  return;
dfi = get_dump_file_info (phase);
 -  if (dfi-pstream)
 +  if (dfi-pstream  strcmp(stderr, dfi-pfilename) != 0
 +   strcmp(stdout, dfi-pfilename) != 0)
  fclose (dfi-pstream);

if (dfi-alt_stream  strcmp(stderr, dfi-alt_filename) != 0

 Yes, this is clearly a bug which I missed. Thanks for fixing it. Is it
 feasible to add a test case for it?

 Thanks,
 Sharad


Good idea. I modified an existing test to dump to stderr instead of a
dump file. Since it has 2 functions with messages from each, I
confirmed that it exposed the bug.

Here is the full patch. Bootstrapped and tested on
x86_64-unknown-linux-gnu. Ok for trunk?

Thanks,
Teresa

2013-09-03  Teresa Johnson  tejohn...@google.com

* dumpfile.c (dump_finish): Don't close stderr/stdout.

* testsuite/gcc.dg/unroll_1.c: Test dumping to stderr.

Index: dumpfile.c
===
--- dumpfile.c  (revision 202121)
+++ dumpfile.c  (working copy)
@@ -450,7 +450,9 @@ dump_finish (int phase)
   if (phase  0)
 return;
   dfi = get_dump_file_info (phase);
-  if (dfi-pstream)
+  if (dfi-pstream  (!dfi-pfilename
+   || (strcmp(stderr, dfi-pfilename) != 0
+strcmp(stdout, dfi-pfilename) != 0)))
 fclose (dfi-pstream);

   if (dfi-alt_stream  strcmp(stderr, dfi-alt_filename) != 0
Index: testsuite/gcc.dg/unroll_1.c
===
--- testsuite/gcc.dg/unroll_1.c (revision 202121)
+++ testsuite/gcc.dg/unroll_1.c (working copy)
@@ -1,5 +1,5 @@
 /* { dg-do compile } */
-/* { dg-options -O2 -fdump-rtl-loop2_unroll -fno-peel-loops
-fdisable-tree-cunroll -fdisable-tree-cunrolli
-fenable-rtl-loop2_unroll } */
+/* { dg-options -O2 -fdump-rtl-loop2_unroll=stderr -fno-peel-loops
-fdisable-tree-cunroll -fdisable-tree-cunrolli
-fenable-rtl-loop2_unroll } */

 unsigned a[100], b[100];
 inline void bar()
@@ -11,7 +11,7 @@ int foo(void)
 {
   int i;
   bar();
-  for (i = 0; i  2; i++)
+  for (i = 0; i  2; i++) /* { dg-message note: loop turned into
non-loop; it never loops } */
   {
  a[i]= b[i] + 1;
   }
@@ -21,12 +21,10 @@ int foo(void)
 int foo2(void)
 {
   int i;
-  for (i = 0; i  2; i++)
+  for (i = 0; i  2; i++) /* { dg-message note: loop turned into
non-loop; it never loops } */
   {
  a[i]= b[i] + 1;
   }
   return 1;
 }
-
-/* { dg-final { scan-rtl-dump-times loop turned into non-loop; it
never loops 2 loop2_unroll } } */
-/* { dg-final { cleanup-rtl-dump loop2_unroll } } */
+/* { dg-prune-output .* } */


-- 
Teresa Johnson | Software Engineer | tejohn...@google.com | 408-460-2413

Re: [PATCH] Convert more passes to new dump framework

2013-09-02 Thread Richard Biener

On Fri, Aug 30, 2013 at 6:27 PM, Xinliang David Li davi...@google.com wrote:
 Except that in this form, the dump will be extremely large and not
 suitable for very large applications.

So?  You asked for it and you can easily grep the output before storing it
away.

 Besides, we might also want to
 use the same machinery (dump_printf_loc etc) for dump file dumping.
 The current behavior of using '-details' to turn on opt-info-all
 messages for dump files are not desirable.  How about the following:

 1) add a new dump_kind modifier so that when that modifier is
 specified, the messages won't goto the alt_dumpfile (controlled by
 -fopt-info), but only to primary dump file. With this, the inline
 messages can be dumped via:

dump_printf_loc (OPT_OPTIMIZED_LOCATIONS | OPT_DUMP_FILE_ONLY, .)


 2) add more flags in -fdump- support:

-fdump-ipa-inline-opt   -- turn on opt-info messages only
-fdump-ipa-inline-optall -- turn on opt-info-all messages
-fdump-tree-pre-ir -- turn on GIMPLE dump only
-fdump-tree-pre-details -- turn on everything (ir, optall, trace)

 With this, developers can really just use


 -fdump-ipa-inline-opt=stderr for inline messages.

 thanks,

 David

 On Fri, Aug 30, 2013 at 1:30 AM, Richard Biener
 richard.guent...@gmail.com wrote:
 On Thu, Aug 29, 2013 at 5:15 PM, Teresa Johnson tejohn...@google.com wrote:
 On Thu, Aug 29, 2013 at 3:04 AM, Richard Biener
 richard.guent...@gmail.com wrote:
 New patch below that removes this global variable, and also outputs
 the node-symbol.order (in square brackets after the function name so
 as to not clutter it). Inline messages with profile data look look:

 test.c:8:3: note: foobar [0] (9000) inlined into foo [2] (1000)
 with call count 9000 (via inline instance bar [3] (9000))

 Ick.  This looks both redundant and cluttered.  This is supposed to be
 understandable by GCC users, not only GCC developers.

 The main part that is only useful/understandable to gcc developers is
 the node-symbol.order in square brackes, requested by Martin. One
 possibility is that I could put that part under a param, disabled by
 default. We have something similar on the google branches that emits
 LIPO module info in the message, enabled via a param.

 But we have _dump files_ for that.  That's the developer-consumed
 form of opt-info.  -fopt-info is purely user sugar and for usual 
 translation
 units it shouldn't exceed a single terminal full of output.

 But as a developer I don't want to have to parse lots of dump files
 for a summary of the major optimizations performed (e.g. inlining,
 unrolling) for an application, unless I am diving into the reasons for
 why or why not one of those optimizations occurred in a particular
 location. I really do want a summary emitted to stderr so that it is
 easily searchable/summarizable for the app as a whole.

 For example, some of the apps I am interested in have thousands of
 input files, and trying to collect and parse dump files for each and
 every one is overwhelming (it probably would be even if my input files
 numbered in the hundreds). What has been very useful is having these
 high level summary messages of inlines and unrolls emitted to stderr
 by -fopt-info. Then it is easy to search and sort by hotness to get a
 feel for things like what inlines are missing when moving to a new
 compiler, or compiling a new version of the source, for example. Then
 you know which files to focus on and collect dump files for.

 I thought we can direct dump files to stderr now?  So, just use
 -fdump-tree-all=stderr

 and grep its contents.


 I'd argue that the other information (the profile counts, emitted only
 when using -fprofile-use, and the inline call chains) are useful if
 you want to understand whether and how critical inlines are occurring.
 I think this is the type of information that users focused on
 optimizations, as well as gcc developers, want when they use
 -fopt-info. Otherwise it is difficult to make sense of the inline
 information.

 Well, I doubt that inline information is interesting to users unless we are
 able to aggressively filter it to what users are interested in.  Which IMHO
 isn't possible - users are interested in I have not inlined this even 
 though
 inlining would severely improve performance which would indicate a bug
 in the heuristics we can reliably detect and thus it wouldn't be there.

 I have interacted with users who are aware of optimizations such as
 inlining and unrolling and want to look at that information to
 diagnose performance differences when refactoring code or using a new
 compiler version. I also think inlining (especially cross-module) is
 one example of an optimization that is still being tuned, and user
 reports of performance issues related to that have been useful.

 I really think that the two groups of people who will find -fopt-info
 useful are gcc developers and savvy performance-hungry users. For the
 former group the additional info

Re: [PATCH] Convert more passes to new dump framework

2013-09-02 Thread Richard Biener

On Fri, Aug 30, 2013 at 9:51 PM, Teresa Johnson tejohn...@google.com wrote:
 On Fri, Aug 30, 2013 at 9:27 AM, Xinliang David Li davi...@google.com wrote:
 Except that in this form, the dump will be extremely large and not
 suitable for very large applications.

 Yes. I did some measurements for both a fairly large source file that
 is heavily optimized with LIPO and for a simple toy example that has
 some inlining. For the large source file, the output from
 -fdump-ipa-inline=stderr was almost 100x the line count of the
 -fopt-info output. For the toy source file it was 43x. The size of the
 -details output was 250x and 100x, respectively. Which is untenable
 for a large app.

 The issue I am having here is that I want a more verbose message, not
 a more voluminous set of messages. Using either -fopt-info-all or
 -fdump-ipa-inline to provoke the more verbose inline message will give
 me a much greater volume of output.

I think we will never reach the state where the dumping is exactly what
each developer wants (because their wants will differ).  Developers can
easily post-process the stderr output with piping through grep.

Richard.

 One compromise could be to emit the more verbose inliner message under
 a param (and a more concise foo inlined into bar by default with
 -fopt-info). Or we could do some variant of what David talks about
 below.

 Besides, we might also want to
 use the same machinery (dump_printf_loc etc) for dump file dumping.
 The current behavior of using '-details' to turn on opt-info-all
 messages for dump files are not desirable.

 Interestingly, this doesn't even work. When I do
 -fdump-ipa-inline-details=stderr (with my patch containing the inliner
 messages) I am not getting those inliner messages emitted to stderr.
 Even though in dumpfile.c details is set to (TDF_DETAILS |
 MSG_OPTIMIZED_LOCATIONS | MSG_MISSED_OPTIMIZATION | MSG_NOTE). I'm not
 sure why, but will need to debug this.

 How about the following:

 1) add a new dump_kind modifier so that when that modifier is
 specified, the messages won't goto the alt_dumpfile (controlled by
 -fopt-info), but only to primary dump file. With this, the inline
 messages can be dumped via:

dump_printf_loc (OPT_OPTIMIZED_LOCATIONS | OPT_DUMP_FILE_ONLY, .)

 (you mean (MSG_OPTIMIZED_LOCATIONS | OPT_DUMP_FILE_ONLY) )

 Typically OR-ing together flags like this indicates dump under any of
 those conditions. But we could implement special handling for
 OPT_DUMP_FILE_ONLY, which in the above case would mean dump only to
 the primary dump file, and only under the other conditions specified
 in the flag (here under -optimized)



 2) add more flags in -fdump- support:

-fdump-ipa-inline-opt   -- turn on opt-info messages only
-fdump-ipa-inline-optall -- turn on opt-info-all messages

 According to the documentation (see the -fdump-tree- documentation on
 http://gcc.gnu.org/onlinedocs/gcc/Debugging-Options.html#Debugging-Options),
 the above are already supposed to be there (-optimized, -missed, -note
 and -optall). However, specifying any of these gives a warning like:
cc1: warning: ignoring unknown option ‘optimized’ in
 ‘-fdump-ipa-inline’ [enabled by default]
 Probably because none is listed in the dump_options[] array in dumpfile.c.

 However, I don't think there is currently a way to use -fdump- options
 and *only* get one of these, as much of the current dump output is
 emitted whenever there is a dump_file defined. Until everything is
 migrated to the new framework it may be difficult to get this to work.

-fdump-tree-pre-ir -- turn on GIMPLE dump only
-fdump-tree-pre-details -- turn on everything (ir, optall, trace)

 With this, developers can really just use


 -fdump-ipa-inline-opt=stderr for inline messages.

 Yes, if we can figure out a good way to get this to work (i.e. only
 emit the optimized messages and not the rest of the dump messages).
 And unfortunately to get them all you need to specify
 -fdump-ipa-all-optimized -fdump-tree-all-optimized
 -fdump-rtl-all-optimized instead of just -fopt-info. Unless we can
 add -fdump-all-all-optimized.

 Teresa


 thanks,

 David

 On Fri, Aug 30, 2013 at 1:30 AM, Richard Biener
 richard.guent...@gmail.com wrote:
 On Thu, Aug 29, 2013 at 5:15 PM, Teresa Johnson tejohn...@google.com 
 wrote:
 On Thu, Aug 29, 2013 at 3:04 AM, Richard Biener
 richard.guent...@gmail.com wrote:
 New patch below that removes this global variable, and also outputs
 the node-symbol.order (in square brackets after the function name so
 as to not clutter it). Inline messages with profile data look look:

 test.c:8:3: note: foobar [0] (9000) inlined into foo [2] (1000)
 with call count 9000 (via inline instance bar [3] (9000))

 Ick.  This looks both redundant and cluttered.  This is supposed to be
 understandable by GCC users, not only GCC developers.

 The main part that is only useful/understandable to gcc developers is
 the node-symbol.order in square brackes, requested

Re: [PATCH] Convert more passes to new dump framework

2013-09-02 Thread Richard Biener

On Fri, Aug 30, 2013 at 11:23 PM, Teresa Johnson tejohn...@google.com wrote:
 On Fri, Aug 30, 2013 at 1:30 PM, Xinliang David Li davi...@google.com wrote:
 On Fri, Aug 30, 2013 at 12:51 PM, Teresa Johnson tejohn...@google.com 
 wrote:
 On Fri, Aug 30, 2013 at 9:27 AM, Xinliang David Li davi...@google.com 
 wrote:
 Except that in this form, the dump will be extremely large and not
 suitable for very large applications.

 Yes. I did some measurements for both a fairly large source file that
 is heavily optimized with LIPO and for a simple toy example that has
 some inlining. For the large source file, the output from
 -fdump-ipa-inline=stderr was almost 100x the line count of the
 -fopt-info output. For the toy source file it was 43x. The size of the
 -details output was 250x and 100x, respectively. Which is untenable
 for a large app.

 The issue I am having here is that I want a more verbose message, not
 a more voluminous set of messages. Using either -fopt-info-all or
 -fdump-ipa-inline to provoke the more verbose inline message will give
 me a much greater volume of output.

 One compromise could be to emit the more verbose inliner message under
 a param (and a more concise foo inlined into bar by default with
 -fopt-info). Or we could do some variant of what David talks about
 below.

 something like --param=verbose-opt-info=1

 Yes. Richard, would this be acceptable for now?

 i.e. the inliner messages would be like:

 -fopt-info:
test.c:8:3: note: foobar inlined into foo with call count 9000
 (the with call count X only when there is profile feedback)

 -fopt-info --param=verbose-opt-info=1:
test.c:8:3: note: foobar/0 (9000) inlined into foo/2 (1000)
 with call count 9000 (via inline instance bar [3] (9000))
 (again the call counts only emitted under profile feedback)

It looks like a hack to me.  Is -fdump-ipa-inline useful at all?  That is,
can't we simply push some of the -details dumping into the non-details
dump?

Richard.




 Besides, we might also want to
 use the same machinery (dump_printf_loc etc) for dump file dumping.
 The current behavior of using '-details' to turn on opt-info-all
 messages for dump files are not desirable.

 Interestingly, this doesn't even work. When I do
 -fdump-ipa-inline-details=stderr (with my patch containing the inliner
 messages) I am not getting those inliner messages emitted to stderr.
 Even though in dumpfile.c details is set to (TDF_DETAILS |
 MSG_OPTIMIZED_LOCATIONS | MSG_MISSED_OPTIMIZATION | MSG_NOTE). I'm not
 sure why, but will need to debug this.

 It works for vectorizer pass.

 Ok, let me see what is going on - I just confirmed that it is not
 working for the loop unroller messages either.



 How about the following:

 1) add a new dump_kind modifier so that when that modifier is
 specified, the messages won't goto the alt_dumpfile (controlled by
 -fopt-info), but only to primary dump file. With this, the inline
 messages can be dumped via:

dump_printf_loc (OPT_OPTIMIZED_LOCATIONS | OPT_DUMP_FILE_ONLY, .)

 (you mean (MSG_OPTIMIZED_LOCATIONS | OPT_DUMP_FILE_ONLY) )


 Yes.

 Typically OR-ing together flags like this indicates dump under any of
 those conditions. But we could implement special handling for
 OPT_DUMP_FILE_ONLY, which in the above case would mean dump only to
 the primary dump file, and only under the other conditions specified
 in the flag (here under -optimized)



 2) add more flags in -fdump- support:

-fdump-ipa-inline-opt   -- turn on opt-info messages only
-fdump-ipa-inline-optall -- turn on opt-info-all messages

 According to the documentation (see the -fdump-tree- documentation on
 http://gcc.gnu.org/onlinedocs/gcc/Debugging-Options.html#Debugging-Options),
 the above are already supposed to be there (-optimized, -missed, -note
 and -optall). However, specifying any of these gives a warning like:
cc1: warning: ignoring unknown option ‘optimized’ in
 ‘-fdump-ipa-inline’ [enabled by default]
 Probably because none is listed in the dump_options[] array in dumpfile.c.

 However, I don't think there is currently a way to use -fdump- options
 and *only* get one of these, as much of the current dump output is
 emitted whenever there is a dump_file defined. Until everything is
 migrated to the new framework it may be difficult to get this to work.

-fdump-tree-pre-ir -- turn on GIMPLE dump only
-fdump-tree-pre-details -- turn on everything (ir, optall, trace)

 With this, developers can really just use


 -fdump-ipa-inline-opt=stderr for inline messages.

 Yes, if we can figure out a good way to get this to work (i.e. only
 emit the optimized messages and not the rest of the dump messages).
 And unfortunately to get them all you need to specify
 -fdump-ipa-all-optimized -fdump-tree-all-optimized
 -fdump-rtl-all-optimized instead of just -fopt-info. Unless we can
 add -fdump-all-all-optimized.

 Having general support requires cleanup of all the old style  if
 (dump_file) fprintf

Re: [PATCH] Convert more passes to new dump framework

2013-08-31 Thread Sharad Singhai

On Fri, Aug 30, 2013 at 9:58 PM, Teresa Johnson tejohn...@google.com wrote:
 Besides, we might also want to
 use the same machinery (dump_printf_loc etc) for dump file dumping.
 The current behavior of using '-details' to turn on opt-info-all
 messages for dump files are not desirable.

 Interestingly, this doesn't even work. When I do
 -fdump-ipa-inline-details=stderr (with my patch containing the inliner
 messages) I am not getting those inliner messages emitted to stderr.
 Even though in dumpfile.c details is set to (TDF_DETAILS |
 MSG_OPTIMIZED_LOCATIONS | MSG_MISSED_OPTIMIZATION | MSG_NOTE). I'm not
 sure why, but will need to debug this.

 It works for vectorizer pass.

 Ok, let me see what is going on - I just confirmed that it is not
 working for the loop unroller messages either.


 Found the issue. The stream was incorrectly being closed when it was
 stderr/stdout. So only the dump output before the first dump_finish
 call was being emitted to stderr. I fixed this the same way the
 alt_dump_file was being handled just below - don't close if it is
 stderr/stdout. Confirmed that this fixes the problem.

 (So the real ratio between the volume of -fdump-...=stderr and
 -fopt-info is much higher than what I reported in an earlier email)

 Is the following patch ok, pending regression tests?

 2013-08-30  Teresa Johnson  tejohn...@google.com

 * dumpfile.c (dump_finish): Don't close stderr/stdout.

 Index: dumpfile.c
 ===
 --- dumpfile.c  (revision 202059)
 +++ dumpfile.c  (working copy)
 @@ -450,7 +450,8 @@ dump_finish (int phase)
if (phase  0)
  return;
dfi = get_dump_file_info (phase);
 -  if (dfi-pstream)
 +  if (dfi-pstream  strcmp(stderr, dfi-pfilename) != 0
 +   strcmp(stdout, dfi-pfilename) != 0)
  fclose (dfi-pstream);

if (dfi-alt_stream  strcmp(stderr, dfi-alt_filename) != 0

Yes, this is clearly a bug which I missed. Thanks for fixing it. Is it
feasible to add a test case for it?

Thanks,
Sharad

 Thanks,
 Teresa

Re: [PATCH] Convert more passes to new dump framework

2013-08-31 Thread Bernhard Reutner-Fischer


On 30 August 2013 23:23:16 Teresa Johnson tejohn...@google.com wrote:

On Fri, Aug 30, 2013 at 1:30 PM, Xinliang David Li davi...@google.com wrote:
 On Fri, Aug 30, 2013 at 12:51 PM, Teresa Johnson tejohn...@google.com 
wrote:
 On Fri, Aug 30, 2013 at 9:27 AM, Xinliang David Li davi...@google.com 
wrote:

 Except that in this form, the dump will be extremely large and not
 suitable for very large applications.

 Yes. I did some measurements for both a fairly large source file that
 is heavily optimized with LIPO and for a simple toy example that has
 some inlining. For the large source file, the output from
 -fdump-ipa-inline=stderr was almost 100x the line count of the
 -fopt-info output. For the toy source file it was 43x. The size of the
 -details output was 250x and 100x, respectively. Which is untenable
 for a large app.

 The issue I am having here is that I want a more verbose message, not
 a more voluminous set of messages. Using either -fopt-info-all or
 -fdump-ipa-inline to provoke the more verbose inline message will give
 me a much greater volume of output.

 One compromise could be to emit the more verbose inliner message under
 a param (and a more concise foo inlined into bar by default with
 -fopt-info). Or we could do some variant of what David talks about
 below.

 something like --param=verbose-opt-info=1

Yes. Richard, would this be acceptable for now?

i.e. the inliner messages would be like:

-fopt-info:
   test.c:8:3: note: foobar inlined into foo with call count 9000
(the with call count X only when there is profile feedback)

-fopt-info --param=verbose-opt-info=1:
   test.c:8:3: note: foobar/0 (9000) inlined into foo/2 (1000)
with call count 9000 (via inline instance bar [3] (9000))
(again the call counts only emitted under profile feedback)


Assuming the [3] is order, please change that to match what the in liner 
uses, I.e. /3


Thanks





 Besides, we might also want to
 use the same machinery (dump_printf_loc etc) for dump file dumping.
 The current behavior of using '-details' to turn on opt-info-all
 messages for dump files are not desirable.

 Interestingly, this doesn't even work. When I do
 -fdump-ipa-inline-details=stderr (with my patch containing the inliner
 messages) I am not getting those inliner messages emitted to stderr.
 Even though in dumpfile.c details is set to (TDF_DETAILS |
 MSG_OPTIMIZED_LOCATIONS | MSG_MISSED_OPTIMIZATION | MSG_NOTE). I'm not
 sure why, but will need to debug this.

 It works for vectorizer pass.

Ok, let me see what is going on - I just confirmed that it is not
working for the loop unroller messages either.



 How about the following:

 1) add a new dump_kind modifier so that when that modifier is
 specified, the messages won't goto the alt_dumpfile (controlled by
 -fopt-info), but only to primary dump file. With this, the inline
 messages can be dumped via:

dump_printf_loc (OPT_OPTIMIZED_LOCATIONS | OPT_DUMP_FILE_ONLY, .)

 (you mean (MSG_OPTIMIZED_LOCATIONS | OPT_DUMP_FILE_ONLY) )


 Yes.

 Typically OR-ing together flags like this indicates dump under any of
 those conditions. But we could implement special handling for
 OPT_DUMP_FILE_ONLY, which in the above case would mean dump only to
 the primary dump file, and only under the other conditions specified
 in the flag (here under -optimized)



 2) add more flags in -fdump- support:

-fdump-ipa-inline-opt   -- turn on opt-info messages only
-fdump-ipa-inline-optall -- turn on opt-info-all messages

 According to the documentation (see the -fdump-tree- documentation on
 http://gcc.gnu.org/onlinedocs/gcc/Debugging-Options.html#Debugging-Options),
 the above are already supposed to be there (-optimized, -missed, -note
 and -optall). However, specifying any of these gives a warning like:
cc1: warning: ignoring unknown option ‘optimized’ in
 ‘-fdump-ipa-inline’ [enabled by default]
 Probably because none is listed in the dump_options[] array in dumpfile.c.

 However, I don't think there is currently a way to use -fdump- options
 and *only* get one of these, as much of the current dump output is
 emitted whenever there is a dump_file defined. Until everything is
 migrated to the new framework it may be difficult to get this to work.

-fdump-tree-pre-ir -- turn on GIMPLE dump only
-fdump-tree-pre-details -- turn on everything (ir, optall, trace)

 With this, developers can really just use


 -fdump-ipa-inline-opt=stderr for inline messages.

 Yes, if we can figure out a good way to get this to work (i.e. only
 emit the optimized messages and not the rest of the dump messages).
 And unfortunately to get them all you need to specify
 -fdump-ipa-all-optimized -fdump-tree-all-optimized
 -fdump-rtl-all-optimized instead of just -fopt-info. Unless we can
 add -fdump-all-all-optimized.

 Having general support requires cleanup of all the old style  if
 (dump_file) fprintf (dump_file, ...) instances to be:

   if (dump_enabled_p ())
 dump_printf

Re: [PATCH] Convert more passes to new dump framework

2013-08-31 Thread Teresa Johnson

On Sat, Aug 31, 2013 at 12:26 AM, Bernhard Reutner-Fischer
rep.dot@gmail.com wrote:
 On 30 August 2013 23:23:16 Teresa Johnson tejohn...@google.com wrote:

 On Fri, Aug 30, 2013 at 1:30 PM, Xinliang David Li davi...@google.com
 wrote:
  On Fri, Aug 30, 2013 at 12:51 PM, Teresa Johnson tejohn...@google.com
  wrote:
  On Fri, Aug 30, 2013 at 9:27 AM, Xinliang David Li davi...@google.com
  wrote:
  Except that in this form, the dump will be extremely large and not
  suitable for very large applications.
 
  Yes. I did some measurements for both a fairly large source file that
  is heavily optimized with LIPO and for a simple toy example that has
  some inlining. For the large source file, the output from
  -fdump-ipa-inline=stderr was almost 100x the line count of the
  -fopt-info output. For the toy source file it was 43x. The size of the
  -details output was 250x and 100x, respectively. Which is untenable
  for a large app.
 
  The issue I am having here is that I want a more verbose message, not
  a more voluminous set of messages. Using either -fopt-info-all or
  -fdump-ipa-inline to provoke the more verbose inline message will give
  me a much greater volume of output.
 
  One compromise could be to emit the more verbose inliner message under
  a param (and a more concise foo inlined into bar by default with
  -fopt-info). Or we could do some variant of what David talks about
  below.
 
  something like --param=verbose-opt-info=1

 Yes. Richard, would this be acceptable for now?

 i.e. the inliner messages would be like:

 -fopt-info:
test.c:8:3: note: foobar inlined into foo with call count 9000
 (the with call count X only when there is profile feedback)

 -fopt-info --param=verbose-opt-info=1:
test.c:8:3: note: foobar/0 (9000) inlined into foo/2 (1000)
 with call count 9000 (via inline instance bar [3] (9000))
 (again the call counts only emitted under profile feedback)


 Assuming the [3] is order, please change that to match what the in liner
 uses, I.e. /3

Agreed - I meant to switch that back to / in both places but missed
the last. It should read:

test.c:8:3: note: foobar/0 (9000) inlined into foo/2 (1000) with
call count 9000 (via inline instance bar/3 (9000))

Thanks,
Teresa


 Thanks


 
 
 
  Besides, we might also want to
  use the same machinery (dump_printf_loc etc) for dump file dumping.
  The current behavior of using '-details' to turn on opt-info-all
  messages for dump files are not desirable.
 
  Interestingly, this doesn't even work. When I do
  -fdump-ipa-inline-details=stderr (with my patch containing the inliner
  messages) I am not getting those inliner messages emitted to stderr.
  Even though in dumpfile.c details is set to (TDF_DETAILS |
  MSG_OPTIMIZED_LOCATIONS | MSG_MISSED_OPTIMIZATION | MSG_NOTE). I'm not
  sure why, but will need to debug this.
 
  It works for vectorizer pass.

 Ok, let me see what is going on - I just confirmed that it is not
 working for the loop unroller messages either.

 
 
  How about the following:
 
  1) add a new dump_kind modifier so that when that modifier is
  specified, the messages won't goto the alt_dumpfile (controlled by
  -fopt-info), but only to primary dump file. With this, the inline
  messages can be dumped via:
 
 dump_printf_loc (OPT_OPTIMIZED_LOCATIONS | OPT_DUMP_FILE_ONLY,
  .)
 
  (you mean (MSG_OPTIMIZED_LOCATIONS | OPT_DUMP_FILE_ONLY) )
 
 
  Yes.
 
  Typically OR-ing together flags like this indicates dump under any of
  those conditions. But we could implement special handling for
  OPT_DUMP_FILE_ONLY, which in the above case would mean dump only to
  the primary dump file, and only under the other conditions specified
  in the flag (here under -optimized)
 
 
 
  2) add more flags in -fdump- support:
 
 -fdump-ipa-inline-opt   -- turn on opt-info messages only
 -fdump-ipa-inline-optall -- turn on opt-info-all messages
 
  According to the documentation (see the -fdump-tree- documentation on
 
  http://gcc.gnu.org/onlinedocs/gcc/Debugging-Options.html#Debugging-Options),
  the above are already supposed to be there (-optimized, -missed, -note
  and -optall). However, specifying any of these gives a warning like:
 cc1: warning: ignoring unknown option ‘optimized’ in
  ‘-fdump-ipa-inline’ [enabled by default]
  Probably because none is listed in the dump_options[] array in
  dumpfile.c.
 
  However, I don't think there is currently a way to use -fdump- options
  and *only* get one of these, as much of the current dump output is
  emitted whenever there is a dump_file defined. Until everything is
  migrated to the new framework it may be difficult to get this to work.
 
 -fdump-tree-pre-ir -- turn on GIMPLE dump only
 -fdump-tree-pre-details -- turn on everything (ir, optall, trace)
 
  With this, developers can really just use
 
 
  -fdump-ipa-inline-opt=stderr for inline messages.
 
  Yes, if we can figure out a good way to get this to work (i.e. only

Re: [PATCH] Convert more passes to new dump framework

2013-08-30 Thread Andreas Schwab

Teresa Johnson tejohn...@google.com writes:

 Index: testsuite/gcc.dg/inline-dump.c
 ===
 --- testsuite/gcc.dg/inline-dump.c  (revision 0)
 +++ testsuite/gcc.dg/inline-dump.c  (revision 0)
 @@ -0,0 +1,11 @@
 +/* Verify that -fopt-info can output correct inline info.  */
 +/* { dg-do compile } */
 +/* { dg-options -Wall -fopt-info-inline=stderr -O2 -fno-early-inlining } */
 +static inline int leaf() {
 +  int i, ret = 0;
 +  for (i = 0; i  10; i++)
 +ret += i;
 +  return ret;
 +}
 +static inline int foo(void) { return leaf(); } /* { dg-message note:
 leaf .*inlined into bar .*via inline instance foo.*\n } */

I don't see that message, neither on ia64 nor m68k.

Andreas.

-- 
Andreas Schwab, sch...@linux-m68k.org
GPG Key fingerprint = 58CA 54C7 6D53 942B 1756  01D3 44D5 214B 8276 4ED5
And now for something completely different.

Re: [PATCH] Convert more passes to new dump framework

2013-08-30 Thread Richard Biener

On Thu, Aug 29, 2013 at 5:15 PM, Teresa Johnson tejohn...@google.com wrote:
 On Thu, Aug 29, 2013 at 3:04 AM, Richard Biener
 richard.guent...@gmail.com wrote:
 New patch below that removes this global variable, and also outputs
 the node-symbol.order (in square brackets after the function name so
 as to not clutter it). Inline messages with profile data look look:

 test.c:8:3: note: foobar [0] (9000) inlined into foo [2] (1000)
 with call count 9000 (via inline instance bar [3] (9000))

 Ick.  This looks both redundant and cluttered.  This is supposed to be
 understandable by GCC users, not only GCC developers.

 The main part that is only useful/understandable to gcc developers is
 the node-symbol.order in square brackes, requested by Martin. One
 possibility is that I could put that part under a param, disabled by
 default. We have something similar on the google branches that emits
 LIPO module info in the message, enabled via a param.

 But we have _dump files_ for that.  That's the developer-consumed
 form of opt-info.  -fopt-info is purely user sugar and for usual translation
 units it shouldn't exceed a single terminal full of output.

 But as a developer I don't want to have to parse lots of dump files
 for a summary of the major optimizations performed (e.g. inlining,
 unrolling) for an application, unless I am diving into the reasons for
 why or why not one of those optimizations occurred in a particular
 location. I really do want a summary emitted to stderr so that it is
 easily searchable/summarizable for the app as a whole.

 For example, some of the apps I am interested in have thousands of
 input files, and trying to collect and parse dump files for each and
 every one is overwhelming (it probably would be even if my input files
 numbered in the hundreds). What has been very useful is having these
 high level summary messages of inlines and unrolls emitted to stderr
 by -fopt-info. Then it is easy to search and sort by hotness to get a
 feel for things like what inlines are missing when moving to a new
 compiler, or compiling a new version of the source, for example. Then
 you know which files to focus on and collect dump files for.

I thought we can direct dump files to stderr now?  So, just use
-fdump-tree-all=stderr

and grep its contents.


 I'd argue that the other information (the profile counts, emitted only
 when using -fprofile-use, and the inline call chains) are useful if
 you want to understand whether and how critical inlines are occurring.
 I think this is the type of information that users focused on
 optimizations, as well as gcc developers, want when they use
 -fopt-info. Otherwise it is difficult to make sense of the inline
 information.

 Well, I doubt that inline information is interesting to users unless we are
 able to aggressively filter it to what users are interested in.  Which IMHO
 isn't possible - users are interested in I have not inlined this even though
 inlining would severely improve performance which would indicate a bug
 in the heuristics we can reliably detect and thus it wouldn't be there.

 I have interacted with users who are aware of optimizations such as
 inlining and unrolling and want to look at that information to
 diagnose performance differences when refactoring code or using a new
 compiler version. I also think inlining (especially cross-module) is
 one example of an optimization that is still being tuned, and user
 reports of performance issues related to that have been useful.

 I really think that the two groups of people who will find -fopt-info
 useful are gcc developers and savvy performance-hungry users. For the
 former group the additional info is extremely useful. For the latter
 group some of the extra information may not be required (although a
 call count is useful for those using profile feedback), but IMO is not
 unreasonable.

well, your proposed output wrecks my 80x24 terminal already due to overly
long lines.

In the end we may up with a verbosity level for each sub-set of opt-info
messages.  Ick.

Richard.

 Teresa


 --
 Teresa Johnson | Software Engineer | tejohn...@google.com | 408-460-2413

Re: [PATCH] Convert more passes to new dump framework

2013-08-30 Thread Teresa Johnson

Sorry, I should not have committed that new test along with this
portion of the patch. Removed as of r202106.
Teresa

On Fri, Aug 30, 2013 at 12:17 AM, Andreas Schwab sch...@linux-m68k.org wrote:
 Teresa Johnson tejohn...@google.com writes:

 Index: testsuite/gcc.dg/inline-dump.c
 ===
 --- testsuite/gcc.dg/inline-dump.c  (revision 0)
 +++ testsuite/gcc.dg/inline-dump.c  (revision 0)
 @@ -0,0 +1,11 @@
 +/* Verify that -fopt-info can output correct inline info.  */
 +/* { dg-do compile } */
 +/* { dg-options -Wall -fopt-info-inline=stderr -O2 -fno-early-inlining } 
 */
 +static inline int leaf() {
 +  int i, ret = 0;
 +  for (i = 0; i  10; i++)
 +ret += i;
 +  return ret;
 +}
 +static inline int foo(void) { return leaf(); } /* { dg-message note:
 leaf .*inlined into bar .*via inline instance foo.*\n } */

 I don't see that message, neither on ia64 nor m68k.

 Andreas.

 --
 Andreas Schwab, sch...@linux-m68k.org
 GPG Key fingerprint = 58CA 54C7 6D53 942B 1756  01D3 44D5 214B 8276 4ED5
 And now for something completely different.



-- 
Teresa Johnson | Software Engineer | tejohn...@google.com | 408-460-2413

Re: [PATCH] Convert more passes to new dump framework

2013-08-30 Thread Xinliang David Li

Except that in this form, the dump will be extremely large and not
suitable for very large applications. Besides, we might also want to
use the same machinery (dump_printf_loc etc) for dump file dumping.
The current behavior of using '-details' to turn on opt-info-all
messages for dump files are not desirable.  How about the following:

1) add a new dump_kind modifier so that when that modifier is
specified, the messages won't goto the alt_dumpfile (controlled by
-fopt-info), but only to primary dump file. With this, the inline
messages can be dumped via:

   dump_printf_loc (OPT_OPTIMIZED_LOCATIONS | OPT_DUMP_FILE_ONLY, .)


2) add more flags in -fdump- support:

   -fdump-ipa-inline-opt   -- turn on opt-info messages only
   -fdump-ipa-inline-optall -- turn on opt-info-all messages
   -fdump-tree-pre-ir -- turn on GIMPLE dump only
   -fdump-tree-pre-details -- turn on everything (ir, optall, trace)

With this, developers can really just use


-fdump-ipa-inline-opt=stderr for inline messages.

thanks,

David

On Fri, Aug 30, 2013 at 1:30 AM, Richard Biener
richard.guent...@gmail.com wrote:
 On Thu, Aug 29, 2013 at 5:15 PM, Teresa Johnson tejohn...@google.com wrote:
 On Thu, Aug 29, 2013 at 3:04 AM, Richard Biener
 richard.guent...@gmail.com wrote:
 New patch below that removes this global variable, and also outputs
 the node-symbol.order (in square brackets after the function name so
 as to not clutter it). Inline messages with profile data look look:

 test.c:8:3: note: foobar [0] (9000) inlined into foo [2] (1000)
 with call count 9000 (via inline instance bar [3] (9000))

 Ick.  This looks both redundant and cluttered.  This is supposed to be
 understandable by GCC users, not only GCC developers.

 The main part that is only useful/understandable to gcc developers is
 the node-symbol.order in square brackes, requested by Martin. One
 possibility is that I could put that part under a param, disabled by
 default. We have something similar on the google branches that emits
 LIPO module info in the message, enabled via a param.

 But we have _dump files_ for that.  That's the developer-consumed
 form of opt-info.  -fopt-info is purely user sugar and for usual translation
 units it shouldn't exceed a single terminal full of output.

 But as a developer I don't want to have to parse lots of dump files
 for a summary of the major optimizations performed (e.g. inlining,
 unrolling) for an application, unless I am diving into the reasons for
 why or why not one of those optimizations occurred in a particular
 location. I really do want a summary emitted to stderr so that it is
 easily searchable/summarizable for the app as a whole.

 For example, some of the apps I am interested in have thousands of
 input files, and trying to collect and parse dump files for each and
 every one is overwhelming (it probably would be even if my input files
 numbered in the hundreds). What has been very useful is having these
 high level summary messages of inlines and unrolls emitted to stderr
 by -fopt-info. Then it is easy to search and sort by hotness to get a
 feel for things like what inlines are missing when moving to a new
 compiler, or compiling a new version of the source, for example. Then
 you know which files to focus on and collect dump files for.

 I thought we can direct dump files to stderr now?  So, just use
 -fdump-tree-all=stderr

 and grep its contents.


 I'd argue that the other information (the profile counts, emitted only
 when using -fprofile-use, and the inline call chains) are useful if
 you want to understand whether and how critical inlines are occurring.
 I think this is the type of information that users focused on
 optimizations, as well as gcc developers, want when they use
 -fopt-info. Otherwise it is difficult to make sense of the inline
 information.

 Well, I doubt that inline information is interesting to users unless we are
 able to aggressively filter it to what users are interested in.  Which IMHO
 isn't possible - users are interested in I have not inlined this even 
 though
 inlining would severely improve performance which would indicate a bug
 in the heuristics we can reliably detect and thus it wouldn't be there.

 I have interacted with users who are aware of optimizations such as
 inlining and unrolling and want to look at that information to
 diagnose performance differences when refactoring code or using a new
 compiler version. I also think inlining (especially cross-module) is
 one example of an optimization that is still being tuned, and user
 reports of performance issues related to that have been useful.

 I really think that the two groups of people who will find -fopt-info
 useful are gcc developers and savvy performance-hungry users. For the
 former group the additional info is extremely useful. For the latter
 group some of the extra information may not be required (although a
 call count is useful for those using profile feedback), but IMO is not

Re: [PATCH] Convert more passes to new dump framework

2013-08-30 Thread Teresa Johnson

On Fri, Aug 30, 2013 at 9:27 AM, Xinliang David Li davi...@google.com wrote:
 Except that in this form, the dump will be extremely large and not
 suitable for very large applications.

Yes. I did some measurements for both a fairly large source file that
is heavily optimized with LIPO and for a simple toy example that has
some inlining. For the large source file, the output from
-fdump-ipa-inline=stderr was almost 100x the line count of the
-fopt-info output. For the toy source file it was 43x. The size of the
-details output was 250x and 100x, respectively. Which is untenable
for a large app.

The issue I am having here is that I want a more verbose message, not
a more voluminous set of messages. Using either -fopt-info-all or
-fdump-ipa-inline to provoke the more verbose inline message will give
me a much greater volume of output.

One compromise could be to emit the more verbose inliner message under
a param (and a more concise foo inlined into bar by default with
-fopt-info). Or we could do some variant of what David talks about
below.

 Besides, we might also want to
 use the same machinery (dump_printf_loc etc) for dump file dumping.
 The current behavior of using '-details' to turn on opt-info-all
 messages for dump files are not desirable.

Interestingly, this doesn't even work. When I do
-fdump-ipa-inline-details=stderr (with my patch containing the inliner
messages) I am not getting those inliner messages emitted to stderr.
Even though in dumpfile.c details is set to (TDF_DETAILS |
MSG_OPTIMIZED_LOCATIONS | MSG_MISSED_OPTIMIZATION | MSG_NOTE). I'm not
sure why, but will need to debug this.

 How about the following:

 1) add a new dump_kind modifier so that when that modifier is
 specified, the messages won't goto the alt_dumpfile (controlled by
 -fopt-info), but only to primary dump file. With this, the inline
 messages can be dumped via:

dump_printf_loc (OPT_OPTIMIZED_LOCATIONS | OPT_DUMP_FILE_ONLY, .)

(you mean (MSG_OPTIMIZED_LOCATIONS | OPT_DUMP_FILE_ONLY) )

Typically OR-ing together flags like this indicates dump under any of
those conditions. But we could implement special handling for
OPT_DUMP_FILE_ONLY, which in the above case would mean dump only to
the primary dump file, and only under the other conditions specified
in the flag (here under -optimized)



 2) add more flags in -fdump- support:

-fdump-ipa-inline-opt   -- turn on opt-info messages only
-fdump-ipa-inline-optall -- turn on opt-info-all messages

According to the documentation (see the -fdump-tree- documentation on
http://gcc.gnu.org/onlinedocs/gcc/Debugging-Options.html#Debugging-Options),
the above are already supposed to be there (-optimized, -missed, -note
and -optall). However, specifying any of these gives a warning like:
   cc1: warning: ignoring unknown option ‘optimized’ in
‘-fdump-ipa-inline’ [enabled by default]
Probably because none is listed in the dump_options[] array in dumpfile.c.

However, I don't think there is currently a way to use -fdump- options
and *only* get one of these, as much of the current dump output is
emitted whenever there is a dump_file defined. Until everything is
migrated to the new framework it may be difficult to get this to work.

-fdump-tree-pre-ir -- turn on GIMPLE dump only
-fdump-tree-pre-details -- turn on everything (ir, optall, trace)

 With this, developers can really just use


 -fdump-ipa-inline-opt=stderr for inline messages.

Yes, if we can figure out a good way to get this to work (i.e. only
emit the optimized messages and not the rest of the dump messages).
And unfortunately to get them all you need to specify
-fdump-ipa-all-optimized -fdump-tree-all-optimized
-fdump-rtl-all-optimized instead of just -fopt-info. Unless we can
add -fdump-all-all-optimized.

Teresa


 thanks,

 David

 On Fri, Aug 30, 2013 at 1:30 AM, Richard Biener
 richard.guent...@gmail.com wrote:
 On Thu, Aug 29, 2013 at 5:15 PM, Teresa Johnson tejohn...@google.com wrote:
 On Thu, Aug 29, 2013 at 3:04 AM, Richard Biener
 richard.guent...@gmail.com wrote:
 New patch below that removes this global variable, and also outputs
 the node-symbol.order (in square brackets after the function name so
 as to not clutter it). Inline messages with profile data look look:

 test.c:8:3: note: foobar [0] (9000) inlined into foo [2] (1000)
 with call count 9000 (via inline instance bar [3] (9000))

 Ick.  This looks both redundant and cluttered.  This is supposed to be
 understandable by GCC users, not only GCC developers.

 The main part that is only useful/understandable to gcc developers is
 the node-symbol.order in square brackes, requested by Martin. One
 possibility is that I could put that part under a param, disabled by
 default. We have something similar on the google branches that emits
 LIPO module info in the message, enabled via a param.

 But we have _dump files_ for that.  That's the developer-consumed
 form of opt-info.  -fopt-info is purely user sugar and for usual

Re: [PATCH] Convert more passes to new dump framework

2013-08-30 Thread Xinliang David Li

On Fri, Aug 30, 2013 at 12:51 PM, Teresa Johnson tejohn...@google.com wrote:
 On Fri, Aug 30, 2013 at 9:27 AM, Xinliang David Li davi...@google.com wrote:
 Except that in this form, the dump will be extremely large and not
 suitable for very large applications.

 Yes. I did some measurements for both a fairly large source file that
 is heavily optimized with LIPO and for a simple toy example that has
 some inlining. For the large source file, the output from
 -fdump-ipa-inline=stderr was almost 100x the line count of the
 -fopt-info output. For the toy source file it was 43x. The size of the
 -details output was 250x and 100x, respectively. Which is untenable
 for a large app.

 The issue I am having here is that I want a more verbose message, not
 a more voluminous set of messages. Using either -fopt-info-all or
 -fdump-ipa-inline to provoke the more verbose inline message will give
 me a much greater volume of output.

 One compromise could be to emit the more verbose inliner message under
 a param (and a more concise foo inlined into bar by default with
 -fopt-info). Or we could do some variant of what David talks about
 below.

something like --param=verbose-opt-info=1



 Besides, we might also want to
 use the same machinery (dump_printf_loc etc) for dump file dumping.
 The current behavior of using '-details' to turn on opt-info-all
 messages for dump files are not desirable.

 Interestingly, this doesn't even work. When I do
 -fdump-ipa-inline-details=stderr (with my patch containing the inliner
 messages) I am not getting those inliner messages emitted to stderr.
 Even though in dumpfile.c details is set to (TDF_DETAILS |
 MSG_OPTIMIZED_LOCATIONS | MSG_MISSED_OPTIMIZATION | MSG_NOTE). I'm not
 sure why, but will need to debug this.

It works for vectorizer pass.


 How about the following:

 1) add a new dump_kind modifier so that when that modifier is
 specified, the messages won't goto the alt_dumpfile (controlled by
 -fopt-info), but only to primary dump file. With this, the inline
 messages can be dumped via:

dump_printf_loc (OPT_OPTIMIZED_LOCATIONS | OPT_DUMP_FILE_ONLY, .)

 (you mean (MSG_OPTIMIZED_LOCATIONS | OPT_DUMP_FILE_ONLY) )


Yes.

 Typically OR-ing together flags like this indicates dump under any of
 those conditions. But we could implement special handling for
 OPT_DUMP_FILE_ONLY, which in the above case would mean dump only to
 the primary dump file, and only under the other conditions specified
 in the flag (here under -optimized)



 2) add more flags in -fdump- support:

-fdump-ipa-inline-opt   -- turn on opt-info messages only
-fdump-ipa-inline-optall -- turn on opt-info-all messages

 According to the documentation (see the -fdump-tree- documentation on
 http://gcc.gnu.org/onlinedocs/gcc/Debugging-Options.html#Debugging-Options),
 the above are already supposed to be there (-optimized, -missed, -note
 and -optall). However, specifying any of these gives a warning like:
cc1: warning: ignoring unknown option ‘optimized’ in
 ‘-fdump-ipa-inline’ [enabled by default]
 Probably because none is listed in the dump_options[] array in dumpfile.c.

 However, I don't think there is currently a way to use -fdump- options
 and *only* get one of these, as much of the current dump output is
 emitted whenever there is a dump_file defined. Until everything is
 migrated to the new framework it may be difficult to get this to work.

-fdump-tree-pre-ir -- turn on GIMPLE dump only
-fdump-tree-pre-details -- turn on everything (ir, optall, trace)

 With this, developers can really just use


 -fdump-ipa-inline-opt=stderr for inline messages.

 Yes, if we can figure out a good way to get this to work (i.e. only
 emit the optimized messages and not the rest of the dump messages).
 And unfortunately to get them all you need to specify
 -fdump-ipa-all-optimized -fdump-tree-all-optimized
 -fdump-rtl-all-optimized instead of just -fopt-info. Unless we can
 add -fdump-all-all-optimized.

Having general support requires cleanup of all the old style  if
(dump_file) fprintf (dump_file, ...) instances to be:

  if (dump_enabled_p ())
dump_printf (dump_kind );


However, it might be easier to do this filtering for IR dump only (in
execute_function_dump) -- do not dump IR if any of the MSG_ is
specified unless IR flag (a new flag) is also specified.

David



 Teresa


 thanks,

 David

 On Fri, Aug 30, 2013 at 1:30 AM, Richard Biener
 richard.guent...@gmail.com wrote:
 On Thu, Aug 29, 2013 at 5:15 PM, Teresa Johnson tejohn...@google.com 
 wrote:
 On Thu, Aug 29, 2013 at 3:04 AM, Richard Biener
 richard.guent...@gmail.com wrote:
 New patch below that removes this global variable, and also outputs
 the node-symbol.order (in square brackets after the function name so
 as to not clutter it). Inline messages with profile data look look:

 test.c:8:3: note: foobar [0] (9000) inlined into foo [2] (1000)
 with call count 9000 (via inline instance bar [3] (9000))

Re: [PATCH] Convert more passes to new dump framework

2013-08-30 Thread Teresa Johnson

On Fri, Aug 30, 2013 at 1:30 PM, Xinliang David Li davi...@google.com wrote:
 On Fri, Aug 30, 2013 at 12:51 PM, Teresa Johnson tejohn...@google.com wrote:
 On Fri, Aug 30, 2013 at 9:27 AM, Xinliang David Li davi...@google.com 
 wrote:
 Except that in this form, the dump will be extremely large and not
 suitable for very large applications.

 Yes. I did some measurements for both a fairly large source file that
 is heavily optimized with LIPO and for a simple toy example that has
 some inlining. For the large source file, the output from
 -fdump-ipa-inline=stderr was almost 100x the line count of the
 -fopt-info output. For the toy source file it was 43x. The size of the
 -details output was 250x and 100x, respectively. Which is untenable
 for a large app.

 The issue I am having here is that I want a more verbose message, not
 a more voluminous set of messages. Using either -fopt-info-all or
 -fdump-ipa-inline to provoke the more verbose inline message will give
 me a much greater volume of output.

 One compromise could be to emit the more verbose inliner message under
 a param (and a more concise foo inlined into bar by default with
 -fopt-info). Or we could do some variant of what David talks about
 below.

 something like --param=verbose-opt-info=1

Yes. Richard, would this be acceptable for now?

i.e. the inliner messages would be like:

-fopt-info:
   test.c:8:3: note: foobar inlined into foo with call count 9000
(the with call count X only when there is profile feedback)

-fopt-info --param=verbose-opt-info=1:
   test.c:8:3: note: foobar/0 (9000) inlined into foo/2 (1000)
with call count 9000 (via inline instance bar [3] (9000))
(again the call counts only emitted under profile feedback)




 Besides, we might also want to
 use the same machinery (dump_printf_loc etc) for dump file dumping.
 The current behavior of using '-details' to turn on opt-info-all
 messages for dump files are not desirable.

 Interestingly, this doesn't even work. When I do
 -fdump-ipa-inline-details=stderr (with my patch containing the inliner
 messages) I am not getting those inliner messages emitted to stderr.
 Even though in dumpfile.c details is set to (TDF_DETAILS |
 MSG_OPTIMIZED_LOCATIONS | MSG_MISSED_OPTIMIZATION | MSG_NOTE). I'm not
 sure why, but will need to debug this.

 It works for vectorizer pass.

Ok, let me see what is going on - I just confirmed that it is not
working for the loop unroller messages either.



 How about the following:

 1) add a new dump_kind modifier so that when that modifier is
 specified, the messages won't goto the alt_dumpfile (controlled by
 -fopt-info), but only to primary dump file. With this, the inline
 messages can be dumped via:

dump_printf_loc (OPT_OPTIMIZED_LOCATIONS | OPT_DUMP_FILE_ONLY, .)

 (you mean (MSG_OPTIMIZED_LOCATIONS | OPT_DUMP_FILE_ONLY) )


 Yes.

 Typically OR-ing together flags like this indicates dump under any of
 those conditions. But we could implement special handling for
 OPT_DUMP_FILE_ONLY, which in the above case would mean dump only to
 the primary dump file, and only under the other conditions specified
 in the flag (here under -optimized)



 2) add more flags in -fdump- support:

-fdump-ipa-inline-opt   -- turn on opt-info messages only
-fdump-ipa-inline-optall -- turn on opt-info-all messages

 According to the documentation (see the -fdump-tree- documentation on
 http://gcc.gnu.org/onlinedocs/gcc/Debugging-Options.html#Debugging-Options),
 the above are already supposed to be there (-optimized, -missed, -note
 and -optall). However, specifying any of these gives a warning like:
cc1: warning: ignoring unknown option ‘optimized’ in
 ‘-fdump-ipa-inline’ [enabled by default]
 Probably because none is listed in the dump_options[] array in dumpfile.c.

 However, I don't think there is currently a way to use -fdump- options
 and *only* get one of these, as much of the current dump output is
 emitted whenever there is a dump_file defined. Until everything is
 migrated to the new framework it may be difficult to get this to work.

-fdump-tree-pre-ir -- turn on GIMPLE dump only
-fdump-tree-pre-details -- turn on everything (ir, optall, trace)

 With this, developers can really just use


 -fdump-ipa-inline-opt=stderr for inline messages.

 Yes, if we can figure out a good way to get this to work (i.e. only
 emit the optimized messages and not the rest of the dump messages).
 And unfortunately to get them all you need to specify
 -fdump-ipa-all-optimized -fdump-tree-all-optimized
 -fdump-rtl-all-optimized instead of just -fopt-info. Unless we can
 add -fdump-all-all-optimized.

 Having general support requires cleanup of all the old style  if
 (dump_file) fprintf (dump_file, ...) instances to be:

   if (dump_enabled_p ())
 dump_printf (dump_kind );

Right. But that is going to be a big longer-term effort - grepping for
dump_file in gcc/*.c gives about 6000 instances.



 However, it might be easier to

Re: [PATCH] Convert more passes to new dump framework

2013-08-30 Thread Teresa Johnson

 Besides, we might also want to
 use the same machinery (dump_printf_loc etc) for dump file dumping.
 The current behavior of using '-details' to turn on opt-info-all
 messages for dump files are not desirable.

 Interestingly, this doesn't even work. When I do
 -fdump-ipa-inline-details=stderr (with my patch containing the inliner
 messages) I am not getting those inliner messages emitted to stderr.
 Even though in dumpfile.c details is set to (TDF_DETAILS |
 MSG_OPTIMIZED_LOCATIONS | MSG_MISSED_OPTIMIZATION | MSG_NOTE). I'm not
 sure why, but will need to debug this.

 It works for vectorizer pass.

 Ok, let me see what is going on - I just confirmed that it is not
 working for the loop unroller messages either.


Found the issue. The stream was incorrectly being closed when it was
stderr/stdout. So only the dump output before the first dump_finish
call was being emitted to stderr. I fixed this the same way the
alt_dump_file was being handled just below - don't close if it is
stderr/stdout. Confirmed that this fixes the problem.

(So the real ratio between the volume of -fdump-...=stderr and
-fopt-info is much higher than what I reported in an earlier email)

Is the following patch ok, pending regression tests?

2013-08-30  Teresa Johnson  tejohn...@google.com

* dumpfile.c (dump_finish): Don't close stderr/stdout.

Index: dumpfile.c
===
--- dumpfile.c  (revision 202059)
+++ dumpfile.c  (working copy)
@@ -450,7 +450,8 @@ dump_finish (int phase)
   if (phase  0)
 return;
   dfi = get_dump_file_info (phase);
-  if (dfi-pstream)
+  if (dfi-pstream  strcmp(stderr, dfi-pfilename) != 0
+   strcmp(stdout, dfi-pfilename) != 0)
 fclose (dfi-pstream);

   if (dfi-alt_stream  strcmp(stderr, dfi-alt_filename) != 0


Thanks,
Teresa

Re: [PATCH] Convert more passes to new dump framework

2013-08-29 Thread Richard Biener

On Wed, Aug 28, 2013 at 4:09 PM, Teresa Johnson tejohn...@google.com wrote:
 On Wed, Aug 28, 2013 at 4:01 AM, Richard Biener
 richard.guent...@gmail.com wrote:
 On Wed, Aug 7, 2013 at 7:23 AM, Teresa Johnson tejohn...@google.com wrote:
 On Tue, Aug 6, 2013 at 9:29 AM, Teresa Johnson tejohn...@google.com wrote:
 On Tue, Aug 6, 2013 at 9:01 AM, Martin Jambor mjam...@suse.cz wrote:
 Hi,

 On Tue, Aug 06, 2013 at 07:14:42AM -0700, Teresa Johnson wrote:
 On Tue, Aug 6, 2013 at 5:37 AM, Martin Jambor mjam...@suse.cz wrote:
  On Mon, Aug 05, 2013 at 10:37:00PM -0700, Teresa Johnson wrote:
  This patch ports messages to the new dump framework,
 
  It would be great this new framework was documented somewhere.  I lost
  track of what was agreed it would be and from the uses in the
  vectorizer I was never quite sure how to utilize it in other passes.

 Cc'ing Sharad who implemented this - Sharad, is this documented on a
 wiki or elsewhere?

 Thanks


 
  I'd also like to point out two other minor things inline:
 
  [...]
 
  2013-08-06  Teresa Johnson  tejohn...@google.com
  Dehao Chen  de...@google.com
 
  * dumpfile.c (dump_loc): Add column number to output, make 
  newlines
  consistent.
  * dumpfile.h (OPTGROUP_OTHER): Add and enable under 
  OPTGROUP_ALL.
  * ipa-inline-transform.c (clone_inlined_nodes):
  (cgraph_node_opt_info): New function.
  (cgraph_node_call_chain): Ditto.
  (dump_inline_decision): Ditto.
  (inline_call): Invoke dump_inline_decision.
  * doc/invoke.texi: Document optall -fopt-info flag.
  * profile.c (read_profile_edge_counts): Use new dump 
  framework.
  (compute_branch_probabilities): Ditto.
  * passes.c (pass_manager::register_one_dump_file): Use 
  OPTGROUP_OTHER
  when pass not in any opt group.
  * value-prof.c (check_counter): Use new dump framework.
  (find_func_by_funcdef_no): Ditto.
  (check_ic_target): Ditto.
  * coverage.c (get_coverage_counts): Ditto.
  (coverage_init): Setup new dump framework.
  * ipa-inline.c (inline_small_functions): Set is_in_ipa_inline.
  * ipa-inline.h (is_in_ipa_inline): Declare.
 
  * testsuite/gcc.dg/pr40209.c: Use -fopt-info.
  * testsuite/gcc.dg/pr26570.c: Ditto.
  * testsuite/gcc.dg/pr32773.c: Ditto.
  * testsuite/g++.dg/tree-ssa/dom-invalid.C (struct C): Ditto.
 
 
  [...]
 
  Index: ipa-inline-transform.c
  ===
  --- ipa-inline-transform.c  (revision 201461)
  +++ ipa-inline-transform.c  (working copy)
  @@ -192,6 +192,108 @@ clone_inlined_nodes (struct cgraph_edge *e, 
  bool d
   }
 
 
  +#define MAX_INT_LENGTH 20
  +
  +/* Return NODE's name and profile count, if available.  */
  +
  +static const char *
  +cgraph_node_opt_info (struct cgraph_node *node)
  +{
  +  char *buf;
  +  size_t buf_size;
  +  const char *bfd_name = lang_hooks.dwarf_name (node-symbol.decl, 
  0);
  +
  +  if (!bfd_name)
  +bfd_name = unknown;
  +
  +  buf_size = strlen (bfd_name) + 1;
  +  if (profile_info)
  +buf_size += (MAX_INT_LENGTH + 3);
  +
  +  buf = (char *) xmalloc (buf_size);
  +
  +  strcpy (buf, bfd_name);
  +
  +  if (profile_info)
  +sprintf (buf, %s (HOST_WIDEST_INT_PRINT_DEC), buf, 
  node-count);
  +  return buf;
  +}
 
  I'm not sure if output of this function is aimed only at the user or
  if it is supposed to be used by gcc developers as well.  If the
  latter, an incredibly useful thing is to also dump node-symbol.order
  too.  We usually dump it after / sign separating it from node name.
  It is invaluable when examining decisions in C++ code where you can
  have lots of clones of a node (and also because existing dumps print
  it, it is easy to combine them).

 The output is useful for both power users doing performance tuning of
 their application, and by gcc developers. Adding the id is not so
 useful for the former, but I agree that it is very useful for compiler
 developers. In fact, in the google branch version we emit more verbose
 information (the lipo module id and the funcdef_no) to help uniquely
 identify the routines and to aid in post-processing by humans and
 tools. So it is probably useful to add something similar here too. Is
 the node-symbol.order more or less unique than the funcdef_no? I see
 that you added a patch a few months ago to print the
 node-symbol.order in the function header, and it also has the
 advantage as you note of matching up with existing ipa dumps.

 node-symbol.order is unique and if I remember correctly, it is not
 even recycled.  Clones, inline clones, thunks, every symbol table node
 gets its own symbol order so it should be more unique than funcdef_no.
 On the other hand it may be a bit cryptic for users but at the same
 time it is only one number.

 Ok, I am going to go ahead and add this to the output.



 
  [...]

Re: [PATCH] Convert more passes to new dump framework

2013-08-29 Thread Richard Biener

On Wed, Aug 28, 2013 at 5:20 PM, Xinliang David Li davi...@google.com wrote:
 On Wed, Aug 28, 2013 at 7:09 AM, Teresa Johnson tejohn...@google.com wrote:
 On Wed, Aug 28, 2013 at 4:01 AM, Richard Biener
 richard.guent...@gmail.com wrote:
 On Wed, Aug 7, 2013 at 7:23 AM, Teresa Johnson tejohn...@google.com wrote:
 On Tue, Aug 6, 2013 at 9:29 AM, Teresa Johnson tejohn...@google.com 
 wrote:
 On Tue, Aug 6, 2013 at 9:01 AM, Martin Jambor mjam...@suse.cz wrote:
 Hi,

 On Tue, Aug 06, 2013 at 07:14:42AM -0700, Teresa Johnson wrote:
 On Tue, Aug 6, 2013 at 5:37 AM, Martin Jambor mjam...@suse.cz wrote:
  On Mon, Aug 05, 2013 at 10:37:00PM -0700, Teresa Johnson wrote:
  This patch ports messages to the new dump framework,
 
  It would be great this new framework was documented somewhere.  I lost
  track of what was agreed it would be and from the uses in the
  vectorizer I was never quite sure how to utilize it in other passes.

 Cc'ing Sharad who implemented this - Sharad, is this documented on a
 wiki or elsewhere?

 Thanks


 
  I'd also like to point out two other minor things inline:
 
  [...]
 
  2013-08-06  Teresa Johnson  tejohn...@google.com
  Dehao Chen  de...@google.com
 
  * dumpfile.c (dump_loc): Add column number to output, make 
  newlines
  consistent.
  * dumpfile.h (OPTGROUP_OTHER): Add and enable under 
  OPTGROUP_ALL.
  * ipa-inline-transform.c (clone_inlined_nodes):
  (cgraph_node_opt_info): New function.
  (cgraph_node_call_chain): Ditto.
  (dump_inline_decision): Ditto.
  (inline_call): Invoke dump_inline_decision.
  * doc/invoke.texi: Document optall -fopt-info flag.
  * profile.c (read_profile_edge_counts): Use new dump 
  framework.
  (compute_branch_probabilities): Ditto.
  * passes.c (pass_manager::register_one_dump_file): Use 
  OPTGROUP_OTHER
  when pass not in any opt group.
  * value-prof.c (check_counter): Use new dump framework.
  (find_func_by_funcdef_no): Ditto.
  (check_ic_target): Ditto.
  * coverage.c (get_coverage_counts): Ditto.
  (coverage_init): Setup new dump framework.
  * ipa-inline.c (inline_small_functions): Set 
  is_in_ipa_inline.
  * ipa-inline.h (is_in_ipa_inline): Declare.
 
  * testsuite/gcc.dg/pr40209.c: Use -fopt-info.
  * testsuite/gcc.dg/pr26570.c: Ditto.
  * testsuite/gcc.dg/pr32773.c: Ditto.
  * testsuite/g++.dg/tree-ssa/dom-invalid.C (struct C): Ditto.
 
 
  [...]
 
  Index: ipa-inline-transform.c
  ===
  --- ipa-inline-transform.c  (revision 201461)
  +++ ipa-inline-transform.c  (working copy)
  @@ -192,6 +192,108 @@ clone_inlined_nodes (struct cgraph_edge *e, 
  bool d
   }
 
 
  +#define MAX_INT_LENGTH 20
  +
  +/* Return NODE's name and profile count, if available.  */
  +
  +static const char *
  +cgraph_node_opt_info (struct cgraph_node *node)
  +{
  +  char *buf;
  +  size_t buf_size;
  +  const char *bfd_name = lang_hooks.dwarf_name (node-symbol.decl, 
  0);
  +
  +  if (!bfd_name)
  +bfd_name = unknown;
  +
  +  buf_size = strlen (bfd_name) + 1;
  +  if (profile_info)
  +buf_size += (MAX_INT_LENGTH + 3);
  +
  +  buf = (char *) xmalloc (buf_size);
  +
  +  strcpy (buf, bfd_name);
  +
  +  if (profile_info)
  +sprintf (buf, %s (HOST_WIDEST_INT_PRINT_DEC), buf, 
  node-count);
  +  return buf;
  +}
 
  I'm not sure if output of this function is aimed only at the user or
  if it is supposed to be used by gcc developers as well.  If the
  latter, an incredibly useful thing is to also dump node-symbol.order
  too.  We usually dump it after / sign separating it from node name.
  It is invaluable when examining decisions in C++ code where you can
  have lots of clones of a node (and also because existing dumps print
  it, it is easy to combine them).

 The output is useful for both power users doing performance tuning of
 their application, and by gcc developers. Adding the id is not so
 useful for the former, but I agree that it is very useful for compiler
 developers. In fact, in the google branch version we emit more verbose
 information (the lipo module id and the funcdef_no) to help uniquely
 identify the routines and to aid in post-processing by humans and
 tools. So it is probably useful to add something similar here too. Is
 the node-symbol.order more or less unique than the funcdef_no? I see
 that you added a patch a few months ago to print the
 node-symbol.order in the function header, and it also has the
 advantage as you note of matching up with existing ipa dumps.

 node-symbol.order is unique and if I remember correctly, it is not
 even recycled.  Clones, inline clones, thunks, every symbol table node
 gets its own symbol order so it should be more unique than funcdef_no.
 On the other hand it may be a bit cryptic for users but at the same
 time it is

Re: [PATCH] Convert more passes to new dump framework

2013-08-29 Thread Teresa Johnson

On Wed, Aug 28, 2013 at 9:07 AM, Teresa Johnson tejohn...@google.com wrote:
 On Wed, Aug 28, 2013 at 7:09 AM, Teresa Johnson tejohn...@google.com wrote:
 On Wed, Aug 28, 2013 at 4:01 AM, Richard Biener
 richard.guent...@gmail.com wrote:
 On Wed, Aug 7, 2013 at 7:23 AM, Teresa Johnson tejohn...@google.com wrote:
 On Tue, Aug 6, 2013 at 9:29 AM, Teresa Johnson tejohn...@google.com 
 wrote:
 On Tue, Aug 6, 2013 at 9:01 AM, Martin Jambor mjam...@suse.cz wrote:
 Hi,

 On Tue, Aug 06, 2013 at 07:14:42AM -0700, Teresa Johnson wrote:
 On Tue, Aug 6, 2013 at 5:37 AM, Martin Jambor mjam...@suse.cz wrote:
  On Mon, Aug 05, 2013 at 10:37:00PM -0700, Teresa Johnson wrote:
  This patch ports messages to the new dump framework,
 
  It would be great this new framework was documented somewhere.  I lost
  track of what was agreed it would be and from the uses in the
  vectorizer I was never quite sure how to utilize it in other passes.

 Cc'ing Sharad who implemented this - Sharad, is this documented on a
 wiki or elsewhere?

 Thanks


 
  I'd also like to point out two other minor things inline:
 
  [...]
 
  2013-08-06  Teresa Johnson  tejohn...@google.com
  Dehao Chen  de...@google.com
 
  * dumpfile.c (dump_loc): Add column number to output, make 
  newlines
  consistent.
  * dumpfile.h (OPTGROUP_OTHER): Add and enable under 
  OPTGROUP_ALL.
  * ipa-inline-transform.c (clone_inlined_nodes):
  (cgraph_node_opt_info): New function.
  (cgraph_node_call_chain): Ditto.
  (dump_inline_decision): Ditto.
  (inline_call): Invoke dump_inline_decision.
  * doc/invoke.texi: Document optall -fopt-info flag.
  * profile.c (read_profile_edge_counts): Use new dump 
  framework.
  (compute_branch_probabilities): Ditto.
  * passes.c (pass_manager::register_one_dump_file): Use 
  OPTGROUP_OTHER
  when pass not in any opt group.
  * value-prof.c (check_counter): Use new dump framework.
  (find_func_by_funcdef_no): Ditto.
  (check_ic_target): Ditto.
  * coverage.c (get_coverage_counts): Ditto.
  (coverage_init): Setup new dump framework.
  * ipa-inline.c (inline_small_functions): Set 
  is_in_ipa_inline.
  * ipa-inline.h (is_in_ipa_inline): Declare.
 
  * testsuite/gcc.dg/pr40209.c: Use -fopt-info.
  * testsuite/gcc.dg/pr26570.c: Ditto.
  * testsuite/gcc.dg/pr32773.c: Ditto.
  * testsuite/g++.dg/tree-ssa/dom-invalid.C (struct C): Ditto.
 
 
  [...]
 
  Index: ipa-inline-transform.c
  ===
  --- ipa-inline-transform.c  (revision 201461)
  +++ ipa-inline-transform.c  (working copy)
  @@ -192,6 +192,108 @@ clone_inlined_nodes (struct cgraph_edge *e, 
  bool d
   }
 
 
  +#define MAX_INT_LENGTH 20
  +
  +/* Return NODE's name and profile count, if available.  */
  +
  +static const char *
  +cgraph_node_opt_info (struct cgraph_node *node)
  +{
  +  char *buf;
  +  size_t buf_size;
  +  const char *bfd_name = lang_hooks.dwarf_name (node-symbol.decl, 
  0);
  +
  +  if (!bfd_name)
  +bfd_name = unknown;
  +
  +  buf_size = strlen (bfd_name) + 1;
  +  if (profile_info)
  +buf_size += (MAX_INT_LENGTH + 3);
  +
  +  buf = (char *) xmalloc (buf_size);
  +
  +  strcpy (buf, bfd_name);
  +
  +  if (profile_info)
  +sprintf (buf, %s (HOST_WIDEST_INT_PRINT_DEC), buf, 
  node-count);
  +  return buf;
  +}
 
  I'm not sure if output of this function is aimed only at the user or
  if it is supposed to be used by gcc developers as well.  If the
  latter, an incredibly useful thing is to also dump node-symbol.order
  too.  We usually dump it after / sign separating it from node name.
  It is invaluable when examining decisions in C++ code where you can
  have lots of clones of a node (and also because existing dumps print
  it, it is easy to combine them).

 The output is useful for both power users doing performance tuning of
 their application, and by gcc developers. Adding the id is not so
 useful for the former, but I agree that it is very useful for compiler
 developers. In fact, in the google branch version we emit more verbose
 information (the lipo module id and the funcdef_no) to help uniquely
 identify the routines and to aid in post-processing by humans and
 tools. So it is probably useful to add something similar here too. Is
 the node-symbol.order more or less unique than the funcdef_no? I see
 that you added a patch a few months ago to print the
 node-symbol.order in the function header, and it also has the
 advantage as you note of matching up with existing ipa dumps.

 node-symbol.order is unique and if I remember correctly, it is not
 even recycled.  Clones, inline clones, thunks, every symbol table node
 gets its own symbol order so it should be more unique than funcdef_no.
 On the other hand it may be a bit cryptic for users but at the same
 time it is only

Re: [PATCH] Convert more passes to new dump framework

2013-08-29 Thread Teresa Johnson

On Thu, Aug 29, 2013 at 3:05 AM, Richard Biener
richard.guent...@gmail.com wrote:
 Does
 'make newlines consitent' avoid all the spurious vertical spacing I see 
 with
 -fopt-info?

 Well, it helps get us there. The problem was that before, since
 dump_loc was not consistently emitting newlines, the calls had to emit
 their own newlines manually in the string to ensure there was a
 newline at all. I was thinking that once this is fixed I could go back
 and clean up all those calls by removing the newlines in the string. I
 could split this part into a separate patch and do both at once.

 However, after thinking about this some more this morning, I am
 wondering whether it is better to remove the newline emission
 completely from dump_loc and rely on the caller to put the newline in
 the string. The reason is that there are 2 high level interfaces to
 the new dump infrastructure, dump_printf() and dump_printf_loc(). Only
 the latter invokes dump_loc and gets the newline at the start of the
 message. The typical usage seems to be to start a message via
 dump_printf_loc, and then use dump_printf to emit parts of the message
 (thus not requiring a newline), but I think it may lead to problems to
 rely on this assumption.

 So if you agree, I will simply remove the newline altogether from
 dump_loc, and ensure that all clients of dump_printf/dump_printf_loc
 include a newline char as appropriate in the string they pass.


 As a helper function, dump_loc should not blindly emit new line as it
 has no context.  I have tried to remove it, and push the newline to
 higher level helpers -- it mostly works, but the vectorizer verbose
 messages need serious clean up -- most of them assume that
 dump_printf_loc does not end with new line, so that the expression
 dump can follow in the same line (the message texts need clean up too
 -- i do not like the === === in info messages).

 I know, but we should really do that cleanup.

I can work on this and will send a separate patch.
Teresa

-- 
Teresa Johnson | Software Engineer | tejohn...@google.com | 408-460-2413

Re: [PATCH] Convert more passes to new dump framework

2013-08-29 Thread Richard Biener

Ok.

Richard.

On Thu, Aug 29, 2013 at 3:18 PM, Teresa Johnson tejohn...@google.com wrote:
 On Wed, Aug 28, 2013 at 9:07 AM, Teresa Johnson tejohn...@google.com wrote:
 On Wed, Aug 28, 2013 at 7:09 AM, Teresa Johnson tejohn...@google.com wrote:
 On Wed, Aug 28, 2013 at 4:01 AM, Richard Biener
 richard.guent...@gmail.com wrote:
 On Wed, Aug 7, 2013 at 7:23 AM, Teresa Johnson tejohn...@google.com 
 wrote:
 On Tue, Aug 6, 2013 at 9:29 AM, Teresa Johnson tejohn...@google.com 
 wrote:
 On Tue, Aug 6, 2013 at 9:01 AM, Martin Jambor mjam...@suse.cz wrote:
 Hi,

 On Tue, Aug 06, 2013 at 07:14:42AM -0700, Teresa Johnson wrote:
 On Tue, Aug 6, 2013 at 5:37 AM, Martin Jambor mjam...@suse.cz wrote:
  On Mon, Aug 05, 2013 at 10:37:00PM -0700, Teresa Johnson wrote:
  This patch ports messages to the new dump framework,
 
  It would be great this new framework was documented somewhere.  I 
  lost
  track of what was agreed it would be and from the uses in the
  vectorizer I was never quite sure how to utilize it in other passes.

 Cc'ing Sharad who implemented this - Sharad, is this documented on a
 wiki or elsewhere?

 Thanks


 
  I'd also like to point out two other minor things inline:
 
  [...]
 
  2013-08-06  Teresa Johnson  tejohn...@google.com
  Dehao Chen  de...@google.com
 
  * dumpfile.c (dump_loc): Add column number to output, make 
  newlines
  consistent.
  * dumpfile.h (OPTGROUP_OTHER): Add and enable under 
  OPTGROUP_ALL.
  * ipa-inline-transform.c (clone_inlined_nodes):
  (cgraph_node_opt_info): New function.
  (cgraph_node_call_chain): Ditto.
  (dump_inline_decision): Ditto.
  (inline_call): Invoke dump_inline_decision.
  * doc/invoke.texi: Document optall -fopt-info flag.
  * profile.c (read_profile_edge_counts): Use new dump 
  framework.
  (compute_branch_probabilities): Ditto.
  * passes.c (pass_manager::register_one_dump_file): Use 
  OPTGROUP_OTHER
  when pass not in any opt group.
  * value-prof.c (check_counter): Use new dump framework.
  (find_func_by_funcdef_no): Ditto.
  (check_ic_target): Ditto.
  * coverage.c (get_coverage_counts): Ditto.
  (coverage_init): Setup new dump framework.
  * ipa-inline.c (inline_small_functions): Set 
  is_in_ipa_inline.
  * ipa-inline.h (is_in_ipa_inline): Declare.
 
  * testsuite/gcc.dg/pr40209.c: Use -fopt-info.
  * testsuite/gcc.dg/pr26570.c: Ditto.
  * testsuite/gcc.dg/pr32773.c: Ditto.
  * testsuite/g++.dg/tree-ssa/dom-invalid.C (struct C): Ditto.
 
 
  [...]
 
  Index: ipa-inline-transform.c
  ===
  --- ipa-inline-transform.c  (revision 201461)
  +++ ipa-inline-transform.c  (working copy)
  @@ -192,6 +192,108 @@ clone_inlined_nodes (struct cgraph_edge *e, 
  bool d
   }
 
 
  +#define MAX_INT_LENGTH 20
  +
  +/* Return NODE's name and profile count, if available.  */
  +
  +static const char *
  +cgraph_node_opt_info (struct cgraph_node *node)
  +{
  +  char *buf;
  +  size_t buf_size;
  +  const char *bfd_name = lang_hooks.dwarf_name (node-symbol.decl, 
  0);
  +
  +  if (!bfd_name)
  +bfd_name = unknown;
  +
  +  buf_size = strlen (bfd_name) + 1;
  +  if (profile_info)
  +buf_size += (MAX_INT_LENGTH + 3);
  +
  +  buf = (char *) xmalloc (buf_size);
  +
  +  strcpy (buf, bfd_name);
  +
  +  if (profile_info)
  +sprintf (buf, %s (HOST_WIDEST_INT_PRINT_DEC), buf, 
  node-count);
  +  return buf;
  +}
 
  I'm not sure if output of this function is aimed only at the user or
  if it is supposed to be used by gcc developers as well.  If the
  latter, an incredibly useful thing is to also dump node-symbol.order
  too.  We usually dump it after / sign separating it from node name.
  It is invaluable when examining decisions in C++ code where you can
  have lots of clones of a node (and also because existing dumps print
  it, it is easy to combine them).

 The output is useful for both power users doing performance tuning of
 their application, and by gcc developers. Adding the id is not so
 useful for the former, but I agree that it is very useful for compiler
 developers. In fact, in the google branch version we emit more verbose
 information (the lipo module id and the funcdef_no) to help uniquely
 identify the routines and to aid in post-processing by humans and
 tools. So it is probably useful to add something similar here too. Is
 the node-symbol.order more or less unique than the funcdef_no? I see
 that you added a patch a few months ago to print the
 node-symbol.order in the function header, and it also has the
 advantage as you note of matching up with existing ipa dumps.

 node-symbol.order is unique and if I remember correctly, it is not
 even recycled.  Clones, inline clones, thunks, every symbol table node
 gets its own symbol order so it should be more unique than

Re: [PATCH] Convert more passes to new dump framework

2013-08-29 Thread Teresa Johnson

On Thu, Aug 29, 2013 at 3:04 AM, Richard Biener
richard.guent...@gmail.com wrote:
 New patch below that removes this global variable, and also outputs
 the node-symbol.order (in square brackets after the function name so
 as to not clutter it). Inline messages with profile data look look:

 test.c:8:3: note: foobar [0] (9000) inlined into foo [2] (1000)
 with call count 9000 (via inline instance bar [3] (9000))

 Ick.  This looks both redundant and cluttered.  This is supposed to be
 understandable by GCC users, not only GCC developers.

 The main part that is only useful/understandable to gcc developers is
 the node-symbol.order in square brackes, requested by Martin. One
 possibility is that I could put that part under a param, disabled by
 default. We have something similar on the google branches that emits
 LIPO module info in the message, enabled via a param.

 But we have _dump files_ for that.  That's the developer-consumed
 form of opt-info.  -fopt-info is purely user sugar and for usual translation
 units it shouldn't exceed a single terminal full of output.

But as a developer I don't want to have to parse lots of dump files
for a summary of the major optimizations performed (e.g. inlining,
unrolling) for an application, unless I am diving into the reasons for
why or why not one of those optimizations occurred in a particular
location. I really do want a summary emitted to stderr so that it is
easily searchable/summarizable for the app as a whole.

For example, some of the apps I am interested in have thousands of
input files, and trying to collect and parse dump files for each and
every one is overwhelming (it probably would be even if my input files
numbered in the hundreds). What has been very useful is having these
high level summary messages of inlines and unrolls emitted to stderr
by -fopt-info. Then it is easy to search and sort by hotness to get a
feel for things like what inlines are missing when moving to a new
compiler, or compiling a new version of the source, for example. Then
you know which files to focus on and collect dump files for.


 I'd argue that the other information (the profile counts, emitted only
 when using -fprofile-use, and the inline call chains) are useful if
 you want to understand whether and how critical inlines are occurring.
 I think this is the type of information that users focused on
 optimizations, as well as gcc developers, want when they use
 -fopt-info. Otherwise it is difficult to make sense of the inline
 information.

 Well, I doubt that inline information is interesting to users unless we are
 able to aggressively filter it to what users are interested in.  Which IMHO
 isn't possible - users are interested in I have not inlined this even though
 inlining would severely improve performance which would indicate a bug
 in the heuristics we can reliably detect and thus it wouldn't be there.

I have interacted with users who are aware of optimizations such as
inlining and unrolling and want to look at that information to
diagnose performance differences when refactoring code or using a new
compiler version. I also think inlining (especially cross-module) is
one example of an optimization that is still being tuned, and user
reports of performance issues related to that have been useful.

I really think that the two groups of people who will find -fopt-info
useful are gcc developers and savvy performance-hungry users. For the
former group the additional info is extremely useful. For the latter
group some of the extra information may not be required (although a
call count is useful for those using profile feedback), but IMO is not
unreasonable.

Teresa


-- 
Teresa Johnson | Software Engineer | tejohn...@google.com | 408-460-2413

Re: [PATCH] Convert more passes to new dump framework

2013-08-29 Thread Teresa Johnson

On Wed, Aug 28, 2013 at 7:09 AM, Teresa Johnson tejohn...@google.com wrote:
 The inline stuff should be split and re-sent, it's non-obvious to me (extra
 function parameters are not documented for example).  I'd rather have
 inline_and_report_call () for example instead of an extra bool parameter.
 But let's iterate over this once it's split out.

 Ok, I will send this separately. I guess we could have a separate
 interface inline_and_report_call that is a wrapper around inline_call
 and simply invokes the dumper. Note that flatten_function will need to
 conditionally call one of the two interfaces based on the value of its
 bool early parameter though.

Here is the split out patch. I realized why I need the extra bool
parameter to indicate early inlining: the early inliner messages are
emitted only under the more verbose MSG_NOTE, so we need to know
whether we are in early or ipa inlining in dump_inline_decision. I
have documented the additional parameter.

Thanks,
Teresa

2013-08-29  Teresa Johnson  tejohn...@google.com
Dehao Chen  de...@google.com

* ipa-inline.c (recursive_inlining): New inline_call parameter.
(inline_small_functions): Ditto.
(flatten_function): Ditto.
(ipa_inline): Ditto.
(inline_always_inline_functions): Ditto.
(early_inline_small_functions): Ditto.
* ipa-inline.h: Ditto.
* ipa-inline-transform.c (cgraph_node_opt_info): New function.
(cgraph_node_call_chain): Ditto.
(dump_inline_decision): Ditto.
(inline_call): Invoke dump_inline_decision, new parameter.

Index: ipa-inline.c
===
--- ipa-inline.c(revision 202059)
+++ ipa-inline.c(working copy)
@@ -1342,7 +1342,7 @@ recursive_inlining (struct cgraph_edge *edge,
   reset_edge_growth_cache (curr);
}

-  inline_call (curr, false, new_edges, overall_size, true);
+  inline_call (curr, false, new_edges, overall_size, true, false);
   lookup_recursive_calls (node, curr-callee, heap);
   n++;
 }
@@ -1757,7 +1757,8 @@ inline_small_functions (void)
fprintf (dump_file,  Peeling recursion with depth %i\n, depth);

  gcc_checking_assert (!callee-global.inlined_to);
- inline_call (edge, true, new_indirect_edges, overall_size, true);
+ inline_call (edge, true, new_indirect_edges, overall_size, true,
+   false);
  if (flag_indirect_inlining)
add_new_edges_to_heap (edge_heap, new_indirect_edges);

@@ -1879,7 +1880,7 @@ flatten_function (struct cgraph_node *node, bool e
 xstrdup (cgraph_node_name (callee)),
 xstrdup (cgraph_node_name (e-caller)));
   orig_callee = callee;
-  inline_call (e, true, NULL, NULL, false);
+  inline_call (e, true, NULL, NULL, false, early);
   if (e-callee != orig_callee)
orig_callee-symbol.aux = (void *) node;
   flatten_function (e-callee, early);
@@ -2017,7 +2018,7 @@ ipa_inline (void)
   inline_summary (node-callers-caller)-size);
}

- inline_call (node-callers, true, NULL, NULL, true);
+ inline_call (node-callers, true, NULL, NULL, true, false);
  if (dump_file)
fprintf (dump_file,
  Inlined into %s which now has %i size\n,
@@ -2089,7 +2090,7 @@ inline_always_inline_functions (struct cgraph_node
fprintf (dump_file,   Inlining %s into %s (always_inline).\n,
 xstrdup (cgraph_node_name (e-callee)),
 xstrdup (cgraph_node_name (e-caller)));
-  inline_call (e, true, NULL, NULL, false);
+  inline_call (e, true, NULL, NULL, false, true);
   inlined = true;
 }
   if (inlined)
@@ -2141,7 +2142,7 @@ early_inline_small_functions (struct cgraph_node *
fprintf (dump_file,  Inlining %s into %s.\n,
 xstrdup (cgraph_node_name (callee)),
 xstrdup (cgraph_node_name (e-caller)));
-  inline_call (e, true, NULL, NULL, true);
+  inline_call (e, true, NULL, NULL, true, true);
   inlined = true;
 }

Index: ipa-inline.h
===
--- ipa-inline.h(revision 202059)
+++ ipa-inline.h(working copy)
@@ -229,7 +229,8 @@ void compute_inline_parameters (struct cgraph_node
 bool speculation_useful_p (struct cgraph_edge *e, bool anticipate_inlining);

 /* In ipa-inline-transform.c  */
-bool inline_call (struct cgraph_edge *, bool, veccgraph_edge_p *,
int *, bool);
+bool inline_call (struct cgraph_edge *, bool, veccgraph_edge_p *, int *,
+  bool, bool);
 unsigned int inline_transform (struct cgraph_node *);
 void clone_inlined_nodes (struct cgraph_edge *e, bool, bool, int *);

Index: ipa-inline-transform.c
===

Re: [PATCH] Convert more passes to new dump framework

2013-08-28 Thread Richard Biener

On Wed, Aug 7, 2013 at 7:23 AM, Teresa Johnson tejohn...@google.com wrote:
 On Tue, Aug 6, 2013 at 9:29 AM, Teresa Johnson tejohn...@google.com wrote:
 On Tue, Aug 6, 2013 at 9:01 AM, Martin Jambor mjam...@suse.cz wrote:
 Hi,

 On Tue, Aug 06, 2013 at 07:14:42AM -0700, Teresa Johnson wrote:
 On Tue, Aug 6, 2013 at 5:37 AM, Martin Jambor mjam...@suse.cz wrote:
  On Mon, Aug 05, 2013 at 10:37:00PM -0700, Teresa Johnson wrote:
  This patch ports messages to the new dump framework,
 
  It would be great this new framework was documented somewhere.  I lost
  track of what was agreed it would be and from the uses in the
  vectorizer I was never quite sure how to utilize it in other passes.

 Cc'ing Sharad who implemented this - Sharad, is this documented on a
 wiki or elsewhere?

 Thanks


 
  I'd also like to point out two other minor things inline:
 
  [...]
 
  2013-08-06  Teresa Johnson  tejohn...@google.com
  Dehao Chen  de...@google.com
 
  * dumpfile.c (dump_loc): Add column number to output, make 
  newlines
  consistent.
  * dumpfile.h (OPTGROUP_OTHER): Add and enable under 
  OPTGROUP_ALL.
  * ipa-inline-transform.c (clone_inlined_nodes):
  (cgraph_node_opt_info): New function.
  (cgraph_node_call_chain): Ditto.
  (dump_inline_decision): Ditto.
  (inline_call): Invoke dump_inline_decision.
  * doc/invoke.texi: Document optall -fopt-info flag.
  * profile.c (read_profile_edge_counts): Use new dump framework.
  (compute_branch_probabilities): Ditto.
  * passes.c (pass_manager::register_one_dump_file): Use 
  OPTGROUP_OTHER
  when pass not in any opt group.
  * value-prof.c (check_counter): Use new dump framework.
  (find_func_by_funcdef_no): Ditto.
  (check_ic_target): Ditto.
  * coverage.c (get_coverage_counts): Ditto.
  (coverage_init): Setup new dump framework.
  * ipa-inline.c (inline_small_functions): Set is_in_ipa_inline.
  * ipa-inline.h (is_in_ipa_inline): Declare.
 
  * testsuite/gcc.dg/pr40209.c: Use -fopt-info.
  * testsuite/gcc.dg/pr26570.c: Ditto.
  * testsuite/gcc.dg/pr32773.c: Ditto.
  * testsuite/g++.dg/tree-ssa/dom-invalid.C (struct C): Ditto.
 
 
  [...]
 
  Index: ipa-inline-transform.c
  ===
  --- ipa-inline-transform.c  (revision 201461)
  +++ ipa-inline-transform.c  (working copy)
  @@ -192,6 +192,108 @@ clone_inlined_nodes (struct cgraph_edge *e, bool d
   }
 
 
  +#define MAX_INT_LENGTH 20
  +
  +/* Return NODE's name and profile count, if available.  */
  +
  +static const char *
  +cgraph_node_opt_info (struct cgraph_node *node)
  +{
  +  char *buf;
  +  size_t buf_size;
  +  const char *bfd_name = lang_hooks.dwarf_name (node-symbol.decl, 0);
  +
  +  if (!bfd_name)
  +bfd_name = unknown;
  +
  +  buf_size = strlen (bfd_name) + 1;
  +  if (profile_info)
  +buf_size += (MAX_INT_LENGTH + 3);
  +
  +  buf = (char *) xmalloc (buf_size);
  +
  +  strcpy (buf, bfd_name);
  +
  +  if (profile_info)
  +sprintf (buf, %s (HOST_WIDEST_INT_PRINT_DEC), buf, 
  node-count);
  +  return buf;
  +}
 
  I'm not sure if output of this function is aimed only at the user or
  if it is supposed to be used by gcc developers as well.  If the
  latter, an incredibly useful thing is to also dump node-symbol.order
  too.  We usually dump it after / sign separating it from node name.
  It is invaluable when examining decisions in C++ code where you can
  have lots of clones of a node (and also because existing dumps print
  it, it is easy to combine them).

 The output is useful for both power users doing performance tuning of
 their application, and by gcc developers. Adding the id is not so
 useful for the former, but I agree that it is very useful for compiler
 developers. In fact, in the google branch version we emit more verbose
 information (the lipo module id and the funcdef_no) to help uniquely
 identify the routines and to aid in post-processing by humans and
 tools. So it is probably useful to add something similar here too. Is
 the node-symbol.order more or less unique than the funcdef_no? I see
 that you added a patch a few months ago to print the
 node-symbol.order in the function header, and it also has the
 advantage as you note of matching up with existing ipa dumps.

 node-symbol.order is unique and if I remember correctly, it is not
 even recycled.  Clones, inline clones, thunks, every symbol table node
 gets its own symbol order so it should be more unique than funcdef_no.
 On the other hand it may be a bit cryptic for users but at the same
 time it is only one number.

 Ok, I am going to go ahead and add this to the output.



 
  [...]
 
  Index: ipa-inline.c
  ===
  --- ipa-inline.c(revision 201461)
  +++ ipa-inline.c

Re: [PATCH] Convert more passes to new dump framework

2013-08-28 Thread Teresa Johnson

On Wed, Aug 28, 2013 at 4:01 AM, Richard Biener
richard.guent...@gmail.com wrote:
 On Wed, Aug 7, 2013 at 7:23 AM, Teresa Johnson tejohn...@google.com wrote:
 On Tue, Aug 6, 2013 at 9:29 AM, Teresa Johnson tejohn...@google.com wrote:
 On Tue, Aug 6, 2013 at 9:01 AM, Martin Jambor mjam...@suse.cz wrote:
 Hi,

 On Tue, Aug 06, 2013 at 07:14:42AM -0700, Teresa Johnson wrote:
 On Tue, Aug 6, 2013 at 5:37 AM, Martin Jambor mjam...@suse.cz wrote:
  On Mon, Aug 05, 2013 at 10:37:00PM -0700, Teresa Johnson wrote:
  This patch ports messages to the new dump framework,
 
  It would be great this new framework was documented somewhere.  I lost
  track of what was agreed it would be and from the uses in the
  vectorizer I was never quite sure how to utilize it in other passes.

 Cc'ing Sharad who implemented this - Sharad, is this documented on a
 wiki or elsewhere?

 Thanks


 
  I'd also like to point out two other minor things inline:
 
  [...]
 
  2013-08-06  Teresa Johnson  tejohn...@google.com
  Dehao Chen  de...@google.com
 
  * dumpfile.c (dump_loc): Add column number to output, make 
  newlines
  consistent.
  * dumpfile.h (OPTGROUP_OTHER): Add and enable under 
  OPTGROUP_ALL.
  * ipa-inline-transform.c (clone_inlined_nodes):
  (cgraph_node_opt_info): New function.
  (cgraph_node_call_chain): Ditto.
  (dump_inline_decision): Ditto.
  (inline_call): Invoke dump_inline_decision.
  * doc/invoke.texi: Document optall -fopt-info flag.
  * profile.c (read_profile_edge_counts): Use new dump framework.
  (compute_branch_probabilities): Ditto.
  * passes.c (pass_manager::register_one_dump_file): Use 
  OPTGROUP_OTHER
  when pass not in any opt group.
  * value-prof.c (check_counter): Use new dump framework.
  (find_func_by_funcdef_no): Ditto.
  (check_ic_target): Ditto.
  * coverage.c (get_coverage_counts): Ditto.
  (coverage_init): Setup new dump framework.
  * ipa-inline.c (inline_small_functions): Set is_in_ipa_inline.
  * ipa-inline.h (is_in_ipa_inline): Declare.
 
  * testsuite/gcc.dg/pr40209.c: Use -fopt-info.
  * testsuite/gcc.dg/pr26570.c: Ditto.
  * testsuite/gcc.dg/pr32773.c: Ditto.
  * testsuite/g++.dg/tree-ssa/dom-invalid.C (struct C): Ditto.
 
 
  [...]
 
  Index: ipa-inline-transform.c
  ===
  --- ipa-inline-transform.c  (revision 201461)
  +++ ipa-inline-transform.c  (working copy)
  @@ -192,6 +192,108 @@ clone_inlined_nodes (struct cgraph_edge *e, bool 
  d
   }
 
 
  +#define MAX_INT_LENGTH 20
  +
  +/* Return NODE's name and profile count, if available.  */
  +
  +static const char *
  +cgraph_node_opt_info (struct cgraph_node *node)
  +{
  +  char *buf;
  +  size_t buf_size;
  +  const char *bfd_name = lang_hooks.dwarf_name (node-symbol.decl, 0);
  +
  +  if (!bfd_name)
  +bfd_name = unknown;
  +
  +  buf_size = strlen (bfd_name) + 1;
  +  if (profile_info)
  +buf_size += (MAX_INT_LENGTH + 3);
  +
  +  buf = (char *) xmalloc (buf_size);
  +
  +  strcpy (buf, bfd_name);
  +
  +  if (profile_info)
  +sprintf (buf, %s (HOST_WIDEST_INT_PRINT_DEC), buf, 
  node-count);
  +  return buf;
  +}
 
  I'm not sure if output of this function is aimed only at the user or
  if it is supposed to be used by gcc developers as well.  If the
  latter, an incredibly useful thing is to also dump node-symbol.order
  too.  We usually dump it after / sign separating it from node name.
  It is invaluable when examining decisions in C++ code where you can
  have lots of clones of a node (and also because existing dumps print
  it, it is easy to combine them).

 The output is useful for both power users doing performance tuning of
 their application, and by gcc developers. Adding the id is not so
 useful for the former, but I agree that it is very useful for compiler
 developers. In fact, in the google branch version we emit more verbose
 information (the lipo module id and the funcdef_no) to help uniquely
 identify the routines and to aid in post-processing by humans and
 tools. So it is probably useful to add something similar here too. Is
 the node-symbol.order more or less unique than the funcdef_no? I see
 that you added a patch a few months ago to print the
 node-symbol.order in the function header, and it also has the
 advantage as you note of matching up with existing ipa dumps.

 node-symbol.order is unique and if I remember correctly, it is not
 even recycled.  Clones, inline clones, thunks, every symbol table node
 gets its own symbol order so it should be more unique than funcdef_no.
 On the other hand it may be a bit cryptic for users but at the same
 time it is only one number.

 Ok, I am going to go ahead and add this to the output.



 
  [...]
 
  Index: ipa-inline.c

Re: [PATCH] Convert more passes to new dump framework

2013-08-28 Thread Xinliang David Li

On Wed, Aug 28, 2013 at 7:09 AM, Teresa Johnson tejohn...@google.com wrote:
 On Wed, Aug 28, 2013 at 4:01 AM, Richard Biener
 richard.guent...@gmail.com wrote:
 On Wed, Aug 7, 2013 at 7:23 AM, Teresa Johnson tejohn...@google.com wrote:
 On Tue, Aug 6, 2013 at 9:29 AM, Teresa Johnson tejohn...@google.com wrote:
 On Tue, Aug 6, 2013 at 9:01 AM, Martin Jambor mjam...@suse.cz wrote:
 Hi,

 On Tue, Aug 06, 2013 at 07:14:42AM -0700, Teresa Johnson wrote:
 On Tue, Aug 6, 2013 at 5:37 AM, Martin Jambor mjam...@suse.cz wrote:
  On Mon, Aug 05, 2013 at 10:37:00PM -0700, Teresa Johnson wrote:
  This patch ports messages to the new dump framework,
 
  It would be great this new framework was documented somewhere.  I lost
  track of what was agreed it would be and from the uses in the
  vectorizer I was never quite sure how to utilize it in other passes.

 Cc'ing Sharad who implemented this - Sharad, is this documented on a
 wiki or elsewhere?

 Thanks


 
  I'd also like to point out two other minor things inline:
 
  [...]
 
  2013-08-06  Teresa Johnson  tejohn...@google.com
  Dehao Chen  de...@google.com
 
  * dumpfile.c (dump_loc): Add column number to output, make 
  newlines
  consistent.
  * dumpfile.h (OPTGROUP_OTHER): Add and enable under 
  OPTGROUP_ALL.
  * ipa-inline-transform.c (clone_inlined_nodes):
  (cgraph_node_opt_info): New function.
  (cgraph_node_call_chain): Ditto.
  (dump_inline_decision): Ditto.
  (inline_call): Invoke dump_inline_decision.
  * doc/invoke.texi: Document optall -fopt-info flag.
  * profile.c (read_profile_edge_counts): Use new dump 
  framework.
  (compute_branch_probabilities): Ditto.
  * passes.c (pass_manager::register_one_dump_file): Use 
  OPTGROUP_OTHER
  when pass not in any opt group.
  * value-prof.c (check_counter): Use new dump framework.
  (find_func_by_funcdef_no): Ditto.
  (check_ic_target): Ditto.
  * coverage.c (get_coverage_counts): Ditto.
  (coverage_init): Setup new dump framework.
  * ipa-inline.c (inline_small_functions): Set is_in_ipa_inline.
  * ipa-inline.h (is_in_ipa_inline): Declare.
 
  * testsuite/gcc.dg/pr40209.c: Use -fopt-info.
  * testsuite/gcc.dg/pr26570.c: Ditto.
  * testsuite/gcc.dg/pr32773.c: Ditto.
  * testsuite/g++.dg/tree-ssa/dom-invalid.C (struct C): Ditto.
 
 
  [...]
 
  Index: ipa-inline-transform.c
  ===
  --- ipa-inline-transform.c  (revision 201461)
  +++ ipa-inline-transform.c  (working copy)
  @@ -192,6 +192,108 @@ clone_inlined_nodes (struct cgraph_edge *e, 
  bool d
   }
 
 
  +#define MAX_INT_LENGTH 20
  +
  +/* Return NODE's name and profile count, if available.  */
  +
  +static const char *
  +cgraph_node_opt_info (struct cgraph_node *node)
  +{
  +  char *buf;
  +  size_t buf_size;
  +  const char *bfd_name = lang_hooks.dwarf_name (node-symbol.decl, 
  0);
  +
  +  if (!bfd_name)
  +bfd_name = unknown;
  +
  +  buf_size = strlen (bfd_name) + 1;
  +  if (profile_info)
  +buf_size += (MAX_INT_LENGTH + 3);
  +
  +  buf = (char *) xmalloc (buf_size);
  +
  +  strcpy (buf, bfd_name);
  +
  +  if (profile_info)
  +sprintf (buf, %s (HOST_WIDEST_INT_PRINT_DEC), buf, 
  node-count);
  +  return buf;
  +}
 
  I'm not sure if output of this function is aimed only at the user or
  if it is supposed to be used by gcc developers as well.  If the
  latter, an incredibly useful thing is to also dump node-symbol.order
  too.  We usually dump it after / sign separating it from node name.
  It is invaluable when examining decisions in C++ code where you can
  have lots of clones of a node (and also because existing dumps print
  it, it is easy to combine them).

 The output is useful for both power users doing performance tuning of
 their application, and by gcc developers. Adding the id is not so
 useful for the former, but I agree that it is very useful for compiler
 developers. In fact, in the google branch version we emit more verbose
 information (the lipo module id and the funcdef_no) to help uniquely
 identify the routines and to aid in post-processing by humans and
 tools. So it is probably useful to add something similar here too. Is
 the node-symbol.order more or less unique than the funcdef_no? I see
 that you added a patch a few months ago to print the
 node-symbol.order in the function header, and it also has the
 advantage as you note of matching up with existing ipa dumps.

 node-symbol.order is unique and if I remember correctly, it is not
 even recycled.  Clones, inline clones, thunks, every symbol table node
 gets its own symbol order so it should be more unique than funcdef_no.
 On the other hand it may be a bit cryptic for users but at the same
 time it is only one number.

 Ok, I am going to go ahead and add this to the output.



 
  [...]

Re: [PATCH] Convert more passes to new dump framework

2013-08-28 Thread Teresa Johnson

On Wed, Aug 28, 2013 at 7:09 AM, Teresa Johnson tejohn...@google.com wrote:
 On Wed, Aug 28, 2013 at 4:01 AM, Richard Biener
 richard.guent...@gmail.com wrote:
 On Wed, Aug 7, 2013 at 7:23 AM, Teresa Johnson tejohn...@google.com wrote:
 On Tue, Aug 6, 2013 at 9:29 AM, Teresa Johnson tejohn...@google.com wrote:
 On Tue, Aug 6, 2013 at 9:01 AM, Martin Jambor mjam...@suse.cz wrote:
 Hi,

 On Tue, Aug 06, 2013 at 07:14:42AM -0700, Teresa Johnson wrote:
 On Tue, Aug 6, 2013 at 5:37 AM, Martin Jambor mjam...@suse.cz wrote:
  On Mon, Aug 05, 2013 at 10:37:00PM -0700, Teresa Johnson wrote:
  This patch ports messages to the new dump framework,
 
  It would be great this new framework was documented somewhere.  I lost
  track of what was agreed it would be and from the uses in the
  vectorizer I was never quite sure how to utilize it in other passes.

 Cc'ing Sharad who implemented this - Sharad, is this documented on a
 wiki or elsewhere?

 Thanks


 
  I'd also like to point out two other minor things inline:
 
  [...]
 
  2013-08-06  Teresa Johnson  tejohn...@google.com
  Dehao Chen  de...@google.com
 
  * dumpfile.c (dump_loc): Add column number to output, make 
  newlines
  consistent.
  * dumpfile.h (OPTGROUP_OTHER): Add and enable under 
  OPTGROUP_ALL.
  * ipa-inline-transform.c (clone_inlined_nodes):
  (cgraph_node_opt_info): New function.
  (cgraph_node_call_chain): Ditto.
  (dump_inline_decision): Ditto.
  (inline_call): Invoke dump_inline_decision.
  * doc/invoke.texi: Document optall -fopt-info flag.
  * profile.c (read_profile_edge_counts): Use new dump 
  framework.
  (compute_branch_probabilities): Ditto.
  * passes.c (pass_manager::register_one_dump_file): Use 
  OPTGROUP_OTHER
  when pass not in any opt group.
  * value-prof.c (check_counter): Use new dump framework.
  (find_func_by_funcdef_no): Ditto.
  (check_ic_target): Ditto.
  * coverage.c (get_coverage_counts): Ditto.
  (coverage_init): Setup new dump framework.
  * ipa-inline.c (inline_small_functions): Set is_in_ipa_inline.
  * ipa-inline.h (is_in_ipa_inline): Declare.
 
  * testsuite/gcc.dg/pr40209.c: Use -fopt-info.
  * testsuite/gcc.dg/pr26570.c: Ditto.
  * testsuite/gcc.dg/pr32773.c: Ditto.
  * testsuite/g++.dg/tree-ssa/dom-invalid.C (struct C): Ditto.
 
 
  [...]
 
  Index: ipa-inline-transform.c
  ===
  --- ipa-inline-transform.c  (revision 201461)
  +++ ipa-inline-transform.c  (working copy)
  @@ -192,6 +192,108 @@ clone_inlined_nodes (struct cgraph_edge *e, 
  bool d
   }
 
 
  +#define MAX_INT_LENGTH 20
  +
  +/* Return NODE's name and profile count, if available.  */
  +
  +static const char *
  +cgraph_node_opt_info (struct cgraph_node *node)
  +{
  +  char *buf;
  +  size_t buf_size;
  +  const char *bfd_name = lang_hooks.dwarf_name (node-symbol.decl, 
  0);
  +
  +  if (!bfd_name)
  +bfd_name = unknown;
  +
  +  buf_size = strlen (bfd_name) + 1;
  +  if (profile_info)
  +buf_size += (MAX_INT_LENGTH + 3);
  +
  +  buf = (char *) xmalloc (buf_size);
  +
  +  strcpy (buf, bfd_name);
  +
  +  if (profile_info)
  +sprintf (buf, %s (HOST_WIDEST_INT_PRINT_DEC), buf, 
  node-count);
  +  return buf;
  +}
 
  I'm not sure if output of this function is aimed only at the user or
  if it is supposed to be used by gcc developers as well.  If the
  latter, an incredibly useful thing is to also dump node-symbol.order
  too.  We usually dump it after / sign separating it from node name.
  It is invaluable when examining decisions in C++ code where you can
  have lots of clones of a node (and also because existing dumps print
  it, it is easy to combine them).

 The output is useful for both power users doing performance tuning of
 their application, and by gcc developers. Adding the id is not so
 useful for the former, but I agree that it is very useful for compiler
 developers. In fact, in the google branch version we emit more verbose
 information (the lipo module id and the funcdef_no) to help uniquely
 identify the routines and to aid in post-processing by humans and
 tools. So it is probably useful to add something similar here too. Is
 the node-symbol.order more or less unique than the funcdef_no? I see
 that you added a patch a few months ago to print the
 node-symbol.order in the function header, and it also has the
 advantage as you note of matching up with existing ipa dumps.

 node-symbol.order is unique and if I remember correctly, it is not
 even recycled.  Clones, inline clones, thunks, every symbol table node
 gets its own symbol order so it should be more unique than funcdef_no.
 On the other hand it may be a bit cryptic for users but at the same
 time it is only one number.

 Ok, I am going to go ahead and add this to the output.



 
  [...]

Re: [PATCH] Convert more passes to new dump framework

2013-08-27 Thread Teresa Johnson

Ping #3.

Thanks,
Teresa

On Mon, Aug 19, 2013 at 11:33 AM, Teresa Johnson tejohn...@google.com wrote:
 Ping.
 Thanks,
 Teresa

 On Mon, Aug 12, 2013 at 6:54 AM, Teresa Johnson tejohn...@google.com wrote:
 On Tue, Aug 6, 2013 at 10:23 PM, Teresa Johnson tejohn...@google.com wrote:
 On Tue, Aug 6, 2013 at 9:29 AM, Teresa Johnson tejohn...@google.com wrote:
 On Tue, Aug 6, 2013 at 9:01 AM, Martin Jambor mjam...@suse.cz wrote:
 Hi,

 On Tue, Aug 06, 2013 at 07:14:42AM -0700, Teresa Johnson wrote:
 On Tue, Aug 6, 2013 at 5:37 AM, Martin Jambor mjam...@suse.cz wrote:
  On Mon, Aug 05, 2013 at 10:37:00PM -0700, Teresa Johnson wrote:
  This patch ports messages to the new dump framework,
 
  It would be great this new framework was documented somewhere.  I lost
  track of what was agreed it would be and from the uses in the
  vectorizer I was never quite sure how to utilize it in other passes.

 Cc'ing Sharad who implemented this - Sharad, is this documented on a
 wiki or elsewhere?

 Thanks


 
  I'd also like to point out two other minor things inline:
 
  [...]
 
  2013-08-06  Teresa Johnson  tejohn...@google.com
  Dehao Chen  de...@google.com
 
  * dumpfile.c (dump_loc): Add column number to output, make 
  newlines
  consistent.
  * dumpfile.h (OPTGROUP_OTHER): Add and enable under 
  OPTGROUP_ALL.
  * ipa-inline-transform.c (clone_inlined_nodes):
  (cgraph_node_opt_info): New function.
  (cgraph_node_call_chain): Ditto.
  (dump_inline_decision): Ditto.
  (inline_call): Invoke dump_inline_decision.
  * doc/invoke.texi: Document optall -fopt-info flag.
  * profile.c (read_profile_edge_counts): Use new dump 
  framework.
  (compute_branch_probabilities): Ditto.
  * passes.c (pass_manager::register_one_dump_file): Use 
  OPTGROUP_OTHER
  when pass not in any opt group.
  * value-prof.c (check_counter): Use new dump framework.
  (find_func_by_funcdef_no): Ditto.
  (check_ic_target): Ditto.
  * coverage.c (get_coverage_counts): Ditto.
  (coverage_init): Setup new dump framework.
  * ipa-inline.c (inline_small_functions): Set is_in_ipa_inline.
  * ipa-inline.h (is_in_ipa_inline): Declare.
 
  * testsuite/gcc.dg/pr40209.c: Use -fopt-info.
  * testsuite/gcc.dg/pr26570.c: Ditto.
  * testsuite/gcc.dg/pr32773.c: Ditto.
  * testsuite/g++.dg/tree-ssa/dom-invalid.C (struct C): Ditto.
 
 
  [...]
 
  Index: ipa-inline-transform.c
  ===
  --- ipa-inline-transform.c  (revision 201461)
  +++ ipa-inline-transform.c  (working copy)
  @@ -192,6 +192,108 @@ clone_inlined_nodes (struct cgraph_edge *e, 
  bool d
   }
 
 
  +#define MAX_INT_LENGTH 20
  +
  +/* Return NODE's name and profile count, if available.  */
  +
  +static const char *
  +cgraph_node_opt_info (struct cgraph_node *node)
  +{
  +  char *buf;
  +  size_t buf_size;
  +  const char *bfd_name = lang_hooks.dwarf_name (node-symbol.decl, 
  0);
  +
  +  if (!bfd_name)
  +bfd_name = unknown;
  +
  +  buf_size = strlen (bfd_name) + 1;
  +  if (profile_info)
  +buf_size += (MAX_INT_LENGTH + 3);
  +
  +  buf = (char *) xmalloc (buf_size);
  +
  +  strcpy (buf, bfd_name);
  +
  +  if (profile_info)
  +sprintf (buf, %s (HOST_WIDEST_INT_PRINT_DEC), buf, 
  node-count);
  +  return buf;
  +}
 
  I'm not sure if output of this function is aimed only at the user or
  if it is supposed to be used by gcc developers as well.  If the
  latter, an incredibly useful thing is to also dump node-symbol.order
  too.  We usually dump it after / sign separating it from node name.
  It is invaluable when examining decisions in C++ code where you can
  have lots of clones of a node (and also because existing dumps print
  it, it is easy to combine them).

 The output is useful for both power users doing performance tuning of
 their application, and by gcc developers. Adding the id is not so
 useful for the former, but I agree that it is very useful for compiler
 developers. In fact, in the google branch version we emit more verbose
 information (the lipo module id and the funcdef_no) to help uniquely
 identify the routines and to aid in post-processing by humans and
 tools. So it is probably useful to add something similar here too. Is
 the node-symbol.order more or less unique than the funcdef_no? I see
 that you added a patch a few months ago to print the
 node-symbol.order in the function header, and it also has the
 advantage as you note of matching up with existing ipa dumps.

 node-symbol.order is unique and if I remember correctly, it is not
 even recycled.  Clones, inline clones, thunks, every symbol table node
 gets its own symbol order so it should be more unique than funcdef_no.
 On the other hand it may be a bit cryptic for users but at the same
 time it is only one number.

 Ok, I am going to go

Re: [PATCH] Convert more passes to new dump framework

2013-08-27 Thread Xinliang David Li

+ Honza

On Tue, Aug 27, 2013 at 10:56 AM, Teresa Johnson tejohn...@google.com wrote:
 Ping #3.

 Thanks,
 Teresa

 On Mon, Aug 19, 2013 at 11:33 AM, Teresa Johnson tejohn...@google.com wrote:
 Ping.
 Thanks,
 Teresa

 On Mon, Aug 12, 2013 at 6:54 AM, Teresa Johnson tejohn...@google.com wrote:
 On Tue, Aug 6, 2013 at 10:23 PM, Teresa Johnson tejohn...@google.com 
 wrote:
 On Tue, Aug 6, 2013 at 9:29 AM, Teresa Johnson tejohn...@google.com 
 wrote:
 On Tue, Aug 6, 2013 at 9:01 AM, Martin Jambor mjam...@suse.cz wrote:
 Hi,

 On Tue, Aug 06, 2013 at 07:14:42AM -0700, Teresa Johnson wrote:
 On Tue, Aug 6, 2013 at 5:37 AM, Martin Jambor mjam...@suse.cz wrote:
  On Mon, Aug 05, 2013 at 10:37:00PM -0700, Teresa Johnson wrote:
  This patch ports messages to the new dump framework,
 
  It would be great this new framework was documented somewhere.  I lost
  track of what was agreed it would be and from the uses in the
  vectorizer I was never quite sure how to utilize it in other passes.

 Cc'ing Sharad who implemented this - Sharad, is this documented on a
 wiki or elsewhere?

 Thanks


 
  I'd also like to point out two other minor things inline:
 
  [...]
 
  2013-08-06  Teresa Johnson  tejohn...@google.com
  Dehao Chen  de...@google.com
 
  * dumpfile.c (dump_loc): Add column number to output, make 
  newlines
  consistent.
  * dumpfile.h (OPTGROUP_OTHER): Add and enable under 
  OPTGROUP_ALL.
  * ipa-inline-transform.c (clone_inlined_nodes):
  (cgraph_node_opt_info): New function.
  (cgraph_node_call_chain): Ditto.
  (dump_inline_decision): Ditto.
  (inline_call): Invoke dump_inline_decision.
  * doc/invoke.texi: Document optall -fopt-info flag.
  * profile.c (read_profile_edge_counts): Use new dump 
  framework.
  (compute_branch_probabilities): Ditto.
  * passes.c (pass_manager::register_one_dump_file): Use 
  OPTGROUP_OTHER
  when pass not in any opt group.
  * value-prof.c (check_counter): Use new dump framework.
  (find_func_by_funcdef_no): Ditto.
  (check_ic_target): Ditto.
  * coverage.c (get_coverage_counts): Ditto.
  (coverage_init): Setup new dump framework.
  * ipa-inline.c (inline_small_functions): Set 
  is_in_ipa_inline.
  * ipa-inline.h (is_in_ipa_inline): Declare.
 
  * testsuite/gcc.dg/pr40209.c: Use -fopt-info.
  * testsuite/gcc.dg/pr26570.c: Ditto.
  * testsuite/gcc.dg/pr32773.c: Ditto.
  * testsuite/g++.dg/tree-ssa/dom-invalid.C (struct C): Ditto.
 
 
  [...]
 
  Index: ipa-inline-transform.c
  ===
  --- ipa-inline-transform.c  (revision 201461)
  +++ ipa-inline-transform.c  (working copy)
  @@ -192,6 +192,108 @@ clone_inlined_nodes (struct cgraph_edge *e, 
  bool d
   }
 
 
  +#define MAX_INT_LENGTH 20
  +
  +/* Return NODE's name and profile count, if available.  */
  +
  +static const char *
  +cgraph_node_opt_info (struct cgraph_node *node)
  +{
  +  char *buf;
  +  size_t buf_size;
  +  const char *bfd_name = lang_hooks.dwarf_name (node-symbol.decl, 
  0);
  +
  +  if (!bfd_name)
  +bfd_name = unknown;
  +
  +  buf_size = strlen (bfd_name) + 1;
  +  if (profile_info)
  +buf_size += (MAX_INT_LENGTH + 3);
  +
  +  buf = (char *) xmalloc (buf_size);
  +
  +  strcpy (buf, bfd_name);
  +
  +  if (profile_info)
  +sprintf (buf, %s (HOST_WIDEST_INT_PRINT_DEC), buf, 
  node-count);
  +  return buf;
  +}
 
  I'm not sure if output of this function is aimed only at the user or
  if it is supposed to be used by gcc developers as well.  If the
  latter, an incredibly useful thing is to also dump node-symbol.order
  too.  We usually dump it after / sign separating it from node name.
  It is invaluable when examining decisions in C++ code where you can
  have lots of clones of a node (and also because existing dumps print
  it, it is easy to combine them).

 The output is useful for both power users doing performance tuning of
 their application, and by gcc developers. Adding the id is not so
 useful for the former, but I agree that it is very useful for compiler
 developers. In fact, in the google branch version we emit more verbose
 information (the lipo module id and the funcdef_no) to help uniquely
 identify the routines and to aid in post-processing by humans and
 tools. So it is probably useful to add something similar here too. Is
 the node-symbol.order more or less unique than the funcdef_no? I see
 that you added a patch a few months ago to print the
 node-symbol.order in the function header, and it also has the
 advantage as you note of matching up with existing ipa dumps.

 node-symbol.order is unique and if I remember correctly, it is not
 even recycled.  Clones, inline clones, thunks, every symbol table node
 gets its own symbol order so it should be more unique than funcdef_no.
 On the other hand it

Re: [PATCH] Convert more passes to new dump framework

2013-08-19 Thread Teresa Johnson

Ping.
Thanks,
Teresa

On Mon, Aug 12, 2013 at 6:54 AM, Teresa Johnson tejohn...@google.com wrote:
 On Tue, Aug 6, 2013 at 10:23 PM, Teresa Johnson tejohn...@google.com wrote:
 On Tue, Aug 6, 2013 at 9:29 AM, Teresa Johnson tejohn...@google.com wrote:
 On Tue, Aug 6, 2013 at 9:01 AM, Martin Jambor mjam...@suse.cz wrote:
 Hi,

 On Tue, Aug 06, 2013 at 07:14:42AM -0700, Teresa Johnson wrote:
 On Tue, Aug 6, 2013 at 5:37 AM, Martin Jambor mjam...@suse.cz wrote:
  On Mon, Aug 05, 2013 at 10:37:00PM -0700, Teresa Johnson wrote:
  This patch ports messages to the new dump framework,
 
  It would be great this new framework was documented somewhere.  I lost
  track of what was agreed it would be and from the uses in the
  vectorizer I was never quite sure how to utilize it in other passes.

 Cc'ing Sharad who implemented this - Sharad, is this documented on a
 wiki or elsewhere?

 Thanks


 
  I'd also like to point out two other minor things inline:
 
  [...]
 
  2013-08-06  Teresa Johnson  tejohn...@google.com
  Dehao Chen  de...@google.com
 
  * dumpfile.c (dump_loc): Add column number to output, make 
  newlines
  consistent.
  * dumpfile.h (OPTGROUP_OTHER): Add and enable under 
  OPTGROUP_ALL.
  * ipa-inline-transform.c (clone_inlined_nodes):
  (cgraph_node_opt_info): New function.
  (cgraph_node_call_chain): Ditto.
  (dump_inline_decision): Ditto.
  (inline_call): Invoke dump_inline_decision.
  * doc/invoke.texi: Document optall -fopt-info flag.
  * profile.c (read_profile_edge_counts): Use new dump framework.
  (compute_branch_probabilities): Ditto.
  * passes.c (pass_manager::register_one_dump_file): Use 
  OPTGROUP_OTHER
  when pass not in any opt group.
  * value-prof.c (check_counter): Use new dump framework.
  (find_func_by_funcdef_no): Ditto.
  (check_ic_target): Ditto.
  * coverage.c (get_coverage_counts): Ditto.
  (coverage_init): Setup new dump framework.
  * ipa-inline.c (inline_small_functions): Set is_in_ipa_inline.
  * ipa-inline.h (is_in_ipa_inline): Declare.
 
  * testsuite/gcc.dg/pr40209.c: Use -fopt-info.
  * testsuite/gcc.dg/pr26570.c: Ditto.
  * testsuite/gcc.dg/pr32773.c: Ditto.
  * testsuite/g++.dg/tree-ssa/dom-invalid.C (struct C): Ditto.
 
 
  [...]
 
  Index: ipa-inline-transform.c
  ===
  --- ipa-inline-transform.c  (revision 201461)
  +++ ipa-inline-transform.c  (working copy)
  @@ -192,6 +192,108 @@ clone_inlined_nodes (struct cgraph_edge *e, bool 
  d
   }
 
 
  +#define MAX_INT_LENGTH 20
  +
  +/* Return NODE's name and profile count, if available.  */
  +
  +static const char *
  +cgraph_node_opt_info (struct cgraph_node *node)
  +{
  +  char *buf;
  +  size_t buf_size;
  +  const char *bfd_name = lang_hooks.dwarf_name (node-symbol.decl, 0);
  +
  +  if (!bfd_name)
  +bfd_name = unknown;
  +
  +  buf_size = strlen (bfd_name) + 1;
  +  if (profile_info)
  +buf_size += (MAX_INT_LENGTH + 3);
  +
  +  buf = (char *) xmalloc (buf_size);
  +
  +  strcpy (buf, bfd_name);
  +
  +  if (profile_info)
  +sprintf (buf, %s (HOST_WIDEST_INT_PRINT_DEC), buf, 
  node-count);
  +  return buf;
  +}
 
  I'm not sure if output of this function is aimed only at the user or
  if it is supposed to be used by gcc developers as well.  If the
  latter, an incredibly useful thing is to also dump node-symbol.order
  too.  We usually dump it after / sign separating it from node name.
  It is invaluable when examining decisions in C++ code where you can
  have lots of clones of a node (and also because existing dumps print
  it, it is easy to combine them).

 The output is useful for both power users doing performance tuning of
 their application, and by gcc developers. Adding the id is not so
 useful for the former, but I agree that it is very useful for compiler
 developers. In fact, in the google branch version we emit more verbose
 information (the lipo module id and the funcdef_no) to help uniquely
 identify the routines and to aid in post-processing by humans and
 tools. So it is probably useful to add something similar here too. Is
 the node-symbol.order more or less unique than the funcdef_no? I see
 that you added a patch a few months ago to print the
 node-symbol.order in the function header, and it also has the
 advantage as you note of matching up with existing ipa dumps.

 node-symbol.order is unique and if I remember correctly, it is not
 even recycled.  Clones, inline clones, thunks, every symbol table node
 gets its own symbol order so it should be more unique than funcdef_no.
 On the other hand it may be a bit cryptic for users but at the same
 time it is only one number.

 Ok, I am going to go ahead and add this to the output.



 
  [...]
 
  Index: ipa-inline.c

Re: [PATCH] Convert more passes to new dump framework

2013-08-12 Thread Teresa Johnson

On Tue, Aug 6, 2013 at 10:23 PM, Teresa Johnson tejohn...@google.com wrote:
 On Tue, Aug 6, 2013 at 9:29 AM, Teresa Johnson tejohn...@google.com wrote:
 On Tue, Aug 6, 2013 at 9:01 AM, Martin Jambor mjam...@suse.cz wrote:
 Hi,

 On Tue, Aug 06, 2013 at 07:14:42AM -0700, Teresa Johnson wrote:
 On Tue, Aug 6, 2013 at 5:37 AM, Martin Jambor mjam...@suse.cz wrote:
  On Mon, Aug 05, 2013 at 10:37:00PM -0700, Teresa Johnson wrote:
  This patch ports messages to the new dump framework,
 
  It would be great this new framework was documented somewhere.  I lost
  track of what was agreed it would be and from the uses in the
  vectorizer I was never quite sure how to utilize it in other passes.

 Cc'ing Sharad who implemented this - Sharad, is this documented on a
 wiki or elsewhere?

 Thanks


 
  I'd also like to point out two other minor things inline:
 
  [...]
 
  2013-08-06  Teresa Johnson  tejohn...@google.com
  Dehao Chen  de...@google.com
 
  * dumpfile.c (dump_loc): Add column number to output, make 
  newlines
  consistent.
  * dumpfile.h (OPTGROUP_OTHER): Add and enable under 
  OPTGROUP_ALL.
  * ipa-inline-transform.c (clone_inlined_nodes):
  (cgraph_node_opt_info): New function.
  (cgraph_node_call_chain): Ditto.
  (dump_inline_decision): Ditto.
  (inline_call): Invoke dump_inline_decision.
  * doc/invoke.texi: Document optall -fopt-info flag.
  * profile.c (read_profile_edge_counts): Use new dump framework.
  (compute_branch_probabilities): Ditto.
  * passes.c (pass_manager::register_one_dump_file): Use 
  OPTGROUP_OTHER
  when pass not in any opt group.
  * value-prof.c (check_counter): Use new dump framework.
  (find_func_by_funcdef_no): Ditto.
  (check_ic_target): Ditto.
  * coverage.c (get_coverage_counts): Ditto.
  (coverage_init): Setup new dump framework.
  * ipa-inline.c (inline_small_functions): Set is_in_ipa_inline.
  * ipa-inline.h (is_in_ipa_inline): Declare.
 
  * testsuite/gcc.dg/pr40209.c: Use -fopt-info.
  * testsuite/gcc.dg/pr26570.c: Ditto.
  * testsuite/gcc.dg/pr32773.c: Ditto.
  * testsuite/g++.dg/tree-ssa/dom-invalid.C (struct C): Ditto.
 
 
  [...]
 
  Index: ipa-inline-transform.c
  ===
  --- ipa-inline-transform.c  (revision 201461)
  +++ ipa-inline-transform.c  (working copy)
  @@ -192,6 +192,108 @@ clone_inlined_nodes (struct cgraph_edge *e, bool d
   }
 
 
  +#define MAX_INT_LENGTH 20
  +
  +/* Return NODE's name and profile count, if available.  */
  +
  +static const char *
  +cgraph_node_opt_info (struct cgraph_node *node)
  +{
  +  char *buf;
  +  size_t buf_size;
  +  const char *bfd_name = lang_hooks.dwarf_name (node-symbol.decl, 0);
  +
  +  if (!bfd_name)
  +bfd_name = unknown;
  +
  +  buf_size = strlen (bfd_name) + 1;
  +  if (profile_info)
  +buf_size += (MAX_INT_LENGTH + 3);
  +
  +  buf = (char *) xmalloc (buf_size);
  +
  +  strcpy (buf, bfd_name);
  +
  +  if (profile_info)
  +sprintf (buf, %s (HOST_WIDEST_INT_PRINT_DEC), buf, 
  node-count);
  +  return buf;
  +}
 
  I'm not sure if output of this function is aimed only at the user or
  if it is supposed to be used by gcc developers as well.  If the
  latter, an incredibly useful thing is to also dump node-symbol.order
  too.  We usually dump it after / sign separating it from node name.
  It is invaluable when examining decisions in C++ code where you can
  have lots of clones of a node (and also because existing dumps print
  it, it is easy to combine them).

 The output is useful for both power users doing performance tuning of
 their application, and by gcc developers. Adding the id is not so
 useful for the former, but I agree that it is very useful for compiler
 developers. In fact, in the google branch version we emit more verbose
 information (the lipo module id and the funcdef_no) to help uniquely
 identify the routines and to aid in post-processing by humans and
 tools. So it is probably useful to add something similar here too. Is
 the node-symbol.order more or less unique than the funcdef_no? I see
 that you added a patch a few months ago to print the
 node-symbol.order in the function header, and it also has the
 advantage as you note of matching up with existing ipa dumps.

 node-symbol.order is unique and if I remember correctly, it is not
 even recycled.  Clones, inline clones, thunks, every symbol table node
 gets its own symbol order so it should be more unique than funcdef_no.
 On the other hand it may be a bit cryptic for users but at the same
 time it is only one number.

 Ok, I am going to go ahead and add this to the output.



 
  [...]
 
  Index: ipa-inline.c
  ===
  --- ipa-inline.c(revision 201461)
  +++ ipa-inline.c

Re: [PATCH] Convert more passes to new dump framework

2013-08-06 Thread Martin Jambor

Hi,

On Mon, Aug 05, 2013 at 10:37:00PM -0700, Teresa Johnson wrote:
 This patch ports messages to the new dump framework,

It would be great this new framework was documented somewhere.  I lost
track of what was agreed it would be and from the uses in the
vectorizer I was never quite sure how to utilize it in other passes.

I'd also like to point out two other minor things inline:

[...]

 2013-08-06  Teresa Johnson  tejohn...@google.com
 Dehao Chen  de...@google.com
 
 * dumpfile.c (dump_loc): Add column number to output, make newlines
 consistent.
 * dumpfile.h (OPTGROUP_OTHER): Add and enable under OPTGROUP_ALL.
 * ipa-inline-transform.c (clone_inlined_nodes):
 (cgraph_node_opt_info): New function.
 (cgraph_node_call_chain): Ditto.
 (dump_inline_decision): Ditto.
 (inline_call): Invoke dump_inline_decision.
 * doc/invoke.texi: Document optall -fopt-info flag.
 * profile.c (read_profile_edge_counts): Use new dump framework.
 (compute_branch_probabilities): Ditto.
 * passes.c (pass_manager::register_one_dump_file): Use OPTGROUP_OTHER
 when pass not in any opt group.
 * value-prof.c (check_counter): Use new dump framework.
 (find_func_by_funcdef_no): Ditto.
 (check_ic_target): Ditto.
 * coverage.c (get_coverage_counts): Ditto.
 (coverage_init): Setup new dump framework.
 * ipa-inline.c (inline_small_functions): Set is_in_ipa_inline.
 * ipa-inline.h (is_in_ipa_inline): Declare.
 
 * testsuite/gcc.dg/pr40209.c: Use -fopt-info.
 * testsuite/gcc.dg/pr26570.c: Ditto.
 * testsuite/gcc.dg/pr32773.c: Ditto.
 * testsuite/g++.dg/tree-ssa/dom-invalid.C (struct C): Ditto.
 

[...]

 Index: ipa-inline-transform.c
 ===
 --- ipa-inline-transform.c  (revision 201461)
 +++ ipa-inline-transform.c  (working copy)
 @@ -192,6 +192,108 @@ clone_inlined_nodes (struct cgraph_edge *e, bool d
  }
 
 
 +#define MAX_INT_LENGTH 20
 +
 +/* Return NODE's name and profile count, if available.  */
 +
 +static const char *
 +cgraph_node_opt_info (struct cgraph_node *node)
 +{
 +  char *buf;
 +  size_t buf_size;
 +  const char *bfd_name = lang_hooks.dwarf_name (node-symbol.decl, 0);
 +
 +  if (!bfd_name)
 +bfd_name = unknown;
 +
 +  buf_size = strlen (bfd_name) + 1;
 +  if (profile_info)
 +buf_size += (MAX_INT_LENGTH + 3);
 +
 +  buf = (char *) xmalloc (buf_size);
 +
 +  strcpy (buf, bfd_name);
 +
 +  if (profile_info)
 +sprintf (buf, %s (HOST_WIDEST_INT_PRINT_DEC), buf, node-count);
 +  return buf;
 +}

I'm not sure if output of this function is aimed only at the user or
if it is supposed to be used by gcc developers as well.  If the
latter, an incredibly useful thing is to also dump node-symbol.order
too.  We usually dump it after / sign separating it from node name.
It is invaluable when examining decisions in C++ code where you can
have lots of clones of a node (and also because existing dumps print
it, it is easy to combine them).

[...]

 Index: ipa-inline.c
 ===
 --- ipa-inline.c(revision 201461)
 +++ ipa-inline.c(working copy)
 @@ -118,6 +118,9 @@ along with GCC; see the file COPYING3.  If not see
  static int overall_size;
  static gcov_type max_count;
 
 +/* Global variable to denote if it is in ipa-inline pass. */
 +bool is_in_ipa_inline = false;
 +
  /* Return false when inlining edge E would lead to violating
 limits on function unit growth or stack usage growth.
 

In this age of removing global variables, are you sure you need this?
The only user of this seems to be a function that is only being called
from inline_call... can that ever happen when not inlining?  If you
plan to use this function also elsewhere, perhaps the callers will
know whether we are inlining or not and can provide this in a
parameter?

Thanks,

Martin

Re: [PATCH] Convert more passes to new dump framework

2013-08-06 Thread Teresa Johnson

On Tue, Aug 6, 2013 at 5:37 AM, Martin Jambor mjam...@suse.cz wrote:
 Hi,

 On Mon, Aug 05, 2013 at 10:37:00PM -0700, Teresa Johnson wrote:
 This patch ports messages to the new dump framework,

 It would be great this new framework was documented somewhere.  I lost
 track of what was agreed it would be and from the uses in the
 vectorizer I was never quite sure how to utilize it in other passes.

Cc'ing Sharad who implemented this - Sharad, is this documented on a
wiki or elsewhere?


 I'd also like to point out two other minor things inline:

 [...]

 2013-08-06  Teresa Johnson  tejohn...@google.com
 Dehao Chen  de...@google.com

 * dumpfile.c (dump_loc): Add column number to output, make newlines
 consistent.
 * dumpfile.h (OPTGROUP_OTHER): Add and enable under OPTGROUP_ALL.
 * ipa-inline-transform.c (clone_inlined_nodes):
 (cgraph_node_opt_info): New function.
 (cgraph_node_call_chain): Ditto.
 (dump_inline_decision): Ditto.
 (inline_call): Invoke dump_inline_decision.
 * doc/invoke.texi: Document optall -fopt-info flag.
 * profile.c (read_profile_edge_counts): Use new dump framework.
 (compute_branch_probabilities): Ditto.
 * passes.c (pass_manager::register_one_dump_file): Use OPTGROUP_OTHER
 when pass not in any opt group.
 * value-prof.c (check_counter): Use new dump framework.
 (find_func_by_funcdef_no): Ditto.
 (check_ic_target): Ditto.
 * coverage.c (get_coverage_counts): Ditto.
 (coverage_init): Setup new dump framework.
 * ipa-inline.c (inline_small_functions): Set is_in_ipa_inline.
 * ipa-inline.h (is_in_ipa_inline): Declare.

 * testsuite/gcc.dg/pr40209.c: Use -fopt-info.
 * testsuite/gcc.dg/pr26570.c: Ditto.
 * testsuite/gcc.dg/pr32773.c: Ditto.
 * testsuite/g++.dg/tree-ssa/dom-invalid.C (struct C): Ditto.


 [...]

 Index: ipa-inline-transform.c
 ===
 --- ipa-inline-transform.c  (revision 201461)
 +++ ipa-inline-transform.c  (working copy)
 @@ -192,6 +192,108 @@ clone_inlined_nodes (struct cgraph_edge *e, bool d
  }


 +#define MAX_INT_LENGTH 20
 +
 +/* Return NODE's name and profile count, if available.  */
 +
 +static const char *
 +cgraph_node_opt_info (struct cgraph_node *node)
 +{
 +  char *buf;
 +  size_t buf_size;
 +  const char *bfd_name = lang_hooks.dwarf_name (node-symbol.decl, 0);
 +
 +  if (!bfd_name)
 +bfd_name = unknown;
 +
 +  buf_size = strlen (bfd_name) + 1;
 +  if (profile_info)
 +buf_size += (MAX_INT_LENGTH + 3);
 +
 +  buf = (char *) xmalloc (buf_size);
 +
 +  strcpy (buf, bfd_name);
 +
 +  if (profile_info)
 +sprintf (buf, %s (HOST_WIDEST_INT_PRINT_DEC), buf, node-count);
 +  return buf;
 +}

 I'm not sure if output of this function is aimed only at the user or
 if it is supposed to be used by gcc developers as well.  If the
 latter, an incredibly useful thing is to also dump node-symbol.order
 too.  We usually dump it after / sign separating it from node name.
 It is invaluable when examining decisions in C++ code where you can
 have lots of clones of a node (and also because existing dumps print
 it, it is easy to combine them).

The output is useful for both power users doing performance tuning of
their application, and by gcc developers. Adding the id is not so
useful for the former, but I agree that it is very useful for compiler
developers. In fact, in the google branch version we emit more verbose
information (the lipo module id and the funcdef_no) to help uniquely
identify the routines and to aid in post-processing by humans and
tools. So it is probably useful to add something similar here too. Is
the node-symbol.order more or less unique than the funcdef_no? I see
that you added a patch a few months ago to print the
node-symbol.order in the function header, and it also has the
advantage as you note of matching up with existing ipa dumps.


 [...]

 Index: ipa-inline.c
 ===
 --- ipa-inline.c(revision 201461)
 +++ ipa-inline.c(working copy)
 @@ -118,6 +118,9 @@ along with GCC; see the file COPYING3.  If not see
  static int overall_size;
  static gcov_type max_count;

 +/* Global variable to denote if it is in ipa-inline pass. */
 +bool is_in_ipa_inline = false;
 +
  /* Return false when inlining edge E would lead to violating
 limits on function unit growth or stack usage growth.


 In this age of removing global variables, are you sure you need this?
 The only user of this seems to be a function that is only being called
 from inline_call... can that ever happen when not inlining?  If you
 plan to use this function also elsewhere, perhaps the callers will
 know whether we are inlining or not and can provide this in a
 parameter?

This is to distinguish early inlining from ipa inlining. The

Re: [PATCH] Convert more passes to new dump framework

2013-08-06 Thread Xinliang David Li

On Tue, Aug 6, 2013 at 5:37 AM, Martin Jambor mjam...@suse.cz wrote:
 Hi,

 On Mon, Aug 05, 2013 at 10:37:00PM -0700, Teresa Johnson wrote:
 This patch ports messages to the new dump framework,

 It would be great this new framework was documented somewhere.  I lost
 track of what was agreed it would be and from the uses in the
 vectorizer I was never quite sure how to utilize it in other passes.

Sharad, can you put the documentation in GCC wiki.

In a nutshell, the new dumping interfaces produces information notes
which have 'dual' outputs -- controlled by different options. When
-fdump-phase-pass is on, the dump info will be dumped into the
pass specific dump file, and when -fopt-info=.. is on, the information
will be dumped into stderr.

The dump call should be guarded by dump_enabled_p().

thanks,

David


 I'd also like to point out two other minor things inline:

 [...]

 2013-08-06  Teresa Johnson  tejohn...@google.com
 Dehao Chen  de...@google.com

 * dumpfile.c (dump_loc): Add column number to output, make newlines
 consistent.
 * dumpfile.h (OPTGROUP_OTHER): Add and enable under OPTGROUP_ALL.
 * ipa-inline-transform.c (clone_inlined_nodes):
 (cgraph_node_opt_info): New function.
 (cgraph_node_call_chain): Ditto.
 (dump_inline_decision): Ditto.
 (inline_call): Invoke dump_inline_decision.
 * doc/invoke.texi: Document optall -fopt-info flag.
 * profile.c (read_profile_edge_counts): Use new dump framework.
 (compute_branch_probabilities): Ditto.
 * passes.c (pass_manager::register_one_dump_file): Use OPTGROUP_OTHER
 when pass not in any opt group.
 * value-prof.c (check_counter): Use new dump framework.
 (find_func_by_funcdef_no): Ditto.
 (check_ic_target): Ditto.
 * coverage.c (get_coverage_counts): Ditto.
 (coverage_init): Setup new dump framework.
 * ipa-inline.c (inline_small_functions): Set is_in_ipa_inline.
 * ipa-inline.h (is_in_ipa_inline): Declare.

 * testsuite/gcc.dg/pr40209.c: Use -fopt-info.
 * testsuite/gcc.dg/pr26570.c: Ditto.
 * testsuite/gcc.dg/pr32773.c: Ditto.
 * testsuite/g++.dg/tree-ssa/dom-invalid.C (struct C): Ditto.


 [...]

 Index: ipa-inline-transform.c
 ===
 --- ipa-inline-transform.c  (revision 201461)
 +++ ipa-inline-transform.c  (working copy)
 @@ -192,6 +192,108 @@ clone_inlined_nodes (struct cgraph_edge *e, bool d
  }


 +#define MAX_INT_LENGTH 20
 +
 +/* Return NODE's name and profile count, if available.  */
 +
 +static const char *
 +cgraph_node_opt_info (struct cgraph_node *node)
 +{
 +  char *buf;
 +  size_t buf_size;
 +  const char *bfd_name = lang_hooks.dwarf_name (node-symbol.decl, 0);
 +
 +  if (!bfd_name)
 +bfd_name = unknown;
 +
 +  buf_size = strlen (bfd_name) + 1;
 +  if (profile_info)
 +buf_size += (MAX_INT_LENGTH + 3);
 +
 +  buf = (char *) xmalloc (buf_size);
 +
 +  strcpy (buf, bfd_name);
 +
 +  if (profile_info)
 +sprintf (buf, %s (HOST_WIDEST_INT_PRINT_DEC), buf, node-count);
 +  return buf;
 +}

 I'm not sure if output of this function is aimed only at the user or
 if it is supposed to be used by gcc developers as well.  If the
 latter, an incredibly useful thing is to also dump node-symbol.order
 too.  We usually dump it after / sign separating it from node name.
 It is invaluable when examining decisions in C++ code where you can
 have lots of clones of a node (and also because existing dumps print
 it, it is easy to combine them).

 [...]

 Index: ipa-inline.c
 ===
 --- ipa-inline.c(revision 201461)
 +++ ipa-inline.c(working copy)
 @@ -118,6 +118,9 @@ along with GCC; see the file COPYING3.  If not see
  static int overall_size;
  static gcov_type max_count;

 +/* Global variable to denote if it is in ipa-inline pass. */
 +bool is_in_ipa_inline = false;
 +
  /* Return false when inlining edge E would lead to violating
 limits on function unit growth or stack usage growth.


 In this age of removing global variables, are you sure you need this?
 The only user of this seems to be a function that is only being called
 from inline_call... can that ever happen when not inlining?  If you
 plan to use this function also elsewhere, perhaps the callers will
 know whether we are inlining or not and can provide this in a
 parameter?

 Thanks,

 Martin

Re: [PATCH] Convert more passes to new dump framework

2013-08-06 Thread Martin Jambor

Hi,

On Tue, Aug 06, 2013 at 07:14:42AM -0700, Teresa Johnson wrote:
 On Tue, Aug 6, 2013 at 5:37 AM, Martin Jambor mjam...@suse.cz wrote:
  On Mon, Aug 05, 2013 at 10:37:00PM -0700, Teresa Johnson wrote:
  This patch ports messages to the new dump framework,
 
  It would be great this new framework was documented somewhere.  I lost
  track of what was agreed it would be and from the uses in the
  vectorizer I was never quite sure how to utilize it in other passes.
 
 Cc'ing Sharad who implemented this - Sharad, is this documented on a
 wiki or elsewhere?

Thanks

 
 
  I'd also like to point out two other minor things inline:
 
  [...]
 
  2013-08-06  Teresa Johnson  tejohn...@google.com
  Dehao Chen  de...@google.com
 
  * dumpfile.c (dump_loc): Add column number to output, make newlines
  consistent.
  * dumpfile.h (OPTGROUP_OTHER): Add and enable under OPTGROUP_ALL.
  * ipa-inline-transform.c (clone_inlined_nodes):
  (cgraph_node_opt_info): New function.
  (cgraph_node_call_chain): Ditto.
  (dump_inline_decision): Ditto.
  (inline_call): Invoke dump_inline_decision.
  * doc/invoke.texi: Document optall -fopt-info flag.
  * profile.c (read_profile_edge_counts): Use new dump framework.
  (compute_branch_probabilities): Ditto.
  * passes.c (pass_manager::register_one_dump_file): Use 
  OPTGROUP_OTHER
  when pass not in any opt group.
  * value-prof.c (check_counter): Use new dump framework.
  (find_func_by_funcdef_no): Ditto.
  (check_ic_target): Ditto.
  * coverage.c (get_coverage_counts): Ditto.
  (coverage_init): Setup new dump framework.
  * ipa-inline.c (inline_small_functions): Set is_in_ipa_inline.
  * ipa-inline.h (is_in_ipa_inline): Declare.
 
  * testsuite/gcc.dg/pr40209.c: Use -fopt-info.
  * testsuite/gcc.dg/pr26570.c: Ditto.
  * testsuite/gcc.dg/pr32773.c: Ditto.
  * testsuite/g++.dg/tree-ssa/dom-invalid.C (struct C): Ditto.
 
 
  [...]
 
  Index: ipa-inline-transform.c
  ===
  --- ipa-inline-transform.c  (revision 201461)
  +++ ipa-inline-transform.c  (working copy)
  @@ -192,6 +192,108 @@ clone_inlined_nodes (struct cgraph_edge *e, bool d
   }
 
 
  +#define MAX_INT_LENGTH 20
  +
  +/* Return NODE's name and profile count, if available.  */
  +
  +static const char *
  +cgraph_node_opt_info (struct cgraph_node *node)
  +{
  +  char *buf;
  +  size_t buf_size;
  +  const char *bfd_name = lang_hooks.dwarf_name (node-symbol.decl, 0);
  +
  +  if (!bfd_name)
  +bfd_name = unknown;
  +
  +  buf_size = strlen (bfd_name) + 1;
  +  if (profile_info)
  +buf_size += (MAX_INT_LENGTH + 3);
  +
  +  buf = (char *) xmalloc (buf_size);
  +
  +  strcpy (buf, bfd_name);
  +
  +  if (profile_info)
  +sprintf (buf, %s (HOST_WIDEST_INT_PRINT_DEC), buf, node-count);
  +  return buf;
  +}
 
  I'm not sure if output of this function is aimed only at the user or
  if it is supposed to be used by gcc developers as well.  If the
  latter, an incredibly useful thing is to also dump node-symbol.order
  too.  We usually dump it after / sign separating it from node name.
  It is invaluable when examining decisions in C++ code where you can
  have lots of clones of a node (and also because existing dumps print
  it, it is easy to combine them).
 
 The output is useful for both power users doing performance tuning of
 their application, and by gcc developers. Adding the id is not so
 useful for the former, but I agree that it is very useful for compiler
 developers. In fact, in the google branch version we emit more verbose
 information (the lipo module id and the funcdef_no) to help uniquely
 identify the routines and to aid in post-processing by humans and
 tools. So it is probably useful to add something similar here too. Is
 the node-symbol.order more or less unique than the funcdef_no? I see
 that you added a patch a few months ago to print the
 node-symbol.order in the function header, and it also has the
 advantage as you note of matching up with existing ipa dumps.

node-symbol.order is unique and if I remember correctly, it is not
even recycled.  Clones, inline clones, thunks, every symbol table node
gets its own symbol order so it should be more unique than funcdef_no.
On the other hand it may be a bit cryptic for users but at the same
time it is only one number.

 
 
  [...]
 
  Index: ipa-inline.c
  ===
  --- ipa-inline.c(revision 201461)
  +++ ipa-inline.c(working copy)
  @@ -118,6 +118,9 @@ along with GCC; see the file COPYING3.  If not see
   static int overall_size;
   static gcov_type max_count;
 
  +/* Global variable to denote if it is in ipa-inline pass. */
  +bool is_in_ipa_inline = false;
  +
   /* Return false when inlining edge E would

Re: [PATCH] Convert more passes to new dump framework

2013-08-06 Thread Sharad Singhai

On Tue, Aug 6, 2013 at 8:57 AM, Xinliang David Li davi...@google.com wrote:
 On Tue, Aug 6, 2013 at 5:37 AM, Martin Jambor mjam...@suse.cz wrote:
 Hi,

 On Mon, Aug 05, 2013 at 10:37:00PM -0700, Teresa Johnson wrote:
 This patch ports messages to the new dump framework,

 It would be great this new framework was documented somewhere.  I lost
 track of what was agreed it would be and from the uses in the
 vectorizer I was never quite sure how to utilize it in other passes.

 Sharad, can you put the documentation in GCC wiki.

Sure. I had user documentation in form of gcc info. But I will add
more developer details to a GCC wiki.

Thanks,
Sharad

 In a nutshell, the new dumping interfaces produces information notes
 which have 'dual' outputs -- controlled by different options. When
 -fdump-phase-pass is on, the dump info will be dumped into the
 pass specific dump file, and when -fopt-info=.. is on, the information
 will be dumped into stderr.

 The dump call should be guarded by dump_enabled_p().

 thanks,

 David


 I'd also like to point out two other minor things inline:

 [...]

 2013-08-06  Teresa Johnson  tejohn...@google.com
 Dehao Chen  de...@google.com

 * dumpfile.c (dump_loc): Add column number to output, make newlines
 consistent.
 * dumpfile.h (OPTGROUP_OTHER): Add and enable under OPTGROUP_ALL.
 * ipa-inline-transform.c (clone_inlined_nodes):
 (cgraph_node_opt_info): New function.
 (cgraph_node_call_chain): Ditto.
 (dump_inline_decision): Ditto.
 (inline_call): Invoke dump_inline_decision.
 * doc/invoke.texi: Document optall -fopt-info flag.
 * profile.c (read_profile_edge_counts): Use new dump framework.
 (compute_branch_probabilities): Ditto.
 * passes.c (pass_manager::register_one_dump_file): Use 
 OPTGROUP_OTHER
 when pass not in any opt group.
 * value-prof.c (check_counter): Use new dump framework.
 (find_func_by_funcdef_no): Ditto.
 (check_ic_target): Ditto.
 * coverage.c (get_coverage_counts): Ditto.
 (coverage_init): Setup new dump framework.
 * ipa-inline.c (inline_small_functions): Set is_in_ipa_inline.
 * ipa-inline.h (is_in_ipa_inline): Declare.

 * testsuite/gcc.dg/pr40209.c: Use -fopt-info.
 * testsuite/gcc.dg/pr26570.c: Ditto.
 * testsuite/gcc.dg/pr32773.c: Ditto.
 * testsuite/g++.dg/tree-ssa/dom-invalid.C (struct C): Ditto.


 [...]

 Index: ipa-inline-transform.c
 ===
 --- ipa-inline-transform.c  (revision 201461)
 +++ ipa-inline-transform.c  (working copy)
 @@ -192,6 +192,108 @@ clone_inlined_nodes (struct cgraph_edge *e, bool d
  }


 +#define MAX_INT_LENGTH 20
 +
 +/* Return NODE's name and profile count, if available.  */
 +
 +static const char *
 +cgraph_node_opt_info (struct cgraph_node *node)
 +{
 +  char *buf;
 +  size_t buf_size;
 +  const char *bfd_name = lang_hooks.dwarf_name (node-symbol.decl, 0);
 +
 +  if (!bfd_name)
 +bfd_name = unknown;
 +
 +  buf_size = strlen (bfd_name) + 1;
 +  if (profile_info)
 +buf_size += (MAX_INT_LENGTH + 3);
 +
 +  buf = (char *) xmalloc (buf_size);
 +
 +  strcpy (buf, bfd_name);
 +
 +  if (profile_info)
 +sprintf (buf, %s (HOST_WIDEST_INT_PRINT_DEC), buf, node-count);
 +  return buf;
 +}

 I'm not sure if output of this function is aimed only at the user or
 if it is supposed to be used by gcc developers as well.  If the
 latter, an incredibly useful thing is to also dump node-symbol.order
 too.  We usually dump it after / sign separating it from node name.
 It is invaluable when examining decisions in C++ code where you can
 have lots of clones of a node (and also because existing dumps print
 it, it is easy to combine them).

 [...]

 Index: ipa-inline.c
 ===
 --- ipa-inline.c(revision 201461)
 +++ ipa-inline.c(working copy)
 @@ -118,6 +118,9 @@ along with GCC; see the file COPYING3.  If not see
  static int overall_size;
  static gcov_type max_count;

 +/* Global variable to denote if it is in ipa-inline pass. */
 +bool is_in_ipa_inline = false;
 +
  /* Return false when inlining edge E would lead to violating
 limits on function unit growth or stack usage growth.


 In this age of removing global variables, are you sure you need this?
 The only user of this seems to be a function that is only being called
 from inline_call... can that ever happen when not inlining?  If you
 plan to use this function also elsewhere, perhaps the callers will
 know whether we are inlining or not and can provide this in a
 parameter?

 Thanks,

 Martin

Re: [PATCH] Convert more passes to new dump framework

2013-08-06 Thread Teresa Johnson

On Tue, Aug 6, 2013 at 9:01 AM, Martin Jambor mjam...@suse.cz wrote:
 Hi,

 On Tue, Aug 06, 2013 at 07:14:42AM -0700, Teresa Johnson wrote:
 On Tue, Aug 6, 2013 at 5:37 AM, Martin Jambor mjam...@suse.cz wrote:
  On Mon, Aug 05, 2013 at 10:37:00PM -0700, Teresa Johnson wrote:
  This patch ports messages to the new dump framework,
 
  It would be great this new framework was documented somewhere.  I lost
  track of what was agreed it would be and from the uses in the
  vectorizer I was never quite sure how to utilize it in other passes.

 Cc'ing Sharad who implemented this - Sharad, is this documented on a
 wiki or elsewhere?

 Thanks


 
  I'd also like to point out two other minor things inline:
 
  [...]
 
  2013-08-06  Teresa Johnson  tejohn...@google.com
  Dehao Chen  de...@google.com
 
  * dumpfile.c (dump_loc): Add column number to output, make 
  newlines
  consistent.
  * dumpfile.h (OPTGROUP_OTHER): Add and enable under OPTGROUP_ALL.
  * ipa-inline-transform.c (clone_inlined_nodes):
  (cgraph_node_opt_info): New function.
  (cgraph_node_call_chain): Ditto.
  (dump_inline_decision): Ditto.
  (inline_call): Invoke dump_inline_decision.
  * doc/invoke.texi: Document optall -fopt-info flag.
  * profile.c (read_profile_edge_counts): Use new dump framework.
  (compute_branch_probabilities): Ditto.
  * passes.c (pass_manager::register_one_dump_file): Use 
  OPTGROUP_OTHER
  when pass not in any opt group.
  * value-prof.c (check_counter): Use new dump framework.
  (find_func_by_funcdef_no): Ditto.
  (check_ic_target): Ditto.
  * coverage.c (get_coverage_counts): Ditto.
  (coverage_init): Setup new dump framework.
  * ipa-inline.c (inline_small_functions): Set is_in_ipa_inline.
  * ipa-inline.h (is_in_ipa_inline): Declare.
 
  * testsuite/gcc.dg/pr40209.c: Use -fopt-info.
  * testsuite/gcc.dg/pr26570.c: Ditto.
  * testsuite/gcc.dg/pr32773.c: Ditto.
  * testsuite/g++.dg/tree-ssa/dom-invalid.C (struct C): Ditto.
 
 
  [...]
 
  Index: ipa-inline-transform.c
  ===
  --- ipa-inline-transform.c  (revision 201461)
  +++ ipa-inline-transform.c  (working copy)
  @@ -192,6 +192,108 @@ clone_inlined_nodes (struct cgraph_edge *e, bool d
   }
 
 
  +#define MAX_INT_LENGTH 20
  +
  +/* Return NODE's name and profile count, if available.  */
  +
  +static const char *
  +cgraph_node_opt_info (struct cgraph_node *node)
  +{
  +  char *buf;
  +  size_t buf_size;
  +  const char *bfd_name = lang_hooks.dwarf_name (node-symbol.decl, 0);
  +
  +  if (!bfd_name)
  +bfd_name = unknown;
  +
  +  buf_size = strlen (bfd_name) + 1;
  +  if (profile_info)
  +buf_size += (MAX_INT_LENGTH + 3);
  +
  +  buf = (char *) xmalloc (buf_size);
  +
  +  strcpy (buf, bfd_name);
  +
  +  if (profile_info)
  +sprintf (buf, %s (HOST_WIDEST_INT_PRINT_DEC), buf, node-count);
  +  return buf;
  +}
 
  I'm not sure if output of this function is aimed only at the user or
  if it is supposed to be used by gcc developers as well.  If the
  latter, an incredibly useful thing is to also dump node-symbol.order
  too.  We usually dump it after / sign separating it from node name.
  It is invaluable when examining decisions in C++ code where you can
  have lots of clones of a node (and also because existing dumps print
  it, it is easy to combine them).

 The output is useful for both power users doing performance tuning of
 their application, and by gcc developers. Adding the id is not so
 useful for the former, but I agree that it is very useful for compiler
 developers. In fact, in the google branch version we emit more verbose
 information (the lipo module id and the funcdef_no) to help uniquely
 identify the routines and to aid in post-processing by humans and
 tools. So it is probably useful to add something similar here too. Is
 the node-symbol.order more or less unique than the funcdef_no? I see
 that you added a patch a few months ago to print the
 node-symbol.order in the function header, and it also has the
 advantage as you note of matching up with existing ipa dumps.

 node-symbol.order is unique and if I remember correctly, it is not
 even recycled.  Clones, inline clones, thunks, every symbol table node
 gets its own symbol order so it should be more unique than funcdef_no.
 On the other hand it may be a bit cryptic for users but at the same
 time it is only one number.

Ok, I am going to go ahead and add this to the output.



 
  [...]
 
  Index: ipa-inline.c
  ===
  --- ipa-inline.c(revision 201461)
  +++ ipa-inline.c(working copy)
  @@ -118,6 +118,9 @@ along with GCC; see the file COPYING3.  If not see
   static int overall_size;
   static gcov_type max_count;
 
  +/* Global

Re: [PATCH] Convert more passes to new dump framework

2013-08-06 Thread Martin Jambor

On Tue, Aug 06, 2013 at 09:22:02AM -0700, Sharad Singhai wrote:
 On Tue, Aug 6, 2013 at 8:57 AM, Xinliang David Li davi...@google.com wrote:
  On Tue, Aug 6, 2013 at 5:37 AM, Martin Jambor mjam...@suse.cz wrote:
  Hi,
 
  On Mon, Aug 05, 2013 at 10:37:00PM -0700, Teresa Johnson wrote:
  This patch ports messages to the new dump framework,
 
  It would be great this new framework was documented somewhere.  I lost
  track of what was agreed it would be and from the uses in the
  vectorizer I was never quite sure how to utilize it in other passes.
 
  Sharad, can you put the documentation in GCC wiki.
 
 Sure. I had user documentation in form of gcc info. But I will add
 more developer details to a GCC wiki.
 

I have built trunk gccint.info yesterday but could not find any string
dump_enabled_p there, for example.  And when I quickly searched just
for the string dump, I did not find any thing that looked like
dumping infrastructure either.  OTOH, I agree that fie would be the
best place for the documentation.

Or did I just miss it?  What section is it in then?

Thanks,

Martin

Re: [PATCH] Convert more passes to new dump framework

2013-08-06 Thread Sharad Singhai

On Tue, Aug 6, 2013 at 10:10 AM, Martin Jambor mjam...@suse.cz wrote:
 On Tue, Aug 06, 2013 at 09:22:02AM -0700, Sharad Singhai wrote:
 On Tue, Aug 6, 2013 at 8:57 AM, Xinliang David Li davi...@google.com wrote:
  On Tue, Aug 6, 2013 at 5:37 AM, Martin Jambor mjam...@suse.cz wrote:
  Hi,
 
  On Mon, Aug 05, 2013 at 10:37:00PM -0700, Teresa Johnson wrote:
  This patch ports messages to the new dump framework,
 
  It would be great this new framework was documented somewhere.  I lost
  track of what was agreed it would be and from the uses in the
  vectorizer I was never quite sure how to utilize it in other passes.
 
  Sharad, can you put the documentation in GCC wiki.

 Sure. I had user documentation in form of gcc info. But I will add
 more developer details to a GCC wiki.


 I have built trunk gccint.info yesterday but could not find any string
 dump_enabled_p there, for example.  And when I quickly searched just
 for the string dump, I did not find any thing that looked like
 dumping infrastructure either.  OTOH, I agree that fie would be the
 best place for the documentation.

 Or did I just miss it?  What section is it in then?

Actually, the user-facing documentation is in doc/invoke.texi.
However, that doesn't describe dump_enabled_p. Do you think
gccint.info would be a good place? I can add documentation there
instead of creating a GCC wiki.

Thanks,
Sharad

 Thanks,

 Martin

Re: [PATCH] Convert more passes to new dump framework

2013-08-06 Thread Xinliang David Li

yes -- if this is the place developers look at the most.

David

On Tue, Aug 6, 2013 at 10:18 AM, Sharad Singhai sing...@google.com wrote:
 On Tue, Aug 6, 2013 at 10:10 AM, Martin Jambor mjam...@suse.cz wrote:
 On Tue, Aug 06, 2013 at 09:22:02AM -0700, Sharad Singhai wrote:
 On Tue, Aug 6, 2013 at 8:57 AM, Xinliang David Li davi...@google.com 
 wrote:
  On Tue, Aug 6, 2013 at 5:37 AM, Martin Jambor mjam...@suse.cz wrote:
  Hi,
 
  On Mon, Aug 05, 2013 at 10:37:00PM -0700, Teresa Johnson wrote:
  This patch ports messages to the new dump framework,
 
  It would be great this new framework was documented somewhere.  I lost
  track of what was agreed it would be and from the uses in the
  vectorizer I was never quite sure how to utilize it in other passes.
 
  Sharad, can you put the documentation in GCC wiki.

 Sure. I had user documentation in form of gcc info. But I will add
 more developer details to a GCC wiki.


 I have built trunk gccint.info yesterday but could not find any string
 dump_enabled_p there, for example.  And when I quickly searched just
 for the string dump, I did not find any thing that looked like
 dumping infrastructure either.  OTOH, I agree that fie would be the
 best place for the documentation.

 Or did I just miss it?  What section is it in then?

 Actually, the user-facing documentation is in doc/invoke.texi.
 However, that doesn't describe dump_enabled_p. Do you think
 gccint.info would be a good place? I can add documentation there
 instead of creating a GCC wiki.

 Thanks,
 Sharad

 Thanks,

 Martin

Re: [PATCH] Convert more passes to new dump framework

2013-08-06 Thread Teresa Johnson

On Tue, Aug 6, 2013 at 9:29 AM, Teresa Johnson tejohn...@google.com wrote:
 On Tue, Aug 6, 2013 at 9:01 AM, Martin Jambor mjam...@suse.cz wrote:
 Hi,

 On Tue, Aug 06, 2013 at 07:14:42AM -0700, Teresa Johnson wrote:
 On Tue, Aug 6, 2013 at 5:37 AM, Martin Jambor mjam...@suse.cz wrote:
  On Mon, Aug 05, 2013 at 10:37:00PM -0700, Teresa Johnson wrote:
  This patch ports messages to the new dump framework,
 
  It would be great this new framework was documented somewhere.  I lost
  track of what was agreed it would be and from the uses in the
  vectorizer I was never quite sure how to utilize it in other passes.

 Cc'ing Sharad who implemented this - Sharad, is this documented on a
 wiki or elsewhere?

 Thanks


 
  I'd also like to point out two other minor things inline:
 
  [...]
 
  2013-08-06  Teresa Johnson  tejohn...@google.com
  Dehao Chen  de...@google.com
 
  * dumpfile.c (dump_loc): Add column number to output, make 
  newlines
  consistent.
  * dumpfile.h (OPTGROUP_OTHER): Add and enable under OPTGROUP_ALL.
  * ipa-inline-transform.c (clone_inlined_nodes):
  (cgraph_node_opt_info): New function.
  (cgraph_node_call_chain): Ditto.
  (dump_inline_decision): Ditto.
  (inline_call): Invoke dump_inline_decision.
  * doc/invoke.texi: Document optall -fopt-info flag.
  * profile.c (read_profile_edge_counts): Use new dump framework.
  (compute_branch_probabilities): Ditto.
  * passes.c (pass_manager::register_one_dump_file): Use 
  OPTGROUP_OTHER
  when pass not in any opt group.
  * value-prof.c (check_counter): Use new dump framework.
  (find_func_by_funcdef_no): Ditto.
  (check_ic_target): Ditto.
  * coverage.c (get_coverage_counts): Ditto.
  (coverage_init): Setup new dump framework.
  * ipa-inline.c (inline_small_functions): Set is_in_ipa_inline.
  * ipa-inline.h (is_in_ipa_inline): Declare.
 
  * testsuite/gcc.dg/pr40209.c: Use -fopt-info.
  * testsuite/gcc.dg/pr26570.c: Ditto.
  * testsuite/gcc.dg/pr32773.c: Ditto.
  * testsuite/g++.dg/tree-ssa/dom-invalid.C (struct C): Ditto.
 
 
  [...]
 
  Index: ipa-inline-transform.c
  ===
  --- ipa-inline-transform.c  (revision 201461)
  +++ ipa-inline-transform.c  (working copy)
  @@ -192,6 +192,108 @@ clone_inlined_nodes (struct cgraph_edge *e, bool d
   }
 
 
  +#define MAX_INT_LENGTH 20
  +
  +/* Return NODE's name and profile count, if available.  */
  +
  +static const char *
  +cgraph_node_opt_info (struct cgraph_node *node)
  +{
  +  char *buf;
  +  size_t buf_size;
  +  const char *bfd_name = lang_hooks.dwarf_name (node-symbol.decl, 0);
  +
  +  if (!bfd_name)
  +bfd_name = unknown;
  +
  +  buf_size = strlen (bfd_name) + 1;
  +  if (profile_info)
  +buf_size += (MAX_INT_LENGTH + 3);
  +
  +  buf = (char *) xmalloc (buf_size);
  +
  +  strcpy (buf, bfd_name);
  +
  +  if (profile_info)
  +sprintf (buf, %s (HOST_WIDEST_INT_PRINT_DEC), buf, node-count);
  +  return buf;
  +}
 
  I'm not sure if output of this function is aimed only at the user or
  if it is supposed to be used by gcc developers as well.  If the
  latter, an incredibly useful thing is to also dump node-symbol.order
  too.  We usually dump it after / sign separating it from node name.
  It is invaluable when examining decisions in C++ code where you can
  have lots of clones of a node (and also because existing dumps print
  it, it is easy to combine them).

 The output is useful for both power users doing performance tuning of
 their application, and by gcc developers. Adding the id is not so
 useful for the former, but I agree that it is very useful for compiler
 developers. In fact, in the google branch version we emit more verbose
 information (the lipo module id and the funcdef_no) to help uniquely
 identify the routines and to aid in post-processing by humans and
 tools. So it is probably useful to add something similar here too. Is
 the node-symbol.order more or less unique than the funcdef_no? I see
 that you added a patch a few months ago to print the
 node-symbol.order in the function header, and it also has the
 advantage as you note of matching up with existing ipa dumps.

 node-symbol.order is unique and if I remember correctly, it is not
 even recycled.  Clones, inline clones, thunks, every symbol table node
 gets its own symbol order so it should be more unique than funcdef_no.
 On the other hand it may be a bit cryptic for users but at the same
 time it is only one number.

 Ok, I am going to go ahead and add this to the output.



 
  [...]
 
  Index: ipa-inline.c
  ===
  --- ipa-inline.c(revision 201461)
  +++ ipa-inline.c(working copy)
  @@ -118,6 +118,9 @@ along with GCC; see the file COPYING3.  If not see

44 matches

Mail list logo