Re: [RFC/PATCHSET 00/14] perf report: Add support to accumulate hist periods (v2)

2013-11-12 Thread Pekka Enberg

On 11/06/2013 05:33 PM, David Ahern wrote:

On 11/6/13, 4:47 AM, Ingo Molnar wrote:

I'm not too worried about call-graph 'legacies': it generates such huge
perf.data files which is parsed so slowly at the moment that there's 
very
little user base ... Anyone who absolutely needs call-graph profiling 
uses

SysProf which performs well.


Actually, perf with callchains is used quite heavily on my products. 
One of the selling points of perf.


I use perf with callchains all the time as well.

Pekka
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC/PATCHSET 00/14] perf report: Add support to accumulate hist periods (v2)

2013-11-12 Thread Pekka Enberg

On 11/06/2013 05:33 PM, David Ahern wrote:

On 11/6/13, 4:47 AM, Ingo Molnar wrote:

I'm not too worried about call-graph 'legacies': it generates such huge
perf.data files which is parsed so slowly at the moment that there's 
very
little user base ... Anyone who absolutely needs call-graph profiling 
uses

SysProf which performs well.


Actually, perf with callchains is used quite heavily on my products. 
One of the selling points of perf.


I use perf with callchains all the time as well.

Pekka
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC/PATCHSET 00/14] perf report: Add support to accumulate hist periods (v2)

2013-11-11 Thread Frederic Weisbecker
On Mon, Nov 11, 2013 at 02:56:37PM +0100, Ingo Molnar wrote:
> 
> * Frederic Weisbecker  wrote:
> 
> > On Mon, Nov 11, 2013 at 01:13:52PM +0100, Ingo Molnar wrote:
> > > 
> > > It's not an irrelevant feature at all! :-)
> > > 
> > > It's just that for any sort of longer profile it was pretty 
> > > difficult/frustrating to use, which I think held back adoption.
> > > 
> > > That performance problem got fixed now by you and Namhyung, so I think 
> > > we'll see even wider adoption of call-graph profiling...
> > 
> > Ah I see now. At the time Linus reported his issue, I had the feeling 
> > his usecase was a bit "extreme", but I actually have no idea how far 
> > perf can be used given that I'm mostly used to short benchmarks, 
> > typically hackbench, perf bench sched messaging et al. Thing is I don't 
> > use it enough for my real usecases :)
> 
> Well, it's a bit of a catch-22: if there are severe scalability problems 
> for a usecase then people won't use it because they cannot use it. So 
> developers should usually try to over-measure things and go for extreme 
> uses and such.

Agreed :)
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC/PATCHSET 00/14] perf report: Add support to accumulate hist periods (v2)

2013-11-11 Thread David Ahern

On 11/11/13, 5:19 AM, Ingo Molnar wrote:


In what way is call-graph profiling utilized typically?

Is it system-wide, i.e. something like:

perf record -a -g sleep 10

? If yes then that would explain why scalability problems rarely surfaced,
it takes a longer user-space profile to get to the event counts where
scalability started hurting.


Both. But not time periods long enough to generate GB sized files 
(limitations in the product). I get various reports of hung commands 
(usually perf-not terminated properly (now fixed) or the file is 
corrupted on transfer), but noone has complained to me about perf-report 
appearing to hang or for 20-30 minutes to generate a result.


David
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC/PATCHSET 00/14] perf report: Add support to accumulate hist periods (v2)

2013-11-11 Thread Ingo Molnar

* Frederic Weisbecker  wrote:

> On Mon, Nov 11, 2013 at 01:13:52PM +0100, Ingo Molnar wrote:
> > 
> > It's not an irrelevant feature at all! :-)
> > 
> > It's just that for any sort of longer profile it was pretty 
> > difficult/frustrating to use, which I think held back adoption.
> > 
> > That performance problem got fixed now by you and Namhyung, so I think 
> > we'll see even wider adoption of call-graph profiling...
> 
> Ah I see now. At the time Linus reported his issue, I had the feeling 
> his usecase was a bit "extreme", but I actually have no idea how far 
> perf can be used given that I'm mostly used to short benchmarks, 
> typically hackbench, perf bench sched messaging et al. Thing is I don't 
> use it enough for my real usecases :)

Well, it's a bit of a catch-22: if there are severe scalability problems 
for a usecase then people won't use it because they cannot use it. So 
developers should usually try to over-measure things and go for extreme 
uses and such.

Thanks,

Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC/PATCHSET 00/14] perf report: Add support to accumulate hist periods (v2)

2013-11-11 Thread Frederic Weisbecker
On Mon, Nov 11, 2013 at 01:13:52PM +0100, Ingo Molnar wrote:
> 
> It's not an irrelevant feature at all! :-)
> 
> It's just that for any sort of longer profile it was pretty 
> difficult/frustrating to use, which I think held back adoption.
> 
> That performance problem got fixed now by you and Namhyung, so I think 
> we'll see even wider adoption of call-graph profiling...

Ah I see now. At the time Linus reported his issue, I had the feeling his
usecase was a bit "extreme", but I actually have no idea how far perf can be
used given that I'm mostly used to short benchmarks, typically hackbench,
perf bench sched messaging et al. Thing is I don't use it enough for my
real usecases :)
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC/PATCHSET 00/14] perf report: Add support to accumulate hist periods (v2)

2013-11-11 Thread Frederic Weisbecker
On Mon, Nov 11, 2013 at 01:12:12PM +0100, Ingo Molnar wrote:
> > I'm not sure why you want to add a new -F that adds news way to display 
> > fields. Isn't -s enough for that?
> 
> Well, -s implies sorting.
> 
> With -F we could decouple sorting from display order, and allow output 
> like:
> 
>   # Symbol   CommandShared Object   Overhead
> 
> Where we still sort by 'overhead', yet display things by having 'overhead' 
> last.
> 
> So basically have maximum flexibility of output and sorting - into which 
> the new 'total' field for accumulated stats would fit automatically.

Ok, I haven't followed the details on why we want this to display the cumulated
overhead.

But reordering the columns should be ok as long as we have the same fields 
present in -F and -s.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC/PATCHSET 00/14] perf report: Add support to accumulate hist periods (v2)

2013-11-11 Thread Ingo Molnar

* David Ahern  wrote:

> On 11/6/13, 4:47 AM, Ingo Molnar wrote:
> >I'm not too worried about call-graph 'legacies': it generates such huge
> >perf.data files which is parsed so slowly at the moment that there's very
> >little user base ... Anyone who absolutely needs call-graph profiling uses
> >SysProf which performs well.
> 
> Actually, perf with callchains is used quite heavily on my products.
> One of the selling points of perf.

That's nice to hear :)

In what way is call-graph profiling utilized typically?

Is it system-wide, i.e. something like:

perf record -a -g sleep 10

? If yes then that would explain why scalability problems rarely surfaced, 
it takes a longer user-space profile to get to the event counts where 
scalability started hurting.

Thanks,

Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC/PATCHSET 00/14] perf report: Add support to accumulate hist periods (v2)

2013-11-11 Thread Ingo Molnar

* Peter Zijlstra  wrote:

> On Wed, Nov 06, 2013 at 12:47:01PM +0100, Ingo Molnar wrote:
>
> > I'm not too worried about call-graph 'legacies': it generates such 
> > huge perf.data files which is parsed so slowly at the moment that 
> > there's very little user base ... Anyone who absolutely needs 
> > call-graph profiling uses SysProf which performs well.
> 
> Uhm, say what? I use it, and I don't use sysprof since that thing is 
> totally not usable ;-)

You aren't a typical case at all! :-)

Just look back the example where Linus tried to use call-graph profiling 
to profile a mild 60-seconds workload (a kernel build) and came away 
reporting that his perf session locked up.

I think many other people ran into that performance problem. Those who are 
using it must be using it for far shorter workloads.

Anyway, that's all fixed now, and I do think that call-graph profiling is 
one of perf's killer features - I thought that from day 1 on when I 
suggested to Frederic that it would be really important to implement it 
;-)

Thanks,

Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC/PATCHSET 00/14] perf report: Add support to accumulate hist periods (v2)

2013-11-11 Thread Ingo Molnar

* Frederic Weisbecker  wrote:

> On Wed, Nov 06, 2013 at 12:47:01PM +0100, Ingo Molnar wrote:
> > 
> > * Namhyung Kim  wrote:
> > 
> > > On Wed, 6 Nov 2013 09:30:46 +0100, Ingo Molnar wrote:
> > > > * Namhyung Kim  wrote:
> > > >
> > > >> Hi Ingo,
> > > >> 
> > > >> On Tue, 5 Nov 2013 12:58:02 +0100, Ingo Molnar wrote:
> > > >> > * Namhyung Kim  wrote:
> > > >> >> But the 'cumulative' (btw, I feel a bit hard to type this word..) 
> > > >> >> is 
> > > >> >> different in that it *generates* entries didn't get sampled 
> > > >> >> originally. 
> > > >> >> And as it requires callchains, total field will not work if 
> > > >> >> callchains 
> > > >> >> are missing.
> > > >> >
> > > >> > Well, 'total' should disappear if it's not available.
> > > >> 
> > > >> But what if it's the only sort key user gave?
> > > >
> > > > Do you mean something like:
> > > >
> > > >   -F self,name -s total
> > > >
> > > > i.e. if a sort key not displayed?
> > > 
> > > What I worry is when no -F option was given at all.
> > 
> > In that case the default list applied, plus whatever new fields are 
> > mentioned in -s would also be added (appended or prepended).
> > 
> > The display order of columns should _probably_ be something like:
> > 
> >   key1 key2 ... non-key1 non-key2
> > 
> > there's not much point in sorting and then displaying the key not in 
> > front, right?
> > 
> > > > I think sort keys should be automatically added to the displayed 
> > > > fields list.
> > > 
> > > Agreed.
> > 
> > > > This problem should be solved if all -s fields are displayed - i.e. 
> > > > they are added to the -F list, right?
> > > 
> > > But old users might not aware of the new -F option, and use -s option 
> > > only.  If so, she will get output like the first example, right?
> > 
> > Well, there's a default -F list that applies - so this shouldn't be a 
> > problem, agreed? So output should be like the second (expected) example.
> > 
> > > > Basically there's just a single concept: the -F list. The -s option 
> > > > simply modifies and extends the -F list but internally perf report 
> > > > would not know anything about '-s', it only knows about fields to 
> > > > display and it would know which of those fields are to be sorted and 
> > > > in what order.
> > > >
> > > > Does that make sense to you? Does it cover everything needed?
> > > 
> > > I like the concept.  I'm just looking for a way to add it without 
> > > upsetting old users. :)
> > 
> > If the default -F list matches our current displayed fields list then 
> > there should not be much change in behavior (beyond the addition of total 
> > for call-graph outputs - which can be kept completely separate).
> > 
> > I'm not too worried about call-graph 'legacies': it generates such huge 
> > perf.data files which is parsed so slowly at the moment that there's very 
> > little user base ... Anyone who absolutely needs call-graph profiling uses 
> > SysProf which performs well.
> 
> I'm a bit confused by what will be changed with call-graph here. Also 
> I've seen perf callgraph reports quite often on emails not even related 
> to perf developement. It doesn't appear to me like an irrelevant 
> feature...

It's not an irrelevant feature at all! :-)

It's just that for any sort of longer profile it was pretty 
difficult/frustrating to use, which I think held back adoption.

That performance problem got fixed now by you and Namhyung, so I think 
we'll see even wider adoption of call-graph profiling...

Thanks,

Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC/PATCHSET 00/14] perf report: Add support to accumulate hist periods (v2)

2013-11-11 Thread Ingo Molnar

* Frederic Weisbecker  wrote:

> On Wed, Nov 06, 2013 at 09:30:46AM +0100, Ingo Molnar wrote:
> > 
> > * Namhyung Kim  wrote:
> > 
> > > Hi Ingo,
> > > 
> > > On Tue, 5 Nov 2013 12:58:02 +0100, Ingo Molnar wrote:
> > > > * Namhyung Kim  wrote:
> > > >> But the 'cumulative' (btw, I feel a bit hard to type this word..) is 
> > > >> different in that it *generates* entries didn't get sampled 
> > > >> originally. 
> > > >> And as it requires callchains, total field will not work if callchains 
> > > >> are missing.
> > > >
> > > > Well, 'total' should disappear if it's not available.
> > > 
> > > But what if it's the only sort key user gave?
> > 
> > Do you mean something like:
> > 
> >   -F self,name -s total
> > 
> > i.e. if a sort key not displayed?
> > 
> > I think sort keys should be automatically added to the displayed fields 
> > list.
> > 
> > This rule is obviously met with the -F total:2,self:1,name:0 kind of 
> > sorting syntax (you can only sort by fields that get displayed) - if 
> > mixed with -s then it should be implicit I think.
> 
> I'm not sure why you want to add a new -F that adds news way to display 
> fields. Isn't -s enough for that?

Well, -s implies sorting.

With -F we could decouple sorting from display order, and allow output 
like:

  # Symbol   CommandShared Object   Overhead

Where we still sort by 'overhead', yet display things by having 'overhead' 
last.

So basically have maximum flexibility of output and sorting - into which 
the new 'total' field for accumulated stats would fit automatically.

Thanks,

Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC/PATCHSET 00/14] perf report: Add support to accumulate hist periods (v2)

2013-11-11 Thread Ingo Molnar

* Frederic Weisbecker fweis...@gmail.com wrote:

 On Wed, Nov 06, 2013 at 09:30:46AM +0100, Ingo Molnar wrote:
  
  * Namhyung Kim namhy...@kernel.org wrote:
  
   Hi Ingo,
   
   On Tue, 5 Nov 2013 12:58:02 +0100, Ingo Molnar wrote:
* Namhyung Kim namhy...@kernel.org wrote:
But the 'cumulative' (btw, I feel a bit hard to type this word..) is 
different in that it *generates* entries didn't get sampled 
originally. 
And as it requires callchains, total field will not work if callchains 
are missing.
   
Well, 'total' should disappear if it's not available.
   
   But what if it's the only sort key user gave?
  
  Do you mean something like:
  
-F self,name -s total
  
  i.e. if a sort key not displayed?
  
  I think sort keys should be automatically added to the displayed fields 
  list.
  
  This rule is obviously met with the -F total:2,self:1,name:0 kind of 
  sorting syntax (you can only sort by fields that get displayed) - if 
  mixed with -s then it should be implicit I think.
 
 I'm not sure why you want to add a new -F that adds news way to display 
 fields. Isn't -s enough for that?

Well, -s implies sorting.

With -F we could decouple sorting from display order, and allow output 
like:

  # Symbol   CommandShared Object   Overhead

Where we still sort by 'overhead', yet display things by having 'overhead' 
last.

So basically have maximum flexibility of output and sorting - into which 
the new 'total' field for accumulated stats would fit automatically.

Thanks,

Ingo
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC/PATCHSET 00/14] perf report: Add support to accumulate hist periods (v2)

2013-11-11 Thread Ingo Molnar

* Frederic Weisbecker fweis...@gmail.com wrote:

 On Wed, Nov 06, 2013 at 12:47:01PM +0100, Ingo Molnar wrote:
  
  * Namhyung Kim namhy...@kernel.org wrote:
  
   On Wed, 6 Nov 2013 09:30:46 +0100, Ingo Molnar wrote:
* Namhyung Kim namhy...@kernel.org wrote:
   
Hi Ingo,

On Tue, 5 Nov 2013 12:58:02 +0100, Ingo Molnar wrote:
 * Namhyung Kim namhy...@kernel.org wrote:
 But the 'cumulative' (btw, I feel a bit hard to type this word..) 
 is 
 different in that it *generates* entries didn't get sampled 
 originally. 
 And as it requires callchains, total field will not work if 
 callchains 
 are missing.

 Well, 'total' should disappear if it's not available.

But what if it's the only sort key user gave?
   
Do you mean something like:
   
  -F self,name -s total
   
i.e. if a sort key not displayed?
   
   What I worry is when no -F option was given at all.
  
  In that case the default list applied, plus whatever new fields are 
  mentioned in -s would also be added (appended or prepended).
  
  The display order of columns should _probably_ be something like:
  
key1 key2 ... non-key1 non-key2
  
  there's not much point in sorting and then displaying the key not in 
  front, right?
  
I think sort keys should be automatically added to the displayed 
fields list.
   
   Agreed.
  
This problem should be solved if all -s fields are displayed - i.e. 
they are added to the -F list, right?
   
   But old users might not aware of the new -F option, and use -s option 
   only.  If so, she will get output like the first example, right?
  
  Well, there's a default -F list that applies - so this shouldn't be a 
  problem, agreed? So output should be like the second (expected) example.
  
Basically there's just a single concept: the -F list. The -s option 
simply modifies and extends the -F list but internally perf report 
would not know anything about '-s', it only knows about fields to 
display and it would know which of those fields are to be sorted and 
in what order.
   
Does that make sense to you? Does it cover everything needed?
   
   I like the concept.  I'm just looking for a way to add it without 
   upsetting old users. :)
  
  If the default -F list matches our current displayed fields list then 
  there should not be much change in behavior (beyond the addition of total 
  for call-graph outputs - which can be kept completely separate).
  
  I'm not too worried about call-graph 'legacies': it generates such huge 
  perf.data files which is parsed so slowly at the moment that there's very 
  little user base ... Anyone who absolutely needs call-graph profiling uses 
  SysProf which performs well.
 
 I'm a bit confused by what will be changed with call-graph here. Also 
 I've seen perf callgraph reports quite often on emails not even related 
 to perf developement. It doesn't appear to me like an irrelevant 
 feature...

It's not an irrelevant feature at all! :-)

It's just that for any sort of longer profile it was pretty 
difficult/frustrating to use, which I think held back adoption.

That performance problem got fixed now by you and Namhyung, so I think 
we'll see even wider adoption of call-graph profiling...

Thanks,

Ingo
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC/PATCHSET 00/14] perf report: Add support to accumulate hist periods (v2)

2013-11-11 Thread Ingo Molnar

* Peter Zijlstra pet...@infradead.org wrote:

 On Wed, Nov 06, 2013 at 12:47:01PM +0100, Ingo Molnar wrote:

  I'm not too worried about call-graph 'legacies': it generates such 
  huge perf.data files which is parsed so slowly at the moment that 
  there's very little user base ... Anyone who absolutely needs 
  call-graph profiling uses SysProf which performs well.
 
 Uhm, say what? I use it, and I don't use sysprof since that thing is 
 totally not usable ;-)

You aren't a typical case at all! :-)

Just look back the example where Linus tried to use call-graph profiling 
to profile a mild 60-seconds workload (a kernel build) and came away 
reporting that his perf session locked up.

I think many other people ran into that performance problem. Those who are 
using it must be using it for far shorter workloads.

Anyway, that's all fixed now, and I do think that call-graph profiling is 
one of perf's killer features - I thought that from day 1 on when I 
suggested to Frederic that it would be really important to implement it 
;-)

Thanks,

Ingo
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC/PATCHSET 00/14] perf report: Add support to accumulate hist periods (v2)

2013-11-11 Thread Ingo Molnar

* David Ahern dsah...@gmail.com wrote:

 On 11/6/13, 4:47 AM, Ingo Molnar wrote:
 I'm not too worried about call-graph 'legacies': it generates such huge
 perf.data files which is parsed so slowly at the moment that there's very
 little user base ... Anyone who absolutely needs call-graph profiling uses
 SysProf which performs well.
 
 Actually, perf with callchains is used quite heavily on my products.
 One of the selling points of perf.

That's nice to hear :)

In what way is call-graph profiling utilized typically?

Is it system-wide, i.e. something like:

perf record -a -g sleep 10

? If yes then that would explain why scalability problems rarely surfaced, 
it takes a longer user-space profile to get to the event counts where 
scalability started hurting.

Thanks,

Ingo
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC/PATCHSET 00/14] perf report: Add support to accumulate hist periods (v2)

2013-11-11 Thread Frederic Weisbecker
On Mon, Nov 11, 2013 at 01:12:12PM +0100, Ingo Molnar wrote:
  I'm not sure why you want to add a new -F that adds news way to display 
  fields. Isn't -s enough for that?
 
 Well, -s implies sorting.
 
 With -F we could decouple sorting from display order, and allow output 
 like:
 
   # Symbol   CommandShared Object   Overhead
 
 Where we still sort by 'overhead', yet display things by having 'overhead' 
 last.
 
 So basically have maximum flexibility of output and sorting - into which 
 the new 'total' field for accumulated stats would fit automatically.

Ok, I haven't followed the details on why we want this to display the cumulated
overhead.

But reordering the columns should be ok as long as we have the same fields 
present in -F and -s.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC/PATCHSET 00/14] perf report: Add support to accumulate hist periods (v2)

2013-11-11 Thread Frederic Weisbecker
On Mon, Nov 11, 2013 at 01:13:52PM +0100, Ingo Molnar wrote:
 
 It's not an irrelevant feature at all! :-)
 
 It's just that for any sort of longer profile it was pretty 
 difficult/frustrating to use, which I think held back adoption.
 
 That performance problem got fixed now by you and Namhyung, so I think 
 we'll see even wider adoption of call-graph profiling...

Ah I see now. At the time Linus reported his issue, I had the feeling his
usecase was a bit extreme, but I actually have no idea how far perf can be
used given that I'm mostly used to short benchmarks, typically hackbench,
perf bench sched messaging et al. Thing is I don't use it enough for my
real usecases :)
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC/PATCHSET 00/14] perf report: Add support to accumulate hist periods (v2)

2013-11-11 Thread Ingo Molnar

* Frederic Weisbecker fweis...@gmail.com wrote:

 On Mon, Nov 11, 2013 at 01:13:52PM +0100, Ingo Molnar wrote:
  
  It's not an irrelevant feature at all! :-)
  
  It's just that for any sort of longer profile it was pretty 
  difficult/frustrating to use, which I think held back adoption.
  
  That performance problem got fixed now by you and Namhyung, so I think 
  we'll see even wider adoption of call-graph profiling...
 
 Ah I see now. At the time Linus reported his issue, I had the feeling 
 his usecase was a bit extreme, but I actually have no idea how far 
 perf can be used given that I'm mostly used to short benchmarks, 
 typically hackbench, perf bench sched messaging et al. Thing is I don't 
 use it enough for my real usecases :)

Well, it's a bit of a catch-22: if there are severe scalability problems 
for a usecase then people won't use it because they cannot use it. So 
developers should usually try to over-measure things and go for extreme 
uses and such.

Thanks,

Ingo
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC/PATCHSET 00/14] perf report: Add support to accumulate hist periods (v2)

2013-11-11 Thread David Ahern

On 11/11/13, 5:19 AM, Ingo Molnar wrote:


In what way is call-graph profiling utilized typically?

Is it system-wide, i.e. something like:

perf record -a -g sleep 10

? If yes then that would explain why scalability problems rarely surfaced,
it takes a longer user-space profile to get to the event counts where
scalability started hurting.


Both. But not time periods long enough to generate GB sized files 
(limitations in the product). I get various reports of hung commands 
(usually perf-not terminated properly (now fixed) or the file is 
corrupted on transfer), but noone has complained to me about perf-report 
appearing to hang or for 20-30 minutes to generate a result.


David
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC/PATCHSET 00/14] perf report: Add support to accumulate hist periods (v2)

2013-11-11 Thread Frederic Weisbecker
On Mon, Nov 11, 2013 at 02:56:37PM +0100, Ingo Molnar wrote:
 
 * Frederic Weisbecker fweis...@gmail.com wrote:
 
  On Mon, Nov 11, 2013 at 01:13:52PM +0100, Ingo Molnar wrote:
   
   It's not an irrelevant feature at all! :-)
   
   It's just that for any sort of longer profile it was pretty 
   difficult/frustrating to use, which I think held back adoption.
   
   That performance problem got fixed now by you and Namhyung, so I think 
   we'll see even wider adoption of call-graph profiling...
  
  Ah I see now. At the time Linus reported his issue, I had the feeling 
  his usecase was a bit extreme, but I actually have no idea how far 
  perf can be used given that I'm mostly used to short benchmarks, 
  typically hackbench, perf bench sched messaging et al. Thing is I don't 
  use it enough for my real usecases :)
 
 Well, it's a bit of a catch-22: if there are severe scalability problems 
 for a usecase then people won't use it because they cannot use it. So 
 developers should usually try to over-measure things and go for extreme 
 uses and such.

Agreed :)
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC/PATCHSET 00/14] perf report: Add support to accumulate hist periods (v2)

2013-11-06 Thread Peter Zijlstra
On Wed, Nov 06, 2013 at 12:47:01PM +0100, Ingo Molnar wrote:
> I'm not too worried about call-graph 'legacies': it generates such huge 
> perf.data files which is parsed so slowly at the moment that there's very 
> little user base ... Anyone who absolutely needs call-graph profiling uses 
> SysProf which performs well.

Uhm, say what? I use it, and I don't use sysprof since that thing is
totally not usable ;-)


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC/PATCHSET 00/14] perf report: Add support to accumulate hist periods (v2)

2013-11-06 Thread David Ahern

On 11/6/13, 4:47 AM, Ingo Molnar wrote:

I'm not too worried about call-graph 'legacies': it generates such huge
perf.data files which is parsed so slowly at the moment that there's very
little user base ... Anyone who absolutely needs call-graph profiling uses
SysProf which performs well.


Actually, perf with callchains is used quite heavily on my products. One 
of the selling points of perf.


David

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC/PATCHSET 00/14] perf report: Add support to accumulate hist periods (v2)

2013-11-06 Thread Frederic Weisbecker
On Wed, Nov 06, 2013 at 12:47:01PM +0100, Ingo Molnar wrote:
> 
> * Namhyung Kim  wrote:
> 
> > On Wed, 6 Nov 2013 09:30:46 +0100, Ingo Molnar wrote:
> > > * Namhyung Kim  wrote:
> > >
> > >> Hi Ingo,
> > >> 
> > >> On Tue, 5 Nov 2013 12:58:02 +0100, Ingo Molnar wrote:
> > >> > * Namhyung Kim  wrote:
> > >> >> But the 'cumulative' (btw, I feel a bit hard to type this word..) is 
> > >> >> different in that it *generates* entries didn't get sampled 
> > >> >> originally. 
> > >> >> And as it requires callchains, total field will not work if 
> > >> >> callchains 
> > >> >> are missing.
> > >> >
> > >> > Well, 'total' should disappear if it's not available.
> > >> 
> > >> But what if it's the only sort key user gave?
> > >
> > > Do you mean something like:
> > >
> > >   -F self,name -s total
> > >
> > > i.e. if a sort key not displayed?
> > 
> > What I worry is when no -F option was given at all.
> 
> In that case the default list applied, plus whatever new fields are 
> mentioned in -s would also be added (appended or prepended).
> 
> The display order of columns should _probably_ be something like:
> 
>   key1 key2 ... non-key1 non-key2
> 
> there's not much point in sorting and then displaying the key not in 
> front, right?
> 
> > > I think sort keys should be automatically added to the displayed 
> > > fields list.
> > 
> > Agreed.
> 
> > > This problem should be solved if all -s fields are displayed - i.e. 
> > > they are added to the -F list, right?
> > 
> > But old users might not aware of the new -F option, and use -s option 
> > only.  If so, she will get output like the first example, right?
> 
> Well, there's a default -F list that applies - so this shouldn't be a 
> problem, agreed? So output should be like the second (expected) example.
> 
> > > Basically there's just a single concept: the -F list. The -s option 
> > > simply modifies and extends the -F list but internally perf report 
> > > would not know anything about '-s', it only knows about fields to 
> > > display and it would know which of those fields are to be sorted and 
> > > in what order.
> > >
> > > Does that make sense to you? Does it cover everything needed?
> > 
> > I like the concept.  I'm just looking for a way to add it without 
> > upsetting old users. :)
> 
> If the default -F list matches our current displayed fields list then 
> there should not be much change in behavior (beyond the addition of total 
> for call-graph outputs - which can be kept completely separate).
> 
> I'm not too worried about call-graph 'legacies': it generates such huge 
> perf.data files which is parsed so slowly at the moment that there's very 
> little user base ... Anyone who absolutely needs call-graph profiling uses 
> SysProf which performs well.

I'm a bit confused by what will be changed with call-graph here. Also I've
seen perf callgraph reports quite often on emails not even related to perf
developement. It doesn't appear to me like an irrelevant feature...
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC/PATCHSET 00/14] perf report: Add support to accumulate hist periods (v2)

2013-11-06 Thread Frederic Weisbecker
On Wed, Nov 06, 2013 at 09:30:46AM +0100, Ingo Molnar wrote:
> 
> * Namhyung Kim  wrote:
> 
> > Hi Ingo,
> > 
> > On Tue, 5 Nov 2013 12:58:02 +0100, Ingo Molnar wrote:
> > > * Namhyung Kim  wrote:
> > >> But the 'cumulative' (btw, I feel a bit hard to type this word..) is 
> > >> different in that it *generates* entries didn't get sampled originally. 
> > >> And as it requires callchains, total field will not work if callchains 
> > >> are missing.
> > >
> > > Well, 'total' should disappear if it's not available.
> > 
> > But what if it's the only sort key user gave?
> 
> Do you mean something like:
> 
>   -F self,name -s total
> 
> i.e. if a sort key not displayed?
> 
> I think sort keys should be automatically added to the displayed fields 
> list.
> 
> This rule is obviously met with the -F total:2,self:1,name:0 kind of 
> sorting syntax (you can only sort by fields that get displayed) - if mixed 
> with -s then it should be implicit I think.

I'm not sure why you want to add a new -F that adds news way to display fields.
Isn't -s enough for that?
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC/PATCHSET 00/14] perf report: Add support to accumulate hist periods (v2)

2013-11-06 Thread Ingo Molnar

* Namhyung Kim  wrote:

> On Wed, 6 Nov 2013 09:30:46 +0100, Ingo Molnar wrote:
> > * Namhyung Kim  wrote:
> >
> >> Hi Ingo,
> >> 
> >> On Tue, 5 Nov 2013 12:58:02 +0100, Ingo Molnar wrote:
> >> > * Namhyung Kim  wrote:
> >> >> But the 'cumulative' (btw, I feel a bit hard to type this word..) is 
> >> >> different in that it *generates* entries didn't get sampled originally. 
> >> >> And as it requires callchains, total field will not work if callchains 
> >> >> are missing.
> >> >
> >> > Well, 'total' should disappear if it's not available.
> >> 
> >> But what if it's the only sort key user gave?
> >
> > Do you mean something like:
> >
> >   -F self,name -s total
> >
> > i.e. if a sort key not displayed?
> 
> What I worry is when no -F option was given at all.

In that case the default list applied, plus whatever new fields are 
mentioned in -s would also be added (appended or prepended).

The display order of columns should _probably_ be something like:

  key1 key2 ... non-key1 non-key2

there's not much point in sorting and then displaying the key not in 
front, right?

> > I think sort keys should be automatically added to the displayed 
> > fields list.
> 
> Agreed.

> > This problem should be solved if all -s fields are displayed - i.e. 
> > they are added to the -F list, right?
> 
> But old users might not aware of the new -F option, and use -s option 
> only.  If so, she will get output like the first example, right?

Well, there's a default -F list that applies - so this shouldn't be a 
problem, agreed? So output should be like the second (expected) example.

> > Basically there's just a single concept: the -F list. The -s option 
> > simply modifies and extends the -F list but internally perf report 
> > would not know anything about '-s', it only knows about fields to 
> > display and it would know which of those fields are to be sorted and 
> > in what order.
> >
> > Does that make sense to you? Does it cover everything needed?
> 
> I like the concept.  I'm just looking for a way to add it without 
> upsetting old users. :)

If the default -F list matches our current displayed fields list then 
there should not be much change in behavior (beyond the addition of total 
for call-graph outputs - which can be kept completely separate).

I'm not too worried about call-graph 'legacies': it generates such huge 
perf.data files which is parsed so slowly at the moment that there's very 
little user base ... Anyone who absolutely needs call-graph profiling uses 
SysProf which performs well.

Thanks,

Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC/PATCHSET 00/14] perf report: Add support to accumulate hist periods (v2)

2013-11-06 Thread Namhyung Kim
On Wed, 6 Nov 2013 09:30:46 +0100, Ingo Molnar wrote:
> * Namhyung Kim  wrote:
>
>> Hi Ingo,
>> 
>> On Tue, 5 Nov 2013 12:58:02 +0100, Ingo Molnar wrote:
>> > * Namhyung Kim  wrote:
>> >> But the 'cumulative' (btw, I feel a bit hard to type this word..) is 
>> >> different in that it *generates* entries didn't get sampled originally. 
>> >> And as it requires callchains, total field will not work if callchains 
>> >> are missing.
>> >
>> > Well, 'total' should disappear if it's not available.
>> 
>> But what if it's the only sort key user gave?
>
> Do you mean something like:
>
>   -F self,name -s total
>
> i.e. if a sort key not displayed?

What I worry is when no -F option was given at all.

>
> I think sort keys should be automatically added to the displayed fields 
> list.

Agreed.

>
> This rule is obviously met with the -F total:2,self:1,name:0 kind of 
> sorting syntax (you can only sort by fields that get displayed) - if mixed 
> with -s then it should be implicit I think.
>
>> >> But for compatibility we need to use 'self' sort key internally iff 
>> >> neither the -F option nor the config option was given by user.  And 
>> >> it might warn (or notice) users to add 'self' column in the sort key 
>> >> for future use.
>> >
>> > Mind explaining what the problem here is? I don't think I get it.
>> 
>> Well, normal users still use it as they used to - like 
>> 'perf report -s comm,dso' without -F option and the config.
>> 
>> In that case, what would the output look like?  According to the above
>> proposal it'd look like below.
>> 
>>   # Command  Shared object
>>   # ...  .
>> aaa  aaa
>> aaa  libc.so
>> bbb  bbb
>> bbb  libc.so
>> 
>> 
>> But the user might want see this:
>> 
>>   # Overhead (self)  Command  Shared object
>>   # ...  ...  .
>>  30.00%  bbb  bbb
>>  25.00%  aaa  aaa
>>  25.00%  aaa  libc.so
>>  20.00%  bbb  libc.so
>> 
>> 
>> If she really wants to see it sorted by comm and dso, the command line
>> should be 'perf report -F self,comm,dso -s comm,dso'
>> (or just 'perf report -F self -s comm,dso' could do the same).
>> 
>>   # Overhead (self)  Command  Shared object
>>   # ...  ...  .
>>  25.00%  aaa  aaa
>>  25.00%  aaa  libc.so
>>  30.00%  bbb  bbb
>>  20.00%  bbb  libc.so
>
> This problem should be solved if all -s fields are displayed - i.e. they 
> are added to the -F list, right?

But old users might not aware of the new -F option, and use -s option
only.  If so, she will get output like the first example, right?

>
> Basically there's just a single concept: the -F list. The -s option simply 
> modifies and extends the -F list but internally perf report would not know 
> anything about '-s', it only knows about fields to display and it would 
> know which of those fields are to be sorted and in what order.
>
> Does that make sense to you? Does it cover everything needed?

I like the concept.  I'm just looking for a way to add it without
upsetting old users. :)

Thanks,
Namhyung
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC/PATCHSET 00/14] perf report: Add support to accumulate hist periods (v2)

2013-11-06 Thread Ingo Molnar

* Namhyung Kim  wrote:

> Hi Ingo,
> 
> On Tue, 5 Nov 2013 12:58:02 +0100, Ingo Molnar wrote:
> > * Namhyung Kim  wrote:
> >> But the 'cumulative' (btw, I feel a bit hard to type this word..) is 
> >> different in that it *generates* entries didn't get sampled originally. 
> >> And as it requires callchains, total field will not work if callchains 
> >> are missing.
> >
> > Well, 'total' should disappear if it's not available.
> 
> But what if it's the only sort key user gave?

Do you mean something like:

  -F self,name -s total

i.e. if a sort key not displayed?

I think sort keys should be automatically added to the displayed fields 
list.

This rule is obviously met with the -F total:2,self:1,name:0 kind of 
sorting syntax (you can only sort by fields that get displayed) - if mixed 
with -s then it should be implicit I think.

> >> But for compatibility we need to use 'self' sort key internally iff 
> >> neither the -F option nor the config option was given by user.  And 
> >> it might warn (or notice) users to add 'self' column in the sort key 
> >> for future use.
> >
> > Mind explaining what the problem here is? I don't think I get it.
> 
> Well, normal users still use it as they used to - like 
> 'perf report -s comm,dso' without -F option and the config.
> 
> In that case, what would the output look like?  According to the above
> proposal it'd look like below.
> 
>   # Command  Shared object
>   # ...  .
> aaa  aaa
> aaa  libc.so
> bbb  bbb
> bbb  libc.so
> 
> 
> But the user might want see this:
> 
>   # Overhead (self)  Command  Shared object
>   # ...  ...  .
>  30.00%  bbb  bbb
>  25.00%  aaa  aaa
>  25.00%  aaa  libc.so
>  20.00%  bbb  libc.so
> 
> 
> If she really wants to see it sorted by comm and dso, the command line
> should be 'perf report -F self,comm,dso -s comm,dso'
> (or just 'perf report -F self -s comm,dso' could do the same).
> 
>   # Overhead (self)  Command  Shared object
>   # ...  ...  .
>  25.00%  aaa  aaa
>  25.00%  aaa  libc.so
>  30.00%  bbb  bbb
>  20.00%  bbb  libc.so

This problem should be solved if all -s fields are displayed - i.e. they 
are added to the -F list, right?

Basically there's just a single concept: the -F list. The -s option simply 
modifies and extends the -F list but internally perf report would not know 
anything about '-s', it only knows about fields to display and it would 
know which of those fields are to be sorted and in what order.

Does that make sense to you? Does it cover everything needed?

Thanks,

Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC/PATCHSET 00/14] perf report: Add support to accumulate hist periods (v2)

2013-11-06 Thread Ingo Molnar

* Namhyung Kim namhy...@kernel.org wrote:

 Hi Ingo,
 
 On Tue, 5 Nov 2013 12:58:02 +0100, Ingo Molnar wrote:
  * Namhyung Kim namhy...@kernel.org wrote:
  But the 'cumulative' (btw, I feel a bit hard to type this word..) is 
  different in that it *generates* entries didn't get sampled originally. 
  And as it requires callchains, total field will not work if callchains 
  are missing.
 
  Well, 'total' should disappear if it's not available.
 
 But what if it's the only sort key user gave?

Do you mean something like:

  -F self,name -s total

i.e. if a sort key not displayed?

I think sort keys should be automatically added to the displayed fields 
list.

This rule is obviously met with the -F total:2,self:1,name:0 kind of 
sorting syntax (you can only sort by fields that get displayed) - if mixed 
with -s then it should be implicit I think.

  But for compatibility we need to use 'self' sort key internally iff 
  neither the -F option nor the config option was given by user.  And 
  it might warn (or notice) users to add 'self' column in the sort key 
  for future use.
 
  Mind explaining what the problem here is? I don't think I get it.
 
 Well, normal users still use it as they used to - like 
 'perf report -s comm,dso' without -F option and the config.
 
 In that case, what would the output look like?  According to the above
 proposal it'd look like below.
 
   # Command  Shared object
   # ...  .
 aaa  aaa
 aaa  libc.so
 bbb  bbb
 bbb  libc.so
 
 
 But the user might want see this:
 
   # Overhead (self)  Command  Shared object
   # ...  ...  .
  30.00%  bbb  bbb
  25.00%  aaa  aaa
  25.00%  aaa  libc.so
  20.00%  bbb  libc.so
 
 
 If she really wants to see it sorted by comm and dso, the command line
 should be 'perf report -F self,comm,dso -s comm,dso'
 (or just 'perf report -F self -s comm,dso' could do the same).
 
   # Overhead (self)  Command  Shared object
   # ...  ...  .
  25.00%  aaa  aaa
  25.00%  aaa  libc.so
  30.00%  bbb  bbb
  20.00%  bbb  libc.so

This problem should be solved if all -s fields are displayed - i.e. they 
are added to the -F list, right?

Basically there's just a single concept: the -F list. The -s option simply 
modifies and extends the -F list but internally perf report would not know 
anything about '-s', it only knows about fields to display and it would 
know which of those fields are to be sorted and in what order.

Does that make sense to you? Does it cover everything needed?

Thanks,

Ingo
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC/PATCHSET 00/14] perf report: Add support to accumulate hist periods (v2)

2013-11-06 Thread Namhyung Kim
On Wed, 6 Nov 2013 09:30:46 +0100, Ingo Molnar wrote:
 * Namhyung Kim namhy...@kernel.org wrote:

 Hi Ingo,
 
 On Tue, 5 Nov 2013 12:58:02 +0100, Ingo Molnar wrote:
  * Namhyung Kim namhy...@kernel.org wrote:
  But the 'cumulative' (btw, I feel a bit hard to type this word..) is 
  different in that it *generates* entries didn't get sampled originally. 
  And as it requires callchains, total field will not work if callchains 
  are missing.
 
  Well, 'total' should disappear if it's not available.
 
 But what if it's the only sort key user gave?

 Do you mean something like:

   -F self,name -s total

 i.e. if a sort key not displayed?

What I worry is when no -F option was given at all.


 I think sort keys should be automatically added to the displayed fields 
 list.

Agreed.


 This rule is obviously met with the -F total:2,self:1,name:0 kind of 
 sorting syntax (you can only sort by fields that get displayed) - if mixed 
 with -s then it should be implicit I think.

  But for compatibility we need to use 'self' sort key internally iff 
  neither the -F option nor the config option was given by user.  And 
  it might warn (or notice) users to add 'self' column in the sort key 
  for future use.
 
  Mind explaining what the problem here is? I don't think I get it.
 
 Well, normal users still use it as they used to - like 
 'perf report -s comm,dso' without -F option and the config.
 
 In that case, what would the output look like?  According to the above
 proposal it'd look like below.
 
   # Command  Shared object
   # ...  .
 aaa  aaa
 aaa  libc.so
 bbb  bbb
 bbb  libc.so
 
 
 But the user might want see this:
 
   # Overhead (self)  Command  Shared object
   # ...  ...  .
  30.00%  bbb  bbb
  25.00%  aaa  aaa
  25.00%  aaa  libc.so
  20.00%  bbb  libc.so
 
 
 If she really wants to see it sorted by comm and dso, the command line
 should be 'perf report -F self,comm,dso -s comm,dso'
 (or just 'perf report -F self -s comm,dso' could do the same).
 
   # Overhead (self)  Command  Shared object
   # ...  ...  .
  25.00%  aaa  aaa
  25.00%  aaa  libc.so
  30.00%  bbb  bbb
  20.00%  bbb  libc.so

 This problem should be solved if all -s fields are displayed - i.e. they 
 are added to the -F list, right?

But old users might not aware of the new -F option, and use -s option
only.  If so, she will get output like the first example, right?


 Basically there's just a single concept: the -F list. The -s option simply 
 modifies and extends the -F list but internally perf report would not know 
 anything about '-s', it only knows about fields to display and it would 
 know which of those fields are to be sorted and in what order.

 Does that make sense to you? Does it cover everything needed?

I like the concept.  I'm just looking for a way to add it without
upsetting old users. :)

Thanks,
Namhyung
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC/PATCHSET 00/14] perf report: Add support to accumulate hist periods (v2)

2013-11-06 Thread Ingo Molnar

* Namhyung Kim namhy...@kernel.org wrote:

 On Wed, 6 Nov 2013 09:30:46 +0100, Ingo Molnar wrote:
  * Namhyung Kim namhy...@kernel.org wrote:
 
  Hi Ingo,
  
  On Tue, 5 Nov 2013 12:58:02 +0100, Ingo Molnar wrote:
   * Namhyung Kim namhy...@kernel.org wrote:
   But the 'cumulative' (btw, I feel a bit hard to type this word..) is 
   different in that it *generates* entries didn't get sampled originally. 
   And as it requires callchains, total field will not work if callchains 
   are missing.
  
   Well, 'total' should disappear if it's not available.
  
  But what if it's the only sort key user gave?
 
  Do you mean something like:
 
-F self,name -s total
 
  i.e. if a sort key not displayed?
 
 What I worry is when no -F option was given at all.

In that case the default list applied, plus whatever new fields are 
mentioned in -s would also be added (appended or prepended).

The display order of columns should _probably_ be something like:

  key1 key2 ... non-key1 non-key2

there's not much point in sorting and then displaying the key not in 
front, right?

  I think sort keys should be automatically added to the displayed 
  fields list.
 
 Agreed.

  This problem should be solved if all -s fields are displayed - i.e. 
  they are added to the -F list, right?
 
 But old users might not aware of the new -F option, and use -s option 
 only.  If so, she will get output like the first example, right?

Well, there's a default -F list that applies - so this shouldn't be a 
problem, agreed? So output should be like the second (expected) example.

  Basically there's just a single concept: the -F list. The -s option 
  simply modifies and extends the -F list but internally perf report 
  would not know anything about '-s', it only knows about fields to 
  display and it would know which of those fields are to be sorted and 
  in what order.
 
  Does that make sense to you? Does it cover everything needed?
 
 I like the concept.  I'm just looking for a way to add it without 
 upsetting old users. :)

If the default -F list matches our current displayed fields list then 
there should not be much change in behavior (beyond the addition of total 
for call-graph outputs - which can be kept completely separate).

I'm not too worried about call-graph 'legacies': it generates such huge 
perf.data files which is parsed so slowly at the moment that there's very 
little user base ... Anyone who absolutely needs call-graph profiling uses 
SysProf which performs well.

Thanks,

Ingo
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC/PATCHSET 00/14] perf report: Add support to accumulate hist periods (v2)

2013-11-06 Thread Frederic Weisbecker
On Wed, Nov 06, 2013 at 09:30:46AM +0100, Ingo Molnar wrote:
 
 * Namhyung Kim namhy...@kernel.org wrote:
 
  Hi Ingo,
  
  On Tue, 5 Nov 2013 12:58:02 +0100, Ingo Molnar wrote:
   * Namhyung Kim namhy...@kernel.org wrote:
   But the 'cumulative' (btw, I feel a bit hard to type this word..) is 
   different in that it *generates* entries didn't get sampled originally. 
   And as it requires callchains, total field will not work if callchains 
   are missing.
  
   Well, 'total' should disappear if it's not available.
  
  But what if it's the only sort key user gave?
 
 Do you mean something like:
 
   -F self,name -s total
 
 i.e. if a sort key not displayed?
 
 I think sort keys should be automatically added to the displayed fields 
 list.
 
 This rule is obviously met with the -F total:2,self:1,name:0 kind of 
 sorting syntax (you can only sort by fields that get displayed) - if mixed 
 with -s then it should be implicit I think.

I'm not sure why you want to add a new -F that adds news way to display fields.
Isn't -s enough for that?
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC/PATCHSET 00/14] perf report: Add support to accumulate hist periods (v2)

2013-11-06 Thread Frederic Weisbecker
On Wed, Nov 06, 2013 at 12:47:01PM +0100, Ingo Molnar wrote:
 
 * Namhyung Kim namhy...@kernel.org wrote:
 
  On Wed, 6 Nov 2013 09:30:46 +0100, Ingo Molnar wrote:
   * Namhyung Kim namhy...@kernel.org wrote:
  
   Hi Ingo,
   
   On Tue, 5 Nov 2013 12:58:02 +0100, Ingo Molnar wrote:
* Namhyung Kim namhy...@kernel.org wrote:
But the 'cumulative' (btw, I feel a bit hard to type this word..) is 
different in that it *generates* entries didn't get sampled 
originally. 
And as it requires callchains, total field will not work if 
callchains 
are missing.
   
Well, 'total' should disappear if it's not available.
   
   But what if it's the only sort key user gave?
  
   Do you mean something like:
  
 -F self,name -s total
  
   i.e. if a sort key not displayed?
  
  What I worry is when no -F option was given at all.
 
 In that case the default list applied, plus whatever new fields are 
 mentioned in -s would also be added (appended or prepended).
 
 The display order of columns should _probably_ be something like:
 
   key1 key2 ... non-key1 non-key2
 
 there's not much point in sorting and then displaying the key not in 
 front, right?
 
   I think sort keys should be automatically added to the displayed 
   fields list.
  
  Agreed.
 
   This problem should be solved if all -s fields are displayed - i.e. 
   they are added to the -F list, right?
  
  But old users might not aware of the new -F option, and use -s option 
  only.  If so, she will get output like the first example, right?
 
 Well, there's a default -F list that applies - so this shouldn't be a 
 problem, agreed? So output should be like the second (expected) example.
 
   Basically there's just a single concept: the -F list. The -s option 
   simply modifies and extends the -F list but internally perf report 
   would not know anything about '-s', it only knows about fields to 
   display and it would know which of those fields are to be sorted and 
   in what order.
  
   Does that make sense to you? Does it cover everything needed?
  
  I like the concept.  I'm just looking for a way to add it without 
  upsetting old users. :)
 
 If the default -F list matches our current displayed fields list then 
 there should not be much change in behavior (beyond the addition of total 
 for call-graph outputs - which can be kept completely separate).
 
 I'm not too worried about call-graph 'legacies': it generates such huge 
 perf.data files which is parsed so slowly at the moment that there's very 
 little user base ... Anyone who absolutely needs call-graph profiling uses 
 SysProf which performs well.

I'm a bit confused by what will be changed with call-graph here. Also I've
seen perf callgraph reports quite often on emails not even related to perf
developement. It doesn't appear to me like an irrelevant feature...
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC/PATCHSET 00/14] perf report: Add support to accumulate hist periods (v2)

2013-11-06 Thread David Ahern

On 11/6/13, 4:47 AM, Ingo Molnar wrote:

I'm not too worried about call-graph 'legacies': it generates such huge
perf.data files which is parsed so slowly at the moment that there's very
little user base ... Anyone who absolutely needs call-graph profiling uses
SysProf which performs well.


Actually, perf with callchains is used quite heavily on my products. One 
of the selling points of perf.


David

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC/PATCHSET 00/14] perf report: Add support to accumulate hist periods (v2)

2013-11-06 Thread Peter Zijlstra
On Wed, Nov 06, 2013 at 12:47:01PM +0100, Ingo Molnar wrote:
 I'm not too worried about call-graph 'legacies': it generates such huge 
 perf.data files which is parsed so slowly at the moment that there's very 
 little user base ... Anyone who absolutely needs call-graph profiling uses 
 SysProf which performs well.

Uhm, say what? I use it, and I don't use sysprof since that thing is
totally not usable ;-)


--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC/PATCHSET 00/14] perf report: Add support to accumulate hist periods (v2)

2013-11-05 Thread Namhyung Kim
Hi Ingo,

On Tue, 5 Nov 2013 12:58:02 +0100, Ingo Molnar wrote:
> * Namhyung Kim  wrote:
>> But the 'cumulative' (btw, I feel a bit hard to type this word..) is 
>> different in that it *generates* entries didn't get sampled originally. 
>> And as it requires callchains, total field will not work if callchains 
>> are missing.
>
> Well, 'total' should disappear if it's not available.

But what if it's the only sort key user gave?

>
> We already have some 'column elimination/optimization' logic - like the 
> 'dso' will disappear already if it's a single dso everywhere, IIRC?

When user explicitly gives a single name as the column filter with -c,
-d and/or -S options.

But it seems to have a same issue that I said above:

  $ perf report -s comm -c perf --stdio
  (...)
  # Overhead
  # 
  #
 100.00%


And TUI even shows a noise in the output.

>
>> But as Frederic noted, it might affect the performance of perf report, 
>> so it might be better to delay this behavior to make default after users 
>> feel comfortable with an option?
>
> I think with call-chain speedups it should be fast enough, right?

Yeah, it should speedup things significantly.

>
> We can argue about the default separately - if it's all done correctly 
> then it should be really easy to change the default layout of 'perf 
> report'.
>

I just think that the perf tools are going so fast. ;-)


>> For now, there're two kind of columns:
>> 
>> - one for showing entry's overhead percentage: self, sys, user,
>>   guest_sys and guest_user.  So the 'total' should go into this
>>   category.  I named it hpp (hist_entry period percentage) functions and
>>   yes, I know it's an awfully bad name. :)  Please see perf_hpp__format.
>> 
>>   There're controlled by a couple of options:  --show-total-period,
>>   --show-nr-samples and --showcpuutilization (I hate this!).  And event
>>   group also can affect its output.
>> 
>> - one for grouping entries: cpu, pid, comm, dso, symbol, srcline and
>>   parent.  We call it "sort keys" but confusingly it doesn't affect 
>>   output sorting for now.
>
> Well, it's still a sort key in a sense, a string lexicographical ordering 
> in essence, right?

Right.  But it only affects on groupping entries when added and
collapsed not the output ordering.

>
>> > If there's demand then we could decouple sort keys from the display 
>> > order, by slightly augmenting the field format:
>> >
>> >  -F total,self:2,process:0,dso:1,name
>> >
>> > This would sort by 'process' field as the primary key, 'dso' the secondary 
>> > key and 'self' as the tertiary key.
>> >
>> > And we could also keep the -s/--sort option:
>> >
>> >  -s process,dso,self
>> >
>> > So the above -F line would be equivalent to:
>> >
>> >  -F total,self,process,dso,name -s process,dso,self
>> >
>> > What do you think?
>> 
>> I like the second one.  It can sustain the old way but can support the 
>> new way easily.
>>
>> But for compatibility we need to use 'self' sort key internally iff 
>> neither the -F option nor the config option was given by user.  And it 
>> might warn (or notice) users to add 'self' column in the sort key for 
>> future use.
>
> Mind explaining what the problem here is? I don't think I get it.

Well, normal users still use it as they used to - like 
'perf report -s comm,dso' without -F option and the config.

In that case, what would the output look like?  According to the above
proposal it'd look like below.

  # Command  Shared object
  # ...  .
aaa  aaa
aaa  libc.so
bbb  bbb
bbb  libc.so


But the user might want see this:

  # Overhead (self)  Command  Shared object
  # ...  ...  .
 30.00%  bbb  bbb
 25.00%  aaa  aaa
 25.00%  aaa  libc.so
 20.00%  bbb  libc.so


If she really wants to see it sorted by comm and dso, the command line
should be 'perf report -F self,comm,dso -s comm,dso'
(or just 'perf report -F self -s comm,dso' could do the same).

  # Overhead (self)  Command  Shared object
  # ...  ...  .
 25.00%  aaa  aaa
 25.00%  aaa  libc.so
 30.00%  bbb  bbb
 20.00%  bbb  libc.so


Thanks,
Namhyung
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC/PATCHSET 00/14] perf report: Add support to accumulate hist periods (v2)

2013-11-05 Thread Ingo Molnar

* Namhyung Kim  wrote:

> On Tue, 5 Nov 2013 08:46:50 +0100, Ingo Molnar wrote:
> > * Namhyung Kim  wrote:
> >> I think it'd better to separate the option and pass column and
> >> (optional) sort key argument.
> >> 
> >>   --cumulative both,total (default)
> >>   --cumulative both,self
> >>   --cumulative total
> >>   --cumulative self (meaningless?)
> >> 
> >> Maybe we need a config option and a single letter option for the default
> >> case like --call-graph and -g options do.
> >> 
> >> What do you think?
> >
> > So why restrict it to 'cumulative'? Why not have a generic --fields/-F, 
> > with a good default. The ordering of the fields determines sorting 
> > behavior.
> 
> Ah, I didn't know you meant that too. :)
> 
> But the 'cumulative' (btw, I feel a bit hard to type this word..) is 
> different in that it *generates* entries didn't get sampled originally. 
> And as it requires callchains, total field will not work if callchains 
> are missing.

Well, 'total' should disappear if it's not available.

We already have some 'column elimination/optimization' logic - like the 
'dso' will disappear already if it's a single dso everywhere, IIRC?

> So I tried to make it a standalone option.
> 
> >
> > The default would be something like:
> >
> >   -F total,self,process,dso,name
> >
> > Whether 'cumulative' data is calculated is not a function of any direct 
> > option, but simply a function of whether the 'total' field is in the -F 
> > list of columns displayed or not.
> 
> So you want to turn the cumulative behavior always on, right?

Yes.

> But as Frederic noted, it might affect the performance of perf report, 
> so it might be better to delay this behavior to make default after users 
> feel comfortable with an option?

I think with call-chain speedups it should be fast enough, right?

We can argue about the default separately - if it's all done correctly 
then it should be really easy to change the default layout of 'perf 
report'.

> > With that scheme we could also do things like this to get old-style 
> > sorting:
> >
> >  -F self,process,dso,name
> >
> > Or a really frugal 'readprofile'-style output:
> >
> >  -F self,name
> >
> > if one is only interested in percentages and raw function names.
> 
> s/name/sym(bol)/ :)

Yeah.

> Yes, this is what we do with -s option now.
> 
> > Wrt. sorting order, by default the first column in the list of columns 
> > would be the primary (and only) sort key.
> 
> Ah, I never thought it like this way.  It makes sense as sort orders 
> really affect the output sorting.
> 
> > (The -F field setup list could also be specified in the .perfconfig.)
> >
> > With this method we could do away with all this geometrical explosion 
> > of somewhat inconsistent formatting and sorting options...
> 
> For now, there're two kind of columns:
> 
> - one for showing entry's overhead percentage: self, sys, user,
>   guest_sys and guest_user.  So the 'total' should go into this
>   category.  I named it hpp (hist_entry period percentage) functions and
>   yes, I know it's an awfully bad name. :)  Please see perf_hpp__format.
> 
>   There're controlled by a couple of options:  --show-total-period,
>   --show-nr-samples and --showcpuutilization (I hate this!).  And event
>   group also can affect its output.
> 
> - one for grouping entries: cpu, pid, comm, dso, symbol, srcline and
>   parent.  We call it "sort keys" but confusingly it doesn't affect 
>   output sorting for now.

Well, it's still a sort key in a sense, a string lexicographical ordering 
in essence, right?

> So I think cleaning this up with -F option is good and I've been wanting 
> this discussion for a long time. :)

Okay :-)

> > If there's demand then we could decouple sort keys from the display 
> > order, by slightly augmenting the field format:
> >
> >  -F total,self:2,process:0,dso:1,name
> >
> > This would sort by 'process' field as the primary key, 'dso' the secondary 
> > key and 'self' as the tertiary key.
> >
> > And we could also keep the -s/--sort option:
> >
> >  -s process,dso,self
> >
> > So the above -F line would be equivalent to:
> >
> >  -F total,self,process,dso,name -s process,dso,self
> >
> > What do you think?
> 
> I like the second one.  It can sustain the old way but can support the 
> new way easily.
>
> But for compatibility we need to use 'self' sort key internally iff 
> neither the -F option nor the config option was given by user.  And it 
> might warn (or notice) users to add 'self' column in the sort key for 
> future use.

Mind explaining what the problem here is? I don't think I get it.

Thanks,

Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC/PATCHSET 00/14] perf report: Add support to accumulate hist periods (v2)

2013-11-05 Thread Namhyung Kim
On Tue, 5 Nov 2013 08:46:50 +0100, Ingo Molnar wrote:
> * Namhyung Kim  wrote:
>> I think it'd better to separate the option and pass column and
>> (optional) sort key argument.
>> 
>>   --cumulative both,total (default)
>>   --cumulative both,self
>>   --cumulative total
>>   --cumulative self (meaningless?)
>> 
>> Maybe we need a config option and a single letter option for the default
>> case like --call-graph and -g options do.
>> 
>> What do you think?
>
> So why restrict it to 'cumulative'? Why not have a generic --fields/-F, 
> with a good default. The ordering of the fields determines sorting 
> behavior.

Ah, I didn't know you meant that too. :)

But the 'cumulative' (btw, I feel a bit hard to type this word..) is
different in that it *generates* entries didn't get sampled originally.
And as it requires callchains, total field will not work if callchains
are missing.

So I tried to make it a standalone option.

>
> The default would be something like:
>
>   -F total,self,process,dso,name
>
> Whether 'cumulative' data is calculated is not a function of any direct 
> option, but simply a function of whether the 'total' field is in the -F 
> list of columns displayed or not.

So you want to turn the cumulative behavior always on, right?

But as Frederic noted, it might affect the performance of perf report,
so it might be better to delay this behavior to make default after users
feel comfortable with an option?

>
> With that scheme we could also do things like this to get old-style 
> sorting:
>
>  -F self,process,dso,name
>
> Or a really frugal 'readprofile'-style output:
>
>  -F self,name
>
> if one is only interested in percentages and raw function names.

s/name/sym(bol)/ :)

Yes, this is what we do with -s option now.

>
> Wrt. sorting order, by default the first column in the list of columns 
> would be the primary (and only) sort key.

Ah, I never thought it like this way.  It makes sense as sort orders
really affect the output sorting.

>
> (The -F field setup list could also be specified in the .perfconfig.)
>
> With this method we could do away with all this geometrical explosion of 
> somewhat inconsistent formatting and sorting options...

For now, there're two kind of columns:

- one for showing entry's overhead percentage: self, sys, user,
  guest_sys and guest_user.  So the 'total' should go into this
  category.  I named it hpp (hist_entry period percentage) functions and
  yes, I know it's an awfully bad name. :)  Please see perf_hpp__format.

  There're controlled by a couple of options:  --show-total-period,
  --show-nr-samples and --showcpuutilization (I hate this!).  And event
  group also can affect its output.

- one for grouping entries: cpu, pid, comm, dso, symbol, srcline and
  parent.  We call it "sort keys" but confusingly it doesn't affect
  output sorting for now.


So I think cleaning this up with -F option is good and I've been wanting
this discussion for a long time. :)

>
> If there's demand then we could decouple sort keys from the display order, 
> by slightly augmenting the field format:
>
>  -F total,self:2,process:0,dso:1,name
>
> This would sort by 'process' field as the primary key, 'dso' the secondary 
> key and 'self' as the tertiary key.
>
> And we could also keep the -s/--sort option:
>
>  -s process,dso,self
>
> So the above -F line would be equivalent to:
>
>  -F total,self,process,dso,name -s process,dso,self
>
> What do you think?

I like the second one.  It can sustain the old way but can support the
new way easily.

But for compatibility we need to use 'self' sort key internally iff
neither the -F option nor the config option was given by user.  And it
might warn (or notice) users to add 'self' column in the sort key for
future use.

Thanks,
Namhyung
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC/PATCHSET 00/14] perf report: Add support to accumulate hist periods (v2)

2013-11-05 Thread Namhyung Kim
On Tue, 5 Nov 2013 08:46:50 +0100, Ingo Molnar wrote:
 * Namhyung Kim namhy...@kernel.org wrote:
 I think it'd better to separate the option and pass column and
 (optional) sort key argument.
 
   --cumulative both,total (default)
   --cumulative both,self
   --cumulative total
   --cumulative self (meaningless?)
 
 Maybe we need a config option and a single letter option for the default
 case like --call-graph and -g options do.
 
 What do you think?

 So why restrict it to 'cumulative'? Why not have a generic --fields/-F, 
 with a good default. The ordering of the fields determines sorting 
 behavior.

Ah, I didn't know you meant that too. :)

But the 'cumulative' (btw, I feel a bit hard to type this word..) is
different in that it *generates* entries didn't get sampled originally.
And as it requires callchains, total field will not work if callchains
are missing.

So I tried to make it a standalone option.


 The default would be something like:

   -F total,self,process,dso,name

 Whether 'cumulative' data is calculated is not a function of any direct 
 option, but simply a function of whether the 'total' field is in the -F 
 list of columns displayed or not.

So you want to turn the cumulative behavior always on, right?

But as Frederic noted, it might affect the performance of perf report,
so it might be better to delay this behavior to make default after users
feel comfortable with an option?


 With that scheme we could also do things like this to get old-style 
 sorting:

  -F self,process,dso,name

 Or a really frugal 'readprofile'-style output:

  -F self,name

 if one is only interested in percentages and raw function names.

s/name/sym(bol)/ :)

Yes, this is what we do with -s option now.


 Wrt. sorting order, by default the first column in the list of columns 
 would be the primary (and only) sort key.

Ah, I never thought it like this way.  It makes sense as sort orders
really affect the output sorting.


 (The -F field setup list could also be specified in the .perfconfig.)

 With this method we could do away with all this geometrical explosion of 
 somewhat inconsistent formatting and sorting options...

For now, there're two kind of columns:

- one for showing entry's overhead percentage: self, sys, user,
  guest_sys and guest_user.  So the 'total' should go into this
  category.  I named it hpp (hist_entry period percentage) functions and
  yes, I know it's an awfully bad name. :)  Please see perf_hpp__format.

  There're controlled by a couple of options:  --show-total-period,
  --show-nr-samples and --showcpuutilization (I hate this!).  And event
  group also can affect its output.

- one for grouping entries: cpu, pid, comm, dso, symbol, srcline and
  parent.  We call it sort keys but confusingly it doesn't affect
  output sorting for now.


So I think cleaning this up with -F option is good and I've been wanting
this discussion for a long time. :)


 If there's demand then we could decouple sort keys from the display order, 
 by slightly augmenting the field format:

  -F total,self:2,process:0,dso:1,name

 This would sort by 'process' field as the primary key, 'dso' the secondary 
 key and 'self' as the tertiary key.

 And we could also keep the -s/--sort option:

  -s process,dso,self

 So the above -F line would be equivalent to:

  -F total,self,process,dso,name -s process,dso,self

 What do you think?

I like the second one.  It can sustain the old way but can support the
new way easily.

But for compatibility we need to use 'self' sort key internally iff
neither the -F option nor the config option was given by user.  And it
might warn (or notice) users to add 'self' column in the sort key for
future use.

Thanks,
Namhyung
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC/PATCHSET 00/14] perf report: Add support to accumulate hist periods (v2)

2013-11-05 Thread Ingo Molnar

* Namhyung Kim namhy...@kernel.org wrote:

 On Tue, 5 Nov 2013 08:46:50 +0100, Ingo Molnar wrote:
  * Namhyung Kim namhy...@kernel.org wrote:
  I think it'd better to separate the option and pass column and
  (optional) sort key argument.
  
--cumulative both,total (default)
--cumulative both,self
--cumulative total
--cumulative self (meaningless?)
  
  Maybe we need a config option and a single letter option for the default
  case like --call-graph and -g options do.
  
  What do you think?
 
  So why restrict it to 'cumulative'? Why not have a generic --fields/-F, 
  with a good default. The ordering of the fields determines sorting 
  behavior.
 
 Ah, I didn't know you meant that too. :)
 
 But the 'cumulative' (btw, I feel a bit hard to type this word..) is 
 different in that it *generates* entries didn't get sampled originally. 
 And as it requires callchains, total field will not work if callchains 
 are missing.

Well, 'total' should disappear if it's not available.

We already have some 'column elimination/optimization' logic - like the 
'dso' will disappear already if it's a single dso everywhere, IIRC?

 So I tried to make it a standalone option.
 
 
  The default would be something like:
 
-F total,self,process,dso,name
 
  Whether 'cumulative' data is calculated is not a function of any direct 
  option, but simply a function of whether the 'total' field is in the -F 
  list of columns displayed or not.
 
 So you want to turn the cumulative behavior always on, right?

Yes.

 But as Frederic noted, it might affect the performance of perf report, 
 so it might be better to delay this behavior to make default after users 
 feel comfortable with an option?

I think with call-chain speedups it should be fast enough, right?

We can argue about the default separately - if it's all done correctly 
then it should be really easy to change the default layout of 'perf 
report'.

  With that scheme we could also do things like this to get old-style 
  sorting:
 
   -F self,process,dso,name
 
  Or a really frugal 'readprofile'-style output:
 
   -F self,name
 
  if one is only interested in percentages and raw function names.
 
 s/name/sym(bol)/ :)

Yeah.

 Yes, this is what we do with -s option now.
 
  Wrt. sorting order, by default the first column in the list of columns 
  would be the primary (and only) sort key.
 
 Ah, I never thought it like this way.  It makes sense as sort orders 
 really affect the output sorting.
 
  (The -F field setup list could also be specified in the .perfconfig.)
 
  With this method we could do away with all this geometrical explosion 
  of somewhat inconsistent formatting and sorting options...
 
 For now, there're two kind of columns:
 
 - one for showing entry's overhead percentage: self, sys, user,
   guest_sys and guest_user.  So the 'total' should go into this
   category.  I named it hpp (hist_entry period percentage) functions and
   yes, I know it's an awfully bad name. :)  Please see perf_hpp__format.
 
   There're controlled by a couple of options:  --show-total-period,
   --show-nr-samples and --showcpuutilization (I hate this!).  And event
   group also can affect its output.
 
 - one for grouping entries: cpu, pid, comm, dso, symbol, srcline and
   parent.  We call it sort keys but confusingly it doesn't affect 
   output sorting for now.

Well, it's still a sort key in a sense, a string lexicographical ordering 
in essence, right?

 So I think cleaning this up with -F option is good and I've been wanting 
 this discussion for a long time. :)

Okay :-)

  If there's demand then we could decouple sort keys from the display 
  order, by slightly augmenting the field format:
 
   -F total,self:2,process:0,dso:1,name
 
  This would sort by 'process' field as the primary key, 'dso' the secondary 
  key and 'self' as the tertiary key.
 
  And we could also keep the -s/--sort option:
 
   -s process,dso,self
 
  So the above -F line would be equivalent to:
 
   -F total,self,process,dso,name -s process,dso,self
 
  What do you think?
 
 I like the second one.  It can sustain the old way but can support the 
 new way easily.

 But for compatibility we need to use 'self' sort key internally iff 
 neither the -F option nor the config option was given by user.  And it 
 might warn (or notice) users to add 'self' column in the sort key for 
 future use.

Mind explaining what the problem here is? I don't think I get it.

Thanks,

Ingo
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC/PATCHSET 00/14] perf report: Add support to accumulate hist periods (v2)

2013-11-05 Thread Namhyung Kim
Hi Ingo,

On Tue, 5 Nov 2013 12:58:02 +0100, Ingo Molnar wrote:
 * Namhyung Kim namhy...@kernel.org wrote:
 But the 'cumulative' (btw, I feel a bit hard to type this word..) is 
 different in that it *generates* entries didn't get sampled originally. 
 And as it requires callchains, total field will not work if callchains 
 are missing.

 Well, 'total' should disappear if it's not available.

But what if it's the only sort key user gave?


 We already have some 'column elimination/optimization' logic - like the 
 'dso' will disappear already if it's a single dso everywhere, IIRC?

When user explicitly gives a single name as the column filter with -c,
-d and/or -S options.

But it seems to have a same issue that I said above:

  $ perf report -s comm -c perf --stdio
  (...)
  # Overhead
  # 
  #
 100.00%


And TUI even shows a noise in the output.


 But as Frederic noted, it might affect the performance of perf report, 
 so it might be better to delay this behavior to make default after users 
 feel comfortable with an option?

 I think with call-chain speedups it should be fast enough, right?

Yeah, it should speedup things significantly.


 We can argue about the default separately - if it's all done correctly 
 then it should be really easy to change the default layout of 'perf 
 report'.


I just think that the perf tools are going so fast. ;-)


 For now, there're two kind of columns:
 
 - one for showing entry's overhead percentage: self, sys, user,
   guest_sys and guest_user.  So the 'total' should go into this
   category.  I named it hpp (hist_entry period percentage) functions and
   yes, I know it's an awfully bad name. :)  Please see perf_hpp__format.
 
   There're controlled by a couple of options:  --show-total-period,
   --show-nr-samples and --showcpuutilization (I hate this!).  And event
   group also can affect its output.
 
 - one for grouping entries: cpu, pid, comm, dso, symbol, srcline and
   parent.  We call it sort keys but confusingly it doesn't affect 
   output sorting for now.

 Well, it's still a sort key in a sense, a string lexicographical ordering 
 in essence, right?

Right.  But it only affects on groupping entries when added and
collapsed not the output ordering.


  If there's demand then we could decouple sort keys from the display 
  order, by slightly augmenting the field format:
 
   -F total,self:2,process:0,dso:1,name
 
  This would sort by 'process' field as the primary key, 'dso' the secondary 
  key and 'self' as the tertiary key.
 
  And we could also keep the -s/--sort option:
 
   -s process,dso,self
 
  So the above -F line would be equivalent to:
 
   -F total,self,process,dso,name -s process,dso,self
 
  What do you think?
 
 I like the second one.  It can sustain the old way but can support the 
 new way easily.

 But for compatibility we need to use 'self' sort key internally iff 
 neither the -F option nor the config option was given by user.  And it 
 might warn (or notice) users to add 'self' column in the sort key for 
 future use.

 Mind explaining what the problem here is? I don't think I get it.

Well, normal users still use it as they used to - like 
'perf report -s comm,dso' without -F option and the config.

In that case, what would the output look like?  According to the above
proposal it'd look like below.

  # Command  Shared object
  # ...  .
aaa  aaa
aaa  libc.so
bbb  bbb
bbb  libc.so


But the user might want see this:

  # Overhead (self)  Command  Shared object
  # ...  ...  .
 30.00%  bbb  bbb
 25.00%  aaa  aaa
 25.00%  aaa  libc.so
 20.00%  bbb  libc.so


If she really wants to see it sorted by comm and dso, the command line
should be 'perf report -F self,comm,dso -s comm,dso'
(or just 'perf report -F self -s comm,dso' could do the same).

  # Overhead (self)  Command  Shared object
  # ...  ...  .
 25.00%  aaa  aaa
 25.00%  aaa  libc.so
 30.00%  bbb  bbb
 20.00%  bbb  libc.so


Thanks,
Namhyung
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC/PATCHSET 00/14] perf report: Add support to accumulate hist periods (v2)

2013-11-04 Thread Ingo Molnar

* Namhyung Kim  wrote:

> Hi Ingo,
> 
> On Fri, 1 Nov 2013 10:27:59 +0100, Ingo Molnar wrote:
> > * Namhyung Kim  wrote:
> >
> >> >> > 2)
> >> >> >
> >> >> > Is it possible to configure the default 'report -g' style, so that 
> >> >> > people who'd like to use it all the time don't have to type '-g 
> >> >> > cumulative' all the time?
> >> >> 
> >> >> Hmm.. maybe I can add support for the 'report.call-graph' config option.
> >> >
> >> > If we display your new 'total' field by default then it's not as 
> >> > pressing to me :)
> >> 
> >> Do you mean -g cumulative without 'self' column?
> >
> > So, if by default we display both 'self' and 'total' and sort by 
> > 'total', then I'm personally a happy camper: while it's a change of 
> > the default perf report output, it's also a step forward.
> >
> > But some people might complain about it once this hits v3.13 (or 
> > v3.14) and might want to hide individual columns and have different 
> > sorting preferences.
> >
> > So out of defensive caution you might want to provide toggles for 
> > such things, maybe even generalize it a bit to allow arbitrary 
> > hiding/display of individual colums in perf report.
> >
> > That would probably also make it easier to do minimal tweaks to the 
> > upstream reporting defaults.
> 
> Okay, so what would the interface look like?
> 
> I think it'd better to separate the option and pass column and
> (optional) sort key argument.
> 
>   --cumulative both,total (default)
>   --cumulative both,self
>   --cumulative total
>   --cumulative self (meaningless?)
> 
> Maybe we need a config option and a single letter option for the default
> case like --call-graph and -g options do.
> 
> What do you think?

So why restrict it to 'cumulative'? Why not have a generic --fields/-F, 
with a good default. The ordering of the fields determines sorting 
behavior.

The default would be something like:

  -F total,self,process,dso,name

Whether 'cumulative' data is calculated is not a function of any direct 
option, but simply a function of whether the 'total' field is in the -F 
list of columns displayed or not.

With that scheme we could also do things like this to get old-style 
sorting:

 -F self,process,dso,name

Or a really frugal 'readprofile'-style output:

 -F self,name

if one is only interested in percentages and raw function names.

Wrt. sorting order, by default the first column in the list of columns 
would be the primary (and only) sort key.

(The -F field setup list could also be specified in the .perfconfig.)

With this method we could do away with all this geometrical explosion of 
somewhat inconsistent formatting and sorting options...

If there's demand then we could decouple sort keys from the display order, 
by slightly augmenting the field format:

 -F total,self:2,process:0,dso:1,name

This would sort by 'process' field as the primary key, 'dso' the secondary 
key and 'self' as the tertiary key.

And we could also keep the -s/--sort option:

 -s process,dso,self

So the above -F line would be equivalent to:

 -F total,self,process,dso,name -s process,dso,self

What do you think?

Thanks,

Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC/PATCHSET 00/14] perf report: Add support to accumulate hist periods (v2)

2013-11-04 Thread Namhyung Kim
Hi Ingo,

On Fri, 1 Nov 2013 10:27:59 +0100, Ingo Molnar wrote:
> * Namhyung Kim  wrote:
>
>> >> > 2)
>> >> >
>> >> > Is it possible to configure the default 'report -g' style, so that 
>> >> > people who'd like to use it all the time don't have to type '-g 
>> >> > cumulative' all the time?
>> >> 
>> >> Hmm.. maybe I can add support for the 'report.call-graph' config option.
>> >
>> > If we display your new 'total' field by default then it's not as 
>> > pressing to me :)
>> 
>> Do you mean -g cumulative without 'self' column?
>
> So, if by default we display both 'self' and 'total' and sort by 
> 'total', then I'm personally a happy camper: while it's a change of 
> the default perf report output, it's also a step forward.
>
> But some people might complain about it once this hits v3.13 (or 
> v3.14) and might want to hide individual columns and have different 
> sorting preferences.
>
> So out of defensive caution you might want to provide toggles for 
> such things, maybe even generalize it a bit to allow arbitrary 
> hiding/display of individual colums in perf report.
>
> That would probably also make it easier to do minimal tweaks to the 
> upstream reporting defaults.

Okay, so what would the interface look like?

I think it'd better to separate the option and pass column and
(optional) sort key argument.

  --cumulative both,total (default)
  --cumulative both,self
  --cumulative total
  --cumulative self (meaningless?)

Maybe we need a config option and a single letter option for the default
case like --call-graph and -g options do.

What do you think?

Thanks,
Namhyung
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC/PATCHSET 00/14] perf report: Add support to accumulate hist periods (v2)

2013-11-04 Thread Frederic Weisbecker
On Thu, Oct 31, 2013 at 09:09:32AM +0100, Ingo Molnar wrote:
> 
> 
> * Namhyung Kim  wrote:
> 
> > When the -g cumulative option is given, it'll be shown like this:
> > 
> >   $ perf report -g cumulative --stdio
> > 
> >   # Overhead  Overhead (Acc)  Command  Shared Object   
> > Symbol
> >   #   ..  ...  .  
> > ...
> >   #
> >0.00%  88.29%  abc  libc-2.17.so   [.] 
> > __libc_start_main  
> >0.00%  88.29%  abc  abc[.] main  
> >  
> >0.00%  88.29%  abc  abc[.] c 
> >  
> >0.00%  88.29%  abc  abc[.] b 
> >  
> >   88.29%  88.29%  abc  abc[.] a 
> >  
> >0.00%  11.61%  abc  ld-2.17.so [k] 
> > _dl_sysdep_start   
> >0.00%   9.43%  abc  ld-2.17.so [.] dl_main   
> >  
> >9.43%   9.43%  abc  ld-2.17.so [.] 
> > _dl_relocate_object
> >2.27%   2.27%  abc  [kernel.kallsyms]  [k] page_fault
> >  
> >0.00%   2.18%  abc  ld-2.17.so [k] 
> > _dl_start_user 
> >0.00%   0.10%  abc  ld-2.17.so [.] _start
> >  
> > 
> > As you can see __libc_start_main -> main -> c -> b -> a callchain 
> > show up in the output.
> 
> This looks really useful!
> 
> A couple of details:
> 
> 1)
> 
> This is pretty close to SysProf output, right? So why not use the 
> well-known SysProf naming and call the first column 'self' and the 
> second column 'total'? I think those names are pretty intuitive and 
> it would help people who come from SysProf over to perf.
> 
> 2)
> 
> Is it possible to configure the default 'report -g' style, so that 
> people who'd like to use it all the time don't have to type '-g 
> cumulative' all the time?
> 
> 3)
> 
> I'd even argue that we enable this reporting feature by default, if 
> a data file includes call-chain data: the first column will still 
> show the well-known percentage that perf report produces today, the 
> second column will be a new feature in essence.
> 
> The only open question would be, by which column should we sort: 
> 'sysprof style' sorts by 'total', 'perf style' sorts by 'self'. 
> Agreed?

Defaulting that behaviour may make sense but we can expect some visible overhead
out of it though. Adding one hist per callchain entry has probably some
measurable impact.

How about we wait to see that option mature upstream a bit first then we can
decide about making it the default once we have measured and maybe optimized
the resulting overhead?

Thanks.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC/PATCHSET 00/14] perf report: Add support to accumulate hist periods (v2)

2013-11-04 Thread Frederic Weisbecker
On Fri, Nov 01, 2013 at 03:48:37PM +0900, Namhyung Kim wrote:
> Hi Ingo,
> 
> On Thu, 31 Oct 2013 09:09:32 +0100, Ingo Molnar wrote:
> > * Namhyung Kim  wrote:
> >
> >> When the -g cumulative option is given, it'll be shown like this:
> >> 
> >>   $ perf report -g cumulative --stdio
> >> 
> >>   # Overhead  Overhead (Acc)  Command  Shared Object   
> >> Symbol
> >>   #   ..  ...  .  
> >> ...
> >>   #
> >>0.00%  88.29%  abc  libc-2.17.so   [.] 
> >> __libc_start_main  
> >>0.00%  88.29%  abc  abc[.] main 
> >>   
> >>0.00%  88.29%  abc  abc[.] c
> >>   
> >>0.00%  88.29%  abc  abc[.] b
> >>   
> >>   88.29%  88.29%  abc  abc[.] a
> >>   
> >>0.00%  11.61%  abc  ld-2.17.so [k] 
> >> _dl_sysdep_start   
> >>0.00%   9.43%  abc  ld-2.17.so [.] dl_main  
> >>   
> >>9.43%   9.43%  abc  ld-2.17.so [.] 
> >> _dl_relocate_object
> >>2.27%   2.27%  abc  [kernel.kallsyms]  [k] page_fault   
> >>   
> >>0.00%   2.18%  abc  ld-2.17.so [k] 
> >> _dl_start_user 
> >>0.00%   0.10%  abc  ld-2.17.so [.] _start   
> >>   
> >> 
> >> As you can see __libc_start_main -> main -> c -> b -> a callchain 
> >> show up in the output.
> >
> > This looks really useful!
> 
> Thanks! :)
> 
> >
> > A couple of details:
> >
> > 1)
> >
> > This is pretty close to SysProf output, right? So why not use the 
> > well-known SysProf naming and call the first column 'self' and the 
> > second column 'total'? I think those names are pretty intuitive and 
> > it would help people who come from SysProf over to perf.
> 
> Okay, I can do it. (Although sysprof seems to call it 'cumulative'
> rather than 'total' - but I think the 'total' is better since it's
> simpler and shorter.)

OTOH cumulative probably express better what it is about. Or branch cumulative
may be.

Total is confusing because we don't know against what it is.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC/PATCHSET 00/14] perf report: Add support to accumulate hist periods (v2)

2013-11-04 Thread Frederic Weisbecker
On Thu, Oct 31, 2013 at 09:09:32AM +0100, Ingo Molnar wrote:
> 
> 
> * Namhyung Kim  wrote:
> 
> > When the -g cumulative option is given, it'll be shown like this:
> > 
> >   $ perf report -g cumulative --stdio
> > 
> >   # Overhead  Overhead (Acc)  Command  Shared Object   
> > Symbol
> >   #   ..  ...  .  
> > ...
> >   #
> >0.00%  88.29%  abc  libc-2.17.so   [.] 
> > __libc_start_main  
> >0.00%  88.29%  abc  abc[.] main  
> >  
> >0.00%  88.29%  abc  abc[.] c 
> >  
> >0.00%  88.29%  abc  abc[.] b 
> >  
> >   88.29%  88.29%  abc  abc[.] a 
> >  
> >0.00%  11.61%  abc  ld-2.17.so [k] 
> > _dl_sysdep_start   
> >0.00%   9.43%  abc  ld-2.17.so [.] dl_main   
> >  
> >9.43%   9.43%  abc  ld-2.17.so [.] 
> > _dl_relocate_object
> >2.27%   2.27%  abc  [kernel.kallsyms]  [k] page_fault
> >  
> >0.00%   2.18%  abc  ld-2.17.so [k] 
> > _dl_start_user 
> >0.00%   0.10%  abc  ld-2.17.so [.] _start
> >  
> > 
> > As you can see __libc_start_main -> main -> c -> b -> a callchain 
> > show up in the output.
> 
> This looks really useful!
> 
> A couple of details:
> 
> 1)
> 
> This is pretty close to SysProf output, right? So why not use the 
> well-known SysProf naming and call the first column 'self' and the 
> second column 'total'? I think those names are pretty intuitive and 
> it would help people who come from SysProf over to perf.

Makes sense. Or "Overhead" and "Total overhead"?
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC/PATCHSET 00/14] perf report: Add support to accumulate hist periods (v2)

2013-11-04 Thread Namhyung Kim
Hi Ingo,

On Fri, 1 Nov 2013 10:27:59 +0100, Ingo Molnar wrote:
 * Namhyung Kim namhy...@kernel.org wrote:

   2)
  
   Is it possible to configure the default 'report -g' style, so that 
   people who'd like to use it all the time don't have to type '-g 
   cumulative' all the time?
  
  Hmm.. maybe I can add support for the 'report.call-graph' config option.
 
  If we display your new 'total' field by default then it's not as 
  pressing to me :)
 
 Do you mean -g cumulative without 'self' column?

 So, if by default we display both 'self' and 'total' and sort by 
 'total', then I'm personally a happy camper: while it's a change of 
 the default perf report output, it's also a step forward.

 But some people might complain about it once this hits v3.13 (or 
 v3.14) and might want to hide individual columns and have different 
 sorting preferences.

 So out of defensive caution you might want to provide toggles for 
 such things, maybe even generalize it a bit to allow arbitrary 
 hiding/display of individual colums in perf report.

 That would probably also make it easier to do minimal tweaks to the 
 upstream reporting defaults.

Okay, so what would the interface look like?

I think it'd better to separate the option and pass column and
(optional) sort key argument.

  --cumulative both,total (default)
  --cumulative both,self
  --cumulative total
  --cumulative self (meaningless?)

Maybe we need a config option and a single letter option for the default
case like --call-graph and -g options do.

What do you think?

Thanks,
Namhyung
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC/PATCHSET 00/14] perf report: Add support to accumulate hist periods (v2)

2013-11-04 Thread Ingo Molnar

* Namhyung Kim namhy...@kernel.org wrote:

 Hi Ingo,
 
 On Fri, 1 Nov 2013 10:27:59 +0100, Ingo Molnar wrote:
  * Namhyung Kim namhy...@kernel.org wrote:
 
2)
   
Is it possible to configure the default 'report -g' style, so that 
people who'd like to use it all the time don't have to type '-g 
cumulative' all the time?
   
   Hmm.. maybe I can add support for the 'report.call-graph' config option.
  
   If we display your new 'total' field by default then it's not as 
   pressing to me :)
  
  Do you mean -g cumulative without 'self' column?
 
  So, if by default we display both 'self' and 'total' and sort by 
  'total', then I'm personally a happy camper: while it's a change of 
  the default perf report output, it's also a step forward.
 
  But some people might complain about it once this hits v3.13 (or 
  v3.14) and might want to hide individual columns and have different 
  sorting preferences.
 
  So out of defensive caution you might want to provide toggles for 
  such things, maybe even generalize it a bit to allow arbitrary 
  hiding/display of individual colums in perf report.
 
  That would probably also make it easier to do minimal tweaks to the 
  upstream reporting defaults.
 
 Okay, so what would the interface look like?
 
 I think it'd better to separate the option and pass column and
 (optional) sort key argument.
 
   --cumulative both,total (default)
   --cumulative both,self
   --cumulative total
   --cumulative self (meaningless?)
 
 Maybe we need a config option and a single letter option for the default
 case like --call-graph and -g options do.
 
 What do you think?

So why restrict it to 'cumulative'? Why not have a generic --fields/-F, 
with a good default. The ordering of the fields determines sorting 
behavior.

The default would be something like:

  -F total,self,process,dso,name

Whether 'cumulative' data is calculated is not a function of any direct 
option, but simply a function of whether the 'total' field is in the -F 
list of columns displayed or not.

With that scheme we could also do things like this to get old-style 
sorting:

 -F self,process,dso,name

Or a really frugal 'readprofile'-style output:

 -F self,name

if one is only interested in percentages and raw function names.

Wrt. sorting order, by default the first column in the list of columns 
would be the primary (and only) sort key.

(The -F field setup list could also be specified in the .perfconfig.)

With this method we could do away with all this geometrical explosion of 
somewhat inconsistent formatting and sorting options...

If there's demand then we could decouple sort keys from the display order, 
by slightly augmenting the field format:

 -F total,self:2,process:0,dso:1,name

This would sort by 'process' field as the primary key, 'dso' the secondary 
key and 'self' as the tertiary key.

And we could also keep the -s/--sort option:

 -s process,dso,self

So the above -F line would be equivalent to:

 -F total,self,process,dso,name -s process,dso,self

What do you think?

Thanks,

Ingo
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC/PATCHSET 00/14] perf report: Add support to accumulate hist periods (v2)

2013-11-04 Thread Frederic Weisbecker
On Thu, Oct 31, 2013 at 09:09:32AM +0100, Ingo Molnar wrote:
 
 
 * Namhyung Kim namhy...@kernel.org wrote:
 
  When the -g cumulative option is given, it'll be shown like this:
  
$ perf report -g cumulative --stdio
  
# Overhead  Overhead (Acc)  Command  Shared Object   
  Symbol
#   ..  ...  .  
  ...
#
 0.00%  88.29%  abc  libc-2.17.so   [.] 
  __libc_start_main  
 0.00%  88.29%  abc  abc[.] main  
   
 0.00%  88.29%  abc  abc[.] c 
   
 0.00%  88.29%  abc  abc[.] b 
   
88.29%  88.29%  abc  abc[.] a 
   
 0.00%  11.61%  abc  ld-2.17.so [k] 
  _dl_sysdep_start   
 0.00%   9.43%  abc  ld-2.17.so [.] dl_main   
   
 9.43%   9.43%  abc  ld-2.17.so [.] 
  _dl_relocate_object
 2.27%   2.27%  abc  [kernel.kallsyms]  [k] page_fault
   
 0.00%   2.18%  abc  ld-2.17.so [k] 
  _dl_start_user 
 0.00%   0.10%  abc  ld-2.17.so [.] _start
   
  
  As you can see __libc_start_main - main - c - b - a callchain 
  show up in the output.
 
 This looks really useful!
 
 A couple of details:
 
 1)
 
 This is pretty close to SysProf output, right? So why not use the 
 well-known SysProf naming and call the first column 'self' and the 
 second column 'total'? I think those names are pretty intuitive and 
 it would help people who come from SysProf over to perf.

Makes sense. Or Overhead and Total overhead?
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC/PATCHSET 00/14] perf report: Add support to accumulate hist periods (v2)

2013-11-04 Thread Frederic Weisbecker
On Fri, Nov 01, 2013 at 03:48:37PM +0900, Namhyung Kim wrote:
 Hi Ingo,
 
 On Thu, 31 Oct 2013 09:09:32 +0100, Ingo Molnar wrote:
  * Namhyung Kim namhy...@kernel.org wrote:
 
  When the -g cumulative option is given, it'll be shown like this:
  
$ perf report -g cumulative --stdio
  
# Overhead  Overhead (Acc)  Command  Shared Object   
  Symbol
#   ..  ...  .  
  ...
#
 0.00%  88.29%  abc  libc-2.17.so   [.] 
  __libc_start_main  
 0.00%  88.29%  abc  abc[.] main 

 0.00%  88.29%  abc  abc[.] c

 0.00%  88.29%  abc  abc[.] b

88.29%  88.29%  abc  abc[.] a

 0.00%  11.61%  abc  ld-2.17.so [k] 
  _dl_sysdep_start   
 0.00%   9.43%  abc  ld-2.17.so [.] dl_main  

 9.43%   9.43%  abc  ld-2.17.so [.] 
  _dl_relocate_object
 2.27%   2.27%  abc  [kernel.kallsyms]  [k] page_fault   

 0.00%   2.18%  abc  ld-2.17.so [k] 
  _dl_start_user 
 0.00%   0.10%  abc  ld-2.17.so [.] _start   

  
  As you can see __libc_start_main - main - c - b - a callchain 
  show up in the output.
 
  This looks really useful!
 
 Thanks! :)
 
 
  A couple of details:
 
  1)
 
  This is pretty close to SysProf output, right? So why not use the 
  well-known SysProf naming and call the first column 'self' and the 
  second column 'total'? I think those names are pretty intuitive and 
  it would help people who come from SysProf over to perf.
 
 Okay, I can do it. (Although sysprof seems to call it 'cumulative'
 rather than 'total' - but I think the 'total' is better since it's
 simpler and shorter.)

OTOH cumulative probably express better what it is about. Or branch cumulative
may be.

Total is confusing because we don't know against what it is.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC/PATCHSET 00/14] perf report: Add support to accumulate hist periods (v2)

2013-11-04 Thread Frederic Weisbecker
On Thu, Oct 31, 2013 at 09:09:32AM +0100, Ingo Molnar wrote:
 
 
 * Namhyung Kim namhy...@kernel.org wrote:
 
  When the -g cumulative option is given, it'll be shown like this:
  
$ perf report -g cumulative --stdio
  
# Overhead  Overhead (Acc)  Command  Shared Object   
  Symbol
#   ..  ...  .  
  ...
#
 0.00%  88.29%  abc  libc-2.17.so   [.] 
  __libc_start_main  
 0.00%  88.29%  abc  abc[.] main  
   
 0.00%  88.29%  abc  abc[.] c 
   
 0.00%  88.29%  abc  abc[.] b 
   
88.29%  88.29%  abc  abc[.] a 
   
 0.00%  11.61%  abc  ld-2.17.so [k] 
  _dl_sysdep_start   
 0.00%   9.43%  abc  ld-2.17.so [.] dl_main   
   
 9.43%   9.43%  abc  ld-2.17.so [.] 
  _dl_relocate_object
 2.27%   2.27%  abc  [kernel.kallsyms]  [k] page_fault
   
 0.00%   2.18%  abc  ld-2.17.so [k] 
  _dl_start_user 
 0.00%   0.10%  abc  ld-2.17.so [.] _start
   
  
  As you can see __libc_start_main - main - c - b - a callchain 
  show up in the output.
 
 This looks really useful!
 
 A couple of details:
 
 1)
 
 This is pretty close to SysProf output, right? So why not use the 
 well-known SysProf naming and call the first column 'self' and the 
 second column 'total'? I think those names are pretty intuitive and 
 it would help people who come from SysProf over to perf.
 
 2)
 
 Is it possible to configure the default 'report -g' style, so that 
 people who'd like to use it all the time don't have to type '-g 
 cumulative' all the time?
 
 3)
 
 I'd even argue that we enable this reporting feature by default, if 
 a data file includes call-chain data: the first column will still 
 show the well-known percentage that perf report produces today, the 
 second column will be a new feature in essence.
 
 The only open question would be, by which column should we sort: 
 'sysprof style' sorts by 'total', 'perf style' sorts by 'self'. 
 Agreed?

Defaulting that behaviour may make sense but we can expect some visible overhead
out of it though. Adding one hist per callchain entry has probably some
measurable impact.

How about we wait to see that option mature upstream a bit first then we can
decide about making it the default once we have measured and maybe optimized
the resulting overhead?

Thanks.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC/PATCHSET 00/14] perf report: Add support to accumulate hist periods (v2)

2013-11-01 Thread Ingo Molnar

* Namhyung Kim  wrote:

> >> > 2)
> >> >
> >> > Is it possible to configure the default 'report -g' style, so that 
> >> > people who'd like to use it all the time don't have to type '-g 
> >> > cumulative' all the time?
> >> 
> >> Hmm.. maybe I can add support for the 'report.call-graph' config option.
> >
> > If we display your new 'total' field by default then it's not as 
> > pressing to me :)
> 
> Do you mean -g cumulative without 'self' column?

So, if by default we display both 'self' and 'total' and sort by 
'total', then I'm personally a happy camper: while it's a change of 
the default perf report output, it's also a step forward.

But some people might complain about it once this hits v3.13 (or 
v3.14) and might want to hide individual columns and have different 
sorting preferences.

So out of defensive caution you might want to provide toggles for 
such things, maybe even generalize it a bit to allow arbitrary 
hiding/display of individual colums in perf report.

That would probably also make it easier to do minimal tweaks to the 
upstream reporting defaults.

> > Btw., if anyone is interested in improving the GTK front-end, it 
> > would be _really_ nice if it had a 'start profiling' button like 
> > sysprof has today, with a 'samples' field showing the current 
> > number of samples. (We could even improve upon sysprof by adding 
> > 'stop' functionality as well ;-)
> 
> Wow, I'm impressed that the sysprof doesn't have one. :)

At least I haven't found it: I tried pressing 'start' once more but 
that doesn't do it, it just keeps collecting data.

Still many developers love sysprof, so I think there would be quite 
some plus in providing a gtk perf top version with the controls 
Pekka and me listed.

Thanks,

Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC/PATCHSET 00/14] perf report: Add support to accumulate hist periods (v2)

2013-11-01 Thread Namhyung Kim
On Fri, 1 Nov 2013 08:55:02 +0100, Ingo Molnar wrote:
> * Namhyung Kim  wrote:
>
>> > A couple of details:
>> >
>> > 1)
>> >
>> > This is pretty close to SysProf output, right? So why not use the 
>> > well-known SysProf naming and call the first column 'self' and the 
>> > second column 'total'? I think those names are pretty intuitive and 
>> > it would help people who come from SysProf over to perf.
>> 
>> Okay, I can do it. (Although sysprof seems to call it 'cumulative'
>> rather than 'total' - but I think the 'total' is better since it's
>> simpler and shorter.)
>
> So sysprof-1.2 has the following two windows:
>
>  'functions',   with 'self' and 'total' fields
>  'descendants', with 'self' and 'cumulative' fields
>
> 'descendants' appears to be similar to the perf 'dso' concept.

Arh, okay.  Thanks for the info.

>
>> > 2)
>> >
>> > Is it possible to configure the default 'report -g' style, so that 
>> > people who'd like to use it all the time don't have to type '-g 
>> > cumulative' all the time?
>> 
>> Hmm.. maybe I can add support for the 'report.call-graph' config option.
>
> If we display your new 'total' field by default then it's not as 
> pressing to me :)

Do you mean -g cumulative without 'self' column?

>
>> > 3)
>> >
>> > I'd even argue that we enable this reporting feature by default, if 
>> > a data file includes call-chain data: the first column will still 
>> > show the well-known percentage that perf report produces today, the 
>> > second column will be a new feature in essence.
>> >
>> > The only open question would be, by which column should we sort: 
>> > 'sysprof style' sorts by 'total', 'perf style' sorts by 'self'. 
>> > Agreed?
>> 
>> Right, I defaulted to go by 'total'.  But we can add an option for 
>> it.
>
> The purpose would be to allow people to do old-style 'sort by 
> function overhead' output, while still seeing the 'total' field as 
> well.

Right.

>
> Btw., if anyone is interested in improving the GTK front-end, it 
> would be _really_ nice if it had a 'start profiling' button like 
> sysprof has today, with a 'samples' field showing the current number 
> of samples. (We could even improve upon sysprof by adding 'stop' 
> functionality as well ;-)

Wow, I'm impressed that the sysprof doesn't have one. :)

>
> A bit like perf top, except the reporting session is hidden until 
> the user actively requests the profile.
>
> Maybe it could even be called a gtk version of 'perf top', with a 
> button to start/stop collection, with another button to 
> activate/deactivate reporting output, and yet another button to 
> reset the profiling buffer.
>
> With that feature set perf would be a ready sysprof workflow 
> replacement I think. (I've Cc:-ed Pekka, just in case! :-)

Sounds nice.  I'm not sure I can have to a time to do it anytime soon.

>
>> > 4)
>> >
>> > This is not directly related to the new feature you added: 
>> > call-graph profiling still takes quite a bit of time. It might 
>> > make sense to save the ordered histogram to a perf.data.ordered 
>> > file, so that repeat invocations of 'perf report' don't have to 
>> > recalculate everything again and again?
>> >
>> > This file would be maintained transparently and would only be 
>> > re-created when the perf.data file changes, or something like 
>> > that.
>> 
>> Hmm.. good idea.  We may discuss it along with Jiri's multiple 
>> file storage patches.  I haven't had a time to review - maybe next 
>> week.
>
> So Arnaldo tells me that with your and Frederic's latest 
> callgraph-speedup patches the parsing of perf.data got _really_ 
> fast, so maybe my performance complaint is moot and we should delay 
> complicating the primary perf.data file model with a 'cache' until 
> your patches are in and we see the full impact.

Okay, let's see what happens. :)

Thanks,
Namhyung
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC/PATCHSET 00/14] perf report: Add support to accumulate hist periods (v2)

2013-11-01 Thread Pekka Enberg
On Fri, Nov 1, 2013 at 9:55 AM, Ingo Molnar  wrote:
> Btw., if anyone is interested in improving the GTK front-end, it
> would be _really_ nice if it had a 'start profiling' button like
> sysprof has today, with a 'samples' field showing the current number
> of samples. (We could even improve upon sysprof by adding 'stop'
> functionality as well ;-)
>
> A bit like perf top, except the reporting session is hidden until
> the user actively requests the profile.
>
> Maybe it could even be called a gtk version of 'perf top', with a
> button to start/stop collection, with another button to
> activate/deactivate reporting output, and yet another button to
> reset the profiling buffer.
>
> With that feature set perf would be a ready sysprof workflow
> replacement I think. (I've Cc:-ed Pekka, just in case! :-)

Sure, "start/stop" button is useful for system-wide profiling but you
also want to be able to start an application for profiling from the UI.
I don't remember if SysProf supports that but the Shark profiler on Mac
OS X does.

Btw, to make GTK-front end even more useful, we should:

  (1) Have a drop-down for selecting a specific process from system-wide
  profile.

  (2) Show 'perf annotate' from 'perf report' when a function is
  double-clicked.

Pekka
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC/PATCHSET 00/14] perf report: Add support to accumulate hist periods (v2)

2013-11-01 Thread Ingo Molnar

* Namhyung Kim  wrote:

> > A couple of details:
> >
> > 1)
> >
> > This is pretty close to SysProf output, right? So why not use the 
> > well-known SysProf naming and call the first column 'self' and the 
> > second column 'total'? I think those names are pretty intuitive and 
> > it would help people who come from SysProf over to perf.
> 
> Okay, I can do it. (Although sysprof seems to call it 'cumulative'
> rather than 'total' - but I think the 'total' is better since it's
> simpler and shorter.)

So sysprof-1.2 has the following two windows:

 'functions',   with 'self' and 'total' fields
 'descendants', with 'self' and 'cumulative' fields

'descendants' appears to be similar to the perf 'dso' concept.

> > 2)
> >
> > Is it possible to configure the default 'report -g' style, so that 
> > people who'd like to use it all the time don't have to type '-g 
> > cumulative' all the time?
> 
> Hmm.. maybe I can add support for the 'report.call-graph' config option.

If we display your new 'total' field by default then it's not as 
pressing to me :)

> > 3)
> >
> > I'd even argue that we enable this reporting feature by default, if 
> > a data file includes call-chain data: the first column will still 
> > show the well-known percentage that perf report produces today, the 
> > second column will be a new feature in essence.
> >
> > The only open question would be, by which column should we sort: 
> > 'sysprof style' sorts by 'total', 'perf style' sorts by 'self'. 
> > Agreed?
> 
> Right, I defaulted to go by 'total'.  But we can add an option for 
> it.

The purpose would be to allow people to do old-style 'sort by 
function overhead' output, while still seeing the 'total' field as 
well.

Btw., if anyone is interested in improving the GTK front-end, it 
would be _really_ nice if it had a 'start profiling' button like 
sysprof has today, with a 'samples' field showing the current number 
of samples. (We could even improve upon sysprof by adding 'stop' 
functionality as well ;-)

A bit like perf top, except the reporting session is hidden until 
the user actively requests the profile.

Maybe it could even be called a gtk version of 'perf top', with a 
button to start/stop collection, with another button to 
activate/deactivate reporting output, and yet another button to 
reset the profiling buffer.

With that feature set perf would be a ready sysprof workflow 
replacement I think. (I've Cc:-ed Pekka, just in case! :-)

> > 4)
> >
> > This is not directly related to the new feature you added: 
> > call-graph profiling still takes quite a bit of time. It might 
> > make sense to save the ordered histogram to a perf.data.ordered 
> > file, so that repeat invocations of 'perf report' don't have to 
> > recalculate everything again and again?
> >
> > This file would be maintained transparently and would only be 
> > re-created when the perf.data file changes, or something like 
> > that.
> 
> Hmm.. good idea.  We may discuss it along with Jiri's multiple 
> file storage patches.  I haven't had a time to review - maybe next 
> week.

So Arnaldo tells me that with your and Frederic's latest 
callgraph-speedup patches the parsing of perf.data got _really_ 
fast, so maybe my performance complaint is moot and we should delay 
complicating the primary perf.data file model with a 'cache' until 
your patches are in and we see the full impact.

Thanks,

Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC/PATCHSET 00/14] perf report: Add support to accumulate hist periods (v2)

2013-11-01 Thread Namhyung Kim
Hi Ingo,

On Thu, 31 Oct 2013 09:09:32 +0100, Ingo Molnar wrote:
> * Namhyung Kim  wrote:
>
>> When the -g cumulative option is given, it'll be shown like this:
>> 
>>   $ perf report -g cumulative --stdio
>> 
>>   # Overhead  Overhead (Acc)  Command  Shared Object   
>> Symbol
>>   #   ..  ...  .  
>> ...
>>   #
>>0.00%  88.29%  abc  libc-2.17.so   [.] 
>> __libc_start_main  
>>0.00%  88.29%  abc  abc[.] main   
>> 
>>0.00%  88.29%  abc  abc[.] c  
>> 
>>0.00%  88.29%  abc  abc[.] b  
>> 
>>   88.29%  88.29%  abc  abc[.] a  
>> 
>>0.00%  11.61%  abc  ld-2.17.so [k] 
>> _dl_sysdep_start   
>>0.00%   9.43%  abc  ld-2.17.so [.] dl_main
>> 
>>9.43%   9.43%  abc  ld-2.17.so [.] 
>> _dl_relocate_object
>>2.27%   2.27%  abc  [kernel.kallsyms]  [k] page_fault 
>> 
>>0.00%   2.18%  abc  ld-2.17.so [k] _dl_start_user 
>> 
>>0.00%   0.10%  abc  ld-2.17.so [.] _start 
>> 
>> 
>> As you can see __libc_start_main -> main -> c -> b -> a callchain 
>> show up in the output.
>
> This looks really useful!

Thanks! :)

>
> A couple of details:
>
> 1)
>
> This is pretty close to SysProf output, right? So why not use the 
> well-known SysProf naming and call the first column 'self' and the 
> second column 'total'? I think those names are pretty intuitive and 
> it would help people who come from SysProf over to perf.

Okay, I can do it. (Although sysprof seems to call it 'cumulative'
rather than 'total' - but I think the 'total' is better since it's
simpler and shorter.)

>
> 2)
>
> Is it possible to configure the default 'report -g' style, so that 
> people who'd like to use it all the time don't have to type '-g 
> cumulative' all the time?

Hmm.. maybe I can add support for the 'report.call-graph' config option.

>
> 3)
>
> I'd even argue that we enable this reporting feature by default, if 
> a data file includes call-chain data: the first column will still 
> show the well-known percentage that perf report produces today, the 
> second column will be a new feature in essence.
>
> The only open question would be, by which column should we sort: 
> 'sysprof style' sorts by 'total', 'perf style' sorts by 'self'. 
> Agreed?

Right, I defaulted to go by 'total'.  But we can add an option for it.

>
> 4)
>
> This is not directly related to the new feature you added: 
> call-graph profiling still takes quite a bit of time. It might make 
> sense to save the ordered histogram to a perf.data.ordered file, so 
> that repeat invocations of 'perf report' don't have to recalculate 
> everything again and again?
>
> This file would be maintained transparently and would only be 
> re-created when the perf.data file changes, or something like that.

Hmm.. good idea.  We may discuss it along with Jiri's multiple file
storage patches.  I haven't had a time to review - maybe next week.

>
> 5)
>
> I realize that this is an early RFC, still there are some usability 
> complaints I have about call-graph recording/reporting which should 
> be addressed before adding new features.
>
> For example I tried to get a list of the -g output modi via:
>
>$ perf report -g help
>
> Which produced a lot of options - I think it should produce only a 
> list of -g options.

Right.  I have a patchset for this.  Will send it soon.


> It also doesn't list cumulative:
>
> -g, --call-graph 
>   Display callchains using output_type 
> (graph, flat, fractal, or none) , min percent threshold, optional 
> print limit, callchain order, key (function or address). Default: 
> fractal,0.5,callee,function

Ah, I forgot to add it.  Will fix!

>
> Also, the list is very long and not very readable - I think there 
> should be more newlines.
>
> Then I tried to do:
>
>$ perf report -g
>
> which, somewhat surprisingly, was accepted. Given that call-graph 
> perf.data is recognized automatically by 'perf report', the -g 
> option should only accept -g  syntax and provide a list of 
> options when '-g' or '-g help' is provided.

Will check.

>
> 6)
>
> A similar UI problem exists on the 'perf record' side: 'perf record 
> --call-graph help' should produce a specific list of call-graph 
> possibilities, not the two screens full output it does today.

Right.  The patch will come soonish. :)

>
>> I know it have some rough edges or even bugs, but I really want to 
>> release it and get reviews.  It does not handle event groups and 
>> annotations and it has a bug on TUI.
>> 
>> You can also get this series on 'perf/cumulate-v2' branch in my tree at:

Re: [RFC/PATCHSET 00/14] perf report: Add support to accumulate hist periods (v2)

2013-11-01 Thread Namhyung Kim
Hi Ingo,

On Thu, 31 Oct 2013 09:09:32 +0100, Ingo Molnar wrote:
 * Namhyung Kim namhy...@kernel.org wrote:

 When the -g cumulative option is given, it'll be shown like this:
 
   $ perf report -g cumulative --stdio
 
   # Overhead  Overhead (Acc)  Command  Shared Object   
 Symbol
   #   ..  ...  .  
 ...
   #
0.00%  88.29%  abc  libc-2.17.so   [.] 
 __libc_start_main  
0.00%  88.29%  abc  abc[.] main   
 
0.00%  88.29%  abc  abc[.] c  
 
0.00%  88.29%  abc  abc[.] b  
 
   88.29%  88.29%  abc  abc[.] a  
 
0.00%  11.61%  abc  ld-2.17.so [k] 
 _dl_sysdep_start   
0.00%   9.43%  abc  ld-2.17.so [.] dl_main
 
9.43%   9.43%  abc  ld-2.17.so [.] 
 _dl_relocate_object
2.27%   2.27%  abc  [kernel.kallsyms]  [k] page_fault 
 
0.00%   2.18%  abc  ld-2.17.so [k] _dl_start_user 
 
0.00%   0.10%  abc  ld-2.17.so [.] _start 
 
 
 As you can see __libc_start_main - main - c - b - a callchain 
 show up in the output.

 This looks really useful!

Thanks! :)


 A couple of details:

 1)

 This is pretty close to SysProf output, right? So why not use the 
 well-known SysProf naming and call the first column 'self' and the 
 second column 'total'? I think those names are pretty intuitive and 
 it would help people who come from SysProf over to perf.

Okay, I can do it. (Although sysprof seems to call it 'cumulative'
rather than 'total' - but I think the 'total' is better since it's
simpler and shorter.)


 2)

 Is it possible to configure the default 'report -g' style, so that 
 people who'd like to use it all the time don't have to type '-g 
 cumulative' all the time?

Hmm.. maybe I can add support for the 'report.call-graph' config option.


 3)

 I'd even argue that we enable this reporting feature by default, if 
 a data file includes call-chain data: the first column will still 
 show the well-known percentage that perf report produces today, the 
 second column will be a new feature in essence.

 The only open question would be, by which column should we sort: 
 'sysprof style' sorts by 'total', 'perf style' sorts by 'self'. 
 Agreed?

Right, I defaulted to go by 'total'.  But we can add an option for it.


 4)

 This is not directly related to the new feature you added: 
 call-graph profiling still takes quite a bit of time. It might make 
 sense to save the ordered histogram to a perf.data.ordered file, so 
 that repeat invocations of 'perf report' don't have to recalculate 
 everything again and again?

 This file would be maintained transparently and would only be 
 re-created when the perf.data file changes, or something like that.

Hmm.. good idea.  We may discuss it along with Jiri's multiple file
storage patches.  I haven't had a time to review - maybe next week.


 5)

 I realize that this is an early RFC, still there are some usability 
 complaints I have about call-graph recording/reporting which should 
 be addressed before adding new features.

 For example I tried to get a list of the -g output modi via:

$ perf report -g help

 Which produced a lot of options - I think it should produce only a 
 list of -g options.

Right.  I have a patchset for this.  Will send it soon.


 It also doesn't list cumulative:

 -g, --call-graph output_type,min_percent[,print_limit],call_order
   Display callchains using output_type 
 (graph, flat, fractal, or none) , min percent threshold, optional 
 print limit, callchain order, key (function or address). Default: 
 fractal,0.5,callee,function

Ah, I forgot to add it.  Will fix!


 Also, the list is very long and not very readable - I think there 
 should be more newlines.

 Then I tried to do:

$ perf report -g

 which, somewhat surprisingly, was accepted. Given that call-graph 
 perf.data is recognized automatically by 'perf report', the -g 
 option should only accept -g type syntax and provide a list of 
 options when '-g' or '-g help' is provided.

Will check.


 6)

 A similar UI problem exists on the 'perf record' side: 'perf record 
 --call-graph help' should produce a specific list of call-graph 
 possibilities, not the two screens full output it does today.

Right.  The patch will come soonish. :)


 I know it have some rough edges or even bugs, but I really want to 
 release it and get reviews.  It does not handle event groups and 
 annotations and it has a bug on TUI.
 
 You can also get this series on 'perf/cumulate-v2' branch in my tree at:
 
   git://git.kernel.org/pub/scm/linux/kernel/git/namhyung/linux-perf.git

 So I tried it 

Re: [RFC/PATCHSET 00/14] perf report: Add support to accumulate hist periods (v2)

2013-11-01 Thread Ingo Molnar

* Namhyung Kim namhy...@kernel.org wrote:

  A couple of details:
 
  1)
 
  This is pretty close to SysProf output, right? So why not use the 
  well-known SysProf naming and call the first column 'self' and the 
  second column 'total'? I think those names are pretty intuitive and 
  it would help people who come from SysProf over to perf.
 
 Okay, I can do it. (Although sysprof seems to call it 'cumulative'
 rather than 'total' - but I think the 'total' is better since it's
 simpler and shorter.)

So sysprof-1.2 has the following two windows:

 'functions',   with 'self' and 'total' fields
 'descendants', with 'self' and 'cumulative' fields

'descendants' appears to be similar to the perf 'dso' concept.

  2)
 
  Is it possible to configure the default 'report -g' style, so that 
  people who'd like to use it all the time don't have to type '-g 
  cumulative' all the time?
 
 Hmm.. maybe I can add support for the 'report.call-graph' config option.

If we display your new 'total' field by default then it's not as 
pressing to me :)

  3)
 
  I'd even argue that we enable this reporting feature by default, if 
  a data file includes call-chain data: the first column will still 
  show the well-known percentage that perf report produces today, the 
  second column will be a new feature in essence.
 
  The only open question would be, by which column should we sort: 
  'sysprof style' sorts by 'total', 'perf style' sorts by 'self'. 
  Agreed?
 
 Right, I defaulted to go by 'total'.  But we can add an option for 
 it.

The purpose would be to allow people to do old-style 'sort by 
function overhead' output, while still seeing the 'total' field as 
well.

Btw., if anyone is interested in improving the GTK front-end, it 
would be _really_ nice if it had a 'start profiling' button like 
sysprof has today, with a 'samples' field showing the current number 
of samples. (We could even improve upon sysprof by adding 'stop' 
functionality as well ;-)

A bit like perf top, except the reporting session is hidden until 
the user actively requests the profile.

Maybe it could even be called a gtk version of 'perf top', with a 
button to start/stop collection, with another button to 
activate/deactivate reporting output, and yet another button to 
reset the profiling buffer.

With that feature set perf would be a ready sysprof workflow 
replacement I think. (I've Cc:-ed Pekka, just in case! :-)

  4)
 
  This is not directly related to the new feature you added: 
  call-graph profiling still takes quite a bit of time. It might 
  make sense to save the ordered histogram to a perf.data.ordered 
  file, so that repeat invocations of 'perf report' don't have to 
  recalculate everything again and again?
 
  This file would be maintained transparently and would only be 
  re-created when the perf.data file changes, or something like 
  that.
 
 Hmm.. good idea.  We may discuss it along with Jiri's multiple 
 file storage patches.  I haven't had a time to review - maybe next 
 week.

So Arnaldo tells me that with your and Frederic's latest 
callgraph-speedup patches the parsing of perf.data got _really_ 
fast, so maybe my performance complaint is moot and we should delay 
complicating the primary perf.data file model with a 'cache' until 
your patches are in and we see the full impact.

Thanks,

Ingo
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC/PATCHSET 00/14] perf report: Add support to accumulate hist periods (v2)

2013-11-01 Thread Pekka Enberg
On Fri, Nov 1, 2013 at 9:55 AM, Ingo Molnar mi...@kernel.org wrote:
 Btw., if anyone is interested in improving the GTK front-end, it
 would be _really_ nice if it had a 'start profiling' button like
 sysprof has today, with a 'samples' field showing the current number
 of samples. (We could even improve upon sysprof by adding 'stop'
 functionality as well ;-)

 A bit like perf top, except the reporting session is hidden until
 the user actively requests the profile.

 Maybe it could even be called a gtk version of 'perf top', with a
 button to start/stop collection, with another button to
 activate/deactivate reporting output, and yet another button to
 reset the profiling buffer.

 With that feature set perf would be a ready sysprof workflow
 replacement I think. (I've Cc:-ed Pekka, just in case! :-)

Sure, start/stop button is useful for system-wide profiling but you
also want to be able to start an application for profiling from the UI.
I don't remember if SysProf supports that but the Shark profiler on Mac
OS X does.

Btw, to make GTK-front end even more useful, we should:

  (1) Have a drop-down for selecting a specific process from system-wide
  profile.

  (2) Show 'perf annotate' from 'perf report' when a function is
  double-clicked.

Pekka
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC/PATCHSET 00/14] perf report: Add support to accumulate hist periods (v2)

2013-11-01 Thread Namhyung Kim
On Fri, 1 Nov 2013 08:55:02 +0100, Ingo Molnar wrote:
 * Namhyung Kim namhy...@kernel.org wrote:

  A couple of details:
 
  1)
 
  This is pretty close to SysProf output, right? So why not use the 
  well-known SysProf naming and call the first column 'self' and the 
  second column 'total'? I think those names are pretty intuitive and 
  it would help people who come from SysProf over to perf.
 
 Okay, I can do it. (Although sysprof seems to call it 'cumulative'
 rather than 'total' - but I think the 'total' is better since it's
 simpler and shorter.)

 So sysprof-1.2 has the following two windows:

  'functions',   with 'self' and 'total' fields
  'descendants', with 'self' and 'cumulative' fields

 'descendants' appears to be similar to the perf 'dso' concept.

Arh, okay.  Thanks for the info.


  2)
 
  Is it possible to configure the default 'report -g' style, so that 
  people who'd like to use it all the time don't have to type '-g 
  cumulative' all the time?
 
 Hmm.. maybe I can add support for the 'report.call-graph' config option.

 If we display your new 'total' field by default then it's not as 
 pressing to me :)

Do you mean -g cumulative without 'self' column?


  3)
 
  I'd even argue that we enable this reporting feature by default, if 
  a data file includes call-chain data: the first column will still 
  show the well-known percentage that perf report produces today, the 
  second column will be a new feature in essence.
 
  The only open question would be, by which column should we sort: 
  'sysprof style' sorts by 'total', 'perf style' sorts by 'self'. 
  Agreed?
 
 Right, I defaulted to go by 'total'.  But we can add an option for 
 it.

 The purpose would be to allow people to do old-style 'sort by 
 function overhead' output, while still seeing the 'total' field as 
 well.

Right.


 Btw., if anyone is interested in improving the GTK front-end, it 
 would be _really_ nice if it had a 'start profiling' button like 
 sysprof has today, with a 'samples' field showing the current number 
 of samples. (We could even improve upon sysprof by adding 'stop' 
 functionality as well ;-)

Wow, I'm impressed that the sysprof doesn't have one. :)


 A bit like perf top, except the reporting session is hidden until 
 the user actively requests the profile.

 Maybe it could even be called a gtk version of 'perf top', with a 
 button to start/stop collection, with another button to 
 activate/deactivate reporting output, and yet another button to 
 reset the profiling buffer.

 With that feature set perf would be a ready sysprof workflow 
 replacement I think. (I've Cc:-ed Pekka, just in case! :-)

Sounds nice.  I'm not sure I can have to a time to do it anytime soon.


  4)
 
  This is not directly related to the new feature you added: 
  call-graph profiling still takes quite a bit of time. It might 
  make sense to save the ordered histogram to a perf.data.ordered 
  file, so that repeat invocations of 'perf report' don't have to 
  recalculate everything again and again?
 
  This file would be maintained transparently and would only be 
  re-created when the perf.data file changes, or something like 
  that.
 
 Hmm.. good idea.  We may discuss it along with Jiri's multiple 
 file storage patches.  I haven't had a time to review - maybe next 
 week.

 So Arnaldo tells me that with your and Frederic's latest 
 callgraph-speedup patches the parsing of perf.data got _really_ 
 fast, so maybe my performance complaint is moot and we should delay 
 complicating the primary perf.data file model with a 'cache' until 
 your patches are in and we see the full impact.

Okay, let's see what happens. :)

Thanks,
Namhyung
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC/PATCHSET 00/14] perf report: Add support to accumulate hist periods (v2)

2013-11-01 Thread Ingo Molnar

* Namhyung Kim namhy...@kernel.org wrote:

   2)
  
   Is it possible to configure the default 'report -g' style, so that 
   people who'd like to use it all the time don't have to type '-g 
   cumulative' all the time?
  
  Hmm.. maybe I can add support for the 'report.call-graph' config option.
 
  If we display your new 'total' field by default then it's not as 
  pressing to me :)
 
 Do you mean -g cumulative without 'self' column?

So, if by default we display both 'self' and 'total' and sort by 
'total', then I'm personally a happy camper: while it's a change of 
the default perf report output, it's also a step forward.

But some people might complain about it once this hits v3.13 (or 
v3.14) and might want to hide individual columns and have different 
sorting preferences.

So out of defensive caution you might want to provide toggles for 
such things, maybe even generalize it a bit to allow arbitrary 
hiding/display of individual colums in perf report.

That would probably also make it easier to do minimal tweaks to the 
upstream reporting defaults.

  Btw., if anyone is interested in improving the GTK front-end, it 
  would be _really_ nice if it had a 'start profiling' button like 
  sysprof has today, with a 'samples' field showing the current 
  number of samples. (We could even improve upon sysprof by adding 
  'stop' functionality as well ;-)
 
 Wow, I'm impressed that the sysprof doesn't have one. :)

At least I haven't found it: I tried pressing 'start' once more but 
that doesn't do it, it just keeps collecting data.

Still many developers love sysprof, so I think there would be quite 
some plus in providing a gtk perf top version with the controls 
Pekka and me listed.

Thanks,

Ingo
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC/PATCHSET 00/14] perf report: Add support to accumulate hist periods (v2)

2013-10-31 Thread Ingo Molnar


* Namhyung Kim  wrote:

> When the -g cumulative option is given, it'll be shown like this:
> 
>   $ perf report -g cumulative --stdio
> 
>   # Overhead  Overhead (Acc)  Command  Shared Object   
> Symbol
>   #   ..  ...  .  
> ...
>   #
>0.00%  88.29%  abc  libc-2.17.so   [.] 
> __libc_start_main  
>0.00%  88.29%  abc  abc[.] main
>
>0.00%  88.29%  abc  abc[.] c   
>
>0.00%  88.29%  abc  abc[.] b   
>
>   88.29%  88.29%  abc  abc[.] a   
>
>0.00%  11.61%  abc  ld-2.17.so [k] 
> _dl_sysdep_start   
>0.00%   9.43%  abc  ld-2.17.so [.] dl_main 
>
>9.43%   9.43%  abc  ld-2.17.so [.] 
> _dl_relocate_object
>2.27%   2.27%  abc  [kernel.kallsyms]  [k] page_fault  
>
>0.00%   2.18%  abc  ld-2.17.so [k] _dl_start_user  
>
>0.00%   0.10%  abc  ld-2.17.so [.] _start  
>
> 
> As you can see __libc_start_main -> main -> c -> b -> a callchain 
> show up in the output.

This looks really useful!

A couple of details:

1)

This is pretty close to SysProf output, right? So why not use the 
well-known SysProf naming and call the first column 'self' and the 
second column 'total'? I think those names are pretty intuitive and 
it would help people who come from SysProf over to perf.

2)

Is it possible to configure the default 'report -g' style, so that 
people who'd like to use it all the time don't have to type '-g 
cumulative' all the time?

3)

I'd even argue that we enable this reporting feature by default, if 
a data file includes call-chain data: the first column will still 
show the well-known percentage that perf report produces today, the 
second column will be a new feature in essence.

The only open question would be, by which column should we sort: 
'sysprof style' sorts by 'total', 'perf style' sorts by 'self'. 
Agreed?

4)

This is not directly related to the new feature you added: 
call-graph profiling still takes quite a bit of time. It might make 
sense to save the ordered histogram to a perf.data.ordered file, so 
that repeat invocations of 'perf report' don't have to recalculate 
everything again and again?

This file would be maintained transparently and would only be 
re-created when the perf.data file changes, or something like that.

5)

I realize that this is an early RFC, still there are some usability 
complaints I have about call-graph recording/reporting which should 
be addressed before adding new features.

For example I tried to get a list of the -g output modi via:

   $ perf report -g help

Which produced a lot of options - I think it should produce only a 
list of -g options. It also doesn't list cumulative:

-g, --call-graph 
  Display callchains using output_type 
(graph, flat, fractal, or none) , min percent threshold, optional 
print limit, callchain order, key (function or address). Default: 
fractal,0.5,callee,function

Also, the list is very long and not very readable - I think there 
should be more newlines.

Then I tried to do:

   $ perf report -g

which, somewhat surprisingly, was accepted. Given that call-graph 
perf.data is recognized automatically by 'perf report', the -g 
option should only accept -g  syntax and provide a list of 
options when '-g' or '-g help' is provided.

6)

A similar UI problem exists on the 'perf record' side: 'perf record 
--call-graph help' should produce a specific list of call-graph 
possibilities, not the two screens full output it does today.

> I know it have some rough edges or even bugs, but I really want to 
> release it and get reviews.  It does not handle event groups and 
> annotations and it has a bug on TUI.
> 
> You can also get this series on 'perf/cumulate-v2' branch in my tree at:
> 
>   git://git.kernel.org/pub/scm/linux/kernel/git/namhyung/linux-perf.git

So I tried it out on top of tip:master, with your testcase, and in 
the --stdio case it works very well:

# Overhead  Overhead (Acc)  Command  Shared Object  
Symbol
#   ..  ...  .  
..
#
 0.00% 100.00%  abc  abc[.] _start  
  
 0.00% 100.00%  abc  libc-2.17.so   [.] __libc_start_main   
  
 0.00% 100.00%  abc  abc[.] main
  
 0.00% 100.00%  abc  abc[.] c   
  
 0.00% 100.00%  abc  abc[.] 

[RFC/PATCHSET 00/14] perf report: Add support to accumulate hist periods (v2)

2013-10-31 Thread Namhyung Kim
Hi,

This is my second attempt to implement cumulative hist period report.
This work begins from Arun's SORT_INCLUSIVE patch [1] but I completely
rewrote it from scratch.

Please see first two patches.  I refactored functions that add hist
entries with struct add_entry_iter.  While I converted all functions
carefully, it'd be better anyone can test and confirm that I didn't
mess up something - especially for branch stack and mem stuff.

This patchset basically adds period in a sample to every node in the
callchain.  A hist_entry now has an additional fields to keep the
cumulative period if -g cumulative option is given on perf report.

Let me show you an example:

  $ cat abc.c
  #define barrier() asm volatile("" ::: "memory")

  void a(void)
  {
int i;
for (i = 0; i < 100; i++)
barrier();
  }
  void b(void)
  {
a();
  }
  void c(void)
  {
b();
  }
  int main(void)
  {
c();
return 0;
  }

With this simple program I ran perf record and report:

  $ perf record -g -e cycles:u ./abc

  $ perf report --stdio
  88.29%  abc  abc[.] a  
  |
  --- a
  b
  c
  main
  __libc_start_main

   9.43%  abc  ld-2.17.so [.] _dl_relocate_object
  |
  --- _dl_relocate_object
  dl_main
  _dl_sysdep_start

   2.27%  abc  [kernel.kallsyms]  [k] page_fault 
  |
  --- page_fault
 |  
 |--95.94%-- _dl_sysdep_start
 |  _dl_start_user
 |  
  --4.06%-- _start

   0.00%  abc  ld-2.17.so [.] _start 
  |
  --- _start


When the -g cumulative option is given, it'll be shown like this:

  $ perf report -g cumulative --stdio

  # Overhead  Overhead (Acc)  Command  Shared Object   
Symbol
  #   ..  ...  .  
...
  #
   0.00%  88.29%  abc  libc-2.17.so   [.] __libc_start_main 
 
   0.00%  88.29%  abc  abc[.] main  
 
   0.00%  88.29%  abc  abc[.] c 
 
   0.00%  88.29%  abc  abc[.] b 
 
  88.29%  88.29%  abc  abc[.] a 
 
   0.00%  11.61%  abc  ld-2.17.so [k] _dl_sysdep_start  
 
   0.00%   9.43%  abc  ld-2.17.so [.] dl_main   
 
   9.43%   9.43%  abc  ld-2.17.so [.] 
_dl_relocate_object
   2.27%   2.27%  abc  [kernel.kallsyms]  [k] page_fault
 
   0.00%   2.18%  abc  ld-2.17.so [k] _dl_start_user
 
   0.00%   0.10%  abc  ld-2.17.so [.] _start
 

As you can see __libc_start_main -> main -> c -> b -> a callchain show
up in the output.

I know it have some rough edges or even bugs, but I really want to
release it and get reviews.  It does not handle event groups and
annotations and it has a bug on TUI.

You can also get this series on 'perf/cumulate-v2' branch in my tree at:

  git://git.kernel.org/pub/scm/linux/kernel/git/namhyung/linux-perf.git


Any comments are welcome, thanks.
Namhyung


Cc: Arun Sharma 
Cc: Frederic Weisbecker 

[1] https://lkml.org/lkml/2012/3/31/6

Namhyung Kim (14):
  perf tools: Consolidate __hists__add_*entry()
  perf tools: Introduce struct add_entry_iter
  perf hists: Convert hist entry functions to use struct he_stat
  perf hists: Add support for accumulated stat of hist entry
  perf hists: Check if accumulated when adding a hist entry
  perf hists: Accumulate hist entry stat based on the callchain
  perf tools: Update cpumode for each cumulative entry
  perf report: Cache cumulative callchains
  perf hists: Sort hist entries by accumulated period
  perf ui/hist: Add support to accumulated hist stat
  perf ui/browser: Add support to accumulated hist stat
  perf ui/gtk: Add support to accumulated hist stat
  perf tools: Apply percent-limit to cumulative percentage
  perf report: Add -g cumulative option

 tools/perf/Documentation/perf-report.txt |   2 +
 tools/perf/builtin-annotate.c|   3 +-
 tools/perf/builtin-diff.c|   3 +-
 tools/perf/builtin-report.c  | 659 ---
 tools/perf/builtin-top.c |   5 +-
 tools/perf/tests/hists_link.c|   6 +-
 tools/perf/ui/browsers/hists.c   |  32 +-
 tools/perf/ui/gtk/hists.c|  20 +
 tools/perf/ui/hist.c |  41 ++
 tools/perf/ui/stdio/hist.c   |   5 +
 

[RFC/PATCHSET 00/14] perf report: Add support to accumulate hist periods (v2)

2013-10-31 Thread Namhyung Kim
Hi,

This is my second attempt to implement cumulative hist period report.
This work begins from Arun's SORT_INCLUSIVE patch [1] but I completely
rewrote it from scratch.

Please see first two patches.  I refactored functions that add hist
entries with struct add_entry_iter.  While I converted all functions
carefully, it'd be better anyone can test and confirm that I didn't
mess up something - especially for branch stack and mem stuff.

This patchset basically adds period in a sample to every node in the
callchain.  A hist_entry now has an additional fields to keep the
cumulative period if -g cumulative option is given on perf report.

Let me show you an example:

  $ cat abc.c
  #define barrier() asm volatile( ::: memory)

  void a(void)
  {
int i;
for (i = 0; i  100; i++)
barrier();
  }
  void b(void)
  {
a();
  }
  void c(void)
  {
b();
  }
  int main(void)
  {
c();
return 0;
  }

With this simple program I ran perf record and report:

  $ perf record -g -e cycles:u ./abc

  $ perf report --stdio
  88.29%  abc  abc[.] a  
  |
  --- a
  b
  c
  main
  __libc_start_main

   9.43%  abc  ld-2.17.so [.] _dl_relocate_object
  |
  --- _dl_relocate_object
  dl_main
  _dl_sysdep_start

   2.27%  abc  [kernel.kallsyms]  [k] page_fault 
  |
  --- page_fault
 |  
 |--95.94%-- _dl_sysdep_start
 |  _dl_start_user
 |  
  --4.06%-- _start

   0.00%  abc  ld-2.17.so [.] _start 
  |
  --- _start


When the -g cumulative option is given, it'll be shown like this:

  $ perf report -g cumulative --stdio

  # Overhead  Overhead (Acc)  Command  Shared Object   
Symbol
  #   ..  ...  .  
...
  #
   0.00%  88.29%  abc  libc-2.17.so   [.] __libc_start_main 
 
   0.00%  88.29%  abc  abc[.] main  
 
   0.00%  88.29%  abc  abc[.] c 
 
   0.00%  88.29%  abc  abc[.] b 
 
  88.29%  88.29%  abc  abc[.] a 
 
   0.00%  11.61%  abc  ld-2.17.so [k] _dl_sysdep_start  
 
   0.00%   9.43%  abc  ld-2.17.so [.] dl_main   
 
   9.43%   9.43%  abc  ld-2.17.so [.] 
_dl_relocate_object
   2.27%   2.27%  abc  [kernel.kallsyms]  [k] page_fault
 
   0.00%   2.18%  abc  ld-2.17.so [k] _dl_start_user
 
   0.00%   0.10%  abc  ld-2.17.so [.] _start
 

As you can see __libc_start_main - main - c - b - a callchain show
up in the output.

I know it have some rough edges or even bugs, but I really want to
release it and get reviews.  It does not handle event groups and
annotations and it has a bug on TUI.

You can also get this series on 'perf/cumulate-v2' branch in my tree at:

  git://git.kernel.org/pub/scm/linux/kernel/git/namhyung/linux-perf.git


Any comments are welcome, thanks.
Namhyung


Cc: Arun Sharma asha...@fb.com
Cc: Frederic Weisbecker fweis...@gmail.com

[1] https://lkml.org/lkml/2012/3/31/6

Namhyung Kim (14):
  perf tools: Consolidate __hists__add_*entry()
  perf tools: Introduce struct add_entry_iter
  perf hists: Convert hist entry functions to use struct he_stat
  perf hists: Add support for accumulated stat of hist entry
  perf hists: Check if accumulated when adding a hist entry
  perf hists: Accumulate hist entry stat based on the callchain
  perf tools: Update cpumode for each cumulative entry
  perf report: Cache cumulative callchains
  perf hists: Sort hist entries by accumulated period
  perf ui/hist: Add support to accumulated hist stat
  perf ui/browser: Add support to accumulated hist stat
  perf ui/gtk: Add support to accumulated hist stat
  perf tools: Apply percent-limit to cumulative percentage
  perf report: Add -g cumulative option

 tools/perf/Documentation/perf-report.txt |   2 +
 tools/perf/builtin-annotate.c|   3 +-
 tools/perf/builtin-diff.c|   3 +-
 tools/perf/builtin-report.c  | 659 ---
 tools/perf/builtin-top.c |   5 +-
 tools/perf/tests/hists_link.c|   6 +-
 tools/perf/ui/browsers/hists.c   |  32 +-
 tools/perf/ui/gtk/hists.c|  20 +
 tools/perf/ui/hist.c |  41 ++
 tools/perf/ui/stdio/hist.c   

Re: [RFC/PATCHSET 00/14] perf report: Add support to accumulate hist periods (v2)

2013-10-31 Thread Ingo Molnar


* Namhyung Kim namhy...@kernel.org wrote:

 When the -g cumulative option is given, it'll be shown like this:
 
   $ perf report -g cumulative --stdio
 
   # Overhead  Overhead (Acc)  Command  Shared Object   
 Symbol
   #   ..  ...  .  
 ...
   #
0.00%  88.29%  abc  libc-2.17.so   [.] 
 __libc_start_main  
0.00%  88.29%  abc  abc[.] main

0.00%  88.29%  abc  abc[.] c   

0.00%  88.29%  abc  abc[.] b   

   88.29%  88.29%  abc  abc[.] a   

0.00%  11.61%  abc  ld-2.17.so [k] 
 _dl_sysdep_start   
0.00%   9.43%  abc  ld-2.17.so [.] dl_main 

9.43%   9.43%  abc  ld-2.17.so [.] 
 _dl_relocate_object
2.27%   2.27%  abc  [kernel.kallsyms]  [k] page_fault  

0.00%   2.18%  abc  ld-2.17.so [k] _dl_start_user  

0.00%   0.10%  abc  ld-2.17.so [.] _start  

 
 As you can see __libc_start_main - main - c - b - a callchain 
 show up in the output.

This looks really useful!

A couple of details:

1)

This is pretty close to SysProf output, right? So why not use the 
well-known SysProf naming and call the first column 'self' and the 
second column 'total'? I think those names are pretty intuitive and 
it would help people who come from SysProf over to perf.

2)

Is it possible to configure the default 'report -g' style, so that 
people who'd like to use it all the time don't have to type '-g 
cumulative' all the time?

3)

I'd even argue that we enable this reporting feature by default, if 
a data file includes call-chain data: the first column will still 
show the well-known percentage that perf report produces today, the 
second column will be a new feature in essence.

The only open question would be, by which column should we sort: 
'sysprof style' sorts by 'total', 'perf style' sorts by 'self'. 
Agreed?

4)

This is not directly related to the new feature you added: 
call-graph profiling still takes quite a bit of time. It might make 
sense to save the ordered histogram to a perf.data.ordered file, so 
that repeat invocations of 'perf report' don't have to recalculate 
everything again and again?

This file would be maintained transparently and would only be 
re-created when the perf.data file changes, or something like that.

5)

I realize that this is an early RFC, still there are some usability 
complaints I have about call-graph recording/reporting which should 
be addressed before adding new features.

For example I tried to get a list of the -g output modi via:

   $ perf report -g help

Which produced a lot of options - I think it should produce only a 
list of -g options. It also doesn't list cumulative:

-g, --call-graph output_type,min_percent[,print_limit],call_order
  Display callchains using output_type 
(graph, flat, fractal, or none) , min percent threshold, optional 
print limit, callchain order, key (function or address). Default: 
fractal,0.5,callee,function

Also, the list is very long and not very readable - I think there 
should be more newlines.

Then I tried to do:

   $ perf report -g

which, somewhat surprisingly, was accepted. Given that call-graph 
perf.data is recognized automatically by 'perf report', the -g 
option should only accept -g type syntax and provide a list of 
options when '-g' or '-g help' is provided.

6)

A similar UI problem exists on the 'perf record' side: 'perf record 
--call-graph help' should produce a specific list of call-graph 
possibilities, not the two screens full output it does today.

 I know it have some rough edges or even bugs, but I really want to 
 release it and get reviews.  It does not handle event groups and 
 annotations and it has a bug on TUI.
 
 You can also get this series on 'perf/cumulate-v2' branch in my tree at:
 
   git://git.kernel.org/pub/scm/linux/kernel/git/namhyung/linux-perf.git

So I tried it out on top of tip:master, with your testcase, and in 
the --stdio case it works very well:

# Overhead  Overhead (Acc)  Command  Shared Object  
Symbol
#   ..  ...  .  
..
#
 0.00% 100.00%  abc  abc[.] _start  
  
 0.00% 100.00%  abc  libc-2.17.so   [.] __libc_start_main   
  
 0.00% 100.00%  abc  abc[.] main
  
 0.00% 100.00%  abc  abc[.] c   
  
 0.00% 100.00%