Re: [PATCHSET 0/4] perf report: Support folded callchain output (v4)

2015-11-05 Thread Namhyung Kim
Hi Arnaldo,

On Wed, Nov 04, 2015 at 03:08:58PM -0300, Arnaldo Carvalho de Melo wrote:
> Em Thu, Nov 05, 2015 at 12:34:57AM +0900, Namhyung Kim escreveu:
> > Hi Arnaldo and Brendan,
> > 
> > On Wed, Nov 04, 2015 at 11:51:31AM -0300, Arnaldo Carvalho de Melo wrote:
> > > Em Tue, Nov 03, 2015 at 10:02:32PM -0800, Brendan Gregg escreveu:
> > > > On Tue, Nov 3, 2015 at 5:54 PM, Namhyung Kim  
> > > > wrote:
> > > > > Ah, makes sense.  So it'd look like
> > > 
> > > > >   $ perf report --stdio -g folded,count,info -F none -s comm
> > > > >   $ perf report --stdio -g folded,count,info -F none -s pid
> > > 
> > > > > The output would be
> > > 
> > > > >   809 swapper-0 
> > > > > cpu_bringup_and_idle;cpu_startup_entry;default_idle_call;arch_cpu_idle;default_idle;xen_hypercall_sched_op
> > >  
> > > > Thanks, looks almost right: a couple of minor changes:
> > >  
> > > > 1. If perf already has the precedent of "PID:comm", instead of my
> > > > "comm-PID", then maybe it should use "PID:comm" for perf consistency.
> > > > Doesn't make much difference to me.
>  
> > Right.  Actually I'd like to write it that way.. ;-)
> 
> Well, those are two pieces of information: "comm" and "pid", so it would
> be nice that we could take this opportunity to remove it, i.e. just
> treat it as any other field and separate it via the designated
> separator, and only show the ones specified.

So do you want to change '-s pid' to print 'PID' part only?


>  
> > > > 2. The second space, delimiting "PID:comm" (or comm) and the stack...
> > > > I'm nervous about using space as a delimiter any more than once, since
> > > > it can also appear in comm (eg, "java main") and frames (eg,
> > > > "JavaCalls::call_helper(JavaValue*, methodHandle*, JavaCallArguments*,
> > > > Thread*)" -- that's direct from "perf script"!). I'd consider making
> > > > it a semicolon:
> 
> The C++ symbol names are the biggest challenge here for a single line in
> CSV ("comma" quoted) record :-\
>  
> > Fair enough.
> 
> > > > 809 swapper-0;cpu_bringup_and_idle;cpu_startup_entry;...
> > >  
> > > > So the output is "value key", and key is a semicolon delimited stack
> > > > with an optional comm or PID:comm frame at the start.
> > > 
> > > Agreed, but then, we can have some sort of default and also be able to,
> > > using -F, specify what are the fields we want, and in which order, and I
> > > liked your suggestion of being able to specify "-F none" and that mean
> > > no hist line to be produced.
> > > 
> > > Likewise, the way that each callchain line should be formatted should be
> > > programmable via the command line, via the -g option, no? Then script
> > > writers could use it in a way that doesn't requires further processing,
> > > as Brendan showed.
> > 
> > Right.  So '-s [,,...] -g info' can control which info is
> > displayed along with the callchains.
> 
> So you force the same selection of fields to be used for both the
> hist_entry and the callchains?

Yes.


> 
> And why is that some of the fields will be selected via -s (comm, dso)
> and other fields will be selected via -g (count, this "info" thing)?

Because it affects how hist entries are aggregated..


> 
> Why not be flexible and allow any set of fields to be used in both
> cases, without one being tied to the other?
> 
> I.e. instead of:
> 
> -s [,,...] -g info
> 
> We use:
> 
> -s [,,...] -g [[,],...]

But then we need to aggregate hist entries using all of key1, key2,
keyA, keyB and so on.  Otherwise callchain info with keyA and keyB
might be stale.

If so, we need to group hist entries again using key1 and key2 only
for printing hist part.  For example, entries for (1,2,A,B) and
(1,2,C,D) should be shown as single entry for (1,2).

I think this 'info' part is only needed when hist entries are omitted
(i.e. -F none).  If so, no need to bother with new options..

> 
> If one would want to have the same set for both, then yeah, a keyword
> for that would be interesting, reusing your "info":
> 
> -s [,,...] -g info
> 
> Would mean:
> 
> -s [,,...] -g [[,],...]
> 
> With both ... equal
> 
> But "info" is way too vague, perhaps "hist_keys", or something more
> compact, like: "\-s", to reuse the semantic of regular expression groups
> (\1).

I prefer "hist_keys".


>  
> >   $ perf report -s comm,dso -g folded,count,info -F none
> >   809 swapper;[kernel.vmlinux];cpu_bringup_and_idle;cpu_startup_entry;...
>  
> > Note that the info part (swapper;[kernel.vmlinux]) is also separated
> > by a semicolon.  But I think it's ok since it's controlled by command
> > line, so script can know how many entries will be.
>  
> > > But yeah, the value is the semicolon delimited stack all the way to the
> > > comm/PID:comm if there are more than one or if the user asks it to be
> > > there via a -g keyword, all the other counts/info are just relative to
> > > that, CSV or whatever other delimiter the user asks it to, and space is
> > > not an option, as we know it can appear in the middle of a 

Re: [PATCHSET 0/4] perf report: Support folded callchain output (v4)

2015-11-05 Thread Namhyung Kim
Hi Arnaldo,

On Wed, Nov 04, 2015 at 03:08:58PM -0300, Arnaldo Carvalho de Melo wrote:
> Em Thu, Nov 05, 2015 at 12:34:57AM +0900, Namhyung Kim escreveu:
> > Hi Arnaldo and Brendan,
> > 
> > On Wed, Nov 04, 2015 at 11:51:31AM -0300, Arnaldo Carvalho de Melo wrote:
> > > Em Tue, Nov 03, 2015 at 10:02:32PM -0800, Brendan Gregg escreveu:
> > > > On Tue, Nov 3, 2015 at 5:54 PM, Namhyung Kim  
> > > > wrote:
> > > > > Ah, makes sense.  So it'd look like
> > > 
> > > > >   $ perf report --stdio -g folded,count,info -F none -s comm
> > > > >   $ perf report --stdio -g folded,count,info -F none -s pid
> > > 
> > > > > The output would be
> > > 
> > > > >   809 swapper-0 
> > > > > cpu_bringup_and_idle;cpu_startup_entry;default_idle_call;arch_cpu_idle;default_idle;xen_hypercall_sched_op
> > >  
> > > > Thanks, looks almost right: a couple of minor changes:
> > >  
> > > > 1. If perf already has the precedent of "PID:comm", instead of my
> > > > "comm-PID", then maybe it should use "PID:comm" for perf consistency.
> > > > Doesn't make much difference to me.
>  
> > Right.  Actually I'd like to write it that way.. ;-)
> 
> Well, those are two pieces of information: "comm" and "pid", so it would
> be nice that we could take this opportunity to remove it, i.e. just
> treat it as any other field and separate it via the designated
> separator, and only show the ones specified.

So do you want to change '-s pid' to print 'PID' part only?


>  
> > > > 2. The second space, delimiting "PID:comm" (or comm) and the stack...
> > > > I'm nervous about using space as a delimiter any more than once, since
> > > > it can also appear in comm (eg, "java main") and frames (eg,
> > > > "JavaCalls::call_helper(JavaValue*, methodHandle*, JavaCallArguments*,
> > > > Thread*)" -- that's direct from "perf script"!). I'd consider making
> > > > it a semicolon:
> 
> The C++ symbol names are the biggest challenge here for a single line in
> CSV ("comma" quoted) record :-\
>  
> > Fair enough.
> 
> > > > 809 swapper-0;cpu_bringup_and_idle;cpu_startup_entry;...
> > >  
> > > > So the output is "value key", and key is a semicolon delimited stack
> > > > with an optional comm or PID:comm frame at the start.
> > > 
> > > Agreed, but then, we can have some sort of default and also be able to,
> > > using -F, specify what are the fields we want, and in which order, and I
> > > liked your suggestion of being able to specify "-F none" and that mean
> > > no hist line to be produced.
> > > 
> > > Likewise, the way that each callchain line should be formatted should be
> > > programmable via the command line, via the -g option, no? Then script
> > > writers could use it in a way that doesn't requires further processing,
> > > as Brendan showed.
> > 
> > Right.  So '-s [,,...] -g info' can control which info is
> > displayed along with the callchains.
> 
> So you force the same selection of fields to be used for both the
> hist_entry and the callchains?

Yes.


> 
> And why is that some of the fields will be selected via -s (comm, dso)
> and other fields will be selected via -g (count, this "info" thing)?

Because it affects how hist entries are aggregated..


> 
> Why not be flexible and allow any set of fields to be used in both
> cases, without one being tied to the other?
> 
> I.e. instead of:
> 
> -s [,,...] -g info
> 
> We use:
> 
> -s [,,...] -g [[,],...]

But then we need to aggregate hist entries using all of key1, key2,
keyA, keyB and so on.  Otherwise callchain info with keyA and keyB
might be stale.

If so, we need to group hist entries again using key1 and key2 only
for printing hist part.  For example, entries for (1,2,A,B) and
(1,2,C,D) should be shown as single entry for (1,2).

I think this 'info' part is only needed when hist entries are omitted
(i.e. -F none).  If so, no need to bother with new options..

> 
> If one would want to have the same set for both, then yeah, a keyword
> for that would be interesting, reusing your "info":
> 
> -s [,,...] -g info
> 
> Would mean:
> 
> -s [,,...] -g [[,],...]
> 
> With both ... equal
> 
> But "info" is way too vague, perhaps "hist_keys", or something more
> compact, like: "\-s", to reuse the semantic of regular expression groups
> (\1).

I prefer "hist_keys".


>  
> >   $ perf report -s comm,dso -g folded,count,info -F none
> >   809 swapper;[kernel.vmlinux];cpu_bringup_and_idle;cpu_startup_entry;...
>  
> > Note that the info part (swapper;[kernel.vmlinux]) is also separated
> > by a semicolon.  But I think it's ok since it's controlled by command
> > line, so script can know how many entries will be.
>  
> > > But yeah, the value is the semicolon delimited stack all the way to the
> > > comm/PID:comm if there are more than one or if the user asks it to be
> > > there via a -g keyword, all the other counts/info are just relative to
> > > that, CSV or whatever other delimiter the user asks it to, and space is
> > > not an option, as we know it can appear 

Re: [PATCHSET 0/4] perf report: Support folded callchain output (v4)

2015-11-04 Thread Arnaldo Carvalho de Melo
Em Thu, Nov 05, 2015 at 12:34:57AM +0900, Namhyung Kim escreveu:
> Hi Arnaldo and Brendan,
> 
> On Wed, Nov 04, 2015 at 11:51:31AM -0300, Arnaldo Carvalho de Melo wrote:
> > Em Tue, Nov 03, 2015 at 10:02:32PM -0800, Brendan Gregg escreveu:
> > > On Tue, Nov 3, 2015 at 5:54 PM, Namhyung Kim  wrote:
> > > > Ah, makes sense.  So it'd look like
> > 
> > > >   $ perf report --stdio -g folded,count,info -F none -s comm
> > > >   $ perf report --stdio -g folded,count,info -F none -s pid
> > 
> > > > The output would be
> > 
> > > >   809 swapper-0 
> > > > cpu_bringup_and_idle;cpu_startup_entry;default_idle_call;arch_cpu_idle;default_idle;xen_hypercall_sched_op
> >  
> > > Thanks, looks almost right: a couple of minor changes:
> >  
> > > 1. If perf already has the precedent of "PID:comm", instead of my
> > > "comm-PID", then maybe it should use "PID:comm" for perf consistency.
> > > Doesn't make much difference to me.
 
> Right.  Actually I'd like to write it that way.. ;-)

Well, those are two pieces of information: "comm" and "pid", so it would
be nice that we could take this opportunity to remove it, i.e. just
treat it as any other field and separate it via the designated
separator, and only show the ones specified.
 
> > > 2. The second space, delimiting "PID:comm" (or comm) and the stack...
> > > I'm nervous about using space as a delimiter any more than once, since
> > > it can also appear in comm (eg, "java main") and frames (eg,
> > > "JavaCalls::call_helper(JavaValue*, methodHandle*, JavaCallArguments*,
> > > Thread*)" -- that's direct from "perf script"!). I'd consider making
> > > it a semicolon:

The C++ symbol names are the biggest challenge here for a single line in
CSV ("comma" quoted) record :-\
 
> Fair enough.

> > > 809 swapper-0;cpu_bringup_and_idle;cpu_startup_entry;...
> >  
> > > So the output is "value key", and key is a semicolon delimited stack
> > > with an optional comm or PID:comm frame at the start.
> > 
> > Agreed, but then, we can have some sort of default and also be able to,
> > using -F, specify what are the fields we want, and in which order, and I
> > liked your suggestion of being able to specify "-F none" and that mean
> > no hist line to be produced.
> > 
> > Likewise, the way that each callchain line should be formatted should be
> > programmable via the command line, via the -g option, no? Then script
> > writers could use it in a way that doesn't requires further processing,
> > as Brendan showed.
> 
> Right.  So '-s [,,...] -g info' can control which info is
> displayed along with the callchains.

So you force the same selection of fields to be used for both the
hist_entry and the callchains?

And why is that some of the fields will be selected via -s (comm, dso)
and other fields will be selected via -g (count, this "info" thing)?

Why not be flexible and allow any set of fields to be used in both
cases, without one being tied to the other?

I.e. instead of:

-s [,,...] -g info

We use:

-s [,,...] -g [[,],...]

If one would want to have the same set for both, then yeah, a keyword
for that would be interesting, reusing your "info":

-s [,,...] -g info

Would mean:

-s [,,...] -g [[,],...]

With both ... equal

But "info" is way too vague, perhaps "hist_keys", or something more
compact, like: "\-s", to reuse the semantic of regular expression groups
(\1).
 
>   $ perf report -s comm,dso -g folded,count,info -F none
>   809 swapper;[kernel.vmlinux];cpu_bringup_and_idle;cpu_startup_entry;...
 
> Note that the info part (swapper;[kernel.vmlinux]) is also separated
> by a semicolon.  But I think it's ok since it's controlled by command
> line, so script can know how many entries will be.
 
> > But yeah, the value is the semicolon delimited stack all the way to the
> > comm/PID:comm if there are more than one or if the user asks it to be
> > there via a -g keyword, all the other counts/info are just relative to
> > that, CSV or whatever other delimiter the user asks it to, and space is
> > not an option, as we know it can appear in the middle of a COMM:
> 
> Yes, I think that we should use a given separator (using -t option)
> instead of hard-coded semicolon.  Although it'd be rare, it seems
> possible to use semicolons in the comm name too.

Well, we can have an option to specify what would be the separator for
the callchains.

- Arnaldo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCHSET 0/4] perf report: Support folded callchain output (v4)

2015-11-04 Thread Namhyung Kim
Hi Arnaldo and Brendan,

On Wed, Nov 04, 2015 at 11:51:31AM -0300, Arnaldo Carvalho de Melo wrote:
> Em Tue, Nov 03, 2015 at 10:02:32PM -0800, Brendan Gregg escreveu:
> > On Tue, Nov 3, 2015 at 5:54 PM, Namhyung Kim  wrote:
> > > Ah, makes sense.  So it'd look like
> 
> > >   $ perf report --stdio -g folded,count,info -F none -s comm
> > >   $ perf report --stdio -g folded,count,info -F none -s pid
> 
> > > The output would be
> 
> > >   809 swapper-0 
> > > cpu_bringup_and_idle;cpu_startup_entry;default_idle_call;arch_cpu_idle;default_idle;xen_hypercall_sched_op
>  
> > Thanks, looks almost right: a couple of minor changes:
>  
> > 1. If perf already has the precedent of "PID:comm", instead of my
> > "comm-PID", then maybe it should use "PID:comm" for perf consistency.
> > Doesn't make much difference to me.

Right.  Actually I'd like to write it that way.. ;-)


> > 2. The second space, delimiting "PID:comm" (or comm) and the stack...
> > I'm nervous about using space as a delimiter any more than once, since
> > it can also appear in comm (eg, "java main") and frames (eg,
> > "JavaCalls::call_helper(JavaValue*, methodHandle*, JavaCallArguments*,
> > Thread*)" -- that's direct from "perf script"!). I'd consider making
> > it a semicolon:

Fair enough.


>  
> > 809 swapper-0;cpu_bringup_and_idle;cpu_startup_entry;...
>  
> > So the output is "value key", and key is a semicolon delimited stack
> > with an optional comm or PID:comm frame at the start.
> 
> Agreed, but then, we can have some sort of default and also be able to,
> using -F, specify what are the fields we want, and in which order, and I
> liked your suggestion of being able to specify "-F none" and that mean
> no hist line to be produced.
> 
> Likewise, the way that each callchain line should be formatted should be
> programmable via the command line, via the -g option, no? Then script
> writers could use it in a way that doesn't requires further processing,
> as Brendan showed.

Right.  So '-s [,,...] -g info' can control which info is
displayed along with the callchains.

  $ perf report -s comm,dso -g folded,count,info -F none
  809 swapper;[kernel.vmlinux];cpu_bringup_and_idle;cpu_startup_entry;...

Note that the info part (swapper;[kernel.vmlinux]) is also separated
by a semicolon.  But I think it's ok since it's controlled by command
line, so script can know how many entries will be.


> 
> But yeah, the value is the semicolon delimited stack all the way to the
> comm/PID:comm if there are more than one or if the user asks it to be
> there via a -g keyword, all the other counts/info are just relative to
> that, CSV or whatever other delimiter the user asks it to, and space is
> not an option, as we know it can appear in the middle of a COMM:

Yes, I think that we should use a given separator (using -t option)
instead of hard-coded semicolon.  Although it'd be rare, it seems
possible to use semicolons in the comm name too.

Thanks,
Namhyung


> 
> [root@zoo ~]# perf report -s comm | grep '[a-zA-Z] [a-zA-Z]'
> # To display the perf.data header info, please use
> # --header/--header-only options.
> # Total Lost Samples: 0
> # Samples: 164K of event 'cycles:pp'
> # Event count (approx.): 34422160859
>  0.11%  DOM Worker 
>  0.10%  JS Helper  
>  0.01%  Qt bearer threa
>  0.00%  Socket Thread  
>  0.00%  dconf worker   
>  0.00%  JS Watchdog
> [root@zoo ~]#
> 
> - Arnaldo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCHSET 0/4] perf report: Support folded callchain output (v4)

2015-11-04 Thread Arnaldo Carvalho de Melo
Em Tue, Nov 03, 2015 at 10:02:32PM -0800, Brendan Gregg escreveu:
> On Tue, Nov 3, 2015 at 5:54 PM, Namhyung Kim  wrote:
> > Ah, makes sense.  So it'd look like

> >   $ perf report --stdio -g folded,count,info -F none -s comm
> >   $ perf report --stdio -g folded,count,info -F none -s pid

> > The output would be

> >   809 swapper-0 
> > cpu_bringup_and_idle;cpu_startup_entry;default_idle_call;arch_cpu_idle;default_idle;xen_hypercall_sched_op
 
> Thanks, looks almost right: a couple of minor changes:
 
> 1. If perf already has the precedent of "PID:comm", instead of my
> "comm-PID", then maybe it should use "PID:comm" for perf consistency.
> Doesn't make much difference to me.
> 2. The second space, delimiting "PID:comm" (or comm) and the stack...
> I'm nervous about using space as a delimiter any more than once, since
> it can also appear in comm (eg, "java main") and frames (eg,
> "JavaCalls::call_helper(JavaValue*, methodHandle*, JavaCallArguments*,
> Thread*)" -- that's direct from "perf script"!). I'd consider making
> it a semicolon:
 
> 809 swapper-0;cpu_bringup_and_idle;cpu_startup_entry;...
 
> So the output is "value key", and key is a semicolon delimited stack
> with an optional comm or PID:comm frame at the start.

Agreed, but then, we can have some sort of default and also be able to,
using -F, specify what are the fields we want, and in which order, and I
liked your suggestion of being able to specify "-F none" and that mean
no hist line to be produced.

Likewise, the way that each callchain line should be formatted should be
programmable via the command line, via the -g option, no? Then script
writers could use it in a way that doesn't requires further processing,
as Brendan showed.

But yeah, the value is the semicolon delimited stack all the way to the
comm/PID:comm if there are more than one or if the user asks it to be
there via a -g keyword, all the other counts/info are just relative to
that, CSV or whatever other delimiter the user asks it to, and space is
not an option, as we know it can appear in the middle of a COMM:

[root@zoo ~]# perf report -s comm | grep '[a-zA-Z] [a-zA-Z]'
# To display the perf.data header info, please use
# --header/--header-only options.
# Total Lost Samples: 0
# Samples: 164K of event 'cycles:pp'
# Event count (approx.): 34422160859
 0.11%  DOM Worker 
 0.10%  JS Helper  
 0.01%  Qt bearer threa
 0.00%  Socket Thread  
 0.00%  dconf worker   
 0.00%  JS Watchdog
[root@zoo ~]#

- Arnaldo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCHSET 0/4] perf report: Support folded callchain output (v4)

2015-11-04 Thread Arnaldo Carvalho de Melo
Em Tue, Nov 03, 2015 at 10:02:32PM -0800, Brendan Gregg escreveu:
> On Tue, Nov 3, 2015 at 5:54 PM, Namhyung Kim  wrote:
> > Ah, makes sense.  So it'd look like

> >   $ perf report --stdio -g folded,count,info -F none -s comm
> >   $ perf report --stdio -g folded,count,info -F none -s pid

> > The output would be

> >   809 swapper-0 
> > cpu_bringup_and_idle;cpu_startup_entry;default_idle_call;arch_cpu_idle;default_idle;xen_hypercall_sched_op
 
> Thanks, looks almost right: a couple of minor changes:
 
> 1. If perf already has the precedent of "PID:comm", instead of my
> "comm-PID", then maybe it should use "PID:comm" for perf consistency.
> Doesn't make much difference to me.
> 2. The second space, delimiting "PID:comm" (or comm) and the stack...
> I'm nervous about using space as a delimiter any more than once, since
> it can also appear in comm (eg, "java main") and frames (eg,
> "JavaCalls::call_helper(JavaValue*, methodHandle*, JavaCallArguments*,
> Thread*)" -- that's direct from "perf script"!). I'd consider making
> it a semicolon:
 
> 809 swapper-0;cpu_bringup_and_idle;cpu_startup_entry;...
 
> So the output is "value key", and key is a semicolon delimited stack
> with an optional comm or PID:comm frame at the start.

Agreed, but then, we can have some sort of default and also be able to,
using -F, specify what are the fields we want, and in which order, and I
liked your suggestion of being able to specify "-F none" and that mean
no hist line to be produced.

Likewise, the way that each callchain line should be formatted should be
programmable via the command line, via the -g option, no? Then script
writers could use it in a way that doesn't requires further processing,
as Brendan showed.

But yeah, the value is the semicolon delimited stack all the way to the
comm/PID:comm if there are more than one or if the user asks it to be
there via a -g keyword, all the other counts/info are just relative to
that, CSV or whatever other delimiter the user asks it to, and space is
not an option, as we know it can appear in the middle of a COMM:

[root@zoo ~]# perf report -s comm | grep '[a-zA-Z] [a-zA-Z]'
# To display the perf.data header info, please use
# --header/--header-only options.
# Total Lost Samples: 0
# Samples: 164K of event 'cycles:pp'
# Event count (approx.): 34422160859
 0.11%  DOM Worker 
 0.10%  JS Helper  
 0.01%  Qt bearer threa
 0.00%  Socket Thread  
 0.00%  dconf worker   
 0.00%  JS Watchdog
[root@zoo ~]#

- Arnaldo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCHSET 0/4] perf report: Support folded callchain output (v4)

2015-11-04 Thread Namhyung Kim
Hi Arnaldo and Brendan,

On Wed, Nov 04, 2015 at 11:51:31AM -0300, Arnaldo Carvalho de Melo wrote:
> Em Tue, Nov 03, 2015 at 10:02:32PM -0800, Brendan Gregg escreveu:
> > On Tue, Nov 3, 2015 at 5:54 PM, Namhyung Kim  wrote:
> > > Ah, makes sense.  So it'd look like
> 
> > >   $ perf report --stdio -g folded,count,info -F none -s comm
> > >   $ perf report --stdio -g folded,count,info -F none -s pid
> 
> > > The output would be
> 
> > >   809 swapper-0 
> > > cpu_bringup_and_idle;cpu_startup_entry;default_idle_call;arch_cpu_idle;default_idle;xen_hypercall_sched_op
>  
> > Thanks, looks almost right: a couple of minor changes:
>  
> > 1. If perf already has the precedent of "PID:comm", instead of my
> > "comm-PID", then maybe it should use "PID:comm" for perf consistency.
> > Doesn't make much difference to me.

Right.  Actually I'd like to write it that way.. ;-)


> > 2. The second space, delimiting "PID:comm" (or comm) and the stack...
> > I'm nervous about using space as a delimiter any more than once, since
> > it can also appear in comm (eg, "java main") and frames (eg,
> > "JavaCalls::call_helper(JavaValue*, methodHandle*, JavaCallArguments*,
> > Thread*)" -- that's direct from "perf script"!). I'd consider making
> > it a semicolon:

Fair enough.


>  
> > 809 swapper-0;cpu_bringup_and_idle;cpu_startup_entry;...
>  
> > So the output is "value key", and key is a semicolon delimited stack
> > with an optional comm or PID:comm frame at the start.
> 
> Agreed, but then, we can have some sort of default and also be able to,
> using -F, specify what are the fields we want, and in which order, and I
> liked your suggestion of being able to specify "-F none" and that mean
> no hist line to be produced.
> 
> Likewise, the way that each callchain line should be formatted should be
> programmable via the command line, via the -g option, no? Then script
> writers could use it in a way that doesn't requires further processing,
> as Brendan showed.

Right.  So '-s [,,...] -g info' can control which info is
displayed along with the callchains.

  $ perf report -s comm,dso -g folded,count,info -F none
  809 swapper;[kernel.vmlinux];cpu_bringup_and_idle;cpu_startup_entry;...

Note that the info part (swapper;[kernel.vmlinux]) is also separated
by a semicolon.  But I think it's ok since it's controlled by command
line, so script can know how many entries will be.


> 
> But yeah, the value is the semicolon delimited stack all the way to the
> comm/PID:comm if there are more than one or if the user asks it to be
> there via a -g keyword, all the other counts/info are just relative to
> that, CSV or whatever other delimiter the user asks it to, and space is
> not an option, as we know it can appear in the middle of a COMM:

Yes, I think that we should use a given separator (using -t option)
instead of hard-coded semicolon.  Although it'd be rare, it seems
possible to use semicolons in the comm name too.

Thanks,
Namhyung


> 
> [root@zoo ~]# perf report -s comm | grep '[a-zA-Z] [a-zA-Z]'
> # To display the perf.data header info, please use
> # --header/--header-only options.
> # Total Lost Samples: 0
> # Samples: 164K of event 'cycles:pp'
> # Event count (approx.): 34422160859
>  0.11%  DOM Worker 
>  0.10%  JS Helper  
>  0.01%  Qt bearer threa
>  0.00%  Socket Thread  
>  0.00%  dconf worker   
>  0.00%  JS Watchdog
> [root@zoo ~]#
> 
> - Arnaldo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCHSET 0/4] perf report: Support folded callchain output (v4)

2015-11-04 Thread Arnaldo Carvalho de Melo
Em Thu, Nov 05, 2015 at 12:34:57AM +0900, Namhyung Kim escreveu:
> Hi Arnaldo and Brendan,
> 
> On Wed, Nov 04, 2015 at 11:51:31AM -0300, Arnaldo Carvalho de Melo wrote:
> > Em Tue, Nov 03, 2015 at 10:02:32PM -0800, Brendan Gregg escreveu:
> > > On Tue, Nov 3, 2015 at 5:54 PM, Namhyung Kim  wrote:
> > > > Ah, makes sense.  So it'd look like
> > 
> > > >   $ perf report --stdio -g folded,count,info -F none -s comm
> > > >   $ perf report --stdio -g folded,count,info -F none -s pid
> > 
> > > > The output would be
> > 
> > > >   809 swapper-0 
> > > > cpu_bringup_and_idle;cpu_startup_entry;default_idle_call;arch_cpu_idle;default_idle;xen_hypercall_sched_op
> >  
> > > Thanks, looks almost right: a couple of minor changes:
> >  
> > > 1. If perf already has the precedent of "PID:comm", instead of my
> > > "comm-PID", then maybe it should use "PID:comm" for perf consistency.
> > > Doesn't make much difference to me.
 
> Right.  Actually I'd like to write it that way.. ;-)

Well, those are two pieces of information: "comm" and "pid", so it would
be nice that we could take this opportunity to remove it, i.e. just
treat it as any other field and separate it via the designated
separator, and only show the ones specified.
 
> > > 2. The second space, delimiting "PID:comm" (or comm) and the stack...
> > > I'm nervous about using space as a delimiter any more than once, since
> > > it can also appear in comm (eg, "java main") and frames (eg,
> > > "JavaCalls::call_helper(JavaValue*, methodHandle*, JavaCallArguments*,
> > > Thread*)" -- that's direct from "perf script"!). I'd consider making
> > > it a semicolon:

The C++ symbol names are the biggest challenge here for a single line in
CSV ("comma" quoted) record :-\
 
> Fair enough.

> > > 809 swapper-0;cpu_bringup_and_idle;cpu_startup_entry;...
> >  
> > > So the output is "value key", and key is a semicolon delimited stack
> > > with an optional comm or PID:comm frame at the start.
> > 
> > Agreed, but then, we can have some sort of default and also be able to,
> > using -F, specify what are the fields we want, and in which order, and I
> > liked your suggestion of being able to specify "-F none" and that mean
> > no hist line to be produced.
> > 
> > Likewise, the way that each callchain line should be formatted should be
> > programmable via the command line, via the -g option, no? Then script
> > writers could use it in a way that doesn't requires further processing,
> > as Brendan showed.
> 
> Right.  So '-s [,,...] -g info' can control which info is
> displayed along with the callchains.

So you force the same selection of fields to be used for both the
hist_entry and the callchains?

And why is that some of the fields will be selected via -s (comm, dso)
and other fields will be selected via -g (count, this "info" thing)?

Why not be flexible and allow any set of fields to be used in both
cases, without one being tied to the other?

I.e. instead of:

-s [,,...] -g info

We use:

-s [,,...] -g [[,],...]

If one would want to have the same set for both, then yeah, a keyword
for that would be interesting, reusing your "info":

-s [,,...] -g info

Would mean:

-s [,,...] -g [[,],...]

With both ... equal

But "info" is way too vague, perhaps "hist_keys", or something more
compact, like: "\-s", to reuse the semantic of regular expression groups
(\1).
 
>   $ perf report -s comm,dso -g folded,count,info -F none
>   809 swapper;[kernel.vmlinux];cpu_bringup_and_idle;cpu_startup_entry;...
 
> Note that the info part (swapper;[kernel.vmlinux]) is also separated
> by a semicolon.  But I think it's ok since it's controlled by command
> line, so script can know how many entries will be.
 
> > But yeah, the value is the semicolon delimited stack all the way to the
> > comm/PID:comm if there are more than one or if the user asks it to be
> > there via a -g keyword, all the other counts/info are just relative to
> > that, CSV or whatever other delimiter the user asks it to, and space is
> > not an option, as we know it can appear in the middle of a COMM:
> 
> Yes, I think that we should use a given separator (using -t option)
> instead of hard-coded semicolon.  Although it'd be rare, it seems
> possible to use semicolons in the comm name too.

Well, we can have an option to specify what would be the separator for
the callchains.

- Arnaldo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCHSET 0/4] perf report: Support folded callchain output (v4)

2015-11-03 Thread Brendan Gregg
On Tue, Nov 3, 2015 at 5:54 PM, Namhyung Kim  wrote:
> Hi Brendan,
>
> On Tue, Nov 03, 2015 at 01:33:43PM -0800, Brendan Gregg wrote:
>> On Tue, Nov 3, 2015 at 6:40 AM, Arnaldo Carvalho de Melo
>>  wrote:
>> > Em Tue, Nov 03, 2015 at 09:52:07PM +0900, Namhyung Kim escreveu:
>> >> Hello,
>> >>
>> >> This is what Brendan requested on the perf-users mailing list [1] to
>> >> support FlameGraphs [2] more efficiently.  This patchset adds a few
>> >> more callchain options to adjust the output for it.
>> >>
>> >>  * changes in v4)
>> >>   - add missing doc update
>> >>   - cleanup/fix callchain value print code
>> >>   - add Acked-by from Brendan and Jiri
>> >
>> > Do those Acked-by stand? Things changed, the values moved from the end
>> > of the line to the start, etc.
>> >
>> [...]
>>
>> I'd Ack this change as it's a useful addition. It doesn't quite
>> address the folded-only output, but it's a step in that direction. I
>> think having the value at the start of a line only makes sense for the
>> perf report output containing the hist summary lines, for consistency.
>
> Right, thanks!
>
>
>>
>> Here's how I'd shuffle the output of this patch (ignore word wrap
>> issues with this email):
>>
>> # ./perf report --stdio -g folded,count,caller -F pid | \
>> awk '/^ / { n = $1 }
>> /^[0-9]/ { split(n,a,":"); print a[2] "-" a[1] ";" $2,$1 }'
>> swapper-0;cpu_bringup_and_idle;cpu_startup_entry;default_idle_call;arch_cpu_idle;default_idle;xen_hypercall_sched_op
>> 809
>> swapper-0;xen_start_kernel;x86_64_start_reservations;start_kernel;rest_init;cpu_startup_entry;default_idle_call;arch_cpu_idle;default_idle;xen_hypercall_sched_op
>> 135
>> dd-30551;__GI___libc_read;entry_SYSCALL_64_fastpath;sys_read;vfs_read;__vfs_read;urandom_read;extract_entropy_user;extract_buf;check_events;xen_hypercall_xen_version
>> 63
>> dd-30551;__GI___libc_read;entry_SYSCALL_64_fastpath;sys_read;vfs_read;__vfs_read;urandom_read;extract_entropy_user;extract_buf
>> 54
>> dd-30551;__GI___libc_read;entry_SYSCALL_64_fastpath;sys_read;vfs_read;__vfs_read;urandom_read;extract_entropy_user;extract_buf;memset_erms
>> 3
>> dd-30551;xen_irq_enable_direct_end;check_events;xen_hypercall_xen_version 3
>>
>> So the output is folded stacks, prefixed by comm-PID. Shuffling the
>> summarized output is a lot better than doing a "perf script" dump and
>> re-processing call chains. (Note that since I'm using -F, I didn't
>> need --no-children;
>
> Nope.  The '-F pid' doesn't affect --children.  It doesn't show the
> children overhead column but we still have hist entries for
> (synthesized) children..
>
>   $ perf report --no-children | wc -l
>   998
>
>   $ perf report --no-children -F pid,dso,sym | wc -l
>   998
>
>   $ perf report --children | wc -l
>   3229
>
>   $ perf report --children -F pid,dso,sym | wc -l
>   3202
>
> So I think you still need to use --no-children (or set report.children
> config variable to false) for your script.

Ok, good to know, thanks.

>
>
>> and with "-g count", I didn't need --show-nr-samples.)
>
> Yes, I used -n/--show-nr-samples just to check the number is correct.
>
>
>>
>> I notice the fields (-F) option already has this precedent:
>>
>> - "comm": prints PID:comm
>> - "pid": prints PID
>
> It's opposite:  "comm" prints comm, "pid" prints PID:comm. :)

Ah, right, sorry, I'd typed those the wrong way around. :)

>
>
>>
>> If these were added to -g, along with a no-hists, then the two types
>> of folded-only output could be generated using:
>>
>> perf report --stdio -g folded,count,comm,no-hists,caller
>> perf report --stdio -g folded,count,pid,no-hists,caller
>
> As I said, using fields like comm, pid requires to have same keys in
> --sort option.  So it's basically unreliable to use those specific
> field names in the -g option IMHO.  I suggested to use 'info' (yes, it
> needs better name) to print all sort keys.
>
>
>>
>> ... although "no-hists" doesn't hit me as intuitive. How about "-F
>> none" to specify zero columns? ie:
>>
>> perf report --stdio -g folded,count,comm,caller -F none
>> perf report --stdio -g folded,count,pid,caller -F none
>
> Ah, makes sense.  So it'd look like
>
>   $ perf report --stdio -g folded,count,info -F none -s comm
>   $ perf report --stdio -g folded,count,info -F none -s pid
>
> The output would be
>
>   809 swapper-0 
> cpu_bringup_and_idle;cpu_startup_entry;default_idle_call;arch_cpu_idle;default_idle;xen_hypercall_sched_op
>

Thanks, looks almost right: a couple of minor changes:

1. If perf already has the precedent of "PID:comm", instead of my
"comm-PID", then maybe it should use "PID:comm" for perf consistency.
Doesn't make much difference to me.
2. The second space, delimiting "PID:comm" (or comm) and the stack...
I'm nervous about using space as a delimiter any more than once, since
it can also appear in comm (eg, "java main") and frames (eg,
"JavaCalls::call_helper(JavaValue*, methodHandle*, JavaCallArguments*,
Thread*)" -- that's direct from "perf script"!). I'd consider 

Re: [PATCHSET 0/4] perf report: Support folded callchain output (v4)

2015-11-03 Thread Namhyung Kim
Hi Brendan,

On Tue, Nov 03, 2015 at 01:33:43PM -0800, Brendan Gregg wrote:
> On Tue, Nov 3, 2015 at 6:40 AM, Arnaldo Carvalho de Melo
>  wrote:
> > Em Tue, Nov 03, 2015 at 09:52:07PM +0900, Namhyung Kim escreveu:
> >> Hello,
> >>
> >> This is what Brendan requested on the perf-users mailing list [1] to
> >> support FlameGraphs [2] more efficiently.  This patchset adds a few
> >> more callchain options to adjust the output for it.
> >>
> >>  * changes in v4)
> >>   - add missing doc update
> >>   - cleanup/fix callchain value print code
> >>   - add Acked-by from Brendan and Jiri
> >
> > Do those Acked-by stand? Things changed, the values moved from the end
> > of the line to the start, etc.
> >
> [...]
> 
> I'd Ack this change as it's a useful addition. It doesn't quite
> address the folded-only output, but it's a step in that direction. I
> think having the value at the start of a line only makes sense for the
> perf report output containing the hist summary lines, for consistency.

Right, thanks!


> 
> Here's how I'd shuffle the output of this patch (ignore word wrap
> issues with this email):
> 
> # ./perf report --stdio -g folded,count,caller -F pid | \
> awk '/^ / { n = $1 }
> /^[0-9]/ { split(n,a,":"); print a[2] "-" a[1] ";" $2,$1 }'
> swapper-0;cpu_bringup_and_idle;cpu_startup_entry;default_idle_call;arch_cpu_idle;default_idle;xen_hypercall_sched_op
> 809
> swapper-0;xen_start_kernel;x86_64_start_reservations;start_kernel;rest_init;cpu_startup_entry;default_idle_call;arch_cpu_idle;default_idle;xen_hypercall_sched_op
> 135
> dd-30551;__GI___libc_read;entry_SYSCALL_64_fastpath;sys_read;vfs_read;__vfs_read;urandom_read;extract_entropy_user;extract_buf;check_events;xen_hypercall_xen_version
> 63
> dd-30551;__GI___libc_read;entry_SYSCALL_64_fastpath;sys_read;vfs_read;__vfs_read;urandom_read;extract_entropy_user;extract_buf
> 54
> dd-30551;__GI___libc_read;entry_SYSCALL_64_fastpath;sys_read;vfs_read;__vfs_read;urandom_read;extract_entropy_user;extract_buf;memset_erms
> 3
> dd-30551;xen_irq_enable_direct_end;check_events;xen_hypercall_xen_version 3
> 
> So the output is folded stacks, prefixed by comm-PID. Shuffling the
> summarized output is a lot better than doing a "perf script" dump and
> re-processing call chains. (Note that since I'm using -F, I didn't
> need --no-children;

Nope.  The '-F pid' doesn't affect --children.  It doesn't show the
children overhead column but we still have hist entries for
(synthesized) children..

  $ perf report --no-children | wc -l
  998

  $ perf report --no-children -F pid,dso,sym | wc -l
  998

  $ perf report --children | wc -l
  3229

  $ perf report --children -F pid,dso,sym | wc -l
  3202

So I think you still need to use --no-children (or set report.children
config variable to false) for your script.


> and with "-g count", I didn't need --show-nr-samples.)

Yes, I used -n/--show-nr-samples just to check the number is correct.


> 
> I notice the fields (-F) option already has this precedent:
> 
> - "comm": prints PID:comm
> - "pid": prints PID

It's opposite:  "comm" prints comm, "pid" prints PID:comm. :)


> 
> If these were added to -g, along with a no-hists, then the two types
> of folded-only output could be generated using:
> 
> perf report --stdio -g folded,count,comm,no-hists,caller
> perf report --stdio -g folded,count,pid,no-hists,caller

As I said, using fields like comm, pid requires to have same keys in
--sort option.  So it's basically unreliable to use those specific
field names in the -g option IMHO.  I suggested to use 'info' (yes, it
needs better name) to print all sort keys.


> 
> ... although "no-hists" doesn't hit me as intuitive. How about "-F
> none" to specify zero columns? ie:
> 
> perf report --stdio -g folded,count,comm,caller -F none
> perf report --stdio -g folded,count,pid,caller -F none

Ah, makes sense.  So it'd look like

  $ perf report --stdio -g folded,count,info -F none -s comm
  $ perf report --stdio -g folded,count,info -F none -s pid

The output would be

  809 swapper-0 
cpu_bringup_and_idle;cpu_startup_entry;default_idle_call;arch_cpu_idle;default_idle;xen_hypercall_sched_op

Thoughts?

Thanks,
Namhyung
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCHSET 0/4] perf report: Support folded callchain output (v4)

2015-11-03 Thread Brendan Gregg
On Tue, Nov 3, 2015 at 6:40 AM, Arnaldo Carvalho de Melo
 wrote:
> Em Tue, Nov 03, 2015 at 09:52:07PM +0900, Namhyung Kim escreveu:
>> Hello,
>>
>> This is what Brendan requested on the perf-users mailing list [1] to
>> support FlameGraphs [2] more efficiently.  This patchset adds a few
>> more callchain options to adjust the output for it.
>>
>>  * changes in v4)
>>   - add missing doc update
>>   - cleanup/fix callchain value print code
>>   - add Acked-by from Brendan and Jiri
>
> Do those Acked-by stand? Things changed, the values moved from the end
> of the line to the start, etc.
>
[...]

I'd Ack this change as it's a useful addition. It doesn't quite
address the folded-only output, but it's a step in that direction. I
think having the value at the start of a line only makes sense for the
perf report output containing the hist summary lines, for consistency.

Here's how I'd shuffle the output of this patch (ignore word wrap
issues with this email):

# ./perf report --stdio -g folded,count,caller -F pid | \
awk '/^ / { n = $1 }
/^[0-9]/ { split(n,a,":"); print a[2] "-" a[1] ";" $2,$1 }'
swapper-0;cpu_bringup_and_idle;cpu_startup_entry;default_idle_call;arch_cpu_idle;default_idle;xen_hypercall_sched_op
809
swapper-0;xen_start_kernel;x86_64_start_reservations;start_kernel;rest_init;cpu_startup_entry;default_idle_call;arch_cpu_idle;default_idle;xen_hypercall_sched_op
135
dd-30551;__GI___libc_read;entry_SYSCALL_64_fastpath;sys_read;vfs_read;__vfs_read;urandom_read;extract_entropy_user;extract_buf;check_events;xen_hypercall_xen_version
63
dd-30551;__GI___libc_read;entry_SYSCALL_64_fastpath;sys_read;vfs_read;__vfs_read;urandom_read;extract_entropy_user;extract_buf
54
dd-30551;__GI___libc_read;entry_SYSCALL_64_fastpath;sys_read;vfs_read;__vfs_read;urandom_read;extract_entropy_user;extract_buf;memset_erms
3
dd-30551;xen_irq_enable_direct_end;check_events;xen_hypercall_xen_version 3

So the output is folded stacks, prefixed by comm-PID. Shuffling the
summarized output is a lot better than doing a "perf script" dump and
re-processing call chains. (Note that since I'm using -F, I didn't
need --no-children; and with "-g count", I didn't need
--show-nr-samples.)

I notice the fields (-F) option already has this precedent:

- "comm": prints PID:comm
- "pid": prints PID

If these were added to -g, along with a no-hists, then the two types
of folded-only output could be generated using:

perf report --stdio -g folded,count,comm,no-hists,caller
perf report --stdio -g folded,count,pid,no-hists,caller

... although "no-hists" doesn't hit me as intuitive. How about "-F
none" to specify zero columns? ie:

perf report --stdio -g folded,count,comm,caller -F none
perf report --stdio -g folded,count,pid,caller -F none

Brendan
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCHSET 0/4] perf report: Support folded callchain output (v4)

2015-11-03 Thread Arnaldo Carvalho de Melo
Em Tue, Nov 03, 2015 at 09:52:07PM +0900, Namhyung Kim escreveu:
> Hello,
> 
> This is what Brendan requested on the perf-users mailing list [1] to
> support FlameGraphs [2] more efficiently.  This patchset adds a few
> more callchain options to adjust the output for it.
> 
>  * changes in v4)
>   - add missing doc update
>   - cleanup/fix callchain value print code
>   - add Acked-by from Brendan and Jiri

Do those Acked-by stand? Things changed, the values moved from the end
of the line to the start, etc.

You said you would consider having a --no-hists, but I see nothing about
it in this patchkit.

Some more comments below.

- Arnaldo

>  * changes in v3)
>   - put the value before callchains
>   - fix compile error
> 
> 
> At first, 'folded' output mode was added.  The folded output puts the
> value, a space and all calchain nodes separated by semicolons.  Now it
> only supports --stdio as other UI provides some way of folding and/or
> expanding callchains dynamically.
> 
> The value is now can be one of 'percent', 'period', or 'count'.  The
> percent is current default output and the period is the raw number of
> sample periods.  The count is the number of samples for each callchain.
> 
> Here's an example:
> 
>   $ perf report --no-children --show-nr-samples --stdio -g folded,count
>   ...
> 39.93% 80  swapper  [kernel.vmlinux]  [k] intel_idel
>   57 
> intel_idle;cpuidle_enter_state;cpuidle_enter;call_cpuidle;cpu_startup_entry;start_secondary
>   23 
> intel_idle;cpuidle_enter_state;cpuidle_enter;call_cpuidle;cpu_startup_entry;rest_init;...
> 
> 
>   $ perf report --no-children --stdio -g percent

So, in this first one you show the percent in both

>   ...
> 39.93%  swapper  [kernel.vmlinux]  [k] intel_idel
> |
> ---intel_idle
>cpuidle_enter_state
>cpuidle_enter
>call_cpuidle
>cpu_startup_entry
>|
>|--28.63%-- start_secondary
>|
> --11.30%-- rest_init
> 
> 
>   $ perf report --no-children --stdio --show-total-period -g period
>   ...

then here you _add_ the period to the hist_entry line, but...

> 39.93%   13018705  swapper  [kernel.vmlinux]  [k] intel_idel
> |
> ---intel_idle
>cpuidle_enter_state
>cpuidle_enter
>call_cpuidle
>cpu_startup_entry
>|

_replace_ the percentage with the period in the callchains.

Can't we have the same effect in both? I.e. I would expect the 39.93% to
simply be replaced with that 13018705.

>|--9334403-- start_secondary
>|
> --3684302-- rest_init
> 
> 
>   $ perf report --no-children --stdio --show-nr-samples -g count
>   ...
> 39.93% 80  swapper  [kernel.vmlinux]  [k] intel_idel

Ditto for count

> |
> ---intel_idle
>cpuidle_enter_state
>cpuidle_enter
>call_cpuidle
>cpu_startup_entry
>|
>|--57-- start_secondary
>|
> --23-- rest_init
> 
> 
> You can get it from 'perf/callchain-fold-v4' branch on my tree:
> 
>   git://git.kernel.org/pub/scm/linux/kernel/git/namhyung/linux-perf.git
> 
> Any comments are welcome, thanks
> Namhyung
> 
> 
> [1] http://www.spinics.net/lists/linux-perf-users/msg02498.html
> [2] http://www.brendangregg.com/FlameGraphs/cpuflamegraphs.html
> 
> 
> Namhyung Kim (4):
>   perf report: Support folded callchain mode on --stdio
>   perf callchain: Abstract callchain print function
>   perf callchain: Add count fields to struct callchain_node
>   perf report: Add callchain value option
> 
>  tools/perf/Documentation/perf-report.txt | 13 +++--
>  tools/perf/builtin-report.c  |  4 +-
>  tools/perf/ui/browsers/hists.c   |  8 +--
>  tools/perf/ui/gtk/hists.c|  8 +--
>  tools/perf/ui/stdio/hist.c   | 93 
> ++--
>  tools/perf/util/callchain.c  | 87 +-
>  tools/perf/util/callchain.h  | 24 -
>  tools/perf/util/util.c   |  3 +-
>  8 files changed, 205 insertions(+), 35 deletions(-)
> 
> -- 
> 2.6.2
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCHSET 0/4] perf report: Support folded callchain output (v4)

2015-11-03 Thread Namhyung Kim
Hello,

This is what Brendan requested on the perf-users mailing list [1] to
support FlameGraphs [2] more efficiently.  This patchset adds a few
more callchain options to adjust the output for it.

 * changes in v4)
  - add missing doc update
  - cleanup/fix callchain value print code
  - add Acked-by from Brendan and Jiri

 * changes in v3)
  - put the value before callchains
  - fix compile error


At first, 'folded' output mode was added.  The folded output puts the
value, a space and all calchain nodes separated by semicolons.  Now it
only supports --stdio as other UI provides some way of folding and/or
expanding callchains dynamically.

The value is now can be one of 'percent', 'period', or 'count'.  The
percent is current default output and the period is the raw number of
sample periods.  The count is the number of samples for each callchain.

Here's an example:

  $ perf report --no-children --show-nr-samples --stdio -g folded,count
  ...
39.93% 80  swapper  [kernel.vmlinux]  [k] intel_idel
  57 
intel_idle;cpuidle_enter_state;cpuidle_enter;call_cpuidle;cpu_startup_entry;start_secondary
  23 
intel_idle;cpuidle_enter_state;cpuidle_enter;call_cpuidle;cpu_startup_entry;rest_init;...


  $ perf report --no-children --stdio -g percent
  ...
39.93%  swapper  [kernel.vmlinux]  [k] intel_idel
|
---intel_idle
   cpuidle_enter_state
   cpuidle_enter
   call_cpuidle
   cpu_startup_entry
   |
   |--28.63%-- start_secondary
   |
--11.30%-- rest_init


  $ perf report --no-children --stdio --show-total-period -g period
  ...
39.93%   13018705  swapper  [kernel.vmlinux]  [k] intel_idel
|
---intel_idle
   cpuidle_enter_state
   cpuidle_enter
   call_cpuidle
   cpu_startup_entry
   |
   |--9334403-- start_secondary
   |
--3684302-- rest_init


  $ perf report --no-children --stdio --show-nr-samples -g count
  ...
39.93% 80  swapper  [kernel.vmlinux]  [k] intel_idel
|
---intel_idle
   cpuidle_enter_state
   cpuidle_enter
   call_cpuidle
   cpu_startup_entry
   |
   |--57-- start_secondary
   |
--23-- rest_init


You can get it from 'perf/callchain-fold-v4' branch on my tree:

  git://git.kernel.org/pub/scm/linux/kernel/git/namhyung/linux-perf.git

Any comments are welcome, thanks
Namhyung


[1] http://www.spinics.net/lists/linux-perf-users/msg02498.html
[2] http://www.brendangregg.com/FlameGraphs/cpuflamegraphs.html


Namhyung Kim (4):
  perf report: Support folded callchain mode on --stdio
  perf callchain: Abstract callchain print function
  perf callchain: Add count fields to struct callchain_node
  perf report: Add callchain value option

 tools/perf/Documentation/perf-report.txt | 13 +++--
 tools/perf/builtin-report.c  |  4 +-
 tools/perf/ui/browsers/hists.c   |  8 +--
 tools/perf/ui/gtk/hists.c|  8 +--
 tools/perf/ui/stdio/hist.c   | 93 ++--
 tools/perf/util/callchain.c  | 87 +-
 tools/perf/util/callchain.h  | 24 -
 tools/perf/util/util.c   |  3 +-
 8 files changed, 205 insertions(+), 35 deletions(-)

-- 
2.6.2
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCHSET 0/4] perf report: Support folded callchain output (v4)

2015-11-03 Thread Namhyung Kim
Hello,

This is what Brendan requested on the perf-users mailing list [1] to
support FlameGraphs [2] more efficiently.  This patchset adds a few
more callchain options to adjust the output for it.

 * changes in v4)
  - add missing doc update
  - cleanup/fix callchain value print code
  - add Acked-by from Brendan and Jiri

 * changes in v3)
  - put the value before callchains
  - fix compile error


At first, 'folded' output mode was added.  The folded output puts the
value, a space and all calchain nodes separated by semicolons.  Now it
only supports --stdio as other UI provides some way of folding and/or
expanding callchains dynamically.

The value is now can be one of 'percent', 'period', or 'count'.  The
percent is current default output and the period is the raw number of
sample periods.  The count is the number of samples for each callchain.

Here's an example:

  $ perf report --no-children --show-nr-samples --stdio -g folded,count
  ...
39.93% 80  swapper  [kernel.vmlinux]  [k] intel_idel
  57 
intel_idle;cpuidle_enter_state;cpuidle_enter;call_cpuidle;cpu_startup_entry;start_secondary
  23 
intel_idle;cpuidle_enter_state;cpuidle_enter;call_cpuidle;cpu_startup_entry;rest_init;...


  $ perf report --no-children --stdio -g percent
  ...
39.93%  swapper  [kernel.vmlinux]  [k] intel_idel
|
---intel_idle
   cpuidle_enter_state
   cpuidle_enter
   call_cpuidle
   cpu_startup_entry
   |
   |--28.63%-- start_secondary
   |
--11.30%-- rest_init


  $ perf report --no-children --stdio --show-total-period -g period
  ...
39.93%   13018705  swapper  [kernel.vmlinux]  [k] intel_idel
|
---intel_idle
   cpuidle_enter_state
   cpuidle_enter
   call_cpuidle
   cpu_startup_entry
   |
   |--9334403-- start_secondary
   |
--3684302-- rest_init


  $ perf report --no-children --stdio --show-nr-samples -g count
  ...
39.93% 80  swapper  [kernel.vmlinux]  [k] intel_idel
|
---intel_idle
   cpuidle_enter_state
   cpuidle_enter
   call_cpuidle
   cpu_startup_entry
   |
   |--57-- start_secondary
   |
--23-- rest_init


You can get it from 'perf/callchain-fold-v4' branch on my tree:

  git://git.kernel.org/pub/scm/linux/kernel/git/namhyung/linux-perf.git

Any comments are welcome, thanks
Namhyung


[1] http://www.spinics.net/lists/linux-perf-users/msg02498.html
[2] http://www.brendangregg.com/FlameGraphs/cpuflamegraphs.html


Namhyung Kim (4):
  perf report: Support folded callchain mode on --stdio
  perf callchain: Abstract callchain print function
  perf callchain: Add count fields to struct callchain_node
  perf report: Add callchain value option

 tools/perf/Documentation/perf-report.txt | 13 +++--
 tools/perf/builtin-report.c  |  4 +-
 tools/perf/ui/browsers/hists.c   |  8 +--
 tools/perf/ui/gtk/hists.c|  8 +--
 tools/perf/ui/stdio/hist.c   | 93 ++--
 tools/perf/util/callchain.c  | 87 +-
 tools/perf/util/callchain.h  | 24 -
 tools/perf/util/util.c   |  3 +-
 8 files changed, 205 insertions(+), 35 deletions(-)

-- 
2.6.2
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCHSET 0/4] perf report: Support folded callchain output (v4)

2015-11-03 Thread Brendan Gregg
On Tue, Nov 3, 2015 at 6:40 AM, Arnaldo Carvalho de Melo
 wrote:
> Em Tue, Nov 03, 2015 at 09:52:07PM +0900, Namhyung Kim escreveu:
>> Hello,
>>
>> This is what Brendan requested on the perf-users mailing list [1] to
>> support FlameGraphs [2] more efficiently.  This patchset adds a few
>> more callchain options to adjust the output for it.
>>
>>  * changes in v4)
>>   - add missing doc update
>>   - cleanup/fix callchain value print code
>>   - add Acked-by from Brendan and Jiri
>
> Do those Acked-by stand? Things changed, the values moved from the end
> of the line to the start, etc.
>
[...]

I'd Ack this change as it's a useful addition. It doesn't quite
address the folded-only output, but it's a step in that direction. I
think having the value at the start of a line only makes sense for the
perf report output containing the hist summary lines, for consistency.

Here's how I'd shuffle the output of this patch (ignore word wrap
issues with this email):

# ./perf report --stdio -g folded,count,caller -F pid | \
awk '/^ / { n = $1 }
/^[0-9]/ { split(n,a,":"); print a[2] "-" a[1] ";" $2,$1 }'
swapper-0;cpu_bringup_and_idle;cpu_startup_entry;default_idle_call;arch_cpu_idle;default_idle;xen_hypercall_sched_op
809
swapper-0;xen_start_kernel;x86_64_start_reservations;start_kernel;rest_init;cpu_startup_entry;default_idle_call;arch_cpu_idle;default_idle;xen_hypercall_sched_op
135
dd-30551;__GI___libc_read;entry_SYSCALL_64_fastpath;sys_read;vfs_read;__vfs_read;urandom_read;extract_entropy_user;extract_buf;check_events;xen_hypercall_xen_version
63
dd-30551;__GI___libc_read;entry_SYSCALL_64_fastpath;sys_read;vfs_read;__vfs_read;urandom_read;extract_entropy_user;extract_buf
54
dd-30551;__GI___libc_read;entry_SYSCALL_64_fastpath;sys_read;vfs_read;__vfs_read;urandom_read;extract_entropy_user;extract_buf;memset_erms
3
dd-30551;xen_irq_enable_direct_end;check_events;xen_hypercall_xen_version 3

So the output is folded stacks, prefixed by comm-PID. Shuffling the
summarized output is a lot better than doing a "perf script" dump and
re-processing call chains. (Note that since I'm using -F, I didn't
need --no-children; and with "-g count", I didn't need
--show-nr-samples.)

I notice the fields (-F) option already has this precedent:

- "comm": prints PID:comm
- "pid": prints PID

If these were added to -g, along with a no-hists, then the two types
of folded-only output could be generated using:

perf report --stdio -g folded,count,comm,no-hists,caller
perf report --stdio -g folded,count,pid,no-hists,caller

... although "no-hists" doesn't hit me as intuitive. How about "-F
none" to specify zero columns? ie:

perf report --stdio -g folded,count,comm,caller -F none
perf report --stdio -g folded,count,pid,caller -F none

Brendan
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCHSET 0/4] perf report: Support folded callchain output (v4)

2015-11-03 Thread Namhyung Kim
Hi Brendan,

On Tue, Nov 03, 2015 at 01:33:43PM -0800, Brendan Gregg wrote:
> On Tue, Nov 3, 2015 at 6:40 AM, Arnaldo Carvalho de Melo
>  wrote:
> > Em Tue, Nov 03, 2015 at 09:52:07PM +0900, Namhyung Kim escreveu:
> >> Hello,
> >>
> >> This is what Brendan requested on the perf-users mailing list [1] to
> >> support FlameGraphs [2] more efficiently.  This patchset adds a few
> >> more callchain options to adjust the output for it.
> >>
> >>  * changes in v4)
> >>   - add missing doc update
> >>   - cleanup/fix callchain value print code
> >>   - add Acked-by from Brendan and Jiri
> >
> > Do those Acked-by stand? Things changed, the values moved from the end
> > of the line to the start, etc.
> >
> [...]
> 
> I'd Ack this change as it's a useful addition. It doesn't quite
> address the folded-only output, but it's a step in that direction. I
> think having the value at the start of a line only makes sense for the
> perf report output containing the hist summary lines, for consistency.

Right, thanks!


> 
> Here's how I'd shuffle the output of this patch (ignore word wrap
> issues with this email):
> 
> # ./perf report --stdio -g folded,count,caller -F pid | \
> awk '/^ / { n = $1 }
> /^[0-9]/ { split(n,a,":"); print a[2] "-" a[1] ";" $2,$1 }'
> swapper-0;cpu_bringup_and_idle;cpu_startup_entry;default_idle_call;arch_cpu_idle;default_idle;xen_hypercall_sched_op
> 809
> swapper-0;xen_start_kernel;x86_64_start_reservations;start_kernel;rest_init;cpu_startup_entry;default_idle_call;arch_cpu_idle;default_idle;xen_hypercall_sched_op
> 135
> dd-30551;__GI___libc_read;entry_SYSCALL_64_fastpath;sys_read;vfs_read;__vfs_read;urandom_read;extract_entropy_user;extract_buf;check_events;xen_hypercall_xen_version
> 63
> dd-30551;__GI___libc_read;entry_SYSCALL_64_fastpath;sys_read;vfs_read;__vfs_read;urandom_read;extract_entropy_user;extract_buf
> 54
> dd-30551;__GI___libc_read;entry_SYSCALL_64_fastpath;sys_read;vfs_read;__vfs_read;urandom_read;extract_entropy_user;extract_buf;memset_erms
> 3
> dd-30551;xen_irq_enable_direct_end;check_events;xen_hypercall_xen_version 3
> 
> So the output is folded stacks, prefixed by comm-PID. Shuffling the
> summarized output is a lot better than doing a "perf script" dump and
> re-processing call chains. (Note that since I'm using -F, I didn't
> need --no-children;

Nope.  The '-F pid' doesn't affect --children.  It doesn't show the
children overhead column but we still have hist entries for
(synthesized) children..

  $ perf report --no-children | wc -l
  998

  $ perf report --no-children -F pid,dso,sym | wc -l
  998

  $ perf report --children | wc -l
  3229

  $ perf report --children -F pid,dso,sym | wc -l
  3202

So I think you still need to use --no-children (or set report.children
config variable to false) for your script.


> and with "-g count", I didn't need --show-nr-samples.)

Yes, I used -n/--show-nr-samples just to check the number is correct.


> 
> I notice the fields (-F) option already has this precedent:
> 
> - "comm": prints PID:comm
> - "pid": prints PID

It's opposite:  "comm" prints comm, "pid" prints PID:comm. :)


> 
> If these were added to -g, along with a no-hists, then the two types
> of folded-only output could be generated using:
> 
> perf report --stdio -g folded,count,comm,no-hists,caller
> perf report --stdio -g folded,count,pid,no-hists,caller

As I said, using fields like comm, pid requires to have same keys in
--sort option.  So it's basically unreliable to use those specific
field names in the -g option IMHO.  I suggested to use 'info' (yes, it
needs better name) to print all sort keys.


> 
> ... although "no-hists" doesn't hit me as intuitive. How about "-F
> none" to specify zero columns? ie:
> 
> perf report --stdio -g folded,count,comm,caller -F none
> perf report --stdio -g folded,count,pid,caller -F none

Ah, makes sense.  So it'd look like

  $ perf report --stdio -g folded,count,info -F none -s comm
  $ perf report --stdio -g folded,count,info -F none -s pid

The output would be

  809 swapper-0 
cpu_bringup_and_idle;cpu_startup_entry;default_idle_call;arch_cpu_idle;default_idle;xen_hypercall_sched_op

Thoughts?

Thanks,
Namhyung
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCHSET 0/4] perf report: Support folded callchain output (v4)

2015-11-03 Thread Brendan Gregg
On Tue, Nov 3, 2015 at 5:54 PM, Namhyung Kim  wrote:
> Hi Brendan,
>
> On Tue, Nov 03, 2015 at 01:33:43PM -0800, Brendan Gregg wrote:
>> On Tue, Nov 3, 2015 at 6:40 AM, Arnaldo Carvalho de Melo
>>  wrote:
>> > Em Tue, Nov 03, 2015 at 09:52:07PM +0900, Namhyung Kim escreveu:
>> >> Hello,
>> >>
>> >> This is what Brendan requested on the perf-users mailing list [1] to
>> >> support FlameGraphs [2] more efficiently.  This patchset adds a few
>> >> more callchain options to adjust the output for it.
>> >>
>> >>  * changes in v4)
>> >>   - add missing doc update
>> >>   - cleanup/fix callchain value print code
>> >>   - add Acked-by from Brendan and Jiri
>> >
>> > Do those Acked-by stand? Things changed, the values moved from the end
>> > of the line to the start, etc.
>> >
>> [...]
>>
>> I'd Ack this change as it's a useful addition. It doesn't quite
>> address the folded-only output, but it's a step in that direction. I
>> think having the value at the start of a line only makes sense for the
>> perf report output containing the hist summary lines, for consistency.
>
> Right, thanks!
>
>
>>
>> Here's how I'd shuffle the output of this patch (ignore word wrap
>> issues with this email):
>>
>> # ./perf report --stdio -g folded,count,caller -F pid | \
>> awk '/^ / { n = $1 }
>> /^[0-9]/ { split(n,a,":"); print a[2] "-" a[1] ";" $2,$1 }'
>> swapper-0;cpu_bringup_and_idle;cpu_startup_entry;default_idle_call;arch_cpu_idle;default_idle;xen_hypercall_sched_op
>> 809
>> swapper-0;xen_start_kernel;x86_64_start_reservations;start_kernel;rest_init;cpu_startup_entry;default_idle_call;arch_cpu_idle;default_idle;xen_hypercall_sched_op
>> 135
>> dd-30551;__GI___libc_read;entry_SYSCALL_64_fastpath;sys_read;vfs_read;__vfs_read;urandom_read;extract_entropy_user;extract_buf;check_events;xen_hypercall_xen_version
>> 63
>> dd-30551;__GI___libc_read;entry_SYSCALL_64_fastpath;sys_read;vfs_read;__vfs_read;urandom_read;extract_entropy_user;extract_buf
>> 54
>> dd-30551;__GI___libc_read;entry_SYSCALL_64_fastpath;sys_read;vfs_read;__vfs_read;urandom_read;extract_entropy_user;extract_buf;memset_erms
>> 3
>> dd-30551;xen_irq_enable_direct_end;check_events;xen_hypercall_xen_version 3
>>
>> So the output is folded stacks, prefixed by comm-PID. Shuffling the
>> summarized output is a lot better than doing a "perf script" dump and
>> re-processing call chains. (Note that since I'm using -F, I didn't
>> need --no-children;
>
> Nope.  The '-F pid' doesn't affect --children.  It doesn't show the
> children overhead column but we still have hist entries for
> (synthesized) children..
>
>   $ perf report --no-children | wc -l
>   998
>
>   $ perf report --no-children -F pid,dso,sym | wc -l
>   998
>
>   $ perf report --children | wc -l
>   3229
>
>   $ perf report --children -F pid,dso,sym | wc -l
>   3202
>
> So I think you still need to use --no-children (or set report.children
> config variable to false) for your script.

Ok, good to know, thanks.

>
>
>> and with "-g count", I didn't need --show-nr-samples.)
>
> Yes, I used -n/--show-nr-samples just to check the number is correct.
>
>
>>
>> I notice the fields (-F) option already has this precedent:
>>
>> - "comm": prints PID:comm
>> - "pid": prints PID
>
> It's opposite:  "comm" prints comm, "pid" prints PID:comm. :)

Ah, right, sorry, I'd typed those the wrong way around. :)

>
>
>>
>> If these were added to -g, along with a no-hists, then the two types
>> of folded-only output could be generated using:
>>
>> perf report --stdio -g folded,count,comm,no-hists,caller
>> perf report --stdio -g folded,count,pid,no-hists,caller
>
> As I said, using fields like comm, pid requires to have same keys in
> --sort option.  So it's basically unreliable to use those specific
> field names in the -g option IMHO.  I suggested to use 'info' (yes, it
> needs better name) to print all sort keys.
>
>
>>
>> ... although "no-hists" doesn't hit me as intuitive. How about "-F
>> none" to specify zero columns? ie:
>>
>> perf report --stdio -g folded,count,comm,caller -F none
>> perf report --stdio -g folded,count,pid,caller -F none
>
> Ah, makes sense.  So it'd look like
>
>   $ perf report --stdio -g folded,count,info -F none -s comm
>   $ perf report --stdio -g folded,count,info -F none -s pid
>
> The output would be
>
>   809 swapper-0 
> cpu_bringup_and_idle;cpu_startup_entry;default_idle_call;arch_cpu_idle;default_idle;xen_hypercall_sched_op
>

Thanks, looks almost right: a couple of minor changes:

1. If perf already has the precedent of "PID:comm", instead of my
"comm-PID", then maybe it should use "PID:comm" for perf consistency.
Doesn't make much difference to me.
2. The second space, delimiting "PID:comm" (or comm) and the stack...
I'm nervous about using space as a delimiter any more than once, since
it can also appear in comm (eg, "java main") and frames (eg,
"JavaCalls::call_helper(JavaValue*, methodHandle*, JavaCallArguments*,
Thread*)" -- that's 

Re: [PATCHSET 0/4] perf report: Support folded callchain output (v4)

2015-11-03 Thread Arnaldo Carvalho de Melo
Em Tue, Nov 03, 2015 at 09:52:07PM +0900, Namhyung Kim escreveu:
> Hello,
> 
> This is what Brendan requested on the perf-users mailing list [1] to
> support FlameGraphs [2] more efficiently.  This patchset adds a few
> more callchain options to adjust the output for it.
> 
>  * changes in v4)
>   - add missing doc update
>   - cleanup/fix callchain value print code
>   - add Acked-by from Brendan and Jiri

Do those Acked-by stand? Things changed, the values moved from the end
of the line to the start, etc.

You said you would consider having a --no-hists, but I see nothing about
it in this patchkit.

Some more comments below.

- Arnaldo

>  * changes in v3)
>   - put the value before callchains
>   - fix compile error
> 
> 
> At first, 'folded' output mode was added.  The folded output puts the
> value, a space and all calchain nodes separated by semicolons.  Now it
> only supports --stdio as other UI provides some way of folding and/or
> expanding callchains dynamically.
> 
> The value is now can be one of 'percent', 'period', or 'count'.  The
> percent is current default output and the period is the raw number of
> sample periods.  The count is the number of samples for each callchain.
> 
> Here's an example:
> 
>   $ perf report --no-children --show-nr-samples --stdio -g folded,count
>   ...
> 39.93% 80  swapper  [kernel.vmlinux]  [k] intel_idel
>   57 
> intel_idle;cpuidle_enter_state;cpuidle_enter;call_cpuidle;cpu_startup_entry;start_secondary
>   23 
> intel_idle;cpuidle_enter_state;cpuidle_enter;call_cpuidle;cpu_startup_entry;rest_init;...
> 
> 
>   $ perf report --no-children --stdio -g percent

So, in this first one you show the percent in both

>   ...
> 39.93%  swapper  [kernel.vmlinux]  [k] intel_idel
> |
> ---intel_idle
>cpuidle_enter_state
>cpuidle_enter
>call_cpuidle
>cpu_startup_entry
>|
>|--28.63%-- start_secondary
>|
> --11.30%-- rest_init
> 
> 
>   $ perf report --no-children --stdio --show-total-period -g period
>   ...

then here you _add_ the period to the hist_entry line, but...

> 39.93%   13018705  swapper  [kernel.vmlinux]  [k] intel_idel
> |
> ---intel_idle
>cpuidle_enter_state
>cpuidle_enter
>call_cpuidle
>cpu_startup_entry
>|

_replace_ the percentage with the period in the callchains.

Can't we have the same effect in both? I.e. I would expect the 39.93% to
simply be replaced with that 13018705.

>|--9334403-- start_secondary
>|
> --3684302-- rest_init
> 
> 
>   $ perf report --no-children --stdio --show-nr-samples -g count
>   ...
> 39.93% 80  swapper  [kernel.vmlinux]  [k] intel_idel

Ditto for count

> |
> ---intel_idle
>cpuidle_enter_state
>cpuidle_enter
>call_cpuidle
>cpu_startup_entry
>|
>|--57-- start_secondary
>|
> --23-- rest_init
> 
> 
> You can get it from 'perf/callchain-fold-v4' branch on my tree:
> 
>   git://git.kernel.org/pub/scm/linux/kernel/git/namhyung/linux-perf.git
> 
> Any comments are welcome, thanks
> Namhyung
> 
> 
> [1] http://www.spinics.net/lists/linux-perf-users/msg02498.html
> [2] http://www.brendangregg.com/FlameGraphs/cpuflamegraphs.html
> 
> 
> Namhyung Kim (4):
>   perf report: Support folded callchain mode on --stdio
>   perf callchain: Abstract callchain print function
>   perf callchain: Add count fields to struct callchain_node
>   perf report: Add callchain value option
> 
>  tools/perf/Documentation/perf-report.txt | 13 +++--
>  tools/perf/builtin-report.c  |  4 +-
>  tools/perf/ui/browsers/hists.c   |  8 +--
>  tools/perf/ui/gtk/hists.c|  8 +--
>  tools/perf/ui/stdio/hist.c   | 93 
> ++--
>  tools/perf/util/callchain.c  | 87 +-
>  tools/perf/util/callchain.h  | 24 -
>  tools/perf/util/util.c   |  3 +-
>  8 files changed, 205 insertions(+), 35 deletions(-)
> 
> -- 
> 2.6.2
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/