On Wed, Sep 21, 2016 at 6:34 PM, Namhyung Kim <namhy...@kernel.org> wrote: > On Wed, Sep 21, 2016 at 01:18:52PM -0700, Stephane Eranian wrote: >> On Wed, Sep 21, 2016 at 9:34 AM, Jiri Olsa <jo...@redhat.com> wrote: >> > On Wed, Sep 21, 2016 at 12:37:53PM -0300, Arnaldo Carvalho de Melo wrote: >> >> Em Tue, Sep 20, 2016 at 06:29:59PM -0700, Stephane Eranian escreveu: >> >> > Hi Arnaldo, >> >> > >> >> > I ran into an issue trying to use the --pid filtering option of perf >> >> > report. >> >> > >> >> > I do a system-wide collection and then I want to narrow down the >> >> > reporting to a specific process: >> >> > >> >> > $ perf record -a -e cycles:pp sleep 10 >> >> > $ perf report --sort cpu,comm --pid X >> >> > >> >> > Where X is a process sampled during the run (easy to catch with perf >> >> > report -D) >> >> > If you do it this way, it works, but if you do: >> >> > >> >> > $ perf report --sort cpu --pid X >> >> > >> >> > Then you get an empty output. >> >> > >> >> > I suspect it has to do with the way hist entries are added to the >> >> > histogram and aggregated. If the first event for a sort criteria is >> >> > not coming from pid X, it will >> >> > still be added in the histogram. if pid X aggregates to the same >> >> > sample criteria, then you will lose the pid information. And then >> >> > later when you try to apply the filter, >> >> > it will mark the hist entry as FILTERED because it does not have a >> >> > matching pid >> >> > and nothing will be printed. >> >> > I suspect you want to apply the filtering upfront for pid. It will >> >> > only add to the histograms matching samples. It changes the >> >> > percentages you will see. They will >> >> > only report the breakdown for the pid. >> >> > >> >> > I have a quick hack to do upfront filtering which does something as >> >> > follows but I am not sure this is the correct way of doing this. >> >> > >> >> > Let me know what you think. >> >> >> >> From a first look I think this makes sense, i.e. we should do the first >> >> round of filtering, one that trows away stuff, for things in the command >> >> line, when creating the histogram entries. >> >> >> >> Later, as we have now, we can apply further filters for non-collapsed >> >> fields of hist_entry. >> >> >> >> Jiri, Namhyung, are you ok with this? >> > >> > Stephan is correct with analysis, but I think we need to add both >> > non/filtered >> > entries in, because we provide that 'F' key for non/filtered counts switch >> > in tui >> > >> > how about something like below >> > >> > thanks, >> > jirka >> > >> > --- >> > diff --git a/tools/perf/util/hist.c b/tools/perf/util/hist.c >> > index b02992efb513..659e0357be68 100644 >> > --- a/tools/perf/util/hist.c >> > +++ b/tools/perf/util/hist.c >> > @@ -536,6 +536,14 @@ static struct hist_entry *hists__findnew_entry(struct >> > hists *hists, >> > map__put(he->ms.map); >> > he->ms.map = map__get(entry->ms.map); >> > } >> > + >> > + /* >> > + * We have at least one entry in which is not >> > + * filtered, we want to display the entry. >> > + */ >> > + if (he->filtered && !entry->filtered) >> > + he->filtered = 0; >> > + >> > goto out; >> > } >> > >> Works for me. So with this approach the % shown with --pid still >> represents % of total samples and not just for that pid. >> I think this is okay as long as this is documented and understood by users. >> Thanks. > > I think we should show correct value depending on the --percentage > option. I wrote a patch to implement it by addding a > total_early_filtered_period stat to hists. Following is the result: > > > $ perf report -s cpu,comm --pid 0 --stdio > # > # Overhead CPU Command > # ........ ... ....... > # > 12.16% 000 swapper > 3.09% 001 swapper > 2.76% 002 swapper > 2.23% 003 swapper > 1.65% 007 swapper > 1.65% 008 swapper > 1.52% 009 swapper > 1.51% 006 swapper > 1.46% 004 swapper > 1.34% 005 swapper > 0.94% 010 swapper > 0.90% 011 swapper > So how do I interpret this?
Is this that 12.16% of all samples comes from pid 0 (swapper) running on CPU0? > $ perf report -s cpu --pid 0 --stdio > # > # Overhead CPU > # ........ ... > # > 12.16% 000 > 3.09% 001 > 2.76% 002 > 2.23% 003 > 1.65% 007 > 1.65% 008 > 1.52% 009 > 1.51% 006 > 1.46% 004 > 1.34% 005 > 0.94% 010 > 0.90% 011 > 12.16% of all the samples collected come from pid 0 (swapper) running CPU0? > $ perf report -s cpu --pid 0 --stdio --percentage relative > # > # Overhead CPU > # ........ ... > # > 38.95% 000 > 9.92% 001 > 8.84% 002 > 7.16% 003 > 5.30% 007 > 5.28% 008 > 4.87% 009 > 4.83% 006 > 4.66% 004 > 4.30% 005 > 3.00% 010 > 2.89% 011 > Ok, so now I am guessing 38.95% of the samples of pid 0 are on CPU0? > > Note that the --hierarchy option provides groups rather than filtering > but shows similar result.. > > $ perf report -s pid,cpu --stdio --hierarchy > # > # Overhead Pid:Command / CPU > # ........... ....................... > # > 31.21% 0:swapper > 12.16% 000 > 3.09% 001 > 2.76% 002 > 2.23% 003 > 1.65% 007 > 1.65% 008 > 1.52% 009 > 1.51% 006 > 1.46% 004 > 1.34% 005 > 0.94% 010 > 0.90% 011 > 19.15% 8618:getmail > ... > 31.21% of total samples come from pid 0 (swapper) and decompose to 12.16% for CPU0, 3.09% for CPU1, .... Is that right? > > > > > -----------------------8<------------------------------- > diff --git a/tools/perf/util/event.h b/tools/perf/util/event.h > index 8d363d5e65a2..42b1bfd29ef8 100644 > --- a/tools/perf/util/event.h > +++ b/tools/perf/util/event.h > @@ -262,6 +262,7 @@ enum auxtrace_error_type { > */ > struct events_stats { > u64 total_period; > + u64 total_early_filtered_period; > u64 total_non_filtered_period; > u64 total_lost; > u64 total_lost_samples; > diff --git a/tools/perf/util/hist.c b/tools/perf/util/hist.c > index 37a08f20730a..c7045411cce2 100644 > --- a/tools/perf/util/hist.c > +++ b/tools/perf/util/hist.c > @@ -1017,12 +1017,21 @@ int hist_entry_iter__add(struct hist_entry_iter > *iter, struct addr_location *al, > int max_stack_depth, void *arg) > { > int err, err2; > + struct hists *hists = evsel__hists(iter->evsel); > > err = sample__resolve_callchain(iter->sample, &callchain_cursor, > &iter->parent, > iter->evsel, al, max_stack_depth); > if (err) > return err; > > + if (symbol__parent_filter(iter->parent)) > + al->filtered |= symbol__parent_filter(iter->parent); > + > + if (al->filtered) { > + hists->stats.total_early_filtered_period += > iter->sample->period; > + return 0; > + } > + > iter->max_stack = max_stack_depth; > > err = iter->ops->prepare_entry(iter, al); > @@ -1503,7 +1512,7 @@ static void hists__reset_filter_stats(struct hists > *hists) > void hists__reset_stats(struct hists *hists) > { > hists->nr_entries = 0; > - hists->stats.total_period = 0; > + hists->stats.total_period = hists->stats.total_early_filtered_period; > > hists__reset_filter_stats(hists); > } > @@ -1530,7 +1539,7 @@ static void hierarchy_recalc_total_periods(struct hists > *hists) > > node = rb_first(&hists->entries); > > - hists->stats.total_period = 0; > + hists->stats.total_period = hists->stats.total_early_filtered_period; > hists->stats.total_non_filtered_period = 0; > > /*