Re: [PATCH] tools/perf: Fix aggr_printout to display cpu field irrespective of core value
> On 06-Oct-2022, at 7:33 PM, Arnaldo Carvalho de Melo wrote: > > Em Thu, Oct 06, 2022 at 06:16:14PM +0530, Athira Rajeev escreveu: >> >> >>> On 05-Oct-2022, at 6:05 PM, Arnaldo Carvalho de Melo >>> wrote: >>> >>> Em Wed, Oct 05, 2022 at 09:28:52AM -0300, Arnaldo Carvalho de Melo escreveu: Em Wed, Oct 05, 2022 at 10:23:39AM +0530, Athira Rajeev escreveu: > diff --git a/tools/perf/util/stat-display.c > b/tools/perf/util/stat-display.c > index b82844cb0ce7..cf28020798ec 100644 > --- a/tools/perf/util/stat-display.c > +++ b/tools/perf/util/stat-display.c > @@ -168,7 +168,7 @@ static void aggr_printout(struct perf_stat_config > *config, > id.socket, > id.die, > id.core); > - } else if (id.core > -1) { > + } else if (id.cpu.cpu > -1) { > fprintf(config->output, "\"cpu\" : \"%d\", ", > id.cpu.cpu); > } > @@ -179,7 +179,7 @@ static void aggr_printout(struct perf_stat_config > *config, > id.die, > config->csv_output ? 0 : -3, > id.core, config->csv_sep); > - } else if (id.core > -1) { > + } else if (id.cpu.cpu > -1) { > fprintf(config->output, "CPU%*d%s", > config->csv_output ? 0 : -7, > id.cpu.cpu, config->csv_sep); > -- > If it is confusing, shall I send it as a separate patch along with > Tested-by from Ian ? I'll have to do this by hand, tried pointing b4 to this message and it picked the old one, also tried to save the message and apply by hand, its mangled. >>> >>> This is what I have now, will force push later, please triple check :-) >> >> >> Sorry for all the confusion Arnaldo. Hereafter, I will resubmit the patch to >> avoid this. >> In below patch which you shared, code change is correct. But commit message >> is different. >> So to avoid further confusion, I went ahead and posted a separate patch in >> the mailing list with: >> >> subject: [PATCH] tools/perf: Fix cpu check to use id.cpu.cpu in aggr_printout >> Patch link: >> https://lore.kernel.org/lkml/20221006114225.66303-1-atraj...@linux.vnet.ibm.com/T/#u >> >> Please pick the separately send patch. > > Ok, will do. Thanks Arnaldo. > > - Arnaldo
Re: [PATCH] tools/perf: Fix aggr_printout to display cpu field irrespective of core value
Em Thu, Oct 06, 2022 at 06:16:14PM +0530, Athira Rajeev escreveu: > > > > On 05-Oct-2022, at 6:05 PM, Arnaldo Carvalho de Melo > > wrote: > > > > Em Wed, Oct 05, 2022 at 09:28:52AM -0300, Arnaldo Carvalho de Melo escreveu: > >> Em Wed, Oct 05, 2022 at 10:23:39AM +0530, Athira Rajeev escreveu: > >>> diff --git a/tools/perf/util/stat-display.c > >>> b/tools/perf/util/stat-display.c > >>> index b82844cb0ce7..cf28020798ec 100644 > >>> --- a/tools/perf/util/stat-display.c > >>> +++ b/tools/perf/util/stat-display.c > >>> @@ -168,7 +168,7 @@ static void aggr_printout(struct perf_stat_config > >>> *config, > >>> id.socket, > >>> id.die, > >>> id.core); > >>> - } else if (id.core > -1) { > >>> + } else if (id.cpu.cpu > -1) { > >>> fprintf(config->output, "\"cpu\" : \"%d\", ", > >>> id.cpu.cpu); > >>> } > >>> @@ -179,7 +179,7 @@ static void aggr_printout(struct perf_stat_config > >>> *config, > >>> id.die, > >>> config->csv_output ? 0 : -3, > >>> id.core, config->csv_sep); > >>> - } else if (id.core > -1) { > >>> + } else if (id.cpu.cpu > -1) { > >>> fprintf(config->output, "CPU%*d%s", > >>> config->csv_output ? 0 : -7, > >>> id.cpu.cpu, config->csv_sep); > >>> -- > >>> If it is confusing, shall I send it as a separate patch along with > >>> Tested-by from Ian ? > >> > >> I'll have to do this by hand, tried pointing b4 to this message and it > >> picked the old one, also tried to save the message and apply by hand, > >> its mangled. > > > > This is what I have now, will force push later, please triple check :-) > > > Sorry for all the confusion Arnaldo. Hereafter, I will resubmit the patch to > avoid this. > In below patch which you shared, code change is correct. But commit message > is different. > So to avoid further confusion, I went ahead and posted a separate patch in > the mailing list with: > > subject: [PATCH] tools/perf: Fix cpu check to use id.cpu.cpu in aggr_printout > Patch link: > https://lore.kernel.org/lkml/20221006114225.66303-1-atraj...@linux.vnet.ibm.com/T/#u > > Please pick the separately send patch. Ok, will do. - Arnaldo
Re: [PATCH] tools/perf: Fix aggr_printout to display cpu field irrespective of core value
> On 05-Oct-2022, at 6:05 PM, Arnaldo Carvalho de Melo wrote: > > Em Wed, Oct 05, 2022 at 09:28:52AM -0300, Arnaldo Carvalho de Melo escreveu: >> Em Wed, Oct 05, 2022 at 10:23:39AM +0530, Athira Rajeev escreveu: >>> diff --git a/tools/perf/util/stat-display.c b/tools/perf/util/stat-display.c >>> index b82844cb0ce7..cf28020798ec 100644 >>> --- a/tools/perf/util/stat-display.c >>> +++ b/tools/perf/util/stat-display.c >>> @@ -168,7 +168,7 @@ static void aggr_printout(struct perf_stat_config >>> *config, >>> id.socket, >>> id.die, >>> id.core); >>> - } else if (id.core > -1) { >>> + } else if (id.cpu.cpu > -1) { >>> fprintf(config->output, "\"cpu\" : \"%d\", ", >>> id.cpu.cpu); >>> } >>> @@ -179,7 +179,7 @@ static void aggr_printout(struct perf_stat_config >>> *config, >>> id.die, >>> config->csv_output ? 0 : -3, >>> id.core, config->csv_sep); >>> - } else if (id.core > -1) { >>> + } else if (id.cpu.cpu > -1) { >>> fprintf(config->output, "CPU%*d%s", >>> config->csv_output ? 0 : -7, >>> id.cpu.cpu, config->csv_sep); >>> -- >>> If it is confusing, shall I send it as a separate patch along with >>> Tested-by from Ian ? >> >> I'll have to do this by hand, tried pointing b4 to this message and it >> picked the old one, also tried to save the message and apply by hand, >> its mangled. > > This is what I have now, will force push later, please triple check :-) Sorry for all the confusion Arnaldo. Hereafter, I will resubmit the patch to avoid this. In below patch which you shared, code change is correct. But commit message is different. So to avoid further confusion, I went ahead and posted a separate patch in the mailing list with: subject: [PATCH] tools/perf: Fix cpu check to use id.cpu.cpu in aggr_printout Patch link: https://lore.kernel.org/lkml/20221006114225.66303-1-atraj...@linux.vnet.ibm.com/T/#u Please pick the separately send patch. Thanks Athira > > - Arnaldo > > From b7dd96f9211e4ddbd6fa080da8dec2eac98d3f2a Mon Sep 17 00:00:00 2001 > From: Athira Rajeev > Date: Tue, 13 Sep 2022 17:27:17 +0530 > Subject: [PATCH 1/1] perf stat: Fix aggr_printout to display CPU field > irrespective of core value > > perf stat includes option to specify aggr_mode to display per-socket, > per-core, per-die, per-node counter details. Also there is option -A ( > AGGR_NONE, -no-aggr ), where the counter values are displayed for each > CPU along with "CPU" value in one field of the output. > > Each of the aggregate mode uses the information fetched from > "/sys/devices/system/cpu/cpuX/topology" like core_id, > physical_package_id. Utility functions in "cpumap.c" fetches this > information and populates the socket id, core id, CPU etc. If the > platform does not expose the topology information, these values will be > set to -1. Example, in case of powerpc, details like physical_package_id > is restricted to be exposed in pSeries platform. So id.socket, id.core, > id.cpu all will be set as -1. > > In case of displaying socket or die value, there is no check done in the > "aggr_printout" function to see if it points to valid socket id or die. > But for displaying "cpu" value, there is a check for "if (id.core > > -1)". In case of powerpc pSeries where detail like physical_package_id > is restricted to be exposed, id.core will be set to -1. Hence the column > or field itself for CPU won't be displayed in the output. > > Result for per-socket: > > perf stat -e branches --per-socket -a true > > Performance counter stats for 'system wide': > > S-1 32416,851 branches > > Here S has -1 in above result. But with -A option which also expects CPU > in one column in the result, below is observed. > > perf stat -e instructions -A -a true > > Performance counter stats for 'system wide': > >47,146 instructions >45,226 instructions >43,354 instructions >45,184 instructions > > If the CPU id value is pointing to -1 also, it makes sense to display > the column in the output to replicate the behaviour or to be in > precedence with other aggr options(like per-socket, per-core). Remove > the check "id.core" so that CPU field gets displayed in the output. > > After the fix: > > perf stat -e instructions -A -a true > > Performance counter stats for 'system wide': > > CPU-1 64,034 instructions > CPU-1 68,941 instructions > CPU-1 59,418 instructions > CPU-1 70,478
Re: [PATCH] tools/perf: Fix aggr_printout to display cpu field irrespective of core value
Em Wed, Oct 05, 2022 at 09:28:52AM -0300, Arnaldo Carvalho de Melo escreveu: > Em Wed, Oct 05, 2022 at 10:23:39AM +0530, Athira Rajeev escreveu: > > diff --git a/tools/perf/util/stat-display.c b/tools/perf/util/stat-display.c > > index b82844cb0ce7..cf28020798ec 100644 > > --- a/tools/perf/util/stat-display.c > > +++ b/tools/perf/util/stat-display.c > > @@ -168,7 +168,7 @@ static void aggr_printout(struct perf_stat_config > > *config, > > id.socket, > > id.die, > > id.core); > > - } else if (id.core > -1) { > > + } else if (id.cpu.cpu > -1) { > > fprintf(config->output, "\"cpu\" : \"%d\", ", > > id.cpu.cpu); > > } > > @@ -179,7 +179,7 @@ static void aggr_printout(struct perf_stat_config > > *config, > > id.die, > > config->csv_output ? 0 : -3, > > id.core, config->csv_sep); > > - } else if (id.core > -1) { > > + } else if (id.cpu.cpu > -1) { > > fprintf(config->output, "CPU%*d%s", > > config->csv_output ? 0 : -7, > > id.cpu.cpu, config->csv_sep); > > -- > > If it is confusing, shall I send it as a separate patch along with > > Tested-by from Ian ? > > I'll have to do this by hand, tried pointing b4 to this message and it > picked the old one, also tried to save the message and apply by hand, > its mangled. This is what I have now, will force push later, please triple check :-) - Arnaldo >From b7dd96f9211e4ddbd6fa080da8dec2eac98d3f2a Mon Sep 17 00:00:00 2001 From: Athira Rajeev Date: Tue, 13 Sep 2022 17:27:17 +0530 Subject: [PATCH 1/1] perf stat: Fix aggr_printout to display CPU field irrespective of core value perf stat includes option to specify aggr_mode to display per-socket, per-core, per-die, per-node counter details. Also there is option -A ( AGGR_NONE, -no-aggr ), where the counter values are displayed for each CPU along with "CPU" value in one field of the output. Each of the aggregate mode uses the information fetched from "/sys/devices/system/cpu/cpuX/topology" like core_id, physical_package_id. Utility functions in "cpumap.c" fetches this information and populates the socket id, core id, CPU etc. If the platform does not expose the topology information, these values will be set to -1. Example, in case of powerpc, details like physical_package_id is restricted to be exposed in pSeries platform. So id.socket, id.core, id.cpu all will be set as -1. In case of displaying socket or die value, there is no check done in the "aggr_printout" function to see if it points to valid socket id or die. But for displaying "cpu" value, there is a check for "if (id.core > -1)". In case of powerpc pSeries where detail like physical_package_id is restricted to be exposed, id.core will be set to -1. Hence the column or field itself for CPU won't be displayed in the output. Result for per-socket: perf stat -e branches --per-socket -a true Performance counter stats for 'system wide': S-1 32416,851 branches Here S has -1 in above result. But with -A option which also expects CPU in one column in the result, below is observed. perf stat -e instructions -A -a true Performance counter stats for 'system wide': 47,146 instructions 45,226 instructions 43,354 instructions 45,184 instructions If the CPU id value is pointing to -1 also, it makes sense to display the column in the output to replicate the behaviour or to be in precedence with other aggr options(like per-socket, per-core). Remove the check "id.core" so that CPU field gets displayed in the output. After the fix: perf stat -e instructions -A -a true Performance counter stats for 'system wide': CPU-1 64,034 instructions CPU-1 68,941 instructions CPU-1 59,418 instructions CPU-1 70,478 instructions CPU-1 65,201 instructions CPU-1 63,704 instructions This is caught while running "perf test" for "stat+json_output.sh" and "stat+csv_output.sh". Reported-by: Disha Goel Suggested-by: Ian Rogers Suggested-by: James Clark Signed-off-by: Athira Jajeev Tested-by: Disha Goel Tested-by: Ian Rogers Cc: Jiri Olsa Cc: Kajol Jain Cc: Madhavan Srinivasan Cc: Michael Ellerman Cc: Nageswara R Sastry Cc: linuxppc-dev@lists.ozlabs.org Link: https://lore.kernel.org/r/20220913115717.36191-1-atraj...@linux.vnet.ibm.com Signed-off-by: Arnaldo Carvalho de Melo --- tools/perf/util/stat-display.c | 4 ++-- 1 file changed, 2 insertions(+), 2
Re: [PATCH] tools/perf: Fix aggr_printout to display cpu field irrespective of core value
Em Wed, Oct 05, 2022 at 10:23:39AM +0530, Athira Rajeev escreveu: > > > > On 04-Oct-2022, at 11:44 PM, Arnaldo Carvalho de Melo > > wrote: > > > > Em Tue, Oct 04, 2022 at 03:14:27PM -0300, Arnaldo Carvalho de Melo escreveu: > >> Em Tue, Oct 04, 2022 at 07:49:21AM -0700, Ian Rogers escreveu: > >>> On Tue, Oct 4, 2022, 12:06 AM Athira Rajeev > Thanks for helping with testing. Can I add your Tested-by for the patch ? > >> > >>> Yep. > >> > >>> Tested-by: Ian Rogers > > > > > > Thanks, applied. > > > > - Arnaldo > > Hi Arnaldo, > > Looks like you have taken change to remove id.core check: > https://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git/commit/?h=tmp.perf/core=db83f447b323958cdc5fedcf2134effb2ec9a6fe > > But the patch that has to go in is : > "[PATCH] tools/perf: Fix cpu check to use id.cpu.cpu in ggr_printout" > which is tested by Ian and "pasted" by me in same mail thread. > > Re-pasting here for reference: > > >From 4dd98d953940deb2f85176cb6b4ecbfd18dbdbf9 Mon Sep 17 00:00:00 2001 > From: Athira Rajeev > Date: Mon, 3 Oct 2022 15:47:27 +0530 > Subject: [PATCH] tools/perf: Fix cpu check to use id.cpu.cpu in aggr_printout > > perf stat has options to aggregate the counts in different > modes like per socket, per core etc. The function "aggr_printout" > in util/stat-display.c which is used to print the aggregates, > has a check for cpu in case of AGGR_NONE. This check was > originally using condition : "if (id.cpu.cpu > -1)". But > this got changed after commit df936cadfb58 ("perf stat: Add > JSON output option"), which added option to output json format > for different aggregation modes. After this commit, the > check in "aggr_printout" is using "if (id.core > -1)". > > The old code was using "id.cpu.cpu > -1" while the new code > is using "id.core > -1". But since the value printed is > id.cpu.cpu, fix this check to use cpu and not core. > > Signed-off-by: Athira Rajeev > Suggested-by: James Clark > Suggested-by: Ian Rogers > --- > tools/perf/util/stat-display.c | 4 ++-- > 1 file changed, 2 insertions(+), 2 deletions(-) > > diff --git a/tools/perf/util/stat-display.c b/tools/perf/util/stat-display.c > index b82844cb0ce7..cf28020798ec 100644 > --- a/tools/perf/util/stat-display.c > +++ b/tools/perf/util/stat-display.c > @@ -168,7 +168,7 @@ static void aggr_printout(struct perf_stat_config *config, > id.socket, > id.die, > id.core); > - } else if (id.core > -1) { > + } else if (id.cpu.cpu > -1) { > fprintf(config->output, "\"cpu\" : \"%d\", ", > id.cpu.cpu); > } > @@ -179,7 +179,7 @@ static void aggr_printout(struct perf_stat_config *config, > id.die, > config->csv_output ? 0 : -3, > id.core, config->csv_sep); > - } else if (id.core > -1) { > + } else if (id.cpu.cpu > -1) { > fprintf(config->output, "CPU%*d%s", > config->csv_output ? 0 : -7, > id.cpu.cpu, config->csv_sep); > -- > 2.31.1 > > If it is confusing, shall I send it as a separate patch along with Tested-by > from Ian ? I'll have to do this by hand, tried pointing b4 to this message and it picked the old one, also tried to save the message and apply by hand, its mangled. - Arnaldo > > Thanks > Athira > > > -- - Arnaldo
Re: [PATCH] tools/perf: Fix aggr_printout to display cpu field irrespective of core value
Em Wed, Oct 05, 2022 at 10:23:39AM +0530, Athira Rajeev escreveu: > > > > On 04-Oct-2022, at 11:44 PM, Arnaldo Carvalho de Melo > > wrote: > > > > Em Tue, Oct 04, 2022 at 03:14:27PM -0300, Arnaldo Carvalho de Melo escreveu: > >> Em Tue, Oct 04, 2022 at 07:49:21AM -0700, Ian Rogers escreveu: > >>> On Tue, Oct 4, 2022, 12:06 AM Athira Rajeev > Thanks for helping with testing. Can I add your Tested-by for the patch ? > >> > >>> Yep. > >> > >>> Tested-by: Ian Rogers > > > > > > Thanks, applied. > > > > - Arnaldo > > Hi Arnaldo, > > Looks like you have taken change to remove id.core check: > https://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git/commit/?h=tmp.perf/core=db83f447b323958cdc5fedcf2134effb2ec9a6fe > > But the patch that has to go in is : > "[PATCH] tools/perf: Fix cpu check to use id.cpu.cpu in ggr_printout" > which is tested by Ian and "pasted" by me in same mail thread. > > Re-pasting here for reference: > > >From 4dd98d953940deb2f85176cb6b4ecbfd18dbdbf9 Mon Sep 17 00:00:00 2001 > From: Athira Rajeev > Date: Mon, 3 Oct 2022 15:47:27 +0530 > Subject: [PATCH] tools/perf: Fix cpu check to use id.cpu.cpu in aggr_printout > > perf stat has options to aggregate the counts in different > modes like per socket, per core etc. The function "aggr_printout" > in util/stat-display.c which is used to print the aggregates, > has a check for cpu in case of AGGR_NONE. This check was > originally using condition : "if (id.cpu.cpu > -1)". But > this got changed after commit df936cadfb58 ("perf stat: Add > JSON output option"), which added option to output json format > for different aggregation modes. After this commit, the > check in "aggr_printout" is using "if (id.core > -1)". > > The old code was using "id.cpu.cpu > -1" while the new code > is using "id.core > -1". But since the value printed is > id.cpu.cpu, fix this check to use cpu and not core. > > Signed-off-by: Athira Rajeev > Suggested-by: James Clark > Suggested-by: Ian Rogers > --- > tools/perf/util/stat-display.c | 4 ++-- > 1 file changed, 2 insertions(+), 2 deletions(-) > > diff --git a/tools/perf/util/stat-display.c b/tools/perf/util/stat-display.c > index b82844cb0ce7..cf28020798ec 100644 > --- a/tools/perf/util/stat-display.c > +++ b/tools/perf/util/stat-display.c > @@ -168,7 +168,7 @@ static void aggr_printout(struct perf_stat_config *config, > id.socket, > id.die, > id.core); > - } else if (id.core > -1) { > + } else if (id.cpu.cpu > -1) { > fprintf(config->output, "\"cpu\" : \"%d\", ", > id.cpu.cpu); > } > @@ -179,7 +179,7 @@ static void aggr_printout(struct perf_stat_config *config, > id.die, > config->csv_output ? 0 : -3, > id.core, config->csv_sep); > - } else if (id.core > -1) { > + } else if (id.cpu.cpu > -1) { > fprintf(config->output, "CPU%*d%s", > config->csv_output ? 0 : -7, > id.cpu.cpu, config->csv_sep); > -- > 2.31.1 > > If it is confusing, shall I send it as a separate patch along with Tested-by > from Ian ? > > Please revert > https://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git/commit/?h=tmp.perf/core=db83f447b323958cdc5fedcf2134effb2ec9a6fe I'll do it, but in these cases just go ahead and resubmit with a v N+1 patchset, so that b4 can pick the newer version even if I point it to the previous version message-id. - Arnaldo
Re: [PATCH] tools/perf: Fix aggr_printout to display cpu field irrespective of core value
> On 04-Oct-2022, at 11:44 PM, Arnaldo Carvalho de Melo wrote: > > Em Tue, Oct 04, 2022 at 03:14:27PM -0300, Arnaldo Carvalho de Melo escreveu: >> Em Tue, Oct 04, 2022 at 07:49:21AM -0700, Ian Rogers escreveu: >>> On Tue, Oct 4, 2022, 12:06 AM Athira Rajeev Thanks for helping with testing. Can I add your Tested-by for the patch ? >> >>> Yep. >> >>> Tested-by: Ian Rogers > > > Thanks, applied. > > - Arnaldo Hi Arnaldo, Looks like you have taken change to remove id.core check: https://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git/commit/?h=tmp.perf/core=db83f447b323958cdc5fedcf2134effb2ec9a6fe But the patch that has to go in is : "[PATCH] tools/perf: Fix cpu check to use id.cpu.cpu in ggr_printout" which is tested by Ian and "pasted" by me in same mail thread. Re-pasting here for reference: >From 4dd98d953940deb2f85176cb6b4ecbfd18dbdbf9 Mon Sep 17 00:00:00 2001 From: Athira Rajeev Date: Mon, 3 Oct 2022 15:47:27 +0530 Subject: [PATCH] tools/perf: Fix cpu check to use id.cpu.cpu in aggr_printout perf stat has options to aggregate the counts in different modes like per socket, per core etc. The function "aggr_printout" in util/stat-display.c which is used to print the aggregates, has a check for cpu in case of AGGR_NONE. This check was originally using condition : "if (id.cpu.cpu > -1)". But this got changed after commit df936cadfb58 ("perf stat: Add JSON output option"), which added option to output json format for different aggregation modes. After this commit, the check in "aggr_printout" is using "if (id.core > -1)". The old code was using "id.cpu.cpu > -1" while the new code is using "id.core > -1". But since the value printed is id.cpu.cpu, fix this check to use cpu and not core. Signed-off-by: Athira Rajeev Suggested-by: James Clark Suggested-by: Ian Rogers --- tools/perf/util/stat-display.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/tools/perf/util/stat-display.c b/tools/perf/util/stat-display.c index b82844cb0ce7..cf28020798ec 100644 --- a/tools/perf/util/stat-display.c +++ b/tools/perf/util/stat-display.c @@ -168,7 +168,7 @@ static void aggr_printout(struct perf_stat_config *config, id.socket, id.die, id.core); - } else if (id.core > -1) { + } else if (id.cpu.cpu > -1) { fprintf(config->output, "\"cpu\" : \"%d\", ", id.cpu.cpu); } @@ -179,7 +179,7 @@ static void aggr_printout(struct perf_stat_config *config, id.die, config->csv_output ? 0 : -3, id.core, config->csv_sep); - } else if (id.core > -1) { + } else if (id.cpu.cpu > -1) { fprintf(config->output, "CPU%*d%s", config->csv_output ? 0 : -7, id.cpu.cpu, config->csv_sep); -- 2.31.1 If it is confusing, shall I send it as a separate patch along with Tested-by from Ian ? Please revert https://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git/commit/?h=tmp.perf/core=db83f447b323958cdc5fedcf2134effb2ec9a6fe Thanks Athira >
Re: [PATCH] tools/perf: Fix aggr_printout to display cpu field irrespective of core value
On Tue, Oct 4, 2022, 12:06 AM Athira Rajeev wrote: > > > > On 04-Oct-2022, at 12:21 AM, Ian Rogers wrote: > > > > On Mon, Oct 3, 2022 at 7:03 AM atrajeev > wrote: > >> > >> On 2022-10-02 05:17, Ian Rogers wrote: > >>> On Thu, Sep 29, 2022 at 5:56 AM James Clark > >>> wrote: > > > > On 29/09/2022 09:49, Athira Rajeev wrote: > > > > > >> On 28-Sep-2022, at 9:05 PM, James Clark > wrote: > >> > >> > >> > > > > Hi James, > > > > Thanks for looking at the patch and sharing review comments. > > > >> On 13/09/2022 12:57, Athira Rajeev wrote: > >>> perf stat includes option to specify aggr_mode to display > >>> per-socket, per-core, per-die, per-node counter details. > >>> Also there is option -A ( AGGR_NONE, -no-aggr ), where the > >>> counter values are displayed for each cpu along with "CPU" > >>> value in one field of the output. > >>> > >>> Each of the aggregate mode uses the information fetched > >>> from "/sys/devices/system/cpu/cpuX/topology" like core_id, > >> > >> I thought that this wouldn't apply to the cpu field because cpu is > >> basically interchangeable as an index in cpumap, rather than > anything > >> being read from the topology file. > > > > The cpu value is filled in this function: > > > > Function : aggr_cpu_id__cpu > > Code: util/cpumap.c > > > >> > >>> physical_package_id. Utility functions in "cpumap.c" fetches > >>> this information and populates the socket id, core id, cpu etc. > >>> If the platform does not expose the topology information, > >>> these values will be set to -1. Example, in case of powerpc, > >>> details like physical_package_id is restricted to be exposed > >>> in pSeries platform. So id.socket, id.core, id.cpu all will > >>> be set as -1. > >>> > >>> In case of displaying socket or die value, there is no check > >>> done in the "aggr_printout" function to see if it points to > >>> valid socket id or die. But for displaying "cpu" value, there > >>> is a check for "if (id.core > -1)". In case of powerpc pSeries > >>> where detail like physical_package_id is restricted to be > >>> exposed, id.core will be set to -1. Hence the column or field > >>> itself for CPU won't be displayed in the output. > >>> > >>> Result for per-socket: > >>> > >>> <<>> > >>> perf stat -e branches --per-socket -a true > >>> > >>> Performance counter stats for 'system wide': > >>> > >>> S-1 32416,851 branches > >>> <<>> > >>> > >>> Here S has -1 in above result. But with -A option which also > >>> expects CPU in one column in the result, below is observed. > >>> > >>> <<>> > >>> /bin/perf stat -e instructions -A -a true > >>> > >>> Performance counter stats for 'system wide': > >>> > >>> 47,146 instructions > >>> 45,226 instructions > >>> 43,354 instructions > >>> 45,184 instructions > >>> <<>> > >>> > >>> If the cpu id value is pointing to -1 also, it makes sense > >>> to display the column in the output to replicate the behaviour > >>> or to be in precedence with other aggr options(like per-socket, > >>> per-core). Remove the check "id.core" so that CPU field gets > >>> displayed in the output. > >> > >> Why would you want to print -1 out? Seems like the if statement was > a > >> good one to me, otherwise the output looks a bit broken to users. > Are > >> the other aggregation modes even working if -1 is set for socket and > >> die? Maybe we need to not print -1 in those cases or exit earlier > with a > >> failure. > >> > >> The -1 value has a specific internal meaning which is "to not > >> aggregate". It doesn't mean "not set". > > > > Currently, this check is done only for printing cpu value. > > For socket/die/core values, this check is not done. Pasting an > > example snippet from a powerpc system ( specifically from pseries > platform where > > the value is set to -1 ) > > > > ./perf stat --per-core -a -C 1 true > > > > Performance counter stats for 'system wide': > > > > S-1-D-1-C-1 1 1.06 msec cpu-clock > #1.018 CPUs utilized > > S-1-D-1-C-1 1 2 context-switches >#1.879 K/sec > > S-1-D-1-C-1 1 0 cpu-migrations >#0.000 /sec > > > > Here though the value is -1, we are displaying it. Where as in case > of cpu, the first column will be > > empty since we do a check before printing. > > > > Example: > > > > ./perf stat --per-core -A -C 1 true > > > > Performance counter stats for 'CPU(s) 1': > > > > 0.88 msec
Re: [PATCH] tools/perf: Fix aggr_printout to display cpu field irrespective of core value
Em Tue, Oct 04, 2022 at 03:14:27PM -0300, Arnaldo Carvalho de Melo escreveu: > Em Tue, Oct 04, 2022 at 07:49:21AM -0700, Ian Rogers escreveu: > > On Tue, Oct 4, 2022, 12:06 AM Athira Rajeev > > > Thanks for helping with testing. Can I add your Tested-by for the patch ? > > > Yep. > > > Tested-by: Ian Rogers Thanks, applied. - Arnaldo
Re: [PATCH] tools/perf: Fix aggr_printout to display cpu field irrespective of core value
Em Tue, Oct 04, 2022 at 07:49:21AM -0700, Ian Rogers escreveu: > On Tue, Oct 4, 2022, 12:06 AM Athira Rajeev > > Thanks for helping with testing. Can I add your Tested-by for the patch ? > Yep. > Tested-by: Ian Rogers > Thanks, > Ian
Re: [PATCH] tools/perf: Fix aggr_printout to display cpu field irrespective of core value
> On 04-Oct-2022, at 12:21 AM, Ian Rogers wrote: > > On Mon, Oct 3, 2022 at 7:03 AM atrajeev wrote: >> >> On 2022-10-02 05:17, Ian Rogers wrote: >>> On Thu, Sep 29, 2022 at 5:56 AM James Clark >>> wrote: On 29/09/2022 09:49, Athira Rajeev wrote: > > >> On 28-Sep-2022, at 9:05 PM, James Clark wrote: >> >> >> > > Hi James, > > Thanks for looking at the patch and sharing review comments. > >> On 13/09/2022 12:57, Athira Rajeev wrote: >>> perf stat includes option to specify aggr_mode to display >>> per-socket, per-core, per-die, per-node counter details. >>> Also there is option -A ( AGGR_NONE, -no-aggr ), where the >>> counter values are displayed for each cpu along with "CPU" >>> value in one field of the output. >>> >>> Each of the aggregate mode uses the information fetched >>> from "/sys/devices/system/cpu/cpuX/topology" like core_id, >> >> I thought that this wouldn't apply to the cpu field because cpu is >> basically interchangeable as an index in cpumap, rather than anything >> being read from the topology file. > > The cpu value is filled in this function: > > Function : aggr_cpu_id__cpu > Code: util/cpumap.c > >> >>> physical_package_id. Utility functions in "cpumap.c" fetches >>> this information and populates the socket id, core id, cpu etc. >>> If the platform does not expose the topology information, >>> these values will be set to -1. Example, in case of powerpc, >>> details like physical_package_id is restricted to be exposed >>> in pSeries platform. So id.socket, id.core, id.cpu all will >>> be set as -1. >>> >>> In case of displaying socket or die value, there is no check >>> done in the "aggr_printout" function to see if it points to >>> valid socket id or die. But for displaying "cpu" value, there >>> is a check for "if (id.core > -1)". In case of powerpc pSeries >>> where detail like physical_package_id is restricted to be >>> exposed, id.core will be set to -1. Hence the column or field >>> itself for CPU won't be displayed in the output. >>> >>> Result for per-socket: >>> >>> <<>> >>> perf stat -e branches --per-socket -a true >>> >>> Performance counter stats for 'system wide': >>> >>> S-1 32416,851 branches >>> <<>> >>> >>> Here S has -1 in above result. But with -A option which also >>> expects CPU in one column in the result, below is observed. >>> >>> <<>> >>> /bin/perf stat -e instructions -A -a true >>> >>> Performance counter stats for 'system wide': >>> >>> 47,146 instructions >>> 45,226 instructions >>> 43,354 instructions >>> 45,184 instructions >>> <<>> >>> >>> If the cpu id value is pointing to -1 also, it makes sense >>> to display the column in the output to replicate the behaviour >>> or to be in precedence with other aggr options(like per-socket, >>> per-core). Remove the check "id.core" so that CPU field gets >>> displayed in the output. >> >> Why would you want to print -1 out? Seems like the if statement was a >> good one to me, otherwise the output looks a bit broken to users. Are >> the other aggregation modes even working if -1 is set for socket and >> die? Maybe we need to not print -1 in those cases or exit earlier with a >> failure. >> >> The -1 value has a specific internal meaning which is "to not >> aggregate". It doesn't mean "not set". > > Currently, this check is done only for printing cpu value. > For socket/die/core values, this check is not done. Pasting an > example snippet from a powerpc system ( specifically from pseries > platform where > the value is set to -1 ) > > ./perf stat --per-core -a -C 1 true > > Performance counter stats for 'system wide': > > S-1-D-1-C-1 1 1.06 msec cpu-clock > #1.018 CPUs utilized > S-1-D-1-C-1 1 2 context-switches > #1.879 K/sec > S-1-D-1-C-1 1 0 cpu-migrations > #0.000 /sec > > Here though the value is -1, we are displaying it. Where as in case of > cpu, the first column will be > empty since we do a check before printing. > > Example: > > ./perf stat --per-core -A -C 1 true > > Performance counter stats for 'CPU(s) 1': > > 0.88 msec cpu-clock#1.022 CPUs > utilized > 2 context-switches > 0 cpu-migrations > > > No sure, whether there are scripts out there,
Re: [PATCH] tools/perf: Fix aggr_printout to display cpu field irrespective of core value
On Mon, Oct 3, 2022 at 7:03 AM atrajeev wrote: > > On 2022-10-02 05:17, Ian Rogers wrote: > > On Thu, Sep 29, 2022 at 5:56 AM James Clark > > wrote: > >> > >> > >> > >> On 29/09/2022 09:49, Athira Rajeev wrote: > >> > > >> > > >> >> On 28-Sep-2022, at 9:05 PM, James Clark wrote: > >> >> > >> >> > >> >> > >> > > >> > Hi James, > >> > > >> > Thanks for looking at the patch and sharing review comments. > >> > > >> >> On 13/09/2022 12:57, Athira Rajeev wrote: > >> >>> perf stat includes option to specify aggr_mode to display > >> >>> per-socket, per-core, per-die, per-node counter details. > >> >>> Also there is option -A ( AGGR_NONE, -no-aggr ), where the > >> >>> counter values are displayed for each cpu along with "CPU" > >> >>> value in one field of the output. > >> >>> > >> >>> Each of the aggregate mode uses the information fetched > >> >>> from "/sys/devices/system/cpu/cpuX/topology" like core_id, > >> >> > >> >> I thought that this wouldn't apply to the cpu field because cpu is > >> >> basically interchangeable as an index in cpumap, rather than anything > >> >> being read from the topology file. > >> > > >> > The cpu value is filled in this function: > >> > > >> > Function : aggr_cpu_id__cpu > >> > Code: util/cpumap.c > >> > > >> >> > >> >>> physical_package_id. Utility functions in "cpumap.c" fetches > >> >>> this information and populates the socket id, core id, cpu etc. > >> >>> If the platform does not expose the topology information, > >> >>> these values will be set to -1. Example, in case of powerpc, > >> >>> details like physical_package_id is restricted to be exposed > >> >>> in pSeries platform. So id.socket, id.core, id.cpu all will > >> >>> be set as -1. > >> >>> > >> >>> In case of displaying socket or die value, there is no check > >> >>> done in the "aggr_printout" function to see if it points to > >> >>> valid socket id or die. But for displaying "cpu" value, there > >> >>> is a check for "if (id.core > -1)". In case of powerpc pSeries > >> >>> where detail like physical_package_id is restricted to be > >> >>> exposed, id.core will be set to -1. Hence the column or field > >> >>> itself for CPU won't be displayed in the output. > >> >>> > >> >>> Result for per-socket: > >> >>> > >> >>> <<>> > >> >>> perf stat -e branches --per-socket -a true > >> >>> > >> >>> Performance counter stats for 'system wide': > >> >>> > >> >>> S-1 32416,851 branches > >> >>> <<>> > >> >>> > >> >>> Here S has -1 in above result. But with -A option which also > >> >>> expects CPU in one column in the result, below is observed. > >> >>> > >> >>> <<>> > >> >>> /bin/perf stat -e instructions -A -a true > >> >>> > >> >>> Performance counter stats for 'system wide': > >> >>> > >> >>>47,146 instructions > >> >>>45,226 instructions > >> >>>43,354 instructions > >> >>>45,184 instructions > >> >>> <<>> > >> >>> > >> >>> If the cpu id value is pointing to -1 also, it makes sense > >> >>> to display the column in the output to replicate the behaviour > >> >>> or to be in precedence with other aggr options(like per-socket, > >> >>> per-core). Remove the check "id.core" so that CPU field gets > >> >>> displayed in the output. > >> >> > >> >> Why would you want to print -1 out? Seems like the if statement was a > >> >> good one to me, otherwise the output looks a bit broken to users. Are > >> >> the other aggregation modes even working if -1 is set for socket and > >> >> die? Maybe we need to not print -1 in those cases or exit earlier with a > >> >> failure. > >> >> > >> >> The -1 value has a specific internal meaning which is "to not > >> >> aggregate". It doesn't mean "not set". > >> > > >> > Currently, this check is done only for printing cpu value. > >> > For socket/die/core values, this check is not done. Pasting an > >> > example snippet from a powerpc system ( specifically from pseries > >> > platform where > >> > the value is set to -1 ) > >> > > >> > ./perf stat --per-core -a -C 1 true > >> > > >> > Performance counter stats for 'system wide': > >> > > >> > S-1-D-1-C-1 1 1.06 msec cpu-clock > >> >#1.018 CPUs utilized > >> > S-1-D-1-C-1 1 2 context-switches > >> >#1.879 K/sec > >> > S-1-D-1-C-1 1 0 cpu-migrations > >> >#0.000 /sec > >> > > >> > Here though the value is -1, we are displaying it. Where as in case of > >> > cpu, the first column will be > >> > empty since we do a check before printing. > >> > > >> > Example: > >> > > >> > ./perf stat --per-core -A -C 1 true > >> > > >> > Performance counter stats for 'CPU(s) 1': > >> > > >> > 0.88 msec cpu-clock#1.022 CPUs > >> > utilized > >> > 2 context-switches > >> > 0 cpu-migrations > >> > > >> > > >> > No sure, whether
Re: [PATCH] tools/perf: Fix aggr_printout to display cpu field irrespective of core value
On Thu, Sep 29, 2022 at 5:56 AM James Clark wrote: > > > > On 29/09/2022 09:49, Athira Rajeev wrote: > > > > > >> On 28-Sep-2022, at 9:05 PM, James Clark wrote: > >> > >> > >> > > > > Hi James, > > > > Thanks for looking at the patch and sharing review comments. > > > >> On 13/09/2022 12:57, Athira Rajeev wrote: > >>> perf stat includes option to specify aggr_mode to display > >>> per-socket, per-core, per-die, per-node counter details. > >>> Also there is option -A ( AGGR_NONE, -no-aggr ), where the > >>> counter values are displayed for each cpu along with "CPU" > >>> value in one field of the output. > >>> > >>> Each of the aggregate mode uses the information fetched > >>> from "/sys/devices/system/cpu/cpuX/topology" like core_id, > >> > >> I thought that this wouldn't apply to the cpu field because cpu is > >> basically interchangeable as an index in cpumap, rather than anything > >> being read from the topology file. > > > > The cpu value is filled in this function: > > > > Function : aggr_cpu_id__cpu > > Code: util/cpumap.c > > > >> > >>> physical_package_id. Utility functions in "cpumap.c" fetches > >>> this information and populates the socket id, core id, cpu etc. > >>> If the platform does not expose the topology information, > >>> these values will be set to -1. Example, in case of powerpc, > >>> details like physical_package_id is restricted to be exposed > >>> in pSeries platform. So id.socket, id.core, id.cpu all will > >>> be set as -1. > >>> > >>> In case of displaying socket or die value, there is no check > >>> done in the "aggr_printout" function to see if it points to > >>> valid socket id or die. But for displaying "cpu" value, there > >>> is a check for "if (id.core > -1)". In case of powerpc pSeries > >>> where detail like physical_package_id is restricted to be > >>> exposed, id.core will be set to -1. Hence the column or field > >>> itself for CPU won't be displayed in the output. > >>> > >>> Result for per-socket: > >>> > >>> <<>> > >>> perf stat -e branches --per-socket -a true > >>> > >>> Performance counter stats for 'system wide': > >>> > >>> S-1 32416,851 branches > >>> <<>> > >>> > >>> Here S has -1 in above result. But with -A option which also > >>> expects CPU in one column in the result, below is observed. > >>> > >>> <<>> > >>> /bin/perf stat -e instructions -A -a true > >>> > >>> Performance counter stats for 'system wide': > >>> > >>>47,146 instructions > >>>45,226 instructions > >>>43,354 instructions > >>>45,184 instructions > >>> <<>> > >>> > >>> If the cpu id value is pointing to -1 also, it makes sense > >>> to display the column in the output to replicate the behaviour > >>> or to be in precedence with other aggr options(like per-socket, > >>> per-core). Remove the check "id.core" so that CPU field gets > >>> displayed in the output. > >> > >> Why would you want to print -1 out? Seems like the if statement was a > >> good one to me, otherwise the output looks a bit broken to users. Are > >> the other aggregation modes even working if -1 is set for socket and > >> die? Maybe we need to not print -1 in those cases or exit earlier with a > >> failure. > >> > >> The -1 value has a specific internal meaning which is "to not > >> aggregate". It doesn't mean "not set". > > > > Currently, this check is done only for printing cpu value. > > For socket/die/core values, this check is not done. Pasting an > > example snippet from a powerpc system ( specifically from pseries platform > > where > > the value is set to -1 ) > > > > ./perf stat --per-core -a -C 1 true > > > > Performance counter stats for 'system wide': > > > > S-1-D-1-C-1 1 1.06 msec cpu-clock > > #1.018 CPUs utilized > > S-1-D-1-C-1 1 2 context-switches > > #1.879 K/sec > > S-1-D-1-C-1 1 0 cpu-migrations > > #0.000 /sec > > > > Here though the value is -1, we are displaying it. Where as in case of cpu, > > the first column will be > > empty since we do a check before printing. > > > > Example: > > > > ./perf stat --per-core -A -C 1 true > > > > Performance counter stats for 'CPU(s) 1': > > > > 0.88 msec cpu-clock#1.022 CPUs > > utilized > > 2 context-switches > > 0 cpu-migrations > > > > > > No sure, whether there are scripts out there, which consume the current > > format and > > not displaying -1 may break it. That is why we tried with change to remove > > check for cpu, similar to > > other modes like socket, die, core etc. > > I wouldn't worry about that because there are json and CSV modes which > are machine readable, and -1 is already not always displayed. If > anything this change here is also likely to break parsing by adding -1 > where it wasn't before. > > > > >
Re: [PATCH] tools/perf: Fix aggr_printout to display cpu field irrespective of core value
On 29/09/2022 09:49, Athira Rajeev wrote: > > >> On 28-Sep-2022, at 9:05 PM, James Clark wrote: >> >> >> > > Hi James, > > Thanks for looking at the patch and sharing review comments. > >> On 13/09/2022 12:57, Athira Rajeev wrote: >>> perf stat includes option to specify aggr_mode to display >>> per-socket, per-core, per-die, per-node counter details. >>> Also there is option -A ( AGGR_NONE, -no-aggr ), where the >>> counter values are displayed for each cpu along with "CPU" >>> value in one field of the output. >>> >>> Each of the aggregate mode uses the information fetched >>> from "/sys/devices/system/cpu/cpuX/topology" like core_id, >> >> I thought that this wouldn't apply to the cpu field because cpu is >> basically interchangeable as an index in cpumap, rather than anything >> being read from the topology file. > > The cpu value is filled in this function: > > Function : aggr_cpu_id__cpu > Code: util/cpumap.c > >> >>> physical_package_id. Utility functions in "cpumap.c" fetches >>> this information and populates the socket id, core id, cpu etc. >>> If the platform does not expose the topology information, >>> these values will be set to -1. Example, in case of powerpc, >>> details like physical_package_id is restricted to be exposed >>> in pSeries platform. So id.socket, id.core, id.cpu all will >>> be set as -1. >>> >>> In case of displaying socket or die value, there is no check >>> done in the "aggr_printout" function to see if it points to >>> valid socket id or die. But for displaying "cpu" value, there >>> is a check for "if (id.core > -1)". In case of powerpc pSeries >>> where detail like physical_package_id is restricted to be >>> exposed, id.core will be set to -1. Hence the column or field >>> itself for CPU won't be displayed in the output. >>> >>> Result for per-socket: >>> >>> <<>> >>> perf stat -e branches --per-socket -a true >>> >>> Performance counter stats for 'system wide': >>> >>> S-1 32416,851 branches >>> <<>> >>> >>> Here S has -1 in above result. But with -A option which also >>> expects CPU in one column in the result, below is observed. >>> >>> <<>> >>> /bin/perf stat -e instructions -A -a true >>> >>> Performance counter stats for 'system wide': >>> >>>47,146 instructions >>>45,226 instructions >>>43,354 instructions >>>45,184 instructions >>> <<>> >>> >>> If the cpu id value is pointing to -1 also, it makes sense >>> to display the column in the output to replicate the behaviour >>> or to be in precedence with other aggr options(like per-socket, >>> per-core). Remove the check "id.core" so that CPU field gets >>> displayed in the output. >> >> Why would you want to print -1 out? Seems like the if statement was a >> good one to me, otherwise the output looks a bit broken to users. Are >> the other aggregation modes even working if -1 is set for socket and >> die? Maybe we need to not print -1 in those cases or exit earlier with a >> failure. >> >> The -1 value has a specific internal meaning which is "to not >> aggregate". It doesn't mean "not set". > > Currently, this check is done only for printing cpu value. > For socket/die/core values, this check is not done. Pasting an > example snippet from a powerpc system ( specifically from pseries platform > where > the value is set to -1 ) > > ./perf stat --per-core -a -C 1 true > > Performance counter stats for 'system wide': > > S-1-D-1-C-1 1 1.06 msec cpu-clock > #1.018 CPUs utilized > S-1-D-1-C-1 1 2 context-switches > #1.879 K/sec > S-1-D-1-C-1 1 0 cpu-migrations > #0.000 /sec > > Here though the value is -1, we are displaying it. Where as in case of cpu, > the first column will be > empty since we do a check before printing. > > Example: > > ./perf stat --per-core -A -C 1 true > > Performance counter stats for 'CPU(s) 1': > > 0.88 msec cpu-clock#1.022 CPUs > utilized > 2 context-switches > > 0 cpu-migrations > > > > No sure, whether there are scripts out there, which consume the current > format and > not displaying -1 may break it. That is why we tried with change to remove > check for cpu, similar to > other modes like socket, die, core etc. I wouldn't worry about that because there are json and CSV modes which are machine readable, and -1 is already not always displayed. If anything this change here is also likely to break parsing by adding -1 where it wasn't before. > > Also perf code ie “aggr_cpu_id__empty” in util/cpumap.c initialises the > values to -1 . I was checking to see where we are
Re: [PATCH] tools/perf: Fix aggr_printout to display cpu field irrespective of core value
> On 28-Sep-2022, at 9:05 PM, James Clark wrote: > > > Hi James, Thanks for looking at the patch and sharing review comments. > On 13/09/2022 12:57, Athira Rajeev wrote: >> perf stat includes option to specify aggr_mode to display >> per-socket, per-core, per-die, per-node counter details. >> Also there is option -A ( AGGR_NONE, -no-aggr ), where the >> counter values are displayed for each cpu along with "CPU" >> value in one field of the output. >> >> Each of the aggregate mode uses the information fetched >> from "/sys/devices/system/cpu/cpuX/topology" like core_id, > > I thought that this wouldn't apply to the cpu field because cpu is > basically interchangeable as an index in cpumap, rather than anything > being read from the topology file. The cpu value is filled in this function: Function : aggr_cpu_id__cpu Code: util/cpumap.c > >> physical_package_id. Utility functions in "cpumap.c" fetches >> this information and populates the socket id, core id, cpu etc. >> If the platform does not expose the topology information, >> these values will be set to -1. Example, in case of powerpc, >> details like physical_package_id is restricted to be exposed >> in pSeries platform. So id.socket, id.core, id.cpu all will >> be set as -1. >> >> In case of displaying socket or die value, there is no check >> done in the "aggr_printout" function to see if it points to >> valid socket id or die. But for displaying "cpu" value, there >> is a check for "if (id.core > -1)". In case of powerpc pSeries >> where detail like physical_package_id is restricted to be >> exposed, id.core will be set to -1. Hence the column or field >> itself for CPU won't be displayed in the output. >> >> Result for per-socket: >> >> <<>> >> perf stat -e branches --per-socket -a true >> >> Performance counter stats for 'system wide': >> >> S-1 32416,851 branches >> <<>> >> >> Here S has -1 in above result. But with -A option which also >> expects CPU in one column in the result, below is observed. >> >> <<>> >> /bin/perf stat -e instructions -A -a true >> >> Performance counter stats for 'system wide': >> >>47,146 instructions >>45,226 instructions >>43,354 instructions >>45,184 instructions >> <<>> >> >> If the cpu id value is pointing to -1 also, it makes sense >> to display the column in the output to replicate the behaviour >> or to be in precedence with other aggr options(like per-socket, >> per-core). Remove the check "id.core" so that CPU field gets >> displayed in the output. > > Why would you want to print -1 out? Seems like the if statement was a > good one to me, otherwise the output looks a bit broken to users. Are > the other aggregation modes even working if -1 is set for socket and > die? Maybe we need to not print -1 in those cases or exit earlier with a > failure. > > The -1 value has a specific internal meaning which is "to not > aggregate". It doesn't mean "not set". Currently, this check is done only for printing cpu value. For socket/die/core values, this check is not done. Pasting an example snippet from a powerpc system ( specifically from pseries platform where the value is set to -1 ) ./perf stat --per-core -a -C 1 true Performance counter stats for 'system wide': S-1-D-1-C-1 1 1.06 msec cpu-clock #1.018 CPUs utilized S-1-D-1-C-1 1 2 context-switches #1.879 K/sec S-1-D-1-C-1 1 0 cpu-migrations #0.000 /sec Here though the value is -1, we are displaying it. Where as in case of cpu, the first column will be empty since we do a check before printing. Example: ./perf stat --per-core -A -C 1 true Performance counter stats for 'CPU(s) 1': 0.88 msec cpu-clock#1.022 CPUs utilized 2 context-switches 0 cpu-migrations No sure, whether there are scripts out there, which consume the current format and not displaying -1 may break it. That is why we tried with change to remove check for cpu, similar to other modes like socket, die, core etc. Also perf code ie “aggr_cpu_id__empty” in util/cpumap.c initialises the values to -1 . I was checking to see where we are mapping -1 to “to not aggregate”. What I could find is AGGR_NONE ( which is for no-aggr ) has value as zero. Reference: defined in util/stat.h enum aggr_mode { AGGR_NONE, James, can you point me to reference for that meaning if I have missed anything. Thanks Athira > >> >> After the fix: >> >> <<>> >> perf stat -e instructions -A -a true >> >> Performance counter stats for 'system wide': >> >> CPU-1 64,034
Re: [PATCH] tools/perf: Fix aggr_printout to display cpu field irrespective of core value
On 13/09/2022 12:57, Athira Rajeev wrote: > perf stat includes option to specify aggr_mode to display > per-socket, per-core, per-die, per-node counter details. > Also there is option -A ( AGGR_NONE, -no-aggr ), where the > counter values are displayed for each cpu along with "CPU" > value in one field of the output. > > Each of the aggregate mode uses the information fetched > from "/sys/devices/system/cpu/cpuX/topology" like core_id, I thought that this wouldn't apply to the cpu field because cpu is basically interchangeable as an index in cpumap, rather than anything being read from the topology file. > physical_package_id. Utility functions in "cpumap.c" fetches > this information and populates the socket id, core id, cpu etc. > If the platform does not expose the topology information, > these values will be set to -1. Example, in case of powerpc, > details like physical_package_id is restricted to be exposed > in pSeries platform. So id.socket, id.core, id.cpu all will > be set as -1. > > In case of displaying socket or die value, there is no check > done in the "aggr_printout" function to see if it points to > valid socket id or die. But for displaying "cpu" value, there > is a check for "if (id.core > -1)". In case of powerpc pSeries > where detail like physical_package_id is restricted to be > exposed, id.core will be set to -1. Hence the column or field > itself for CPU won't be displayed in the output. > > Result for per-socket: > > <<>> > perf stat -e branches --per-socket -a true > > Performance counter stats for 'system wide': > > S-1 32416,851 branches > <<>> > > Here S has -1 in above result. But with -A option which also > expects CPU in one column in the result, below is observed. > > <<>> > /bin/perf stat -e instructions -A -a true > > Performance counter stats for 'system wide': > > 47,146 instructions > 45,226 instructions > 43,354 instructions > 45,184 instructions > <<>> > > If the cpu id value is pointing to -1 also, it makes sense > to display the column in the output to replicate the behaviour > or to be in precedence with other aggr options(like per-socket, > per-core). Remove the check "id.core" so that CPU field gets > displayed in the output. Why would you want to print -1 out? Seems like the if statement was a good one to me, otherwise the output looks a bit broken to users. Are the other aggregation modes even working if -1 is set for socket and die? Maybe we need to not print -1 in those cases or exit earlier with a failure. The -1 value has a specific internal meaning which is "to not aggregate". It doesn't mean "not set". > > After the fix: > > <<>> > perf stat -e instructions -A -a true > > Performance counter stats for 'system wide': > > CPU-1 64,034 instructions > CPU-1 68,941 instructions > CPU-1 59,418 instructions > CPU-1 70,478 instructions > CPU-1 65,201 instructions > CPU-1 63,704 instructions > <<>> > > This is caught while running "perf test" for > "stat+json_output.sh" and "stat+csv_output.sh". Is it possible to fix the issue by making the tests cope with the lack of the CPU id? > > Reported-by: Disha Goel > Signed-off-by: Athira Rajeev > --- > tools/perf/util/stat-display.c | 6 ++ > 1 file changed, 2 insertions(+), 4 deletions(-) > > diff --git a/tools/perf/util/stat-display.c b/tools/perf/util/stat-display.c > index b82844cb0ce7..1b751a730271 100644 > --- a/tools/perf/util/stat-display.c > +++ b/tools/perf/util/stat-display.c > @@ -168,10 +168,9 @@ static void aggr_printout(struct perf_stat_config > *config, > id.socket, > id.die, > id.core); > - } else if (id.core > -1) { > + } else This should have been "id.cpu.cpu > -1". Looks like it was changed by some kind of bad merge or rebase in df936cadfb because there is no obvious justification for the change to .core in that commit. > fprintf(config->output, "\"cpu\" : \"%d\", ", > id.cpu.cpu); > - } > } else { > if (evsel->percore && !config->percore_show_thread) { > fprintf(config->output, "S%d-D%d-C%*d%s", > @@ -179,11 +178,10 @@ static void aggr_printout(struct perf_stat_config > *config, > id.die, > config->csv_output ? 0 : -3, > id.core, config->csv_sep); > - } else if (id.core > -1) { > + } else > fprintf(config->output, "CPU%*d%s", >
Re: [PATCH] tools/perf: Fix aggr_printout to display cpu field irrespective of core value
> On 16-Sep-2022, at 5:01 PM, Disha Goel wrote: > > This Message Is From an External Sender > This message came from outside your organization. > > > On 9/13/22 5:27 PM, Athira Rajeev wrote: >> perf stat includes option to specify aggr_mode to display >> per-socket, per-core, per-die, per-node counter details. >> Also there is option -A ( AGGR_NONE, -no-aggr ), where the >> counter values are displayed for each cpu along with "CPU" >> value in one field of the output. >> >> Each of the aggregate mode uses the information fetched >> from "/sys/devices/system/cpu/cpuX/topology" like core_id, >> physical_package_id. Utility functions in "cpumap.c" fetches >> this information and populates the socket id, core id, cpu etc. >> If the platform does not expose the topology information, >> these values will be set to -1. Example, in case of powerpc, >> details like physical_package_id is restricted to be exposed >> in pSeries platform. So id.socket, id.core, id.cpu all will >> be set as -1. >> >> In case of displaying socket or die value, there is no check >> done in the "aggr_printout" function to see if it points to >> valid socket id or die. But for displaying "cpu" value, there >> is a check for "if (id.core > -1)". In case of powerpc pSeries >> where detail like physical_package_id is restricted to be >> exposed, id.core will be set to -1. Hence the column or field >> itself for CPU won't be displayed in the output. >> >> Result for per-socket: >> >> <<>> >> perf stat -e branches --per-socket -a true >> >> Performance counter stats for 'system wide': >> >> S-1 32416,851 branches >> <<>> >> >> Here S has -1 in above result. But with -A option which also >> expects CPU in one column in the result, below is observed. >> >> <<>> >> /bin/perf stat -e instructions -A -a true >> >> Performance counter stats for 'system wide': >> >> 47,146 instructions >> 45,226 instructions >> 43,354 instructions >> 45,184 instructions >> <<>> >> >> If the cpu id value is pointing to -1 also, it makes sense >> to display the column in the output to replicate the behaviour >> or to be in precedence with other aggr options(like per-socket, >> per-core). Remove the check "id.core" so that CPU field gets >> displayed in the output. >> >> After the fix: >> >> <<>> >> perf stat -e instructions -A -a true >> >> Performance counter stats for 'system wide': >> >> CPU-1 64,034 instructions >> CPU-1 68,941 instructions >> CPU-1 59,418 instructions >> CPU-1 70,478 instructions >> CPU-1 65,201 instructions >> CPU-1 63,704 instructions >> <<>> >> >> This is caught while running "perf test" for >> "stat+json_output.sh" and "stat+csv_output.sh". >> >> Reported-by: Disha Goel >> >> >> Signed-off-by: Athira Rajeev >> > Tested-by: Disha Goel Hi, Looking for review comments for this change. Thanks Athira >> --- >> tools/perf/util/stat-display.c | 6 ++ >> 1 file changed, 2 insertions(+), 4 deletions(-) >> >> diff --git a/tools/perf/util/stat-display.c b/tools/perf/util/stat-display.c >> index b82844cb0ce7..1b751a730271 100644 >> --- a/tools/perf/util/stat-display.c >> +++ b/tools/perf/util/stat-display.c >> @@ -168,10 +168,9 @@ static void aggr_printout(struct perf_stat_config >> *config, >> id.socket, >> id.die, >> id.core); >> -} else if (id.core > -1) { >> +} else >> fprintf(config->output, "\"cpu\" : \"%d\", ", >> id.cpu.cpu); >> -} >> } else { >> if (evsel->percore && !config->percore_show_thread) { >> fprintf(config->output, "S%d-D%d-C%*d%s", >> @@ -179,11 +178,10 @@ static void aggr_printout(struct perf_stat_config >> *config, >> id.die, >> config->csv_output ? 0 : -3, >> id.core, config->csv_sep); >> -} else if (id.core > -1) { >> +} else >> fprintf(config->output, "CPU%*d%s", >> config->csv_output ? 0 : -7, >> id.cpu.cpu, config->csv_sep); >> -} >> } >> break; >> case AGGR_THREAD: >>
Re: [PATCH] tools/perf: Fix aggr_printout to display cpu field irrespective of core value
On 9/13/22 5:27 PM, Athira Rajeev wrote: perf stat includes option to specify aggr_mode to display per-socket, per-core, per-die, per-node counter details. Also there is option -A ( AGGR_NONE, -no-aggr ), where the counter values are displayed for each cpu along with "CPU" value in one field of the output. Each of the aggregate mode uses the information fetched from "/sys/devices/system/cpu/cpuX/topology" like core_id, physical_package_id. Utility functions in "cpumap.c" fetches this information and populates the socket id, core id, cpu etc. If the platform does not expose the topology information, these values will be set to -1. Example, in case of powerpc, details like physical_package_id is restricted to be exposed in pSeries platform. So id.socket, id.core, id.cpu all will be set as -1. In case of displaying socket or die value, there is no check done in the "aggr_printout" function to see if it points to valid socket id or die. But for displaying "cpu" value, there is a check for "if (id.core > -1)". In case of powerpc pSeries where detail like physical_package_id is restricted to be exposed, id.core will be set to -1. Hence the column or field itself for CPU won't be displayed in the output. Result for per-socket: <<>> perf stat -e branches --per-socket -a true Performance counter stats for 'system wide': S-1 32416,851 branches <<>> Here S has -1 in above result. But with -A option which also expects CPU in one column in the result, below is observed. <<>> /bin/perf stat -e instructions -A -a true Performance counter stats for 'system wide': 47,146 instructions 45,226 instructions 43,354 instructions 45,184 instructions <<>> If the cpu id value is pointing to -1 also, it makes sense to display the column in the output to replicate the behaviour or to be in precedence with other aggr options(like per-socket, per-core). Remove the check "id.core" so that CPU field gets displayed in the output. After the fix: <<>> perf stat -e instructions -A -a true Performance counter stats for 'system wide': CPU-1 64,034 instructions CPU-1 68,941 instructions CPU-1 59,418 instructions CPU-1 70,478 instructions CPU-1 65,201 instructions CPU-1 63,704 instructions <<>> This is caught while running "perf test" for "stat+json_output.sh" and "stat+csv_output.sh". Reported-by: Disha Goel Signed-off-by: Athira Rajeev Tested-by: Disha Goel --- tools/perf/util/stat-display.c | 6 ++ 1 file changed, 2 insertions(+), 4 deletions(-) diff --git a/tools/perf/util/stat-display.c b/tools/perf/util/stat-display.c index b82844cb0ce7..1b751a730271 100644 --- a/tools/perf/util/stat-display.c +++ b/tools/perf/util/stat-display.c @@ -168,10 +168,9 @@ static void aggr_printout(struct perf_stat_config *config, id.socket, id.die, id.core); - } else if (id.core > -1) { + } else fprintf(config->output, "\"cpu\" : \"%d\", ", id.cpu.cpu); - } } else { if (evsel->percore && !config->percore_show_thread) { fprintf(config->output, "S%d-D%d-C%*d%s", @@ -179,11 +178,10 @@ static void aggr_printout(struct perf_stat_config *config, id.die, config->csv_output ? 0 : -3, id.core, config->csv_sep); - } else if (id.core > -1) { + } else fprintf(config->output, "CPU%*d%s", config->csv_output ? 0 : -7, id.cpu.cpu, config->csv_sep); - } } break; case AGGR_THREAD:
[PATCH] tools/perf: Fix aggr_printout to display cpu field irrespective of core value
perf stat includes option to specify aggr_mode to display per-socket, per-core, per-die, per-node counter details. Also there is option -A ( AGGR_NONE, -no-aggr ), where the counter values are displayed for each cpu along with "CPU" value in one field of the output. Each of the aggregate mode uses the information fetched from "/sys/devices/system/cpu/cpuX/topology" like core_id, physical_package_id. Utility functions in "cpumap.c" fetches this information and populates the socket id, core id, cpu etc. If the platform does not expose the topology information, these values will be set to -1. Example, in case of powerpc, details like physical_package_id is restricted to be exposed in pSeries platform. So id.socket, id.core, id.cpu all will be set as -1. In case of displaying socket or die value, there is no check done in the "aggr_printout" function to see if it points to valid socket id or die. But for displaying "cpu" value, there is a check for "if (id.core > -1)". In case of powerpc pSeries where detail like physical_package_id is restricted to be exposed, id.core will be set to -1. Hence the column or field itself for CPU won't be displayed in the output. Result for per-socket: <<>> perf stat -e branches --per-socket -a true Performance counter stats for 'system wide': S-1 32416,851 branches <<>> Here S has -1 in above result. But with -A option which also expects CPU in one column in the result, below is observed. <<>> /bin/perf stat -e instructions -A -a true Performance counter stats for 'system wide': 47,146 instructions 45,226 instructions 43,354 instructions 45,184 instructions <<>> If the cpu id value is pointing to -1 also, it makes sense to display the column in the output to replicate the behaviour or to be in precedence with other aggr options(like per-socket, per-core). Remove the check "id.core" so that CPU field gets displayed in the output. After the fix: <<>> perf stat -e instructions -A -a true Performance counter stats for 'system wide': CPU-1 64,034 instructions CPU-1 68,941 instructions CPU-1 59,418 instructions CPU-1 70,478 instructions CPU-1 65,201 instructions CPU-1 63,704 instructions <<>> This is caught while running "perf test" for "stat+json_output.sh" and "stat+csv_output.sh". Reported-by: Disha Goel Signed-off-by: Athira Rajeev --- tools/perf/util/stat-display.c | 6 ++ 1 file changed, 2 insertions(+), 4 deletions(-) diff --git a/tools/perf/util/stat-display.c b/tools/perf/util/stat-display.c index b82844cb0ce7..1b751a730271 100644 --- a/tools/perf/util/stat-display.c +++ b/tools/perf/util/stat-display.c @@ -168,10 +168,9 @@ static void aggr_printout(struct perf_stat_config *config, id.socket, id.die, id.core); - } else if (id.core > -1) { + } else fprintf(config->output, "\"cpu\" : \"%d\", ", id.cpu.cpu); - } } else { if (evsel->percore && !config->percore_show_thread) { fprintf(config->output, "S%d-D%d-C%*d%s", @@ -179,11 +178,10 @@ static void aggr_printout(struct perf_stat_config *config, id.die, config->csv_output ? 0 : -3, id.core, config->csv_sep); - } else if (id.core > -1) { + } else fprintf(config->output, "CPU%*d%s", config->csv_output ? 0 : -7, id.cpu.cpu, config->csv_sep); - } } break; case AGGR_THREAD: -- 2.31.1