Re: [PATCH] perf stat: Take cgroups into account for shadow stats

2020-11-16 Thread Namhyung Kim
On Sun, Nov 15, 2020 at 10:08 PM Andi Kleen  wrote:
>
> Actually thinking about it more you should probably pass around ctx/cgroup
> in a single abstract argument. Otherwise have to change all the metrics
> functions for the next filter too.

Ok, will do.

Thanks,
Namhyung


Re: [PATCH] perf stat: Take cgroups into account for shadow stats

2020-11-16 Thread Namhyung Kim
Hi Andi,

On Sun, Nov 15, 2020 at 10:05 PM Andi Kleen  wrote:
>
> > @@ -57,6 +59,9 @@ static int saved_value_cmp(struct rb_node *rb_node, const 
> > void *entry)
> >   if (a->ctx != b->ctx)
> >   return a->ctx - b->ctx;
> >
> > + if (a->cgrp != b->cgrp)
> > + return (char *)a->cgrp < (char *)b->cgrp ? -1 : +1;
>
> This means the sort order will depend on heap randomization,
> which will make it harder to debug.
>
> Better use something stable like the inode number of the cgroup.

I don't think they are used for sorting.  It's just to compare to find a
matching event value.

For heap randomization, we already used the same technique
for evsel and runtime_stat pointers.  I can make it use cgroup id
but not sure it is really worth it.

>
> Do we have the same problem with other filters?

I'm not aware of it.

>
> The rest of the patch looks good to me.

Thanks for the review!

Namhyung


Re: [PATCH] perf stat: Take cgroups into account for shadow stats

2020-11-15 Thread Andi Kleen
Actually thinking about it more you should probably pass around ctx/cgroup
in a single abstract argument. Otherwise have to change all the metrics
functions for the next filter too.


Re: [PATCH] perf stat: Take cgroups into account for shadow stats

2020-11-15 Thread Andi Kleen
> @@ -57,6 +59,9 @@ static int saved_value_cmp(struct rb_node *rb_node, const 
> void *entry)
>   if (a->ctx != b->ctx)
>   return a->ctx - b->ctx;
>  
> + if (a->cgrp != b->cgrp)
> + return (char *)a->cgrp < (char *)b->cgrp ? -1 : +1;

This means the sort order will depend on heap randomization, 
which will make it harder to debug.

Better use something stable like the inode number of the cgroup.

Do we have the same problem with other filters?

The rest of the patch looks good to me.

-Andi


[PATCH] perf stat: Take cgroups into account for shadow stats

2020-11-13 Thread Namhyung Kim
As of now it doesn't consider cgroups when collecting shadow stats and
metrics so counter values from different cgroups will be saved in a
same slot.  This resulted in an incorrect numbers when those cgroups
have different workloads.

For example, let's look at the below - the cgroup A and C runs same
workload which burns a cpu while cgroup B runs a light workload.

  $ perf stat -a -e cycles,instructions --for-each-cgroup A,B,C  sleep 1

   Performance counter stats for 'system wide':

 3,958,116,522  cyclesA
 6,722,650,929  instructions  A #2.53  insn per cycle
 1,132,741  cyclesB
   571,743  instructions  B #0.00  insn per cycle
 4,007,799,935  cyclesC
 6,793,181,523  instructions  C #2.56  insn per cycle

   1.001050869 seconds time elapsed

When I run perf stat with single workload, it usually shows IPC around 1.7.
We can verify it (6,722,650,929.0 / 3,958,116,522 = 1.698) for cgroup A.

But in this case, since cgroups are ignored, cycles are averaged so it
used the lower value for IPC calculation and resulted in around 2.5.

  avg cycle: (3958116522 + 1132741 + 4007799935) / 3 = 2655683066
  IPC (A)  :  6722650929 / 2655683066 = 2.531
  IPC (B)  :  571743 / 2655683066 = 0.0002
  IPC (C)  :  6793181523 / 2655683066 = 2.557

We can simply compare cgroup pointers in the evsel and it'll be NULL
when cgroups are not specified.  With this patch, I can see correct
numbers like below:

  $ perf stat -a -e cycles,instructions --for-each-cgroup A,B,C  sleep 1

  Performance counter stats for 'system wide':

 4,171,051,687  cyclesA
 7,219,793,922  instructions  A #1.73  insn per cycle
 1,051,189  cyclesB
   583,102  instructions  B #0.55  insn per cycle
 4,171,124,710  cyclesC
 7,192,944,580  instructions  C #1.72  insn per cycle

   1.007909814 seconds time elapsed

Signed-off-by: Namhyung Kim 
---
 tools/perf/util/stat-shadow.c | 243 ++
 1 file changed, 132 insertions(+), 111 deletions(-)

diff --git a/tools/perf/util/stat-shadow.c b/tools/perf/util/stat-shadow.c
index 901265127e36..10d0f5a0fd4a 100644
--- a/tools/perf/util/stat-shadow.c
+++ b/tools/perf/util/stat-shadow.c
@@ -8,6 +8,7 @@
 #include "evlist.h"
 #include "expr.h"
 #include "metricgroup.h"
+#include "cgroup.h"
 #include 
 
 /*
@@ -28,6 +29,7 @@ struct saved_value {
enum stat_type type;
int ctx;
int cpu;
+   struct cgroup *cgrp;
struct runtime_stat *stat;
struct stats stats;
u64 metric_total;
@@ -57,6 +59,9 @@ static int saved_value_cmp(struct rb_node *rb_node, const 
void *entry)
if (a->ctx != b->ctx)
return a->ctx - b->ctx;
 
+   if (a->cgrp != b->cgrp)
+   return (char *)a->cgrp < (char *)b->cgrp ? -1 : +1;
+
if (a->evsel == NULL && b->evsel == NULL) {
if (a->stat == b->stat)
return 0;
@@ -100,7 +105,8 @@ static struct saved_value *saved_value_lookup(struct evsel 
*evsel,
  bool create,
  enum stat_type type,
  int ctx,
- struct runtime_stat *st)
+ struct runtime_stat *st,
+ struct cgroup *cgrp)
 {
struct rblist *rblist;
struct rb_node *nd;
@@ -110,6 +116,7 @@ static struct saved_value *saved_value_lookup(struct evsel 
*evsel,
.type = type,
.ctx = ctx,
.stat = st,
+   .cgrp = cgrp,
};
 
rblist = >value_list;
@@ -193,10 +200,11 @@ void perf_stat__reset_shadow_per_stat(struct runtime_stat 
*st)
 
 static void update_runtime_stat(struct runtime_stat *st,
enum stat_type type,
-   int ctx, int cpu, u64 count)
+   int ctx, int cpu, u64 count,
+   struct cgroup *cgrp)
 {
struct saved_value *v = saved_value_lookup(NULL, cpu, true,
-  type, ctx, st);
+  type, ctx, st, cgrp);
 
if (v)
update_stats(>stats, count);
@@ -212,80 +220,81 @@ void perf_stat__update_shadow_stats(struct evsel 
*counter, u64 count,
 {
int ctx = evsel_context(counter);
u64 count_ns = count;
+   struct cgroup *cgrp = counter->cgrp;
struct saved_value *v;
 
count *= counter->scale;
 
if (evsel__is_clock(counter))
-   update_runtime_stat(st, STAT_NSECS, 0, cpu, count_ns);
+