On Sun, Mar 11, 2018 at 10:32:44PM -0700, Song Liu wrote: > When a perf_event is attached to parent cgroup, it should count events > for all children cgroups: > > parent_group <---- perf_event > \ > - child_group <---- process(es) > > However, in our tests, we found this perf_event cannot report reliable > results. This is because perf_event->cgrp and cpuctx->cgrp are not > identical, thus perf_event->cgrp are not updated properly.
It might help now and in for our older selves, if you could provide a simple reproducer for this. > Signed-off-by: Song Liu <songliubrav...@fb.com> > Reported-by: Ephraim Park <ephiep...@fb.com> > --- > kernel/events/core.c | 68 > +++++++++++++++++++++++++++++++++++++++++++++++++++- > 1 file changed, 67 insertions(+), 1 deletion(-) > > diff --git a/kernel/events/core.c b/kernel/events/core.c > index 5789810..623d38f 100644 > --- a/kernel/events/core.c > +++ b/kernel/events/core.c > @@ -766,6 +821,17 @@ perf_cgroup_set_timestamp(struct task_struct *task, > cgrp = perf_cgroup_from_task(task, ctx); > info = this_cpu_ptr(cgrp->info); > info->timestamp = ctx->timestamp; > + > + /* set timestamp for ancestor cgroups */ > + if (cgrp->css.cgroup->level > 1) { > + struct perf_cgroup_info *info = this_cpu_ptr(cgrp->info); > + struct perf_event *event; > + > + list_for_each_entry(event, &ctx->pinned_groups, group_entry) > + perf_ancestor_cgroup_set_timestamp(event, cgrp, info); > + list_for_each_entry(event, &ctx->flexible_groups, group_entry) > + perf_ancestor_cgroup_set_timestamp(event, cgrp, info); > + } > } That doesn't make any kind of sense... if we're interested in ancestor groups, then WTH are you iterating _events_ ? I'm expecting something like: struct cgroup_subsys_state *css; for (css = cgrp->css; css; css = css->parent) { /* fudge time */ }