Re: [perfmon2] [PATCH 1/2] perf_events: add cgroup support (v8)

Stephane Eranian Thu, 10 Feb 2011 03:48:00 -0800

On Wed, Feb 9, 2011 at 10:47 AM, Peter Zijlstra <[email protected]> wrote:
> On Tue, 2011-02-08 at 23:31 +0100, Stephane Eranian wrote:
>> Peter,
>>
>> See comments below.
>>
>>
>> On Mon, Feb 7, 2011 at 5:10 PM, Peter Zijlstra <[email protected]> wrote:
>> > Compile tested only, depends on the cgroup::exit patch
>> >
>> > --- linux-2.6.orig/include/linux/perf_event.h
>> > +++ linux-2.6/include/linux/perf_event.h
>> > @@ -905,6 +929,9 @@ struct perf_cpu_context {
>> >        struct list_head                rotation_list;
>> >        int                             jiffies_interval;
>> >        struct pmu                      *active_pmu;
>> > +#ifdef CONFIG_CGROUP_PERF
>> > +       struct perf_cgroup              *cgrp;
>> > +#endif
>> >  };
>> >
>> I don't quite understand the motivation for adding cgrp to cpuctx.
>>
>> > --- linux-2.6.orig/kernel/perf_event.c
>> > +++ linux-2.6/kernel/perf_event.c
>> > +static inline void update_cgrp_time_from_cpuctx(struct perf_cpu_context 
>> > *cpuctx)
>> > +{
>> > +       struct perf_cgroup *cgrp_out = cpuctx->cgrp;
>> > +       if (cgrp_out)
>> > +               __update_cgrp_time(cgrp_out);
>> > +}
>> > +
>> What's the benefit of this form compared to the original from_task() version?
>
> Answering both questions, I did this so we could still do the
> sched_out() while the task has already been flipped to a new cgroup.
> Note that both attach and the new exit cgroup_subsys methods are called
> after they update the task's cgroup. While they do provide the old
> cgroup as an argument, making use of that requires passing that along
> which would have been a much more invasive change.
>
>
>> > +                       if (mode & PERF_CGROUP_SWOUT)
>> > +                               cpu_ctx_sched_out(cpuctx, EVENT_ALL);
>> > +
>> > +                       if (mode & PERF_CGROUP_SWIN) {
>> > +                               cpu_ctx_sched_in(cpuctx, EVENT_ALL, task, 
>> > 1);
>> > +                               cpuctx->cgrp = perf_cgroup_from_task(task);
>> > +                       }
>> > +               }
>> I think there is a risk on cpuctx->cgrp pointing to stale cgrp information.
>> Shouldn't we also set cpuctx->cgrp = NULL on SWOUT?
>
> Yeah, we probably should.
>
>> > +static int __perf_cgroup_move(void *info)
>> > +{
>> > +       struct task_struct *task = info;
>> > +       perf_cgroup_switch(task, PERF_CGROUP_SWOUT | PERF_CGROUP_SWIN);
>> > +       return 0;
>> > +}
>> > +
>> > +static void perf_cgroup_move(struct task_struct *task)
>> > +{
>> > +       task_function_call(task, __perf_cgroup_move, task);
>> > +}
>> > +
>> > +static void perf_cgroup_attach(struct cgroup_subsys *ss, struct cgroup 
>> > *cgrp,
>> > +               struct cgroup *old_cgrp, struct task_struct *task,
>> > +               bool threadgroup)
>> > +{
>> > +       perf_cgroup_move(task);
>> > +       if (threadgroup) {
>> > +               struct task_struct *c;
>> > +               rcu_read_lock();
>> > +               list_for_each_entry_rcu(c, &task->thread_group, 
>> > thread_group) {
>> > +                       perf_cgroup_move(c);
>> > +               }
>> > +               rcu_read_unlock();
>> > +       }
>> > +}
>> > +
>> I suspect my original patch was not necessarily handling the attach 
>> completely
>> when you move an existing task into a cgroup which was already monitored.
>> I think you may have had to wait until a ctxsw. Looks like this callback 
>> handles
>> this better.
>
> Right, this deals with moving a task into a cgroup that isn't currently
> being monitored and its converse, moving it out of a cgroup that is
> being monitored.
>
>> Let me make sure I understand the threadgroup iteration, though. I suspect
>> this handles the situation where a multi-threaded app is moved into a cgroup
>
> Indeed.
>
>> while there is already cgroup monitoring active. In that case and if we do 
>> not
>> want to wait until there is at least one ctxsw on all CPUs, then we have to
>> check if the other threads are not already running on the other CPUs.If so,
>> we need to do a cgroup switch on those CPUs. Otherwise, we have nothing to
>> do. Am I getting this right?
>
> Right, so if any of those tasks is currently running, that cpu will be
> monitoring their old cgroup, hence we send an IPI to flip cgroups.
>
I have built a test case where this would trigger. I launched a multi-threaded
app, and then I move the pid into a cgroup via: echo PID >/cgroup/tests/tasks.
I don't see any perf_cgroup move beyond the PID passed.


I looked at kernel/cgroup.c and I could not find a invocation of
ss->attach() that
would pass threadgroup = true. So I am confused here.

I wonder how the cgroupfs 'echo PID >tasks' interface would make the distinction
between PID and TID. It seems possible to move one thread of a multi-threaded
process into a cgroup but not the others.

------------------------------------------------------------------------------
The ultimate all-in-one performance toolkit: Intel(R) Parallel Studio XE:
Pinpoint memory and threading errors before they happen.
Find and fix more than 250 security defects in the development cycle.
Locate bottlenecks in serial and parallel code that limit performance.
http://p.sf.net/sfu/intel-dev2devfeb
_______________________________________________
perfmon2-devel mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/perfmon2-devel

Re: [perfmon2] [PATCH 1/2] perf_events: add cgroup support (v8)

Reply via email to