On Wed, Feb 2, 2011 at 4:46 AM, Peter Zijlstra <pet...@infradead.org> wrote: > On Wed, 2011-02-02 at 17:20 +0530, Balbir Singh wrote: >> * Peter Zijlstra <pet...@infradead.org> [2011-02-02 12:29:20]: >> >> > On Thu, 2011-01-20 at 15:39 +0100, Peter Zijlstra wrote: >> > > On Thu, 2011-01-20 at 15:30 +0200, Stephane Eranian wrote: >> > > > @@ -4259,8 +4261,20 @@ void cgroup_exit(struct task_struct *tsk, int >> > > > run_callbacks) >> > > > >> > > > /* Reassign the task to the init_css_set. */ >> > > > task_lock(tsk); >> > > > + /* >> > > > + * we mask interrupts to prevent: >> > > > + * - timer tick to cause event rotation which >> > > > + * could schedule back in cgroup events after >> > > > + * they were switched out by perf_cgroup_sched_out() >> > > > + * >> > > > + * - preemption which could schedule back in cgroup events >> > > > + */ >> > > > + local_irq_save(flags); >> > > > + perf_cgroup_sched_out(tsk); >> > > > cg = tsk->cgroups; >> > > > tsk->cgroups = &init_css_set; >> > > > + perf_cgroup_sched_in(tsk); >> > > > + local_irq_restore(flags); >> > > > task_unlock(tsk); >> > > > if (cg) >> > > > put_css_set_taskexit(cg); >> > > >> > > So you too need a callback on cgroup change there.. Li, Paul, any chance >> > > we can fix this cgroup_subsys::exit callback? The scheduler code needs >> > > to do funny thing because its in the wrong place as well. >> > >> > cgroup guys? Shall I just fix this exit thing since the only user seems >> > to be the scheduler and now perf for both of which its unfortunate at >> > best? >> >> Are you suggesting that the cgroup_exit on task_exit notification should be >> pulled out? > > > No, just fixed. The callback as it exists isn't useful and leads to > hacks like the above. > > >> > Balbir, memcontrol.c uses pre_destroy(), I pose that using this method >> > is broken per definition since it makes the cgroup empty notification >> > void. >> > >> >> We use pre_destroy() to reclaim, so that delete/rmdir() will be able >> to clean up the node/group. I am not sure what you mean by it makes >> the empty notification void and why pre_destroy() is broken? > > A quick look at the code looked like it could return -EBUSY (and other > errors), in that case the rmdir of the empty cgroup will fail. > > Therefore it can happen that after the last task is removed, and we get > the notification that the cgroup is empty, and we attempt the rmdir we > will fail. > > This again means that all such notification handlers must poll state, > which is ridiculous. >
Not necessarily - we could make it that a failed rmdir() sets a bit that causes a notification again once the final refcount is dropped again on the cgroup. Paul ------------------------------------------------------------------------------ The ultimate all-in-one performance toolkit: Intel(R) Parallel Studio XE: Pinpoint memory and threading errors before they happen. Find and fix more than 250 security defects in the development cycle. Locate bottlenecks in serial and parallel code that limit performance. http://p.sf.net/sfu/intel-dev2devfeb _______________________________________________ perfmon2-devel mailing list perfmon2-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/perfmon2-devel