Peter, There is indeed a problem in the exit path. We cannot use TASK_DEAD. Using task->flags & PF_EXITING would work, but there is one last gotcha.
We need to have the sequence: perf_event_exit_task(tsk); cgroup_exit(tsk, 1); Be atomic w.r.t., IPI. I am running into timing problem (time_ena, time_run) if you have an IPI to read the counts between perf_event_exit_task() and cgroup_exit(). The issue is that during that time window, tsk->cgrp == current->cgrp, thus the timing code assumes cgroup monitoring is still on and it updates the cgroup time (time_enabled). I am working on a solution that does not involve interrupt masking for those two calls. On Thu, Jan 6, 2011 at 11:15 AM, Peter Zijlstra <pet...@infradead.org> wrote: > On Wed, 2011-01-05 at 22:39 +0100, Stephane Eranian wrote: >> Peter, >> >> On Wed, Jan 5, 2011 at 2:01 PM, Stephane Eranian <eran...@google.com> wrote: >> > On Wed, Jan 5, 2011 at 12:23 PM, Peter Zijlstra <pet...@infradead.org> >> > wrote: >> >> On Mon, 2011-01-03 at 18:20 +0200, Stephane Eranian wrote: >> >>> +void >> >>> +perf_cgroup_switch(struct task_struct *task, struct task_struct *next) >> >>> +{ >> >>> + struct perf_cgroup *cgrp_out = perf_cgroup_from_task(task); >> >>> + struct perf_cgroup *cgrp_in = perf_cgroup_from_task(next); >> >>> + struct perf_cpu_context *cpuctx; >> >>> + struct pmu *pmu; >> >>> + /* >> >>> + * if task is DEAD, then css_out is irrelevant, it has >> >>> + * been changed to init_cgrp in cgroup_exit() from do_exit(). >> >>> + * Furthermore, perf_cgroup_exit_task(), has scheduled out >> >>> + * all css constrained events, only unconstrained events >> >>> + * remain. Therefore we need to reschedule based on css_in. >> >>> + */ >> >>> + if (task->state != TASK_DEAD && cgrp_out == cgrp_in) >> >>> + return; >> >> >> >> I think that check is broken, TASK_DEAD is set way after calling >> >> cgroup_exit(), so if we get preempted in between there you'll still go >> >> funny. >> >> >> >> I looked at this part again. >> >> The original code checking for TASK_DEAD is correct. >> >> The reason is simple, you're looking at perf_cgroup_switch() which is >> invoked as part of schedule() and NOT perf_event_task_exit() (called >> prior to cgroup_exit()). >> Thus, by the time you do the final schedule(), the task state has indeed >> been switched to TASK_DEAD. >> >> I remember testing for this condition during the debug phase. > > But, cgroup_exit() detaches the task from the cgroup, after which the > cgroup can disappear. Furthermore, we can schedule after cgroup_exit() > and before the explicit schedule() invocation. > > Some of the exit functions (say proc_exit_connector) can block and cause > scheduling, and with PREEMPT=y we can get preempted. > > This means you'll be context switching, and thus possibly calling > perf_cgroup_switch(), on a task who's cgroup is possibly destroyed. > > So I'm not at all seeing how this is correct. > ------------------------------------------------------------------------------ Gaining the trust of online customers is vital for the success of any company that requires sensitive data to be transmitted over the Web. Learn how to best implement a security strategy that keeps consumers' information secure and instills the confidence they need to proceed with transactions. http://p.sf.net/sfu/oracle-sfdevnl _______________________________________________ perfmon2-devel mailing list perfmon2-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/perfmon2-devel