Hi, On Thu, Feb 15, 2018 at 10:08:56PM +0000, Mathieu Desnoyers wrote: > My current theory: do_exit() gets preempted after having set current->mm > to NULL, and after having issued mmput(), which brings the mm_count down > to 0. > > Unfortunately, if the scheduler switches from a userspace thread > to a kernel thread, context_switch() loads prev->active_mm which still > points to the now-freed mm, mmgrab the mm, and eventually does mmdrop > in finish_task_switch().
For this to happen, we need to get to the mmput() in exit_mm() with: mm->mm_count == 1 mm->mm_users == 1 mm == active_mm ... but AFAICT, this cannot happen. If there's no context_switch between clearing current->mm and the mmput(), then mm->mm_count >= 2, thanks to the prior mmgrab() and the active_mm reference (in mm_count) that context_switch+finish_task_switch manage. If there is a context_switch between the two, then AFAICT, either: a) The task re-inherits its old mm as active_mm, and mm_count >= 2. In context_switch we mmgrab() the active_mm to inherit it, and in finish_task_switch() we drop the oldmm, balancing the mmgrab() with an mmput(). e.g we go task -> kernel_task -> task b) At some point, another user task is scheduled, and we switch to its mm. We don't mmgrab() the active_mm, but we mmdrop() the oldmm, which means mm_count >= 1. Since we witched to a new mm, if we switch back to the first task, it cannot have its own mm as active_mm. e.g. we go task -> other_task -> task I suspect we have a bogus mmdrop or mmput elsewhere, and do_exit() and finish_task_switch() aren't to blame. Thanks, Mark.