On 03/08, Linus Torvalds wrote: > > On Sat, Mar 8, 2014 at 11:44 AM, Oleg Nesterov <o...@redhat.com> wrote: > > > > Sure. But another thread or CLONE_VM task can do vmacache_invalidate(), > > hit vmacache_seqnum == 0 and call vmacache_flush_all() to solve the > > problem with potential overflow. > > How? > > Any invalidation is supposed to hold the mm semaphore for writing.
Yes, > And > we should have it for reading. No, dup_task_struct() is obviously lockless. And the new child is not yet visible to for_each_process_thread(). clone(CLONE_VM) can create a thread with the corrupted vmacache. OK. Suppose we have a task T1 which has the valid vmacache, T1->vmacache_seqnum == T1->mm->vmacache_seqnum == 0. Suppose it sleeps a lot. Suppose that its subthread T2 does a lot munmap's, finally mm->vmacache_seqnum becomes zero again and T2 calls vmacache_flush_all(). T1 wakes up and does clone(CLONE_VM). The new thread T3 gets the copy of T2's ->vmacache_seqnum and ->vmacache[]. T2 continues, vmacache_flush_all() finds T1 and does vmacache_flush(T1). But the new thread T3 is not on the list yet, vmacache_flush_all() can't find it. So T3 will run with vmacache_valid() == T (till the next invalidate(mm) of course) but its ->vmacache[] points to nowhere. Oleg. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/