From: Vladimir Davydov <vdavy...@virtuozzo.com> An mm_struct may be pinned by a file. An example is vhost-net device created by a qemu/kvm (see vhost_net_ioctl -> vhost_net_set_owner -> vhost_dev_set_owner). If such process gets OOM-killed, the reference to its mm_struct will only be released from exit_task_work -> ____fput -> __fput -> vhost_net_release -> vhost_dev_cleanup, which is called after exit_mmap, where TIF_MEMDIE is cleared. As a result, we can start selecting the next victim before giving the last one a chance to free its memory. In practice, this leads to killing several VMs along with the fattest one.
https://jira.sw.ru/browse/PSBM-44683 Signed-off-by: Vladimir Davydov <vdavy...@virtuozzo.com> Reviewed-by: Kirill Tkhai <ktk...@virtuozzo.com> khorenko@: Volodya tried to send this upstream, but the fix was not applied: https://lkml.org/lkml/2016/2/29/537 The patch was rejected because in ms it increases chances for deadlock: someone takes a lock A->tries to alloc memory->no memory->calls OOM-> OOM selects a task->task requires lock A in order to die-> deadlock. Better solution has not been implemented in ms, we are appliying the current patch because we have a timeout against such a deadlock: in case OOM cannot kill a task in X secs, the OOM caller drops locks and tries to allocate memory once again. Signed-off-by: Andrey Ryabinin <aryabi...@virtuozzo.com> (cherry picked from vz8 commit bd5ffae6952cb97fd97d1ffdba6049baab6c9396) Signed-off-by: Andrey Zhadchenko <andrey.zhadche...@virtuozzo.com> --- kernel/exit.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/kernel/exit.c b/kernel/exit.c index 9a89e7f..9e07095 100644 --- a/kernel/exit.c +++ b/kernel/exit.c @@ -499,8 +499,6 @@ static void exit_mm(void) mmap_read_unlock(mm); mm_update_next_owner(mm); mmput(mm); - if (test_thread_flag(TIF_MEMDIE)) - exit_oom_victim(); } static struct task_struct *find_alive_thread(struct task_struct *p) @@ -824,6 +822,8 @@ void __noreturn do_exit(long code) exit_task_namespaces(tsk); exit_task_work(tsk); exit_thread(tsk); + if (test_thread_flag(TIF_MEMDIE)) + exit_oom_victim(); /* * Flush inherited counters to the parent - before the parent -- 1.8.3.1 _______________________________________________ Devel mailing list Devel@openvz.org https://lists.openvz.org/mailman/listinfo/devel