Ok, I think I found one first deadlock source during swapping with the mmu notifiers and it's a KVM bug. I got a deadlock inversion between PT lock and mmu_lock because of this bug. With PREEMPT=n it's not enough to spin_lock(mmu_lock) to disable preempt and in turn the page fault will go through and it'll take the PT lock _after_ the mmu_lock which is a forbidden ordering (the VM often calls the mmu notifier invalidate page with the PT lock held). For example FNAME(fetch) calls kvm_read_guest_atomic with the mmu_lock held (taken by FNAME(page_fault)). Not sure why I didn't trigger yet on my workstation but only on the test system. This seem not enough to get V2 stable yet (but I think it was enough to get my old codebase stable on the test system). I'll now try to rollback kvm source to my last stable status and to apply this fix and run it on V2/V3.
Signed-off-by: Andrea Arcangeli <[EMAIL PROTECTED]> diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c index 8fc12dc..48d9a11 100644 --- a/virt/kvm/kvm_main.c +++ b/virt/kvm/kvm_main.c @@ -553,7 +565,9 @@ int kvm_read_guest_atomic(struct kvm *kvm, gpa_t gpa, void *data, addr = gfn_to_hva(kvm, gfn); if (kvm_is_error_hva(addr)) return -EFAULT; + pagefault_disable(); r = __copy_from_user_inatomic(data, (void __user *)addr + offset, len); + pagefault_enable(); if (r) return -EFAULT; return 0; ------------------------------------------------------------------------- This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ _______________________________________________ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel