Hello Avi,
while fixing a module reference counting problem for s390 I spotted
another problem in the kvm main code. Any comments on the patch?
From: Christian Borntraeger <[EMAIL PROTECTED]>
There is a race between a "close of the file descriptors" and module
unload in the kvm module.
You can easily trigger this problem by applying this debug patch:
>--- kvm.orig/virt/kvm/kvm_main.c
>+++ kvm/virt/kvm/kvm_main.c
>@@ -648,10 +648,14 @@ void kvm_free_physmem(struct kvm *kvm)
> kvm_free_physmem_slot(&kvm->memslots[i], NULL);
> }
>
>+#include <linux/delay.h>
> static void kvm_destroy_vm(struct kvm *kvm)
> {
> struct mm_struct *mm = kvm->mm;
>
>+ printk("off1\n");
>+ msleep(5000);
>+ printk("off2\n");
> spin_lock(&kvm_lock);
> list_del(&kvm->vm_list);
> spin_unlock(&kvm_lock);
and changing userspace to closing the vcpu file descriptors after
the vm file descriptors. (killall qemu ; rmmod kvm_intel, rmmod kvm
also crashes my system)
The problem is that kvm_destroy_vm can run while the module count
is 0. That means, you can remove the module while kvm_destroy_vm
is running. But kvm_destroy_vm is part of the module text. This
causes a kerneloops. The race exists without the msleep but is much
harder to trigger.
Nevertheless, the right solution is to call kvm_destroy_vm only
with module_ref_count > 0. This can be done by calling kvm_destroy_vm
only via a release function, since the VFS will not allow module unload.
This patch sets kvm_vcpu_fops.owner to the module and manually
increases the module refcount after anon_inode_getfd, since
anon_inode_getfd does not do it.
Signed-off-by: Christian Borntraeger <[EMAIL PROTECTED]>
---
virt/kvm/kvm_main.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)
Index: kvm/virt/kvm/kvm_main.c
===================================================================
--- kvm.orig/virt/kvm/kvm_main.c
+++ kvm/virt/kvm/kvm_main.c
@@ -1303,7 +1303,7 @@ static int kvm_vcpu_release(struct inode
return 0;
}
-static const struct file_operations kvm_vcpu_fops = {
+static struct file_operations kvm_vcpu_fops = {
.release = kvm_vcpu_release,
.unlocked_ioctl = kvm_vcpu_ioctl,
.compat_ioctl = kvm_vcpu_ioctl,
@@ -1318,6 +1318,7 @@ static int create_vcpu_fd(struct kvm_vcp
int fd = anon_inode_getfd("kvm-vcpu", &kvm_vcpu_fops, vcpu, 0);
if (fd < 0)
kvm_put_kvm(vcpu->kvm);
+ __module_get(kvm_vcpu_fops.owner);
return fd;
}
@@ -2061,6 +2062,7 @@ int kvm_init(void *opaque, unsigned int
}
kvm_chardev_ops.owner = module;
+ kvm_vcpu_fops.owner = module;
r = misc_register(&kvm_dev);
if (r) {
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at http://vger.kernel.org/majordomo-info.html