On Mon, Apr 28, 2008 at 11:11:56AM -0500, Anthony Liguori wrote: > Here's my thinking as to why we don't want to destroy the VM in the mmu > notifiers ->release method. I don't have a valid use-case for this but my > argument depends on the fact that this is something that should work. > Daemonizing a running VM may be a reasonable use-case. It's useful to wait > to daemonize until you are sure that everything is working correctly so > it's not all that unreasonable to move the daemonize until after the VCPUs > have been launched. > > If you take a running VM, and pause all of the VCPUs, and then issue a > fork() followed by an immediate exit() in the parent process, the child > process should be able to unpause all the VCPUs and the guest should > continue running uninterrupted.
Fork shares nothing, the only thing that don't need to be duplicated and copied are the ones that can be marked wrprotect in the pagetables. > From KVM's perspective, issuing the fork() will increment the reference > count of the file descriptor for the VM but otherwise, no real change > should happen. The issue would now be that we must completely flush the > shadow page table cache. In theory, MMU notifiers should do this for us. Let's ignore sptes for now. What you miss for something like this to remotely work is to copy all memslots so an ioctl issued on the parent won't break the child too. If you want to ever support the above you have to change fork() so that fork() will also mmu_notifier_register in the child, copy memslots, patch all kvm->mm, and do all other things needed for this to work. Then yes, it'll work, but not because exit_mmap didn't shutdown the VM of the parent, but because you won't notice anymore that the vm in the parent was _entirely_ shutdown by ->release. > When the parent process exits, this will result in exit_mmap() and will > destroy the KVM guest. This leaves the child process with a file > descriptor that refers to a VM that is no longer valid. Rightfully so, as nothing could possibly work anymore in that VM. If it works in the child it'd be only because you copied all memslots in the child and did many other things and you changed the fd to work on a different kvm structure for parent and child. You have to copy the memslots or then parent couldn't run at the same time of the child. > Just avoiding destroying the VM in the ->release() method won't fix this > use-case I don't think. In general, I think we need to think a little more > about how fork() is handled with respect to mmu notifiers. This has very little to do with mmu notifiers. Let's make this work first: if (fork()) { ioctl(delete a memslot) } else { ioctl(create a memslot) } those two ops won't invoke mmu notifiers at all, and there's no chance the above will work right now. In short you think fork() exit() should allow the child to keep working with the guest of the parent when infact they both need to work on different guests, and if they do you won't notice ->release of the parent shutting down the parent vm (just in case it runs exit()). Thanks! ------------------------------------------------------------------------- This SF.net email is sponsored by the 2008 JavaOne(SM) Conference Don't miss this year's exciting event. There's still time to save $100. Use priority code J8TL2D2. http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone _______________________________________________ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel