Andrea Arcangeli wrote:
On Sat, Apr 26, 2008 at 01:59:23PM -0500, Anthony Liguori wrote:
+static void kvm_unmap_spte(struct kvm *kvm, u64 *spte)
+{
+ struct page *page = pfn_to_page((*spte & PT64_BASE_ADDR_MASK) >> PAGE_SHIFT);
+       get_page(page);
You should not assume a struct page exists for any given spte. Instead, use kvm_get_pfn() and kvm_release_pfn_clean().

Last email from [EMAIL PROTECTED] in my inbox argues it's useless to build rmap
on mmio regions, so the above is more efficient so put_page runs
directly on the page without going back and forth between spte -> pfn
-> page -> pfn -> page in a single function.

Avi can correct me if I'm wrong, but I don't think the consensus of that discussion was that we're going to avoid putting mmio pages in the rmap. Practically speaking, replacing:

+ struct page *page = pfn_to_page((*spte & PT64_BASE_ADDR_MASK) >> PAGE_SHIFT);
+       get_page(page);


With:

unsigned long pfn = (*spte & PT64_BASE_ADDR_MASK) >> PAGE_SHIFT;
kvm_get_pfn(pfn);

Results in exactly the same code except the later allows mmio pfns in the rmap. So ignoring the whole mmio thing, using accessors that are already there and used elsewhere seems like a good idea :-)

Certainly if we start building rmap on mmio regions we'll have to
change that.

Perhaps I just have a weak stomach but I am uneasy having a function that takes a lock on exit. I walked through the logic and it doesn't appear to be wrong but it also is pretty clear that you could defer the acquisition of the lock to the caller (in this case, kvm_mmu_pte_write) by moving the update_pte assignment into kvm_mmu_pte_write.

I agree out_lock is an uncommon exit path, the problem is that the
code was buggy, and I tried to fix it with the smallest possible
change and that resulting in an out_lock. That section likely need a
refactoring, all those update_pte fields should be at least returned
by the function guess_....  but I tried to reduce the changes to make
the issue more readable, I didn't want to rewrite certain functions
just to take a spinlock a few instructions ahead.

I appreciate the desire to minimize changes, but taking a lock on return seems to take that to a bit of an extreme. It seems like a simple thing to fix though, no?

Why move the destruction of the vm to the MMU notifier unregister hook? Does anything else ever call mmu_notifier_unregister that would implicitly destroy the VM?

mmu notifier ->release can run at anytime before the filehandle is
closed. ->release has to zap all sptes and freeze the mmu (hence all
vcpus) to prevent any further page fault. After ->release returns all
pages are freed (we'll never relay on the page pin to avoid the
rmap_remove put_page to be a relevant unpin event). So the idea is
that I wanted to maintain the same ordering of the current code in the
vm destroy event, I didn't want to leave a partially shutdown VM on
the vmlist. If the ordering is entirely irrelevant and the
kvm_arch_destroy_vm can run well before kvm_destroy_vm is called, then
I can avoid changes to kvm_main.c but I doubt.

I've done it in a way that archs not needing mmu notifiers like s390
can simply add the kvm_destroy_common_vm at the top of their
kvm_arch_destroy_vm. All others using mmu_notifiers have to invoke
kvm_destroy_common_vm in the ->release of the mmu notifiers.

This will ensure that everything will be ok regardless if exit_mmap is
called before/after exit_files, and it won't make a whole lot of
difference anymore, if the driver fd is pinned through vmas->vm_file
released in exit_mmap or through the task filedescriptors relased in
exit_files etc... Infact this allows to call mmu_notifier_unregister
at anytime later after the task has already been killed, without any
trouble (like if the mmu notifier owner isn't registering in
current->mm but some other tasks mm).

I see. It seems a little strange to me as a KVM guest isn't really tied to the current mm. It seems like the net effect of this is that we are now tying a KVM guest to an mm.

For instance, if you create a guest, but didn't assign any memory to it, you could transfer the fd to another process and then close the fd (without destroying the guest). The other process then could assign memory to it and presumably run the guest.

With your change, as soon as the first process exits, the guest will be destroyed. I'm not sure this behavioral difference really matters but it is a behavioral difference.

Regards,

Anthony Liguori

-------------------------------------------------------------------------
This SF.net email is sponsored by the 2008 JavaOne(SM) Conference Don't miss this year's exciting event. There's still time to save $100. Use priority code J8TL2D2. http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone
_______________________________________________
kvm-devel mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/kvm-devel

_______________________________________________
general mailing list
[email protected]
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Reply via email to