On 23.06.2010, at 09:57, Avi Kivity wrote:
> On 06/22/2010 05:54 PM, Nadav Har'El wrote:
>>
>>> I'm not sure what we need to do with vmcs that is not in RAM. It may
>>> simplify things to return the error_page to the caller and set
>>> KVM_REQ_TRIPLE_FAULT, so we don't have to deal with error handling later on.
>>>
>> This is a very good point. The approach in the patches I sent was to pin
>> the L1-specified VMCS page and map it every time it was needed, and unpin
>> and unmap it immediately afterward. This indeed will not work correctly if
>> called with interrupts disabled and for some reason the vmcs12 page is
>> swapped
>> out...
>>
>> I decided to reconsider this whole approach. In the new approach, we pin
>> and kmap the guest vmcs12 page on VMPTRLD time (when sleeping is fine),
>> and unpin and unmap it when a different VMPTRLD is done (or VMCLEAR, or
>> cleanup). Whenever we want to use this vmcs12 in the code - we can do so
>> without needing to swap in or map anything.
>>
>> The code should now handle the rare situation when the vmcs12 gets swapped
>> out, and is cleaner (with almost 100 lines of ugly nested_map_current()/
>> nested_unmap_current() calls were eliminated). The only downside I see is
>> that
>> now when nested vmx is being used, a single page, the current vmcs of L1, is
>> always pinned and kmaped. I believe that pinning and mapping one single page
>> (no matter how many guests there are) is a small price to pay - we already
>> spend more than that in other places (e.g., one vmcs02 page per .
>>
>> Does this sound reasonable to you?
>>
>
> kmap() really should be avoided when possible. It is for when we don't have
> a pte pointing to the page (for example, accessing a user page from outside
> its process context).
>
> Really, the correct API is kvm_read_guest() and kvm_write_guest(). They can
> easily be wrapped in with something that takes a vmcs12 field and
> automatically references the vmptr:
>
>
> kvm_set_cr0(vcpu, gvmcs_read64(vcpu, guest_cr0));
>
> This will take care of mark_page_dirty() etc.
>
> kvm_*_guest() is a slow API since it needs to search the memory slot list but
> that can be optimized easily by caching the memory slot (and invalidating the
> cache when memory mapping changes). The optimized APIs can end up as doing a
> pointer fetch and a get/put_user(), which is very efficient.
>
> Now the only problem is access from atomic contexts, as far as I can tell
> there are two areas where this is needed:
>
> 1) interrupt injection
> 2) optimized VMREAD/VMWRITE emulation
>
> There are other reasons to move interrupt injection out of atomic context.
> If that's the only thing in the way of using kvm_read/write_guest(), I'll be
> happy to prioritize that work.
>
> Alex, Joerg, well gvmcb_{read,write}{32,64}() work for nsvm? All that
> kmapping is incredibly annoying.
I'm sceptical that we can actually get that to be as fast as a direct kmap, but
apart from that I don't see an obvious reason why it wouldn't.
Alex
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to [email protected]
More majordomo info at http://vger.kernel.org/majordomo-info.html