> From: Nadav Har'El
> Sent: Tuesday, May 17, 2011 3:54 AM
>
> This patch implements nested_vmx_vmexit(), called when the nested L2 guest
> exits and we want to run its L1 parent and let it handle this exit.
>
> Note that this will not necessarily be called on every L2 exit. L0 may decide
> to handle a particular exit on its own, without L1's involvement; In that
> case, L0 will handle the exit, and resume running L2, without running L1 and
> without calling nested_vmx_vmexit(). The logic for deciding whether to handle
> a particular exit in L1 or in L0, i.e., whether to call nested_vmx_vmexit(),
> will appear in a separate patch below.
>
> Signed-off-by: Nadav Har'El <[email protected]>
> ---
> arch/x86/kvm/vmx.c | 257
> +++++++++++++++++++++++++++++++++++++++++++
> 1 file changed, 257 insertions(+)
>
> --- .before/arch/x86/kvm/vmx.c 2011-05-16 22:36:49.000000000 +0300
> +++ .after/arch/x86/kvm/vmx.c 2011-05-16 22:36:49.000000000 +0300
> @@ -6203,6 +6203,263 @@ static int nested_vmx_run(struct kvm_vcp
> return 1;
> }
>
> +/*
> + * On a nested exit from L2 to L1, vmcs12.guest_cr0 might not be up-to-date
> + * because L2 may have changed some cr0 bits directly (see
> CRO_GUEST_HOST_MASK)
> + * without L0 trapping the change and updating vmcs12.
> + * This function returns the value we should put in vmcs12.guest_cr0. It's
> not
> + * enough to just return the current (vmcs02) GUEST_CR0 - that may not be
> the
> + * guest cr0 that L1 thought it was giving its L2 guest; It is possible that
> + * L1 wished to allow its guest to set some cr0 bit directly, but we (L0)
> asked
> + * to trap this change and instead set just the read shadow bit. If this is
> the
> + * case, we need to copy these read-shadow bits back to vmcs12.guest_cr0,
> where
> + * L1 believes they already are.
> + */
> +static inline unsigned long
> +vmcs12_guest_cr0(struct kvm_vcpu *vcpu, struct vmcs12 *vmcs12)
> +{
> + /*
> + * As explained above, we take a bit from GUEST_CR0 if we allowed the
> + * guest to modify it untrapped (vcpu->arch.cr0_guest_owned_bits), or
> + * if we did trap it - if we did so because L1 asked to trap this bit
> + * (vmcs12->cr0_guest_host_mask). Otherwise (bits we trapped but L1
> + * didn't expect us to trap) we read from CR0_READ_SHADOW.
> + */
> + unsigned long guest_cr0_bits =
> + vcpu->arch.cr0_guest_owned_bits | vmcs12->cr0_guest_host_mask;
> + return (vmcs_readl(GUEST_CR0) & guest_cr0_bits) |
> + (vmcs_readl(CR0_READ_SHADOW) & ~guest_cr0_bits);
> +}
Hi, Nadav,
Not sure whether I get above operation wrong. But it looks not exactly correct
to me
in a glimpse. Say a bit set both in L0/L1's cr0_guest_host_mask. In such case
that
bit from vmcs12_GUEST_CR0 resides in vmcs02_CR0_READ_SHADOW, however above
operation will make vmcs02_GUEST_CR0 bit returned instead.
Instead of constructing vmcs12_GUEST_CR0 completely from vmcs02_GUEST_CR0,
why not just updating bits which can be altered while keeping the rest bits from
vmcs12_GUEST_CR0? Say something like:
vmcs12->guest_cr0 &= vmcs12->cr0_guest_host_mask; /* keep unchanged bits */
vmcs12->guest_cr0 |= (vmcs_readl(GUEST_CR0) & vcpu->arch.cr0_guest_owned_bits) |
(vmcs_readl(CR0_READ_SHADOW) & ~( vcpu->arch.cr0_guest_owned_bits |
vmcs12->cr0_guest_host_mask))
Thanks
Kevin
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to [email protected]
More majordomo info at http://vger.kernel.org/majordomo-info.html