This series of patches implements shadow-vmcs capability for nested VMX.
Shadow-vmcs - background and overview:
In Intel VMX, vmread and vmwrite privileged instructions are used by the
hypervisor to read and modify the guest and host specifications (VMCS). In a
nested virtualization environment, L1 executes multiple vmread and vmwrite
instruction to handle a single L2 exit. Each vmread and vmwrite executed by L1
traps (cause an exit) to the L0 hypervisor (KVM). L0 emulates the instruction
behaviour and resumes L1 execution.
Removing the need to trap and emulate these special instructions reduces the
number of exits and improves nested virtualization performance. As it was first
evaluated in [1], exit-less vmread and vmwrite can reduce nested virtualization
overhead up-to 40%.
Intel introduced a new feature to their processors called shadow-vmcs. Using
shadow-vmcs, L0 can configure the processor to let L1 running in guest-mode
access VMCS12 fields using vmread and vmwrite instructions but without causing
an exit to L0. The VMCS12 fields' data is stored in a shadow-vmcs controlled
by L0.
Shadow-vmcs - design considerations:
A shadow-vmcs is processor-dependent and must be accessed by L0 or L1 using
vmread and vmwrite instructions. With nested virtualization we aim to abstract
the hardware from the L1 hypervisor. Thus, to avoid hardware dependencies we
prefered to keep the software defined VMCS12 format as part of L1 address space
and hold the processor-specific shadow-vmcs format only in L0 address space.
In other words, the shadow-vmcs is used by L0 as an accelerator but the format
and content is never exposed to L1 directly. L0 syncs the content of the
processor-specific shadow vmcs with the content of the software-controlled
VMCS12 format.
We could have been kept the processor-specific shadow-vmcs format in L1 address
space to avoid using the software defined VMCS12 format, however, this type of
design/implementation would have been created hardware dependencies and
would complicate other capabilities (e.g. Live Migration of L1).
Changes since v1:
1) Added sync_shadow_vmcs flag used to indicate when the content of VMCS12
must be copied to the shadow vmcs. The flag value is checked during
vmx_vcpu_run.
2) Code quality improvements
Changes since v2:
1) Allocate shadow vmcs only once per VCPU on handle_vmxon and re-use the
same instance for multiple VMCS12s
2) More code quality improvements
Acknowledgments:
Many thanks to
"Natapov, Gleb" <[email protected]>
"Xu, Dongxiao" <[email protected]>
"Nakajima, Jun" <[email protected]>
"Har'El, Nadav" <[email protected]>
for the insightful discussions, comments and reviews.
These patches were easily created and maintained using
Patchouli -- patch creator
http://patchouli.sourceforge.net/
[1] "The Turtles Project: Design and Implementation of Nested Virtualization",
http://www.usenix.org/events/osdi10/tech/full_papers/Ben-Yehuda.pdf
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to [email protected]
More majordomo info at http://vger.kernel.org/majordomo-info.html