Makes sense to release it as a ReadyKernel patch?

Or the code is executed too early, so modules won't be loaded yet?

--
Best regards,

Konstantin Khorenko,
Virtuozzo Linux Kernel Team

On 02/14/2018 01:43 PM, Konstantin Khorenko wrote:
The commit is pushed to "branch-rh7-3.10.0-693.17.1.vz7.43.x-ovz" and will 
appear at https://src.openvz.org/scm/ovz/vzkernel.git
after rh7-3.10.0-693.17.1.vz7.43.4
------>
commit 988f48689f68fe80186f2ec994e4f135769c1d1b
Author: Vitaly Kuznetsov <vkuzn...@redhat.com>
Date:   Wed Feb 14 13:43:38 2018 +0300

    ms/x86/kvm/vmx: do not use vm-exit instruction length for fast MMIO when 
running nested

    I was investigating an issue with seabios >= 1.10 which stopped working
    for nested KVM on Hyper-V. The problem appears to be in
    handle_ept_violation() function: when we do fast mmio we need to skip
    the instruction so we do kvm_skip_emulated_instruction(). This, however,
    depends on VM_EXIT_INSTRUCTION_LEN field being set correctly in VMCS.
    However, this is not the case.

    Intel's manual doesn't mandate VM_EXIT_INSTRUCTION_LEN to be set when
    EPT MISCONFIG occurs. While on real hardware it was observed to be set,
    some hypervisors follow the spec and don't set it; we end up advancing
    IP with some random value.

    I checked with Microsoft and they confirmed they don't fill
    VM_EXIT_INSTRUCTION_LEN on EPT MISCONFIG.

    Fix the issue by doing instruction skip through emulator when running
    nested.

    Fixes: 68c3b4d1676d870f0453c31d5a52e7e65c7448ae
    Suggested-by: Radim Krčmář <rkrc...@redhat.com>
    Suggested-by: Paolo Bonzini <pbonz...@redhat.com>
    Signed-off-by: Vitaly Kuznetsov <vkuzn...@redhat.com>
    Acked-by: Michael S. Tsirkin <m...@redhat.com>
    Signed-off-by: Radim Krčmář <rkrc...@redhat.com>

    [rkagan: the problem pertains to ESXi as well]
    (cherry picked from commit d391f1207067268261add0485f0f34503539c5b0)
    [rkagan: adjusted for vz7.7]

    https://jira.sw.ru/browse/PSBM-81462
    Signed-off-by: Roman Kagan <rka...@virtuozzo.com>
---
 arch/x86/kvm/vmx.c | 18 ++++++++++++++++--
 arch/x86/kvm/x86.c |  3 ++-
 2 files changed, 18 insertions(+), 3 deletions(-)

diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
index d94946a07de6..8ab2b3d42adc 100644
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -5934,9 +5934,23 @@ static int handle_ept_misconfig(struct kvm_vcpu *vcpu)

        gpa = vmcs_read64(GUEST_PHYSICAL_ADDRESS);
        if (!kvm_io_bus_write(vcpu->kvm, KVM_FAST_MMIO_BUS, gpa, 0, NULL)) {
-               skip_emulated_instruction(vcpu);
                trace_kvm_fast_mmio(gpa);
-               return 1;
+               /*
+                * Doing kvm_skip_emulated_instruction() depends on undefined
+                * behavior: Intel's manual doesn't mandate
+                * VM_EXIT_INSTRUCTION_LEN to be set in VMCS when EPT MISCONFIG
+                * occurs and while on real hardware it was observed to be set,
+                * other hypervisors (namely Hyper-V) don't set it, we end up
+                * advancing IP with some random value. Disable fast mmio when
+                * running nested and keep it for real hardware in hope that
+                * VM_EXIT_INSTRUCTION_LEN will always be set correctly.
+                */
+               if (!static_cpu_has(X86_FEATURE_HYPERVISOR)) {
+                       skip_emulated_instruction(vcpu);
+                       return 1;
+               } else
+                       return x86_emulate_instruction(vcpu, gpa, EMULTYPE_SKIP,
+                                                      NULL, 0) == EMULATE_DONE;
        }

        ret = handle_mmio_page_fault(vcpu, gpa, true);
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index c3103bdd2d37..ddb3c2529ca8 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -5503,7 +5503,8 @@ int x86_emulate_instruction(struct kvm_vcpu *vcpu,
                 * handle watchpoints yet, those would be handled in
                 * the emulate_ops.
                 */
-               if (kvm_vcpu_check_breakpoint(vcpu, &r))
+               if (!(emulation_type & EMULTYPE_SKIP) &&
+                   kvm_vcpu_check_breakpoint(vcpu, &r))
                        return r;

                ctxt->interruptibility = 0;
.

_______________________________________________
Devel mailing list
Devel@openvz.org
https://lists.openvz.org/mailman/listinfo/devel

Reply via email to