I've added instructions for a workaround. The code paths I've seen in
crashes has been the following:
kvm_sched_in
-> kvm_arch_vcpu_load
-> vmx_vcpu_load
-> loaded_vmcs_clear
-> smp_call_function_single
pmdp_clear_flush
-> flush_tlb_mm_range
-> native_flush_tlb_others
-> smp_call_function_many
Generally this has been caused by workloads that use nested VMs, and
stress L2/L1 vms (causing non-local CPU TLB flushing or VMCS clearing).
The hang is in csd_lock_wait waiting for CSD_FLAG_LOCK bit to be
cleared, which can only be triggered with non-local smp_call_function_*
calls.
Another data point is that this can happen with x2apic as well as flat
apic (as tested with nox2apic).
--
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1413540
Title:
Trusty soft lockup issues with nested KVM
To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1413540/+subscriptions
--
ubuntu-bugs mailing list
[email protected]
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs