2014-11-16 23:49+0200, Nadav Amit:
> apic_find_highest_irr assumes irr_pending is set if any vector in APIC_IRR is
> set. If this assumption is broken and apicv is disabled, the injection of
> interrupts may be deferred until another interrupt is delivered to the guest.
> Ultimately, if no other interrupt should be injected to that vCPU, the pending
> interrupt may be lost.
>
> commit 56cc2406d68c ("KVM: nVMX: fix "acknowledge interrupt on exit" when
> APICv
> is in use") changed the behavior of apic_clear_irr so irr_pending is cleared
> after setting APIC_IRR vector. After this commit, if apic_set_irr and
> apic_clear_irr run simultaneously, a race may occur, resulting in APIC_IRR
> vector set, and irr_pending cleared. In the following example, assume a single
> vector is set in IRR prior to calling apic_clear_irr:
>
> apic_set_irr apic_clear_irr
> ------------ --------------
> apic->irr_pending = true;
> apic_clear_vector(...);
> vec = apic_search_irr(apic);
> // => vec == -1
> apic_set_vector(...);
> apic->irr_pending = (vec != -1);
> // => apic->irr_pending == false
>
> Nonetheless, it appears the race might even occur prior to this commit:
>
> apic_set_irr apic_clear_irr
> ------------ --------------
> apic->irr_pending = true;
> apic->irr_pending = false;
> apic_clear_vector(...);
> if (apic_search_irr(apic) != -1)
> apic->irr_pending = true;
> // => apic->irr_pending == false
> apic_set_vector(...);
>
> Fixing this issue by:
> 1. Restoring the previous behavior of apic_clear_irr: clear irr_pending, call
> apic_clear_vector, and then if APIC_IRR is non-zero, set irr_pending.
> 2. On apic_set_irr: first call apic_set_vector, then set irr_pending.
>
> Signed-off-by: Nadav Amit <[email protected]>
> ---
> arch/x86/kvm/lapic.c | 18 ++++++++++++------
> 1 file changed, 12 insertions(+), 6 deletions(-)
>
> diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
> index 6e8ce5a..e0e5642 100644
> --- a/arch/x86/kvm/lapic.c
> +++ b/arch/x86/kvm/lapic.c
> @@ -341,8 +341,12 @@ EXPORT_SYMBOL_GPL(kvm_apic_update_irr);
>
> static inline void apic_set_irr(int vec, struct kvm_lapic *apic)
> {
> - apic->irr_pending = true;
> apic_set_vector(vec, apic->regs + APIC_IRR);
> + /*
> + * irr_pending must be true if any interrupt is pending; set it after
> + * APIC_IRR to avoid race with apic_clear_irr
> + */
> + apic->irr_pending = true;
(A race that ends up with 'irr_pending = true' and zero IRR is
harmless.)
> }
>
> static inline int apic_search_irr(struct kvm_lapic *apic)
> @@ -374,13 +378,15 @@ static inline void apic_clear_irr(int vec, struct
> kvm_lapic *apic)
>
> vcpu = apic->vcpu;
>
> - apic_clear_vector(vec, apic->regs + APIC_IRR);
> - if (unlikely(kvm_apic_vid_enabled(vcpu->kvm)))
> + if (unlikely(kvm_apic_vid_enabled(vcpu->kvm))) {
> /* try to update RVI */
> + apic_clear_vector(vec, apic->regs + APIC_IRR);
> kvm_make_request(KVM_REQ_EVENT, vcpu);
> - else {
> - vec = apic_search_irr(apic);
> - apic->irr_pending = (vec != -1);
> + } else {
> + apic->irr_pending = false;
> + apic_clear_vector(vec, apic->regs + APIC_IRR);
> + if (apic_search_irr(apic) != -1)
> + apic->irr_pending = true;
> }
Works because apic_clear_vector() is also a compiler barrier ...
Reviewed-by: Radim Krčmář <[email protected]>
(I hope the performance gain of irr_pending is worth its complexity.)
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to [email protected]
More majordomo info at http://vger.kernel.org/majordomo-info.html