2017-02-08 12:10+0100, Paolo Bonzini:
> The purpose of the KVM_SET_SIGNAL_MASK API is to let userspace "kick"
> a VCPU out of KVM_RUN through a POSIX signal.  A signal is attached
> to a dummy signal handler; by blocking the signal outside KVM_RUN and
> unblocking it inside, this possible race is closed:
> 
>           VCPU thread                     service thread
>    --------------------------------------------------------------
>         check flag
>                                           set flag
>                                           raise signal
>         (signal handler does nothing)
>         KVM_RUN
> 
> However, one issue with KVM_SET_SIGNAL_MASK is that it has to take
> tsk->sighand->siglock on every KVM_RUN.  This lock is often on a
> remote NUMA node, because it is on the node of a thread's creator.
> Taking this lock can be very expensive if there are many userspace
> exits (as is the case for SMP Windows VMs without Hyper-V reference
> time counter).
> 
> As an alternative, we can put the flag directly in kvm_run so that
> KVM can see it:
> 
>           VCPU thread                     service thread
>    --------------------------------------------------------------
>                                           raise signal
>         signal handler
>           set run->immediate_exit
>         KVM_RUN
>           check run->immediate_exit
> 
> Signed-off-by: Paolo Bonzini <[email protected]>
> ---
> diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
> @@ -2564,9 +2565,15 @@ static long kvm_vcpu_ioctl(struct file *filp,
>                               synchronize_rcu();
>                       put_pid(oldpid);
>               }
> -             r = kvm_arch_vcpu_ioctl_run(vcpu, vcpu->run);
> -             trace_kvm_userspace_exit(vcpu->run->exit_reason, r);
> +             run = vcpu->run;
> +             if (run->immediate_exit) {
> +                     WRITE_ONCE(run->immediate_exit, 0);
> +                     return -EINTR;
> +             }

QEMU also uses self-kick to complete IO, but run->immediate_exit is
checked too soon for that.  I think we should move it at least into
kvm_arch_vcpu_ioctl_run(), to cover two uses of the interrupt mask.

(I don't remember the reason behind QEMU's mask on SIGBUS any more.)

Thanks.

> +             r = kvm_arch_vcpu_ioctl_run(vcpu, run);
> +             trace_kvm_userspace_exit(run->exit_reason, r);
>               break;
> +     }
>       case KVM_GET_REGS: {
>               struct kvm_regs *kvm_regs;
>  

Reply via email to