On Sun, Sep 24, 2023 at 01:07:43AM -0400, Dave Voutila wrote:
> vmd has a sneaky little deadlock hidden in the pause logic related to
> the use of mutexes and condition variables.
>
> When pausing, the vcpu is holding the "run" mutex. It then sleeps
> waiting for the unpause condition. If the event thread is trying to
> assert an irq, it will try to lock that "run" mutex in an attempt to
> signal a halted vcpu that it should start running. This deadlocks the
> thread.
>
> Diff below releases the run mutex (by the vcpu thread) before sleeping
> on the "pause" condition and reacquires it afterwards. This lets the
> event thread advance as needed.
>
> A simple reproducer is run `iperf3 -s` inside a guest, `iperf -c <ip> -t
> 60` from outside the guest, and them `vmctl pause <vm>`. vmctl will hang
> as the vm can't respond to the pause request.
>
> I'm also simplifying the offending vcpu_assert_pic_irq (called by the
> event thread), so the diff should be pretty self contained for review.
>
> ok?

ok mlarkin

>
> diffstat /usr/src
>  M  usr.sbin/vmd/vm.c  |  6+  8-
>
> 1 file changed, 6 insertions(+), 8 deletions(-)
>
> diff /usr/src
> commit - 89fcd96b33617c4eddd2306889179a96a934ebe8
> path + /usr/src
> blob - fe804b4e3b9e8b59b883e58f02e15a178c035742
> file + usr.sbin/vmd/vm.c
> --- usr.sbin/vmd/vm.c
> +++ usr.sbin/vmd/vm.c
> @@ -1564,16 +1564,20 @@ vcpu_run_loop(void *arg)
>                                   __func__, (int)ret);
>                               return ((void *)ret);
>                       }
>
> +                     /* i8259 may be firing as we pause, release run mtx. */
> +                     mutex_unlock(&vcpu_run_mtx[n]);
>                       ret = pthread_cond_wait(&vcpu_unpause_cond[n],
>                           &vcpu_unpause_mtx[n]);
>                       if (ret) {
>                               log_warnx(
>                                   "%s: can't wait on unpause cond (%d)",
>                                   __func__, (int)ret);
>                               break;
>                       }
> +                     mutex_lock(&vcpu_run_mtx[n]);
> +
>                       ret = pthread_mutex_unlock(&vcpu_unpause_mtx[n]);
>                       if (ret) {
>                               log_warnx("%s: can't unlock unpause mtx (%d)",
>                                   __func__, (int)ret);
> @@ -2135,20 +2139,14 @@ vcpu_assert_pic_irq(uint32_t vm_id, uint32_t vcpu_id,
>
>       if (i8259_is_pending()) {
>               if (vcpu_pic_intr(vm_id, vcpu_id, 1))
>                       fatalx("%s: can't assert INTR", __func__);
> -
> -             ret = pthread_mutex_lock(&vcpu_run_mtx[vcpu_id]);
> -             if (ret)
> -                     fatalx("%s: can't lock vcpu mtx (%d)", __func__, ret);
> -
> +             mutex_lock(&vcpu_run_mtx[vcpu_id]);
>               vcpu_hlt[vcpu_id] = 0;
>               ret = pthread_cond_signal(&vcpu_run_cond[vcpu_id]);
>               if (ret)
>                       fatalx("%s: can't signal (%d)", __func__, ret);
> -             ret = pthread_mutex_unlock(&vcpu_run_mtx[vcpu_id]);
> -             if (ret)
> -                     fatalx("%s: can't unlock vcpu mtx (%d)", __func__, ret);
> +             mutex_unlock(&vcpu_run_mtx[vcpu_id]);
>       }
>  }
>
>  /*
>

Reply via email to