On Sun, Sep 24, 2023 at 01:07:43AM -0400, Dave Voutila wrote:
> vmd has a sneaky little deadlock hidden in the pause logic related to
> the use of mutexes and condition variables.
>
> When pausing, the vcpu is holding the "run" mutex. It then sleeps
> waiting for the unpause condition. If the event thread is trying to
> assert an irq, it will try to lock that "run" mutex in an attempt to
> signal a halted vcpu that it should start running. This deadlocks the
> thread.
>
> Diff below releases the run mutex (by the vcpu thread) before sleeping
> on the "pause" condition and reacquires it afterwards. This lets the
> event thread advance as needed.
>
> A simple reproducer is run `iperf3 -s` inside a guest, `iperf -c <ip> -t
> 60` from outside the guest, and them `vmctl pause <vm>`. vmctl will hang
> as the vm can't respond to the pause request.
>
> I'm also simplifying the offending vcpu_assert_pic_irq (called by the
> event thread), so the diff should be pretty self contained for review.
>
> ok?
ok mlarkin
>
> diffstat /usr/src
> M usr.sbin/vmd/vm.c | 6+ 8-
>
> 1 file changed, 6 insertions(+), 8 deletions(-)
>
> diff /usr/src
> commit - 89fcd96b33617c4eddd2306889179a96a934ebe8
> path + /usr/src
> blob - fe804b4e3b9e8b59b883e58f02e15a178c035742
> file + usr.sbin/vmd/vm.c
> --- usr.sbin/vmd/vm.c
> +++ usr.sbin/vmd/vm.c
> @@ -1564,16 +1564,20 @@ vcpu_run_loop(void *arg)
> __func__, (int)ret);
> return ((void *)ret);
> }
>
> + /* i8259 may be firing as we pause, release run mtx. */
> + mutex_unlock(&vcpu_run_mtx[n]);
> ret = pthread_cond_wait(&vcpu_unpause_cond[n],
> &vcpu_unpause_mtx[n]);
> if (ret) {
> log_warnx(
> "%s: can't wait on unpause cond (%d)",
> __func__, (int)ret);
> break;
> }
> + mutex_lock(&vcpu_run_mtx[n]);
> +
> ret = pthread_mutex_unlock(&vcpu_unpause_mtx[n]);
> if (ret) {
> log_warnx("%s: can't unlock unpause mtx (%d)",
> __func__, (int)ret);
> @@ -2135,20 +2139,14 @@ vcpu_assert_pic_irq(uint32_t vm_id, uint32_t vcpu_id,
>
> if (i8259_is_pending()) {
> if (vcpu_pic_intr(vm_id, vcpu_id, 1))
> fatalx("%s: can't assert INTR", __func__);
> -
> - ret = pthread_mutex_lock(&vcpu_run_mtx[vcpu_id]);
> - if (ret)
> - fatalx("%s: can't lock vcpu mtx (%d)", __func__, ret);
> -
> + mutex_lock(&vcpu_run_mtx[vcpu_id]);
> vcpu_hlt[vcpu_id] = 0;
> ret = pthread_cond_signal(&vcpu_run_cond[vcpu_id]);
> if (ret)
> fatalx("%s: can't signal (%d)", __func__, ret);
> - ret = pthread_mutex_unlock(&vcpu_run_mtx[vcpu_id]);
> - if (ret)
> - fatalx("%s: can't unlock vcpu mtx (%d)", __func__, ret);
> + mutex_unlock(&vcpu_run_mtx[vcpu_id]);
> }
> }
>
> /*
>