Cc: Marc-André for additional monitor and chardev expertise. Wolfgang Bumiller <w.bumil...@proxmox.com> writes:
> When a monitor's queue is filled up in handle_qmp_command() > it gets suspended. It's the dispatcher bh's job currently to > resume the monitor, which it does after processing an event > from the queue. However, it is possible for a > CHR_EVENT_CLOSED event to be processed before before the bh > is scheduled, which will clear the queue without resuming > the monitor, thereby preventing the dispatcher from reaching > the resume() call. Because with the request queue cleared, there's nothing for monitor_qmp_requests_pop_any_with_lock() to pop, so monitor_qmp_bh_dispatcher() won't look at this monitor. It stays suspended forever. Correct? Observable effect for the monitor's user? > Fix this by resuming the monitor when clearing a queue which > was filled up. > > Signed-off-by: Wolfgang Bumiller <w.bumil...@proxmox.com> > --- > @Michael, we ran into this with qemu 4.0, so if the logic in this patch > is correct it may make sense to include it in the 4.0.1 roundup. > A backport is at [1] as 4.0 was before the monitor/ dir split. > > [1] > https://gitlab.com/wbumiller/qemu/commit/9d8bbb5294ed084f282174b0c91e1a614e0a0714 > > monitor/qmp.c | 10 ++++++++++ > 1 file changed, 10 insertions(+) > > diff --git a/monitor/qmp.c b/monitor/qmp.c > index 9d9e5d8b27..c1db5bf940 100644 > --- a/monitor/qmp.c > +++ b/monitor/qmp.c > @@ -70,9 +70,19 @@ static void qmp_request_free(QMPRequest *req) > /* Caller must hold mon->qmp.qmp_queue_lock */ > static void monitor_qmp_cleanup_req_queue_locked(MonitorQMP *mon) > { > + bool need_resume = (!qmp_oob_enabled(mon) && mon->qmp_requests->length > > 0) > + || mon->qmp_requests->length == QMP_REQ_QUEUE_LEN_MAX; Can you explain why this condition is correct? > while (!g_queue_is_empty(mon->qmp_requests)) { > qmp_request_free(g_queue_pop_head(mon->qmp_requests)); > } > + if (need_resume) { > + /* > + * Pairs with the monitor_suspend() in handle_qmp_command() in case > the > + * queue gets cleared from a CH_EVENT_CLOSED event before the > dispatch > + * bh got scheduled. > + */ > + monitor_resume(&mon->common); > + } > } > > static void monitor_qmp_cleanup_queues(MonitorQMP *mon) Is monitor_qmp_cleanup_req_queue_locked() the correct place? It's called from * monitor_qmp_event() case CHR_EVENT_CLOSED via monitor_qmp_cleanup_queues(), as part of destroying the monitor's session state. This is the case you're trying to fix. Correct? I figure monitor_resume() is safe because we haven't really destroyed anything, yet, we merely flushed the request queue. Correct? * monitor_data_destroy() via monitor_data_destroy_qmp() when destroying the monitor. Can need_resume be true in this case? If yes, is monitor_resume() still safe? We're in the middle of destroying the monitor...