On Wed, Mar 04, 2026 at 01:28:00PM +0100, Kevin Wolf wrote:
> Commit 2155d2dd introduced rate limiting for BLOCK_IO_ERROR to emit an
> event only once a second. This makes sense for cases in which the guest
> keeps running and can submit more requests that would possibly also fail
> because there is a problem with the backend.
> 
> However, if the error policy is configured so that the VM is stopped on
> errors, this is both unnecessary because stopping the VM means that the
> guest can't issue more requests and in fact harmful because stopping the
> VM is an important state change that management tools need to keep track
> of even if it happens more than once in a given second. If an event is
> dropped, the management tool would see a VM randomly going to paused
> state without an associated error, so it has a hard time deciding how to
> handle the situation.
> 
> This patch disables rate limiting for action=stop by not relying on the
> event type alone any more in monitor_qapi_event_queue_no_reenter(), but
> checking action for BLOCK_IO_ERROR, too. If the error is reported to the
> guest or ignored, the rate limiting stays in place.
> 
> Fixes: 2155d2dd7f73 ('block-backend: per-device throttling of BLOCK_IO_ERROR 
> reports')
> Signed-off-by: Kevin Wolf <[email protected]>
> ---
>  qapi/block-core.json |  2 +-
>  monitor/monitor.c    | 21 ++++++++++++++++++++-
>  2 files changed, 21 insertions(+), 2 deletions(-)
> 
> diff --git a/qapi/block-core.json b/qapi/block-core.json
> index b66bf316e2f..da0b36a3751 100644
> --- a/qapi/block-core.json
> +++ b/qapi/block-core.json
> @@ -5794,7 +5794,7 @@
>  # .. note:: If action is "stop", a `STOP` event will eventually follow
>  #    the `BLOCK_IO_ERROR` event.
>  #
> -# .. note:: This event is rate-limited.
> +# .. note:: This event is rate-limited, except if action is "stop".
>  #
>  # Since: 0.13
>  #
> diff --git a/monitor/monitor.c b/monitor/monitor.c
> index 1273eb72605..37fa674cfe6 100644
> --- a/monitor/monitor.c
> +++ b/monitor/monitor.c
> @@ -367,14 +367,33 @@ monitor_qapi_event_queue_no_reenter(QAPIEvent event, 
> QDict *qdict)
>  {
>      MonitorQAPIEventConf *evconf;
>      MonitorQAPIEventState *evstate;
> +    bool throttled;
>  
>      assert(event < QAPI_EVENT__MAX);
>      evconf = &monitor_qapi_event_conf[event];
>      trace_monitor_protocol_event_queue(event, qdict, evconf->rate);
> +    throttled = evconf->rate;
> +
> +    /*
> +     * Rate limit BLOCK_IO_ERROR only for action != "stop".
> +     *
> +     * If the VM is stopped after an I/O error, this is important information
> +     * for the management tool to keep track of the state of QEMU and we 
> can't
> +     * merge any events. At the same time, stopping the VM means that the 
> guest
> +     * can't send additional requests and the number of events is already
> +     * limited, so we can do without rate limiting.
> +     */
> +    if (event == QAPI_EVENT_BLOCK_IO_ERROR) {
> +        QDict *data = qobject_to(QDict, qdict_get(qdict, "data"));
> +        const char *action = qdict_get_str(data, "action");
> +        if (!strcmp(action, "stop")) {
> +            throttled = false;
> +        }
> +    }

Can this be handled in the same way as other events viat he
qapi_event_throttle_hash & qapi_event_throttle_equal methods ?

eg if action is "stop", then ensure "equal" is always false ?
Possibly add a random token to the hash but might not be needed
if 'equal' is always false

>  
>      QEMU_LOCK_GUARD(&monitor_lock);
>  
> -    if (!evconf->rate) {
> +    if (!throttled) {
>          /* Unthrottled event */
>          monitor_qapi_event_emit(event, qdict);
>      } else {
> -- 
> 2.53.0
> 
> 

With regards,
Daniel
-- 
|: https://berrange.com       ~~        https://hachyderm.io/@berrange :|
|: https://libvirt.org          ~~          https://entangle-photo.org :|
|: https://pixelfed.art/berrange   ~~    https://fstop138.berrange.com :|


Reply via email to