There is a race condition on the value of throttle_timers on the ThrottleGroupMember structure. Those timers should be protected by the ThrottleGroup lock but sometimes are read without the lock and the code expects their value to remain constant.
In particular, there is an assertion that can be false as the timers can change between their value is checked and the assertion is run. Resolves: https://gitlab.com/qemu-project/qemu/-/issues/3194 Signed-off-by: Jorge Merlino <[email protected]> --- This patch fixes a race condition on an assertion on the value of the ThrottleGroupMember throttle_timers. The patch is minimal as it changes only a few lines. It will probably have to be refactored, maybe removing part of the code of the throttle_group_restart_queue procedure and duplicating it before the call. As it is now, this procedure needs to be called with the ThrottleGroup lock held as it will unlock it during its execution. I left it as is now so that the changes are clear for review. As I'm messing with locks and I'm not an expert on this codebase I'm not sure if there could be side effects I'm not aware of. --- block/throttle-groups.c | 8 +++++++- 1 file changed, 7 insertions(+), 1 deletion(-) diff --git a/block/throttle-groups.c b/block/throttle-groups.c index 5329ff1fdb..54daf7841d 100644 --- a/block/throttle-groups.c +++ b/block/throttle-groups.c @@ -423,6 +423,8 @@ static void throttle_group_restart_queue(ThrottleGroupMember *tgm, { Coroutine *co; RestartData *rd = g_new0(RestartData, 1); + ThrottleState *ts = tgm->throttle_state; + ThrottleGroup *tg = container_of(ts, ThrottleGroup, ts); rd->tgm = tgm; rd->direction = direction; @@ -433,6 +435,7 @@ static void throttle_group_restart_queue(ThrottleGroupMember *tgm, assert(!timer_pending(tgm->throttle_timers.timers[direction])); qatomic_inc(&tgm->restart_pending); + qemu_mutex_unlock(&tg->lock); co = qemu_coroutine_create(throttle_group_restart_queue_entry, rd); aio_co_enter(tgm->aio_context, co); @@ -441,11 +444,15 @@ static void throttle_group_restart_queue(ThrottleGroupMember *tgm, void throttle_group_restart_tgm(ThrottleGroupMember *tgm) { ThrottleDirection dir; + ThrottleState *ts = tgm->throttle_state; + ThrottleGroup *tg = container_of(ts, ThrottleGroup, ts); if (tgm->throttle_state) { for (dir = THROTTLE_READ; dir < THROTTLE_MAX; dir++) { QEMUTimer *t = tgm->throttle_timers.timers[dir]; + qemu_mutex_lock(&tg->lock); if (timer_pending(t)) { + qemu_mutex_unlock(&tg->lock); /* If there's a pending timer on this tgm, fire it now */ timer_del(t); timer_cb(tgm, dir); @@ -505,7 +512,6 @@ static void timer_cb(ThrottleGroupMember *tgm, ThrottleDirection direction) /* The timer has just been fired, so we can update the flag */ qemu_mutex_lock(&tg->lock); tg->any_timer_armed[direction] = false; - qemu_mutex_unlock(&tg->lock); /* Run the request that was waiting for this timer */ throttle_group_restart_queue(tgm, direction); --- base-commit: ae56950eac7b61b1abf42003329ee0f3ce111711 change-id: 20260310-fix-race-condition-b4-c1bd186c496e Best regards, -- Jorge Merlino <[email protected]>
