When qemu_coroutine_enter is executed in a loop (even QEMU_FOREACH_SAFE), the new routine can modify the list, for example removing an element, causing problem when control is given back to the caller that continues iterating on the same list.
Patch 1 solves the issue in blkdebug_debug_resume by restarting the list walk after every coroutine_enter if list has to be fully iterated. Patches 2,3,4 aim to fix blkdebug_debug_event by gathering all actions that the rules make in a counter and invoking the respective coroutine_yeld only after processing all requests. Patch 5-6 are somewhat independent of the others, patch 5 removes the need of new_state field, and patch 6 adds a lock to protect rules and suspended_reqs; right now everything works because it's protected by the AioContext lock. This is a preparation for the current proposal of removing the AioContext lock and instead using smaller granularity locks to allow multiple iothread execution in the same block device. Signed-off-by: Emanuele Giuseppe Esposito <eespo...@redhat.com> --- v5: * Add comment in patch 1 to explain why we don't need _SAFE in for loop * Move the state update (s->state = new_state) in patch 5, to maintain the same existing effect in all patches Emanuele Giuseppe Esposito (6): blkdebug: refactor removal of a suspended request blkdebug: move post-resume handling to resume_req_by_tag blkdebug: track all actions blkdebug: do not suspend in the middle of QLIST_FOREACH_SAFE block/blkdebug: remove new_state field and instead use a local variable blkdebug: protect rules and suspended_reqs with a lock block/blkdebug.c | 136 ++++++++++++++++++++++++++++++++--------------- 1 file changed, 92 insertions(+), 44 deletions(-) -- 2.31.1