Am 04.10.23 um 18:13 schrieb Vladimir Sementsov-Ogievskiy: >> @@ -833,7 +836,10 @@ static int dirty_bitmap_load_start(QEMUFile *f, >> DBMLoadState *s) >> bdrv_disable_dirty_bitmap(s->bitmap); >> if (flags & DIRTY_BITMAP_MIG_START_FLAG_ENABLED) { >> + AioContext *ctx = bdrv_get_aio_context(s->bs); >> + aio_context_acquire(ctx); >> bdrv_dirty_bitmap_create_successor(s->bitmap, &local_err); >> + aio_context_release(ctx); > > Would not this deadlock in current code? When we have the only one aio > context and therefore we are already in it? >
Yes, I noticed that myself a bit later ;) Am 01.08.23 um 09:57 schrieb Fiona Ebner: > And the patch itself also got an issue. AFAICT, when > qcow2_co_load_vmstate() is called, we already have acquired the context > for the drive we load the snapshot from, and since > polling/AIO_WAIT_WHILE requires that the caller has acquired the context > exactly once, we'd need to distinguish if the dirty bitmap is for that > drive (can't acquire the context a second time) or a different drive > (need to acquire the context for the first time). Quoted from a reply in this thread https://lists.nongnu.org/archive/html/qemu-devel/2023-08/msg00007.html Am 04.10.23 um 18:13 schrieb Vladimir Sementsov-Ogievskiy: > If as Juan said, we rework incoming migration coroutine to be a separate > thread, this patch becomes more correct, I think.. Yes, it would become an issue when the function is called from a non-coroutine context. > If keep coroutine, I think, we should check are we already in that aio > context, and if so we should not acquire it. In coroutine context, we don't need to acquire it, but it shouldn't hurt either and this approach should work for non-coroutine context too. The question is if such conditional lock-taking is acceptable (do we already have something similar somewhere?) or if it can be avoided somehow like it was preferred in another one of my patches: Am 05.05.23 um 16:59 schrieb Peter Xu: > On Fri, May 05, 2023 at 03:46:52PM +0200, Fiona Ebner wrote: >> To fix it, ensure that the BQL is held during setup. To avoid changing >> the behavior for migration too, introduce conditionals for the setup >> callbacks that need the BQL and only take the lock if it's not already >> held. > > The major complexity of this patch is the "conditionally taking" part. Quoted from https://lists.nongnu.org/archive/html/qemu-devel/2023-05/msg01514.html