On 17/04/2024 14:13, Philippe Mathieu-Daudé wrote: > On 17/4/24 04:56, Li Zhijian via wrote: >> bdrv_activate_all() should not be called from the coroutine context, move >> it to the QEMU thread colo_process_incoming_thread() with the bql_lock >> protected. >> >> The backtrace is as follows: >> #4 0x0000561af7948362 in bdrv_graph_rdlock_main_loop () at >> ../block/graph-lock.c:260 >> #5 0x0000561af7907a68 in graph_lockable_auto_lock_mainloop >> (x=0x7fd29810be7b) at /patch/to/qemu/include/block/graph-lock.h:259 >> #6 0x0000561af79167d1 in bdrv_activate_all (errp=0x7fd29810bed0) at >> ../block.c:6906 >> #7 0x0000561af762b4af in colo_incoming_co () at ../migration/colo.c:935 >> #8 0x0000561af7607e57 in process_incoming_migration_co (opaque=0x0) at >> ../migration/migration.c:793 >> #9 0x0000561af7adbeeb in coroutine_trampoline (i0=-106876144, i1=22042) >> at ../util/coroutine-ucontext.c:175 >> #10 0x00007fd2a5cf21c0 in () at /lib64/libc.so.6 >> >> CC: Fabiano Rosas <faro...@suse.de> > > Cc: qemu-sta...@nongnu.org > >> Closes: https://gitlab.com/qemu-project/qemu/-/issues/2277 >> Fixes: 2b3912f135 ("block: Mark bdrv_first_blk() and bdrv_is_root_node() >> GRAPH_RDLOCK") >> Signed-off-by: Li Zhijian <lizhij...@fujitsu.com> >> --- >> V2: fix missing bql_unlock() in error path. >> --- >> migration/colo.c | 18 ++++++++++-------- >> 1 file changed, 10 insertions(+), 8 deletions(-) >> >> diff --git a/migration/colo.c b/migration/colo.c >> index 84632a603e..5600a43d78 100644 >> --- a/migration/colo.c >> +++ b/migration/colo.c >> @@ -835,6 +835,16 @@ static void *colo_process_incoming_thread(void *opaque) >> return NULL; >> } >> + /* Make sure all file formats throw away their mutable metadata */ >> + bql_lock(); > > Note there is also the convenient BQL_LOCK_GUARD() macro.
Thanks for your reminder, Philippe. IIUC, BQL_LOCK_GUARD() will lock all the rest code till the subroutine ends. So it's not suitable for this case where we only need a partial critical section. Thanks Zhijian > >> + bdrv_activate_all(&local_err); >> + if (local_err) { >> + bql_unlock(); >> + error_report_err(local_err); >> + return NULL; >> + } >> + bql_unlock(); >> + >> failover_init_state(); >