When a BlockDriverState is about to be reopened it can trigger certain operations that need to write to disk. During this process a different block job can be woken up. If that block job completes and also needs to call bdrv_reopen() it can happen that it needs to do it on the same BlockDriverState that is still in the process of being reopened.
This can have fatal consequences, like in this example: 1) Block job A starts and sleeps after a while. 2) Block job B starts and tries to reopen node1 (a qcow2 file). 3) Reopening node1 means flushing and replacing its qcow2 cache. 4) While the qcow2 cache is being flushed, job A wakes up. 5) Job A completes and reopens node1, replacing its cache. 6) Job B resumes, but the cache that was being flushed no longer exists. This patch pauses all block jobs during bdrv_reopen_multiple(), so that step 4 can never happen and the operation is safe. Note that this scenario can only happen if both bdrv_reopen() calls are made by block jobs on the same backing chain. Otherwise there's no chance that the same BlockDriverState appears in both reopen queues. Signed-off-by: Alberto Garcia <be...@igalia.com> --- block.c | 21 +++++++++++++++++++++ 1 file changed, 21 insertions(+) diff --git a/block.c b/block.c index bb1f1ec..c80b528 100644 --- a/block.c +++ b/block.c @@ -2087,9 +2087,19 @@ int bdrv_reopen_multiple(BlockReopenQueue *bs_queue, Error **errp) int ret = -1; BlockReopenQueueEntry *bs_entry, *next; Error *local_err = NULL; + BlockJob *job = NULL; assert(bs_queue != NULL); + /* Pause all block jobs */ + while ((job = block_job_next(job))) { + AioContext *aio_context = blk_get_aio_context(job->blk); + + aio_context_acquire(aio_context); + block_job_pause(job); + aio_context_release(aio_context); + } + bdrv_drain_all(); QSIMPLEQ_FOREACH(bs_entry, bs_queue, entry) { @@ -2120,6 +2130,17 @@ cleanup: g_free(bs_entry); } g_free(bs_queue); + + /* Resume all block jobs */ + job = NULL; + while ((job = block_job_next(job))) { + AioContext *aio_context = blk_get_aio_context(job->blk); + + aio_context_acquire(aio_context); + block_job_resume(job); + aio_context_release(aio_context); + } + return ret; } -- 2.9.3