Il 02/10/2014 10:52, Alexey Kardashevskiy ha scritto: > When migrated using libvirt with "--copy-storage-all", at the end of > migration there is race between NBD mirroring task trying to do flush > and migration completion, both end up invalidating cache. Since qcow2 > driver does not handle this situation very well, random crashes happen. > > This disables the BDRV_O_INCOMING flag for the block device being migrated > and restores it when NBD task is done. > > Signed-off-by: Alexey Kardashevskiy <a...@ozlabs.ru> > --- > > > The commit log is not full and most likely incorrect as well > as the patch :) Please, help. Thanks! > > The patch seems to fix the initial problem though. > > > btw is there any easy way to migrate one QEMU to another > using NBD (i.e. not using "migrate -b") and not using libvirt? > What would the command line be? Debugging with libvirt is real > pain :( > > > --- > block.c | 17 ++++------------- > migration.c | 1 - > nbd.c | 11 +++++++++++ > 3 files changed, 15 insertions(+), 14 deletions(-) > > diff --git a/block.c b/block.c > index c5a251c..ed72e0a 100644 > --- a/block.c > +++ b/block.c > @@ -5073,6 +5073,10 @@ void bdrv_invalidate_cache_all(Error **errp) > QTAILQ_FOREACH(bs, &bdrv_states, device_list) { > AioContext *aio_context = bdrv_get_aio_context(bs); > > + if (!(bs->open_flags & BDRV_O_INCOMING)) { > + continue; > + } > + > aio_context_acquire(aio_context); > bdrv_invalidate_cache(bs, &local_err); > aio_context_release(aio_context);
This part is okay, though perhaps we should add it to bdrv_invalidate_cache instead? > @@ -5083,19 +5087,6 @@ void bdrv_invalidate_cache_all(Error **errp) > } > } > > -void bdrv_clear_incoming_migration_all(void) > -{ > - BlockDriverState *bs; > - > - QTAILQ_FOREACH(bs, &bdrv_states, device_list) { > - AioContext *aio_context = bdrv_get_aio_context(bs); > - > - aio_context_acquire(aio_context); > - bs->open_flags = bs->open_flags & ~(BDRV_O_INCOMING); > - aio_context_release(aio_context); > - } > -} > - > int bdrv_flush(BlockDriverState *bs) > { > Coroutine *co; > diff --git a/migration.c b/migration.c > index 8d675b3..c49a05a 100644 > --- a/migration.c > +++ b/migration.c > @@ -103,7 +103,6 @@ static void process_incoming_migration_co(void *opaque) > } > qemu_announce_self(); > > - bdrv_clear_incoming_migration_all(); > /* Make sure all file formats flush their mutable metadata */ > bdrv_invalidate_cache_all(&local_err); > if (local_err) { This part I don't understand. Shouldn't you at least be adding bs->open_flags = bs->open_flags & ~(BDRV_O_INCOMING); to bdrv_invalidate_cache? > diff --git a/nbd.c b/nbd.c > index e9b539b..7b479c0 100644 > --- a/nbd.c > +++ b/nbd.c > @@ -106,6 +106,7 @@ struct NBDExport { > off_t dev_offset; > off_t size; > uint32_t nbdflags; > + bool restore_incoming; > QTAILQ_HEAD(, NBDClient) clients; > QTAILQ_ENTRY(NBDExport) next; > > @@ -972,6 +973,13 @@ NBDExport *nbd_export_new(BlockDriverState *bs, off_t > dev_offset, > exp->ctx = bdrv_get_aio_context(bs); > bdrv_ref(bs); > bdrv_add_aio_context_notifier(bs, bs_aio_attached, bs_aio_detach, exp); > + > + if (bs->open_flags & BDRV_O_INCOMING) { > + bdrv_invalidate_cache(bs, NULL); > + exp->restore_incoming = !!(bs->open_flags & BDRV_O_INCOMING); > + bs->open_flags &= ~(BDRV_O_INCOMING); > + } > + > return exp; > } > > @@ -1021,6 +1029,9 @@ void nbd_export_close(NBDExport *exp) > if (exp->bs) { > bdrv_remove_aio_context_notifier(exp->bs, bs_aio_attached, > bs_aio_detach, exp); > + if (exp->restore_incoming) { > + exp->bs->open_flags |= BDRV_O_INCOMING; > + } > bdrv_unref(exp->bs); > exp->bs = NULL; > } > For this, I don't think you even need exp->restore_incoming, and then it can simply be a one-liner + bdrv_invalidate_cache(bs, NULL); if you modify bdrv_invalidate_cache instead of bdrv_invalidate_cache_all. Paolo