On Tue, Feb 13, 2018 at 06:56:51PM +0000, Dr. David Alan Gilbert wrote: > * Peter Xu (pet...@redhat.com) wrote: > > The first allow-oob=true command. It's used on destination side when > > the postcopy migration is paused and ready for a recovery. After > > execution, a new migration channel will be established for postcopy to > > continue. > > > > Signed-off-by: Peter Xu <pet...@redhat.com> > > --- > > migration/migration.c | 26 ++++++++++++++++++++++++++ > > migration/migration.h | 1 + > > migration/savevm.c | 3 +++ > > qapi/migration.json | 20 ++++++++++++++++++++ > > 4 files changed, 50 insertions(+) > > > > diff --git a/migration/migration.c b/migration/migration.c > > index cf3a3f416c..bb57ed9ade 100644 > > --- a/migration/migration.c > > +++ b/migration/migration.c > > @@ -1422,6 +1422,32 @@ void qmp_migrate_incoming(const char *uri, Error > > **errp) > > once = false; > > } > > > > +void qmp_migrate_recover(const char *uri, Error **errp) > > +{ > > + MigrationIncomingState *mis = migration_incoming_get_current(); > > + > > + if (mis->state != MIGRATION_STATUS_POSTCOPY_PAUSED) { > > + error_setg(errp, "Migrate recover can only be run " > > + "when postcopy is paused."); > > + return; > > + } > > OK, if it did come back as Paused I don't think it can leave it again > except this way, so I'm not too worried it being thread safe. > > > + if (mis->postcopy_recover_triggered) { > > + error_setg(errp, "Migrate recovery is triggered already"); > > + return; > > + } > > + > > + /* This will make sure we'll only allow one recover for one pause */ > > + mis->postcopy_recover_triggered = true; > > However, does that need to be done with a : > if (atomic_cmpxchg(mis->postcopy_recovery_triggered, false, true) == > true) { > error_setg(errp, "Migrate recovery is triggered already"); > } > > for the slim chance that someone did this command on the main and the > oob monitor?
Yes, slim chance, but I agree. :) I wasn't that strict on this, but I should. Since we are at it, maybe I'll also... > > Dave > > > + /* > > + * Note that this call will never start a real migration; it will > > + * only re-setup the migration stream and poke existing migration > > + * to continue using that newly established channel. > > + */ > > + qemu_start_incoming_migration(uri, errp); > > +} > > + > > bool migration_is_blocked(Error **errp) > > { > > if (qemu_savevm_state_blocked(errp)) { > > diff --git a/migration/migration.h b/migration/migration.h > > index 88f5614b90..581bf4668b 100644 > > --- a/migration/migration.h > > +++ b/migration/migration.h > > @@ -65,6 +65,7 @@ struct MigrationIncomingState { > > QemuSemaphore colo_incoming_sem; > > > > /* notify PAUSED postcopy incoming migrations to try to continue */ > > + bool postcopy_recover_triggered; > > QemuSemaphore postcopy_pause_sem_dst; > > QemuSemaphore postcopy_pause_sem_fault; > > }; > > diff --git a/migration/savevm.c b/migration/savevm.c > > index d40092a2b6..5f41b062ba 100644 > > --- a/migration/savevm.c > > +++ b/migration/savevm.c > > @@ -2182,6 +2182,9 @@ static bool > > postcopy_pause_incoming(MigrationIncomingState *mis) > > /* Notify the fault thread for the invalidated file handle */ > > postcopy_fault_thread_notify(mis); > > > > + /* Clear the triggered bit to allow one recovery */ > > + mis->postcopy_recover_triggered = false; > > + ... move this set operation above migrate_set_state() since there can also be a slim chance too that we may be handling migrate-recover even before setting up postcopy_recover_triggered=false first. Thanks, -- Peter Xu