> -----Original Message----- > From: Eric Blake [mailto:[email protected]] > Sent: Tuesday, March 25, 2014 12:01 AM > To: Paolo Bonzini; Gonglei (Arei); [email protected] > Cc: [email protected]; [email protected]; Yanqiangjun; Zhaoyanbin > (A); Zengjunliang; [email protected] > Subject: Re: [PATCH] migration: Fix possible bug for migrate cancel > > [adding libvirt] > > On 03/24/2014 09:47 AM, Paolo Bonzini wrote: > > Il 24/03/2014 14:04, [email protected] ha scritto: > >> From: zengjunliang <[email protected]> > >> > >> Return error for migrate cancel, when migration status is not > >> MIG_STATE_SETUP or MIG_STATE_ACTIVE. Thus, libvirt can can > >> perceive the operation fails. > >> > >> Signed-off-by: zengjunliang <[email protected]> > >> Signed-off-by: Gonglei <[email protected]> > > > > I think this is done on purpose, because canceling migration is racy. > > Instead, libvirt should do "query-migrate" and check if the migration > > was completed or canceled. > > Can you please give more details at how you are triggering the problem > with libvirt? I think Paolo is probably right - the bug is more likely > to be in libvirt not expecting the race and not recovering correctly > when the race occurs, than it is to be in changing qemu's state algorithm. > When the migration progress reaches 100%, and the migration status becomes MIG_STATE_COMPLETED in Qemu. It will take some time which from MIG_STATE_COMPLETED to the migration thread resources are recovered. If we cancel the migration at this moment, the migrate_fd_cancel function will break directly without reporting error code. Then, libvirt considers the cancle operation a success, contrary facts.
Best regards, -Gonglei -- libvir-list mailing list [email protected] https://www.redhat.com/mailman/listinfo/libvir-list
