* Kevin Wolf (kw...@redhat.com) wrote: > Am 11.04.2018 um 12:01 hat Jiri Denemark geschrieben: > > On Tue, Apr 10, 2018 at 16:47:56 +0200, Kevin Wolf wrote: > > > Am 10.04.2018 um 16:22 hat Dr. David Alan Gilbert geschrieben: > > > > * Kevin Wolf (kw...@redhat.com) wrote: > > > > > Am 10.04.2018 um 12:40 hat Dr. David Alan Gilbert geschrieben: > > > > > > Hmm; having chatted to Jiri I'm OK with reverting it, on the > > > > > > condition > > > > > > that I actually understand how this alternative would work first. > > > > > > > > > > > > I can't currently see how a block-inactivate would be used. > > > > > > I also can't see how a block-activate unless it's also with the > > > > > > change that you're asking to revert. > > > > > > > > > > > > Can you explain the way you see it working? > > > > > > > > > > The key is making the delayed activation of block devices (and > > > > > probably > > > > > delayed announcement of NICs? - you didn't answer that part) optional > > > > > instead of making it the default. > > > > > > > > NIC announcments are broken in similar but slightly different ways; we > > > > did have a series on list to help a while ago but it never got merged; > > > > I'd like to keep that mess separate. > > > > > > Okay. I just thought that it would make sense to have clear migration > > > phases that are the same for all external resources that the QEMU > > > processes use. > > > > I don't think NIC announcements should be delayed in this specific case > > since we're dealing with a failure recovery which should be rare in > > comparison to successful migration when we want NIC announcements to be > > send early. In other words, any NIC issues should be solved separately > > and Laine would likely be a better person for discussing them since he > > has a broader knowledge of all the fancy network stuff which libvirt > > needs to coordinate with. > > Well, if I were the migration maintainer, I would insist on a properly > designed phase model that solves the problem once and for all because it > would be clear where everything belongs. We could still have bugs in the > future, but that would be internal implementation bugs with no effect on > the API.
My main reason for believing this wouldn't work is that most of the things we've had recently have been things where we've found out about subtle constraints that we previously didn't realise, and hence if we were writing down the mythical phase model we wouldn't have put in. I'd have loved to have had some more discussion about what those requirements were _before_ block locking went in a few versions back, because unsurprisingly adding hard locking constraints shook a lot of problems out (and IMHO was a much bigger API change than this change) > But I'm not the maintainer and Dave prefers to deal with it basically as > a bunch of one-off fixes, and that will work, too. It will probably > clutter up the external API a bit (because the management tool will have > to separately address migration of block devices, network devices and > possibly other things in the future), but that shouldn't matter much for > libvirt. Maybe what we do need is some documentation of the recommended > process for performing a live migration so that management tools know > which QMP commands they need to issue when. I'd like to keep the networking stuff separate because it's got a whole bunch of other interactions that we found out last time we tried to fix it; in particular Op=enStack's networking interaciton with Libvirt isn't quite what's expected and OpenStack have a whole bunch of different network configurations whose behaviour when we change something is fun. Oh and it depends heavily on the guest - which fortunately the block stuff doesn't (because on modern virtio-net guests it does the modern announce from the guest and so only ever happens once the guest CPU is running which simplifies stuff massively). Dave > Kevin -- Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK