On Oct 25, 2013 8:30 AM, "Paolo Bonzini" <pbonz...@redhat.com> wrote: > > Il 25/10/2013 06:58, Lei Li ha scritto: > > Right now just has inaccurate numbers without the new vmsplice, which > > based on > > the result from info migrate, as the guest ram size increases, although the > > 'total time' is number of times less compared with the current live > > migration, but the 'downtime' performs badly. > > Of course. > > > > For a 1GB ram guest, > > > > total time: 702 milliseconds > > downtime: 692 milliseconds > > > > And when the ram size of guest increasesexponentially, those numbers are > > proportional to it. > > > > I will make a list of the performance with the new vmsplice later, I am > > sure it'd be much better than this at least. > > Yes, please. Is the memory usage is still 2x without vmsplice? > > I think you have a nice proof of concept, but on the other hand this > probably needs to be coupled with some kind of postcopy live migration, > that is: > > * the source starts sending data > > * but the destination starts running immediately > > * if the machine needs a page that is missing, the destination asks the > source to send it > > * as soon as it arrives, the destination can restart > > Using postcopy is problematic for reliability: if the destination fails, > the virtual machine is lost because the source doesn't have the latest > content of memory. However, this is a much, much smaller problem for > live QEMU upgrade where the network cannot fail. > > If you do this, you can achieve pretty much instantaneous live upgrade, > well within your original 200 ms goals.
This is actually a very nice justification for post copy. Regards, Anthony Liguori But the flipping code with > vmsplice should be needed anyway to avoid doubling memory usage, and > it's looking pretty good in this version already! I'm relieved that the > RDMA code was designed right! > > Paolo > >