Re: [Qemu-devel] [PATCH v1 0/6] A migration performance testing framework

Dr. David Alan Gilbert Mon, 09 May 2016 08:38:35 -0700

* Daniel P. Berrange (berra...@redhat.com) wrote:
> On Thu, May 05, 2016 at 04:39:45PM +0100, Dr. David Alan Gilbert wrote:
> > * Daniel P. Berrange (berra...@redhat.com) wrote:
> > > Some interesting things that I have observed with this
> 
> > >  - Post-copy, by its very nature, obviously ensured that the migraton 
> > > would
> > >    complete. While post-copy was running in pre-copy mode there was a 
> > > somewhat
> > >    chaotic small impact on guest CPU performance, causing performance to
> > >    periodically oscillate between 400ms/GB and 800ms/GB. This is less than
> > >    the impact at the start of each migration iteration which was 1000ms/GB
> > >    in this test. There was also a massive penalty at time of switchover 
> > > from
> > >    pre to post copy, as to be expected. The migration completed in 
> > > post-copy
> > >    phase quite quickly though. For this workload, number of iterations in
> > >    pre-copy mode before switching to post-copy did not have much impact. I
> > >    expect a less extreme workload would have shown more interesting 
> > > results
> > >    wrt number of iterations of pre-copy:
> > > 
> > >     
> > > https://berrange.fedorapeople.org/qemu-mig-test-2016-05-05/tcp-remote-8gb-4cpu/post-copy-iters.html
> > 
> > Hmm; I hadn't actually expected that much performance difference during the
> > precopy phase (it used to in earlier postcopy versions but the later 
> > versions
> > should have got simpler).  The number of iterations wouldn't make that much 
> > difference
> > for your workload - because you're changing all of memory then we're going 
> > to have to
> > resend it; if you had a workload where some of the memory was mostly static
> > and some was rapidly changing, then one or two passes to transfer the mostly
> > static data would show a benefit.
> 
> Ok, so I have repeated the tests with a standard kernel. I also measured
> the exact same settings except without post-copy active, and also see
> the exact same magnitude of jitter without post-copy. IOW, this is not
> the fault of post-copy, its a factor whenever migration is running.


OK, good.

> What
> is most interesting is that I see greater jitter in guest performance,
> the higher the overall network transfer bandwidth is. ie with migration
> throttled to 100mbs, the jitter is massively smaller than the jitter when
> it is allowed to use 10gbs.

That doesn't surprise me, for a few reasons; I think there are three
main sources of overhead:
    a) The syncing of the bitmap
    b) Write faults after a sync when the guest redirties a page
    c) The CPU overhead of shuffling pages down a socket and checking if
       they're zero etc

With a lower bandwidth connection (a) happens more rarely and (c) is lower.
Also, since (a) happens more rarely, and you only fault a page once
between sync's, (b) has a lower overhead.

> Also, I only see the jitter on my 4 vCPU guest, not the 1 vCPU guest.
>
> The QEMU process is confined to only run on 4 pCPUs, so I believe the
> cause of this jitter is simply a result of the migration thread in QEMU
> stealing a little time from the vCPU threads.

Oh yes, that's cruel - you need an extra pCPU for migration if you've
got a fast network connection because (a) & (c) are quite expensive.

> IOW completely expected and there is no penalty of having post-copy
> enabled even if you never get beyond the pre-copy stage :-)

Great.

Dave

> Regards,
> Daniel
> -- 
> |: http://berrange.com      -o-    http://www.flickr.com/photos/dberrange/ :|
> |: http://libvirt.org              -o-             http://virt-manager.org :|
> |: http://autobuild.org       -o-         http://search.cpan.org/~danberr/ :|
> |: http://entangle-photo.org       -o-       http://live.gnome.org/gtk-vnc :|
--
Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK

Re: [Qemu-devel] [PATCH v1 0/6] A migration performance testing framework

Reply via email to