Re: [Qemu-devel] Testing migration under stress

David Gibson Mon, 05 Nov 2012 22:54:20 -0800

On Tue, Nov 06, 2012 at 04:22:11PM +1100, Alexey Kardashevskiy wrote:
> On 02/11/12 23:12, Orit Wasserman wrote:
> >On 11/02/2012 05:10 AM, David Gibson wrote:
> >>Asking for some advice on the list.
> >>
> >>I have prorotype savevm and migration support ready for the pseries
> >>machine.  They seem to work under simple circumstances (idle guest).
> >>To test them more extensively I've been attempting to perform live
> >>migrations (just over tcp->localhost) which the guest is active with
> >>something.  In particular I've tried while using octave to do matrix
> >>multiply (so exercising the FP unit) and my colleague Alexey has tried
> >>during some video encoding.
> >>
> >As you are doing local migration one option is to setting the speed higher
> >than line speed , as we don't actually send the data, another is to set high 
> >downtime.
> >
> >>However, in each of these cases, we've found that the migration only
> >>completes and the source instance only stops after the intensive
> >>workload has (just) completed.  What I surmise is happening is that
> >>the workload is touching memory pages fast enough that the ram
> >>migration code is never getting below the threshold to complete the
> >>migration until the guest is idle again.
> >>
> >The workload you chose is really bad for live migration, as all the guest 
> >does is
> >dirtying his memory. I recommend looking for workload that does some 
> >networking or disk IO.
> >Vinod succeeded running SwingBench and SLOB benchmarks that converged ok, I 
> >don't
> >know if they run on pseries, but similar workload should be ok(small 
> >database/warehouse).
> >We found out that SpecJbb on the other hand is hard to converge.
> >Web workload or video streaming also do the trick.
> 
> 
> My ffmpeg workload is simple encoding h263+ac3 to h263+ac3, 64*36
> pixels. So it should not be dirtying memory too much. Or is it?


Oh.. if your encoding the same format to the same format it may well
be optimized and therefore memory limited.  I was envisaging encoding
an uncompressed format to a highly compressed format, which should be
compute limited rather than memory bandwidth limited.  The size and
resolution of the input doesn't really matter as long as:
           * the output size is much smaller than the input size
and        * it takes several minutes for the full encode to give a
             reasonable amount of  time for the migrate to converge.
> 
> (qemu) info migrate
> capabilities: xbzrle: off
> Migration status: completed
> total time: 14538 milliseconds
> downtime: 1273 milliseconds
> transferred ram: 389961 kbytes
> remaining ram: 0 kbytes
> total ram: 1065024 kbytes
> duplicate: 181949 pages
> normal: 97446 pages
> normal bytes: 389784 kbytes
> 
> How many bytes were actually transferred? "duplicate" * 4K = 745MB?
> 
> Is there any tool in QEMU to see how many pages are used/dirty/etc?
> "info" does not seem to have any kind of such statistic.
> 
> btw the new guest did not resume (qemu still responds on commands)
> but this is probably our problem within "pseries" platform. What is

Uh, that's a bug, and I'm not sure when it broke.  If the migrate
isn't even working we're premature in attempting to work out why it
isn't happening when we expect.


-- 
David Gibson                    | I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
                                | _way_ _around_!
http://www.ozlabs.org/~dgibson

Re: [Qemu-devel] Testing migration under stress

Reply via email to