On Tue, Nov 06, 2012 at 04:22:11PM +1100, Alexey Kardashevskiy wrote: > On 02/11/12 23:12, Orit Wasserman wrote: > >On 11/02/2012 05:10 AM, David Gibson wrote: > >>Asking for some advice on the list. > >> > >>I have prorotype savevm and migration support ready for the pseries > >>machine. They seem to work under simple circumstances (idle guest). > >>To test them more extensively I've been attempting to perform live > >>migrations (just over tcp->localhost) which the guest is active with > >>something. In particular I've tried while using octave to do matrix > >>multiply (so exercising the FP unit) and my colleague Alexey has tried > >>during some video encoding. > >> > >As you are doing local migration one option is to setting the speed higher > >than line speed , as we don't actually send the data, another is to set high > >downtime. > > > >>However, in each of these cases, we've found that the migration only > >>completes and the source instance only stops after the intensive > >>workload has (just) completed. What I surmise is happening is that > >>the workload is touching memory pages fast enough that the ram > >>migration code is never getting below the threshold to complete the > >>migration until the guest is idle again. > >> > >The workload you chose is really bad for live migration, as all the guest > >does is > >dirtying his memory. I recommend looking for workload that does some > >networking or disk IO. > >Vinod succeeded running SwingBench and SLOB benchmarks that converged ok, I > >don't > >know if they run on pseries, but similar workload should be ok(small > >database/warehouse). > >We found out that SpecJbb on the other hand is hard to converge. > >Web workload or video streaming also do the trick. > > > My ffmpeg workload is simple encoding h263+ac3 to h263+ac3, 64*36 > pixels. So it should not be dirtying memory too much. Or is it?
Oh.. if your encoding the same format to the same format it may well be optimized and therefore memory limited. I was envisaging encoding an uncompressed format to a highly compressed format, which should be compute limited rather than memory bandwidth limited. The size and resolution of the input doesn't really matter as long as: * the output size is much smaller than the input size and * it takes several minutes for the full encode to give a reasonable amount of time for the migrate to converge. > > (qemu) info migrate > capabilities: xbzrle: off > Migration status: completed > total time: 14538 milliseconds > downtime: 1273 milliseconds > transferred ram: 389961 kbytes > remaining ram: 0 kbytes > total ram: 1065024 kbytes > duplicate: 181949 pages > normal: 97446 pages > normal bytes: 389784 kbytes > > How many bytes were actually transferred? "duplicate" * 4K = 745MB? > > Is there any tool in QEMU to see how many pages are used/dirty/etc? > "info" does not seem to have any kind of such statistic. > > btw the new guest did not resume (qemu still responds on commands) > but this is probably our problem within "pseries" platform. What is Uh, that's a bug, and I'm not sure when it broke. If the migrate isn't even working we're premature in attempting to work out why it isn't happening when we expect. -- David Gibson | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_ | _way_ _around_! http://www.ozlabs.org/~dgibson