On 08/03/2010 18:30, Martin Mueller - Sun Germany SE wrote: > Agreed, but Terry and me always agree ;-) > > I might find some spare time this week, and will try a few things (we're > also testing LDOMs + cluster integration on that setup, the migration > timing does have an influence on the failover times...): > > * measure influence of MAUs (make control bigger)
Martin in the results I sent you adding more than one mau to the control domain does not make any difference Adding more cpus does - up to 16 cpus - the migration task has 16 threads so adding a little more then 16 will allow the migration plus other things to happen smoothly. > * monitor control dom utilization (like Mike suggested mpstat is the > tool of choice here...) Monitoring the system with ldm list -o cpu shows that the control domain is 100% busy for the duration of the migration > * maybe try a b2b 10GBE line No point - I have tried and because of the cpu limiting factor you will not get any difference in performance. T > > Any other ideas? I'll keep you posted on the progress > > > Regards > Martin > > Terry Smith wrote: >> Hi Mike >> >> I also have done some timing tests and come to the same conclusions >> broadly as Martin. >> >> >> On 08/03/2010 17:55, Mike Gerdts wrote: >>> On Mon, Mar 8, 2010 at 11:06 AM, Martin Mueller - Sun Germany SE >>> <Martin.Mueller at sun.com> wrote: >>>> Hi * >>>> >>>> I wanted to share some timing data of LDOM V1.3 warm migrations. The >>>> setup under test was: >>>> >>>> * two T5120 connected via switched Fast Ethernet >>>> * 1.4GHz T2 Chips >>>> * S10U8+latest EIS CD, LDOMs V1.3 SUNWldm software >>>> * the moving LDOM was running "xclock -update 1" (to have it doing >>>> something, and to monitor quiescing during migration) >>>> * Timing command: >>>> ptime ldm migrate -p ./rootpw<ldom name> <target host> >>>> * up to six migrations back and forth to get some statistical evidence >>>> >>>> The result is attached: >>>> * The diagram contains error bars, the trend line seems to be reliable >>>> (as reliable as a sample of five to six can be ;-)) >>>> * the "governing law" for the run time seems to be >>>> >>>> 27.2 sec + 14.8 sec/GB * (RAM LDom) >>>> >>>> In other words: the basic duration is 27 sec plus another 14.8 sec per >>>> GB RAM the guest has assigned >>> >>> It looks like there are two things that may need some attention: >>> >>> 1) What is happening in that fixed time of 27.2 seconds? >>> >>> The solution to this is likely somewhere where code has to be changed. >> >> This is the startup overhead - is also depends on a number of things - >> the number of cpus in the primary and now also if a mau is selected for >> the primary >> >>> >>> 2) Why is it only getting 540 Mbits/sec (1 / 14.8 * 8 * 1000) of >>> network throughput? >>> >>> During the memory copy is a CPU pegged ("mpstat 1", possibly "prstat >>> -mLc -n 5 1")? If so a fix probably requires code changes to offload >>> crypto. If not, there is is a pretty good chance that a bit of network >>> tuning will get your throughput much closer to wire speed. >>> http://unix.derkeiler.com/Newsgroups/comp.unix.solaris/2007-04/msg00439.html >>> >>> suggests several parameters that are pretty common to set - go look at >>> disclosures for pretty much any benchmark published by Sun. Note that >>> I haven't read that post closely, but the parameters set at the top >>> are consistent with what my experience suggests is needed to get wire >>> speed on gigabit NICs. >>> >>>> >> The system is not network limited - but limited by the amount of work to >> be done by the primary domain in doing memory compression and moving >> bits around. >> >> T >> >>>> Regards >>>> Martin >>> >>> >>> Thank you very much for sharing your work! >>> >>> >>>> >>>> >>>> -- >>>> Martin M?ller martin.mueller at sun.com >>>> Systems Infrastructure Ambassador Tel.: +49 2102 45 11740 >>>> Sun Microsystems GmbH Fax: +49 2102 499516 >>>> D-40880 Ratingen Mobile: +49 172 8618483 >>>> >>>> Sitz der Gesellschaft: Sun Microsystems GmbH, Sonnenallee 1, D-85551 >>>> Kirchheim-Heimstetten >>>> Amtsgericht Muenchen: HRB 161028 >>>> Geschaeftsfuehrer: Thomas Schr?der, Wolfgang Engels, Wolf Frenkel >>>> Vorsitzender des Aufsichtsrates: Martin Haering >>>> >>>> _______________________________________________ >>>> ldoms-discuss mailing list >>>> ldoms-discuss at opensolaris.org >>>> http://mail.opensolaris.org/mailman/listinfo/ldoms-discuss >>>> >>>> >>> >>> >>> >