----- Original Message ----- > From: "Francesco Romani" <from...@redhat.com> > To: "vdsm-devel" <vdsm-devel@lists.fedorahosted.org> > Sent: Wednesday, March 19, 2014 10:33:51 AM > Subject: [vdsm] VDSM profiling results, round 1 > > (sending again WITHOUT the attachments) > > Hi everyone > > I'd like to share the first round of profiling results for VDSM and my next > steps. > > Summary: > - experimented a couple of profiling approaches and found a good one > - benchmarked http://gerrit.ovirt.org/#/c/25678/ : it is beneficial, was > merged > - found a few low-hanging fruits which seems quite safe to merge and > beneficial to *all* flows > - started engagement with infra (see other thread) to have common and > polished performance > tools > - test roadmap is shaping up, wiki/ML will be updated in the coming days > > Please read through for a more detailed discussion. Every comment is welcome. > > Disclaimer: > long mail, lot of content, please point out if something is missing or not > clear enough > or if deserves more discussion. > > +++ > > == First round results == > > First round of profiling was a follow-up of what I shown during the VDSM > gathering. > The results file contains a full profile ordered by descending time. > In a nutshell: parallel start of 32 tiny VMs using engine REST API and a > single hypervisor host. > > VMs are tiny just because I want to stuff as much VMs I can in my mini-dell > (16 GB ram, 4 core + HT CPUs) > > It is worth to point out a few differences with respect to the *profile* (NOT > the graphs) > I shown during the gathering: > > - profile data is now collected using the profile decorator (see > http://www.ovirt.org/Profiling_Vdsm) > just around Vm._startUnderlyingVm.
Profiling one function is misleading without the big picture. You should profile the whole application before drilling down and checking one function. > The gathering profile was obtained using > the yappi application-wide I don't see such profile - can you put it somewhere? > profiler (see https://code.google.com/p/yappi/) and 40 VMs. > * why yappi? > I thought an application-wide profiler gathers more information and let > us to have a better picture. > I actually still think that but I faced some yappi misbehaviour which I > want to fix later; > function-level profile so far is easier to collect (just grab the data > dumped to file). > * why 40 VMs? > I started with 64 but exausted my storage backing store :) > Will add more storage space in the next days, for the moment I stepped > back to 32. > > It is worth to note that while on one hand numbers change a bit (if you > remember the old profile data > and the scary 80secs wasted on namedtuple), on the other hand the suspects > are the same and the > relative positions are roughly the same. > So I believe our initial findings (namedtuple patch) and the plan are still > valid. > > == how it was done == > > I am still focusing just on the "monday morning" scenario (mass start of many > VMs at the same time). > Each run consisted in a parallel start of 32 VMs as described in the result > data. > VDSM was restarted between one run and the another. > engine was *NOT* restarted between runs. > individual profiles have been gathered after all the runs and the profile was > extracted from the aggregation of them. How did you aggregate the profiles? > > profile dumps are available to everyone, just drop me a note and I'll put the > tarball somewhere. > > please find attached the profile data as txt format. For easier consumption, > they are also > available on pastebin: > > baseline : http://paste.fedoraproject.org/86318/ > namedtuple fix: http://paste.fedoraproject.org/86378/ > pickle fix : http://paste.fedoraproject.org/86600/ (see below) Please limit the number of calls in print_stats to 20 or 30. Showing everything just make it harder to compare, and leads to optimizing unimportant stuff. We want to optimize only the hotspots. > > == hotspots == > > the baseline profile data highlights five major areas and hotspots: > > 1. internal concurrency (possible patch: http://gerrit.ovirt.org/#/c/25857/ - > see below) If you also add print_callers(30) to the profiles, we may learn more about these wait calls. > 2. libvirt This is the main issue - we should invert time in this. > 3. XML processing (initial patch: http://gerrit.ovirt.org/#/c/17694/) Consider using cElementTree: http://effbot.org/zone/celementtree.htm > 4. namedtuple (patch: http://gerrit.ovirt.org/#/c/25678/ - fixed, merged) > 5. pickling (patch: http://gerrit.ovirt.org/#/c/25860/ - see below) These are much more expensive then pickle: 452311 39.222 0.000 41.883 0.000 /usr/lib64/python2.6/StringIO.py:208(write) 7299 38.946 0.005 38.946 0.005 {built-in method acquire} StringIO should be easily replaced with cStringIO (assuming that we write only byte arrays - no unicode). acquire - probably will be hard to fix - showing callers may be interesting. > > #4 is beneficial in the ISCSI path and it was already merged. > #1 shows some potential but it needs to be carefully evaluated to avoid > performance regressions > on different scenarios (e.g. bigger machines than mine :)) > #2 is basically outside of our control but it needs to be watched out > #3 and #5 are beneficial for all flows and scenarios and are safe to merge. > #5 is almost a no-brainer IMO > > == Note about the third profile == > > When profiling the cPickle patch http://paste.fedoraproject.org/86600/ > the tests turned out actually *slower* with respect the second profile with > just the namedtuple > patch. > > The hotspots seems to be around concurrency and libvirt: > location profile2(s) profile3(s) > diff(s) > pthread.py:129(wait) 1230.640 1377.992 > +147.28 (BAD) > virDomainCreateXML 155.171 175.681 > +20.51 (BAD) > 'select.epoll' objects 52.523 53.635 > +1.112 (negligible) > expatbuilder.py:743(start_element_handler) 28.172 33.975 > +5.803 (BAD?) > virDomainGetXMLDesc 23.947 23.217 > -0.73 (negligible) Looking at the number of calls, it is clear that you are not comparing the same thing: baseline: ncalls tottime percall cumtime percall filename:lineno(function) 330 1258.791 3.815 1258.793 3.815 /usr/lib/python2.6/site-packages/pthread.py:129(wait) 157 240.347 1.531 240.347 1.531 {libvirtmod.virDomainCreateXML} 6153 89.501 0.015 89.501 0.015 {method 'poll' of 'select.epoll' objects} 259014 63.736 0.000 113.172 0.000 /usr/lib64/python2.6/xml/dom/expatbuilder.py:743(start_element_handler) 700 50.530 0.072 50.530 0.072 {libvirtmod.virDomainGetXMLDesc} 21831 43.144 0.002 108.839 0.005 /usr/lib64/python2.6/collections.py:223(namedtuple) 1659132 39.388 0.000 39.651 0.000 /usr/lib64/python2.6/collections.py:253(<genexpr>) 452311 39.222 0.000 41.883 0.000 /usr/lib64/python2.6/StringIO.py:208(write) 7299 38.946 0.005 38.946 0.005 {built-in method acquire} nametuple + pickle: ncalls tottime percall cumtime percall filename:lineno(function) 179 1377.992 7.698 1377.993 7.698 /usr/lib/python2.6/site-packages/pthread.py:129(wait) 96 175.681 1.830 175.681 1.830 {libvirtmod.virDomainCreateXML} 3697 53.635 0.015 53.635 0.015 {method 'poll' of 'select.epoll' objects} 141120 33.975 0.000 59.975 0.000 /usr/lib64/python2.6/xml/dom/expatbuilder.py:743(start_element_handler) 384 23.217 0.060 23.217 0.060 {libvirtmod.virDomainGetXMLDesc} 249397 22.812 0.000 24.503 0.000 /usr/lib64/python2.6/StringIO.py:208(write) 96 18.541 0.193 18.541 0.193 {libvirtmod.virDomainSetSchedulerParameters} 547584/9408 18.108 0.000 18.112 0.002 /usr/lib64/python2.6/xml/dom/minidom.py:305(_get_elements_by_tagName_helper) 348288 13.614 0.000 13.723 0.000 /usr/lib64/python2.6/xml/dom/minidom.py:281(_append_child) 199584 13.472 0.000 20.935 0.000 /usr/lib64/python2.6/xml/dom/expatbuilder.py:274(character_data_handler_cdata) 254400 13.368 0.000 13.470 0.000 /usr/lib64/python2.6/xml/dom/minidom.py:349(__init__) 4213 11.543 0.003 11.543 0.003 {built-in method acquire} Note: 157 calls to libvirtmod.virDomainCreateXML in first profile vs 96 in the second. 700 calls to virDomainGetXMLDesc in the first vs 384 in the second. You should start the profiler before you start the test, and stop it when the test is done. The approach of profiling one function run concurrently on different threads and aggregating the profiles seems to give wrong results. I would use the profiler to find the hotspots and compare the performance by timing the whole operation without the profiler. You should do multiple runs (at least 10) and report the average and variance for the whole operation. > > I'm OK with some variance (it is expected) but this is also a warning sign to > be extra-carefully > in tuning the concurrency patch (bullet point #1 above). We should definitely > evaluate more scenarios > before to merge it. > > If we edge out those diffs, we see the cPickle patch has the (small) benefits > we expect, > and I think it is 100% safe to merge. I already did some minimal > extra-verification just in case. > > == Next steps == > > For the near term (the coming days/next weeks) > * benchmark the remaining easy fixes which are beneficial for all flows > and quite safe to merge (XML processing being first) and to work to have them > merged. > * polish scripts and benchmarking code, start to submit to infra for review > * continue investigation about our (in)famous BoundedSempahore > (http://gerrit.ovirt.org/#/c/25857/) > to see if dropping it has regressions or other bad effects > * find other test scenarios > > I also have noted all the suggestion received so far and I planning more test > cases just for this scenario. > > For example: > 1. just start N QEMUs to obtain our lower bound (we cannot get faster than > this) > 2. run with different storage (NFS) > 3. run with no storage > 4. run with Guest OS installed on disks > > And of course we need more scenarios. > Let me just repeat myself: those are just the first steps of a long journey. > > > -- > Francesco Romani > RedHat Engineering Virtualization R & D > Phone: 8261328 > IRC: fromani > _______________________________________________ > vdsm-devel mailing list > vdsm-devel@lists.fedorahosted.org > https://lists.fedorahosted.org/mailman/listinfo/vdsm-devel > _______________________________________________ vdsm-devel mailing list vdsm-devel@lists.fedorahosted.org https://lists.fedorahosted.org/mailman/listinfo/vdsm-devel