> -----Original Message----- > From: Qemu-devel > [mailto:qemu-devel-bounces+kuhn.chenqun=huawei....@nongnu.org] On > Behalf Of Luká? Doktor > Sent: Thursday, November 26, 2020 4:10 PM > To: QEMU Developers <qemu-devel@nongnu.org> > Cc: Charles Shih <che...@redhat.com>; Aleksandar Markovic > <aleksandar.qemu.de...@gmail.com>; Stefan Hajnoczi > <stefa...@redhat.com> > Subject: Proposal for a regular upstream performance testing > > Hello guys, > > I had been around qemu on the Avocado-vt side for quite some time and a while > ago I shifted my focus on performance testing. Currently I am not aware of any > upstream CI that would continuously monitor the upstream qemu performance > and I'd like to change that. There is a lot to cover so please bear with me. > > Goal > ==== > > The goal of this initiative is to detect system-wide performance regressions > as > well as improvements early, ideally pin-point the individual commits and > notify > people that they should fix things. All in upstream and ideally with least > human > interaction possible. > > Unlike the recent work of Ahmed Karaman's > https://ahmedkrmn.github.io/TCG-Continuous-Benchmarking/ my aim is on the > system-wide performance inside the guest (like fio, uperf, ...) > > Tools > ===== > > In house we have several different tools used by various teams and I bet there > are tons of other tools out there that can do that. I can not speak for all > teams > but over the time many teams at Red Hat have come to like pbench > https://distributed-system-analysis.github.io/pbench/ to run the tests and > produce machine readable results and use other tools (Ansible, scripts, ...) > to > provision the systems and to generate the comparisons. > > As for myself I used python for PoC and over the last year I pushed hard to > turn > it into a usable and sensible tool which I'd like to offer: > https://run-perf.readthedocs.io/en/latest/ anyway I am open to suggestions > and comparisons. As I am using it downstream to watch regressions I do plan > on keep developing the tool as well as the pipelines (unless a better tool is > found that would replace it or it's parts). > > How > === > > This is a tough question. Ideally this should be a standalone service that > would > only notify the author of the patch that caused the change with a bunch of > useful data so they can either address the issue or just be aware of this > change > and mark it as expected. > > Ideally the community should have a way to also issue their custom builds in > order to verify their patches so they can debug and address issues better than > just commit to qemu-master. > > The problem with those is that we can not simply use travis/gitlab/... > machines > for running those tests, because we are measuring in-guest actual > performance. We can't just stop the time when the machine decides to > schedule another container/vm. I briefly checked the public bare-metal > offerings like rackspace but these are most probably not sufficient either > because (unless I'm wrong) they only give you a machine but it is not > guaranteed that it will be the same machine the next time. If we are to > compare the results we don't need just the same model, we really need the > very same machine. Any change to the machine might lead to a significant > difference (disk replacement, even firmware update...).
Hi Lukáš, It's nice to see a discussion of QEMU performance topic. If you have a need for CI platform and physical machine environments, maybe compass-ci can help you. Compass-ci is an open CI platform of the openEuler community and is growing. Here's a brief reame: https://gitee.com/wu_fengguang/compass-ci/blob/master/README.en.md Thanks, Chen Qun > > Solution 1 > ---------- > > Doing this for downstream builds I can start doing this for upstream as well. > At > this point I can offer a single pipeline watching only changes in qemu > (downstream we are checking distro/kernel changes as well but that would > require too much time at this point) on a single x86_64 machine. I can not > offer > a public access to the testing machine, not even checking custom builds > (unless > someone provides me a publicly available machine(s) that I would use for > this). > What I can offer is running the checks on the latest qemu master, publishing > the reports, bisecting issues and notifying people about the changes. An > example of a report can be found here: > https://drive.google.com/file/d/1V2w7QpSuybNusUaGxnyT5zTUvtZDOfsb/view > ?usp=sharing a documentation of the format is here: > https://run-perf.readthedocs.io/en/latest/scripts.html#html-results I can also > attach the raw pbench results if needed (as well as details about the tests > that > were executed and the params and other details). > > Currently the covered scenarios would be a default libvirt machine with qcow2 > storage and tuned libvirt machine (cpus, hugepages, numa, raw disk...) running > fio, uperf and linpack on the latest GA RHEL. In the future I can add/tweak > the > scenarios as well as tests selection based on your feedback. > > Solution 2 > ---------- > > I can offer a documentation: > https://run-perf.readthedocs.io/en/latest/jenkins.html and someone can > fork/inspire by it and setup the pipelines on their system, making it > available to > the outside world, add your custom scenarios and variants. Note the setup > does not require Jenkins, it's just an example and could be easily turned > into a > cronjob or whatever you chose. > > Solution 3 > ---------- > > You name it. I bet there are many other ways to perform system-wide > performance testing. > > Regards, > Lukáš >