On 2020/11/26 下午4:10, Lukáš Doktor wrote:
Hello guys,
I had been around qemu on the Avocado-vt side for quite some time and
a while ago I shifted my focus on performance testing. Currently I am
not aware of any upstream CI that would continuously monitor the
upstream qemu performance and I'd like to change that. There is a lot
to cover so please bear with me.
Goal
====
The goal of this initiative is to detect system-wide performance
regressions as well as improvements early, ideally pin-point the
individual commits and notify people that they should fix things. All
in upstream and ideally with least human interaction possible.
Unlike the recent work of Ahmed Karaman's
https://ahmedkrmn.github.io/TCG-Continuous-Benchmarking/ my aim is on
the system-wide performance inside the guest (like fio, uperf, ...)
Tools
=====
In house we have several different tools used by various teams and I
bet there are tons of other tools out there that can do that. I can
not speak for all teams but over the time many teams at Red Hat have
come to like pbench
https://distributed-system-analysis.github.io/pbench/ to run the tests
and produce machine readable results and use other tools (Ansible,
scripts, ...) to provision the systems and to generate the comparisons.
As for myself I used python for PoC and over the last year I pushed
hard to turn it into a usable and sensible tool which I'd like to
offer: https://run-perf.readthedocs.io/en/latest/ anyway I am open to
suggestions and comparisons. As I am using it downstream to watch
regressions I do plan on keep developing the tool as well as the
pipelines (unless a better tool is found that would replace it or it's
parts).
FYI, Intel has invented a lot on the 0-day Linux kernel automated
performance regression test: https://01.org/lkp. It's being actively
developed upstream.
It's powerful and tons of regressions were reported (and bisected).
I think it can use qemu somehow but I'm not sure. Maybe we can have a try.
Thanks
How
===
This is a tough question. Ideally this should be a standalone service
that would only notify the author of the patch that caused the change
with a bunch of useful data so they can either address the issue or
just be aware of this change and mark it as expected.
Ideally the community should have a way to also issue their custom
builds in order to verify their patches so they can debug and address
issues better than just commit to qemu-master.
The problem with those is that we can not simply use travis/gitlab/...
machines for running those tests, because we are measuring in-guest
actual performance. We can't just stop the time when the machine
decides to schedule another container/vm. I briefly checked the public
bare-metal offerings like rackspace but these are most probably not
sufficient either because (unless I'm wrong) they only give you a
machine but it is not guaranteed that it will be the same machine the
next time. If we are to compare the results we don't need just the
same model, we really need the very same machine. Any change to the
machine might lead to a significant difference (disk replacement, even
firmware update...).
Solution 1
----------
Doing this for downstream builds I can start doing this for upstream
as well. At this point I can offer a single pipeline watching only
changes in qemu (downstream we are checking distro/kernel changes as
well but that would require too much time at this point) on a single
x86_64 machine. I can not offer a public access to the testing
machine, not even checking custom builds (unless someone provides me a
publicly available machine(s) that I would use for this). What I can
offer is running the checks on the latest qemu master, publishing the
reports, bisecting issues and notifying people about the changes. An
example of a report can be found here:
https://drive.google.com/file/d/1V2w7QpSuybNusUaGxnyT5zTUvtZDOfsb/view?usp=sharing
a documentation of the format is here:
https://run-perf.readthedocs.io/en/latest/scripts.html#html-results I
can also attach the raw pbench results if needed (as well as details
about the tests that were executed and the params and other details).
Currently the covered scenarios would be a default libvirt machine
with qcow2 storage and tuned libvirt machine (cpus, hugepages, numa,
raw disk...) running fio, uperf and linpack on the latest GA RHEL. In
the future I can add/tweak the scenarios as well as tests selection
based on your feedback.
Solution 2
----------
I can offer a documentation:
https://run-perf.readthedocs.io/en/latest/jenkins.html and someone can
fork/inspire by it and setup the pipelines on their system, making it
available to the outside world, add your custom scenarios and
variants. Note the setup does not require Jenkins, it's just an
example and could be easily turned into a cronjob or whatever you chose.
Solution 3
----------
You name it. I bet there are many other ways to perform system-wide
performance testing.
Regards,
Lukáš