RE: Proposal for a regular upstream performance testing

Chenqun (kuhn) Wed, 02 Dec 2020 01:57:37 -0800

> -----Original Message-----
> From: Qemu-devel
> [mailto:qemu-devel-bounces+kuhn.chenqun=huawei....@nongnu.org] On
> Behalf Of Luká? Doktor
> Sent: Thursday, November 26, 2020 4:10 PM
> To: QEMU Developers <qemu-devel@nongnu.org>
> Cc: Charles Shih <che...@redhat.com>; Aleksandar Markovic
> <aleksandar.qemu.de...@gmail.com>; Stefan Hajnoczi
> <stefa...@redhat.com>
> Subject: Proposal for a regular upstream performance testing
> 
> Hello guys,
> 
> I had been around qemu on the Avocado-vt side for quite some time and a while
> ago I shifted my focus on performance testing. Currently I am not aware of any
> upstream CI that would continuously monitor the upstream qemu performance
> and I'd like to change that. There is a lot to cover so please bear with me.
> 
> Goal
> ====
> 
> The goal of this initiative is to detect system-wide performance regressions 
> as
> well as improvements early, ideally pin-point the individual commits and 
> notify
> people that they should fix things. All in upstream and ideally with least 
> human
> interaction possible.
> 
> Unlike the recent work of Ahmed Karaman's
> https://ahmedkrmn.github.io/TCG-Continuous-Benchmarking/ my aim is on the
> system-wide performance inside the guest (like fio, uperf, ...)
> 
> Tools
> =====
> 
> In house we have several different tools used by various teams and I bet there
> are tons of other tools out there that can do that. I can not speak for all 
> teams
> but over the time many teams at Red Hat have come to like pbench
> https://distributed-system-analysis.github.io/pbench/ to run the tests and
> produce machine readable results and use other tools (Ansible, scripts, ...) 
> to
> provision the systems and to generate the comparisons.
> 
> As for myself I used python for PoC and over the last year I pushed hard to 
> turn
> it into a usable and sensible tool which I'd like to offer:
> https://run-perf.readthedocs.io/en/latest/ anyway I am open to suggestions
> and comparisons. As I am using it downstream to watch regressions I do plan
> on keep developing the tool as well as the pipelines (unless a better tool is
> found that would replace it or it's parts).
> 
> How
> ===
> 
> This is a tough question. Ideally this should be a standalone service that 
> would
> only notify the author of the patch that caused the change with a bunch of
> useful data so they can either address the issue or just be aware of this 
> change
> and mark it as expected.
> 
> Ideally the community should have a way to also issue their custom builds in
> order to verify their patches so they can debug and address issues better than
> just commit to qemu-master.
> 
> The problem with those is that we can not simply use travis/gitlab/... 
> machines
> for running those tests, because we are measuring in-guest actual
> performance. We can't just stop the time when the machine decides to
> schedule another container/vm. I briefly checked the public bare-metal
> offerings like rackspace but these are most probably not sufficient either
> because (unless I'm wrong) they only give you a machine but it is not
> guaranteed that it will be the same machine the next time. If we are to
> compare the results we don't need just the same model, we really need the
> very same machine. Any change to the machine might lead to a significant
> difference (disk replacement, even firmware update...).


Hi Lukáš,

  It's nice to see a discussion of QEMU performance topic.
If you have a need for CI platform and physical machine environments, maybe 
compass-ci can help you.

Compass-ci is an open CI platform of the openEuler community and is growing.

Here's a brief reame：
https://gitee.com/wu_fengguang/compass-ci/blob/master/README.en.md


Thanks,
Chen Qun
> 
> Solution 1
> ----------
> 
> Doing this for downstream builds I can start doing this for upstream as well. 
> At
> this point I can offer a single pipeline watching only changes in qemu
> (downstream we are checking distro/kernel changes as well but that would
> require too much time at this point) on a single x86_64 machine. I can not 
> offer
> a public access to the testing machine, not even checking custom builds 
> (unless
> someone provides me a publicly available machine(s) that I would use for 
> this).
> What I can offer is running the checks on the latest qemu master, publishing
> the reports, bisecting issues and notifying people about the changes. An
> example of a report can be found here:
> https://drive.google.com/file/d/1V2w7QpSuybNusUaGxnyT5zTUvtZDOfsb/view
> ?usp=sharing a documentation of the format is here:
> https://run-perf.readthedocs.io/en/latest/scripts.html#html-results I can also
> attach the raw pbench results if needed (as well as details about the tests 
> that
> were executed and the params and other details).
> 
> Currently the covered scenarios would be a default libvirt machine with qcow2
> storage and tuned libvirt machine (cpus, hugepages, numa, raw disk...) running
> fio, uperf and linpack on the latest GA RHEL. In the future I can add/tweak 
> the
> scenarios as well as tests selection based on your feedback.
> 
> Solution 2
> ----------
> 
> I can offer a documentation:
> https://run-perf.readthedocs.io/en/latest/jenkins.html and someone can
> fork/inspire by it and setup the pipelines on their system, making it 
> available to
> the outside world, add your custom scenarios and variants. Note the setup
> does not require Jenkins, it's just an example and could be easily turned 
> into a
> cronjob or whatever you chose.
> 
> Solution 3
> ----------
> 
> You name it. I bet there are many other ways to perform system-wide
> performance testing.
> 
> Regards,
> Lukáš
>

RE: Proposal for a regular upstream performance testing

Reply via email to