* Claudio Fontana (cfont...@suse.de) wrote: > On 3/7/22 11:32 AM, Dr. David Alan Gilbert wrote: > > * Claudio Fontana (cfont...@suse.de) wrote: > >> On 3/5/22 2:20 PM, Claudio Fontana wrote: > >>> > >>> Hello all, > >>> > >>> I have been looking at some reports of bad qemu savevm performance in > >>> large VMs (around 20+ Gb), > >>> when used in libvirt commands like: > >>> > >>> > >>> virsh save domain /dev/null > >>> > >>> > >>> > >>> I have written a simple test to run in a Linux centos7-minimal-2009 > >>> guest, which allocates and touches 20G mem. > >>> > >>> With any qemu version since around 2020, I am not seeing more than 580 > >>> Mb/Sec even in the most ideal of situations. > >>> > >>> This drops to around 122 Mb/sec after commit: > >>> cbde7be900d2a2279cbc4becb91d1ddd6a014def . > >>> > >>> Here is the bisection for this particular drop in throughput: > >>> > >>> commit cbde7be900d2a2279cbc4becb91d1ddd6a014def (HEAD, refs/bisect/bad) > >>> Author: Daniel P. Berrangé <berra...@redhat.com> > >>> Date: Fri Feb 19 18:40:12 2021 +0000 > >>> > >>> migrate: remove QMP/HMP commands for speed, downtime and cache size > >>> > >>> The generic 'migrate_set_parameters' command handle all types of > >>> param. > >>> > >>> Only the QMP commands were documented in the deprecations page, but > >>> the > >>> rationale for deprecating applies equally to HMP, and the replacements > >>> exist. Furthermore the HMP commands are just shims to the QMP > >>> commands, > >>> so removing the latter breaks the former unless they get > >>> re-implemented. > >>> > >>> Reviewed-by: Dr. David Alan Gilbert <dgilb...@redhat.com> > >>> Signed-off-by: Daniel P. Berrangé <berra...@redhat.com> > >>> > >>> > >>> git bisect start > >>> # bad: [5c8463886d50eeb0337bd121ab877cf692731e36] Merge remote-tracking > >>> branch 'remotes/kraxel/tags/kraxel-20220304-pull-request' into staging > >>> git bisect bad 5c8463886d50eeb0337bd121ab877cf692731e36 > >>> # good: [6cdf8c4efa073eac7d5f9894329e2d07743c2955] Update version for > >>> 4.2.1 release > >>> git bisect good 6cdf8c4efa073eac7d5f9894329e2d07743c2955 > >>> # good: [b0ca999a43a22b38158a222233d3f5881648bb4f] Update version for > >>> v4.2.0 release > >>> git bisect good b0ca999a43a22b38158a222233d3f5881648bb4f > >>> # skip: [e2665f314d80d7edbfe7f8275abed7e2c93c0ddc] target/mips: Alias MSA > >>> vector registers on FPU scalar registers > >>> git bisect skip e2665f314d80d7edbfe7f8275abed7e2c93c0ddc > >>> # good: [4762c82cbda22b1036ce9dd2c5e951ac0ed0a7d3] tests/docker: Install > >>> static libc package in CentOS 7 > >>> git bisect good 4762c82cbda22b1036ce9dd2c5e951ac0ed0a7d3 > >>> # bad: [d4127349e316b5c78645f95dba5922196ac4cc23] Merge remote-tracking > >>> branch 'remotes/berrange-gitlab/tags/crypto-and-more-pull-request' into > >>> staging > >>> git bisect bad d4127349e316b5c78645f95dba5922196ac4cc23 > >>> # bad: [d90f154867ec0ec22fd719164b88716e8fd48672] Merge remote-tracking > >>> branch 'remotes/dg-gitlab/tags/ppc-for-6.1-20210504' into staging > >>> git bisect bad d90f154867ec0ec22fd719164b88716e8fd48672 > >>> # good: [dd5af6ece9b101d29895851a7441d848b7ccdbff] tests/docker: add a > >>> test-tcg for building then running check-tcg > >>> git bisect good dd5af6ece9b101d29895851a7441d848b7ccdbff > >>> # bad: [90ec1cff768fcbe1fa2870d2018f378376f4f744] target/riscv: Adjust > >>> privilege level for HLV(X)/HSV instructions > >>> git bisect bad 90ec1cff768fcbe1fa2870d2018f378376f4f744 > >>> # good: [373969507a3dc7de2d291da7e1bd03acf46ec643] migration: Replaced > >>> qemu_mutex_lock calls with QEMU_LOCK_GUARD > >>> git bisect good 373969507a3dc7de2d291da7e1bd03acf46ec643 > >>> # good: [4083904bc9fe5da580f7ca397b1e828fbc322732] Merge remote-tracking > >>> branch 'remotes/rth-gitlab/tags/pull-tcg-20210317' into staging > >>> git bisect good 4083904bc9fe5da580f7ca397b1e828fbc322732 > >>> # bad: [009ff89328b1da3ea8ba316bf2be2125bc9937c5] vl: allow passing JSON > >>> to -object > >>> git bisect bad 009ff89328b1da3ea8ba316bf2be2125bc9937c5 > >>> # bad: [50243407457a9fb0ed17b9a9ba9fc9aee09495b1] qapi/qom: Drop > >>> deprecated 'props' from object-add > >>> git bisect bad 50243407457a9fb0ed17b9a9ba9fc9aee09495b1 > >>> # bad: [1b507e55f8199eaad99744613823f6929e4d57c6] Merge remote-tracking > >>> branch 'remotes/berrange-gitlab/tags/dep-many-pull-request' into staging > >>> git bisect bad 1b507e55f8199eaad99744613823f6929e4d57c6 > >>> # bad: [24e13a4dc1eb1630eceffc7ab334145d902e763d] chardev: reject use of > >>> 'wait' flag for socket client chardevs > >>> git bisect bad 24e13a4dc1eb1630eceffc7ab334145d902e763d > >>> # good: [8becb36063fb14df1e3ae4916215667e2cb65fa2] monitor: remove > >>> 'query-events' QMP command > >>> git bisect good 8becb36063fb14df1e3ae4916215667e2cb65fa2 > >>> # bad: [8af54b9172ff3b9bbdbb3191ed84994d275a0d81] machine: remove > >>> 'query-cpus' QMP command > >>> git bisect bad 8af54b9172ff3b9bbdbb3191ed84994d275a0d81 > >>> # bad: [cbde7be900d2a2279cbc4becb91d1ddd6a014def] migrate: remove QMP/HMP > >>> commands for speed, downtime and cache size > >>> git bisect bad cbde7be900d2a2279cbc4becb91d1ddd6a014def > >>> # first bad commit: [cbde7be900d2a2279cbc4becb91d1ddd6a014def] migrate: > >>> remove QMP/HMP commands for speed, downtime and cache size > >>> > >>> > >>> Are there some obvious settings / options I am missing to regain the > >>> savevm performance after this commit? > >> > >> Answering myself: > > > > <oops we seem to have split this thread into two> > > > >> this seems to be due to a resulting different default xbzrle cache size > >> (probably interactions between libvirt/qemu versions?). > >> > >> When forcing the xbzrle cache size to a larger value, the performance is > >> back. > > > > That's weird that 'virsh save' is ending up using xbzrle. > > virsh save (or qemu savevm..) seems to me like it uses a subset of the > migration code and migration parameters but not all.. > > > > >>> > >>> I have seen projects attempting to improve other aspects of performance > >>> (snapshot performance, etc), is there something going on to improve the > >>> transfer of RAM in savevm too? > >> > >> > >> Still I would think that we should be able to do better than 600ish Mb/s , > >> any ideas, prior work on this, > >> to improve savevm performance, especially looking at RAM regions transfer > >> speed? > > > > My normal feeling is ~10Gbps for a live migrate over the wire; I rarely > > try virsh save though. > > If you're using xbzrle that might explain it; it's known to eat cpu - > > but I'd never expect it to have been used with 'virsh save'. > > some valgrind shows it among the top cpu eaters; > > I wonder why we are able to do more than 2x better for actual live migration, > compared with virsh save /dev/null ...
What speed do you get if you force xbzrle off? Dave > Thanks, > > Claudio > -- Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK