On Wed, 26 Aug 2020 11:55:49 +0200 Joerg Jaspert <jo...@debian.org> wrote:
> using Ganeti 2.16 and qemu 1:5.0-14~bpo10+1 I tried setting > migration_caps for the cluster. But no matter which value i use it > breaks migration. > > Migration then "goes" > - Setup disks and prepare target node > - starting memore transfer > - "Migration failed, aborting" > - Closing disks > > That is entirely independent on which value I put into the caps. > > Unsetting migration_caps and retrying the migration - it fails again. I have tested Ganeti-3.0 from master on Buster with Qemu-5.0 from backports. The HV-code should be the same for Ganeti-2.16. First migration works on freshly started instances (empty migration_caps). I've also tested the following migration capabilities with success (2 times of migrations): auto-converge, zero-blocks, xbzrle. The only known one that is broken is postcopy-ram[1]. So I assume Joerg is/was using postcopy-ram? If an instance was previously migrated with postcopy-ram, the current qemu-process "remembers" this "setting". If migration_caps is unset (empty), running instances must be unset as well. On the instance's node run: echo "migrate_set_capability postcopy-ram off" | socat STDIO UNIX-CONNECT:/var/run/ganeti/kvm-hypervisor/ctrl/XXXXX.monitor > Looking more in detail it appears that its setting the caps on the > source side. But possibly forgets to set them on the target side so qemu > hates doing migration?! That is true for postcopy-ram. It worked before Qemu-2.11, but is now broken (should also be broken with qemu-3.1/default buster version). > And when it breaks, it forgets to unset the capabilities, so next > migrations break too. One has to manually connect to the monitor and > unset them, before migration works again. Sounds exactly what I described. [1] https://github.com/ganeti/ganeti/issues/950#issuecomment-506266808
pgptEUu87UQFb.pgp
Description: PGP signature