** Description changed: + [Impact] + + * History: Xenial's qemu as once released with a machine type that was + very broken. This was later on fixed in bug 1621042 but for + compatibility reasons we need to carry the broken type as well (to e.g. + allow migrations from guests started back then). In bug 1829868 we + realized that and "fixed the type to be as bad as it was originally". + + * New Issue: In between Bionic and Focal the qemu code changed (again) + the way compat features are stored and assigned. While forward porting + our delta the wily type became "too non bad" that means it is more + "normal" in comparison to e.g. a proper qemu 2.3/2.4 type but that is + not what we need. We need it to be exactly the same mix&match of + 2.3/2.4 features it was from the beginning. + + * This bug has identified an issue due to that difference, the fix shall + again get this type in sync. + + [Test Case] + + * Windows guests baloon driver can be affected by this change of + attributes. So if you have started a windows guest with the wily + machine type on xenial and migrate it to focal it will fail as reported + by the bug opener below. Migrating such a machine is a valid test and + was done on the PPA in comment 17. + + * These types carry more than just what failed in that windows guest, to + get the full list of compat attributes comment #12 & #13 show how to + get those from gdb in 4.2 and 2.11 respectively. The list should match + what bionic had (without the fix the one of Focal is different). + + [Where problems could occur] + + * We are changing a type meant for compatibility with very old machines. + So I'd potential problems in migration (or save/restore) of those very + old guests. + Gladly that type isn't the default for more than 4 years now and + discouraged since like forever - and the changes are isolated to this + type. + Furthermore even if there are guests with that old type out it likely + is on very old xenial systems, but we only change >=Focal to be able to + receive those correctly - yet on >Focal there should be (hopefully) + next to none of these super old machine types. + + [Other Info] + + * To be clear, we are trying to keep an older and older compat base alive + here. But if possible anyone affected should consider upgrading the + guest machine types whenever there are major host OS upgrades. That + needs a guest restart, so only doable on scheduled downtimes. + https://wiki.ubuntu.com/QemuKVMMigration#Upgrade_machine_type + + + --- --- --- + We have several thousands of virtual machines with pc-i440fx-wily machine type. Hypervisors run on ubuntu 16.04 and ubuntu 18.04. - - We have several problems when we try to migrate those machines to hypervisors with ubuntu 20.04. + We have several problems when we try to migrate those machines to + hypervisors with ubuntu 20.04. * linux guests migrate OK, but for some weird reason windows guests (with the same XML domain definition) do not. We have the following error: --- qemu-system-x86_64: Features 0x8000002 unsupported. Allowed features: 0x71000002 qemu-system-x86_64: Failed to load virtio-console:virtio qemu-system-x86_64: error while loading state for instance 0x0 of device '0000:00:04.0/virtio-console --- I tried to investigate this issue and discovered following things: - missing feature is VIRTIO_F_ANY_LAYOUT for some of virtio devices - on xenial and bionic VIRTIO_F_ANY_LAYOUT is enabled for pc-i440fx-wily guests, observe: --- - # virsh qemu-monitor-command some-guest --hmp info qtree | grep any_layout - any_layout = true - any_layout = true - any_layout = false - any_layout = true + # virsh qemu-monitor-command some-guest --hmp info qtree | grep any_layout + any_layout = true + any_layout = true + any_layout = false + any_layout = true --- - on focal it is disabled --- # virsh qemu-monitor-command some-guest2 --hmp info qtree | grep any_layout - any_layout = false - any_layout = true - any_layout = false - any_layout = false + any_layout = false + any_layout = true + any_layout = false + any_layout = false --- I tried (helplessly) to compare source code for bionic and focal branches of qemu. Looks like this block code is included for the pc-i440fx-wily in focal branch and this is where any_layout is disabled: --- GlobalProperty hw_compat_2_3[] = { - { "virtio-blk-pci", "any_layout", "off" }, - { "virtio-balloon-pci", "any_layout", "off" }, - { "virtio-serial-pci", "any_layout", "off" }, - { "virtio-9p-pci", "any_layout", "off" }, - { "virtio-rng-pci", "any_layout", "off" }, - { TYPE_PCI_DEVICE, "x-pcie-lnksta-dllla", "off" }, - { "migration", "send-configuration", "off" }, - { "migration", "send-section-footer", "off" }, - { "migration", "store-global-state", "off" }, + { "virtio-blk-pci", "any_layout", "off" }, + { "virtio-balloon-pci", "any_layout", "off" }, + { "virtio-serial-pci", "any_layout", "off" }, + { "virtio-9p-pci", "any_layout", "off" }, + { "virtio-rng-pci", "any_layout", "off" }, + { TYPE_PCI_DEVICE, "x-pcie-lnksta-dllla", "off" }, + { "migration", "send-configuration", "off" }, + { "migration", "send-section-footer", "off" }, + { "migration", "store-global-state", "off" }, }; --- * also we have another problem that *might* be linked to broken definition of pc-i440fx-wily. I am not sure so I'll just mention it (maybe it will be obvious for someone familiar with source code that this problem is also due to broken definition of pc-i440fx-wily in focal and hence part of the same issue) So even if migration bionic → focal succeeds, it's impossible to migrate guest back (focal → bionic). The problem is: --- operation failed: guest CPU doesn't match specification: extra features: arat ---
-- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1902654 Title: failure to migrate virtual machines with pc-i440fx-wily type to ubuntu 20.04 To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/libvirt/+bug/1902654/+subscriptions -- ubuntu-bugs mailing list [email protected] https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
