** Description changed:

+ [Impact]
+ 
+  * History: Xenial's qemu as once released with a machine type that was 
+    very broken. This was later on fixed in bug 1621042 but for 
+    compatibility reasons we need to carry the broken type as well (to e.g. 
+    allow migrations from guests started back then). In bug 1829868 we 
+    realized that and "fixed the type to be as bad as it was originally". 
+    
+  * New Issue: In between Bionic and Focal the qemu code changed (again) 
+    the way compat features are stored and assigned. While forward porting 
+    our delta the wily type became "too non bad" that means it is more 
+    "normal" in comparison to e.g. a proper qemu 2.3/2.4 type but that is 
+    not what we need. We need it to be exactly the same mix&match of 
+    2.3/2.4 features it was from the beginning.
+ 
+  * This bug has identified an issue due to that difference, the fix shall
+    again get this type in sync.
+ 
+ [Test Case]
+ 
+  * Windows guests baloon driver can be affected by this change of 
+    attributes. So if you have started a windows guest with the wily 
+    machine type on xenial and migrate it to focal it will fail as reported 
+    by the bug opener below. Migrating such a machine is a valid test and 
+    was done on the PPA in comment 17.
+ 
+  * These types carry more than just what failed in that windows guest, to 
+    get the full list of compat attributes comment #12 & #13 show how to 
+    get those from gdb in 4.2 and 2.11 respectively. The list should match 
+    what bionic had (without the fix the one of Focal is different).
+ 
+ [Where problems could occur]
+ 
+  * We are changing a type meant for compatibility with very old machines. 
+    So I'd potential problems in migration (or save/restore) of those very 
+    old guests.
+    Gladly that type isn't the default for more than 4 years now and 
+    discouraged since like forever - and the changes are isolated to this 
+    type.
+    Furthermore even if there are guests with that old type out it likely 
+    is on very old xenial systems, but we only change >=Focal to be able to 
+    receive those correctly - yet on >Focal there should be (hopefully) 
+    next to none of these super old machine types.
+ 
+ [Other Info]
+  
+  * To be clear, we are trying to keep an older and older compat base alive 
+    here. But if possible anyone affected should consider upgrading the 
+    guest machine types whenever there are major host OS upgrades. That 
+    needs a guest restart, so only doable on scheduled downtimes.
+    https://wiki.ubuntu.com/QemuKVMMigration#Upgrade_machine_type
+ 
+ 
+ --- --- ---
+ 
  We have several thousands of virtual machines with pc-i440fx-wily
  machine type. Hypervisors run on ubuntu 16.04 and ubuntu 18.04.
  
- 
- We have several problems when we try to migrate those machines to hypervisors 
with ubuntu 20.04.
+ We have several problems when we try to migrate those machines to
+ hypervisors with ubuntu 20.04.
  
  * linux guests migrate OK, but for some weird reason windows guests (with the 
same XML domain definition) do not. We have the following error:
  ---
  qemu-system-x86_64: Features 0x8000002 unsupported. Allowed features: 
0x71000002
  qemu-system-x86_64: Failed to load virtio-console:virtio
  qemu-system-x86_64: error while loading state for instance 0x0 of device 
'0000:00:04.0/virtio-console
  ---
  I tried to investigate this issue and discovered following things:
  - missing feature is VIRTIO_F_ANY_LAYOUT for some of virtio devices
  - on xenial and bionic VIRTIO_F_ANY_LAYOUT is enabled for pc-i440fx-wily 
guests, observe:
  ---
- # virsh qemu-monitor-command some-guest --hmp info qtree | grep any_layout 
-             any_layout = true
-             any_layout = true
-             any_layout = false
-             any_layout = true
+ # virsh qemu-monitor-command some-guest --hmp info qtree | grep any_layout
+             any_layout = true
+             any_layout = true
+             any_layout = false
+             any_layout = true
  ---
  - on focal it is disabled
  ---
  # virsh qemu-monitor-command some-guest2 --hmp info qtree | grep any_layout
-             any_layout = false
-             any_layout = true
-             any_layout = false
-             any_layout = false
+             any_layout = false
+             any_layout = true
+             any_layout = false
+             any_layout = false
  ---
  I tried (helplessly) to compare source code for bionic and focal branches of 
qemu. Looks like this block code is included for the pc-i440fx-wily in focal 
branch and this is where any_layout is disabled:
  ---
  GlobalProperty hw_compat_2_3[] = {
-     { "virtio-blk-pci", "any_layout", "off" },
-     { "virtio-balloon-pci", "any_layout", "off" },
-     { "virtio-serial-pci", "any_layout", "off" },
-     { "virtio-9p-pci", "any_layout", "off" },
-     { "virtio-rng-pci", "any_layout", "off" },
-     { TYPE_PCI_DEVICE, "x-pcie-lnksta-dllla", "off" },
-     { "migration", "send-configuration", "off" },
-     { "migration", "send-section-footer", "off" },
-     { "migration", "store-global-state", "off" },
+     { "virtio-blk-pci", "any_layout", "off" },
+     { "virtio-balloon-pci", "any_layout", "off" },
+     { "virtio-serial-pci", "any_layout", "off" },
+     { "virtio-9p-pci", "any_layout", "off" },
+     { "virtio-rng-pci", "any_layout", "off" },
+     { TYPE_PCI_DEVICE, "x-pcie-lnksta-dllla", "off" },
+     { "migration", "send-configuration", "off" },
+     { "migration", "send-section-footer", "off" },
+     { "migration", "store-global-state", "off" },
  };
  ---
  
  * also we have another problem that *might* be linked to broken definition of 
pc-i440fx-wily. I am not sure so I'll just mention it (maybe it will be obvious 
for someone familiar with source code that this problem is also due to broken 
definition of pc-i440fx-wily in focal and hence part of the same issue)
  So even if migration bionic → focal succeeds, it's impossible to migrate 
guest back (focal → bionic). The problem is:
  ---
  operation failed: guest CPU doesn't match specification: extra features: arat
  ---

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1902654

Title:
  failure to migrate virtual machines with pc-i440fx-wily type to ubuntu
  20.04

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/libvirt/+bug/1902654/+subscriptions

-- 
ubuntu-bugs mailing list
[email protected]
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

Reply via email to