Public bug reported:

Microsoft would like to request the following commit in all supported releases 
that run on Azure:
915cff7f38c5 (“PCI: hv: Fix hibernation in case interrupts are not re-created”)

Commit details:
pci_restore_msi_state() directly writes the MSI/MSI-X related registers
    via MMIO. On a physical machine, this works perfectly; for a Linux VM
    running on a hypervisor, which typically enables IOMMU interrupt remapping,
    the hypervisor usually should trap and emulate the MMIO accesses in order
    to re-create the necessary interrupt remapping table entries in the IOMMU,
    otherwise the interrupts can not work in the VM after hibernation.

    Hyper-V is different from other hypervisors in that it does not trap and
    emulate the MMIO accesses, and instead it uses a para-virtualized method,
    which requires the VM to call hv_compose_msi_msg() to notify the hypervisor
    of the info that would be passed to the hypervisor in the case of the
    trap-and-emulate method. This is not an issue to a lot of PCI device
    drivers, which destroy and re-create the interrupts across hibernation, so
    hv_compose_msi_msg() is called automatically. However, some PCI device
    drivers (e.g. the in-tree GPU driver nouveau and the out-of-tree Nvidia
    proprietary GPU driver) do not destroy and re-create MSI/MSI-X interrupts
    across hibernation, so hv_pci_resume() has to call hv_compose_msi_msg(),
    otherwise the PCI device drivers can no longer receive interrupts after
    the VM resumes from hibernation.

    Hyper-V is also different in that chip->irq_unmask() may fail in a
    Linux VM running on Hyper-V (on a physical machine, chip->irq_unmask()
    can not fail because unmasking an MSI/MSI-X register just means an MMIO
    write): during hibernation, when a CPU is offlined, the kernel tries
    to move the interrupt to the remaining CPUs that haven't been offlined
    yet. In this case, hv_irq_unmask() -> hv_do_hypercall() always fails
    because the vmbus channel has been closed: here the early "return" in
    hv_irq_unmask() means the pci_msi_unmask_irq() is not called, i.e. the
    desc->masked remains "true", so later after hibernation, the MSI interrupt
    always remains masked, which is incorrect. Refer to cpu_disable_common()
    -> fixup_irqs() -> irq_migrate_all_off_this_cpu() -> migrate_one_irq():

** Affects: linux-azure (Ubuntu)
     Importance: Undecided
         Status: New

** Summary changed:

- Fix hibernation in case interrupts are not re-created
+ [linux-azure] Fix hibernation in case interrupts are not re-created

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux-azure in Ubuntu.
https://bugs.launchpad.net/bugs/1904463

Title:
  [linux-azure] Fix hibernation in case interrupts are not re-created

Status in linux-azure package in Ubuntu:
  New

Bug description:
  Microsoft would like to request the following commit in all supported 
releases that run on Azure:
  915cff7f38c5 (“PCI: hv: Fix hibernation in case interrupts are not 
re-created”)

  Commit details:
  pci_restore_msi_state() directly writes the MSI/MSI-X related registers
      via MMIO. On a physical machine, this works perfectly; for a Linux VM
      running on a hypervisor, which typically enables IOMMU interrupt 
remapping,
      the hypervisor usually should trap and emulate the MMIO accesses in order
      to re-create the necessary interrupt remapping table entries in the IOMMU,
      otherwise the interrupts can not work in the VM after hibernation.

      Hyper-V is different from other hypervisors in that it does not trap and
      emulate the MMIO accesses, and instead it uses a para-virtualized method,
      which requires the VM to call hv_compose_msi_msg() to notify the 
hypervisor
      of the info that would be passed to the hypervisor in the case of the
      trap-and-emulate method. This is not an issue to a lot of PCI device
      drivers, which destroy and re-create the interrupts across hibernation, so
      hv_compose_msi_msg() is called automatically. However, some PCI device
      drivers (e.g. the in-tree GPU driver nouveau and the out-of-tree Nvidia
      proprietary GPU driver) do not destroy and re-create MSI/MSI-X interrupts
      across hibernation, so hv_pci_resume() has to call hv_compose_msi_msg(),
      otherwise the PCI device drivers can no longer receive interrupts after
      the VM resumes from hibernation.

      Hyper-V is also different in that chip->irq_unmask() may fail in a
      Linux VM running on Hyper-V (on a physical machine, chip->irq_unmask()
      can not fail because unmasking an MSI/MSI-X register just means an MMIO
      write): during hibernation, when a CPU is offlined, the kernel tries
      to move the interrupt to the remaining CPUs that haven't been offlined
      yet. In this case, hv_irq_unmask() -> hv_do_hypercall() always fails
      because the vmbus channel has been closed: here the early "return" in
      hv_irq_unmask() means the pci_msi_unmask_irq() is not called, i.e. the
      desc->masked remains "true", so later after hibernation, the MSI interrupt
      always remains masked, which is incorrect. Refer to cpu_disable_common()
      -> fixup_irqs() -> irq_migrate_all_off_this_cpu() -> migrate_one_irq():

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux-azure/+bug/1904463/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to     : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp

Reply via email to