For Groovy, the proposed fix has already been applied to the generic groovy/linux kernel as part of "Groovy update: v5.8.17 upstream stable release" (bug 1902137). Therefore, the patch applied to the linux-azure branch went away during the rebase so it's missing the BugLink to this bug report, due to that this bug will not be closed automatically when the package is released.
-- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux-azure in Ubuntu. https://bugs.launchpad.net/bugs/1894893 Title: [linux-azure][hibernation] GPU device no longer working after resume from hibernation in NV6 VM size Status in linux-azure package in Ubuntu: Invalid Status in linux-azure source package in Focal: Fix Committed Status in linux-azure source package in Groovy: Fix Committed Bug description: [Impact] There are failed logs after resume from hibernation in NV6 (GPU passthrough size) VM in Azure: [ 1432.153730] hv_pci 47505500-0001-0000-3130-444531334632: hv_irq_unmask() failed: 0x5 [ 1432.167910] hv_pci 47505500-0001-0000-3130-444531334632: hv_irq_unmask() failed: 0x5 This happens to the latest stable release of the linux-azure 5.4.0-1023.23 kernel and the latest mainline linux kernel. [Test Case] How reproducible: 100% Steps to Reproduce: 1. Start a Standard_NV6 VM in Azure and enable hibernation properly (please refer to https://bugs.launchpad.net/ubuntu/+source/linux-azure/+bug/1880032/comments/14 ) E.g. here I create a Generation-1 Ubuntu 20.04 Standard NV6_Promo (6 vcpus, 56 GiB memory) VM in East US 2. 2. Make sure the in-kernel open-source nouveau driver is loaded, or blacklist the nouveau driver and install the official Nvidia GPU driver (please follow https://docs.microsoft.com/en-us/azure/virtual- machines/linux/n-series-driver-setup : "Install GRID drivers on NV or NVv3-series VMs" -- the most important step to run the "./NVIDIA- Linux-x86_64-grid.run".) 3. Run hibernation from serial console # systemctl hibernate 4. After hibernation finishes, start VM and check dmesg # dmesg|grep fail Actual results: [ 1432.153730] hv_pci 47505500-0001-0000-3130-444531334632: hv_irq_unmask() failed: 0x5 [ 1432.167910] hv_pci 47505500-0001-0000-3130-444531334632: hv_irq_unmask() failed: 0x5 And /proc/interrupts shows that the GPU interrupts are no longer happening. Expected results: No failed logs, and the GPU interrupt should still happen after hibernation. [Regression Potential] The fix touches the pci-hyperv and can compromise the hyper-v guest drivers. However the change is focuses on the execution path used for hibernation that is still not officially supported. [Other info] BUG FIX: I made a fix here: https://lkml.org/lkml/2020/9/4/1268. Without the patch, we see the error "hv_pci 47505500-0001-0000-3130-444531334632: hv_irq_unmask() failed: 0x5" during hibernation when the VM has the Nvidia GPU driver loaded, and after hibernation the GPU driver can no longer receive any MSI/MSI-X interrupts when we check /proc/interrupts. With the patch, we should no longer see the error, and the GPU driver should still receive interrupts after hibernation. To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux-azure/+bug/1894893/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp