This bug is missing log files that will aid in diagnosing the problem. While running an Ubuntu kernel (not a mainline or third-party kernel) please enter the following command in a terminal window:
apport-collect 1961968 and then change the status of the bug to 'Confirmed'. If, due to the nature of the issue you have encountered, you are unable to run this command, please add a comment stating that fact and change the bug status to 'Confirmed'. This change has been made by an automated script, maintained by the Ubuntu Kernel Team. ** Changed in: linux (Ubuntu) Status: New => Incomplete -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1961968 Title: Broken network on some AWS instances with focal/impish kernels Status in linux package in Ubuntu: Incomplete Status in linux source package in Focal: In Progress Status in linux source package in Impish: In Progress Bug description: [Impact] With the latest focal/linux (5.4.0-101.114) and impish/linux (5.13.0-31.34) kernels built for SRU cycle 2022.02.21 some AWS instances fail to boot. This impacts mostly the instance types: c4.large, c3.xlarge and x1e.xlarge. However, not all instances deployed on those types will fail. This is affecting mostly c4.large which fails about 80-90% of all deployments. This was traced to be caused by the network interface failing to come up. The following console log snippets from 5.4.0-101-generic on a c4.large show some hints of what's going on: [...] [ 3.990368] unchecked MSR access error: RDMSR from 0xc90 at rIP: 0xffffffff8ea733c8 (native_read_msr+0x8/0x40) [ 3.998463] Call Trace: [ 4.001164] ? set_rdt_options+0x91/0x91 [ 4.004864] resctrl_late_init+0x592/0x63c [ 4.008711] ? set_rdt_options+0x91/0x91 [ 4.012452] do_one_initcall+0x4a/0x200 [ 4.016115] kernel_init_freeable+0x1c0/0x263 [ 4.020402] ? rest_init+0xb0/0xb0 [ 4.024889] kernel_init+0xe/0x110 [ 4.029245] ret_from_fork+0x35/0x40 [...] [ 7.718268] ena: The ena device sent a completion but the driver didn't receive a MSI-X interrupt (cmd 8), autopolling mode is OFF [ 7.727036] ena: Failed to submit get_feature command 12 error: -62 [ 7.731691] ena 0000:00:03.0: Cannot init indirect table [ 7.735636] ena 0000:00:03.0: Cannot init RSS rc: -62 [ 7.740700] ena: probe of 0000:00:03.0 failed with error -62 [...] [Fix] Reverting the following upstream stable commit fixes the issue: 83dbf898a2d4 PCI/MSI: Mask MSI-X vectors only on success [Test Case] Boot an affected AWS instance type with focal/linux (5.4.0-101.114) and impish/linux (5.13.0-31.34) kernels with the mentioned patch reverted. Then boot with the original kernels. It should boot successfully with the reverted patch but fail with the original kernels. [Regression Potential] The patch description mentions fixing a MSI-X issue with a Marvell NVME device, which doesn't seem to be following the PCI-E specification. Reverting this commit will keep the issue on systems with that particular NVME device unfixed. As of now there is no follow-up fix for this commit upstream, we might need to keep an eye on any change and re-apply it in case a fix is found. To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1961968/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp