** Changed in: linux (Ubuntu Focal)
       Status: In Progress => Fix Committed

** Changed in: linux (Ubuntu Impish)
       Status: In Progress => Fix Committed

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1961968

Title:
  Broken network on some AWS instances with focal/impish kernels

Status in linux package in Ubuntu:
  Fix Released
Status in linux source package in Focal:
  Fix Committed
Status in linux source package in Impish:
  Fix Committed

Bug description:
  [Impact]
  With the latest focal/linux (5.4.0-101.114) and impish/linux (5.13.0-31.34) 
kernels built for SRU cycle 2022.02.21 some AWS instances fail to boot. This 
impacts mostly the instance types: c4.large, c3.xlarge and x1e.xlarge. However, 
not all instances deployed on those types will fail. This is affecting mostly 
c4.large which fails about 80-90% of all deployments.

  This was traced to be caused by the network interface failing to come
  up. The following console log snippets from 5.4.0-101-generic on a
  c4.large show some hints of what's going on:

  [...]
  [    3.990368] unchecked MSR access error: RDMSR from 0xc90 at rIP: 
0xffffffff8ea733c8 (native_read_msr+0x8/0x40)
  [    3.998463] Call Trace:
  [    4.001164]  ? set_rdt_options+0x91/0x91
  [    4.004864]  resctrl_late_init+0x592/0x63c
  [    4.008711]  ? set_rdt_options+0x91/0x91
  [    4.012452]  do_one_initcall+0x4a/0x200
  [    4.016115]  kernel_init_freeable+0x1c0/0x263
  [    4.020402]  ? rest_init+0xb0/0xb0
  [    4.024889]  kernel_init+0xe/0x110
  [    4.029245]  ret_from_fork+0x35/0x40
  [...]
  [    7.718268] ena: The ena device sent a completion but the driver didn't 
receive a MSI-X interrupt (cmd 8), autopolling mode is OFF
  [    7.727036] ena: Failed to submit get_feature command 12 error: -62
  [    7.731691] ena 0000:00:03.0: Cannot init indirect table
  [    7.735636] ena 0000:00:03.0: Cannot init RSS rc: -62
  [    7.740700] ena: probe of 0000:00:03.0 failed with error -62
  [...]

  [Fix]
  Reverting the following upstream stable commit fixes the issue:

  83dbf898a2d4 PCI/MSI: Mask MSI-X vectors only on success

  [Test Case]
  Boot an affected AWS instance type with focal/linux (5.4.0-101.114) and 
impish/linux (5.13.0-31.34) kernels with the mentioned patch reverted. Then 
boot with the original kernels. It should boot successfully with the reverted 
patch but fail with the original kernels.

  [Regression Potential]
  The patch description mentions fixing a MSI-X issue with a Marvell NVME 
device, which doesn't seem to be following the PCI-E specification. Reverting 
this commit will keep the issue on systems with that particular NVME device 
unfixed.
  As of now there is no follow-up fix for this commit upstream, we might need 
to keep an eye on any change and re-apply it in case a fix is found.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1961968/+subscriptions


-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to     : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp

Reply via email to