** Changed in: linux (Ubuntu Jammy)
Status: New => Triaged
** Changed in: linux (Ubuntu Jammy)
Importance: Undecided => Medium
--
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/2109537
Title:
Jammy generic-64k fails to initialize gVNIC devices
Status in linux package in Ubuntu:
New
Status in linux source package in Jammy:
Triaged
Bug description:
[Impact]
During startup on one of Google Compute Engine's C4A machines, the gVNIC will
fail to initialize:
[ 1.071899] gvnic 0000:00:00.0: enabling device (0010 -> 0012)
[ 1.076631] ACPI: \_SB_.PCI0.GSI2: Enabled at IRQ 37
[ 1.078075] nvme nvme0: pci function 0000:00:02.0
[ 1.093687] nvme nvme0: 4/0/0 default/read/poll queues
[ 1.097563] nvme0n1: p1 p15
[ 3.886472] gvnic 0000:00:00.0: AQ commands timed out, need to reset AQ
[ 3.888151] gvnic 0000:00:00.0: Could not get device information: err=-131
[ 3.891458] gvnic: probe of 0000:00:00.0 failed with error -131
Because this is a cloud instance, network failure means the instance
is unusable.
[Fix]
A patchset to make the GVE driver work on both 64k page size and 4k page size
kernels was applied in Linux 6.8, so Noble and later kernels all don't have
this problem. Backporting the patchset to 5.15 appears to fix the issue, as I
was able to boot and connect to the machine using the patched kernel.
Patchset link:
https://lore.kernel.org/all/[email protected]/
Hashes:
955f4d3bf0a45 ("gve: Perform adminq allocations through a dma_pool.")
8ae980d24195f ("gve: Deprecate adminq_pfn for pci revision 0x1.")
ce260cb114bbf ("gve: Remove obsolete checks that rely on page size.")
513072fb4bf81 ("gve: Add page size register to the register_page_list
command.")
da7d4b42caf1b ("gve: Remove dependency on 4k page size.")
[Test plan]
Boot the 64k flavor of the patched kernel on a C4A Google Compute
Engine instance, and verify that you can ssh to it.
[Regression potential]
Of the applied patches, "gve: Remove dependency on 4k page size." was
the only one to have conflicts. It's possible that there are uses of
the native PAGE_SIZE definition that aren't covered by the backport of
the patch. This patchset is being without including other major GVE
driver patchsets that had been applied before it in mainline.
Since the patches are isolated to the GVE driver, and since
generic-64k previously didn't work on gVNIC instances at all, the
possibility of failure is limited to configurations which were already
not working, therefore not regressions.
To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/2109537/+subscriptions
--
Mailing list: https://launchpad.net/~kernel-packages
Post to : [email protected]
Unsubscribe : https://launchpad.net/~kernel-packages
More help : https://help.launchpad.net/ListHelp