Public bug reported:
[Impact]
MAAS deploys to the wrong NVMe device
[Description]
Since [0], the introduction of NVMe multipath brought a change in the way
namespaces' "identity" is calculated. It's possible to have a "mismatch"
between the nvme device names and their corresponding namespace, similar to the
situation below:
lrwxrwxrwx 1 root root 0 Jul 18 13:25 /sys/block/nvme0n1/device -> ../../nvme1
lrwxrwxrwx 1 root root 0 Jul 18 13:25 /sys/block/nvme1n1/device -> ../../nvme0
This can cause MAAS/curtin to deploy the wrong nvme device, as it's
currently using device names that are subject to change between reboots.
It can be alleviated by using the nvme_core.multipath=0 parameter, but
ideally we should not have MAAS/curtin rely on the device numbers.
A possible solution for this is to ensure that NVMe devices are referred
to by their device ID, as that should keep things consistent between
reboots.
[0] ed754e5dee ("nvme: track shared namespaces")
https://git.kernel.org/linus/ed754e5dee
[Test Case]
On a system with multiple NVMe devices, deploy a Custom OS image with MAAS. As
the change to device names is not completely deterministic, below are some
reports of possible symptoms:
- deployed OS ends up in the wrong drive
- disk management presented Disk 0 as uninitialized and Disk 1 with installed OS
- OS fails to boot if only primary drive was listed in boot order
[Regression Potential]
The regression potential for this change should be low, considering that
MAAS/curtin already have the necessary support for referring to storage devices
by their ID. A regression could cause deployments to fail consistently, if the
nvme devices end up being indexed by wrong IDs.
** Affects: curtin
Importance: Undecided
Status: New
** Affects: maas
Importance: Undecided
Status: New
** Affects: maas (Ubuntu)
Importance: Undecided
Status: Incomplete
** Tags: sts
--
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1849320
Title:
MAAS assigns wrong multipath NVMe device
To manage notifications about this bug go to:
https://bugs.launchpad.net/curtin/+bug/1849320/+subscriptions
--
ubuntu-bugs mailing list
[email protected]
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs