Public bug reported: [Impact] MAAS deploys to the wrong NVMe device
[Description] Since [0], the introduction of NVMe multipath brought a change in the way namespaces' "identity" is calculated. It's possible to have a "mismatch" between the nvme device names and their corresponding namespace, similar to the situation below: lrwxrwxrwx 1 root root 0 Jul 18 13:25 /sys/block/nvme0n1/device -> ../../nvme1 lrwxrwxrwx 1 root root 0 Jul 18 13:25 /sys/block/nvme1n1/device -> ../../nvme0 This can cause MAAS/curtin to deploy the wrong nvme device, as it's currently using device names that are subject to change between reboots. It can be alleviated by using the nvme_core.multipath=0 parameter, but ideally we should not have MAAS/curtin rely on the device numbers. A possible solution for this is to ensure that NVMe devices are referred to by their device ID, as that should keep things consistent between reboots. [0] ed754e5dee ("nvme: track shared namespaces") https://git.kernel.org/linus/ed754e5dee [Test Case] On a system with multiple NVMe devices, deploy a Custom OS image with MAAS. As the change to device names is not completely deterministic, below are some reports of possible symptoms: - deployed OS ends up in the wrong drive - disk management presented Disk 0 as uninitialized and Disk 1 with installed OS - OS fails to boot if only primary drive was listed in boot order [Regression Potential] The regression potential for this change should be low, considering that MAAS/curtin already have the necessary support for referring to storage devices by their ID. A regression could cause deployments to fail consistently, if the nvme devices end up being indexed by wrong IDs. ** Affects: curtin Importance: Undecided Status: New ** Affects: maas Importance: Undecided Status: New ** Affects: maas (Ubuntu) Importance: Undecided Status: Incomplete ** Tags: sts -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1849320 Title: MAAS assigns wrong multipath NVMe device To manage notifications about this bug go to: https://bugs.launchpad.net/curtin/+bug/1849320/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs