Public bug reported:

[Impact]
During repeated NS map/unmap operations in ONTAP (which triggers NS attr 
changed AENs) where new NSs get mapped reusing the old NSID, one occasionally 
sees the Ubuntu 24.04 NVMe/TCP host ending up with device inconsistencies where 
the respective NVMe block device (i.e. /dev/nvmeXnY) is available, but not the 
corresponding NVMe generic char device (i.e. /dev/ngXnY). This issue is not 
seen if the same NS is remapped on the same NSID, but only hit when a new NS is 
mapped reusing the same NSID which was previously used by some other NS.

The following error entries are seen in the messages file during this
device inconsistency scenario:

...
kernel: [267011.744167][ T2016] nvme nvme6: rescanning namespaces.
kernel: [267011.744347][T46805] nvme nvme2: rescanning namespaces.
kernel: [267011.750418][ T7876] nvme nvme1: rescanning namespaces.
kernel: [267011.784466][ T2016] nvme nvme6: IDs don't match for shared 
namespace 1
kernel: [267011.784791][T46805] nvme nvme2: IDs don't match for shared 
namespace 1
kernel: [267011.790843][ T7876] nvme nvme1: IDs don't match for shared 
namespace 1
kernel: [267011.804852][ T2016] nvme nvme6: IDs don't match for shared 
namespace 2
kernel: [267011.804867][T46805] nvme nvme2: IDs don't match for shared 
namespace 2
kernel: [267011.810788][ T7876] nvme nvme1: IDs don't match for shared 
namespace 2
kernel: [267011.824600][ T2016] nvme nvme6: IDs don't match for shared 
namespace 3
kernel: [267011.825114][T46805] nvme nvme2: IDs don't match for shared 
namespace 3
kernel: [267011.830982][ T7876] nvme nvme1: IDs don't match for shared 
namespace 3
kernel: [267011.844712][ T2016] nvme nvme6: duplicate IDs in subsystem for nsid 
4
kernel: [267011.845161][T46805] nvme nvme2: duplicate IDs in subsystem for nsid 
4
kernel: [267011.851060][ T7876] nvme nvme1: duplicate IDs in subsystem for nsid 
4

[Fix]
The following upstream commits are required:

  62baf70c3274 nvme: re-read ANA log page after ns scan completes
  9546ad1a9bda nvme: requeue namespace scan on missed AENs
  1f021341eef4 nvme-multipath: defer partition scanning
  3b97f5a05cfc nvme-multipath: avoid hang on inaccessible namespaces
  63bcf9014e95 nvme-multipath: system fails to create generic nvme device

$ git describe --contains 3b97f5a05cfc 63bcf9014e95 1f021341eef4 62baf70c3274 
9546ad1a9bda
v6.12-rc1~47^2^2~2
v6.12-rc1~47^2^2~3
v6.12-rc4~20^2~1^2~4
v6.15-rc2~11^2~1^2~10
v6.15-rc2~11^2~1^2~11

The first three patches are already present in the 6.8-based kernels, and the
two follow-up commits have been sent to stable trees already. Given 6.8 is not
an upstream stable tree, we should pick up those patches for the Ubuntu kernels
as well.

[Test Case]


[Where Problems Could Occur]

** Affects: linux (Ubuntu)
     Importance: Medium
         Status: Fix Released

** Affects: linux (Ubuntu Noble)
     Importance: Medium
     Assignee: Heitor Alves de Siqueira (halves)
         Status: In Progress

** Also affects: linux (Ubuntu Noble)
   Importance: Undecided
       Status: New

** Changed in: linux (Ubuntu)
       Status: New => Fix Released

** Changed in: linux (Ubuntu Noble)
   Importance: Undecided => Medium

** Changed in: linux (Ubuntu Noble)
       Status: New => In Progress

** Changed in: linux (Ubuntu Noble)
     Assignee: (unassigned) => Heitor Alves de Siqueira (halves)

** Changed in: linux (Ubuntu)
   Importance: Undecided => Medium

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/2115209

Title:
  NVMe namespace ID mismatch on repeated map/unmap

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/2115209/+subscriptions


-- 
ubuntu-bugs mailing list
[email protected]
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

Reply via email to