Public bug reported:

I have a system containing two identical nvme devices.  When booting a
trusty PXE image with kernel 4.4.0-38-generic both devices are detected
and available:

# nvme id-ctrl /dev/nvme0
NVME Identify Controller:
vid     : 0x8086
ssvid   : 0x8086
sn      : BTHH82250N1X1P0E    
mn      : INTEL SSDPEKKF010T8L
fr      : L08P    
...

# nvme id-ctrl /dev/nvme1
NVME Identify Controller:
vid     : 0x8086
ssvid   : 0x8086
sn      : BTHH82250N261P0E    
mn      : INTEL SSDPEKKF010T8L
fr      : L08P    
...


# dmesg | grep nvme
[    5.106516]  nvme0n1: p1 p2 p3 p4
[    5.106615]  nvme1n1: p1 p2


After booting a bionic PXE image based on 4.15.0-38-generic only the
first nvme device is enabled, the second is detected but disabled as
both devices have the same nqn:

nvme nvme1: ignoring ctrl due to duplicate subnqn 
(nqn.2017-12.org.nvmeexpress:uuid:11111111-2222-3333-4444-555555555555).
nvme nvme1: Removing after probe failure status: -22


The nqn string is found in the device firmware rather than being generated by 
Linux but there does not seem to be an operation in nvme-cli to change this.  
(It is also questionable if the device firmware value is correct according to 
section 7.9 of 
https://nvmexpress.org/wp-content/uploads/NVM-Express-1_3a-20171024_ratified.pdf.
  My reading of the specification is that the string should start 
nqn.2014-08.org.nvmeexpress:uuid: with a random UUID, and I assume a random 
UUID per device.)

The Windows 10 installation provided on the system did not have any
problems operating with both devices.

Looking at the kernel nvme driver history suggests that in 4.4 it didn't
care or validate the nqn but now it does there is a problem.

Our typical installation is a zpool mirror across two devices and this
is preventing us moving from trusty to bionic.

This is a report of a similar issue:
https://ask.fedoraproject.org/en/question/128422/one-of-two-identical-m2
-nvme-drives-disabling-due-to-same-nqn/

It may be worth noting that if the nvme device does not provide an nqn
then it seems one is generated based on the device serial number so a
system with two Samsung MZVLB256HAHQ devices works fine.

** Affects: linux (Ubuntu)
     Importance: Undecided
         Status: Incomplete


** Tags: xenial

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1803692

Title:
  bionic 4.15 nvme regression from trusty 4.4 with two identical devices

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1803692/+subscriptions

-- 
ubuntu-bugs mailing list
[email protected]
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

Reply via email to