okay, it seems that both acpi_pci_hotplug and pci_hotplug are enabled
for this kernel:

grep "CONFIG_HOTPLUG_PCI_ACPI=" /boot/config-`uname -r`
CONFIG_HOTPLUG_PCI_ACPI=y
grep "CONFIG_HOTPLUG_PCI=" /boot/config-`uname -r`
CONFIG_HOTPLUG_PCI=y

but still, hotplug is not working.

the current situation is: I unplugged a correctly recognized Intel NVME
and switched it for a Micron NVME, there is also a second intel nvme
still installed.

nvme list shows the intel nvme which remained in the system:

nvme list
Node             SN                   Model                                    
Namespace Usage                      Format           FW Rev  
---------------- -------------------- ---------------------------------------- 
--------- -------------------------- ---------------- --------
/dev/nvme0n1     PHLN130100QJ1P6AGN   INTEL SSDPE2KE016T8                      
1           1.60  TB /   1.60  TB    512   B +  0 B   VDV10184

but when I look into /sys/, I still find shadows of the other intel
nvme, which just got plugged out:

find /sys/devices | egrep "nvme[0-9][0-9]?$"
/sys/devices/pci0000:40/0000:40:01.1/0000:41:00.0/nvme/nvme0
/sys/devices/virtual/nvme-subsystem/nvme-subsys1/nvme1
/sys/devices/virtual/nvme-subsystem/nvme-subsys0/nvme0

also, nvme list-subsys, throws an error. I suppose because that old
subsystem nvme-subsys1 does no longer exist, but is still referenced:

nvme list-subsys
free(): double free detected in tcache 2
Aborted

googling the above error leads me to this bug report:
https://github.com/linux-nvme/nvme-cli/issues/1707

when rescanning the pci bus via:

echo 1 > /sys/bus/pci/rescan

I get the following messages in dmesg:

[Wed Feb  8 10:20:49 2023] pci 0000:c3:00.0: PCI bridge to [bus c4]
[Wed Feb  8 10:20:49 2023] pci 0000:c3:00.0:   bridge window [io  0xf000-0xffff]
[Wed Feb  8 10:20:49 2023] pci 0000:c3:00.0:   bridge window [mem 
0xb8000000-0xb90fffff]
[Wed Feb  8 10:20:49 2023] pcieport 0000:40:01.1: bridge window [io  
0x1000-0x0fff] to [bus 41] add_size 1000
[Wed Feb  8 10:20:49 2023] pcieport 0000:40:01.2: bridge window [io  
0x1000-0x0fff] to [bus 42] add_size 1000
[Wed Feb  8 10:20:49 2023] pcieport 0000:40:01.1: BAR 13: no space for [io  
size 0x1000]
[Wed Feb  8 10:20:49 2023] pcieport 0000:40:01.1: BAR 13: failed to assign [io  
size 0x1000]
[Wed Feb  8 10:20:49 2023] pcieport 0000:40:01.2: BAR 13: no space for [io  
size 0x1000]
[Wed Feb  8 10:20:49 2023] pcieport 0000:40:01.2: BAR 13: failed to assign [io  
size 0x1000]
[Wed Feb  8 10:20:49 2023] pcieport 0000:40:01.2: BAR 13: no space for [io  
size 0x1000]
[Wed Feb  8 10:20:49 2023] pcieport 0000:40:01.2: BAR 13: failed to assign [io  
size 0x1000]
[Wed Feb  8 10:20:49 2023] pcieport 0000:40:01.1: BAR 13: no space for [io  
size 0x1000]
[Wed Feb  8 10:20:49 2023] pcieport 0000:40:01.1: BAR 13: failed to assign [io  
size 0x1000]

as already stated in the original bug report.

how can I get rid of the now defunct nvme subsystem? anything else I
could try?

** Bug watch added: github.com/linux-nvme/nvme-cli/issues #1707
   https://github.com/linux-nvme/nvme-cli/issues/1707

** Summary changed:

- pcie hotplug not working in linux-generic-hwe-18.04 5.4.0.107.121~18
+ pcie hotplug not working in linux-generic-hwe-18.04 5.4.0.135.152~18

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1998224

Title:
  pcie hotplug not working in linux-generic-hwe-18.04 5.4.0.135.152~18

Status in linux package in Ubuntu:
  Confirmed

Bug description:
  situation: ubuntu 18.04 server install on a supermicro x64 host.

  hot plug NVME ssd into NVME U.2 HotSwap Slot.

  problem: hot plug does not work/ nvme is not recognised.

  how to test:

  echo 1 > /sys/bus/pci/rescan

  dmesg output:

  [Mon Nov 28 15:46:33 2022] pcieport 0000:40:01.1: bridge window [io  
0x1000-0x0fff] to [bus 41] add_size 1000
  [Mon Nov 28 15:46:33 2022] pcieport 0000:40:01.2: bridge window [io  
0x1000-0x0fff] to [bus 42] add_size 1000
  [Mon Nov 28 15:46:33 2022] pcieport 0000:40:01.1: BAR 13: no space for [io  
size 0x1000]
  [Mon Nov 28 15:46:33 2022] pcieport 0000:40:01.1: BAR 13: failed to assign 
[io  size 0x1000]
  [Mon Nov 28 15:46:33 2022] pcieport 0000:40:01.2: BAR 13: no space for [io  
size 0x1000]
  [Mon Nov 28 15:46:33 2022] pcieport 0000:40:01.2: BAR 13: failed to assign 
[io  size 0x1000]
  [Mon Nov 28 15:46:33 2022] pcieport 0000:40:01.2: BAR 13: no space for [io  
size 0x1000]
  [Mon Nov 28 15:46:33 2022] pcieport 0000:40:01.2: BAR 13: failed to assign 
[io  size 0x1000]
  [Mon Nov 28 15:46:33 2022] pcieport 0000:40:01.1: BAR 13: no space for [io  
size 0x1000]
  [Mon Nov 28 15:46:33 2022] pcieport 0000:40:01.1: BAR 13: failed to assign 
[io  size 0x1000]

  Kernel Version:

  5.4.0-107-generic #121~18.04.1-Ubuntu SMP Thu Mar 24 17:21:33 UTC 2022
  x86_64 x86_64 x86_64 GNU/Linux

  hardware information: tried with micron and intel NVME, e.g:

  INTEL SSDPE2KE016T8

  after a reboot, the NVME is recognized, so there is no hardware
  problem.

  if you need additional debug information, feel free to ask.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1998224/+subscriptions


-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to     : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp

Reply via email to