Public bug reported:

[Impact]
On Dell systems (CID: 202501-36199), when two Thunderbolt storage devices
are connected -- either both through a Thunderbolt dock, or one through a
dock and one through a daisy-chained TBT monitor, or both directly to the
system -- the second TBT storage fails to be recognized. Re-plugging a TBT
storage also causes the re-plugged device to not be detected. The issue is
specific to TBT storage devices; TBT monitors are not affected.

Relevant kernel log showing spurious hotplug events during tunnel 
activation that prevent pciehp from detecting the second device:

  thunderbolt: acking hot unplug event on 702:2
  thunderbolt: PCIe Up path activation complete
  thunderbolt: hotplug event for upstream port 702:2 (unplug: 0)
  thunderbolt: hotplug event for upstream port 702:2 (unplug: 1)

The fix is to schedule a delayed pci_rescan_bus() after tunnel activation.
However, on the current kernel, pci_rescan_bus() alone does not bring the
missing device back because pci_enable_resources() refuses to enable a PCI
bridge when any of its bridge window resources are unassigned. Since not
all bridge windows are always needed, this incorrectly blocks the bridge
from being enabled, and downstream devices behind it remain inaccessible.
A set of upstream PCI resource handling fixes is required to make the
rescan path work correctly.

Affected hardware: Dell systems (CID: 202501-36199)
Failure rate: random, high on affected hardware

[Fix]
Two groups of patches work together to fix this issue:

1. PCI bridge window resource handling fixes (upstream in v6.18):

   A series of 6 patches from Ilpo Järvinen that fix the PCI resource
   assignment and bridge enablement path. The critical change is allowing
   bridges to be enabled even when not all bridge window resources are
   assigned, since not all windows are always needed. Without these fixes,
   pci_rescan_bus() discovers the missing device but cannot enable its
   parent bridge, so the device remains inaccessible. The remaining
   patches are prerequisites that ensure bridge window resource flags are
   properly preserved and managed throughout the lifecycle.

   Upstream commits (all in v6.18):
   2ee33aa14d3f PCI: Always claim bridge window before its setup
   b15f45ab65e2 PCI: Disable non-claimed bridge window
   3baeae36039a PCI: Use pci_release_resource() instead of release_resource()
   1cdffa51ecc4 PCI: Enable bridge even if bridge window fails to assign
   ff77c5219747 PCI: Fix pdev_resources_assignable() disparity
   8278c6914306 PCI: Preserve bridge window resource type flags

2. Thunderbolt PCIe enumeration fix (SAUCE patch, under review
upstream):

   Schedule a delayed pci_rescan_bus() (300ms) after tunnel activation to
   catch devices that pciehp missed due to spurious hotplug events. Since
   pci_rescan_bus() is idempotent, it is safe to call unconditionally.

   Patch:
   
https://lore.kernel.org/lkml/[email protected]/T/#u

[Test Plan]
1. Connect a TBT storage to the Thunderbolt dock
2. Connect the dock to the system and boot into the OS
3. Verify the first TBT storage is recognized:
   $ lsblk
4. Plug a second TBT storage to another TBT port on the dock
5. Check if the second TBT storage is recognized:
   $ lsblk
6. Unplug and re-plug one of the TBT storage devices
7. Check if the re-plugged storage is recognized:
   $ lsblk
8. Repeat steps 4-7 at least 10 times

Without the patches: The second TBT storage or re-plugged storage fails
to be detected
With the patches: Both TBT storage devices are recognized reliably

[Where problems could occur]
The PCI bridge window resource handling changes are a significant rework
touching core PCI subsystem code (setup-bus.c, setup-res.c, probe.c).
These changes alter how bridge window resources track their type flags and
how bridge enablement decisions are made. If the new IORESOURCE_UNSET /
IORESOURCE_DISABLED flag semantics are not handled correctly in all code
paths, PCI devices behind bridges could fail to be assigned resources or
bridges might not be enabled, resulting in devices not being detected at
boot or after hotplug.

The Thunderbolt SAUCE patch adds an unconditional delayed pci_rescan_bus()
after tunnel activation. While pci_rescan_bus() is idempotent, if the
timing interacts poorly with pciehp's own enumeration on some hardware
configurations, it could theoretically cause duplicate enumeration attempts
or lock contention between pci_lock_rescan_remove() and pciehp's own
locking.

** Affects: linux-oem-6.17 (Ubuntu)
     Importance: Undecided
         Status: Invalid

** Affects: linux-oem-6.17 (Ubuntu Noble)
     Importance: Undecided
     Assignee: AceLan Kao (acelankao)
         Status: In Progress

** Also affects: linux-oem-6.17 (Ubuntu Noble)
   Importance: Undecided
       Status: New

** Changed in: linux-oem-6.17 (Ubuntu)
       Status: In Progress => Invalid

** Changed in: linux-oem-6.17 (Ubuntu Noble)
       Status: New => In Progress

** Changed in: linux-oem-6.17 (Ubuntu Noble)
     Assignee: (unassigned) => AceLan Kao (acelankao)

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/2139572

Title:
  The second tbt storage plugged on the dock will not be recognized

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux-oem-6.17/+bug/2139572/+subscriptions


-- 
ubuntu-bugs mailing list
[email protected]
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

Reply via email to