The XHCI controller presents two USB buses to the system - one for USB2
and one for USB3. The hub init code (hub_port_init) is reentrant but
only locks one bus per thread, leading to a race condition failure when
two threads attempt to simultaneously initialise a USB2 and USB3 device:
[8.034843] xhci_hcd :00:14.0: Timeout while waiting for setup device
command
[ 13.183701] usb 3-3: device descriptor read/all, error -110
On a test system this failure occurred on 6% of all boots.
The call traces at the point of failure are:
Call Trace:
[] schedule+0x37/0x90
[] usb_kill_urb+0x8d/0xd0
[] ? wake_up_atomic_t+0x30/0x30
[] usb_start_wait_urb+0xbe/0x150
[] usb_control_msg+0xbc/0xf0
[] hub_port_init+0x51e/0xb70
[] hub_event+0x817/0x1570
[] process_one_work+0x1ff/0x620
[] ? process_one_work+0x15f/0x620
[] worker_thread+0x64/0x4b0
[] ? rescuer_thread+0x390/0x390
[] kthread+0x105/0x120
[] ? kthread_create_on_node+0x200/0x200
[] ret_from_fork+0x3f/0x70
[] ? kthread_create_on_node+0x200/0x200
Call Trace:
[] xhci_setup_device+0x53d/0xa40
[] xhci_address_device+0xe/0x10
[] hub_port_init+0x1bf/0xb70
[] ? trace_hardirqs_on+0xd/0x10
[] hub_event+0x817/0x1570
[] process_one_work+0x1ff/0x620
[] ? process_one_work+0x15f/0x620
[] worker_thread+0x64/0x4b0
[] ? rescuer_thread+0x390/0x390
[] kthread+0x105/0x120
[] ? kthread_create_on_node+0x200/0x200
[] ret_from_fork+0x3f/0x70
[] ? kthread_create_on_node+0x200/0x200
Which results from the two call chains:
hub_port_init
usb_get_device_descriptor
usb_get_descriptor
usb_control_msg
usb_internal_control_msg
usb_start_wait_urb
usb_submit_urb / wait_for_completion_timeout / usb_kill_urb
hub_port_init
hub_set_address
xhci_address_device
xhci_setup_device
Mathias Nyman explains the current behaviour violates the XHCI spec:
hub_port_reset() will end up moving the corresponding xhci device slot
to default state.
As hub_port_reset() is called several times in hub_port_init() it
sounds reasonable that we could end up with two threads having their
xhci device slots in default state at the same time, which according to
xhci 4.5.3 specs still is a big no no:
"Note: Software shall not transition more than one Device Slot to the
Default State at a time"
So both threads fail at their next task after this.
One fails to read the descriptor, and the other fails addressing the
device.
Fix this in hub_port_init by locking the USB controller (instead of an
individual bus) to prevent simultaneous initialisation of both buses.
Fixes: 638139eb95d2 ("usb: hub: allow to process more usb hub events in
parallel")
Link: https://lkml.org/lkml/2016/2/8/312
Link: https://lkml.org/lkml/2016/2/4/748
Signed-off-by: Chris Bainbridge
---
This is technically the same as the last patch version, but with a reworked
commit message to better explain the cause of the issue.
---
drivers/usb/core/hcd.c | 15 +--
drivers/usb/core/hub.c | 8
include/linux/usb.h | 3 +--
include/linux/usb/hcd.h | 1 +
4 files changed, 19 insertions(+), 8 deletions(-)
diff --git a/drivers/usb/core/hcd.c b/drivers/usb/core/hcd.c
index 2ca2cef7f681..980fc5774151 100644
--- a/drivers/usb/core/hcd.c
+++ b/drivers/usb/core/hcd.c
@@ -994,7 +994,7 @@ static void usb_bus_init (struct usb_bus *bus)
bus->bandwidth_allocated = 0;
bus->bandwidth_int_reqs = 0;
bus->bandwidth_isoc_reqs = 0;
- mutex_init(&bus->usb_address0_mutex);
+ mutex_init(&bus->devnum_next_mutex);
}
/*-*/
@@ -2521,6 +2521,14 @@ struct usb_hcd *usb_create_shared_hcd(const struct
hc_driver *driver,
return NULL;
}
if (primary_hcd == NULL) {
+ hcd->address0_mutex = kmalloc(sizeof(*hcd->address0_mutex),
+ GFP_KERNEL);
+ if (!hcd->address0_mutex) {
+ kfree(hcd);
+ dev_dbg(dev, "hcd address0 mutex alloc failed\n");
+ return NULL;
+ }
+ mutex_init(hcd->address0_mutex);
hcd->bandwidth_mutex = kmalloc(sizeof(*hcd->bandwidth_mutex),
GFP_KERNEL);
if (!hcd->bandwidth_mutex) {
@@ -2532,6 +2540,7 @@ struct usb_hcd *usb_create_shared_hcd(const struct
hc_driver *driver,
dev_set_drvdata(dev, hcd);
} else {
mutex_lock(&usb_port_peer_mutex);
+ hcd->address0_mutex = primary_hcd->address0_mutex;
hcd->bandwidth_mutex = primary_hcd->bandwidth_mutex;
hcd->primary_hcd = primary_hcd;
primary_hcd->primary_hcd = primary_hcd;
@@ -2598,8 +2607,10 @@ static void hcd_release(struct kref *kref)
struct usb_hcd *hcd = container_of (kref, struct usb_hcd, kref);
mutex_lock(&usb_port_peer_mutex);
- if (usb_hcd_is_primary_hcd(hc