Re: usb-storage URB use-after-free
On Thu, 29 Jan 2015 11:42:18 -0500 Alan Stern wrote: > On Wed, 28 Jan 2015, Joe Lawrence wrote: > > > This one should have gone over to linux-usb. > > > > -- Joe > > > > On 01/28/2015 05:04 PM, Joe Lawrence wrote: > > > Hello linux-usb, > > > > > > We've hit a USB use-after-free on Stratus HW during device removal tests. > > > We're running fio disk I/O to a scsi disk hanging off USB when the USB > > > controller is hotplug removed. This crash is very consistent (usually the > > > first device pull during testing). Without I/O, it may take days to > > > encounter. > > > > > > general protection fault: [#1] SMP > > > ... > > > CPU: 35 PID: 19626 Comm: kworker/u97:0 Tainted: PF W > > > O-- 3.10.0-210.el7.dev02.x86_64 #1 > > > Hardware name: Stratus ftServer 6800/G7LYY, BIOS BIOS Version 8.0:30 > > > 11/12/2014 > > > Workqueue: scsi_tmf_872 scmd_eh_abort_handler > > > task: 88031bd91960 ti: 880981318000 task.ti: 880981318000 > > > RIP: 0010:[] [] kobject_put+0xd/0x60 > > > RSP: 0018:88098131bd28 EFLAGS: 00010002 > > > RAX: RBX: 6b6b6b6b6b6b6c0b RCX: 0001002f0009 > > > RDX: 002f RSI: ea0015719800 RDI: 6b6b6b6b6b6b6c0b > > > RBP: 88098131bd30 R08: 88055c6622f0 R09: 0001002f0008 > > > R10: 88085f407a80 R11: 81450503 R12: 8804bab6d248 > > > R13: ff98 R14: 0086 R15: 0c20 > > > FS: () GS:88085fce() > > > knlGS: > > > CS: 0010 DS: ES: CR0: 80050033 > > > CR2: 7f2ebb5d6008 CR3: 001038dc5000 CR4: 001407e0 > > > DR0: DR1: DR2: > > > DR3: DR6: 0ff0 DR7: 0400 > > > Stack: > > > 88098131bd40 813cd327 88098131bd50 > > > 8140a65a 88098131bd78 81416795 > > > 880414791968 880414791978 88098131bd88 814175f6 > > > Call Trace: > > > [] put_device+0x17/0x20 > > > [] usb_put_dev+0x1a/0x20 > > > [] usb_hcd_unlink_urb+0x65/0xe0 > > > [] usb_unlink_urb+0x26/0x50 > > > [] usb_sg_cancel+0x82/0xe0 > > > [] usb_stor_stop_transport+0x31/0x50 [usb_storage] > > > [] command_abort+0xcd/0xe0 [usb_storage] > > > [] scmd_eh_abort_handler+0xbf/0x480 > > > [] process_one_work+0x17b/0x470 > > > [] worker_thread+0x11b/0x400 > > > [] ? rescuer_thread+0x400/0x400 > > > [] kthread+0xcf/0xe0 > > > [] ? kthread_create_on_node+0x140/0x140 > > > [] ret_from_fork+0x7c/0xb0 > > > [] ? kthread_create_on_node+0x140/0x140 > > > > > > from another crash, we know that the URB itself has been freed: > > > > but the underlying usb_device->device is still present: > > > > It looks like usb_hcd_unlink_urb takes a reference out on the underlying > > > device but not the URB, which in our test case consistently gets freed > > > just before usb_hcd_unlink_urb tries to return a reference on > > > urb->device. > > The analysis looks correct. > > > > Maybe the URB is a reference count short when usb_hcd_unlink_urb calls > > > unlink1? Somewhere usb-storage missed taking out a ref? > > No; it's a simple use-after-free error. > > > > A tourniquet hack-patch follows (probably inside out, as the URB should > > > stay > > > intact while usb_hcd_unlink_urb does its thing) and has been running a > > > little over 30 surprise device removals under I/O without incident. > > > > > > Any better suggestions? > > Please try this instead. > > Alan Stern > > > > Index: usb-3.19/drivers/usb/core/hcd.c > === > --- usb-3.19.orig/drivers/usb/core/hcd.c > +++ usb-3.19/drivers/usb/core/hcd.c > @@ -1618,6 +1618,7 @@ static int unlink1(struct usb_hcd *hcd, > int usb_hcd_unlink_urb (struct urb *urb, int status) > { > struct usb_hcd *hcd; > + struct usb_device *udev = urb->dev; > int retval = -EIDRM; > unsigned long flags; > > @@ -1629,20 +1630,19 @@ int usb_hcd_unlink_urb (struct urb *urb, > spin_lock_irqsave(&hcd_urb_unlink_lock, flags); > if (atomic_read(&urb->use_count) > 0) { > retval = 0; > - usb_get_dev(urb->dev); > + usb_get_dev(udev); > } > spin_unlock_irqrestore(&hcd_urb_unlink_lock, flags); > if (retval == 0) { > hcd = bus_to_hcd(urb->dev->bus); > retval = unlink1(hcd, urb, status); > - usb_put_dev(urb->dev); > + if (retval == 0) > + retval = -EINPROGRESS; > + else if (retval != -EIDRM && retval != -EBUSY) > + dev_dbg(&udev->dev, "hcd_unlink_urb %p fail %d\n", > + urb, retval); > + usb_put_dev(udev); > } > - > - if (retval == 0) > - retval = -EINPROGRESS; > - els
Re: usb-storage URB use-after-free
On Wed, 28 Jan 2015, Joe Lawrence wrote: > This one should have gone over to linux-usb. > > -- Joe > > On 01/28/2015 05:04 PM, Joe Lawrence wrote: > > Hello linux-usb, > > > > We've hit a USB use-after-free on Stratus HW during device removal tests. > > We're running fio disk I/O to a scsi disk hanging off USB when the USB > > controller is hotplug removed. This crash is very consistent (usually the > > first device pull during testing). Without I/O, it may take days to > > encounter. > > > > general protection fault: [#1] SMP > > ... > > CPU: 35 PID: 19626 Comm: kworker/u97:0 Tainted: PF W O-- > > 3.10.0-210.el7.dev02.x86_64 #1 > > Hardware name: Stratus ftServer 6800/G7LYY, BIOS BIOS Version 8.0:30 > > 11/12/2014 > > Workqueue: scsi_tmf_872 scmd_eh_abort_handler > > task: 88031bd91960 ti: 880981318000 task.ti: 880981318000 > > RIP: 0010:[] [] kobject_put+0xd/0x60 > > RSP: 0018:88098131bd28 EFLAGS: 00010002 > > RAX: RBX: 6b6b6b6b6b6b6c0b RCX: 0001002f0009 > > RDX: 002f RSI: ea0015719800 RDI: 6b6b6b6b6b6b6c0b > > RBP: 88098131bd30 R08: 88055c6622f0 R09: 0001002f0008 > > R10: 88085f407a80 R11: 81450503 R12: 8804bab6d248 > > R13: ff98 R14: 0086 R15: 0c20 > > FS: () GS:88085fce() knlGS: > > CS: 0010 DS: ES: CR0: 80050033 > > CR2: 7f2ebb5d6008 CR3: 001038dc5000 CR4: 001407e0 > > DR0: DR1: DR2: > > DR3: DR6: 0ff0 DR7: 0400 > > Stack: > > 88098131bd40 813cd327 88098131bd50 > > 8140a65a 88098131bd78 81416795 > > 880414791968 880414791978 88098131bd88 814175f6 > > Call Trace: > > [] put_device+0x17/0x20 > > [] usb_put_dev+0x1a/0x20 > > [] usb_hcd_unlink_urb+0x65/0xe0 > > [] usb_unlink_urb+0x26/0x50 > > [] usb_sg_cancel+0x82/0xe0 > > [] usb_stor_stop_transport+0x31/0x50 [usb_storage] > > [] command_abort+0xcd/0xe0 [usb_storage] > > [] scmd_eh_abort_handler+0xbf/0x480 > > [] process_one_work+0x17b/0x470 > > [] worker_thread+0x11b/0x400 > > [] ? rescuer_thread+0x400/0x400 > > [] kthread+0xcf/0xe0 > > [] ? kthread_create_on_node+0x140/0x140 > > [] ret_from_fork+0x7c/0xb0 > > [] ? kthread_create_on_node+0x140/0x140 > > > > from another crash, we know that the URB itself has been freed: > > but the underlying usb_device->device is still present: > > It looks like usb_hcd_unlink_urb takes a reference out on the underlying > > device but not the URB, which in our test case consistently gets freed > > just before usb_hcd_unlink_urb tries to return a reference on urb->device. The analysis looks correct. > > Maybe the URB is a reference count short when usb_hcd_unlink_urb calls > > unlink1? Somewhere usb-storage missed taking out a ref? No; it's a simple use-after-free error. > > A tourniquet hack-patch follows (probably inside out, as the URB should stay > > intact while usb_hcd_unlink_urb does its thing) and has been running a > > little over 30 surprise device removals under I/O without incident. > > > > Any better suggestions? Please try this instead. Alan Stern Index: usb-3.19/drivers/usb/core/hcd.c === --- usb-3.19.orig/drivers/usb/core/hcd.c +++ usb-3.19/drivers/usb/core/hcd.c @@ -1618,6 +1618,7 @@ static int unlink1(struct usb_hcd *hcd, int usb_hcd_unlink_urb (struct urb *urb, int status) { struct usb_hcd *hcd; + struct usb_device *udev = urb->dev; int retval = -EIDRM; unsigned long flags; @@ -1629,20 +1630,19 @@ int usb_hcd_unlink_urb (struct urb *urb, spin_lock_irqsave(&hcd_urb_unlink_lock, flags); if (atomic_read(&urb->use_count) > 0) { retval = 0; - usb_get_dev(urb->dev); + usb_get_dev(udev); } spin_unlock_irqrestore(&hcd_urb_unlink_lock, flags); if (retval == 0) { hcd = bus_to_hcd(urb->dev->bus); retval = unlink1(hcd, urb, status); - usb_put_dev(urb->dev); + if (retval == 0) + retval = -EINPROGRESS; + else if (retval != -EIDRM && retval != -EBUSY) + dev_dbg(&udev->dev, "hcd_unlink_urb %p fail %d\n", + urb, retval); + usb_put_dev(udev); } - - if (retval == 0) - retval = -EINPROGRESS; - else if (retval != -EIDRM && retval != -EBUSY) - dev_dbg(&urb->dev->dev, "hcd_unlink_urb %p fail %d\n", - urb, retval); return retval; } -- To unsubscribe from this list: send the line "unsubscribe linux-scsi"
Re: usb-storage URB use-after-free
This one should have gone over to linux-usb. -- Joe On 01/28/2015 05:04 PM, Joe Lawrence wrote: > Hello linux-usb, > > We've hit a USB use-after-free on Stratus HW during device removal tests. > We're running fio disk I/O to a scsi disk hanging off USB when the USB > controller is hotplug removed. This crash is very consistent (usually the > first device pull during testing). Without I/O, it may take days to > encounter. > > general protection fault: [#1] SMP > ... > CPU: 35 PID: 19626 Comm: kworker/u97:0 Tainted: PF W O-- > 3.10.0-210.el7.dev02.x86_64 #1 > Hardware name: Stratus ftServer 6800/G7LYY, BIOS BIOS Version 8.0:30 > 11/12/2014 > Workqueue: scsi_tmf_872 scmd_eh_abort_handler > task: 88031bd91960 ti: 880981318000 task.ti: 880981318000 > RIP: 0010:[] [] kobject_put+0xd/0x60 > RSP: 0018:88098131bd28 EFLAGS: 00010002 > RAX: RBX: 6b6b6b6b6b6b6c0b RCX: 0001002f0009 > RDX: 002f RSI: ea0015719800 RDI: 6b6b6b6b6b6b6c0b > RBP: 88098131bd30 R08: 88055c6622f0 R09: 0001002f0008 > R10: 88085f407a80 R11: 81450503 R12: 8804bab6d248 > R13: ff98 R14: 0086 R15: 0c20 > FS: () GS:88085fce() knlGS: > CS: 0010 DS: ES: CR0: 80050033 > CR2: 7f2ebb5d6008 CR3: 001038dc5000 CR4: 001407e0 > DR0: DR1: DR2: > DR3: DR6: 0ff0 DR7: 0400 > Stack: > 88098131bd40 813cd327 88098131bd50 > 8140a65a 88098131bd78 81416795 > 880414791968 880414791978 88098131bd88 814175f6 > Call Trace: > [] put_device+0x17/0x20 > [] usb_put_dev+0x1a/0x20 > [] usb_hcd_unlink_urb+0x65/0xe0 > [] usb_unlink_urb+0x26/0x50 > [] usb_sg_cancel+0x82/0xe0 > [] usb_stor_stop_transport+0x31/0x50 [usb_storage] > [] command_abort+0xcd/0xe0 [usb_storage] > [] scmd_eh_abort_handler+0xbf/0x480 > [] process_one_work+0x17b/0x470 > [] worker_thread+0x11b/0x400 > [] ? rescuer_thread+0x400/0x400 > [] kthread+0xcf/0xe0 > [] ? kthread_create_on_node+0x140/0x140 > [] ret_from_fork+0x7c/0xb0 > [] ? kthread_create_on_node+0x140/0x140 > > from another crash, we know that the URB itself has been freed: > > crash> struct -o urb | grep SIZE > SIZE: 0xc0 > crash> p 0xc0/8 > $2 = 0x18 > crash> rd 880846df9248 0x18 > 880846df9248: 6b6b6b6b6b6b6b6b 6b6b6b6b6b6b6b6b > 880846df9258: 6b6b6b6b6b6b6b6b 6b6b6b6b6b6b6b6b > 880846df9268: 6b6b6b6b6b6b6b6b 6b6b6b6b6b6b6b6b > 880846df9278: 6b6b6b6b6b6b6b6b 6b6b6b6b6b6b6b6b > 880846df9288: 6b6b6b6b6b6b6b6b 6b6b6b6b6b6b6b6b > 880846df9298: 6b6b6b6b6b6b6b6b 6b6b6b6b6b6b6b6b > 880846df92a8: 6b6b6b6b6b6b6b6b 6b6b6b6b6b6b6b6b > 880846df92b8: 6b6b6b6b6b6b6b6b 6b6b6b6b6b6b6b6b > 880846df92c8: 6b6b6b6b6b6b6b6b 6b6b6b6b6b6b6b6b > 880846df92d8: 6b6b6b6b6b6b6b6b 6b6b6b6b6b6b6b6b > 880846df92e8: 6b6b6b6b6b6b6b6b 6b6b6b6b6b6b6b6b > 880846df92f8: 6b6b6b6b6b6b6b6b a56b6b6b6b6b6b6b kkk. > > but the underlying usb_device->device is still present: > > crash> struct usb_device 88084fd22e68 > ... > dev = { > parent = 0x881050c3d418, > p = 0x88084ff27ae8, > kobj = { > name = 0x88084ff071e0 "6-1", > entry = { > next = 0x88084ff09568, > prev = 0x88084ff09000 > }, > parent = 0x881050c3d428, > kset = 0x8808540dc9f0, > ktype = 0x819cc220 , > sd = 0x88084ff1b8b8, > kref = { > refcount = { > counter = 0x8 > } > }, > state_initialized = 0x1, > state_in_sysfs = 0x1, > state_add_uevent_sent = 0x1, > state_remove_uevent_sent = 0x0, > uevent_suppress = 0x0 > }, > ... > > I added trace printing to usb_{alloc_urb,get_urb,free_urb} and urb_destroy > showing the URB and the calling function, then crashed again: > > ... snip ... > usb-storage-687 [000] 350.013212: bprint: > usb_alloc_urb : JL: usb_sg_init+0xe1 -> usb_alloc_urb(0x8808489b0a28) > usb-storage-687 [000] 350.013212: bprint: usb_get_urb > : JL: usb_hcd_submit_urb+0x3b -> usb_get_urb(0x8808489b0a28) >kworker/u97:2-703 [006] 380.656791: bprint: usb_free_urb > : JL: __usb_hcd_giveback_urb -> usb_free_urb(0x8808489b0a28) > usb-storage-687 [000] 380.656793: bprint: usb_free_urb > : JL: sg_clean+0x33 -> usb_free_urb(0x8808489b0a28) > usb-storage-687 [000] 380.656794: bprint: urb_destroy
usb-storage URB use-after-free
Hello linux-usb, We've hit a USB use-after-free on Stratus HW during device removal tests. We're running fio disk I/O to a scsi disk hanging off USB when the USB controller is hotplug removed. This crash is very consistent (usually the first device pull during testing). Without I/O, it may take days to encounter. general protection fault: [#1] SMP ... CPU: 35 PID: 19626 Comm: kworker/u97:0 Tainted: PF W O-- 3.10.0-210.el7.dev02.x86_64 #1 Hardware name: Stratus ftServer 6800/G7LYY, BIOS BIOS Version 8.0:30 11/12/2014 Workqueue: scsi_tmf_872 scmd_eh_abort_handler task: 88031bd91960 ti: 880981318000 task.ti: 880981318000 RIP: 0010:[] [] kobject_put+0xd/0x60 RSP: 0018:88098131bd28 EFLAGS: 00010002 RAX: RBX: 6b6b6b6b6b6b6c0b RCX: 0001002f0009 RDX: 002f RSI: ea0015719800 RDI: 6b6b6b6b6b6b6c0b RBP: 88098131bd30 R08: 88055c6622f0 R09: 0001002f0008 R10: 88085f407a80 R11: 81450503 R12: 8804bab6d248 R13: ff98 R14: 0086 R15: 0c20 FS: () GS:88085fce() knlGS: CS: 0010 DS: ES: CR0: 80050033 CR2: 7f2ebb5d6008 CR3: 001038dc5000 CR4: 001407e0 DR0: DR1: DR2: DR3: DR6: 0ff0 DR7: 0400 Stack: 88098131bd40 813cd327 88098131bd50 8140a65a 88098131bd78 81416795 880414791968 880414791978 88098131bd88 814175f6 Call Trace: [] put_device+0x17/0x20 [] usb_put_dev+0x1a/0x20 [] usb_hcd_unlink_urb+0x65/0xe0 [] usb_unlink_urb+0x26/0x50 [] usb_sg_cancel+0x82/0xe0 [] usb_stor_stop_transport+0x31/0x50 [usb_storage] [] command_abort+0xcd/0xe0 [usb_storage] [] scmd_eh_abort_handler+0xbf/0x480 [] process_one_work+0x17b/0x470 [] worker_thread+0x11b/0x400 [] ? rescuer_thread+0x400/0x400 [] kthread+0xcf/0xe0 [] ? kthread_create_on_node+0x140/0x140 [] ret_from_fork+0x7c/0xb0 [] ? kthread_create_on_node+0x140/0x140 from another crash, we know that the URB itself has been freed: crash> struct -o urb | grep SIZE SIZE: 0xc0 crash> p 0xc0/8 $2 = 0x18 crash> rd 880846df9248 0x18 880846df9248: 6b6b6b6b6b6b6b6b 6b6b6b6b6b6b6b6b 880846df9258: 6b6b6b6b6b6b6b6b 6b6b6b6b6b6b6b6b 880846df9268: 6b6b6b6b6b6b6b6b 6b6b6b6b6b6b6b6b 880846df9278: 6b6b6b6b6b6b6b6b 6b6b6b6b6b6b6b6b 880846df9288: 6b6b6b6b6b6b6b6b 6b6b6b6b6b6b6b6b 880846df9298: 6b6b6b6b6b6b6b6b 6b6b6b6b6b6b6b6b 880846df92a8: 6b6b6b6b6b6b6b6b 6b6b6b6b6b6b6b6b 880846df92b8: 6b6b6b6b6b6b6b6b 6b6b6b6b6b6b6b6b 880846df92c8: 6b6b6b6b6b6b6b6b 6b6b6b6b6b6b6b6b 880846df92d8: 6b6b6b6b6b6b6b6b 6b6b6b6b6b6b6b6b 880846df92e8: 6b6b6b6b6b6b6b6b 6b6b6b6b6b6b6b6b 880846df92f8: 6b6b6b6b6b6b6b6b a56b6b6b6b6b6b6b kkk. but the underlying usb_device->device is still present: crash> struct usb_device 88084fd22e68 ... dev = { parent = 0x881050c3d418, p = 0x88084ff27ae8, kobj = { name = 0x88084ff071e0 "6-1", entry = { next = 0x88084ff09568, prev = 0x88084ff09000 }, parent = 0x881050c3d428, kset = 0x8808540dc9f0, ktype = 0x819cc220 , sd = 0x88084ff1b8b8, kref = { refcount = { counter = 0x8 } }, state_initialized = 0x1, state_in_sysfs = 0x1, state_add_uevent_sent = 0x1, state_remove_uevent_sent = 0x0, uevent_suppress = 0x0 }, ... I added trace printing to usb_{alloc_urb,get_urb,free_urb} and urb_destroy showing the URB and the calling function, then crashed again: ... snip ... usb-storage-687 [000] 350.013212: bprint: usb_alloc_urb : JL: usb_sg_init+0xe1 -> usb_alloc_urb(0x8808489b0a28) usb-storage-687 [000] 350.013212: bprint: usb_get_urb : JL: usb_hcd_submit_urb+0x3b -> usb_get_urb(0x8808489b0a28) kworker/u97:2-703 [006] 380.656791: bprint: usb_free_urb : JL: __usb_hcd_giveback_urb -> usb_free_urb(0x8808489b0a28) usb-storage-687 [000] 380.656793: bprint: usb_free_urb : JL: sg_clean+0x33 -> usb_free_urb(0x8808489b0a28) usb-storage-687 [000] 380.656794: bprint: urb_destroy : JL: usb_free_urb+0x40 -> urb_destroy(0x8808489b0a28) crash> bt 703 PID: 703TASK: 8808497f8000 CPU: 6 COMMAND: "kworker/u97:2" #0 [8808497f7c10] die at 8101736b #1 [8808497f7c40] do_general_protection at 8160be4e #2 [8808497f7c70] general_protection at 8160b768 [