Re: usb-storage URB use-after-free

2015-01-30 Thread Joe Lawrence
On Thu, 29 Jan 2015 11:42:18 -0500
Alan Stern  wrote:

> On Wed, 28 Jan 2015, Joe Lawrence wrote:
> 
> > This one should have gone over to linux-usb.
> > 
> > -- Joe
> > 
> > On 01/28/2015 05:04 PM, Joe Lawrence wrote:
> > > Hello linux-usb,
> > > 
> > > We've hit a USB use-after-free on Stratus HW during device removal tests.
> > > We're running fio disk I/O to a scsi disk hanging off USB when the USB
> > > controller is hotplug removed.  This crash is very consistent (usually the
> > > first device pull during testing).  Without I/O, it may take days to
> > > encounter.
> > > 
> > > general protection fault:  [#1] SMP
> > > ...
> > > CPU: 35 PID: 19626 Comm: kworker/u97:0 Tainted: PF   W  
> > > O--   3.10.0-210.el7.dev02.x86_64 #1
> > > Hardware name: Stratus ftServer 6800/G7LYY, BIOS BIOS Version 8.0:30 
> > > 11/12/2014
> > > Workqueue: scsi_tmf_872 scmd_eh_abort_handler
> > > task: 88031bd91960 ti: 880981318000 task.ti: 880981318000
> > > RIP: 0010:[]  [] kobject_put+0xd/0x60
> > > RSP: 0018:88098131bd28  EFLAGS: 00010002
> > > RAX:  RBX: 6b6b6b6b6b6b6c0b RCX: 0001002f0009
> > > RDX: 002f RSI: ea0015719800 RDI: 6b6b6b6b6b6b6c0b
> > > RBP: 88098131bd30 R08: 88055c6622f0 R09: 0001002f0008
> > > R10: 88085f407a80 R11: 81450503 R12: 8804bab6d248
> > > R13: ff98 R14: 0086 R15: 0c20
> > > FS:  () GS:88085fce() 
> > > knlGS:
> > > CS:  0010 DS:  ES:  CR0: 80050033
> > > CR2: 7f2ebb5d6008 CR3: 001038dc5000 CR4: 001407e0
> > > DR0:  DR1:  DR2: 
> > > DR3:  DR6: 0ff0 DR7: 0400
> > > Stack:
> > >   88098131bd40 813cd327 88098131bd50
> > >  8140a65a 88098131bd78 81416795 
> > >  880414791968 880414791978 88098131bd88 814175f6
> > > Call Trace:
> > >  [] put_device+0x17/0x20
> > >  [] usb_put_dev+0x1a/0x20
> > >  [] usb_hcd_unlink_urb+0x65/0xe0
> > >  [] usb_unlink_urb+0x26/0x50
> > >  [] usb_sg_cancel+0x82/0xe0
> > >  [] usb_stor_stop_transport+0x31/0x50 [usb_storage]
> > >  [] command_abort+0xcd/0xe0 [usb_storage]
> > >  [] scmd_eh_abort_handler+0xbf/0x480
> > >  [] process_one_work+0x17b/0x470
> > >  [] worker_thread+0x11b/0x400
> > >  [] ? rescuer_thread+0x400/0x400
> > >  [] kthread+0xcf/0xe0
> > >  [] ? kthread_create_on_node+0x140/0x140
> > >  [] ret_from_fork+0x7c/0xb0
> > >  [] ? kthread_create_on_node+0x140/0x140
> > > 
> > > from another crash, we know that the URB itself has been freed:
> 
> > > but the underlying usb_device->device is still present:
> 
> > > It looks like usb_hcd_unlink_urb takes a reference out on the underlying
> > > device but not the URB, which in our test case consistently gets freed
> > > just before usb_hcd_unlink_urb tries to return a reference on 
> > > urb->device. 
> 
> The analysis looks correct.
> 
> > > Maybe the URB is a reference count short when usb_hcd_unlink_urb calls
> > > unlink1?  Somewhere usb-storage missed taking out a ref?
> 
> No; it's a simple use-after-free error.
> 
> > > A tourniquet hack-patch follows (probably inside out, as the URB should 
> > > stay
> > > intact while usb_hcd_unlink_urb does its thing) and has been running a
> > > little over 30 surprise device removals under I/O without incident.
> > > 
> > > Any better suggestions?
> 
> Please try this instead.
> 
> Alan Stern
> 
> 
> 
> Index: usb-3.19/drivers/usb/core/hcd.c
> ===
> --- usb-3.19.orig/drivers/usb/core/hcd.c
> +++ usb-3.19/drivers/usb/core/hcd.c
> @@ -1618,6 +1618,7 @@ static int unlink1(struct usb_hcd *hcd,
>  int usb_hcd_unlink_urb (struct urb *urb, int status)
>  {
>   struct usb_hcd  *hcd;
> + struct usb_device   *udev = urb->dev;
>   int retval = -EIDRM;
>   unsigned long   flags;
>  
> @@ -1629,20 +1630,19 @@ int usb_hcd_unlink_urb (struct urb *urb,
>   spin_lock_irqsave(&hcd_urb_unlink_lock, flags);
>   if (atomic_read(&urb->use_count) > 0) {
>   retval = 0;
> - usb_get_dev(urb->dev);
> + usb_get_dev(udev);
>   }
>   spin_unlock_irqrestore(&hcd_urb_unlink_lock, flags);
>   if (retval == 0) {
>   hcd = bus_to_hcd(urb->dev->bus);
>   retval = unlink1(hcd, urb, status);
> - usb_put_dev(urb->dev);
> + if (retval == 0)
> + retval = -EINPROGRESS;
> + else if (retval != -EIDRM && retval != -EBUSY)
> + dev_dbg(&udev->dev, "hcd_unlink_urb %p fail %d\n",
> + urb, retval);
> + usb_put_dev(udev);
>   }
> -
> - if (retval == 0)
> - retval = -EINPROGRESS;
> - els

Re: usb-storage URB use-after-free

2015-01-29 Thread Alan Stern
On Wed, 28 Jan 2015, Joe Lawrence wrote:

> This one should have gone over to linux-usb.
> 
> -- Joe
> 
> On 01/28/2015 05:04 PM, Joe Lawrence wrote:
> > Hello linux-usb,
> > 
> > We've hit a USB use-after-free on Stratus HW during device removal tests.
> > We're running fio disk I/O to a scsi disk hanging off USB when the USB
> > controller is hotplug removed.  This crash is very consistent (usually the
> > first device pull during testing).  Without I/O, it may take days to
> > encounter.
> > 
> > general protection fault:  [#1] SMP
> > ...
> > CPU: 35 PID: 19626 Comm: kworker/u97:0 Tainted: PF   W  O-- 
> >   3.10.0-210.el7.dev02.x86_64 #1
> > Hardware name: Stratus ftServer 6800/G7LYY, BIOS BIOS Version 8.0:30 
> > 11/12/2014
> > Workqueue: scsi_tmf_872 scmd_eh_abort_handler
> > task: 88031bd91960 ti: 880981318000 task.ti: 880981318000
> > RIP: 0010:[]  [] kobject_put+0xd/0x60
> > RSP: 0018:88098131bd28  EFLAGS: 00010002
> > RAX:  RBX: 6b6b6b6b6b6b6c0b RCX: 0001002f0009
> > RDX: 002f RSI: ea0015719800 RDI: 6b6b6b6b6b6b6c0b
> > RBP: 88098131bd30 R08: 88055c6622f0 R09: 0001002f0008
> > R10: 88085f407a80 R11: 81450503 R12: 8804bab6d248
> > R13: ff98 R14: 0086 R15: 0c20
> > FS:  () GS:88085fce() knlGS:
> > CS:  0010 DS:  ES:  CR0: 80050033
> > CR2: 7f2ebb5d6008 CR3: 001038dc5000 CR4: 001407e0
> > DR0:  DR1:  DR2: 
> > DR3:  DR6: 0ff0 DR7: 0400
> > Stack:
> >   88098131bd40 813cd327 88098131bd50
> >  8140a65a 88098131bd78 81416795 
> >  880414791968 880414791978 88098131bd88 814175f6
> > Call Trace:
> >  [] put_device+0x17/0x20
> >  [] usb_put_dev+0x1a/0x20
> >  [] usb_hcd_unlink_urb+0x65/0xe0
> >  [] usb_unlink_urb+0x26/0x50
> >  [] usb_sg_cancel+0x82/0xe0
> >  [] usb_stor_stop_transport+0x31/0x50 [usb_storage]
> >  [] command_abort+0xcd/0xe0 [usb_storage]
> >  [] scmd_eh_abort_handler+0xbf/0x480
> >  [] process_one_work+0x17b/0x470
> >  [] worker_thread+0x11b/0x400
> >  [] ? rescuer_thread+0x400/0x400
> >  [] kthread+0xcf/0xe0
> >  [] ? kthread_create_on_node+0x140/0x140
> >  [] ret_from_fork+0x7c/0xb0
> >  [] ? kthread_create_on_node+0x140/0x140
> > 
> > from another crash, we know that the URB itself has been freed:

> > but the underlying usb_device->device is still present:

> > It looks like usb_hcd_unlink_urb takes a reference out on the underlying
> > device but not the URB, which in our test case consistently gets freed
> > just before usb_hcd_unlink_urb tries to return a reference on urb->device. 

The analysis looks correct.

> > Maybe the URB is a reference count short when usb_hcd_unlink_urb calls
> > unlink1?  Somewhere usb-storage missed taking out a ref?

No; it's a simple use-after-free error.

> > A tourniquet hack-patch follows (probably inside out, as the URB should stay
> > intact while usb_hcd_unlink_urb does its thing) and has been running a
> > little over 30 surprise device removals under I/O without incident.
> > 
> > Any better suggestions?

Please try this instead.

Alan Stern



Index: usb-3.19/drivers/usb/core/hcd.c
===
--- usb-3.19.orig/drivers/usb/core/hcd.c
+++ usb-3.19/drivers/usb/core/hcd.c
@@ -1618,6 +1618,7 @@ static int unlink1(struct usb_hcd *hcd,
 int usb_hcd_unlink_urb (struct urb *urb, int status)
 {
struct usb_hcd  *hcd;
+   struct usb_device   *udev = urb->dev;
int retval = -EIDRM;
unsigned long   flags;
 
@@ -1629,20 +1630,19 @@ int usb_hcd_unlink_urb (struct urb *urb,
spin_lock_irqsave(&hcd_urb_unlink_lock, flags);
if (atomic_read(&urb->use_count) > 0) {
retval = 0;
-   usb_get_dev(urb->dev);
+   usb_get_dev(udev);
}
spin_unlock_irqrestore(&hcd_urb_unlink_lock, flags);
if (retval == 0) {
hcd = bus_to_hcd(urb->dev->bus);
retval = unlink1(hcd, urb, status);
-   usb_put_dev(urb->dev);
+   if (retval == 0)
+   retval = -EINPROGRESS;
+   else if (retval != -EIDRM && retval != -EBUSY)
+   dev_dbg(&udev->dev, "hcd_unlink_urb %p fail %d\n",
+   urb, retval);
+   usb_put_dev(udev);
}
-
-   if (retval == 0)
-   retval = -EINPROGRESS;
-   else if (retval != -EIDRM && retval != -EBUSY)
-   dev_dbg(&urb->dev->dev, "hcd_unlink_urb %p fail %d\n",
-   urb, retval);
return retval;
 }
 

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi"

Re: usb-storage URB use-after-free

2015-01-28 Thread Joe Lawrence
This one should have gone over to linux-usb.

-- Joe

On 01/28/2015 05:04 PM, Joe Lawrence wrote:
> Hello linux-usb,
> 
> We've hit a USB use-after-free on Stratus HW during device removal tests.
> We're running fio disk I/O to a scsi disk hanging off USB when the USB
> controller is hotplug removed.  This crash is very consistent (usually the
> first device pull during testing).  Without I/O, it may take days to
> encounter.
> 
> general protection fault:  [#1] SMP
> ...
> CPU: 35 PID: 19626 Comm: kworker/u97:0 Tainted: PF   W  O--   
> 3.10.0-210.el7.dev02.x86_64 #1
> Hardware name: Stratus ftServer 6800/G7LYY, BIOS BIOS Version 8.0:30 
> 11/12/2014
> Workqueue: scsi_tmf_872 scmd_eh_abort_handler
> task: 88031bd91960 ti: 880981318000 task.ti: 880981318000
> RIP: 0010:[]  [] kobject_put+0xd/0x60
> RSP: 0018:88098131bd28  EFLAGS: 00010002
> RAX:  RBX: 6b6b6b6b6b6b6c0b RCX: 0001002f0009
> RDX: 002f RSI: ea0015719800 RDI: 6b6b6b6b6b6b6c0b
> RBP: 88098131bd30 R08: 88055c6622f0 R09: 0001002f0008
> R10: 88085f407a80 R11: 81450503 R12: 8804bab6d248
> R13: ff98 R14: 0086 R15: 0c20
> FS:  () GS:88085fce() knlGS:
> CS:  0010 DS:  ES:  CR0: 80050033
> CR2: 7f2ebb5d6008 CR3: 001038dc5000 CR4: 001407e0
> DR0:  DR1:  DR2: 
> DR3:  DR6: 0ff0 DR7: 0400
> Stack:
>   88098131bd40 813cd327 88098131bd50
>  8140a65a 88098131bd78 81416795 
>  880414791968 880414791978 88098131bd88 814175f6
> Call Trace:
>  [] put_device+0x17/0x20
>  [] usb_put_dev+0x1a/0x20
>  [] usb_hcd_unlink_urb+0x65/0xe0
>  [] usb_unlink_urb+0x26/0x50
>  [] usb_sg_cancel+0x82/0xe0
>  [] usb_stor_stop_transport+0x31/0x50 [usb_storage]
>  [] command_abort+0xcd/0xe0 [usb_storage]
>  [] scmd_eh_abort_handler+0xbf/0x480
>  [] process_one_work+0x17b/0x470
>  [] worker_thread+0x11b/0x400
>  [] ? rescuer_thread+0x400/0x400
>  [] kthread+0xcf/0xe0
>  [] ? kthread_create_on_node+0x140/0x140
>  [] ret_from_fork+0x7c/0xb0
>  [] ? kthread_create_on_node+0x140/0x140
> 
> from another crash, we know that the URB itself has been freed:
> 
> crash> struct -o urb | grep SIZE
> SIZE: 0xc0
> crash> p 0xc0/8
> $2 = 0x18
> crash> rd 880846df9248 0x18
> 880846df9248:  6b6b6b6b6b6b6b6b 6b6b6b6b6b6b6b6b   
> 880846df9258:  6b6b6b6b6b6b6b6b 6b6b6b6b6b6b6b6b   
> 880846df9268:  6b6b6b6b6b6b6b6b 6b6b6b6b6b6b6b6b   
> 880846df9278:  6b6b6b6b6b6b6b6b 6b6b6b6b6b6b6b6b   
> 880846df9288:  6b6b6b6b6b6b6b6b 6b6b6b6b6b6b6b6b   
> 880846df9298:  6b6b6b6b6b6b6b6b 6b6b6b6b6b6b6b6b   
> 880846df92a8:  6b6b6b6b6b6b6b6b 6b6b6b6b6b6b6b6b   
> 880846df92b8:  6b6b6b6b6b6b6b6b 6b6b6b6b6b6b6b6b   
> 880846df92c8:  6b6b6b6b6b6b6b6b 6b6b6b6b6b6b6b6b   
> 880846df92d8:  6b6b6b6b6b6b6b6b 6b6b6b6b6b6b6b6b   
> 880846df92e8:  6b6b6b6b6b6b6b6b 6b6b6b6b6b6b6b6b   
> 880846df92f8:  6b6b6b6b6b6b6b6b a56b6b6b6b6b6b6b   kkk.
> 
> but the underlying usb_device->device is still present:
> 
> crash> struct usb_device 88084fd22e68
> ...
>   dev = {
> parent = 0x881050c3d418, 
> p = 0x88084ff27ae8, 
> kobj = {
>   name = 0x88084ff071e0 "6-1", 
>   entry = {
> next = 0x88084ff09568, 
> prev = 0x88084ff09000
>   }, 
>   parent = 0x881050c3d428, 
>   kset = 0x8808540dc9f0, 
>   ktype = 0x819cc220 , 
>   sd = 0x88084ff1b8b8, 
>   kref = {
> refcount = {
>   counter = 0x8
> }
>   }, 
>   state_initialized = 0x1, 
>   state_in_sysfs = 0x1, 
>   state_add_uevent_sent = 0x1, 
>   state_remove_uevent_sent = 0x0, 
>   uevent_suppress = 0x0
> },
> ...
> 
> I added trace printing to usb_{alloc_urb,get_urb,free_urb} and urb_destroy
> showing the URB and the calling function, then crashed again:
> 
> ... snip ...
>  usb-storage-687   [000]   350.013212: bprint:   
> usb_alloc_urb : JL: usb_sg_init+0xe1 -> usb_alloc_urb(0x8808489b0a28)
>  usb-storage-687   [000]   350.013212: bprint:   usb_get_urb 
> : JL: usb_hcd_submit_urb+0x3b -> usb_get_urb(0x8808489b0a28)
>kworker/u97:2-703   [006]   380.656791: bprint:   usb_free_urb 
> : JL: __usb_hcd_giveback_urb -> usb_free_urb(0x8808489b0a28)
>  usb-storage-687   [000]   380.656793: bprint:   usb_free_urb 
> : JL: sg_clean+0x33 -> usb_free_urb(0x8808489b0a28)
>  usb-storage-687   [000]   380.656794: bprint:   urb_destroy 

usb-storage URB use-after-free

2015-01-28 Thread Joe Lawrence
Hello linux-usb,

We've hit a USB use-after-free on Stratus HW during device removal tests.
We're running fio disk I/O to a scsi disk hanging off USB when the USB
controller is hotplug removed.  This crash is very consistent (usually the
first device pull during testing).  Without I/O, it may take days to
encounter.

general protection fault:  [#1] SMP
...
CPU: 35 PID: 19626 Comm: kworker/u97:0 Tainted: PF   W  O--   
3.10.0-210.el7.dev02.x86_64 #1
Hardware name: Stratus ftServer 6800/G7LYY, BIOS BIOS Version 8.0:30 11/12/2014
Workqueue: scsi_tmf_872 scmd_eh_abort_handler
task: 88031bd91960 ti: 880981318000 task.ti: 880981318000
RIP: 0010:[]  [] kobject_put+0xd/0x60
RSP: 0018:88098131bd28  EFLAGS: 00010002
RAX:  RBX: 6b6b6b6b6b6b6c0b RCX: 0001002f0009
RDX: 002f RSI: ea0015719800 RDI: 6b6b6b6b6b6b6c0b
RBP: 88098131bd30 R08: 88055c6622f0 R09: 0001002f0008
R10: 88085f407a80 R11: 81450503 R12: 8804bab6d248
R13: ff98 R14: 0086 R15: 0c20
FS:  () GS:88085fce() knlGS:
CS:  0010 DS:  ES:  CR0: 80050033
CR2: 7f2ebb5d6008 CR3: 001038dc5000 CR4: 001407e0
DR0:  DR1:  DR2: 
DR3:  DR6: 0ff0 DR7: 0400
Stack:
  88098131bd40 813cd327 88098131bd50
 8140a65a 88098131bd78 81416795 
 880414791968 880414791978 88098131bd88 814175f6
Call Trace:
 [] put_device+0x17/0x20
 [] usb_put_dev+0x1a/0x20
 [] usb_hcd_unlink_urb+0x65/0xe0
 [] usb_unlink_urb+0x26/0x50
 [] usb_sg_cancel+0x82/0xe0
 [] usb_stor_stop_transport+0x31/0x50 [usb_storage]
 [] command_abort+0xcd/0xe0 [usb_storage]
 [] scmd_eh_abort_handler+0xbf/0x480
 [] process_one_work+0x17b/0x470
 [] worker_thread+0x11b/0x400
 [] ? rescuer_thread+0x400/0x400
 [] kthread+0xcf/0xe0
 [] ? kthread_create_on_node+0x140/0x140
 [] ret_from_fork+0x7c/0xb0
 [] ? kthread_create_on_node+0x140/0x140

from another crash, we know that the URB itself has been freed:

crash> struct -o urb | grep SIZE
SIZE: 0xc0
crash> p 0xc0/8
$2 = 0x18
crash> rd 880846df9248 0x18
880846df9248:  6b6b6b6b6b6b6b6b 6b6b6b6b6b6b6b6b   
880846df9258:  6b6b6b6b6b6b6b6b 6b6b6b6b6b6b6b6b   
880846df9268:  6b6b6b6b6b6b6b6b 6b6b6b6b6b6b6b6b   
880846df9278:  6b6b6b6b6b6b6b6b 6b6b6b6b6b6b6b6b   
880846df9288:  6b6b6b6b6b6b6b6b 6b6b6b6b6b6b6b6b   
880846df9298:  6b6b6b6b6b6b6b6b 6b6b6b6b6b6b6b6b   
880846df92a8:  6b6b6b6b6b6b6b6b 6b6b6b6b6b6b6b6b   
880846df92b8:  6b6b6b6b6b6b6b6b 6b6b6b6b6b6b6b6b   
880846df92c8:  6b6b6b6b6b6b6b6b 6b6b6b6b6b6b6b6b   
880846df92d8:  6b6b6b6b6b6b6b6b 6b6b6b6b6b6b6b6b   
880846df92e8:  6b6b6b6b6b6b6b6b 6b6b6b6b6b6b6b6b   
880846df92f8:  6b6b6b6b6b6b6b6b a56b6b6b6b6b6b6b   kkk.

but the underlying usb_device->device is still present:

crash> struct usb_device 88084fd22e68
...
  dev = {
parent = 0x881050c3d418, 
p = 0x88084ff27ae8, 
kobj = {
  name = 0x88084ff071e0 "6-1", 
  entry = {
next = 0x88084ff09568, 
prev = 0x88084ff09000
  }, 
  parent = 0x881050c3d428, 
  kset = 0x8808540dc9f0, 
  ktype = 0x819cc220 , 
  sd = 0x88084ff1b8b8, 
  kref = {
refcount = {
  counter = 0x8
}
  }, 
  state_initialized = 0x1, 
  state_in_sysfs = 0x1, 
  state_add_uevent_sent = 0x1, 
  state_remove_uevent_sent = 0x0, 
  uevent_suppress = 0x0
},
...

I added trace printing to usb_{alloc_urb,get_urb,free_urb} and urb_destroy
showing the URB and the calling function, then crashed again:

... snip ...
 usb-storage-687   [000]   350.013212: bprint:   usb_alloc_urb 
: JL: usb_sg_init+0xe1 -> usb_alloc_urb(0x8808489b0a28)
 usb-storage-687   [000]   350.013212: bprint:   usb_get_urb : 
JL: usb_hcd_submit_urb+0x3b -> usb_get_urb(0x8808489b0a28)
   kworker/u97:2-703   [006]   380.656791: bprint:   usb_free_urb : 
JL: __usb_hcd_giveback_urb -> usb_free_urb(0x8808489b0a28)
 usb-storage-687   [000]   380.656793: bprint:   usb_free_urb : 
JL: sg_clean+0x33 -> usb_free_urb(0x8808489b0a28)
 usb-storage-687   [000]   380.656794: bprint:   urb_destroy : 
JL: usb_free_urb+0x40 -> urb_destroy(0x8808489b0a28)

crash> bt 703
PID: 703TASK: 8808497f8000  CPU: 6   COMMAND: "kworker/u97:2"
 #0 [8808497f7c10] die at 8101736b
 #1 [8808497f7c40] do_general_protection at 8160be4e
 #2 [8808497f7c70] general_protection at 8160b768
[