Re: [PATCH] usb: hub: fix panic caused by NULL bos pointer during reset device

2016-04-27 Thread Tony Battersby
On 04/26/2016 10:53 PM, Du, Changbin wrote:
>> On Tue, Mar 08, 2016 at 05:15:17PM +0800, changbin...@intel.com wrote:
>>> From: "Du, Changbin" <changbin...@intel.com>
>>>
>>> This is a reworked patch based on reverted commit d8f00cd685f5 ("usb:
>>> hub: do not clear BOS field during reset device").
>>>
>>> The privious one caused double mem-free if run to re_enumerate label.
>>> New patch title changed to distinguish from old one. And I have tested
>>> it with memory debugging options.
>>>
>>> In function usb_reset_and_verify_device, the old BOS descriptor may
>>> still be used before allocating a new one. (usb_disable_lpm function
>>> uses it under the situation that it fails at usb_disable_link_state.)
>>> So we cannot set the udev->bos to NULL before that, just keep what it
>>> was. It will be overwrite when allocating a new one.
>>>
>>> How to reproduce:
>>> 1. connect one usb3 hub to xhci port.
>>> 2. connect several lpm-capable super-speed usb disk to the hub.
>>> 3. copy big files to the usb disks.
>>> 4. disconnect the hub and repeat step 1-4.
>>>
>>> Crash log:
>>> BUG: unable to handle kernel NULL pointer dereference at
>>> 0010
>>> IP: [] usb_enable_link_state+0x2d/0x2f0
>>> Call Trace:
>>> [] ? usb_set_lpm_timeout+0x12b/0x140
>>> [] usb_enable_lpm+0x81/0xa0
>>> [] usb_disable_lpm+0xa8/0xc0
>>> [] usb_unlocked_disable_lpm+0x2c/0x50
>>> [] usb_reset_and_verify_device+0xc3/0x710
>>> [] ? usb_sg_wait+0x13d/0x190
>>> [] usb_reset_device+0x133/0x280
>>> [] usb_stor_port_reset+0x61/0x70
>>> [] usb_stor_invoke_transport+0x88/0x520
>>>
>>> Signed-off-by: Du, Changbin <changbin...@intel.com>
>>> ---
>>>  drivers/usb/core/hub.c | 14 +-
>>>  1 file changed, 9 insertions(+), 5 deletions(-)
>> Is this patch still needed?  I thought we had some other fix in this
>> area...
>>
>> confused,
>>
>> greg k-h
>>
> Hi, Greg k-h,
> Sorry for it confused you. This patch still need. This is same fix with
> previous commit d8f00cd685f5 ("usb: hub: do not clear BOS field
> during reset device"). But d8f00cd685f5 is buggy and reverted. This
> new patch should be the final fix.
>
> Best Regards,
> Du, Changbin
>

I think Greg is referring to commit 464ad8c43a9e ("usb: core : hub: Fix
BOS 'NULL pointer' kernel panic"), which has already been applied
upstream.  It looks to me like that patch might have fixed the same
problem in a different way, in which case Changbin's patch is not
needed.  But I haven't been involved in developing or testing that
patch, so I can't say for sure.  At the very least, 464ad8c43a9e
conflicts with Changbin's patch.

Changbin, can you take a look at 464ad8c43a9e and see if that fixes the
same problem that your patch did?

Thanks,
Tony Battersby

--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: probably missing patch to stable?

2016-04-04 Thread Tony Battersby
Just to recap:

The description of the original patch d8f00cd685f5 ("usb: hub: do not
clear BOS field during reset device") indicates that it fixes an oops,
but it also had a bug that introduced a different oops (reported by
me).  That patch has now been reverted in mainline, fixing the new oops
that I reported but AFAIK re-introducing the original oops.  Changbin
has also  posted an updated patch that fixes both the original oops and
the new oops ("usb: hub: fix panic caused by NULL bos pointer during
reset device"), but that patch has not yet been merged into mainline. 
So perhaps it would be better to merge Changbin's new patch into
mainline and backport that to -stable also, so that both oopses get fixed.

As far as testing goes, Changbin posted a small patch in the thread "Re:
USB oops regression caused by -stable patch", which I tested and it
fixed the oops that I found.  But that small patch was before the
original patch d8f00cd685f5 was reverted.  Changbin's new patch ("usb:
hub: fix panic caused by NULL bos pointer during reset device") is
equivalent to un-reverting d8f00cd685f5 and applying the small patch
that I already tested.  So my testing also applies to Changbin's new patch.

Tony Battersby
Cybernetics

On 04/04/2016 05:26 AM, Roger Quadros wrote:
> Hi Greg,
>
> This commit [1] mentions that it affects certain stable versions but
> I didn't see cc: stable in it nor could find it in any mailing list.
>
> Just wanted to bring to your attention. Thanks.
>
> cheers,
> -roger
>
> [1]
> commit e5bdfd50d6f76077bf8441d130c606229e100d40 
> Author: Greg Kroah-Hartman <gre...@linuxfoundation.org>
>
> Revert "usb: hub: do not clear BOS field during reset device"
>
> This reverts commit d8f00cd685f5c8e0def8593e520a7fef12c22407.
> 
> Tony writes:
> 
> This upstream commit is causing an oops:
> d8f00cd685f5 ("usb: hub: do not clear BOS field during reset device")
> 
> This patch has already been included in several -stable kernels.  Here
> are the affected kernels:
> 4.5.0-rc4 (current git)
> 4.4.2
> 4.3.6 (currently in review)
> 4.1.18
> 3.18.27
> 3.14.61
> 
> How to reproduce the problem:
> Boot kernel with slub debugging enabled (otherwise memory corruption
> will cause random oopses later instead of immediately)
> Plug in USB 3.0 disk to xhci USB 3.0 port
> dd if=/dev/sdc of=/dev/null bs=65536
> (where /dev/sdc is the USB 3.0 disk)
> Unplug USB cable while dd is still going
> Oops is immediate:
> 
> Reported-by: Tony Battersby <to...@cybernetics.com>
> Cc: Du, Changbin <changbin...@intel.com>
> Signed-off-by: Greg Kroah-Hartman <gre...@linuxfoundation.org>
>

--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: USB oops regression caused by -stable patch

2016-02-22 Thread Tony Battersby
Thanks, that fixes it.  Tested on 4.5.0-rc5 and 3.18.27.  Just to be
clear, I tested it *without* reverting d8f00cd685f5.  So this patch is
in addition to d8f00cd685f5 instead of replacing it.

Tested-by: Tony Battersby <to...@cybernetics.com>

On 02/21/2016 09:27 PM, Du, Changbin wrote:
> Thanks for reporting, Tony. It was remiss of me.
> There is another BOS free operation in label re_enumerate. This cause a 
> double-free of BOS.
> USB2 doesn't have BOS desc, so you cannot reproduce it.
>
> I am on a travel. It is appreciated if you can help try below fix.
>
> Hi, Greg, I will commit a final patch once returned from travel.
>
> --- a/drivers/usb/core/hub.c
> +++ b/drivers/usb/core/hub.c
> @@ -5501,8 +5501,10 @@ done:
> return 0;
>  
>  re_enumerate:
> -   usb_release_bos_descriptor(udev);
> -   udev->bos = bos;
> +   if (udev->bos != bos) {
> +   usb_release_bos_descriptor(udev);
> +   udev->bos = bos;
> +   }
>
> Best Regards,
> Du, Changbin
>
>> On Fri, Feb 19, 2016 at 09:39:57AM -0500, Tony Battersby wrote:
>>> This upstream commit is causing an oops:
>>> d8f00cd685f5 ("usb: hub: do not clear BOS field during reset device")
>>>
>>> This patch has already been included in several -stable kernels.  Here
>>> are the affected kernels:
>>> 4.5.0-rc4 (current git)
>>> 4.4.2
>>> 4.3.6 (currently in review)
>>> 4.1.18
>>> 3.18.27
>>> 3.14.61
>>>
>>> How to reproduce the problem:
>>> Boot kernel with slub debugging enabled (otherwise memory corruption
>>> will cause random oopses later instead of immediately)
>>> Plug in USB 3.0 disk to xhci USB 3.0 port
>>> dd if=/dev/sdc of=/dev/null bs=65536
>>> (where /dev/sdc is the USB 3.0 disk)
>>> Unplug USB cable while dd is still going
>>> Oops is immediate:
>> Not good, thanks for letting us know.  I've now reverted this and will
>> get the fix into 4.5-rc6.
>>
>> greg k-h

--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


USB oops regression caused by -stable patch

2016-02-19 Thread Tony Battersby
tion+0x11b/0x140
 [] usb_release_bos_descriptor+0x20/0x40
 [] usb_release_dev+0x2c/0x70
 [] device_release+0x33/0xa0
 [] kobject_release+0x47/0x90
 [] kobject_put+0x2c/0x60
 [] put_device+0x12/0x20
 [] usb_disconnect+0x1cb/0x220
 [] hub_event+0x46a/0x1070
 [] ? dequeue_task_fair+0x73a/0x820
 [] ? next_zone+0x25/0x30
 [] ? pick_next_task_fair+0xa9/0x850
 [] process_one_work+0x151/0x3c0
 [] ? mod_timer+0xe9/0x160
 [] ? lock_timer_base+0x55/0x70
 [] ? schedule+0x3b/0xa0
 [] worker_thread+0x158/0x6b0
 [] ? __schedule+0x27a/0x6e0
 [] ? default_wake_function+0xd/0x10
 [] ? __wake_up_common+0x51/0x80
 [] ? schedule+0x3b/0xa0
 [] ? process_one_work+0x3c0/0x3c0
 [] kthread+0xc7/0xf0
 [] ? kthread_parkme+0x20/0x20
 [] ret_from_fork+0x3f/0x70
 [] ? kthread_parkme+0x20/0x20
Code: 25 00 ac 00 00 48 8b 80 e8 03 00 00 48 8b 40 c8 c9 48 d1 e8 83 e0 01 c3 
0f 1f 84 00 00 00 00 00 55 48 8b 87 e8 03 00 00 48 89 e5 <48> 8b 40 d8 c9 c3 66 
66 66 66 66 66 2e 0f 1f 84 00 00 00 00 00 
RIP  [] kthread_data+0xb/0x20
 RSP 
CR2: ffd8
---[ end trace a3bcfa253dbef568 ]---
Fixing recursive fault but reboot is needed!

With the patch reverted, everything works fine.

So far I have been unable to reproduce the problem using EHCI (USB 2.0).

Tony Battersby
Cybernetics

--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html