RE: [PATCH v1] usb: dwc3: gadget: sanity check for usb request complete function in ep_enqueue and giveback function.
Hi, (please don't top-post ;-) "Tang, Jianqiang" writes: > Hi Balbi, > > Sorry late response due to some other issues. > 1> Yes, I agree this is one gadget driver bug. > > Current gadget framework do not have any check about this > usb_request->complete pointer per my understanding. yes, and that's by design. > I think we should add some check in dwc3 OR gadget API like > usb_ep_queue() in include/linux/usb/gadget.h. I'd rather we don't. ->complete() is a required API and if there's any gadget driver not providing it, I want that gadget driver to Oops, open pandora's box, Summon Chtulhu, whatever's the worst. The reason is that lacking a ->complete() is a pretty big mistake and we better catch it as soon as possible. > I saw in 4.5-rc6 there is some sanity check code in usb_ep_queue. I > can move the check from dwc3 to gadget.h. thanks, but no thanks ;-) > 2> My current kernel is not vanilla kernel from Linus, I am using > one old/modified kernel based on 3.1x. then, I'm sorry, you're on your own here :-) > But as this bug is point to usb_request->complete is NULL, I think the > latest kernel also have the risk to happen. right, and we will keep it that way so we can be forced at looking for _that_ bug. Would you be willing to look for this bug in RNDIS driver and fix it ? > My platform is x86 platform. > And I am not able to reproduce this issue also. and how can you fix somethign you can't reproduce ? > From my analysis of call trace, I suspect there is RNDIS gadget > function is running with data transfer, also disconnect happen when > kernel panic. > > As I am not able to reproduce this issue until now, I am using my > supposed way to reproduce this issue: this is no good. If you can't reproduce the issue there's no way you can understand it well enough to fix it properly, right ? This patch should have been an RFC, considering you haven't found the root cause of the problem. > Connect device with RNDIS enabled. > Run RNDIS transfer using iperf with one Host machine as server and > client. > Disconnect device when iperf is running. and you couldn't reproduce with these steps ? What makes you think this is really the scenario which would cause the problem ? > To be clear, this is my supposed way, I am not able to reproduce this > issue also. you shouldn't have sent this patch, actually. An RFC would be fine, but the truth is that you didn't really understand the problem yet if you can't reliably reproduce it. > 3> So what is your opinion about how to fix this issue? see above -- balbi signature.asc Description: PGP signature
RE: [PATCH v1] usb: dwc3: gadget: sanity check for usb request complete function in ep_enqueue and giveback function.
Hi Balbi, Sorry late response due to some other issues. 1> Yes, I agree this is one gadget driver bug. Current gadget framework do not have any check about this usb_request->complete pointer per my understanding. I think we should add some check in dwc3 OR gadget API like usb_ep_queue() in include/linux/usb/gadget.h. I saw in 4.5-rc6 there is some sanity check code in usb_ep_queue. I can move the check from dwc3 to gadget.h. 2> My current kernel is not vanilla kernel from Linus, I am using one old/modified kernel based on 3.1x. But as this bug is point to usb_request->complete is NULL, I think the latest kernel also have the risk to happen. My platform is x86 platform. And I am not able to reproduce this issue also. From my analysis of call trace, I suspect there is RNDIS gadget function is running with data transfer, also disconnect happen when kernel panic. As I am not able to reproduce this issue until now, I am using my supposed way to reproduce this issue: Connect device with RNDIS enabled. Run RNDIS transfer using iperf with one Host machine as server and client. Disconnect device when iperf is running. To be clear, this is my supposed way, I am not able to reproduce this issue also. 3> So what is your opinion about how to fix this issue? Thanks! -Original Message- From: Felipe Balbi [mailto:ba...@kernel.org] Sent: Friday, February 26, 2016 3:58 PM To: Tang, Jianqiang ; Greg Kroah-Hartman Cc: linux-usb@vger.kernel.org; Tang, Jianqiang Subject: Re: [PATCH v1] usb: dwc3: gadget: sanity check for usb request complete function in ep_enqueue and giveback function. hi, Tang Jianqiang writes: > From: Jianqiang Tang > > Do sanity check for usb request complete function as we hit random > null pointer kernel panic in giveback function. > > From the call trace, show the complete function should be null. > So we add the sanity check before every usb request queue to dwc3 also > before dwc3 giveback the usb request. > > Logs: > BUG: unable to handle kernel NULL pointer dereference at (null) > IP: [< (null)>] (null) > Call Trace: > ? dwc3_gadget_giveback+0xa5/0x130 > ? vsnprintf+0x166/0x3d0 > dwc3_remove_requests+0x57/0x70 > __dwc3_gadget_ep_disable+0x18/0x80 > dwc3_gadget_ep_disable+0x79/0x1a0 > linkwatch_fire_event+0x4c/0x90 > gether_disconnect+0x45/0x1b0 > ? wake_up_klogd+0x49/0x70 > console_unlock+0x295/0x4c0 > rndis_disable+0x3d/0x90 > preempt_count_add+0x55/0xa0 > reset_config+0x3b/0x90 > _raw_spin_lock_irqsave+0x25/0x30 > composite_disconnect+0x2f/0x50 > dwc3_gadget_disconnect_interrupt+0x62/0x90 > > Signed-off-by: Jianqiang Tang well, this is a gadget driver bug, not a dwc3 bug. This gadget driver deserves to oops so we fix it. Care to provide information on how to reproduce this ? Which kernel did you use ? Which platform ? Are you using a vanilla kernel from Linus ? cheers -- balbi -- To unsubscribe from this list: send the line "unsubscribe linux-usb" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v1] usb: dwc3: gadget: sanity check for usb request complete function in ep_enqueue and giveback function.
hi, Tang Jianqiang writes: > From: Jianqiang Tang > > Do sanity check for usb request complete function as we hit random > null pointer kernel panic in giveback function. > > From the call trace, show the complete function should be null. > So we add the sanity check before every usb request queue to dwc3 > also before dwc3 giveback the usb request. > > Logs: > BUG: unable to handle kernel NULL pointer dereference at (null) > IP: [< (null)>] (null) > Call Trace: > ? dwc3_gadget_giveback+0xa5/0x130 > ? vsnprintf+0x166/0x3d0 > dwc3_remove_requests+0x57/0x70 > __dwc3_gadget_ep_disable+0x18/0x80 > dwc3_gadget_ep_disable+0x79/0x1a0 > linkwatch_fire_event+0x4c/0x90 > gether_disconnect+0x45/0x1b0 > ? wake_up_klogd+0x49/0x70 > console_unlock+0x295/0x4c0 > rndis_disable+0x3d/0x90 > preempt_count_add+0x55/0xa0 > reset_config+0x3b/0x90 > _raw_spin_lock_irqsave+0x25/0x30 > composite_disconnect+0x2f/0x50 > dwc3_gadget_disconnect_interrupt+0x62/0x90 > > Signed-off-by: Jianqiang Tang well, this is a gadget driver bug, not a dwc3 bug. This gadget driver deserves to oops so we fix it. Care to provide information on how to reproduce this ? Which kernel did you use ? Which platform ? Are you using a vanilla kernel from Linus ? cheers -- balbi signature.asc Description: PGP signature