Re: [PATCH] net/cdc_ncm: fix null pointer panic at usbnet_link_change
Du, ChangbinX changbinx...@intel.com writes: From: Bjørn Mork [mailto:bj...@mork.no] Sent: Tuesday, October 29, 2013 4:41 PM To: Du, ChangbinX Cc: oli...@neukum.org; linux-usb@vger.kernel.org; net...@vger.kernel.org; linux-ker...@vger.kernel.org Subject: Re: [PATCH] net/cdc_ncm: fix null pointer panic at usbnet_link_change Du, ChangbinX changbinx...@intel.com writes: From: Du, Changbin changbinx...@intel.com In cdc_ncm_bind() function, it call cdc_ncm_bind_common() to setup usb. But cdc_ncm_bind_common() may meet error and cause usbnet_disconnect() be called which calls free_netdev(net). I am sure you are right, but I really don't see how that can happen. AFAICS, we ensure that the intfdata is set to NULL before calling usb_driver_release_interface() in the error cleanup in cdc_ncm_bind_common(): error2: usb_set_intfdata(ctx-control, NULL); usb_set_intfdata(ctx-data, NULL); if (ctx-data != ctx-control) usb_driver_release_interface(driver, ctx-data); error: cdc_ncm_free((struct cdc_ncm_ctx *)dev-data[0]); dev-data[0] = 0; dev_info(dev-udev-dev, bind() failure\n); return -ENODEV; Thus we hit the test in disconnect, and usbnet_disconnect() is never called: static void cdc_ncm_disconnect(struct usb_interface *intf) { struct usbnet *dev = usb_get_intfdata(intf); if (dev == NULL) return; /* already disconnected */ usbnet_disconnect(intf); } So we should really be OK, but we are not I would appreciate it if anyone took the time to feed me why, with a very small spoon... Yes, you are right. It should not happen if cdc_ncm_disconnect is not called. But this really happened. It produced on kernel 3.10. Yes, I do not doubt it. I just don't understand it. There is no contradiction there. I believe lots of stuff I don't understand :-) The version this happened is very useful, although I don't think there are any relevant changes after v3.10. Here is the full panic message for you to refer. I will get a method try to reproduce it. That would be great if you were able to. Otherwise it will be difficult to verify that the proposed fix works. In any case, as I said: I believe a fix which avoids the call to usbnet_link_change() in case of bind failure is correct and appropriate. But I suggest that you do it by removing the call and comment from cdc_ncm_bind() and instead set the FLAG_LINK_INTR in the driver_info. That's how this should have been implemented from the beginning. [ 15.635727] BUG: Bad page state in process khubd pfn:31994 [ 15.641989] page:f3ad9280 count:0 mapcount:0 mapping:0020 index:0x0 [ 15.649384] page flags: 0x4000() [ 15.670096] BUG: unable to handle kernel NULL pointer dereference at 0078 [ 15.678078] IP: [c24b7f5e] usbnet_link_change+0x1e/0x80 [ 15.684120] *pde = [ 15.687339] Oops: [#1] SMP [ 15.690953] Modules linked in: atmel_mxt_ts vxd392 videobuf_vmalloc videobuf_core bcm_bt_lpm bcm432) [ 15.702658] CPU: 0 PID: 573 Comm: khubd Tainted: GB W O 3.10.1+ #1 [ 15.710241] task: f27e8f30 ti: f2084000 task.ti: f2084000 [ 15.716270] EIP: 0060:[c24b7f5e] EFLAGS: 00010246 CPU: 0 [ 15.722396] EIP is at usbnet_link_change+0x1e/0x80 [ 15.727747] EAX: f18524c0 EBX: ECX: f18524c0 EDX: [ 15.734746] ESI: f18524c0 EDI: EBP: f2085b7c ESP: f2085b70 [ 15.741746] DS: 007b ES: 007b FS: 00d8 GS: SS: 0068 [ 15.747775] CR0: 8005003b CR2: 0078 CR3: 02c3b000 CR4: 001007d0 [ 15.754774] DR0: DR1: DR2: DR3: [ 15.761774] DR6: 0ff0 DR7: 0400 [ 15.766054] Stack: [ 15.768295] f18524c0 c2a03818 f2085b8c c24bc69a f1852000 f1f52e00 f2085be0 [ 15.776991] c24b8d32 0001 c2167f2c f2085bb4 c2821253 f2085bec [ 15.785687] 0246 0246 f18d8088 f1f52e1c f1fce464 f1fce400 f18524c0 c28a06e0 [ 15.794376] Call Trace: [ 15.797109] [c24bc69a] cdc_ncm_bind+0x3a/0x50 [ 15.802267] [c24b8d32] usbnet_probe+0x282/0x7d0 [ 15.807621] [c2167f2c] ? sysfs_new_dirent+0x6c/0x100 [ 15.813460] [c2821253] ? mutex_lock+0x13/0x40 [ 15.818618] [c24bb278] cdc_ncm_probe+0x8/0x10 [ 15.823777] [c24e0ef7] usb_probe_interface+0x187/0x2c0 I still do not understand how this happens, but usbnet_link_change will schedule some delayed work which absolutely is not appropriate if usbnet_probe fails. There are no cancel_work calls in the usbnet_probe exit path So the call should definitely be avoided if bind fails. Bjørn -- To unsubscribe from this list: send the line unsubscribe linux-usb in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] net/cdc_ncm: fix null pointer panic at usbnet_link_change
David Miller da...@davemloft.net writes: The problem is in cdc_ncm_bind_common(). It seems to leave dangling interface data pointers in some cases, and then branches just to error so that they don't get cleared back out. Sorry, but I fail to see this as well. I see one return and two goto error, but all are well before any intfdata pointer is set: 364 ctx = kzalloc(sizeof(*ctx), GFP_KERNEL); 365 if (!ctx) 366 return -ENOMEM; 367 [..] 453 /* check if we got everything */ 454 if ((ctx-control == NULL) || (ctx-data == NULL) || 455 ((!ctx-mbim_desc) ((ctx-ether_desc == NULL) || (ctx-control != intf 456 goto error; 457 458 /* claim data interface, if different from control */ 459 if (ctx-data != ctx-control) { 460 temp = usb_driver_claim_interface(driver, ctx-data, dev); 461 if (temp) 462 goto error; 463 } [..] 490 usb_set_intfdata(ctx-data, dev); 491 usb_set_intfdata(ctx-control, dev); 492 usb_set_intfdata(ctx-intf, dev); [..] 510 return 0; 511 512 error2: 513 usb_set_intfdata(ctx-control, NULL); 514 usb_set_intfdata(ctx-data, NULL); 515 if (ctx-data != ctx-control) 516 usb_driver_release_interface(driver, ctx-data); 517 error: 518 cdc_ncm_free((struct cdc_ncm_ctx *)dev-data[0]); 519 dev-data[0] = 0; 520 dev_info(dev-udev-dev, bind() failure\n); 521 return -ENODEV; 522 } This could (and given this thread, probably should) certainly be refactored and cleaned up a bit. But I do not see how it leaves any dangling pointers. The pointers are only set near the end of the function, and the only exit points after that are either success or through the error2 label. Side note: It is definitely confusing that we set 3 pointers, but only clean up 2. The reason is that there are never more than 2 interfaces involved here. We always have ctx-intf == ctx-control. I'd really like to get rid of that redundant ctx-intf pointer. That's one issue to cleanup throughout this driver. Bjørn -- To unsubscribe from this list: send the line unsubscribe linux-usb in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: [PATCH] net/cdc_ncm: fix null pointer panic at usbnet_link_change
From: Bjørn Mork [mailto:bj...@mork.no] Sent: Tuesday, October 29, 2013 4:41 PM To: Du, ChangbinX Cc: oli...@neukum.org; linux-usb@vger.kernel.org; net...@vger.kernel.org; linux-ker...@vger.kernel.org Subject: Re: [PATCH] net/cdc_ncm: fix null pointer panic at usbnet_link_change Du, ChangbinX changbinx...@intel.com writes: From: Du, Changbin changbinx...@intel.com In cdc_ncm_bind() function, it call cdc_ncm_bind_common() to setup usb. But cdc_ncm_bind_common() may meet error and cause usbnet_disconnect() be called which calls free_netdev(net). I am sure you are right, but I really don't see how that can happen. AFAICS, we ensure that the intfdata is set to NULL before calling usb_driver_release_interface() in the error cleanup in cdc_ncm_bind_common(): error2: usb_set_intfdata(ctx-control, NULL); usb_set_intfdata(ctx-data, NULL); if (ctx-data != ctx-control) usb_driver_release_interface(driver, ctx-data); error: cdc_ncm_free((struct cdc_ncm_ctx *)dev-data[0]); dev-data[0] = 0; dev_info(dev-udev-dev, bind() failure\n); return -ENODEV; Thus we hit the test in disconnect, and usbnet_disconnect() is never called: static void cdc_ncm_disconnect(struct usb_interface *intf) { struct usbnet *dev = usb_get_intfdata(intf); if (dev == NULL) return; /* already disconnected */ usbnet_disconnect(intf); } So we should really be OK, but we are not I would appreciate it if anyone took the time to feed me why, with a very small spoon... Yes, you are right. It should not happen if cdc_ncm_disconnect is not called. But this really happened. It produced on kernel 3.10. Here is the full panic message for you to refer. I will get a method try to reproduce it. [ 15.635727] BUG: Bad page state in process khubd pfn:31994 [ 15.641989] page:f3ad9280 count:0 mapcount:0 mapping:0020 index:0x0 [ 15.649384] page flags: 0x4000() [ 15.670096] BUG: unable to handle kernel NULL pointer dereference at 0078 [ 15.678078] IP: [c24b7f5e] usbnet_link_change+0x1e/0x80 [ 15.684120] *pde = [ 15.687339] Oops: [#1] SMP [ 15.690953] Modules linked in: atmel_mxt_ts vxd392 videobuf_vmalloc videobuf_core bcm_bt_lpm bcm432) [ 15.702658] CPU: 0 PID: 573 Comm: khubd Tainted: GB W O 3.10.1+ #1 [ 15.710241] task: f27e8f30 ti: f2084000 task.ti: f2084000 [ 15.716270] EIP: 0060:[c24b7f5e] EFLAGS: 00010246 CPU: 0 [ 15.722396] EIP is at usbnet_link_change+0x1e/0x80 [ 15.727747] EAX: f18524c0 EBX: ECX: f18524c0 EDX: [ 15.734746] ESI: f18524c0 EDI: EBP: f2085b7c ESP: f2085b70 [ 15.741746] DS: 007b ES: 007b FS: 00d8 GS: SS: 0068 [ 15.747775] CR0: 8005003b CR2: 0078 CR3: 02c3b000 CR4: 001007d0 [ 15.754774] DR0: DR1: DR2: DR3: [ 15.761774] DR6: 0ff0 DR7: 0400 [ 15.766054] Stack: [ 15.768295] f18524c0 c2a03818 f2085b8c c24bc69a f1852000 f1f52e00 f2085be0 [ 15.776991] c24b8d32 0001 c2167f2c f2085bb4 c2821253 f2085bec [ 15.785687] 0246 0246 f18d8088 f1f52e1c f1fce464 f1fce400 f18524c0 c28a06e0 [ 15.794376] Call Trace: [ 15.797109] [c24bc69a] cdc_ncm_bind+0x3a/0x50 [ 15.802267] [c24b8d32] usbnet_probe+0x282/0x7d0 [ 15.807621] [c2167f2c] ? sysfs_new_dirent+0x6c/0x100 [ 15.813460] [c2821253] ? mutex_lock+0x13/0x40 [ 15.818618] [c24bb278] cdc_ncm_probe+0x8/0x10 [ 15.823777] [c24e0ef7] usb_probe_interface+0x187/0x2c0 [ 15.829811] [c23caa8a] ? driver_sysfs_add+0x6a/0x90 [ 15.835550] [c23cb290] ? __driver_attach+0x90/0x90 [ 15.841192] [c23caf14] driver_probe_device+0x74/0x360 [ 15.847127] [c24e07b1] ? usb_match_id+0x41/0x60 [ 15.852479] [c24e081e] ? usb_device_match+0x4e/0x90 [ 15.858219] [c23cb290] ? __driver_attach+0x90/0x90 [ 15.863861] [c23cb2c9] __device_attach+0x39/0x50 [ 15.869311] [c23c93f4] bus_for_each_drv+0x34/0x70 [ 15.874856] [c23cae2b] device_attach+0x7b/0x90 [ 15.880109] [c23cb290] ? __driver_attach+0x90/0x90 [ 15.885752] [c23ca38f] bus_probe_device+0x6f/0x90 [ 15.891298] [c23c8a08] device_add+0x558/0x630 [ 15.896457] [c24e4821] ? usb_create_ep_devs+0x71/0xd0 [ 15.902391] [c24dd0db] ? create_intf_ep_devs+0x4b/0x70 [ 15.908422] [c24df2bf] usb_set_configuration+0x4bf/0x800 [ 15.914648] [c23cb290] ? __driver_attach+0x90/0x90 [ 15.920292] [c24e809b] generic_probe+0x2b/0x90 [ 15.925546] [c24e105c] usb_probe_device+0x2c/0x70 [ 15.931091] [c23caf14] driver_probe_device+0x74/0x360 [ 15.943345] [c2272102] ? kobject_uevent_env+0x182/0x620 [ 15.949473] [c2272102] ? kobject_uevent_env+0x182/0x620 [ 15.955601] [c23cb290] ? __driver_attach+0x90/0x90 [ 15.961242] [c23cb2c9] __device_attach+0x39/0x50 [ 15.966692] [c23c93f4] bus_for_each_drv+0x34/0x70 [ 15.972238] [c23cae2b] device_attach+0x7b/0x90 [ 15.977492
Re: [PATCH] net/cdc_ncm: fix null pointer panic at usbnet_link_change
Du, ChangbinX changbinx...@intel.com writes: From: Du, Changbin changbinx...@intel.com In cdc_ncm_bind() function, it call cdc_ncm_bind_common() to setup usb. But cdc_ncm_bind_common() may meet error and cause usbnet_disconnect() be called which calls free_netdev(net). I am sure you are right, but I really don't see how that can happen. AFAICS, we ensure that the intfdata is set to NULL before calling usb_driver_release_interface() in the error cleanup in cdc_ncm_bind_common(): error2: usb_set_intfdata(ctx-control, NULL); usb_set_intfdata(ctx-data, NULL); if (ctx-data != ctx-control) usb_driver_release_interface(driver, ctx-data); error: cdc_ncm_free((struct cdc_ncm_ctx *)dev-data[0]); dev-data[0] = 0; dev_info(dev-udev-dev, bind() failure\n); return -ENODEV; Thus we hit the test in disconnect, and usbnet_disconnect() is never called: static void cdc_ncm_disconnect(struct usb_interface *intf) { struct usbnet *dev = usb_get_intfdata(intf); if (dev == NULL) return; /* already disconnected */ usbnet_disconnect(intf); } So we should really be OK, but we are not I would appreciate it if anyone took the time to feed me why, with a very small spoon... diff --git a/drivers/net/usb/cdc_ncm.c b/drivers/net/usb/cdc_ncm.c index 43afde8..af37ecf 100644 --- a/drivers/net/usb/cdc_ncm.c +++ b/drivers/net/usb/cdc_ncm.c @@ -603,14 +603,15 @@ static int cdc_ncm_bind(struct usbnet *dev, struct usb_interface *intf) /* NCM data altsetting is always 1 */ ret = cdc_ncm_bind_common(dev, intf, 1); - - /* - * We should get an event when network connection is connected or - * disconnected. Set network connection in disconnected state - * (carrier is OFF) during attach, so the IP network stack does not - * start IPv6 negotiation and more. - */ - usbnet_link_change(dev, 0, 0); + if (!ret) { + /* + * We should get an event when network connection is connected + * or disconnected. Set network connection in disconnected + * state (carrier is OFF) during attach, so the IP network stack + * does not start IPv6 negotiation and more. + */ + usbnet_link_change(dev, 0, 0); + } return ret; } This change does of course make sense in any case, so no objections there. But maybe you could modify cdc_ncm to set the new FLAG_LINK_INTR instead, and completely drop the usbnet_link_change() from this driver? This will make usbnet_probe() handle the call, and it will avoid it if cdc_ncm_bind() failed. Bjørn -- To unsubscribe from this list: send the line unsubscribe linux-usb in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] net/cdc_ncm: fix null pointer panic at usbnet_link_change
From: Du, ChangbinX changbinx...@intel.com Date: Tue, 29 Oct 2013 03:30:42 + In cdc_ncm_bind() function, it call cdc_ncm_bind_common() to setup usb. But cdc_ncm_bind_common() may meet error and cause usbnet_disconnect() be called which calls free_netdev(net). Thus usbnet structure(alloced with net_device structure) will be freed,too. So we cannot call usbnet_link_change() if cdc_ncm_bind_common() return error. This is not the bug. The problem is in cdc_ncm_bind_common(). It seems to leave dangling interface data pointers in some cases, and then branches just to error so that they don't get cleared back out. This bypasses the protection present in cdc_ncm_disconnect() meant to avoid this problem. -- To unsubscribe from this list: send the line unsubscribe linux-usb in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] net/cdc_ncm: fix null pointer panic at usbnet_link_change
From: Du, Changbin changbinx...@intel.com In cdc_ncm_bind() function, it call cdc_ncm_bind_common() to setup usb. But cdc_ncm_bind_common() may meet error and cause usbnet_disconnect() be called which calls free_netdev(net). Thus usbnet structure(alloced with net_device structure) will be freed,too. So we cannot call usbnet_link_change() if cdc_ncm_bind_common() return error. BUG: unable to handle kernel NULL pointer dereference at 0078 EIP is at usbnet_link_change+0x1e/0x80 Call Trace: [c24bc69a] cdc_ncm_bind+0x3a/0x50 [c24b8d32] usbnet_probe+0x282/0x7d0 [c2167f2c] ? sysfs_new_dirent+0x6c/0x100 [c2821253] ? mutex_lock+0x13/0x40 [c24bb278] cdc_ncm_probe+0x8/0x10 [c24e0ef7] usb_probe_interface+0x187/0x2c0 [c23caa8a] ? driver_sysfs_add+0x6a/0x90 [c23cb290] ? __driver_attach+0x90/0x90 [c23caf14] driver_probe_device+0x74/0x360 [c24e07b1] ? usb_match_id+0x41/0x60 [c24e081e] ? usb_device_match+0x4e/0x90 [c23cb290] ? __driver_attach+0x90/0x90 [c23cb2c9] __device_attach+0x39/0x50 [c23c93f4] bus_for_each_drv+0x34/0x70 [c23cae2b] device_attach+0x7b/0x90 [c23cb290] ? __driver_attach+0x90/0x90 [c23ca38f] bus_probe_device+0x6f/0x90 [c23c8a08] device_add+0x558/0x630 [c24e4821] ? usb_create_ep_devs+0x71/0xd0 [c24dd0db] ? create_intf_ep_devs+0x4b/0x70 [c24df2bf] usb_set_configuration+0x4bf/0x800 [c23cb290] ? __driver_attach+0x90/0x90 [c24e809b] generic_probe+0x2b/0x90 [c24e105c] usb_probe_device+0x2c/0x70 [c23caf14] driver_probe_device+0x74/0x360 Signed-off-by: Du, Changbin changbinx...@intel.com --- drivers/net/usb/cdc_ncm.c | 17 + 1 file changed, 9 insertions(+), 8 deletions(-) diff --git a/drivers/net/usb/cdc_ncm.c b/drivers/net/usb/cdc_ncm.c index 43afde8..af37ecf 100644 --- a/drivers/net/usb/cdc_ncm.c +++ b/drivers/net/usb/cdc_ncm.c @@ -603,14 +603,15 @@ static int cdc_ncm_bind(struct usbnet *dev, struct usb_interface *intf) /* NCM data altsetting is always 1 */ ret = cdc_ncm_bind_common(dev, intf, 1); - - /* -* We should get an event when network connection is connected or -* disconnected. Set network connection in disconnected state -* (carrier is OFF) during attach, so the IP network stack does not -* start IPv6 negotiation and more. -*/ - usbnet_link_change(dev, 0, 0); + if (!ret) { + /* +* We should get an event when network connection is connected +* or disconnected. Set network connection in disconnected +* state (carrier is OFF) during attach, so the IP network stack +* does not start IPv6 negotiation and more. +*/ + usbnet_link_change(dev, 0, 0); + } return ret; } -- 1.8.3.2 -- To unsubscribe from this list: send the line unsubscribe linux-usb in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html