Re: [mac80211_hwsim] BUG: unable to handle kernel paging request at ce1db404
On Sat, Mar 15, 2014 at 9:11 PM, Johannes Berg wrote: > On Sat, 2014-03-15 at 21:03 +0530, Krishna Chaitanya wrote: > >> > > what RC are u using? Default should be minstrel, i dont see >> > > a reason for rc alloc to fail (remote reason kmalloc failure), >> > > so did you disable RC completely? No prints either w.r.t RC either in >> > > dmesg? >> > >> > Pay attention to the .config. >> > >> Missed the attachment, thanks for pointing. >> As guessed the rate control is empty causing the registration fail. > > It still shouldn't crash though. Looks like there's a fix in this > thread, can somebody verify & post it? > Yes, it should not crash. The change suggested by martin is not correct there is no double free as the the list he mentioned will be empty. (Only after successful registration we will add the radio to the list) the problem here is platform_driver_unregister internally calls the driver_unregister which tries to get the kobject through get_device, but we have already freed the kobject using device_unregister (which calls device_del which frees the kobject). In other failures cases we use mac80211_hwsim_free() and return, so the call to platform_driver_unregister is not there, hence no crash. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [mac80211_hwsim] BUG: unable to handle kernel paging request at ce1db404
On Sat, 2014-03-15 at 21:03 +0530, Krishna Chaitanya wrote: > > > what RC are u using? Default should be minstrel, i dont see > > > a reason for rc alloc to fail (remote reason kmalloc failure), > > > so did you disable RC completely? No prints either w.r.t RC either in > > > dmesg? > > > > Pay attention to the .config. > > > Missed the attachment, thanks for pointing. > As guessed the rate control is empty causing the registration fail. It still shouldn't crash though. Looks like there's a fix in this thread, can somebody verify & post it? Thx, johannes -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [mac80211_hwsim] BUG: unable to handle kernel paging request at ce1db404
On Sat, Mar 15, 2014 at 8:50 PM, Johannes Berg wrote: > > On Thu, 2014-03-13 at 02:15 +0530, Krishna Chaitanya wrote: > > > From the logs it looks like "rate_control_alloc" is failed, > > causing ieee80211_register_hw to fail triggering the crash. > > Yes. > > > what RC are u using? Default should be minstrel, i dont see > > a reason for rc alloc to fail (remote reason kmalloc failure), > > so did you disable RC completely? No prints either w.r.t RC either in > > dmesg? > > Pay attention to the .config. > Missed the attachment, thanks for pointing. As guessed the rate control is empty causing the registration fail. CONFIG_MAC80211=y # CONFIG_MAC80211_RC_PID is not set # CONFIG_MAC80211_RC_MINSTREL is not set CONFIG_MAC80211_RC_DEFAULT="" -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [mac80211_hwsim] BUG: unable to handle kernel paging request at ce1db404
On Thu, 2014-03-13 at 02:15 +0530, Krishna Chaitanya wrote: > From the logs it looks like "rate_control_alloc" is failed, > causing ieee80211_register_hw to fail triggering the crash. Yes. > what RC are u using? Default should be minstrel, i dont see > a reason for rc alloc to fail (remote reason kmalloc failure), > so did you disable RC completely? No prints either w.r.t RC either in > dmesg? Pay attention to the .config. johannes -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [mac80211_hwsim] BUG: unable to handle kernel paging request at ce1db404
On Thu, 2014-03-13 at 02:15 +0530, Krishna Chaitanya wrote: From the logs it looks like rate_control_alloc is failed, causing ieee80211_register_hw to fail triggering the crash. Yes. what RC are u using? Default should be minstrel, i dont see a reason for rc alloc to fail (remote reason kmalloc failure), so did you disable RC completely? No prints either w.r.t RC either in dmesg? Pay attention to the .config. johannes -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [mac80211_hwsim] BUG: unable to handle kernel paging request at ce1db404
On Sat, Mar 15, 2014 at 8:50 PM, Johannes Berg johan...@sipsolutions.net wrote: On Thu, 2014-03-13 at 02:15 +0530, Krishna Chaitanya wrote: From the logs it looks like rate_control_alloc is failed, causing ieee80211_register_hw to fail triggering the crash. Yes. what RC are u using? Default should be minstrel, i dont see a reason for rc alloc to fail (remote reason kmalloc failure), so did you disable RC completely? No prints either w.r.t RC either in dmesg? Pay attention to the .config. Missed the attachment, thanks for pointing. As guessed the rate control is empty causing the registration fail. CONFIG_MAC80211=y # CONFIG_MAC80211_RC_PID is not set # CONFIG_MAC80211_RC_MINSTREL is not set CONFIG_MAC80211_RC_DEFAULT= -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [mac80211_hwsim] BUG: unable to handle kernel paging request at ce1db404
On Sat, 2014-03-15 at 21:03 +0530, Krishna Chaitanya wrote: what RC are u using? Default should be minstrel, i dont see a reason for rc alloc to fail (remote reason kmalloc failure), so did you disable RC completely? No prints either w.r.t RC either in dmesg? Pay attention to the .config. Missed the attachment, thanks for pointing. As guessed the rate control is empty causing the registration fail. It still shouldn't crash though. Looks like there's a fix in this thread, can somebody verify post it? Thx, johannes -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [mac80211_hwsim] BUG: unable to handle kernel paging request at ce1db404
On Sat, Mar 15, 2014 at 9:11 PM, Johannes Berg johan...@sipsolutions.net wrote: On Sat, 2014-03-15 at 21:03 +0530, Krishna Chaitanya wrote: what RC are u using? Default should be minstrel, i dont see a reason for rc alloc to fail (remote reason kmalloc failure), so did you disable RC completely? No prints either w.r.t RC either in dmesg? Pay attention to the .config. Missed the attachment, thanks for pointing. As guessed the rate control is empty causing the registration fail. It still shouldn't crash though. Looks like there's a fix in this thread, can somebody verify post it? Yes, it should not crash. The change suggested by martin is not correct there is no double free as the the list he mentioned will be empty. (Only after successful registration we will add the radio to the list) the problem here is platform_driver_unregister internally calls the driver_unregister which tries to get the kobject through get_device, but we have already freed the kobject using device_unregister (which calls device_del which frees the kobject). In other failures cases we use mac80211_hwsim_free() and return, so the call to platform_driver_unregister is not there, hence no crash. -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [mac80211_hwsim] BUG: unable to handle kernel paging request at ce1db404
Hi Martin, On Wed, Mar 12, 2014 at 06:08:06PM +0100, Martin Pitt wrote: > Hey Fengguang, > > Fengguang Wu [2014-03-05 21:23 +0800]: > > git bisect start v3.10 v3.9 -- [snip] > > git bisect bad d8efcf38b13df3e9e889cf7cc214cb85dc53600c # 04:30 0- > >3 Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm > > git bisect bad b148a42ba7823e34971cd4e5b05a5c74fa3311ed # 04:30 0- > > 19 Add linux-next specific files for 20140228 > > I noticed that this bisect list didn't include Sasha Levin's followup fix: > > > http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/drivers/net/wireless/mac80211_hwsim.c?id=c07fe5ae06 > > That was supposed to fix problems at registration. Is that actually > inclued in your bisection? I suppose it is, but it's quite clear that > with only 9ea927 you'd get crashes with building it statically. The "git bisect start v3.10 v3.9" itself may not include that fixing commit, however the bisect script will check latest mainline and linux-next kernels at the end, and this line shows that linux-next at that day is still bad: > > git bisect bad b148a42ba7823e34971cd4e5b05a5c74fa3311ed # 04:30 0- > > 19 Add linux-next specific files for 20140228 I confirmed that b148a42ba7823e34971cd4e5b05a5c74fa3311ed does contain the fixing commit c07fe5ae06. Thanks, Fengguang -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [mac80211_hwsim] BUG: unable to handle kernel paging request at ce1db404
Hi Martin, On Wed, Mar 12, 2014 at 06:08:06PM +0100, Martin Pitt wrote: Hey Fengguang, Fengguang Wu [2014-03-05 21:23 +0800]: git bisect start v3.10 v3.9 -- [snip] git bisect bad d8efcf38b13df3e9e889cf7cc214cb85dc53600c # 04:30 0- 3 Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm git bisect bad b148a42ba7823e34971cd4e5b05a5c74fa3311ed # 04:30 0- 19 Add linux-next specific files for 20140228 I noticed that this bisect list didn't include Sasha Levin's followup fix: http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/drivers/net/wireless/mac80211_hwsim.c?id=c07fe5ae06 That was supposed to fix problems at registration. Is that actually inclued in your bisection? I suppose it is, but it's quite clear that with only 9ea927 you'd get crashes with building it statically. The git bisect start v3.10 v3.9 itself may not include that fixing commit, however the bisect script will check latest mainline and linux-next kernels at the end, and this line shows that linux-next at that day is still bad: git bisect bad b148a42ba7823e34971cd4e5b05a5c74fa3311ed # 04:30 0- 19 Add linux-next specific files for 20140228 I confirmed that b148a42ba7823e34971cd4e5b05a5c74fa3311ed does contain the fixing commit c07fe5ae06. Thanks, Fengguang -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [mac80211_hwsim] BUG: unable to handle kernel paging request at ce1db404
On Wed, Mar 12, 2014 at 11:04 PM, Martin Pitt wrote: > > Fengguang Wu [2014-03-08 20:11 +0800]: > > [4.429993] mac80211_hwsim: ieee80211_register_hw failed (-2) > > [...] > > [4.431924] [] get_device+0xf/0x17 > > [4.431924] [] driver_detach+0x38/0x8f > > [4.431924] [] bus_remove_driver+0x53/0x66 > > [4.431924] [] driver_unregister+0x38/0x3d > > [4.431924] [] platform_driver_unregister+0xb/0xd > > [4.431924] [] init_mac80211_hwsim+0x3a5/0x3b6 > > > So that first message is from > http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/tree/drivers/net/wireless/mac80211_hwsim.c?id=9ea927748#n2428 > > At this point we registered the platform driver and the class, and it > should have created two devices (at least for the default radios=2). > What's odd is that I don't see this printk in your kernel log: > > mac80211_hwsim: Initializing radio %d > > If for some reasons "radio" is 0, it would not show this and not > initialize data->dev, but then you shouldn't get to > ieee80211_register_hw() either as it's in the same loop. So that's a > bit of a mystery to me. > > On failure, above ieee80211_register_hw() jumps to the cleanup: > > | failed_hw: > | device_unregister(data->dev); > | failed_drvdata: > | ieee80211_free_hw(hw); > | failed: > | mac80211_hwsim_free(); > | failed_unregister_driver: > | driver_unregister(_hwsim_driver); > | return err; > | } > > > The mac80211_hwsim_free() function again calls > device_unregister(data->dev) for a list (not sure which, I'm not > certain how to interpret > > list_for_each_entry_safe(data, tmpdata, , list) > > ) Could that be the double free causing the memory corruption? > > If you are in a position to do quick builds and tests, does the crash > go away with this? > > printk(KERN_DEBUG "mac80211_hwsim: device_bind_driver failed > (%d)\n", >err); > - goto failed_hw; > + goto failed_drvdata; > } > > (I'm not claiming that this is correct, just taking a stab at > understanding what happens) If not, does it go away with changing the > goto to failed_unregister_driver()? > >From the logs it looks like "rate_control_alloc" is failed, causing ieee80211_register_hw to fail triggering the crash. what RC are u using? Default should be minstrel, i dont see a reason for rc alloc to fail (remote reason kmalloc failure), so did you disable RC completely? No prints either w.r.t RC either in dmesg? -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [mac80211_hwsim] BUG: unable to handle kernel paging request at ce1db404
Fengguang Wu [2014-03-08 20:11 +0800]: > [4.429993] mac80211_hwsim: ieee80211_register_hw failed (-2) > [...] > [4.431924] [] get_device+0xf/0x17 > [4.431924] [] driver_detach+0x38/0x8f > [4.431924] [] bus_remove_driver+0x53/0x66 > [4.431924] [] driver_unregister+0x38/0x3d > [4.431924] [] platform_driver_unregister+0xb/0xd > [4.431924] [] init_mac80211_hwsim+0x3a5/0x3b6 So that first message is from http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/tree/drivers/net/wireless/mac80211_hwsim.c?id=9ea927748#n2428 At this point we registered the platform driver and the class, and it should have created two devices (at least for the default radios=2). What's odd is that I don't see this printk in your kernel log: mac80211_hwsim: Initializing radio %d If for some reasons "radio" is 0, it would not show this and not initialize data->dev, but then you shouldn't get to ieee80211_register_hw() either as it's in the same loop. So that's a bit of a mystery to me. On failure, above ieee80211_register_hw() jumps to the cleanup: | failed_hw: | device_unregister(data->dev); | failed_drvdata: | ieee80211_free_hw(hw); | failed: | mac80211_hwsim_free(); | failed_unregister_driver: | driver_unregister(_hwsim_driver); | return err; | } The mac80211_hwsim_free() function again calls device_unregister(data->dev) for a list (not sure which, I'm not certain how to interpret list_for_each_entry_safe(data, tmpdata, , list) ) Could that be the double free causing the memory corruption? If you are in a position to do quick builds and tests, does the crash go away with this? printk(KERN_DEBUG "mac80211_hwsim: device_bind_driver failed (%d)\n", err); - goto failed_hw; + goto failed_drvdata; } (I'm not claiming that this is correct, just taking a stab at understanding what happens) If not, does it go away with changing the goto to failed_unregister_driver()? Thanks, Martin -- Martin Pitt| http://www.piware.de Ubuntu Developer (www.ubuntu.com) | Debian Developer (www.debian.org) signature.asc Description: Digital signature
Re: [mac80211_hwsim] BUG: unable to handle kernel paging request at ce1db404
Hey Fengguang, Fengguang Wu [2014-03-05 21:23 +0800]: > git bisect start v3.10 v3.9 -- > git bisect bad ff89acc563a0bd49965674f56552ad6620415fe2 # 03:01 0- > 2 Merge branch 'rcu/urgent' of > git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu > git bisect bad 24d0c2542b38963ae4d5171ecc0a2c1326c656bc # 03:03 0- > 2 Merge branch 'for-linus' of > git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs > git bisect good 151173e8ce9b95d7eedb9035cfaffbdb7cb2 # 03:11 20+ > 20 Merge tag 'for-v3.10' of git://git.infradead.org/battery-2.6 > git bisect bad e95893004104054d49406fd108fefa3ddc054366 # 03:13 0- > 4 Merge tag 'for_linus' of > git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost > git bisect good 8a72f3820c4d14b27ad5336aed00063a7a7f1bef # 03:19 20+ > 20 Merge git://git.kernel.org/pub/scm/linux/kernel/git/cmetcalf/linux-tile > git bisect bad 600fe9751aeb6f6b72de84076a05c5b8c04152c0 # 03:21 0- > 5 ipc_schedule_free() can do vfree() directly now > git bisect bad 126de6b20bfb82cc19012d5048f11f339ae5a021 # 03:23 0- > 1 linkage.h: fix build breakage due to symbol prefix handling > git bisect good 251df49db3327c64bf917bfdba94491fde2b4ee0 # 03:27 20+ > 20 Merge branch 'for-linus' of > git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input > git bisect bad 73287a43cc79ca06629a88d1a199cd283f42456a # 03:29 0- > 2 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next > git bisect good 953c96e0d85615d1ab1f100e525d376053294dc2 # 03:37 20+ > 0 tg3: Use bool not int > git bisect bad 4fc4118cdb29ab946b8a586fc766ebb6ae1e1c90 # 03:39 0- > 2 wil6210: more Rx descriptor accessor functions > git bisect good e73dcfbf061b524fe9aaef56cf3c2e234a45ec19 # 03:46 20+ > 7 Bluetooth: hidp: fix sending output reports on intr channel > git bisect good d5590bba37f3c7d496195648532d5313abb43891 # 03:50 20+ > 0 NFC: pn533: Re-group fields in struct pn533 > git bisect bad 06d961a8e210035bff7e82f466107f9ab4a8fd94 # 03:52 0- > 7 mac80211/minstrel: use the new rate control API > git bisect bad 97990a060e6757f48b931a3946b17c1c4362c3fb # 03:54 0- > 3 nl80211: allow using wdev identifiers to get scan results > git bisect bad 85220d71bf3ca1ba9129e0744247ae5f61bec559 # 03:56 1- > 5 mac80211: support secondary channel offset in CSA > git bisect bad 0ca54f6c5fd4ce58aa044d1fc7f00d7f6cf2801c # 03:57 1- > 6 mac80211: provide SSID in IBSS mode > git bisect bad 3088f7d2db42925808c4b43a6258647ee4d1dd5f # 03:59 3- > 8 mac80211: stringify another plink state > git bisect bad 9d6d6f4924133567a108a862d9cf949cd03f71cb # 04:02 0- > 2 mac80211: unset FC retry bit in mesh fwding path > git bisect bad 9ea927748ced4953f1e9a0f1fa1fdeacd1018b4e # 04:05 0- > 9 mac80211_hwsim: Register and bind to driver > # first bad commit: [9ea927748ced4953f1e9a0f1fa1fdeacd1018b4e] > mac80211_hwsim: Register and bind to driver > git bisect good ddc4db2e3d5393ede7a9222bb3b7522a603a4678 # 04:27 69+ > 18 mac80211: make ieee802_11_parse_elems an inline > git bisect bad b148a42ba7823e34971cd4e5b05a5c74fa3311ed # 04:27 0- > 19 Add linux-next specific files for 20140228 > git bisect bad d8efcf38b13df3e9e889cf7cc214cb85dc53600c # 04:30 0- > 3 Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm > git bisect bad b148a42ba7823e34971cd4e5b05a5c74fa3311ed # 04:30 0- > 19 Add linux-next specific files for 20140228 I noticed that this bisect list didn't include Sasha Levin's followup fix: http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/drivers/net/wireless/mac80211_hwsim.c?id=c07fe5ae06 That was supposed to fix problems at registration. Is that actually inclued in your bisection? I suppose it is, but it's quite clear that with only 9ea927 you'd get crashes with building it statically. I have no real idea why ieee80211_register_hw() would fail (but then again, I'm a kernel n00b). It's apparently related to doing platform_driver_register() before, so perhaps it's missing another field in the struct. "unable to handle kernel paging request", is that a normal error message if a driver fails to initialize? I would have assumed that the kernel would't panic if a single driver fails to initialize. Or does that mean it actually tries to access uninit'ed memory? I can certainly check out linux-next, build the whole thing statically, and try to twiddle things a bit, but I'm afraid that's just going to take a while (both because this is all new to me and I'm working on other things ATM), so please don't hold your breath. :-) Thanks, Martin -- Martin Pitt| http://www.piware.de Ubuntu Developer (www.ubuntu.com) | Debian Developer (www.debian.org) signature.asc Description: Digital signature
Re: [mac80211_hwsim] BUG: unable to handle kernel paging request at ce1db404
Hey Fengguang, Fengguang Wu [2014-03-05 21:23 +0800]: git bisect start v3.10 v3.9 -- git bisect bad ff89acc563a0bd49965674f56552ad6620415fe2 # 03:01 0- 2 Merge branch 'rcu/urgent' of git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu git bisect bad 24d0c2542b38963ae4d5171ecc0a2c1326c656bc # 03:03 0- 2 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs git bisect good 151173e8ce9b95d7eedb9035cfaffbdb7cb2 # 03:11 20+ 20 Merge tag 'for-v3.10' of git://git.infradead.org/battery-2.6 git bisect bad e95893004104054d49406fd108fefa3ddc054366 # 03:13 0- 4 Merge tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost git bisect good 8a72f3820c4d14b27ad5336aed00063a7a7f1bef # 03:19 20+ 20 Merge git://git.kernel.org/pub/scm/linux/kernel/git/cmetcalf/linux-tile git bisect bad 600fe9751aeb6f6b72de84076a05c5b8c04152c0 # 03:21 0- 5 ipc_schedule_free() can do vfree() directly now git bisect bad 126de6b20bfb82cc19012d5048f11f339ae5a021 # 03:23 0- 1 linkage.h: fix build breakage due to symbol prefix handling git bisect good 251df49db3327c64bf917bfdba94491fde2b4ee0 # 03:27 20+ 20 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input git bisect bad 73287a43cc79ca06629a88d1a199cd283f42456a # 03:29 0- 2 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next git bisect good 953c96e0d85615d1ab1f100e525d376053294dc2 # 03:37 20+ 0 tg3: Use bool not int git bisect bad 4fc4118cdb29ab946b8a586fc766ebb6ae1e1c90 # 03:39 0- 2 wil6210: more Rx descriptor accessor functions git bisect good e73dcfbf061b524fe9aaef56cf3c2e234a45ec19 # 03:46 20+ 7 Bluetooth: hidp: fix sending output reports on intr channel git bisect good d5590bba37f3c7d496195648532d5313abb43891 # 03:50 20+ 0 NFC: pn533: Re-group fields in struct pn533 git bisect bad 06d961a8e210035bff7e82f466107f9ab4a8fd94 # 03:52 0- 7 mac80211/minstrel: use the new rate control API git bisect bad 97990a060e6757f48b931a3946b17c1c4362c3fb # 03:54 0- 3 nl80211: allow using wdev identifiers to get scan results git bisect bad 85220d71bf3ca1ba9129e0744247ae5f61bec559 # 03:56 1- 5 mac80211: support secondary channel offset in CSA git bisect bad 0ca54f6c5fd4ce58aa044d1fc7f00d7f6cf2801c # 03:57 1- 6 mac80211: provide SSID in IBSS mode git bisect bad 3088f7d2db42925808c4b43a6258647ee4d1dd5f # 03:59 3- 8 mac80211: stringify another plink state git bisect bad 9d6d6f4924133567a108a862d9cf949cd03f71cb # 04:02 0- 2 mac80211: unset FC retry bit in mesh fwding path git bisect bad 9ea927748ced4953f1e9a0f1fa1fdeacd1018b4e # 04:05 0- 9 mac80211_hwsim: Register and bind to driver # first bad commit: [9ea927748ced4953f1e9a0f1fa1fdeacd1018b4e] mac80211_hwsim: Register and bind to driver git bisect good ddc4db2e3d5393ede7a9222bb3b7522a603a4678 # 04:27 69+ 18 mac80211: make ieee802_11_parse_elems an inline git bisect bad b148a42ba7823e34971cd4e5b05a5c74fa3311ed # 04:27 0- 19 Add linux-next specific files for 20140228 git bisect bad d8efcf38b13df3e9e889cf7cc214cb85dc53600c # 04:30 0- 3 Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm git bisect bad b148a42ba7823e34971cd4e5b05a5c74fa3311ed # 04:30 0- 19 Add linux-next specific files for 20140228 I noticed that this bisect list didn't include Sasha Levin's followup fix: http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/drivers/net/wireless/mac80211_hwsim.c?id=c07fe5ae06 That was supposed to fix problems at registration. Is that actually inclued in your bisection? I suppose it is, but it's quite clear that with only 9ea927 you'd get crashes with building it statically. I have no real idea why ieee80211_register_hw() would fail (but then again, I'm a kernel n00b). It's apparently related to doing platform_driver_register() before, so perhaps it's missing another field in the struct. unable to handle kernel paging request, is that a normal error message if a driver fails to initialize? I would have assumed that the kernel would't panic if a single driver fails to initialize. Or does that mean it actually tries to access uninit'ed memory? I can certainly check out linux-next, build the whole thing statically, and try to twiddle things a bit, but I'm afraid that's just going to take a while (both because this is all new to me and I'm working on other things ATM), so please don't hold your breath. :-) Thanks, Martin -- Martin Pitt| http://www.piware.de Ubuntu Developer (www.ubuntu.com) | Debian Developer (www.debian.org) signature.asc Description: Digital signature
Re: [mac80211_hwsim] BUG: unable to handle kernel paging request at ce1db404
Fengguang Wu [2014-03-08 20:11 +0800]: [4.429993] mac80211_hwsim: ieee80211_register_hw failed (-2) [...] [4.431924] [c12377de] get_device+0xf/0x17 [4.431924] [c123a165] driver_detach+0x38/0x8f [4.431924] [c1239433] bus_remove_driver+0x53/0x66 [4.431924] [c123a535] driver_unregister+0x38/0x3d [4.431924] [c123b3aa] platform_driver_unregister+0xb/0xd [4.431924] [c1c4ac9f] init_mac80211_hwsim+0x3a5/0x3b6 So that first message is from http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/tree/drivers/net/wireless/mac80211_hwsim.c?id=9ea927748#n2428 At this point we registered the platform driver and the class, and it should have created two devices (at least for the default radios=2). What's odd is that I don't see this printk in your kernel log: mac80211_hwsim: Initializing radio %d If for some reasons radio is 0, it would not show this and not initialize data-dev, but then you shouldn't get to ieee80211_register_hw() either as it's in the same loop. So that's a bit of a mystery to me. On failure, above ieee80211_register_hw() jumps to the cleanup: | failed_hw: | device_unregister(data-dev); | failed_drvdata: | ieee80211_free_hw(hw); | failed: | mac80211_hwsim_free(); | failed_unregister_driver: | driver_unregister(mac80211_hwsim_driver); | return err; | } The mac80211_hwsim_free() function again calls device_unregister(data-dev) for a list (not sure which, I'm not certain how to interpret list_for_each_entry_safe(data, tmpdata, tmplist, list) ) Could that be the double free causing the memory corruption? If you are in a position to do quick builds and tests, does the crash go away with this? printk(KERN_DEBUG mac80211_hwsim: device_bind_driver failed (%d)\n, err); - goto failed_hw; + goto failed_drvdata; } (I'm not claiming that this is correct, just taking a stab at understanding what happens) If not, does it go away with changing the goto to failed_unregister_driver()? Thanks, Martin -- Martin Pitt| http://www.piware.de Ubuntu Developer (www.ubuntu.com) | Debian Developer (www.debian.org) signature.asc Description: Digital signature
Re: [mac80211_hwsim] BUG: unable to handle kernel paging request at ce1db404
On Wed, Mar 12, 2014 at 11:04 PM, Martin Pitt martin.p...@ubuntu.com wrote: Fengguang Wu [2014-03-08 20:11 +0800]: [4.429993] mac80211_hwsim: ieee80211_register_hw failed (-2) [...] [4.431924] [c12377de] get_device+0xf/0x17 [4.431924] [c123a165] driver_detach+0x38/0x8f [4.431924] [c1239433] bus_remove_driver+0x53/0x66 [4.431924] [c123a535] driver_unregister+0x38/0x3d [4.431924] [c123b3aa] platform_driver_unregister+0xb/0xd [4.431924] [c1c4ac9f] init_mac80211_hwsim+0x3a5/0x3b6 So that first message is from http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/tree/drivers/net/wireless/mac80211_hwsim.c?id=9ea927748#n2428 At this point we registered the platform driver and the class, and it should have created two devices (at least for the default radios=2). What's odd is that I don't see this printk in your kernel log: mac80211_hwsim: Initializing radio %d If for some reasons radio is 0, it would not show this and not initialize data-dev, but then you shouldn't get to ieee80211_register_hw() either as it's in the same loop. So that's a bit of a mystery to me. On failure, above ieee80211_register_hw() jumps to the cleanup: | failed_hw: | device_unregister(data-dev); | failed_drvdata: | ieee80211_free_hw(hw); | failed: | mac80211_hwsim_free(); | failed_unregister_driver: | driver_unregister(mac80211_hwsim_driver); | return err; | } The mac80211_hwsim_free() function again calls device_unregister(data-dev) for a list (not sure which, I'm not certain how to interpret list_for_each_entry_safe(data, tmpdata, tmplist, list) ) Could that be the double free causing the memory corruption? If you are in a position to do quick builds and tests, does the crash go away with this? printk(KERN_DEBUG mac80211_hwsim: device_bind_driver failed (%d)\n, err); - goto failed_hw; + goto failed_drvdata; } (I'm not claiming that this is correct, just taking a stab at understanding what happens) If not, does it go away with changing the goto to failed_unregister_driver()? From the logs it looks like rate_control_alloc is failed, causing ieee80211_register_hw to fail triggering the crash. what RC are u using? Default should be minstrel, i dont see a reason for rc alloc to fail (remote reason kmalloc failure), so did you disable RC completely? No prints either w.r.t RC either in dmesg? -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [mac80211_hwsim] BUG: unable to handle kernel paging request at ce1db404
Martin, These are long term bug and is still showing up in linux-next. It can be reliably reproduced with the below KVM script. #!/bin/bash kvm=( qemu-system-x86_64 -cpu kvm64 -enable-kvm -kernel $1 -smp 2 -m 256M -net nic,vlan=0,macaddr=00:00:00:00:00:00,model=virtio -net user,vlan=0 -net nic,vlan=1,model=e1000 -net user,vlan=1 -boot order=nc -no-reboot -watchdog i6300esb -display none -serial stdio -monitor null ) append=( debug sched_debug apic=debug ignore_loglevel earlyprintk=ttyS0,115200 sysrq_always_enabled panic=10 prompt_ramdisk=0 console=ttyS0,115200 console=tty0 vga=normal root=/dev/ram0 rcupdate.rcu_cpu_stall_timeout=100 rw ) "${kvm[@]}" --append "${append[*]}" % kvm-0day.sh /kernel/i386-randconfig-c5-03010009/b148a42ba7823e34971cd4e5b05a5c74fa3311ed/vmlinuz-3.14.0-rc4-next-20140228-05738-gb148a42 early console in setup code [0.00] Initializing cgroup subsys cpuset [0.00] Initializing cgroup subsys cpu [0.00] Linux version 3.14.0-rc4-next-20140228-05738-gb148a42 (kbuild@cairo) (gcc version 4.8.1 (Debian 4.8.1-8) ) #8 SMP PREEMPT Sat Mar 1 01:05:21 CST 2014 [0.00] KERNEL supported cpus: [0.00] Intel GenuineIntel [0.00] Transmeta GenuineTMx86 [0.00] Transmeta TransmetaCPU [0.00] e820: BIOS-provided physical RAM map: [0.00] BIOS-e820: [mem 0x-0x0009fbff] usable [0.00] BIOS-e820: [mem 0x0009fc00-0x0009] reserved [0.00] BIOS-e820: [mem 0x000f-0x000f] reserved [0.00] BIOS-e820: [mem 0x0010-0x0fffdfff] usable [0.00] BIOS-e820: [mem 0x0fffe000-0x0fff] reserved [0.00] BIOS-e820: [mem 0xfeffc000-0xfeff] reserved [0.00] BIOS-e820: [mem 0xfffc-0x] reserved [0.00] debug: ignoring loglevel setting. [0.00] NX (Execute Disable) protection: active [0.00] SMBIOS 2.4 present. [0.00] DMI: Bochs Bochs, BIOS Bochs 01/01/2011 [0.00] Hypervisor detected: KVM [0.00] e820: update [mem 0x-0x0fff] usable ==> reserved [0.00] e820: remove [mem 0x000a-0x000f] usable [0.00] e820: last_pfn = 0xfffe max_arch_pfn = 0x100 [0.00] MTRR default type: write-back [0.00] MTRR fixed ranges enabled: [0.00] 0-9 write-back [0.00] A-B uncachable [0.00] C-F write-protect [0.00] MTRR variable ranges enabled: [0.00] 0 base 008000 mask FF8000 uncachable [0.00] 1 disabled [0.00] 2 disabled [0.00] 3 disabled [0.00] 4 disabled [0.00] 5 disabled [0.00] 6 disabled [0.00] 7 disabled [0.00] Scan for SMP in [mem 0x-0x03ff] [0.00] Scan for SMP in [mem 0x0009fc00-0x0009] [0.00] Scan for SMP in [mem 0x000f-0x000f] [0.00] found SMP MP-table at [mem 0x000f1840-0x000f184f] mapped at [c00f1840] [0.00] mpc: f1850-f193c [0.00] Scanning 1 areas for low memory corruption [0.00] initial memory mapped: [mem 0x-0x027f] [0.00] Base memory trampoline at [c009b000] 9b000 size 16384 [0.00] init_memory_mapping: [mem 0x-0x000f] [0.00] [mem 0x-0x000f] page 4k [0.00] init_memory_mapping: [mem 0x0fc0-0x0fdf] [0.00] [mem 0x0fc0-0x0fdf] page 4k [0.00] BRK [0x023ac000, 0x023acfff] PGTABLE [0.00] init_memory_mapping: [mem 0x0c00-0x0fbf] [0.00] [mem 0x0c00-0x0fbf] page 4k [0.00] BRK [0x023ad000, 0x023adfff] PGTABLE [0.00] BRK [0x023ae000, 0x023aefff] PGTABLE [0.00] BRK [0x023af000, 0x023a] PGTABLE [0.00] BRK [0x023b, 0x023b0fff] PGTABLE [0.00] BRK [0x023b1000, 0x023b1fff] PGTABLE [0.00] init_memory_mapping: [mem 0x0010-0x0bff] [0.00] [mem 0x0010-0x0bff] page 4k [0.00] init_memory_mapping: [mem 0x0fe0-0x0fffdfff] [0.00] [mem 0x0fe0-0x0fffdfff] page 4k [0.00] ACPI: RSDP 0x000F16B0 14 (v00 BOCHS ) [0.00] ACPI: RSDT 0x0FFFE3F0 34 (v01 BOCHS BXPCRSDT 0001 BXPC 0001) [0.00] ACPI: FACP 0x0F80 74 (v01 BOCHS BXPCFACP 0001 BXPC 0001) [0.00] ACPI: DSDT 0x0FFFE430 001137 (v01 BXPC BXDSDT 0001 INTL 20100528) [0.00] ACPI: FACS 0x0F40 40 [0.00] ACPI: SSDT 0x06A0 000899 (v01 BOCHS BXPCSSDT 0001 BXPC 0001) [0.00] ACPI: APIC 0x05B0 80 (v01 BOCHS BXPCAPIC 0001 BXPC 0001) [0.00]
Re: [mac80211_hwsim] BUG: unable to handle kernel paging request at ce1db404
Martin, These are long term bug and is still showing up in linux-next. It can be reliably reproduced with the below KVM script. #!/bin/bash kvm=( qemu-system-x86_64 -cpu kvm64 -enable-kvm -kernel $1 -smp 2 -m 256M -net nic,vlan=0,macaddr=00:00:00:00:00:00,model=virtio -net user,vlan=0 -net nic,vlan=1,model=e1000 -net user,vlan=1 -boot order=nc -no-reboot -watchdog i6300esb -display none -serial stdio -monitor null ) append=( debug sched_debug apic=debug ignore_loglevel earlyprintk=ttyS0,115200 sysrq_always_enabled panic=10 prompt_ramdisk=0 console=ttyS0,115200 console=tty0 vga=normal root=/dev/ram0 rcupdate.rcu_cpu_stall_timeout=100 rw ) ${kvm[@]} --append ${append[*]} % kvm-0day.sh /kernel/i386-randconfig-c5-03010009/b148a42ba7823e34971cd4e5b05a5c74fa3311ed/vmlinuz-3.14.0-rc4-next-20140228-05738-gb148a42 early console in setup code [0.00] Initializing cgroup subsys cpuset [0.00] Initializing cgroup subsys cpu [0.00] Linux version 3.14.0-rc4-next-20140228-05738-gb148a42 (kbuild@cairo) (gcc version 4.8.1 (Debian 4.8.1-8) ) #8 SMP PREEMPT Sat Mar 1 01:05:21 CST 2014 [0.00] KERNEL supported cpus: [0.00] Intel GenuineIntel [0.00] Transmeta GenuineTMx86 [0.00] Transmeta TransmetaCPU [0.00] e820: BIOS-provided physical RAM map: [0.00] BIOS-e820: [mem 0x-0x0009fbff] usable [0.00] BIOS-e820: [mem 0x0009fc00-0x0009] reserved [0.00] BIOS-e820: [mem 0x000f-0x000f] reserved [0.00] BIOS-e820: [mem 0x0010-0x0fffdfff] usable [0.00] BIOS-e820: [mem 0x0fffe000-0x0fff] reserved [0.00] BIOS-e820: [mem 0xfeffc000-0xfeff] reserved [0.00] BIOS-e820: [mem 0xfffc-0x] reserved [0.00] debug: ignoring loglevel setting. [0.00] NX (Execute Disable) protection: active [0.00] SMBIOS 2.4 present. [0.00] DMI: Bochs Bochs, BIOS Bochs 01/01/2011 [0.00] Hypervisor detected: KVM [0.00] e820: update [mem 0x-0x0fff] usable == reserved [0.00] e820: remove [mem 0x000a-0x000f] usable [0.00] e820: last_pfn = 0xfffe max_arch_pfn = 0x100 [0.00] MTRR default type: write-back [0.00] MTRR fixed ranges enabled: [0.00] 0-9 write-back [0.00] A-B uncachable [0.00] C-F write-protect [0.00] MTRR variable ranges enabled: [0.00] 0 base 008000 mask FF8000 uncachable [0.00] 1 disabled [0.00] 2 disabled [0.00] 3 disabled [0.00] 4 disabled [0.00] 5 disabled [0.00] 6 disabled [0.00] 7 disabled [0.00] Scan for SMP in [mem 0x-0x03ff] [0.00] Scan for SMP in [mem 0x0009fc00-0x0009] [0.00] Scan for SMP in [mem 0x000f-0x000f] [0.00] found SMP MP-table at [mem 0x000f1840-0x000f184f] mapped at [c00f1840] [0.00] mpc: f1850-f193c [0.00] Scanning 1 areas for low memory corruption [0.00] initial memory mapped: [mem 0x-0x027f] [0.00] Base memory trampoline at [c009b000] 9b000 size 16384 [0.00] init_memory_mapping: [mem 0x-0x000f] [0.00] [mem 0x-0x000f] page 4k [0.00] init_memory_mapping: [mem 0x0fc0-0x0fdf] [0.00] [mem 0x0fc0-0x0fdf] page 4k [0.00] BRK [0x023ac000, 0x023acfff] PGTABLE [0.00] init_memory_mapping: [mem 0x0c00-0x0fbf] [0.00] [mem 0x0c00-0x0fbf] page 4k [0.00] BRK [0x023ad000, 0x023adfff] PGTABLE [0.00] BRK [0x023ae000, 0x023aefff] PGTABLE [0.00] BRK [0x023af000, 0x023a] PGTABLE [0.00] BRK [0x023b, 0x023b0fff] PGTABLE [0.00] BRK [0x023b1000, 0x023b1fff] PGTABLE [0.00] init_memory_mapping: [mem 0x0010-0x0bff] [0.00] [mem 0x0010-0x0bff] page 4k [0.00] init_memory_mapping: [mem 0x0fe0-0x0fffdfff] [0.00] [mem 0x0fe0-0x0fffdfff] page 4k [0.00] ACPI: RSDP 0x000F16B0 14 (v00 BOCHS ) [0.00] ACPI: RSDT 0x0FFFE3F0 34 (v01 BOCHS BXPCRSDT 0001 BXPC 0001) [0.00] ACPI: FACP 0x0F80 74 (v01 BOCHS BXPCFACP 0001 BXPC 0001) [0.00] ACPI: DSDT 0x0FFFE430 001137 (v01 BXPC BXDSDT 0001 INTL 20100528) [0.00] ACPI: FACS 0x0F40 40 [0.00] ACPI: SSDT 0x06A0 000899 (v01 BOCHS BXPCSSDT 0001 BXPC 0001) [0.00] ACPI: APIC 0x05B0 80 (v01 BOCHS BXPCAPIC 0001 BXPC 0001) [0.00] ACPI: