Re: [mac80211_hwsim] BUG: unable to handle kernel paging request at ce1db404

2014-03-15 Thread Krishna Chaitanya
On Sat, Mar 15, 2014 at 9:11 PM, Johannes Berg
 wrote:
> On Sat, 2014-03-15 at 21:03 +0530, Krishna Chaitanya wrote:
>
>> > > what RC are u using? Default should be minstrel, i dont see
>> > > a reason for rc alloc to fail (remote reason kmalloc failure),
>> > > so did you disable RC completely? No prints either w.r.t RC either in
>> > > dmesg?
>> >
>> > Pay attention to the .config.
>> >
>> Missed the attachment, thanks for pointing.
>> As guessed the rate control is empty causing the registration fail.
>
> It still shouldn't crash though. Looks like there's a fix in this
> thread, can somebody verify & post it?
>
Yes, it should not crash. The change suggested by martin is not correct
there is no double free as the the list he mentioned will be empty.
(Only after successful registration we will add the radio to the list)

the problem here is platform_driver_unregister internally calls the
driver_unregister
which tries to get the kobject through get_device, but we have already
freed the kobject using
device_unregister (which calls device_del which frees the kobject).

In other failures cases we use mac80211_hwsim_free() and return, so the call to
platform_driver_unregister is not there, hence no crash.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [mac80211_hwsim] BUG: unable to handle kernel paging request at ce1db404

2014-03-15 Thread Johannes Berg
On Sat, 2014-03-15 at 21:03 +0530, Krishna Chaitanya wrote:

> > > what RC are u using? Default should be minstrel, i dont see
> > > a reason for rc alloc to fail (remote reason kmalloc failure),
> > > so did you disable RC completely? No prints either w.r.t RC either in
> > > dmesg?
> >
> > Pay attention to the .config.
> >
> Missed the attachment, thanks for pointing.
> As guessed the rate control is empty causing the registration fail.

It still shouldn't crash though. Looks like there's a fix in this
thread, can somebody verify & post it?

Thx,
johannes

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [mac80211_hwsim] BUG: unable to handle kernel paging request at ce1db404

2014-03-15 Thread Krishna Chaitanya
On Sat, Mar 15, 2014 at 8:50 PM, Johannes Berg
 wrote:
>
> On Thu, 2014-03-13 at 02:15 +0530, Krishna Chaitanya wrote:
>
> > From the logs it looks like "rate_control_alloc" is failed,
> > causing ieee80211_register_hw to fail triggering the crash.
>
> Yes.
>
> > what RC are u using? Default should be minstrel, i dont see
> > a reason for rc alloc to fail (remote reason kmalloc failure),
> > so did you disable RC completely? No prints either w.r.t RC either in
> > dmesg?
>
> Pay attention to the .config.
>
Missed the attachment, thanks for pointing.
As guessed the rate control is empty causing the registration fail.

CONFIG_MAC80211=y
# CONFIG_MAC80211_RC_PID is not set
# CONFIG_MAC80211_RC_MINSTREL is not set
CONFIG_MAC80211_RC_DEFAULT=""
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [mac80211_hwsim] BUG: unable to handle kernel paging request at ce1db404

2014-03-15 Thread Johannes Berg
On Thu, 2014-03-13 at 02:15 +0530, Krishna Chaitanya wrote:

> From the logs it looks like "rate_control_alloc" is failed,
> causing ieee80211_register_hw to fail triggering the crash.

Yes.

> what RC are u using? Default should be minstrel, i dont see
> a reason for rc alloc to fail (remote reason kmalloc failure),
> so did you disable RC completely? No prints either w.r.t RC either in
> dmesg?

Pay attention to the .config.

johannes

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [mac80211_hwsim] BUG: unable to handle kernel paging request at ce1db404

2014-03-15 Thread Johannes Berg
On Thu, 2014-03-13 at 02:15 +0530, Krishna Chaitanya wrote:

 From the logs it looks like rate_control_alloc is failed,
 causing ieee80211_register_hw to fail triggering the crash.

Yes.

 what RC are u using? Default should be minstrel, i dont see
 a reason for rc alloc to fail (remote reason kmalloc failure),
 so did you disable RC completely? No prints either w.r.t RC either in
 dmesg?

Pay attention to the .config.

johannes

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [mac80211_hwsim] BUG: unable to handle kernel paging request at ce1db404

2014-03-15 Thread Krishna Chaitanya
On Sat, Mar 15, 2014 at 8:50 PM, Johannes Berg
johan...@sipsolutions.net wrote:

 On Thu, 2014-03-13 at 02:15 +0530, Krishna Chaitanya wrote:

  From the logs it looks like rate_control_alloc is failed,
  causing ieee80211_register_hw to fail triggering the crash.

 Yes.

  what RC are u using? Default should be minstrel, i dont see
  a reason for rc alloc to fail (remote reason kmalloc failure),
  so did you disable RC completely? No prints either w.r.t RC either in
  dmesg?

 Pay attention to the .config.

Missed the attachment, thanks for pointing.
As guessed the rate control is empty causing the registration fail.

CONFIG_MAC80211=y
# CONFIG_MAC80211_RC_PID is not set
# CONFIG_MAC80211_RC_MINSTREL is not set
CONFIG_MAC80211_RC_DEFAULT=
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [mac80211_hwsim] BUG: unable to handle kernel paging request at ce1db404

2014-03-15 Thread Johannes Berg
On Sat, 2014-03-15 at 21:03 +0530, Krishna Chaitanya wrote:

   what RC are u using? Default should be minstrel, i dont see
   a reason for rc alloc to fail (remote reason kmalloc failure),
   so did you disable RC completely? No prints either w.r.t RC either in
   dmesg?
 
  Pay attention to the .config.
 
 Missed the attachment, thanks for pointing.
 As guessed the rate control is empty causing the registration fail.

It still shouldn't crash though. Looks like there's a fix in this
thread, can somebody verify  post it?

Thx,
johannes

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [mac80211_hwsim] BUG: unable to handle kernel paging request at ce1db404

2014-03-15 Thread Krishna Chaitanya
On Sat, Mar 15, 2014 at 9:11 PM, Johannes Berg
johan...@sipsolutions.net wrote:
 On Sat, 2014-03-15 at 21:03 +0530, Krishna Chaitanya wrote:

   what RC are u using? Default should be minstrel, i dont see
   a reason for rc alloc to fail (remote reason kmalloc failure),
   so did you disable RC completely? No prints either w.r.t RC either in
   dmesg?
 
  Pay attention to the .config.
 
 Missed the attachment, thanks for pointing.
 As guessed the rate control is empty causing the registration fail.

 It still shouldn't crash though. Looks like there's a fix in this
 thread, can somebody verify  post it?

Yes, it should not crash. The change suggested by martin is not correct
there is no double free as the the list he mentioned will be empty.
(Only after successful registration we will add the radio to the list)

the problem here is platform_driver_unregister internally calls the
driver_unregister
which tries to get the kobject through get_device, but we have already
freed the kobject using
device_unregister (which calls device_del which frees the kobject).

In other failures cases we use mac80211_hwsim_free() and return, so the call to
platform_driver_unregister is not there, hence no crash.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [mac80211_hwsim] BUG: unable to handle kernel paging request at ce1db404

2014-03-14 Thread Fengguang Wu
Hi Martin,

On Wed, Mar 12, 2014 at 06:08:06PM +0100, Martin Pitt wrote:
> Hey Fengguang,
> 
> Fengguang Wu [2014-03-05 21:23 +0800]:
> > git bisect start v3.10 v3.9 --
[snip]
> > git bisect  bad d8efcf38b13df3e9e889cf7cc214cb85dc53600c  # 04:30  0-   
> >3  Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm
> > git bisect  bad b148a42ba7823e34971cd4e5b05a5c74fa3311ed  # 04:30  0-   
> >   19  Add linux-next specific files for 20140228
> 
> I noticed that this bisect list didn't include Sasha Levin's followup fix:
> 
>   
> http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/drivers/net/wireless/mac80211_hwsim.c?id=c07fe5ae06
> 
> That was supposed to fix problems at registration. Is that actually
> inclued in your bisection? I suppose it is, but it's quite clear that
> with only 9ea927 you'd get crashes with building it statically.

The "git bisect start v3.10 v3.9" itself may not include that fixing
commit, however the bisect script will check latest mainline and
linux-next kernels at the end, and this line shows that linux-next at
that day is still bad:

> > git bisect  bad b148a42ba7823e34971cd4e5b05a5c74fa3311ed  # 04:30  0-   
> >   19  Add linux-next specific files for 20140228

I confirmed that b148a42ba7823e34971cd4e5b05a5c74fa3311ed does contain
the fixing commit c07fe5ae06.

Thanks,
Fengguang
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [mac80211_hwsim] BUG: unable to handle kernel paging request at ce1db404

2014-03-14 Thread Fengguang Wu
Hi Martin,

On Wed, Mar 12, 2014 at 06:08:06PM +0100, Martin Pitt wrote:
 Hey Fengguang,
 
 Fengguang Wu [2014-03-05 21:23 +0800]:
  git bisect start v3.10 v3.9 --
[snip]
  git bisect  bad d8efcf38b13df3e9e889cf7cc214cb85dc53600c  # 04:30  0-   
 3  Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm
  git bisect  bad b148a42ba7823e34971cd4e5b05a5c74fa3311ed  # 04:30  0-   
19  Add linux-next specific files for 20140228
 
 I noticed that this bisect list didn't include Sasha Levin's followup fix:
 
   
 http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/drivers/net/wireless/mac80211_hwsim.c?id=c07fe5ae06
 
 That was supposed to fix problems at registration. Is that actually
 inclued in your bisection? I suppose it is, but it's quite clear that
 with only 9ea927 you'd get crashes with building it statically.

The git bisect start v3.10 v3.9 itself may not include that fixing
commit, however the bisect script will check latest mainline and
linux-next kernels at the end, and this line shows that linux-next at
that day is still bad:

  git bisect  bad b148a42ba7823e34971cd4e5b05a5c74fa3311ed  # 04:30  0-   
19  Add linux-next specific files for 20140228

I confirmed that b148a42ba7823e34971cd4e5b05a5c74fa3311ed does contain
the fixing commit c07fe5ae06.

Thanks,
Fengguang
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [mac80211_hwsim] BUG: unable to handle kernel paging request at ce1db404

2014-03-12 Thread Krishna Chaitanya
On Wed, Mar 12, 2014 at 11:04 PM, Martin Pitt  wrote:
>
> Fengguang Wu [2014-03-08 20:11 +0800]:
> > [4.429993] mac80211_hwsim: ieee80211_register_hw failed (-2)
> > [...]
> > [4.431924]  [] get_device+0xf/0x17
> > [4.431924]  [] driver_detach+0x38/0x8f
> > [4.431924]  [] bus_remove_driver+0x53/0x66
> > [4.431924]  [] driver_unregister+0x38/0x3d
> > [4.431924]  [] platform_driver_unregister+0xb/0xd
> > [4.431924]  [] init_mac80211_hwsim+0x3a5/0x3b6
>
>
> So that first message is from
> http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/tree/drivers/net/wireless/mac80211_hwsim.c?id=9ea927748#n2428
>
> At this point we registered the platform driver and the class, and it
> should have created two devices (at least for the default radios=2).
> What's odd is that I don't see this printk in your kernel log:
>
>   mac80211_hwsim: Initializing radio %d
>
> If for some reasons "radio" is 0, it would not show this and not
> initialize data->dev, but then you shouldn't get to
> ieee80211_register_hw() either as it's in the same loop. So that's a
> bit of a mystery to me.
>
> On failure, above ieee80211_register_hw() jumps to the cleanup:
>
> | failed_hw:
> |   device_unregister(data->dev);
> | failed_drvdata:
> |   ieee80211_free_hw(hw);
> | failed:
> |   mac80211_hwsim_free();
> | failed_unregister_driver:
> |   driver_unregister(_hwsim_driver);
> |   return err;
> | }
>
>
> The mac80211_hwsim_free() function again calls
> device_unregister(data->dev) for a list (not sure which, I'm not
> certain how to interpret
>
>   list_for_each_entry_safe(data, tmpdata, , list)
>
> ) Could that be the double free causing the memory corruption?
>
> If you are in a position to do quick builds and tests, does the crash
> go away with this?
>
> printk(KERN_DEBUG "mac80211_hwsim: device_bind_driver failed 
> (%d)\n",
>err);
> -   goto failed_hw;
> +   goto failed_drvdata;
> }
>
> (I'm not claiming that this is correct, just taking a stab at
> understanding what happens) If not, does it go away with changing the
> goto to failed_unregister_driver()?
>
>From the logs it looks like "rate_control_alloc" is failed,
causing ieee80211_register_hw to fail triggering the crash.
what RC are u using? Default should be minstrel, i dont see
a reason for rc alloc to fail (remote reason kmalloc failure),
so did you disable RC completely? No prints either w.r.t RC either in
dmesg?
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [mac80211_hwsim] BUG: unable to handle kernel paging request at ce1db404

2014-03-12 Thread Martin Pitt
Fengguang Wu [2014-03-08 20:11 +0800]:
> [4.429993] mac80211_hwsim: ieee80211_register_hw failed (-2)
> [...]
> [4.431924]  [] get_device+0xf/0x17
> [4.431924]  [] driver_detach+0x38/0x8f
> [4.431924]  [] bus_remove_driver+0x53/0x66
> [4.431924]  [] driver_unregister+0x38/0x3d
> [4.431924]  [] platform_driver_unregister+0xb/0xd
> [4.431924]  [] init_mac80211_hwsim+0x3a5/0x3b6


So that first message is from
http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/tree/drivers/net/wireless/mac80211_hwsim.c?id=9ea927748#n2428

At this point we registered the platform driver and the class, and it
should have created two devices (at least for the default radios=2).
What's odd is that I don't see this printk in your kernel log:

  mac80211_hwsim: Initializing radio %d

If for some reasons "radio" is 0, it would not show this and not
initialize data->dev, but then you shouldn't get to
ieee80211_register_hw() either as it's in the same loop. So that's a
bit of a mystery to me.

On failure, above ieee80211_register_hw() jumps to the cleanup:

| failed_hw:
|   device_unregister(data->dev);
| failed_drvdata:
|   ieee80211_free_hw(hw);
| failed:
|   mac80211_hwsim_free();
| failed_unregister_driver:
|   driver_unregister(_hwsim_driver);
|   return err;
| }


The mac80211_hwsim_free() function again calls
device_unregister(data->dev) for a list (not sure which, I'm not
certain how to interpret

  list_for_each_entry_safe(data, tmpdata, , list)

) Could that be the double free causing the memory corruption?

If you are in a position to do quick builds and tests, does the crash
go away with this?

printk(KERN_DEBUG "mac80211_hwsim: device_bind_driver failed 
(%d)\n",
   err);
-   goto failed_hw;
+   goto failed_drvdata;
}

(I'm not claiming that this is correct, just taking a stab at
understanding what happens) If not, does it go away with changing the
goto to failed_unregister_driver()?

Thanks,

Martin
-- 
Martin Pitt| http://www.piware.de
Ubuntu Developer (www.ubuntu.com)  | Debian Developer  (www.debian.org)


signature.asc
Description: Digital signature


Re: [mac80211_hwsim] BUG: unable to handle kernel paging request at ce1db404

2014-03-12 Thread Martin Pitt
Hey Fengguang,

Fengguang Wu [2014-03-05 21:23 +0800]:
> git bisect start v3.10 v3.9 --
> git bisect  bad ff89acc563a0bd49965674f56552ad6620415fe2  # 03:01  0- 
>  2  Merge branch 'rcu/urgent' of 
> git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu
> git bisect  bad 24d0c2542b38963ae4d5171ecc0a2c1326c656bc  # 03:03  0- 
>  2  Merge branch 'for-linus' of 
> git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
> git bisect good 151173e8ce9b95d7eedb9035cfaffbdb7cb2  # 03:11 20+ 
> 20  Merge tag 'for-v3.10' of git://git.infradead.org/battery-2.6
> git bisect  bad e95893004104054d49406fd108fefa3ddc054366  # 03:13  0- 
>  4  Merge tag 'for_linus' of 
> git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost
> git bisect good 8a72f3820c4d14b27ad5336aed00063a7a7f1bef  # 03:19 20+ 
> 20  Merge git://git.kernel.org/pub/scm/linux/kernel/git/cmetcalf/linux-tile
> git bisect  bad 600fe9751aeb6f6b72de84076a05c5b8c04152c0  # 03:21  0- 
>  5  ipc_schedule_free() can do vfree() directly now
> git bisect  bad 126de6b20bfb82cc19012d5048f11f339ae5a021  # 03:23  0- 
>  1  linkage.h: fix build breakage due to symbol prefix handling
> git bisect good 251df49db3327c64bf917bfdba94491fde2b4ee0  # 03:27 20+ 
> 20  Merge branch 'for-linus' of 
> git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input
> git bisect  bad 73287a43cc79ca06629a88d1a199cd283f42456a  # 03:29  0- 
>  2  Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next
> git bisect good 953c96e0d85615d1ab1f100e525d376053294dc2  # 03:37 20+ 
>  0  tg3: Use bool not int
> git bisect  bad 4fc4118cdb29ab946b8a586fc766ebb6ae1e1c90  # 03:39  0- 
>  2  wil6210: more Rx descriptor accessor functions
> git bisect good e73dcfbf061b524fe9aaef56cf3c2e234a45ec19  # 03:46 20+ 
>  7  Bluetooth: hidp: fix sending output reports on intr channel
> git bisect good d5590bba37f3c7d496195648532d5313abb43891  # 03:50 20+ 
>  0  NFC: pn533: Re-group fields in struct pn533
> git bisect  bad 06d961a8e210035bff7e82f466107f9ab4a8fd94  # 03:52  0- 
>  7  mac80211/minstrel: use the new rate control API
> git bisect  bad 97990a060e6757f48b931a3946b17c1c4362c3fb  # 03:54  0- 
>  3  nl80211: allow using wdev identifiers to get scan results
> git bisect  bad 85220d71bf3ca1ba9129e0744247ae5f61bec559  # 03:56  1- 
>  5  mac80211: support secondary channel offset in CSA
> git bisect  bad 0ca54f6c5fd4ce58aa044d1fc7f00d7f6cf2801c  # 03:57  1- 
>  6  mac80211: provide SSID in IBSS mode
> git bisect  bad 3088f7d2db42925808c4b43a6258647ee4d1dd5f  # 03:59  3- 
>  8  mac80211: stringify another plink state
> git bisect  bad 9d6d6f4924133567a108a862d9cf949cd03f71cb  # 04:02  0- 
>  2  mac80211: unset FC retry bit in mesh fwding path
> git bisect  bad 9ea927748ced4953f1e9a0f1fa1fdeacd1018b4e  # 04:05  0- 
>  9  mac80211_hwsim: Register and bind to driver
> # first bad commit: [9ea927748ced4953f1e9a0f1fa1fdeacd1018b4e] 
> mac80211_hwsim: Register and bind to driver
> git bisect good ddc4db2e3d5393ede7a9222bb3b7522a603a4678  # 04:27 69+ 
> 18  mac80211: make ieee802_11_parse_elems an inline
> git bisect  bad b148a42ba7823e34971cd4e5b05a5c74fa3311ed  # 04:27  0- 
> 19  Add linux-next specific files for 20140228
> git bisect  bad d8efcf38b13df3e9e889cf7cc214cb85dc53600c  # 04:30  0- 
>  3  Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm
> git bisect  bad b148a42ba7823e34971cd4e5b05a5c74fa3311ed  # 04:30  0- 
> 19  Add linux-next specific files for 20140228

I noticed that this bisect list didn't include Sasha Levin's followup fix:

  
http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/drivers/net/wireless/mac80211_hwsim.c?id=c07fe5ae06

That was supposed to fix problems at registration. Is that actually
inclued in your bisection? I suppose it is, but it's quite clear that
with only 9ea927 you'd get crashes with building it statically.

I have no real idea why ieee80211_register_hw() would fail (but then
again, I'm a kernel n00b). It's apparently related to doing
platform_driver_register() before, so perhaps it's missing another
field in the struct. "unable to handle kernel paging request", is that
a normal error message if a driver fails to initialize? I would have
assumed that the kernel would't panic if a single driver fails to
initialize. Or does that mean it actually tries to access uninit'ed
memory?

I can certainly check out linux-next, build the whole thing
statically, and try to twiddle things a bit, but I'm afraid that's
just going to take a while (both because this is all new to me and I'm
working on other things ATM), so please don't hold your breath. :-)

Thanks,

Martin
-- 
Martin Pitt| http://www.piware.de
Ubuntu Developer (www.ubuntu.com)  | Debian Developer  (www.debian.org)


signature.asc
Description: Digital signature


Re: [mac80211_hwsim] BUG: unable to handle kernel paging request at ce1db404

2014-03-12 Thread Martin Pitt
Hey Fengguang,

Fengguang Wu [2014-03-05 21:23 +0800]:
 git bisect start v3.10 v3.9 --
 git bisect  bad ff89acc563a0bd49965674f56552ad6620415fe2  # 03:01  0- 
  2  Merge branch 'rcu/urgent' of 
 git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu
 git bisect  bad 24d0c2542b38963ae4d5171ecc0a2c1326c656bc  # 03:03  0- 
  2  Merge branch 'for-linus' of 
 git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
 git bisect good 151173e8ce9b95d7eedb9035cfaffbdb7cb2  # 03:11 20+ 
 20  Merge tag 'for-v3.10' of git://git.infradead.org/battery-2.6
 git bisect  bad e95893004104054d49406fd108fefa3ddc054366  # 03:13  0- 
  4  Merge tag 'for_linus' of 
 git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost
 git bisect good 8a72f3820c4d14b27ad5336aed00063a7a7f1bef  # 03:19 20+ 
 20  Merge git://git.kernel.org/pub/scm/linux/kernel/git/cmetcalf/linux-tile
 git bisect  bad 600fe9751aeb6f6b72de84076a05c5b8c04152c0  # 03:21  0- 
  5  ipc_schedule_free() can do vfree() directly now
 git bisect  bad 126de6b20bfb82cc19012d5048f11f339ae5a021  # 03:23  0- 
  1  linkage.h: fix build breakage due to symbol prefix handling
 git bisect good 251df49db3327c64bf917bfdba94491fde2b4ee0  # 03:27 20+ 
 20  Merge branch 'for-linus' of 
 git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input
 git bisect  bad 73287a43cc79ca06629a88d1a199cd283f42456a  # 03:29  0- 
  2  Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next
 git bisect good 953c96e0d85615d1ab1f100e525d376053294dc2  # 03:37 20+ 
  0  tg3: Use bool not int
 git bisect  bad 4fc4118cdb29ab946b8a586fc766ebb6ae1e1c90  # 03:39  0- 
  2  wil6210: more Rx descriptor accessor functions
 git bisect good e73dcfbf061b524fe9aaef56cf3c2e234a45ec19  # 03:46 20+ 
  7  Bluetooth: hidp: fix sending output reports on intr channel
 git bisect good d5590bba37f3c7d496195648532d5313abb43891  # 03:50 20+ 
  0  NFC: pn533: Re-group fields in struct pn533
 git bisect  bad 06d961a8e210035bff7e82f466107f9ab4a8fd94  # 03:52  0- 
  7  mac80211/minstrel: use the new rate control API
 git bisect  bad 97990a060e6757f48b931a3946b17c1c4362c3fb  # 03:54  0- 
  3  nl80211: allow using wdev identifiers to get scan results
 git bisect  bad 85220d71bf3ca1ba9129e0744247ae5f61bec559  # 03:56  1- 
  5  mac80211: support secondary channel offset in CSA
 git bisect  bad 0ca54f6c5fd4ce58aa044d1fc7f00d7f6cf2801c  # 03:57  1- 
  6  mac80211: provide SSID in IBSS mode
 git bisect  bad 3088f7d2db42925808c4b43a6258647ee4d1dd5f  # 03:59  3- 
  8  mac80211: stringify another plink state
 git bisect  bad 9d6d6f4924133567a108a862d9cf949cd03f71cb  # 04:02  0- 
  2  mac80211: unset FC retry bit in mesh fwding path
 git bisect  bad 9ea927748ced4953f1e9a0f1fa1fdeacd1018b4e  # 04:05  0- 
  9  mac80211_hwsim: Register and bind to driver
 # first bad commit: [9ea927748ced4953f1e9a0f1fa1fdeacd1018b4e] 
 mac80211_hwsim: Register and bind to driver
 git bisect good ddc4db2e3d5393ede7a9222bb3b7522a603a4678  # 04:27 69+ 
 18  mac80211: make ieee802_11_parse_elems an inline
 git bisect  bad b148a42ba7823e34971cd4e5b05a5c74fa3311ed  # 04:27  0- 
 19  Add linux-next specific files for 20140228
 git bisect  bad d8efcf38b13df3e9e889cf7cc214cb85dc53600c  # 04:30  0- 
  3  Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm
 git bisect  bad b148a42ba7823e34971cd4e5b05a5c74fa3311ed  # 04:30  0- 
 19  Add linux-next specific files for 20140228

I noticed that this bisect list didn't include Sasha Levin's followup fix:

  
http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/drivers/net/wireless/mac80211_hwsim.c?id=c07fe5ae06

That was supposed to fix problems at registration. Is that actually
inclued in your bisection? I suppose it is, but it's quite clear that
with only 9ea927 you'd get crashes with building it statically.

I have no real idea why ieee80211_register_hw() would fail (but then
again, I'm a kernel n00b). It's apparently related to doing
platform_driver_register() before, so perhaps it's missing another
field in the struct. unable to handle kernel paging request, is that
a normal error message if a driver fails to initialize? I would have
assumed that the kernel would't panic if a single driver fails to
initialize. Or does that mean it actually tries to access uninit'ed
memory?

I can certainly check out linux-next, build the whole thing
statically, and try to twiddle things a bit, but I'm afraid that's
just going to take a while (both because this is all new to me and I'm
working on other things ATM), so please don't hold your breath. :-)

Thanks,

Martin
-- 
Martin Pitt| http://www.piware.de
Ubuntu Developer (www.ubuntu.com)  | Debian Developer  (www.debian.org)


signature.asc
Description: Digital signature


Re: [mac80211_hwsim] BUG: unable to handle kernel paging request at ce1db404

2014-03-12 Thread Martin Pitt
Fengguang Wu [2014-03-08 20:11 +0800]:
 [4.429993] mac80211_hwsim: ieee80211_register_hw failed (-2)
 [...]
 [4.431924]  [c12377de] get_device+0xf/0x17
 [4.431924]  [c123a165] driver_detach+0x38/0x8f
 [4.431924]  [c1239433] bus_remove_driver+0x53/0x66
 [4.431924]  [c123a535] driver_unregister+0x38/0x3d
 [4.431924]  [c123b3aa] platform_driver_unregister+0xb/0xd
 [4.431924]  [c1c4ac9f] init_mac80211_hwsim+0x3a5/0x3b6


So that first message is from
http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/tree/drivers/net/wireless/mac80211_hwsim.c?id=9ea927748#n2428

At this point we registered the platform driver and the class, and it
should have created two devices (at least for the default radios=2).
What's odd is that I don't see this printk in your kernel log:

  mac80211_hwsim: Initializing radio %d

If for some reasons radio is 0, it would not show this and not
initialize data-dev, but then you shouldn't get to
ieee80211_register_hw() either as it's in the same loop. So that's a
bit of a mystery to me.

On failure, above ieee80211_register_hw() jumps to the cleanup:

| failed_hw:
|   device_unregister(data-dev);
| failed_drvdata:
|   ieee80211_free_hw(hw);
| failed:
|   mac80211_hwsim_free();
| failed_unregister_driver:
|   driver_unregister(mac80211_hwsim_driver);
|   return err;
| }


The mac80211_hwsim_free() function again calls
device_unregister(data-dev) for a list (not sure which, I'm not
certain how to interpret

  list_for_each_entry_safe(data, tmpdata, tmplist, list)

) Could that be the double free causing the memory corruption?

If you are in a position to do quick builds and tests, does the crash
go away with this?

printk(KERN_DEBUG mac80211_hwsim: device_bind_driver failed 
(%d)\n,
   err);
-   goto failed_hw;
+   goto failed_drvdata;
}

(I'm not claiming that this is correct, just taking a stab at
understanding what happens) If not, does it go away with changing the
goto to failed_unregister_driver()?

Thanks,

Martin
-- 
Martin Pitt| http://www.piware.de
Ubuntu Developer (www.ubuntu.com)  | Debian Developer  (www.debian.org)


signature.asc
Description: Digital signature


Re: [mac80211_hwsim] BUG: unable to handle kernel paging request at ce1db404

2014-03-12 Thread Krishna Chaitanya
On Wed, Mar 12, 2014 at 11:04 PM, Martin Pitt martin.p...@ubuntu.com wrote:

 Fengguang Wu [2014-03-08 20:11 +0800]:
  [4.429993] mac80211_hwsim: ieee80211_register_hw failed (-2)
  [...]
  [4.431924]  [c12377de] get_device+0xf/0x17
  [4.431924]  [c123a165] driver_detach+0x38/0x8f
  [4.431924]  [c1239433] bus_remove_driver+0x53/0x66
  [4.431924]  [c123a535] driver_unregister+0x38/0x3d
  [4.431924]  [c123b3aa] platform_driver_unregister+0xb/0xd
  [4.431924]  [c1c4ac9f] init_mac80211_hwsim+0x3a5/0x3b6


 So that first message is from
 http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/tree/drivers/net/wireless/mac80211_hwsim.c?id=9ea927748#n2428

 At this point we registered the platform driver and the class, and it
 should have created two devices (at least for the default radios=2).
 What's odd is that I don't see this printk in your kernel log:

   mac80211_hwsim: Initializing radio %d

 If for some reasons radio is 0, it would not show this and not
 initialize data-dev, but then you shouldn't get to
 ieee80211_register_hw() either as it's in the same loop. So that's a
 bit of a mystery to me.

 On failure, above ieee80211_register_hw() jumps to the cleanup:

 | failed_hw:
 |   device_unregister(data-dev);
 | failed_drvdata:
 |   ieee80211_free_hw(hw);
 | failed:
 |   mac80211_hwsim_free();
 | failed_unregister_driver:
 |   driver_unregister(mac80211_hwsim_driver);
 |   return err;
 | }


 The mac80211_hwsim_free() function again calls
 device_unregister(data-dev) for a list (not sure which, I'm not
 certain how to interpret

   list_for_each_entry_safe(data, tmpdata, tmplist, list)

 ) Could that be the double free causing the memory corruption?

 If you are in a position to do quick builds and tests, does the crash
 go away with this?

 printk(KERN_DEBUG mac80211_hwsim: device_bind_driver failed 
 (%d)\n,
err);
 -   goto failed_hw;
 +   goto failed_drvdata;
 }

 (I'm not claiming that this is correct, just taking a stab at
 understanding what happens) If not, does it go away with changing the
 goto to failed_unregister_driver()?

From the logs it looks like rate_control_alloc is failed,
causing ieee80211_register_hw to fail triggering the crash.
what RC are u using? Default should be minstrel, i dont see
a reason for rc alloc to fail (remote reason kmalloc failure),
so did you disable RC completely? No prints either w.r.t RC either in
dmesg?
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [mac80211_hwsim] BUG: unable to handle kernel paging request at ce1db404

2014-03-08 Thread Fengguang Wu
Martin,

These are long term bug and is still showing up in linux-next.
It can be reliably reproduced with the below KVM script.

#!/bin/bash

kvm=(
qemu-system-x86_64 -cpu kvm64 -enable-kvm
-kernel $1
-smp 2
-m 256M
-net nic,vlan=0,macaddr=00:00:00:00:00:00,model=virtio
-net user,vlan=0
-net nic,vlan=1,model=e1000
-net user,vlan=1
-boot order=nc
-no-reboot
-watchdog i6300esb
-display none
-serial stdio
-monitor null
)

append=(
debug
sched_debug
apic=debug
ignore_loglevel
earlyprintk=ttyS0,115200
sysrq_always_enabled
panic=10
prompt_ramdisk=0
console=ttyS0,115200
console=tty0
vga=normal
root=/dev/ram0
rcupdate.rcu_cpu_stall_timeout=100
rw
)

"${kvm[@]}" --append "${append[*]}"


% kvm-0day.sh 
/kernel/i386-randconfig-c5-03010009/b148a42ba7823e34971cd4e5b05a5c74fa3311ed/vmlinuz-3.14.0-rc4-next-20140228-05738-gb148a42
early console in setup code
[0.00] Initializing cgroup subsys cpuset
[0.00] Initializing cgroup subsys cpu
[0.00] Linux version 3.14.0-rc4-next-20140228-05738-gb148a42 
(kbuild@cairo) (gcc version 4.8.1 (Debian 4.8.1-8) ) #8 SMP PREEMPT Sat Mar 1 
01:05:21 CST 2014
[0.00] KERNEL supported cpus:
[0.00]   Intel GenuineIntel
[0.00]   Transmeta GenuineTMx86
[0.00]   Transmeta TransmetaCPU
[0.00] e820: BIOS-provided physical RAM map:
[0.00] BIOS-e820: [mem 0x-0x0009fbff] usable
[0.00] BIOS-e820: [mem 0x0009fc00-0x0009] reserved
[0.00] BIOS-e820: [mem 0x000f-0x000f] reserved
[0.00] BIOS-e820: [mem 0x0010-0x0fffdfff] usable
[0.00] BIOS-e820: [mem 0x0fffe000-0x0fff] reserved
[0.00] BIOS-e820: [mem 0xfeffc000-0xfeff] reserved
[0.00] BIOS-e820: [mem 0xfffc-0x] reserved
[0.00] debug: ignoring loglevel setting.
[0.00] NX (Execute Disable) protection: active
[0.00] SMBIOS 2.4 present.
[0.00] DMI: Bochs Bochs, BIOS Bochs 01/01/2011
[0.00] Hypervisor detected: KVM
[0.00] e820: update [mem 0x-0x0fff] usable ==> reserved
[0.00] e820: remove [mem 0x000a-0x000f] usable
[0.00] e820: last_pfn = 0xfffe max_arch_pfn = 0x100
[0.00] MTRR default type: write-back
[0.00] MTRR fixed ranges enabled:
[0.00]   0-9 write-back
[0.00]   A-B uncachable
[0.00]   C-F write-protect
[0.00] MTRR variable ranges enabled:
[0.00]   0 base 008000 mask FF8000 uncachable
[0.00]   1 disabled
[0.00]   2 disabled
[0.00]   3 disabled
[0.00]   4 disabled
[0.00]   5 disabled
[0.00]   6 disabled
[0.00]   7 disabled
[0.00] Scan for SMP in [mem 0x-0x03ff]
[0.00] Scan for SMP in [mem 0x0009fc00-0x0009]
[0.00] Scan for SMP in [mem 0x000f-0x000f]
[0.00] found SMP MP-table at [mem 0x000f1840-0x000f184f] mapped at 
[c00f1840]
[0.00]   mpc: f1850-f193c
[0.00] Scanning 1 areas for low memory corruption
[0.00] initial memory mapped: [mem 0x-0x027f]
[0.00] Base memory trampoline at [c009b000] 9b000 size 16384
[0.00] init_memory_mapping: [mem 0x-0x000f]
[0.00]  [mem 0x-0x000f] page 4k
[0.00] init_memory_mapping: [mem 0x0fc0-0x0fdf]
[0.00]  [mem 0x0fc0-0x0fdf] page 4k
[0.00] BRK [0x023ac000, 0x023acfff] PGTABLE
[0.00] init_memory_mapping: [mem 0x0c00-0x0fbf]
[0.00]  [mem 0x0c00-0x0fbf] page 4k
[0.00] BRK [0x023ad000, 0x023adfff] PGTABLE
[0.00] BRK [0x023ae000, 0x023aefff] PGTABLE
[0.00] BRK [0x023af000, 0x023a] PGTABLE
[0.00] BRK [0x023b, 0x023b0fff] PGTABLE
[0.00] BRK [0x023b1000, 0x023b1fff] PGTABLE
[0.00] init_memory_mapping: [mem 0x0010-0x0bff]
[0.00]  [mem 0x0010-0x0bff] page 4k
[0.00] init_memory_mapping: [mem 0x0fe0-0x0fffdfff]
[0.00]  [mem 0x0fe0-0x0fffdfff] page 4k
[0.00] ACPI: RSDP 0x000F16B0 14 (v00 BOCHS )
[0.00] ACPI: RSDT 0x0FFFE3F0 34 (v01 BOCHS  BXPCRSDT 0001 BXPC 
0001)
[0.00] ACPI: FACP 0x0F80 74 (v01 BOCHS  BXPCFACP 0001 BXPC 
0001)
[0.00] ACPI: DSDT 0x0FFFE430 001137 (v01 BXPC   BXDSDT   0001 INTL 
20100528)
[0.00] ACPI: FACS 0x0F40 40
[0.00] ACPI: SSDT 0x06A0 000899 (v01 BOCHS  BXPCSSDT 0001 BXPC 
0001)
[0.00] ACPI: APIC 0x05B0 80 (v01 BOCHS  BXPCAPIC 0001 BXPC 
0001)
[0.00] 

Re: [mac80211_hwsim] BUG: unable to handle kernel paging request at ce1db404

2014-03-08 Thread Fengguang Wu
Martin,

These are long term bug and is still showing up in linux-next.
It can be reliably reproduced with the below KVM script.

#!/bin/bash

kvm=(
qemu-system-x86_64 -cpu kvm64 -enable-kvm
-kernel $1
-smp 2
-m 256M
-net nic,vlan=0,macaddr=00:00:00:00:00:00,model=virtio
-net user,vlan=0
-net nic,vlan=1,model=e1000
-net user,vlan=1
-boot order=nc
-no-reboot
-watchdog i6300esb
-display none
-serial stdio
-monitor null
)

append=(
debug
sched_debug
apic=debug
ignore_loglevel
earlyprintk=ttyS0,115200
sysrq_always_enabled
panic=10
prompt_ramdisk=0
console=ttyS0,115200
console=tty0
vga=normal
root=/dev/ram0
rcupdate.rcu_cpu_stall_timeout=100
rw
)

${kvm[@]} --append ${append[*]}


% kvm-0day.sh 
/kernel/i386-randconfig-c5-03010009/b148a42ba7823e34971cd4e5b05a5c74fa3311ed/vmlinuz-3.14.0-rc4-next-20140228-05738-gb148a42
early console in setup code
[0.00] Initializing cgroup subsys cpuset
[0.00] Initializing cgroup subsys cpu
[0.00] Linux version 3.14.0-rc4-next-20140228-05738-gb148a42 
(kbuild@cairo) (gcc version 4.8.1 (Debian 4.8.1-8) ) #8 SMP PREEMPT Sat Mar 1 
01:05:21 CST 2014
[0.00] KERNEL supported cpus:
[0.00]   Intel GenuineIntel
[0.00]   Transmeta GenuineTMx86
[0.00]   Transmeta TransmetaCPU
[0.00] e820: BIOS-provided physical RAM map:
[0.00] BIOS-e820: [mem 0x-0x0009fbff] usable
[0.00] BIOS-e820: [mem 0x0009fc00-0x0009] reserved
[0.00] BIOS-e820: [mem 0x000f-0x000f] reserved
[0.00] BIOS-e820: [mem 0x0010-0x0fffdfff] usable
[0.00] BIOS-e820: [mem 0x0fffe000-0x0fff] reserved
[0.00] BIOS-e820: [mem 0xfeffc000-0xfeff] reserved
[0.00] BIOS-e820: [mem 0xfffc-0x] reserved
[0.00] debug: ignoring loglevel setting.
[0.00] NX (Execute Disable) protection: active
[0.00] SMBIOS 2.4 present.
[0.00] DMI: Bochs Bochs, BIOS Bochs 01/01/2011
[0.00] Hypervisor detected: KVM
[0.00] e820: update [mem 0x-0x0fff] usable == reserved
[0.00] e820: remove [mem 0x000a-0x000f] usable
[0.00] e820: last_pfn = 0xfffe max_arch_pfn = 0x100
[0.00] MTRR default type: write-back
[0.00] MTRR fixed ranges enabled:
[0.00]   0-9 write-back
[0.00]   A-B uncachable
[0.00]   C-F write-protect
[0.00] MTRR variable ranges enabled:
[0.00]   0 base 008000 mask FF8000 uncachable
[0.00]   1 disabled
[0.00]   2 disabled
[0.00]   3 disabled
[0.00]   4 disabled
[0.00]   5 disabled
[0.00]   6 disabled
[0.00]   7 disabled
[0.00] Scan for SMP in [mem 0x-0x03ff]
[0.00] Scan for SMP in [mem 0x0009fc00-0x0009]
[0.00] Scan for SMP in [mem 0x000f-0x000f]
[0.00] found SMP MP-table at [mem 0x000f1840-0x000f184f] mapped at 
[c00f1840]
[0.00]   mpc: f1850-f193c
[0.00] Scanning 1 areas for low memory corruption
[0.00] initial memory mapped: [mem 0x-0x027f]
[0.00] Base memory trampoline at [c009b000] 9b000 size 16384
[0.00] init_memory_mapping: [mem 0x-0x000f]
[0.00]  [mem 0x-0x000f] page 4k
[0.00] init_memory_mapping: [mem 0x0fc0-0x0fdf]
[0.00]  [mem 0x0fc0-0x0fdf] page 4k
[0.00] BRK [0x023ac000, 0x023acfff] PGTABLE
[0.00] init_memory_mapping: [mem 0x0c00-0x0fbf]
[0.00]  [mem 0x0c00-0x0fbf] page 4k
[0.00] BRK [0x023ad000, 0x023adfff] PGTABLE
[0.00] BRK [0x023ae000, 0x023aefff] PGTABLE
[0.00] BRK [0x023af000, 0x023a] PGTABLE
[0.00] BRK [0x023b, 0x023b0fff] PGTABLE
[0.00] BRK [0x023b1000, 0x023b1fff] PGTABLE
[0.00] init_memory_mapping: [mem 0x0010-0x0bff]
[0.00]  [mem 0x0010-0x0bff] page 4k
[0.00] init_memory_mapping: [mem 0x0fe0-0x0fffdfff]
[0.00]  [mem 0x0fe0-0x0fffdfff] page 4k
[0.00] ACPI: RSDP 0x000F16B0 14 (v00 BOCHS )
[0.00] ACPI: RSDT 0x0FFFE3F0 34 (v01 BOCHS  BXPCRSDT 0001 BXPC 
0001)
[0.00] ACPI: FACP 0x0F80 74 (v01 BOCHS  BXPCFACP 0001 BXPC 
0001)
[0.00] ACPI: DSDT 0x0FFFE430 001137 (v01 BXPC   BXDSDT   0001 INTL 
20100528)
[0.00] ACPI: FACS 0x0F40 40
[0.00] ACPI: SSDT 0x06A0 000899 (v01 BOCHS  BXPCSSDT 0001 BXPC 
0001)
[0.00] ACPI: APIC 0x05B0 80 (v01 BOCHS  BXPCAPIC 0001 BXPC 
0001)
[0.00] ACPI: