Re: [lockdep] b09be676e0 BUG: unable to handle kernel NULL pointer dereference at 000001f2

2017-10-09 Thread Fengguang Wu
On Mon, Oct 09, 2017 at 12:50:55PM +0200, Peter Zijlstra wrote: Fengguang, if you're still listening, could you please rerun the tests on top of ce07a9415f26, with the attached patches also applied? Ping!? it would be very good to get feedback on this asap. Sorry for the delay! From

Re: [lockdep] b09be676e0 BUG: unable to handle kernel NULL pointer dereference at 000001f2

2017-10-09 Thread Peter Zijlstra
> Fengguang, if you're still listening, could you please rerun the tests > on top of ce07a9415f26, with the attached patches also applied? Ping!? it would be very good to get feedback on this asap. > From e7840ad76515f0b5061fcdd098b57b7c01b61482 Mon Sep 17 00:00:00 2001 > Message-Id: >

Re: [lockdep] b09be676e0 BUG: unable to handle kernel NULL pointer dereference at 000001f2

2017-10-09 Thread Peter Zijlstra
> Fengguang, if you're still listening, could you please rerun the tests > on top of ce07a9415f26, with the attached patches also applied? Ping!? it would be very good to get feedback on this asap. > From e7840ad76515f0b5061fcdd098b57b7c01b61482 Mon Sep 17 00:00:00 2001 > Message-Id: > > From:

Re: [lockdep] b09be676e0 BUG: unable to handle kernel NULL pointer dereference at 000001f2

2017-10-05 Thread Josh Poimboeuf
some existing > 32-bit unwinder/GCC/frame pointer bugs in the process. > > So I just wanted to clarify that crossrelease seems to be innocent in > all this. Sorry for the confusion! Ok, I may have spoken too soon :-) There were so many issues here that it's been hard for me to untan

Re: [lockdep] b09be676e0 BUG: unable to handle kernel NULL pointer dereference at 000001f2

2017-10-05 Thread Josh Poimboeuf
/GCC/frame pointer bugs in the process. > > So I just wanted to clarify that crossrelease seems to be innocent in > all this. Sorry for the confusion! Ok, I may have spoken too soon :-) There were so many issues here that it's been hard for me to untangle them all. There's one panic w

Re: [lockdep] b09be676e0 BUG: unable to handle kernel NULL pointer dereference at 000001f2

2017-10-05 Thread Josh Poimboeuf
On Thu, Oct 05, 2017 at 08:02:33PM +0900, Tetsuo Handa wrote: > Josh Poimboeuf wrote: > > On Wed, Oct 04, 2017 at 06:44:50AM +0900, Tetsuo Handa wrote: > > > Josh Poimboeuf wrote: > > > > On Tue, Oct 03, 2017 at 11:28:15AM -0500, Josh Poimboeuf wrote: > > > > > There are two bugs: > > > > > > > >

Re: [lockdep] b09be676e0 BUG: unable to handle kernel NULL pointer dereference at 000001f2

2017-10-05 Thread Josh Poimboeuf
On Thu, Oct 05, 2017 at 08:02:33PM +0900, Tetsuo Handa wrote: > Josh Poimboeuf wrote: > > On Wed, Oct 04, 2017 at 06:44:50AM +0900, Tetsuo Handa wrote: > > > Josh Poimboeuf wrote: > > > > On Tue, Oct 03, 2017 at 11:28:15AM -0500, Josh Poimboeuf wrote: > > > > > There are two bugs: > > > > > > > >

Re: [lockdep] b09be676e0 BUG: unable to handle kernel NULL pointer dereference at 000001f2

2017-10-05 Thread Josh Poimboeuf
On Tue, Oct 03, 2017 at 09:54:31AM -0700, Linus Torvalds wrote: > On Tue, Oct 3, 2017 at 7:06 AM, Fengguang Wu wrote: > > > > This patch triggers a NULL-dereference bug at update_stack_state(). > > Although its parent commit also has a NULL-dereference bug, however > > the

Re: [lockdep] b09be676e0 BUG: unable to handle kernel NULL pointer dereference at 000001f2

2017-10-05 Thread Josh Poimboeuf
On Tue, Oct 03, 2017 at 09:54:31AM -0700, Linus Torvalds wrote: > On Tue, Oct 3, 2017 at 7:06 AM, Fengguang Wu wrote: > > > > This patch triggers a NULL-dereference bug at update_stack_state(). > > Although its parent commit also has a NULL-dereference bug, however > > the call stack looks rather

Re: [lockdep] b09be676e0 BUG: unable to handle kernel NULL pointer dereference at 000001f2

2017-10-05 Thread Tetsuo Handa
Josh Poimboeuf wrote: > On Wed, Oct 04, 2017 at 06:44:50AM +0900, Tetsuo Handa wrote: > > Josh Poimboeuf wrote: > > > On Tue, Oct 03, 2017 at 11:28:15AM -0500, Josh Poimboeuf wrote: > > > > There are two bugs: > > > > > > > > 1) Somebody -- presumably lockdep -- is corrupting the stack. Need the

Re: [lockdep] b09be676e0 BUG: unable to handle kernel NULL pointer dereference at 000001f2

2017-10-05 Thread Tetsuo Handa
Josh Poimboeuf wrote: > On Wed, Oct 04, 2017 at 06:44:50AM +0900, Tetsuo Handa wrote: > > Josh Poimboeuf wrote: > > > On Tue, Oct 03, 2017 at 11:28:15AM -0500, Josh Poimboeuf wrote: > > > > There are two bugs: > > > > > > > > 1) Somebody -- presumably lockdep -- is corrupting the stack. Need the

Re: [lockdep] b09be676e0 BUG: unable to handle kernel NULL pointer dereference at 000001f2

2017-10-04 Thread Josh Poimboeuf
On Wed, Oct 04, 2017 at 06:44:50AM +0900, Tetsuo Handa wrote: > Josh Poimboeuf wrote: > > On Tue, Oct 03, 2017 at 11:28:15AM -0500, Josh Poimboeuf wrote: > > > There are two bugs: > > > > > > 1) Somebody -- presumably lockdep -- is corrupting the stack. Need the > > >lockdep people to look

Re: [lockdep] b09be676e0 BUG: unable to handle kernel NULL pointer dereference at 000001f2

2017-10-04 Thread Josh Poimboeuf
On Wed, Oct 04, 2017 at 06:44:50AM +0900, Tetsuo Handa wrote: > Josh Poimboeuf wrote: > > On Tue, Oct 03, 2017 at 11:28:15AM -0500, Josh Poimboeuf wrote: > > > There are two bugs: > > > > > > 1) Somebody -- presumably lockdep -- is corrupting the stack. Need the > > >lockdep people to look

Re: [lockdep] b09be676e0 BUG: unable to handle kernel NULL pointer dereference at 000001f2

2017-10-04 Thread Josh Poimboeuf
On Wed, Oct 04, 2017 at 02:30:42PM -0700, Linus Torvalds wrote: > On Wed, Oct 4, 2017 at 2:06 PM, Josh Poimboeuf wrote: > > > > I compiled the same kernel with a similar version of GCC. It turns out > > that GCC *does* create unaligned stacks with frame pointers enabled: >

Re: [lockdep] b09be676e0 BUG: unable to handle kernel NULL pointer dereference at 000001f2

2017-10-04 Thread Josh Poimboeuf
On Wed, Oct 04, 2017 at 02:30:42PM -0700, Linus Torvalds wrote: > On Wed, Oct 4, 2017 at 2:06 PM, Josh Poimboeuf wrote: > > > > I compiled the same kernel with a similar version of GCC. It turns out > > that GCC *does* create unaligned stacks with frame pointers enabled: > > Christ. What a

Re: [lockdep] b09be676e0 BUG: unable to handle kernel NULL pointer dereference at 000001f2

2017-10-04 Thread Linus Torvalds
On Wed, Oct 4, 2017 at 2:06 PM, Josh Poimboeuf wrote: > > I compiled the same kernel with a similar version of GCC. It turns out > that GCC *does* create unaligned stacks with frame pointers enabled: Christ. What a piece of crap. It doesn't even seem to make any sense.

Re: [lockdep] b09be676e0 BUG: unable to handle kernel NULL pointer dereference at 000001f2

2017-10-04 Thread Linus Torvalds
On Wed, Oct 4, 2017 at 2:06 PM, Josh Poimboeuf wrote: > > I compiled the same kernel with a similar version of GCC. It turns out > that GCC *does* create unaligned stacks with frame pointers enabled: Christ. What a piece of crap. It doesn't even seem to make any sense. Spill room for the "u16

Re: [lockdep] b09be676e0 BUG: unable to handle kernel NULL pointer dereference at 000001f2

2017-10-04 Thread Josh Poimboeuf
On Wed, Oct 04, 2017 at 06:44:50AM +0900, Tetsuo Handa wrote: > Josh Poimboeuf wrote: > > On Tue, Oct 03, 2017 at 11:28:15AM -0500, Josh Poimboeuf wrote: > > > There are two bugs: > > > > > > 1) Somebody -- presumably lockdep -- is corrupting the stack. Need the > > >lockdep people to look

Re: [lockdep] b09be676e0 BUG: unable to handle kernel NULL pointer dereference at 000001f2

2017-10-04 Thread Josh Poimboeuf
On Wed, Oct 04, 2017 at 06:44:50AM +0900, Tetsuo Handa wrote: > Josh Poimboeuf wrote: > > On Tue, Oct 03, 2017 at 11:28:15AM -0500, Josh Poimboeuf wrote: > > > There are two bugs: > > > > > > 1) Somebody -- presumably lockdep -- is corrupting the stack. Need the > > >lockdep people to look

Re: [lockdep] b09be676e0 BUG: unable to handle kernel NULL pointer dereference at 000001f2

2017-10-04 Thread Josh Poimboeuf
On Wed, Oct 04, 2017 at 11:20:52AM +0200, Peter Zijlstra wrote: > On Tue, Oct 03, 2017 at 07:18:24PM +0200, Ingo Molnar wrote: > > Yes, I'll do that tomorrow. I was always a bit unhappy about cross-release, > > because it breaks the 'owner task owns the lock' model. > > Still, you can get real

Re: [lockdep] b09be676e0 BUG: unable to handle kernel NULL pointer dereference at 000001f2

2017-10-04 Thread Josh Poimboeuf
On Wed, Oct 04, 2017 at 11:20:52AM +0200, Peter Zijlstra wrote: > On Tue, Oct 03, 2017 at 07:18:24PM +0200, Ingo Molnar wrote: > > Yes, I'll do that tomorrow. I was always a bit unhappy about cross-release, > > because it breaks the 'owner task owns the lock' model. > > Still, you can get real

Re: [lockdep] b09be676e0 BUG: unable to handle kernel NULL pointer dereference at 000001f2

2017-10-04 Thread Ingo Molnar
* Peter Zijlstra wrote: > On Tue, Oct 03, 2017 at 07:18:24PM +0200, Ingo Molnar wrote: > > Yes, I'll do that tomorrow. I was always a bit unhappy about cross-release, > > because it breaks the 'owner task owns the lock' model. > > Still, you can get real deadlocks with

Re: [lockdep] b09be676e0 BUG: unable to handle kernel NULL pointer dereference at 000001f2

2017-10-04 Thread Ingo Molnar
* Peter Zijlstra wrote: > On Tue, Oct 03, 2017 at 07:18:24PM +0200, Ingo Molnar wrote: > > Yes, I'll do that tomorrow. I was always a bit unhappy about cross-release, > > because it breaks the 'owner task owns the lock' model. > > Still, you can get real deadlocks with completions... > > >

Re: [lockdep] b09be676e0 BUG: unable to handle kernel NULL pointer dereference at 000001f2

2017-10-04 Thread Peter Zijlstra
On Tue, Oct 03, 2017 at 07:18:24PM +0200, Ingo Molnar wrote: > Yes, I'll do that tomorrow. I was always a bit unhappy about cross-release, > because it breaks the 'owner task owns the lock' model. Still, you can get real deadlocks with completions... > Plus I don't think we found that many real

Re: [lockdep] b09be676e0 BUG: unable to handle kernel NULL pointer dereference at 000001f2

2017-10-04 Thread Peter Zijlstra
On Tue, Oct 03, 2017 at 07:18:24PM +0200, Ingo Molnar wrote: > Yes, I'll do that tomorrow. I was always a bit unhappy about cross-release, > because it breaks the 'owner task owns the lock' model. Still, you can get real deadlocks with completions... > Plus I don't think we found that many real

Re: [lockdep] b09be676e0 BUG: unable to handle kernel NULL pointer dereference at 000001f2

2017-10-04 Thread Peter Zijlstra
On Tue, Oct 03, 2017 at 10:05:38AM -0500, Josh Poimboeuf wrote: > I don't know the lockdep code, but one more comment from the peanut > gallery. This code looks suspect to me: > > > /* >* Stop saving stack_trace if save_trace() was >

Re: [lockdep] b09be676e0 BUG: unable to handle kernel NULL pointer dereference at 000001f2

2017-10-04 Thread Peter Zijlstra
On Tue, Oct 03, 2017 at 10:05:38AM -0500, Josh Poimboeuf wrote: > I don't know the lockdep code, but one more comment from the peanut > gallery. This code looks suspect to me: > > > /* >* Stop saving stack_trace if save_trace() was >

Re: [lockdep] b09be676e0 BUG: unable to handle kernel NULL pointer dereference at 000001f2

2017-10-03 Thread Tetsuo Handa
006d0 [0.789000] DR0: 0000 DR1: 0000 DR2: 0000 DR3: 0000 [0.789000] DR6: DR7: [0.789000] Call Trace: [0.789000] BUG: unable to handle kernel NULL pointer dereference at (null) [0.789000] IP: unwind_next_frame.part.5+0x168/0x1f0 [0.789000] *pde =

Re: [lockdep] b09be676e0 BUG: unable to handle kernel NULL pointer dereference at 000001f2

2017-10-03 Thread Tetsuo Handa
006d0 [0.789000] DR0: 0000 DR1: 0000 DR2: 0000 DR3: 0000 [0.789000] DR6: DR7: [0.789000] Call Trace: [0.789000] BUG: unable to handle kernel NULL pointer dereference at (null) [0.789000] IP: unwind_next_frame.part.5+0x168/0x1f0 [0.789000] *pde =

Re: [lockdep] b09be676e0 BUG: unable to handle kernel NULL pointer dereference at 000001f2

2017-10-03 Thread Josh Poimboeuf
On Tue, Oct 03, 2017 at 11:28:15AM -0500, Josh Poimboeuf wrote: > There are two bugs: > > 1) Somebody -- presumably lockdep -- is corrupting the stack. Need the >lockdep people to look at that. > > 2) The 32-bit FP unwinder isn't handling the corrupt stack very well, >It's blindly

Re: [lockdep] b09be676e0 BUG: unable to handle kernel NULL pointer dereference at 000001f2

2017-10-03 Thread Josh Poimboeuf
On Tue, Oct 03, 2017 at 11:28:15AM -0500, Josh Poimboeuf wrote: > There are two bugs: > > 1) Somebody -- presumably lockdep -- is corrupting the stack. Need the >lockdep people to look at that. > > 2) The 32-bit FP unwinder isn't handling the corrupt stack very well, >It's blindly

Re: [lockdep] b09be676e0 BUG: unable to handle kernel NULL pointer dereference at 000001f2

2017-10-03 Thread Ingo Molnar
* Linus Torvalds wrote: > On Tue, Oct 3, 2017 at 7:06 AM, Fengguang Wu wrote: > > > > This patch triggers a NULL-dereference bug at update_stack_state(). > > Although its parent commit also has a NULL-dereference bug, however > > the call

Re: [lockdep] b09be676e0 BUG: unable to handle kernel NULL pointer dereference at 000001f2

2017-10-03 Thread Ingo Molnar
* Linus Torvalds wrote: > On Tue, Oct 3, 2017 at 7:06 AM, Fengguang Wu wrote: > > > > This patch triggers a NULL-dereference bug at update_stack_state(). > > Although its parent commit also has a NULL-dereference bug, however > > the call stack looks rather different. Both dmesg files are

Re: [lockdep] b09be676e0 BUG: unable to handle kernel NULL pointer dereference at 000001f2

2017-10-03 Thread Linus Torvalds
On Tue, Oct 3, 2017 at 9:54 AM, Linus Torvalds wrote: > > Can we consider just reverting the crossrelease thing? > > The apparent stack corruption really worries me [...] Side note: I also think the thing is just broken. Any actual cross-releaser should be way

Re: [lockdep] b09be676e0 BUG: unable to handle kernel NULL pointer dereference at 000001f2

2017-10-03 Thread Linus Torvalds
On Tue, Oct 3, 2017 at 9:54 AM, Linus Torvalds wrote: > > Can we consider just reverting the crossrelease thing? > > The apparent stack corruption really worries me [...] Side note: I also think the thing is just broken. Any actual cross-releaser should be way more annotated than just "set

Re: [lockdep] b09be676e0 BUG: unable to handle kernel NULL pointer dereference at 000001f2

2017-10-03 Thread Linus Torvalds
On Tue, Oct 3, 2017 at 7:06 AM, Fengguang Wu wrote: > > This patch triggers a NULL-dereference bug at update_stack_state(). > Although its parent commit also has a NULL-dereference bug, however > the call stack looks rather different. Both dmesg files are attached. > > It

Re: [lockdep] b09be676e0 BUG: unable to handle kernel NULL pointer dereference at 000001f2

2017-10-03 Thread Linus Torvalds
On Tue, Oct 3, 2017 at 7:06 AM, Fengguang Wu wrote: > > This patch triggers a NULL-dereference bug at update_stack_state(). > Although its parent commit also has a NULL-dereference bug, however > the call stack looks rather different. Both dmesg files are attached. > > It also triggers this

Re: [lockdep] b09be676e0 BUG: unable to handle kernel NULL pointer dereference at 000001f2

2017-10-03 Thread Josh Poimboeuf
nitialized stack_trace struct to the > dependency list. > > I could be wrong, but it's at least something the lockdep folks might > want to look at. [ Different manifestations of this bug have been discussed in several different threads. Bringing partipants from those threads onto CC

Re: [lockdep] b09be676e0 BUG: unable to handle kernel NULL pointer dereference at 000001f2

2017-10-03 Thread Josh Poimboeuf
nitialized stack_trace struct to the > dependency list. > > I could be wrong, but it's at least something the lockdep folks might > want to look at. [ Different manifestations of this bug have been discussed in several different threads. Bringing partipants from those threads onto CC

Re: [lockdep] b09be676e0 BUG: unable to handle kernel NULL pointer dereference at 000001f2

2017-10-03 Thread Josh Poimboeuf
On Tue, Oct 03, 2017 at 09:41:36AM -0500, Josh Poimboeuf wrote: > On Tue, Oct 03, 2017 at 09:31:47AM -0500, Josh Poimboeuf wrote: > > On Tue, Oct 03, 2017 at 10:06:34PM +0800, Fengguang Wu wrote: > > > Hi Byungchul, > > > > > > This patch triggers a NULL-dereference bug at update_stack_state(). >

Re: [lockdep] b09be676e0 BUG: unable to handle kernel NULL pointer dereference at 000001f2

2017-10-03 Thread Josh Poimboeuf
On Tue, Oct 03, 2017 at 09:41:36AM -0500, Josh Poimboeuf wrote: > On Tue, Oct 03, 2017 at 09:31:47AM -0500, Josh Poimboeuf wrote: > > On Tue, Oct 03, 2017 at 10:06:34PM +0800, Fengguang Wu wrote: > > > Hi Byungchul, > > > > > > This patch triggers a NULL-dereference bug at update_stack_state(). >

Re: [lockdep] b09be676e0 BUG: unable to handle kernel NULL pointer dereference at 000001f2

2017-10-03 Thread Josh Poimboeuf
On Tue, Oct 03, 2017 at 09:31:47AM -0500, Josh Poimboeuf wrote: > On Tue, Oct 03, 2017 at 10:06:34PM +0800, Fengguang Wu wrote: > > Hi Byungchul, > > > > This patch triggers a NULL-dereference bug at update_stack_state(). > > Although its parent commit also has a NULL-dereference bug, however > >

Re: [lockdep] b09be676e0 BUG: unable to handle kernel NULL pointer dereference at 000001f2

2017-10-03 Thread Josh Poimboeuf
On Tue, Oct 03, 2017 at 09:31:47AM -0500, Josh Poimboeuf wrote: > On Tue, Oct 03, 2017 at 10:06:34PM +0800, Fengguang Wu wrote: > > Hi Byungchul, > > > > This patch triggers a NULL-dereference bug at update_stack_state(). > > Although its parent commit also has a NULL-dereference bug, however > >

Re: [lockdep] b09be676e0 BUG: unable to handle kernel NULL pointer dereference at 000001f2

2017-10-03 Thread Josh Poimboeuf
On Tue, Oct 03, 2017 at 10:06:34PM +0800, Fengguang Wu wrote: > Hi Byungchul, > > This patch triggers a NULL-dereference bug at update_stack_state(). > Although its parent commit also has a NULL-dereference bug, however > the call stack looks rather different. Both dmesg files are attached. > >

Re: [lockdep] b09be676e0 BUG: unable to handle kernel NULL pointer dereference at 000001f2

2017-10-03 Thread Josh Poimboeuf
On Tue, Oct 03, 2017 at 10:06:34PM +0800, Fengguang Wu wrote: > Hi Byungchul, > > This patch triggers a NULL-dereference bug at update_stack_state(). > Although its parent commit also has a NULL-dereference bug, however > the call stack looks rather different. Both dmesg files are attached. > >

Re: ce07a9415f ("locking/lockdep: Make check_prev_add() able to .."): BUG: unable to handle kernel NULL pointer dereference at 00000020

2017-10-03 Thread Ingo Molnar
-++++---+ > > procd: Instance odhcpd::instance1 s in a crash loop 6 crashes, 0 seconds > since last crash > procd: Instance uhttpd::instance1 s in a crash loop 6 crashes, 0 seconds >

Re: ce07a9415f ("locking/lockdep: Make check_prev_add() able to .."): BUG: unable to handle kernel NULL pointer dereference at 00000020

2017-10-03 Thread Ingo Molnar
n a crash loop 6 crashes, 0 seconds > since last crash > procd: Instance dnsmasq::instance1 s in a crash loop 6 crashes, 0 seconds > since last crash > [ 187.661000] Writes: Total: 2 Max/Min: 0/0 Fail: 0 > procd: - shutdown - > [ 220.353842] BUG: unable to handle kernel NU

ce07a9415f ("locking/lockdep: Make check_prev_add() able to .."): BUG: unable to handle kernel NULL pointer dereference at 00000020

2017-09-30 Thread kernel test robot
hes, 0 seconds since last crash procd: Instance dnsmasq::instance1 s in a crash loop 6 crashes, 0 seconds since last crash [ 187.661000] Writes: Total: 2 Max/Min: 0/0 Fail: 0 procd: - shutdown - [ 220.353842] BUG: unable to handle kernel NULL pointer dereference at 0020 [ 220.354946] IP:

ce07a9415f ("locking/lockdep: Make check_prev_add() able to .."): BUG: unable to handle kernel NULL pointer dereference at 00000020

2017-09-30 Thread kernel test robot
to handle kernel NULL pointer dereference at 0020 [ 220.354946] IP: iput+0x544/0x650 [ 220.355441] *pde = [ 220.355444] [ 220.356100] Oops: [#1] PREEMPT SMP [ 220.356647] CPU: 0 PID: 29697 Comm: umount Not tainted 4.13.0-rc4-00169-gce07a941 #627 [ 220.357790] Hardware name

Re: d57108d4f6 ("watchdog/core: Get rid of the thread .."): BUG: unable to handle kernel NULL pointer dereference at 0000000000000208

2017-09-16 Thread Linus Torvalds
On Sat, Sep 16, 2017 at 2:47 PM, Thomas Gleixner wrote: > > Yes and no. We get more code which uses cpumasks to store state, just like > I did, and while a lot of the cpumask functions just work as expected a > subset including for_each_cpu does not. That's confusing at best

Re: d57108d4f6 ("watchdog/core: Get rid of the thread .."): BUG: unable to handle kernel NULL pointer dereference at 0000000000000208

2017-09-16 Thread Linus Torvalds
On Sat, Sep 16, 2017 at 2:47 PM, Thomas Gleixner wrote: > > Yes and no. We get more code which uses cpumasks to store state, just like > I did, and while a lot of the cpumask functions just work as expected a > subset including for_each_cpu does not. That's confusing at best and I > rather avoid

Re: d57108d4f6 ("watchdog/core: Get rid of the thread .."): BUG: unable to handle kernel NULL pointer dereference at 0000000000000208

2017-09-16 Thread Thomas Gleixner
On Sat, 16 Sep 2017, Linus Torvalds wrote: > On Sat, Sep 16, 2017 at 11:12 AM, Thomas Gleixner wrote: > >> > >> So I suspect your perf fix is the right one, and maybe we could/should > >> just make people more aware of the empty cpumask issue with UP. > > > > Right, I just got

Re: d57108d4f6 ("watchdog/core: Get rid of the thread .."): BUG: unable to handle kernel NULL pointer dereference at 0000000000000208

2017-09-16 Thread Thomas Gleixner
On Sat, 16 Sep 2017, Linus Torvalds wrote: > On Sat, Sep 16, 2017 at 11:12 AM, Thomas Gleixner wrote: > >> > >> So I suspect your perf fix is the right one, and maybe we could/should > >> just make people more aware of the empty cpumask issue with UP. > > > > Right, I just got a bit frightened as

Re: d57108d4f6 ("watchdog/core: Get rid of the thread .."): BUG: unable to handle kernel NULL pointer dereference at 0000000000000208

2017-09-16 Thread Linus Torvalds
On Sat, Sep 16, 2017 at 11:12 AM, Thomas Gleixner wrote: >> >> So I suspect your perf fix is the right one, and maybe we could/should >> just make people more aware of the empty cpumask issue with UP. > > Right, I just got a bit frightened as I really was not aware about that

Re: d57108d4f6 ("watchdog/core: Get rid of the thread .."): BUG: unable to handle kernel NULL pointer dereference at 0000000000000208

2017-09-16 Thread Linus Torvalds
On Sat, Sep 16, 2017 at 11:12 AM, Thomas Gleixner wrote: >> >> So I suspect your perf fix is the right one, and maybe we could/should >> just make people more aware of the empty cpumask issue with UP. > > Right, I just got a bit frightened as I really was not aware about that > 'opmtimization'

Re: d57108d4f6 ("watchdog/core: Get rid of the thread .."): BUG: unable to handle kernel NULL pointer dereference at 0000000000000208

2017-09-16 Thread Thomas Gleixner
On Sat, 16 Sep 2017, Linus Torvalds wrote: > On Sat, Sep 16, 2017 at 10:35 AM, Thomas Gleixner wrote: > > > > Don't bother. I found it already. On UP we have: > > > > #define for_each_cpu(cpu, mask) \ > > for ((cpu) = 0; (cpu) < 1; (cpu)++, (void)mask) >

Re: d57108d4f6 ("watchdog/core: Get rid of the thread .."): BUG: unable to handle kernel NULL pointer dereference at 0000000000000208

2017-09-16 Thread Thomas Gleixner
On Sat, 16 Sep 2017, Linus Torvalds wrote: > On Sat, Sep 16, 2017 at 10:35 AM, Thomas Gleixner wrote: > > > > Don't bother. I found it already. On UP we have: > > > > #define for_each_cpu(cpu, mask) \ > > for ((cpu) = 0; (cpu) < 1; (cpu)++, (void)mask) > > > > which is a

Re: d57108d4f6 ("watchdog/core: Get rid of the thread .."): BUG: unable to handle kernel NULL pointer dereference at 0000000000000208

2017-09-16 Thread Linus Torvalds
On Sat, Sep 16, 2017 at 10:35 AM, Thomas Gleixner wrote: > > Don't bother. I found it already. On UP we have: > > #define for_each_cpu(cpu, mask) \ > for ((cpu) = 0; (cpu) < 1; (cpu)++, (void)mask) > > which is a total fail as it breaks any code which

Re: d57108d4f6 ("watchdog/core: Get rid of the thread .."): BUG: unable to handle kernel NULL pointer dereference at 0000000000000208

2017-09-16 Thread Linus Torvalds
On Sat, Sep 16, 2017 at 10:35 AM, Thomas Gleixner wrote: > > Don't bother. I found it already. On UP we have: > > #define for_each_cpu(cpu, mask) \ > for ((cpu) = 0; (cpu) < 1; (cpu)++, (void)mask) > > which is a total fail as it breaks any code which uses for_each_cpu() or

Re: d57108d4f6 ("watchdog/core: Get rid of the thread .."): BUG: unable to handle kernel NULL pointer dereference at 0000000000000208

2017-09-16 Thread Thomas Gleixner
On Sat, 16 Sep 2017, Fengguang Wu wrote: > > > [0.038086] Performance Events: unsupported p6 CPU model 61 no PMU > > > driver, software events only. > > What's your host CPU? I can reproduce it in Nehalem, Haswell and Sandy > Bridge machines with the attached script. My bad. I booted the

Re: d57108d4f6 ("watchdog/core: Get rid of the thread .."): BUG: unable to handle kernel NULL pointer dereference at 0000000000000208

2017-09-16 Thread Thomas Gleixner
On Sat, 16 Sep 2017, Fengguang Wu wrote: > > > [0.038086] Performance Events: unsupported p6 CPU model 61 no PMU > > > driver, software events only. > > What's your host CPU? I can reproduce it in Nehalem, Haswell and Sandy > Bridge machines with the attached script. My bad. I booted the

Re: d57108d4f6 ("watchdog/core: Get rid of the thread .."): BUG: unable to handle kernel NULL pointer dereference at 0000000000000208

2017-09-16 Thread Fengguang Wu
On Fri, Sep 15, 2017 at 06:24:20PM +0200, Thomas Gleixner wrote: On Fri, 15 Sep 2017, Thomas Gleixner wrote: On Fri, 15 Sep 2017, Thomas Gleixner wrote: > On Fri, 15 Sep 2017, kernel test robot wrote: > > [0.035023] CPU: Intel Common KVM processor (family: 0xf, model: 0x6, stepping: 0x1)

Re: d57108d4f6 ("watchdog/core: Get rid of the thread .."): BUG: unable to handle kernel NULL pointer dereference at 0000000000000208

2017-09-16 Thread Fengguang Wu
On Fri, Sep 15, 2017 at 06:24:20PM +0200, Thomas Gleixner wrote: On Fri, 15 Sep 2017, Thomas Gleixner wrote: On Fri, 15 Sep 2017, Thomas Gleixner wrote: > On Fri, 15 Sep 2017, kernel test robot wrote: > > [0.035023] CPU: Intel Common KVM processor (family: 0xf, model: 0x6, stepping: 0x1)

Re: d57108d4f6 ("watchdog/core: Get rid of the thread .."): BUG: unable to handle kernel NULL pointer dereference at 0000000000000208

2017-09-15 Thread Thomas Gleixner
On Fri, 15 Sep 2017, Thomas Gleixner wrote: > On Fri, 15 Sep 2017, Thomas Gleixner wrote: > > > On Fri, 15 Sep 2017, kernel test robot wrote: > > > [0.035023] CPU: Intel Common KVM processor (family: 0xf, model: 0x6, > > > stepping: 0x1) > > > [0.042302] Performance Events: unsupported

Re: d57108d4f6 ("watchdog/core: Get rid of the thread .."): BUG: unable to handle kernel NULL pointer dereference at 0000000000000208

2017-09-15 Thread Thomas Gleixner
On Fri, 15 Sep 2017, Thomas Gleixner wrote: > On Fri, 15 Sep 2017, Thomas Gleixner wrote: > > > On Fri, 15 Sep 2017, kernel test robot wrote: > > > [0.035023] CPU: Intel Common KVM processor (family: 0xf, model: 0x6, > > > stepping: 0x1) > > > [0.042302] Performance Events: unsupported

Re: d57108d4f6 ("watchdog/core: Get rid of the thread .."): BUG: unable to handle kernel NULL pointer dereference at 0000000000000208

2017-09-15 Thread Thomas Gleixner
On Fri, 15 Sep 2017, Thomas Gleixner wrote: > On Fri, 15 Sep 2017, kernel test robot wrote: > > [0.035023] CPU: Intel Common KVM processor (family: 0xf, model: 0x6, > > stepping: 0x1) > > [0.042302] Performance Events: unsupported Netburst CPU model 6 no PMU > > driver, software events

Re: d57108d4f6 ("watchdog/core: Get rid of the thread .."): BUG: unable to handle kernel NULL pointer dereference at 0000000000000208

2017-09-15 Thread Thomas Gleixner
On Fri, 15 Sep 2017, Thomas Gleixner wrote: > On Fri, 15 Sep 2017, kernel test robot wrote: > > [0.035023] CPU: Intel Common KVM processor (family: 0xf, model: 0x6, > > stepping: 0x1) > > [0.042302] Performance Events: unsupported Netburst CPU model 6 no PMU > > driver, software events

Re: d57108d4f6 ("watchdog/core: Get rid of the thread .."): BUG: unable to handle kernel NULL pointer dereference at 0000000000000208

2017-09-15 Thread Thomas Gleixner
but for some unknown reason the lockup detector can create an event, otherwise the perf availaibility check in lockup_detector_init() would fail Peter??? > [ 0.051650] BUG: unable to handle kernel NULL pointer dereference at > 0208 > [0.052000] IP: perf_event_rele

Re: d57108d4f6 ("watchdog/core: Get rid of the thread .."): BUG: unable to handle kernel NULL pointer dereference at 0000000000000208

2017-09-15 Thread Thomas Gleixner
but for some unknown reason the lockup detector can create an event, otherwise the perf availaibility check in lockup_detector_init() would fail Peter??? > [ 0.051650] BUG: unable to handle kernel NULL pointer dereference at > 0208 > [0.052000] IP: perf_event_rele

d57108d4f6 ("watchdog/core: Get rid of the thread .."): BUG: unable to handle kernel NULL pointer dereference at 0000000000000208

2017-09-14 Thread kernel test robot
iTLB entries: 4KB 0, 2MB 0, 4MB 0 [0.034018] Last level dTLB entries: 4KB 0, 2MB 0, 4MB 0, 1GB 0 [0.035023] CPU: Intel Common KVM processor (family: 0xf, model: 0x6, stepping: 0x1) [0.042302] Performance Events: unsupported Netburst CPU model 6 no PMU driver, software eve

d57108d4f6 ("watchdog/core: Get rid of the thread .."): BUG: unable to handle kernel NULL pointer dereference at 0000000000000208

2017-09-14 Thread kernel test robot
level dTLB entries: 4KB 0, 2MB 0, 4MB 0, 1GB 0 [0.035023] CPU: Intel Common KVM processor (family: 0xf, model: 0x6, stepping: 0x1) [0.042302] Performance Events: unsupported Netburst CPU model 6 no PMU driver, software events only. [0.051650] BUG: unable to handle kernel NULL pointer

74310e06be ("android: binder: Move buffer out of area shared .."): BUG: unable to handle kernel NULL pointer dereference at 00000014

2017-08-29 Thread kernel test robot
know what you are doing. [5.21] init: networking main process (377) terminated with status 1 [ 13.636567] sock: process `trinity-main' is using obsolete setsockopt SO_BSDCOMPAT [ 14.977100] BUG: unable to handle kernel NULL pointer dereference at 0014 [ 14.979193] IP: binder_alloc_def

74310e06be ("android: binder: Move buffer out of area shared .."): BUG: unable to handle kernel NULL pointer dereference at 00000014

2017-08-29 Thread kernel test robot
`trinity-main' is using obsolete setsockopt SO_BSDCOMPAT [ 14.977100] BUG: unable to handle kernel NULL pointer dereference at 0014 [ 14.979193] IP: binder_alloc_deferred_release+0xd3/0x270 [ 14.980697] *pde = [ 14.980698] [ 14.981969] Oops: [#1] DEBUG_PAGEALLOC

Re: "BUG: unable to handle kernel NULL pointer dereference" in swapping shmem

2017-07-31 Thread Naoya Horiguchi
> [ 112.690842] ===> testcase 'mm/shmem_swap' start > [ 112.788440] Adding 40956k swap on > /mnt/tests/examples/regression/kernel/mm_regression/mm_regression/work/swapfile. > Priority:-2 extents:1 across:40956k FS > [ 112.815903] bash (17346): drop_caches: 3 > [ 112.975713] BUG:

Re: "BUG: unable to handle kernel NULL pointer dereference" in swapping shmem

2017-07-31 Thread Naoya Horiguchi
12.788440] Adding 40956k swap on > /mnt/tests/examples/regression/kernel/mm_regression/mm_regression/work/swapfile. > Priority:-2 extents:1 across:40956k FS > [ 112.815903] bash (17346): drop_caches: 3 > [ 112.975713] BUG: unable to handle kernel NULL pointer dereference at > 00

Re: "BUG: unable to handle kernel NULL pointer dereference" in swapping shmem

2017-07-31 Thread Huang, Ying
nts:1 across:40956k FS > [ 112.815903] bash (17346): drop_caches: 3 > [ 112.975713] BUG: unable to handle kernel NULL pointer dereference at > 0007 > [ 112.984464] IP: swap_page_trans_huge_swapped+0x49/0xd0 > [ 112.990202] PGD 805e62067 > [ 112.990202] P4

Re: "BUG: unable to handle kernel NULL pointer dereference" in swapping shmem

2017-07-31 Thread Huang, Ying
> [ 112.815903] bash (17346): drop_caches: 3 > [ 112.975713] BUG: unable to handle kernel NULL pointer dereference at > 0007 > [ 112.984464] IP: swap_page_trans_huge_swapped+0x49/0xd0 > [ 112.990202] PGD 805e62067 > [ 112.990202] P4D 805e62067 > [ 11

"BUG: unable to handle kernel NULL pointer dereference" in swapping shmem

2017-07-31 Thread Naoya Horiguchi
'mm/shmem_swap' start [ 112.788440] Adding 40956k swap on /mnt/tests/examples/regression/kernel/mm_regression/mm_regression/work/swapfile. Priority:-2 extents:1 across:40956k FS [ 112.815903] bash (17346): drop_caches: 3 [ 112.975713] BUG: unable to handle kernel NULL pointer derefere

"BUG: unable to handle kernel NULL pointer dereference" in swapping shmem

2017-07-31 Thread Naoya Horiguchi
'mm/shmem_swap' start [ 112.788440] Adding 40956k swap on /mnt/tests/examples/regression/kernel/mm_regression/mm_regression/work/swapfile. Priority:-2 extents:1 across:40956k FS [ 112.815903] bash (17346): drop_caches: 3 [ 112.975713] BUG: unable to handle kernel NULL pointer derefere

RE: BUG: unable to handle kernel NULL pointer dereference at 0000000000000038 !//RE: kernel BUG at kernel/locking/rtmutex.c:1027

2017-06-28 Thread Feng Feng24 Liu
l.org; linux-rt-us...@vger.kernel.org >Subject: Re: BUG: unable to handle kernel NULL pointer dereference at >0038 !//RE: kernel BUG at kernel/locking/rtmutex.c:1027 > >On Tue, Jun 27, 2017 at 05:47:41AM +, Feng Feng24 Liu wrote: >> Hi, Julia >> Thanks

RE: BUG: unable to handle kernel NULL pointer dereference at 0000000000000038 !//RE: kernel BUG at kernel/locking/rtmutex.c:1027

2017-06-28 Thread Feng Feng24 Liu
l.org; linux-rt-us...@vger.kernel.org >Subject: Re: BUG: unable to handle kernel NULL pointer dereference at >0038 !//RE: kernel BUG at kernel/locking/rtmutex.c:1027 > >On Tue, Jun 27, 2017 at 05:47:41AM +, Feng Feng24 Liu wrote: >> Hi, Julia >> Thanks

Re: BUG: unable to handle kernel NULL pointer dereference at 0000000000000038 !//RE: kernel BUG at kernel/locking/rtmutex.c:1027

2017-06-27 Thread Julia Cartwright
On Tue, Jun 27, 2017 at 05:47:41AM +, Feng Feng24 Liu wrote: > Hi, Julia > Thanks for your kindly hit. I will try this patch > The problem is accidental. I will try to reproduce it. > BTW, could you help to give the link about the emails which > discuss about " nsfs:

Re: BUG: unable to handle kernel NULL pointer dereference at 0000000000000038 !//RE: kernel BUG at kernel/locking/rtmutex.c:1027

2017-06-27 Thread Julia Cartwright
On Tue, Jun 27, 2017 at 05:47:41AM +, Feng Feng24 Liu wrote: > Hi, Julia > Thanks for your kindly hit. I will try this patch > The problem is accidental. I will try to reproduce it. > BTW, could you help to give the link about the emails which > discuss about " nsfs:

RE: BUG: unable to handle kernel NULL pointer dereference at 0000000000000038 !//RE: kernel BUG at kernel/locking/rtmutex.c:1027

2017-06-26 Thread Feng Feng24 Liu
l.org; t...@hp.com >Subject: Re: BUG: unable to handle kernel NULL pointer dereference at >0038 !//RE: kernel BUG at kernel/locking/rtmutex.c:1027 > >On Mon, Jun 26, 2017 at 04:54:36PM +0200, Sebastian Andrzej Siewior wrote: >> On 2017-06-26 10:24:18 [-0400], Steven Ros

RE: BUG: unable to handle kernel NULL pointer dereference at 0000000000000038 !//RE: kernel BUG at kernel/locking/rtmutex.c:1027

2017-06-26 Thread Feng Feng24 Liu
l.org; t...@hp.com >Subject: Re: BUG: unable to handle kernel NULL pointer dereference at >0038 !//RE: kernel BUG at kernel/locking/rtmutex.c:1027 > >On Mon, Jun 26, 2017 at 04:54:36PM +0200, Sebastian Andrzej Siewior wrote: >> On 2017-06-26 10:24:18 [-0400], Steven Ros

RE: BUG: unable to handle kernel NULL pointer dereference at 0000000000000038 !//RE: kernel BUG at kernel/locking/rtmutex.c:1027

2017-06-26 Thread Feng Feng24 Liu
h; linux-kernel@vger.kernel.org; >linux-rt-us...@vger.kernel.org; t...@hp.com >Subject: Re: BUG: unable to handle kernel NULL pointer dereference at >0038 !//RE: kernel BUG at kernel/locking/rtmutex.c:1027 > >On 2017-06-26 10:24:18 [-0400], Steven Rostedt wrote: >> > CP

RE: BUG: unable to handle kernel NULL pointer dereference at 0000000000000038 !//RE: kernel BUG at kernel/locking/rtmutex.c:1027

2017-06-26 Thread Feng Feng24 Liu
h; linux-kernel@vger.kernel.org; >linux-rt-us...@vger.kernel.org; t...@hp.com >Subject: Re: BUG: unable to handle kernel NULL pointer dereference at >0038 !//RE: kernel BUG at kernel/locking/rtmutex.c:1027 > >On 2017-06-26 10:24:18 [-0400], Steven Rostedt wrote: >> > CP

RE: BUG: unable to handle kernel NULL pointer dereference at 0000000000000038 !//RE: kernel BUG at kernel/locking/rtmutex.c:1027

2017-06-26 Thread Feng Feng24 Liu
...@vger.kernel.org; t...@hp.com >Subject: Re: BUG: unable to handle kernel NULL pointer dereference at >0038 !//RE: kernel BUG at kernel/locking/rtmutex.c:1027 > >On Mon, 26 Jun 2017 06:33:29 + >Feng Feng24 Liu <liufen...@lenovo.com> wrote: > >> Hi, dear R

RE: BUG: unable to handle kernel NULL pointer dereference at 0000000000000038 !//RE: kernel BUG at kernel/locking/rtmutex.c:1027

2017-06-26 Thread Feng Feng24 Liu
...@vger.kernel.org; t...@hp.com >Subject: Re: BUG: unable to handle kernel NULL pointer dereference at >0038 !//RE: kernel BUG at kernel/locking/rtmutex.c:1027 > >On Mon, 26 Jun 2017 06:33:29 + >Feng Feng24 Liu wrote: > >> Hi, dear RT experts >> Than

Re: BUG: unable to handle kernel NULL pointer dereference at 0000000000000038 !//RE: kernel BUG at kernel/locking/rtmutex.c:1027

2017-06-26 Thread Julia Cartwright
On Mon, Jun 26, 2017 at 04:54:36PM +0200, Sebastian Andrzej Siewior wrote: > On 2017-06-26 10:24:18 [-0400], Steven Rostedt wrote: > > > CPU: 17 PID: 1738811 Comm: ip Not tainted 4.4.70-thinkcloud-nfv #1 > > > Hardware name: LENOVO System x3650 M5: -[8871AC1]-/01GR174, BIOS > > >

Re: BUG: unable to handle kernel NULL pointer dereference at 0000000000000038 !//RE: kernel BUG at kernel/locking/rtmutex.c:1027

2017-06-26 Thread Julia Cartwright
On Mon, Jun 26, 2017 at 04:54:36PM +0200, Sebastian Andrzej Siewior wrote: > On 2017-06-26 10:24:18 [-0400], Steven Rostedt wrote: > > > CPU: 17 PID: 1738811 Comm: ip Not tainted 4.4.70-thinkcloud-nfv #1 > > > Hardware name: LENOVO System x3650 M5: -[8871AC1]-/01GR174, BIOS > > >

Re: BUG: unable to handle kernel NULL pointer dereference at 0000000000000038 !//RE: kernel BUG at kernel/locking/rtmutex.c:1027

2017-06-26 Thread Sebastian Andrzej Siewior
On 2017-06-26 10:24:18 [-0400], Steven Rostedt wrote: > > CPU: 17 PID: 1738811 Comm: ip Not tainted 4.4.70-thinkcloud-nfv #1 > > Hardware name: LENOVO System x3650 M5: -[8871AC1]-/01GR174, BIOS > > -[TCE124M-2.10]- 06/23/2016 > > task: 881cda2c27c0 ti: 881ea0538000 task.ti:

Re: BUG: unable to handle kernel NULL pointer dereference at 0000000000000038 !//RE: kernel BUG at kernel/locking/rtmutex.c:1027

2017-06-26 Thread Sebastian Andrzej Siewior
On 2017-06-26 10:24:18 [-0400], Steven Rostedt wrote: > > CPU: 17 PID: 1738811 Comm: ip Not tainted 4.4.70-thinkcloud-nfv #1 > > Hardware name: LENOVO System x3650 M5: -[8871AC1]-/01GR174, BIOS > > -[TCE124M-2.10]- 06/23/2016 > > task: 881cda2c27c0 ti: 881ea0538000 task.ti:

Re: BUG: unable to handle kernel NULL pointer dereference at 0000000000000038 !//RE: kernel BUG at kernel/locking/rtmutex.c:1027

2017-06-26 Thread Steven Rostedt
> But I found there is another BUG in 4.4.70-rt83, which can cause the > system hang-up > The BUG is: "BUG: unable to handle kernel NULL pointer dereference at > 003

Re: BUG: unable to handle kernel NULL pointer dereference at 0000000000000038 !//RE: kernel BUG at kernel/locking/rtmutex.c:1027

2017-06-26 Thread Steven Rostedt
ere is another BUG in 4.4.70-rt83, which can cause the > system hang-up > The BUG is: "BUG: unable to handle kernel NULL pointer dereference at > 0038"

[__try_to_take_rt_mutex] BUG: unable to handle kernel NULL pointer dereference at 0000000000000038 !//RE: kernel BUG at kernel/locking/rtmutex.c:1027

2017-06-26 Thread Feng Feng24 Liu
ound there is another BUG in 4.4.70-rt83, which can cause the system hang-up The BUG is: "BUG: unable to handle kernel NULL pointer dereference at 0038" Following is

[__try_to_take_rt_mutex] BUG: unable to handle kernel NULL pointer dereference at 0000000000000038 !//RE: kernel BUG at kernel/locking/rtmutex.c:1027

2017-06-26 Thread Feng Feng24 Liu
ound there is another BUG in 4.4.70-rt83, which can cause the system hang-up The BUG is: "BUG: unable to handle kernel NULL pointer dereference at 0038" Following is

BUG: unable to handle kernel NULL pointer dereference at 0000000000000038 !//RE: kernel BUG at kernel/locking/rtmutex.c:1027

2017-06-26 Thread Feng Feng24 Liu
is: "BUG: unable to handle kernel NULL pointer dereference at 0038" Following is the kernel log --- <4>Jun 23 21:54:53 node-1 kerne

BUG: unable to handle kernel NULL pointer dereference at 0000000000000038 !//RE: kernel BUG at kernel/locking/rtmutex.c:1027

2017-06-26 Thread Feng Feng24 Liu
is: "BUG: unable to handle kernel NULL pointer dereference at 0038" Following is the kernel log --- <4>Jun 23 21:54:53 node-1 kerne

Re: [Merge tag 'pci-v4.12-changes' of git] 857f864014: BUG: unable to handle kernel NULL pointer dereference at 00000000000000a8

2017-06-14 Thread Logan Gunthorpe
Hi Linus, On 14/06/17 03:59 AM, Linus Walleij wrote: > I started to take a stab at it at one point and incorporated some feedback > from Torvalds etc, it's here: > https://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-gpio.git/commit/?h=chrdev-warn=65e5b1e9eb3f777ab7535b74b490e882eeec79d7

<    1   2   3   4   5   6   7   8   9   10   >