Re: BUG: kernel NULL pointer dereference from check_preempt_wakeup()

2020-06-15 Thread Peter Zijlstra
On Sat, Jun 13, 2020 at 04:40:30PM -0700, Paul E. McKenney wrote:

> So Peter's patch is fully in the clear:
> 
> Tested-by: Paul E. McKenney 

Awesome!, now I get to explain how the lack of that leads to the
observed NULL pointer :-)



Re: BUG: kernel NULL pointer dereference from check_preempt_wakeup()

2020-06-13 Thread Paul E. McKenney
On Sat, Jun 13, 2020 at 07:57:19AM -0700, Paul E. McKenney wrote:
> On Sat, Jun 13, 2020 at 09:26:40AM +0200, Thomas Gleixner wrote:
> > "Paul E. McKenney"  writes:
> > > And an update based on your patch (https://paste.debian.net/1151802/)
> > > against 44ebe016df3a ("Merge branch 'proc-linus' of
> > > git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace").
> > 
> > I'm running this patch since midnight on top of x86/entry. Still no NULL
> > pointer deref.
> > 
> > The cross-check with plain x86/entry has triggered it on all instances
> > by now.
> 
> That is consistent with my experience.  I have not yet see a NULL pointer
> dereference with Peter's patch.  As I said earlier, tests thus far
> at my end give 95% confidence that it is a fix for the NULL pointer
> problem.
> 
> I have seen two other problems, but I haven't yet see them often enough
> to have any confidence as to what they are related to.  The RCU CPU
> stall warning happened only once, so it might have been introduced in
> mainline sometime in the last few days.  The BUG was with Peter's patch
> on an intermediate state of x86/entry, so it might be specific to that
> intermediate state.  Or to my commit/patch confusion, perhaps.
> 
> > So it looks your up to something here.
> 
> Let's recap.
> 
> I ran 140 hours each of TREE04 and TREE05 with Peter's patch on top of
> x86/entry in -tip with no complaints of any kind.  So that is good,
> and it means we have a good fix for the too-short grace periods.
> I already verified TASKS03 yesterday (not to be confused with TREE03).
> So we have a clean bill of health for x86/entry from my end with respect
> to too-short grace periods with insanely high confidence.
> 
> I have started 28*TREE03 for a few hours with Peter's patch on top
> of x86/entry in -tip, which I expect will reproduce your result of
> no NULL pointer.  If so (as I fully expect it to), I will join you in
> proclaiming Peter's patch to be a fix for the NULL pointer problem.

It did pass, so I hereby join you in proclaiming Peter's patch to be
a fix for the NULL pointer problem.  ;-)

And a big "Thank You" to you guys for tracking this one down.  It was
not at all straightforward!

> Then I follow up on https://paste.debian.net/1151842 and also on
> https://paste.debian.net/1151809.
> 
> First, I run TREE03 longer on 44ebe016df3a ("Merge branch 'proc-linus' of
> git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace")
> in mainline without Peter's patch ignoring any occurrences of the NULL
> pointer problem to see what happens.  If that reproduces the RCU CPU
> stall in https://paste.debian.net/1151842 or the BUG on line 1046 of
> kernel/sched/rt.c in https://paste.debian.net/1151809, I will attempt
> to bisect those in mainline.

And the run on mainline without Peter's patch did in fact reproduce the
RCU CPU stall warning.  So this is a mainline bug that I will track down
separately.  This appears to be a failure to awaken RCU's grace-period
kthread, with the kthread remaining in 0x402 sleeping state for more
than 21 seconds, which is a bit excessive for a three-jiffy sleep. On
the other hand, many of the other CPUs seem to be stuck in stop-machine.
The stall persists.

This happened one time in 112 hours of TREE03 rcutorture, so bisection
will take some time, assuming that it works at all in this case.  ;-)

So Peter's patch is fully in the clear:

Tested-by: Paul E. McKenney 

Thanx, Paul

> If neither of those two reproduce, on to other things.
> 
> Seem reasonable?
> 
>   Thanx, Paul


Re: BUG: kernel NULL pointer dereference from check_preempt_wakeup()

2020-06-13 Thread Paul E. McKenney
On Sat, Jun 13, 2020 at 09:26:40AM +0200, Thomas Gleixner wrote:
> "Paul E. McKenney"  writes:
> > And an update based on your patch (https://paste.debian.net/1151802/)
> > against 44ebe016df3a ("Merge branch 'proc-linus' of
> > git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace").
> 
> I'm running this patch since midnight on top of x86/entry. Still no NULL
> pointer deref.
> 
> The cross-check with plain x86/entry has triggered it on all instances
> by now.

That is consistent with my experience.  I have not yet see a NULL pointer
dereference with Peter's patch.  As I said earlier, tests thus far
at my end give 95% confidence that it is a fix for the NULL pointer
problem.

I have seen two other problems, but I haven't yet see them often enough
to have any confidence as to what they are related to.  The RCU CPU
stall warning happened only once, so it might have been introduced in
mainline sometime in the last few days.  The BUG was with Peter's patch
on an intermediate state of x86/entry, so it might be specific to that
intermediate state.  Or to my commit/patch confusion, perhaps.

> So it looks your up to something here.

Let's recap.

I ran 140 hours each of TREE04 and TREE05 with Peter's patch on top of
x86/entry in -tip with no complaints of any kind.  So that is good,
and it means we have a good fix for the too-short grace periods.
I already verified TASKS03 yesterday (not to be confused with TREE03).
So we have a clean bill of health for x86/entry from my end with respect
to too-short grace periods with insanely high confidence.

I have started 28*TREE03 for a few hours with Peter's patch on top
of x86/entry in -tip, which I expect will reproduce your result of
no NULL pointer.  If so (as I fully expect it to), I will join you in
proclaiming Peter's patch to be a fix for the NULL pointer problem.

Then I follow up on https://paste.debian.net/1151842 and also on
https://paste.debian.net/1151809.

First, I run TREE03 longer on 44ebe016df3a ("Merge branch 'proc-linus' of
git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace")
in mainline without Peter's patch ignoring any occurrences of the NULL
pointer problem to see what happens.  If that reproduces the RCU CPU
stall in https://paste.debian.net/1151842 or the BUG on line 1046 of
kernel/sched/rt.c in https://paste.debian.net/1151809, I will attempt
to bisect those in mainline.

If neither of those two reproduce, on to other things.

Seem reasonable?

Thanx, Paul


Re: BUG: kernel NULL pointer dereference from check_preempt_wakeup()

2020-06-13 Thread Thomas Gleixner
"Paul E. McKenney"  writes:
> And an update based on your patch (https://paste.debian.net/1151802/)
> against 44ebe016df3a ("Merge branch 'proc-linus' of
> git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace").

I'm running this patch since midnight on top of x86/entry. Still no NULL
pointer deref.

The cross-check with plain x86/entry has triggered it on all instances
by now.

So it looks your up to something here.

Thanks,

Thomas


Re: BUG: kernel NULL pointer dereference from check_preempt_wakeup()

2020-06-12 Thread Paul E. McKenney
On Tue, Jun 09, 2020 at 08:40:16AM -0700, Paul E. McKenney wrote:
> On Sun, Jun 07, 2020 at 11:57:32AM -0700, Paul E. McKenney wrote:
> > On Sat, Jun 06, 2020 at 10:29:42AM -0700, Paul E. McKenney wrote:
> > > On Fri, Jun 05, 2020 at 05:51:26PM -0700, Paul E. McKenney wrote:
> > > > On Fri, Jun 05, 2020 at 11:41:59AM -0700, Paul E. McKenney wrote:
> > > > > On Fri, Jun 05, 2020 at 07:16:07AM -0700, Paul E. McKenney wrote:
> > > > > 
> > > > > And in case it is helpful, here is the output of "git bisect view",
> > > > > which lists rather more commits than "git bisect run" claims, but 
> > > > > there
> > > > > are only a few scheduler commits below.  I don't see anything that
> > > > > I can prove can cause this problem, but there are some that are at
> > > > > least related to this code path.
> > > > > 
> > > > > Is there anything that is actually relevant?
> > > > 
> > > > And the run with the WARN and tracing did hit two failures, and the
> > > > corresponding console logs are in the attached tarball.  Both of them
> > > > triggered a warning as well as the failure.
> > > 
> > > And the current state of the bisection, for whatever it is worth.
> > 
> > The bisection finished, finally!
> > 
> > 90b5363acd47 ("sched: Clean up scheduler_ipi()")
> > 
> > I don't see anything immediately obvious, but then again, I do not
> > claim to understand this code.  If you have additional diagnostics,
> > please let me know.
> 
> But lockdep just might have spotted something useful.
> This was running the rcutorture SRCU-P scenario on
> mainline commit abfbb29297c2 ("Merge tag 'rproc-v5.8' of
> git://git.kernel.org/pub/scm/linux/kernel/git/andersson/remoteproc").
> Unlike TREE03, SRCU-P enables lockdep.
> 
> This splat features a couple of lockdep_assert_held() splats just before
> the mysterious NULL pointer dereference.

And an update based on your patch (https://paste.debian.net/1151802/)
against 44ebe016df3a ("Merge branch 'proc-linus' of
git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace").

Without your patch, 28 hours of rcutorture scenario TREE03 gets three NULL
pointer dereferences.  With it, there are no NULL pointer dereferences,
but I did see one of these:  https://paste.debian.net/1151842.
(Also shown below.)

Related or not, who knows?  More as I learn more.

There is only a 5% chance of the result with your patch being a
false negative, so looking positive.

Thanx, Paul



[ 1669.614123] rcu: INFO: rcu_preempt self-detected stall on CPU
[ 1669.615634] rcu: 13-...!: (20999 ticks this GP) 
idle=fda/1/0x4002 softirq=234177/234177 fqs=0
[ 1669.618350]  (t=21004 jiffies g=874585 q=4817)
[ 1669.619395] rcu: rcu_preempt kthread starved for 21005 jiffies! g874585 f0x0 
RCU_GP_WAIT_FQS(5) ->state=0x402 ->cpu=0
[ 1669.621920] rcu: Unless rcu_preempt kthread gets sufficient CPU time, 
OOM is now expected behavior.
[ 1669.624060] rcu: RCU grace-period kthread stack dump:
[ 1669.625393] rcu_preempt I1505611  2 0x4000
[ 1669.626899] Call Trace:
[ 1669.627697]  __schedule+0x25d/0x5d0
[ 1669.628475]  ? _raw_spin_lock_irqsave+0x12/0x40
[ 1669.629620]  schedule+0x37/0xe0
[ 1669.630404]  schedule_timeout+0x109/0x210
[ 1669.631145]  ? trace_raw_output_hrtimer_start+0x70/0x70
[ 1669.632069]  rcu_gp_kthread+0x8e1/0x1260
[ 1669.632995]  ? call_rcu+0x2d0/0x2d0
[ 1669.633881]  kthread+0x138/0x160
[ 1669.634558]  ? kthread_create_on_node+0x60/0x60
[ 1669.635536]  ret_from_fork+0x22/0x30
[ 1669.636307] NMI backtrace for cpu 13
[ 1669.637257] CPU: 13 PID: 93 Comm: migration/13 Not tainted 5.7.0+ #18
[ 1669.638624] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 
1.11.0-2.el7 04/01/2014
[ 1669.640360] Call Trace:
[ 1669.640913]  
[ 1669.641371]  dump_stack+0x57/0x70
[ 1669.642090]  nmi_cpu_backtrace.cold.6+0x13/0x51
[ 1669.643066]  ? lapic_can_unplug_cpu.cold.30+0x3e/0x3e
[ 1669.644298]  nmi_trigger_cpumask_backtrace+0xc4/0xcd
[ 1669.645197]  rcu_dump_cpu_stacks+0x96/0xc2
[ 1669.645884]  rcu_sched_clock_irq.cold.86+0x118/0x506
[ 1669.646955]  ? perf_event_task_tick+0x5f/0x280
[ 1669.648053]  ? sched_clock+0x5/0x10
[ 1669.648788]  ? cpuacct_account_field+0x14/0x70
[ 1669.649961]  ? tick_switch_to_oneshot.cold.2+0x74/0x74
[ 1669.651599]  update_process_times+0x1f/0x50
[ 1669.652862]  tick_sched_timer+0x55/0x170
[ 1669.653685]  __hrtimer_run_queues+0xfb/0x2c0
[ 1669.654669]  hrtimer_interrupt+0x105/0x220
[ 1669.655696]  smp_apic_timer_interrupt+0x7f/0x190
[ 1669.656700]  apic_timer_interrupt+0xf/0x20
[ 1669.657384]  
[ 1669.657831] RIP: 0010:stop_machine_yield+0x2/0x10
[ 1669.658554] Code: 0c 25 28 00 00 00 75 10 48 8d 65 f0 5b 41 5c 5d c3 b8 fe 
ff ff ff eb e0 e8 ab c0 f4 ff 90 66 2e 0f 1f 84 00 00 00 00 00 f3 90  0f 1f 
00 66 2e 0f 1f 84 00 00 00 00 00 41 57 41 56 41 55 41 54
[ 1669.662249] RSP: :a7860039fe60 EFLAGS: 0246 ORIG_RAX: 

Re: BUG: kernel NULL pointer dereference from check_preempt_wakeup()

2020-06-09 Thread Paul E. McKenney
On Sun, Jun 07, 2020 at 11:57:32AM -0700, Paul E. McKenney wrote:
> On Sat, Jun 06, 2020 at 10:29:42AM -0700, Paul E. McKenney wrote:
> > On Fri, Jun 05, 2020 at 05:51:26PM -0700, Paul E. McKenney wrote:
> > > On Fri, Jun 05, 2020 at 11:41:59AM -0700, Paul E. McKenney wrote:
> > > > On Fri, Jun 05, 2020 at 07:16:07AM -0700, Paul E. McKenney wrote:
> > > > 
> > > > And in case it is helpful, here is the output of "git bisect view",
> > > > which lists rather more commits than "git bisect run" claims, but there
> > > > are only a few scheduler commits below.  I don't see anything that
> > > > I can prove can cause this problem, but there are some that are at
> > > > least related to this code path.
> > > > 
> > > > Is there anything that is actually relevant?
> > > 
> > > And the run with the WARN and tracing did hit two failures, and the
> > > corresponding console logs are in the attached tarball.  Both of them
> > > triggered a warning as well as the failure.
> > 
> > And the current state of the bisection, for whatever it is worth.
> 
> The bisection finished, finally!
> 
> 90b5363acd47 ("sched: Clean up scheduler_ipi()")
> 
> I don't see anything immediately obvious, but then again, I do not
> claim to understand this code.  If you have additional diagnostics,
> please let me know.

But lockdep just might have spotted something useful.
This was running the rcutorture SRCU-P scenario on
mainline commit abfbb29297c2 ("Merge tag 'rproc-v5.8' of
git://git.kernel.org/pub/scm/linux/kernel/git/andersson/remoteproc").
Unlike TREE03, SRCU-P enables lockdep.

This splat features a couple of lockdep_assert_held() splats just before
the mysterious NULL pointer dereference.

Thanx, Paul


[16741.334139] [ cut here ]
[16741.335393] WARNING: CPU: 2 PID: 159 at kernel/sched/sched.h:1132 
update_curr+0xc6/0x390
[16741.336800] Modules linked in:
[16741.337426] CPU: 2 PID: 159 Comm: kworker/2:3 Not tainted 5.7.0+ #4
[16741.338315] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 
1.11.0-2.el7 04/01/2014
[16741.339516] Workqueue: rcu_gp process_srcu
[16741.340100] RIP: 0010:update_curr+0xc6/0x390
[16741.340710] Code: c0 00 00 00 48 89 83 c0 00 00 00 eb a7 4d 01 7c 24 20 eb 
a9 48 8d 78 18 be ff ff ff ff e8 52 92 b7 00 85 c0 0f 85 66 ff ff ff <0f> 0b e9 
5f ff ff ff 4c 8d 6b 80 0f 1f 44 00 00 65 8b 05 53 51 b6
[16741.343660] RSP: :9d7180108c30 EFLAGS: 00010046
[16741.344504] RAX:  RBX: 8ca19f3d44c0 RCX: 8ca19dd55b00
[16741.345511] RDX:  RSI: 8ca19f42bfd8 RDI: 8ca19dd56470
[16741.346540] RBP: 9d7180108c58 R08: 10012d27c58c R09: 0008
[16741.347554] R10: 0001 R11:  R12: 8ca19f42c080
[16741.348561] R13: 8ca19f3d44c0 R14: 8ca19f42bfc0 R15: 
[16741.349567] FS:  () GS:8ca19f48() 
knlGS:
[16741.350704] CS:  0010 DS:  ES:  CR0: 80050033
[16741.351522] CR2:  CR3: 0f62 CR4: 06e0
[16741.352547] DR0:  DR1:  DR2: 
[16741.353580] DR3:  DR6: fffe0ff0 DR7: 0400
[16741.354597] Call Trace:
[16741.354947]  
[16741.355254]  enqueue_task_fair+0x25f/0xb60
[16741.355845]  activate_task+0x41/0xb0
[16741.356377]  ttwu_do_activate+0x49/0x80
[16741.356928]  sched_ttwu_pending+0x94/0xe0
[16741.357573]  smp_call_function_single_interrupt+0x44/0x1e0
[16741.358378]  call_function_single_interrupt+0xf/0x20
[16741.359076] RIP: 0010:_raw_spin_unlock_irqrestore+0x49/0x60
[16741.359873] Code: c7 02 75 1f 53 9d e8 d6 e2 53 ff bf 01 00 00 00 e8 ac 1c 
47 ff 65 8b 05 7d 94 fe 7b 85 c0 74 0c 5b 5d c3 e8 c9 e1 53 ff 53 9d  df e8 
90 34 3d ff eb ed 0f 1f 40 00 66 2e 0f 1f 84 00 00 00 00
[16741.362526] RSP: :9d7180108e00 EFLAGS: 0202 ORIG_RAX: 
ff04
[16741.363590] RAX: 0c8b8f92 RBX: 0202 RCX: 0002
[16741.364625] RDX:  RSI:  RDI: 8402ea97
[16741.365778] RBP: 8ca19f3d4c38 R08:  R09: 
[16741.366793] R10: 0001 R11:  R12: 
[16741.367818] R13: 8ca19f3d4c38 R14: 0202 R15: 8ca19f42bfc0
[16741.368850]  ? call_function_single_interrupt+0xa/0x20
[16741.369737]  ? _raw_spin_unlock_irqrestore+0x47/0x60
[16741.370590]  try_to_wake_up+0x25f/0x7e0
[16741.371137]  ? __next_timer_interrupt+0xc0/0xc0
[16741.371890]  call_timer_fn+0xa0/0x2f0
[16741.372586]  ? __next_timer_interrupt+0xc0/0xc0
[16741.373291]  run_timer_softirq+0x1cc/0x550
[16741.373911]  __do_softirq+0xe5/0x497
[16741.374443]  irq_exit+0xa9/0xc0
[16741.374894]  smp_apic_timer_interrupt+0xb7/0x280
[16741.375568]  apic_timer_interrupt+0xf/0x20
[16741.376150]  

Re: BUG: kernel NULL pointer dereference from check_preempt_wakeup()

2020-06-07 Thread Paul E. McKenney
On Sat, Jun 06, 2020 at 10:29:42AM -0700, Paul E. McKenney wrote:
> On Fri, Jun 05, 2020 at 05:51:26PM -0700, Paul E. McKenney wrote:
> > On Fri, Jun 05, 2020 at 11:41:59AM -0700, Paul E. McKenney wrote:
> > > On Fri, Jun 05, 2020 at 07:16:07AM -0700, Paul E. McKenney wrote:
> > > 
> > > And in case it is helpful, here is the output of "git bisect view",
> > > which lists rather more commits than "git bisect run" claims, but there
> > > are only a few scheduler commits below.  I don't see anything that
> > > I can prove can cause this problem, but there are some that are at
> > > least related to this code path.
> > > 
> > > Is there anything that is actually relevant?
> > 
> > And the run with the WARN and tracing did hit two failures, and the
> > corresponding console logs are in the attached tarball.  Both of them
> > triggered a warning as well as the failure.
> 
> And the current state of the bisection, for whatever it is worth.

The bisection finished, finally!

90b5363acd47 ("sched: Clean up scheduler_ipi()")

I don't see anything immediately obvious, but then again, I do not
claim to understand this code.  If you have additional diagnostics,
please let me know.

Thanx, Paul



commit 90b5363acd4739769c3f38c1aff16171bd133e8c
Author: Peter Zijlstra (Intel) 
Date:   Fri Mar 27 11:44:56 2020 +0100

sched: Clean up scheduler_ipi()

The scheduler IPI has grown weird and wonderful over the years, time
for spring cleaning.

Move all the non-trivial stuff out of it and into a regular smp function
call IPI. This then reduces the schedule_ipi() to most of it's former NOP
glory and ensures to keep the interrupt vector lean and mean.

Aside of that avoiding the full irq_enter() in the x86 IPI implementation
is incorrect as scheduler_ipi() can be instrumented. To work around that
scheduler_ipi() had an irq_enter/exit() hack when heavy work was
pending. This is gone now.

Signed-off-by: Peter Zijlstra (Intel) 
Signed-off-by: Thomas Gleixner 
Reviewed-by: Alexandre Chartre 
Link: https://lkml.kernel.org/r/20200505134058.361859...@linutronix.de

diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index b58efb1..cd2070d 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -219,6 +219,13 @@ void update_rq_clock(struct rq *rq)
update_rq_clock_task(rq, delta);
 }
 
+static inline void
+rq_csd_init(struct rq *rq, call_single_data_t *csd, smp_call_func_t func)
+{
+   csd->flags = 0;
+   csd->func = func;
+   csd->info = rq;
+}
 
 #ifdef CONFIG_SCHED_HRTICK
 /*
@@ -314,16 +321,14 @@ void hrtick_start(struct rq *rq, u64 delay)
hrtimer_start(>hrtick_timer, ns_to_ktime(delay),
  HRTIMER_MODE_REL_PINNED_HARD);
 }
+
 #endif /* CONFIG_SMP */
 
 static void hrtick_rq_init(struct rq *rq)
 {
 #ifdef CONFIG_SMP
-   rq->hrtick_csd.flags = 0;
-   rq->hrtick_csd.func = __hrtick_start;
-   rq->hrtick_csd.info = rq;
+   rq_csd_init(rq, >hrtick_csd, __hrtick_start);
 #endif
-
hrtimer_init(>hrtick_timer, CLOCK_MONOTONIC, HRTIMER_MODE_REL_HARD);
rq->hrtick_timer.function = hrtick;
 }
@@ -650,6 +655,16 @@ static inline bool got_nohz_idle_kick(void)
return false;
 }
 
+static void nohz_csd_func(void *info)
+{
+   struct rq *rq = info;
+
+   if (got_nohz_idle_kick()) {
+   rq->idle_balance = 1;
+   raise_softirq_irqoff(SCHED_SOFTIRQ);
+   }
+}
+
 #else /* CONFIG_NO_HZ_COMMON */
 
 static inline bool got_nohz_idle_kick(void)
@@ -2292,6 +2307,11 @@ void sched_ttwu_pending(void)
rq_unlock_irqrestore(rq, );
 }
 
+static void wake_csd_func(void *info)
+{
+   sched_ttwu_pending();
+}
+
 void scheduler_ipi(void)
 {
/*
@@ -2300,34 +2320,6 @@ void scheduler_ipi(void)
 * this IPI.
 */
preempt_fold_need_resched();
-
-   if (llist_empty(_rq()->wake_list) && !got_nohz_idle_kick())
-   return;
-
-   /*
-* Not all reschedule IPI handlers call irq_enter/irq_exit, since
-* traditionally all their work was done from the interrupt return
-* path. Now that we actually do some work, we need to make sure
-* we do call them.
-*
-* Some archs already do call them, luckily irq_enter/exit nest
-* properly.
-*
-* Arguably we should visit all archs and update all handlers,
-* however a fair share of IPIs are still resched only so this would
-* somewhat pessimize the simple resched case.
-*/
-   irq_enter();
-   sched_ttwu_pending();
-
-   /*
-* Check if someone kicked us for doing the nohz idle load balance.
-*/
-   if (unlikely(got_nohz_idle_kick())) {
-   this_rq()->idle_balance = 1;
-   raise_softirq_irqoff(SCHED_SOFTIRQ);
-   }

Re: BUG: kernel NULL pointer dereference from check_preempt_wakeup()

2020-06-06 Thread Paul E. McKenney
On Fri, Jun 05, 2020 at 05:51:26PM -0700, Paul E. McKenney wrote:
> On Fri, Jun 05, 2020 at 11:41:59AM -0700, Paul E. McKenney wrote:
> > On Fri, Jun 05, 2020 at 07:16:07AM -0700, Paul E. McKenney wrote:
> > > On Fri, Jun 05, 2020 at 03:14:51PM +0200, Peter Zijlstra wrote:
> > > 
> > > No KCSAN.  GCC 8.2.1.  No cgroups unless the kernel creates some.
> > > No userspace other than a C-language binary named "init" that
> > > sleeps in an infinite loop.
> > > 
> > > .config attached.
> > 
> > And in case it is helpful, here is the output of "git bisect view",
> > which lists rather more commits than "git bisect run" claims, but there
> > are only a few scheduler commits below.  I don't see anything that
> > I can prove can cause this problem, but there are some that are at
> > least related to this code path.
> > 
> > Is there anything that is actually relevant?
> 
> And the run with the WARN and tracing did hit two failures, and the
> corresponding console logs are in the attached tarball.  Both of them
> triggered a warning as well as the failure.

And the current state of the bisection, for whatever it is worth.

Thanx, Paul



dbe9337109c2 sched/cpuacct: Fix charge cpuacct.usage_sys
04f5c362ec6d sched/fair: Replace zero-length array with flexible-array
95d685935a2e sched/pelt: Sync util/runnable_sum with PELT window when 
propagating
12aa2587388d sched/cpuacct: Use __this_cpu_add() instead of this_cpu_ptr()
7d148be69e3a sched/fair: Optimize enqueue_task_fair()
9013196a467e Merge branch 'sched/urgent'
2a0a24ebb499 sched: Make scheduler_ipi inline
90b5363acd47 sched: Clean up scheduler_ipi()
b1d1779e5ef7 sched/core: Simplify sched_init()
12ac6782a40a sched/swait: Reword some of the main description
17c891ab3491 sched/fair: Use __this_cpu_read() in wake_wide()
bf2c59fce407 sched/core: Fix illegal RCU from offline CPUs
f38f12d1e081 sched/fair: Mark sched_init_granularity __init
5a6d6a6ccb5f sched/fair: Refill bandwidth before scaling
457d1f465778 sched: Extract the task putting code from pick_next_task()
d91cecc15662 sched: Make newidle_balance() static again
36c5bdc43870 sched/topology: Kill SD_LOAD_BALANCE
e669ac8ab952 sched: Remove checks against SD_LOAD_BALANCE
9818427c6270 sched/debug: Make sd->flags sysctl read-only
45da27732b0b sched/fair: find_idlest_group(): Remove unused sd_flag parameter
586b58cac8b4 exit: Move preemption fixup up, move blocking operations down
64297f2b03cc sched/fair: Simplify the code of should_we_balance()
ab93a4bc955b sched/fair: Remove distribute_running from CFS bandwidth
e98fa02c4f2e sched/fair: Eliminate bandwidth race between throttling and 
distribution
f080d93e1d41 sched/debug: Fix trival print_task() format


Re: BUG: kernel NULL pointer dereference from check_preempt_wakeup()

2020-06-05 Thread Paul E. McKenney
On Fri, Jun 05, 2020 at 07:16:07AM -0700, Paul E. McKenney wrote:
> On Fri, Jun 05, 2020 at 03:14:51PM +0200, Peter Zijlstra wrote:
> 
> No KCSAN.  GCC 8.2.1.  No cgroups unless the kernel creates some.
> No userspace other than a C-language binary named "init" that
> sleeps in an infinite loop.
> 
> .config attached.

And in case it is helpful, here is the output of "git bisect view",
which lists rather more commits than "git bisect run" claims, but there
are only a few scheduler commits below.  I don't see anything that
I can prove can cause this problem, but there are some that are at
least related to this code path.

Is there anything that is actually relevant?

Thanx, Paul


Semi-plausible to my admittedly untrained eye:

a148866489fbe243c936fe43e4525d8dbfa0318f sched: Replace rq::wake_list
126c2092e5c8b28623cb890cd2930aa292410676 sched: Add rq::ttwu_pending
2ebb17717550607bcd85fb8cf7d24ac870e9d762 sched/core: Offload wakee task 
activation if it the wakee is descheduling
c6e7bd7afaeb3af55ffac122828035f1c01d1d7b sched/core: Optimize ttwu() spinning 
on p->on_cpu
7d148be69e3a0eaa9d029a3c51b545e322116a99 sched/fair: Optimize 
enqueue_task_fair()


Full list, which includes quite a few additional sched-related commits:

e8f4abf8fd1a2beb94983cb95ed713df75b3d135 Merge branch 'uaccess.csum' of 
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
174e1ea8a2f6140078b6c61068b478cf3c4aa74f fix a braino in ia64 uaccess csum 
changes
e7c93cbfe9cb4b0a47633099e78c455b1f79bbac Merge tag 'threads-v5.8' of 
git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux
d479c5a1919b4e569dcd3ae9c84ed74a675d0b94 Merge tag 'sched-core-2020-06-02' of 
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
f6aee505c71bbb035dde146caf5a6abbf3ccbe47 Merge tag 'x86-timers-2020-06-03' of 
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
dabc4df27c628866ede130a09121f255ca894d8c Merge tag 'timers-core-2020-06-02' of 
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
f6606d0c0010953e4c28c8662623662b5108b4ce Merge tag 'irq-core-2020-06-02' of 
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
d6f9469a03d832dcd17041ed67774ffb5f3e73b3 Merge tag 'erofs-for-5.8-rc1' of 
git://git.kernel.org/pub/scm/linux/kernel/git/xiang/erofs
cadf32234b6f6dd96a0892bf915e3ee8c438cf07 Merge tag 'jfs-5.8' of 
git://github.com/kleikamp/linux-shaggy
f3cdc8ae116e27d84e1f33c7a2995960cebb73ac Merge tag 'for-5.8-tag' of 
git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux
8eeae5bae1239c030ba0b34cac97ebd5e7ec1886 Merge tag 'vfs-5.8-merge-2' of 
git://git.kernel.org/pub/scm/fs/xfs/xfs-linux
96ed320d527eb071389f69cbd6772440805af7d7 Merge tag 'vfs-5.8-merge-1' of 
git://git.kernel.org/pub/scm/fs/xfs/xfs-linux
16d91548d1057691979de4686693f0ff92f46000 Merge tag 'xfs-5.8-merge-8' of 
git://git.kernel.org/pub/scm/fs/xfs/xfs-linux
25de110d148666752dc0e0da7a0b69de31cd7098 irq_work: Define irq_work_single() on 
!CONFIG_IRQ_WORK too
d77aeb5d403d379ff458e04fc07b5b86700270f2 irqchip: Fix "Loongson HyperTransport 
Vector support" driver build on all non-MIPS platforms
76fe06c1e68b8f8dfb63d5b268623830dcb16ed0 Merge tag 'irqchip-5.8' of 
git://git.kernel.org/pub/scm/linux/kernel/git/maz/arm-platforms into irq/core
34f853b849eb6a509eb8f40f2f5946ebb1f62739 erofs: suppress false positive 
last_block warning
f57a3fe44995a3820192e0cf7c3ebdecedd9586e erofs: convert to use the new mount 
fs_context api
da10a4b626657387845f32d37141fc7d48ebbdb3 dt-bindings: interrupt-controller: Add 
Loongson PCH MSI
632dcc2c75ef6de3272aa4ddd8f19da1f1ace323 irqchip: Add Loongson PCH MSI 
controller
b6e4bc125fc517969f97d901b1845ebf47bbea26 dt-bindings: interrupt-controller: Add 
Loongson PCH PIC
ef8c01eb64ca6719da449dab0aa9424e13c58bd0 irqchip: Add Loongson PCH PIC 
controller
6c2832c3c6edc38ab58bad29731b4951c0a90cf8 dt-bindings: interrupt-controller: Add 
Loongson HTVEC
818e915fbac518e8c78e1877a0048d92d4965e5a irqchip: Add Loongson HyperTransport 
Vector support
1d0326f352bb094771df17f045bdbadff89a43e6 genirq: Check irq_data_get_irq_chip() 
return value before use
2166e5edce9ac1edf3b113d6091ef72fcac2d6c4 btrfs: fix space_info bytes_may_use 
underflow during space cache writeout
467dc47ea99c56e966e99d09dae54869850abeeb btrfs: fix space_info bytes_may_use 
underflow after nocow buffered write
e2c8e92d1140754073ad3799eb6620c76bab2078 btrfs: fix wrong file range cleanup 
after an error filling dealloc range
213ff4b72a9c7509dd85979db64c66774f4f26c1 btrfs: remove redundant local variable 
in read_block_for_search
995e9a166b6909c9bb4af8f51b9502f8b8c18291 btrfs: open code key_search
d8f3e73587ce574f7a9bc165e0db69b0b148f6f8 btrfs: split btrfs_direct_IO to read 

Re: BUG: kernel NULL pointer dereference from check_preempt_wakeup()

2020-06-05 Thread Paul E. McKenney
On Fri, Jun 05, 2020 at 03:14:51PM +0200, Peter Zijlstra wrote:

No KCSAN.  GCC 8.2.1.  No cgroups unless the kernel creates some.
No userspace other than a C-language binary named "init" that
sleeps in an infinite loop.

.config attached.

> On Thu, Jun 04, 2020 at 03:54:45PM -0700, Paul E. McKenney wrote:
> 
> > BUG: kernel NULL pointer dereference, address: 0150
> > #PF: supervisor read access in kernel mode
> > #PF: error_code(0x) - not-present page
> > PGD 0 P4D 0 
> > Oops:  [#1] PREEMPT SMP PTI
> > CPU: 9 PID: 196 Comm: rcu_torture_rea Not tainted 5.7.0+ #3923
> > Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.11.0-2.el7 
> > 04/01/2014
> > RIP: 0010:check_preempt_wakeup+0xb1/0x180
> > Code: 83 ea 01 48 8b 9b 48 01 00 00 39 d0 75 f2 48 39 bb 50 01 00 00 75 05 
> > 48 85 ff 75 29 48 8b ad 48 01 00 00 48 8b 9b 48 01 00 00 <48> 8b bd 50 01 
> > 00 00 48 39 bb 50 01 00 00 0f 95 c2 48 85 ff 0f 94
> 
> That is:
> 
> All code
> 
>0: 83 ea 01sub$0x1,%edx
>3: 48 8b 9b 48 01 00 00mov0x148(%rbx),%rbx
>a: 39 d0   cmp%edx,%eax
>c: 75 f2   jne0x0
>e: 48 39 bb 50 01 00 00cmp%rdi,0x150(%rbx)
>   15: 75 05   jne0x1c
>   17: 48 85 fftest   %rdi,%rdi
>   1a: 75 29   jne0x45
>   1c: 48 8b ad 48 01 00 00mov0x148(%rbp),%rbp
>   23: 48 8b 9b 48 01 00 00mov0x148(%rbx),%rbx
>   2a:*48 8b bd 50 01 00 00mov0x150(%rbp),%rdi <-- 
> trapping instruction
>   31: 48 39 bb 50 01 00 00cmp%rdi,0x150(%rbx)
>   38: 0f 95 c2setne  %dl
>   3b: 48 85 fftest   %rdi,%rdi
>   3e: 0f  .byte 0xf
>   3f: 94  xchg   %eax,%esp
> 
> > RSP: 0018:accdc02ecd38 EFLAGS: 00010006
> > RAX:  RBX:  RCX: afa0bc20
> > RDX:  RSI: 946b5df5 RDI: 946b5f469340
> > RBP:  R08: 946b5df80d00 R09: 0001
> 
> And you have RBP == NULL and RBX == NULL
> 
> Now, my compiler generates very similar code for this function and tells
> me this is:
> 
>   check_preempt_wakeup()
> find_matching_se()
>   is_same_group()
> if (se->cfs_rq == pse->cfs_rq) <-- *BOOM*
> 
> and pahole gives us (for struct sched_entity):
> 
>   struct sched_entity *  parent;  
>  /* 0x148   0x8 */
>   struct cfs_rq *cfs_rq;  
>  /* 0x150   0x8 */
> 
> apparently both your @se and @pse are NULL.
> 
> Which shouldn't be possible..
> 
> I also don't see a relation to my recent changes here:
> 
> > Call Trace:
> >  
> >  check_preempt_curr+0x5d/0x90
> >  ttwu_do_wakeup.isra.93+0xf/0x100
> >  sched_ttwu_pending+0x66/0x90
> >  smp_call_function_single_interrupt+0x2d/0xf0
> >  call_function_single_interrupt+0xf/0x20
> 
> Since I would expect that to blow up much earlier, like @p == NULL or
> something along those lines.
> 
> Could you perhaps try something like the below ? It would splat when we
> run ouf of hierarchy to ascend, which is the only semi sane scenario for
> ending up where you are.

Let me interrupt my bisection and give your patch a try.  This will
take some time because although this reproduces, it does so slowly.

> Are you actively using cgroups or just whatever systemd decides to gift
> you? Can you perhaps dump /proc/sched_debug while your test is running?

I have no userspace, so no /proc.  Is there somewhere I can plant a
printk() to get you the information you need?

Thanx, Paul

> ---
> 
> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> index 35f4cc024dcfc..7aace0a5921e9 100644
> --- a/kernel/sched/fair.c
> +++ b/kernel/sched/fair.c
> @@ -416,8 +416,6 @@ static inline struct sched_entity *parent_entity(struct 
> sched_entity *se)
>  static void
>  find_matching_se(struct sched_entity **se, struct sched_entity **pse)
>  {
> - int se_depth, pse_depth;
> -
>   /*
>* preemption test can be made between sibling entities who are in the
>* same cfs_rq i.e who have a common parent. Walk up the hierarchy of
> @@ -425,23 +423,22 @@ find_matching_se(struct sched_entity **se, struct 
> sched_entity **pse)
>* parent.
>*/
>  
> - /* First walk up until both entities are at same depth */
> - se_depth = (*se)->depth;
> - pse_depth = (*pse)->depth;
> -
> - while (se_depth > pse_depth) {
> - se_depth--;
> - *se = parent_entity(*se);
> - }
> -
> - while (pse_depth > se_depth) {
> - pse_depth--;
> - *pse = parent_entity(*pse);
> - }
> -
>   while (!is_same_group(*se, *pse)) {
> - *se = parent_entity(*se);
> - *pse = parent_entity(*pse);
> + int se_depth = 

Re: BUG: kernel NULL pointer dereference from check_preempt_wakeup()

2020-06-05 Thread Peter Zijlstra
On Thu, Jun 04, 2020 at 03:54:45PM -0700, Paul E. McKenney wrote:

> BUG: kernel NULL pointer dereference, address: 0150
> #PF: supervisor read access in kernel mode
> #PF: error_code(0x) - not-present page
> PGD 0 P4D 0 
> Oops:  [#1] PREEMPT SMP PTI
> CPU: 9 PID: 196 Comm: rcu_torture_rea Not tainted 5.7.0+ #3923
> Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.11.0-2.el7 
> 04/01/2014
> RIP: 0010:check_preempt_wakeup+0xb1/0x180
> Code: 83 ea 01 48 8b 9b 48 01 00 00 39 d0 75 f2 48 39 bb 50 01 00 00 75 05 48 
> 85 ff 75 29 48 8b ad 48 01 00 00 48 8b 9b 48 01 00 00 <48> 8b bd 50 01 00 00 
> 48 39 bb 50 01 00 00 0f 95 c2 48 85 ff 0f 94

That is:

All code

   0:   83 ea 01sub$0x1,%edx
   3:   48 8b 9b 48 01 00 00mov0x148(%rbx),%rbx
   a:   39 d0   cmp%edx,%eax
   c:   75 f2   jne0x0
   e:   48 39 bb 50 01 00 00cmp%rdi,0x150(%rbx)
  15:   75 05   jne0x1c
  17:   48 85 fftest   %rdi,%rdi
  1a:   75 29   jne0x45
  1c:   48 8b ad 48 01 00 00mov0x148(%rbp),%rbp
  23:   48 8b 9b 48 01 00 00mov0x148(%rbx),%rbx
  2a:*  48 8b bd 50 01 00 00mov0x150(%rbp),%rdi <-- trapping 
instruction
  31:   48 39 bb 50 01 00 00cmp%rdi,0x150(%rbx)
  38:   0f 95 c2setne  %dl
  3b:   48 85 fftest   %rdi,%rdi
  3e:   0f  .byte 0xf
  3f:   94  xchg   %eax,%esp

> RSP: 0018:accdc02ecd38 EFLAGS: 00010006
> RAX:  RBX:  RCX: afa0bc20
> RDX:  RSI: 946b5df5 RDI: 946b5f469340
> RBP:  R08: 946b5df80d00 R09: 0001

And you have RBP == NULL and RBX == NULL

Now, my compiler generates very similar code for this function and tells
me this is:

  check_preempt_wakeup()
find_matching_se()
  is_same_group()
if (se->cfs_rq == pse->cfs_rq) <-- *BOOM*

and pahole gives us (for struct sched_entity):

  struct sched_entity *  parent;
   /* 0x148   0x8 */
  struct cfs_rq *cfs_rq;
   /* 0x150   0x8 */

apparently both your @se and @pse are NULL.

Which shouldn't be possible..

I also don't see a relation to my recent changes here:

> Call Trace:
>  
>  check_preempt_curr+0x5d/0x90
>  ttwu_do_wakeup.isra.93+0xf/0x100
>  sched_ttwu_pending+0x66/0x90
>  smp_call_function_single_interrupt+0x2d/0xf0
>  call_function_single_interrupt+0xf/0x20

Since I would expect that to blow up much earlier, like @p == NULL or
something along those lines.

Could you perhaps try something like the below ? It would splat when we
run ouf of hierarchy to ascend, which is the only semi sane scenario for
ending up where you are.

Are you actively using cgroups or just whatever systemd decides to gift
you? Can you perhaps dump /proc/sched_debug while your test is running?

---

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 35f4cc024dcfc..7aace0a5921e9 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -416,8 +416,6 @@ static inline struct sched_entity *parent_entity(struct 
sched_entity *se)
 static void
 find_matching_se(struct sched_entity **se, struct sched_entity **pse)
 {
-   int se_depth, pse_depth;
-
/*
 * preemption test can be made between sibling entities who are in the
 * same cfs_rq i.e who have a common parent. Walk up the hierarchy of
@@ -425,23 +423,22 @@ find_matching_se(struct sched_entity **se, struct 
sched_entity **pse)
 * parent.
 */
 
-   /* First walk up until both entities are at same depth */
-   se_depth = (*se)->depth;
-   pse_depth = (*pse)->depth;
-
-   while (se_depth > pse_depth) {
-   se_depth--;
-   *se = parent_entity(*se);
-   }
-
-   while (pse_depth > se_depth) {
-   pse_depth--;
-   *pse = parent_entity(*pse);
-   }
-
while (!is_same_group(*se, *pse)) {
-   *se = parent_entity(*se);
-   *pse = parent_entity(*pse);
+   int se_depth = (*se)->depth;
+   int pse_depth = (*pse)->depth;
+
+   if (se_depth <= pse_depth) {
+   struct sched_entity *parent = parent_entity(*pse);
+   if (WARN_ON_ONCE(!parent))
+   return;
+   *pse = parent;
+   }
+   if (se_depth >= pse_depth) {
+   struct sched_entity *parent = parent_entity(*se);
+   if (WARN_ON_ONCE(!parent))
+   return;
+   *se = parent_entity(*se);
+   }
}
 }
 


Re: BUG: kernel NULL pointer dereference from check_preempt_wakeup()

2020-06-05 Thread Peter Zijlstra
On Fri, Jun 05, 2020 at 12:38:59PM +0200, Peter Zijlstra wrote:
> On Thu, Jun 04, 2020 at 03:54:45PM -0700, Paul E. McKenney wrote:
> > Hello!
> > 
> > I get the splat below at a rate of roughly two per thirty hours when
> > running rcutorture scenario TREE03 on x86 at the June 3rd mainline commit:
> > 
> > cb8e59cc8720 ("Merge 
> > git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next")
> > 
> > Running 140 hours of this same scenario at the following June 2nd mainline
> > commit shows no errors:
> > 
> > d9afbb350990 ("Merge branch 'next-general' of 
> > git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security")
> > 
> > I have started a bisection, but it is likely to take several days to
> > complete.  I am looking at ways of speeding this up, but in the meantime,
> > I figured that I should check to see if others are also encountering this.
> > 
> > Thoughts?
> 
> I think this shows there's a boo-boo with the IPI patches. I've not
> managed to reproduce, but I'll give them another hard look.
> 
> Would you have a .config for me? My compiler's check_preempt_wakeup
> isn't anywhere near 0x180 bytes long. I'm thiknig you have
> instrumentation enabled, KCSAN?

n/m, I was looking at the wrong function.. let me go puzzle.


Re: BUG: kernel NULL pointer dereference from check_preempt_wakeup()

2020-06-05 Thread Peter Zijlstra
On Thu, Jun 04, 2020 at 03:54:45PM -0700, Paul E. McKenney wrote:
> Hello!
> 
> I get the splat below at a rate of roughly two per thirty hours when
> running rcutorture scenario TREE03 on x86 at the June 3rd mainline commit:
> 
> cb8e59cc8720 ("Merge 
> git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next")
> 
> Running 140 hours of this same scenario at the following June 2nd mainline
> commit shows no errors:
> 
> d9afbb350990 ("Merge branch 'next-general' of 
> git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security")
> 
> I have started a bisection, but it is likely to take several days to
> complete.  I am looking at ways of speeding this up, but in the meantime,
> I figured that I should check to see if others are also encountering this.
> 
> Thoughts?

I think this shows there's a boo-boo with the IPI patches. I've not
managed to reproduce, but I'll give them another hard look.

Would you have a .config for me? My compiler's check_preempt_wakeup
isn't anywhere near 0x180 bytes long. I'm thiknig you have
instrumentation enabled, KCSAN?

> BUG: kernel NULL pointer dereference, address: 0150
> #PF: supervisor read access in kernel mode
> #PF: error_code(0x) - not-present page
> PGD 0 P4D 0 
> Oops:  [#1] PREEMPT SMP PTI
> CPU: 9 PID: 196 Comm: rcu_torture_rea Not tainted 5.7.0+ #3923
> Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.11.0-2.el7 
> 04/01/2014
> RIP: 0010:check_preempt_wakeup+0xb1/0x180
> Code: 83 ea 01 48 8b 9b 48 01 00 00 39 d0 75 f2 48 39 bb 50 01 00 00 75 05 48 
> 85 ff 75 29 48 8b ad 48 01 00 00 48 8b 9b 48 01 00 00 <48> 8b bd 50 01 00 00 
> 48 39 bb 50 01 00 00 0f 95 c2 48 85 ff 0f 94
> RSP: 0018:accdc02ecd38 EFLAGS: 00010006
> RAX:  RBX:  RCX: afa0bc20
> RDX:  RSI: 946b5df5 RDI: 946b5f469340
> RBP:  R08: 946b5df80d00 R09: 0001
> R10:  R11:  R12: 946b5f469300
> R13: 0001 R14: 946b5df80d00 R15: 
> FS:  () GS:946b5f44() knlGS:
> CS:  0010 DS:  ES:  CR0: 80050033
> CR2: 0150 CR3: 16e0a000 CR4: 06e0
> DR0:  DR1:  DR2: 
> DR3:  DR6: fffe0ff0 DR7: 0400
> Call Trace:
>  
>  check_preempt_curr+0x5d/0x90
>  ttwu_do_wakeup.isra.93+0xf/0x100
>  sched_ttwu_pending+0x66/0x90
>  smp_call_function_single_interrupt+0x2d/0xf0
>  call_function_single_interrupt+0xf/0x20

Right, so I frobbed at that recently, see:

a148866489fbe243c936fe43e4525d8dbfa0318f...19a1f5ec699954d21be10f74ff71c2a7079e99ad



BUG: kernel NULL pointer dereference from check_preempt_wakeup()

2020-06-04 Thread Paul E. McKenney
Hello!

I get the splat below at a rate of roughly two per thirty hours when
running rcutorture scenario TREE03 on x86 at the June 3rd mainline commit:

cb8e59cc8720 ("Merge 
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next")

Running 140 hours of this same scenario at the following June 2nd mainline
commit shows no errors:

d9afbb350990 ("Merge branch 'next-general' of 
git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security")

I have started a bisection, but it is likely to take several days to
complete.  I am looking at ways of speeding this up, but in the meantime,
I figured that I should check to see if others are also encountering this.

Thoughts?

Thanx, Paul



BUG: kernel NULL pointer dereference, address: 0150
#PF: supervisor read access in kernel mode
#PF: error_code(0x) - not-present page
PGD 0 P4D 0 
Oops:  [#1] PREEMPT SMP PTI
CPU: 9 PID: 196 Comm: rcu_torture_rea Not tainted 5.7.0+ #3923
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.11.0-2.el7 04/01/2014
RIP: 0010:check_preempt_wakeup+0xb1/0x180
Code: 83 ea 01 48 8b 9b 48 01 00 00 39 d0 75 f2 48 39 bb 50 01 00 00 75 05 48 
85 ff 75 29 48 8b ad 48 01 00 00 48 8b 9b 48 01 00 00 <48> 8b bd 50 01 00 00 48 
39 bb 50 01 00 00 0f 95 c2 48 85 ff 0f 94
RSP: 0018:accdc02ecd38 EFLAGS: 00010006
RAX:  RBX:  RCX: afa0bc20
RDX:  RSI: 946b5df5 RDI: 946b5f469340
RBP:  R08: 946b5df80d00 R09: 0001
R10:  R11:  R12: 946b5f469300
R13: 0001 R14: 946b5df80d00 R15: 
FS:  () GS:946b5f44() knlGS:
CS:  0010 DS:  ES:  CR0: 80050033
CR2: 0150 CR3: 16e0a000 CR4: 06e0
DR0:  DR1:  DR2: 
DR3:  DR6: fffe0ff0 DR7: 0400
Call Trace:
 
 check_preempt_curr+0x5d/0x90
 ttwu_do_wakeup.isra.93+0xf/0x100
 sched_ttwu_pending+0x66/0x90
 smp_call_function_single_interrupt+0x2d/0xf0
 call_function_single_interrupt+0xf/0x20
RIP: 0010:_raw_spin_unlock_irqrestore+0x5/0x30
Code: 81 4e ff c3 90 c6 07 00 fb bf 01 00 00 00 e8 62 b0 57 ff 65 8b 05 f3 e0 
af 50 85 c0 74 01 c3 e8 a1 81 4e ff c3 c6 07 00 56 9d  01 00 00 00 e8 41 b0 
57 ff 65 8b 05 d2 e0 af 50 85 c0 74 01 c3
RSP: 0018:accdc02ece80 EFLAGS: 0287 ORIG_RAX: ff04
RAX: 0001 RBX: 946b5df5 RCX: 946b5ed62700
RDX: 00fb RSI: 0287 RDI: 946b5df50784
RBP: 0009 R08: 166489fdaf46 R09: 946b5f45cf28
R10: accdc02ecf18 R11: accdc06a3c40 R12: 
R13: 946b5df50784 R14: 0287 R15: 946b5f5e9300
 ? call_function_single_interrupt+0xa/0x20
 try_to_wake_up+0x205/0x510
 ? trace_raw_output_hrtimer_start+0x70/0x70
 ? trace_raw_output_hrtimer_start+0x70/0x70
 call_timer_fn+0x28/0x150
 run_timer_softirq+0x17b/0x220
 ? kvm_clock_read+0x14/0x30
 ? ktime_get+0x31/0x90
 ? hpet_assign_irq+0x90/0x90
 ? lapic_next_event+0x17/0x20
 __do_softirq+0xf7/0x322
 do_softirq_own_stack+0x2a/0x40
 
 do_softirq.part.15+0x32/0x40
 __local_bh_enable_ip+0x6b/0x80
 rcutorture_one_extend+0x1a1/0x2e0
 rcu_torture_one_read+0x186/0x3a0
 ? rcu_torture_one_read+0x3a0/0x3a0
 rcu_torture_reader+0x9d/0x1d0
 ? rcu_torture_stats+0x50/0x50
 kthread+0x134/0x160
 ? kthread_create_on_node+0x60/0x60
 ret_from_fork+0x22/0x30
Modules linked in:
CR2: 0150
---[ end trace 04d9c8a56ef5df54 ]---
RIP: 0010:check_preempt_wakeup+0xb1/0x180
Code: 83 ea 01 48 8b 9b 48 01 00 00 39 d0 75 f2 48 39 bb 50 01 00 00 75 05 48 
85 ff 75 29 48 8b ad 48 01 00 00 48 8b 9b 48 01 00 00 <48> 8b bd 50 01 00 00 48 
39 bb 50 01 00 00 0f 95 c2 48 85 ff 0f 94
RSP: 0018:accdc02ecd38 EFLAGS: 00010006
RAX:  RBX:  RCX: afa0bc20
RDX:  RSI: 946b5df5 RDI: 946b5f469340
RBP:  R08: 946b5df80d00 R09: 0001
R10:  R11:  R12: 946b5f469300
R13: 0001 R14: 946b5df80d00 R15: 
FS:  () GS:946b5f44() knlGS:
CS:  0010 DS:  ES:  CR0: 80050033
CR2: 0150 CR3: 16e0a000 CR4: 06e0
DR0:  DR1:  DR2: 
DR3:  DR6: fffe0ff0 DR7: 0400
Kernel panic - not syncing: Fatal exception in interrupt
Kernel Offset: 0x2da0 from 0x8100 (relocation range: 
0x8000-0xbfff)
---[ end Kernel panic - not syncing: Fatal exception in interrupt ]---