On Tue, May 31, 2016 at 04:07:33PM -0300, Daniel Bristot de Oliveira wrote: > Currently, a schedule while atomic error prints the stack trace to the > kernel log and the system continue running. > > Although it is possible to collect the kernel log messages and analyse > it, often more information are needed. Furthermore, keep the system > running is not always the best choice. For example, when the preempt > count underflows the system will not stop to complain about scheduling > while atomic, so the kernel log can wraparound overwriting the first > stack trace, tuning the analysis even more difficult. > > This patch implements the kernel.panic_on_sched_in_atomic sysctl to > help out on these more complex situations. > > When kernel.panic_on_sched_in_atomic is set to 1, the kernel will > panic() in the schedule while atomic detection. > > The default value for the sysctl is 0, maintaining the current behavior. > > Cc: Ingo Molnar <[email protected]> > Cc: Peter Zijlstra <[email protected]> > Reviewed-by: Arnaldo Carvalho de Melo <[email protected]> > Signed-off-by: Daniel Bristot de Oliveira <[email protected]>
Reviewed-by: Josh Triplett <[email protected]> > Documentation/sysctl/kernel.txt | 13 +++++++++++++ > include/linux/kernel.h | 1 + > kernel/sched/core.c | 7 +++++++ > kernel/sysctl.c | 9 +++++++++ > 4 files changed, 30 insertions(+) > > diff --git a/Documentation/sysctl/kernel.txt b/Documentation/sysctl/kernel.txt > index 29ec4bb..0b176bb 100644 > --- a/Documentation/sysctl/kernel.txt > +++ b/Documentation/sysctl/kernel.txt > @@ -59,6 +59,7 @@ show up in /proc/sys/kernel: > - panic_on_unrecovered_nmi > - panic_on_warn > - panic_on_rcu_stall > +- panic_on_sched_in_atomic > - perf_cpu_time_max_percent > - perf_event_paranoid > - perf_event_max_stack > @@ -630,6 +631,18 @@ is useful to define the root cause of RCU stalls using > kdump. > > ============================================================== > > +panic_on_sched_in_atomic: > + > +When set to 1, calls panic() in the schedule while in atomic detection. > +This is useful get a vmcore to inspect the root cause of a schedule() > +call in the atomic context. > + > +0: do not panic() on scheduling while in atomic, default behavior. > + > +1: panic() after printing schedule while in atomic messages. > + > +============================================================== > + > perf_cpu_time_max_percent: > > Hints to the kernel how much CPU time it should be allowed to > diff --git a/include/linux/kernel.h b/include/linux/kernel.h > index c420821..e4d9804 100644 > --- a/include/linux/kernel.h > +++ b/include/linux/kernel.h > @@ -452,6 +452,7 @@ extern int panic_on_unrecovered_nmi; > extern int panic_on_io_nmi; > extern int panic_on_warn; > extern int sysctl_panic_on_rcu_stall; > +extern int sysctl_panic_on_sched_in_atomic; > extern int sysctl_panic_on_stackoverflow; > > extern bool crash_kexec_post_notifiers; > diff --git a/kernel/sched/core.c b/kernel/sched/core.c > index 7f2cae4..602f978 100644 > --- a/kernel/sched/core.c > +++ b/kernel/sched/core.c > @@ -152,6 +152,11 @@ __read_mostly int scheduler_running; > */ > int sysctl_sched_rt_runtime = 950000; > > +/* > + * panic on scheduling while atomic > + */ > +__read_mostly int sysctl_panic_on_sched_in_atomic = 0; > + > /* cpus with isolated domains */ > cpumask_var_t cpu_isolated_map; > > @@ -3146,6 +3151,8 @@ static noinline void __schedule_bug(struct task_struct > *prev) > pr_cont("\n"); > } > #endif > + if (sysctl_panic_on_sched_in_atomic) > + panic("scheduling while atomic\n"); > dump_stack(); > add_taint(TAINT_WARN, LOCKDEP_STILL_OK); > } > diff --git a/kernel/sysctl.c b/kernel/sysctl.c > index d3b93aa..f0a984c 100644 > --- a/kernel/sysctl.c > +++ b/kernel/sysctl.c > @@ -1216,6 +1216,15 @@ static struct ctl_table kern_table[] = { > .extra2 = &one, > }, > #endif > + { > + .procname = "panic_on_sched_in_atomic", > + .data = &sysctl_panic_on_sched_in_atomic, > + .maxlen = sizeof(sysctl_panic_on_sched_in_atomic), > + .mode = 0644, > + .proc_handler = proc_dointvec_minmax, > + .extra1 = &zero, > + .extra2 = &one, > + }, > { } > }; > > -- > 2.5.5 >

