On Mon, 8 Feb 2016 21:35:03 +0100 Denys Vlasenko <[email protected]> wrote:
> This patch is reported to make affected user's machine survive. Would be nice to have a test case for this. Make a test module to reproduce the issue? > > Signed-off-by: Denys Vlasenko <[email protected]> > CC: [email protected] > CC: [email protected] > CC: Steven Rostedt <[email protected]> > CC: Tejun Heo <[email protected]> > CC: Peter Hurley <[email protected]> > --- > kernel/printk/printk.c | 25 +++++++++++++++++++++++++ > 1 file changed, 25 insertions(+) > > diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c > index c963ba5..ca4f9d55 100644 > --- a/kernel/printk/printk.c > +++ b/kernel/printk/printk.c > @@ -2235,6 +2235,7 @@ void console_unlock(void) > unsigned long flags; > bool wake_klogd = false; > bool do_cond_resched, retry; > + unsigned cnt; > > if (console_suspended) { > up_console_sem(); > @@ -2257,6 +2258,7 @@ void console_unlock(void) > /* flush buffered message fragment immediately to console */ > console_cont_flush(text, sizeof(text)); > again: > + cnt = 5; > for (;;) { > struct printk_log *msg; > size_t ext_len = 0; > @@ -2284,6 +2286,9 @@ skip: > if (console_seq == log_next_seq) > break; > > + if (--cnt == 0) > + break; /* Someone else printk's like crazy */ > + > msg = log_from_idx(console_idx); > if (msg->flags & LOG_NOCONS) { > /* > @@ -2350,6 +2355,26 @@ skip: > if (retry && console_trylock()) > goto again; > > + if (cnt == 0) { > + /* > + * Other CPU(s) printk like crazy, filling log_buf[]. > + * Try to get rid of the "honor" of servicing their data: > + * give _them_ time to grab console_sem and start working. > + */ > + cnt = 9999; I'll ignore that this looks very hacky. > + while (--cnt != 0) { > + cpu_relax(); > + if (console_seq == log_next_seq) { First, console_seq needs logbuf_lock protection. On some archs, this may hit 9999 every time as the console_seq is most likely in cache and isn't updating. Not to mention the race of another task moving log_next_seq too and this could have been on another CPU changing both console_seq and log_next_seq. Perhaps just save off console_seq and see if it changes at all. > + /* Good, other CPU entered "for(;;)" loop */ > + goto out; > + } > + } > + /* No one seems to be willing to take it... */ > + if (console_trylock()) > + goto again; /* we took it */ Perhaps add a few loops to the taking of the console sem. But again, this just sounds like playing with heuristics, and I hate heuristics. There's gotta be a better solution. -- Steve > + /* Nope, someone else holds console_sem! Good */ > + } > +out: > if (wake_klogd) > wake_up_klogd(); > }

