Hello, On (04/06/17 19:14), Pavel Machek wrote: [..] > > @@ -1765,17 +1803,40 @@ asmlinkage int vprintk_emit(int facility, int level, > > > > printed_len += log_output(facility, level, lflags, dict, dictlen, text, > > text_len); > > > > + /* > > + * Emergency level indicates that the system is unstable and, thus, > > + * we better stop relying on wake_up(printk_kthread) and try to do > > + * a direct printing. > > + */ > > + if (level == LOGLEVEL_EMERG) > > + printk_kthread_disabled = true; > > + > > + set_bit(PRINTK_PENDING_OUTPUT, &printk_pending); > > Messages lower then _EMERG may be important, too.. and usually are, > for debugging. > > And you keep both code paths, anyway, so they have to work. So you did > not really "fix" issues you are pointing out -- they still remain > there for _EMERG and above.
we don't drop messages of lower levels. we just print then from a schedulable context. once the things go off the rails, and EMERG is a good hint, I think, we stop being optimismitcs and switch to a "best effort" mode. that is sort of reasonable. if there is a flood of EMERG messages that are not actually important and, basically, are result of a coding error, then, I think, the we must fix that coding error. I mean, there should be some common sense, and doing while (1) printk(KERN_EMERG "hello\n"); is probably not. > I agree that printing too much is a problem. Could you just print > "(messages delayed)" in that case, then wake a kernel thread to [rint > the rest? sorry, but what difference would it make? it's really unclear at what point we should offload printing if we begin that "we will offload sometimes". for example, I've seen many spin-lock lockups where printk was involved. CPU0 CPU1 CPU2 CPU3 spin_lock(&lock) spin_lock(&lock) spin_lock(&lock) printk("foo") // grabs the console_sem printk("bar") printk("a") printk("b") printk("c") ... printk("z") spin_dump() spin_dump() call_console_drivers() printk() printk() serial_driver_write() printk() printk() spin_lock_irqsave(port->lock) ... ... uart_console_write(...) trigger_all_cpu_backtrace() trigger_all_cpu_backtrace() serial_driver_putchar() while (!txrdy(...)) cpu_relax() spin_dump() and trigger_all_cpu_backtrace() result in a bunch of additional printk()-s so CPU0 has even more job to do in console_unlock(), while it still holds the contended spin_lock. and so on; there are many other examples. so should we declare a "we can spend only 2 seconds in direct printk() and then must offload printing" rule? I don't think it's much better than a simpler "we always offload, as long as we think it's safe". -ss