Re: linux.git: printk() problem
On Mon 2016-10-24 19:22:59, Linus Torvalds wrote: > On Mon, Oct 24, 2016 at 7:06 PM, Linus Torvalds >wrote: > > On Mon, Oct 24, 2016 at 6:55 PM, Sergey Senozhatsky > > wrote: > >> > >> I think cont_flush() should grab the logbuf_lock lock, because > >> it does log_store() and touches the cont.len. so something like > >> this perhaps > > > > Absolutely. Good catch. > > Actually, you can't do it the way you did (inside cont_flush), because > "cont_flush()" is already called with logbuf_lock held in most cases > (see "cont_add()"). > > So it's really just the timer function that needs to take the > logbuf_lock before it calls cont_flush(). > > So here's a new version. How does this look to you? > > Again, this still tests "cont.len" outside the lock (not just in > console_unlock(), but also in deferred_cont_flush()). And it's fine: > even if it sees the "wrong" value due to some race, it does so either > because cont.len was just set to non-zero (and whoever set it will > force the re-check anyway), or it got cleared just as it was tested > (and at worst you end up with an extra timer invocation). This patch really seems to reduce the number of too-early flushed continuous lines. It reduces the scattered output. But I am not sure if we want to add a timer code into the printk calls at this stage (for 4.9-rc5). Well, the patch looks fine, except that we call cont_flush() without poking console. It is not a regression because only newlines triggered console in the past and they still do but... I would suggest to revert the commit bfd8d3f23b51018388be ("printk: make reading the kernel log flush pending lines") for 4.9. Then we could test/fix it properly for 4.10. Me and Sergey would happily help with it. Just in case, you still want to commit this patch. I would suggest to apply the one below on top. >From 7a0ad7ce2347346fc81872fe42a95ad5dfba4098 Mon Sep 17 00:00:00 2001 From: Petr Mladek Date: Tue, 25 Oct 2016 15:23:13 +0200 Subject: [PATCH] printk: Poke console and other loggers when cont buffer is flushed The commit bfd8d3f23b51018388be041 ("printk: make reading the kernel log flush pending lines") allows to add new message into the log buffer without flushing it to the console and waking other loggers. This patch adds wake_up_console() and calls it when the cont buffer is flushed. The function name will make more sense once we switch to the async printk. Note that it is enough to poke console. The other loggers are waken from console_unlock() when there is a new message. The patch also renames PRINTK_PENDING flags and wake_up_klogd_work* stuff to reduce confusion. First, there are more possible log daemons, e.g. klogd, syslogd. Second, the word "console" is more descriptive than "output". This patch is based on top of the fix proposed at https://lkml.kernel.org/r/CA+55aFwKYnrMJr_vSE+GfDGszeUGyd=cpud15-zz8ywqw61...@mail.gmail.com Signed-off-by: Petr Mladek --- kernel/printk/printk.c | 60 ++ 1 file changed, 41 insertions(+), 19 deletions(-) diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c index e63aa679614e..f0e72de6ddbc 100644 --- a/kernel/printk/printk.c +++ b/kernel/printk/printk.c @@ -780,6 +780,7 @@ static ssize_t devkmsg_write(struct kiocb *iocb, struct iov_iter *from) } static void deferred_cont_flush(void); +static void __wake_up_console(void); static ssize_t devkmsg_read(struct file *file, char __user *buf, size_t count, loff_t *ppos) @@ -1620,13 +1621,11 @@ static bool cont_flush(void) static void flush_timer(unsigned long data) { unsigned long flags; - bool did_flush; raw_spin_lock_irqsave(_lock, flags); - did_flush = cont_flush(); + if (cont_flush()) + __wake_up_console(); raw_spin_unlock_irqrestore(_lock, flags); - if (did_flush) - wake_up_klogd(); } static void deferred_cont_flush(void) @@ -2730,40 +2729,57 @@ static int __init printk_late_init(void) #if defined CONFIG_PRINTK /* - * Delayed printk version, for scheduler-internal messages: + * Handle console and wakeup loggers via IRQ work so that it can be done + * from any context. */ -#define PRINTK_PENDING_WAKEUP 0x01 -#define PRINTK_PENDING_OUTPUT 0x02 +#define PRINTK_PENDING_LOGGERS 0x01 +#define PRINTK_PENDING_CONSOLE 0x02 static DEFINE_PER_CPU(int, printk_pending); -static void wake_up_klogd_work_func(struct irq_work *irq_work) +static void printk_pending_func(struct irq_work *irq_work) { int pending = __this_cpu_xchg(printk_pending, 0); - if (pending & PRINTK_PENDING_OUTPUT) { + if (pending & PRINTK_PENDING_CONSOLE) { /* If trylock fails, someone else is doing the printing */ if (console_trylock()) console_unlock(); } - if
Re: linux.git: printk() problem
On Mon 2016-10-24 19:22:59, Linus Torvalds wrote: > On Mon, Oct 24, 2016 at 7:06 PM, Linus Torvalds > wrote: > > On Mon, Oct 24, 2016 at 6:55 PM, Sergey Senozhatsky > > wrote: > >> > >> I think cont_flush() should grab the logbuf_lock lock, because > >> it does log_store() and touches the cont.len. so something like > >> this perhaps > > > > Absolutely. Good catch. > > Actually, you can't do it the way you did (inside cont_flush), because > "cont_flush()" is already called with logbuf_lock held in most cases > (see "cont_add()"). > > So it's really just the timer function that needs to take the > logbuf_lock before it calls cont_flush(). > > So here's a new version. How does this look to you? > > Again, this still tests "cont.len" outside the lock (not just in > console_unlock(), but also in deferred_cont_flush()). And it's fine: > even if it sees the "wrong" value due to some race, it does so either > because cont.len was just set to non-zero (and whoever set it will > force the re-check anyway), or it got cleared just as it was tested > (and at worst you end up with an extra timer invocation). This patch really seems to reduce the number of too-early flushed continuous lines. It reduces the scattered output. But I am not sure if we want to add a timer code into the printk calls at this stage (for 4.9-rc5). Well, the patch looks fine, except that we call cont_flush() without poking console. It is not a regression because only newlines triggered console in the past and they still do but... I would suggest to revert the commit bfd8d3f23b51018388be ("printk: make reading the kernel log flush pending lines") for 4.9. Then we could test/fix it properly for 4.10. Me and Sergey would happily help with it. Just in case, you still want to commit this patch. I would suggest to apply the one below on top. >From 7a0ad7ce2347346fc81872fe42a95ad5dfba4098 Mon Sep 17 00:00:00 2001 From: Petr Mladek Date: Tue, 25 Oct 2016 15:23:13 +0200 Subject: [PATCH] printk: Poke console and other loggers when cont buffer is flushed The commit bfd8d3f23b51018388be041 ("printk: make reading the kernel log flush pending lines") allows to add new message into the log buffer without flushing it to the console and waking other loggers. This patch adds wake_up_console() and calls it when the cont buffer is flushed. The function name will make more sense once we switch to the async printk. Note that it is enough to poke console. The other loggers are waken from console_unlock() when there is a new message. The patch also renames PRINTK_PENDING flags and wake_up_klogd_work* stuff to reduce confusion. First, there are more possible log daemons, e.g. klogd, syslogd. Second, the word "console" is more descriptive than "output". This patch is based on top of the fix proposed at https://lkml.kernel.org/r/CA+55aFwKYnrMJr_vSE+GfDGszeUGyd=cpud15-zz8ywqw61...@mail.gmail.com Signed-off-by: Petr Mladek --- kernel/printk/printk.c | 60 ++ 1 file changed, 41 insertions(+), 19 deletions(-) diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c index e63aa679614e..f0e72de6ddbc 100644 --- a/kernel/printk/printk.c +++ b/kernel/printk/printk.c @@ -780,6 +780,7 @@ static ssize_t devkmsg_write(struct kiocb *iocb, struct iov_iter *from) } static void deferred_cont_flush(void); +static void __wake_up_console(void); static ssize_t devkmsg_read(struct file *file, char __user *buf, size_t count, loff_t *ppos) @@ -1620,13 +1621,11 @@ static bool cont_flush(void) static void flush_timer(unsigned long data) { unsigned long flags; - bool did_flush; raw_spin_lock_irqsave(_lock, flags); - did_flush = cont_flush(); + if (cont_flush()) + __wake_up_console(); raw_spin_unlock_irqrestore(_lock, flags); - if (did_flush) - wake_up_klogd(); } static void deferred_cont_flush(void) @@ -2730,40 +2729,57 @@ static int __init printk_late_init(void) #if defined CONFIG_PRINTK /* - * Delayed printk version, for scheduler-internal messages: + * Handle console and wakeup loggers via IRQ work so that it can be done + * from any context. */ -#define PRINTK_PENDING_WAKEUP 0x01 -#define PRINTK_PENDING_OUTPUT 0x02 +#define PRINTK_PENDING_LOGGERS 0x01 +#define PRINTK_PENDING_CONSOLE 0x02 static DEFINE_PER_CPU(int, printk_pending); -static void wake_up_klogd_work_func(struct irq_work *irq_work) +static void printk_pending_func(struct irq_work *irq_work) { int pending = __this_cpu_xchg(printk_pending, 0); - if (pending & PRINTK_PENDING_OUTPUT) { + if (pending & PRINTK_PENDING_CONSOLE) { /* If trylock fails, someone else is doing the printing */ if (console_trylock()) console_unlock(); } - if (pending & PRINTK_PENDING_WAKEUP) + if (pending & PRINTK_PENDING_LOGGERS)
Re: linux.git: printk() problem
On Mon 2016-10-24 19:22:59, Linus Torvalds wrote: > On Mon, Oct 24, 2016 at 7:06 PM, Linus Torvalds >wrote: > > On Mon, Oct 24, 2016 at 6:55 PM, Sergey Senozhatsky > > wrote: > >> > >> I think cont_flush() should grab the logbuf_lock lock, because > >> it does log_store() and touches the cont.len. so something like > >> this perhaps > > > > Absolutely. Good catch. > > Actually, you can't do it the way you did (inside cont_flush), because > "cont_flush()" is already called with logbuf_lock held in most cases > (see "cont_add()"). > > So it's really just the timer function that needs to take the > logbuf_lock before it calls cont_flush(). > > So here's a new version. How does this look to you? > > Again, this still tests "cont.len" outside the lock (not just in > console_unlock(), but also in deferred_cont_flush()). And it's fine: > even if it sees the "wrong" value due to some race, it does so either > because cont.len was just set to non-zero (and whoever set it will > force the re-check anyway), or it got cleared just as it was tested > (and at worst you end up with an extra timer invocation). > > +static void flush_timer(unsigned long data) > +{ > + unsigned long flags; > + bool did_flush; > + > + raw_spin_lock_irqsave(_lock, flags); > + did_flush = cont_flush(); > + raw_spin_unlock_irqrestore(_lock, flags); > + if (did_flush) > + wake_up_klogd(); We wake only klogd and syslog but not console here. Same problem is also with some other recently added cont_flush() calls, e.g. in kmsg_dump(), kmsg_dump_get_line(). BTW: It should be safe to call wake_up_klogd() in the locked area. It just modifies a per-CPU variable and queues IRQ work. I have played with it a bit and the result is below. Feel free to just use the pieces or the idea if you like a part of it. >From 226c4bfbe9f81bdd3b91b6b0ff3e268b1c09085f Mon Sep 17 00:00:00 2001 From: Petr Mladek Date: Tue, 25 Oct 2016 15:23:13 +0200 Subject: [PATCH] printk: Poke console when cont buffer is flushed The commit bfd8d3f23b51018388be041 ("printk: make reading the kernel log flush pending lines") allows to add new message into the log buffer without flushing it to the console. This patch reworks wake_up_klogd() to a generic printk_poke(). It pokes the loggers or console or both where necessary. I have renamed some some variables and functions. I am not if it is better readable now. Anyway, the old names came from times before printk_deferred() when the irq work handled only klogd. I am not super happy the new state because the irq work is called from many locations and there are cycles. The cycle breaks when cont.len is zero but it might be worth some clean up. For example, deferred_cont_flush() call in printk_poke_func() looks superfluous. But I wanted to be rather conservative for the moment. Also I wonder if we really need per-CPU variable for the printk_pending/printk_need_poke flag. A global variable might help to reduce some calls and it would not complicate the code. This patch is based on top of the fix proposed at https://lkml.kernel.org/r/CA+55aFwKYnrMJr_vSE+GfDGszeUGyd=cpud15-zz8ywqw61...@mail.gmail.com Signed-off-by: Petr Mladek --- include/linux/printk.h | 7 -- kernel/printk/printk.c | 67 -- lib/bust_spinlocks.c | 2 +- 3 files changed, 44 insertions(+), 32 deletions(-) diff --git a/include/linux/printk.h b/include/linux/printk.h index eac1af8502bb..6fdced13d2c9 100644 --- a/include/linux/printk.h +++ b/include/linux/printk.h @@ -31,6 +31,9 @@ static inline const char *printk_skip_level(const char *buffer) return buffer; } +#define PRINTK_POKE_CONSOLE0x01 +#define PRINTK_POKE_LOGGERS0x02 + #define CONSOLE_EXT_LOG_MAX8192 /* printk's without a loglevel use this.. */ @@ -185,7 +188,7 @@ extern bool printk_timed_ratelimit(unsigned long *caller_jiffies, devkmsg_sysctl_set_loglvl(struct ctl_table *table, int write, void __user *buf, size_t *lenp, loff_t *ppos); -extern void wake_up_klogd(void); +extern void printk_poke(int poke); char *log_buf_addr_get(void); u32 log_buf_len_get(void); @@ -220,7 +223,7 @@ static inline bool printk_timed_ratelimit(unsigned long *caller_jiffies, return false; } -static inline void wake_up_klogd(void) +static inline void printk_poke(int poke) { } diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c index e63aa679614e..2c231bb73744 100644 --- a/kernel/printk/printk.c +++ b/kernel/printk/printk.c @@ -1620,13 +1620,11 @@ static bool cont_flush(void) static void flush_timer(unsigned long data) { unsigned long flags; - bool did_flush; raw_spin_lock_irqsave(_lock, flags); - did_flush = cont_flush(); + if (cont_flush()) + printk_poke(PRINTK_POKE_CONSOLE | PRINTK_POKE_LOGGERS);
Re: linux.git: printk() problem
On Mon 2016-10-24 19:22:59, Linus Torvalds wrote: > On Mon, Oct 24, 2016 at 7:06 PM, Linus Torvalds > wrote: > > On Mon, Oct 24, 2016 at 6:55 PM, Sergey Senozhatsky > > wrote: > >> > >> I think cont_flush() should grab the logbuf_lock lock, because > >> it does log_store() and touches the cont.len. so something like > >> this perhaps > > > > Absolutely. Good catch. > > Actually, you can't do it the way you did (inside cont_flush), because > "cont_flush()" is already called with logbuf_lock held in most cases > (see "cont_add()"). > > So it's really just the timer function that needs to take the > logbuf_lock before it calls cont_flush(). > > So here's a new version. How does this look to you? > > Again, this still tests "cont.len" outside the lock (not just in > console_unlock(), but also in deferred_cont_flush()). And it's fine: > even if it sees the "wrong" value due to some race, it does so either > because cont.len was just set to non-zero (and whoever set it will > force the re-check anyway), or it got cleared just as it was tested > (and at worst you end up with an extra timer invocation). > > +static void flush_timer(unsigned long data) > +{ > + unsigned long flags; > + bool did_flush; > + > + raw_spin_lock_irqsave(_lock, flags); > + did_flush = cont_flush(); > + raw_spin_unlock_irqrestore(_lock, flags); > + if (did_flush) > + wake_up_klogd(); We wake only klogd and syslog but not console here. Same problem is also with some other recently added cont_flush() calls, e.g. in kmsg_dump(), kmsg_dump_get_line(). BTW: It should be safe to call wake_up_klogd() in the locked area. It just modifies a per-CPU variable and queues IRQ work. I have played with it a bit and the result is below. Feel free to just use the pieces or the idea if you like a part of it. >From 226c4bfbe9f81bdd3b91b6b0ff3e268b1c09085f Mon Sep 17 00:00:00 2001 From: Petr Mladek Date: Tue, 25 Oct 2016 15:23:13 +0200 Subject: [PATCH] printk: Poke console when cont buffer is flushed The commit bfd8d3f23b51018388be041 ("printk: make reading the kernel log flush pending lines") allows to add new message into the log buffer without flushing it to the console. This patch reworks wake_up_klogd() to a generic printk_poke(). It pokes the loggers or console or both where necessary. I have renamed some some variables and functions. I am not if it is better readable now. Anyway, the old names came from times before printk_deferred() when the irq work handled only klogd. I am not super happy the new state because the irq work is called from many locations and there are cycles. The cycle breaks when cont.len is zero but it might be worth some clean up. For example, deferred_cont_flush() call in printk_poke_func() looks superfluous. But I wanted to be rather conservative for the moment. Also I wonder if we really need per-CPU variable for the printk_pending/printk_need_poke flag. A global variable might help to reduce some calls and it would not complicate the code. This patch is based on top of the fix proposed at https://lkml.kernel.org/r/CA+55aFwKYnrMJr_vSE+GfDGszeUGyd=cpud15-zz8ywqw61...@mail.gmail.com Signed-off-by: Petr Mladek --- include/linux/printk.h | 7 -- kernel/printk/printk.c | 67 -- lib/bust_spinlocks.c | 2 +- 3 files changed, 44 insertions(+), 32 deletions(-) diff --git a/include/linux/printk.h b/include/linux/printk.h index eac1af8502bb..6fdced13d2c9 100644 --- a/include/linux/printk.h +++ b/include/linux/printk.h @@ -31,6 +31,9 @@ static inline const char *printk_skip_level(const char *buffer) return buffer; } +#define PRINTK_POKE_CONSOLE0x01 +#define PRINTK_POKE_LOGGERS0x02 + #define CONSOLE_EXT_LOG_MAX8192 /* printk's without a loglevel use this.. */ @@ -185,7 +188,7 @@ extern bool printk_timed_ratelimit(unsigned long *caller_jiffies, devkmsg_sysctl_set_loglvl(struct ctl_table *table, int write, void __user *buf, size_t *lenp, loff_t *ppos); -extern void wake_up_klogd(void); +extern void printk_poke(int poke); char *log_buf_addr_get(void); u32 log_buf_len_get(void); @@ -220,7 +223,7 @@ static inline bool printk_timed_ratelimit(unsigned long *caller_jiffies, return false; } -static inline void wake_up_klogd(void) +static inline void printk_poke(int poke) { } diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c index e63aa679614e..2c231bb73744 100644 --- a/kernel/printk/printk.c +++ b/kernel/printk/printk.c @@ -1620,13 +1620,11 @@ static bool cont_flush(void) static void flush_timer(unsigned long data) { unsigned long flags; - bool did_flush; raw_spin_lock_irqsave(_lock, flags); - did_flush = cont_flush(); + if (cont_flush()) + printk_poke(PRINTK_POKE_CONSOLE | PRINTK_POKE_LOGGERS); raw_spin_unlock_irqrestore(_lock, flags); - if (did_flush) -
Re: linux.git: printk() problem
On Sun, Oct 23, 2016 at 11:11:18AM -0700, Linus Torvalds wrote: > > For example, one of the really historical uses for partial lines is this: > >pr_info("Checking 'hlt' instruction... "); > >if (!boot_cpu_data.hlt_works_ok) { >pr_cont("disabled\n"); >return; >} >halt(); >halt(); >halt(); >halt(); >pr_cont("OK\n"); > > and the point was that there used to be some really old i386 machines > that hung on the "hlt" instruction (probably not because of a CPU bug, > but because of either power supply issues or some DMA issues). > > To support that, we really *had* to print out the continuation lines > even when they were partial. And that complicates the printk logic a > lot. Note, my ftrace start up tests still does exactly this (e.g.): pr_info("Testing dynamic ftrace ops: #%d: ", cnt); [ do lots of testing ] printk(KERN_CONT "PASSED\n"); Previously a change was made to buffer lines without \n, and I wasted an entire day debugging why a crash happened because I was looking at the wrong test. -- Steve > > Now, that "hlt" case is long long gone, and maybe we should just say > "screw that". It would be really quite easy to say "we don't print out > continuation lines immediately, we just buffer them for 0.1s instead, > and KERN_CONT only works for things that really happen more or less > immediately". > > Maybe that really is the right answer. Because the original cause of > us having to bend over backwards in this case is really no longer > there. And it would simplify printk a *lot*. > > Let me whip up a minimal patch for you to try.
Re: linux.git: printk() problem
On Sun, Oct 23, 2016 at 11:11:18AM -0700, Linus Torvalds wrote: > > For example, one of the really historical uses for partial lines is this: > >pr_info("Checking 'hlt' instruction... "); > >if (!boot_cpu_data.hlt_works_ok) { >pr_cont("disabled\n"); >return; >} >halt(); >halt(); >halt(); >halt(); >pr_cont("OK\n"); > > and the point was that there used to be some really old i386 machines > that hung on the "hlt" instruction (probably not because of a CPU bug, > but because of either power supply issues or some DMA issues). > > To support that, we really *had* to print out the continuation lines > even when they were partial. And that complicates the printk logic a > lot. Note, my ftrace start up tests still does exactly this (e.g.): pr_info("Testing dynamic ftrace ops: #%d: ", cnt); [ do lots of testing ] printk(KERN_CONT "PASSED\n"); Previously a change was made to buffer lines without \n, and I wasted an entire day debugging why a crash happened because I was looking at the wrong test. -- Steve > > Now, that "hlt" case is long long gone, and maybe we should just say > "screw that". It would be really quite easy to say "we don't print out > continuation lines immediately, we just buffer them for 0.1s instead, > and KERN_CONT only works for things that really happen more or less > immediately". > > Maybe that really is the right answer. Because the original cause of > us having to bend over backwards in this case is really no longer > there. And it would simplify printk a *lot*. > > Let me whip up a minimal patch for you to try.
Re: linux.git: printk() problem
On (10/24/16 21:15), Linus Torvalds wrote: [..] > No. Most cont lines never hit the delay, because when the line is > completed, it is flushed (and then printed synchronously, assuming it > can get the console lock). > > So the timeout only ever comes into effect if the line isn't completed > in time at all. Which is actually very rare, and never happens for the > "let's print things out in multiple chinks because we're using a > loop". > > Similarly, if a new printk() happens due to interleaving, the previous > buffered line is always flushed first, so buffering never causes > out-of-order behavior. thanks. the patch works fine on my x86 box. > Basically, the only time the timer actually does anything is if > something just does a printk() without a newline, and no other > printouts happen for the next 0.1s. ok. perhaps, like slow serial console. will test on arm board later. -ss
Re: linux.git: printk() problem
On (10/24/16 21:15), Linus Torvalds wrote: [..] > No. Most cont lines never hit the delay, because when the line is > completed, it is flushed (and then printed synchronously, assuming it > can get the console lock). > > So the timeout only ever comes into effect if the line isn't completed > in time at all. Which is actually very rare, and never happens for the > "let's print things out in multiple chinks because we're using a > loop". > > Similarly, if a new printk() happens due to interleaving, the previous > buffered line is always flushed first, so buffering never causes > out-of-order behavior. thanks. the patch works fine on my x86 box. > Basically, the only time the timer actually does anything is if > something just does a printk() without a newline, and no other > printouts happen for the next 0.1s. ok. perhaps, like slow serial console. will test on arm board later. -ss
Re: linux.git: printk() problem
On Mon, Oct 24, 2016 at 9:06 PM, Sergey Senozhatskywrote: > > 1) the way we dumpstack on x86 (at least on x86) is a spaghetti of > printk() and pr_cont() calls. for instance, arch/x86/kernel/dumpstack_64.c > show_regs() does pr_cont() to print out the registers, while the stack and > backtrace are printed with printk(). so, I assume, the backtrace now will > look a bit upside-down, because cont lines are printed with the delay. > correct? No. Most cont lines never hit the delay, because when the line is completed, it is flushed (and then printed synchronously, assuming it can get the console lock). So the timeout only ever comes into effect if the line isn't completed in time at all. Which is actually very rare, and never happens for the "let's print things out in multiple chinks because we're using a loop". Similarly, if a new printk() happens due to interleaving, the previous buffered line is always flushed first, so buffering never causes out-of-order behavior. Basically, the only time the timer actually does anything is if something just does a printk() without a newline, and no other printouts happen for the next 0.1s. > 2) flush on oops. Again, really not an issue for the exact reason above: nothing is ever buffered when something new is printed. And all you need to guarantee that last line of the oops itself is printed is that it has a newline. So again, the timer only matters for the exceptional case, not for the normal situation. It's literally there to guarantee basic timeliness. Linus
Re: linux.git: printk() problem
On Mon, Oct 24, 2016 at 9:06 PM, Sergey Senozhatsky wrote: > > 1) the way we dumpstack on x86 (at least on x86) is a spaghetti of > printk() and pr_cont() calls. for instance, arch/x86/kernel/dumpstack_64.c > show_regs() does pr_cont() to print out the registers, while the stack and > backtrace are printed with printk(). so, I assume, the backtrace now will > look a bit upside-down, because cont lines are printed with the delay. > correct? No. Most cont lines never hit the delay, because when the line is completed, it is flushed (and then printed synchronously, assuming it can get the console lock). So the timeout only ever comes into effect if the line isn't completed in time at all. Which is actually very rare, and never happens for the "let's print things out in multiple chinks because we're using a loop". Similarly, if a new printk() happens due to interleaving, the previous buffered line is always flushed first, so buffering never causes out-of-order behavior. Basically, the only time the timer actually does anything is if something just does a printk() without a newline, and no other printouts happen for the next 0.1s. > 2) flush on oops. Again, really not an issue for the exact reason above: nothing is ever buffered when something new is printed. And all you need to guarantee that last line of the oops itself is printed is that it has a newline. So again, the timer only matters for the exceptional case, not for the normal situation. It's literally there to guarantee basic timeliness. Linus
Re: linux.git: printk() problem
On Tue, 2016-10-25 at 13:06 +0900, Sergey Senozhatsky wrote: > so how about skipping mod_timer in deferred_cont_flush() and just > cont_flush() when we are in oops? here is probably one more thing we > need to "fix" first. oops_in_progress is unreliable. x86 oops_end() > does bust_spinlocks(0) before it calls panic(). panic() increments > oops_in_progress but decrements it back to 0 (bust_spinlocks(0)) before > it does console_flush_on_panic(). so there is (almost) no way > console_flush_on_panic() can see oops_in_progress != 0. I think Linus' insistence on not requiring EOL/newline is misguided. It'd be simpler to avoid the flushes and have a require EOL/newline.
Re: linux.git: printk() problem
On Tue, 2016-10-25 at 13:06 +0900, Sergey Senozhatsky wrote: > so how about skipping mod_timer in deferred_cont_flush() and just > cont_flush() when we are in oops? here is probably one more thing we > need to "fix" first. oops_in_progress is unreliable. x86 oops_end() > does bust_spinlocks(0) before it calls panic(). panic() increments > oops_in_progress but decrements it back to 0 (bust_spinlocks(0)) before > it does console_flush_on_panic(). so there is (almost) no way > console_flush_on_panic() can see oops_in_progress != 0. I think Linus' insistence on not requiring EOL/newline is misguided. It'd be simpler to avoid the flushes and have a require EOL/newline.
Re: linux.git: printk() problem
On (10/24/16 19:22), Linus Torvalds wrote: > On Mon, Oct 24, 2016 at 7:06 PM, Linus Torvalds >wrote: > > On Mon, Oct 24, 2016 at 6:55 PM, Sergey Senozhatsky > > wrote: > >> > >> I think cont_flush() should grab the logbuf_lock lock, because > >> it does log_store() and touches the cont.len. so something like > >> this perhaps > > > > Absolutely. Good catch. > > Actually, you can't do it the way you did (inside cont_flush), because > "cont_flush()" is already called with logbuf_lock held in most cases > (see "cont_add()"). right. my bad, realized too late. > So it's really just the timer function that needs to take the > logbuf_lock before it calls cont_flush(). yes. > So here's a new version. How does this look to you? ok, looks much better. there are several things that I want to mention here, just to make sure we don't miss anything (just my 5 cents). 1) the way we dumpstack on x86 (at least on x86) is a spaghetti of printk() and pr_cont() calls. for instance, arch/x86/kernel/dumpstack_64.c show_regs() does pr_cont() to print out the registers, while the stack and backtrace are printed with printk(). so, I assume, the backtrace now will look a bit upside-down, because cont lines are printed with the delay. correct? 2) flush on oops. not that panic printk is deadlock proof (not at all) but: a) rather unlikely, but what if BUG_ON() or panic() happens under lock_timer_base()? b) what if timer event never happens? (we are in panic, who knows) so how about skipping mod_timer in deferred_cont_flush() and just cont_flush() when we are in oops? here is probably one more thing we need to "fix" first. oops_in_progress is unreliable. x86 oops_end() does bust_spinlocks(0) before it calls panic(). panic() increments oops_in_progress but decrements it back to 0 (bust_spinlocks(0)) before it does console_flush_on_panic(). so there is (almost) no way console_flush_on_panic() can see oops_in_progress != 0. diff --git a/kernel/panic.c b/kernel/panic.c index e6480e2..8e17540 100644 --- a/kernel/panic.c +++ b/kernel/panic.c @@ -228,7 +228,6 @@ void panic(const char *fmt, ...) if (_crash_kexec_post_notifiers) __crash_kexec(NULL); - bust_spinlocks(0); /* * We may have ended up stopping the CPU holding the lock (in @@ -240,6 +239,7 @@ void panic(const char *fmt, ...) */ debug_locks_off(); console_flush_on_panic(); + bust_spinlocks(0); if (!panic_blink) panic_blink = no_blink; --- -ss
Re: linux.git: printk() problem
On (10/24/16 19:22), Linus Torvalds wrote: > On Mon, Oct 24, 2016 at 7:06 PM, Linus Torvalds > wrote: > > On Mon, Oct 24, 2016 at 6:55 PM, Sergey Senozhatsky > > wrote: > >> > >> I think cont_flush() should grab the logbuf_lock lock, because > >> it does log_store() and touches the cont.len. so something like > >> this perhaps > > > > Absolutely. Good catch. > > Actually, you can't do it the way you did (inside cont_flush), because > "cont_flush()" is already called with logbuf_lock held in most cases > (see "cont_add()"). right. my bad, realized too late. > So it's really just the timer function that needs to take the > logbuf_lock before it calls cont_flush(). yes. > So here's a new version. How does this look to you? ok, looks much better. there are several things that I want to mention here, just to make sure we don't miss anything (just my 5 cents). 1) the way we dumpstack on x86 (at least on x86) is a spaghetti of printk() and pr_cont() calls. for instance, arch/x86/kernel/dumpstack_64.c show_regs() does pr_cont() to print out the registers, while the stack and backtrace are printed with printk(). so, I assume, the backtrace now will look a bit upside-down, because cont lines are printed with the delay. correct? 2) flush on oops. not that panic printk is deadlock proof (not at all) but: a) rather unlikely, but what if BUG_ON() or panic() happens under lock_timer_base()? b) what if timer event never happens? (we are in panic, who knows) so how about skipping mod_timer in deferred_cont_flush() and just cont_flush() when we are in oops? here is probably one more thing we need to "fix" first. oops_in_progress is unreliable. x86 oops_end() does bust_spinlocks(0) before it calls panic(). panic() increments oops_in_progress but decrements it back to 0 (bust_spinlocks(0)) before it does console_flush_on_panic(). so there is (almost) no way console_flush_on_panic() can see oops_in_progress != 0. diff --git a/kernel/panic.c b/kernel/panic.c index e6480e2..8e17540 100644 --- a/kernel/panic.c +++ b/kernel/panic.c @@ -228,7 +228,6 @@ void panic(const char *fmt, ...) if (_crash_kexec_post_notifiers) __crash_kexec(NULL); - bust_spinlocks(0); /* * We may have ended up stopping the CPU holding the lock (in @@ -240,6 +239,7 @@ void panic(const char *fmt, ...) */ debug_locks_off(); console_flush_on_panic(); + bust_spinlocks(0); if (!panic_blink) panic_blink = no_blink; --- -ss
Re: linux.git: printk() problem
On (10/25/16 10:55), Sergey Senozhatsky wrote: > I think cont_flush() should grab the logbuf_lock lock, because > it does log_store() and touches the cont.len. so something like > this perhaps > > diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c > index c7f490f..47f887c 100644 > --- a/kernel/printk/printk.c > +++ b/kernel/printk/printk.c > @@ -1608,13 +1608,20 @@ static struct cont { > > static bool cont_flush(void) > { > + unsigned long flags; > + bool flushed = false; > + > + raw_spin_lock_irqsave(_lock, flags); > if (!cont.len) > - return false; > + goto out; > > log_store(cont.facility, cont.level, cont.flags, cont.ts_nsec, > NULL, 0, cont.buf, cont.len); > cont.len = 0; > - return true; > + flushed = true; > +out: > + raw_spin_unlock_irqrestore(_lock, flags); > + return flushed; > } ... clearly, wasn't tested at all! sorry about that. what I meant was cont_flush() from deferred_cont_flush()->flush_timer(). diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c index c7f490f..bd7841c 100644 --- a/kernel/printk/printk.c +++ b/kernel/printk/printk.c @@ -1619,7 +1619,14 @@ static bool cont_flush(void) static void flush_timer(unsigned long data) { - if (cont_flush()) + unsigned long flags; + bool flushed; + + raw_spin_lock_irqsave(_lock, flags); + flushed = cont_flush(); + raw_spin_unlock_irqrestore(_lock, flags); + + if (flushed) wake_up_klogd(); } -ss
Re: linux.git: printk() problem
On (10/25/16 10:55), Sergey Senozhatsky wrote: > I think cont_flush() should grab the logbuf_lock lock, because > it does log_store() and touches the cont.len. so something like > this perhaps > > diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c > index c7f490f..47f887c 100644 > --- a/kernel/printk/printk.c > +++ b/kernel/printk/printk.c > @@ -1608,13 +1608,20 @@ static struct cont { > > static bool cont_flush(void) > { > + unsigned long flags; > + bool flushed = false; > + > + raw_spin_lock_irqsave(_lock, flags); > if (!cont.len) > - return false; > + goto out; > > log_store(cont.facility, cont.level, cont.flags, cont.ts_nsec, > NULL, 0, cont.buf, cont.len); > cont.len = 0; > - return true; > + flushed = true; > +out: > + raw_spin_unlock_irqrestore(_lock, flags); > + return flushed; > } ... clearly, wasn't tested at all! sorry about that. what I meant was cont_flush() from deferred_cont_flush()->flush_timer(). diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c index c7f490f..bd7841c 100644 --- a/kernel/printk/printk.c +++ b/kernel/printk/printk.c @@ -1619,7 +1619,14 @@ static bool cont_flush(void) static void flush_timer(unsigned long data) { - if (cont_flush()) + unsigned long flags; + bool flushed; + + raw_spin_lock_irqsave(_lock, flags); + flushed = cont_flush(); + raw_spin_unlock_irqrestore(_lock, flags); + + if (flushed) wake_up_klogd(); } -ss
Re: linux.git: printk() problem
On Mon, Oct 24, 2016 at 7:06 PM, Linus Torvaldswrote: > On Mon, Oct 24, 2016 at 6:55 PM, Sergey Senozhatsky > wrote: >> >> I think cont_flush() should grab the logbuf_lock lock, because >> it does log_store() and touches the cont.len. so something like >> this perhaps > > Absolutely. Good catch. Actually, you can't do it the way you did (inside cont_flush), because "cont_flush()" is already called with logbuf_lock held in most cases (see "cont_add()"). So it's really just the timer function that needs to take the logbuf_lock before it calls cont_flush(). So here's a new version. How does this look to you? Again, this still tests "cont.len" outside the lock (not just in console_unlock(), but also in deferred_cont_flush()). And it's fine: even if it sees the "wrong" value due to some race, it does so either because cont.len was just set to non-zero (and whoever set it will force the re-check anyway), or it got cleared just as it was tested (and at worst you end up with an extra timer invocation). Linus kernel/printk/printk.c | 255 +++-- 1 file changed, 58 insertions(+), 197 deletions(-) diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c index de08fc90baaf..e63aa679614e 100644 --- a/kernel/printk/printk.c +++ b/kernel/printk/printk.c @@ -367,7 +367,6 @@ DECLARE_WAIT_QUEUE_HEAD(log_wait); /* the next printk record to read by syslog(READ) or /proc/kmsg */ static u64 syslog_seq; static u32 syslog_idx; -static enum log_flags syslog_prev; static size_t syslog_partial; /* index and sequence number of the first record stored in the buffer */ @@ -381,7 +380,6 @@ static u32 log_next_idx; /* the next printk record to write to the console */ static u64 console_seq; static u32 console_idx; -static enum log_flags console_prev; /* the next printk record to read after the last 'clear' command */ static u64 clear_seq; @@ -650,27 +648,15 @@ static void append_char(char **pp, char *e, char c) } static ssize_t msg_print_ext_header(char *buf, size_t size, - struct printk_log *msg, u64 seq, - enum log_flags prev_flags) + struct printk_log *msg, u64 seq) { u64 ts_usec = msg->ts_nsec; - char cont = '-'; do_div(ts_usec, 1000); - /* -* If we couldn't merge continuation line fragments during the print, -* export the stored flags to allow an optional external merge of the -* records. Merging the records isn't always neccessarily correct, like -* when we hit a race during printing. In most cases though, it produces -* better readable output. 'c' in the record flags mark the first -* fragment of a line, '+' the following. -*/ - if (msg->flags & LOG_CONT) - cont = (prev_flags & LOG_CONT) ? '+' : 'c'; - return scnprintf(buf, size, "%u,%llu,%llu,%c;", - (msg->facility << 3) | msg->level, seq, ts_usec, cont); + (msg->facility << 3) | msg->level, seq, ts_usec, + msg->flags & LOG_CONT ? 'c' : '-'); } static ssize_t msg_print_ext_body(char *buf, size_t size, @@ -725,7 +711,6 @@ static ssize_t msg_print_ext_body(char *buf, size_t size, struct devkmsg_user { u64 seq; u32 idx; - enum log_flags prev; struct ratelimit_state rs; struct mutex lock; char buf[CONSOLE_EXT_LOG_MAX]; @@ -794,7 +779,7 @@ static ssize_t devkmsg_write(struct kiocb *iocb, struct iov_iter *from) return ret; } -static void cont_flush(void); +static void deferred_cont_flush(void); static ssize_t devkmsg_read(struct file *file, char __user *buf, size_t count, loff_t *ppos) @@ -811,7 +796,6 @@ static ssize_t devkmsg_read(struct file *file, char __user *buf, if (ret) return ret; raw_spin_lock_irq(_lock); - cont_flush(); while (user->seq == log_next_seq) { if (file->f_flags & O_NONBLOCK) { ret = -EAGAIN; @@ -838,12 +822,11 @@ static ssize_t devkmsg_read(struct file *file, char __user *buf, msg = log_from_idx(user->idx); len = msg_print_ext_header(user->buf, sizeof(user->buf), - msg, user->seq, user->prev); + msg, user->seq); len += msg_print_ext_body(user->buf + len, sizeof(user->buf) - len, log_dict(msg), msg->dict_len, log_text(msg), msg->text_len); - user->prev = msg->flags; user->idx = log_next(user->idx); user->seq++; raw_spin_unlock_irq(_lock); @@ -860,6 +843,7 @@ static ssize_t devkmsg_read(struct file *file, char __user *buf, ret = len; out:
Re: linux.git: printk() problem
On Mon, Oct 24, 2016 at 7:06 PM, Linus Torvalds wrote: > On Mon, Oct 24, 2016 at 6:55 PM, Sergey Senozhatsky > wrote: >> >> I think cont_flush() should grab the logbuf_lock lock, because >> it does log_store() and touches the cont.len. so something like >> this perhaps > > Absolutely. Good catch. Actually, you can't do it the way you did (inside cont_flush), because "cont_flush()" is already called with logbuf_lock held in most cases (see "cont_add()"). So it's really just the timer function that needs to take the logbuf_lock before it calls cont_flush(). So here's a new version. How does this look to you? Again, this still tests "cont.len" outside the lock (not just in console_unlock(), but also in deferred_cont_flush()). And it's fine: even if it sees the "wrong" value due to some race, it does so either because cont.len was just set to non-zero (and whoever set it will force the re-check anyway), or it got cleared just as it was tested (and at worst you end up with an extra timer invocation). Linus kernel/printk/printk.c | 255 +++-- 1 file changed, 58 insertions(+), 197 deletions(-) diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c index de08fc90baaf..e63aa679614e 100644 --- a/kernel/printk/printk.c +++ b/kernel/printk/printk.c @@ -367,7 +367,6 @@ DECLARE_WAIT_QUEUE_HEAD(log_wait); /* the next printk record to read by syslog(READ) or /proc/kmsg */ static u64 syslog_seq; static u32 syslog_idx; -static enum log_flags syslog_prev; static size_t syslog_partial; /* index and sequence number of the first record stored in the buffer */ @@ -381,7 +380,6 @@ static u32 log_next_idx; /* the next printk record to write to the console */ static u64 console_seq; static u32 console_idx; -static enum log_flags console_prev; /* the next printk record to read after the last 'clear' command */ static u64 clear_seq; @@ -650,27 +648,15 @@ static void append_char(char **pp, char *e, char c) } static ssize_t msg_print_ext_header(char *buf, size_t size, - struct printk_log *msg, u64 seq, - enum log_flags prev_flags) + struct printk_log *msg, u64 seq) { u64 ts_usec = msg->ts_nsec; - char cont = '-'; do_div(ts_usec, 1000); - /* -* If we couldn't merge continuation line fragments during the print, -* export the stored flags to allow an optional external merge of the -* records. Merging the records isn't always neccessarily correct, like -* when we hit a race during printing. In most cases though, it produces -* better readable output. 'c' in the record flags mark the first -* fragment of a line, '+' the following. -*/ - if (msg->flags & LOG_CONT) - cont = (prev_flags & LOG_CONT) ? '+' : 'c'; - return scnprintf(buf, size, "%u,%llu,%llu,%c;", - (msg->facility << 3) | msg->level, seq, ts_usec, cont); + (msg->facility << 3) | msg->level, seq, ts_usec, + msg->flags & LOG_CONT ? 'c' : '-'); } static ssize_t msg_print_ext_body(char *buf, size_t size, @@ -725,7 +711,6 @@ static ssize_t msg_print_ext_body(char *buf, size_t size, struct devkmsg_user { u64 seq; u32 idx; - enum log_flags prev; struct ratelimit_state rs; struct mutex lock; char buf[CONSOLE_EXT_LOG_MAX]; @@ -794,7 +779,7 @@ static ssize_t devkmsg_write(struct kiocb *iocb, struct iov_iter *from) return ret; } -static void cont_flush(void); +static void deferred_cont_flush(void); static ssize_t devkmsg_read(struct file *file, char __user *buf, size_t count, loff_t *ppos) @@ -811,7 +796,6 @@ static ssize_t devkmsg_read(struct file *file, char __user *buf, if (ret) return ret; raw_spin_lock_irq(_lock); - cont_flush(); while (user->seq == log_next_seq) { if (file->f_flags & O_NONBLOCK) { ret = -EAGAIN; @@ -838,12 +822,11 @@ static ssize_t devkmsg_read(struct file *file, char __user *buf, msg = log_from_idx(user->idx); len = msg_print_ext_header(user->buf, sizeof(user->buf), - msg, user->seq, user->prev); + msg, user->seq); len += msg_print_ext_body(user->buf + len, sizeof(user->buf) - len, log_dict(msg), msg->dict_len, log_text(msg), msg->text_len); - user->prev = msg->flags; user->idx = log_next(user->idx); user->seq++; raw_spin_unlock_irq(_lock); @@ -860,6 +843,7 @@ static ssize_t devkmsg_read(struct file *file, char __user *buf, ret = len; out: mutex_unlock(>lock); + deferred_cont_flush(); return ret; }
Re: linux.git: printk() problem
On Mon, Oct 24, 2016 at 6:55 PM, Sergey Senozhatskywrote: > > I think cont_flush() should grab the logbuf_lock lock, because > it does log_store() and touches the cont.len. so something like > this perhaps Absolutely. Good catch. >> - if (wake_klogd) >> + if (wake_klogd || cont.len) > ^^ > this _technically_ can result in additional spurious wakeups - cont.len > check is done outside of console_sem && logbuf_lock - but I don't think > this is a huge problem. Yes. We could easily just do it inside the spinlock and make it part of the wake_klogd logic, but as you say, it doesn't actually matter. And yes, it would be lovely if people who use the network console (which I think is the only user of the whole extended console thing) would test this too. It really makes things potentially much simpler. It's not just that it removes 150 lines, it's 150 lines of really nasty complex special case crud. I'm obviously not going to apply it for 4.9, but it might be "do it for 4.10 and mark it for stable if nobody notices any issues", because apart from the code simplification it also gets more cases "right". Right now the console flushing often means that we end up not merging continuation lines as much as we could. Linus
Re: linux.git: printk() problem
On Mon, Oct 24, 2016 at 6:55 PM, Sergey Senozhatsky wrote: > > I think cont_flush() should grab the logbuf_lock lock, because > it does log_store() and touches the cont.len. so something like > this perhaps Absolutely. Good catch. >> - if (wake_klogd) >> + if (wake_klogd || cont.len) > ^^ > this _technically_ can result in additional spurious wakeups - cont.len > check is done outside of console_sem && logbuf_lock - but I don't think > this is a huge problem. Yes. We could easily just do it inside the spinlock and make it part of the wake_klogd logic, but as you say, it doesn't actually matter. And yes, it would be lovely if people who use the network console (which I think is the only user of the whole extended console thing) would test this too. It really makes things potentially much simpler. It's not just that it removes 150 lines, it's 150 lines of really nasty complex special case crud. I'm obviously not going to apply it for 4.9, but it might be "do it for 4.10 and mark it for stable if nobody notices any issues", because apart from the code simplification it also gets more cases "right". Right now the console flushing often means that we end up not merging continuation lines as much as we could. Linus
Re: linux.git: printk() problem
Hello, Cc more people report: https://marc.info/?l=linux-kernel=147721454506634=2 patch: https://marc.info/?l=linux-kernel=147733173800859 FB is using ext header a lot (afaik), so may be Tejun or Calvin will also be interested. On (10/24/16 10:55), Linus Torvalds wrote: > > Note the "totally untested" part. It compiles for me. But it may do > > unspeakably stupid things. Caveat applior. > > Well, it is hard to apply a patch that I didn't even attach. Blush. [..] > -static void cont_flush(void) > +static bool cont_flush(void) > { > - if (cont.flushed) > - return; > - if (cont.len == 0) > + if (!cont.len) > + return false; > + > + log_store(cont.facility, cont.level, cont.flags, cont.ts_nsec, > + NULL, 0, cont.buf, cont.len); > + cont.len = 0; > + return true; > +} I think cont_flush() should grab the logbuf_lock lock, because it does log_store() and touches the cont.len. so something like this perhaps diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c index c7f490f..47f887c 100644 --- a/kernel/printk/printk.c +++ b/kernel/printk/printk.c @@ -1608,13 +1608,20 @@ static struct cont { static bool cont_flush(void) { + unsigned long flags; + bool flushed = false; + + raw_spin_lock_irqsave(_lock, flags); if (!cont.len) - return false; + goto out; log_store(cont.facility, cont.level, cont.flags, cont.ts_nsec, NULL, 0, cont.buf, cont.len); cont.len = 0; - return true; + flushed = true; +out: + raw_spin_unlock_irqrestore(_lock, flags); + return flushed; } [..] > @@ -2449,7 +2311,6 @@ void console_unlock(void) > } > console_idx = log_next(console_idx); > console_seq++; > - console_prev = msg->flags; > raw_spin_unlock(_lock); > > stop_critical_timings();/* don't trace print latency */ > @@ -2483,7 +2344,7 @@ void console_unlock(void) > if (retry && console_trylock()) > goto again; > > - if (wake_klogd) > + if (wake_klogd || cont.len) ^^ this _technically_ can result in additional spurious wakeups - cont.len check is done outside of console_sem && logbuf_lock - but I don't think this is a huge problem. -ss
Re: linux.git: printk() problem
Hello, Cc more people report: https://marc.info/?l=linux-kernel=147721454506634=2 patch: https://marc.info/?l=linux-kernel=147733173800859 FB is using ext header a lot (afaik), so may be Tejun or Calvin will also be interested. On (10/24/16 10:55), Linus Torvalds wrote: > > Note the "totally untested" part. It compiles for me. But it may do > > unspeakably stupid things. Caveat applior. > > Well, it is hard to apply a patch that I didn't even attach. Blush. [..] > -static void cont_flush(void) > +static bool cont_flush(void) > { > - if (cont.flushed) > - return; > - if (cont.len == 0) > + if (!cont.len) > + return false; > + > + log_store(cont.facility, cont.level, cont.flags, cont.ts_nsec, > + NULL, 0, cont.buf, cont.len); > + cont.len = 0; > + return true; > +} I think cont_flush() should grab the logbuf_lock lock, because it does log_store() and touches the cont.len. so something like this perhaps diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c index c7f490f..47f887c 100644 --- a/kernel/printk/printk.c +++ b/kernel/printk/printk.c @@ -1608,13 +1608,20 @@ static struct cont { static bool cont_flush(void) { + unsigned long flags; + bool flushed = false; + + raw_spin_lock_irqsave(_lock, flags); if (!cont.len) - return false; + goto out; log_store(cont.facility, cont.level, cont.flags, cont.ts_nsec, NULL, 0, cont.buf, cont.len); cont.len = 0; - return true; + flushed = true; +out: + raw_spin_unlock_irqrestore(_lock, flags); + return flushed; } [..] > @@ -2449,7 +2311,6 @@ void console_unlock(void) > } > console_idx = log_next(console_idx); > console_seq++; > - console_prev = msg->flags; > raw_spin_unlock(_lock); > > stop_critical_timings();/* don't trace print latency */ > @@ -2483,7 +2344,7 @@ void console_unlock(void) > if (retry && console_trylock()) > goto again; > > - if (wake_klogd) > + if (wake_klogd || cont.len) ^^ this _technically_ can result in additional spurious wakeups - cont.len check is done outside of console_sem && logbuf_lock - but I don't think this is a huge problem. -ss
Re: linux.git: printk() problem
On Mon, Oct 24, 2016 at 10:54 AM, Linus Torvaldswrote: > > Note the "totally untested" part. It compiles for me. But it may do > unspeakably stupid things. Caveat applior. Well, it is hard to apply a patch that I didn't even attach. Blush. Linus kernel/printk/printk.c | 249 +++-- 1 file changed, 52 insertions(+), 197 deletions(-) diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c index de08fc90baaf..6df4d3f6c276 100644 --- a/kernel/printk/printk.c +++ b/kernel/printk/printk.c @@ -367,7 +367,6 @@ DECLARE_WAIT_QUEUE_HEAD(log_wait); /* the next printk record to read by syslog(READ) or /proc/kmsg */ static u64 syslog_seq; static u32 syslog_idx; -static enum log_flags syslog_prev; static size_t syslog_partial; /* index and sequence number of the first record stored in the buffer */ @@ -381,7 +380,6 @@ static u32 log_next_idx; /* the next printk record to write to the console */ static u64 console_seq; static u32 console_idx; -static enum log_flags console_prev; /* the next printk record to read after the last 'clear' command */ static u64 clear_seq; @@ -650,27 +648,15 @@ static void append_char(char **pp, char *e, char c) } static ssize_t msg_print_ext_header(char *buf, size_t size, - struct printk_log *msg, u64 seq, - enum log_flags prev_flags) + struct printk_log *msg, u64 seq) { u64 ts_usec = msg->ts_nsec; - char cont = '-'; do_div(ts_usec, 1000); - /* -* If we couldn't merge continuation line fragments during the print, -* export the stored flags to allow an optional external merge of the -* records. Merging the records isn't always neccessarily correct, like -* when we hit a race during printing. In most cases though, it produces -* better readable output. 'c' in the record flags mark the first -* fragment of a line, '+' the following. -*/ - if (msg->flags & LOG_CONT) - cont = (prev_flags & LOG_CONT) ? '+' : 'c'; - return scnprintf(buf, size, "%u,%llu,%llu,%c;", - (msg->facility << 3) | msg->level, seq, ts_usec, cont); + (msg->facility << 3) | msg->level, seq, ts_usec, + msg->flags & LOG_CONT ? 'c' : '-'); } static ssize_t msg_print_ext_body(char *buf, size_t size, @@ -725,7 +711,6 @@ static ssize_t msg_print_ext_body(char *buf, size_t size, struct devkmsg_user { u64 seq; u32 idx; - enum log_flags prev; struct ratelimit_state rs; struct mutex lock; char buf[CONSOLE_EXT_LOG_MAX]; @@ -794,7 +779,7 @@ static ssize_t devkmsg_write(struct kiocb *iocb, struct iov_iter *from) return ret; } -static void cont_flush(void); +static void deferred_cont_flush(void); static ssize_t devkmsg_read(struct file *file, char __user *buf, size_t count, loff_t *ppos) @@ -811,7 +796,6 @@ static ssize_t devkmsg_read(struct file *file, char __user *buf, if (ret) return ret; raw_spin_lock_irq(_lock); - cont_flush(); while (user->seq == log_next_seq) { if (file->f_flags & O_NONBLOCK) { ret = -EAGAIN; @@ -838,12 +822,11 @@ static ssize_t devkmsg_read(struct file *file, char __user *buf, msg = log_from_idx(user->idx); len = msg_print_ext_header(user->buf, sizeof(user->buf), - msg, user->seq, user->prev); + msg, user->seq); len += msg_print_ext_body(user->buf + len, sizeof(user->buf) - len, log_dict(msg), msg->dict_len, log_text(msg), msg->text_len); - user->prev = msg->flags; user->idx = log_next(user->idx); user->seq++; raw_spin_unlock_irq(_lock); @@ -860,6 +843,7 @@ static ssize_t devkmsg_read(struct file *file, char __user *buf, ret = len; out: mutex_unlock(>lock); + deferred_cont_flush(); return ret; } @@ -874,7 +858,6 @@ static loff_t devkmsg_llseek(struct file *file, loff_t offset, int whence) return -ESPIPE; raw_spin_lock_irq(_lock); - cont_flush(); switch (whence) { case SEEK_SET: /* the first record */ @@ -913,7 +896,6 @@ static unsigned int devkmsg_poll(struct file *file, poll_table *wait) poll_wait(file, _wait, wait); raw_spin_lock_irq(_lock); - cont_flush(); if (user->seq < log_next_seq) { /* return error when data has vanished underneath us */ if (user->seq < log_first_seq) @@ -922,6 +904,7 @@ static unsigned int devkmsg_poll(struct file *file, poll_table *wait) ret =
Re: linux.git: printk() problem
On Mon, Oct 24, 2016 at 10:54 AM, Linus Torvalds wrote: > > Note the "totally untested" part. It compiles for me. But it may do > unspeakably stupid things. Caveat applior. Well, it is hard to apply a patch that I didn't even attach. Blush. Linus kernel/printk/printk.c | 249 +++-- 1 file changed, 52 insertions(+), 197 deletions(-) diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c index de08fc90baaf..6df4d3f6c276 100644 --- a/kernel/printk/printk.c +++ b/kernel/printk/printk.c @@ -367,7 +367,6 @@ DECLARE_WAIT_QUEUE_HEAD(log_wait); /* the next printk record to read by syslog(READ) or /proc/kmsg */ static u64 syslog_seq; static u32 syslog_idx; -static enum log_flags syslog_prev; static size_t syslog_partial; /* index and sequence number of the first record stored in the buffer */ @@ -381,7 +380,6 @@ static u32 log_next_idx; /* the next printk record to write to the console */ static u64 console_seq; static u32 console_idx; -static enum log_flags console_prev; /* the next printk record to read after the last 'clear' command */ static u64 clear_seq; @@ -650,27 +648,15 @@ static void append_char(char **pp, char *e, char c) } static ssize_t msg_print_ext_header(char *buf, size_t size, - struct printk_log *msg, u64 seq, - enum log_flags prev_flags) + struct printk_log *msg, u64 seq) { u64 ts_usec = msg->ts_nsec; - char cont = '-'; do_div(ts_usec, 1000); - /* -* If we couldn't merge continuation line fragments during the print, -* export the stored flags to allow an optional external merge of the -* records. Merging the records isn't always neccessarily correct, like -* when we hit a race during printing. In most cases though, it produces -* better readable output. 'c' in the record flags mark the first -* fragment of a line, '+' the following. -*/ - if (msg->flags & LOG_CONT) - cont = (prev_flags & LOG_CONT) ? '+' : 'c'; - return scnprintf(buf, size, "%u,%llu,%llu,%c;", - (msg->facility << 3) | msg->level, seq, ts_usec, cont); + (msg->facility << 3) | msg->level, seq, ts_usec, + msg->flags & LOG_CONT ? 'c' : '-'); } static ssize_t msg_print_ext_body(char *buf, size_t size, @@ -725,7 +711,6 @@ static ssize_t msg_print_ext_body(char *buf, size_t size, struct devkmsg_user { u64 seq; u32 idx; - enum log_flags prev; struct ratelimit_state rs; struct mutex lock; char buf[CONSOLE_EXT_LOG_MAX]; @@ -794,7 +779,7 @@ static ssize_t devkmsg_write(struct kiocb *iocb, struct iov_iter *from) return ret; } -static void cont_flush(void); +static void deferred_cont_flush(void); static ssize_t devkmsg_read(struct file *file, char __user *buf, size_t count, loff_t *ppos) @@ -811,7 +796,6 @@ static ssize_t devkmsg_read(struct file *file, char __user *buf, if (ret) return ret; raw_spin_lock_irq(_lock); - cont_flush(); while (user->seq == log_next_seq) { if (file->f_flags & O_NONBLOCK) { ret = -EAGAIN; @@ -838,12 +822,11 @@ static ssize_t devkmsg_read(struct file *file, char __user *buf, msg = log_from_idx(user->idx); len = msg_print_ext_header(user->buf, sizeof(user->buf), - msg, user->seq, user->prev); + msg, user->seq); len += msg_print_ext_body(user->buf + len, sizeof(user->buf) - len, log_dict(msg), msg->dict_len, log_text(msg), msg->text_len); - user->prev = msg->flags; user->idx = log_next(user->idx); user->seq++; raw_spin_unlock_irq(_lock); @@ -860,6 +843,7 @@ static ssize_t devkmsg_read(struct file *file, char __user *buf, ret = len; out: mutex_unlock(>lock); + deferred_cont_flush(); return ret; } @@ -874,7 +858,6 @@ static loff_t devkmsg_llseek(struct file *file, loff_t offset, int whence) return -ESPIPE; raw_spin_lock_irq(_lock); - cont_flush(); switch (whence) { case SEEK_SET: /* the first record */ @@ -913,7 +896,6 @@ static unsigned int devkmsg_poll(struct file *file, poll_table *wait) poll_wait(file, _wait, wait); raw_spin_lock_irq(_lock); - cont_flush(); if (user->seq < log_next_seq) { /* return error when data has vanished underneath us */ if (user->seq < log_first_seq) @@ -922,6 +904,7 @@ static unsigned int devkmsg_poll(struct file *file, poll_table *wait) ret = POLLIN|POLLRDNORM; }
Re: linux.git: printk() problem
On Mon, Oct 24, 2016 at 7:08 AM, Sergey Senozhatskywrote: > > is mod_timer() safe enough to rely on/call from > panic()->console_flush_on_panic()->console_unlock() ? I don't think that's a big issue: the whole "panic()" siotuation is very much about best effort. > shouldn't deferred_cont_flush() be called every time we jump > to `again' label in console_unlock()? actually, I think we should just do it at the end. And your "spinlock bad magic" reports says to me that we should actually delay it even more, and just do it in klogd, to avoid the whole issue with "printk() is used very _very_ early during boot". Because if klogd isn't even running yet, then we clearly don't care about timely flushing to the logs. So here's a new version - TOTALLY UNTESTED! - that removes even more redundant code (all the "test previous record flags" crud - it is not sensible to think that the previous record was what a continuation was all about, since the whole reason the continuation didn't get merged was that something else happened in between). Note the "totally untested" part. It compiles for me. But it may do unspeakably stupid things. Caveat applior. Linus
Re: linux.git: printk() problem
On Mon, Oct 24, 2016 at 7:08 AM, Sergey Senozhatsky wrote: > > is mod_timer() safe enough to rely on/call from > panic()->console_flush_on_panic()->console_unlock() ? I don't think that's a big issue: the whole "panic()" siotuation is very much about best effort. > shouldn't deferred_cont_flush() be called every time we jump > to `again' label in console_unlock()? actually, I think we should just do it at the end. And your "spinlock bad magic" reports says to me that we should actually delay it even more, and just do it in klogd, to avoid the whole issue with "printk() is used very _very_ early during boot". Because if klogd isn't even running yet, then we clearly don't care about timely flushing to the logs. So here's a new version - TOTALLY UNTESTED! - that removes even more redundant code (all the "test previous record flags" crud - it is not sensible to think that the previous record was what a continuation was all about, since the whole reason the continuation didn't get merged was that something else happened in between). Note the "totally untested" part. It compiles for me. But it may do unspeakably stupid things. Caveat applior. Linus
Re: linux.git: printk() problem
> On (10/23/16 12:46), Linus Torvalds wrote: > > +static void deferred_cont_flush(void) > > +{ > > + static DEFINE_TIMER(timer, flush_timer, 0, 0); > > + > > + if (!cont.len) > > return; > > + mod_timer(, jiffies + HZ/10); > > } > > [..] > > > @@ -2360,6 +2285,8 @@ void console_unlock(void) > > return; > > } > > > > + deferred_cont_flush(); > > + > [0.00] BUG: spinlock bad magic on CPU#0, swapper/0 [0.00] lock: __init_begin+0x1cebc0/0x1da000, .magic: , .owner: swapper/0, .owner_cpu: 0 [0.00] CPU: 0 PID: 0 Comm: swapper Not tainted 4.9.0-rc1-next-20161024-dbg-dirty #150 [0.00] 81803c18 81221216 8180a540 81ac7bc0 [0.00] 81803c38 81081414 81ac7bc0 0046 [0.00] 81803c58 81081444 81ac7bc0 81737e85 [0.00] Call Trace: [0.00] [] dump_stack+0x4f/0x65 [0.00] [] spin_dump+0x8a/0x8f [0.00] [] spin_bug+0x2b/0x2d [0.00] [] do_raw_spin_unlock+0x22/0x81 [0.00] [] _raw_spin_unlock_irqrestore+0x2c/0x4e [0.00] [] mod_timer+0x266/0x2a0 [0.00] [] deferred_cont_flush+0x2a/0x2c [0.00] [] console_unlock+0x28/0x45c [0.00] [] ? vprintk_emit+0x3a6/0x3c5 [0.00] [] vprintk_emit+0x3af/0x3c5 [0.00] [] vprintk_default+0x1d/0x1f [0.00] [] printk+0x48/0x50 [0.00] [] ? vprintk_default+0x1d/0x1f [0.00] [] e820_print_map+0x4a/0x70 [0.00] [] ? 0x8100 [0.00] [] setup_memory_map+0x37/0x3a [0.00] [] setup_arch+0x16d/0x890 [0.00] [] start_kernel+0x59/0x40c [0.00] [] x86_64_start_reservations+0x2a/0x2c [0.00] [] x86_64_start_kernel+0x17a/0x18d [0.00] [] start_cpu+0x5/0x14 [0.00] [] ? start_cpu+0x5/0x14 [0.00] usable -ss
Re: linux.git: printk() problem
> On (10/23/16 12:46), Linus Torvalds wrote: > > +static void deferred_cont_flush(void) > > +{ > > + static DEFINE_TIMER(timer, flush_timer, 0, 0); > > + > > + if (!cont.len) > > return; > > + mod_timer(, jiffies + HZ/10); > > } > > [..] > > > @@ -2360,6 +2285,8 @@ void console_unlock(void) > > return; > > } > > > > + deferred_cont_flush(); > > + > [0.00] BUG: spinlock bad magic on CPU#0, swapper/0 [0.00] lock: __init_begin+0x1cebc0/0x1da000, .magic: , .owner: swapper/0, .owner_cpu: 0 [0.00] CPU: 0 PID: 0 Comm: swapper Not tainted 4.9.0-rc1-next-20161024-dbg-dirty #150 [0.00] 81803c18 81221216 8180a540 81ac7bc0 [0.00] 81803c38 81081414 81ac7bc0 0046 [0.00] 81803c58 81081444 81ac7bc0 81737e85 [0.00] Call Trace: [0.00] [] dump_stack+0x4f/0x65 [0.00] [] spin_dump+0x8a/0x8f [0.00] [] spin_bug+0x2b/0x2d [0.00] [] do_raw_spin_unlock+0x22/0x81 [0.00] [] _raw_spin_unlock_irqrestore+0x2c/0x4e [0.00] [] mod_timer+0x266/0x2a0 [0.00] [] deferred_cont_flush+0x2a/0x2c [0.00] [] console_unlock+0x28/0x45c [0.00] [] ? vprintk_emit+0x3a6/0x3c5 [0.00] [] vprintk_emit+0x3af/0x3c5 [0.00] [] vprintk_default+0x1d/0x1f [0.00] [] printk+0x48/0x50 [0.00] [] ? vprintk_default+0x1d/0x1f [0.00] [] e820_print_map+0x4a/0x70 [0.00] [] ? 0x8100 [0.00] [] setup_memory_map+0x37/0x3a [0.00] [] setup_arch+0x16d/0x890 [0.00] [] start_kernel+0x59/0x40c [0.00] [] x86_64_start_reservations+0x2a/0x2c [0.00] [] x86_64_start_kernel+0x17a/0x18d [0.00] [] start_cpu+0x5/0x14 [0.00] [] ? start_cpu+0x5/0x14 [0.00] usable -ss
Re: linux.git: printk() problem
Hello, thanks for Cc-ing. On (10/23/16 12:46), Linus Torvalds wrote: > +static void deferred_cont_flush(void) > +{ > + static DEFINE_TIMER(timer, flush_timer, 0, 0); > + > + if (!cont.len) > return; > + mod_timer(, jiffies + HZ/10); > } [..] > @@ -2360,6 +2285,8 @@ void console_unlock(void) > return; > } > > + deferred_cont_flush(); > + is mod_timer() safe enough to rely on/call from panic()->console_flush_on_panic()->console_unlock() ? shouldn't deferred_cont_flush() be called every time we jump to `again' label in console_unlock()? timer has debug object support, which probably can printk(), but that shouldn't cause any troubles, I suppose. -ss
Re: linux.git: printk() problem
Hello, thanks for Cc-ing. On (10/23/16 12:46), Linus Torvalds wrote: > +static void deferred_cont_flush(void) > +{ > + static DEFINE_TIMER(timer, flush_timer, 0, 0); > + > + if (!cont.len) > return; > + mod_timer(, jiffies + HZ/10); > } [..] > @@ -2360,6 +2285,8 @@ void console_unlock(void) > return; > } > > + deferred_cont_flush(); > + is mod_timer() safe enough to rely on/call from panic()->console_flush_on_panic()->console_unlock() ? shouldn't deferred_cont_flush() be called every time we jump to `again' label in console_unlock()? timer has debug object support, which probably can printk(), but that shouldn't cause any troubles, I suppose. -ss
Re: linux.git: printk() problem
Hi Linus, On Sun, Oct 23, 2016 at 9:46 PM, Linus Torvaldswrote: > On Sun, Oct 23, 2016 at 12:32 PM, Linus Torvalds > wrote: >> >> No, the real complexity comes from that interaction with the console >> output, which is done outside the core log locks, and which currently >> has the added thing where we have a "has this line fragment been >> flushed or not". > > Ok, so here's the stupid patch that removes all the partial line flushing. > > NOTE! It still leaves all the games with LOG_NEWLINE and LOG_NOCONS > that are pretty much pointless with it. So there's room for more > simplification here. > > In particular, the games with LOG_NEWLINE is what Geert's "console and > dmesg output looks different" at least partially comes from. What > happens is that "dmesg" always shows the records as one line (so it > effectively ignores LOG_NEWLINE), but the console output (in > msg_print_text() still has that LOG_NEWLINE logic. > > In particular, msg_print_text() looks at the *previous* logged line to > decide whether it should do newlines etc, which is why Geert gets that > odd "two continuations per line" pattern on the console, but "one > continuation per line" in dmesg. That comes from the interaction with > flushing to the console and LOG_NEWLINE and just general complexity. Thanks, Linux kernel output is again in sync with dmesg output. Tested-by: Geert Uytterhoeven Gr{oetje,eeting}s, Geert -- Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- ge...@linux-m68k.org In personal conversations with technical people, I call myself a hacker. But when I'm talking to journalists I just say "programmer" or something like that. -- Linus Torvalds
Re: linux.git: printk() problem
Hi Linus, On Sun, Oct 23, 2016 at 9:46 PM, Linus Torvalds wrote: > On Sun, Oct 23, 2016 at 12:32 PM, Linus Torvalds > wrote: >> >> No, the real complexity comes from that interaction with the console >> output, which is done outside the core log locks, and which currently >> has the added thing where we have a "has this line fragment been >> flushed or not". > > Ok, so here's the stupid patch that removes all the partial line flushing. > > NOTE! It still leaves all the games with LOG_NEWLINE and LOG_NOCONS > that are pretty much pointless with it. So there's room for more > simplification here. > > In particular, the games with LOG_NEWLINE is what Geert's "console and > dmesg output looks different" at least partially comes from. What > happens is that "dmesg" always shows the records as one line (so it > effectively ignores LOG_NEWLINE), but the console output (in > msg_print_text() still has that LOG_NEWLINE logic. > > In particular, msg_print_text() looks at the *previous* logged line to > decide whether it should do newlines etc, which is why Geert gets that > odd "two continuations per line" pattern on the console, but "one > continuation per line" in dmesg. That comes from the interaction with > flushing to the console and LOG_NEWLINE and just general complexity. Thanks, Linux kernel output is again in sync with dmesg output. Tested-by: Geert Uytterhoeven Gr{oetje,eeting}s, Geert -- Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- ge...@linux-m68k.org In personal conversations with technical people, I call myself a hacker. But when I'm talking to journalists I just say "programmer" or something like that. -- Linus Torvalds
Re: linux.git: printk() problem
On Sun, Oct 23, 2016 at 1:33 PM, Joe Percheswrote: > > Perhaps it could be a pool of active thread > continuation buffers. Yes, we could probably do with just a couple of entries. Even just two would probably catch almost all cases. That said, having dealt with that code, I'd hate to make it more complex again for something where the preferred solution is "don't do it". Linus
Re: linux.git: printk() problem
On Sun, Oct 23, 2016 at 1:33 PM, Joe Perches wrote: > > Perhaps it could be a pool of active thread > continuation buffers. Yes, we could probably do with just a couple of entries. Even just two would probably catch almost all cases. That said, having dealt with that code, I'd hate to make it more complex again for something where the preferred solution is "don't do it". Linus
Re: linux.git: printk() problem
On Sun, 2016-10-23 at 12:32 -0700, Linus Torvalds wrote: > On Sun, Oct 23, 2016 at 12:06 PM, Joe Percheswrote: > > On Sun, 2016-10-23 at 11:11 -0700, Linus Torvalds wrote: > > > > > > And those two per se sound fairly easy to handle ("KERN_CONT means > > > append to the line buffer, otherwise flush the line buffer and move to > > > the record buffer"). > > > > > > But what complicates things more is then the "console output", which > > > has two issues: > > > > > > - it is done outside the locking regime for the line buffer and the > > > record buffer. > > > > > > - it is done on _partial_ line buffers. > > > > > > EOL KERN_ and thread interleaving still exists. > > > Note that the thread interleaving is still trivial: it's easily done > at the point where we decide "can we append to the line buffer or > not". That's pretty simple. Just flush the record when the thread > changes. > > So the interleaving will never go away, it's very fundamental - unless > we make the line buffer just be a per-thread thing. And yes, that > would be the cleanest solution, but it's also an extra buffer for each > thread, so realistically it's just not going to happen. I doubt there are cases where more than a few of these interleaving threads are simultaneous. Perhaps it could be a pool of active thread continuation buffers. > End result: I'm not worried about the interleaving. It will cause ugly > output, but we've always had that, and the solution to it is "if you > absolutely don't want interleaving, then don't try to print partial > lines!". > The classic "don't do that then" response, in other world. Yup, best solution.
Re: linux.git: printk() problem
On Sun, 2016-10-23 at 12:32 -0700, Linus Torvalds wrote: > On Sun, Oct 23, 2016 at 12:06 PM, Joe Perches wrote: > > On Sun, 2016-10-23 at 11:11 -0700, Linus Torvalds wrote: > > > > > > And those two per se sound fairly easy to handle ("KERN_CONT means > > > append to the line buffer, otherwise flush the line buffer and move to > > > the record buffer"). > > > > > > But what complicates things more is then the "console output", which > > > has two issues: > > > > > > - it is done outside the locking regime for the line buffer and the > > > record buffer. > > > > > > - it is done on _partial_ line buffers. > > > > > > EOL KERN_ and thread interleaving still exists. > > > Note that the thread interleaving is still trivial: it's easily done > at the point where we decide "can we append to the line buffer or > not". That's pretty simple. Just flush the record when the thread > changes. > > So the interleaving will never go away, it's very fundamental - unless > we make the line buffer just be a per-thread thing. And yes, that > would be the cleanest solution, but it's also an extra buffer for each > thread, so realistically it's just not going to happen. I doubt there are cases where more than a few of these interleaving threads are simultaneous. Perhaps it could be a pool of active thread continuation buffers. > End result: I'm not worried about the interleaving. It will cause ugly > output, but we've always had that, and the solution to it is "if you > absolutely don't want interleaving, then don't try to print partial > lines!". > The classic "don't do that then" response, in other world. Yup, best solution.
Re: linux.git: printk() problem
On Sun, Oct 23, 2016 at 12:32 PM, Linus Torvaldswrote: > > No, the real complexity comes from that interaction with the console > output, which is done outside the core log locks, and which currently > has the added thing where we have a "has this line fragment been > flushed or not". Ok, so here's the stupid patch that removes all the partial line flushing. NOTE! It still leaves all the games with LOG_NEWLINE and LOG_NOCONS that are pretty much pointless with it. So there's room for more simplification here. In particular, the games with LOG_NEWLINE is what Geert's "console and dmesg output looks different" at least partially comes from. What happens is that "dmesg" always shows the records as one line (so it effectively ignores LOG_NEWLINE), but the console output (in msg_print_text() still has that LOG_NEWLINE logic. In particular, msg_print_text() looks at the *previous* logged line to decide whether it should do newlines etc, which is why Geert gets that odd "two continuations per line" pattern on the console, but "one continuation per line" in dmesg. That comes from the interaction with flushing to the console and LOG_NEWLINE and just general complexity. All of that LOG_NEWLINE code could be removed. But again, this patch doesn't do that removal. It just removes the partial console flushing and simplifies that part of the code. (This patch removes way more lines than it adds, but the *real* advantage is that it removes complexity. The rules for console_cont_flush() really were _very_ hard to grok, it has subtle interactions with cont_add() and cont_flush() through that "cont.cons" and "cont.flushed" logic that is all removed by this patch). Linus kernel/printk/printk.c | 132 +++-- 1 file changed, 28 insertions(+), 104 deletions(-) diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c index de08fc90baaf..ed1bfa774427 100644 --- a/kernel/printk/printk.c +++ b/kernel/printk/printk.c @@ -794,7 +794,7 @@ static ssize_t devkmsg_write(struct kiocb *iocb, struct iov_iter *from) return ret; } -static void cont_flush(void); +static void deferred_cont_flush(void); static ssize_t devkmsg_read(struct file *file, char __user *buf, size_t count, loff_t *ppos) @@ -811,7 +811,6 @@ static ssize_t devkmsg_read(struct file *file, char __user *buf, if (ret) return ret; raw_spin_lock_irq(_lock); - cont_flush(); while (user->seq == log_next_seq) { if (file->f_flags & O_NONBLOCK) { ret = -EAGAIN; @@ -860,6 +859,7 @@ static ssize_t devkmsg_read(struct file *file, char __user *buf, ret = len; out: mutex_unlock(>lock); + deferred_cont_flush(); return ret; } @@ -874,7 +874,6 @@ static loff_t devkmsg_llseek(struct file *file, loff_t offset, int whence) return -ESPIPE; raw_spin_lock_irq(_lock); - cont_flush(); switch (whence) { case SEEK_SET: /* the first record */ @@ -913,7 +912,6 @@ static unsigned int devkmsg_poll(struct file *file, poll_table *wait) poll_wait(file, _wait, wait); raw_spin_lock_irq(_lock); - cont_flush(); if (user->seq < log_next_seq) { /* return error when data has vanished underneath us */ if (user->seq < log_first_seq) @@ -922,6 +920,7 @@ static unsigned int devkmsg_poll(struct file *file, poll_table *wait) ret = POLLIN|POLLRDNORM; } raw_spin_unlock_irq(_lock); + deferred_cont_flush(); return ret; } @@ -1300,7 +1299,6 @@ static int syslog_print(char __user *buf, int size) size_t skip; raw_spin_lock_irq(_lock); - cont_flush(); if (syslog_seq < log_first_seq) { /* messages are gone, move to first one */ syslog_seq = log_first_seq; @@ -1360,7 +1358,6 @@ static int syslog_print_all(char __user *buf, int size, bool clear) return -ENOMEM; raw_spin_lock_irq(_lock); - cont_flush(); if (buf) { u64 next_seq; u64 seq; @@ -1522,7 +1519,6 @@ int do_syslog(int type, char __user *buf, int len, int source) /* Number of chars in the log buffer */ case SYSLOG_ACTION_SIZE_UNREAD: raw_spin_lock_irq(_lock); - cont_flush(); if (syslog_seq < log_first_seq) { /* messages are gone, move to first one */ syslog_seq = log_first_seq; @@ -1563,6 +1559,7 @@ int do_syslog(int type, char __user *buf, int len, int source) error = -EINVAL; break; } + deferred_cont_flush(); out: return error; } @@ -1650,46 +1647,41 @@ static inline void
Re: linux.git: printk() problem
On Sun, Oct 23, 2016 at 12:32 PM, Linus Torvalds wrote: > > No, the real complexity comes from that interaction with the console > output, which is done outside the core log locks, and which currently > has the added thing where we have a "has this line fragment been > flushed or not". Ok, so here's the stupid patch that removes all the partial line flushing. NOTE! It still leaves all the games with LOG_NEWLINE and LOG_NOCONS that are pretty much pointless with it. So there's room for more simplification here. In particular, the games with LOG_NEWLINE is what Geert's "console and dmesg output looks different" at least partially comes from. What happens is that "dmesg" always shows the records as one line (so it effectively ignores LOG_NEWLINE), but the console output (in msg_print_text() still has that LOG_NEWLINE logic. In particular, msg_print_text() looks at the *previous* logged line to decide whether it should do newlines etc, which is why Geert gets that odd "two continuations per line" pattern on the console, but "one continuation per line" in dmesg. That comes from the interaction with flushing to the console and LOG_NEWLINE and just general complexity. All of that LOG_NEWLINE code could be removed. But again, this patch doesn't do that removal. It just removes the partial console flushing and simplifies that part of the code. (This patch removes way more lines than it adds, but the *real* advantage is that it removes complexity. The rules for console_cont_flush() really were _very_ hard to grok, it has subtle interactions with cont_add() and cont_flush() through that "cont.cons" and "cont.flushed" logic that is all removed by this patch). Linus kernel/printk/printk.c | 132 +++-- 1 file changed, 28 insertions(+), 104 deletions(-) diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c index de08fc90baaf..ed1bfa774427 100644 --- a/kernel/printk/printk.c +++ b/kernel/printk/printk.c @@ -794,7 +794,7 @@ static ssize_t devkmsg_write(struct kiocb *iocb, struct iov_iter *from) return ret; } -static void cont_flush(void); +static void deferred_cont_flush(void); static ssize_t devkmsg_read(struct file *file, char __user *buf, size_t count, loff_t *ppos) @@ -811,7 +811,6 @@ static ssize_t devkmsg_read(struct file *file, char __user *buf, if (ret) return ret; raw_spin_lock_irq(_lock); - cont_flush(); while (user->seq == log_next_seq) { if (file->f_flags & O_NONBLOCK) { ret = -EAGAIN; @@ -860,6 +859,7 @@ static ssize_t devkmsg_read(struct file *file, char __user *buf, ret = len; out: mutex_unlock(>lock); + deferred_cont_flush(); return ret; } @@ -874,7 +874,6 @@ static loff_t devkmsg_llseek(struct file *file, loff_t offset, int whence) return -ESPIPE; raw_spin_lock_irq(_lock); - cont_flush(); switch (whence) { case SEEK_SET: /* the first record */ @@ -913,7 +912,6 @@ static unsigned int devkmsg_poll(struct file *file, poll_table *wait) poll_wait(file, _wait, wait); raw_spin_lock_irq(_lock); - cont_flush(); if (user->seq < log_next_seq) { /* return error when data has vanished underneath us */ if (user->seq < log_first_seq) @@ -922,6 +920,7 @@ static unsigned int devkmsg_poll(struct file *file, poll_table *wait) ret = POLLIN|POLLRDNORM; } raw_spin_unlock_irq(_lock); + deferred_cont_flush(); return ret; } @@ -1300,7 +1299,6 @@ static int syslog_print(char __user *buf, int size) size_t skip; raw_spin_lock_irq(_lock); - cont_flush(); if (syslog_seq < log_first_seq) { /* messages are gone, move to first one */ syslog_seq = log_first_seq; @@ -1360,7 +1358,6 @@ static int syslog_print_all(char __user *buf, int size, bool clear) return -ENOMEM; raw_spin_lock_irq(_lock); - cont_flush(); if (buf) { u64 next_seq; u64 seq; @@ -1522,7 +1519,6 @@ int do_syslog(int type, char __user *buf, int len, int source) /* Number of chars in the log buffer */ case SYSLOG_ACTION_SIZE_UNREAD: raw_spin_lock_irq(_lock); - cont_flush(); if (syslog_seq < log_first_seq) { /* messages are gone, move to first one */ syslog_seq = log_first_seq; @@ -1563,6 +1559,7 @@ int do_syslog(int type, char __user *buf, int len, int source) error = -EINVAL; break; } + deferred_cont_flush(); out: return error; } @@ -1650,46 +1647,41 @@ static inline void printk_delay(void) static struct
Re: linux.git: printk() problem
On Sun, Oct 23, 2016 at 12:06 PM, Joe Percheswrote: > On Sun, 2016-10-23 at 11:11 -0700, Linus Torvalds wrote: >> >> And those two per se sound fairly easy to handle ("KERN_CONT means >> append to the line buffer, otherwise flush the line buffer and move to >> the record buffer"). >> >> But what complicates things more is then the "console output", which >> has two issues: >> >> - it is done outside the locking regime for the line buffer and the >> record buffer. >> >> - it is done on _partial_ line buffers. > > EOL KERN_ and thread interleaving still exists. Note that the thread interleaving is still trivial: it's easily done at the point where we decide "can we append to the line buffer or not". That's pretty simple. Just flush the record when the thread changes. So the interleaving will never go away, it's very fundamental - unless we make the line buffer just be a per-thread thing. And yes, that would be the cleanest solution, but it's also an extra buffer for each thread, so realistically it's just not going to happen. End result: I'm not worried about the interleaving. It will cause ugly output, but we've always had that, and the solution to it is "if you absolutely don't want interleaving, then don't try to print partial lines!". The classic "don't do that then" response, in other world. No, the real complexity comes from that interaction with the console output, which is done outside the core log locks, and which currently has the added thing where we have a "has this line fragment been flushed or not". That "has this line fragment been flushed or not" is particularly painful, because we may have flushed it *partially*. That "cont.cons" thing is a counter of how many bytes have been flushed, and we can be in the situation where we have had multiple continuations added to the line buffer, and only *some* of them have been flushed to the console. (Reasons for not flushing: we couldn't get the console lock because another process held it due to logging or whatever). And then the interface to the actual record logging only has a "all or nothing was flushed" flag (LOG_NOCONS) to avoid flushing things twice. So when we actually log the record, we lose the "this line was only partially printed". That whole "we've flushed part of the line to the console" thing is why it would make things so much easier to just log full records to the console. Getting rid of that gets rid of a lot of ugly and hard-to-read crap. Yes, the line buffer would still remain, and yes, you'd still see the interleaving with threads, but that's not "complexity", that's just "visually ugly output". Linus
Re: linux.git: printk() problem
On Sun, Oct 23, 2016 at 12:06 PM, Joe Perches wrote: > On Sun, 2016-10-23 at 11:11 -0700, Linus Torvalds wrote: >> >> And those two per se sound fairly easy to handle ("KERN_CONT means >> append to the line buffer, otherwise flush the line buffer and move to >> the record buffer"). >> >> But what complicates things more is then the "console output", which >> has two issues: >> >> - it is done outside the locking regime for the line buffer and the >> record buffer. >> >> - it is done on _partial_ line buffers. > > EOL KERN_ and thread interleaving still exists. Note that the thread interleaving is still trivial: it's easily done at the point where we decide "can we append to the line buffer or not". That's pretty simple. Just flush the record when the thread changes. So the interleaving will never go away, it's very fundamental - unless we make the line buffer just be a per-thread thing. And yes, that would be the cleanest solution, but it's also an extra buffer for each thread, so realistically it's just not going to happen. End result: I'm not worried about the interleaving. It will cause ugly output, but we've always had that, and the solution to it is "if you absolutely don't want interleaving, then don't try to print partial lines!". The classic "don't do that then" response, in other world. No, the real complexity comes from that interaction with the console output, which is done outside the core log locks, and which currently has the added thing where we have a "has this line fragment been flushed or not". That "has this line fragment been flushed or not" is particularly painful, because we may have flushed it *partially*. That "cont.cons" thing is a counter of how many bytes have been flushed, and we can be in the situation where we have had multiple continuations added to the line buffer, and only *some* of them have been flushed to the console. (Reasons for not flushing: we couldn't get the console lock because another process held it due to logging or whatever). And then the interface to the actual record logging only has a "all or nothing was flushed" flag (LOG_NOCONS) to avoid flushing things twice. So when we actually log the record, we lose the "this line was only partially printed". That whole "we've flushed part of the line to the console" thing is why it would make things so much easier to just log full records to the console. Getting rid of that gets rid of a lot of ugly and hard-to-read crap. Yes, the line buffer would still remain, and yes, you'd still see the interleaving with threads, but that's not "complexity", that's just "visually ugly output". Linus
Re: linux.git: printk() problem
On Sun, 2016-10-23 at 11:11 -0700, Linus Torvalds wrote: > On Sun, Oct 23, 2016 at 2:22 AM, Geert Uytterhoeven >wrote: > > > > These changes have an interesting side-effect on sequences of printk()s that > > lack proper continuation: they introduced a discrepancy between dmesg output > > and the actual kernel output. > > Yes. > > So the "print vs log" handling is really really horrible. I cleaned up > some of it, but left the really fundamental problems. I wanted to just > rewrite it all, but didn't quite have the heart for it. > > The best solution by far would be to just not support KERN_CONT at > all, but there's too many "silly details" things that keep it from > being possible. > > The basic issue is that we have the line buffer that is used for > continuations, and then the record buffer that is used for logging. > > And those two per se sound fairly easy to handle ("KERN_CONT means > append to the line buffer, otherwise flush the line buffer and move to > the record buffer"). > > But what complicates things more is then the "console output", which > has two issues: > > - it is done outside the locking regime for the line buffer and the > record buffer. > > - it is done on _partial_ line buffers. EOL KERN_ and thread interleaving still exists. > It would be really quite easy to say "we don't print out > continuation lines immediately, we just buffer them for 0.1s instead, > and KERN_CONT only works for things that really happen more or less > immediately". Or use to a start/stop buffer (maybe via KERN_ and \n) with PID/TIDs added to /dev/kmsg and that short-term timer to reassemble. > Maybe that really is the right answer. Because the original cause of > us having to bend over backwards in this case is really no longer > there. And it would simplify printk a *lot*. A timer might be a good idea, but perhaps Sergey and Petr might have some interest in that too. (added to cc's)
Re: linux.git: printk() problem
On Sun, 2016-10-23 at 11:11 -0700, Linus Torvalds wrote: > On Sun, Oct 23, 2016 at 2:22 AM, Geert Uytterhoeven > wrote: > > > > These changes have an interesting side-effect on sequences of printk()s that > > lack proper continuation: they introduced a discrepancy between dmesg output > > and the actual kernel output. > > Yes. > > So the "print vs log" handling is really really horrible. I cleaned up > some of it, but left the really fundamental problems. I wanted to just > rewrite it all, but didn't quite have the heart for it. > > The best solution by far would be to just not support KERN_CONT at > all, but there's too many "silly details" things that keep it from > being possible. > > The basic issue is that we have the line buffer that is used for > continuations, and then the record buffer that is used for logging. > > And those two per se sound fairly easy to handle ("KERN_CONT means > append to the line buffer, otherwise flush the line buffer and move to > the record buffer"). > > But what complicates things more is then the "console output", which > has two issues: > > - it is done outside the locking regime for the line buffer and the > record buffer. > > - it is done on _partial_ line buffers. EOL KERN_ and thread interleaving still exists. > It would be really quite easy to say "we don't print out > continuation lines immediately, we just buffer them for 0.1s instead, > and KERN_CONT only works for things that really happen more or less > immediately". Or use to a start/stop buffer (maybe via KERN_ and \n) with PID/TIDs added to /dev/kmsg and that short-term timer to reassemble. > Maybe that really is the right answer. Because the original cause of > us having to bend over backwards in this case is really no longer > there. And it would simplify printk a *lot*. A timer might be a good idea, but perhaps Sergey and Petr might have some interest in that too. (added to cc's)
Re: linux.git: printk() problem
On Sun, Oct 23, 2016 at 2:22 AM, Geert Uytterhoevenwrote: > > These changes have an interesting side-effect on sequences of printk()s that > lack proper continuation: they introduced a discrepancy between dmesg output > and the actual kernel output. Yes. So the "print vs log" handling is really really horrible. I cleaned up some of it, but left the really fundamental problems. I wanted to just rewrite it all, but didn't quite have the heart for it. The best solution by far would be to just not support KERN_CONT at all, but there's too many "silly details" things that keep it from being possible. The basic issue is that we have the line buffer that is used for continuations, and then the record buffer that is used for logging. And those two per se sound fairly easy to handle ("KERN_CONT means append to the line buffer, otherwise flush the line buffer and move to the record buffer"). But what complicates things more is then the "console output", which has two issues: - it is done outside the locking regime for the line buffer and the record buffer. - it is done on _partial_ line buffers. Again, this would be absolutely trivial if we just said "we only print the record buffer". Easy, and solves all the problems. Except for _one_ problem: - if a hang occurs in the middle of a continuation, we historically *really* want that continuation to have been printed out. For example, one of the really historical uses for partial lines is this: pr_info("Checking 'hlt' instruction... "); if (!boot_cpu_data.hlt_works_ok) { pr_cont("disabled\n"); return; } halt(); halt(); halt(); halt(); pr_cont("OK\n"); and the point was that there used to be some really old i386 machines that hung on the "hlt" instruction (probably not because of a CPU bug, but because of either power supply issues or some DMA issues). To support that, we really *had* to print out the continuation lines even when they were partial. And that complicates the printk logic a lot. Now, that "hlt" case is long long gone, and maybe we should just say "screw that". It would be really quite easy to say "we don't print out continuation lines immediately, we just buffer them for 0.1s instead, and KERN_CONT only works for things that really happen more or less immediately". Maybe that really is the right answer. Because the original cause of us having to bend over backwards in this case is really no longer there. And it would simplify printk a *lot*. Let me whip up a minimal patch for you to try. Linus
Re: linux.git: printk() problem
On Sun, Oct 23, 2016 at 2:22 AM, Geert Uytterhoeven wrote: > > These changes have an interesting side-effect on sequences of printk()s that > lack proper continuation: they introduced a discrepancy between dmesg output > and the actual kernel output. Yes. So the "print vs log" handling is really really horrible. I cleaned up some of it, but left the really fundamental problems. I wanted to just rewrite it all, but didn't quite have the heart for it. The best solution by far would be to just not support KERN_CONT at all, but there's too many "silly details" things that keep it from being possible. The basic issue is that we have the line buffer that is used for continuations, and then the record buffer that is used for logging. And those two per se sound fairly easy to handle ("KERN_CONT means append to the line buffer, otherwise flush the line buffer and move to the record buffer"). But what complicates things more is then the "console output", which has two issues: - it is done outside the locking regime for the line buffer and the record buffer. - it is done on _partial_ line buffers. Again, this would be absolutely trivial if we just said "we only print the record buffer". Easy, and solves all the problems. Except for _one_ problem: - if a hang occurs in the middle of a continuation, we historically *really* want that continuation to have been printed out. For example, one of the really historical uses for partial lines is this: pr_info("Checking 'hlt' instruction... "); if (!boot_cpu_data.hlt_works_ok) { pr_cont("disabled\n"); return; } halt(); halt(); halt(); halt(); pr_cont("OK\n"); and the point was that there used to be some really old i386 machines that hung on the "hlt" instruction (probably not because of a CPU bug, but because of either power supply issues or some DMA issues). To support that, we really *had* to print out the continuation lines even when they were partial. And that complicates the printk logic a lot. Now, that "hlt" case is long long gone, and maybe we should just say "screw that". It would be really quite easy to say "we don't print out continuation lines immediately, we just buffer them for 0.1s instead, and KERN_CONT only works for things that really happen more or less immediately". Maybe that really is the right answer. Because the original cause of us having to bend over backwards in this case is really no longer there. And it would simplify printk a *lot*. Let me whip up a minimal patch for you to try. Linus
Re: linux.git: printk() problem
Hi Linus, On Wed, Oct 12, 2016 at 5:47 PM, Linus Torvaldswrote: > On Wed, Oct 12, 2016 at 6:30 AM, Tetsuo Handa > wrote: >> >> I noticed that current linux.git generates hardly readable console output >> due to KERN_CONT changes. Are you suggesting developers that output like >> this be fixed? > > Yes. Needing to add a few KERN_CONT markers was pretty much expected, > since it's about 5 years since we enfroced it and new code won't > necessarily have it (and even back then I don't think we _always_ got > it right). > > That said, looking at the printk's in the lowmem code, I think it > could be useful if there was some effort to see if the code could > somehow avoid the multi-printk thing. This is actually one area where > > (a) the problem actually happens while the system is running, rather > than booting > > (b) I've seen line mixing in the past > > but the short term fix is to just add KERN_CONT markers to the lines > that are continuations. > > NOTE! The reason I mention that (a) thing that it has traditionally > made it much messier to do logging of continuation lines in the first > place (because more things are going on and often one problem leads to > another and then the mixing is much more likely), but I actually > intentionally made it more likely to trigger the flushing issue in > commit bfd8d3f23b51 ("printk: make reading the kernel log flush > pending lines"). > > So if there is an active system logger that is reading messages *when* > one of those "one line in multiple printk's" things happen, that log > reader will now potentially cause the logging to be broken up because > the act of reading will flush the pending lines. These changes have an interesting side-effect on sequences of printk()s that lack proper continuation: they introduced a discrepancy between dmesg output and the actual kernel output. Before: Atari hardware found: VIDEL STDMA-SCSI ST_MFP YM2149 PCM CODEC DSP56K SCC ANALOG_JOY BLITTER IDE TT_CLK FDC_SPEED Output of "dmesg" after: Atari hardware found: VIDEL STDMA-SCSI ST_MFP YM2149 PCM CODEC DSP56K SCC ANALOG_JOY BLITTER IDE TT_CLK FDC_SPEED Actual kernel output after: Atari hardware found: VIDEL STDMA-SCSI ST_MFP YM2149 PCM CODEC DSP56K SCC ANALOG_JOY BLITTER IDE TT_CLK FDC_SPEED Note that the above is really early in the boot process, right after the debug console is enabled, and before any system log consumer is running, Of course I'm converting this code to use pr_cont() anyway... Gr{oetje,eeting}s, Geert -- Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- ge...@linux-m68k.org In personal conversations with technical people, I call myself a hacker. But when I'm talking to journalists I just say "programmer" or something like that. -- Linus Torvalds
Re: linux.git: printk() problem
Hi Linus, On Wed, Oct 12, 2016 at 5:47 PM, Linus Torvalds wrote: > On Wed, Oct 12, 2016 at 6:30 AM, Tetsuo Handa > wrote: >> >> I noticed that current linux.git generates hardly readable console output >> due to KERN_CONT changes. Are you suggesting developers that output like >> this be fixed? > > Yes. Needing to add a few KERN_CONT markers was pretty much expected, > since it's about 5 years since we enfroced it and new code won't > necessarily have it (and even back then I don't think we _always_ got > it right). > > That said, looking at the printk's in the lowmem code, I think it > could be useful if there was some effort to see if the code could > somehow avoid the multi-printk thing. This is actually one area where > > (a) the problem actually happens while the system is running, rather > than booting > > (b) I've seen line mixing in the past > > but the short term fix is to just add KERN_CONT markers to the lines > that are continuations. > > NOTE! The reason I mention that (a) thing that it has traditionally > made it much messier to do logging of continuation lines in the first > place (because more things are going on and often one problem leads to > another and then the mixing is much more likely), but I actually > intentionally made it more likely to trigger the flushing issue in > commit bfd8d3f23b51 ("printk: make reading the kernel log flush > pending lines"). > > So if there is an active system logger that is reading messages *when* > one of those "one line in multiple printk's" things happen, that log > reader will now potentially cause the logging to be broken up because > the act of reading will flush the pending lines. These changes have an interesting side-effect on sequences of printk()s that lack proper continuation: they introduced a discrepancy between dmesg output and the actual kernel output. Before: Atari hardware found: VIDEL STDMA-SCSI ST_MFP YM2149 PCM CODEC DSP56K SCC ANALOG_JOY BLITTER IDE TT_CLK FDC_SPEED Output of "dmesg" after: Atari hardware found: VIDEL STDMA-SCSI ST_MFP YM2149 PCM CODEC DSP56K SCC ANALOG_JOY BLITTER IDE TT_CLK FDC_SPEED Actual kernel output after: Atari hardware found: VIDEL STDMA-SCSI ST_MFP YM2149 PCM CODEC DSP56K SCC ANALOG_JOY BLITTER IDE TT_CLK FDC_SPEED Note that the above is really early in the boot process, right after the debug console is enabled, and before any system log consumer is running, Of course I'm converting this code to use pr_cont() anyway... Gr{oetje,eeting}s, Geert -- Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- ge...@linux-m68k.org In personal conversations with technical people, I call myself a hacker. But when I'm talking to journalists I just say "programmer" or something like that. -- Linus Torvalds
Re: linux.git: printk() problem
On Thu 13-10-16 03:20:50, Joe Perches wrote: > On Thu, 2016-10-13 at 12:04 +0200, Michal Hocko wrote: [...] > > I would be really surprised if we really had that many continuation > > lines. They should be avoided as much as possible. Hundreds of thousands > > just sounds more than over exaggerated... > > Hey Michal. > > "Hundreds _to_ thousands" of instances. Not "hundreds _of_ thousands". my bad I misread your words. Sorry about that -- Michal Hocko SUSE Labs
Re: linux.git: printk() problem
On Thu 13-10-16 03:20:50, Joe Perches wrote: > On Thu, 2016-10-13 at 12:04 +0200, Michal Hocko wrote: [...] > > I would be really surprised if we really had that many continuation > > lines. They should be avoided as much as possible. Hundreds of thousands > > just sounds more than over exaggerated... > > Hey Michal. > > "Hundreds _to_ thousands" of instances. Not "hundreds _of_ thousands". my bad I misread your words. Sorry about that -- Michal Hocko SUSE Labs
Re: linux.git: printk() problem
On Thu, 2016-10-13 at 12:04 +0200, Michal Hocko wrote: > On Thu 13-10-16 02:29:46, Joe Perches wrote: > > On Thu, 2016-10-13 at 08:26 +0200, Michal Hocko wrote: > > [...] > > > I think they are not critical and can be fix once somebody notices. > > > > > > As do I, but Linus objected to applying a patch when Colin Ian King > > noticed one. > > > > I think the 250,000 or so uses with newlines are enough of a > > precedence to keep using newlines everywhere. > > > or simply fix missing KERN_CONTs and simply do not add any new missing \n > > > Now we'll have to have patches adding hundreds to thousands of the > > missing KERN_CONTs for continuation lines that weren't previously a > > problem in logging output but are now. > > > I would be really surprised if we really had that many continuation > lines. They should be avoided as much as possible. Hundreds of thousands > just sounds more than over exaggerated... Hey Michal. "Hundreds _to_ thousands" of instances. Not "hundreds _of_ thousands". > Not requiring \n at the end of strings just makes a lot of sense if we > have a KERN_CONT with a well defined semantic. True enough. And I am not at all arguing against having a well defined KERN_CONT semantic. But using KERN_CONT alone is not enough information to be able to perfectly reassemble message fragments post hoc given multiple threads possibly interleaving KERN_CONT. I do think the inconsistency of mixing styles with and without newlines not particularly good. cheers, Joe
Re: linux.git: printk() problem
On Thu, 2016-10-13 at 12:04 +0200, Michal Hocko wrote: > On Thu 13-10-16 02:29:46, Joe Perches wrote: > > On Thu, 2016-10-13 at 08:26 +0200, Michal Hocko wrote: > > [...] > > > I think they are not critical and can be fix once somebody notices. > > > > > > As do I, but Linus objected to applying a patch when Colin Ian King > > noticed one. > > > > I think the 250,000 or so uses with newlines are enough of a > > precedence to keep using newlines everywhere. > > > or simply fix missing KERN_CONTs and simply do not add any new missing \n > > > Now we'll have to have patches adding hundreds to thousands of the > > missing KERN_CONTs for continuation lines that weren't previously a > > problem in logging output but are now. > > > I would be really surprised if we really had that many continuation > lines. They should be avoided as much as possible. Hundreds of thousands > just sounds more than over exaggerated... Hey Michal. "Hundreds _to_ thousands" of instances. Not "hundreds _of_ thousands". > Not requiring \n at the end of strings just makes a lot of sense if we > have a KERN_CONT with a well defined semantic. True enough. And I am not at all arguing against having a well defined KERN_CONT semantic. But using KERN_CONT alone is not enough information to be able to perfectly reassemble message fragments post hoc given multiple threads possibly interleaving KERN_CONT. I do think the inconsistency of mixing styles with and without newlines not particularly good. cheers, Joe
Re: linux.git: printk() problem
On Thu 13-10-16 02:29:46, Joe Perches wrote: > On Thu, 2016-10-13 at 08:26 +0200, Michal Hocko wrote: [...] > > I think they are not critical and can be fix once somebody notices. > > As do I, but Linus objected to applying a patch when Colin Ian King > noticed one. > > I think the 250,000 or so uses with newlines are enough of a > precedence to keep using newlines everywhere. or simply fix missing KERN_CONTs and simply do not add any new missing \n > Now we'll have to have patches adding hundreds to thousands of the > missing KERN_CONTs for continuation lines that weren't previously a > problem in logging output but are now. I would be really surprised if we really had that many continuation lines. They should be avoided as much as possible. Hundreds of thousands just sounds more than over exaggerated... I think you are just making much bigger deal from this than necessary. Not requiring \n at the end of strings just makes a lot of sense if we have a KERN_CONT with a well defined semantic. Which was the whole point of the patch from Linus AFAIU. If there are some left overs, so what, we can fix them as soon as somebody notices. The worst thing we will get is a messy output but no information should be lost. We used to have a messy output in the past regardless and we could live with it... That being said I would be happier to know about this change before I had to scratch my head when seeing this for the first time so a heads up would be more than appreciated but fixing these issues is trivial not not worth making a lot of noise about. -- Michal Hocko SUSE Labs
Re: linux.git: printk() problem
On Thu 13-10-16 02:29:46, Joe Perches wrote: > On Thu, 2016-10-13 at 08:26 +0200, Michal Hocko wrote: [...] > > I think they are not critical and can be fix once somebody notices. > > As do I, but Linus objected to applying a patch when Colin Ian King > noticed one. > > I think the 250,000 or so uses with newlines are enough of a > precedence to keep using newlines everywhere. or simply fix missing KERN_CONTs and simply do not add any new missing \n > Now we'll have to have patches adding hundreds to thousands of the > missing KERN_CONTs for continuation lines that weren't previously a > problem in logging output but are now. I would be really surprised if we really had that many continuation lines. They should be avoided as much as possible. Hundreds of thousands just sounds more than over exaggerated... I think you are just making much bigger deal from this than necessary. Not requiring \n at the end of strings just makes a lot of sense if we have a KERN_CONT with a well defined semantic. Which was the whole point of the patch from Linus AFAIU. If there are some left overs, so what, we can fix them as soon as somebody notices. The worst thing we will get is a messy output but no information should be lost. We used to have a messy output in the past regardless and we could live with it... That being said I would be happier to know about this change before I had to scratch my head when seeing this for the first time so a heads up would be more than appreciated but fixing these issues is trivial not not worth making a lot of noise about. -- Michal Hocko SUSE Labs
Re: linux.git: printk() problem
On Thu, 2016-10-13 at 08:26 +0200, Michal Hocko wrote: > On Wed 12-10-16 09:08:58, Joe Perches wrote: > > On Wed, 2016-10-12 at 16:35 +0200, Michal Hocko wrote: > > > On Wed 12-10-16 22:30:03, Tetsuo Handa wrote: > > > > Hello. > > > > > > > > I noticed that current linux.git generates hardly readable console > > > > output > > > > due to KERN_CONT changes. Are you suggesting developers that output like > > > > this be fixed? > > > > > > > > > > > > Joe has already posted a patch > > > http://lkml.kernel.org/r/c7df37c8665134654a17aaeb8b9f6ace1d6db58b.1476239034.git@perches.com > > > > > > And in general I think that adding those KERN_CONT is a good thing to > > > do. > > > > > > As do I, but there are about a quarter _million_ uses of > > printk/logging messages in the kernel tree with newlines. > > > > Maybe 100 or so of the existing messages lack terminating > > newlines. > > > I think they are not critical and can be fix once somebody notices. As do I, but Linus objected to applying a patch when Colin Ian King noticed one. I think the 250,000 or so uses with newlines are enough of a precedence to keep using newlines everywhere. Now we'll have to have patches adding hundreds to thousands of the missing KERN_CONTs for continuation lines that weren't previously a problem in logging output but are now.
Re: linux.git: printk() problem
On Thu, 2016-10-13 at 08:26 +0200, Michal Hocko wrote: > On Wed 12-10-16 09:08:58, Joe Perches wrote: > > On Wed, 2016-10-12 at 16:35 +0200, Michal Hocko wrote: > > > On Wed 12-10-16 22:30:03, Tetsuo Handa wrote: > > > > Hello. > > > > > > > > I noticed that current linux.git generates hardly readable console > > > > output > > > > due to KERN_CONT changes. Are you suggesting developers that output like > > > > this be fixed? > > > > > > > > > > > > Joe has already posted a patch > > > http://lkml.kernel.org/r/c7df37c8665134654a17aaeb8b9f6ace1d6db58b.1476239034.git@perches.com > > > > > > And in general I think that adding those KERN_CONT is a good thing to > > > do. > > > > > > As do I, but there are about a quarter _million_ uses of > > printk/logging messages in the kernel tree with newlines. > > > > Maybe 100 or so of the existing messages lack terminating > > newlines. > > > I think they are not critical and can be fix once somebody notices. As do I, but Linus objected to applying a patch when Colin Ian King noticed one. I think the 250,000 or so uses with newlines are enough of a precedence to keep using newlines everywhere. Now we'll have to have patches adding hundreds to thousands of the missing KERN_CONTs for continuation lines that weren't previously a problem in logging output but are now.
Re: linux.git: printk() problem
On Wed 12-10-16 09:08:58, Joe Perches wrote: > On Wed, 2016-10-12 at 16:35 +0200, Michal Hocko wrote: > > On Wed 12-10-16 22:30:03, Tetsuo Handa wrote: > > > Hello. > > > > > > I noticed that current linux.git generates hardly readable console output > > > due to KERN_CONT changes. Are you suggesting developers that output like > > > this be fixed? > > > > > > Joe has already posted a patch > > http://lkml.kernel.org/r/c7df37c8665134654a17aaeb8b9f6ace1d6db58b.1476239034.git@perches.com > > > > And in general I think that adding those KERN_CONT is a good thing to > > do. > > As do I, but there are about a quarter _million_ uses of > printk/logging messages in the kernel tree with newlines. > > Maybe 100 or so of the existing messages lack terminating > newlines. I think they are not critical and can be fix once somebody notices. -- Michal Hocko SUSE Labs
Re: linux.git: printk() problem
On Wed 12-10-16 09:08:58, Joe Perches wrote: > On Wed, 2016-10-12 at 16:35 +0200, Michal Hocko wrote: > > On Wed 12-10-16 22:30:03, Tetsuo Handa wrote: > > > Hello. > > > > > > I noticed that current linux.git generates hardly readable console output > > > due to KERN_CONT changes. Are you suggesting developers that output like > > > this be fixed? > > > > > > Joe has already posted a patch > > http://lkml.kernel.org/r/c7df37c8665134654a17aaeb8b9f6ace1d6db58b.1476239034.git@perches.com > > > > And in general I think that adding those KERN_CONT is a good thing to > > do. > > As do I, but there are about a quarter _million_ uses of > printk/logging messages in the kernel tree with newlines. > > Maybe 100 or so of the existing messages lack terminating > newlines. I think they are not critical and can be fix once somebody notices. -- Michal Hocko SUSE Labs
Re: linux.git: printk() problem
On Wed, Oct 12, 2016 at 9:16 AM, Joe Percheswrote: >> (but with the new world order you actually *can* combine KERN_CONT >> with a loglevel, so that if the beginning od the line got flushed, the >> continuation can still be printed with the right log level). > > I think that might not be a good idea. > > Anything that uses a KERN_CONT with a new log level > might as well be converted into multiple printks. Generally yes. The immediate reason for it was some really really nasty and horrible indirection in the ACPI layer, which goes through something like fifteen million different abstraction layers before it actually hits "printk()", and some of them do vsprintf() in the middle. It was something like ACPI_INFO() -> acpi_os_printf() -> acpi_os_vprintf() which is completely broken and *always* has that KERN_CONT marker at the beginning, but earlier phases can have the level marker in it already.. In other words, it was a terminally broken piece of nasty code, but there was no way I was going to touch the ACPI "OS independent" layers, so I just said "screw this, if you want to mix KERN_CONT with a level marker, you can". So I agree, you normally shouldn't need to. In fact, the fewer KERN_CONT's I see in the kernel, the better off we are. But _sometimes_ KERN_CONT makes sense (and the OOM code is likely one of the few places where it really might be the best of a number of bad options: lots of information that needs to be pretty dense, and we obviously can't afford to allocate a buffer for it dynamically, and doing so statically isn't a great option either..) Linus
Re: linux.git: printk() problem
On Wed, Oct 12, 2016 at 9:16 AM, Joe Perches wrote: >> (but with the new world order you actually *can* combine KERN_CONT >> with a loglevel, so that if the beginning od the line got flushed, the >> continuation can still be printed with the right log level). > > I think that might not be a good idea. > > Anything that uses a KERN_CONT with a new log level > might as well be converted into multiple printks. Generally yes. The immediate reason for it was some really really nasty and horrible indirection in the ACPI layer, which goes through something like fifteen million different abstraction layers before it actually hits "printk()", and some of them do vsprintf() in the middle. It was something like ACPI_INFO() -> acpi_os_printf() -> acpi_os_vprintf() which is completely broken and *always* has that KERN_CONT marker at the beginning, but earlier phases can have the level marker in it already.. In other words, it was a terminally broken piece of nasty code, but there was no way I was going to touch the ACPI "OS independent" layers, so I just said "screw this, if you want to mix KERN_CONT with a level marker, you can". So I agree, you normally shouldn't need to. In fact, the fewer KERN_CONT's I see in the kernel, the better off we are. But _sometimes_ KERN_CONT makes sense (and the OOM code is likely one of the few places where it really might be the best of a number of bad options: lots of information that needs to be pretty dense, and we obviously can't afford to allocate a buffer for it dynamically, and doing so statically isn't a great option either..) Linus
Re: linux.git: printk() problem
On Wed, 2016-10-12 at 08:47 -0700, Linus Torvalds wrote: < We'll see. But the other issues are easily fixed by just adding > KERN_CONT where appropriate. It was actually very much what you were > supposed to do before too, if only as a marker to others that "yes, > I'm actually doing this, and no, I'm not supposed to have a log level" > (but with the new world order you actually *can* combine KERN_CONT > with a loglevel, so that if the beginning od the line got flushed, the > continuation can still be printed with the right log level). I think that might not be a good idea. Anything that uses a KERN_CONT with a new log level might as well be converted into multiple printks.
Re: linux.git: printk() problem
On Wed, 2016-10-12 at 08:47 -0700, Linus Torvalds wrote: < We'll see. But the other issues are easily fixed by just adding > KERN_CONT where appropriate. It was actually very much what you were > supposed to do before too, if only as a marker to others that "yes, > I'm actually doing this, and no, I'm not supposed to have a log level" > (but with the new world order you actually *can* combine KERN_CONT > with a loglevel, so that if the beginning od the line got flushed, the > continuation can still be printed with the right log level). I think that might not be a good idea. Anything that uses a KERN_CONT with a new log level might as well be converted into multiple printks.
Re: linux.git: printk() problem
On Wed, 2016-10-12 at 16:35 +0200, Michal Hocko wrote: > On Wed 12-10-16 22:30:03, Tetsuo Handa wrote: > > Hello. > > > > I noticed that current linux.git generates hardly readable console output > > due to KERN_CONT changes. Are you suggesting developers that output like > > this be fixed? > > > Joe has already posted a patch > http://lkml.kernel.org/r/c7df37c8665134654a17aaeb8b9f6ace1d6db58b.1476239034.git@perches.com > > And in general I think that adding those KERN_CONT is a good thing to > do. As do I, but there are about a quarter _million_ uses of printk/logging messages in the kernel tree with newlines. Maybe 100 or so of the existing messages lack terminating newlines. There are about 1800 uses of KERN_CONT/pr_cont today. There are multiple thousands of uses of bare printks used as printk line continuations that would need updating. Most of those multiple thousands are in drivers of ancient and out of manufacture devices or are subsystems that are old, crufty and effectively unmaintained. These might not need to be updated. Still, there are probably hundreds to low thousands of actively used device drivers that will need KERN?CONT or pr_cont update to avoid logging output defects.
Re: linux.git: printk() problem
On Wed, 2016-10-12 at 16:35 +0200, Michal Hocko wrote: > On Wed 12-10-16 22:30:03, Tetsuo Handa wrote: > > Hello. > > > > I noticed that current linux.git generates hardly readable console output > > due to KERN_CONT changes. Are you suggesting developers that output like > > this be fixed? > > > Joe has already posted a patch > http://lkml.kernel.org/r/c7df37c8665134654a17aaeb8b9f6ace1d6db58b.1476239034.git@perches.com > > And in general I think that adding those KERN_CONT is a good thing to > do. As do I, but there are about a quarter _million_ uses of printk/logging messages in the kernel tree with newlines. Maybe 100 or so of the existing messages lack terminating newlines. There are about 1800 uses of KERN_CONT/pr_cont today. There are multiple thousands of uses of bare printks used as printk line continuations that would need updating. Most of those multiple thousands are in drivers of ancient and out of manufacture devices or are subsystems that are old, crufty and effectively unmaintained. These might not need to be updated. Still, there are probably hundreds to low thousands of actively used device drivers that will need KERN?CONT or pr_cont update to avoid logging output defects.
Re: linux.git: printk() problem
On Wed, Oct 12, 2016 at 6:30 AM, Tetsuo Handawrote: > > I noticed that current linux.git generates hardly readable console output > due to KERN_CONT changes. Are you suggesting developers that output like > this be fixed? Yes. Needing to add a few KERN_CONT markers was pretty much expected, since it's about 5 years since we enfroced it and new code won't necessarily have it (and even back then I don't think we _always_ got it right). That said, looking at the printk's in the lowmem code, I think it could be useful if there was some effort to see if the code could somehow avoid the multi-printk thing. This is actually one area where (a) the problem actually happens while the system is running, rather than booting (b) I've seen line mixing in the past but the short term fix is to just add KERN_CONT markers to the lines that are continuations. NOTE! The reason I mention that (a) thing that it has traditionally made it much messier to do logging of continuation lines in the first place (because more things are going on and often one problem leads to another and then the mixing is much more likely), but I actually intentionally made it more likely to trigger the flushing issue in commit bfd8d3f23b51 ("printk: make reading the kernel log flush pending lines"). So if there is an active system logger that is reading messages *when* one of those "one line in multiple printk's" things happen, that log reader will now potentially cause the logging to be broken up because the act of reading will flush the pending lines. Now, honestly, that is something that we may end up reverting, but I'd _like_ to try not to. Because without that flushing, there might be one last partial line that the logger never sees. So it was me trying to be aggressive about those partial lines, and the *hope* is that we can just keep it, and that we can look at areas that have problems with it. We'll see. But the other issues are easily fixed by just adding KERN_CONT where appropriate. It was actually very much what you were supposed to do before too, if only as a marker to others that "yes, I'm actually doing this, and no, I'm not supposed to have a log level" (but with the new world order you actually *can* combine KERN_CONT with a loglevel, so that if the beginning od the line got flushed, the continuation can still be printed with the right log level). Linus
Re: linux.git: printk() problem
On Wed, Oct 12, 2016 at 6:30 AM, Tetsuo Handa wrote: > > I noticed that current linux.git generates hardly readable console output > due to KERN_CONT changes. Are you suggesting developers that output like > this be fixed? Yes. Needing to add a few KERN_CONT markers was pretty much expected, since it's about 5 years since we enfroced it and new code won't necessarily have it (and even back then I don't think we _always_ got it right). That said, looking at the printk's in the lowmem code, I think it could be useful if there was some effort to see if the code could somehow avoid the multi-printk thing. This is actually one area where (a) the problem actually happens while the system is running, rather than booting (b) I've seen line mixing in the past but the short term fix is to just add KERN_CONT markers to the lines that are continuations. NOTE! The reason I mention that (a) thing that it has traditionally made it much messier to do logging of continuation lines in the first place (because more things are going on and often one problem leads to another and then the mixing is much more likely), but I actually intentionally made it more likely to trigger the flushing issue in commit bfd8d3f23b51 ("printk: make reading the kernel log flush pending lines"). So if there is an active system logger that is reading messages *when* one of those "one line in multiple printk's" things happen, that log reader will now potentially cause the logging to be broken up because the act of reading will flush the pending lines. Now, honestly, that is something that we may end up reverting, but I'd _like_ to try not to. Because without that flushing, there might be one last partial line that the logger never sees. So it was me trying to be aggressive about those partial lines, and the *hope* is that we can just keep it, and that we can look at areas that have problems with it. We'll see. But the other issues are easily fixed by just adding KERN_CONT where appropriate. It was actually very much what you were supposed to do before too, if only as a marker to others that "yes, I'm actually doing this, and no, I'm not supposed to have a log level" (but with the new world order you actually *can* combine KERN_CONT with a loglevel, so that if the beginning od the line got flushed, the continuation can still be printed with the right log level). Linus
Re: linux.git: printk() problem
On Wed 12-10-16 22:30:03, Tetsuo Handa wrote: > Hello. > > I noticed that current linux.git generates hardly readable console output > due to KERN_CONT changes. Are you suggesting developers that output like > this be fixed? Joe has already posted a patch http://lkml.kernel.org/r/c7df37c8665134654a17aaeb8b9f6ace1d6db58b.1476239034.git@perches.com And in general I think that adding those KERN_CONT is a good thing to do. -- Michal Hocko SUSE Labs
Re: linux.git: printk() problem
On Wed 12-10-16 22:30:03, Tetsuo Handa wrote: > Hello. > > I noticed that current linux.git generates hardly readable console output > due to KERN_CONT changes. Are you suggesting developers that output like > this be fixed? Joe has already posted a patch http://lkml.kernel.org/r/c7df37c8665134654a17aaeb8b9f6ace1d6db58b.1476239034.git@perches.com And in general I think that adding those KERN_CONT is a good thing to do. -- Michal Hocko SUSE Labs
linux.git: printk() problem
Hello. I noticed that current linux.git generates hardly readable console output due to KERN_CONT changes. Are you suggesting developers that output like this be fixed? [ 93.723582] Mem-Info: [ 93.725919] active_anon:380970 inactive_anon:2098 isolated_anon:0 [ 93.725919] active_file:242 inactive_file:275 isolated_file:0 [ 93.725919] unevictable:0 dirty:0 writeback:65 unstable:0 [ 93.725919] slab_reclaimable:3825 slab_unreclaimable:12527 [ 93.725919] mapped:760 shmem:2164 pagetables:9297 bounce:0 [ 93.725919] free:14046 free_pcp:74 free_cma:0 [ 93.735626] Node 0 active_anon:1523880kB inactive_anon:8392kB active_file:968kB inactive_file:1100kB unevictable:0kB isolated(anon):0kB isolated(file):0kB mapped:3040kB dirty:0kB writeback:244kB shmem:0kB shmem_thp: 0kB shmem_pmdmapped: 1056768kB anon_thp: 8656kB writeback_tmp:0kB unstable:0kB pages_scanned:0 all_unreclaimable? no [ 93.742894] Node 0 [ 93.743319] DMA free:7156kB min:408kB low:508kB high:608kB active_anon:8672kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB writepending:0kB present:15988kB managed:15904kB mlocked:0kB slab_reclaimable:0kB slab_unreclaimable:40kB kernel_stack:0kB pagetables:36kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB [ 93.750951] lowmem_reserve[]:[ 93.751597] 0 1699[ 93.752678] 1699 1699[ 93.753910] [ 93.754822] Node 0 [ 93.755216] DMA32 free:49028kB min:44644kB low:55804kB high:66964kB active_anon:1515208kB inactive_anon:8392kB active_file:968kB inactive_file:960kB unevictable:0kB writepending:232kB present:2080640kB managed:1760736kB mlocked:0kB slab_reclaimable:15300kB slab_unreclaimable:50068kB kernel_stack:20320kB pagetables:37152kB bounce:0kB free_pcp:436kB local_pcp:240kB free_cma:0kB [ 93.763709] lowmem_reserve[]:[ 93.764235] 0 0[ 93.765630] 0 0[ 93.767544] [ 93.768676] Node 0 [ 93.769135] DMA: 1*4kB [ 93.770932] (U) 0*8kB [ 93.772192] 1*16kB (U) [ 93.773438] 1*32kB (M) [ 93.774574] 1*64kB (U) [ 93.775997] 1*128kB (U) [ 93.777665] 1*256kB (U) [ 93.778861] 1*512kB (M) [ 93.780004] 2*1024kB (UM) [ 93.781225] 0*2048kB 1*4096kB [ 93.782422] (M) [ 93.783423] = 7156kB [ 93.784445] Node 0 [ 93.784844] DMA32: 1265*4kB [ 93.786204] (UMEH) 790*8kB [ 93.787412] (UMEH) 335*16kB [ 93.788634] (UEH) 265*32kB [ 93.790037] (UE) 89*64kB [ 93.791254] (UEH) 23*128kB [ 93.792541] (UME) 9*256kB [ 93.793787] (ME) 13*512kB [ 93.794956] (UME) 6*1024kB [ 93.796094] (UME) 0*2048kB [ 93.797435] 0*4096kB [ 93.798413] = 48964kB [ 93.799357] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB [ 93.802221] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB [ 93.804476] 2700 total pagecache pages [ 93.805949] 0 pages in swap cache [ 93.807557] Swap cache stats: add 0, delete 0, find 0/0 [ 93.809417] Free swap = 0kB [ 93.810573] Total swap = 0kB [ 93.811788] 524157 pages RAM [ 93.812903] 0 pages HighMem/MovableOnly [ 93.814216] 79997 pages reserved [ 93.815292] 0 pages hwpoisoned [ 185.497015] Showing all locks held in the system: [ 185.59] 1 lock held by systemd/1: [ 185.501377] #0: [ 185.501717] ( &(>i_mmaplock)->mr_lock[ 185.503347] ){++} , at: [ 185.504566] [] xfs_ilock+0xd0/0xe0 [ 185.506157] 2 locks held by khungtaskd/46: [ 185.507553] #0: [ 185.507896] ( rcu_read_lock[ 185.509091] ){..} , at: [ 185.510277] [] watchdog+0x9e/0x480 [ 185.512072] #1: [ 185.512413] ( tasklist_lock[ 185.513602] ){.+.+..} , at: [ 185.514802] [] debug_show_all_locks+0x3f/0x1b0
linux.git: printk() problem
Hello. I noticed that current linux.git generates hardly readable console output due to KERN_CONT changes. Are you suggesting developers that output like this be fixed? [ 93.723582] Mem-Info: [ 93.725919] active_anon:380970 inactive_anon:2098 isolated_anon:0 [ 93.725919] active_file:242 inactive_file:275 isolated_file:0 [ 93.725919] unevictable:0 dirty:0 writeback:65 unstable:0 [ 93.725919] slab_reclaimable:3825 slab_unreclaimable:12527 [ 93.725919] mapped:760 shmem:2164 pagetables:9297 bounce:0 [ 93.725919] free:14046 free_pcp:74 free_cma:0 [ 93.735626] Node 0 active_anon:1523880kB inactive_anon:8392kB active_file:968kB inactive_file:1100kB unevictable:0kB isolated(anon):0kB isolated(file):0kB mapped:3040kB dirty:0kB writeback:244kB shmem:0kB shmem_thp: 0kB shmem_pmdmapped: 1056768kB anon_thp: 8656kB writeback_tmp:0kB unstable:0kB pages_scanned:0 all_unreclaimable? no [ 93.742894] Node 0 [ 93.743319] DMA free:7156kB min:408kB low:508kB high:608kB active_anon:8672kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB writepending:0kB present:15988kB managed:15904kB mlocked:0kB slab_reclaimable:0kB slab_unreclaimable:40kB kernel_stack:0kB pagetables:36kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB [ 93.750951] lowmem_reserve[]:[ 93.751597] 0 1699[ 93.752678] 1699 1699[ 93.753910] [ 93.754822] Node 0 [ 93.755216] DMA32 free:49028kB min:44644kB low:55804kB high:66964kB active_anon:1515208kB inactive_anon:8392kB active_file:968kB inactive_file:960kB unevictable:0kB writepending:232kB present:2080640kB managed:1760736kB mlocked:0kB slab_reclaimable:15300kB slab_unreclaimable:50068kB kernel_stack:20320kB pagetables:37152kB bounce:0kB free_pcp:436kB local_pcp:240kB free_cma:0kB [ 93.763709] lowmem_reserve[]:[ 93.764235] 0 0[ 93.765630] 0 0[ 93.767544] [ 93.768676] Node 0 [ 93.769135] DMA: 1*4kB [ 93.770932] (U) 0*8kB [ 93.772192] 1*16kB (U) [ 93.773438] 1*32kB (M) [ 93.774574] 1*64kB (U) [ 93.775997] 1*128kB (U) [ 93.777665] 1*256kB (U) [ 93.778861] 1*512kB (M) [ 93.780004] 2*1024kB (UM) [ 93.781225] 0*2048kB 1*4096kB [ 93.782422] (M) [ 93.783423] = 7156kB [ 93.784445] Node 0 [ 93.784844] DMA32: 1265*4kB [ 93.786204] (UMEH) 790*8kB [ 93.787412] (UMEH) 335*16kB [ 93.788634] (UEH) 265*32kB [ 93.790037] (UE) 89*64kB [ 93.791254] (UEH) 23*128kB [ 93.792541] (UME) 9*256kB [ 93.793787] (ME) 13*512kB [ 93.794956] (UME) 6*1024kB [ 93.796094] (UME) 0*2048kB [ 93.797435] 0*4096kB [ 93.798413] = 48964kB [ 93.799357] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB [ 93.802221] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB [ 93.804476] 2700 total pagecache pages [ 93.805949] 0 pages in swap cache [ 93.807557] Swap cache stats: add 0, delete 0, find 0/0 [ 93.809417] Free swap = 0kB [ 93.810573] Total swap = 0kB [ 93.811788] 524157 pages RAM [ 93.812903] 0 pages HighMem/MovableOnly [ 93.814216] 79997 pages reserved [ 93.815292] 0 pages hwpoisoned [ 185.497015] Showing all locks held in the system: [ 185.59] 1 lock held by systemd/1: [ 185.501377] #0: [ 185.501717] ( &(>i_mmaplock)->mr_lock[ 185.503347] ){++} , at: [ 185.504566] [] xfs_ilock+0xd0/0xe0 [ 185.506157] 2 locks held by khungtaskd/46: [ 185.507553] #0: [ 185.507896] ( rcu_read_lock[ 185.509091] ){..} , at: [ 185.510277] [] watchdog+0x9e/0x480 [ 185.512072] #1: [ 185.512413] ( tasklist_lock[ 185.513602] ){.+.+..} , at: [ 185.514802] [] debug_show_all_locks+0x3f/0x1b0