Re: [RFC][PATCHv4 4/6] printk: report lost messages in printk safe/nmi contexts
On Thu 2016-12-01 11:10:42, Sergey Senozhatsky wrote: > On (11/25/16 12:07), Petr Mladek wrote: > [..] > > > +static void report_message_lost(atomic_t *num_lost, char *fmt) > > > +{ > > > + int lost = atomic_xchg(num_lost, 0); > > > + > > > + if (lost) { > > > + char msg[56]; > > > I would really like to avoid a hard coded buffer size. Such things > > are likely to bite us in the future. > > why would scnprintf() overflow. Sure, it will not overflow. But still, such a small hard coded buffer size looks too hacky to me. > > I thought about reshuffling a lot of logic, adding more wrappers, > > ... But the solution might be easy in the end, see below. > > > > > + > > > + scnprintf(msg, sizeof(msg), fmt, lost); > > > + > > > + printk_safe_flush_line(msg, strlen(msg)); > > > > This made my brain spin a lot. I wondered if it did what we wanted > > and it was safe. > > > > On one hand, it is supposed to work because use exactly this > > function in __printk_safe_flush() where you call this from. > > > > One question is if it does what we want in different contexts. > > Let's look at it: > > > > 1. It calls printk_deferred() in NMI context. There is a risk > >of a deadlock. But it is called only from > >printk_safe_flush_on_panic() which is the last resort. Therefore > >it does exactly what we want. > > > > 2. It calls printk()->printk_func()->vprintk_emit() in normal context. > >It is what we want in normal context. > > > > 3. It calls printk()->printk_func()->v printk_safe() in printk_safe > >context. This does not look correct. IMHO, this might happen > >only printk_safe_flush_on_panic() and we want to use > >printk_deferred() here as well. > [..] > > The easiest solution would be to simply call printk_deferred() > > here. Everything will be deferred after the async printk() patchset > > anyway. > > > > I would even use printk_deferred() in printk_safe_flush_line() > > for each context. It is not optimal but it works very well > > and it makes the code much more straightforward. > > yes, good point. > we can call deferred printk for anything there; or replace that in_nmi() > check with the `printk_safe_context != 0' one, and then route the message > via printk or printk_deferred. Yup, it might be an option and sounds good. Anyway, I would use printk_deferred() to print the warnings about lost messages. It is perfectly fine and you will not need the hard coded temporary buffer. Best regards, Petr
Re: [RFC][PATCHv4 4/6] printk: report lost messages in printk safe/nmi contexts
On Thu 2016-12-01 11:10:42, Sergey Senozhatsky wrote: > On (11/25/16 12:07), Petr Mladek wrote: > [..] > > > +static void report_message_lost(atomic_t *num_lost, char *fmt) > > > +{ > > > + int lost = atomic_xchg(num_lost, 0); > > > + > > > + if (lost) { > > > + char msg[56]; > > > I would really like to avoid a hard coded buffer size. Such things > > are likely to bite us in the future. > > why would scnprintf() overflow. Sure, it will not overflow. But still, such a small hard coded buffer size looks too hacky to me. > > I thought about reshuffling a lot of logic, adding more wrappers, > > ... But the solution might be easy in the end, see below. > > > > > + > > > + scnprintf(msg, sizeof(msg), fmt, lost); > > > + > > > + printk_safe_flush_line(msg, strlen(msg)); > > > > This made my brain spin a lot. I wondered if it did what we wanted > > and it was safe. > > > > On one hand, it is supposed to work because use exactly this > > function in __printk_safe_flush() where you call this from. > > > > One question is if it does what we want in different contexts. > > Let's look at it: > > > > 1. It calls printk_deferred() in NMI context. There is a risk > >of a deadlock. But it is called only from > >printk_safe_flush_on_panic() which is the last resort. Therefore > >it does exactly what we want. > > > > 2. It calls printk()->printk_func()->vprintk_emit() in normal context. > >It is what we want in normal context. > > > > 3. It calls printk()->printk_func()->v printk_safe() in printk_safe > >context. This does not look correct. IMHO, this might happen > >only printk_safe_flush_on_panic() and we want to use > >printk_deferred() here as well. > [..] > > The easiest solution would be to simply call printk_deferred() > > here. Everything will be deferred after the async printk() patchset > > anyway. > > > > I would even use printk_deferred() in printk_safe_flush_line() > > for each context. It is not optimal but it works very well > > and it makes the code much more straightforward. > > yes, good point. > we can call deferred printk for anything there; or replace that in_nmi() > check with the `printk_safe_context != 0' one, and then route the message > via printk or printk_deferred. Yup, it might be an option and sounds good. Anyway, I would use printk_deferred() to print the warnings about lost messages. It is perfectly fine and you will not need the hard coded temporary buffer. Best regards, Petr
Re: [RFC][PATCHv4 4/6] printk: report lost messages in printk safe/nmi contexts
On (11/25/16 12:07), Petr Mladek wrote: [..] > > +static void report_message_lost(atomic_t *num_lost, char *fmt) > > +{ > > + int lost = atomic_xchg(num_lost, 0); > > + > > + if (lost) { > > + char msg[56]; > > I would really like to avoid a hard coded buffer size. Such things > are likely to bite us in the future. why would scnprintf() overflow. > I thought about reshuffling a lot of logic, adding more wrappers, > ... But the solution might be easy in the end, see below. > > > + > > + scnprintf(msg, sizeof(msg), fmt, lost); > > + > > + printk_safe_flush_line(msg, strlen(msg)); > > This made my brain spin a lot. I wondered if it did what we wanted > and it was safe. > > On one hand, it is supposed to work because use exactly this > function in __printk_safe_flush() where you call this from. > > One question is if it does what we want in different contexts. > Let's look at it: > > 1. It calls printk_deferred() in NMI context. There is a risk >of a deadlock. But it is called only from >printk_safe_flush_on_panic() which is the last resort. Therefore >it does exactly what we want. > > 2. It calls printk()->printk_func()->vprintk_emit() in normal context. >It is what we want in normal context. > > 3. It calls printk()->printk_func()->v printk_safe() in printk_safe >context. This does not look correct. IMHO, this might happen >only printk_safe_flush_on_panic() and we want to use >printk_deferred() here as well. [..] > The easiest solution would be to simply call printk_deferred() > here. Everything will be deferred after the async printk() patchset > anyway. > > I would even use printk_deferred() in printk_safe_flush_line() > for each context. It is not optimal but it works very well > and it makes the code much more straightforward. yes, good point. we can call deferred printk for anything there; or replace that in_nmi() check with the `printk_safe_context != 0' one, and then route the message via printk or printk_deferred. [..] > > * Flush data from the associated per-CPU buffer. The function > > * can be called either via IRQ work or independently. > > @@ -147,6 +183,9 @@ static void __printk_safe_flush(struct irq_work *work) > > > > i = 0; > > more: > > + report_nmi_message_lost(); > > + report_safe_message_lost(); > > Please, move this at the end of this function. ok. -ss
Re: [RFC][PATCHv4 4/6] printk: report lost messages in printk safe/nmi contexts
On (11/25/16 12:07), Petr Mladek wrote: [..] > > +static void report_message_lost(atomic_t *num_lost, char *fmt) > > +{ > > + int lost = atomic_xchg(num_lost, 0); > > + > > + if (lost) { > > + char msg[56]; > > I would really like to avoid a hard coded buffer size. Such things > are likely to bite us in the future. why would scnprintf() overflow. > I thought about reshuffling a lot of logic, adding more wrappers, > ... But the solution might be easy in the end, see below. > > > + > > + scnprintf(msg, sizeof(msg), fmt, lost); > > + > > + printk_safe_flush_line(msg, strlen(msg)); > > This made my brain spin a lot. I wondered if it did what we wanted > and it was safe. > > On one hand, it is supposed to work because use exactly this > function in __printk_safe_flush() where you call this from. > > One question is if it does what we want in different contexts. > Let's look at it: > > 1. It calls printk_deferred() in NMI context. There is a risk >of a deadlock. But it is called only from >printk_safe_flush_on_panic() which is the last resort. Therefore >it does exactly what we want. > > 2. It calls printk()->printk_func()->vprintk_emit() in normal context. >It is what we want in normal context. > > 3. It calls printk()->printk_func()->v printk_safe() in printk_safe >context. This does not look correct. IMHO, this might happen >only printk_safe_flush_on_panic() and we want to use >printk_deferred() here as well. [..] > The easiest solution would be to simply call printk_deferred() > here. Everything will be deferred after the async printk() patchset > anyway. > > I would even use printk_deferred() in printk_safe_flush_line() > for each context. It is not optimal but it works very well > and it makes the code much more straightforward. yes, good point. we can call deferred printk for anything there; or replace that in_nmi() check with the `printk_safe_context != 0' one, and then route the message via printk or printk_deferred. [..] > > * Flush data from the associated per-CPU buffer. The function > > * can be called either via IRQ work or independently. > > @@ -147,6 +183,9 @@ static void __printk_safe_flush(struct irq_work *work) > > > > i = 0; > > more: > > + report_nmi_message_lost(); > > + report_safe_message_lost(); > > Please, move this at the end of this function. ok. -ss
Re: [RFC][PATCHv4 4/6] printk: report lost messages in printk safe/nmi contexts
On Fri 2016-10-28 00:49:31, Sergey Senozhatsky wrote: > Account lost messages in pritk-safe and printk-safe-nmi > contexts and report those numbers during printk_safe_flush(). > > Signed-off-by: Sergey Senozhatsky> --- > kernel/printk/internal.h| 17 --- > kernel/printk/printk.c | 10 - > kernel/printk/printk_safe.c | 50 > +++-- > 3 files changed, 48 insertions(+), 29 deletions(-) > > diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c > index 1878b2d..4675b8d 100644 > --- a/kernel/printk/printk.c > +++ b/kernel/printk/printk.c > @@ -1795,7 +1795,6 @@ asmlinkage int vprintk_emit(int facility, int level, > unsigned long flags; > int this_cpu; > int printed_len = 0; > - int nmi_message_lost; > bool in_sched = false; > /* cpu currently holding logbuf_lock in this function */ > static unsigned int logbuf_cpu = UINT_MAX; > @@ -1846,15 +1845,6 @@ asmlinkage int vprintk_emit(int facility, int level, >strlen(recursion_msg)); > } > > - nmi_message_lost = get_nmi_message_lost(); > - if (unlikely(nmi_message_lost)) { > - text_len = scnprintf(textbuf, sizeof(textbuf), > - "BAD LUCK: lost %d message(s) from NMI > context!", > - nmi_message_lost); > - printed_len += log_store(0, 2, LOG_PREFIX|LOG_NEWLINE, 0, > - NULL, 0, textbuf, text_len); > - } I really like that we are moving this out of vprintk_emit() but... > - > /* >* The printf needs to come first; we need the syslog >* prefix which might be passed-in as a parameter. > diff --git a/kernel/printk/printk_safe.c b/kernel/printk/printk_safe.c > index af74d9c..721b091 100644 > --- a/kernel/printk/printk_safe.c > +++ b/kernel/printk/printk_safe.c > @@ -52,10 +52,11 @@ struct printk_safe_seq_buf { > > static DEFINE_PER_CPU(struct printk_safe_seq_buf, safe_print_seq); > static DEFINE_PER_CPU(int, printk_safe_context); > +static atomic_t safe_message_lost; > > #ifdef CONFIG_PRINTK_NMI > static DEFINE_PER_CPU(struct printk_safe_seq_buf, nmi_print_seq); > -atomic_t nmi_message_lost; > +static atomic_t nmi_message_lost; > #endif > > static int printk_safe_log_store(struct printk_safe_seq_buf *s, > @@ -122,6 +123,41 @@ static void printk_safe_flush_seq_line(struct > printk_safe_seq_buf *s, > printk_safe_flush_line(buf, (end - start) + 1); > } > > +static void report_message_lost(atomic_t *num_lost, char *fmt) > +{ > + int lost = atomic_xchg(num_lost, 0); > + > + if (lost) { > + char msg[56]; I would really like to avoid a hard coded buffer size. Such things are likely to bite us in the future. I thought about reshuffling a lot of logic, adding more wrappers, ... But the solution might be easy in the end, see below. > + > + scnprintf(msg, sizeof(msg), fmt, lost); > + > + printk_safe_flush_line(msg, strlen(msg)); This made my brain spin a lot. I wondered if it did what we wanted and it was safe. On one hand, it is supposed to work because use exactly this function in __printk_safe_flush() where you call this from. One question is if it does what we want in different contexts. Let's look at it: 1. It calls printk_deferred() in NMI context. There is a risk of a deadlock. But it is called only from printk_safe_flush_on_panic() which is the last resort. Therefore it does exactly what we want. 2. It calls printk()->printk_func()->vprintk_emit() in normal context. It is what we want in normal context. 3. It calls printk()->printk_func()->v printk_safe() in printk_safe context. This does not look correct. IMHO, this might happen only printk_safe_flush_on_panic() and we want to use printk_deferred() here as well. Also note that printk_safe buffer might still be full at this point. It means that the warning will not be stored. There also might be a risk of an infinite loop. Both problems are solved by printk_deferred(). The easiest solution would be to simply call printk_deferred() here. Everything will be deferred after the async printk() patchset anyway. I would even use printk_deferred() in printk_safe_flush_line() for each context. It is not optimal but it works very well and it makes the code much more straightforward. > + } > +} > + > +#ifdef CONFIG_PRINTK_NMI > + > +static void report_nmi_message_lost(void) > +{ > + report_message_lost(_message_lost, > + "Lost %d message(s) from NMI context!"); > +} > + > +#else > + > +static void report_nmi_message_lost(void) > +{ > +} > + > +#endif /* CONFIG_PRINTK_NMI */ > + > +static void report_safe_message_lost(void) > +{ > + report_message_lost(_message_lost, > + "Lost %d message(s) from printk-safe context!"); > +}
Re: [RFC][PATCHv4 4/6] printk: report lost messages in printk safe/nmi contexts
On Fri 2016-10-28 00:49:31, Sergey Senozhatsky wrote: > Account lost messages in pritk-safe and printk-safe-nmi > contexts and report those numbers during printk_safe_flush(). > > Signed-off-by: Sergey Senozhatsky > --- > kernel/printk/internal.h| 17 --- > kernel/printk/printk.c | 10 - > kernel/printk/printk_safe.c | 50 > +++-- > 3 files changed, 48 insertions(+), 29 deletions(-) > > diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c > index 1878b2d..4675b8d 100644 > --- a/kernel/printk/printk.c > +++ b/kernel/printk/printk.c > @@ -1795,7 +1795,6 @@ asmlinkage int vprintk_emit(int facility, int level, > unsigned long flags; > int this_cpu; > int printed_len = 0; > - int nmi_message_lost; > bool in_sched = false; > /* cpu currently holding logbuf_lock in this function */ > static unsigned int logbuf_cpu = UINT_MAX; > @@ -1846,15 +1845,6 @@ asmlinkage int vprintk_emit(int facility, int level, >strlen(recursion_msg)); > } > > - nmi_message_lost = get_nmi_message_lost(); > - if (unlikely(nmi_message_lost)) { > - text_len = scnprintf(textbuf, sizeof(textbuf), > - "BAD LUCK: lost %d message(s) from NMI > context!", > - nmi_message_lost); > - printed_len += log_store(0, 2, LOG_PREFIX|LOG_NEWLINE, 0, > - NULL, 0, textbuf, text_len); > - } I really like that we are moving this out of vprintk_emit() but... > - > /* >* The printf needs to come first; we need the syslog >* prefix which might be passed-in as a parameter. > diff --git a/kernel/printk/printk_safe.c b/kernel/printk/printk_safe.c > index af74d9c..721b091 100644 > --- a/kernel/printk/printk_safe.c > +++ b/kernel/printk/printk_safe.c > @@ -52,10 +52,11 @@ struct printk_safe_seq_buf { > > static DEFINE_PER_CPU(struct printk_safe_seq_buf, safe_print_seq); > static DEFINE_PER_CPU(int, printk_safe_context); > +static atomic_t safe_message_lost; > > #ifdef CONFIG_PRINTK_NMI > static DEFINE_PER_CPU(struct printk_safe_seq_buf, nmi_print_seq); > -atomic_t nmi_message_lost; > +static atomic_t nmi_message_lost; > #endif > > static int printk_safe_log_store(struct printk_safe_seq_buf *s, > @@ -122,6 +123,41 @@ static void printk_safe_flush_seq_line(struct > printk_safe_seq_buf *s, > printk_safe_flush_line(buf, (end - start) + 1); > } > > +static void report_message_lost(atomic_t *num_lost, char *fmt) > +{ > + int lost = atomic_xchg(num_lost, 0); > + > + if (lost) { > + char msg[56]; I would really like to avoid a hard coded buffer size. Such things are likely to bite us in the future. I thought about reshuffling a lot of logic, adding more wrappers, ... But the solution might be easy in the end, see below. > + > + scnprintf(msg, sizeof(msg), fmt, lost); > + > + printk_safe_flush_line(msg, strlen(msg)); This made my brain spin a lot. I wondered if it did what we wanted and it was safe. On one hand, it is supposed to work because use exactly this function in __printk_safe_flush() where you call this from. One question is if it does what we want in different contexts. Let's look at it: 1. It calls printk_deferred() in NMI context. There is a risk of a deadlock. But it is called only from printk_safe_flush_on_panic() which is the last resort. Therefore it does exactly what we want. 2. It calls printk()->printk_func()->vprintk_emit() in normal context. It is what we want in normal context. 3. It calls printk()->printk_func()->v printk_safe() in printk_safe context. This does not look correct. IMHO, this might happen only printk_safe_flush_on_panic() and we want to use printk_deferred() here as well. Also note that printk_safe buffer might still be full at this point. It means that the warning will not be stored. There also might be a risk of an infinite loop. Both problems are solved by printk_deferred(). The easiest solution would be to simply call printk_deferred() here. Everything will be deferred after the async printk() patchset anyway. I would even use printk_deferred() in printk_safe_flush_line() for each context. It is not optimal but it works very well and it makes the code much more straightforward. > + } > +} > + > +#ifdef CONFIG_PRINTK_NMI > + > +static void report_nmi_message_lost(void) > +{ > + report_message_lost(_message_lost, > + "Lost %d message(s) from NMI context!"); > +} > + > +#else > + > +static void report_nmi_message_lost(void) > +{ > +} > + > +#endif /* CONFIG_PRINTK_NMI */ > + > +static void report_safe_message_lost(void) > +{ > + report_message_lost(_message_lost, > + "Lost %d message(s) from printk-safe context!"); > +} > + > /* > * Flush data