Re: [PATCH printk v3 3/6] printk: remove safe buffers

2021-06-25 Thread Petr Mladek
On Thu 2021-06-24 17:41:56, John Ogness wrote:
> I would prefer a v4 with these fixes:
> 
> - wrap @console_owner_lock with printk_safe usage
> 
> - remove unnecessary printk_safe usage from printk_safe.c
> 
> - update commit message to say that safe context tracking is left in
>   place for both the console and console_owner locks

Sounds good to me.

Best Regards,
Petr

___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


Re: [PATCH printk v3 3/6] printk: remove safe buffers

2021-06-24 Thread John Ogness
On 2021-06-24, Petr Mladek  wrote:
>> --- a/kernel/printk/printk.c
>> +++ b/kernel/printk/printk.c
>> @@ -1852,7 +1839,7 @@ static int console_trylock_spinning(void)
>>  if (console_trylock())
>>  return 1;
>>  
>> -printk_safe_enter_irqsave(flags);
>> +local_irq_save(flags);
>>  
>>  raw_spin_lock(&console_owner_lock);
>
> This spin_lock is in the printk() path. We must make sure that
> it does not cause deadlock.
>
> printk_safe_enter_irqsave(flags) prevented the recursion because
> it deferred the console handling.
>
> One danger might be a lockdep report triggered by
> raw_spin_lock(&console_owner_lock) itself. But it should be safe.
> lockdep is checked before the lock is actually taken
> and lockdep should disable itself before printing anything.
>
> Another danger might be any printk() called under the lock.
> The code just compares and assigns values to some variables
> (static, on stack) so we should be on the safe side.
>
> Well, I would feel more comfortable if we add printk_safe_enter_irqsave()
> back around the sections guarded by this lock. It can be done
> in a separate patch. The code looks safe at the moment.

You are correct. printk_safe should also be wrapping @console_owner_lock
locking.

>> @@ -2716,19 +2700,22 @@ void console_unlock(void)
>>   * were to occur on another CPU, it may wait for this one to
>>   * finish. This task can not be preempted if there is a
>>   * waiter waiting to take over.
>> + *
>> + * Interrupts are disabled because the hand over to a waiter
>> + * must not be interrupted until the hand over is completed
>> + * (@console_waiter is cleared).
>>   */
>> +local_irq_save(flags);
>>  console_lock_spinning_enable();
>
> Same here. console_lock_spinning_enable() takes console_owner_lock.
> I would feel more comfortable if we added printk_safe_enter_irqsave(flags)
> inside console_lock_spinning_enable() around the locked code. The code
> is safe at the moment but...

Agreed.

>>  stop_critical_timings();/* don't trace print latency */
>>  call_console_drivers(ext_text, ext_len, text, len);
>>  start_critical_timings();
>>  
>> -if (console_lock_spinning_disable_and_check()) {
>> -printk_safe_exit_irqrestore(flags);
>> +handover = console_lock_spinning_disable_and_check();
>
> Same here. Also console_lock_spinning_disable_and_check() takes
> console_owner_lock. It looks safe at the moment but...

Agreed.

>> --- a/kernel/printk/printk_safe.c
>> +++ b/kernel/printk/printk_safe.c
>> @@ -369,7 +70,10 @@ asmlinkage int vprintk(const char *fmt, va_list args)
>>   * Use the main logbuf even in NMI. But avoid calling console
>>   * drivers that might have their own locks.
>>   */
>> -if ((this_cpu_read(printk_context) & PRINTK_NMI_DIRECT_CONTEXT_MASK)) {
>> +if (this_cpu_read(printk_context) &
>> +(PRINTK_NMI_DIRECT_CONTEXT_MASK |
>> + PRINTK_NMI_CONTEXT_MASK |
>> + PRINTK_SAFE_CONTEXT_MASK)) {
>>  unsigned long flags;
>>  int len;
>>  
>
> There is the following code right below:
>
>   printk_safe_enter_irqsave(flags);
>   len = vprintk_store(0, LOGLEVEL_DEFAULT, NULL, fmt, args);
>   printk_safe_exit_irqrestore(flags);
>   defer_console_output();
>   return len;
>
> printk_safe_enter_irqsave(flags) is not needed here. Any nested
> printk() ends here as well.

Ah, I missed that one. Good eye!

> Against this can be done in a separate patch. Well, the commit message
> mentions that the printk_safe context is removed everywhere except
> for the code manipulating console lock. But is it just a detail.

I would prefer a v4 with these fixes:

- wrap @console_owner_lock with printk_safe usage

- remove unnecessary printk_safe usage from printk_safe.c

- update commit message to say that safe context tracking is left in
  place for both the console and console_owner locks

John Ogness

___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


Re: [PATCH printk v3 3/6] printk: remove safe buffers

2021-06-24 Thread Petr Mladek
On Thu 2021-06-24 13:17:45, John Ogness wrote:
> With @logbuf_lock removed, the high level printk functions for
> storing messages are lockless. Messages can be stored from any
> context, so there is no need for the NMI and safe buffers anymore.
> Remove the NMI and safe buffers.
> 
> Although the safe buffers are removed, the NMI and safe context
> tracking is still in place. In these contexts, store the message
> immediately but still use irq_work to defer the console printing.
> 
> Since printk recursion tracking is in place, safe context tracking
> for most of printk is not needed. Remove it. Only safe context
> tracking relating to the console lock is left in place. This is
> because the console lock is needed for the actual printing.

Feel free to use:

Reviewed-by: Petr Mladek 

There are some comments below.

> --- a/kernel/printk/printk.c
> +++ b/kernel/printk/printk.c
> @@ -1852,7 +1839,7 @@ static int console_trylock_spinning(void)
>   if (console_trylock())
>   return 1;
>  
> - printk_safe_enter_irqsave(flags);
> + local_irq_save(flags);
>  
>   raw_spin_lock(&console_owner_lock);

This spin_lock is in the printk() path. We must make sure that
it does not cause deadlock.

printk_safe_enter_irqsave(flags) prevented the recursion because
it deferred the console handling.

One danger might be a lockdep report triggered by
raw_spin_lock(&console_owner_lock) itself. But it should be safe.
lockdep is checked before the lock is actually taken
and lockdep should disable itself before printing anything.

Another danger might be any printk() called under the lock.
The code just compares and assigns values to some variables
(static, on stack) so we should be on the safe side.

Well, I would feel more comfortable if we add printk_safe_enter_irqsave()
back around the sections guarded by this lock. It can be done
in a separate patch. The code looks safe at the moment.


> @@ -2664,9 +2648,9 @@ void console_unlock(void)
>  
>   for (;;) {
>   size_t ext_len = 0;
> + int handover;
>   size_t len;
>  
> - printk_safe_enter_irqsave(flags);
>  skip:
>   if (!prb_read_valid(prb, console_seq, &r))
>   break;
> @@ -2716,19 +2700,22 @@ void console_unlock(void)
>* were to occur on another CPU, it may wait for this one to
>* finish. This task can not be preempted if there is a
>* waiter waiting to take over.
> +  *
> +  * Interrupts are disabled because the hand over to a waiter
> +  * must not be interrupted until the hand over is completed
> +  * (@console_waiter is cleared).
>*/
> + local_irq_save(flags);
>   console_lock_spinning_enable();

Same here. console_lock_spinning_enable() takes console_owner_lock.
I would feel more comfortable if we added printk_safe_enter_irqsave(flags)
inside console_lock_spinning_enable() around the locked code. The code
is safe at the moment but...

>  
>   stop_critical_timings();/* don't trace print latency */
>   call_console_drivers(ext_text, ext_len, text, len);
>   start_critical_timings();
>  
> - if (console_lock_spinning_disable_and_check()) {
> - printk_safe_exit_irqrestore(flags);
> + handover = console_lock_spinning_disable_and_check();

Same here. Also console_lock_spinning_disable_and_check() takes
console_owner_lock. It looks safe at the moment but...


> + local_irq_restore(flags);
> + if (handover)
>   return;
> - }
> -
> - printk_safe_exit_irqrestore(flags);
>  
>   if (do_cond_resched)
>   cond_resched();

> --- a/kernel/printk/printk_safe.c
> +++ b/kernel/printk/printk_safe.c
> @@ -369,7 +70,10 @@ asmlinkage int vprintk(const char *fmt, va_list args)
>* Use the main logbuf even in NMI. But avoid calling console
>* drivers that might have their own locks.
>*/
> - if ((this_cpu_read(printk_context) & PRINTK_NMI_DIRECT_CONTEXT_MASK)) {
> + if (this_cpu_read(printk_context) &
> + (PRINTK_NMI_DIRECT_CONTEXT_MASK |
> +  PRINTK_NMI_CONTEXT_MASK |
> +  PRINTK_SAFE_CONTEXT_MASK)) {
>   unsigned long flags;
>   int len;
>  

There is the following code right below:

printk_safe_enter_irqsave(flags);
len = vprintk_store(0, LOGLEVEL_DEFAULT, NULL, fmt, args);
printk_safe_exit_irqrestore(flags);
defer_console_output();
return len;

printk_safe_enter_irqsave(flags) is not needed here. Any nested
printk() ends here as well.

Against this can be done in a separate patch. Well, the commit message
mentions that the printk_safe context is removed everywhere except
for the code manipulating console

[PATCH printk v3 3/6] printk: remove safe buffers

2021-06-24 Thread John Ogness
With @logbuf_lock removed, the high level printk functions for
storing messages are lockless. Messages can be stored from any
context, so there is no need for the NMI and safe buffers anymore.
Remove the NMI and safe buffers.

Although the safe buffers are removed, the NMI and safe context
tracking is still in place. In these contexts, store the message
immediately but still use irq_work to defer the console printing.

Since printk recursion tracking is in place, safe context tracking
for most of printk is not needed. Remove it. Only safe context
tracking relating to the console lock is left in place. This is
because the console lock is needed for the actual printing.

Signed-off-by: John Ogness 
---
 arch/powerpc/kernel/traps.c|   1 -
 arch/powerpc/kernel/watchdog.c |   5 -
 include/linux/printk.h |  10 -
 kernel/kexec_core.c|   1 -
 kernel/panic.c |   3 -
 kernel/printk/internal.h   |  17 --
 kernel/printk/printk.c | 126 +
 kernel/printk/printk_safe.c| 332 +
 lib/nmi_backtrace.c|   6 -
 9 files changed, 51 insertions(+), 450 deletions(-)

diff --git a/arch/powerpc/kernel/traps.c b/arch/powerpc/kernel/traps.c
index a44a30b0688c..5828c83eaca6 100644
--- a/arch/powerpc/kernel/traps.c
+++ b/arch/powerpc/kernel/traps.c
@@ -171,7 +171,6 @@ extern void panic_flush_kmsg_start(void)
 
 extern void panic_flush_kmsg_end(void)
 {
-   printk_safe_flush_on_panic();
kmsg_dump(KMSG_DUMP_PANIC);
bust_spinlocks(0);
debug_locks_off();
diff --git a/arch/powerpc/kernel/watchdog.c b/arch/powerpc/kernel/watchdog.c
index c9a8f4781a10..dc17d8903d4f 100644
--- a/arch/powerpc/kernel/watchdog.c
+++ b/arch/powerpc/kernel/watchdog.c
@@ -183,11 +183,6 @@ static void watchdog_smp_panic(int cpu, u64 tb)
 
wd_smp_unlock(&flags);
 
-   printk_safe_flush();
-   /*
-* printk_safe_flush() seems to require another print
-* before anything actually goes out to console.
-*/
if (sysctl_hardlockup_all_cpu_backtrace)
trigger_allbutself_cpu_backtrace();
 
diff --git a/include/linux/printk.h b/include/linux/printk.h
index 1790a5521fd9..664612f75dac 100644
--- a/include/linux/printk.h
+++ b/include/linux/printk.h
@@ -207,8 +207,6 @@ __printf(1, 2) void dump_stack_set_arch_desc(const char 
*fmt, ...);
 void dump_stack_print_info(const char *log_lvl);
 void show_regs_print_info(const char *log_lvl);
 extern asmlinkage void dump_stack(void) __cold;
-extern void printk_safe_flush(void);
-extern void printk_safe_flush_on_panic(void);
 #else
 static inline __printf(1, 0)
 int vprintk(const char *s, va_list args)
@@ -272,14 +270,6 @@ static inline void show_regs_print_info(const char 
*log_lvl)
 static inline void dump_stack(void)
 {
 }
-
-static inline void printk_safe_flush(void)
-{
-}
-
-static inline void printk_safe_flush_on_panic(void)
-{
-}
 #endif
 
 #ifdef CONFIG_SMP
diff --git a/kernel/kexec_core.c b/kernel/kexec_core.c
index a0b6780740c8..480d5f77ef4f 100644
--- a/kernel/kexec_core.c
+++ b/kernel/kexec_core.c
@@ -977,7 +977,6 @@ void crash_kexec(struct pt_regs *regs)
old_cpu = atomic_cmpxchg(&panic_cpu, PANIC_CPU_INVALID, this_cpu);
if (old_cpu == PANIC_CPU_INVALID) {
/* This is the 1st CPU which comes here, so go ahead. */
-   printk_safe_flush_on_panic();
__crash_kexec(regs);
 
/*
diff --git a/kernel/panic.c b/kernel/panic.c
index 332736a72a58..1f0df42f8d0c 100644
--- a/kernel/panic.c
+++ b/kernel/panic.c
@@ -247,7 +247,6 @@ void panic(const char *fmt, ...)
 * Bypass the panic_cpu check and call __crash_kexec directly.
 */
if (!_crash_kexec_post_notifiers) {
-   printk_safe_flush_on_panic();
__crash_kexec(NULL);
 
/*
@@ -271,8 +270,6 @@ void panic(const char *fmt, ...)
 */
atomic_notifier_call_chain(&panic_notifier_list, 0, buf);
 
-   /* Call flush even twice. It tries harder with a single online CPU */
-   printk_safe_flush_on_panic();
kmsg_dump(KMSG_DUMP_PANIC);
 
/*
diff --git a/kernel/printk/internal.h b/kernel/printk/internal.h
index 51615c909b2f..6cc35c5de890 100644
--- a/kernel/printk/internal.h
+++ b/kernel/printk/internal.h
@@ -22,7 +22,6 @@ __printf(1, 0) int vprintk_deferred(const char *fmt, va_list 
args);
 void __printk_safe_enter(void);
 void __printk_safe_exit(void);
 
-void printk_safe_init(void);
 bool printk_percpu_data_ready(void);
 
 #define printk_safe_enter_irqsave(flags)   \
@@ -37,18 +36,6 @@ bool printk_percpu_data_ready(void);
local_irq_restore(flags);   \
} while (0)
 
-#define printk_safe_enter_irq()\
-   do {\
-   local_irq_disable();\
-   __printk_safe_enter();  \
-   } while (0)
-
-#define printk_sa