Re: [PATCH] panic: move bust_spinlocks(0) after console_flush_on_panic() to avoid deadlocks
On (06/20/18 14:30), Andrew Morton wrote: > > From: Hoeun Ryu > > > > Many console device drivers hold the uart_port->lock spinlock with irq > > enabled > > (using spin_lock()) while the device drivers are writing characters to > > their devices, > > but the device drivers just try to hold the spin lock (using > > spin_trylock()) if > > "oops_in_progress" is equal or greater than 1 to avoid deadlocks. > > > > There is a case ocurring a deadlock related to the lock and > > oops_in_progress. A CPU > > could be stopped by smp_send_stop() while it was holding the port lock > > because irq was > > enabled. Once a CPU stops, it doesn't respond interrupts anymore and the > > lock stays > > locked forever. > > > > console_flush_on_panic() is called during panic() and it eventually holds > > the uart > > lock but the lock is held by another stopped CPU and it is a deadlock. By > > moving > > bust_spinlocks(0) after console_flush_on_panic(), let the console device > > drivers > > think the Oops is still in progress to call spin_trylock() instead of > > spin_lock() to > > avoid the deadlock. > > hm. Sergey, is this at all related to the UART printk deadlock change > which you're presently discussing in > http://lkml.kernel.org/r/20180615093919.559-1-sergey.senozhat...@gmail.com? Hi Andrew, Not exactly. The change I'm discussing is a little different - it's about re-entrant UART [+ circular locking in TTY], when UART deadlocks us because of printk()-s issued by MM/tty/WQ/sched/other core kernel stuff/etc Example: IRQ -> uart -> tty -> WQ -> printk -> uart -ss
Re: [PATCH] panic: move bust_spinlocks(0) after console_flush_on_panic() to avoid deadlocks
On (06/20/18 14:30), Andrew Morton wrote: > > From: Hoeun Ryu > > > > Many console device drivers hold the uart_port->lock spinlock with irq > > enabled > > (using spin_lock()) while the device drivers are writing characters to > > their devices, > > but the device drivers just try to hold the spin lock (using > > spin_trylock()) if > > "oops_in_progress" is equal or greater than 1 to avoid deadlocks. > > > > There is a case ocurring a deadlock related to the lock and > > oops_in_progress. A CPU > > could be stopped by smp_send_stop() while it was holding the port lock > > because irq was > > enabled. Once a CPU stops, it doesn't respond interrupts anymore and the > > lock stays > > locked forever. > > > > console_flush_on_panic() is called during panic() and it eventually holds > > the uart > > lock but the lock is held by another stopped CPU and it is a deadlock. By > > moving > > bust_spinlocks(0) after console_flush_on_panic(), let the console device > > drivers > > think the Oops is still in progress to call spin_trylock() instead of > > spin_lock() to > > avoid the deadlock. > > hm. Sergey, is this at all related to the UART printk deadlock change > which you're presently discussing in > http://lkml.kernel.org/r/20180615093919.559-1-sergey.senozhat...@gmail.com? Hi Andrew, Not exactly. The change I'm discussing is a little different - it's about re-entrant UART [+ circular locking in TTY], when UART deadlocks us because of printk()-s issued by MM/tty/WQ/sched/other core kernel stuff/etc Example: IRQ -> uart -> tty -> WQ -> printk -> uart -ss
Re: [PATCH] panic: move bust_spinlocks(0) after console_flush_on_panic() to avoid deadlocks
On Mon, 4 Jun 2018 14:45:57 +0900 Hoeun Ryu wrote: > From: Hoeun Ryu > > Many console device drivers hold the uart_port->lock spinlock with irq > enabled > (using spin_lock()) while the device drivers are writing characters to their > devices, > but the device drivers just try to hold the spin lock (using spin_trylock()) > if > "oops_in_progress" is equal or greater than 1 to avoid deadlocks. > > There is a case ocurring a deadlock related to the lock and > oops_in_progress. A CPU > could be stopped by smp_send_stop() while it was holding the port lock > because irq was > enabled. Once a CPU stops, it doesn't respond interrupts anymore and the lock > stays > locked forever. > > console_flush_on_panic() is called during panic() and it eventually holds > the uart > lock but the lock is held by another stopped CPU and it is a deadlock. By > moving > bust_spinlocks(0) after console_flush_on_panic(), let the console device > drivers > think the Oops is still in progress to call spin_trylock() instead of > spin_lock() to > avoid the deadlock. hm. Sergey, is this at all related to the UART printk deadlock change which you're presently discussing in http://lkml.kernel.org/r/20180615093919.559-1-sergey.senozhat...@gmail.com? > --- a/kernel/panic.c > +++ b/kernel/panic.c > @@ -233,8 +233,6 @@ void panic(const char *fmt, ...) > if (_crash_kexec_post_notifiers) > __crash_kexec(NULL); > > - bust_spinlocks(0); > - > /* >* We may have ended up stopping the CPU holding the lock (in >* smp_send_stop()) while still having some valuable data in the console > @@ -246,6 +244,8 @@ void panic(const char *fmt, ...) > debug_locks_off(); > console_flush_on_panic(); > > + bust_spinlocks(0); > + > if (!panic_blink) > panic_blink = no_blink; >
Re: [PATCH] panic: move bust_spinlocks(0) after console_flush_on_panic() to avoid deadlocks
On Mon, 4 Jun 2018 14:45:57 +0900 Hoeun Ryu wrote: > From: Hoeun Ryu > > Many console device drivers hold the uart_port->lock spinlock with irq > enabled > (using spin_lock()) while the device drivers are writing characters to their > devices, > but the device drivers just try to hold the spin lock (using spin_trylock()) > if > "oops_in_progress" is equal or greater than 1 to avoid deadlocks. > > There is a case ocurring a deadlock related to the lock and > oops_in_progress. A CPU > could be stopped by smp_send_stop() while it was holding the port lock > because irq was > enabled. Once a CPU stops, it doesn't respond interrupts anymore and the lock > stays > locked forever. > > console_flush_on_panic() is called during panic() and it eventually holds > the uart > lock but the lock is held by another stopped CPU and it is a deadlock. By > moving > bust_spinlocks(0) after console_flush_on_panic(), let the console device > drivers > think the Oops is still in progress to call spin_trylock() instead of > spin_lock() to > avoid the deadlock. hm. Sergey, is this at all related to the UART printk deadlock change which you're presently discussing in http://lkml.kernel.org/r/20180615093919.559-1-sergey.senozhat...@gmail.com? > --- a/kernel/panic.c > +++ b/kernel/panic.c > @@ -233,8 +233,6 @@ void panic(const char *fmt, ...) > if (_crash_kexec_post_notifiers) > __crash_kexec(NULL); > > - bust_spinlocks(0); > - > /* >* We may have ended up stopping the CPU holding the lock (in >* smp_send_stop()) while still having some valuable data in the console > @@ -246,6 +244,8 @@ void panic(const char *fmt, ...) > debug_locks_off(); > console_flush_on_panic(); > > + bust_spinlocks(0); > + > if (!panic_blink) > panic_blink = no_blink; >
RE: [PATCH] panic: move bust_spinlocks(0) after console_flush_on_panic() to avoid deadlocks
I misunderstood the cause of a deadlock. I sent v2 fixing the commit message about the reason of the deadlock. Please ignore this and review v2. Thank you. > -Original Message- > From: Steven Rostedt [mailto:rost...@goodmis.org] > Sent: Tuesday, June 05, 2018 10:44 AM > To: Hoeun Ryu > Cc: Andrew Morton ; Kees Cook > ; Borislav Petkov ; Andi Kleen > ; Josh Poimboeuf ; Hoeun Ryu > ; linux-kernel@vger.kernel.org; Tejun Heo > ; Vitaly Kuznetsov > Subject: Re: [PATCH] panic: move bust_spinlocks(0) after > console_flush_on_panic() to avoid deadlocks > > On Mon, 4 Jun 2018 14:45:57 +0900 > Hoeun Ryu wrote: > > > From: Hoeun Ryu > > > > Many console device drivers hold the uart_port->lock spinlock with irq > enabled > > (using spin_lock()) while the device drivers are writing characters to > their devices, > > but the device drivers just try to hold the spin lock (using > spin_trylock()) if > > "oops_in_progress" is equal or greater than 1 to avoid deadlocks. > > > > There is a case ocurring a deadlock related to the lock and > oops_in_progress. A CPU > > could be stopped by smp_send_stop() while it was holding the port lock > because irq was > > enabled. Once a CPU stops, it doesn't respond interrupts anymore and the > lock stays > > locked forever. > > > > console_flush_on_panic() is called during panic() and it eventually > holds the uart > > lock but the lock is held by another stopped CPU and it is a deadlock. > By moving > > bust_spinlocks(0) after console_flush_on_panic(), let the console device > drivers > > think the Oops is still in progress to call spin_trylock() instead of > spin_lock() to > > avoid the deadlock. > > > > Signed-off-by: Hoeun Ryu > > --- > > kernel/panic.c | 4 ++-- > > 1 file changed, 2 insertions(+), 2 deletions(-) > > > > diff --git a/kernel/panic.c b/kernel/panic.c > > index 42e4874..b4063b6 100644 > > --- a/kernel/panic.c > > +++ b/kernel/panic.c > > @@ -233,8 +233,6 @@ void panic(const char *fmt, ...) > > if (_crash_kexec_post_notifiers) > > __crash_kexec(NULL); > > > > - bust_spinlocks(0); > > - > > /* > > * We may have ended up stopping the CPU holding the lock (in > > * smp_send_stop()) while still having some valuable data in the > console > > @@ -246,6 +244,8 @@ void panic(const char *fmt, ...) > > debug_locks_off(); > > console_flush_on_panic(); > > > > + bust_spinlocks(0); > > Added a few more to Cc. This looks like it could have subtle > side-effects. I'd like those that have been touching the code around > here to have a look. > > -- Steve > > > > + > > if (!panic_blink) > > panic_blink = no_blink; > >
RE: [PATCH] panic: move bust_spinlocks(0) after console_flush_on_panic() to avoid deadlocks
I misunderstood the cause of a deadlock. I sent v2 fixing the commit message about the reason of the deadlock. Please ignore this and review v2. Thank you. > -Original Message- > From: Steven Rostedt [mailto:rost...@goodmis.org] > Sent: Tuesday, June 05, 2018 10:44 AM > To: Hoeun Ryu > Cc: Andrew Morton ; Kees Cook > ; Borislav Petkov ; Andi Kleen > ; Josh Poimboeuf ; Hoeun Ryu > ; linux-kernel@vger.kernel.org; Tejun Heo > ; Vitaly Kuznetsov > Subject: Re: [PATCH] panic: move bust_spinlocks(0) after > console_flush_on_panic() to avoid deadlocks > > On Mon, 4 Jun 2018 14:45:57 +0900 > Hoeun Ryu wrote: > > > From: Hoeun Ryu > > > > Many console device drivers hold the uart_port->lock spinlock with irq > enabled > > (using spin_lock()) while the device drivers are writing characters to > their devices, > > but the device drivers just try to hold the spin lock (using > spin_trylock()) if > > "oops_in_progress" is equal or greater than 1 to avoid deadlocks. > > > > There is a case ocurring a deadlock related to the lock and > oops_in_progress. A CPU > > could be stopped by smp_send_stop() while it was holding the port lock > because irq was > > enabled. Once a CPU stops, it doesn't respond interrupts anymore and the > lock stays > > locked forever. > > > > console_flush_on_panic() is called during panic() and it eventually > holds the uart > > lock but the lock is held by another stopped CPU and it is a deadlock. > By moving > > bust_spinlocks(0) after console_flush_on_panic(), let the console device > drivers > > think the Oops is still in progress to call spin_trylock() instead of > spin_lock() to > > avoid the deadlock. > > > > Signed-off-by: Hoeun Ryu > > --- > > kernel/panic.c | 4 ++-- > > 1 file changed, 2 insertions(+), 2 deletions(-) > > > > diff --git a/kernel/panic.c b/kernel/panic.c > > index 42e4874..b4063b6 100644 > > --- a/kernel/panic.c > > +++ b/kernel/panic.c > > @@ -233,8 +233,6 @@ void panic(const char *fmt, ...) > > if (_crash_kexec_post_notifiers) > > __crash_kexec(NULL); > > > > - bust_spinlocks(0); > > - > > /* > > * We may have ended up stopping the CPU holding the lock (in > > * smp_send_stop()) while still having some valuable data in the > console > > @@ -246,6 +244,8 @@ void panic(const char *fmt, ...) > > debug_locks_off(); > > console_flush_on_panic(); > > > > + bust_spinlocks(0); > > Added a few more to Cc. This looks like it could have subtle > side-effects. I'd like those that have been touching the code around > here to have a look. > > -- Steve > > > > + > > if (!panic_blink) > > panic_blink = no_blink; > >
Re: [PATCH] panic: move bust_spinlocks(0) after console_flush_on_panic() to avoid deadlocks
On Mon, 4 Jun 2018 14:45:57 +0900 Hoeun Ryu wrote: > From: Hoeun Ryu > > Many console device drivers hold the uart_port->lock spinlock with irq > enabled > (using spin_lock()) while the device drivers are writing characters to their > devices, > but the device drivers just try to hold the spin lock (using spin_trylock()) > if > "oops_in_progress" is equal or greater than 1 to avoid deadlocks. > > There is a case ocurring a deadlock related to the lock and > oops_in_progress. A CPU > could be stopped by smp_send_stop() while it was holding the port lock > because irq was > enabled. Once a CPU stops, it doesn't respond interrupts anymore and the lock > stays > locked forever. > > console_flush_on_panic() is called during panic() and it eventually holds > the uart > lock but the lock is held by another stopped CPU and it is a deadlock. By > moving > bust_spinlocks(0) after console_flush_on_panic(), let the console device > drivers > think the Oops is still in progress to call spin_trylock() instead of > spin_lock() to > avoid the deadlock. > > Signed-off-by: Hoeun Ryu > --- > kernel/panic.c | 4 ++-- > 1 file changed, 2 insertions(+), 2 deletions(-) > > diff --git a/kernel/panic.c b/kernel/panic.c > index 42e4874..b4063b6 100644 > --- a/kernel/panic.c > +++ b/kernel/panic.c > @@ -233,8 +233,6 @@ void panic(const char *fmt, ...) > if (_crash_kexec_post_notifiers) > __crash_kexec(NULL); > > - bust_spinlocks(0); > - > /* >* We may have ended up stopping the CPU holding the lock (in >* smp_send_stop()) while still having some valuable data in the console > @@ -246,6 +244,8 @@ void panic(const char *fmt, ...) > debug_locks_off(); > console_flush_on_panic(); > > + bust_spinlocks(0); Added a few more to Cc. This looks like it could have subtle side-effects. I'd like those that have been touching the code around here to have a look. -- Steve > + > if (!panic_blink) > panic_blink = no_blink; >
Re: [PATCH] panic: move bust_spinlocks(0) after console_flush_on_panic() to avoid deadlocks
On Mon, 4 Jun 2018 14:45:57 +0900 Hoeun Ryu wrote: > From: Hoeun Ryu > > Many console device drivers hold the uart_port->lock spinlock with irq > enabled > (using spin_lock()) while the device drivers are writing characters to their > devices, > but the device drivers just try to hold the spin lock (using spin_trylock()) > if > "oops_in_progress" is equal or greater than 1 to avoid deadlocks. > > There is a case ocurring a deadlock related to the lock and > oops_in_progress. A CPU > could be stopped by smp_send_stop() while it was holding the port lock > because irq was > enabled. Once a CPU stops, it doesn't respond interrupts anymore and the lock > stays > locked forever. > > console_flush_on_panic() is called during panic() and it eventually holds > the uart > lock but the lock is held by another stopped CPU and it is a deadlock. By > moving > bust_spinlocks(0) after console_flush_on_panic(), let the console device > drivers > think the Oops is still in progress to call spin_trylock() instead of > spin_lock() to > avoid the deadlock. > > Signed-off-by: Hoeun Ryu > --- > kernel/panic.c | 4 ++-- > 1 file changed, 2 insertions(+), 2 deletions(-) > > diff --git a/kernel/panic.c b/kernel/panic.c > index 42e4874..b4063b6 100644 > --- a/kernel/panic.c > +++ b/kernel/panic.c > @@ -233,8 +233,6 @@ void panic(const char *fmt, ...) > if (_crash_kexec_post_notifiers) > __crash_kexec(NULL); > > - bust_spinlocks(0); > - > /* >* We may have ended up stopping the CPU holding the lock (in >* smp_send_stop()) while still having some valuable data in the console > @@ -246,6 +244,8 @@ void panic(const char *fmt, ...) > debug_locks_off(); > console_flush_on_panic(); > > + bust_spinlocks(0); Added a few more to Cc. This looks like it could have subtle side-effects. I'd like those that have been touching the code around here to have a look. -- Steve > + > if (!panic_blink) > panic_blink = no_blink; >
[PATCH] panic: move bust_spinlocks(0) after console_flush_on_panic() to avoid deadlocks
From: Hoeun Ryu Many console device drivers hold the uart_port->lock spinlock with irq enabled (using spin_lock()) while the device drivers are writing characters to their devices, but the device drivers just try to hold the spin lock (using spin_trylock()) if "oops_in_progress" is equal or greater than 1 to avoid deadlocks. There is a case ocurring a deadlock related to the lock and oops_in_progress. A CPU could be stopped by smp_send_stop() while it was holding the port lock because irq was enabled. Once a CPU stops, it doesn't respond interrupts anymore and the lock stays locked forever. console_flush_on_panic() is called during panic() and it eventually holds the uart lock but the lock is held by another stopped CPU and it is a deadlock. By moving bust_spinlocks(0) after console_flush_on_panic(), let the console device drivers think the Oops is still in progress to call spin_trylock() instead of spin_lock() to avoid the deadlock. Signed-off-by: Hoeun Ryu --- kernel/panic.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/kernel/panic.c b/kernel/panic.c index 42e4874..b4063b6 100644 --- a/kernel/panic.c +++ b/kernel/panic.c @@ -233,8 +233,6 @@ void panic(const char *fmt, ...) if (_crash_kexec_post_notifiers) __crash_kexec(NULL); - bust_spinlocks(0); - /* * We may have ended up stopping the CPU holding the lock (in * smp_send_stop()) while still having some valuable data in the console @@ -246,6 +244,8 @@ void panic(const char *fmt, ...) debug_locks_off(); console_flush_on_panic(); + bust_spinlocks(0); + if (!panic_blink) panic_blink = no_blink; -- 2.1.4
[PATCH] panic: move bust_spinlocks(0) after console_flush_on_panic() to avoid deadlocks
From: Hoeun Ryu Many console device drivers hold the uart_port->lock spinlock with irq enabled (using spin_lock()) while the device drivers are writing characters to their devices, but the device drivers just try to hold the spin lock (using spin_trylock()) if "oops_in_progress" is equal or greater than 1 to avoid deadlocks. There is a case ocurring a deadlock related to the lock and oops_in_progress. A CPU could be stopped by smp_send_stop() while it was holding the port lock because irq was enabled. Once a CPU stops, it doesn't respond interrupts anymore and the lock stays locked forever. console_flush_on_panic() is called during panic() and it eventually holds the uart lock but the lock is held by another stopped CPU and it is a deadlock. By moving bust_spinlocks(0) after console_flush_on_panic(), let the console device drivers think the Oops is still in progress to call spin_trylock() instead of spin_lock() to avoid the deadlock. Signed-off-by: Hoeun Ryu --- kernel/panic.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/kernel/panic.c b/kernel/panic.c index 42e4874..b4063b6 100644 --- a/kernel/panic.c +++ b/kernel/panic.c @@ -233,8 +233,6 @@ void panic(const char *fmt, ...) if (_crash_kexec_post_notifiers) __crash_kexec(NULL); - bust_spinlocks(0); - /* * We may have ended up stopping the CPU holding the lock (in * smp_send_stop()) while still having some valuable data in the console @@ -246,6 +244,8 @@ void panic(const char *fmt, ...) debug_locks_off(); console_flush_on_panic(); + bust_spinlocks(0); + if (!panic_blink) panic_blink = no_blink; -- 2.1.4