Re: [RFC 0/5] printk: Implement WARN_*DEFERRED()
On Fri 2016-09-30 09:48:32, Sergey Senozhatsky wrote: > On (09/29/16 13:28), Petr Mladek wrote: > > On Wed 2016-09-28 10:18:45, Sergey Senozhatsky wrote: > > > On (09/27/16 18:02), Petr Mladek wrote: > > > > The main trick is that we replace the per-CPU function pointer > > > > by a preempt_count-like variable that could track the printk context. > > > > > > > > I know that Sergey has another ideas in this area. But I wanted to see > > > > how this approach would look like. > > > > > > well, yes. I was looking at WARN_*_DEFERRED [1] for some time, and, I > > > think, the maintenance cost of that solution is just too high: > > > > > > a) every existing WARN_* in sched/timekeeping/who knows where else > > >must be evaluated to ensure that in can't be called from printk() > > >path. if `false' - then the corresponding macro must be replaced > > >with _DEFERRED flavor. > > > > > > b) any patch that adds new WARN_* usages must be additionally checked > > >to ensure that each of new WARN_* macros cannot be called from printk > > >path. if `false' -- the corresponding macro must be replaced with > > >_DEFERRED flavor. > > > > > > c) any patch that refactors the code or moves some function calls around > > >etc. must be additionally checked for any accidental WARN_* from printk > > >path. even though if none of the patches added any new WARN_* to the > > > code. > > > > > > b) apart from WARN_* there can be `accidental' pr_err/pr_debug/etc. not > > >necessarily newly added (see 'c'). > > > > > > > > > that's too much. > > > > > > it takes a lot of additional effort, because both reviewer and contributor > > > must consider printk() internals. and, what's worse, if something goes > > > unnoticed we end up having a printk() deadlock again. > > > > > > so I decided to address some of printk() issues in printk.c, not in > > > kernel/time/timekeeping.c or kernel/sched/core.c or anywhere else. I do not longer see how this might be achieved. If a printk()/WARN() in the scheduler/timekeeping code can be reached from printk() then it might too be reached outside printk. In this case, printk() will not know about it and will happily call the scheduler/timekeeping code recursively. This might still cause deadlock. > > I see the point. > > well, just my 5 cents. > > > Your approach (alt buffer) adds some complexity to the printk code > > it does. > the other thing is that there are several ways to deadlock printk(). > alt_printk is addressing deadlocks that were caused by printk() > recursion only. > >printk() > acquire_lock() >printk() > acquire_lock() This looks theoretical. The recursion in printk() is not easily possible at the moment. It is prevented by logbuf_cpu check when logbug_lock is taken. It is prevented by console_trylock() when console_sem is taken. > which is a sub-set of all of the printk() deadlock scenarios. all of > the locks that printk() acquires can be taken outside of printk() path. > > for example, cat /proc/console locks the console_lock() for seq output. > thus we can have something like > > console_unlock() // lock >lock > up() > activate_task() > WARN_ON() > printk() > console_trylock() // lock >lock The WARN_ON() here is called under >pi_lock that is taken by try_to_wake_up(). This WARN_ON() can be triggered also outside printk()/console_unlock(). Therefore it needs to get replaced by WARN_DEFERRED() anyway. > DEFERRED_WARN is a good thing; it's just quite hard to keep everything > working, given that any of those "9 patches per hour" can break something > with just one WARN_ON(). > > > I assume that doing something like this > > #define WARN_ON(condition, format...) ({ \ > printk_deferred_enter();\ > WARN(condition, ##format); \ > printk_deferred_exit(); \ > }) > > is less than exciting because WARN_ON from irq won't immediately print > the backtrace anymore. Yup, we might need WARN_ON_DEFERRED() variant. Best Regards, Petr
Re: [RFC 0/5] printk: Implement WARN_*DEFERRED()
On Fri 2016-09-30 09:48:32, Sergey Senozhatsky wrote: > On (09/29/16 13:28), Petr Mladek wrote: > > On Wed 2016-09-28 10:18:45, Sergey Senozhatsky wrote: > > > On (09/27/16 18:02), Petr Mladek wrote: > > > > The main trick is that we replace the per-CPU function pointer > > > > by a preempt_count-like variable that could track the printk context. > > > > > > > > I know that Sergey has another ideas in this area. But I wanted to see > > > > how this approach would look like. > > > > > > well, yes. I was looking at WARN_*_DEFERRED [1] for some time, and, I > > > think, the maintenance cost of that solution is just too high: > > > > > > a) every existing WARN_* in sched/timekeeping/who knows where else > > >must be evaluated to ensure that in can't be called from printk() > > >path. if `false' - then the corresponding macro must be replaced > > >with _DEFERRED flavor. > > > > > > b) any patch that adds new WARN_* usages must be additionally checked > > >to ensure that each of new WARN_* macros cannot be called from printk > > >path. if `false' -- the corresponding macro must be replaced with > > >_DEFERRED flavor. > > > > > > c) any patch that refactors the code or moves some function calls around > > >etc. must be additionally checked for any accidental WARN_* from printk > > >path. even though if none of the patches added any new WARN_* to the > > > code. > > > > > > b) apart from WARN_* there can be `accidental' pr_err/pr_debug/etc. not > > >necessarily newly added (see 'c'). > > > > > > > > > that's too much. > > > > > > it takes a lot of additional effort, because both reviewer and contributor > > > must consider printk() internals. and, what's worse, if something goes > > > unnoticed we end up having a printk() deadlock again. > > > > > > so I decided to address some of printk() issues in printk.c, not in > > > kernel/time/timekeeping.c or kernel/sched/core.c or anywhere else. I do not longer see how this might be achieved. If a printk()/WARN() in the scheduler/timekeeping code can be reached from printk() then it might too be reached outside printk. In this case, printk() will not know about it and will happily call the scheduler/timekeeping code recursively. This might still cause deadlock. > > I see the point. > > well, just my 5 cents. > > > Your approach (alt buffer) adds some complexity to the printk code > > it does. > the other thing is that there are several ways to deadlock printk(). > alt_printk is addressing deadlocks that were caused by printk() > recursion only. > >printk() > acquire_lock() >printk() > acquire_lock() This looks theoretical. The recursion in printk() is not easily possible at the moment. It is prevented by logbuf_cpu check when logbug_lock is taken. It is prevented by console_trylock() when console_sem is taken. > which is a sub-set of all of the printk() deadlock scenarios. all of > the locks that printk() acquires can be taken outside of printk() path. > > for example, cat /proc/console locks the console_lock() for seq output. > thus we can have something like > > console_unlock() // lock >lock > up() > activate_task() > WARN_ON() > printk() > console_trylock() // lock >lock The WARN_ON() here is called under >pi_lock that is taken by try_to_wake_up(). This WARN_ON() can be triggered also outside printk()/console_unlock(). Therefore it needs to get replaced by WARN_DEFERRED() anyway. > DEFERRED_WARN is a good thing; it's just quite hard to keep everything > working, given that any of those "9 patches per hour" can break something > with just one WARN_ON(). > > > I assume that doing something like this > > #define WARN_ON(condition, format...) ({ \ > printk_deferred_enter();\ > WARN(condition, ##format); \ > printk_deferred_exit(); \ > }) > > is less than exciting because WARN_ON from irq won't immediately print > the backtrace anymore. Yup, we might need WARN_ON_DEFERRED() variant. Best Regards, Petr
Re: [RFC 0/5] printk: Implement WARN_*DEFERRED()
On (09/29/16 13:28), Petr Mladek wrote: > On Wed 2016-09-28 10:18:45, Sergey Senozhatsky wrote: > > On (09/27/16 18:02), Petr Mladek wrote: > > > The main trick is that we replace the per-CPU function pointer > > > by a preempt_count-like variable that could track the printk context. > > > > > > I know that Sergey has another ideas in this area. But I wanted to see > > > how this approach would look like. > > > > well, yes. I was looking at WARN_*_DEFERRED [1] for some time, and, I > > think, the maintenance cost of that solution is just too high: > > > > a) every existing WARN_* in sched/timekeeping/who knows where else > >must be evaluated to ensure that in can't be called from printk() > >path. if `false' - then the corresponding macro must be replaced > >with _DEFERRED flavor. > > > > b) any patch that adds new WARN_* usages must be additionally checked > >to ensure that each of new WARN_* macros cannot be called from printk > >path. if `false' -- the corresponding macro must be replaced with > >_DEFERRED flavor. > > > > c) any patch that refactors the code or moves some function calls around > >etc. must be additionally checked for any accidental WARN_* from printk > >path. even though if none of the patches added any new WARN_* to the > > code. > > > > b) apart from WARN_* there can be `accidental' pr_err/pr_debug/etc. not > >necessarily newly added (see 'c'). > > > > > > that's too much. > > > > it takes a lot of additional effort, because both reviewer and contributor > > must consider printk() internals. and, what's worse, if something goes > > unnoticed we end up having a printk() deadlock again. > > > > so I decided to address some of printk() issues in printk.c, not in > > kernel/time/timekeeping.c or kernel/sched/core.c or anywhere else. > > I see the point. well, just my 5 cents. > Your approach (alt buffer) adds some complexity to the printk code it does. the other thing is that there are several ways to deadlock printk(). alt_printk is addressing deadlocks that were caused by printk() recursion only. printk() acquire_lock() printk() acquire_lock() which is a sub-set of all of the printk() deadlock scenarios. all of the locks that printk() acquires can be taken outside of printk() path. for example, cat /proc/console locks the console_lock() for seq output. thus we can have something like console_unlock()// lock >lock up() activate_task() WARN_ON() printk() console_trylock() // lock >lock DEFERRED_WARN is a good thing; it's just quite hard to keep everything working, given that any of those "9 patches per hour" can break something with just one WARN_ON(). I assume that doing something like this #define WARN_ON(condition, format...) ({\ printk_deferred_enter();\ WARN(condition, ##format); \ printk_deferred_exit(); \ }) is less than exciting because WARN_ON from irq won't immediately print the backtrace anymore. thoughts? > but it allows to remove printk_deferred()/WARN_DEFERRED() and all > the risk of it. at some point we even can drop the entire deferred_printk() thing. but alt_printk needs some love and care first. > I am going to look closely on it. thanks. -ss
Re: [RFC 0/5] printk: Implement WARN_*DEFERRED()
On (09/29/16 13:28), Petr Mladek wrote: > On Wed 2016-09-28 10:18:45, Sergey Senozhatsky wrote: > > On (09/27/16 18:02), Petr Mladek wrote: > > > The main trick is that we replace the per-CPU function pointer > > > by a preempt_count-like variable that could track the printk context. > > > > > > I know that Sergey has another ideas in this area. But I wanted to see > > > how this approach would look like. > > > > well, yes. I was looking at WARN_*_DEFERRED [1] for some time, and, I > > think, the maintenance cost of that solution is just too high: > > > > a) every existing WARN_* in sched/timekeeping/who knows where else > >must be evaluated to ensure that in can't be called from printk() > >path. if `false' - then the corresponding macro must be replaced > >with _DEFERRED flavor. > > > > b) any patch that adds new WARN_* usages must be additionally checked > >to ensure that each of new WARN_* macros cannot be called from printk > >path. if `false' -- the corresponding macro must be replaced with > >_DEFERRED flavor. > > > > c) any patch that refactors the code or moves some function calls around > >etc. must be additionally checked for any accidental WARN_* from printk > >path. even though if none of the patches added any new WARN_* to the > > code. > > > > b) apart from WARN_* there can be `accidental' pr_err/pr_debug/etc. not > >necessarily newly added (see 'c'). > > > > > > that's too much. > > > > it takes a lot of additional effort, because both reviewer and contributor > > must consider printk() internals. and, what's worse, if something goes > > unnoticed we end up having a printk() deadlock again. > > > > so I decided to address some of printk() issues in printk.c, not in > > kernel/time/timekeeping.c or kernel/sched/core.c or anywhere else. > > I see the point. well, just my 5 cents. > Your approach (alt buffer) adds some complexity to the printk code it does. the other thing is that there are several ways to deadlock printk(). alt_printk is addressing deadlocks that were caused by printk() recursion only. printk() acquire_lock() printk() acquire_lock() which is a sub-set of all of the printk() deadlock scenarios. all of the locks that printk() acquires can be taken outside of printk() path. for example, cat /proc/console locks the console_lock() for seq output. thus we can have something like console_unlock()// lock >lock up() activate_task() WARN_ON() printk() console_trylock() // lock >lock DEFERRED_WARN is a good thing; it's just quite hard to keep everything working, given that any of those "9 patches per hour" can break something with just one WARN_ON(). I assume that doing something like this #define WARN_ON(condition, format...) ({\ printk_deferred_enter();\ WARN(condition, ##format); \ printk_deferred_exit(); \ }) is less than exciting because WARN_ON from irq won't immediately print the backtrace anymore. thoughts? > but it allows to remove printk_deferred()/WARN_DEFERRED() and all > the risk of it. at some point we even can drop the entire deferred_printk() thing. but alt_printk needs some love and care first. > I am going to look closely on it. thanks. -ss
Re: [RFC 0/5] printk: Implement WARN_*DEFERRED()
On Wed 2016-09-28 10:18:45, Sergey Senozhatsky wrote: > On (09/27/16 18:02), Petr Mladek wrote: > > The main trick is that we replace the per-CPU function pointer > > by a preempt_count-like variable that could track the printk context. > > > > I know that Sergey has another ideas in this area. But I wanted to see > > how this approach would look like. > > well, yes. I was looking at WARN_*_DEFERRED [1] for some time, and, I > think, the maintenance cost of that solution is just too high: > > a) every existing WARN_* in sched/timekeeping/who knows where else >must be evaluated to ensure that in can't be called from printk() >path. if `false' - then the corresponding macro must be replaced >with _DEFERRED flavor. > > b) any patch that adds new WARN_* usages must be additionally checked >to ensure that each of new WARN_* macros cannot be called from printk >path. if `false' -- the corresponding macro must be replaced with >_DEFERRED flavor. > > c) any patch that refactors the code or moves some function calls around >etc. must be additionally checked for any accidental WARN_* from printk >path. even though if none of the patches added any new WARN_* to the code. > > b) apart from WARN_* there can be `accidental' pr_err/pr_debug/etc. not >necessarily newly added (see 'c'). > > > that's too much. > > it takes a lot of additional effort, because both reviewer and contributor > must consider printk() internals. and, what's worse, if something goes > unnoticed we end up having a printk() deadlock again. > > so I decided to address some of printk() issues in printk.c, not in > kernel/time/timekeeping.c or kernel/sched/core.c or anywhere else. I see the point. Your approach (alt buffer) adds some complexity to the printk code but it allows to remove printk_deferred()/WARN_DEFERRED() and all the risk of it. I am going to look closely on it. Best Regards, Petr
Re: [RFC 0/5] printk: Implement WARN_*DEFERRED()
On Wed 2016-09-28 10:18:45, Sergey Senozhatsky wrote: > On (09/27/16 18:02), Petr Mladek wrote: > > The main trick is that we replace the per-CPU function pointer > > by a preempt_count-like variable that could track the printk context. > > > > I know that Sergey has another ideas in this area. But I wanted to see > > how this approach would look like. > > well, yes. I was looking at WARN_*_DEFERRED [1] for some time, and, I > think, the maintenance cost of that solution is just too high: > > a) every existing WARN_* in sched/timekeeping/who knows where else >must be evaluated to ensure that in can't be called from printk() >path. if `false' - then the corresponding macro must be replaced >with _DEFERRED flavor. > > b) any patch that adds new WARN_* usages must be additionally checked >to ensure that each of new WARN_* macros cannot be called from printk >path. if `false' -- the corresponding macro must be replaced with >_DEFERRED flavor. > > c) any patch that refactors the code or moves some function calls around >etc. must be additionally checked for any accidental WARN_* from printk >path. even though if none of the patches added any new WARN_* to the code. > > b) apart from WARN_* there can be `accidental' pr_err/pr_debug/etc. not >necessarily newly added (see 'c'). > > > that's too much. > > it takes a lot of additional effort, because both reviewer and contributor > must consider printk() internals. and, what's worse, if something goes > unnoticed we end up having a printk() deadlock again. > > so I decided to address some of printk() issues in printk.c, not in > kernel/time/timekeeping.c or kernel/sched/core.c or anywhere else. I see the point. Your approach (alt buffer) adds some complexity to the printk code but it allows to remove printk_deferred()/WARN_DEFERRED() and all the risk of it. I am going to look closely on it. Best Regards, Petr
Re: [RFC 0/5] printk: Implement WARN_*DEFERRED()
On (09/27/16 18:02), Petr Mladek wrote: > The main trick is that we replace the per-CPU function pointer > by a preempt_count-like variable that could track the printk context. > > I know that Sergey has another ideas in this area. But I wanted to see > how this approach would look like. well, yes. I was looking at WARN_*_DEFERRED [1] for some time, and, I think, the maintenance cost of that solution is just too high: a) every existing WARN_* in sched/timekeeping/who knows where else must be evaluated to ensure that in can't be called from printk() path. if `false' - then the corresponding macro must be replaced with _DEFERRED flavor. b) any patch that adds new WARN_* usages must be additionally checked to ensure that each of new WARN_* macros cannot be called from printk path. if `false' -- the corresponding macro must be replaced with _DEFERRED flavor. c) any patch that refactors the code or moves some function calls around etc. must be additionally checked for any accidental WARN_* from printk path. even though if none of the patches added any new WARN_* to the code. b) apart from WARN_* there can be `accidental' pr_err/pr_debug/etc. not necessarily newly added (see 'c'). that's too much. for example [not blaming anyone], a recent patch [2] that added a reasonable WARN_ON_ONCE to assert_clock_updated() which, however, can result in a possible printk() deadlock scenario that you, Petr, outlined [3]: :+ printk() : + vprintk_func -> vprintk_default() :+ vprinkt_emit() : + console_unlock() :+ up_console_sem() : + up()# takes >lock :+ __up() : + wake_up_process() :+ try_to_wake_up() : + ttwu_queue() :+ ttwu_do_activate() : + ttwu_do_wakeup() :+ rq_clock() : + lockdep_assert_held() :+ WARN_ON_ONCE() : + printk() :+ vprintk_func -> vprintk_default() : + vprintk_emit() :+ console_try_lock() : + down_trylock_console_sem() :+ __down_trylock_console_sem() : + down_trylock() it takes a lot of additional effort, because both reviewer and contributor must consider printk() internals. and, what's worse, if something goes unnoticed we end up having a printk() deadlock again. so I decided to address some of printk() issues in printk.c, not in kernel/time/timekeeping.c or kernel/sched/core.c or anywhere else. > Mid-air collision: > > I have just realized that Sergey sent another patchset that was > more generic, complicated, and had some similarities, see > https://lkml.kernel.org/r/20160927142237.5539-1-sergey.senozhat...@gmail.com yeah, I should have Cc-ed a wider audience. do I need to resend the patch set with the `extended' Cc list? [1] https://marc.info/?l=linux-kernel=147158843319944 [2] https://marc.info/?l=linux-kernel=147446511924573 [3] https://marc.info/?l=linux-kernel=147447352127741 -ss
Re: [RFC 0/5] printk: Implement WARN_*DEFERRED()
On (09/27/16 18:02), Petr Mladek wrote: > The main trick is that we replace the per-CPU function pointer > by a preempt_count-like variable that could track the printk context. > > I know that Sergey has another ideas in this area. But I wanted to see > how this approach would look like. well, yes. I was looking at WARN_*_DEFERRED [1] for some time, and, I think, the maintenance cost of that solution is just too high: a) every existing WARN_* in sched/timekeeping/who knows where else must be evaluated to ensure that in can't be called from printk() path. if `false' - then the corresponding macro must be replaced with _DEFERRED flavor. b) any patch that adds new WARN_* usages must be additionally checked to ensure that each of new WARN_* macros cannot be called from printk path. if `false' -- the corresponding macro must be replaced with _DEFERRED flavor. c) any patch that refactors the code or moves some function calls around etc. must be additionally checked for any accidental WARN_* from printk path. even though if none of the patches added any new WARN_* to the code. b) apart from WARN_* there can be `accidental' pr_err/pr_debug/etc. not necessarily newly added (see 'c'). that's too much. for example [not blaming anyone], a recent patch [2] that added a reasonable WARN_ON_ONCE to assert_clock_updated() which, however, can result in a possible printk() deadlock scenario that you, Petr, outlined [3]: :+ printk() : + vprintk_func -> vprintk_default() :+ vprinkt_emit() : + console_unlock() :+ up_console_sem() : + up()# takes >lock :+ __up() : + wake_up_process() :+ try_to_wake_up() : + ttwu_queue() :+ ttwu_do_activate() : + ttwu_do_wakeup() :+ rq_clock() : + lockdep_assert_held() :+ WARN_ON_ONCE() : + printk() :+ vprintk_func -> vprintk_default() : + vprintk_emit() :+ console_try_lock() : + down_trylock_console_sem() :+ __down_trylock_console_sem() : + down_trylock() it takes a lot of additional effort, because both reviewer and contributor must consider printk() internals. and, what's worse, if something goes unnoticed we end up having a printk() deadlock again. so I decided to address some of printk() issues in printk.c, not in kernel/time/timekeeping.c or kernel/sched/core.c or anywhere else. > Mid-air collision: > > I have just realized that Sergey sent another patchset that was > more generic, complicated, and had some similarities, see > https://lkml.kernel.org/r/20160927142237.5539-1-sergey.senozhat...@gmail.com yeah, I should have Cc-ed a wider audience. do I need to resend the patch set with the `extended' Cc list? [1] https://marc.info/?l=linux-kernel=147158843319944 [2] https://marc.info/?l=linux-kernel=147446511924573 [3] https://marc.info/?l=linux-kernel=147447352127741 -ss