Re: 4.14 backport request for dbdda842fe96f: "printk: Add console owner and waiter logic to load balance console writes"

2018-10-21 Thread Daniel Wang
Just got back from vacation. Thanks for the continued discussion. Just so I understand the current state. Looks like we've got a pretty good explanation of what's going on (though not completely sure), and backporting Steven's patches is still the way to go? I see that Sergey had sent an RFC

Re: 4.14 backport request for dbdda842fe96f: "printk: Add console owner and waiter logic to load balance console writes"

2018-11-01 Thread Daniel Wang
On Mon, Oct 22, 2018 at 3:10 AM Sergey Senozhatsky wrote: > Another deadlock scenario could be the following one: > > printk() > console_trylock() > down_trylock() >raw_spin_lock_irqsave(>lock, flags) > > panic() >

Re: 4.14 backport request for dbdda842fe96f: "printk: Add console owner and waiter logic to load balance console writes"

2018-10-03 Thread Daniel Wang
On Wed, Oct 3, 2018 at 2:14 AM Petr Mladek wrote: > > On Tue 2018-10-02 21:23:27, Steven Rostedt wrote: > > I don't see the big deal of backporting this. The biggest complaints > > about backports are from fixes that were added to late -rc releases > > where the fixes didn't get much testing.

Re: 4.14 backport request for dbdda842fe96f: "printk: Add console owner and waiter logic to load balance console writes"

2018-10-03 Thread Daniel Wang
On Wed, Oct 3, 2018 at 10:37 AM Steven Rostedt wrote: > Just so I understand correctly. Does the panic hit with and without the > suggested backport patch? The only difference is that you get the full > output with the patch and limited output without it? When `softlockup_panic` is set (which is

Re: 4.14 backport request for dbdda842fe96f: "printk: Add console owner and waiter logic to load balance console writes"

2018-10-02 Thread Daniel Wang
On Tue, Oct 2, 2018 at 1:42 AM Petr Mladek wrote: > Well, I still wonder why it helped and why you do not see it with 4.4. > I have a feeling that the console owner switch helped only by chance. So do I. I don't think Steven had the deadlock in mind when working on that patch, but with that

Re: 4.14 backport request for dbdda842fe96f: "printk: Add console owner and waiter logic to load balance console writes"

2018-10-02 Thread Daniel Wang
On Tue, Oct 2, 2018 at 1:42 AM Petr Mladek wrote: > > Well, I still wonder why it helped and why you do not see it with 4.4. > I have a feeling that the console owner switch helped only by chance. > In fact, you might be affected by a race in > printk_safe_flush_on_panic() that was fixed by the

Re: 4.14 backport request for dbdda842fe96f: "printk: Add console owner and waiter logic to load balance console writes"

2018-10-03 Thread Daniel Wang
I wanted to let you know that I am leaving for a two-week vacation. So if you don't hear from me during that period assume bad network connectivity and not lack of enthusiasm. :) Feel free to go with the backports if we reach an agreement here. Otherwise I'll do it when I get back. Thank you all!

Re: 4.14 backport request for dbdda842fe96f: "printk: Add console owner and waiter logic to load balance console writes"

2018-10-01 Thread Daniel Wang
On Mon, Oct 1, 2018 at 12:23 PM Steven Rostedt wrote: > > > Serial console logs leading up to the deadlock. As can be seen the stack > > trace > > was incomplete because the printing path hit a timeout. > > I'm fine with having this backported. Thanks. I can send the cherrypicks your way. Do

Re: 4.14 backport request for dbdda842fe96f: "printk: Add console owner and waiter logic to load balance console writes"

2018-10-01 Thread Daniel Wang
On Mon, Oct 1, 2018 at 1:23 PM Vlastimil Babka wrote: > > On 10/1/18 10:13 PM, Pavel Machek wrote: > > > > Dunno. Is the patch perhaps a bit too complex? This is not exactly > > trivial bugfix. > > > > pavel@duo:/data/l/clean-cg$ git show dbdda842fe96f | diffstat > > printk.c | 108 > >

4.14 backport request for dbdda842fe96f: "printk: Add console owner and waiter logic to load balance console writes"

2018-09-27 Thread Daniel Wang
Prior to this change, the combination of `softlockup_panic=1` and `softlockup_all_cpu_stacktrace=1` may result in a deadlock when the reboot path is trying to grab the console lock that is held by the stack trace printing path. What seems to be happening is that while there are multiple CPUs, only

Re: 4.14 backport request for dbdda842fe96f: "printk: Add console owner and waiter logic to load balance console writes"

2018-12-12 Thread Daniel Wang
On Wed, Dec 12, 2018 at 9:43 AM Sasha Levin wrote: > > On Wed, Dec 12, 2018 at 10:59:39PM +0900, Sergey Senozhatsky wrote: > >On (12/12/18 14:36), Petr Mladek wrote: > >> > OK, really didn't know that! I wasn't Cc-ed on that AUTOSEL email, > >> > and I wasn't Cc-ed on this whole discussion and

Re: 4.14 backport request for dbdda842fe96f: "printk: Add console owner and waiter logic to load balance console writes"

2018-12-12 Thread Daniel Wang
Thanks for the clarification. So I guess I don't need to start another thread for it? What are the next steps? On Wed, Dec 12, 2018 at 1:43 PM Sasha Levin wrote: > > On Wed, Dec 12, 2018 at 12:11:29PM -0800, Daniel Wang wrote: > >On Wed, Dec 12, 2018 at 9:43 AM Sasha

Re: 4.14 backport request for dbdda842fe96f: "printk: Add console owner and waiter logic to load balance console writes"

2018-12-12 Thread Daniel Wang
Thank you! On Wed, Dec 12, 2018 at 1:52 PM Sasha Levin wrote: > > On Wed, Dec 12, 2018 at 01:49:25PM -0800, Daniel Wang wrote: > >Thanks for the clarification. So I guess I don't need to start another > >thread for it? What are the next steps? > > Nothing here, I'll queu

Re: 4.14 backport request for dbdda842fe96f: "printk: Add console owner and waiter logic to load balance console writes"

2018-12-12 Thread Daniel Wang
klogd when passing console_lock owner. On Wed, Dec 12, 2018 at 1:56 PM Daniel Wang wrote: > > Thank you! > > On Wed, Dec 12, 2018 at 1:52 PM Sasha Levin wrote: > > > > On Wed, Dec 12, 2018 at 01:49:25PM -0800, Daniel Wang wrote: > > >Thanks for the clarification.

Re: 4.14 backport request for dbdda842fe96f: "printk: Add console owner and waiter logic to load balance console writes"

2018-12-12 Thread Daniel Wang
n Wed, Dec 12, 2018 at 6:27 PM Sergey Senozhatsky wrote: > > On (12/12/18 16:40), Daniel Wang wrote: > > In case this was buried in previous messages, the commit I'd like to > > get backported to 4.14 is dbdda842fe96f: printk: Add console owner and > > waiter logic to loa

Re: [PATCHv3] panic: avoid deadlocks in re-entrant console drivers

2018-12-11 Thread Daniel Wang
Is it okay to tag this commit with `Cc: sta...@vger.kernel.org` so that it'll get applied to the stable trees once merged into Linux's tree, if it's not too late? Otherwise I'll follow up on the stable merges separately. Thanks for making the changes anyway. On Thu, Nov 22, 2018 at 5:12 AM Petr

Re: 4.14 backport request for dbdda842fe96f: "printk: Add console owner and waiter logic to load balance console writes"

2018-12-11 Thread Daniel Wang
ID and subject, which are both mentioned in this email. Should I send another one? What's the process like? Thanks! On Thu, Nov 8, 2018 at 10:47 PM Sergey Senozhatsky wrote: > > On (11/01/18 09:05), Daniel Wang wrote: > > > Another deadlock scenario could be the following one: > >

Re: [PATCHv3] panic: avoid deadlocks in re-entrant console drivers

2018-12-11 Thread Daniel Wang
No worries. I will follow up. You would recommend that all four patches in this set to be backported though, right? On Tue, Dec 11, 2018 at 9:23 PM Sergey Senozhatsky wrote: > > On (12/11/18 16:53), Daniel Wang wrote: > > Is it okay to tag this commit with `Cc: sta...@vger.ke

Re: 4.14 backport request for dbdda842fe96f: "printk: Add console owner and waiter logic to load balance console writes"

2018-12-11 Thread Daniel Wang
2 patch sets to choose from for -stable, then -stable guys can pick up the one that requires less effort: 1 two-liner patch vs. 3 or 4 bigger patches. Which two sets are you referring to specifically? On Tue, Dec 11, 2018 at 9:21 PM Sergey Senozhatsky wrote: > > On (12/11/18 17:16), Daniel Wang

Re: [PATCHv3] panic: avoid deadlocks in re-entrant console drivers

2018-12-11 Thread Daniel Wang
Got it. Thank you. On Tue, Dec 11, 2018 at 10:06 PM Sergey Senozhatsky wrote: > > On (12/11/18 21:59), Daniel Wang wrote: > > No worries. I will follow up. You would recommend that all four > > patches in this set to be backported though, right? > > Just the last one,

Re: 4.14 backport request for dbdda842fe96f: "printk: Add console owner and waiter logic to load balance console writes"

2018-10-02 Thread Daniel Wang
On Tue, Oct 2, 2018 at 1:42 AM Petr Mladek wrote: > Well, I still wonder why it helped and why you do not see it with 4.4. > I have a feeling that the console owner switch helped only by chance. So do I. I don't think Steven had the deadlock in mind when working on that patch, but with that

Re: 4.14 backport request for dbdda842fe96f: "printk: Add console owner and waiter logic to load balance console writes"

2018-10-02 Thread Daniel Wang
On Tue, Oct 2, 2018 at 1:42 AM Petr Mladek wrote: > > Well, I still wonder why it helped and why you do not see it with 4.4. > I have a feeling that the console owner switch helped only by chance. > In fact, you might be affected by a race in > printk_safe_flush_on_panic() that was fixed by the

Re: 4.14 backport request for dbdda842fe96f: "printk: Add console owner and waiter logic to load balance console writes"

2018-10-03 Thread Daniel Wang
On Wed, Oct 3, 2018 at 2:14 AM Petr Mladek wrote: > > On Tue 2018-10-02 21:23:27, Steven Rostedt wrote: > > I don't see the big deal of backporting this. The biggest complaints > > about backports are from fixes that were added to late -rc releases > > where the fixes didn't get much testing.

Re: 4.14 backport request for dbdda842fe96f: "printk: Add console owner and waiter logic to load balance console writes"

2018-10-03 Thread Daniel Wang
On Wed, Oct 3, 2018 at 10:37 AM Steven Rostedt wrote: > Just so I understand correctly. Does the panic hit with and without the > suggested backport patch? The only difference is that you get the full > output with the patch and limited output without it? When `softlockup_panic` is set (which is

Re: 4.14 backport request for dbdda842fe96f: "printk: Add console owner and waiter logic to load balance console writes"

2018-10-03 Thread Daniel Wang
I wanted to let you know that I am leaving for a two-week vacation. So if you don't hear from me during that period assume bad network connectivity and not lack of enthusiasm. :) Feel free to go with the backports if we reach an agreement here. Otherwise I'll do it when I get back. Thank you all!

4.14 backport request for dbdda842fe96f: "printk: Add console owner and waiter logic to load balance console writes"

2018-09-27 Thread Daniel Wang
Prior to this change, the combination of `softlockup_panic=1` and `softlockup_all_cpu_stacktrace=1` may result in a deadlock when the reboot path is trying to grab the console lock that is held by the stack trace printing path. What seems to be happening is that while there are multiple CPUs, only

Re: 4.14 backport request for dbdda842fe96f: "printk: Add console owner and waiter logic to load balance console writes"

2018-10-01 Thread Daniel Wang
On Mon, Oct 1, 2018 at 12:23 PM Steven Rostedt wrote: > > > Serial console logs leading up to the deadlock. As can be seen the stack > > trace > > was incomplete because the printing path hit a timeout. > > I'm fine with having this backported. Thanks. I can send the cherrypicks your way. Do

Re: 4.14 backport request for dbdda842fe96f: "printk: Add console owner and waiter logic to load balance console writes"

2018-10-01 Thread Daniel Wang
On Mon, Oct 1, 2018 at 1:23 PM Vlastimil Babka wrote: > > On 10/1/18 10:13 PM, Pavel Machek wrote: > > > > Dunno. Is the patch perhaps a bit too complex? This is not exactly > > trivial bugfix. > > > > pavel@duo:/data/l/clean-cg$ git show dbdda842fe96f | diffstat > > printk.c | 108 > >

Re: 4.14 backport request for dbdda842fe96f: "printk: Add console owner and waiter logic to load balance console writes"

2018-12-28 Thread Daniel Wang
Thanks. I was able to confirm that commit c7c3f05e341a9a2bd alone fixed the problem for me. As expected, all 16 CPUs' stacktrace was printed, before a final panic stack dump and a successful reboot. [ 24.035044] Hogging a CPU now [ 48.200258] watchdog: BUG: soft lockup - CPU#3 stuck for 22s!

Re: 4.14 backport request for dbdda842fe96f: "printk: Add console owner and waiter logic to load balance console writes"

2018-11-01 Thread Daniel Wang
On Mon, Oct 22, 2018 at 3:10 AM Sergey Senozhatsky wrote: > Another deadlock scenario could be the following one: > > printk() > console_trylock() > down_trylock() >raw_spin_lock_irqsave(>lock, flags) > > panic() >

Re: 4.14 backport request for dbdda842fe96f: "printk: Add console owner and waiter logic to load balance console writes"

2018-10-21 Thread Daniel Wang
Just got back from vacation. Thanks for the continued discussion. Just so I understand the current state. Looks like we've got a pretty good explanation of what's going on (though not completely sure), and backporting Steven's patches is still the way to go? I see that Sergey had sent an RFC