Re: 4.14 backport request for dbdda842fe96f: "printk: Add console owner and waiter logic to load balance console writes"

2018-12-28 Thread Daniel Wang
Thanks. I was able to confirm that commit c7c3f05e341a9a2bd alone
fixed the problem for me. As expected, all 16 CPUs' stacktrace was
printed, before a final panic stack dump and a successful reboot.

[   24.035044] Hogging a CPU now
[   48.200258] watchdog: BUG: soft lockup - CPU#3 stuck for 22s! [lockme:1102]
[   48.207371] Modules linked in: lockme(O) ipt_MASQUERADE
nf_nat_masquerade_ipv4 iptable_nat nf_nat_ipv4 xt_addrtype nf_nat
br_netfilter ip6table_filter ip6_tables aesni_intel aes_x86_64
crypto_simd cryptd glue_helper
[   48.226613] CPU: 3 PID: 1102 Comm: lockme Tainted: G   O
4.14.79 #33
[   48.234057] Hardware name: Google Google Compute Engine/Google
Compute Engine, BIOS Google 01/01/2011
[   48.243388] task: a3da1bd7 task.stack: c04e077e
[   48.249425] RIP: 0010:hog_thread+0x13/0x1000 [lockme]
[   48.255197] RSP: 0018:c04e077e3f10 EFLAGS: 0282 ORIG_RAX:
ff10
[   48.262879] RAX: 0011 RBX: a3da362ffa80 RCX: 
[   48.270131] RDX: a3da432dd740 RSI: a3da432d54f8 RDI: a3da432d54f8
[   48.277382] RBP: c04e077e3f48 R08: 0030 R09: 
[   48.284629] R10: 0358 R11:  R12: a3da33f7c940
[   48.291881] R13: c04e079b7c58 R14:  R15: a3da362ffac8
[   48.299134] FS:  () GS:a3da432c()
knlGS:
[   48.307338] CS:  0010 DS:  ES:  CR0: 80050033
[   48.313200] CR2: 7f0142c77e5d CR3: 000b10e12002 CR4: 003606a0
[   48.320455] DR0:  DR1:  DR2: 
[   48.327705] DR3:  DR6: fffe0ff0 DR7: 0400
[   48.334955] Call Trace:
[   48.337534]  kthread+0x127/0x160
[   48.340878]  ? 0xc04bc000
[   48.344315]  ? kthread_create_on_node+0x40/0x40
[   48.348962]  ret_from_fork+0x35/0x40
[   48.352655] Code:  fe 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00
[   48.360712] Sending NMI from CPU 3 to CPUs 0-2,4-15:
[   48.365891] NMI backtrace for cpu 5
[   48.365892] CPU: 5 PID: 963 Comm: dd Tainted: G   O4.14.79 #33
[   48.365892] Hardware name: Google Google Compute Engine/Google
Compute Engine, BIOS Google 01/01/2011
[   48.365893] task: a3da2e769c80 task.stack: c04e072dc000
[   48.365894] RIP: 0010:chacha20_block+0x203/0x350
[   48.365894] RSP: 0018:c04e072dfd08 EFLAGS: 0086
[   48.365895] RAX: 430aa37c RBX: 8849a559 RCX: 1e380d02
[   48.365896] RDX: f37255aa RSI: 430aa37c RDI: d39ce109
[   48.365896] RBP: 242dad92 R08: 942a2b36 R09: 6df44375
[   48.365897] R10: 7f47d158 R11: 80fde9af R12: 92e47c5e
[   48.365897] R13: ed09aada R14: c6fd956d R15: 1bb4deeb
[   48.365898] FS:  7f074a4a6700() GS:a3da4334()
knlGS:
[   48.365899] CS:  0010 DS:  ES:  CR0: 80050033
[   48.365899] CR2: 55a35c5b0520 CR3: 000edc93 CR4: 003606a0
[   48.365900] DR0:  DR1:  DR2: 
[   48.365900] DR3:  DR6: fffe0ff0 DR7: 0400
[   48.365900] Call Trace:
[   48.365901]  _extract_crng+0xdb/0x130
[   48.365901]  crng_backtrack_protect+0xb3/0xc0
[   48.365902]  urandom_read+0x13b/0x2c0
[   48.365902]  vfs_read+0xad/0x170
[   48.365903]  SyS_read+0x4b/0xa0
[   48.365903]  ? __audit_syscall_exit+0x21e/0x2c0
[   48.365904]  do_syscall_64+0x70/0x200
[   48.365904]  entry_SYSCALL_64_after_hwframe+0x3d/0xa2
[   48.365904] RIP: 0033:0x7f0749e7c410
[   48.365905] RSP: 002b:7ffd69532b18 EFLAGS: 0246 ORIG_RAX:

[   48.365906] RAX: ffda RBX: 00024ab3 RCX: 7f0749e7c410
[   48.365906] RDX: 0400 RSI: 55e440267000 RDI: 
[   48.365907] RBP: 7ffd69532b40 R08:  R09: 000d
[   48.365907] R10: f000 R11: 0246 R12: 
[   48.365908] R13: 7f074a4a6690 R14: 0400 R15: 55e440267000
[   48.365908] Code: c0 10 31 f0 01 d3 89 74 24 08 41 89 c7 8b 44 24
0c 41 31 dc 41 c1 c4 0c 46 8d 0c 1f 45 89 eb 41 c1 c7 0c 44 01 c0 45
31 cb 89 c6 <89> e8 41 c1 c3 08 89 74 24 0c 31 f0 41 8d 34 0c 8b 4c 24
10 c1
[   48.365921] NMI backtrace for cpu 13
[   48.365923] CPU: 13 PID: 967 Comm: dd Tainted: G   O4.14.79 #33
[   48.365924] Hardware name: Google Google Compute Engine/Google
Compute Engine, BIOS Google 01/01/2011
[   48.365924] task: a3da1c8b task.stack: c04e07798000
[   48.365925] RIP: 0010:native_queued_spin_lock_slowpath+0xce/0x1b0
[   48.365925] RSP: 0018:c04e0779bda8 EFLAGS: 0002
[   48.365926] RAX: 0001 RBX: adecd3c8 RCX: 
[   48.365926] RDX: 0001 RSI: 0001 RDI: adecd3c8
[   48.365927] RBP: 

Re: 4.14 backport request for dbdda842fe96f: "printk: Add console owner and waiter logic to load balance console writes"

2018-12-12 Thread Daniel Wang
> Additionally, for dbdda842fe96f to work as expected we really
need fd5f7cde1b85d4c. Otherwise printk() can schedule under
console_sem and console_owner, which will deactivate the "load
balance" logic.

It looks like fd5f7cde1b85d4c got into 4.14.82 that was released last month.

On Wed, Dec 12, 2018 at 6:27 PM Sergey Senozhatsky
 wrote:
>
> On (12/12/18 16:40), Daniel Wang wrote:
> > In case this was buried in previous messages, the commit I'd like to
> > get backported to 4.14 is dbdda842fe96f: printk: Add console owner and
> > waiter logic to load balance console writes. But another followup
> > patch that fixes a bug in that patch is also required. That is
> > c14376de3a1b: printk: Wake klogd when passing console_lock owner.
>
> Additionally, for dbdda842fe96f to work as expected we really
> need fd5f7cde1b85d4c. Otherwise printk() can schedule under
> console_sem and console_owner, which will deactivate the "load
> balance" logic.
>
> -ss



-- 
Best,
Daniel


smime.p7s
Description: S/MIME Cryptographic Signature


Re: 4.14 backport request for dbdda842fe96f: "printk: Add console owner and waiter logic to load balance console writes"

2018-12-12 Thread Daniel Wang
In case this was buried in previous messages, the commit I'd like to
get backported to 4.14 is dbdda842fe96f: printk: Add console owner and
waiter logic to load balance console writes. But another followup
patch that fixes a bug in that patch is also required. That is
c14376de3a1b: printk: Wake klogd when passing console_lock owner.

On Wed, Dec 12, 2018 at 1:56 PM Daniel Wang  wrote:
>
> Thank you!
>
> On Wed, Dec 12, 2018 at 1:52 PM Sasha Levin  wrote:
> >
> > On Wed, Dec 12, 2018 at 01:49:25PM -0800, Daniel Wang wrote:
> > >Thanks for the clarification. So I guess I don't need to start another
> > >thread for it? What are the next steps?
> >
> > Nothing here, I'll queue it once Sergey or Petr clarify if they wanted
> > additional information in the -stable commit message.
> >
> > --
> > Thanks,
> > Sasha
> >
>
>
> --
> Best,
> Daniel



-- 
Best,
Daniel


smime.p7s
Description: S/MIME Cryptographic Signature


Re: 4.14 backport request for dbdda842fe96f: "printk: Add console owner and waiter logic to load balance console writes"

2018-12-12 Thread Daniel Wang
Thank you!

On Wed, Dec 12, 2018 at 1:52 PM Sasha Levin  wrote:
>
> On Wed, Dec 12, 2018 at 01:49:25PM -0800, Daniel Wang wrote:
> >Thanks for the clarification. So I guess I don't need to start another
> >thread for it? What are the next steps?
>
> Nothing here, I'll queue it once Sergey or Petr clarify if they wanted
> additional information in the -stable commit message.
>
> --
> Thanks,
> Sasha
>


-- 
Best,
Daniel


smime.p7s
Description: S/MIME Cryptographic Signature


Re: 4.14 backport request for dbdda842fe96f: "printk: Add console owner and waiter logic to load balance console writes"

2018-12-12 Thread Daniel Wang
Thanks for the clarification. So I guess I don't need to start another
thread for it? What are the next steps?

On Wed, Dec 12, 2018 at 1:43 PM Sasha Levin  wrote:
>
> On Wed, Dec 12, 2018 at 12:11:29PM -0800, Daniel Wang wrote:
> >On Wed, Dec 12, 2018 at 9:43 AM Sasha Levin  wrote:
> >>
> >> On Wed, Dec 12, 2018 at 10:59:39PM +0900, Sergey Senozhatsky wrote:
> >> >On (12/12/18 14:36), Petr Mladek wrote:
> >> >> > OK, really didn't know that! I wasn't Cc-ed on that AUTOSEL email,
> >> >> > and I wasn't Cc-ed on this whole discussion and found it purely
> >> >> > accidentally while browsing linux-mm list.
> >> >>
> >> >> I am sorry that I did not CC you. There were so many people in CC.
> >> >> I expected that all people mentioned in the related commit message
> >> >> were included by default.
> >> >
> >> >No worries! I'm not blaming anyone.
> >> >
> >> >> > So if you are willing to backport this set to -stable, then I wouldn't
> >> >> > mind, probably would be more correct if we don't advertise this as a
> >> >> > "panic() deadlock fix"
> >> >>
> >> >> This should not be a problem. I guess that stable does not modify
> >> >> the original commit messages unless there is a change.
> >> >
> >> >Agreed.
> >>
> >> I'll be happy to add anything you want to the commit message. Do you
> >> have a blurb you want to use?
> >
> >If we still get to amend the commit message, I'd like to add "Cc:
> >sta...@vger.kernel.org" in the sign-off area. According to
> >https://www.kernel.org/doc/html/v4.12/process/stable-kernel-rules.html#option-1
> >patches with that tag will be automatically applied to -stable trees.
> >It's unclear though if it'll get applied to ALL -stable trees. For my
> >request, I care at least about 4.19 and 4.14. So maybe we can add two
> >lines, "Cc:  # 4.14.x" and "Cc:
> > # 4.19.x".
>
> We can't change the original commit message (but that's fine, the
> purpose of that tag is to let us know that this commit should go in
> stable - and no we do :) ).
>
> I was under the impression that Sergey or Petr wanted to add more
> information about the purpose of this patch and the issue it solves.
>
> --
> Thanks,
> Sasha



-- 
Best,
Daniel


smime.p7s
Description: S/MIME Cryptographic Signature


Re: 4.14 backport request for dbdda842fe96f: "printk: Add console owner and waiter logic to load balance console writes"

2018-12-12 Thread Daniel Wang
On Wed, Dec 12, 2018 at 9:43 AM Sasha Levin  wrote:
>
> On Wed, Dec 12, 2018 at 10:59:39PM +0900, Sergey Senozhatsky wrote:
> >On (12/12/18 14:36), Petr Mladek wrote:
> >> > OK, really didn't know that! I wasn't Cc-ed on that AUTOSEL email,
> >> > and I wasn't Cc-ed on this whole discussion and found it purely
> >> > accidentally while browsing linux-mm list.
> >>
> >> I am sorry that I did not CC you. There were so many people in CC.
> >> I expected that all people mentioned in the related commit message
> >> were included by default.
> >
> >No worries! I'm not blaming anyone.
> >
> >> > So if you are willing to backport this set to -stable, then I wouldn't
> >> > mind, probably would be more correct if we don't advertise this as a
> >> > "panic() deadlock fix"
> >>
> >> This should not be a problem. I guess that stable does not modify
> >> the original commit messages unless there is a change.
> >
> >Agreed.
>
> I'll be happy to add anything you want to the commit message. Do you
> have a blurb you want to use?

If we still get to amend the commit message, I'd like to add "Cc:
sta...@vger.kernel.org" in the sign-off area. According to
https://www.kernel.org/doc/html/v4.12/process/stable-kernel-rules.html#option-1
patches with that tag will be automatically applied to -stable trees.
It's unclear though if it'll get applied to ALL -stable trees. For my
request, I care at least about 4.19 and 4.14. So maybe we can add two
lines, "Cc:  # 4.14.x" and "Cc:
 # 4.19.x".

>
> --
> Thanks,
> Sasha



-- 
Best,
Daniel


smime.p7s
Description: S/MIME Cryptographic Signature


Re: [PATCHv3] panic: avoid deadlocks in re-entrant console drivers

2018-12-11 Thread Daniel Wang
Got it. Thank you.

On Tue, Dec 11, 2018 at 10:06 PM Sergey Senozhatsky
 wrote:
>
> On (12/11/18 21:59), Daniel Wang wrote:
> > No worries. I will follow up. You would recommend that all four
> > patches in this set to be backported though, right?
>
> Just the last one, which makes consoles re-entrant
>
> panic: avoid deadlocks in re-entrant console drivers
>
> Because only this one was applied so far. The reset got stuck.
>
> -ss



-- 
Best,
Daniel


smime.p7s
Description: S/MIME Cryptographic Signature


Re: 4.14 backport request for dbdda842fe96f: "printk: Add console owner and waiter logic to load balance console writes"

2018-12-11 Thread Daniel Wang
> So... did my patch address the deadlock you are seeing or it didn't?

I've been meaning to try it but kept getting distracted by other
things. I'll try to find some time for it this week or next. Right now
my intent is to get Steven's patch into 4.14 stable as it evidently
fixed the particular issue I was seeing, and as Steven said it has
been in upstream since 4.16 so it's not like backporting it will raise
any red flags. I will start another thread on -stable for it.

> I guess we still don't have a really clear understanding of what exactly
is going in your system

I would also like to get to the bottom of it. Unfortunately I haven't
got the expertise in this area nor the time to do it yet. Hence the
intent to take a step back and backport Steven's patch to fix the
issue that has resurfaced in our production recently.

> If it's uart_port->lock and there will be 2 patch sets to choose from
for -stable, then -stable guys can pick up the one that requires less
effort: 1 two-liner patch vs. 3 or 4 bigger patches.

Which two sets are you referring to specifically?

On Tue, Dec 11, 2018 at 9:21 PM Sergey Senozhatsky
 wrote:
>
> On (12/11/18 17:16), Daniel Wang wrote:
> > > Let's first figure out if it works.
> >
> > I would still like to try applying your patches that went into
> > printk.git, but for now I wonder if we can get Steven's patch into
> > 4.14 first, for at least we know it mitigated the issue if not
> > fundamentally addressed it, and we've agreed it's an innocuous change
> > that doesn't risk breaking stable.
>
> So... did my patch address the deadlock you are seeing or it didn't?
>
> > I haven't done this before so I'll need your help. What's the next
> > step to actually get Steven's patch *in* linux-4.14.y? According to
> > https://www.kernel.org/doc/html/latest/process/stable-kernel-rules.html
> > I am supposed to send an email with the patch ID and subject, which
> > are both mentioned in this email. Should I send another one? What's
> > the process like? Thanks!
>
> I'm not doing any -stable releases, so can't really answer, sorry.
> Probably would be better to re-address this question to 4.14 -stable
> maintainers.
>
>
> ---
> I guess we still don't have a really clear understanding of what exactly
> is going in your system. We don't even know for sure which one of the locks
> is deadlocking the system. And why exactly Steven's patch helps. If it
> is uart_port->lock, then it's one thing; if it's console_sem ->lock then
> it's another thing. But those two are just theories, not supported by any
> logs/backtraces from your systems.
>
> If it's uart_port->lock and there will be 2 patch sets to choose from
> for -stable, then -stable guys can pick up the one that requires less
> effort: 1 two-liner patch vs. 3 or 4 bigger patches.
>
> -ss



-- 
Best,
Daniel


smime.p7s
Description: S/MIME Cryptographic Signature


Re: [PATCHv3] panic: avoid deadlocks in re-entrant console drivers

2018-12-11 Thread Daniel Wang
No worries. I will follow up. You would recommend that all four
patches in this set to be backported though, right?

On Tue, Dec 11, 2018 at 9:23 PM Sergey Senozhatsky
 wrote:
>
> On (12/11/18 16:53), Daniel Wang wrote:
> > Is it okay to tag this commit with `Cc: sta...@vger.kernel.org` so
> > that it'll get applied to the stable trees once merged into Linux's
> > tree, if it's not too late? Otherwise I'll follow up on the stable
> > merges separately. Thanks for making the changes anyway.
>
> My bad! I should have Cc-ed stable on that one, sorry. I guess now it
> would be easier to wait and follow up once it gets merged.
>
> -ss



-- 
Best,
Daniel


smime.p7s
Description: S/MIME Cryptographic Signature


Re: 4.14 backport request for dbdda842fe96f: "printk: Add console owner and waiter logic to load balance console writes"

2018-12-11 Thread Daniel Wang
> Let's first figure out if it works.

I would still like to try applying your patches that went into
printk.git, but for now I wonder if we can get Steven's patch into
4.14 first, for at least we know it mitigated the issue if not
fundamentally addressed it, and we've agreed it's an innocuous change
that doesn't risk breaking stable.

I haven't done this before so I'll need your help. What's the next
step to actually get Steven's patch *in* linux-4.14.y? According to
https://www.kernel.org/doc/html/latest/process/stable-kernel-rules.html
I am supposed to send an email with the patch ID and subject, which
are both mentioned in this email. Should I send another one? What's
the process like? Thanks!

On Thu, Nov 8, 2018 at 10:47 PM Sergey Senozhatsky
 wrote:
>
> On (11/01/18 09:05), Daniel Wang wrote:
> > > Another deadlock scenario could be the following one:
> > >
> > > printk()
> > >  console_trylock()
> > >   down_trylock()
> > >raw_spin_lock_irqsave(>lock, flags)
> > > 
> > >  panic()
> > >   console_flush_on_panic()
> > >console_trylock()
> > > raw_spin_lock_irqsave(>lock, flags)// 
> > > deadlock
> > >
> > > There are no patches addressing this one at the moment. And it's
> > > unclear if you are hitting this scenario.
> >
> > I am not sure, but Steven's patches did make the deadlock I saw go away...
>
> You certainly can find cases when "busy spin on console_sem owner" logic
> can reduce some possibilities.
>
> But spin_lock(); NMI; spin_lock(); code is still in the kernel.
>
> > A little swamped by other things lately but I'll run a test with it.
> > If it works, would you recommend taking your patch alone
>
> Let's first figure out if it works.
>
> -ss



--
Best,
Daniel

On Thu, Nov 8, 2018 at 10:47 PM Sergey Senozhatsky
 wrote:
>
> On (11/01/18 09:05), Daniel Wang wrote:
> > > Another deadlock scenario could be the following one:
> > >
> > > printk()
> > >  console_trylock()
> > >   down_trylock()
> > >raw_spin_lock_irqsave(>lock, flags)
> > > 
> > >  panic()
> > >   console_flush_on_panic()
> > >console_trylock()
> > > raw_spin_lock_irqsave(>lock, flags)// 
> > > deadlock
> > >
> > > There are no patches addressing this one at the moment. And it's
> > > unclear if you are hitting this scenario.
> >
> > I am not sure, but Steven's patches did make the deadlock I saw go away...
>
> You certainly can find cases when "busy spin on console_sem owner" logic
> can reduce some possibilities.
>
> But spin_lock(); NMI; spin_lock(); code is still in the kernel.
>
> > A little swamped by other things lately but I'll run a test with it.
> > If it works, would you recommend taking your patch alone
>
> Let's first figure out if it works.
>
> -ss



-- 
Best,
Daniel


smime.p7s
Description: S/MIME Cryptographic Signature


Re: [PATCHv3] panic: avoid deadlocks in re-entrant console drivers

2018-12-11 Thread Daniel Wang
Is it okay to tag this commit with `Cc: sta...@vger.kernel.org` so
that it'll get applied to the stable trees once merged into Linux's
tree, if it's not too late? Otherwise I'll follow up on the stable
merges separately. Thanks for making the changes anyway.

On Thu, Nov 22, 2018 at 5:12 AM Petr Mladek  wrote:
>
> On Thu 2018-11-01 09:08:08, Petr Mladek wrote:
> > On Thu 2018-11-01 10:48:21, Sergey Senozhatsky wrote:
> > > On (10/31/18 13:27), Petr Mladek wrote:
> > > > >
> > > > > Signed-off-by: Sergey Senozhatsky 
> > > >
> > > > The patch makes sense to me. The locks should stay busted also for
> > > > console_flush_on_panic().
> > > >
> > > > With the added #include :
> > > >
> > > > Reviewed-by: Petr Mladek 
> > >
> > > Thanks!
> > >
> > > Since there are no objections - how shall we route it? Via printk tree?
> >
> > Good question. OK, I am going to put it into printk.git unless I hear
> > complains withing next couple of days.
>
> I have pushed this into printk.git, branch for-4.21.
>
> Best Regards,
> Petr



-- 
Best,
Daniel


smime.p7s
Description: S/MIME Cryptographic Signature


Re: 4.14 backport request for dbdda842fe96f: "printk: Add console owner and waiter logic to load balance console writes"

2018-11-01 Thread Daniel Wang
On Mon, Oct 22, 2018 at 3:10 AM Sergey Senozhatsky
 wrote:

> Another deadlock scenario could be the following one:
>
> printk()
>  console_trylock()
>   down_trylock()
>raw_spin_lock_irqsave(>lock, flags)
> 
>  panic()
>   console_flush_on_panic()
>console_trylock()
> raw_spin_lock_irqsave(>lock, flags)// deadlock
>
> There are no patches addressing this one at the moment. And it's
> unclear if you are hitting this scenario.

I am not sure, but Steven's patches did make the deadlock I saw go away...

>
>
> > I see that Sergey had sent an RFC series for similar things. Are those
> > trying to solve the deadlock problem in a different way?
>
> Umm, I wouldn't call it "another way". It turns non-reentrant serial
> consoles to re-entrable ones. Did you test patch [1] from that series
> on you environment, by the way?

A little swamped by other things lately but I'll run a test with it.
If it works, would you recommend taking your patch alone and not
bother taking Steven's (since as you mentioned above even with his
patches there's still possibility for other deadlocks) ?

>
> [1] lkml.kernel.org/r/20181016050428.17966-2-sergey.senozhat...@gmail.com
>
> -ss



-- 
Best,
Daniel


smime.p7s
Description: S/MIME Cryptographic Signature


Re: 4.14 backport request for dbdda842fe96f: "printk: Add console owner and waiter logic to load balance console writes"

2018-11-01 Thread Daniel Wang
On Mon, Oct 22, 2018 at 3:10 AM Sergey Senozhatsky
 wrote:

> Another deadlock scenario could be the following one:
>
> printk()
>  console_trylock()
>   down_trylock()
>raw_spin_lock_irqsave(>lock, flags)
> 
>  panic()
>   console_flush_on_panic()
>console_trylock()
> raw_spin_lock_irqsave(>lock, flags)// deadlock
>
> There are no patches addressing this one at the moment. And it's
> unclear if you are hitting this scenario.

I am not sure, but Steven's patches did make the deadlock I saw go away...

>
>
> > I see that Sergey had sent an RFC series for similar things. Are those
> > trying to solve the deadlock problem in a different way?
>
> Umm, I wouldn't call it "another way". It turns non-reentrant serial
> consoles to re-entrable ones. Did you test patch [1] from that series
> on you environment, by the way?

A little swamped by other things lately but I'll run a test with it.
If it works, would you recommend taking your patch alone and not
bother taking Steven's (since as you mentioned above even with his
patches there's still possibility for other deadlocks) ?

>
> [1] lkml.kernel.org/r/20181016050428.17966-2-sergey.senozhat...@gmail.com
>
> -ss



-- 
Best,
Daniel


smime.p7s
Description: S/MIME Cryptographic Signature


Re: 4.14 backport request for dbdda842fe96f: "printk: Add console owner and waiter logic to load balance console writes"

2018-10-21 Thread Daniel Wang
Just got back from vacation. Thanks for the continued discussion. Just so
I understand the current state. Looks like we've got a pretty good explanation
of what's going on (though not completely sure), and backporting Steven's
patches is still the way to go? I see that Sergey had sent an RFC series
for similar things. Are those trying to solve the deadlock problem in a
different way?On Thu, Oct 4, 2018 at 1:55 AM Sergey Senozhatsky
 wrote:
>
> On (10/04/18 10:36), Petr Mladek wrote:
> >
> > This looks like a reasonable explanation of what is happening here.
> > It also explains why the console owner logic helped.
>
> Well, I'm still a bit puzzled, frankly speaking. I've two theories.
>
> Theory #1 [most likely]
>
>   Steven is a wizard and his code cures whatever problem we throw it at.
>
> Theory #2
>
>   console_sem hand over actually spreads print out, so we don't have one CPU
> doing all the printing job. Instead every CPU prints its backtrace, while the
> CPU which issued all_cpus_backtrace() waits for them. So all_cpus_backtrace()
> still has to wait for NR_CPUS * strlen(bakctrace), which still probably
> truggers NMI panic on it at some point. The panic CPU send out stop IPI, then
> it waits for foreign CPUs to ACK stop IPI request - for 10 seconds. So each
> CPU prints its backtrace, then ACK stop IPI. So when panic CPU proceeds with
> flush_on_panic() and emergency_reboot() uart_port->lock is unlocked. Without
> the patch we probably declare NMI panic on the CPU which does all the printing
> work, and panic sometimes jumps in when that CPU is in busy in
> serial8250_console_write(), holding the uart_port->lock. So we can't re-enter
> the 8250 driver from panic CPU and we can't reboot the system. In other
> words... Steven is a wizard.
>
> > > serial8250_console_write()
> > > {
> > > if (port->sysrq)
> > > locked = 0;
> > > else if (oops_in_progress)
> > > locked = spin_trylock_irqsave(>lock, flags);
> > > else
> > > spin_lock_irqsave(>lock, flags);
> > >
> > > ...
> > > uart_console_write(port, s, count, serial8250_console_putchar);
> > > ...
> > >
> > > if (locked)
> > > spin_unlock_irqrestore(>lock, flags);
> > > }
> > >
> > > Now... the problem. A theory, in fact.
> > > panic() sets oops_in_progress back to zero - bust_spinlocks(0) -  too 
> > > soon.
> >
> > I see your point. I am just a bit scared of this way. Ignoring locks
> > is a dangerous and painful approach in general.
>
> Well, I agree. But 8250 is not the only console which does ignore
> uart_port lock state sometimes. Otherwise sysrq would be totally unreliable,
> including emergency reboot. So it's sort of how it has been for quite some
> time, I guess. We are in panic(), it's over, so we probably can ignore
> uart_port->lock at this point.
>
> -ss



--
Best,
Daniel


smime.p7s
Description: S/MIME Cryptographic Signature


Re: 4.14 backport request for dbdda842fe96f: "printk: Add console owner and waiter logic to load balance console writes"

2018-10-21 Thread Daniel Wang
Just got back from vacation. Thanks for the continued discussion. Just so
I understand the current state. Looks like we've got a pretty good explanation
of what's going on (though not completely sure), and backporting Steven's
patches is still the way to go? I see that Sergey had sent an RFC series
for similar things. Are those trying to solve the deadlock problem in a
different way?On Thu, Oct 4, 2018 at 1:55 AM Sergey Senozhatsky
 wrote:
>
> On (10/04/18 10:36), Petr Mladek wrote:
> >
> > This looks like a reasonable explanation of what is happening here.
> > It also explains why the console owner logic helped.
>
> Well, I'm still a bit puzzled, frankly speaking. I've two theories.
>
> Theory #1 [most likely]
>
>   Steven is a wizard and his code cures whatever problem we throw it at.
>
> Theory #2
>
>   console_sem hand over actually spreads print out, so we don't have one CPU
> doing all the printing job. Instead every CPU prints its backtrace, while the
> CPU which issued all_cpus_backtrace() waits for them. So all_cpus_backtrace()
> still has to wait for NR_CPUS * strlen(bakctrace), which still probably
> truggers NMI panic on it at some point. The panic CPU send out stop IPI, then
> it waits for foreign CPUs to ACK stop IPI request - for 10 seconds. So each
> CPU prints its backtrace, then ACK stop IPI. So when panic CPU proceeds with
> flush_on_panic() and emergency_reboot() uart_port->lock is unlocked. Without
> the patch we probably declare NMI panic on the CPU which does all the printing
> work, and panic sometimes jumps in when that CPU is in busy in
> serial8250_console_write(), holding the uart_port->lock. So we can't re-enter
> the 8250 driver from panic CPU and we can't reboot the system. In other
> words... Steven is a wizard.
>
> > > serial8250_console_write()
> > > {
> > > if (port->sysrq)
> > > locked = 0;
> > > else if (oops_in_progress)
> > > locked = spin_trylock_irqsave(>lock, flags);
> > > else
> > > spin_lock_irqsave(>lock, flags);
> > >
> > > ...
> > > uart_console_write(port, s, count, serial8250_console_putchar);
> > > ...
> > >
> > > if (locked)
> > > spin_unlock_irqrestore(>lock, flags);
> > > }
> > >
> > > Now... the problem. A theory, in fact.
> > > panic() sets oops_in_progress back to zero - bust_spinlocks(0) -  too 
> > > soon.
> >
> > I see your point. I am just a bit scared of this way. Ignoring locks
> > is a dangerous and painful approach in general.
>
> Well, I agree. But 8250 is not the only console which does ignore
> uart_port lock state sometimes. Otherwise sysrq would be totally unreliable,
> including emergency reboot. So it's sort of how it has been for quite some
> time, I guess. We are in panic(), it's over, so we probably can ignore
> uart_port->lock at this point.
>
> -ss



--
Best,
Daniel


smime.p7s
Description: S/MIME Cryptographic Signature


Re: 4.14 backport request for dbdda842fe96f: "printk: Add console owner and waiter logic to load balance console writes"

2018-10-03 Thread Daniel Wang
I wanted to let you know that I am leaving for a two-week vacation. So
if you don't hear from me during that period assume bad network
connectivity and not lack of enthusiasm. :) Feel free to go with the
backports if we reach an agreement here. Otherwise I'll do it when I get
back. Thank you all!


smime.p7s
Description: S/MIME Cryptographic Signature


Re: 4.14 backport request for dbdda842fe96f: "printk: Add console owner and waiter logic to load balance console writes"

2018-10-03 Thread Daniel Wang
I wanted to let you know that I am leaving for a two-week vacation. So
if you don't hear from me during that period assume bad network
connectivity and not lack of enthusiasm. :) Feel free to go with the
backports if we reach an agreement here. Otherwise I'll do it when I get
back. Thank you all!


smime.p7s
Description: S/MIME Cryptographic Signature


Re: 4.14 backport request for dbdda842fe96f: "printk: Add console owner and waiter logic to load balance console writes"

2018-10-03 Thread Daniel Wang
On Wed, Oct 3, 2018 at 10:37 AM Steven Rostedt  wrote:
> Just so I understand correctly. Does the panic hit with and without the
> suggested backport patch? The only difference is that you get the full
> output with the patch and limited output without it?

When `softlockup_panic` is set (which is what my original repro had and
what we use in production), without the backport patch, the expected panic
would hit a seemingly deadlock. So even when the machine is configured
to reboot immediately after the panic (kernel.panic=-1), it just hangs there
with an incomplete backtrace. With your patch, the deadlock doesn't happen
and the machine reboots successfully.

This was and still is the issue this thread is trying to fix. The last
log snippet
was from an "experiment" that I did in order to understand what's really
happening. So far the speculation has been that the panic path was trying
to get a lock held by a backtrace dumping thread, but there is not enough
evidence which thread is holding the lock and how it uses it. So I set
`softlockup_panic` to 0, to get panic out of the equation. Then I saw that one
CPU was indeed holding the console lock, trying to write something out. If
the panic was to hit while it's doing that, we might get a deadlock.

>
> -- Steve
>


-- 
Best,
Daniel


smime.p7s
Description: S/MIME Cryptographic Signature


Re: 4.14 backport request for dbdda842fe96f: "printk: Add console owner and waiter logic to load balance console writes"

2018-10-03 Thread Daniel Wang
On Wed, Oct 3, 2018 at 10:37 AM Steven Rostedt  wrote:
> Just so I understand correctly. Does the panic hit with and without the
> suggested backport patch? The only difference is that you get the full
> output with the patch and limited output without it?

When `softlockup_panic` is set (which is what my original repro had and
what we use in production), without the backport patch, the expected panic
would hit a seemingly deadlock. So even when the machine is configured
to reboot immediately after the panic (kernel.panic=-1), it just hangs there
with an incomplete backtrace. With your patch, the deadlock doesn't happen
and the machine reboots successfully.

This was and still is the issue this thread is trying to fix. The last
log snippet
was from an "experiment" that I did in order to understand what's really
happening. So far the speculation has been that the panic path was trying
to get a lock held by a backtrace dumping thread, but there is not enough
evidence which thread is holding the lock and how it uses it. So I set
`softlockup_panic` to 0, to get panic out of the equation. Then I saw that one
CPU was indeed holding the console lock, trying to write something out. If
the panic was to hit while it's doing that, we might get a deadlock.

>
> -- Steve
>


-- 
Best,
Daniel


smime.p7s
Description: S/MIME Cryptographic Signature


Re: 4.14 backport request for dbdda842fe96f: "printk: Add console owner and waiter logic to load balance console writes"

2018-10-03 Thread Daniel Wang
On Wed, Oct 3, 2018 at 2:14 AM Petr Mladek  wrote:
>
> On Tue 2018-10-02 21:23:27, Steven Rostedt wrote:
> > I don't see the big deal of backporting this. The biggest complaints
> > about backports are from fixes that were added to late -rc releases
> > where the fixes didn't get much testing. This commit was added in 4.16,
> > and hasn't had any issues due to the design. Although a fix has been
> > added:
> >
> > c14376de3a1 ("printk: Wake klogd when passing console_lock owner")
>
> As I said, I am fine with backporting the console_lock owner stuff
> into the stable release.
>
> I just wonder (like Sergey) what the real problem is. The console_lock
> owner handshake is not fully reliable. It is might be good enough
> to prevent softlockup. But we should not relay on it to prevent
> a deadlock.

Yes. I myself was curious too. :)

>
> My new theory ;-)
>
> printk_safe_flush() is called in nmi_trigger_cpumask_backtrace().
> => watchdog_timer_fn() is blocked until all backtraces are printed.
>
> Now, the original report complained that the system rebooted before
> all backtraces were printed. It means that panic() was called
> on another CPU. My guess is that it is from the hardlockup detector.
> And the panic() was not able to flush the console because it was
> not able to take console_lock.
>
> IMHO, there was not a real deadlock. The console_lock owner
> handshake jsut helped to get console_lock in panic() and
> flush all messages before reboot => it is reasonable
> and acceptable fix.

I had the same speculation. Tried to capture a lockdep snippet with
CONFIG_PROVE_LOCKING turned on but didn't get anything. But
maybe I was doing it wrong.

>
> Just to be sure. Daniel, could you please send a log with
> the console_lock owner stuff backported? There we would see
> who called the panic() and why it rebooted early.

Sure. Here is one. It's a bit long but complete. I attached another log
snippet below it which is what I got when `softlockup_panic` was turned
off. The log was from the IRQ task that was flushing the printk buffer. I
will be taking a closer look at it too but in case you'll find it helpful.

lockup-test-16-2 login: [   89.277372] LoadPin: kernel-module
pinning-ignored obj="/tmp/release/hog.ko" pid=1992 cmdline="insmod
hog.ko"
[   89.280029] hog: loading out-of-tree module taints kernel.
[   89.294559] Hogging a CPU now
[   92.619688] watchdog: BUG: soft lockup - CPU#6 stuck for 3s! [hog:1993]
[   92.626490] Modules linked in: hog(O) ipt_MASQUERADE
nf_nat_masquerade_ipv4 iptable_nat nf_nat_ipv4 xt_addrtype nf_nat
br_netfilter ip6table_filter ip6_tables aesni_intel aes_x86_64
crypto_simd cryptd glue_helper
[   92.645567] CPU: 6 PID: 1993 Comm: hog Tainted: G   O 4.15.0+ #12
[   92.652899] Hardware name: Google Google Compute Engine/Google
Compute Engine, BIOS Google 01/01/2011
[   92.662245] RIP: 0010:hog_thread+0x13/0x1000 [hog]
[   92.667164] RSP: 0018:b489c741ff10 EFLAGS: 0282 ORIG_RAX:
ff11
[   92.675139] RAX: 0011 RBX: 9f5c75a88900 RCX: 
[   92.682474] RDX: 9f5c8339d840 RSI: 9f5c833954b8 RDI: 9f5c833954b8
[   92.689727] RBP: b489c741ff48 R08: 0030 R09: 
[   92.696985] R10: 03a8 R11:  R12: 9f5c7959e080
[   92.704251] R13: b489c7f2bc70 R14:  R15: 9f5c75a88948
[   92.711498] FS:  () GS:9f5c8338()
knlGS:
[   92.719699] CS:  0010 DS:  ES:  CR0: 80050033
[   92.725556] CR2: 558184c9b89c CR3: 000499e12006 CR4: 003606a0
[   92.732976] DR0:  DR1:  DR2: 
[   92.740231] DR3:  DR6: fffe0ff0 DR7: 0400
[   92.747487] Call Trace:
[   92.750054]  kthread+0x120/0x160
[   92.753419]  ? 0xc030d000
[   92.756859]  ? kthread_stop+0x120/0x120
[   92.760819]  ret_from_fork+0x35/0x40
[   92.764594] Code:  fe 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00
[   92.772656] Sending NMI from CPU 6 to CPUs 0-5,7-15:
[   92.43] NMI backtrace for cpu 0
[   92.46] CPU: 0 PID: 1844 Comm: dd Tainted: G   O 4.15.0+ #12
[   92.47] Hardware name: Google Google Compute Engine/Google
Compute Engine, BIOS Google 01/01/2011
[   92.55] RIP: 0010:native_queued_spin_lock_slowpath+0x18/0x1b0
[   92.56] RSP: 0018:b489c7dcbdb0 EFLAGS: 0002
[   92.57] RAX: 0001 RBX: 9f5c82ca5a68 RCX: 
[   92.58] RDX: 0001 RSI: 0001 RDI: 9f5c82ca5a68
[   92.59] RBP: b489c7dcbde0 R08: f9f8f56c R09: 4c55ba96
[   92.60] R10: 84f6cd57 R11: 41f66b45 R12: 9f5c82ca5a68
[   92.61] R13: b489c7dcbe38 R14: b489c7dcbe38 R15: 0040
[   92.62] FS:  7f1e21116700() GS:9f5c8320()
knlGS:
[   92.63] CS:  0010 DS: 

Re: 4.14 backport request for dbdda842fe96f: "printk: Add console owner and waiter logic to load balance console writes"

2018-10-03 Thread Daniel Wang
On Wed, Oct 3, 2018 at 2:14 AM Petr Mladek  wrote:
>
> On Tue 2018-10-02 21:23:27, Steven Rostedt wrote:
> > I don't see the big deal of backporting this. The biggest complaints
> > about backports are from fixes that were added to late -rc releases
> > where the fixes didn't get much testing. This commit was added in 4.16,
> > and hasn't had any issues due to the design. Although a fix has been
> > added:
> >
> > c14376de3a1 ("printk: Wake klogd when passing console_lock owner")
>
> As I said, I am fine with backporting the console_lock owner stuff
> into the stable release.
>
> I just wonder (like Sergey) what the real problem is. The console_lock
> owner handshake is not fully reliable. It is might be good enough
> to prevent softlockup. But we should not relay on it to prevent
> a deadlock.

Yes. I myself was curious too. :)

>
> My new theory ;-)
>
> printk_safe_flush() is called in nmi_trigger_cpumask_backtrace().
> => watchdog_timer_fn() is blocked until all backtraces are printed.
>
> Now, the original report complained that the system rebooted before
> all backtraces were printed. It means that panic() was called
> on another CPU. My guess is that it is from the hardlockup detector.
> And the panic() was not able to flush the console because it was
> not able to take console_lock.
>
> IMHO, there was not a real deadlock. The console_lock owner
> handshake jsut helped to get console_lock in panic() and
> flush all messages before reboot => it is reasonable
> and acceptable fix.

I had the same speculation. Tried to capture a lockdep snippet with
CONFIG_PROVE_LOCKING turned on but didn't get anything. But
maybe I was doing it wrong.

>
> Just to be sure. Daniel, could you please send a log with
> the console_lock owner stuff backported? There we would see
> who called the panic() and why it rebooted early.

Sure. Here is one. It's a bit long but complete. I attached another log
snippet below it which is what I got when `softlockup_panic` was turned
off. The log was from the IRQ task that was flushing the printk buffer. I
will be taking a closer look at it too but in case you'll find it helpful.

lockup-test-16-2 login: [   89.277372] LoadPin: kernel-module
pinning-ignored obj="/tmp/release/hog.ko" pid=1992 cmdline="insmod
hog.ko"
[   89.280029] hog: loading out-of-tree module taints kernel.
[   89.294559] Hogging a CPU now
[   92.619688] watchdog: BUG: soft lockup - CPU#6 stuck for 3s! [hog:1993]
[   92.626490] Modules linked in: hog(O) ipt_MASQUERADE
nf_nat_masquerade_ipv4 iptable_nat nf_nat_ipv4 xt_addrtype nf_nat
br_netfilter ip6table_filter ip6_tables aesni_intel aes_x86_64
crypto_simd cryptd glue_helper
[   92.645567] CPU: 6 PID: 1993 Comm: hog Tainted: G   O 4.15.0+ #12
[   92.652899] Hardware name: Google Google Compute Engine/Google
Compute Engine, BIOS Google 01/01/2011
[   92.662245] RIP: 0010:hog_thread+0x13/0x1000 [hog]
[   92.667164] RSP: 0018:b489c741ff10 EFLAGS: 0282 ORIG_RAX:
ff11
[   92.675139] RAX: 0011 RBX: 9f5c75a88900 RCX: 
[   92.682474] RDX: 9f5c8339d840 RSI: 9f5c833954b8 RDI: 9f5c833954b8
[   92.689727] RBP: b489c741ff48 R08: 0030 R09: 
[   92.696985] R10: 03a8 R11:  R12: 9f5c7959e080
[   92.704251] R13: b489c7f2bc70 R14:  R15: 9f5c75a88948
[   92.711498] FS:  () GS:9f5c8338()
knlGS:
[   92.719699] CS:  0010 DS:  ES:  CR0: 80050033
[   92.725556] CR2: 558184c9b89c CR3: 000499e12006 CR4: 003606a0
[   92.732976] DR0:  DR1:  DR2: 
[   92.740231] DR3:  DR6: fffe0ff0 DR7: 0400
[   92.747487] Call Trace:
[   92.750054]  kthread+0x120/0x160
[   92.753419]  ? 0xc030d000
[   92.756859]  ? kthread_stop+0x120/0x120
[   92.760819]  ret_from_fork+0x35/0x40
[   92.764594] Code:  fe 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00
[   92.772656] Sending NMI from CPU 6 to CPUs 0-5,7-15:
[   92.43] NMI backtrace for cpu 0
[   92.46] CPU: 0 PID: 1844 Comm: dd Tainted: G   O 4.15.0+ #12
[   92.47] Hardware name: Google Google Compute Engine/Google
Compute Engine, BIOS Google 01/01/2011
[   92.55] RIP: 0010:native_queued_spin_lock_slowpath+0x18/0x1b0
[   92.56] RSP: 0018:b489c7dcbdb0 EFLAGS: 0002
[   92.57] RAX: 0001 RBX: 9f5c82ca5a68 RCX: 
[   92.58] RDX: 0001 RSI: 0001 RDI: 9f5c82ca5a68
[   92.59] RBP: b489c7dcbde0 R08: f9f8f56c R09: 4c55ba96
[   92.60] R10: 84f6cd57 R11: 41f66b45 R12: 9f5c82ca5a68
[   92.61] R13: b489c7dcbe38 R14: b489c7dcbe38 R15: 0040
[   92.62] FS:  7f1e21116700() GS:9f5c8320()
knlGS:
[   92.63] CS:  0010 DS: 

Re: 4.14 backport request for dbdda842fe96f: "printk: Add console owner and waiter logic to load balance console writes"

2018-10-02 Thread Daniel Wang
On Tue, Oct 2, 2018 at 1:42 AM Petr Mladek  wrote:
>
> Well, I still wonder why it helped and why you do not see it with 4.4.
> I have a feeling that the console owner switch helped only by chance.
> In fact, you might be affected by a race in
> printk_safe_flush_on_panic() that was fixed by the commit:
>
> 554755be08fba31c7 printk: drop in_nmi check from printk_safe_flush_on_panic()
>
> The above one commit might be enough. Well, there was one more
> NMI-related race that was fixed by:
>
> ba552399954dde1b printk: Split the code for storing a message into the log 
> buffer
> a338f84dc196f44b printk: Create helper function to queue deferred console 
> handling
> 03fc7f9c99c1e7ae printk/nmi: Prevent deadlock when accessing the main log 
> buffer in NMI

All of these commits already exist in 4.14 stable, since 4.14.68. The deadlock
still exists even when built from 4.14.73 (latest tag) though. And cherrypicking
dbdda842fe96 fixes it.

>
> Best Regards,
> Petr



-- 
Best,
Daniel


smime.p7s
Description: S/MIME Cryptographic Signature


Re: 4.14 backport request for dbdda842fe96f: "printk: Add console owner and waiter logic to load balance console writes"

2018-10-02 Thread Daniel Wang
On Tue, Oct 2, 2018 at 1:42 AM Petr Mladek  wrote:
>
> Well, I still wonder why it helped and why you do not see it with 4.4.
> I have a feeling that the console owner switch helped only by chance.
> In fact, you might be affected by a race in
> printk_safe_flush_on_panic() that was fixed by the commit:
>
> 554755be08fba31c7 printk: drop in_nmi check from printk_safe_flush_on_panic()
>
> The above one commit might be enough. Well, there was one more
> NMI-related race that was fixed by:
>
> ba552399954dde1b printk: Split the code for storing a message into the log 
> buffer
> a338f84dc196f44b printk: Create helper function to queue deferred console 
> handling
> 03fc7f9c99c1e7ae printk/nmi: Prevent deadlock when accessing the main log 
> buffer in NMI

All of these commits already exist in 4.14 stable, since 4.14.68. The deadlock
still exists even when built from 4.14.73 (latest tag) though. And cherrypicking
dbdda842fe96 fixes it.

>
> Best Regards,
> Petr



-- 
Best,
Daniel


smime.p7s
Description: S/MIME Cryptographic Signature


Re: 4.14 backport request for dbdda842fe96f: "printk: Add console owner and waiter logic to load balance console writes"

2018-10-02 Thread Daniel Wang
On Tue, Oct 2, 2018 at 1:42 AM Petr Mladek  wrote:

> Well, I still wonder why it helped and why you do not see it with 4.4.
> I have a feeling that the console owner switch helped only by chance.

So do I. I don't think Steven had the deadlock in mind when working on
that patch, but with that patch and that patch alone, the deadlock
disappeared.

> In fact, you might be affected by a race in
> printk_safe_flush_on_panic() that was fixed by the commit:
>
> 554755be08fba31c7 printk: drop in_nmi check from printk_safe_flush_on_panic()
>
> The above one commit might be enough. Well, there was one more

Thanks for the pointer. Let me test this out.

> NMI-related race that was fixed by:
>
> ba552399954dde1b printk: Split the code for storing a message into the log 
> buffer
> a338f84dc196f44b printk: Create helper function to queue deferred console 
> handling
> 03fc7f9c99c1e7ae printk/nmi: Prevent deadlock when accessing the main log 
> buffer in NMI
>
> Best Regards,
> Petr


On Tue, Oct 2, 2018 at 1:17 AM Sergey Senozhatsky
 wrote:

> Sorry, I'm missing something here. Steven's patch deals with lockups and
> I understand why you want to backport the patch set; but console output
> deadlock on panic() is another thing.

Understood. But it did fix the deadlock for me and without it I hit the deadlock
pretty consistently. Let me test the patch Petr pointed out and will update
here.

-- 
Best,
Daniel


smime.p7s
Description: S/MIME Cryptographic Signature


Re: 4.14 backport request for dbdda842fe96f: "printk: Add console owner and waiter logic to load balance console writes"

2018-10-02 Thread Daniel Wang
On Tue, Oct 2, 2018 at 1:42 AM Petr Mladek  wrote:

> Well, I still wonder why it helped and why you do not see it with 4.4.
> I have a feeling that the console owner switch helped only by chance.

So do I. I don't think Steven had the deadlock in mind when working on
that patch, but with that patch and that patch alone, the deadlock
disappeared.

> In fact, you might be affected by a race in
> printk_safe_flush_on_panic() that was fixed by the commit:
>
> 554755be08fba31c7 printk: drop in_nmi check from printk_safe_flush_on_panic()
>
> The above one commit might be enough. Well, there was one more

Thanks for the pointer. Let me test this out.

> NMI-related race that was fixed by:
>
> ba552399954dde1b printk: Split the code for storing a message into the log 
> buffer
> a338f84dc196f44b printk: Create helper function to queue deferred console 
> handling
> 03fc7f9c99c1e7ae printk/nmi: Prevent deadlock when accessing the main log 
> buffer in NMI
>
> Best Regards,
> Petr


On Tue, Oct 2, 2018 at 1:17 AM Sergey Senozhatsky
 wrote:

> Sorry, I'm missing something here. Steven's patch deals with lockups and
> I understand why you want to backport the patch set; but console output
> deadlock on panic() is another thing.

Understood. But it did fix the deadlock for me and without it I hit the deadlock
pretty consistently. Let me test the patch Petr pointed out and will update
here.

-- 
Best,
Daniel


smime.p7s
Description: S/MIME Cryptographic Signature


Re: 4.14 backport request for dbdda842fe96f: "printk: Add console owner and waiter logic to load balance console writes"

2018-10-01 Thread Daniel Wang
On Mon, Oct 1, 2018 at 1:23 PM Vlastimil Babka  wrote:
>
> On 10/1/18 10:13 PM, Pavel Machek wrote:
> >
> > Dunno. Is the patch perhaps a bit too complex? This is not exactly
> > trivial bugfix.
> >
> > pavel@duo:/data/l/clean-cg$ git show dbdda842fe96f | diffstat
> >  printk.c |  108
> >  ++-
> >
> > I see that it is pretty critical to Daniel, but maybe kernel with
> > console locking redone should no longer be called 4.4?
>
> In that case it probably should no longer be called 4.4 since at least
> Meltdown/Spectre fixes :)

To clarify, I am requesting a backport to 4.14. This bug doesn't repro on 4.4.

>


-- 
Best,
Daniel


smime.p7s
Description: S/MIME Cryptographic Signature


Re: 4.14 backport request for dbdda842fe96f: "printk: Add console owner and waiter logic to load balance console writes"

2018-10-01 Thread Daniel Wang
On Mon, Oct 1, 2018 at 1:23 PM Vlastimil Babka  wrote:
>
> On 10/1/18 10:13 PM, Pavel Machek wrote:
> >
> > Dunno. Is the patch perhaps a bit too complex? This is not exactly
> > trivial bugfix.
> >
> > pavel@duo:/data/l/clean-cg$ git show dbdda842fe96f | diffstat
> >  printk.c |  108
> >  ++-
> >
> > I see that it is pretty critical to Daniel, but maybe kernel with
> > console locking redone should no longer be called 4.4?
>
> In that case it probably should no longer be called 4.4 since at least
> Meltdown/Spectre fixes :)

To clarify, I am requesting a backport to 4.14. This bug doesn't repro on 4.4.

>


-- 
Best,
Daniel


smime.p7s
Description: S/MIME Cryptographic Signature


Re: 4.14 backport request for dbdda842fe96f: "printk: Add console owner and waiter logic to load balance console writes"

2018-10-01 Thread Daniel Wang
On Mon, Oct 1, 2018 at 12:23 PM Steven Rostedt  wrote:
>
> > Serial console logs leading up to the deadlock. As can be seen the stack 
> > trace
> > was incomplete because the printing path hit a timeout.
>
> I'm fine with having this backported.

Thanks. I can send the cherrypicks your way. Do you recommend that I
include the three follow-up fixes though?

c14376de3a1b printk: Wake klogd when passing console_lock owner
fd5f7cde1b85 printk: Never set console_may_schedule in console_trylock()
c162d5b4338d printk: Hide console waiter logic into helpers
dbdda842fe96 printk: Add console owner and waiter logic to load
balance console writes

>
> -- Steve



-- 
Best,
Daniel


smime.p7s
Description: S/MIME Cryptographic Signature


Re: 4.14 backport request for dbdda842fe96f: "printk: Add console owner and waiter logic to load balance console writes"

2018-10-01 Thread Daniel Wang
On Mon, Oct 1, 2018 at 12:23 PM Steven Rostedt  wrote:
>
> > Serial console logs leading up to the deadlock. As can be seen the stack 
> > trace
> > was incomplete because the printing path hit a timeout.
>
> I'm fine with having this backported.

Thanks. I can send the cherrypicks your way. Do you recommend that I
include the three follow-up fixes though?

c14376de3a1b printk: Wake klogd when passing console_lock owner
fd5f7cde1b85 printk: Never set console_may_schedule in console_trylock()
c162d5b4338d printk: Hide console waiter logic into helpers
dbdda842fe96 printk: Add console owner and waiter logic to load
balance console writes

>
> -- Steve



-- 
Best,
Daniel


smime.p7s
Description: S/MIME Cryptographic Signature


4.14 backport request for dbdda842fe96f: "printk: Add console owner and waiter logic to load balance console writes"

2018-09-27 Thread Daniel Wang
Prior to this change, the combination of `softlockup_panic=1` and
`softlockup_all_cpu_stacktrace=1` may result in a deadlock when the reboot path
is trying to grab the console lock that is held by the stack trace printing
path. What seems to be happening is that while there are multiple CPUs, only one
of them is tasked to print the back trace of all CPUs. On a machine with many
CPUs and a slow serial console (on Google Compute Engine for example), the stack
trace printing routine hits a timeout and the reboot path kicks in. The latter
then tries to print something else, but can't get the lock because it's still
held by earlier printing path. This is easily reproducible on a VM with 16+
vCPUs on Google Compute Engine - which is a very common scenario.

A quick repro is available at
https://github.com/wonderfly/printk-deadlock-repro. The system hangs 3 seconds
into executing repro.sh. Both deadlock analysis and repro are credits to Peter
Feiner.

Note that I have read previous discussions on backporting this to stable [1].
The argument for objecting the backport was that this is a non-trivial fix and
is supported to prevent hypothetical soft lockups. What we are hitting is a real
deadlock, in production, however. Hence this request.

[1] 
https://lore.kernel.org/lkml/20180409081535.dq7p5bfnpvd3x...@pathway.suse.cz/T/#u

Serial console logs leading up to the deadlock. As can be seen the stack trace
was incomplete because the printing path hit a timeout.

```
lockup-test-16-2 login: [  206.648060] LoadPin: kernel-module pinning-ignored 
obj="/tmp/release/hog.ko" pid=3003 cmdline="insmod hog.ko"
[  206.650851] hog: loading out-of-tree module taints kernel.
[  206.654761] Hogging a CPU now
[  209.577900] watchdog: BUG: soft lockup - CPU#13 stuck for 3s! [hog:3010]
[  209.584883] Modules linked in: hog(O) ipt_MASQUERADE nf_nat_masquerade_ipv4 
iptable_nat nf_nat_ipv4 xt_addrtype nf_nat br_netfilter ip6table_filter 
ip6_tables aesni_intel aes_x86_64 crypto_simd cryptd glue_helper
[  209.603952] CPU: 13 PID: 3010 Comm: hog Tainted: G   O 4.14.0+ 
#11
[  209.611390] Hardware name: Google Google Compute Engine/Google Compute 
Engine, BIOS Google 01/01/2011
[  209.620733] task: 9501b8ca9d00 task.stack: b99c0732c000
[  209.626766] RIP: 0010:hog_thread+0x13/0x1000 [hog]
[  209.631763] RSP: 0018:b99c0732ff10 EFLAGS: 0282 ORIG_RAX: 
ff11
[  209.639466] RAX: 0011 RBX: 9501bc1af580 RCX: 
[  209.646818] RDX: 9501c3554d80 RSI: 9501c354cc38 RDI: 9501c354cc38
[  209.654087] RBP: b99c0732ff48 R08: 0030 R09: 
[  209.661510] R10: b99c08df3ce0 R11:  R12: 9501aeab8e80
[  209.668773] R13: b99c0803bc28 R14:  R15: 9501bc1af5c8
[  209.676089] FS:  () GS:9501c354() 
knlGS:
[  209.684292] CS:  0010 DS:  ES:  CR0: 80050033
[  209.690150] CR2: 7f146fd8aba0 CR3: 000b0ba11006 CR4: 003606a0
[  209.697571] DR0:  DR1:  DR2: 
[  209.704936] DR3:  DR6: fffe0ff0 DR7: 0400
[  209.712184] Call Trace:
[  209.714853]  kthread+0x120/0x160
[  209.718198]  ? 0xc0307000
[  209.721641]  ? kthread_stop+0x120/0x120
[  209.725591]  ? ret_from_fork+0x1f/0x30
[  209.729462] Code:  fe 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 
00 00 00 
[  209.737518] Sending NMI from CPU 13 to CPUs 0-12,14-15:
[  209.742864] NMI backtrace for cpu 0
[  209.742868] CPU: 0 PID: 2866 Comm: dd Tainted: G   O 4.14.0+ #11
[  209.742868] Hardware name: Google Google Compute Engine/Google Compute 
Engine, BIOS Google 01/01/2011
[  209.742870] task: 95019d15 task.stack: b99c08d98000
[  209.742875] RIP: 0010:native_queued_spin_lock_slowpath+0x28/0x1b0
[  209.742876] RSP: 0018:b99c08d9bda8 EFLAGS: 0002
[  209.742877] RAX: 0001 RBX: 9501c2cdda68 RCX: 
[  209.742877] RDX: 0001 RSI: 0001 RDI: 9501c2cdda68
[  209.742878] RBP: b99c08d9bdd8 R08: 7f569b7d R09: 6930609f
[  209.742880] R10: 0a41d205 R11: 8b5d54b4 R12: 9501c2cdda68
[  209.742881] R13: b99c08d9be30 R14: b99c08d9be30 R15: 0040
[  209.742882] FS:  7f2605cd3700() GS:9501c320() 
knlGS:
[  209.742883] CS:  0010 DS:  ES:  CR0: 80050033
[  209.742884] CR2: 7fe4acdfe9c0 CR3: 000ef55fa001 CR4: 003606b0
[  209.742888] DR0:  DR1:  DR2: 
[  209.742889] DR3:  DR6: fffe0ff0 DR7: 0400
[  209.742890] Call Trace:
[  209.742895]  do_raw_spin_lock+0xa0/0xb0
[  209.742900]  _raw_spin_lock_irqsave+0x20/0x30
[  209.742905]  _extract_crng+0x45/0x120
[  209.742907]  ? urandom_read+0xfa/0x2a0
[  209.742910]  ? 

4.14 backport request for dbdda842fe96f: "printk: Add console owner and waiter logic to load balance console writes"

2018-09-27 Thread Daniel Wang
Prior to this change, the combination of `softlockup_panic=1` and
`softlockup_all_cpu_stacktrace=1` may result in a deadlock when the reboot path
is trying to grab the console lock that is held by the stack trace printing
path. What seems to be happening is that while there are multiple CPUs, only one
of them is tasked to print the back trace of all CPUs. On a machine with many
CPUs and a slow serial console (on Google Compute Engine for example), the stack
trace printing routine hits a timeout and the reboot path kicks in. The latter
then tries to print something else, but can't get the lock because it's still
held by earlier printing path. This is easily reproducible on a VM with 16+
vCPUs on Google Compute Engine - which is a very common scenario.

A quick repro is available at
https://github.com/wonderfly/printk-deadlock-repro. The system hangs 3 seconds
into executing repro.sh. Both deadlock analysis and repro are credits to Peter
Feiner.

Note that I have read previous discussions on backporting this to stable [1].
The argument for objecting the backport was that this is a non-trivial fix and
is supported to prevent hypothetical soft lockups. What we are hitting is a real
deadlock, in production, however. Hence this request.

[1] 
https://lore.kernel.org/lkml/20180409081535.dq7p5bfnpvd3x...@pathway.suse.cz/T/#u

Serial console logs leading up to the deadlock. As can be seen the stack trace
was incomplete because the printing path hit a timeout.

```
lockup-test-16-2 login: [  206.648060] LoadPin: kernel-module pinning-ignored 
obj="/tmp/release/hog.ko" pid=3003 cmdline="insmod hog.ko"
[  206.650851] hog: loading out-of-tree module taints kernel.
[  206.654761] Hogging a CPU now
[  209.577900] watchdog: BUG: soft lockup - CPU#13 stuck for 3s! [hog:3010]
[  209.584883] Modules linked in: hog(O) ipt_MASQUERADE nf_nat_masquerade_ipv4 
iptable_nat nf_nat_ipv4 xt_addrtype nf_nat br_netfilter ip6table_filter 
ip6_tables aesni_intel aes_x86_64 crypto_simd cryptd glue_helper
[  209.603952] CPU: 13 PID: 3010 Comm: hog Tainted: G   O 4.14.0+ 
#11
[  209.611390] Hardware name: Google Google Compute Engine/Google Compute 
Engine, BIOS Google 01/01/2011
[  209.620733] task: 9501b8ca9d00 task.stack: b99c0732c000
[  209.626766] RIP: 0010:hog_thread+0x13/0x1000 [hog]
[  209.631763] RSP: 0018:b99c0732ff10 EFLAGS: 0282 ORIG_RAX: 
ff11
[  209.639466] RAX: 0011 RBX: 9501bc1af580 RCX: 
[  209.646818] RDX: 9501c3554d80 RSI: 9501c354cc38 RDI: 9501c354cc38
[  209.654087] RBP: b99c0732ff48 R08: 0030 R09: 
[  209.661510] R10: b99c08df3ce0 R11:  R12: 9501aeab8e80
[  209.668773] R13: b99c0803bc28 R14:  R15: 9501bc1af5c8
[  209.676089] FS:  () GS:9501c354() 
knlGS:
[  209.684292] CS:  0010 DS:  ES:  CR0: 80050033
[  209.690150] CR2: 7f146fd8aba0 CR3: 000b0ba11006 CR4: 003606a0
[  209.697571] DR0:  DR1:  DR2: 
[  209.704936] DR3:  DR6: fffe0ff0 DR7: 0400
[  209.712184] Call Trace:
[  209.714853]  kthread+0x120/0x160
[  209.718198]  ? 0xc0307000
[  209.721641]  ? kthread_stop+0x120/0x120
[  209.725591]  ? ret_from_fork+0x1f/0x30
[  209.729462] Code:  fe 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 
00 00 00 
[  209.737518] Sending NMI from CPU 13 to CPUs 0-12,14-15:
[  209.742864] NMI backtrace for cpu 0
[  209.742868] CPU: 0 PID: 2866 Comm: dd Tainted: G   O 4.14.0+ #11
[  209.742868] Hardware name: Google Google Compute Engine/Google Compute 
Engine, BIOS Google 01/01/2011
[  209.742870] task: 95019d15 task.stack: b99c08d98000
[  209.742875] RIP: 0010:native_queued_spin_lock_slowpath+0x28/0x1b0
[  209.742876] RSP: 0018:b99c08d9bda8 EFLAGS: 0002
[  209.742877] RAX: 0001 RBX: 9501c2cdda68 RCX: 
[  209.742877] RDX: 0001 RSI: 0001 RDI: 9501c2cdda68
[  209.742878] RBP: b99c08d9bdd8 R08: 7f569b7d R09: 6930609f
[  209.742880] R10: 0a41d205 R11: 8b5d54b4 R12: 9501c2cdda68
[  209.742881] R13: b99c08d9be30 R14: b99c08d9be30 R15: 0040
[  209.742882] FS:  7f2605cd3700() GS:9501c320() 
knlGS:
[  209.742883] CS:  0010 DS:  ES:  CR0: 80050033
[  209.742884] CR2: 7fe4acdfe9c0 CR3: 000ef55fa001 CR4: 003606b0
[  209.742888] DR0:  DR1:  DR2: 
[  209.742889] DR3:  DR6: fffe0ff0 DR7: 0400
[  209.742890] Call Trace:
[  209.742895]  do_raw_spin_lock+0xa0/0xb0
[  209.742900]  _raw_spin_lock_irqsave+0x20/0x30
[  209.742905]  _extract_crng+0x45/0x120
[  209.742907]  ? urandom_read+0xfa/0x2a0
[  209.742910]  ?