Just got back from vacation. Thanks for the continued discussion. Just so
I understand the current state. Looks like we've got a pretty good explanation
of what's going on (though not completely sure), and backporting Steven's
patches is still the way to go? I see that Sergey had sent an RFC
On Mon, Oct 22, 2018 at 3:10 AM Sergey Senozhatsky
wrote:
> Another deadlock scenario could be the following one:
>
> printk()
> console_trylock()
> down_trylock()
>raw_spin_lock_irqsave(>lock, flags)
>
> panic()
>
On Wed, Oct 3, 2018 at 2:14 AM Petr Mladek wrote:
>
> On Tue 2018-10-02 21:23:27, Steven Rostedt wrote:
> > I don't see the big deal of backporting this. The biggest complaints
> > about backports are from fixes that were added to late -rc releases
> > where the fixes didn't get much testing.
On Wed, Oct 3, 2018 at 10:37 AM Steven Rostedt wrote:
> Just so I understand correctly. Does the panic hit with and without the
> suggested backport patch? The only difference is that you get the full
> output with the patch and limited output without it?
When `softlockup_panic` is set (which is
On Tue, Oct 2, 2018 at 1:42 AM Petr Mladek wrote:
> Well, I still wonder why it helped and why you do not see it with 4.4.
> I have a feeling that the console owner switch helped only by chance.
So do I. I don't think Steven had the deadlock in mind when working on
that patch, but with that
On Tue, Oct 2, 2018 at 1:42 AM Petr Mladek wrote:
>
> Well, I still wonder why it helped and why you do not see it with 4.4.
> I have a feeling that the console owner switch helped only by chance.
> In fact, you might be affected by a race in
> printk_safe_flush_on_panic() that was fixed by the
I wanted to let you know that I am leaving for a two-week vacation. So
if you don't hear from me during that period assume bad network
connectivity and not lack of enthusiasm. :) Feel free to go with the
backports if we reach an agreement here. Otherwise I'll do it when I get
back. Thank you all!
On Mon, Oct 1, 2018 at 12:23 PM Steven Rostedt wrote:
>
> > Serial console logs leading up to the deadlock. As can be seen the stack
> > trace
> > was incomplete because the printing path hit a timeout.
>
> I'm fine with having this backported.
Thanks. I can send the cherrypicks your way. Do
On Mon, Oct 1, 2018 at 1:23 PM Vlastimil Babka wrote:
>
> On 10/1/18 10:13 PM, Pavel Machek wrote:
> >
> > Dunno. Is the patch perhaps a bit too complex? This is not exactly
> > trivial bugfix.
> >
> > pavel@duo:/data/l/clean-cg$ git show dbdda842fe96f | diffstat
> > printk.c | 108
> >
Prior to this change, the combination of `softlockup_panic=1` and
`softlockup_all_cpu_stacktrace=1` may result in a deadlock when the reboot path
is trying to grab the console lock that is held by the stack trace printing
path. What seems to be happening is that while there are multiple CPUs, only
On Wed, Dec 12, 2018 at 9:43 AM Sasha Levin wrote:
>
> On Wed, Dec 12, 2018 at 10:59:39PM +0900, Sergey Senozhatsky wrote:
> >On (12/12/18 14:36), Petr Mladek wrote:
> >> > OK, really didn't know that! I wasn't Cc-ed on that AUTOSEL email,
> >> > and I wasn't Cc-ed on this whole discussion and
Thanks for the clarification. So I guess I don't need to start another
thread for it? What are the next steps?
On Wed, Dec 12, 2018 at 1:43 PM Sasha Levin wrote:
>
> On Wed, Dec 12, 2018 at 12:11:29PM -0800, Daniel Wang wrote:
> >On Wed, Dec 12, 2018 at 9:43 AM Sasha
Thank you!
On Wed, Dec 12, 2018 at 1:52 PM Sasha Levin wrote:
>
> On Wed, Dec 12, 2018 at 01:49:25PM -0800, Daniel Wang wrote:
> >Thanks for the clarification. So I guess I don't need to start another
> >thread for it? What are the next steps?
>
> Nothing here, I'll queu
klogd when passing console_lock owner.
On Wed, Dec 12, 2018 at 1:56 PM Daniel Wang wrote:
>
> Thank you!
>
> On Wed, Dec 12, 2018 at 1:52 PM Sasha Levin wrote:
> >
> > On Wed, Dec 12, 2018 at 01:49:25PM -0800, Daniel Wang wrote:
> > >Thanks for the clarification.
n Wed, Dec 12, 2018 at 6:27 PM Sergey Senozhatsky
wrote:
>
> On (12/12/18 16:40), Daniel Wang wrote:
> > In case this was buried in previous messages, the commit I'd like to
> > get backported to 4.14 is dbdda842fe96f: printk: Add console owner and
> > waiter logic to loa
Is it okay to tag this commit with `Cc: sta...@vger.kernel.org` so
that it'll get applied to the stable trees once merged into Linux's
tree, if it's not too late? Otherwise I'll follow up on the stable
merges separately. Thanks for making the changes anyway.
On Thu, Nov 22, 2018 at 5:12 AM Petr
ID and subject, which
are both mentioned in this email. Should I send another one? What's
the process like? Thanks!
On Thu, Nov 8, 2018 at 10:47 PM Sergey Senozhatsky
wrote:
>
> On (11/01/18 09:05), Daniel Wang wrote:
> > > Another deadlock scenario could be the following one:
> >
No worries. I will follow up. You would recommend that all four
patches in this set to be backported though, right?
On Tue, Dec 11, 2018 at 9:23 PM Sergey Senozhatsky
wrote:
>
> On (12/11/18 16:53), Daniel Wang wrote:
> > Is it okay to tag this commit with `Cc: sta...@vger.ke
2 patch sets to choose from
for -stable, then -stable guys can pick up the one that requires less
effort: 1 two-liner patch vs. 3 or 4 bigger patches.
Which two sets are you referring to specifically?
On Tue, Dec 11, 2018 at 9:21 PM Sergey Senozhatsky
wrote:
>
> On (12/11/18 17:16), Daniel Wang
Got it. Thank you.
On Tue, Dec 11, 2018 at 10:06 PM Sergey Senozhatsky
wrote:
>
> On (12/11/18 21:59), Daniel Wang wrote:
> > No worries. I will follow up. You would recommend that all four
> > patches in this set to be backported though, right?
>
> Just the last one,
On Tue, Oct 2, 2018 at 1:42 AM Petr Mladek wrote:
> Well, I still wonder why it helped and why you do not see it with 4.4.
> I have a feeling that the console owner switch helped only by chance.
So do I. I don't think Steven had the deadlock in mind when working on
that patch, but with that
On Tue, Oct 2, 2018 at 1:42 AM Petr Mladek wrote:
>
> Well, I still wonder why it helped and why you do not see it with 4.4.
> I have a feeling that the console owner switch helped only by chance.
> In fact, you might be affected by a race in
> printk_safe_flush_on_panic() that was fixed by the
On Wed, Oct 3, 2018 at 2:14 AM Petr Mladek wrote:
>
> On Tue 2018-10-02 21:23:27, Steven Rostedt wrote:
> > I don't see the big deal of backporting this. The biggest complaints
> > about backports are from fixes that were added to late -rc releases
> > where the fixes didn't get much testing.
On Wed, Oct 3, 2018 at 10:37 AM Steven Rostedt wrote:
> Just so I understand correctly. Does the panic hit with and without the
> suggested backport patch? The only difference is that you get the full
> output with the patch and limited output without it?
When `softlockup_panic` is set (which is
I wanted to let you know that I am leaving for a two-week vacation. So
if you don't hear from me during that period assume bad network
connectivity and not lack of enthusiasm. :) Feel free to go with the
backports if we reach an agreement here. Otherwise I'll do it when I get
back. Thank you all!
Prior to this change, the combination of `softlockup_panic=1` and
`softlockup_all_cpu_stacktrace=1` may result in a deadlock when the reboot path
is trying to grab the console lock that is held by the stack trace printing
path. What seems to be happening is that while there are multiple CPUs, only
On Mon, Oct 1, 2018 at 12:23 PM Steven Rostedt wrote:
>
> > Serial console logs leading up to the deadlock. As can be seen the stack
> > trace
> > was incomplete because the printing path hit a timeout.
>
> I'm fine with having this backported.
Thanks. I can send the cherrypicks your way. Do
On Mon, Oct 1, 2018 at 1:23 PM Vlastimil Babka wrote:
>
> On 10/1/18 10:13 PM, Pavel Machek wrote:
> >
> > Dunno. Is the patch perhaps a bit too complex? This is not exactly
> > trivial bugfix.
> >
> > pavel@duo:/data/l/clean-cg$ git show dbdda842fe96f | diffstat
> > printk.c | 108
> >
Thanks. I was able to confirm that commit c7c3f05e341a9a2bd alone
fixed the problem for me. As expected, all 16 CPUs' stacktrace was
printed, before a final panic stack dump and a successful reboot.
[ 24.035044] Hogging a CPU now
[ 48.200258] watchdog: BUG: soft lockup - CPU#3 stuck for 22s!
On Mon, Oct 22, 2018 at 3:10 AM Sergey Senozhatsky
wrote:
> Another deadlock scenario could be the following one:
>
> printk()
> console_trylock()
> down_trylock()
>raw_spin_lock_irqsave(>lock, flags)
>
> panic()
>
Just got back from vacation. Thanks for the continued discussion. Just so
I understand the current state. Looks like we've got a pretty good explanation
of what's going on (though not completely sure), and backporting Steven's
patches is still the way to go? I see that Sergey had sent an RFC
31 matches
Mail list logo