Re: freebsd-head: suddenly NMI panics lead to ddb being unable to stop CPUs?

2015-08-21 Thread Ryan Stone
I have seen similar behaviour before. The problem is that every CPU receives an NMI concurrently. As I recall, one of them gets some kind of pseudo-spinlock and tries to stop the other CPUs with an NMI. However, because they are already in an NMI handler, they don't get the second NMI and don't

Re: freebsd-head: suddenly NMI panics lead to ddb being unable to stop CPUs?

2015-08-21 Thread Adrian Chadd
Ah, cool. I'll give it a whirl. I'm a little worried about having all of the other cores spinning in this case (mostly thermal; the machines get VERY LOUD when the CPUs are spinning..) -a On 21 August 2015 at 08:19, Eric van Gyzen vangy...@freebsd.org wrote: I mentioned this to Adrian, but

Re: freebsd-head: suddenly NMI panics lead to ddb being unable to stop CPUs?

2015-08-21 Thread Ian Lepore
On Fri, 2015-08-21 at 23:30 +0800, Julian Elischer wrote: On 8/21/15 11:25 PM, Adrian Chadd wrote: Ah, cool. I'll give it a whirl. I'm a little worried about having all of the other cores spinning in this case (mostly thermal; the machines get VERY LOUD when the CPUs are spinning..)

Re: freebsd-head: suddenly NMI panics lead to ddb being unable to stop CPUs?

2015-08-21 Thread Eric van Gyzen
On 08/21/2015 10:30, Julian Elischer wrote: On 8/21/15 11:25 PM, Adrian Chadd wrote: Ah, cool. I'll give it a whirl. I'm a little worried about having all of the other cores spinning in this case (mostly thermal; the machines get VERY LOUD when the CPUs are spinning..) make each spin with

Re: freebsd-head: suddenly NMI panics lead to ddb being unable to stop CPUs?

2015-08-21 Thread Julian Elischer
On 8/22/15 12:23 AM, Ian Lepore wrote: On Fri, 2015-08-21 at 23:30 +0800, Julian Elischer wrote: On 8/21/15 11:25 PM, Adrian Chadd wrote: Ah, cool. I'll give it a whirl. I'm a little worried about having all of the other cores spinning in this case (mostly thermal; the machines get VERY LOUD

Re: freebsd-head: suddenly NMI panics lead to ddb being unable to stop CPUs?

2015-08-21 Thread Eric van Gyzen
Spinning is probably the only safe option in NMI context, since the NMI could have arrived at literally any time in any context (e.g. holding a spin lock, interrupts disabled). :-/ Eric On 08/21/2015 10:25, Adrian Chadd wrote: Ah, cool. I'll give it a whirl. I'm a little worried about

Re: freebsd-head: suddenly NMI panics lead to ddb being unable to stop CPUs?

2015-08-21 Thread Scott Long
I might have a fix for this, I’ll check the netflix repo and see if it’s something that is ready to go upstream to freebsd. Scott On Aug 21, 2015, at 4:19 PM, Eric van Gyzen vangy...@freebsd.org wrote: I mentioned this to Adrian, but I'll mention here for everyone else's benefit. Ryan is

Re: freebsd-head: suddenly NMI panics lead to ddb being unable to stop CPUs?

2015-08-21 Thread Eric van Gyzen
I mentioned this to Adrian, but I'll mention here for everyone else's benefit. Ryan is exactly right. There was a thread a while ago, with a proposed patch from Kostik: https://lists.freebsd.org/pipermail/freebsd-arch/2014-July/015584.html As I recall, Scott Long also ran into this a few

Re: freebsd-head: suddenly NMI panics lead to ddb being unable to stop CPUs?

2015-08-21 Thread Julian Elischer
On 8/21/15 11:25 PM, Adrian Chadd wrote: Ah, cool. I'll give it a whirl. I'm a little worried about having all of the other cores spinning in this case (mostly thermal; the machines get VERY LOUD when the CPUs are spinning..) make each spin with the pause instruction.. and for N seconds (N

Re: freebsd-head: suddenly NMI panics lead to ddb being unable to stop CPUs?

2015-08-21 Thread Konstantin Belousov
On Sat, Aug 22, 2015 at 12:53:15AM +0800, Julian Elischer wrote: On 8/22/15 12:23 AM, Ian Lepore wrote: On Fri, 2015-08-21 at 23:30 +0800, Julian Elischer wrote: On 8/21/15 11:25 PM, Adrian Chadd wrote: Ah, cool. I'll give it a whirl. I'm a little worried about having all of the other

freebsd-head: suddenly NMI panics lead to ddb being unable to stop CPUs?

2015-08-20 Thread Adrian Chadd
Hi! This has started happening on -HEAD recently. No, I don't have any more details yet than recently. Whenever I get an NMI panic (and getting an NMI is a separate issue, sigh) I get a slew of failed to stop cpu messages, and all CPUs enter ddb. This is .. sub-optimal. Has anyone seen this?