I have seen similar behaviour before.  The problem is that every CPU
receives an NMI concurrently.  As I recall, one of them gets some kind of
pseudo-spinlock and tries to stop the other CPUs with an NMI.  However,
because they are already in an NMI handler, they don't get the second NMI
and don't stop properly.

The case that I saw actually had to do with a panic triggered by an NMI,
not entering the debugger, but I believe that both cases use
stop_cpus_hard() under the hood and have a similar issue.

(I also recall seeing the exact situation that you describe while
originally developing SR-IOV on an alpha version of the Fortville hardware
and firmware with a very buggy SR-IOV implementation.  I've never seen it
on ixgbe before, although I haven't used SR-IOV there very much at all)


On Thu, Aug 20, 2015 at 6:15 PM, Adrian Chadd <adr...@freebsd.org> wrote:

> Hi!
>
> This has started happening on -HEAD recently. No, I don't have any
> more details yet than "recently."
>
> Whenever I get an NMI panic (and getting an NMI is a separate issue,
> sigh) I get a slew of "failed to stop cpu" messages, and all CPUs
> enter ddb. This is .. sub-optimal. Has anyone seen this? Does anyone
> have any ideas?
>
>
> -adrian
> _______________________________________________
> freebsd-a...@freebsd.org mailing list
> https://lists.freebsd.org/mailman/listinfo/freebsd-arch
> To unsubscribe, send any mail to "freebsd-arch-unsubscr...@freebsd.org"
>
_______________________________________________
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"

Reply via email to