On Wed, May 12, 2021 at 07:08:39PM -0500, Scott Cheloha wrote:
> In a separate mail thread, bluhm@ mentioned that panic(9) does not
> cleanly handle multiple CPUs entering it simultaneously:
> 
> https://marc.info/?l=openbsd-tech&m=161908805925325&w=2
> 
> I'm unsure which part of panic(9) is causing the problem he mentions,
> but one obvious issue I see is that panicstr is not set atomically,
> so two CPUs entering panic(9) simultaneously may clobber panicbuf.
> 
> If we set panicstr atomically only one CPU will write panicbuf.

I think most of the clobbering is explained by more than one CPU writing
to the console at the same time. The vsnprintf() and setting of panicstr
usually happen quickly, so the kind of garbling occasionally seen with
nearly simultaneous panicking is not likely to arise there. Console I/O,
on the other hand, can be orders of magnitude slower. That, and the fact
that mutexes become no-ops once panicstr is set, create a slow phase
where multiple CPUs can easily be concurrently even if the initial
timings were not so close after all.

I feel that panic() should let only the first panicker run the panic
code and stop any other CPUs, like NetBSD does. Another option is to
serialize panic() in a more proper way. Or maybe secondary panickers
should just delay a little at the start of panic()...

Reply via email to