On Sat, May 22, 2021 at 01:35:53AM +0200, Alexander Bluhm wrote:
> On Fri, May 21, 2021 at 02:00:54PM -0500, Scott Cheloha wrote:
> > Given all of this, would it be better if secondary CPUs spin in
> > panic(9) instead of trying to print anything?
> 
> The panic code should be as primitive as possible.  The garbled
> output also tells me something.  Two CPUs are failing simultaneosly.
> Please don't suppress that information.
> 
> The crash is the problem, not the ugly printing.

I get where you're coming from in principle (simpler is better) but I
think you're prioritizing a minor concern over the bigger picture.

I think it is strictly better for one CPU to run the panic code than
to have two CPUs doing the same.  If the panic code is primitive then
it probably isn't MP-safe.

Further, what about kernels without ddb(4)?  What happens if two or
more CPUs all simultaneously call reboot() at the end of panic(9)?  It
would vary by platform at a minimum.  Wouldn't deterministic behavior
be better in that case?

We can keep the garbled printing if you like, I can see how it works
as a diagnostic feature, but I think the losing CPUs should spin in
panic.  Fewer things can go wrong.

What about something like this?

Index: subr_prf.c
===================================================================
RCS file: /cvs/src/sys/kern/subr_prf.c,v
retrieving revision 1.103
diff -u -p -r1.103 subr_prf.c
--- subr_prf.c  16 May 2021 15:10:20 -0000      1.103
+++ subr_prf.c  25 May 2021 03:09:19 -0000
@@ -184,31 +184,50 @@ void
 panic(const char *fmt, ...)
 {
        static char panicbuf[512];
-       int bootopt;
+       static struct cpu_info *paniccpu;
+       struct cpu_info *ci;
+       int bootopt, recursive, spin;
        va_list ap;
 
-       /* do not trigger assertions, we know that we are inconsistent */
-       splassert_ctl = 0;
+       ci = curcpu();
+       recursive = 0;
+       spin = 0;
 
-       /* make sure we see kernel printf output */
-       printf_flags |= TOCONS;
+       if (atomic_cas_ptr(&paniccpu, NULL, ci) != NULL) {
+               if (paniccpu != ci)
+                       spin = 1;
+               else
+                       recursive = 1;
+       } else {
+               panicstr = panicbuf;
+
+               /* do not trigger assertions, we know that we are inconsistent 
*/
+               splassert_ctl = 0;
+
+               /* make sure we see kernel printf output */
+               printf_flags |= TOCONS;
+       }
 
        bootopt = RB_AUTOBOOT | RB_DUMP;
-       va_start(ap, fmt);
-       if (panicstr)
+
+       if (recursive || spin) {
                bootopt |= RB_NOSYNC;
-       else {
+               printf("panic: ");
+               va_start(ap, fmt);
+               vprintf(fmt, ap);
+               va_end(ap);
+               printf("\n");
+       } else {
+               va_start(ap, fmt);
                vsnprintf(panicbuf, sizeof panicbuf, fmt, ap);
-               panicstr = panicbuf;
+               va_end(ap);
+               printf("panic: %s\n", panicbuf);
        }
-       va_end(ap);
-
-       printf("panic: ");
-       va_start(ap, fmt);
-       vprintf(fmt, ap);
-       printf("\n");
-       va_end(ap);
 
+       if (spin) {
+               for (;;)
+                       CPU_BUSY_CYCLE();
+       }
 #ifdef DDB
        if (db_panic)
                db_enter();

Reply via email to