On Sat, 4 Jan 2003, Poul-Henning Kamp wrote:

> I pressed ctrl-T to see how far my compile had come and was rewarded
> with a LOR:
>
> llock order reversal
>  1st 0xc0347680 sched lock (sched lock) @ ../../../kern/tty.c:2386
>  2nd 0xc039f3e0 sio (sio) @ ../../../dev/sio/sio.c:3200
> Debugger("witness_lock")
> Stopped at      Debugger+0x54:  xchgl   %ebx,in_Debugger.0
> db> trace
> Debugger(c0308bc1,c039f3e0,c035cf40,c035cf40,c0330b72) at Debugger+0x54
> witness_lock(c039f3e0,8,c0330b72,c80,c0322f08) at witness_lock+0x667
> _mtx_lock_spin_flags(c039f3e0,0,c0330b72,c80,0) at _mtx_lock_spin_flags+0xd1
> siocnputc(c036fca0,63,5,d6985b68,0) at siocnputc+0x81
> cnputc(63,ffffffff,1,c032024c,225) at cnputc+0x5d
> putchar(63,d6985b68,d6985aa0,c01c84b4,c14f9800) at putchar+0xcd
> kvprintf(c032024b,c01e04f0,d6985b68,a,d6985b88) at kvprintf+0x8d
> printf(c032024b,2a51,108f5,c4b06868,c0322f08) at printf+0x57
> calcru(c4b06720,d6985c40,d6985c48,0,22a) at calcru+0x20b
> ttyinfo(c14f9800,0,c0324bb5,22a,c01b98bd) at ttyinfo+0x1f9
> ttyinput(14,c14f9800,c0330b72,647,c3fe2740) at ttyinput+0x7c3
> sioinput(c401b000,0,c0330b72,856,0) at sioinput+0x1d3
> siopoll(0,0,c031e372,216,c14fe720) at siopoll+0x116
> ithread_loop(c14fbe80,d6985d48,c031e1ef,361,0) at ithread_loop+0x182
> fork_exit(c01b0680,c14fbe80,d6985d48) at fork_exit+0xc4
> fork_trampoline() at fork_trampoline+0x1a
> --- trap 0x1, eip = 0, esp = 0xd6985d7c, ebp = 0 ---
> db>

This is a variation on the old, unfixed bug that printf() is not safe
to call while sched_lock is held.  calcru() always holds sched_lock
but only calls printf() when the timecounter goes backwards.  Timecounters
were detected to go backwards much more often in mi_switch(), so the
corresponding message there was first turned off and then shot.

I thought that this bug didn't affect the sio console driver.  Console
drivers must not use normal locking for their low-level i/o for various
reasons, but the bug here seems to be just that witness doesn't
understand sio_lock.  I think sio_lock needs to be initialized with
MTX_NOWITNESS as well as or instead of MTX_QUIET.  I changed my version
to do this so long ago that I forget doing it:

%%%
Index: sio.c
===================================================================
RCS file: /home/ncvs/src/sys/dev/sio/sio.c,v
retrieving revision 1.382
diff -u -2 -r1.382 sio.c
--- sio.c       11 Oct 2002 20:22:20 -0000      1.382
+++ sio.c       17 Oct 2002 20:18:53 -0000
@@ -512,7 +526,7 @@
        while (sio_inited != 2)
                if (atomic_cmpset_int(&sio_inited, 0, 1)) {
+                       /* XXX might need MTX_QUIET instead of MTX_NOWITNESS.*/
                        mtx_init(&sio_lock, sio_driver_name, NULL,
-                           (comconsole != -1) ?
-                           MTX_SPIN | MTX_QUIET : MTX_SPIN);
+                                MTX_SPIN | MTX_NOWITNESS);
                        atomic_store_rel_int(&sio_inited, 2);
                }
@@ -3197,6 +3295,6 @@
        s = spltty();
        need_unlock = 0;
-       if (sio_inited == 2 && !mtx_owned(&sio_lock)) {
-               mtx_lock_spin(&sio_lock);
+       if (!db_active && sio_inited == 2 && !mtx_owned(&sio_lock)) {
+               COM_LOCK();
                need_unlock = 1;
        }
%%%

Perhaps MTX_QUIET is still required.  I think initializing the lock
depending on whether comconsole != -1 is broken for at least the case
where consoleness is initialized later using a sysctl.  MTX_NOWITNESS
may make the sio entry in the lock order list bogus.  The patch also fixes
the style bug of using KNF indentation for continued lines; sio is not in
KNF.


The changes in the second hunk do the following:
- checking db_active stops ddb going near sio_lock to prevent problems
  when changes of sio_lock are traced.  When ddb is active, other CPUs
  are locked out by a ddb lock and interrupts on the current CPU are
  locked out by disable_intr() (modulo bugs).
- COM_LOCK() is mtx_lock_spin(&sio_lock) in the SMP case and null otherwise.
  Sio's locking is nothing like I want it to be in either version, but the
  macro hides some of the details.

Bruce


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message

Reply via email to