Hi Jason,
Thanks for sharing the core dumps.
I've root caused this issue as a deadlock issue. It's caused by using
spin lock for cv_timedwait_sig()/cv_signal() calls in
rtSemEventMultiWait()/RTSemEventSignal() in vboxdrv module of virtual
box. You may want to try latest vbox to see if this issue is still there
or not. See analysis below for details:
> ::cpuinfo -v
ID ADDR FLG NRUN BSPL PRI RNRN KRNRN SWITCH THREAD
PROC
0 fffffffffbc36f80 1b 1 2 -1 no no t-5 ffffff0007805c60
(idle)
| | |
RUNNING <--+ | +--> PIL THREAD
READY | 2 ffffff00078cbc60
EXISTS |
ENABLE +--> PRI THREAD PROC
60 ffffff0007e11c60 sched
ID ADDR FLG NRUN BSPL PRI RNRN KRNRN SWITCH THREAD
PROC
1 ffffff01cbee7000 1f 0 0 168 no no t-6968
ffffff0007c49c60 sched
| |
RUNNING <--+ +--> PIL THREAD
READY 9 ffffff0007c49c60
QUIESCED 6 ffffff0007c4fc60
EXISTS - ffffff01d6d55740 VirtualBox
ENABLE
Here we see ffffff01d6d55740 is preempted by e1000g interrupt(IPL 6),
then e1000g interrupt is preempted by nvidia graphics interrupt.
> ffffff0007c49c60::findstack -v
stack pointer for thread ffffff0007c49c60: ffffff0007c49b30
ffffff0007c49bb0 rm_isr+0xaa()
ffffff0007c49c00 av_dispatch_autovect+0x7c(10)
ffffff0007c49c40 dispatch_hardint+0x33(10, 6)
ffffff0007c4f450 switch_sp_and_call+0x13()
ffffff0007c4f4a0 do_interrupt+0x9e(ffffff0007c4f4b0, b)
ffffff0007c4f4b0 _interrupt+0xba()
ffffff0007c4f5c0 default_lock_delay+0x8c(4)
ffffff0007c4f630 lock_set_spl_spin+0xc2(ffffff01e4840d99, b,
ffffff01e4840d9c
, 6)
ffffff0007c4f690 mutex_vector_enter+0x45e(ffffff01e4840d98)
ffffff0007c4f6c0 RTSemEventSignal+0x6a()
ffffff0007c4f740 0xfffffffff836c57b()
ffffff0007c4f770 0xfffffffff836d73a()
ffffff0007c4f830 vboxNetFltSolarisRecv+0x331()
ffffff0007c4f880 VBoxNetFltSolarisModReadPut+0x107()
ffffff0007c4f8f0 putnext+0x21e(ffffff01e425ed30, ffffff01ccb1a0a0)
ffffff0007c4f950 dld_str_rx_raw+0xb3(ffffff01ce74a6c8, 0,
ffffff01ccb1a0a0,
ffffff0007c4f980)
ffffff0007c4fa10 dls_rx_promisc+0x179(ffffff01ce74a6c8, 0,
ffffff01ccb1a0a0, 0
)
ffffff0007c4fa50 mac_promisc_dispatch_one+0x5f(ffffff01df15ee78,
ffffff01d20f7640, 0)
ffffff0007c4fac0 mac_promisc_dispatch+0x105(ffffff01cdc5f788,
ffffff01d20f7640
, 0)
ffffff0007c4fb10 mac_rx+0x3e(ffffff01cdc5f788, ffffff01cdb86ee8,
ffffff01d20f7640)
ffffff0007c4fb50 mac_rx_ring+0x4c(ffffff01cdc5f788, ffffff01cdb86ee8,
ffffff01d20f7640, 1)
ffffff0007c4fbb0 e1000g_intr+0x17e(ffffff01cd3ec000)
ffffff0007c4fc00 av_dispatch_autovect+0x7c(30)
ffffff0007c4fc40 dispatch_hardint+0x33(30, 0)
ffffff0008b8e7d0 switch_sp_and_call+0x13()
ffffff0008b8e820 do_interrupt+0x9e(ffffff0008b8e830, 0)
ffffff0008b8e830 _interrupt+0xba()
ffffff0008b8e960 untimeout_generic+0xaf(17e3e41c8, 0)
ffffff0008b8e980 untimeout_default+0x15(17e3e41c8, 0)
ffffff0008b8ea00 cv_timedwait_sig_internal+0x1f8(ffffff01dd2a2910,
ffffff01dd2a2908, 553c76, 0)
ffffff0008b8ea30 cv_timedwait_sig+0x1f(ffffff01dd2a2910,
ffffff01dd2a2908,
553c76)
ffffff0008b8ea90 rtSemEventMultiWait+0xe2()
ffffff0008b8eae0 0xfffffffff8333a80()
ffffff0008b8eb10 0xfffffffff833e60c()
ffffff0008b8eb50 0xfffffffff833edeb()
ffffff0008b8ec30 supdrvIOCtl+0xd62()
ffffff0008b8ed00 VBoxDrvSolarisIOCtl+0x359()
ffffff0008b8ed40 cdev_ioctl+0x45(10a00000000, c0185687,
fffffd7ffe55cda0,
202003, ffffff01dd4364b0, ffffff0008b8ee24)
ffffff0008b8ed80 spec_ioctl+0x83(ffffff01de3ca200, c0185687,
fffffd7ffe55cda0
, 202003, ffffff01dd4364b0, ffffff0008b8ee24, 0)
ffffff0008b8ee00 fop_ioctl+0x7b(ffffff01de3ca200, c0185687,
fffffd7ffe55cda0,
202003, ffffff01dd4364b0, ffffff0008b8ee24, 0)
ffffff0008b8ef00 ioctl+0x18e(17, ffffffffc0185687, fffffd7ffe55cda0)
ffffff0008b8ef10 sys_syscall+0x17b()
>
Anyway, e1000g interrupt(ffffff01d6d55740) is spinning on a spin lock:
> ffffff01e4840d98::mutex
ADDR TYPE HELD MINSPL OLDSPL WAITERS
ffffff01e4840d98 spin yes 11 0 -
The question now is who is holding the spin lock? Let's print out
all the threads related to vboxdrv:
> ::stacks -a -m vboxdrv
THREAD STATE SOBJTYPE COUNT
ffffff01ccad6e60 SLEEP MUTEX 1
swtch+0x147
turnstile_block+0x764
mutex_vector_enter+0x261
timeout_generic+0x87
cv_timedwait_sig_internal+0xef
cv_timedwait_sig+0x1f
rtSemEventWait+0xe2
0xfffffffff836d354
0xfffffffff833e6ae
0xfffffffff833edeb
supdrvIOCtl+0xd62
VBoxDrvSolarisIOCtl+0x359
cdev_ioctl+0x45
spec_ioctl+0x83
fop_ioctl+0x7b
ioctl+0x18e
THREAD STATE SOBJTYPE COUNT
ffffff0007c4fc60 ONPROC <NONE> 1
default_lock_delay+0x8c
lock_set_spl_spin+0xc2
mutex_vector_enter+0x45e
RTSemEventSignal+0x6a
0xfffffffff836c57b
0xfffffffff836d73a
vboxNetFltSolarisRecv+0x331
VBoxNetFltSolarisModReadPut+0x107
putnext+0x21e
dld_str_rx_raw+0xb3
dls_rx_promisc+0x179
mac_promisc_dispatch_one+0x5f
mac_promisc_dispatch+0x105
mac_rx+0x3e
mac_rx_ring+0x4c
e1000g_intr+0x17e
av_dispatch_autovect+0x7c
dispatch_hardint+0x33
THREAD STATE SOBJTYPE COUNT
ffffff01d6d55740 ONPROC <NONE> 1
untimeout_generic+0xaf
untimeout_default+0x15
cv_timedwait_sig_internal+0x1f8
cv_timedwait_sig+0x1f
rtSemEventMultiWait+0xe2
0xfffffffff8333a80
0xfffffffff833e60c
0xfffffffff833edeb
supdrvIOCtl+0xd62
VBoxDrvSolarisIOCtl+0x359
cdev_ioctl+0x45
spec_ioctl+0x83
fop_ioctl+0x7b
ioctl+0x18e
Now we can see the deadlock senario is:
- Thread A(ffffff01d6d55740) is cancelling a cv timeout with the callout
table lock(at untimeout_generic+0x54) being hold.
- Thread B(ffffff01ccad6e60) is blocked and sleep trying to get the
callout table lock to enable a timeout. However, it has hold the
vboxdrv spin lock(at rtSemEventMultiWait+0x8b) and hasn't got chance
to release it yet at (cv_timedwait_sig_internal+0x138).
- E1000g interrup(thread ffffff0007c4fc60) preempted thread A and calls
RTSemEventSignal() which needs to hold the vboxdrv spin lock, but the
spin lock is held at thread B. This cause e1000g interrupt thread
spins there foreever on CPU1. (Only intterrupts with higher priority
than e1000g, i.e. >6, can get scheduled on CPU1)
To solve this issue, vboxdrv needs to change to not use spin lock but
adaptive lock for cv_timedwait_sig()/cv_signal() calls in
rtSemEventMultiWait()/RTSemEventSignal().
On the other hand, can we also improve Solaris cv_* interface to avoid
such issue? I CCed Eric and Madhavan for comments.
Regards,
Kerry
Jason King wrote:
On Thu, Sep 17, 2009 at 11:55 AM, Dana H. Myers <[email protected]> wrote:
Kerry Shu wrote:
What you are looking for is 0x10, not 0x0a. Looks to me, here you have
IRQ# 16 interrupt (might be either hci1394#0, uhci#3, uhci#0, or
nvidia#0) preempting e1000g#0 interrupt. I guess such situation happened
frequently since you felt system freeze. So are you running something
that let both e1000g0 and other 4 driver instances at IRQ# 16 busy? For
example, are you putting heavy load on both network and graphics?
I don't understand what the VirtualBox/bridged-Ethernet configuration
does here, but one comment about xHCIs - they tend to interrupt
regularly at a reasonable rate, IIRC, even without a heavy
load. Graphics interrupts periodically but not that often, I
believe tied to vertical retrace/frame rate, which is 50-100 per
second.
Could it be possible that it's stuck in in the e1000g ISR and the IRQ
16 preemption is just a red herring?
I'm not really familiar with the Opensolaris interrupt implementation,
so I'm making some assumptions here (corrections are welcome) ---
looking at the stack, I'm guessing the preemption by IRQ 16 starts at
the _interrupt call. Since xHCIs interrupt regularly, if the e1000g
ISR (or really the vboxfilt stream module sitting ontop of e1000g) is
deadlocked on a mutex, isn't that what you'd see? I'm assuming the
mutex call prior to _interrupt isn't part of the preemption process.
_______________________________________________
driver-discuss mailing list
[email protected]
http://mail.opensolaris.org/mailman/listinfo/driver-discuss