I believe PIL 10 is used by the system timer. Probably more
investigation is required.
- Garrett
Jason King wrote:
On Thu, Sep 17, 2009 at 10:16 AM, Garrett D'Amore <[email protected]> wrote:
Look closely at the stack. You'll notice that a PIL9 interrupt
*interrupted* e1000g while it was servicing an interrupt. I don't think
e1000g is at fault here. Something else is doing it.
This is probably my lack of knowledge about how solaris handles
interrupts, but with doing a little digging:
0xffffff0007c49c60::findstack -v
stack pointer for thread ffffff0007c49c60: ffffff0007c49b30
ffffff0007c49bb0 rm_isr+0xaa()
ffffff0007c49c00 av_dispatch_autovect+0x7c(10)
ffffff0007c49c40 dispatch_hardint+0x33(10, 6)
ffffff0007c4f450 switch_sp_and_call+0x13()
ffffff0007c4f4a0 do_interrupt+0x9e(ffffff0007c4f4b0, b)
ffffff0007c4f4b0 _interrupt+0xba()
I'm assuming this portion of the stack dump is what you're talking
about... looking at the function signature for dispatch_hardint -- the
new vector is 10, and the old ipl is 6.
::interrupts -d
IRQ Vect IPL Bus Trg Type CPU Share APIC/INT# Driver Name(s)
3 0xb1 12 ISA Edg Fixed 0 1 0x0/0x3 asy#1
4 0xb0 12 ISA Edg Fixed 0 1 0x0/0x4 asy#0
6 0x41 5 ISA Edg Fixed 0 1 0x0/0x6 fdc#0
7 0x42 5 ISA Edg Fixed 1 1 0x0/0x7 ecpp#0
9 0x81 9 PCI Lvl Fixed 1 1 0x0/0x9 acpi_wrapper_isr
15 0x43 5 ISA Edg Fixed 0 1 0x0/0xf ata#1
16 0x83 9 PCI Lvl Fixed 1 4 0x0/0x10 hci1394#0, uhci#3, uhci#0,
nvidia#0
17 0x87 8 PCI Lvl Fixed 0 1 0x0/0x11 audio810#0
18 0x86 9 PCI Lvl Fixed 1 1 0x0/0x12 pci-ide#1
19 0x85 9 PCI Lvl Fixed 0 1 0x0/0x13 uhci#1
23 0x84 9 PCI Lvl Fixed 1 1 0x0/0x17 ehci#0
26 0x40 5 PCI Lvl Fixed 1 1 0x1/0x2 aac#0
48 0x60 6 PCI Lvl Fixed 1 1 0x2/0x0 e1000g#0
72 0x82 7 PCI Edg MSI 0 1 - pcie_pci#0
73 0x30 4 PCI Edg MSI 0 1 - pcie_pci#2
74 0x44 5 PCI Edg MSI 0 1 - adpu320#0
160 0xa0 0 Edg IPI all 0 - poke_cpu
192 0xc0 13 Edg IPI all 1 - xc_serv
208 0xd0 14 Edg IPI all 1 - kcpc_hw_overflow_intr
209 0xd1 14 Edg IPI all 1 - cbe_fire
210 0xd3 14 Edg IPI all 1 - cbe_fire
240 0xe0 15 Edg IPI all 1 - xc_serv
241 0xe1 15 Edg IPI all 1 - apic_error_intr
That makes sense -- e1000g#0 is IPL 6, however shouldn't there then be
an entry somewhere in there with a VECT value of 0x0a and an IPL of 9?
Or do i still have more learning to do?
- Garrett
Jason King wrote:
I have a desktop that keeps freezing.. after some work, I managed to
force a crashdump via kmdb.. It's a dual-core xeon desktop running OS
2009.06 -- in this case I'm running virtualbox on it with a bridged
ethernet connection.
My very rudimentary analysis is this:
zsh 3 % scat 0
Solaris[TM] CAT 5.2 for Solaris 11 64-bit x64
SV4990M, Aug 26 2009
Copyright © 2009 Sun Microsystems, Inc. All rights reserved.
Use is subject to license terms.
Feedback regarding the tool should be sent to [email protected]
Visit the Solaris CAT blog at http://blogs.sun.com/SolarisCAT
opening unix.0 vmcore.0 ...dumphdr...symtab...core...done
loading core data: modules...symbols...CTF...done
core file: /var/crash/homer/vmcore.0
user: Jason King (jking:101)
release: 5.11 (64-bit)
version: snv_111b
machine: i86pc
node name: homer
system type: i86pc
hostid: 4eda84
dump_conflags: 0x10000 (DUMP_KERNEL) on /dev/zvol/dsk/rpool/dump(1.96G)
snooping: 0x1
boothowto: 0x22040 (DEBUG|VERBOSE|KMDB)
time of crash: Thu Sep 17 09:39:15 CDT 2009
age of system: 15 hours 32 minutes 10.05 seconds
panic CPU: 0 (2 CPUs, 3.93G memory)
panic string: BAD TRAP: type=e (#pf Page fault) rp=ffffff00078d1da0
addr=0 occurred in module "<unknown>" due to a NULL pointer
dereference
sanity checks: settings...vmem...
WARNING: CPU0 has cpu_intr_actv for 2
WARNING: CPU1 has cpu_intr_actv for 6 9
WARNING: last_swtch[1]: 0x553c75 (1 minutes 9.68 seconds earlier)
WARNING: PIL9 interrupt thread 0xffffff0007c49c60 on CPU1 pinning PIL6
interrupt thread 0xffffff0007c4fc60 pinning IA thread
0xffffff01d6d55740
sysent...clock...misc...
WARNING: 54 expired realtime (max -1m41.272660310s) and 27 expired
normal (max -11.562660310s) callouts
done
Does this mean that interrupt thread 0xffffff0007c4fc60 is taking too
long? It would explain why the box seems to hang.
That thread is:
% mdb -k unix.0 vmcore.0
Loading modules: [ unix genunix specfs dtrace mac cpu.generic uppc
pcplusmp scsi_vhci zfs sd sockfs ip hook neti sctp arp usba uhci s1394
fctl md lofs audiosup fcip fcp cpc random crypto logindmux ptm ufs
nsmb sppp ipc ]
0xffffff0007c49c60::findstack
stack pointer for thread ffffff0007c49c60: ffffff0007c49b30
ffffff0007c49bb0 rm_isr+0xaa()
ffffff0007c49c00 av_dispatch_autovect+0x7c()
ffffff0007c49c40 dispatch_hardint+0x33()
ffffff0007c4f450 switch_sp_and_call+0x13()
ffffff0007c4f4a0 do_interrupt+0x9e()
ffffff0007c4f4b0 _interrupt+0xba()
ffffff0007c4f5c0 default_lock_delay+0x8c()
ffffff0007c4f630 lock_set_spl_spin+0xc2()
ffffff0007c4f690 mutex_vector_enter+0x45e()
ffffff0007c4f6c0 RTSemEventSignal+0x6a()
ffffff0007c4f740 0xfffffffff836c57b()
ffffff0007c4f770 0xfffffffff836d73a()
ffffff0007c4f830 vboxNetFltSolarisRecv+0x331()
ffffff0007c4f880 VBoxNetFltSolarisModReadPut+0x107()
ffffff0007c4f8f0 putnext+0x21e()
ffffff0007c4f950 dld_str_rx_raw+0xb3()
ffffff0007c4fa10 dls_rx_promisc+0x179()
ffffff0007c4fa50 mac_promisc_dispatch_one+0x5f()
ffffff0007c4fac0 mac_promisc_dispatch+0x105()
ffffff0007c4fb10 mac_rx+0x3e()
ffffff0007c4fb50 mac_rx_ring+0x4c()
ffffff0007c4fbb0 e1000g_intr+0x17e()
Do I appear to be on the right track, and can anyone offer any
additional suggestions where to go from here (or even recognize the
problem)?
_______________________________________________
driver-discuss mailing list
[email protected]
http://mail.opensolaris.org/mailman/listinfo/driver-discuss
_______________________________________________
driver-discuss mailing list
[email protected]
http://mail.opensolaris.org/mailman/listinfo/driver-discuss