Hi,
I have one workstation(hp xw4300) , with Solaris 10 (x86) and one Digi Sync570i
card.
The system may hangs at any time, from a few minutes to a couple of hours, when
the card is receiving data frames.
I doubt the system hanging is caused by the driver module for Sync570, however,
the same driver works properly on solaris 8 system.
We used to install Solaris 8 on HP xw4100, but now we have to install Solaris
10 on HP xw4300.(we cant get HP xw4100 in the market)
I use kmdb to load solaris system. After the system hangs I can't ping the
host. And the keyboard and mouse have no reponses.
I can get the crashdump file by pressing "F1+A" and then input "$<systemdump".
By analysing the crashdump file , I can't find such problems as 'mutex
deadlock' and 'bad trap'.
I really don't know what to do next step !
# The crashdump files can be downloaded from the following URLs :
# www.ras.com.cn/rivanwang/crash_4.tar.gz
# www.ras.com.cn/rivanwang/crash_8.tar.gz
# www.ras.com.cn/rivanwang/crash_7_nor.tar.gz
# "crash_7_nor.tar.gz" is generated before system hanging happens.
I have some questions as follows.
Would you be so kind as to give me some suggestions?
[[ Q1 ]]
I can't find the kernel thread reponding to Sync570 module by using the command
"threadlist -v".
But I can get the LOADADDR:
::modinfo !grep Sync
161 feba4340 cc60 1 dsync (Sync/570 Device Driver)
How can I find the address of the kernel thread reponding to Sync570 module ?
[[ Q2 ]]
::msgbuf
panic[cpu0]/thread=d2c84de0:
BAD TRAP: type=e (#pf Page fault) rp=d2c84cec addr=0 occurred in module
"<unknown>" due to a NULL pointer dereference
sched:
#pf Page fault
Bad kernel fault at addr=0x0
pid=0, pc=0x0, sp=0x202, eflags=0x10002
cr0: 8005003b<pg,wp,ne,et,ts,mp,pe> cr4: 6f8<xmme,fxsr,pge,mce,pae,pse,de>
cr2: 0 cr3: 4226000
gs: 1b0 fs: 0 es: 160 ds: 160
edi: d2f50a60 esi: fef4b2a8 ebp: d2c84d34 esp: d2c84d1c
ebx: d2f54180 edx: d2f541f8 ecx: 1f eax: fed6c870
trp: e err: 10 eip: 0 cs: 158
efl: 10002 usp: 202 ss: d2c84d3c
d2c84c4c unix:die+a7 (e, d2c84cec, 0, 0)
d2c84cd8 unix:trap+f56 (d2c84cec, 0, 0)
d2c84cec unix:cmntrap+83 ()
d2c84d34 0 (d2c84d44, fe81189a,)
d2c84d3c genunix:kdi_dvec_enter+a (d2c84d50, fe81183c,)
d2c84d44 unix:debug_enter+32 (0)
d2c84d50 unix:abort_sequence_enter+27 (0)
d2c84d64 kbtrans:kbtrans_streams_key+3e (d2f54180, 1f, 0)
d2c84d88 kb8042:kb8042_received_byte+b2 (fef4b1a8, 1e)
d2c84da0 kb8042:kb8042_intr+65 (fef4b1a8)
d2c84db8 i8042:i8042_intr+a4 (d2f50980)
----------------------------------------------------------------------------
::cpuinfo -v
ID ADDR FLG NRUN BSPL PRI RNRN KRNRN SWITCH THREAD PROC
0 fec20ae4 1b 8 0 104 no no t-740847 d2c84de0 sched
| | |
RUNNING <--+ | +--> PIL THREAD
READY | 5 d2c84de0
EXISTS | 3 d2ca0de0
ENABLE | - d2c28de0 (idle)
|
+--> PRI THREAD PROC
99 d2c9ade0 sched
99 d2c97de0 sched
60 d3264a00 fsflush
60 d2e1ade0 sched
60 d2e37de0 sched
60 d4644de0 sched
60 d96dcde0 sched
59 d38e7400 Xsun
d2c84de0::thread
ADDR STATE FLG PFLG SFLG PRI EPRI PIL INTR DISPTIME BOUND PR
d2c84de0 onproc 809 0 3 104 0 5 d2ca0de0 0 -1 2
d2ca0de0::thread
ADDR STATE FLG PFLG SFLG PRI EPRI PIL INTR DISPTIME BOUND PR
d2ca0de0 onproc 9 0 3 102 0 3 d2c28de0 46a51 -1 1
d2ca0de0::findstack -v
stack pointer for thread d2ca0de0: d2ca0c2c
d2ca0de0 0xd94c62bc()
----------------------------------------------------------------------------
After I pressed "F1+A",the kernel created the thread "d2c84de0" to give
responses to keyboard interruption(PIL = 5, PRI= 104).
but another thread "d2ca0de0",at same time, is still running on CPU. ( PIL = 3
, PRI = 102 ).
I guess one event may causes the kernel to create the thread d2ca0de0 , but
then the kernel hangs, until I have pressed "F1+A" , the kernel creates
another thead d2c84de0 , and finally crashed down.
I have no idea what causes the kernel to create thread d2ca0de0
(PRI=102,PIL=3)?
[[ Q3 ]]
::cpuinfo
ID ADDR FLG NRUN BSPL PRI RNRN KRNRN SWITCH THREAD PROC
0 fec20ae4 1b 8 0 104 no no t-740847 d2c84de0 sched
::cycinfo -v
CPU CYC_CPU STATE NELEMS ROOT FIRE HANDLER
0 d9aabe00 online 4 d9aabd80 96b6b848e80 clock
2
|
+------------------+------------------+
0 1
| |
+---------+--------+ +---------+---------+
3
|
+----+----+
ADDR NDX HEAP LEVL PEND FIRE USECINT HANDLER
d9aabd80 0 1 high 0 96b6b848e80 10000 cbe_hres_tick
d9aabda0 1 2 low 741253 96b6b848e80 10000 apic_redistribute_compute
d9aabdc0 2 0 lock 406 96b6b848e80 10000 clock
d9aabde0 3 3 high 0 96b6d4e5200 1000000 deadman
-----------------------------------------------------------------------------------
The value of SWITCH of thread d2c84de0 is 740847 ;
The value of PEND of apic_redistribute_compute is 741253 ;
The value of PEND of clock is 406 .
(741253 - 406) == 740847
What does it mean ? Could you please account for it ?
[[ Q4 ]]
::ipcs
Message queues:
failed to read 'msq_svc'; module not present
Shared memory:
ADDR REF ID KEY MODE PRJID ZONEID OWNER GROUP CREAT CGRP
d4915f50 1 3 103 0666 3 0 1002 102 1002 102
d3f0b090 1 2 101 0666 3 0 0 0 0 0
d3f0b2c0 1 1 102 0666 3 0 1002 102 1002 102
d3f0bbf0 1 0 100 0666 3 0 1002 102 1002 102
Semaphores:
ADDR REF ID KEY MODE PRJID ZONEID OWNER GROUP CREAT CGRP
d4915ee0 3 3 103 0666 3 0 1002 102 1002 102
d3f0b1e0 3 2 101 0666 3 0 0 0 0 0
d3f0b250 4 1 102 0666 3 0 1002 102 1002 102
d3f0bb80 7 0 100 0666 3 0 1002 102 1002 102
>
-------------------------------------------
I dont know what threads are accessing to the semaphore "d3f0b1e0" ?
How can I find these unkown threads?
::showrev
Hostname: cetc.a28.com
Release: 5.10
Kernel architecture: i86pc
Application architecture: i386
Kernel version: SunOS 5.10 i86pc Generic
Platform: i86pc
::msgbuf
MESSAGE
/[EMAIL PROTECTED],0/pci103c,[EMAIL PROTECTED],2 (uhci2): failed to attach
pcplusmp: pciclass,0c0300 (uhci) instance 3 vector 0x16 ioapic 0x1 intin 0x16 is
bound to cpu 0
/[EMAIL PROTECTED],0/pci103c,[EMAIL PROTECTED],3 (uhci3): failed to attach
cpu0: x86 (GenuineIntel family 15 model 4 step 10 clock 3000 MHz)
cpu0: Intel(r) Pentium(r) 4 CPU 3.00GHz
NOTICE:
Broadcom NetXtreme Gigabit Ethernet Driver (32-bit) v8.3.1
PCI-device: pci8086,[EMAIL PROTECTED],5, pci_pci3
pci_pci3 is /[EMAIL PROTECTED],0/pci8086,[EMAIL PROTECTED],5
pcplusmp: pci14e4,1600 (bcme) instance 0 vector 0x11 ioapic 0x1 intin 0x11 is bo
und to cpu 0
NOTICE:
bcme0 : Broadcom NetXtreme Gigabit Ethernet BCM95752 (Copper) is detected
NOTICE: bcme0 : Firmware version 5752-v3.10
NOTICE: bcme0 : No Link
pcplusmp: pci14e4,1600 (bcme) instance 0 vector 0x11 ioapic 0x1 intin 0x11 is bo
und to cpu 0
PCI-device: pci103c,[EMAIL PROTECTED], bcme0
bcme0 is /[EMAIL PROTECTED],0/pci8086,[EMAIL PROTECTED],5/pci103c,[EMAIL
PROTECTED]
pcplusmp: pciclass,0c0300 (uhci) instance 0 vector 0x14 ioapic 0x1 intin 0x14 is
bound to cpu 0
/[EMAIL PROTECTED],0/pci103c,[EMAIL PROTECTED] (uhci0): failed to attach
pcplusmp: pciclass,0c0300 (uhci) instance 1 vector 0x12 ioapic 0x1 intin 0x12 is
bound to cpu 0
/[EMAIL PROTECTED],0/pci103c,[EMAIL PROTECTED],1 (uhci1): failed to attach
pcplusmp: pciclass,0c0300 (uhci) instance 2 vector 0x15 ioapic 0x1 intin 0x15 is
bound to cpu 0
/[EMAIL PROTECTED],0/pci103c,[EMAIL PROTECTED],2 (uhci2): failed to attach
pcplusmp: pciclass,0c0300 (uhci) instance 3 vector 0x16 ioapic 0x1 intin 0x16 is
bound to cpu 0
/[EMAIL PROTECTED],0/pci103c,[EMAIL PROTECTED],3 (uhci3): failed to attach
UltraDMA mode 5 selected
dump on /dev/dsk/c1d0s1 size 2047 MB
NOTICE: bcme0 : Link is Up (100Mbps, Full Duplex, Rx & Tx Flow Control ON)
pseudo-device: pm0
pm0 is /pseudo/[EMAIL PROTECTED]
pseudo-device: devinfo0
devinfo0 is /pseudo/[EMAIL PROTECTED]
xsvc0 at root
xsvc0 is /xsvc
pcplusmp: asy (asy) instance 0 vector 0x4 ioapic 0x1 intin 0x4 is bound to cpu 0
pcplusmp: asy (asy) instance 0 vector 0x4 ioapic 0x1 intin 0x4 is bound to cpu 0
ISA-device: asy0
asy0 is /isa/[EMAIL PROTECTED],3f8
pseudo-device: pool0
pool0 is /pseudo/[EMAIL PROTECTED]
pseudo-device: vol0
vol0 is /pseudo/[EMAIL PROTECTED]
pcplusmp: ide (ata) instance 0 vector 0xe ioapic 0x1 intin 0xe is bound to cpu 0
pcplusmp: ide (ata) instance 0 vector 0xe ioapic 0x1 intin 0xe is bound to cpu 0
ATAPI device at targ 0, lun 0 lastlun 0x0
model LITE-ON DVD SOHD-16P9S
ATA/ATAPI-6 supported, majver 0x78 minver 0x0
PCI-device: [EMAIL PROTECTED], ata0
ata0 is /[EMAIL PROTECTED],0/[EMAIL PROTECTED],1/[EMAIL PROTECTED]
ATA DMA off: disabled. Control with "atapi-cd-dma-enabled" property
PIO mode 4 selected
ATA DMA off: disabled. Control with "atapi-cd-dma-enabled" property
PIO mode 4 selected
sd0 at ata0: target 0 lun 0
sd0 is /[EMAIL PROTECTED],0/[EMAIL PROTECTED],1/[EMAIL PROTECTED]/[EMAIL
PROTECTED],0
device pciclass,[EMAIL PROTECTED](display#1) keeps up device [EMAIL
PROTECTED],0(sd#0), but the latter
is not power managed
pcplusmp: pciclass,0c0300 (uhci) instance 3 vector 0x16 ioapic 0x1 intin 0x16 is
bound to cpu 0
/[EMAIL PROTECTED],0/pci103c,[EMAIL PROTECTED],3 (uhci3): failed to attach
pcplusmp: pciclass,0c0300 (uhci) instance 3 vector 0x16 ioapic 0x1 intin 0x16 is
bound to cpu 0
/[EMAIL PROTECTED],0/pci103c,[EMAIL PROTECTED],3 (uhci3): failed to attach
pcplusmp: fdc (fdc) instance 0 vector 0x6 ioapic 0x1 intin 0x6 is bound to cpu 0
pcplusmp: fdc (fdc) instance 0 vector 0x6 ioapic 0x1 intin 0x6 is bound to cpu 0
ISA-device: fdc0
fd0 at fdc0
fd0 is /isa/[EMAIL PROTECTED],3f0/[EMAIL PROTECTED],0
8042 device: [EMAIL PROTECTED], mouse8042 # 0
mouse80420 is /isa/[EMAIL PROTECTED],60/[EMAIL PROTECTED]
pseudo-device: pm0
pm0 is /pseudo/[EMAIL PROTECTED]
Pad8 attaching at 14:47:16, Jun 7 2001
pad81 at root: space 0 offset ee704
pad81 is /[EMAIL PROTECTED],ee704
Solaris x86 pad driver open.
Solaris x86 pad driver open.
pcplusmp: pci114f,5013 (dsync) instance 0 vector 0x12 ioapic 0x1 intin 0x12 is b
ound to cpu 0
pcplusmp: pci114f,5013 (dsync) instance 0 vector 0x12 ioapic 0x1 intin 0x12 is b
ound to cpu 0
WARNING: minor name <dsync1> is not compatible network driver instance <0>
WARNING: minor name <dsync2> is not compatible network driver instance <0>
WARNING: minor name <dsync3> is not compatible network driver instance <0>
NOTICE: bcme0 : No Link
NOTICE: bcme0 : Link is Up (10Mbps, Full Duplex, Rx & Tx Flow Control ON)
NOTICE: bcme0 : No Link
NOTICE: bcme0 : Link is Up (100Mbps, Full Duplex, Rx & Tx Flow Control ON)
NOTICE: bcme0 : No Link
NOTICE: bcme0 : Link is Up (100Mbps, Full Duplex, Rx & Tx Flow Control ON)
panic[cpu0]/thread=d2c84de0:
BAD TRAP: type=e (#pf Page fault) rp=d2c84cec addr=0 occurred in module "<unknow
n>" due to a NULL pointer dereference
sched:
#pf Page fault
Bad kernel fault at addr=0x0
pid=0, pc=0x0, sp=0x202, eflags=0x10002
cr0: 8005003b<pg,wp,ne,et,ts,mp,pe> cr4: 6f8<xmme,fxsr,pge,mce,pae,pse,de>
cr2: 0 cr3: 4226000
gs: 1b0 fs: 0 es: 160 ds: 160
edi: d2f50a60 esi: fef4b2a8 ebp: d2c84d34 esp: d2c84d1c
ebx: d2f54180 edx: d2f541f8 ecx: 1f eax: fed6c870
trp: e err: 10 eip: 0 cs: 158
efl: 10002 usp: 202 ss: d2c84d3c
d2c84c4c unix:die+a7 (e, d2c84cec, 0, 0)
d2c84cd8 unix:trap+f56 (d2c84cec, 0, 0)
d2c84cec unix:cmntrap+83 ()
d2c84d34 0 (d2c84d44, fe81189a,)
d2c84d3c genunix:kdi_dvec_enter+a (d2c84d50, fe81183c,)
d2c84d44 unix:debug_enter+32 (0)
d2c84d50 unix:abort_sequence_enter+27 (0)
d2c84d64 kbtrans:kbtrans_streams_key+3e (d2f54180, 1f, 0)
d2c84d88 kb8042:kb8042_received_byte+b2 (fef4b1a8, 1e)
d2c84da0 kb8042:kb8042_intr+65 (fef4b1a8)
d2c84db8 i8042:i8042_intr+a4 (d2f50980)
syncing file systems...
2
2
done
dumping to /dev/dsk/c1d0s1, offset 429391872, content: kernel
rivanwang
[EMAIL PROTECTED]
2007-03-28
This message posted from opensolaris.org
_______________________________________________
opensolaris-code mailing list
[email protected]
http://mail.opensolaris.org/mailman/listinfo/opensolaris-code