Hi,

I have one workstation(hp xw4300) , with Solaris 10 (x86) and one Digi Sync570i 
card.
The system may hangs at any time, from a few minutes to a couple of hours, when 
the card is receiving data frames.

I doubt the system hanging is caused by the driver module for Sync570, however, 
the same driver  works properly on solaris 8 system.
We used to install Solaris 8 on HP xw4100, but now we have to install Solaris 
10 on HP xw4300.(we cant get HP xw4100 in the market)

I use kmdb to load solaris system. After the system hangs I can't ping the 
host. And the keyboard and mouse have no reponses.
I can get the crashdump file by pressing "F1+A" and then input "$<systemdump".

By analysing the crashdump file , I can't find such problems as 'mutex 
deadlock' and 'bad trap'.
I really don't know what to do next step !

# crashdump files can be downloaded from the following URLs :
#     www.ras.com.cn/rivanwang/crash_4.tar.gz
#     www.ras.com.cn/rivanwang/crash_8.tar.gz
#     www.ras.com.cn/rivanwang/crash_7_nor.tar.gz
# "crash_7_nor.tar.gz" is generated before system hanging happens.

I have some questions as follows.
Would you be so kind as to give me some suggestions?



[[ Q1 ]]

I can't find the kernel thread reponding to Sync570 module by using the command 
"threadlist -v".
But I can get the LOADADDR:
 ::modinfo !grep Sync
161 feba4340     cc60   1 dsync (Sync/570 Device Driver)

How can I find the address of the kernel thread reponding to Sync570 module ?




[[ Q2 ]]
::msgbuf
panic[cpu0]/thread=d2c84de0:
BAD TRAP: type=e (#pf Page fault) rp=d2c84cec addr=0 occurred in module 
"<unknown>" due to a NULL pointer dereference

sched:
#pf Page fault
Bad kernel fault at addr=0x0
pid=0, pc=0x0, sp=0x202, eflags=0x10002
cr0: 8005003b<pg,wp,ne,et,ts,mp,pe> cr4: 6f8<xmme,fxsr,pge,mce,pae,pse,de>
cr2: 0 cr3: 4226000
         gs:      1b0  fs:        0  es:      160  ds:      160
        edi: d2f50a60 esi: fef4b2a8 ebp: d2c84d34 esp: d2c84d1c
        ebx: d2f54180 edx: d2f541f8 ecx:       1f eax: fed6c870
        trp:        e err:       10 eip:        0  cs:      158
        efl:    10002 usp:      202  ss: d2c84d3c
d2c84c4c unix:die+a7 (e, d2c84cec, 0, 0)
d2c84cd8 unix:trap+f56 (d2c84cec, 0, 0)
d2c84cec unix:cmntrap+83 ()
d2c84d34 0 (d2c84d44, fe81189a,)
d2c84d3c genunix:kdi_dvec_enter+a (d2c84d50, fe81183c,)
d2c84d44 unix:debug_enter+32 (0)
d2c84d50 unix:abort_sequence_enter+27 (0)
d2c84d64 kbtrans:kbtrans_streams_key+3e (d2f54180, 1f, 0)
d2c84d88 kb8042:kb8042_received_byte+b2 (fef4b1a8, 1e)
d2c84da0 kb8042:kb8042_intr+65 (fef4b1a8)
d2c84db8 i8042:i8042_intr+a4 (d2f50980)

----------------------------------------------------------------------------
 ::cpuinfo -v
ID ADDR     FLG NRUN BSPL PRI RNRN KRNRN SWITCH THREAD   PROC
  0 fec20ae4  1b    8    0 104   no    no t-740847 d2c84de0 sched
               |    |    |
    RUNNING <--+    |    +--> PIL THREAD
      READY         |           5 d2c84de0
     EXISTS         |           3 d2ca0de0
     ENABLE         |           - d2c28de0 (idle)
                    |
                    +-->  PRI THREAD   PROC
                           99 d2c9ade0 sched
                           99 d2c97de0 sched
                           60 d3264a00 fsflush
                           60 d2e1ade0 sched
                           60 d2e37de0 sched
                           60 d4644de0 sched
                           60 d96dcde0 sched
                           59 d38e7400 Xsun
  d2c84de0::thread
    ADDR    STATE  FLG PFLG SFLG   PRI  EPRI PIL     INTR DISPTIME BOUND PR
d2c84de0 onproc    809    0    3   104     0   5 d2ca0de0        0    -1  2
  d2ca0de0::thread
    ADDR    STATE  FLG PFLG SFLG   PRI  EPRI PIL     INTR DISPTIME BOUND PR
d2ca0de0 onproc      9    0    3   102     0   3 d2c28de0    46a51    -1  1
  d2ca0de0::findstack -v
stack pointer for thread d2ca0de0: d2ca0c2c
  d2ca0de0 0xd94c62bc()

----------------------------------------------------------------------------

  After I pressed "F1+A"?the kernel created the thread "d2c84de0" to give 
responses to keyboard interruption(PIL = 5, PRI= 104).
but another thread "d2ca0de0",at same time, is still running on CPU. ( PIL = 3 
, PRI = 102 ).
  I guess one event may causes the kernel to create the thread d2ca0de0 , but 
then the kernel hangs,  until I have pressed "F1+A" , the kernel creates 
another thead d2c84de0 , and finally crashed down.

I have no idea what causes the kernel to create thread d2ca0de0 
(PRI=102,PIL=3)? 






[[ Q3 ]]
  ::cpuinfo
 ID ADDR     FLG NRUN BSPL PRI RNRN KRNRN SWITCH THREAD   PROC
  0 fec20ae4  1b    8    0 104   no    no t-740847 d2c84de0 sched
  ::cycinfo -v
CPU  CYC_CPU   STATE NELEMS     ROOT            FIRE HANDLER
  0 d9aabe00  online      4 d9aabd80     96b6b848e80 clock

                                       2
                                       |
                    +------------------+------------------+
                    0                                     1
                    |                                     |
          +---------+--------+                  +---------+---------+
          3
          |
     +----+----+

      ADDR NDX HEAP LEVL  PEND            FIRE USECINT HANDLER
  d9aabd80   0    1 high     0     96b6b848e80   10000 cbe_hres_tick
  d9aabda0   1    2  low 741253    96b6b848e80   10000 apic_redistribute_compute
  d9aabdc0   2    0 lock   406     96b6b848e80   10000 clock
  d9aabde0   3    3 high     0     96b6d4e5200 1000000 deadman

-----------------------------------------------------------------------------------
The value of SWITCH of thread d2c84de0  is 740847 ;
The value of PEND of apic_redistribute_compute is 741253 ;
The value of PEND of clock is 406 .
              (741253 - 406) == 740847 
What does it mean ? Could you please account for it ?





[[ Q4 ]]
  
  ::ipcs
Message queues:
failed to read 'msq_svc'; module not present

Shared memory:
    ADDR   REF    ID      KEY  MODE PRJID ZONEID OWNER GROUP CREAT  CGRP
d4915f50     1     3      103  0666     3      0  1002   102  1002   102
d3f0b090     1     2      101  0666     3      0     0     0     0     0
d3f0b2c0     1     1      102  0666     3      0  1002   102  1002   102
d3f0bbf0     1     0      100  0666     3      0  1002   102  1002   102

Semaphores:
    ADDR   REF    ID      KEY  MODE PRJID ZONEID OWNER GROUP CREAT  CGRP
d4915ee0     3     3      103  0666     3      0  1002   102  1002   102
d3f0b1e0     3     2      101  0666     3      0     0     0     0     0
d3f0b250     4     1      102  0666     3      0  1002   102  1002   102
d3f0bb80     7     0      100  0666     3      0  1002   102  1002   102
> 
-------------------------------------------
I dont know what threads are accessing to the semaphore "d3f0b1e0" ?
How can I find these unkown threads?






  ::showrev
Hostname: cetc.a28.com
Release: 5.10
Kernel architecture: i86pc
Application architecture: i386
Kernel version: SunOS 5.10 i86pc Generic
Platform: i86pc
 



  ::msgbuf
MESSAGE                                                               
/pci at 0,0/pci103c,3013 at 1d,2 (uhci2): failed to attach
pcplusmp: pciclass,0c0300 (uhci) instance 3 vector 0x16 ioapic 0x1 intin 0x16 is
 bound to cpu 0
/pci at 0,0/pci103c,3013 at 1d,3 (uhci3): failed to attach
cpu0: x86 (GenuineIntel family 15 model 4 step 10 clock 3000 MHz)
cpu0: Intel(r) Pentium(r) 4 CPU 3.00GHz
NOTICE: 
Broadcom NetXtreme Gigabit Ethernet Driver (32-bit) v8.3.1
PCI-device: pci8086,27e2 at 1c,5, pci_pci3
pci_pci3 is /pci at 0,0/pci8086,27e2 at 1c,5
pcplusmp: pci14e4,1600 (bcme) instance 0 vector 0x11 ioapic 0x1 intin 0x11 is bo
und to cpu 0
NOTICE: 
bcme0 : Broadcom NetXtreme Gigabit Ethernet BCM95752 (Copper) is detected
NOTICE: bcme0 : Firmware version 5752-v3.10
NOTICE: bcme0 : No Link
pcplusmp: pci14e4,1600 (bcme) instance 0 vector 0x11 ioapic 0x1 intin 0x11 is bo
und to cpu 0
PCI-device: pci103c,3013 at 0, bcme0
bcme0 is /pci at 0,0/pci8086,27e2 at 1c,5/pci103c,3013 at 0
pcplusmp: pciclass,0c0300 (uhci) instance 0 vector 0x14 ioapic 0x1 intin 0x14 is
 bound to cpu 0
/pci at 0,0/pci103c,3013 at 1d (uhci0): failed to attach
pcplusmp: pciclass,0c0300 (uhci) instance 1 vector 0x12 ioapic 0x1 intin 0x12 is
 bound to cpu 0
/pci at 0,0/pci103c,3013 at 1d,1 (uhci1): failed to attach
pcplusmp: pciclass,0c0300 (uhci) instance 2 vector 0x15 ioapic 0x1 intin 0x15 is
 bound to cpu 0
/pci at 0,0/pci103c,3013 at 1d,2 (uhci2): failed to attach
pcplusmp: pciclass,0c0300 (uhci) instance 3 vector 0x16 ioapic 0x1 intin 0x16 is
 bound to cpu 0
/pci at 0,0/pci103c,3013 at 1d,3 (uhci3): failed to attach
        UltraDMA mode 5 selected
dump on /dev/dsk/c1d0s1 size 2047 MB
NOTICE: bcme0 : Link is Up (100Mbps, Full Duplex, Rx & Tx Flow Control ON)
pseudo-device: pm0
pm0 is /pseudo/pm at 0
pseudo-device: devinfo0
devinfo0 is /pseudo/devinfo at 0
xsvc0 at root
xsvc0 is /xsvc
pcplusmp: asy (asy) instance 0 vector 0x4 ioapic 0x1 intin 0x4 is bound to cpu 0
pcplusmp: asy (asy) instance 0 vector 0x4 ioapic 0x1 intin 0x4 is bound to cpu 0
ISA-device: asy0
asy0 is /isa/asy at 1,3f8
pseudo-device: pool0
pool0 is /pseudo/pool at 0
pseudo-device: vol0
vol0 is /pseudo/vol at 0
pcplusmp: ide (ata) instance 0 vector 0xe ioapic 0x1 intin 0xe is bound to cpu 0
pcplusmp: ide (ata) instance 0 vector 0xe ioapic 0x1 intin 0xe is bound to cpu 0
        ATAPI device at targ 0, lun 0 lastlun 0x0
        model LITE-ON DVD SOHD-16P9S
        ATA/ATAPI-6 supported, majver 0x78 minver 0x0
PCI-device: ide at 0, ata0
ata0 is /pci at 0,0/pci-ide at 1f,1/ide at 0
        ATA DMA off: disabled.  Control with "atapi-cd-dma-enabled" property
        PIO mode 4 selected
        ATA DMA off: disabled.  Control with "atapi-cd-dma-enabled" property
        PIO mode 4 selected
sd0 at ata0: target 0 lun 0
sd0 is /pci at 0,0/pci-ide at 1f,1/ide at 0/sd at 0,0
device pciclass,030000 at 0(display#1) keeps up device sd at 0,0(sd#0), but the 
latter
 is not power managed
pcplusmp: pciclass,0c0300 (uhci) instance 3 vector 0x16 ioapic 0x1 intin 0x16 is
 bound to cpu 0
/pci at 0,0/pci103c,3013 at 1d,3 (uhci3): failed to attach
pcplusmp: pciclass,0c0300 (uhci) instance 3 vector 0x16 ioapic 0x1 intin 0x16 is
 bound to cpu 0
/pci at 0,0/pci103c,3013 at 1d,3 (uhci3): failed to attach
pcplusmp: fdc (fdc) instance 0 vector 0x6 ioapic 0x1 intin 0x6 is bound to cpu 0
pcplusmp: fdc (fdc) instance 0 vector 0x6 ioapic 0x1 intin 0x6 is bound to cpu 0
ISA-device: fdc0
fd0 at fdc0
fd0 is /isa/fdc at 1,3f0/fd at 0,0
8042 device:  mouse at 1, mouse8042 # 0
mouse80420 is /isa/i8042 at 1,60/mouse at 1
pseudo-device: pm0
pm0 is /pseudo/pm at 0
Pad8 attaching at 14:47:16, Jun  7 2001
pad81 at root: space 0 offset ee704
pad81 is /pad8 at 0,ee704
Solaris x86 pad driver open.
Solaris x86 pad driver open.
pcplusmp: pci114f,5013 (dsync) instance 0 vector 0x12 ioapic 0x1 intin 0x12 is b
ound to cpu 0
pcplusmp: pci114f,5013 (dsync) instance 0 vector 0x12 ioapic 0x1 intin 0x12 is b
ound to cpu 0
WARNING: minor name <dsync1> is not compatible network driver instance <0>
WARNING: minor name <dsync2> is not compatible network driver instance <0>
WARNING: minor name <dsync3> is not compatible network driver instance <0>
NOTICE: bcme0 : No Link
NOTICE: bcme0 : Link is Up (10Mbps, Full Duplex, Rx & Tx Flow Control ON)
NOTICE: bcme0 : No Link
NOTICE: bcme0 : Link is Up (100Mbps, Full Duplex, Rx & Tx Flow Control ON)
NOTICE: bcme0 : No Link
NOTICE: bcme0 : Link is Up (100Mbps, Full Duplex, Rx & Tx Flow Control ON)


panic[cpu0]/thread=d2c84de0: 
BAD TRAP: type=e (#pf Page fault) rp=d2c84cec addr=0 occurred in module "<unknow
n>" due to a NULL pointer dereference


sched: 
#pf Page fault
Bad kernel fault at addr=0x0
pid=0, pc=0x0, sp=0x202, eflags=0x10002
cr0: 8005003b<pg,wp,ne,et,ts,mp,pe> cr4: 6f8<xmme,fxsr,pge,mce,pae,pse,de>
cr2: 0 cr3: 4226000
         gs:      1b0  fs:        0  es:      160  ds:      160
        edi: d2f50a60 esi: fef4b2a8 ebp: d2c84d34 esp: d2c84d1c
        ebx: d2f54180 edx: d2f541f8 ecx:       1f eax: fed6c870
        trp:        e err:       10 eip:        0  cs:      158
        efl:    10002 usp:      202  ss: d2c84d3c

d2c84c4c unix:die+a7 (e, d2c84cec, 0, 0)
d2c84cd8 unix:trap+f56 (d2c84cec, 0, 0)
d2c84cec unix:cmntrap+83 ()
d2c84d34 0 (d2c84d44, fe81189a,)
d2c84d3c genunix:kdi_dvec_enter+a (d2c84d50, fe81183c,)
d2c84d44 unix:debug_enter+32 (0)
d2c84d50 unix:abort_sequence_enter+27 (0)
d2c84d64 kbtrans:kbtrans_streams_key+3e (d2f54180, 1f, 0)
d2c84d88 kb8042:kb8042_received_byte+b2 (fef4b1a8, 1e)
d2c84da0 kb8042:kb8042_intr+65 (fef4b1a8)
d2c84db8 i8042:i8042_intr+a4 (d2f50980)

syncing file systems...
 2
 2
 done
dumping to /dev/dsk/c1d0s1, offset 429391872, content: kernel
  
 

????????rivanwang
????????rivan at vip.sina.com
??????????2007-03-28
 
 
This message posted from opensolaris.org

Reply via email to