Hi Liujun, Ok. Here is the way I would "attack" this problem. First, load kmdb:
# mdb -K Since this is sparc (based on addresses), you should be able to do this from your console. You may need to add "-F" to the mdb command line. This should work, as you have a second cpu that is not also hung. This will drop you into kmdb. There, put a breakpoint at ssfcp_handle_devices+44c (the location where ssfcp_outstanding_lun_cmds should return: ssfcp_handle_device+44c:b Then, continue: :c If you hit the breakpoint, try the next function in the stack. If you don't hit the breakpoint, the code is looping in ssfcp_outstanding_lun_cmds or in something it calls. In that case, put a breakpoint in ssfcp_outstanding_lun_cmds+18 and again continue. If you hit the breakpoint, start single stepping and, at the same time, look at the C code to see if there is a loop. Basically, if you are not switching, it implies that the stack trace you have is not returning out to user. To single step, you can use :s or :e, or, better, just type '[' or ']'. (Then you don't have to type a carriage return). '[' skips over functions, ']' steps into function calls. I suggest '[' to start with, or you will be there all day. (You may be there all day anyway, but you can save yourself a little time). If you need more detail, let me know. Also, this may be a known problem, so maybe the first thing is to look in the bug listings to see if it's already known. I hope this helps. max liujun wrote: > max, > > The stack for that thread is : > >> 2a100371cc0::findstack -v >> > stack pointer for thread 2a100371cc0: 2a100370d61 [ 000002a100370d61 > ktl0+0x48() ] > 000002a100370eb1 ssfcp_outstanding_lun_cmds+0x18(60001918fd8, 0, > 60003081a40, 60003590228, 1, 100000) > 000002a100370f61 ssfcp_handle_devices+0x444(12b9ff0, 60001918fd8, 2, > 60005c977b8, 1, 60001918ff0) > 000002a100371071 ssfcp_statec_callback+0x614(60005c977b8, 1606d, 2, > 60000205000, 600019433a8, 404) > 000002a100371141 fctl_ulp_statec_cb+0x250(1, 2, 6000015d1d8, 6000577d188, > 60000106000, ff000000) > 000002a100371201 taskq_thread+0x1a4(60000010f90, 60000010f38, 50000, > 5208a1333c14, 2a100371aca, 2a100371ac8) > 000002a1003712d1 thread_start+4(60000010f38, 0, 0, 0, 0, 0) > > > > >> ssfcp_outstanding_lun_cmds+0x18::dis >> > ssfcp_handle_ipkt_errors+0x2c4: mov %i0, %o0 > ssfcp_handle_ipkt_errors+0x2c8: sra %l0, 0, %i0 > ssfcp_handle_ipkt_errors+0x2cc: ret > ssfcp_handle_ipkt_errors+0x2d0: restore > ssfcp_outstanding_lun_cmds: save %sp, -0xb0, %sp > ssfcp_outstanding_lun_cmds+4: ldx [%i0 + 0x20], %i2 > ssfcp_outstanding_lun_cmds+8: cmp %i2, 0 > ssfcp_outstanding_lun_cmds+0xc: > be,pn %xcc, +0x7c <ssfcp_outstanding_lun_cmds+0x88> > ssfcp_outstanding_lun_cmds+0x10:clr %i1 > ssfcp_outstanding_lun_cmds+0x14:mov %i2, %o0 > ssfcp_outstanding_lun_cmds+0x18:call -0x2891ac <mutex_enter> > ssfcp_outstanding_lun_cmds+0x1c:nop > ssfcp_outstanding_lun_cmds+0x20:ldx [%i2 + 0x10], %i3 > ssfcp_outstanding_lun_cmds+0x24:cmp %i3, %i1 > ssfcp_outstanding_lun_cmds+0x28: > be,pn %xcc, +0x48 <ssfcp_outstanding_lun_cmds+0x70> > ssfcp_outstanding_lun_cmds+0x2c:nop > ssfcp_outstanding_lun_cmds+0x30:ld [%i3 + 0x54], %i4 > ssfcp_outstanding_lun_cmds+0x34:cmp %i4, 2 > ssfcp_outstanding_lun_cmds+0x38: > be,pn %icc, +0x20 <ssfcp_outstanding_lun_cmds+0x58> > ssfcp_outstanding_lun_cmds+0x3c:nop > ssfcp_outstanding_lun_cmds+0x40:ldx [% > > > This message posted from opensolaris.org > _______________________________________________ > mdb-discuss mailing list > mdb-discuss at opensolaris.org > >