Re: [storage-discuss] Re: Iscsi target and initiator on svn 55b x86_32

Ken Davis Fri, 30 Mar 2007 14:12:00 -0800

Nigel, Rick is currently on vacation and will return Monday, April2nd. You can expect a response sometime next week. He will beburied with email so it may take awhile to get to this.


-Ken


On Mar 29, 2007, at 9:15 AM, Nigel Smith wrote:

Rafael, thanks for posting your mdb backtrace, which I think isquite interesting.
The warning you are seeing in '/var/adm/messages' from the
initiator, about 'received an unsupported opcode:0x41' is the
exact same as what occurred on my PC, which caused a core dump.

Also the 'stack backtraces' from core file you have,
is similar to the problem I was seeing.
(I was using snv_54 at that time.)

Ok, let's compare the two set's of stack backtraces:

In your case Rafael, the mdb trace show:
(Deepest level shown first)

libc.so.1`_assert+0x6e(8075568, 8075558, 265)
t10_cmd_state_machine+0xcb(80954f8, 3)
trans_aioread+0x4c(80954f8, 809b5f0, 2000, 5000, 0, 80942b0)
raw_read+0x4a7(80954f8, 808fb98, 10)
raw_cmd+0x23(80954f8, 808fb98, 10)
lu_runner+0x79b(8092070)

In the case of my earlier core dump, I saw:

libc.so.1`_assert+0x6e(807675c, 807674c, 259)
t10_cmd_state_machine+0x25e(80d3950, 2)
t10_cmd_shoot_event+0x53(80d3950, 2)
trans_send_complete+0x6c(80d3950, 0)
spc_mselect_data+0x92(80d3950, 80b2ab0, 0, 80b2ab0, 20)
sbc_data+0x2b(80d3950, 80b2ab0, 0, 80b2ab0, 20)
trans_rqst_dataout+0x142(80d3950, 80b2ab0, 20, 0, 80b2ab0, 8069c98)
spc_mselect+0x54(80d3950, 80a1ad8, 10)
sbc_cmd+0x23(80d3950, 80a1ad8, 10)
lu_runner+0x79b(80afdd0)

Based on my (limited) understanding of the code, what is happening
is that the 't10_cmd_state_machine' routine, is detecting a statetransition
 that should never, in-theory happen, and so it issues an 'assert',
 which aborts the program, and creates the core dump.
 (And then SMF restarts 'iscsitgtd', and it keeps looping as it
 tries again to login.)
Rick McNeal, the author of the code, in the case of my core dump,traced the
 bug to the routine "spc_mselect_data()", which is line 501 in
http://src.opensolaris.org/source/xref/onnv/onnv-gate/usr/src/cmd/iscsi/iscsitgtd/t10_spc.c
where he had a "break;" which should have be a "return;".

In your case Rafael, maybe there is a problem in "trans_aioread()",
which is line 1249 in
http://src.opensolaris.org/source/xref/onnv/onnv-gate/usr/src/cmd/iscsi/iscsitgtd/t10_sam.c
or in "raw_read()",
which is line 282 in
http://src.opensolaris.org/source/xref/onnv/onnv-gate/usr/src/cmd/iscsi/iscsitgtd/t10_raw_if.c
Rafael, the other useful thing you could try is to use the 'truss'command.
See my post here:
http://mail.opensolaris.org/pipermail/storage-discuss/2007-February/000801.html
You can use 'truss' to trace the routines called by a runningprocess - in thiscase 'iscsitgt'. (The parameter -p specifies the process id, and -ospecifies theoutput file name. And the parameter '-u a.out' means trace 'user-level' functions.)
In the truss output file, when iscsitgtd fails you should see the "A s s e r t i o n", and the interesting part will be the linesleading up to that, which should showthe routines being called in the iscsitgt code. This shouldfurther help to trace
 what is going wrong.
Of course, what we really need is for Rick to jump in here and givehis view.
I guess maybe he is busy or on holiday at the moment.
And of course, all the above is really just to help Rick find thebug and squash it.We would then need to ask him to release a new set of packages,like he did here:http://mail.opensolaris.org/pipermail/storage-discuss/2007-January/000748.html
Ok, I look forward to seeing the truss output.
Thanks
Nigel Smith
http://nwsmith.blogspot.com/

Oh, by the way, I have found some good links concerning 'mdb',
which I have posted here:
http://del.icio.us/nwsmith/solaris-mdb


This message posted from opensolaris.org
_______________________________________________
storage-discuss mailing list
[email protected]
http://mail.opensolaris.org/mailman/listinfo/storage-discuss


Ken Davis - Manager
Sun Microsystems
New Solaris Storage Group
work: 303.395.4168
cell: 720.837.5818



_______________________________________________
storage-discuss mailing list
[email protected]
http://mail.opensolaris.org/mailman/listinfo/storage-discuss

Re: [storage-discuss] Re: Iscsi target and initiator on svn 55b x86_32

Reply via email to