Ok, I think I MAY have figured out what's going wrong.
Let's start with the dapptrace output, and clear out some of the 'noise'.
I've removed the lines containing these strings:
'queue_'
'util_title'
'_to_str'
And this is what we are left with:
(And I've numbered the lines from 01 to 42)
01 - 1009/22: -> conn_poller(0x4931D0, 0x0, 0x0)
02 - 1009/22: -> conn_state(0x4931D0, 0x1, 0xFFFFFFFF8717FCA0)
03 - 1009/22: <- conn_state = 221
04 - 1009/22: -> iscsi_handle_login_pkt(0x4931D0, 0x0, 0x0)
05 - 1009/22: -> session_alloc(0x4931D0, 0xFFFFFD7FFE1FDF18, 0x43)
06 - 1009/22: <- session_alloc = 327
07 - 1009/22: -> connection_parameters_default(0x4931D0, 0x1, 0x0)
08 - 1009/22: <- connection_parameters_default = 10
09 - 1009/22: -> parse_text(0x4931D0, 0x98, 0xFFFFFD7FFE1FDE80)
10 - 1009/22: -> find_target_node(0x47EDC9, 0x47EDC9, 0x0)
11 - 1009/22: <- find_target_node = 123
12 - 1009/22: -> check_access(0x46A930, 0x465D90, 0x0)
13 - 1009/22: <- check_access = 305
14 - 1009/22: -> add_text(0xFFFFFD7FFE1FDE80, 0xFFFFFD7FFE1FDE88,
0x42BD84)
15 - 1009/22: <- add_text = 184
16 - 1009/22: -> add_text(0xFFFFFD7FFE1FDE80, 0xFFFFFD7FFE1FDE88,
0x42BD90)
17 - 1009/22: <- add_text = 184
18 - 1009/22: -> add_text(0x493520, 0x493528, 0x47EE18)
19 - 1009/22: <- add_text = 184
20 - 1009/22: <- parse_text = 504
21 - 1009/22: -> add_text(0xFFFFFD7FFE1FDE80, 0xFFFFFD7FFE1FDE88, 0x42B2CF)
22 - 1009/22: <- add_text = 184
23 - 1009/22: -> session_validate(0x49D010, 0x42B2CE, 0x1)
24 - 1009/22: <- session_validate = 327
25 - 1009/22: -> send_iscsi_pkt(0x4931D0, 0x4686D0, 0x479970)
26 - 1009/22: <- send_iscsi_pkt = 117
27 - 1009/22: <- iscsi_handle_login_pkt = 168
28 - 1009/22: -> iscsi_handle_login_pkt(0x4931D0, 0x0, 0x0)
29 - 1009/22: -> session_alloc(0x4931D0, 0xFFFFFD7FFE1FDF18, 0x43)
30 - 1009/22: <- session_alloc = 51
31 - 1009/22: -> connection_parameters_default(0x4931D0,
0xFFFFFD7FFE1FDF18, 0x43)
32 - 1009/22: <- connection_parameters_default = 10
33 - 1009/22: -> conn_state(0x4931D0, 0x4, 0x0)
34 - 1009/22: <- conn_state = 221
35 - 1009/22: <- iscsi_handle_login_pkt = 316
36 - 1009/22: <- conn_poller = 227
37 - 1009/21: -> port_conn_remove(0x4931D0, 0x1, 0x0)
38 - 1009/21: <- port_conn_remove = 115
39 - 1009/21: <- conn_process = 422
40 - 1009/21: -> conn_process(0x4931D0, 0x0, 0x0)
41 - 1009/21: -> conn_state(0x4931D0, 0x0, 0x0)
42 - 1009/21: <- conn_state = 221
In line 04 we enter function "iscsi_handle_login_pkt()"
You can see the code for this here:
http://src.opensolaris.org/source/xref/onnv/onnv-gate/usr/src/cmd/iscsi/iscsitgtd/iscsi_login.c?r=3452
I'm using revision 3452 from early 2007, as that looks
to be around the time of the Solaris 10u4 version of
the target, which is what we are dealing with here.
If you follow through the code:
In code line 166, we call "session_alloc()"
In code line 171, we call "connection_parameters_default()"
In code line 239, we call "parse_text()"
In code line 268, we call "add_text()"
In code line 599, we call "session_validate()"
In code line 601, we call "send_iscsi_pkt()"
and then we exit out of "iscsi_handle_login_pkt()"
Now in line 28 we enter "iscsi_handle_login_pkt()" again,
presumably for the second login packet.
In code line 166, we call "session_alloc()"
In code line 171, we call "connection_parameters_default()"
And in code line 188, we call "conn_state()"
This means that in line 185, this statement must have been TRUE:
185 if (ntohs(lh.tsid) != 0) {
Just above that line, and just after are these comments:
181 /*
182 * Is this a new session or an attempt to add a connection to
183 * an existing session.
184 */
187 /* Multiple connections per session not handled right
now */
Variable 'lh' is defined as "iscsi_login_hdr_t lh;".
And if we look in this file:
http://src.opensolaris.org/source/xref/onnv/onnv-gate/usr/src/uts/common/sys/iscsi_protocol.h
... we find:
467 /* Login Header */
468 typedef struct _iscsi_login_hdr {
469 uint8_t opcode;
470 uint8_t flags;
471 uint8_t max_version; /* Max. version supported */
472 uint8_t min_version; /* Min. version supported */
473 uint8_t hlength;
474 uint8_t dlength[3];
475 uint8_t isid[6]; /* Initiator Session ID */
476 uint16_t tsid; /* Target Session ID */
477 uint32_t itt; /* Initiator Task Tag */
478 uint16_t cid;
479 uint16_t rsvd3;
480 uint32_t cmdsn;
481 uint32_t expstatsn;
482 uint8_t rsvd5[16];
483 } iscsi_login_hdr_t;
So 'lh.tsid' is referring to "Target Session ID"
Let's take a look in RFC-3720:
http://www3.tools.ietf.org/html/rfc3720
I quote:
" - TSIH (Target Session Identifying Handle): A target assigned tag for
a session with a specific named initiator. The target generates it
during session establishment. Its internal format and content are
not defined by this protocol, except for the value 0 that is
reserved and used by the initiator to indicate a new session. It
is given to the target during additional connection establishment
for the same session."
What the Solaris iscsi target calls TSID, is, I think, the same as what
the RFC calls TSIH.
In the trace that James Nord has provided, the first login packet
has a TSIH of 0, but the second login packet has a TSIH of 0x000c,
which is NOT ZERO, so I think that's why the Solaris iscsi target FIN's
the connection.
The value 0x000c is in fact the TSIH in the response from the target
to the first login packet.
But looking at my traces for various other initiator software, where there
is a two stage login, even in the second login packet the TSIH
is still zero.
I believe the initiator should use TSIH of zero, until the login
has been successfully completed.
Ok, that's my theory. I've rushed through documenting this theory
so others can pull it apart. I may have made some wrong assumptions
in the above argument, so I am open to comments.
If only we could get the debugging to work in the Sol10 target
it might prove or disprove the above theory!
Thanks
Nigel Smith
This message posted from opensolaris.org
_______________________________________________
storage-discuss mailing list
[email protected]
http://mail.opensolaris.org/mailman/listinfo/storage-discuss