Rick, I've now discovered why my trace was shorter than what you were reporting.
The iscsitgtd process was falling over and core dumping due to the same problem
as mentioned by Peter Tribble in thread "S10U3 iscsi initiator vs snv_54
target".
# tail -f /var/svc/log/system-iscsitgt\:default.log
[ Jan 9 21:58:04 Disabled. ]
[ Jan 9 21:58:04 Rereading configuration. ]
[ Jan 10 00:52:55 Enabled. ]
[ Jan 10 00:52:56 Executing start method ("/lib/svc/method/svc-iscsitgt start")
]
[ Jan 10 00:52:57 Method "start" exited with status 0 ]
[ Jan 10 01:06:12 Stopping because process dumped core. ]
[ Jan 10 01:06:13 Executing stop method ("/lib/svc/method/svc-iscsitgt stop
89") ]
[ Jan 10 01:06:13 Method "stop" exited with status 0 ]
(Although it say's it's core dumping, I cannot find the core file.
Can anyone tell me if I need to enable something so the core file is created?)
Anyway, entering these commands to set a dummy base directory, fix's the
problem:
# zfs set shareiscsi=off tank/target
# iscsitadm modify admin -d /export/home
# zfs set shareiscsi=on tank/target
Ok, so now the trace goes further, but still the Microsoft initiator does not
work with the Sun target!
I now see the two 'Login' commands, followed by 'Report LUNs', and then two
'Inquiry LUN'. It then fails, and it is now the Microsoft initiator that sends
the [RST, ACK] to terminate the TCP connection.
My trace file can be downloaded here:
http://www.nwsmith.net/solaris/sol-iscsi-tgt-fails3.cap
I also have a trace file of the Microsoft initiator working successfully
with the Netbsd iscsi target, running on Solaris, here:
http://www.nwsmith.net/solaris/sol-iscsi-tgt-netbsd.cap
Comparing the two traces, one obvious difference is the the Sun target
splits it's response over two packets, using TCP continuation packets. The
Netbsd target puts the response into one packet. It should not make any
difference to it working or not, but I think it is more efficient to use one
packet rather than using two packets.
Otherwise the two traces match up very well. The only other difference I see
is just before the failure. The initiator send the second 'Inquiry LUN' command
(Ox12) with EVPD=1. The Sun target responds with just 'Data In', with the
payload in a second continuation packet. With the Netbsd target, it responds
with a 'Data In', with payload in the same packet, but then immediately it also
sends a 'Response LUN: 0x00 (Good)' packet. I'm not seeing the Sun target
sending this 'Response LUN' packet and it's at this stage that the initiator
drops the connection.
I'm using Ethereal to view these traces. But because the Sun target uses
continuation packets and as Ethereal does not have a state machine, it makes it
a little difficult to see what's in the continuation packets. Can anyone
recommend anything better than Ethereal that can handle this?
Here are summaries of the traces:
First the Sun target failing:
No. Time Source Destination Protocol Size Info
4 0.000014 172.16.8.3 172.16.8.150 TCP 62 [SYN]
5 0.016638 172.16.8.150 172.16.8.3 TCP 64 [SYN, ACK]
6 0.000139 172.16.8.3 172.16.8.150 TCP 54 [ACK]
7 0.001312 172.16.8.3 172.16.8.150 iSCSI 254 Login Command
8 0.001567 172.16.8.150 172.16.8.3 TCP 64 [ACK]
9 0.002775 172.16.8.150 172.16.8.3 iSCSI 102 Login Response
(Success)
10 0.174705 172.16.8.3 172.16.8.150 TCP 54 [ACK]
11 0.000332 172.16.8.150 172.16.8.3 TCP 118 [Continuation to
#9] [PSH, ACK] Len=64
12 0.000634 172.16.8.3 172.16.8.150 iSCSI 402 Login Command
13 0.001476 172.16.8.150 172.16.8.3 TCP 64 [ACK]
14 0.000666 172.16.8.150 172.16.8.3 iSCSI 102 Login Response
(Success)
15 0.197549 172.16.8.3 172.16.8.150 TCP 54 [ACK]
16 0.000389 172.16.8.150 172.16.8.3 TCP 342 [Continuation to
#14] [PSH, ACK] Len=288
17 0.001415 172.16.8.3 172.16.8.150 iSCSI 102 SCSI: Report LUNs
LUN: 0x00
18 0.029322 172.16.8.150 172.16.8.3 iSCSI 102 SCSI Data In (Good)
19 0.169899 172.16.8.3 172.16.8.150 TCP 54 [ACK]
20 0.000490 172.16.8.150 172.16.8.3 TCP 70 [Continuation to
#18] [PSH, ACK] Len=16
21 0.000359 172.16.8.3 172.16.8.150 iSCSI 102 SCSI: Inquiry LUN:
0x00
22 0.033877 172.16.8.150 172.16.8.3 iSCSI 102 SCSI Data In (Good)
23 0.165697 172.16.8.3 172.16.8.150 TCP 54 [ACK]
24 0.000337 172.16.8.150 172.16.8.3 TCP 90 [Continuation to
#22] [PSH, ACK] Len=36
25 0.000140 172.16.8.3 172.16.8.150 iSCSI 102 SCSI: Inquiry LUN:
0x00
26 0.000681 172.16.8.150 172.16.8.3 iSCSI 102 SCSI Data In (Good)
27 0.199448 172.16.8.3 172.16.8.150 TCP 54 [ACK]
28 0.000357 172.16.8.150 172.16.8.3 TCP 198 [Continuation to
#26] [PSH, ACK] Len=144
29 0.000157 172.16.8.3 172.16.8.150 TCP 54 [RST, ACK]
And now the netbsd target working:
No. Time Source Destination Protocol Size Info
19 57.226237 172.16.8.3 172.16.8.200 TCP 62 [SYN]
20 0.000157 172.16.8.200 172.16.8.3 TCP 62 [SYN, ACK]
21 0.000059 172.16.8.3 172.16.8.200 TCP 54 [ACK]
22 0.000945 172.16.8.3 172.16.8.200 iSCSI 238 Login Command
23 0.000243 172.16.8.200 172.16.8.3 TCP 60 [ACK]
24 0.000099 172.16.8.200 172.16.8.3 iSCSI 142 Login Response
(Success)
25 0.000563 172.16.8.3 172.16.8.200 iSCSI 402 Login Command
26 0.000158 172.16.8.200 172.16.8.3 TCP 60 [ACK]
27 0.000377 172.16.8.200 172.16.8.3 iSCSI 386 Login Response
(Success)
28 0.001539 172.16.8.3 172.16.8.200 iSCSI 102 SCSI: Report LUNs
LUN: 0x00
29 0.000721 172.16.8.200 172.16.8.3 iSCSI 118 SCSI Data In
30 0.000062 172.16.8.200 172.16.8.3 iSCSI 102 SCSI: Response
LUN: 0x00 (Good)
31 0.000067 172.16.8.3 172.16.8.200 TCP 54 [ACK]
32 0.000171 172.16.8.3 172.16.8.200 iSCSI 102 SCSI: Inquiry LUN:
0x00
33 0.000704 172.16.8.200 172.16.8.3 iSCSI 138 SCSI Data In
34 0.000044 172.16.8.200 172.16.8.3 iSCSI 102 SCSI: Response
LUN: 0x00 (Good)
35 0.000056 172.16.8.3 172.16.8.200 TCP 54 [ACK]
36 0.000120 172.16.8.3 172.16.8.200 iSCSI 102 SCSI: Inquiry LUN:
0x00
37 0.001789 172.16.8.200 172.16.8.3 iSCSI 358 SCSI Data In
38 0.000065 172.16.8.200 172.16.8.3 iSCSI 102 SCSI: Response
LUN: 0x00 (Good)
39 0.000068 172.16.8.3 172.16.8.200 TCP 54 [ACK]
40 0.000138 172.16.8.3 172.16.8.200 iSCSI 102 SCSI: Inquiry LUN:
0x00
41 0.000720 172.16.8.200 172.16.8.3 iSCSI 358 SCSI Data In
42 0.000038 172.16.8.200 172.16.8.3 iSCSI 102 SCSI: Response
LUN: 0x00 (Good)
43 0.000051 172.16.8.3 172.16.8.200 TCP 54 [ACK]
44 0.181812 172.16.8.3 172.16.8.200 iSCSI 102 SCSI: Read
Capacity(10) LUN: 0x00
45 0.000210 172.16.8.200 172.16.8.3 iSCSI 110 SCSI Data In
46 0.000052 172.16.8.200 172.16.8.3 iSCSI 102 SCSI: Response
LUN: 0x00 (Good)
47 0.000067 172.16.8.3 172.16.8.200 TCP 54 [ACK]
48 0.000094 172.16.8.3 172.16.8.200 iSCSI 102 SCSI: Read(10)
LUN: 0x00 (LBA: 0x00000000, Len: 1)
49 0.000710 172.16.8.200 172.16.8.3 iSCSI 614 SCSI Data In
50 0.000073 172.16.8.200 172.16.8.3 iSCSI 102 SCSI: Response
LUN: 0x00 (Good)
51 0.000078 172.16.8.3 172.16.8.200 TCP 54 [ACK]
Ok, it would be interesting if other people can confirm if they are seeing
different behaviour to the above.
Thanks
Nigel Smith
This message posted from opensolaris.org
_______________________________________________
storage-discuss mailing list
[email protected]
http://mail.opensolaris.org/mailman/listinfo/storage-discuss