Chris Many thanks for emailing the snoop capture file to me. The reason it was so large was that you had inadvertently captured a lot of SSH data. Once I had filtered that out, the capture file was a much more sensible size. You have captured the full iScsi session, so that is good. I have made the capture file available here: http://www.nwsmith.net/solaris/AppInit-SunTgt.cap
When I looked at the capture with Ethereal, I was surprised by how much negotiation the GlobalSAN initiator was doing compared to what we normally see with other initiators. There are numerous repeated SCSI Inquiries and Mode Senses. As I think these are superfluous to the cause of the problem, I've edited them out of this post, and put them into an separate file, for those that are interested in looking at them, available here: http://www.nwsmith.net/solaris/AppInit-SunTgt-Decode.txt Ok, so here's my analysis of the capture: GlobalSAN Initiator is 192.168.1.5 Solaris Target is 192.168.1.1 ### Initiator Login (packet #6) SessionType=Normal InitialR2T=No HeaderDigest=None DataDigest=None MaxConnections=1 ImmediateData=Yes MaxOutstandingR2T=4 DataPDUInOrder=Yes DataSequenceInOrder=Yes ErrorRecoveryLevel=0.TargetName=iqn.1986-03.com.sun:02:15665ce1-a3a0-e5bd-e9dd-d78783dec173.media InitiatorName=iqn.2005-03.com.studionetworksolutions:mac-1093623522 ### Target Response - Success (packet #8+9+10) InitialR2T=Yes HeaderDigest=None DataDigest=None MaxConnections=1 ImmediateData=Yes MaxOutstandingR2T=4 DataPDUInOrder=Yes DataSequenceInOrder=Yes ErrorRecoveryLevel=0 TargetAlias=media TargetPortalGroupTag=1 Init->Tgt (packet#14) SCSI: Test Unit Ready LUN: 0x00 Tgt->Init (packet#16) SCSI: Response LUN: 0x00 (Good) <--snip--> Init->Tgt (packet#185) SCSI: SCSI: Read Capacity(10) LUN: 0x00 Tgt->Init (packet#186-187)SCSI: Data In LUN: 0x00 (Read Capacity(10) Response) <--snip--> Init->Tgt (packet#237) SCSI: Read(10) LUN: 0x00 (LBA: 0x00000000, Len: 48) The Target ACK's this packet, then does nothing further. After 4 seconds, the Initiator tries: Init->Tgt (packet#239) SCSI: Test Unit Ready LUN: 0x00 Again, the Target ACK's this packet, then does nothing further. In packet#241 the initiator FIN's the TCP session. Basically, we are seeing that as soon as the initiator tries to do an actual READ from the storage, then the target never responds. Which is exactly the same as what we saw here: http://mail.opensolaris.org/pipermail/storage-discuss/2007-October/003552.html with the Gpxe initiator. So I think there is a good chance that the GlobalSAN initiator has the same issue. Basically the solaris iscsi target expects to see certain key/value pairs from the initiator, but if they are not present, then it assumes default values, which seem to be inappropriate. I know Jim Dunham and his team are looking at this problem, and hopefully working on a solution. Bug ID 6619812 should track progress. http://bugs.opensolaris.org/view_bug.do?bug_id=6619812 Jim & his team, like to take a very rigorous approach to understanding, fixing & testing problems, so don't expect a quick fix. To get a quick fix, the best hope is if Rick McNeal and Andrew Hettinger, could make available to you their 'fixed' version of the solaris iscsi target. More background on these issues here: http://www.opensolaris.org/jive/thread.jspa?messageID=167293 http://www.opensolaris.org/jive/thread.jspa?messageID=165390 Chris, could you just confirm which build of OpenSolaris Nevada you are using? Regards Nigel Smith This message posted from opensolaris.org _______________________________________________ storage-discuss mailing list [email protected] http://mail.opensolaris.org/mailman/listinfo/storage-discuss
