Hi Roman
Ok I'm now looking at the second set of files:
"winserver-jambo_on_tso_off.pcap" & "sun-jambo_on_tso_off.cap"

In your last post, you mentioned some errors reported by snoop
when capturing "sun-jambo_on_tso_off.cap".
Many thanks for noting those detail, because this looks really interesting.

When I checked the size of packets in the snoop file, 
(by reading it into WireShark),
the maximum size I see is 9014, which is correct for an MTU of 9000,
EXCEPT there are four packets with unusually large size:
 
 Packet No: #42818 and #42820, #42829 and #42831
 
Note those packets occur just after the packets mention by snoop.

  42817 (warning) packet length greater than MTU in buffer offset 0: 
length=26960
  42819 (warning) packet length greater than MTU in buffer offset 27048: 
length=18000
  42828 (warning) packet length greater than MTU in buffer offset 16728: 
length=27680
  42830 (warning) packet length greater than MTU in buffer offset 44496: 
length=13728
  
WireShark counts packets starting at #1 - Maybe snoop counts from #0 ??
So here are the sizes of those packets, as reported by WireShark, from the 
snoop file:

 #42818 - size 26934
 #42820 - size 17974
 #42829 - size 27654
 #42831 - size 13702
 
So the 'length' reported in the snoop error/warning is exactly 26 bytes larger
the the size reported by WireShark. 
 
Before looking at those weird packets in detail, lets look at some
normal activity, just before and just after those warning messages, so we
can compare normal with abnormal.

At packets #42784 and # 42798, the Windows initiator does a 'Read(10)' with
a transfer length of 128 blocks. So that's 128 * 512 = 65536 bytes.
The OpenSolaris target replies with 8 packets of size 8294.
The TCP payload in each of those packet is 8240 bytes,
but there is also a 48 byte iscsi header. So the data payload is 8192.
Now 8 * 8192 = 65536 bytes, so that's fine.
After the eight 'SCSI Data In' packets, we have a seperate iscsi status 
response packet,
with just an 48 byte iscsi header, corresponding to 102 bytes 'on the wire'.
Now we know Comstar iscsi target currently does NOT do 'phase collapse'
(See previous posts from Peter Dunlap & Peter Cudhea), so that is normal.

At packet #42834, the Windows initiator does a 'Read(10)' with
a transfer length of 64 blocks. This is handled very similar to the above,
but just four  'SCSI Data In' packets to deliver 32768 bytes, again
followed by a separate iscsi response packet.

Ok, now let's look at the 'strange' transfers.

In packet #42814, the Windows initiator does a 'Read(10)' with
a transfer length of 128 blocks.
The OpenSolaris target replies with two packets of size 8294 (#42815 & #42816),
but then packet #42818 with size 26934, #42820 with size 17974,
which look unusually large, and finally #42822 with size 4688.
And the iscsi status response, rather than being in a separate packets,
is that last 64 bytes of that final packets #42822.
Now if you go through the maths of that, accounting for the TCP headers, 
and the iscsi headers, it has still transferred 65536 of data,
just in a strange way.
 
In packet #42824, the Windows initiator does another 'Read(10)' with
a transfer length of 128 blocks.
The OpenSolaris target replies with three packets of size 8294 (#42825, #42826 
& #42827),
but then packet #42829 with size 27654 and , #42831 with size 13702,
which again look unusually large. 
Again the iscsi status response is that last 64 bytes of that final packets 
#42831.
And again the 65536 of data has been transferred in a strange way.

Ok, so how does the above look at the receiving end
in file "winserver-jambo_on_tso_off.pcap" ?

Obviously the packet number corresponding to the above Read(10) are different 
in 
the capture file obtained at the Windows end.
To match up the two files, I used Tshark to extract a list of 
LBA offsets for each Read(10), from each capture file.
>From that I was able to work out and confirm which packets corresponded
on the OpenSolaris side with the ones on the Windows side.

In the Windows file the first strange Read(10) is packet #42713 and the second 
is #42726.

So what do those strange 'data In' packets look like when captured by WireShark
on the Windows server. The answer is again they look strange.

For the first read, the response data comes back as:
2 packets of size 8294 (#42714, #42715), then
5 packets of size 9014 (#42717, #42718, #42720, #42721, #42723), and
a final packet of size 4742 (#42724).

For the second read, the response data comes back as:
3 packets of size 8294 (#42727, #42728, #42730), then
3 packets of size 9014 (#42731, #42733, #42734), then
1 packet of size 774 (#42735), then
1 packet of size 9014 (#42737), and
a final packet of size 4742 (#42739).

WireShark seems to be struggling to decode the payload from those packets,
so I will not attempt a more detailed explanation.

For the 'normal' read(10), I mention earlier, the packets Windows receives
are exactly the same a OpenSolaris is sending, as one would expect!
A read of 64 blocks comes back as 4 packets of 8294 bytes, plus 
a separate iscsi status response of 102 bytes.
A read of 128 blocks comes back as 8 packets of 8294 bytes, plus 
a separate iscsi status response of 102 bytes.

:---------------:

Ok, so what is causing this strangeness?
Is this what is causing Roman's attempts to format the LUN with NTFS & FAT to 
fail?
Is what we are seeing a problem in Snoop?
Or the OpenSolaris TCP stack?
Or the NIC driver? Or Comstar?

Hopefuly some of the OpenSolaris kernel code experts have read though
the above, and maybe can offer an explanation.

Roman, if you repeat this test, does 'snoop' always issues the same warnings
in a consistent way, or is it erratic?

Roman, please confirm what network cards you are using, at both sides,
and which release of OpenSolaris. Thanks.

Late last year, we had something strange like this.
First we thought it was the old iscsi target, then we realised it was only
being reported with certain Broadcom NICs, but it turned out to be a bug
in how OpenSolaris handled I/OAT:

  http://bugs.opensolaris.org/bugdatabase/view_bug.do?bug_id=6826987
  http://bugs.opensolaris.org/bugdatabase/view_bug.do?bug_id=6750954 
  http://www.intel.com/network/connectivity/vtc_ioat.htm

Roman, are you using an Intel chipset on your OpenSolaris server?
Just a guess, but may be worth eliminating this as a possible cause.
It may be worth trying disabling I/OAT in the BIOS if possible.
Thanks
Nigel Smith
-- 
This message posted from opensolaris.org
_______________________________________________
storage-discuss mailing list
[email protected]
http://mail.opensolaris.org/mailman/listinfo/storage-discuss

Reply via email to