determining window size

2011-04-28 Thread Joe Hoot
Mike or others,

I would like to enable a small (yes, feel free to snicker here...) bit of 
debugging on one of our systems to try to capture tcp window sizes (so that I 
can attempt to trend some behaviors).  Mike, recall awhile ago you had 
mentioned that the following will help us gather more information:

  echo 1  /sys/module/libiscsi2/parameters/debug_libiscsi
  echo 1  /sys/module/libiscsi_tcp/parameters/debug_libiscsi_tcp
  echo 1  /sys/module/iscsi_tcp/parameters/debug_iscsi_tcp

I believe that the only one that I really need to turn on to get more 
information about size of data being sent would be this one:

  echo 1  /sys/module/libiscsi_tcp/parameters/debug_libiscsi_tcp

as it gives me something like this when I run this command:
[root@oim61024006 ~]# echo 1  
/sys/module/libiscsi_tcp/parameters/debug_libiscsi_tcp; sleep 1; dd 
if=/dev/zero of=/var/ovs/mount/DCA44DBDD7944327A7945E96BF8F7CCC/joetest.dd.img 
bs=1K count=1; echo 0  /sys/module/libiscsi_tcp/parameters/debug_libiscsi_tcp
1+0 records in
1+0 records out
1024 bytes (1.0 kB) copied, 3.7e-05 seconds, 27.7 MB/s
[root@oim61024006 ~]# 

That produced this in /var/log/messages:
Apr 28 15:26:07 oim61024006 kernel:  connection22:0: iscsi_tcp_task_init task 
deq [itt 0x1c6 imm 0 unsol 0]
Apr 28 15:26:07 oim61024006 kernel:  connection22:0: iscsi_tcp_segment_done 
copied 0 0 size 48 xmit
Apr 28 15:26:07 oim61024006 kernel:  connection22:0: iscsi_tcp_segment_done 
copied 0 48 size 48 xmit
Apr 28 15:26:07 oim61024006 kernel:  connection22:0: iscsi_tcp_segment_done 
total copied 48 total size 48
Apr 28 15:26:07 oim61024006 kernel:  connection22:0: iscsi_tcp_recv_skb in 560 
bytes
Apr 28 15:26:07 oim61024006 kernel:  connection22:0: iscsi_tcp_recv_skb skb 
deace200 ptr=dd84c800 avail=560
Apr 28 15:26:07 oim61024006 kernel:  connection22:0: iscsi_tcp_segment_done 
copied 0 0 size 48 recv
Apr 28 15:26:07 oim61024006 kernel:  connection22:0: iscsi_tcp_segment_recv 
copying 48
Apr 28 15:26:07 oim61024006 kernel:  connection22:0: iscsi_tcp_segment_done 
copied 0 48 size 48 recv
Apr 28 15:26:07 oim61024006 kernel:  connection22:0: iscsi_tcp_segment_done 
total copied 48 total size 48
Apr 28 15:26:07 oim61024006 kernel:  connection22:0: iscsi_tcp_recv_skb segment 
done

--- and I'm thinking that this is actually where my dd starts going:
Apr 28 15:26:07 oim61024006 kernel:  connection22:0: iscsi_tcp_hdr_dissect 
opcode 0x25 ahslen 0 datalen 512
Apr 28 15:26:07 oim61024006 kernel:  connection22:0: iscsi_tcp_hdr_dissect 
iscsi_tcp_begin_data_in( offset=0, datalen=512)
Apr 28 15:26:07 oim61024006 kernel:  connection22:0: iscsi_tcp_recv_skb in 512 
bytes
Apr 28 15:26:07 oim61024006 kernel:  connection22:0: iscsi_tcp_recv_skb skb 
deace200 ptr=dd84c830 avail=512
Apr 28 15:26:07 oim61024006 kernel:  connection22:0: iscsi_tcp_segment_done 
copied 0 0 size 512 recv
Apr 28 15:26:07 oim61024006 kernel:  connection22:0: iscsi_tcp_segment_recv 
copying 512
Apr 28 15:26:07 oim61024006 kernel:  connection22:0: iscsi_tcp_segment_done 
copied 0 512 size 512 recv
Apr 28 15:26:07 oim61024006 kernel:  connection22:0: iscsi_tcp_segment_done 
total copied 512 total size 512
Apr 28 15:26:07 oim61024006 kernel:  connection22:0: iscsi_tcp_recv_skb segment 
done
Apr 28 15:26:07 oim61024006 kernel:  connection22:0: iscsi_tcp_hdr_recv_prep 
(digest disabled)
Apr 28 15:26:07 oim61024006 kernel:  connection22:0: iscsi_tcp_recv_skb in 0 
bytes
Apr 28 15:26:07 oim61024006 kernel:  connection22:0: iscsi_tcp_recv_skb no more 
data avail. Consumed 0
-- and this is where it had finished?

Apr 28 15:26:07 oim61024006 kernel:  connection22:0: iscsi_tcp_segment_done 
copied 0 0 size 0 xmit
Apr 28 15:26:07 oim61024006 kernel:  connection22:0: iscsi_tcp_segment_done 
total copied 0 total size 0

I'm assuming that these size 48 recv might be some keepalive thing (nop 
pings, dm-multipath readsector 0, etc..)?


Would someone on the list be able to help me understand what these messages 
mean?  

For example, in the first one above:

1)  iscsi_tcp_hdr_dissect opcode 0x25 ahslen 0 datalen 512 

   0x25 = I don't need to know exactly, but I'm assuming this is telling the 
target to get ready for a write
   ahslen = some type of send length?
   datalen 512 = amount that the initiator will be sending in this segment?

2) iscsi_tcp_hdr_dissect iscsi_tcp_begin_data_in( offset=0, datalen=512)

   basically tells disk to start a 0 and write 512bytes?
   
3) iscsi_tcp_recv_skb in 512 bytes

   skb = send kb's?   

etc...

But what I'm really trying to pull out of this information is trying to trend 
how often the initiator is trying to xmit data, but is not allowed to.   Maybe 
there is also a way (outside of just tcpdumping... or even with tcpdump, with a 
tweaked filter so I don't get gigs of pcaps) to understand what the tcp window 
size is currently being set to.

I would then try to parse it after the fact to put the data in rrd graphs.

If there are any sites that explain this in more detail, I 

Re: determining window size

2011-04-28 Thread Mike Christie

On 04/28/2011 02:42 PM, Joe Hoot wrote:

Mike or others,

I would like to enable a small (yes, feel free to snicker here...) bit of 
debugging on one of our systems to try to capture tcp window sizes (so that I 
can attempt to trend some behaviors).  Mike, recall awhile ago you had 
mentioned that the following will help us gather more information:



How are you setting the window sizes?


   echo 1  /sys/module/libiscsi2/parameters/debug_libiscsi
   echo 1  /sys/module/libiscsi_tcp/parameters/debug_libiscsi_tcp
   echo 1  /sys/module/iscsi_tcp/parameters/debug_iscsi_tcp

I believe that the only one that I really need to turn on to get more 
information about size of data being sent would be this one:

   echo 1  /sys/module/libiscsi_tcp/parameters/debug_libiscsi_tcp

as it gives me something like this when I run this command:
[root@oim61024006 ~]# echo 1  
/sys/module/libiscsi_tcp/parameters/debug_libiscsi_tcp; sleep 1; dd if=/dev/zero 
of=/var/ovs/mount/DCA44DBDD7944327A7945E96BF8F7CCC/joetest.dd.img bs=1K count=1; echo 
0  /sys/module/libiscsi_tcp/parameters/debug_libiscsi_tcp
1+0 records in
1+0 records out
1024 bytes (1.0 kB) copied, 3.7e-05 seconds, 27.7 MB/s
[root@oim61024006 ~]#

That produced this in /var/log/messages:
Apr 28 15:26:07 oim61024006 kernel:  connection22:0: iscsi_tcp_task_init task 
deq [itt 0x1c6 imm 0 unsol 0]
Apr 28 15:26:07 oim61024006 kernel:  connection22:0: iscsi_tcp_segment_done 
copied 0 0 size 48 xmit
Apr 28 15:26:07 oim61024006 kernel:  connection22:0: iscsi_tcp_segment_done 
copied 0 48 size 48 xmit
Apr 28 15:26:07 oim61024006 kernel:  connection22:0: iscsi_tcp_segment_done 
total copied 48 total size 48


Need libiscsi debugging on, but this looks like it might have been a 
iscsi pdus related to scsi READ commands. On the next line we see some 
data coming in from the target in response to some command.




Apr 28 15:26:07 oim61024006 kernel:  connection22:0: iscsi_tcp_recv_skb in 560 
bytes
Apr 28 15:26:07 oim61024006 kernel:  connection22:0: iscsi_tcp_recv_skb skb 
deace200 ptr=dd84c800 avail=560
Apr 28 15:26:07 oim61024006 kernel:  connection22:0: iscsi_tcp_segment_done 
copied 0 0 size 48 recv
Apr 28 15:26:07 oim61024006 kernel:  connection22:0: iscsi_tcp_segment_recv 
copying 48
Apr 28 15:26:07 oim61024006 kernel:  connection22:0: iscsi_tcp_segment_done 
copied 0 48 size 48 recv
Apr 28 15:26:07 oim61024006 kernel:  connection22:0: iscsi_tcp_segment_done 
total copied 48 total size 48
Apr 28 15:26:07 oim61024006 kernel:  connection22:0: iscsi_tcp_recv_skb segment 
done

---  and I'm thinking that this is actually where my dd starts going:
Apr 28 15:26:07 oim61024006 kernel:  connection22:0: iscsi_tcp_hdr_dissect 
opcode 0x25 ahslen 0 datalen 512


opcode 25 is a iSCSI SCSI DATA_IN. Basically the target sends data to 
the initiator in this type of pdu in response to a read command. Below 
we see more data getting read in.



Apr 28 15:26:07 oim61024006 kernel:  connection22:0: iscsi_tcp_hdr_dissect 
iscsi_tcp_begin_data_in( offset=0, datalen=512)


so here datalen means that the DATA_IN pdu has 512 bytes of data that we 
are going to read in.



Apr 28 15:26:07 oim61024006 kernel:  connection22:0: iscsi_tcp_recv_skb in 512 
bytes
Apr 28 15:26:07 oim61024006 kernel:  connection22:0: iscsi_tcp_recv_skb skb 
deace200 ptr=dd84c830 avail=512
Apr 28 15:26:07 oim61024006 kernel:  connection22:0: iscsi_tcp_segment_done 
copied 0 0 size 512 recv
Apr 28 15:26:07 oim61024006 kernel:  connection22:0: iscsi_tcp_segment_recv 
copying 512
Apr 28 15:26:07 oim61024006 kernel:  connection22:0: iscsi_tcp_segment_done 
copied 0 512 size 512 recv
Apr 28 15:26:07 oim61024006 kernel:  connection22:0: iscsi_tcp_segment_done 
total copied 512 total size 512
Apr 28 15:26:07 oim61024006 kernel:  connection22:0: iscsi_tcp_recv_skb segment 
done


So the above blob just means we got the 512 bytes read in successfully.


Apr 28 15:26:07 oim61024006 kernel:  connection22:0: iscsi_tcp_hdr_recv_prep 
(digest disabled)
Apr 28 15:26:07 oim61024006 kernel:  connection22:0: iscsi_tcp_recv_skb in 0 
bytes
Apr 28 15:26:07 oim61024006 kernel:  connection22:0: iscsi_tcp_recv_skb no more 
data avail. Consumed 0
--  and this is where it had finished?

Apr 28 15:26:07 oim61024006 kernel:  connection22:0: iscsi_tcp_segment_done 
copied 0 0 size 0 xmit
Apr 28 15:26:07 oim61024006 kernel:  connection22:0: iscsi_tcp_segment_done 
total copied 0 total size 0

I'm assuming that these size 48 recv might be some keepalive thing (nop 
pings, dm-multipath readsector 0, etc..)?



Is a iscsi header. In a iscsi packet there is basically a header that 
tells us what type of command it is (ping, scsi command, ready to 
transfer, login, etc), how much data is going to be transferred in the 
data segment after the header, etc.


For the xmit path you need libiscsi debugging on as well as libiscsi_tcp 
to determine what type of iscsi header it is.


See above for info the packets.


3) iscsi_tcp_recv_skb in 512 bytes

skb 

Re: determining window size

2011-04-28 Thread Joe Hoot
Thanks for the quick response (as always) Mike.

Ok.  I see where you're going.  I should be able to build a kernel, but I have 
already veered a little bit away from where Oracle already is with their OVM 
kernel.  I don't want to veer to far (even with printk statements)-- but 
maybe... let me think about possibly doing something with just tcpdump to see 
if I can see tcp windows scaling real low.

We have a concern at ITEC that the EqualLogic is setting TCP windows towards 
zero.  And when it does that we are concerned that if it does this too much, 
even noop pings aren't getting through.

So... let think about this and see if I can respond again to you tomorrow 
morning.

Thanks again for jumping on this.

Joe


On Apr 28, 2011, at 6:29 PM, Mike Christie wrote:

 On 04/28/2011 02:42 PM, Joe Hoot wrote:
 Mike or others,
 
 I would like to enable a small (yes, feel free to snicker here...) bit of 
 debugging on one of our systems to try to capture tcp window sizes (so that 
 I can attempt to trend some behaviors).  Mike, recall awhile ago you had 
 mentioned that the following will help us gather more information:
 
 
 How are you setting the window sizes?
 
   echo 1  /sys/module/libiscsi2/parameters/debug_libiscsi
   echo 1  /sys/module/libiscsi_tcp/parameters/debug_libiscsi_tcp
   echo 1  /sys/module/iscsi_tcp/parameters/debug_iscsi_tcp
 
 I believe that the only one that I really need to turn on to get more 
 information about size of data being sent would be this one:
 
   echo 1  /sys/module/libiscsi_tcp/parameters/debug_libiscsi_tcp
 
 as it gives me something like this when I run this command:
 [root@oim61024006 ~]# echo 1  
 /sys/module/libiscsi_tcp/parameters/debug_libiscsi_tcp; sleep 1; dd 
 if=/dev/zero 
 of=/var/ovs/mount/DCA44DBDD7944327A7945E96BF8F7CCC/joetest.dd.img bs=1K 
 count=1; echo 0  /sys/module/libiscsi_tcp/parameters/debug_libiscsi_tcp
 1+0 records in
 1+0 records out
 1024 bytes (1.0 kB) copied, 3.7e-05 seconds, 27.7 MB/s
 [root@oim61024006 ~]#
 
 That produced this in /var/log/messages:
 Apr 28 15:26:07 oim61024006 kernel:  connection22:0: iscsi_tcp_task_init 
 task deq [itt 0x1c6 imm 0 unsol 0]
 Apr 28 15:26:07 oim61024006 kernel:  connection22:0: iscsi_tcp_segment_done 
 copied 0 0 size 48 xmit
 Apr 28 15:26:07 oim61024006 kernel:  connection22:0: iscsi_tcp_segment_done 
 copied 0 48 size 48 xmit
 Apr 28 15:26:07 oim61024006 kernel:  connection22:0: iscsi_tcp_segment_done 
 total copied 48 total size 48
 
 Need libiscsi debugging on, but this looks like it might have been a
 iscsi pdus related to scsi READ commands. On the next line we see some
 data coming in from the target in response to some command.
 
 
 Apr 28 15:26:07 oim61024006 kernel:  connection22:0: iscsi_tcp_recv_skb in 
 560 bytes
 Apr 28 15:26:07 oim61024006 kernel:  connection22:0: iscsi_tcp_recv_skb skb 
 deace200 ptr=dd84c800 avail=560
 Apr 28 15:26:07 oim61024006 kernel:  connection22:0: iscsi_tcp_segment_done 
 copied 0 0 size 48 recv
 Apr 28 15:26:07 oim61024006 kernel:  connection22:0: iscsi_tcp_segment_recv 
 copying 48
 Apr 28 15:26:07 oim61024006 kernel:  connection22:0: iscsi_tcp_segment_done 
 copied 0 48 size 48 recv
 Apr 28 15:26:07 oim61024006 kernel:  connection22:0: iscsi_tcp_segment_done 
 total copied 48 total size 48
 Apr 28 15:26:07 oim61024006 kernel:  connection22:0: iscsi_tcp_recv_skb 
 segment done
 
 ---  and I'm thinking that this is actually where my dd starts going:
 Apr 28 15:26:07 oim61024006 kernel:  connection22:0: iscsi_tcp_hdr_dissect 
 opcode 0x25 ahslen 0 datalen 512
 
 opcode 25 is a iSCSI SCSI DATA_IN. Basically the target sends data to
 the initiator in this type of pdu in response to a read command. Below
 we see more data getting read in.
 
 Apr 28 15:26:07 oim61024006 kernel:  connection22:0: iscsi_tcp_hdr_dissect 
 iscsi_tcp_begin_data_in( offset=0, datalen=512)
 
 so here datalen means that the DATA_IN pdu has 512 bytes of data that we
 are going to read in.
 
 Apr 28 15:26:07 oim61024006 kernel:  connection22:0: iscsi_tcp_recv_skb in 
 512 bytes
 Apr 28 15:26:07 oim61024006 kernel:  connection22:0: iscsi_tcp_recv_skb skb 
 deace200 ptr=dd84c830 avail=512
 Apr 28 15:26:07 oim61024006 kernel:  connection22:0: iscsi_tcp_segment_done 
 copied 0 0 size 512 recv
 Apr 28 15:26:07 oim61024006 kernel:  connection22:0: iscsi_tcp_segment_recv 
 copying 512
 Apr 28 15:26:07 oim61024006 kernel:  connection22:0: iscsi_tcp_segment_done 
 copied 0 512 size 512 recv
 Apr 28 15:26:07 oim61024006 kernel:  connection22:0: iscsi_tcp_segment_done 
 total copied 512 total size 512
 Apr 28 15:26:07 oim61024006 kernel:  connection22:0: iscsi_tcp_recv_skb 
 segment done
 
 So the above blob just means we got the 512 bytes read in successfully.
 
 Apr 28 15:26:07 oim61024006 kernel:  connection22:0: iscsi_tcp_hdr_recv_prep 
 (digest disabled)
 Apr 28 15:26:07 oim61024006 kernel:  connection22:0: iscsi_tcp_recv_skb in 0 
 bytes
 Apr 28 15:26:07 oim61024006 kernel:  connection22:0: