> -----Original Message-----
> From: open-iscsi@googlegroups.com [mailto:open-iscsi@googlegroups.com]
> On Behalf Of Joe Hoot
> Sent: Thursday, April 28, 2011 8:08 PM
> To: open-iscsi@googlegroups.com
> Subject: Re: determining window size
> 
> Thanks for the quick response (as always) Mike.
> 
> Ok.  I see where you're going.  I should be able to build a kernel, but
> I have already veered a little bit away from where Oracle already is
> with their OVM kernel.  I don't want to veer to far (even with printk
> statements)-- but maybe... let me think about possibly doing something
> with just tcpdump to see if I can see tcp windows scaling real low.
> 
> We have a concern at ITEC that the EqualLogic is setting TCP windows
> towards zero.  And when it does that we are concerned that if it does
> this too much, even noop pings aren't getting through.

What is the NIC at the initiator..? Is it an offload session ?

I have seen a TCP zero window issue recently.

> 
> So... let think about this and see if I can respond again to you
> tomorrow morning.
> 
> Thanks again for jumping on this.
> 
> Joe
> 
> 
> On Apr 28, 2011, at 6:29 PM, Mike Christie wrote:
> 
> > On 04/28/2011 02:42 PM, Joe Hoot wrote:
> >> Mike or others,
> >>
> >> I would like to enable a small (yes, feel free to snicker here...)
> bit of debugging on one of our systems to try to capture tcp window
> sizes (so that I can attempt to trend some behaviors).  Mike, recall
> awhile ago you had mentioned that the following will help us gather
> more information:
> >>
> >
> > How are you setting the window sizes?
> >
> >>   echo 1>  /sys/module/libiscsi2/parameters/debug_libiscsi
> >>   echo 1>  /sys/module/libiscsi_tcp/parameters/debug_libiscsi_tcp
> >>   echo 1>  /sys/module/iscsi_tcp/parameters/debug_iscsi_tcp
> >>
> >> I believe that the only one that I really need to turn on to get
> more information about size of data being sent would be this one:
> >>
> >>   echo 1>  /sys/module/libiscsi_tcp/parameters/debug_libiscsi_tcp
> >>
> >> as it gives me something like this when I run this command:
> >> [root@oim61024006 ~]# echo 1>
> /sys/module/libiscsi_tcp/parameters/debug_libiscsi_tcp; sleep 1; dd
> if=/dev/zero
> of=/var/ovs/mount/DCA44DBDD7944327A7945E96BF8F7CCC/joetest.dd.img bs=1K
> count=1; echo 0>
> /sys/module/libiscsi_tcp/parameters/debug_libiscsi_tcp
> >> 1+0 records in
> >> 1+0 records out
> >> 1024 bytes (1.0 kB) copied, 3.7e-05 seconds, 27.7 MB/s
> >> [root@oim61024006 ~]#
> >>
> >> That produced this in /var/log/messages:
> >> Apr 28 15:26:07 oim61024006 kernel:  connection22:0:
> iscsi_tcp_task_init task deq [itt 0x1c6 imm 0 unsol 0]
> >> Apr 28 15:26:07 oim61024006 kernel:  connection22:0:
> iscsi_tcp_segment_done copied 0 0 size 48 xmit
> >> Apr 28 15:26:07 oim61024006 kernel:  connection22:0:
> iscsi_tcp_segment_done copied 0 48 size 48 xmit
> >> Apr 28 15:26:07 oim61024006 kernel:  connection22:0:
> iscsi_tcp_segment_done total copied 48 total size 48
> >
> > Need libiscsi debugging on, but this looks like it might have been a
> > iscsi pdus related to scsi READ commands. On the next line we see
> some
> > data coming in from the target in response to some command.
> >
> >
> >> Apr 28 15:26:07 oim61024006 kernel:  connection22:0:
> iscsi_tcp_recv_skb in 560 bytes
> >> Apr 28 15:26:07 oim61024006 kernel:  connection22:0:
> iscsi_tcp_recv_skb skb deace200 ptr=dd84c800 avail=560
> >> Apr 28 15:26:07 oim61024006 kernel:  connection22:0:
> iscsi_tcp_segment_done copied 0 0 size 48 recv
> >> Apr 28 15:26:07 oim61024006 kernel:  connection22:0:
> iscsi_tcp_segment_recv copying 48
> >> Apr 28 15:26:07 oim61024006 kernel:  connection22:0:
> iscsi_tcp_segment_done copied 0 48 size 48 recv
> >> Apr 28 15:26:07 oim61024006 kernel:  connection22:0:
> iscsi_tcp_segment_done total copied 48 total size 48
> >> Apr 28 15:26:07 oim61024006 kernel:  connection22:0:
> iscsi_tcp_recv_skb segment done
> >>
> >> --->  and I'm thinking that this is actually where my dd starts
> going:
> >> Apr 28 15:26:07 oim61024006 kernel:  connection22:0:
> iscsi_tcp_hdr_dissect opcode 0x25 ahslen 0 datalen 512
> >
> > opcode 25 is a iSCSI SCSI DATA_IN. Basically the target sends data to
> > the initiator in this type of pdu in response to a read command.
> Below
> > we see more data getting read in.
> >
> >> Apr 28 15:26:07 oim61024006 kernel:  connection22:0:
> iscsi_tcp_hdr_dissect iscsi_tcp_begin_data_in( offset=0, datalen=512)
> >
> > so here datalen means that the DATA_IN pdu has 512 bytes of data that
> we
> > are going to read in.
> >
> >> Apr 28 15:26:07 oim61024006 kernel:  connection22:0:
> iscsi_tcp_recv_skb in 512 bytes
> >> Apr 28 15:26:07 oim61024006 kernel:  connection22:0:
> iscsi_tcp_recv_skb skb deace200 ptr=dd84c830 avail=512
> >> Apr 28 15:26:07 oim61024006 kernel:  connection22:0:
> iscsi_tcp_segment_done copied 0 0 size 512 recv
> >> Apr 28 15:26:07 oim61024006 kernel:  connection22:0:
> iscsi_tcp_segment_recv copying 512
> >> Apr 28 15:26:07 oim61024006 kernel:  connection22:0:
> iscsi_tcp_segment_done copied 0 512 size 512 recv
> >> Apr 28 15:26:07 oim61024006 kernel:  connection22:0:
> iscsi_tcp_segment_done total copied 512 total size 512
> >> Apr 28 15:26:07 oim61024006 kernel:  connection22:0:
> iscsi_tcp_recv_skb segment done
> >
> > So the above blob just means we got the 512 bytes read in
> successfully.
> >
> >> Apr 28 15:26:07 oim61024006 kernel:  connection22:0:
> iscsi_tcp_hdr_recv_prep (digest disabled)
> >> Apr 28 15:26:07 oim61024006 kernel:  connection22:0:
> iscsi_tcp_recv_skb in 0 bytes
> >> Apr 28 15:26:07 oim61024006 kernel:  connection22:0:
> iscsi_tcp_recv_skb no more data avail. Consumed 0
> >> -->  and this is where it had finished?
> >>
> >> Apr 28 15:26:07 oim61024006 kernel:  connection22:0:
> iscsi_tcp_segment_done copied 0 0 size 0 xmit
> >> Apr 28 15:26:07 oim61024006 kernel:  connection22:0:
> iscsi_tcp_segment_done total copied 0 total size 0
> >>
> >> I'm assuming that these "size 48 recv" might be some keepalive thing
> (nop pings, dm-multipath readsector 0, etc..)?
> >>
> >
> > Is a iscsi header. In a iscsi packet there is basically a header that
> > tells us what type of command it is (ping, scsi command, ready to
> > transfer, login, etc), how much data is going to be transferred in
> the
> > data segment after the header, etc.
> >
> > For the xmit path you need libiscsi debugging on as well as
> libiscsi_tcp
> > to determine what type of iscsi header it is.
> >
> > See above for info the packets.
> >
> >> 3) "iscsi_tcp_recv_skb in 512 bytes"
> >>
> >>    skb = send kb's?
> >
> > skb is a networking thing. It is just short for "struct sk_buff"
> which
> > just represents a buffer that we could send/recv to/from.
> >
> >
> >>
> >> etc...
> >>
> >> But what I'm really trying to pull out of this information is trying
> to trend how often the initiator is trying to xmit data, but is not
> allowed to.   Maybe there is also a way (outside of just tcpdumping...
> or even with tcpdump, with a tweaked filter so I don't get gigs of
> pcaps) to understand what the tcp window size is currently being set
> to.
> >>
> >
> > There is no nice printk for when sendpage/sendmsg has failed to send
> > data. In the past logs I could tell it was happening because in the
> xmit
> > messages we would repeat messages like this:
> >
> > connection7:0: iscsi_tcp_segment_done copied 2592 0 size 4096 xmit
> >
> > which comes from here:
> >
> >         ISCSI_DBG_TCP(tcp_conn->iscsi_conn, "copied %u %u size %u
> %s\n",
> >                       segment->copied, copied, segment->size,
> >                       recv ? "recv" : "xmit");
> >
> > And so the first value is how much we have copied so far. The second
> > values is how much got copied when we just called sendpage/sendmsg
> and
> > the last value size is the total amount of data we want to send.
> >
> > In the logs you sent it kept repeating the same values and we never
> saw
> > the copied values get incremented and we never saw this message
> > indicating we sent everything.
> >
> >         /* Do we have more scatterlist entries? */
> >         ISCSI_DBG_TCP(tcp_conn->iscsi_conn, "total copied %u total
> size
> > %u\n",
> >                       segment->total_copied, segment->total_size);
> >
> >
> >> I would then try to parse it after the fact to put the data in rrd
> graphs.
> >>
> >> If there are any sites that explain this in more detail, I would
> happy to dig.  The only issue I have is that I'm not a programmer... I
> haven't reviewed or coded in C in about 10 years (and even then I was
> just learning it in school).  So if the site is too in depth, I may
> still be asking questions... but regardless, I am more than happy to
> hit the ground running and at least do the due diligence of reviewing
> the info if someone can point me in the right direction.
> >>
> >
> > If you can build I kernel I can make this a lot easier. I can just
> add a
> > printk to the sendpage/sendmsg call, and so it will be really easy to
> > pinpoint.
> >
> > --
> > You received this message because you are subscribed to the Google
> Groups "open-iscsi" group.
> > To post to this group, send email to open-iscsi@googlegroups.com.
> > To unsubscribe from this group, send email to open-
> iscsi+unsubscr...@googlegroups.com.
> > For more options, visit this group at
> http://groups.google.com/group/open-iscsi?hl=en.
> >
> 
> --
> You received this message because you are subscribed to the Google
> Groups "open-iscsi" group.
> To post to this group, send email to open-iscsi@googlegroups.com.
> To unsubscribe from this group, send email to open-
> iscsi+unsubscr...@googlegroups.com.
> For more options, visit this group at
> http://groups.google.com/group/open-iscsi?hl=en.

-- 
You received this message because you are subscribed to the Google Groups 
"open-iscsi" group.
To post to this group, send email to open-iscsi@googlegroups.com.
To unsubscribe from this group, send email to 
open-iscsi+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/open-iscsi?hl=en.

Reply via email to