> -----Original Message----- > From: open-iscsi@googlegroups.com [mailto:open-iscsi@googlegroups.com] > On Behalf Of Joe Hoot > Sent: Thursday, April 28, 2011 8:08 PM > To: open-iscsi@googlegroups.com > Subject: Re: determining window size > > Thanks for the quick response (as always) Mike. > > Ok. I see where you're going. I should be able to build a kernel, but > I have already veered a little bit away from where Oracle already is > with their OVM kernel. I don't want to veer to far (even with printk > statements)-- but maybe... let me think about possibly doing something > with just tcpdump to see if I can see tcp windows scaling real low. > > We have a concern at ITEC that the EqualLogic is setting TCP windows > towards zero. And when it does that we are concerned that if it does > this too much, even noop pings aren't getting through.
What is the NIC at the initiator..? Is it an offload session ? I have seen a TCP zero window issue recently. > > So... let think about this and see if I can respond again to you > tomorrow morning. > > Thanks again for jumping on this. > > Joe > > > On Apr 28, 2011, at 6:29 PM, Mike Christie wrote: > > > On 04/28/2011 02:42 PM, Joe Hoot wrote: > >> Mike or others, > >> > >> I would like to enable a small (yes, feel free to snicker here...) > bit of debugging on one of our systems to try to capture tcp window > sizes (so that I can attempt to trend some behaviors). Mike, recall > awhile ago you had mentioned that the following will help us gather > more information: > >> > > > > How are you setting the window sizes? > > > >> echo 1> /sys/module/libiscsi2/parameters/debug_libiscsi > >> echo 1> /sys/module/libiscsi_tcp/parameters/debug_libiscsi_tcp > >> echo 1> /sys/module/iscsi_tcp/parameters/debug_iscsi_tcp > >> > >> I believe that the only one that I really need to turn on to get > more information about size of data being sent would be this one: > >> > >> echo 1> /sys/module/libiscsi_tcp/parameters/debug_libiscsi_tcp > >> > >> as it gives me something like this when I run this command: > >> [root@oim61024006 ~]# echo 1> > /sys/module/libiscsi_tcp/parameters/debug_libiscsi_tcp; sleep 1; dd > if=/dev/zero > of=/var/ovs/mount/DCA44DBDD7944327A7945E96BF8F7CCC/joetest.dd.img bs=1K > count=1; echo 0> > /sys/module/libiscsi_tcp/parameters/debug_libiscsi_tcp > >> 1+0 records in > >> 1+0 records out > >> 1024 bytes (1.0 kB) copied, 3.7e-05 seconds, 27.7 MB/s > >> [root@oim61024006 ~]# > >> > >> That produced this in /var/log/messages: > >> Apr 28 15:26:07 oim61024006 kernel: connection22:0: > iscsi_tcp_task_init task deq [itt 0x1c6 imm 0 unsol 0] > >> Apr 28 15:26:07 oim61024006 kernel: connection22:0: > iscsi_tcp_segment_done copied 0 0 size 48 xmit > >> Apr 28 15:26:07 oim61024006 kernel: connection22:0: > iscsi_tcp_segment_done copied 0 48 size 48 xmit > >> Apr 28 15:26:07 oim61024006 kernel: connection22:0: > iscsi_tcp_segment_done total copied 48 total size 48 > > > > Need libiscsi debugging on, but this looks like it might have been a > > iscsi pdus related to scsi READ commands. On the next line we see > some > > data coming in from the target in response to some command. > > > > > >> Apr 28 15:26:07 oim61024006 kernel: connection22:0: > iscsi_tcp_recv_skb in 560 bytes > >> Apr 28 15:26:07 oim61024006 kernel: connection22:0: > iscsi_tcp_recv_skb skb deace200 ptr=dd84c800 avail=560 > >> Apr 28 15:26:07 oim61024006 kernel: connection22:0: > iscsi_tcp_segment_done copied 0 0 size 48 recv > >> Apr 28 15:26:07 oim61024006 kernel: connection22:0: > iscsi_tcp_segment_recv copying 48 > >> Apr 28 15:26:07 oim61024006 kernel: connection22:0: > iscsi_tcp_segment_done copied 0 48 size 48 recv > >> Apr 28 15:26:07 oim61024006 kernel: connection22:0: > iscsi_tcp_segment_done total copied 48 total size 48 > >> Apr 28 15:26:07 oim61024006 kernel: connection22:0: > iscsi_tcp_recv_skb segment done > >> > >> ---> and I'm thinking that this is actually where my dd starts > going: > >> Apr 28 15:26:07 oim61024006 kernel: connection22:0: > iscsi_tcp_hdr_dissect opcode 0x25 ahslen 0 datalen 512 > > > > opcode 25 is a iSCSI SCSI DATA_IN. Basically the target sends data to > > the initiator in this type of pdu in response to a read command. > Below > > we see more data getting read in. > > > >> Apr 28 15:26:07 oim61024006 kernel: connection22:0: > iscsi_tcp_hdr_dissect iscsi_tcp_begin_data_in( offset=0, datalen=512) > > > > so here datalen means that the DATA_IN pdu has 512 bytes of data that > we > > are going to read in. > > > >> Apr 28 15:26:07 oim61024006 kernel: connection22:0: > iscsi_tcp_recv_skb in 512 bytes > >> Apr 28 15:26:07 oim61024006 kernel: connection22:0: > iscsi_tcp_recv_skb skb deace200 ptr=dd84c830 avail=512 > >> Apr 28 15:26:07 oim61024006 kernel: connection22:0: > iscsi_tcp_segment_done copied 0 0 size 512 recv > >> Apr 28 15:26:07 oim61024006 kernel: connection22:0: > iscsi_tcp_segment_recv copying 512 > >> Apr 28 15:26:07 oim61024006 kernel: connection22:0: > iscsi_tcp_segment_done copied 0 512 size 512 recv > >> Apr 28 15:26:07 oim61024006 kernel: connection22:0: > iscsi_tcp_segment_done total copied 512 total size 512 > >> Apr 28 15:26:07 oim61024006 kernel: connection22:0: > iscsi_tcp_recv_skb segment done > > > > So the above blob just means we got the 512 bytes read in > successfully. > > > >> Apr 28 15:26:07 oim61024006 kernel: connection22:0: > iscsi_tcp_hdr_recv_prep (digest disabled) > >> Apr 28 15:26:07 oim61024006 kernel: connection22:0: > iscsi_tcp_recv_skb in 0 bytes > >> Apr 28 15:26:07 oim61024006 kernel: connection22:0: > iscsi_tcp_recv_skb no more data avail. Consumed 0 > >> --> and this is where it had finished? > >> > >> Apr 28 15:26:07 oim61024006 kernel: connection22:0: > iscsi_tcp_segment_done copied 0 0 size 0 xmit > >> Apr 28 15:26:07 oim61024006 kernel: connection22:0: > iscsi_tcp_segment_done total copied 0 total size 0 > >> > >> I'm assuming that these "size 48 recv" might be some keepalive thing > (nop pings, dm-multipath readsector 0, etc..)? > >> > > > > Is a iscsi header. In a iscsi packet there is basically a header that > > tells us what type of command it is (ping, scsi command, ready to > > transfer, login, etc), how much data is going to be transferred in > the > > data segment after the header, etc. > > > > For the xmit path you need libiscsi debugging on as well as > libiscsi_tcp > > to determine what type of iscsi header it is. > > > > See above for info the packets. > > > >> 3) "iscsi_tcp_recv_skb in 512 bytes" > >> > >> skb = send kb's? > > > > skb is a networking thing. It is just short for "struct sk_buff" > which > > just represents a buffer that we could send/recv to/from. > > > > > >> > >> etc... > >> > >> But what I'm really trying to pull out of this information is trying > to trend how often the initiator is trying to xmit data, but is not > allowed to. Maybe there is also a way (outside of just tcpdumping... > or even with tcpdump, with a tweaked filter so I don't get gigs of > pcaps) to understand what the tcp window size is currently being set > to. > >> > > > > There is no nice printk for when sendpage/sendmsg has failed to send > > data. In the past logs I could tell it was happening because in the > xmit > > messages we would repeat messages like this: > > > > connection7:0: iscsi_tcp_segment_done copied 2592 0 size 4096 xmit > > > > which comes from here: > > > > ISCSI_DBG_TCP(tcp_conn->iscsi_conn, "copied %u %u size %u > %s\n", > > segment->copied, copied, segment->size, > > recv ? "recv" : "xmit"); > > > > And so the first value is how much we have copied so far. The second > > values is how much got copied when we just called sendpage/sendmsg > and > > the last value size is the total amount of data we want to send. > > > > In the logs you sent it kept repeating the same values and we never > saw > > the copied values get incremented and we never saw this message > > indicating we sent everything. > > > > /* Do we have more scatterlist entries? */ > > ISCSI_DBG_TCP(tcp_conn->iscsi_conn, "total copied %u total > size > > %u\n", > > segment->total_copied, segment->total_size); > > > > > >> I would then try to parse it after the fact to put the data in rrd > graphs. > >> > >> If there are any sites that explain this in more detail, I would > happy to dig. The only issue I have is that I'm not a programmer... I > haven't reviewed or coded in C in about 10 years (and even then I was > just learning it in school). So if the site is too in depth, I may > still be asking questions... but regardless, I am more than happy to > hit the ground running and at least do the due diligence of reviewing > the info if someone can point me in the right direction. > >> > > > > If you can build I kernel I can make this a lot easier. I can just > add a > > printk to the sendpage/sendmsg call, and so it will be really easy to > > pinpoint. > > > > -- > > You received this message because you are subscribed to the Google > Groups "open-iscsi" group. > > To post to this group, send email to open-iscsi@googlegroups.com. > > To unsubscribe from this group, send email to open- > iscsi+unsubscr...@googlegroups.com. > > For more options, visit this group at > http://groups.google.com/group/open-iscsi?hl=en. > > > > -- > You received this message because you are subscribed to the Google > Groups "open-iscsi" group. > To post to this group, send email to open-iscsi@googlegroups.com. > To unsubscribe from this group, send email to open- > iscsi+unsubscr...@googlegroups.com. > For more options, visit this group at > http://groups.google.com/group/open-iscsi?hl=en. -- You received this message because you are subscribed to the Google Groups "open-iscsi" group. To post to this group, send email to open-iscsi@googlegroups.com. To unsubscribe from this group, send email to open-iscsi+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/open-iscsi?hl=en.