On 04/28/2011 02:42 PM, Joe Hoot wrote:
Mike or others,
I would like to enable a small (yes, feel free to snicker here...) bit of
debugging on one of our systems to try to capture tcp window sizes (so that I
can attempt to trend some behaviors). Mike, recall awhile ago you had
mentioned that the following will help us gather more information:
How are you setting the window sizes?
echo 1> /sys/module/libiscsi2/parameters/debug_libiscsi
echo 1> /sys/module/libiscsi_tcp/parameters/debug_libiscsi_tcp
echo 1> /sys/module/iscsi_tcp/parameters/debug_iscsi_tcp
I believe that the only one that I really need to turn on to get more
information about size of data being sent would be this one:
echo 1> /sys/module/libiscsi_tcp/parameters/debug_libiscsi_tcp
as it gives me something like this when I run this command:
[root@oim61024006 ~]# echo 1>
/sys/module/libiscsi_tcp/parameters/debug_libiscsi_tcp; sleep 1; dd if=/dev/zero
of=/var/ovs/mount/DCA44DBDD7944327A7945E96BF8F7CCC/joetest.dd.img bs=1K count=1; echo
0> /sys/module/libiscsi_tcp/parameters/debug_libiscsi_tcp
1+0 records in
1+0 records out
1024 bytes (1.0 kB) copied, 3.7e-05 seconds, 27.7 MB/s
[root@oim61024006 ~]#
That produced this in /var/log/messages:
Apr 28 15:26:07 oim61024006 kernel: connection22:0: iscsi_tcp_task_init task
deq [itt 0x1c6 imm 0 unsol 0]
Apr 28 15:26:07 oim61024006 kernel: connection22:0: iscsi_tcp_segment_done
copied 0 0 size 48 xmit
Apr 28 15:26:07 oim61024006 kernel: connection22:0: iscsi_tcp_segment_done
copied 0 48 size 48 xmit
Apr 28 15:26:07 oim61024006 kernel: connection22:0: iscsi_tcp_segment_done
total copied 48 total size 48
Need libiscsi debugging on, but this looks like it might have been a
iscsi pdus related to scsi READ commands. On the next line we see some
data coming in from the target in response to some command.
Apr 28 15:26:07 oim61024006 kernel: connection22:0: iscsi_tcp_recv_skb in 560
bytes
Apr 28 15:26:07 oim61024006 kernel: connection22:0: iscsi_tcp_recv_skb skb
deace200 ptr=dd84c800 avail=560
Apr 28 15:26:07 oim61024006 kernel: connection22:0: iscsi_tcp_segment_done
copied 0 0 size 48 recv
Apr 28 15:26:07 oim61024006 kernel: connection22:0: iscsi_tcp_segment_recv
copying 48
Apr 28 15:26:07 oim61024006 kernel: connection22:0: iscsi_tcp_segment_done
copied 0 48 size 48 recv
Apr 28 15:26:07 oim61024006 kernel: connection22:0: iscsi_tcp_segment_done
total copied 48 total size 48
Apr 28 15:26:07 oim61024006 kernel: connection22:0: iscsi_tcp_recv_skb segment
done
---> and I'm thinking that this is actually where my dd starts going:
Apr 28 15:26:07 oim61024006 kernel: connection22:0: iscsi_tcp_hdr_dissect
opcode 0x25 ahslen 0 datalen 512
opcode 25 is a iSCSI SCSI DATA_IN. Basically the target sends data to
the initiator in this type of pdu in response to a read command. Below
we see more data getting read in.
Apr 28 15:26:07 oim61024006 kernel: connection22:0: iscsi_tcp_hdr_dissect
iscsi_tcp_begin_data_in( offset=0, datalen=512)
so here datalen means that the DATA_IN pdu has 512 bytes of data that we
are going to read in.
Apr 28 15:26:07 oim61024006 kernel: connection22:0: iscsi_tcp_recv_skb in 512
bytes
Apr 28 15:26:07 oim61024006 kernel: connection22:0: iscsi_tcp_recv_skb skb
deace200 ptr=dd84c830 avail=512
Apr 28 15:26:07 oim61024006 kernel: connection22:0: iscsi_tcp_segment_done
copied 0 0 size 512 recv
Apr 28 15:26:07 oim61024006 kernel: connection22:0: iscsi_tcp_segment_recv
copying 512
Apr 28 15:26:07 oim61024006 kernel: connection22:0: iscsi_tcp_segment_done
copied 0 512 size 512 recv
Apr 28 15:26:07 oim61024006 kernel: connection22:0: iscsi_tcp_segment_done
total copied 512 total size 512
Apr 28 15:26:07 oim61024006 kernel: connection22:0: iscsi_tcp_recv_skb segment
done
So the above blob just means we got the 512 bytes read in successfully.
Apr 28 15:26:07 oim61024006 kernel: connection22:0: iscsi_tcp_hdr_recv_prep
(digest disabled)
Apr 28 15:26:07 oim61024006 kernel: connection22:0: iscsi_tcp_recv_skb in 0
bytes
Apr 28 15:26:07 oim61024006 kernel: connection22:0: iscsi_tcp_recv_skb no more
data avail. Consumed 0
--> and this is where it had finished?
Apr 28 15:26:07 oim61024006 kernel: connection22:0: iscsi_tcp_segment_done
copied 0 0 size 0 xmit
Apr 28 15:26:07 oim61024006 kernel: connection22:0: iscsi_tcp_segment_done
total copied 0 total size 0
I'm assuming that these "size 48 recv" might be some keepalive thing (nop
pings, dm-multipath readsector 0, etc..)?
Is a iscsi header. In a iscsi packet there is basically a header that
tells us what type of command it is (ping, scsi command, ready to
transfer, login, etc), how much data is going to be transferred in the
data segment after the header, etc.
For the xmit path you need libiscsi debugging on as well as libiscsi_tcp
to determine what type of iscsi header it is.
See above for info the packets.
3) "iscsi_tcp_recv_skb in 512 bytes"
skb = send kb's?
skb is a networking thing. It is just short for "struct sk_buff" which
just represents a buffer that we could send/recv to/from.
etc...
But what I'm really trying to pull out of this information is trying to trend
how often the initiator is trying to xmit data, but is not allowed to. Maybe
there is also a way (outside of just tcpdumping... or even with tcpdump, with a
tweaked filter so I don't get gigs of pcaps) to understand what the tcp window
size is currently being set to.
There is no nice printk for when sendpage/sendmsg has failed to send
data. In the past logs I could tell it was happening because in the xmit
messages we would repeat messages like this:
connection7:0: iscsi_tcp_segment_done copied 2592 0 size 4096 xmit
which comes from here:
ISCSI_DBG_TCP(tcp_conn->iscsi_conn, "copied %u %u size %u %s\n",
segment->copied, copied, segment->size,
recv ? "recv" : "xmit");
And so the first value is how much we have copied so far. The second
values is how much got copied when we just called sendpage/sendmsg and
the last value size is the total amount of data we want to send.
In the logs you sent it kept repeating the same values and we never saw
the copied values get incremented and we never saw this message
indicating we sent everything.
/* Do we have more scatterlist entries? */
ISCSI_DBG_TCP(tcp_conn->iscsi_conn, "total copied %u total size
%u\n",
segment->total_copied, segment->total_size);
I would then try to parse it after the fact to put the data in rrd graphs.
If there are any sites that explain this in more detail, I would happy to dig.
The only issue I have is that I'm not a programmer... I haven't reviewed or
coded in C in about 10 years (and even then I was just learning it in school).
So if the site is too in depth, I may still be asking questions... but
regardless, I am more than happy to hit the ground running and at least do the
due diligence of reviewing the info if someone can point me in the right
direction.
If you can build I kernel I can make this a lot easier. I can just add a
printk to the sendpage/sendmsg call, and so it will be really easy to
pinpoint.
--
You received this message because you are subscribed to the Google Groups
"open-iscsi" group.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to
[email protected].
For more options, visit this group at
http://groups.google.com/group/open-iscsi?hl=en.