Re: iscsi diagnosis help

Mike Christie Wed, 18 Nov 2009 17:15:55 -0800

Guru Anbalagane wrote:
> Hi Mike,
> 
> Thanks.
> Can you please try  --target=i686.
>


Thanks Guru.

> On the patches, yes, I will include it in our next VM kernel.


Joseph, attached is a patch that has a couple fixes from upstream in 
that code path. If it does not fix your issue then I will have to send a 
patch with some more debugging in it (the net layer is not waking us up 
when send space is opening or it really is taking a long time to send 
the data).

Patch is only compile tested.


> regards
> Guru
> Mike Christie wrote:
>> ccing Guru Anbalagane.
>>
>> Hoot, Joseph wrote:
>>> Sure.  Its the OVM 2.2 environment that I talked with you about a 
>>> couple of days ago.  Here is the info:
>>>
>>
>> Doh! I forgot.
>>
>> Guru, I am trying to make a patch for your Oracle VM kernel. I want to 
>> port the patch I told you about
>> http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commitdiff;h=4c48a82935f833d94fcf44c2b0c5d2922acfc77a;hp=d1acfae514425d680912907c6554852f1e258551
>>  
>>
>> and this one:
>> http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commitdiff;h=d1acfae514425d680912907c6554852f1e258551
>>  
>>
>> if you do not have it (could not remember if you did or not).
>>
>> I found the src rpm:
>> http://edelivery.oracle.com/EPD/Download/get_form?egroup_aru_number=11874896 
>>
>>
>> The problem is that I keep getting RPM build errors when I try to 
>> build the project:
>>
>> rpm -ivh kernel-2.6.18-128.2.1.4.9.el5.src.rpm
>> rpmbuild -bp --target=noarch /usr/src/redhat/SPECS/kernel-2.6.spec
>>
>> ... a bunch of stuff then
>>
>> make[1]: *** [nonint_oldconfig] Error 15
>> make: *** [nonint_oldconfig] Error 2
>> error: Bad exit status from /var/tmp/rpm-tmp.94270 (%prep)
>>
>>
>> RPM build errors:
>>     Bad exit status from /var/tmp/rpm-tmp.94270 (%prep)
>> (have also tried without the --target arg and with different archs and 
>> always fails)
>>
>> This is how I setup the RHEL kernel source rpm. Are the commands for 
>> the Oracle VM kernel different?
>>
>>
>>
>>> [r...@oim6102506 ~]# uname -r
>>> 2.6.18-128.2.1.4.9.el5xen
>>> [r...@oim6102506 ~]# rpm -qa | grep iscsi
>>> iscsi-initiator-utils-6.2.0.871-0.7.el5
>>>
>>> I have a tcpdump that I had sent to Don WIlliams.  I'll pull that up 
>>> shortly.
>>>
>>> On Nov 17, 2009, at 10:06 PM, Mike Christie wrote:
>>>
>>>> Mike Christie wrote:
>>>>> Hoot, Joseph wrote:
>>>>>> more INLINE below...
>>>>>>
>>>>>> On Nov 17, 2009, at 7:27 PM, Mike Christie wrote:
>>>>>>
>>>>>>> Pasi Kärkkäinen wrote:
>>>>>>>> On Mon, Nov 16, 2009 at 09:39:00PM -0500, Hoot, Joseph wrote:
>>>>>>>>> On Nov 16, 2009, at 8:19 PM, Hoot, Joseph wrote:
>>>>>>>>>
>>>>>>>>>> thanks.  That helps.  So I know that with the EqualLogic 
>>>>>>>>>> targets, there is a "Group IP" which, I believe, responds with 
>>>>>>>>>> an iscsi login_redirect.
>>>>>>>>>> 1) Could the "Login authentication failed" message be the 
>>>>>>>>>> response because of a login redirect messages from the EQL 
>>>>>>>>>> redirect?
>>>>>>>>>>
>>>>>>>>>> and then my next question is more for curiosity sake:
>>>>>>>>>>
>>>>>>>>>> 2) Are there plans in the future to have more than one 
>>>>>>>>>> connection per session?  and I guess in addition to that, 
>>>>>>>>>> would that mean multiple connections to a single volume over 
>>>>>>>>>> the same nic?
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>> Also Mike, I'm seeing one or two of these every 30-40 minutes 
>>>>>>>>> if I slam our EqualLogic with roughly 7-15k IOPS (reads and 
>>>>>>>>> writes) non stop on 3 volumes.  In this type of scenario, would 
>>>>>>>>> you expect to see timeouts like this once in awhile?  If so, do 
>>>>>>>>> you think increasing my NOOP timeouts would assist so we don't 
>>>>>>>>> get these?  maybe set it to 15 seconds instead of 10?
>>>>>>>>>
>>>>>>>> Equallogic does active loadbalancing (redirects) during operation..
>>>>>>>> dunno about the errors though.
>>>>>>>>
>>>>>>> Oh yeah, forgot about that. Thanks Pasi!
>>>>>>>
>>>>>>> Joseph, look in the EQL target logs for something about the EQL 
>>>>>>> box doing load balancing. I think normally we handle the load 
>>>>>>> balancing more gracefully, but we might be messing up. I think if 
>>>>>>> EQL was load balancing in the open-iscsi logs we would see 
>>>>>>> something about getting a async iscsi pdu from the target that 
>>>>>>> asks us to logout. Then when we relogin the target would redirect 
>>>>>>> us to the optimal path.
>>>>>> There are two things that the EQL does, I believe-- one thing is 
>>>>>> async logout, the other is login_redirect.   Unfortunately, from 
>>>>>> the EQL syslog side we don't see any errors related to this.  It's 
>>>>>> my understanding, however, that when a login is initially 
>>>>>> attempted to the EQL, it hits the "group ip" or an alias'd IP 
>>>>>> sitting on a real nic.  The group IP looks at all the interfaces 
>>>>>> on the EQL and decides, based on some algorithm, which EQL nic the 
>>>>>> session should connect to.  It then sends the initiator that made 
>>>>>> the request a login_redirect, which I thought is basically a 
>>>>>> "logout and reconnect" pdu.  It would say, for example, "you're 
>>>>>> can't log into the group IP, however, you can log into this IP (a 
>>>>>> real nic) that it would prefer you be logged into."
>>>>>>
>>>>>> I'm thinking that the "failed login" is actually the result of 
>>>>>> that attempt to log into the group IP and it sending a login 
>>>>>> redirect pdu back to it.
>>>>>>
>>>>> If the target was load balancing us it would:
>>>>>
>>>>> - Send a async logout pdu.
>>>>> - We then send a logout pdu.
>>>>> - When we get the logout response pdu we kill the tcp ip connection
>>>>> - We then create a new tcp connection
>>>>> - We then log in to the portal that was passed into iscsiadm/iscsid 
>>>>> (the one in the DB that you see when you run iscsiadm -m node, 
>>>>> which is probably what you call the group IP). For this process we 
>>>>> send a login pdu. It then sends a login response pdu with the login 
>>>>> redirect response. In this response we also get the new IP to log 
>>>>> into.
>>>>> - We see that response and kill the tcp connection, and create a 
>>>>> new tcp connection to the portal we are being redirected to.
>>>>> - We then log into the portal we were redirected to. We again do 
>>>>> this by sending a login pdu. This time the login response pdu 
>>>>> should be ok and we are done.
>>>> Oh yeah, I meant to also say that this is pretty much the same 
>>>> process that happens we do the first login, and if we have to 
>>>> relogin because of a connection problem like the nop/ping timeout. 
>>>> The only difference in those cases is that we do not get the async 
>>>> logout and we do not do a logout by sending a logout pdu. We start 
>>>> at the killing tcp ip connection step.
>>>>
>>>> So even if we are not getting load balanced we would be in the same 
>>>> place in the open-iscsi code when we are getting the login failed 
>>>> errors.
>>>>
>>>>
>>>> To get back on track solving why we get the nop timeouts then if we 
>>>> are not seeing load balancing messages or async logout messages,  it 
>>>> could be the open-iscsi bug I mentioned in the other mail. If you 
>>>> can send the open-iscsi and kernel info I asked for in the other 
>>>> mail, we can start down that path.
>>>>
>>>> -- 
>>>>
>>>> You received this message because you are subscribed to the Google 
>>>> Groups "open-iscsi" group.
>>>> To post to this group, send email to [email protected].
>>>> To unsubscribe from this group, send email to 
>>>> [email protected].
>>>> For more options, visit this group at 
>>>> http://groups.google.com/group/open-iscsi?hl=.
>>>>
>>>>
>>>
>>> ===========================
>>> Joseph R. Hoot
>>> Lead System Programmer/Analyst
>>> (w) 716-878-4832
>>> (c) 716-759-HOOT
>>> [email protected]
>>> GPG KEY:   7145F633
>>> ===========================
>>>
>>> -- 
>>>
>>> You received this message because you are subscribed to the Google 
>>> Groups "open-iscsi" group.
>>> To post to this group, send email to [email protected].
>>> To unsubscribe from this group, send email to 
>>> [email protected].
>>> For more options, visit this group at 
>>> http://groups.google.com/group/open-iscsi?hl=.
>>>
>>>
>>
> 

--

You received this message because you are subscribed to the Google Groups 
"open-iscsi" group.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to 
[email protected].
For more options, visit this group at 
http://groups.google.com/group/open-iscsi?hl=.

diff -aurp linux-2.6.18.i686.orig/drivers/scsi/libiscsi2.c linux-2.6.18.i686/drivers/scsi/libiscsi2.c
--- linux-2.6.18.i686.orig/drivers/scsi/libiscsi2.c	2009-11-18 20:00:43.000000000 -0500
+++ linux-2.6.18.i686/drivers/scsi/libiscsi2.c	2009-11-18 20:13:21.000000000 -0500
@@ -1662,6 +1662,22 @@ static void iscsi_start_tx(struct iscsi_
 		iscsi_conn_queue_work(conn);
 }
 
+/*
+ * We want to make sure a ping is in flight. It has timed out.
+ * And we are not busy processing a pdu that is making
+ * progress but got started before the ping and is taking a while
+ * to complete so the ping is just stuck behind it in a queue.
+ */
+static int iscsi_has_ping_timed_out(struct iscsi_conn *conn)
+{
+	if (conn->ping_task &&
+	    time_before_eq(conn->last_recv + (conn->recv_timeout * HZ) +
+			   (conn->ping_timeout * HZ), jiffies))
+		return 1;
+	else
+		return 0;
+}
+
 static enum blk_eh_timer_return iscsi_eh_cmd_timed_out(struct scsi_cmnd *scmd)
 {
 	struct iscsi_cls_session *cls_session;
@@ -1697,8 +1713,7 @@ static enum blk_eh_timer_return iscsi_eh
 	 * if the ping timedout then we are in the middle of cleaning up
 	 * and can let the iscsi eh handle it
 	 */
-	if (time_before_eq(conn->last_recv + (conn->recv_timeout * HZ) +
-			    (conn->ping_timeout * HZ), jiffies))
+	if (iscsi_has_ping_timed_out(conn))
 		rc = BLK_EH_RESET_TIMER;
 	/*
 	 * if we are about to check the transport then give the command
@@ -1707,6 +1722,7 @@ static enum blk_eh_timer_return iscsi_eh
 	if (time_before_eq(conn->last_recv + (conn->recv_timeout * HZ),
 			   jiffies))
 		rc = BLK_EH_RESET_TIMER;
+
 	/* if in the middle of checking the transport then give us more time */
 	if (conn->ping_task)
 		rc = BLK_EH_RESET_TIMER;
@@ -1733,13 +1749,12 @@ static void iscsi_check_transport_timeou
 
 	recv_timeout *= HZ;
 	last_recv = conn->last_recv;
-	if (conn->ping_task &&
-	    time_before_eq(conn->last_ping + (conn->ping_timeout * HZ),
-			   jiffies)) {
+	if (iscsi_has_ping_timed_out(conn)) {
 		iscsi_conn_printk(KERN_ERR, conn, "ping timeout of %d secs "
-				  "expired, last rx %lu, last ping %lu, "
-				  "now %lu\n", conn->ping_timeout, last_recv,
-				  conn->last_ping, jiffies);
+				  "expired, recv timeout %d, last rx %lu, "
+				  "last ping %lu, now %lu\n",
+				  conn->ping_timeout, conn->recv_timeout,
+				  last_recv, conn->last_ping, jiffies);
 		spin_unlock(&session->lock);
 		iscsi2_conn_failure(conn, ISCSI_ERR_CONN_FAILED);
 		return;
diff -aurp linux-2.6.18.i686.orig/drivers/scsi/libiscsi_tcp.c linux-2.6.18.i686/drivers/scsi/libiscsi_tcp.c
--- linux-2.6.18.i686.orig/drivers/scsi/libiscsi_tcp.c	2009-11-18 20:00:43.000000000 -0500
+++ linux-2.6.18.i686/drivers/scsi/libiscsi_tcp.c	2009-11-18 20:01:32.000000000 -0500
@@ -866,6 +866,12 @@ int iscsi_tcp_recv_skb(struct iscsi_conn
 	int rc = 0;
 
 	ISCSI_DBG_TCP(conn, "in %d bytes\n", skb->len - offset);
+	/*
+	 * Update for each skb instead of pdu, because over slow networks a
+	 * data_in's data could take a while to read in. We also want to
+	 * account for r2ts.
+	 */
+	conn->last_recv = jiffies;
 
 	if (unlikely(conn->suspend_rx)) {
 		ISCSI_DBG_TCP(conn, "Rx suspended!\n");

Re: iscsi diagnosis help

Reply via email to