Guru Anbalagane wrote:
> Hi Mike,
>
> Thanks.
> Can you please try --target=i686.
>
Thanks Guru.
> On the patches, yes, I will include it in our next VM kernel.
Joseph, attached is a patch that has a couple fixes from upstream in
that code path. If it does not fix your issue then I will have to send a
patch with some more debugging in it (the net layer is not waking us up
when send space is opening or it really is taking a long time to send
the data).
Patch is only compile tested.
> regards
> Guru
> Mike Christie wrote:
>> ccing Guru Anbalagane.
>>
>> Hoot, Joseph wrote:
>>> Sure. Its the OVM 2.2 environment that I talked with you about a
>>> couple of days ago. Here is the info:
>>>
>>
>> Doh! I forgot.
>>
>> Guru, I am trying to make a patch for your Oracle VM kernel. I want to
>> port the patch I told you about
>> http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commitdiff;h=4c48a82935f833d94fcf44c2b0c5d2922acfc77a;hp=d1acfae514425d680912907c6554852f1e258551
>>
>>
>> and this one:
>> http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commitdiff;h=d1acfae514425d680912907c6554852f1e258551
>>
>>
>> if you do not have it (could not remember if you did or not).
>>
>> I found the src rpm:
>> http://edelivery.oracle.com/EPD/Download/get_form?egroup_aru_number=11874896
>>
>>
>> The problem is that I keep getting RPM build errors when I try to
>> build the project:
>>
>> rpm -ivh kernel-2.6.18-128.2.1.4.9.el5.src.rpm
>> rpmbuild -bp --target=noarch /usr/src/redhat/SPECS/kernel-2.6.spec
>>
>> ... a bunch of stuff then
>>
>> make[1]: *** [nonint_oldconfig] Error 15
>> make: *** [nonint_oldconfig] Error 2
>> error: Bad exit status from /var/tmp/rpm-tmp.94270 (%prep)
>>
>>
>> RPM build errors:
>> Bad exit status from /var/tmp/rpm-tmp.94270 (%prep)
>> (have also tried without the --target arg and with different archs and
>> always fails)
>>
>> This is how I setup the RHEL kernel source rpm. Are the commands for
>> the Oracle VM kernel different?
>>
>>
>>
>>> [r...@oim6102506 ~]# uname -r
>>> 2.6.18-128.2.1.4.9.el5xen
>>> [r...@oim6102506 ~]# rpm -qa | grep iscsi
>>> iscsi-initiator-utils-6.2.0.871-0.7.el5
>>>
>>> I have a tcpdump that I had sent to Don WIlliams. I'll pull that up
>>> shortly.
>>>
>>> On Nov 17, 2009, at 10:06 PM, Mike Christie wrote:
>>>
>>>> Mike Christie wrote:
>>>>> Hoot, Joseph wrote:
>>>>>> more INLINE below...
>>>>>>
>>>>>> On Nov 17, 2009, at 7:27 PM, Mike Christie wrote:
>>>>>>
>>>>>>> Pasi Kärkkäinen wrote:
>>>>>>>> On Mon, Nov 16, 2009 at 09:39:00PM -0500, Hoot, Joseph wrote:
>>>>>>>>> On Nov 16, 2009, at 8:19 PM, Hoot, Joseph wrote:
>>>>>>>>>
>>>>>>>>>> thanks. That helps. So I know that with the EqualLogic
>>>>>>>>>> targets, there is a "Group IP" which, I believe, responds with
>>>>>>>>>> an iscsi login_redirect.
>>>>>>>>>> 1) Could the "Login authentication failed" message be the
>>>>>>>>>> response because of a login redirect messages from the EQL
>>>>>>>>>> redirect?
>>>>>>>>>>
>>>>>>>>>> and then my next question is more for curiosity sake:
>>>>>>>>>>
>>>>>>>>>> 2) Are there plans in the future to have more than one
>>>>>>>>>> connection per session? and I guess in addition to that,
>>>>>>>>>> would that mean multiple connections to a single volume over
>>>>>>>>>> the same nic?
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>> Also Mike, I'm seeing one or two of these every 30-40 minutes
>>>>>>>>> if I slam our EqualLogic with roughly 7-15k IOPS (reads and
>>>>>>>>> writes) non stop on 3 volumes. In this type of scenario, would
>>>>>>>>> you expect to see timeouts like this once in awhile? If so, do
>>>>>>>>> you think increasing my NOOP timeouts would assist so we don't
>>>>>>>>> get these? maybe set it to 15 seconds instead of 10?
>>>>>>>>>
>>>>>>>> Equallogic does active loadbalancing (redirects) during operation..
>>>>>>>> dunno about the errors though.
>>>>>>>>
>>>>>>> Oh yeah, forgot about that. Thanks Pasi!
>>>>>>>
>>>>>>> Joseph, look in the EQL target logs for something about the EQL
>>>>>>> box doing load balancing. I think normally we handle the load
>>>>>>> balancing more gracefully, but we might be messing up. I think if
>>>>>>> EQL was load balancing in the open-iscsi logs we would see
>>>>>>> something about getting a async iscsi pdu from the target that
>>>>>>> asks us to logout. Then when we relogin the target would redirect
>>>>>>> us to the optimal path.
>>>>>> There are two things that the EQL does, I believe-- one thing is
>>>>>> async logout, the other is login_redirect. Unfortunately, from
>>>>>> the EQL syslog side we don't see any errors related to this. It's
>>>>>> my understanding, however, that when a login is initially
>>>>>> attempted to the EQL, it hits the "group ip" or an alias'd IP
>>>>>> sitting on a real nic. The group IP looks at all the interfaces
>>>>>> on the EQL and decides, based on some algorithm, which EQL nic the
>>>>>> session should connect to. It then sends the initiator that made
>>>>>> the request a login_redirect, which I thought is basically a
>>>>>> "logout and reconnect" pdu. It would say, for example, "you're
>>>>>> can't log into the group IP, however, you can log into this IP (a
>>>>>> real nic) that it would prefer you be logged into."
>>>>>>
>>>>>> I'm thinking that the "failed login" is actually the result of
>>>>>> that attempt to log into the group IP and it sending a login
>>>>>> redirect pdu back to it.
>>>>>>
>>>>> If the target was load balancing us it would:
>>>>>
>>>>> - Send a async logout pdu.
>>>>> - We then send a logout pdu.
>>>>> - When we get the logout response pdu we kill the tcp ip connection
>>>>> - We then create a new tcp connection
>>>>> - We then log in to the portal that was passed into iscsiadm/iscsid
>>>>> (the one in the DB that you see when you run iscsiadm -m node,
>>>>> which is probably what you call the group IP). For this process we
>>>>> send a login pdu. It then sends a login response pdu with the login
>>>>> redirect response. In this response we also get the new IP to log
>>>>> into.
>>>>> - We see that response and kill the tcp connection, and create a
>>>>> new tcp connection to the portal we are being redirected to.
>>>>> - We then log into the portal we were redirected to. We again do
>>>>> this by sending a login pdu. This time the login response pdu
>>>>> should be ok and we are done.
>>>> Oh yeah, I meant to also say that this is pretty much the same
>>>> process that happens we do the first login, and if we have to
>>>> relogin because of a connection problem like the nop/ping timeout.
>>>> The only difference in those cases is that we do not get the async
>>>> logout and we do not do a logout by sending a logout pdu. We start
>>>> at the killing tcp ip connection step.
>>>>
>>>> So even if we are not getting load balanced we would be in the same
>>>> place in the open-iscsi code when we are getting the login failed
>>>> errors.
>>>>
>>>>
>>>> To get back on track solving why we get the nop timeouts then if we
>>>> are not seeing load balancing messages or async logout messages, it
>>>> could be the open-iscsi bug I mentioned in the other mail. If you
>>>> can send the open-iscsi and kernel info I asked for in the other
>>>> mail, we can start down that path.
>>>>
>>>> --
>>>>
>>>> You received this message because you are subscribed to the Google
>>>> Groups "open-iscsi" group.
>>>> To post to this group, send email to [email protected].
>>>> To unsubscribe from this group, send email to
>>>> [email protected].
>>>> For more options, visit this group at
>>>> http://groups.google.com/group/open-iscsi?hl=.
>>>>
>>>>
>>>
>>> ===========================
>>> Joseph R. Hoot
>>> Lead System Programmer/Analyst
>>> (w) 716-878-4832
>>> (c) 716-759-HOOT
>>> [email protected]
>>> GPG KEY: 7145F633
>>> ===========================
>>>
>>> --
>>>
>>> You received this message because you are subscribed to the Google
>>> Groups "open-iscsi" group.
>>> To post to this group, send email to [email protected].
>>> To unsubscribe from this group, send email to
>>> [email protected].
>>> For more options, visit this group at
>>> http://groups.google.com/group/open-iscsi?hl=.
>>>
>>>
>>
>
--
You received this message because you are subscribed to the Google Groups
"open-iscsi" group.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to
[email protected].
For more options, visit this group at
http://groups.google.com/group/open-iscsi?hl=.
diff -aurp linux-2.6.18.i686.orig/drivers/scsi/libiscsi2.c linux-2.6.18.i686/drivers/scsi/libiscsi2.c
--- linux-2.6.18.i686.orig/drivers/scsi/libiscsi2.c 2009-11-18 20:00:43.000000000 -0500
+++ linux-2.6.18.i686/drivers/scsi/libiscsi2.c 2009-11-18 20:13:21.000000000 -0500
@@ -1662,6 +1662,22 @@ static void iscsi_start_tx(struct iscsi_
iscsi_conn_queue_work(conn);
}
+/*
+ * We want to make sure a ping is in flight. It has timed out.
+ * And we are not busy processing a pdu that is making
+ * progress but got started before the ping and is taking a while
+ * to complete so the ping is just stuck behind it in a queue.
+ */
+static int iscsi_has_ping_timed_out(struct iscsi_conn *conn)
+{
+ if (conn->ping_task &&
+ time_before_eq(conn->last_recv + (conn->recv_timeout * HZ) +
+ (conn->ping_timeout * HZ), jiffies))
+ return 1;
+ else
+ return 0;
+}
+
static enum blk_eh_timer_return iscsi_eh_cmd_timed_out(struct scsi_cmnd *scmd)
{
struct iscsi_cls_session *cls_session;
@@ -1697,8 +1713,7 @@ static enum blk_eh_timer_return iscsi_eh
* if the ping timedout then we are in the middle of cleaning up
* and can let the iscsi eh handle it
*/
- if (time_before_eq(conn->last_recv + (conn->recv_timeout * HZ) +
- (conn->ping_timeout * HZ), jiffies))
+ if (iscsi_has_ping_timed_out(conn))
rc = BLK_EH_RESET_TIMER;
/*
* if we are about to check the transport then give the command
@@ -1707,6 +1722,7 @@ static enum blk_eh_timer_return iscsi_eh
if (time_before_eq(conn->last_recv + (conn->recv_timeout * HZ),
jiffies))
rc = BLK_EH_RESET_TIMER;
+
/* if in the middle of checking the transport then give us more time */
if (conn->ping_task)
rc = BLK_EH_RESET_TIMER;
@@ -1733,13 +1749,12 @@ static void iscsi_check_transport_timeou
recv_timeout *= HZ;
last_recv = conn->last_recv;
- if (conn->ping_task &&
- time_before_eq(conn->last_ping + (conn->ping_timeout * HZ),
- jiffies)) {
+ if (iscsi_has_ping_timed_out(conn)) {
iscsi_conn_printk(KERN_ERR, conn, "ping timeout of %d secs "
- "expired, last rx %lu, last ping %lu, "
- "now %lu\n", conn->ping_timeout, last_recv,
- conn->last_ping, jiffies);
+ "expired, recv timeout %d, last rx %lu, "
+ "last ping %lu, now %lu\n",
+ conn->ping_timeout, conn->recv_timeout,
+ last_recv, conn->last_ping, jiffies);
spin_unlock(&session->lock);
iscsi2_conn_failure(conn, ISCSI_ERR_CONN_FAILED);
return;
diff -aurp linux-2.6.18.i686.orig/drivers/scsi/libiscsi_tcp.c linux-2.6.18.i686/drivers/scsi/libiscsi_tcp.c
--- linux-2.6.18.i686.orig/drivers/scsi/libiscsi_tcp.c 2009-11-18 20:00:43.000000000 -0500
+++ linux-2.6.18.i686/drivers/scsi/libiscsi_tcp.c 2009-11-18 20:01:32.000000000 -0500
@@ -866,6 +866,12 @@ int iscsi_tcp_recv_skb(struct iscsi_conn
int rc = 0;
ISCSI_DBG_TCP(conn, "in %d bytes\n", skb->len - offset);
+ /*
+ * Update for each skb instead of pdu, because over slow networks a
+ * data_in's data could take a while to read in. We also want to
+ * account for r2ts.
+ */
+ conn->last_recv = jiffies;
if (unlikely(conn->suspend_rx)) {
ISCSI_DBG_TCP(conn, "Rx suspended!\n");