Mike Christie wrote:
> On 06/24/2009 09:40 AM, Hannes Reinecke wrote:
>> During heavy I/O we might hit a receive timeout as the
>> xmitworker is still busy sending PDUs. Even as we strictly
>> speaking didn't receive a reply during the receive timeout,
>> we didn't actually gave the target a chance to reply as
>> we're constantly hitting it with requests.
>> So it's better to ensure that cmdqueue really is empty
>> before start sending NOPs.
>>
>> Signed-off-by: Hannes Reinecke<h...@suse.de>
>> ---
>>   drivers/scsi/libiscsi.c |    4 +++-
>>   1 files changed, 3 insertions(+), 1 deletions(-)
>>
>> diff --git a/drivers/scsi/libiscsi.c b/drivers/scsi/libiscsi.c
>> index 716cc34..41bb177 100644
>> --- a/drivers/scsi/libiscsi.c
>> +++ b/drivers/scsi/libiscsi.c
>> @@ -1862,7 +1862,9 @@ static void
>> iscsi_check_transport_timeouts(unsigned long data)
>>           return;
>>       }
>>
>> -    if (time_before_eq(last_recv + recv_timeout, jiffies)) {
>> +    if (time_before_eq(last_recv + recv_timeout, jiffies)&&
>> +        list_empty(&conn->cmdqueue)&&
>> +        list_empty(&conn->requeue)) {
>>           /* send a ping to try to provoke some traffic */
>>           ISCSI_DBG_CONN(conn, "Sending nopout as ping\n");
>>           iscsi_send_nopout(conn, NULL);
> 
> If the connection goes bad and the network layer stops sending IO (you
> hit that wait in sendmsg/sendpage) but there are tasks in the cmdqueue
> or requeue then how will we detect that the connection has gone bad
> (tasks will not be removed from the queues at this time)?
> 
Hmm. Yes, indeed, this is a problem. But I think with your patch tracking
all xmits and recvs we don't need to apply that one as last_recv will
actually be last_xmit and hence a NOP will be sent if there is no traffic
at all.
Which as about what we want to achieve here.

> With this patch will we have to wait until the scsi eh fires and the
> tmfs also timeout and then the target reset code eventually gives up and
> drops the session.
> 
> I think we want to change
>         if (time_before_eq(last_recv + recv_timeout, jiffies)) {
>                 /* send a ping to try to provoke some traffic */
>                 ISCSI_DBG_CONN(conn, "Sending nopout as ping\n");
>                 iscsi_send_nopout(conn, NULL);
>                 next_timeout = conn->last_ping + (conn->ping_timeout * HZ);
>         } else
>                 next_timeout = last_recv + recv_timeout;
> 
> 
> to account for xmits and recvs like is done with tasks, so if we have
> been successfully sending or receiving data then do not send a ping.
> 
> Here is a patch. It is only compile tested.  I think we need locking
> around the last_xfer accesses or maybe make it an atomic or something
> (is there a atomic that matches the size of jiffies on all archs that is
> not expensive).
> 
No, we don't need a lock here as it's already been taken
(On the libiscsi side).

And this will probably also take care of the eh_timeout problem
as this will also check last_xmit.

I'll give it a spin.

Cheers,

Hannes
-- 
Dr. Hannes Reinecke                   zSeries & Storage
h...@suse.de                          +49 911 74053 688
SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: Markus Rex, HRB 16746 (AG Nürnberg)

--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups 
"open-iscsi" group.
To post to this group, send email to open-iscsi@googlegroups.com
To unsubscribe from this group, send email to 
open-iscsi+unsubscr...@googlegroups.com
For more options, visit this group at http://groups.google.com/group/open-iscsi
-~----------~----~----~----~------~----~------~--~---

Reply via email to