Re: [PATCH 0/1] scsi: Handle MLQUEUE busy response in scsi_send_eh_cmnd

2013-04-16 Thread Brian King
On 04/15/2013 05:33 PM, James Bottomley wrote:
 On Mon, 2013-04-15 at 16:55 -0500, Brian King wrote:
 On 04/15/2013 03:45 PM, James Bottomley wrote:
 On Mon, 2013-04-15 at 13:39 -0500, wenxi...@linux.vnet.ibm.com wrote:
 In scsi_send_eh_cmnd(), this fix will check the return code of queuecomamnd
 when sending commands and retry for a bit if the driver returns a
 busy response.

 This is already handled by the timeout, I think.  If a driver
 continuously returns MLQUEUE BUSY, then we'll fail the request after the
 timeout on the command expires.

 If we get a timeout in scsi_send_eh_cmnd we call scsi_abort_eh_cmnd. If the
 abort works, we return FAILED out of scsi_send_eh_cmnd, which results in
 no retry being performed, since scsi_eh_tur only retries once and
 only if NEEDS_RETRY is returned. Or am I missing something?
 
 Sorry, I'm not being clear.  It comes with being at a conference.  What
 I mean is that if you do this, the criterion for success or failure
 should be the amount of time left not the number of retries.  This is
 what the non-eh submission path also does for retries of events that
 don't count against the retry limit ... so something like this patch
 (uncompiled and untested #include stddisclaimer.h)

Jams,

Wendy and I discussed this a bit more and I think we understand your concern.
Wendy is working on an updated patch.

Thanks,

Brian

-- 
Brian King
Power Linux I/O
IBM Linux Technology Center


--
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 0/1] scsi: Handle MLQUEUE busy response in scsi_send_eh_cmnd

2013-04-15 Thread wenxiong
In scsi_send_eh_cmnd(), this fix will check the return code of queuecomamnd
when sending commands and retry for a bit if the driver returns a
busy response.

Thanks,
Wendy

-- 
--
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 0/1] scsi: Handle MLQUEUE busy response in scsi_send_eh_cmnd

2013-04-15 Thread James Bottomley
On Mon, 2013-04-15 at 13:39 -0500, wenxi...@linux.vnet.ibm.com wrote:
 In scsi_send_eh_cmnd(), this fix will check the return code of queuecomamnd
 when sending commands and retry for a bit if the driver returns a
 busy response.

This is already handled by the timeout, I think.  If a driver
continuously returns MLQUEUE BUSY, then we'll fail the request after the
timeout on the command expires.

James


--
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 0/1] scsi: Handle MLQUEUE busy response in scsi_send_eh_cmnd

2013-04-15 Thread Brian King
On 04/15/2013 03:45 PM, James Bottomley wrote:
 On Mon, 2013-04-15 at 13:39 -0500, wenxi...@linux.vnet.ibm.com wrote:
 In scsi_send_eh_cmnd(), this fix will check the return code of queuecomamnd
 when sending commands and retry for a bit if the driver returns a
 busy response.
 
 This is already handled by the timeout, I think.  If a driver
 continuously returns MLQUEUE BUSY, then we'll fail the request after the
 timeout on the command expires.

If we get a timeout in scsi_send_eh_cmnd we call scsi_abort_eh_cmnd. If the
abort works, we return FAILED out of scsi_send_eh_cmnd, which results in
no retry being performed, since scsi_eh_tur only retries once and
only if NEEDS_RETRY is returned. Or am I missing something?

Thanks,

Brian

-- 
Brian King
Power Linux I/O
IBM Linux Technology Center


--
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 0/1] scsi: Handle MLQUEUE busy response in scsi_send_eh_cmnd

2013-04-15 Thread James Bottomley
On Mon, 2013-04-15 at 16:55 -0500, Brian King wrote:
 On 04/15/2013 03:45 PM, James Bottomley wrote:
  On Mon, 2013-04-15 at 13:39 -0500, wenxi...@linux.vnet.ibm.com wrote:
  In scsi_send_eh_cmnd(), this fix will check the return code of queuecomamnd
  when sending commands and retry for a bit if the driver returns a
  busy response.
  
  This is already handled by the timeout, I think.  If a driver
  continuously returns MLQUEUE BUSY, then we'll fail the request after the
  timeout on the command expires.
 
 If we get a timeout in scsi_send_eh_cmnd we call scsi_abort_eh_cmnd. If the
 abort works, we return FAILED out of scsi_send_eh_cmnd, which results in
 no retry being performed, since scsi_eh_tur only retries once and
 only if NEEDS_RETRY is returned. Or am I missing something?

Sorry, I'm not being clear.  It comes with being at a conference.  What
I mean is that if you do this, the criterion for success or failure
should be the amount of time left not the number of retries.  This is
what the non-eh submission path also does for retries of events that
don't count against the retry limit ... so something like this patch
(uncompiled and untested #include stddisclaimer.h)

James


diff --git a/drivers/scsi/scsi_error.c b/drivers/scsi/scsi_error.c
index c1b05a8..93ab4f4 100644
--- a/drivers/scsi/scsi_error.c
+++ b/drivers/scsi/scsi_error.c
@@ -793,6 +793,7 @@ static int scsi_send_eh_cmnd(struct scsi_cmnd *scmd, 
unsigned char *cmnd,
DECLARE_COMPLETION_ONSTACK(done);
unsigned long timeleft;
struct scsi_eh_save ses;
+   const int stall_for = min(HZ/10,1); /* 100 ms */
int rtn;
 
scsi_eh_prep_cmnd(scmd, ses, cmnd, cmnd_size, sense_bytes);
@@ -802,6 +803,8 @@ static int scsi_send_eh_cmnd(struct scsi_cmnd *scmd, 
unsigned char *cmnd,
scmd-scsi_done = scsi_eh_done;
shost-hostt-queuecommand(shost, scmd);
 
+  retry:
+
timeleft = wait_for_completion_timeout(done, timeout);
 
shost-eh_action = NULL;
@@ -831,8 +834,12 @@ static int scsi_send_eh_cmnd(struct scsi_cmnd *scmd, 
unsigned char *cmnd,
case TARGET_ERROR:
break;
case ADD_TO_MLQUEUE:
-   rtn = NEEDS_RETRY;
-   break;
+   if (timeleft  stall_for) {
+   timeout = timeleft - stall_for;
+   msleep(stall_for);
+   goto retry;
+   }
+   /* fall through */
default:
rtn = FAILED;
break;



--
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 0/1] scsi: Handle MLQUEUE busy response in scsi_send_eh_cmnd

2013-04-15 Thread wenxiong


Quoting James Bottomley james.bottom...@hansenpartnership.com:


On Mon, 2013-04-15 at 16:55 -0500, Brian King wrote:

On 04/15/2013 03:45 PM, James Bottomley wrote:
 On Mon, 2013-04-15 at 13:39 -0500, wenxi...@linux.vnet.ibm.com wrote:
 In scsi_send_eh_cmnd(), this fix will check the return code of  
queuecomamnd

 when sending commands and retry for a bit if the driver returns a
 busy response.

 This is already handled by the timeout, I think.  If a driver
 continuously returns MLQUEUE BUSY, then we'll fail the request after the
 timeout on the command expires.

If we get a timeout in scsi_send_eh_cmnd we call scsi_abort_eh_cmnd. If the
abort works, we return FAILED out of scsi_send_eh_cmnd, which results in
no retry being performed, since scsi_eh_tur only retries once and
only if NEEDS_RETRY is returned. Or am I missing something?


Sorry, I'm not being clear.  It comes with being at a conference.  What
I mean is that if you do this, the criterion for success or faiure
should be the amount of time left not the number of retries.  This is
what the non-eh submission path also does for retries of events that
don't count against the retry limit ... so something like this patch
(uncompiled and untested #include stddisclaimer.h)

James


Hi James,

The failing case for us is: Doesn't matter what timeout value we set in
wait_for_completion_timeout(), it always returns with timeleft = 0.
For example, if I set timeout = 50 secs, wait_for_completion_timeout()
always returns with timeleft =0(even adapter is already in good shape in
20 secs).  We never gets a chance to call into if (timeleft) leg.

My understanding is: if shost-host-queuecommand() failed with MLQUEUE busy
response at the first time, wait_for_completion_timeout() always wakes  
up by expired.


Here is log when I enabled scsi log:

Apr 15 18:44:35 ltcsatiocp5 kernel: scsi_send_eh_cmnd: scmd:  
c000f88bc980, timeleft: 0


I applied your patch. Because timeleft is always zero, never got a  
chance to call into

if(timeleft) { leg.

Thanks,
Wendy

--
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html