Re: [PATCH 0/1] scsi: Handle MLQUEUE busy response in scsi_send_eh_cmnd
On 04/15/2013 05:33 PM, James Bottomley wrote: On Mon, 2013-04-15 at 16:55 -0500, Brian King wrote: On 04/15/2013 03:45 PM, James Bottomley wrote: On Mon, 2013-04-15 at 13:39 -0500, wenxi...@linux.vnet.ibm.com wrote: In scsi_send_eh_cmnd(), this fix will check the return code of queuecomamnd when sending commands and retry for a bit if the driver returns a busy response. This is already handled by the timeout, I think. If a driver continuously returns MLQUEUE BUSY, then we'll fail the request after the timeout on the command expires. If we get a timeout in scsi_send_eh_cmnd we call scsi_abort_eh_cmnd. If the abort works, we return FAILED out of scsi_send_eh_cmnd, which results in no retry being performed, since scsi_eh_tur only retries once and only if NEEDS_RETRY is returned. Or am I missing something? Sorry, I'm not being clear. It comes with being at a conference. What I mean is that if you do this, the criterion for success or failure should be the amount of time left not the number of retries. This is what the non-eh submission path also does for retries of events that don't count against the retry limit ... so something like this patch (uncompiled and untested #include stddisclaimer.h) Jams, Wendy and I discussed this a bit more and I think we understand your concern. Wendy is working on an updated patch. Thanks, Brian -- Brian King Power Linux I/O IBM Linux Technology Center -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 0/1] scsi: Handle MLQUEUE busy response in scsi_send_eh_cmnd
In scsi_send_eh_cmnd(), this fix will check the return code of queuecomamnd when sending commands and retry for a bit if the driver returns a busy response. Thanks, Wendy -- -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 0/1] scsi: Handle MLQUEUE busy response in scsi_send_eh_cmnd
On Mon, 2013-04-15 at 13:39 -0500, wenxi...@linux.vnet.ibm.com wrote: In scsi_send_eh_cmnd(), this fix will check the return code of queuecomamnd when sending commands and retry for a bit if the driver returns a busy response. This is already handled by the timeout, I think. If a driver continuously returns MLQUEUE BUSY, then we'll fail the request after the timeout on the command expires. James -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 0/1] scsi: Handle MLQUEUE busy response in scsi_send_eh_cmnd
On 04/15/2013 03:45 PM, James Bottomley wrote: On Mon, 2013-04-15 at 13:39 -0500, wenxi...@linux.vnet.ibm.com wrote: In scsi_send_eh_cmnd(), this fix will check the return code of queuecomamnd when sending commands and retry for a bit if the driver returns a busy response. This is already handled by the timeout, I think. If a driver continuously returns MLQUEUE BUSY, then we'll fail the request after the timeout on the command expires. If we get a timeout in scsi_send_eh_cmnd we call scsi_abort_eh_cmnd. If the abort works, we return FAILED out of scsi_send_eh_cmnd, which results in no retry being performed, since scsi_eh_tur only retries once and only if NEEDS_RETRY is returned. Or am I missing something? Thanks, Brian -- Brian King Power Linux I/O IBM Linux Technology Center -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 0/1] scsi: Handle MLQUEUE busy response in scsi_send_eh_cmnd
On Mon, 2013-04-15 at 16:55 -0500, Brian King wrote: On 04/15/2013 03:45 PM, James Bottomley wrote: On Mon, 2013-04-15 at 13:39 -0500, wenxi...@linux.vnet.ibm.com wrote: In scsi_send_eh_cmnd(), this fix will check the return code of queuecomamnd when sending commands and retry for a bit if the driver returns a busy response. This is already handled by the timeout, I think. If a driver continuously returns MLQUEUE BUSY, then we'll fail the request after the timeout on the command expires. If we get a timeout in scsi_send_eh_cmnd we call scsi_abort_eh_cmnd. If the abort works, we return FAILED out of scsi_send_eh_cmnd, which results in no retry being performed, since scsi_eh_tur only retries once and only if NEEDS_RETRY is returned. Or am I missing something? Sorry, I'm not being clear. It comes with being at a conference. What I mean is that if you do this, the criterion for success or failure should be the amount of time left not the number of retries. This is what the non-eh submission path also does for retries of events that don't count against the retry limit ... so something like this patch (uncompiled and untested #include stddisclaimer.h) James diff --git a/drivers/scsi/scsi_error.c b/drivers/scsi/scsi_error.c index c1b05a8..93ab4f4 100644 --- a/drivers/scsi/scsi_error.c +++ b/drivers/scsi/scsi_error.c @@ -793,6 +793,7 @@ static int scsi_send_eh_cmnd(struct scsi_cmnd *scmd, unsigned char *cmnd, DECLARE_COMPLETION_ONSTACK(done); unsigned long timeleft; struct scsi_eh_save ses; + const int stall_for = min(HZ/10,1); /* 100 ms */ int rtn; scsi_eh_prep_cmnd(scmd, ses, cmnd, cmnd_size, sense_bytes); @@ -802,6 +803,8 @@ static int scsi_send_eh_cmnd(struct scsi_cmnd *scmd, unsigned char *cmnd, scmd-scsi_done = scsi_eh_done; shost-hostt-queuecommand(shost, scmd); + retry: + timeleft = wait_for_completion_timeout(done, timeout); shost-eh_action = NULL; @@ -831,8 +834,12 @@ static int scsi_send_eh_cmnd(struct scsi_cmnd *scmd, unsigned char *cmnd, case TARGET_ERROR: break; case ADD_TO_MLQUEUE: - rtn = NEEDS_RETRY; - break; + if (timeleft stall_for) { + timeout = timeleft - stall_for; + msleep(stall_for); + goto retry; + } + /* fall through */ default: rtn = FAILED; break; -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 0/1] scsi: Handle MLQUEUE busy response in scsi_send_eh_cmnd
Quoting James Bottomley james.bottom...@hansenpartnership.com: On Mon, 2013-04-15 at 16:55 -0500, Brian King wrote: On 04/15/2013 03:45 PM, James Bottomley wrote: On Mon, 2013-04-15 at 13:39 -0500, wenxi...@linux.vnet.ibm.com wrote: In scsi_send_eh_cmnd(), this fix will check the return code of queuecomamnd when sending commands and retry for a bit if the driver returns a busy response. This is already handled by the timeout, I think. If a driver continuously returns MLQUEUE BUSY, then we'll fail the request after the timeout on the command expires. If we get a timeout in scsi_send_eh_cmnd we call scsi_abort_eh_cmnd. If the abort works, we return FAILED out of scsi_send_eh_cmnd, which results in no retry being performed, since scsi_eh_tur only retries once and only if NEEDS_RETRY is returned. Or am I missing something? Sorry, I'm not being clear. It comes with being at a conference. What I mean is that if you do this, the criterion for success or faiure should be the amount of time left not the number of retries. This is what the non-eh submission path also does for retries of events that don't count against the retry limit ... so something like this patch (uncompiled and untested #include stddisclaimer.h) James Hi James, The failing case for us is: Doesn't matter what timeout value we set in wait_for_completion_timeout(), it always returns with timeleft = 0. For example, if I set timeout = 50 secs, wait_for_completion_timeout() always returns with timeleft =0(even adapter is already in good shape in 20 secs). We never gets a chance to call into if (timeleft) leg. My understanding is: if shost-host-queuecommand() failed with MLQUEUE busy response at the first time, wait_for_completion_timeout() always wakes up by expired. Here is log when I enabled scsi log: Apr 15 18:44:35 ltcsatiocp5 kernel: scsi_send_eh_cmnd: scmd: c000f88bc980, timeleft: 0 I applied your patch. Because timeleft is always zero, never got a chance to call into if(timeleft) { leg. Thanks, Wendy -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html