Re: [PATCH] scsi: Handle MLQUEUE busy response in scsi_send_eh_cmnd
On 05/03/13 20:23, James Bottomley wrote: + const unsigned long stall_for = min(msecs_to_jiffies(10), 1UL); Hello James, Can you please clarify what the intention of this statement is ? Is the purpose of this statement to avoid that stall_for would be zero in case HZ 100 ? If that is the case, maybe you meant max() instead of min() ? Also, are you aware that msecs_to_jiffies() already rounds up the result of the division ? Bart. -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] scsi: Handle MLQUEUE busy response in scsi_send_eh_cmnd
On Sat, 2013-05-04 at 19:02 +0200, Bart Van Assche wrote: On 05/03/13 20:23, James Bottomley wrote: + const unsigned long stall_for = min(msecs_to_jiffies(10), 1UL); Hello James, Can you please clarify what the intention of this statement is ? Is the purpose of this statement to avoid that stall_for would be zero in case HZ 100 ? If that is the case, maybe you meant max() instead of min() ? Also, are you aware that msecs_to_jiffies() already rounds up the result of the division ? Yes, I thought afterwards I should dump the bogus min statement as well. Plus HZ/10 is actually 100ms, so the value is 10x wrong. I've fixed it up below (plus a bit of comment rework and some style fixes). Thanks, James --- From 4bd9ef9789ad86656d8e52e8fff5422b741097e1 Mon Sep 17 00:00:00 2001 From: Hannes Reinecke h...@suse.de Date: Thu, 25 Apr 2013 08:10:00 +0200 Subject: [PATCH] [SCSI] Handle MLQUEUE busy response in scsi_send_eh_cmnd scsi_send_eh_cmnd() is calling queuecommand() directly, so it needs to check the return value here. The only valid return codes for queuecommand() are 'busy' states, so we need to wait for a bit to allow the LLDD to recover. Based on an earlier patch from Wen Xiong. [jejb: fix confusion between msec and jiffies values and other issues] [bvanassche: correct stall_for interval] Cc: Wen Xiong wenxi...@linux.vnet.ibm.com Cc: Brian King brk...@linux.vnet.ibm.com Signed-off-by: Hannes Reinecke h...@suse.de Signed-off-by: James Bottomley jbottom...@parallels.com diff --git a/drivers/scsi/scsi_error.c b/drivers/scsi/scsi_error.c index c1b05a8..f43de1e 100644 --- a/drivers/scsi/scsi_error.c +++ b/drivers/scsi/scsi_error.c @@ -25,6 +25,7 @@ #include linux/interrupt.h #include linux/blkdev.h #include linux/delay.h +#include linux/jiffies.h #include scsi/scsi.h #include scsi/scsi_cmnd.h @@ -791,32 +792,48 @@ static int scsi_send_eh_cmnd(struct scsi_cmnd *scmd, unsigned char *cmnd, struct scsi_device *sdev = scmd-device; struct Scsi_Host *shost = sdev-host; DECLARE_COMPLETION_ONSTACK(done); - unsigned long timeleft; + unsigned long timeleft = timeout; struct scsi_eh_save ses; + const unsigned long stall_for = msecs_to_jiffies(100); int rtn; +retry: scsi_eh_prep_cmnd(scmd, ses, cmnd, cmnd_size, sense_bytes); shost-eh_action = done; scsi_log_send(scmd); scmd-scsi_done = scsi_eh_done; - shost-hostt-queuecommand(shost, scmd); - - timeleft = wait_for_completion_timeout(done, timeout); + rtn = shost-hostt-queuecommand(shost, scmd); + if (rtn) { + if (timeleft stall_for) { + scsi_eh_restore_cmnd(scmd, ses); + timeleft -= stall_for; + msleep(jiffies_to_msecs(stall_for)); + goto retry; + } + /* signal not to enter either branch of the if () below */ + timeleft = 0; + rtn = NEEDS_RETRY; + } else { + timeleft = wait_for_completion_timeout(done, timeout); + } shost-eh_action = NULL; - scsi_log_completion(scmd, SUCCESS); + scsi_log_completion(scmd, rtn); SCSI_LOG_ERROR_RECOVERY(3, printk(%s: scmd: %p, timeleft: %ld\n, __func__, scmd, timeleft)); /* -* If there is time left scsi_eh_done got called, and we will -* examine the actual status codes to see whether the command -* actually did complete normally, else tell the host to forget -* about this command. +* If there is time left scsi_eh_done got called, and we will examine +* the actual status codes to see whether the command actually did +* complete normally, else if we have a zero return and no time left, +* the command must still be pending, so abort it and return FAILED. +* If we never actually managed to issue the command, because +* -queuecommand() kept returning non zero, use the rtn = FAILED +* value above (so don't execute either branch of the if) */ if (timeleft) { rtn = scsi_eh_completed_normally(scmd); @@ -837,7 +854,7 @@ static int scsi_send_eh_cmnd(struct scsi_cmnd *scmd, unsigned char *cmnd, rtn = FAILED; break; } - } else { + } else if (!rtn) { scsi_abort_eh_cmnd(scmd); rtn = FAILED; } -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] scsi: Handle MLQUEUE busy response in scsi_send_eh_cmnd
Quoting Hannes Reinecke h...@suse.de: scsi_send_eh_cmnd() is calling queuecommand() directly, so it needs to check the return value here. The only valid return codes for queuecommand() are 'busy' states, so we need to wait for a bit to allow the LLDD to recover. Based on an earlier patch from Wen Xiong. Cc: Wen Xiong wenxi...@linux.vnet.ibm.com Cc: Brian King brk...@linux.vnet.ibm.com Signed-off-by: Hannes Reinecke h...@suse.de Hi James, I have verified this patch with two new ipr adapters. EEH error can be recoery successfully. Do you have any question about this new patch? Thanks, Wendy diff --git a/drivers/scsi/scsi_error.c b/drivers/scsi/scsi_error.c index d58db32..6a3c1d2 100644 --- a/drivers/scsi/scsi_error.c +++ b/drivers/scsi/scsi_error.c @@ -889,22 +889,32 @@ static int scsi_send_eh_cmnd(struct scsi_cmnd *scmd, unsigned char *cmnd, struct scsi_device *sdev = scmd-device; struct Scsi_Host *shost = sdev-host; DECLARE_COMPLETION_ONSTACK(done); - unsigned long timeleft; + unsigned long timeleft = timeout; struct scsi_eh_save ses; + const int stall_for = min(HZ/10, 1); int rtn; +retry: scsi_eh_prep_cmnd(scmd, ses, cmnd, cmnd_size, sense_bytes); shost-eh_action = done; scsi_log_send(scmd); scmd-scsi_done = scsi_eh_done; - shost-hostt-queuecommand(shost, scmd); - - timeleft = wait_for_completion_timeout(done, timeout); + rtn = shost-hostt-queuecommand(shost, scmd); + if (rtn) { + if (timeleft) { + scsi_eh_restore_cmnd(scmd, ses); + timeleft -= stall_for; + msleep(stall_for); + goto retry; + } + rtn = NEEDS_RETRY; + } else + timeleft = wait_for_completion_timeout(done, timeout); shost-eh_action = NULL; - scsi_log_completion(scmd, SUCCESS); + scsi_log_completion(scmd, rtn); SCSI_LOG_ERROR_RECOVERY(3, printk(%s: scmd: %p, timeleft: %ld\n, @@ -935,7 +945,7 @@ static int scsi_send_eh_cmnd(struct scsi_cmnd *scmd, unsigned char *cmnd, rtn = FAILED; break; } - } else { + } else if (!rtn) { scsi_abort_eh_cmnd(scmd); rtn = FAILED; } -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] scsi: Handle MLQUEUE busy response in scsi_send_eh_cmnd
On Fri, 2013-05-03 at 10:24 -0400, wenxi...@linux.vnet.ibm.com wrote: Quoting Hannes Reinecke h...@suse.de: scsi_send_eh_cmnd() is calling queuecommand() directly, so it needs to check the return value here. The only valid return codes for queuecommand() are 'busy' states, so we need to wait for a bit to allow the LLDD to recover. Based on an earlier patch from Wen Xiong. Cc: Wen Xiong wenxi...@linux.vnet.ibm.com Cc: Brian King brk...@linux.vnet.ibm.com Signed-off-by: Hannes Reinecke h...@suse.de Hi James, I have verified this patch with two new ipr adapters. EEH error can be recoery successfully. Do you have any question about this new patch? Yes, it's not correct: stall_for is in jiffies not msec, so msleep(stall_for) is taking far too long. Plus you don't know that stall_for is a divisor of timeleft, so timeleft -= stall_for could wrap and cause the retry loop to go on effectively forever. I think the below is the correct patch, so I'll commit it unless there are objections. James --- From a74498dda7acf6f5fb99e43df54f2c1f5c6beec9 Mon Sep 17 00:00:00 2001 From: Hannes Reinecke h...@suse.de Date: Thu, 25 Apr 2013 08:10:00 +0200 Subject: [PATCH] [SCSI] Handle MLQUEUE busy response in scsi_send_eh_cmnd scsi_send_eh_cmnd() is calling queuecommand() directly, so it needs to check the return value here. The only valid return codes for queuecommand() are 'busy' states, so we need to wait for a bit to allow the LLDD to recover. Based on an earlier patch from Wen Xiong. [jejb: fix confusion between msec and jiffies values and other issues] Cc: Wen Xiong wenxi...@linux.vnet.ibm.com Cc: Brian King brk...@linux.vnet.ibm.com Signed-off-by: Hannes Reinecke h...@suse.de Signed-off-by: James Bottomley jbottom...@parallels.com diff --git a/drivers/scsi/scsi_error.c b/drivers/scsi/scsi_error.c index c1b05a8..cfd1ef2 100644 --- a/drivers/scsi/scsi_error.c +++ b/drivers/scsi/scsi_error.c @@ -25,6 +25,7 @@ #include linux/interrupt.h #include linux/blkdev.h #include linux/delay.h +#include linux/jiffies.h #include scsi/scsi.h #include scsi/scsi_cmnd.h @@ -791,22 +792,33 @@ static int scsi_send_eh_cmnd(struct scsi_cmnd *scmd, unsigned char *cmnd, struct scsi_device *sdev = scmd-device; struct Scsi_Host *shost = sdev-host; DECLARE_COMPLETION_ONSTACK(done); - unsigned long timeleft; + unsigned long timeleft = timeout; struct scsi_eh_save ses; + const unsigned long stall_for = min(msecs_to_jiffies(10), 1UL); int rtn; +retry: scsi_eh_prep_cmnd(scmd, ses, cmnd, cmnd_size, sense_bytes); shost-eh_action = done; scsi_log_send(scmd); scmd-scsi_done = scsi_eh_done; - shost-hostt-queuecommand(shost, scmd); - - timeleft = wait_for_completion_timeout(done, timeout); + rtn = shost-hostt-queuecommand(shost, scmd); + if (rtn) { + if (timeleft stall_for) { + scsi_eh_restore_cmnd(scmd, ses); + timeleft -= stall_for; + msleep(jiffies_to_msecs(stall_for)); + goto retry; + } + timeleft = 0; + rtn = NEEDS_RETRY; + } else + timeleft = wait_for_completion_timeout(done, timeout); shost-eh_action = NULL; - scsi_log_completion(scmd, SUCCESS); + scsi_log_completion(scmd, rtn); SCSI_LOG_ERROR_RECOVERY(3, printk(%s: scmd: %p, timeleft: %ld\n, @@ -837,7 +849,7 @@ static int scsi_send_eh_cmnd(struct scsi_cmnd *scmd, unsigned char *cmnd, rtn = FAILED; break; } - } else { + } else if (!rtn) { scsi_abort_eh_cmnd(scmd); rtn = FAILED; } -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] scsi: Handle MLQUEUE busy response in scsi_send_eh_cmnd
scsi_send_eh_cmnd() is calling queuecommand() directly, so it needs to check the return value here. The only valid return codes for queuecommand() are 'busy' states, so we need to wait for a bit to allow the LLDD to recover. Based on an earlier patch from Wen Xiong. Cc: Wen Xiong wenxi...@linux.vnet.ibm.com Cc: Brian King brk...@linux.vnet.ibm.com Signed-off-by: Hannes Reinecke h...@suse.de diff --git a/drivers/scsi/scsi_error.c b/drivers/scsi/scsi_error.c index d58db32..6a3c1d2 100644 --- a/drivers/scsi/scsi_error.c +++ b/drivers/scsi/scsi_error.c @@ -889,22 +889,32 @@ static int scsi_send_eh_cmnd(struct scsi_cmnd *scmd, unsigned char *cmnd, struct scsi_device *sdev = scmd-device; struct Scsi_Host *shost = sdev-host; DECLARE_COMPLETION_ONSTACK(done); - unsigned long timeleft; + unsigned long timeleft = timeout; struct scsi_eh_save ses; + const int stall_for = min(HZ/10, 1); int rtn; +retry: scsi_eh_prep_cmnd(scmd, ses, cmnd, cmnd_size, sense_bytes); shost-eh_action = done; scsi_log_send(scmd); scmd-scsi_done = scsi_eh_done; - shost-hostt-queuecommand(shost, scmd); - - timeleft = wait_for_completion_timeout(done, timeout); + rtn = shost-hostt-queuecommand(shost, scmd); + if (rtn) { + if (timeleft) { + scsi_eh_restore_cmnd(scmd, ses); + timeleft -= stall_for; + msleep(stall_for); + goto retry; + } + rtn = NEEDS_RETRY; + } else + timeleft = wait_for_completion_timeout(done, timeout); shost-eh_action = NULL; - scsi_log_completion(scmd, SUCCESS); + scsi_log_completion(scmd, rtn); SCSI_LOG_ERROR_RECOVERY(3, printk(%s: scmd: %p, timeleft: %ld\n, @@ -935,7 +945,7 @@ static int scsi_send_eh_cmnd(struct scsi_cmnd *scmd, unsigned char *cmnd, rtn = FAILED; break; } - } else { + } else if (!rtn) { scsi_abort_eh_cmnd(scmd); rtn = FAILED; } -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html