Re: [PATCH 7/8] scsi: Add 'eh_deadline' to limit SCSI EH runtime

2014-02-27 Thread Ren Mingxin

Hi, Hannes:

On 10/23/2013 04:51 PM, Hannes Reinecke wrote:

This patchs adds an 'eh_deadline' sysfs attribute to the scsi
host which limits the overall runtime of the SCSI EH.


As you known, adding to scsi host means such interface has also been
added to the SATA and USB controllers. But to users, I think it is
possible that there are 3 confusing points below:

1) There should not be this sysfs interface for SATA controllers, for
   such interface will not work under SATA's own EH policy;
2) There should not be this sysfs interface for USB controllers,
   because probably they will not consider EH recovery to USB ones;
3) They are not willing to affect SATA/USB controllers(even if their
   sysfs interfaces) while setting global interafce by assigning scsi
   module parameter.

I was thinking how to mask SATA/USB controllers, but havn't a perfect
solution so far, for it seems that it is not clever enough to mask
them in each controller driver. Do you have any idea about above?

Thanks,
Ren
--
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/3] scsi: improved eh timeout handler

2013-11-01 Thread Ren Mingxin

Hi, Hannes:

I'm sorry that I don't know why you didn't consider my former patch
below which not only raises the minimum valid value of 'eh_deadline'
as '0' for your former patchset but also includes some fix for your
this patchset:

http://www.spinics.net/lists/linux-scsi/msg69361.html

If you think I'd post the minimum value issue as a improvement when
your patchset is accepted, I've no problem;-)

On 10/31/2013 09:02 PM, Hannes Reinecke wrote:

+void
+scmd_eh_abort_handler(struct work_struct *work)
+{
+   struct scsi_cmnd *scmd =
+   container_of(work, struct scsi_cmnd, abort_work.work);
+   struct scsi_device *sdev = scmd-device;
+   unsigned long flags;
+   int rtn;
+
+   spin_lock_irqsave(sdev-host-host_lock, flags);
+   if (scsi_host_eh_past_deadline(sdev-host)) {
+   spin_unlock_irqrestore(sdev-host-host_lock, flags);
+   SCSI_LOG_ERROR_RECOVERY(3,
+   scmd_printk(KERN_INFO, scmd,
+   scmd %p eh timeout, not aborting\n,
+   scmd));
+   } else {
+   spin_unlock_irqrestore(sdev-host-host_lock, flags);
+   SCSI_LOG_ERROR_RECOVERY(3,
+   scmd_printk(KERN_INFO, scmd,
+   aborting command %p\n, scmd));
+   rtn = scsi_try_to_abort_cmd(sdev-host-hostt, scmd);
+   if (rtn == SUCCESS) {
+   scmd-result |= DID_TIME_OUT  16;
+   if (!scsi_noretry_cmd(scmd)
+   (++scmd-retries= scmd-allowed)) {


scsi_host_eh_past_deadline() should also be checked here before long
term retrying.



+   SCSI_LOG_ERROR_RECOVERY(3,
+   scmd_printk(KERN_WARNING, scmd,
+   scmd %p retry 
+   aborted command\n, scmd));
+   scsi_queue_insert(scmd, SCSI_MLQUEUE_EH_RETRY);
+   } else {
+   SCSI_LOG_ERROR_RECOVERY(3,
+   scmd_printk(KERN_WARNING, scmd,
+   scmd %p finish 
+   aborted command\n, scmd));
+   scsi_finish_command(scmd);
+   }
+   return;
+   }
+   SCSI_LOG_ERROR_RECOVERY(3,
+   scmd_printk(KERN_INFO, scmd,
+   scmd %p abort failed, rtn %d\n,
+   scmd, rtn));
+   }
+
+   if (scsi_eh_scmd_add(scmd, 0)) {


scsi_finish_command() should be invoked if scsi_eh_scmd_add() is
returned on failure.

Thanks,
Ren



+   SCSI_LOG_ERROR_RECOVERY(3,
+   scmd_printk(KERN_WARNING, scmd,
+   scmd %p terminate 
+   aborted command\n, scmd));
+   scmd-result |= DID_TIME_OUT  16;
+   scsi_finish_command(scmd);
+   }
+}
snip

--
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 7/8] scsi: Add 'eh_deadline' to limit SCSI EH runtime

2013-10-23 Thread Ren Mingxin

Hi, Hannes:

On 10/23/2013 04:51 PM, Hannes Reinecke wrote:

This patchs adds an 'eh_deadline' sysfs attribute to the scsi
host which limits the overall runtime of the SCSI EH.
The 'eh_deadline' value is stored in the now obsolete field
'resetting'.
When a command is failed the start time of the EH is stored
in 'last_reset'. If the overall runtime of the SCSI EH is longer
than last_reset + eh_deadline, the EH is short-circuited and
falls through to issue a host reset only.

Signed-off-by: Hannes Reineckeh...@suse.de
---
  drivers/scsi/hosts.c  |   7 +++
  drivers/scsi/scsi_error.c | 130 +++---
  drivers/scsi/scsi_sysfs.c |  37 +
  include/scsi/scsi_host.h  |   4 +-
  4 files changed, 170 insertions(+), 8 deletions(-)

diff --git a/drivers/scsi/hosts.c b/drivers/scsi/hosts.c
index df0c3c7..f334859 100644
--- a/drivers/scsi/hosts.c
+++ b/drivers/scsi/hosts.c
@@ -316,6 +316,12 @@ static void scsi_host_dev_release(struct device *dev)
kfree(shost);
  }

+static unsigned int shost_eh_deadline;
+
+module_param_named(eh_deadline, shost_eh_deadline, uint, S_IRUGO|S_IWUSR);
+MODULE_PARM_DESC(eh_deadline,
+SCSI EH timeout in seconds (should be between 1 and 2^32-1));


Sorry, didn't consider '0' as the minimum valid value as we talked on
Oct 9?

Thanks,
Ren

snip

+   int eh_deadline;
unsigned long last_reset;

/*


--
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] scsi: Set the minimum valid value of 'eh_deadline' as 0

2013-10-10 Thread Ren Mingxin

Hi, Ewan, Hannes:

On 10/09/2013 08:28 PM, Ewan Milne wrote:

On Wed, 2013-10-09 at 15:43 +0800, Ren Mingxin wrote:

The former minimum valid value of 'eh_deadline' is 1s, which means
the earliest occasion to shorten EH is 1 second later since a
command is failed or timed out. But if we want to skip EH steps
ASAP, we have to wait until the first EH step is finished. If the
duration of the first EH step is long, this waiting time is
excruciating. So, it is necessary to accept 0 as the minimum valid
value for 'eh_deadline'.

According to my test, with Hannes' patchset 'New EH command timeout
handler' as well, the minimum IO time is improved from 73s
(eh_deadline = 1) to 43s(eh_deadline = 0) when commands are timed
out by disabling RSCN and target port.

Another thing: scsi_finish_command() should be invoked if
scsi_eh_scmd_add() is returned on failure - let EH finish those
commands.

Signed-off-by: Ren Mingxinre...@cn.fujitsu.com
---
  drivers/scsi/hosts.c  |   14 +++---
  drivers/scsi/scsi_error.c |   40 +++-
  drivers/scsi/scsi_sysfs.c |   36 +---
  include/scsi/scsi_host.h  |2 +-
  4 files changed, 64 insertions(+), 28 deletions(-)

diff --git a/drivers/scsi/hosts.c b/drivers/scsi/hosts.c
index f334859..e84123a 100644
--- a/drivers/scsi/hosts.c
+++ b/drivers/scsi/hosts.c
@@ -316,11 +316,11 @@ static void scsi_host_dev_release(struct device *dev)
kfree(shost);
  }

-static unsigned int shost_eh_deadline;
+static unsigned int shost_eh_deadline = -1;


This should probably be static int shost_eh_deadline = -1;.
And the range tests in scsi_host_alloc() and store_shost_eh_deadline()
below should probably use INT_MAX rather than UINT_MAX.


The maximum value is decreased then.
Hannes, agree?



  module_param_named(eh_deadline, shost_eh_deadline, uint, S_IRUGO|S_IWUSR);
  MODULE_PARM_DESC(eh_deadline,
-SCSI EH timeout in seconds (should be between 1 and 2^32-1));
+SCSI EH timeout in seconds (should be between 0 and 2^32-1));


And the description above should be modified as:

+   SCSI EH timeout in seconds (should be between 0 and 
(2^31-1)/HZ));





  static struct device_type scsi_host_type = {
.name = scsi_host,
@@ -394,7 +394,15 @@ struct Scsi_Host *scsi_host_alloc(struct 
scsi_host_template *sht, int privsize)
shost-unchecked_isa_dma = sht-unchecked_isa_dma;
shost-use_clustering = sht-use_clustering;
shost-ordered_tag = sht-ordered_tag;
-   shost-eh_deadline = shost_eh_deadline * HZ;
+   if (shost_eh_deadline == -1)
+   shost-eh_deadline = -1;
+   else if ((ulong) shost_eh_deadline * HZ  UINT_MAX) {
+   printk(KERN_WARNING scsi%d: eh_deadline %u exceeds the 
+  maximum, setting to %u\n,
+  shost-host_no, shost_eh_deadline, UINT_MAX / HZ);
+   shost-eh_deadline = UINT_MAX / HZ * HZ;


Just use shost-eh_deadline = INT_MAX here, leave off the / HZ * HZ.


Nod.

Thanks,
Ren
--
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] scsi: Set the minimum valid value of 'eh_deadline' as 0

2013-10-09 Thread Ren Mingxin
The former minimum valid value of 'eh_deadline' is 1s, which means
the earliest occasion to shorten EH is 1 second later since a
command is failed or timed out. But if we want to skip EH steps
ASAP, we have to wait until the first EH step is finished. If the
duration of the first EH step is long, this waiting time is
excruciating. So, it is necessary to accept 0 as the minimum valid
value for 'eh_deadline'.

According to my test, with Hannes' patchset 'New EH command timeout
handler' as well, the minimum IO time is improved from 73s
(eh_deadline = 1) to 43s(eh_deadline = 0) when commands are timed
out by disabling RSCN and target port.

Another thing: scsi_finish_command() should be invoked if
scsi_eh_scmd_add() is returned on failure - let EH finish those
commands.

Signed-off-by: Ren Mingxin re...@cn.fujitsu.com
---
 drivers/scsi/hosts.c  |   14 +++---
 drivers/scsi/scsi_error.c |   40 +++-
 drivers/scsi/scsi_sysfs.c |   36 +---
 include/scsi/scsi_host.h  |2 +-
 4 files changed, 64 insertions(+), 28 deletions(-)

diff --git a/drivers/scsi/hosts.c b/drivers/scsi/hosts.c
index f334859..e84123a 100644
--- a/drivers/scsi/hosts.c
+++ b/drivers/scsi/hosts.c
@@ -316,11 +316,11 @@ static void scsi_host_dev_release(struct device *dev)
kfree(shost);
 }
 
-static unsigned int shost_eh_deadline;
+static unsigned int shost_eh_deadline = -1;
 
 module_param_named(eh_deadline, shost_eh_deadline, uint, S_IRUGO|S_IWUSR);
 MODULE_PARM_DESC(eh_deadline,
-SCSI EH timeout in seconds (should be between 1 and 2^32-1));
+SCSI EH timeout in seconds (should be between 0 and 2^32-1));
 
 static struct device_type scsi_host_type = {
.name = scsi_host,
@@ -394,7 +394,15 @@ struct Scsi_Host *scsi_host_alloc(struct 
scsi_host_template *sht, int privsize)
shost-unchecked_isa_dma = sht-unchecked_isa_dma;
shost-use_clustering = sht-use_clustering;
shost-ordered_tag = sht-ordered_tag;
-   shost-eh_deadline = shost_eh_deadline * HZ;
+   if (shost_eh_deadline == -1)
+   shost-eh_deadline = -1;
+   else if ((ulong) shost_eh_deadline * HZ  UINT_MAX) {
+   printk(KERN_WARNING scsi%d: eh_deadline %u exceeds the 
+  maximum, setting to %u\n,
+  shost-host_no, shost_eh_deadline, UINT_MAX / HZ);
+   shost-eh_deadline = UINT_MAX / HZ * HZ;
+   } else
+   shost-eh_deadline = shost_eh_deadline * HZ;
 
if (sht-supported_mode == MODE_UNKNOWN)
/* means we didn't set it ... default to INITIATOR */
diff --git a/drivers/scsi/scsi_error.c b/drivers/scsi/scsi_error.c
index adb4cbe..c2f9431 100644
--- a/drivers/scsi/scsi_error.c
+++ b/drivers/scsi/scsi_error.c
@@ -90,7 +90,7 @@ EXPORT_SYMBOL_GPL(scsi_schedule_eh);
 
 static int scsi_host_eh_past_deadline(struct Scsi_Host *shost)
 {
-   if (!shost-last_reset || !shost-eh_deadline)
+   if (!shost-last_reset || shost-eh_deadline == -1)
return 0;
 
if (time_before(jiffies,
@@ -127,29 +127,43 @@ scmd_eh_abort_handler(struct work_struct *work)
rtn = scsi_try_to_abort_cmd(sdev-host-hostt, scmd);
if (rtn == SUCCESS) {
scmd-result |= DID_TIME_OUT  16;
-   if (!scsi_noretry_cmd(scmd) 
+   spin_lock_irqsave(sdev-host-host_lock, flags);
+   if (scsi_host_eh_past_deadline(sdev-host)) {
+   spin_unlock_irqrestore(sdev-host-host_lock,
+  flags);
+   SCSI_LOG_ERROR_RECOVERY(3,
+   scmd_printk(KERN_INFO, scmd,
+   scmd %p eh timeout, 
+   not retrying aborted 
+   command\n, scmd));
+   } else if (!scsi_noretry_cmd(scmd) 
(++scmd-retries = scmd-allowed)) {
+   spin_unlock_irqrestore(sdev-host-host_lock,
+  flags);
SCSI_LOG_ERROR_RECOVERY(3,
scmd_printk(KERN_WARNING, scmd,
scmd %p retry 
aborted command\n, scmd));
scsi_queue_insert(scmd, SCSI_MLQUEUE_EH_RETRY);
+   return;
} else {
+   spin_unlock_irqrestore(sdev-host-host_lock,
+  flags);
SCSI_LOG_ERROR_RECOVERY(3,
scmd_printk

Re: [PATCH 2/3] scsi: improved eh timeout handler

2013-09-20 Thread Ren Mingxin

Hi, Hannes:

On 09/02/2013 07:58 PM, Hannes Reinecke wrote:

+scmd_eh_abort_handler(struct work_struct *work)
+{
+   struct scsi_cmnd *scmd =
+   container_of(work, struct scsi_cmnd, abort_work.work);
+   struct scsi_device *sdev = scmd-device;
+   unsigned long flags;
+   int rtn;
+
+   spin_lock_irqsave(sdev-host-host_lock, flags);
+   if (scsi_host_eh_past_deadline(sdev-host)) {
+   spin_unlock_irqrestore(sdev-host-host_lock, flags);
+   SCSI_LOG_ERROR_RECOVERY(3,
+   scmd_printk(KERN_INFO, scmd,
+   scmd %p eh timeout, not aborting\n, 
scmd));
+   } else {
+   spin_unlock_irqrestore(sdev-host-host_lock, flags);
+   SCSI_LOG_ERROR_RECOVERY(3,
+   scmd_printk(KERN_INFO, scmd,
+   aborting command %p\n, scmd));
+   rtn = scsi_try_to_abort_cmd(sdev-host-hostt, scmd);
+   if (rtn == SUCCESS) {
+   scmd-result |= DID_TIME_OUT  16;
+   if (!scsi_noretry_cmd(scmd)
+   (++scmd-retries= scmd-allowed)) {


I think scsi_host_eh_past_deadline() should be checked here like:

-   if (!scsi_noretry_cmd(scmd)
+   if (!scsi_host_eh_past_deadline(sdev-host)
+   !scsi_noretry_cmd(scmd)

According to my test, once retry requires 30 seconds. If eh_deadline
is reached, we can stop EH here without waiting for long term
retrying.

Thanks,
Ren



+   SCSI_LOG_ERROR_RECOVERY(3,
+   scmd_printk(KERN_WARNING, scmd,
+   scmd %p retry 
+   aborted command\n, scmd));
+   scsi_queue_insert(scmd, SCSI_MLQUEUE_EH_RETRY);
+   } else {
+   SCSI_LOG_ERROR_RECOVERY(3,
+   scmd_printk(KERN_WARNING, scmd,
+   scmd %p finish 
+   aborted command\n, scmd));
+   scsi_finish_command(scmd);
+   }
+   return;
+   }


--
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/3] scsi: improved eh timeout handler

2013-09-11 Thread Ren Mingxin

Hi, Hannes:

On 09/02/2013 07:58 PM, Hannes Reinecke wrote:

If abort succeeds the command is either retried or terminated,
depending on the number of allowed retries. However, 'eh_eflags'
records the abort, so if the retry would fail again the
command is pushed onto the error handler without trying to
abort it (again); it'll be cleared up from SCSI EH.


I'm still thinking about the aborting 'scsi_eh_abort_cmds()' in SCSI
EH - does it make sense to abort in SCSI EH since we've tried to
abort via your scsi_abort_command()? Though the aborting in SCSI EH
will handle commands which havn't been aborted in scsi_abort_command
since EH has been engaged.

Thanks,
Ren
--
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 3/9] scsi: improved eh timeout handler

2013-08-22 Thread Ren Mingxin

Hi, Hannes:

On 07/01/2013 10:24 PM, Hannes Reinecke wrote:

When a command runs into a timeout we need to send an 'ABORT TASK'
TMF. This is typically done by the 'eh_abort_handler' LLDD callback.

Conceptually, however, this function is a normal SCSI command, so
there is no need to enter the error handler.

This patch implements a new scsi_abort_command() function which
invokes an asynchronous function scsi_eh_abort_handler() to
abort the commands via the usual 'eh_abort_handler'.

If abort succeeds the command is either retried or terminated,
depending on the number of allowed retries. However, 'eh_eflags'
records the abort, so if the retry would fail again the
command is pushed onto the error handler without trying to
abort it (again); it'll be cleared up from SCSI EH.

Signed-off-by: Hannes Reineckeh...@suse.de
---
  drivers/scsi/scsi.c   |   1 +
  drivers/scsi/scsi_error.c | 139 ++
  drivers/scsi/scsi_priv.h  |   2 +
  include/scsi/scsi_cmnd.h  |   2 +
  4 files changed, 132 insertions(+), 12 deletions(-)

diff --git a/drivers/scsi/scsi.c b/drivers/scsi/scsi.c
index ebe3b0a..06257cf 100644
--- a/drivers/scsi/scsi.c
+++ b/drivers/scsi/scsi.c
@@ -297,6 +297,7 @@ struct scsi_cmnd *scsi_get_command(struct scsi_device *dev, 
gfp_t gfp_mask)

cmd-device = dev;
INIT_LIST_HEAD(cmd-list);
+   INIT_WORK(cmd-abort_work, scmd_eh_abort_handler);
spin_lock_irqsave(dev-list_lock, flags);
list_add_tail(cmd-list,dev-cmd_list);
spin_unlock_irqrestore(dev-list_lock, flags);
diff --git a/drivers/scsi/scsi_error.c b/drivers/scsi/scsi_error.c
index e76e895..835f7e4 100644
--- a/drivers/scsi/scsi_error.c
+++ b/drivers/scsi/scsi_error.c
@@ -55,6 +55,7 @@ static void scsi_eh_done(struct scsi_cmnd *scmd);
  #define HOST_RESET_SETTLE_TIME  (10)

  static int scsi_eh_try_stu(struct scsi_cmnd *scmd);
+static int scsi_try_to_abort_cmd(struct scsi_host_template *, struct scsi_cmnd 
*);

  /* called with shost-host_lock held */
  void scsi_eh_wakeup(struct Scsi_Host *shost)
@@ -102,6 +103,111 @@ static int scsi_host_eh_past_deadline(struct Scsi_Host 
*shost)
  }

  /**
+ * scmd_eh_abort_handler - Handle command aborts
+ * @work:  command to be aborted.
+ */
+void
+scmd_eh_abort_handler(struct work_struct *work)
+{
+   struct scsi_cmnd *scmd =
+   container_of(work, struct scsi_cmnd, abort_work);
+   struct scsi_device *sdev = scmd-device;
+   unsigned long flags;
+   int rtn;
+
+   spin_lock_irqsave(sdev-host-host_lock, flags);
+   if (scsi_host_eh_past_deadline(sdev-host)) {
+   spin_unlock_irqrestore(sdev-host-host_lock, flags);
+   SCSI_LOG_ERROR_RECOVERY(3,
+   scmd_printk(KERN_INFO, scmd,
+   eh timeout, not aborting\n));


Command address should be also printed for debugging conveniently:
+eh timeout, not aborting command %p\n, scmd));



+   } else {
+   spin_unlock_irqrestore(sdev-host-host_lock, flags);
+   SCSI_LOG_ERROR_RECOVERY(3,
+   scmd_printk(KERN_INFO, scmd,
+   aborting command %p\n, scmd));
+   rtn = scsi_try_to_abort_cmd(sdev-host-hostt, scmd);
+   if (rtn == SUCCESS) {
+   scmd-result |= DID_TIME_OUT  16;
+   if (!scsi_noretry_cmd(scmd)


I think 'scsi_device_online(scmd-device)' is also necessary here.



+   (++scmd-retries= scmd-allowed)) {
+   SCSI_LOG_ERROR_RECOVERY(3,
+   scmd_printk(KERN_WARNING, scmd,
+   retry aborted command\n));


Command address should be also printed here.



+   scsi_queue_insert(scmd, SCSI_MLQUEUE_EH_RETRY);
+   } else {
+   SCSI_LOG_ERROR_RECOVERY(3,
+   scmd_printk(KERN_WARNING, scmd,
+   finish aborted 
command\n));


Command address should be also printed here.



+   scsi_finish_command(scmd);
+   }
+   return;
+   }
+   SCSI_LOG_ERROR_RECOVERY(3,
+   scmd_printk(KERN_INFO, scmd,
+   abort command failed, rtn %d\n, rtn));


Command address should be also printed here.



+   }
+
+   if (scsi_eh_scmd_add(scmd, 0)) {
+   SCSI_LOG_ERROR_RECOVERY(3,
+   scmd_printk(KERN_WARNING, scmd,
+   terminate aborted command\n));


Command address should be also printed here.



+   scmd-result |= DID_TIME_OUT  16;
+   scsi_finish_command(scmd);
+   }
+}
+
+/**
+ * 

Re: [PATCHv2 0/7] Limit overall SCSI EH runtime

2013-08-07 Thread Ren Mingxin

Hi, James:

On 07/11/2013 04:35 AM, Ewan Milne wrote:

Looks good.  We have been testing this extensively.

Acked-by: Ewan D. Milneemi...@redhat.com


Do you think this patchset can be applied? If so, When? Perhaps you
are waiting for someone's feedback?

We've also tested and got the duration could be shortened from 6m26s
to 44s when 'eh_deadline' was set as 1s(the minimum value of timeout)
and 16M data were written(I/O processing time can be ignored - 0.7s).

As Ewan said, this is efficient to fast failover policy for redundant
environments.

Thanks,
Ren
--
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCHv3 0/9] New EH command timeout handler

2013-08-07 Thread Ren Mingxin

Hi, Hannes:

On 07/15/2013 02:05 PM, Ren Mingxin wrote:

On 07/12/2013 06:27 PM, Hannes Reinecke wrote:

On 07/12/2013 12:00 PM, Ren Mingxin wrote:

On 07/12/2013 02:09 PM, Hannes Reinecke wrote:

On 07/12/2013 06:14 AM, Ren Mingxin wrote:

On 07/01/2013 10:24 PM, Hannes Reinecke wrote:

With the original SCSI EH I got:
# time dd if=/dev/zero of=/dev/dm-2 bs=4k count=4k oflag=direct
4096+0 records in
4096+0 records out
16777216 bytes (17 MB) copied, 142.652 s, 118 kB/s

real2m22.657s
user0m0.013s
sys0m0.145s

With this patchset I got:
# time dd if=/dev/zero of=/dev/dm-2 bs=4k count=4k oflag=direct
4096+0 records in
4096+0 records out
16777216 bytes (17 MB) copied, 52.1579 s, 322 kB/s

real0m52.163s
user0m0.012s
sys0m0.145s

Test was to disable RSCN on the target port, disable the
target port, and then start the 'dd' command as indicated.


Do you mean disabling RSCN/port is enough? I'm afraid I couldn't
reproduce the problem by your steps. Both with and without your
patchset are the same 'dd' result: 27s. Please let me know where I
neglected or mistook:

1) I made a dm-multipath target 'dm-0' whose grouping policy was
 failover;
2) Disable RSCN/port via brocade fc switch:
 SW300:root   portcfg rscnsupr 15 --enable; portDisable 15
3) Start the 'dd' command:
 # time dd if=/dev/zero of=/dev/dm-0 bs=4k count=4k oflag=direct
 dd: writing `/dev/sde': Input/output error
 1+0 records in
 0+0 records out
 0 bytes (0 B) copied, 27.8588 s, 0.0 kB/s

 real0m27.860s
 user0m0.001s
 sys 0m0.000s


You are aware that you have to disable RSCNs on the _target_ port,
right?
Disabling RSCNs on the _initiator_ ports is a well-tested case, and
the one which actually makes sense (and is even implemented in
QLogic switches).
Disabling RSCNs for the _target_ port, OTOH, has a very questionable
nature (hence QLogic switches don't even allow you to do this).


You're right. By disabling RSCNs on target port, I've reproduced this
problem. Thank you so much. But I've encountered the bug I said
before. I'll test again with your new patchset once you send.



Could you check with the attached patch? That should convert it to
delayed_work and avoid this issue.


Unfortunately, the login prompt couldn't be entered in and BUGs were
printed ceaselessly while os booting with this patch. The BUGs are
like below:

BUG: scheduling while atomic: swapper/0/0/0x1100
Modules linked in: mptsas(F+) mptscsih(F) mptbase(F) 
scsi_transport_sas(F)

CPU: 0 PID: 0 Comm: swapper/0 Tainted: GF3.10.0hannes+ #10
Hardware name: FUJITSU-SV PRIMEQUEST 1800E/SB-8GDIMM-CN, BIOS 
PRIMEQUEST 1000 Series BIOS Version 1.39 11/16/2012

  88047ee03b68 8153ada4 88047ee03b78
 8107389d 88047ee03c08 8153ca26 81a01fd8
 00012d00 81a00010 00012d00 00012d00
Call Trace:
IRQ  [8153ada4] dump_stack+0x19/0x1d
 [8107389d] __schedule_bug+0x4d/0x60
 [8153ca26] __schedule+0x646/0x6f0
 [8107749a] __cond_resched+0x2a/0x40
 [8153cb60] _cond_resched+0x30/0x40
 [8105fecc] start_flush_work+0x2c/0x140
 [8105fffa] flush_work+0x1a/0x40
 [8105fb39] ? try_to_grab_pending+0x109/0x190
 [8106027e] __cancel_work_timer+0x7e/0x110
 [81060323] cancel_delayed_work_sync+0x13/0x20
 [81374ec5] scsi_put_command+0x65/0xa0


This bug is caused by the sync function 'cancel_delayed_work_sync'
which is invoked in the interrupt context. By replacing it by non-
sync function 'cancel_delayed_work' in 'scsi_put_command' can avoid.

Do you think there is such need to sync in the function 'scsi_put_
command'? Since SCSI command block will be freed here, it is NOT
necessary to wait for the abort work to finish on it, yes?

Thanks,
Ren


 [8137d5aa] scsi_next_command+0x3a/0x60
 [8137dedb] scsi_end_request+0xab/0xb0
 [8137e1ef] scsi_io_completion+0x9f/0x670
 [813744e4] scsi_finish_command+0xd4/0x140
 [8137e927] scsi_softirq_done+0x147/0x170
 [81239534] blk_done_softirq+0x74/0x90
 [81049a4f] __do_softirq+0xef/0x260
 [81049cb5] irq_exit+0xb5/0xc0
 [81548406] do_IRQ+0x66/0xe0
 [8153e5ea] common_interrupt+0x6a/0x6a
EOI  [8109b5f2] ? clockevents_notify+0x52/0x150
 [8142dce3] ? cpuidle_enter_state+0x53/0xd0
 [8142dcdf] ? cpuidle_enter_state+0x4f/0xd0
 [8142e10f] cpuidle_idle_call+0xcf/0x160
 [8100ab1e] arch_cpu_idle+0xe/0x30
 [81093275] cpu_idle_loop+0x65/0x1f0
 [81093470] cpu_startup_entry+0x70/0x80
 [81529427] rest_init+0x77/0x80
 [81b0e1bb] start_kernel+0x41a/0x427
 [81b0dbbf] ? repair_env_string+0x5b/0x5b
 [81b0d5a1] x86_64_start_reservations+0x2a/0x2c
 [81b0d6d2] x86_64_start_kernel+0x12f/0x136


--
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body

Re: [PATCHv2 0/7] Limit overall SCSI EH runtime

2013-07-26 Thread Ren Mingxin

Hi, Hannes:

On 07/15/2013 06:33 PM, Ren Mingxin wrote:

I noticed that the dd time had been reduced from 6m+ to 2m+ when the
'eh_deadline' was set as 30s, but the dd time was 6m+(nearly the same
as default - 'eh_deadline' was 0) when the 'eh_deadline' was set as
10s. I havn't been able to dig further, but I guess there is some
restriction when setting this 'eh_deadline' interface. Maybe should
not less than some timeout, otherwise 'eh_deadline' setting will not
work?


I've retried and confirmed that the exception above is caused by
misoperation - for I had two fc hosts to build a failover multipath,
but I just set 'eh_deadline' on one host. When I tested with 10s,
the 'eh_deadline' on the host of the active path wasn't set :-(

Sorry for my mistake. So:

Tested-by: Ren Mingxin re...@cn.fujitsu.com

Thanks,
Ren

--
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCHv3 0/9] New EH command timeout handler

2013-07-15 Thread Ren Mingxin

Hi, Hannes:

On 07/12/2013 06:27 PM, Hannes Reinecke wrote:

On 07/12/2013 12:00 PM, Ren Mingxin wrote:

On 07/12/2013 02:09 PM, Hannes Reinecke wrote:

On 07/12/2013 06:14 AM, Ren Mingxin wrote:

On 07/01/2013 10:24 PM, Hannes Reinecke wrote:

With the original SCSI EH I got:
# time dd if=/dev/zero of=/dev/dm-2 bs=4k count=4k oflag=direct
4096+0 records in
4096+0 records out
16777216 bytes (17 MB) copied, 142.652 s, 118 kB/s

real2m22.657s
user0m0.013s
sys0m0.145s

With this patchset I got:
# time dd if=/dev/zero of=/dev/dm-2 bs=4k count=4k oflag=direct
4096+0 records in
4096+0 records out
16777216 bytes (17 MB) copied, 52.1579 s, 322 kB/s

real0m52.163s
user0m0.012s
sys0m0.145s

Test was to disable RSCN on the target port, disable the
target port, and then start the 'dd' command as indicated.


Do you mean disabling RSCN/port is enough? I'm afraid I couldn't
reproduce the problem by your steps. Both with and without your
patchset are the same 'dd' result: 27s. Please let me know where I
neglected or mistook:

1) I made a dm-multipath target 'dm-0' whose grouping policy was
 failover;
2) Disable RSCN/port via brocade fc switch:
 SW300:root   portcfg rscnsupr 15 --enable; portDisable 15
3) Start the 'dd' command:
 # time dd if=/dev/zero of=/dev/dm-0 bs=4k count=4k oflag=direct
 dd: writing `/dev/sde': Input/output error
 1+0 records in
 0+0 records out
 0 bytes (0 B) copied, 27.8588 s, 0.0 kB/s

 real0m27.860s
 user0m0.001s
 sys 0m0.000s


You are aware that you have to disable RSCNs on the _target_ port,
right?
Disabling RSCNs on the _initiator_ ports is a well-tested case, and
the one which actually makes sense (and is even implemented in
QLogic switches).
Disabling RSCNs for the _target_ port, OTOH, has a very questionable
nature (hence QLogic switches don't even allow you to do this).


You're right. By disabling RSCNs on target port, I've reproduced this
problem. Thank you so much. But I've encountered the bug I said
before. I'll test again with your new patchset once you send.



Could you check with the attached patch? That should convert it to
delayed_work and avoid this issue.


Unfortunately, the login prompt couldn't be entered in and BUGs were
printed ceaselessly while os booting with this patch. The BUGs are
like below:

BUG: scheduling while atomic: swapper/0/0/0x1100
Modules linked in: mptsas(F+) mptscsih(F) mptbase(F) scsi_transport_sas(F)
CPU: 0 PID: 0 Comm: swapper/0 Tainted: GF3.10.0hannes+ #10
Hardware name: FUJITSU-SV PRIMEQUEST 1800E/SB-8GDIMM-CN, BIOS PRIMEQUEST 
1000 Series BIOS Version 1.39 11/16/2012

  88047ee03b68 8153ada4 88047ee03b78
 8107389d 88047ee03c08 8153ca26 81a01fd8
 00012d00 81a00010 00012d00 00012d00
Call Trace:
IRQ  [8153ada4] dump_stack+0x19/0x1d
 [8107389d] __schedule_bug+0x4d/0x60
 [8153ca26] __schedule+0x646/0x6f0
 [8107749a] __cond_resched+0x2a/0x40
 [8153cb60] _cond_resched+0x30/0x40
 [8105fecc] start_flush_work+0x2c/0x140
 [8105fffa] flush_work+0x1a/0x40
 [8105fb39] ? try_to_grab_pending+0x109/0x190
 [8106027e] __cancel_work_timer+0x7e/0x110
 [81060323] cancel_delayed_work_sync+0x13/0x20
 [81374ec5] scsi_put_command+0x65/0xa0
 [8137d5aa] scsi_next_command+0x3a/0x60
 [8137dedb] scsi_end_request+0xab/0xb0
 [8137e1ef] scsi_io_completion+0x9f/0x670
 [813744e4] scsi_finish_command+0xd4/0x140
 [8137e927] scsi_softirq_done+0x147/0x170
 [81239534] blk_done_softirq+0x74/0x90
 [81049a4f] __do_softirq+0xef/0x260
 [81049cb5] irq_exit+0xb5/0xc0
 [81548406] do_IRQ+0x66/0xe0
 [8153e5ea] common_interrupt+0x6a/0x6a
EOI  [8109b5f2] ? clockevents_notify+0x52/0x150
 [8142dce3] ? cpuidle_enter_state+0x53/0xd0
 [8142dcdf] ? cpuidle_enter_state+0x4f/0xd0
 [8142e10f] cpuidle_idle_call+0xcf/0x160
 [8100ab1e] arch_cpu_idle+0xe/0x30
 [81093275] cpu_idle_loop+0x65/0x1f0
 [81093470] cpu_startup_entry+0x70/0x80
 [81529427] rest_init+0x77/0x80
 [81b0e1bb] start_kernel+0x41a/0x427
 [81b0dbbf] ? repair_env_string+0x5b/0x5b
 [81b0d5a1] x86_64_start_reservations+0x2a/0x2c
 [81b0d6d2] x86_64_start_kernel+0x12f/0x136

If there is any info I havn't expatiated, please let me know.

Thanks,
Ren

--
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCHv2 0/7] Limit overall SCSI EH runtime

2013-07-15 Thread Ren Mingxin

Hi, Ewan:

On 07/12/2013 09:30 PM, Ewan Milne wrote:

On Fri, 2013-07-12 at 13:54 +0800, Ren Mingxin wrote:

I'm wondering how do you test, with a special hardware or self-made
module?Would you mind pasting your test method() and result?

This was tested in a SAN environment with an EMC Symmetrix and
Brocade FC switches.  The error was injected by the following
commands:

portcfg rscnsuprport  --enable
portdisableport

Whereport  is the FC port of the Symmetrix target.

Multipath is used and the test records how long I/O from userspace
takes to complete after the error handling stops and the I/O is
retried on another path.

What happens is that the target never responds to anything the
HBA sends, so commands and TMFs just timeout.  The HBA doesn't
see link down (since it is the target port) and doesn't get an
RSCN.  When the HBA is finally reset, however, it can't login
to the target port and so further I/O gets an immediate error.

Unfortunately, not all SAN environments will exhibit the failing
behavior -- it appears as if in some cases the HBA detects the
problem regardless of the switch portcfg setting.  But this has
been verified to solve the problem of seemingly endless EH
activity in testing at a large customer site.


Thanks in advance for your explanations in detail. I've been able to
reproduce only with this patchset.


Also, to be clear, we tested with the Limit overall SCSI EH
runtime patchset but not the New EH command timeout handler.
I think the changes to issue the abort in the timeout handler
are a good idea, though, because there really is no need to
wait for all activity on the host to cease before issuing the
abort as far as I can see.


Hmm, agree with you. It is much better to issue aborts without
waiting, which can shorten the timeout handling time.


Acked-by: Ewan D. Milneemi...@redhat.com



Hi, Hannes:

I noticed that the dd time had been reduced from 6m+ to 2m+ when the
'eh_deadline' was set as 30s, but the dd time was 6m+(nearly the same
as default - 'eh_deadline' was 0) when the 'eh_deadline' was set as
10s. I havn't been able to dig further, but I guess there is some
restriction when setting this 'eh_deadline' interface. Maybe should
not less than some timeout, otherwise 'eh_deadline' setting will not
work?

Thanks,
Ren
--
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCHv3 0/9] New EH command timeout handler

2013-07-12 Thread Ren Mingxin

Hi, Hannes:

On 07/12/2013 02:09 PM, Hannes Reinecke wrote:

On 07/12/2013 06:14 AM, Ren Mingxin wrote:

On 07/01/2013 10:24 PM, Hannes Reinecke wrote:

With the original SCSI EH I got:
# time dd if=/dev/zero of=/dev/dm-2 bs=4k count=4k oflag=direct
4096+0 records in
4096+0 records out
16777216 bytes (17 MB) copied, 142.652 s, 118 kB/s

real2m22.657s
user0m0.013s
sys0m0.145s

With this patchset I got:
# time dd if=/dev/zero of=/dev/dm-2 bs=4k count=4k oflag=direct
4096+0 records in
4096+0 records out
16777216 bytes (17 MB) copied, 52.1579 s, 322 kB/s

real0m52.163s
user0m0.012s
sys0m0.145s

Test was to disable RSCN on the target port, disable the
target port, and then start the 'dd' command as indicated.


Do you mean disabling RSCN/port is enough? I'm afraid I couldn't
reproduce the problem by your steps. Both with and without your
patchset are the same 'dd' result: 27s. Please let me know where I
neglected or mistook:

1) I made a dm-multipath target 'dm-0' whose grouping policy was
failover;
2) Disable RSCN/port via brocade fc switch:
SW300:root  portcfg rscnsupr 15 --enable; portDisable 15
3) Start the 'dd' command:
# time dd if=/dev/zero of=/dev/dm-0 bs=4k count=4k oflag=direct
dd: writing `/dev/sde': Input/output error
1+0 records in
0+0 records out
0 bytes (0 B) copied, 27.8588 s, 0.0 kB/s

real0m27.860s
user0m0.001s
sys 0m0.000s


You are aware that you have to disable RSCNs on the _target_ port,
right?
Disabling RSCNs on the _initiator_ ports is a well-tested case, and
the one which actually makes sense (and is even implemented in
QLogic switches).
Disabling RSCNs for the _target_ port, OTOH, has a very questionable
nature (hence QLogic switches don't even allow you to do this).


You're right. By disabling RSCNs on target port, I've reproduced this
problem. Thank you so much. But I've encountered the bug I said
before. I'll test again with your new patchset once you send.

Thanks,
Ren




[ .. ]


Another question:

I also tried to produce timeouts by modifying Yasui's module(please
see APPENDIX A):
http://www.spinics.net/lists/linux-scsi/msg35091.html

But I got a bug with your this patchset by follwing steps(there was
not such bug without your patchset):

# grep lpfc_template /proc/kallsyms
a00f9240 d lpfc_template[lpfc]
# multipath -ll
...
mpathb (36000b5d0006a006a14e7000c) dm-1 FUJITSU,ETERNUS_DX400
size=50G features='1 queue_if_no_path' hwhandler='0' wp=rw
|-+- policy='round-robin 0' prio=130 status=active
| `- 2:0:0:1 sdf 8:80  active ready running
`-+- policy='round-robin 0' prio=130 status=enabled
   `- 3:0:0:1 sdh 8:112 active ready running
# insmod scsi_tmo_mod.ko param=0xa00f9240,2:0:0:1; time dd
if=/dev/zero of=/dev/dm-1 bs=4k count=4k oflag=direct
4096+0 records in
4096+0 records out
16777216 bytes (17 MB) copied, 151.194 s, 111 kB/s

real2m31.195s
user0m0.004s
sys0m0.111s

Please see logs in APPENDIX B. Do you think this bug is irrelevant to
your patchset?


Hmm. No, sadly not.

'cancel_work_sync' cannot be called from an interrupt context;
guess I'll need to convert it to delayed work.

Thanks for testing; will be updating the patchset.


--
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCHv3 0/9] New EH command timeout handler

2013-07-11 Thread Ren Mingxin

Hi, Hannes:

On 07/01/2013 10:24 PM, Hannes Reinecke wrote:

With the original SCSI EH I got:
# time dd if=/dev/zero of=/dev/dm-2 bs=4k count=4k oflag=direct
4096+0 records in
4096+0 records out
16777216 bytes (17 MB) copied, 142.652 s, 118 kB/s

real2m22.657s
user0m0.013s
sys 0m0.145s

With this patchset I got:
# time dd if=/dev/zero of=/dev/dm-2 bs=4k count=4k oflag=direct
4096+0 records in
4096+0 records out
16777216 bytes (17 MB) copied, 52.1579 s, 322 kB/s

real0m52.163s
user0m0.012s
sys 0m0.145s

Test was to disable RSCN on the target port, disable the
target port, and then start the 'dd' command as indicated.


Do you mean disabling RSCN/port is enough? I'm afraid I couldn't
reproduce the problem by your steps. Both with and without your
patchset are the same 'dd' result: 27s. Please let me know where I
neglected or mistook:

1) I made a dm-multipath target 'dm-0' whose grouping policy was
   failover;
2) Disable RSCN/port via brocade fc switch:
   SW300:root portcfg rscnsupr 15 --enable; portDisable 15
3) Start the 'dd' command:
   # time dd if=/dev/zero of=/dev/dm-0 bs=4k count=4k oflag=direct
   dd: writing `/dev/sde': Input/output error
   1+0 records in
   0+0 records out
   0 bytes (0 B) copied, 27.8588 s, 0.0 kB/s

   real0m27.860s
   user0m0.001s
   sys 0m0.000s
#) Corresponding logs in /var/log/messages
Jul  9 14:56:06 build kernel: lpfc :0d:00.1: 1:1305 Link Down Event 
x4 received Data: x4 x20 x110 x0 x0
Jul  9 14:56:36 build kernel: rport-3:0-2: blocked FC remote port time 
out: removing target and saving binding

Jul  9 14:56:36 build kernel: sd 3:0:0:0: rejecting I/O to offline device
Jul  9 14:56:36 build kernel: lpfc :0d:00.1: 1:(0):0203 Devloss 
timeout on WWPN 20:41:00:0b:5d:6a:14:e7 NPort x620700 Data: x0 x8 x0

Jul  9 14:56:36 build kernel: sd 3:0:0:0: [sde] Synchronizing SCSI cache
Jul  9 14:56:36 build kernel: sd 3:0:0:0: [sde]
Jul  9 14:56:36 build kernel: Result: hostbyte=DID_NO_CONNECT 
driverbyte=DRIVER_OK

Jul  9 14:56:36 build kernel: sd 3:0:0:1: [sdf] Synchronizing SCSI cache
Jul  9 14:56:36 build kernel: sd 3:0:0:1: [sdf]
Jul  9 14:56:36 build kernel: Result: hostbyte=DID_NO_CONNECT 
driverbyte=DRIVER_OK

Jul  9 14:56:36 build multipathd: sdf: remove path (uevent)
Jul  9 14:56:36 build multipathd: mpatha: load table [0 104857600 
multipath 1 queue_if_no_path 0 1 1 round-robin 0 1 1 8:112 1]

Jul  9 14:56:36 build multipathd: sdf: path removed from map mpatha
Jul  9 14:56:36 build udevd-work[8420]: error opening 
ATTR{/sys/devices/pci:00/:00:03.0/:01:00.0/:02:01.0/:0a:00.0/:0b:01.0/:0d:00.1/host3/rport-3:0-2/target3:0:0/3:0:0:0/block/sde/queue/iosched/slice_idle} 
for writing: No such file or directory
Jul  9 14:56:36 build udevd-work[8420]: error opening 
ATTR{/sys/devices/pci:00/:00:03.0/:01:00.0/:02:01.0/:0a:00.0/:0b:01.0/:0d:00.1/host3/rport-3:0-2/target3:0:0/3:0:0:0/block/sde/queue/iosched/quantum} 
for writing: No such file or directory

Jul  9 14:56:36 build multipathd: sde: remove path (uevent)
Jul  9 14:56:36 build multipathd: mpathb: load table [0 104857600 
multipath 1 queue_if_no_path 0 1 1 round-robin 0 1 1 8:96 1]

Jul  9 14:56:36 build multipathd: sde: path removed from map mpathb
* there are two disks sde and sdf connected via port 15

Another question:

I also tried to produce timeouts by modifying Yasui's module(please
see APPENDIX A):
http://www.spinics.net/lists/linux-scsi/msg35091.html

But I got a bug with your this patchset by follwing steps(there was
not such bug without your patchset):

# grep lpfc_template /proc/kallsyms
a00f9240 d lpfc_template[lpfc]
# multipath -ll
...
mpathb (36000b5d0006a006a14e7000c) dm-1 FUJITSU,ETERNUS_DX400
size=50G features='1 queue_if_no_path' hwhandler='0' wp=rw
|-+- policy='round-robin 0' prio=130 status=active
| `- 2:0:0:1 sdf 8:80  active ready running
`-+- policy='round-robin 0' prio=130 status=enabled
  `- 3:0:0:1 sdh 8:112 active ready running
# insmod scsi_tmo_mod.ko param=0xa00f9240,2:0:0:1; time dd 
if=/dev/zero of=/dev/dm-1 bs=4k count=4k oflag=direct

4096+0 records in
4096+0 records out
16777216 bytes (17 MB) copied, 151.194 s, 111 kB/s

real2m31.195s
user0m0.004s
sys0m0.111s

Please see logs in APPENDIX B. Do you think this bug is irrelevant to
your patchset?

Thanks,
Ren

APPENDIX A:

/*
 * scsi timeout injection module
 */
#include linux/module.h
#include scsi/scsi_cmnd.h
#include scsi/scsi_host.h
#include scsi/scsi_device.h

static struct scsi_host_template *sht;
static char config[32];

static struct target {
short host;
uint channel;
uint id;
uint lun;
} st;

static int (*org_qc)(struct Scsi_Host *, struct scsi_cmnd *);


static inline int check_dev(struct target *st, struct scsi_cmnd *cmd)
{
return (st-host == cmd-device-host-host_no 
st-channel == cmd-device-channel 
st-id == 

Re: [PATCHv2 0/7] Limit overall SCSI EH runtime

2013-07-11 Thread Ren Mingxin

Hi, Ewan:

On 07/11/2013 04:35 AM, Ewan Milne wrote:

On Mon, 2013-07-01 at 08:50 +0200, Hannes Reinecke wrote:

This patchset implements a new 'eh_deadline' attribute to the
SCSI host. It will limit the overall SCSI EH runtime by a given
timeout. If the timeout is reached all intermediate EH steps
will be skipped and host reset will be scheduled immediately.

For this patch I've re-used the existing 'last_reset' field
of the SCSI host to store the initial time SCSI EH started.
Also the field 'resetting' has been removed as it never has
been used as intended.

As 'last_reset' might be in use by transport-specific EH
implementation I've disallowed eh_deadline setting there.

Changes from the initial version:
- Add list_splice_init() calls to avoid stale commands
- Rename function to scsi_host_eh_past_deadline

Hannes Reinecke (7):
   dpt_i2o: Remove DPTI_STATE_IOCTL
   dpt_i2o: return SCSI_MLQUEUE_HOST_BUSY when in reset
   advansys: Remove 'last_reset' references
   tmscsim: Move 'last_reset' into host structure
   dc395: Move 'last_reset' into internal host structure
   scsi: remove check for 'resetting'
   scsi: Add 'eh_deadline' to limit SCSI EH runtime

  drivers/scsi/advansys.c   |   8 +--
  drivers/scsi/dc395x.c |  24 +
  drivers/scsi/dpt_i2o.c|  35 +
  drivers/scsi/dpti.h   |   1 -
  drivers/scsi/hosts.c  |   7 +++
  drivers/scsi/scsi.c   |  28 --
  drivers/scsi/scsi_error.c | 130 +++---
  drivers/scsi/scsi_sysfs.c |  37 +
  drivers/scsi/tmscsim.c|  14 ++---
  drivers/scsi/tmscsim.h|   1 +
  include/scsi/scsi_host.h  |   4 +-
  11 files changed, 208 insertions(+), 81 deletions(-)


Looks good.  We have been testing this extensively.


I'm wondering how do you test, with a special hardware or self-made
module?Would you mind pasting your test method() and result?

Thanks,
Ren



Acked-by: Ewan D. Milneemi...@redhat.com


--
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 3/4] scsi: improved eh timeout handler

2013-06-07 Thread Ren Mingxin

Hi, Hannes:

On 06/07/2013 04:28 AM, Jörn Engel wrote:

On Thu, 6 June 2013 22:39:14 +0200, Hannes Reinecke wrote:

+   spin_unlock_irqrestore(sdev-list_lock, flags);
+   SCSI_LOG_ERROR_RECOVERY(3,
+   scmd_printk(KERN_INFO, scmd,
+   aborting command %p\n, scmd));
+   rtn = scsi_try_to_abort_cmd(shost-hostt, scmd);
+   if (rtn == SUCCESS || rtn == FAST_IO_FAIL) {
+   if (((scmd-request-cmd_flags  REQ_FAILFAST_DEV) ||


Am I being stupid again or should this be negated?


Knowing you I would think the former; where do you see the negation?


If REQ_FAILFAST_DEV is set, this runs scsi_queue_insert(), which I
would expect it should run scsi_finish_command().


I also think (scmd-request-cmd_flags  REQ_FAILFAST_DEV) and
(scmd-request-cmd_type == REQ_TYPE_BLOCK_PC) should be negated.
I'm confused why not use !scsi_noretry_cmd(scmd) directly as your
former patch here?


+(scmd-request-cmd_type == REQ_TYPE_BLOCK_PC))
+   (++scmd-retries= scmd-allowed)) {
+   SCSI_LOG_ERROR_RECOVERY(3,
+   scmd_printk(KERN_WARNING, scmd,
+   retry aborted command\n));
+
+   scsi_queue_insert(scmd, SCSI_MLQUEUE_EH_RETRY);
+   } else {
+   SCSI_LOG_ERROR_RECOVERY(3,
+   scmd_printk(KERN_WARNING, scmd,
+   fast fail aborted 
command\n));
+   scmd-result |= DID_TRANSPORT_FAILFAST  16;
+   scsi_finish_command(scmd);
+   }
+   } else {
+   if (!scsi_eh_scmd_add(scmd, 0)) {
+   SCSI_LOG_ERROR_RECOVERY(3,
+   scmd_printk(KERN_WARNING, scmd,
+   terminate aborted 
command\n));
+   scmd-result |= DID_TIME_OUT  16;
+   scsi_finish_command(scmd);
+   }
+   }
+   spin_lock_irqsave(sdev-list_lock, flags);
+   }
+   spin_unlock_irqrestore(sdev-list_lock, flags);

...

+/**
+ * scsi_abort_command - schedule a command abort
+ * @scmd:  scmd to abort.
+ *
+ * We only need to abort commands after a command timeout
+ */
+void
+scsi_abort_command(struct scsi_cmnd *scmd)
+{
+   unsigned long flags;
+   int kick_worker = 0;
+   struct scsi_device *sdev = scmd-device;
+
+   spin_lock_irqsave(sdev-list_lock, flags);
+   if (list_empty(sdev-eh_abort_list))
+   kick_worker = 1;
+   list_add(scmd-eh_entry,sdev-eh_abort_list);
+   SCSI_LOG_ERROR_RECOVERY(3,
+   scmd_printk(KERN_INFO, scmd, adding to eh_abort_list\n));
+   spin_unlock_irqrestore(sdev-list_lock, flags);
+   if (kick_worker)
+   schedule_work(sdev-abort_work);
+}
+EXPORT_SYMBOL_GPL(scsi_abort_command);


Should the name of function above be more ideographic/understandable?
For example, scsi_abort_scmd_add? I was bewildered among functions
named scsi_abort_eh_cmnd, scsi_eh_abort_cmds...

Thanks,
Ren
--
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 0/4] New SCSI command timeout handler

2013-06-07 Thread Ren Mingxin

Hi, Hannes:

On 06/06/2013 05:43 PM, Hannes Reinecke wrote:

this is the first step towards a new non-blocking
error handler. This patch implements a new command
timeout handler which will be sending command aborts
inline without engaging SCSI EH.

In addition the commands will be returned directly
if the command abort succeeded, cutting down recovery
times dramatically.

With the original scsi error recovery I got:
# time dd if=/dev/zero of=/mnt/test.blk bs=512 count=2048 oflag=sync
2048+0 records in
2048+0 records out
1048576 bytes (1.0 MB) copied, 3.72732 s, 281 kB/s

real2m14.475s
user0m0.000s
sys 0m0.104s

with this patchset I got:
# time dd if=/dev/zero of=/mnt/test.blk bs=512 count=2048 oflag=sync
2048+0 records in
2048+0 records out
1048576 bytes (1.0 MB) copied, 31.5151 s, 33.3 kB/s

real0m31.519s
user0m0.000s
sys 0m0.088s

Test was to disable RSCN on the target port, disable the
target port, and then start the 'dd' command as indicated.

As a proof-of-concept I've also enabled the new timeout
handler for virtio, so that things can be tested out
more easily.


So this 31.5s is tested on virtio disks, right? Much faster than your
former test via fc.

This approach may not work for some LLDDs as you said, but I wonder
whether SAS is applicable(whether there will be later patches for
SAS).

Thanks,
Ren

--
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 3/3] scsi: Return ENODATA on medium error

2013-06-06 Thread Ren Mingxin

Hi, Hannes:

On 06/05/2013 03:11 PM, Hannes Reinecke wrote:

When a medium error is detected the SCSI stack should return
ENODATA to the upper layers.

Signed-off-by: Hannes Reineckeh...@suse.de
---
  drivers/scsi/scsi_error.c | 7 ++-
  drivers/scsi/scsi_lib.c   | 5 +
  include/scsi/scsi.h   | 2 ++
  3 files changed, 13 insertions(+), 1 deletion(-)

diff --git a/drivers/scsi/scsi_error.c b/drivers/scsi/scsi_error.c
index bf5e61a..2ded10a 100644
--- a/drivers/scsi/scsi_error.c
+++ b/drivers/scsi/scsi_error.c
@@ -235,6 +235,7 @@ static inline void scsi_eh_prt_fail_stats(struct Scsi_Host 
*shost,
   *NEEDS_RETRY
   *TARGET_ERROR
   *ALLOC_ERROR
+ * MEDIA_FAILURE
   *
   * Notes:
   *When a deferred error is detected the current command has
@@ -375,7 +376,7 @@ static int scsi_check_sense(struct scsi_cmnd *scmd)
if (sshdr.asc == 0x11 || /* UNRECOVERED READ ERR */
sshdr.asc == 0x13 || /* AMNF DATA FIELD */
sshdr.asc == 0x14) { /* RECORD NOT FOUND */
-   return TARGET_ERROR;
+   return MEDIA_FAILURE;
}
return NEEDS_RETRY;

@@ -1598,6 +1599,10 @@ int scsi_decide_disposition(struct scsi_cmnd *scmd)
/* target hit out-of-space condition */
set_host_byte(scmd, DID_ALLOC_FAILURE);
rtn = SUCCESS;
+   } else if (rtn == MEDIA_FAILURE) {
+   /* medium error */
+   set_host_byte(scmd, DID_MEDIUM_ERROR);
+   rtn = SUCCESS;
}
/* if rtn == FAILED, we have no sense information;
 * returning FAILED will wake the error handler thread
diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
index 209a4d5..39d626e 100644
--- a/drivers/scsi/scsi_lib.c
+++ b/drivers/scsi/scsi_lib.c
@@ -711,6 +711,7 @@ EXPORT_SYMBOL(scsi_release_buffers);
   * -EREMOTEIO permanent target failure, do not retry
   * -EBADE permanent nexus failure, retry on other path
   * -ENOSPCNo write space available
+ * -ENODATAMedium error
   */
  static int __scsi_error_from_host_byte(struct scsi_cmnd *cmd, int result)
  {
@@ -732,6 +733,10 @@ static int __scsi_error_from_host_byte(struct scsi_cmnd 
*cmd, int result)
set_host_byte(cmd, DID_OK);
error = -ENOSPC;
break;
+   case DID_MEDIUM_ERROR:
+   set_host_byte(cmd, DID_OK);
+   error = -ENODATA;
+   break;


It seems that there is a debugging requirement to announce the
meaning of these new added error codes in the function
blk_update_request()like this:

diff --git a/block/blk-core.c b/block/blk-core.c
index 33c33bc..a396eb6 100644
--- a/block/blk-core.c
+++ b/block/blk-core.c
@@ -2315,6 +2315,12 @@ bool blk_update_request(struct request *req, int 
error, unsigned int nr_bytes)

case -EBADE:
error_type = critical nexus;
break;
+   case -ENOSPC:
+   error_type = critical space allocation;
+   break;
+   case -ENODATA:
+   error_type = critical medium;
+   break;
case -EIO:
default:
error_type = I/O;

# To tell the truth, I'm not understand why this patchset is needed
# in practice for I've only just got limited info about LSF. I guess
# this is one of the improvements for SCSI EH. Could you give an
# example/condition the upper layers interest in?

Thanks,
Ren


default:
error = -EIO;
break;
diff --git a/include/scsi/scsi.h b/include/scsi/scsi.h
index 5ead86b..c397684 100644
--- a/include/scsi/scsi.h
+++ b/include/scsi/scsi.h
@@ -453,6 +453,7 @@ static inline int scsi_is_wlun(unsigned int lun)
  #define DID_NEXUS_FAILURE 0x11  /* Permanent nexus failure, retry on other
 * paths might yield different results */
  #define DID_ALLOC_FAILURE 0x12  /* Space allocation on the device failed */
+#define DID_MEDIUM_ERROR  0x13  /* Medium error */
  #define DRIVER_OK   0x00  /* Driver status   */

  /*
@@ -484,6 +485,7 @@ static inline int scsi_is_wlun(unsigned int lun)
  #define FAST_IO_FAIL  0x2009
  #define TARGET_ERROR0x200A
  #define ALLOC_ERROR 0x200B
+#define MEDIA_FAILURE   0x200C

  /*
   * Midlevel queue return values.


--
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/3] scsi: Document enhanced error codes

2013-06-05 Thread Ren Mingxin

Hi, Hannes:

I have two questions about the comments:

On 06/05/2013 03:10 PM, Hannes Reinecke wrote:

Document the various error codes returned on I/O failure.

Signed-off-by: Hannes Reineckeh...@suse.de
---
  drivers/scsi/scsi_error.c |  7 +--
  drivers/scsi/scsi_lib.c   | 11 +++
  2 files changed, 16 insertions(+), 2 deletions(-)

diff --git a/drivers/scsi/scsi_error.c b/drivers/scsi/scsi_error.c
index f43de1e..443b0e3 100644
--- a/drivers/scsi/scsi_error.c
+++ b/drivers/scsi/scsi_error.c
@@ -229,8 +229,11 @@ static inline void scsi_eh_prt_fail_stats(struct Scsi_Host 
*shost,
   * scsi_check_sense - Examine scsi cmd sense
   * @scmd: Cmd to have sense checked.
   *
- * Return value:
- * SUCCESS or FAILED or NEEDS_RETRY or TARGET_ERROR
+ * Possible return values:
+ * SUCCESS
+ * FAILED
+ * NEEDS_RETRY
+ * TARGET_ERROR


This is more likely to be a historical non-update issue -
there is another possible return value 'ADD_TO_MLQUEUE' which may be
returned by the handler check_sense() or the case of this
scsi_check_sense() below, right?

switch (sshdr.sense_key) {
case HARDWARE_ERROR:
if (scmd-device-retry_hwerror)
return ADD_TO_MLQUEUE;



   *
   * Notes:
   *When a deferred error is detected the current command has
diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
index 86d5220..12bfa73 100644
--- a/drivers/scsi/scsi_lib.c
+++ b/drivers/scsi/scsi_lib.c
@@ -700,6 +700,17 @@ void scsi_release_buffers(struct scsi_cmnd *cmd)
  }
  EXPORT_SYMBOL(scsi_release_buffers);

+/**
+ * __scsi_error_from_host_byte - translate SCSI error code into errno
+ * @cmd:   SCSI command (unused)
+ * @result:scsi error code
+ *
+ * Translate SCSI error code into standard UNIX errno.
+ * Return values:
+ * -ENOLINKtemporary transport failure
+ * -EREMOTEIO  permanent target failure, do not retry
+ * -EBADE  permanent nexus failure, retry on other path


Sorry, I'm afraid that I'm not clear why '-EIO' is not listed here...

Perhaps some of them are not necessary to document for some reasons?

Thanks,
Ren


+ */
  static int __scsi_error_from_host_byte(struct scsi_cmnd *cmd, int result)
  {
int error = 0;


--
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 0/4] New FC timeout handler

2013-05-30 Thread Ren Mingxin

Hi, Hannes:

On 05/24/2013 05:50 PM, Hannes Reinecke wrote:

this is the first step towards a new FC error handler.
This patch implements a new FC command timeout handler
which will be sending command aborts inline without
engaging SCSI EH.

In addition the commands will be returned directly
if the command abort succeeded, cutting down recovery
times dramatically.


To the commands which can be aborted successfully, I guess your
patchset has solved the problem the error handler can't even be
called until host_failed == host_busy, because it needn't to
wait for the scheduling of EH threads(without engaging SCSI EH
as you said) now, right?


For any other return code from 'eh_abort_handler' the command
will be pushed onto the existing SCSI EH handler, or aborted
with an error if that fails.


To the commands which can NOT be aborted successfully, there is
not any improvements for the SCSI EH will be invoked as usual.
But should we consider the repetitive/time-consuming issue for
the commands will be tried to abort again in the SCSI EH handler?

Thanks,
Ren
--
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 0/5] scsi: Allow fast io fail without waiting through timeout

2013-05-22 Thread Ren Mingxin

Hi, James,

On 05/20/2013 11:53 PM, James Smart wrote:

Based on the discussion recently held at LSF 2013, we are
reworking the error recovery path to address all the issues
you are mentioning. That work contradicts these patches.
So for now, these should be held off.


Interesting. Can I have your general goal/idea briefly even
though via a reference? Will the URL below be one you will
refer to?
  http://lwn.net/Articles/548500

And, could I know your current progress/schedule? Especially
when can we see your patches?

Much appreciated!

Thanks,
Ren



On 5/20/2013 3:14 AM, Ren Mingxin wrote:

When there is a scsi command timed-out or failed, the scsi eh
tries a thorugh recovery, which is necessary for non-redundant
systems. However, the thorugh recovery usually takes much time,
which is not acceptable for misson critical systems. To improve
this latency, if we are working on a redundant system, we should
avoid the scsi eh for its long time failing recovery, and quick
failover to another path.

This set of patches is trying to implement above.

NOTE: the userland tools need to eusure the environment
restriction, which will be implemented later.

Thanks,
Ren

Ren Mingxin (5):
   scsi: rename return code FAST_IO_FAIL to FAST_IO
   FC transport: Add interface to specify fast io level for timed-out 
cmds
   SAS transport: Add interface to specify fast io level for 
timed-out cmds

   lpfc: Allow fast timed-out io recovery
   mptfusion: Allow fast timed-out io recovery

  drivers/message/fusion/mptscsih.c   |   29 -
  drivers/scsi/lpfc/lpfc_scsi.c   |   34 ++
  drivers/scsi/scsi_error.c   |   18 ++---
  drivers/scsi/scsi_sas_internal.h|4 -
  drivers/scsi/scsi_transport_fc.c|  112 
++--

  drivers/scsi/scsi_transport_iscsi.c |6 -
  drivers/scsi/scsi_transport_sas.c   |  103 
-

  include/scsi/scsi.h |2
  include/scsi/scsi_transport_fc.h|   11 +++
  include/scsi/scsi_transport_sas.h   |8 ++
  10 files changed, 303 insertions(+), 24 deletions(-)

--
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 4/5] lpfc: Allow fast timed-out io recovery

2013-05-20 Thread Ren Mingxin
This patch implements fast timed-out io recovery in LLDD(lpfc) by
checking the corresponding bit fields specified in the new added
interface fast_io_tmo_flags and returning FAST_IO to avoid the
scsi_eh recovery actions on corresponding levels.

This is mainly for redundant configurations. To non-redundant
systems, the thorough recovery is necessary.

Furthermore, userland tools such as multipath-tools should ensure
that this policy is available only if there are more than one path
active, which will be implemented later.

Here is an example which can show the improvement of this patch:

  before:
- takes about 3s to write 800MB normally
# dd if=/dev/zero of=/dev/mapper/mpathb bs=4k count=20
20+0 records in
20+0 records out
81920 bytes (819 MB) copied, 3.10581 s, 264 MB/s

- takes about 105s to write 800MB when I/Os timed out
# grep lpfc_template /proc/kallsyms
a00f83a0 d lpfc_template[lpfc]
# insmod scsi_timeout.ko param=0xa00f83a0,2:0:0:1[*]
# dd if=/dev/zero of=/dev/mapper/mpathb bs=4k count=20
20+0 records in
20+0 records out
81920 bytes (819 MB) copied, 104.91 s, 7.8 MB/s

  after:
- takes about 34s to write 800MB by using this patch when I/Os
  timed out
# echo 0x1f  /sys/devices/pci:00/:00:03.0/\
   :01:00.0/:02:01.0/:0a:00.0/\
   :0b:01.0/:0d:00.0/host2/rport-2:0-2/\
   fc_remote_ports/rport-2:0-2/fast_io_tmo_flags
# insmod scsi_timeout.ko param=0xa00f83a0,2:0:0:1
# dd if=/dev/zero of=/dev/mapper/mpathb bs=4k count=20
20+0 records in
20+0 records out
81920 bytes (819 MB) copied, 33.7718 s, 24.3 MB/s

  * scsi_timeout.ko is a self-made module which wraps the scsi
queuecommand handler and ignores I/Os to the specified device
and any I/Os are not passed to LLDD.
Reference:
  http://www.spinics.net/lists/linux-scsi/msg35091.html

So with this patch, we just spend time writing(about 3s) and
waiting through timeout(30s), and save about 71s in scsi eh.

Signed-off-by: Ren Mingxin re...@cn.fujitsu.com
---
 drivers/scsi/lpfc/lpfc_scsi.c |   34 --
 1 files changed, 32 insertions(+), 2 deletions(-)

diff --git a/drivers/scsi/lpfc/lpfc_scsi.c b/drivers/scsi/lpfc/lpfc_scsi.c
index 8523b27..796893b 100644
--- a/drivers/scsi/lpfc/lpfc_scsi.c
+++ b/drivers/scsi/lpfc/lpfc_scsi.c
@@ -4798,6 +4798,7 @@ lpfc_abort_handler(struct scsi_cmnd *cmnd)
 {
struct Scsi_Host  *shost = cmnd-device-host;
struct lpfc_vport *vport = (struct lpfc_vport *) shost-hostdata;
+   struct fc_rport   *rport = starget_to_rport(scsi_target(cmnd-device));
struct lpfc_hba   *phba = vport-phba;
struct lpfc_iocbq *iocb;
struct lpfc_iocbq *abtsiocb;
@@ -4811,6 +4812,11 @@ lpfc_abort_handler(struct scsi_cmnd *cmnd)
if (status != 0  status != SUCCESS)
return status;
 
+   if (rport-fast_io_tmo_flags  FC_RPORT_IGN_ABORT_CMDS) {
+   scsi_device_set_state(cmnd-device, SDEV_OFFLINE);
+   return FAST_IO;
+   }
+
spin_lock_irqsave(phba-hbalock, flags);
/* driver queued commands are in process of being flushed */
if (phba-hba_flag  HBA_FCP_IOQ_FLUSH) {
@@ -5150,6 +5156,7 @@ lpfc_device_reset_handler(struct scsi_cmnd *cmnd)
 {
struct Scsi_Host  *shost = cmnd-device-host;
struct lpfc_vport *vport = (struct lpfc_vport *) shost-hostdata;
+   struct fc_rport   *rport = starget_to_rport(scsi_target(cmnd-device));
struct lpfc_rport_data *rdata = cmnd-device-hostdata;
struct lpfc_nodelist *pnode;
unsigned tgt_id = cmnd-device-id;
@@ -5167,6 +5174,11 @@ lpfc_device_reset_handler(struct scsi_cmnd *cmnd)
if (status != 0  status != SUCCESS)
return status;
 
+   if (rport-fast_io_tmo_flags  FC_RPORT_IGN_DEVICE_RESET) {
+   scsi_device_set_state(cmnd-device, SDEV_OFFLINE);
+   return FAST_IO;
+   }
+
status = lpfc_chk_tgt_mapped(vport, cmnd);
if (status == FAILED) {
lpfc_printf_vlog(vport, KERN_ERR, LOG_FCP,
@@ -5217,6 +5229,7 @@ lpfc_target_reset_handler(struct scsi_cmnd *cmnd)
 {
struct Scsi_Host  *shost = cmnd-device-host;
struct lpfc_vport *vport = (struct lpfc_vport *) shost-hostdata;
+   struct fc_rport   *rport = starget_to_rport(scsi_target(cmnd-device));
struct lpfc_rport_data *rdata = cmnd-device-hostdata;
struct lpfc_nodelist *pnode;
unsigned tgt_id = cmnd-device-id;
@@ -5234,6 +5247,11 @@ lpfc_target_reset_handler(struct scsi_cmnd *cmnd)
if (status != 0  status != SUCCESS)
return status;
 
+   if (rport-fast_io_tmo_flags  FC_RPORT_IGN_TARGET_RESET) {
+   scsi_device_set_state(cmnd-device, SDEV_OFFLINE);
+   return FAST_IO

[PATCH 2/5] FC transport: Add interface to specify fast io level for timed-out cmds

2013-05-20 Thread Ren Mingxin
This patch introduces new interfaces through sysfs for fc hosts
and rports to allow users to avoid the scsi_eh recovery actions
on different levels when scsi commands timed out, e.g.
  /sys/devices/pci***/.../hostN/fc_host/hostN/fast_io_tmo_flags
  /sys/devices/pci***/.../hostN/rport-X:Y-Z/fc_remote_ports/\
rport-X:Y-Z/fast_io_tmo_flags

This new added interface fast_io_tmo_flags is a 8-bit mask with
low 5-bit available up to now:
  0x01 - Ignore aborting commands
  0x02 - Ignore device resets
  0x04 - Ignore target resets
  0x08 - Ignore bus resets
  0x10 - Ignore host resets

When scsi_eh unjams hosts, the corresponding bit fields will be
checked by LLDD to decide whether to ignore specified recovery
levels. Its value is zero by default, so it keeps existing
behavior, which is necessary for non-redundant systems.

This interface is mainly for redundant environments. To
redundant systems, they need a quick give up and failover,
instead of thorough recovery which usually takes much time.

The actions in LLDD/redundant configurations should be implemented
individually later.

Signed-off-by: Ren Mingxin re...@cn.fujitsu.com
---
 drivers/scsi/scsi_transport_fc.c |  108 +-
 include/scsi/scsi_transport_fc.h |   11 
 2 files changed, 117 insertions(+), 2 deletions(-)

diff --git a/drivers/scsi/scsi_transport_fc.c b/drivers/scsi/scsi_transport_fc.c
index 7b29e00..155a658 100644
--- a/drivers/scsi/scsi_transport_fc.c
+++ b/drivers/scsi/scsi_transport_fc.c
@@ -310,9 +310,9 @@ static void fc_scsi_scan_rport(struct work_struct *work);
  * Increase these values if you add attributes
  */
 #define FC_STARGET_NUM_ATTRS   3
-#define FC_RPORT_NUM_ATTRS 10
+#define FC_RPORT_NUM_ATTRS 11
 #define FC_VPORT_NUM_ATTRS 9
-#define FC_HOST_NUM_ATTRS  29
+#define FC_HOST_NUM_ATTRS  30
 
 struct fc_internal {
struct scsi_transport_template t;
@@ -995,6 +995,67 @@ store_fc_rport_fast_io_fail_tmo(struct device *dev,
 static FC_DEVICE_ATTR(rport, fast_io_fail_tmo, S_IRUGO | S_IWUSR,
show_fc_rport_fast_io_fail_tmo, store_fc_rport_fast_io_fail_tmo);
 
+/*
+ * fast_io_tmo_flags attribute
+ */
+static ssize_t
+show_fc_rport_fast_io_tmo_flags(struct device *dev,
+   struct device_attribute *attr,
+   char *buf)
+{
+   struct fc_rport *rport = transport_class_to_rport(dev);
+
+   return sprintf(buf, 0x%02x\n, rport-fast_io_tmo_flags);
+}
+
+static int fc_str_to_fast_io_tmo_flags(const char *buf, u8 *val)
+{
+   char *cp;
+
+   *val = simple_strtoul(buf, cp, 0)  0xff;
+   if (cp == buf)
+   return -EINVAL;
+
+   return 0;
+}
+
+static int fc_rport_set_fast_io_tmo_flags(struct fc_rport *rport, u8 val)
+{
+   if ((rport-port_state == FC_PORTSTATE_BLOCKED) ||
+   (rport-port_state == FC_PORTSTATE_DELETED) ||
+   (rport-port_state == FC_PORTSTATE_NOTPRESENT))
+   return -EBUSY;
+
+   rport-fast_io_tmo_flags = val;
+
+   return 0;
+}
+
+static ssize_t
+store_fc_rport_fast_io_tmo_flags(struct device *dev,
+struct device_attribute *attr,
+const char *buf,
+size_t count)
+{
+   struct fc_rport *rport = transport_class_to_rport(dev);
+   u8 val;
+   int rc;
+
+   if (count  1)
+   return -EINVAL;
+
+   rc = fc_str_to_fast_io_tmo_flags(buf, val);
+   if (rc)
+   return rc;
+
+   rc = fc_rport_set_fast_io_tmo_flags(rport, val);
+   if (rc)
+   return rc;
+   return count;
+}
+static FC_DEVICE_ATTR(rport, fast_io_tmo_flags, S_IRUGO | S_IWUSR,
+   show_fc_rport_fast_io_tmo_flags, store_fc_rport_fast_io_tmo_flags);
+
 
 /*
  * FC SCSI Target Attribute Management
@@ -1679,6 +1740,47 @@ static FC_DEVICE_ATTR(host, dev_loss_tmo, S_IRUGO | 
S_IWUSR,
  show_fc_host_dev_loss_tmo,
  store_fc_private_host_dev_loss_tmo);
 
+static ssize_t
+show_fc_private_host_fast_io_tmo_flags (struct device *dev,
+   struct device_attribute *attr,
+   char *buf)
+{
+   struct Scsi_Host *shost = transport_class_to_shost(dev);
+
+   return sprintf(buf, 0x%02x\n, fc_host_fast_io_tmo_flags(shost));
+}
+
+static ssize_t
+store_fc_private_host_fast_io_tmo_flags(struct device *dev,
+   struct device_attribute *attr,
+   const char *buf,
+   size_t count)
+{
+   struct Scsi_Host *shost = transport_class_to_shost(dev);
+   struct fc_host_attrs *fc_host = shost_to_fc_host(shost);
+   struct fc_rport *rport;
+   u8 val;
+   int rc;
+   unsigned long flags;
+
+   if (count  1)
+   return -EINVAL;
+
+   rc

[PATCH 1/5] scsi: rename return code FAST_IO_FAIL to FAST_IO

2013-05-20 Thread Ren Mingxin
The return code FAST_IO_FAIL was introduced for fast failed io
recovery. To use this code for fast timed-out io recovery as well,
we'd rename it to FAST_IO.

Signed-off-by: Ren Mingxin re...@cn.fujitsu.com
---
 drivers/scsi/scsi_error.c   |   18 +-
 drivers/scsi/scsi_transport_fc.c|4 ++--
 drivers/scsi/scsi_transport_iscsi.c |6 +++---
 include/scsi/scsi.h |2 +-
 4 files changed, 15 insertions(+), 15 deletions(-)

diff --git a/drivers/scsi/scsi_error.c b/drivers/scsi/scsi_error.c
index f43de1e..9e8e37a 100644
--- a/drivers/scsi/scsi_error.c
+++ b/drivers/scsi/scsi_error.c
@@ -1067,9 +1067,9 @@ static int scsi_eh_abort_cmds(struct list_head *work_q,
  0x%p\n, current-comm,
  scmd));
rtn = scsi_try_to_abort_cmd(scmd-device-host-hostt, scmd);
-   if (rtn == SUCCESS || rtn == FAST_IO_FAIL) {
+   if (rtn == SUCCESS || rtn == FAST_IO) {
scmd-eh_eflags = ~SCSI_EH_CANCEL_CMD;
-   if (rtn == FAST_IO_FAIL)
+   if (rtn == FAST_IO)
scsi_eh_finish_cmd(scmd, done_q);
else
list_move_tail(scmd-eh_entry, check_list);
@@ -1195,9 +1195,9 @@ static int scsi_eh_bus_device_reset(struct Scsi_Host 
*shost,
   0x%p\n, current-comm,
  sdev));
rtn = scsi_try_bus_device_reset(bdr_scmd);
-   if (rtn == SUCCESS || rtn == FAST_IO_FAIL) {
+   if (rtn == SUCCESS || rtn == FAST_IO) {
if (!scsi_device_online(sdev) ||
-   rtn == FAST_IO_FAIL ||
+   rtn == FAST_IO ||
!scsi_eh_tur(bdr_scmd)) {
list_for_each_entry_safe(scmd, next,
 work_q, eh_entry) {
@@ -1248,7 +1248,7 @@ static int scsi_eh_target_reset(struct Scsi_Host *shost,
  to target %d\n,
  current-comm, id));
rtn = scsi_try_target_reset(scmd);
-   if (rtn != SUCCESS  rtn != FAST_IO_FAIL)
+   if (rtn != SUCCESS  rtn != FAST_IO)
SCSI_LOG_ERROR_RECOVERY(3, printk(%s: Target reset
   failed target: 
  %d\n,
@@ -1259,7 +1259,7 @@ static int scsi_eh_target_reset(struct Scsi_Host *shost,
 
if (rtn == SUCCESS)
list_move_tail(scmd-eh_entry, check_list);
-   else if (rtn == FAST_IO_FAIL)
+   else if (rtn == FAST_IO)
scsi_eh_finish_cmd(scmd, done_q);
else
/* push back on work queue for further 
processing */
@@ -1311,10 +1311,10 @@ static int scsi_eh_bus_reset(struct Scsi_Host *shost,
   %d\n, current-comm,
  channel));
rtn = scsi_try_bus_reset(chan_scmd);
-   if (rtn == SUCCESS || rtn == FAST_IO_FAIL) {
+   if (rtn == SUCCESS || rtn == FAST_IO) {
list_for_each_entry_safe(scmd, next, work_q, eh_entry) {
if (channel == scmd_channel(scmd)) {
-   if (rtn == FAST_IO_FAIL)
+   if (rtn == FAST_IO)
scsi_eh_finish_cmd(scmd,
   done_q);
else
@@ -1354,7 +1354,7 @@ static int scsi_eh_host_reset(struct list_head *work_q,
rtn = scsi_try_host_reset(scmd);
if (rtn == SUCCESS) {
list_splice_init(work_q, check_list);
-   } else if (rtn == FAST_IO_FAIL) {
+   } else if (rtn == FAST_IO) {
list_for_each_entry_safe(scmd, next, work_q, eh_entry) {
scsi_eh_finish_cmd(scmd, done_q);
}
diff --git a/drivers/scsi/scsi_transport_fc.c b/drivers/scsi/scsi_transport_fc.c
index e106c27..7b29e00 100644
--- a/drivers/scsi/scsi_transport_fc.c
+++ b/drivers/scsi/scsi_transport_fc.c
@@ -3301,7 +3301,7 @@ fc_scsi_scan_rport(struct work_struct *work)
  * rports which would lead to offlined SCSI devices.
  *
  * Returns: 0 if the fc_rport left the state FC_PORTSTATE_BLOCKED.
- * FAST_IO_FAIL if the fast_io_fail_tmo fired, this should

[PATCH 5/5] mptfusion: Allow fast timed-out io recovery

2013-05-20 Thread Ren Mingxin
This patch implements fast timed-out io recovery in LLDD(mptfusion)
by checking the corresponding bit fields specified in the new added
interface fast_io_tmo_flags and returning FAST_IO to avoid the
scsi_eh recovery actions on corresponding levels.

This is mainly for redundant configurations. To non-redundant
systems, the thorough recovery is necessary.

Furthermore, userland tools such as mdadm should ensure that this
policy is available only if there are more than one mirrored
devices active, which will be implemented later.

NOTE: the device reset handler isn't implemented and the bus rest
handler isn't defined for mptsas_driver_template.

Here is an example which can show the improvement of this patch on
md-raid1 devices:

  before:
- takes about 69s to write 8GB normally
# dd if=/dev/zero of=/dev/md0 bs=4k count=200
200+0 records in
200+0 records out
819200 bytes (8.2 GB) copied, 68.7898 s, 119 MB/s

- takes about 188s to write 8GB when I/Os timed out
# grep mptsas_driver_template /proc/kallsyms
a00485c0 d mptsas_driver_template   [mptsas]
# insmod scsi_timeout.ko param=0xa00485c0,1:0:1:0[*]
# dd if=/dev/zero of=/dev/md0 bs=4k count=200
200+0 records in
200+0 records out
819200 bytes (8.2 GB) copied, 187.857 s, 43.6 MB/s

  after:
- takes about 129s to write 8GB by using this patch when I/Os
  timed out
# echo 0x1f  /sys/devices/pci:00/:00:03.0/\
   :01:00.0/:02:00.0/:03:00.0/\
   :04:03.0/:08:00.0/host1/port-1:1/\
   end_device-1:1/sas_device/end_device-1:1/\
   fast_io_tmo_flags
# insmod scsi_timeout.ko param=0xa00485c0,1:0:1:0
# dd if=/dev/zero of=/dev/md127 bs=4k count=200
200+0 records in
200+0 records out
819200 bytes (8.2 GB) copied, 129.478 s, 63.3 MB/s

  * scsi_timeout.ko is a self-made module which wraps the scsi
queuecommand handler and ignores I/Os to the specified device
and any I/Os are not passed to LLDD.
Reference:
  http://www.spinics.net/lists/linux-scsi/msg35091.html

So with this patch, we just spend time writing(about 69s) and
waiting through timeout(60s), and save about 59s in scsi eh.

Signed-off-by: Ren Mingxin re...@cn.fujitsu.com
---
 drivers/message/fusion/mptscsih.c |   29 +++--
 1 files changed, 27 insertions(+), 2 deletions(-)

diff --git a/drivers/message/fusion/mptscsih.c 
b/drivers/message/fusion/mptscsih.c
index 727819c..47ef776 100644
--- a/drivers/message/fusion/mptscsih.c
+++ b/drivers/message/fusion/mptscsih.c
@@ -62,6 +62,7 @@
 #include scsi/scsi_host.h
 #include scsi/scsi_tcq.h
 #include scsi/scsi_dbg.h
+#include scsi/scsi_transport_sas.h
 
 #include mptbase.h
 #include mptscsih.h
@@ -1698,6 +1699,12 @@ mptscsih_abort(struct scsi_cmnd * SCpnt)
int  retval;
VirtDevice   *vdevice;
MPT_ADAPTER *ioc;
+   struct sas_rphy *rphy = target_to_rphy(SCpnt-device-sdev_target);
+
+   if (rphy-fast_io_tmo_flags  SAS_RPHY_IGN_ABORT_CMDS) {
+   scsi_device_set_state(SCpnt-device, SDEV_OFFLINE);
+   return FAST_IO;
+   }
 
/* If we can't locate our host adapter structure, return FAILED status.
 */
@@ -1818,6 +1825,12 @@ mptscsih_dev_reset(struct scsi_cmnd * SCpnt)
int  retval;
VirtDevice   *vdevice;
MPT_ADAPTER *ioc;
+   struct sas_rphy *rphy = target_to_rphy(SCpnt-device-sdev_target);
+
+   if (rphy-fast_io_tmo_flags  SAS_RPHY_IGN_TARGET_RESET) {
+   scsi_device_set_state(SCpnt-device, SDEV_OFFLINE);
+   return FAST_IO;
+   }
 
/* If we can't locate our host adapter structure, return FAILED status.
 */
@@ -1878,6 +1891,12 @@ mptscsih_bus_reset(struct scsi_cmnd * SCpnt)
int  retval;
VirtDevice   *vdevice;
MPT_ADAPTER *ioc;
+   struct sas_rphy *rphy = target_to_rphy(SCpnt-device-sdev_target);
+
+   if (rphy-fast_io_tmo_flags  SAS_RPHY_IGN_BUS_RESET) {
+   scsi_device_set_state(SCpnt-device, SDEV_OFFLINE);
+   return FAST_IO;
+   }
 
/* If we can't locate our host adapter structure, return FAILED status.
 */
@@ -1924,10 +1943,16 @@ mptscsih_bus_reset(struct scsi_cmnd * SCpnt)
 int
 mptscsih_host_reset(struct scsi_cmnd *SCpnt)
 {
-   MPT_SCSI_HOST *  hd;
-   int  status = SUCCESS;
+   MPT_SCSI_HOST   *hd;
+   int status = SUCCESS;
MPT_ADAPTER *ioc;
int retval;
+   struct sas_rphy *rphy = target_to_rphy(SCpnt-device-sdev_target);
+
+   if (rphy-fast_io_tmo_flags  SAS_RPHY_IGN_HOST_RESET) {
+   scsi_device_set_state(SCpnt-device, SDEV_OFFLINE);
+   return FAST_IO;
+   }
 
/*  If we can't locate

[PATCH 3/5] SAS transport: Add interface to specify fast io level for timed-out cmds

2013-05-20 Thread Ren Mingxin
This patch introduces new interfaces through sysfs for sas hosts
and rphys to allow users to avoid the scsi_eh recovery actions
on different levels when scsi commands timed out, e.g.
  /sys/devices/pci***/.../hostN/sas_host/hostN/fast_io_tmo_flags
  /sys/devices/pci***/.../hostN/port-X:Y/end_device-X:Y/\
  sas_device/end_device-X:Y/fast_io_tmo_flags

This new added interface fast_io_tmo_flags is a 8-bit mask with
low 5-bit available up to now:
  0x01 - Ignore aborting commands
  0x02 - Ignore device resets
  0x04 - Ignore target resets
  0x08 - Ignore bus resets
  0x10 - Ignore host resets

When scsi_eh unjams hosts, the corresponding bit fields will be
checked by LLDD to decide whether to ignore specified recovery
levels. Its value is zero by default, so it keeps existing
behavior, which is necessary for non-redundant systems.

This interface is mainly for redundant environments. To
redundant systems, they need a quick give up and failover,
instead of thorough recovery which usually takes much time.

The actions in LLDD/redundant configurations should be implemented
individually later.

Signed-off-by: Ren Mingxin re...@cn.fujitsu.com
---
 drivers/scsi/scsi_sas_internal.h  |4 +-
 drivers/scsi/scsi_transport_sas.c |  103 -
 include/scsi/scsi_transport_sas.h |8 +++
 3 files changed, 112 insertions(+), 3 deletions(-)

diff --git a/drivers/scsi/scsi_sas_internal.h b/drivers/scsi/scsi_sas_internal.h
index 6266a5d..8c7ab08 100644
--- a/drivers/scsi/scsi_sas_internal.h
+++ b/drivers/scsi/scsi_sas_internal.h
@@ -1,10 +1,10 @@
 #ifndef _SCSI_SAS_INTERNAL_H
 #define _SCSI_SAS_INTERNAL_H
 
-#define SAS_HOST_ATTRS 0
+#define SAS_HOST_ATTRS 1
 #define SAS_PHY_ATTRS  17
 #define SAS_PORT_ATTRS 1
-#define SAS_RPORT_ATTRS7
+#define SAS_RPORT_ATTRS8
 #define SAS_END_DEV_ATTRS  5
 #define SAS_EXPANDER_ATTRS 7
 
diff --git a/drivers/scsi/scsi_transport_sas.c 
b/drivers/scsi/scsi_transport_sas.c
index 1b68142..960f3e5 100644
--- a/drivers/scsi/scsi_transport_sas.c
+++ b/drivers/scsi/scsi_transport_sas.c
@@ -37,6 +37,7 @@
 #include scsi/scsi_host.h
 #include scsi/scsi_transport.h
 #include scsi/scsi_transport_sas.h
+#include scsi/scsi_cmnd.h
 
 #include scsi_sas_internal.h
 struct sas_host_attrs {
@@ -46,6 +47,7 @@ struct sas_host_attrs {
u32 next_target_id;
u32 next_expander_id;
int next_port_id;
+   u8 fast_io_tmo_flags;
 };
 #define to_sas_host_attrs(host)((struct sas_host_attrs 
*)(host)-shost_data)
 
@@ -277,6 +279,59 @@ static void sas_bsg_remove(struct Scsi_Host *shost, struct 
sas_rphy *rphy)
  * SAS host attributes
  */
 
+static ssize_t
+show_sas_private_host_fast_io_tmo_flags(struct device *dev,
+   struct device_attribute *attr, char 
*buf)
+{
+   struct Scsi_Host *shost = dev_to_shost(dev);
+   struct sas_host_attrs *sas_host = to_sas_host_attrs(shost);
+
+   return sprintf(buf, 0x%02x\n, sas_host-fast_io_tmo_flags);
+}
+
+static int sas_str_to_fast_io_tmo_flags(const char *buf, u8 *val)
+{
+   char *cp;
+
+   *val = simple_strtoul(buf, cp, 0)  0xff;
+   if (cp == buf)
+   return -EINVAL;
+
+   return 0;
+}
+
+static ssize_t
+store_sas_private_host_fast_io_tmo_flags(struct device *dev,
+struct device_attribute *attr,
+const char *buf,
+size_t count)
+{
+   struct Scsi_Host *shost = dev_to_shost(dev);
+   struct sas_host_attrs *sas_host = to_sas_host_attrs(shost);
+   struct sas_rphy *rphy;
+   u8 val;
+   int rc;
+   unsigned long flags;
+
+   if (count  1)
+   return -EINVAL;
+
+   rc = sas_str_to_fast_io_tmo_flags(buf, val);
+   if (rc)
+   return rc;
+
+   sas_host-fast_io_tmo_flags = val;
+   spin_lock_irqsave(shost-host_lock, flags);
+   list_for_each_entry(rphy, sas_host-rphy_list, list)
+   rphy-fast_io_tmo_flags = val;
+   spin_unlock_irqrestore(shost-host_lock, flags);
+   return count;
+}
+
+static SAS_DEVICE_ATTR(host, fast_io_tmo_flags, S_IRUGO | S_IWUSR,
+   show_sas_private_host_fast_io_tmo_flags,
+   store_sas_private_host_fast_io_tmo_flags);
+
 static int sas_host_setup(struct transport_container *tc, struct device *dev,
  struct device *cdev)
 {
@@ -1267,6 +1322,38 @@ sas_rphy_simple_attr(identify.sas_address, sas_address, 
0x%016llx\n,
unsigned long long);
 sas_rphy_simple_attr(identify.phy_identifier, phy_identifier, %d\n, u8);
 
+static ssize_t show_sas_rphy_fast_io_tmo_flags (struct device *dev,
+   struct device_attribute *attr,
+   char *buf)
+{
+   struct sas_rphy *rphy

[PATCH 0/5] scsi: Allow fast io fail without waiting through timeout

2013-05-20 Thread Ren Mingxin
When there is a scsi command timed-out or failed, the scsi eh
tries a thorugh recovery, which is necessary for non-redundant
systems. However, the thorugh recovery usually takes much time,
which is not acceptable for misson critical systems. To improve
this latency, if we are working on a redundant system, we should
avoid the scsi eh for its long time failing recovery, and quick
failover to another path.

This set of patches is trying to implement above.

NOTE: the userland tools need to eusure the environment
restriction, which will be implemented later.

Thanks,
Ren

Ren Mingxin (5):
  scsi: rename return code FAST_IO_FAIL to FAST_IO
  FC transport: Add interface to specify fast io level for timed-out cmds
  SAS transport: Add interface to specify fast io level for timed-out cmds
  lpfc: Allow fast timed-out io recovery
  mptfusion: Allow fast timed-out io recovery

 drivers/message/fusion/mptscsih.c   |   29 -
 drivers/scsi/lpfc/lpfc_scsi.c   |   34 ++
 drivers/scsi/scsi_error.c   |   18 ++---
 drivers/scsi/scsi_sas_internal.h|4 -
 drivers/scsi/scsi_transport_fc.c|  112 ++--
 drivers/scsi/scsi_transport_iscsi.c |6 -
 drivers/scsi/scsi_transport_sas.c   |  103 -
 include/scsi/scsi.h |2 
 include/scsi/scsi_transport_fc.h|   11 +++
 include/scsi/scsi_transport_sas.h   |8 ++
 10 files changed, 303 insertions(+), 24 deletions(-)

--
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] scsi_dh: remove unused declaration dm_pg_init_complete()

2013-04-16 Thread Ren Mingxin
This patch removes dm_pg_init_complete()'s declaration as it is
not needed anymore since 2651f5d7d3bc5120a439e498f131e4d731f99b3e.

Signed-off-by: Ren Mingxin re...@cn.fujitsu.com
---
 drivers/md/dm-mpath.h |3 ---
 1 files changed, 0 insertions(+), 3 deletions(-)

diff --git a/drivers/md/dm-mpath.h b/drivers/md/dm-mpath.h
index e230f71..9c36d0f 100644
--- a/drivers/md/dm-mpath.h
+++ b/drivers/md/dm-mpath.h
@@ -16,7 +16,4 @@ struct dm_path {
void *pscontext;/* For path-selector use */
 };
 
-/* Callback for hwh_pg_init_fn to use when complete */
-void dm_pg_init_complete(struct dm_path *path, unsigned err_flags);
-
 #endif
-- 
1.7.1

--
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: error handler scheduling

2013-04-12 Thread Ren Mingxin

On 03/29/2013 12:02 AM, Elliott, Robert (Server Storage) wrote:

There are several possible reasons for SCSI command timeouts:
 a) the command request did not get to the SCSI target port and logical
unit (e.g., error on the wire)
 b) logical unit is still working on the command
 c) the command completed, but status didn't get to the SCSI initiator port
and application client (e.g., error on the wire)

SCSI doesn't have a good way to detect case (c). For status delivery errors
detected by the logical unit, I once proposed that the logical unit establish
a unit attention condition and record the status delivery problem in a log
page (T10 proposal 04-072) but this proposal didn't draw much interest. The
QUERY TASK task management function can detect case (b) vs. the other cases.

With SSDs, a lengthy timeout derived from ancient SCSI floppy drives doesn't
make sense. Timeouts should scale automatically based on the device type
(e.g., use microseconds for SSDs and seconds for HDDs). The REPORT
SUPPORTED OPERATION CODES command provides some command timeout values
to facilitate this.

For Base feature set drives I'm encouraging an approach like this for
handling command timeouts:

1) at discovery time:
 1a) send REPORT SUPPORTED OPERATION CODES to determine the nominal
 and maximum command timeouts
 1b) send REPORT SUPPORTED TASK MANAGEMENT FUNCTION to determine
 the TMF timeouts

2) send the command (e.g., READ, WRITE, FORMAT UNIT, ...)

If status arrives for the command at any time, exit out of this procedure.
If an I_T nexus loss occurs, then that handling overrides this procedure
as well. Otherwise:

3) if the nominal command timeout is long (e.g., for a command like FORMAT
UNIT with IMMED=0, but not for IO commands like READ and WRITE), then wait
a short time and send QUERY TASK to ensure the command got there:
 3a) if the command is not there (probably lost in delivery, but
 possibly lost status), go to step (2) to resend the command
 3b) if the command is still being processed, keep waiting

4) if the nominal command timeout is reached, send QUERY TASK to determine
what is happening:
 4a) if the command is not there (if step (3) was run, then this
 probably means lost status), go to step (2) to resend the command
 4b) if the command is still being processed, keep waiting

5) if the maximum command timeout is reached, send QUERY TASK to determine
what is happening:
 5a) if the command is not there (since step (4) was run, this
  probably means lost status), go to step (2) to resend the command
 5b) if the command is still being processed, proceed to step (6)
 to abort the command

6) send ABORT TASK to abort the command

7) If ABORT TASK succeeds, either:
 7a) escalate to a stronger TMF or hard reset if this command
keeps having repeated problems; or
 7b) go to step (2) to resend the command

8) If the ABORT TASK timeout is reached, either:
 8a) escalate to a stronger TMF or hard reset, then go to step (2)
 to resend the command; or
 8b) declare the logical unit is unavailable

Doug: for ***, In addition to WSNZ bit now letting the drive not support
the value of zero, T10 proposal 13-052 changes WRITE SAME so the NUMBER
OF LOGICAL BLOCKS set to zero (if supported) must honor the MAXIMUM WRITE
SAME LENGTH field, so the drive can provide a reasonable timeout value
for the command (not worry that the entire capacity might be specified).


Please let me summarize what this thread has talked about the scsi
eh latency:

1) some scsi cmds' timemout values are inappropriate, we can avoid
   timeout by:
   a) sg_format sets the IMMED bit and use TEST UNIT READY or REQUEST
  SENSE polling to monitor - by Douglas
   b) cut big cmd into some reasonable-sized ones - by Douglas
   c) improve timeout values according to device types - by Elliott
2) call -done() on the command after lun reset - by Hannes

And, my question is:
- could we wake up eh thread ASAP instead of waiting for all cmds
  complete to fast scheduling?

BTW: my original question is here:
http://www.spinics.net/lists/linux-scsi/msg65107.html

Thanks,
Ren


---
Rob ElliottHP Server Storage




-Original Message-
From: linux-scsi-ow...@vger.kernel.org [mailto:linux-scsi-
ow...@vger.kernel.org] On Behalf Of Douglas Gilbert
Sent: Wednesday, 27 March, 2013 9:39 AM
To: james.sm...@emulex.com
Cc: linux-scsi@vger.kernel.org
Subject: Re: error handler scheduling

On 13-03-26 10:11 PM, James Smart wrote:

In looking through the error handler, if a command times out and is added to

the

eh_cmd_q for the shost, the error handler is only awakened once shost-
host_busy
(total number of i/os posted to the shost) is equal to shost-host_failed
(number of i/o that have been failed and put on the eh_cmd_q).  Which

means, any

other i/o that was outstanding must either complete or have their timeout

fire.


scsi_error: improve the recovery latency for timeouted scsi cmds

2013-03-19 Thread Ren Mingxin

Hi,

Please let me ask one question about improving the recovery latency
for timeouted scmds:

In the functions 'scsi_eh_wakeup()'  'scsi_error_handler()', there
are two same condition judgements which ensure the number of active
scmds equals to the number of failed scmds:

  void scsi_eh_wakeup(struct Scsi_Host *shost)
  {
  if (shost-host_busy == shost-host_failed)
  wake_up_process(shost-ehandler);
  }

  int scsi_error_handler(void *data)
  {
  while (!kthread_should_stop()) {
  if ((shost-host_failed == 0 
   shost-host_eh_scheduled == 0) ||
   shost-host_failed != shost-host_busy) {
  schedule();
  continue;
  }
  
  }
  
  }

I think the original reason for waking up eh thread until all scmds
complete/fail may be in case of more overhead produced by threads
waking up time after time, right?

But in the below condition, the strategy above seems not appropriate:

  If a scmd is issued and stuck and another scmd is issued, scsi eh
  detects a timeout of the first scmd, but has to wait for the second
  one to be timedout/completed. Which means the first timeouted scmds
  couldn't be handled in time.

This may be fatal to a certain extent(the critical system especially).
So, please let me know the starting point for the wakeup strategy in
eh. We'd investigate further based on your comments. Any suggestions
will be appreciated.

Thanks,
Ren
--
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] lpfc: init: fix misspelling word in mailbox command waiting comments

2012-12-10 Thread Ren Mingxin
Correct misspelling of outstanding in mailbox command waiting comments.

Signed-off-by: Ren Mingxin re...@cn.fujitsu.com
Signed-off-by: Pan Dayu pandy.f...@cn.fujitsu.com
---
 drivers/scsi/lpfc/lpfc_init.c |2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/drivers/scsi/lpfc/lpfc_init.c b/drivers/scsi/lpfc/lpfc_init.c
index 7dc4218..8533160 100644
--- a/drivers/scsi/lpfc/lpfc_init.c
+++ b/drivers/scsi/lpfc/lpfc_init.c
@@ -2566,7 +2566,7 @@ lpfc_block_mgmt_io(struct lpfc_hba *phba, int mbx_action)
}
spin_unlock_irqrestore(phba-hbalock, iflag);
 
-   /* Wait for the outstnading mailbox command to complete */
+   /* Wait for the outstanding mailbox command to complete */
while (phba-sli.mbox_active) {
/* Check active mailbox complete status every 2ms */
msleep(2);
-- 
1.7.1

--
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] lpfc: init: fix misspelling word in mailbox command waiting comments

2012-12-10 Thread Ren Mingxin

On 12/11/2012 11:53 AM, re...@cn.fujitsu.com wrote:

From: Ren Mingxinre...@cn.fujitsu.com


Superfluous, sorry for disturbing everyone :-(

Ren
--
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html