Re: [PATCH] IB/srp: Fail I/O requests if the transport is offline

2013-02-24 Thread Bart Van Assche

On 02/18/13 09:11, Sagi Grimberg wrote:

On 2/18/2013 6:06 AM, David Dillow wrote:

On Fri, 2013-02-15 at 10:39 +0100, Bart Van Assche wrote:

diff --git a/drivers/infiniband/ulp/srp/ib_srp.c
b/drivers/infiniband/ulp/srp/ib_srp.c
index 8a7eb9f..b34752d 100644
--- a/drivers/infiniband/ulp/srp/ib_srp.c
+++ b/drivers/infiniband/ulp/srp/ib_srp.c
@@ -734,6 +734,7 @@ static int srp_reconnect_target(struct
srp_target_port *target)
  scsi_target_unblock(shost-shost_gendev, ret == 0 ?
SDEV_RUNNING :
  SDEV_TRANSPORT_OFFLINE);
+target-transport_offline = ret != 0;

Minor nit, that line is hard to read; I keep thinking it needs parens
around the conditional...

Perhaps
target-transport_offline = !!ret;
or
target-transport_offline = ret;

gcc should do the right conversion since we're assigning to a bool.


Or, Vu, does this solve the issue you've seen? I may have time to test
later this week, but not before.


This indeed solve scsi_host removal issues.
Vu is on vacation, I'll perform some more failover tests...


Hello Sagi,

Since no further feedback was posted on the list I assume that means 
that all tests passed ?


Bart.

--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] IB/srp: Fail I/O requests if the transport is offline

2013-02-24 Thread Sagi Grimberg

On 2/24/2013 10:09 AM, Bart Van Assche wrote:

On 02/18/13 09:11, Sagi Grimberg wrote:

On 2/18/2013 6:06 AM, David Dillow wrote:

On Fri, 2013-02-15 at 10:39 +0100, Bart Van Assche wrote:

diff --git a/drivers/infiniband/ulp/srp/ib_srp.c
b/drivers/infiniband/ulp/srp/ib_srp.c
index 8a7eb9f..b34752d 100644
--- a/drivers/infiniband/ulp/srp/ib_srp.c
+++ b/drivers/infiniband/ulp/srp/ib_srp.c
@@ -734,6 +734,7 @@ static int srp_reconnect_target(struct
srp_target_port *target)
  scsi_target_unblock(shost-shost_gendev, ret == 0 ?
SDEV_RUNNING :
  SDEV_TRANSPORT_OFFLINE);
+target-transport_offline = ret != 0;

Minor nit, that line is hard to read; I keep thinking it needs parens
around the conditional...

Perhaps
target-transport_offline = !!ret;
or
target-transport_offline = ret;

gcc should do the right conversion since we're assigning to a bool.


Or, Vu, does this solve the issue you've seen? I may have time to test
later this week, but not before.


This indeed solve scsi_host removal issues.
Vu is on vacation, I'll perform some more failover tests...


Hello Sagi,

Since no further feedback was posted on the list I assume that means 
that all tests passed ?


Bart.



Hey Bart,
Sorry for the delay I was just about to reply...

From my end, the related patchset seems solve the scsi_host removal 
issue and prevents the SCSI error handling loop.
Generally our tests passed, I still have some issue with long-term 
failover test but I'm not sure its SRP (perhaps might origin in IB layer).

So ack from me...

-Sagi
--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] IB/srp: Fail I/O requests if the transport is offline

2013-02-24 Thread Or Gerlitz
On Sun, Feb 24, 2013 at 10:59 AM, Sagi Grimberg sa...@mellanox.com wrote:
 On 2/24/2013 10:09 AM, Bart Van Assche wrote:
 On 02/18/13 09:11, Sagi Grimberg wrote:
 On 2/18/2013 6:06 AM, David Dillow wrote:

 Or, Vu, does this solve the issue you've seen? I may have time to test
 later this week, but not before.

 This indeed solve scsi_host removal issues.
 Vu is on vacation, I'll perform some more failover tests...

 Hello Sagi, Since no further feedback
 was posted on the list I assume that means that all tests passed ?

 Hey Bart, Sorry for the delay I was just about to reply...
 From my end, the related patchset seems solve the scsi_host removal issue
 and prevents the SCSI error handling loop.


 Generally our tests passed, I still have some issue with long-term
 failover test but I'm not sure its SRP (perhaps might origin in IB layer).
 So ack from me...

OK

Dave/Roland - sounds like we did managed to get real progress here...
lets get these fix in for 3.9 and 3.8-stable too,

Or.
--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] IB/srp: Fail I/O requests if the transport is offline

2013-02-21 Thread Bart Van Assche

On 02/18/13 05:06, David Dillow wrote:

On Fri, 2013-02-15 at 10:39 +0100, Bart Van Assche wrote:

diff --git a/drivers/infiniband/ulp/srp/ib_srp.c 
b/drivers/infiniband/ulp/srp/ib_srp.c
index 8a7eb9f..b34752d 100644
--- a/drivers/infiniband/ulp/srp/ib_srp.c
+++ b/drivers/infiniband/ulp/srp/ib_srp.c
@@ -734,6 +734,7 @@ static int srp_reconnect_target(struct srp_target_port 
*target)

scsi_target_unblock(shost-shost_gendev, ret == 0 ? SDEV_RUNNING :
SDEV_TRANSPORT_OFFLINE);
+   target-transport_offline = ret != 0;


Minor nit, that line is hard to read; I keep thinking it needs parens
around the conditional...

Perhaps
target-transport_offline = !!ret;
or
target-transport_offline = ret;

gcc should do the right conversion since we're assigning to a bool.


Personally I prefer changing the assignment into the first alternative 
since it's more explicit than the second alternative - it doesn't 
require the person who's reading the source code that the 
transport_offline variable has been declared as bool.


Bart.

--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] IB/srp: Fail I/O requests if the transport is offline

2013-02-18 Thread Sagi Grimberg

On 2/18/2013 6:06 AM, David Dillow wrote:

On Fri, 2013-02-15 at 10:39 +0100, Bart Van Assche wrote:

diff --git a/drivers/infiniband/ulp/srp/ib_srp.c 
b/drivers/infiniband/ulp/srp/ib_srp.c
index 8a7eb9f..b34752d 100644
--- a/drivers/infiniband/ulp/srp/ib_srp.c
+++ b/drivers/infiniband/ulp/srp/ib_srp.c
@@ -734,6 +734,7 @@ static int srp_reconnect_target(struct srp_target_port 
*target)
  
  	scsi_target_unblock(shost-shost_gendev, ret == 0 ? SDEV_RUNNING :

SDEV_TRANSPORT_OFFLINE);
+   target-transport_offline = ret != 0;

Minor nit, that line is hard to read; I keep thinking it needs parens
around the conditional...

Perhaps
target-transport_offline = !!ret;
or
target-transport_offline = ret;

gcc should do the right conversion since we're assigning to a bool.


Or, Vu, does this solve the issue you've seen? I may have time to test
later this week, but not before.



Hey David,

This indeed solve scsi_host removal issues.
Vu is on vacation, I'll perform some more failover tests...

-Sagi
--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] IB/srp: Fail I/O requests if the transport is offline

2013-02-17 Thread David Dillow
On Fri, 2013-02-15 at 10:39 +0100, Bart Van Assche wrote:
 diff --git a/drivers/infiniband/ulp/srp/ib_srp.c 
 b/drivers/infiniband/ulp/srp/ib_srp.c
 index 8a7eb9f..b34752d 100644
 --- a/drivers/infiniband/ulp/srp/ib_srp.c
 +++ b/drivers/infiniband/ulp/srp/ib_srp.c
 @@ -734,6 +734,7 @@ static int srp_reconnect_target(struct srp_target_port 
 *target)
  
   scsi_target_unblock(shost-shost_gendev, ret == 0 ? SDEV_RUNNING :
   SDEV_TRANSPORT_OFFLINE);
 + target-transport_offline = ret != 0;

Minor nit, that line is hard to read; I keep thinking it needs parens
around the conditional...

Perhaps
target-transport_offline = !!ret;
or
target-transport_offline = ret;

gcc should do the right conversion since we're assigning to a bool.


Or, Vu, does this solve the issue you've seen? I may have time to test
later this week, but not before.

--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] IB/srp: Fail I/O requests if the transport is offline

2013-02-15 Thread Bart Van Assche
If an SRP target is no longer reachable and srp_reset_host()
fails to reconnect then ib_srp will invoke scsi_remove_host().
That function will invoke __scsi_remove_device() for each LUN.
And that last function will change the device state from
SDEV_TRANSPORT_OFFLINE into SDEV_CANCEL. Certain user space
software, e.g. older versions of multipathd, continue queueing
I/O to SCSI devices that are in the SDEV_CANCEL state. If these
I/O requests are submitted as SG_IO that means that the
REQ_PREEMPT flag will be set and hence that these requests will
be passed to srp_queuecommand(). These requests will time out.
If new requests are queued fast enough from user space these
active requests will prevent __scsi_remove_device() to finish.
Avoid this by failing I/O requests in the SDEV_CANCEL state if
the transport is offline. Introduce a new variable to keep
track of the transport state instead of failing requests if
(!target-connected || target-qp_in_error) such that the SCSI
error handler has a chance to retry commands after a transport
layer failure occurred.

Signed-off-by: Bart Van Assche bvanass...@acm.org
Cc: David Dillow d...@thedillows.org
Cc: Or Gerlitz ogerl...@mellanox.com
Cc: Vu Pham v...@mellanox.com
---
 drivers/infiniband/ulp/srp/ib_srp.c |7 +++
 drivers/infiniband/ulp/srp/ib_srp.h |1 +
 2 files changed, 8 insertions(+)

diff --git a/drivers/infiniband/ulp/srp/ib_srp.c 
b/drivers/infiniband/ulp/srp/ib_srp.c
index 8a7eb9f..b34752d 100644
--- a/drivers/infiniband/ulp/srp/ib_srp.c
+++ b/drivers/infiniband/ulp/srp/ib_srp.c
@@ -734,6 +734,7 @@ static int srp_reconnect_target(struct srp_target_port 
*target)
 
scsi_target_unblock(shost-shost_gendev, ret == 0 ? SDEV_RUNNING :
SDEV_TRANSPORT_OFFLINE);
+   target-transport_offline = ret != 0;
 
if (ret)
goto err;
@@ -1353,6 +1354,12 @@ static int srp_queuecommand(struct Scsi_Host *shost, 
struct scsi_cmnd *scmnd)
unsigned long flags;
int len;
 
+   if (unlikely(target-transport_offline)) {
+   scmnd-result = DID_NO_CONNECT  16;
+   scmnd-scsi_done(scmnd);
+   return 0;
+   }
+
spin_lock_irqsave(target-lock, flags);
iu = __srp_get_tx_iu(target, SRP_IU_CMD);
if (!iu)
diff --git a/drivers/infiniband/ulp/srp/ib_srp.h 
b/drivers/infiniband/ulp/srp/ib_srp.h
index de2d0b3..66fbedd 100644
--- a/drivers/infiniband/ulp/srp/ib_srp.h
+++ b/drivers/infiniband/ulp/srp/ib_srp.h
@@ -140,6 +140,7 @@ struct srp_target_port {
unsigned intcmd_sg_cnt;
unsigned intindirect_size;
boolallow_ext_sg;
+   booltransport_offline;
 
/* Everything above this point is used in the hot path of
 * command processing. Try to keep them packed into cachelines.
-- 
1.7.10.4

--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html