On 11/26/12 05:44, David Dillow wrote:
> From: Bart Van Assche
>
> Some SCSI upper layer drivers, e.g. sd, issue SCSI commands from
> inside scsi_remove_host() (see also the sd_shutdown() call in
> sd_remove()). Make sure that these commands have a chance to reach
> the SCSI d
On 11/26/12 05:44, David Dillow wrote:
The state of the target has several conditions that overlap, making it
easier to model as a bit-field of exceptional conditions rather than an
enum of all possible states.
Bart Van Assche did the hard work of identifying the states that can be
removed, and
On 11/26/12 05:44, David Dillow wrote:
From: Ishai Rabinovitz
HW QP FATAL errors persist over a reset operation, but we can recover
from that by recreating the QP and associated CQs for each connection.
Creating a new QP/CQ also completely forecloses any possibility of
getting stale completions
On 11/26/12 05:44, David Dillow wrote:
Here is a first, UNTESTED, pass at preparing a merge of Bart's SRP HA
work to upstream. It is not complete, as I have not yet added the
transport layer error handling and related patches. It is also currently
missing the patch to maintain a single connection
On 11/27/12 23:13, Or Gerlitz wrote:
On Tue, Nov 27, 2012 at 6:34 PM, Bart Van Assche wrote:
Thanks Dave for doing all this work. A reworked and retested patch series
that should address all comments that have been posted so far can be found
here: http://github.com/bvanassche/linux/srp-ha. I
On 12/05/12 19:23, Or Gerlitz wrote:
On Fri, Nov 30, 2012 at 4:21 AM, David Dillow wrote:
[...]
Modulo a few style issues (braces around one line if branches, etc.) and
having three state variables vs one, I can live with everything up to
aabfa852acd27962 at git://github.com/bvanassche/linux.gi
On 12/05/12 19:50, Bart Van Assche wrote:
On 12/05/12 19:23, Or Gerlitz wrote:
On Fri, Nov 30, 2012 at 4:21 AM, David Dillow wrote:
[...]
Modulo a few style issues (braces around one line if branches, etc.) and
having three state variables vs one, I can live with everything up to
On 12/06/12 10:52, Vasiliy Tolstov wrote:
Now i'm switch from sles kernel to 3.6.7
All works fine , but now you patches from github provide some errors:
/sbin/service openibd restart
Unloading ib_srp [FAILED]
Removing 'ib_srp': Device or resource busy
xen11:~ #
On 12/05/12 22:32, Or Gerlitz wrote:
On Wed, Dec 5, 2012 at 8:50 PM, Bart Van Assche wrote:
[...]
The only way to make I/O work reliably if a failure can occur at the
transport layer is to use multipathd on top of ib_srp. If a connection fails
for some reason, then the SRP SCSI host will be
On 12/06/12 15:27, Or Gerlitz wrote:
The core problem here seems to be that scsi_remove_host simply never ends.
Hello Or,
The later patches in the srp-ha patch series avoided such behavior by
checking whether the connection between SRP initiator and target is
unique, and by removing duplicat
If a SCSI command times out it is passed to the SCSI error
handler. The SCSI error handler keeps trying to abort a
command until aborting succeeded or the command has been
finished. Avoid that attempts to abort a command without
RDMA RC connection trigger an endless loop.
Signed-off-by: Bart Van
On 12/07/12 22:47, Vu Pham wrote:
I applied your latest patch [PATCH for-next] IB/srp: Make SCSI error
handling finish
and test
Let me capture what I'm seeing:
Host has two paths (scsi_host 7 & 8) to target thru two physical ports 1
& 2
[root@rsws42 ~]# multipath -l
size=50G features='0' hwhan
On 12/09/12 10:26, Garrett Cooper wrote:
+#if defined(__FreeBSD__)
+"%s %02d %02d:%02d:%02d %06d [%p] 0x%02x -> %s",
+#else
"%s %02d %02d:%02d:%02d %06d [%04X] 0x%02x -> %s",
+#endif
Please cast the pthread_t value to an unsigned long long or another
integral type. Su
On 12/09/12 12:07, Garrett Cooper wrote:
> It seems that there's a bug when linking inlined functions with clang; this
> issue will need to be upstreamed and reverified with clang 3.2.
>
> Signed-off-by: Garrett Cooper
> ---
> osmtest/osmtest.c | 2 +-
> 1 file changed, 1 insertion(+), 1 delet
Hello Dave,
It would be appreciated if you could have a look at the following two
patches:
* Track connection state properly. Apparently an assignment statement
had not been dropped while it should have been dropped.
* Avoid endless SCSI error handling loop after cable pull.
Thanks,
Bart.
-
The connection state must be initialized before srp_connect_target()
is invoked. Drop the assignment in srp_add_target() since
scsi_host_alloc() zero-initializes the Scsi_Host structure anyway.
This patch makes ib_srp again report the first QP error.
Signed-off-by: Bart Van Assche
Cc: David
if the QP is in the error
state.
- Make srp_reset_host() reset SCSI requests even if host
removal has already started or if reconnecting fails.
Signed-off-by: Bart Van Assche
Cc: David Dillow
Cc: Roland Dreier
Reported-by: Or Gerlitz
Cc: Vu Pham
Cc: Alex Turin
---
drivers/infiniband/ulp
On 12/14/12 16:55, David Dillow wrote:
On Fri, 2012-12-14 at 16:38 +0100, Bart Van Assche wrote:
If a SCSI command times out it is passed to the SCSI error
handler. The SCSI error handler will try to abort the command
that timed out. If aborting failed a device reset will be
attempted. If the
On 12/14/12 17:19, David Dillow wrote:
On Fri, 2012-12-14 at 17:12 +0100, Bart Van Assche wrote:
On 12/14/12 16:55, David Dillow wrote:
This is much more than your original patch that Alex claimed fixed his
issues; are you not merging two separate issues?
>
Also, there's no r
On 12/19/12 05:09, David Dillow wrote:
Did you update the patch? I think I'm on-board with the idea.
Sorry for the delay. I will post the updated patch series.
Bart.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
This patch series avoids that SCSI error handling triggers an endless
loop and also restores reporting of QP errors in the kernel log.
Changes between v3 and v2:
- As proposed by Dave, added a patch that prevents sending of a task
management function over a closed connection.
Changes between
The connection state must be initialized before srp_connect_target()
is invoked. Drop the assignment in srp_add_target() since it occurs
after srp_connect_target() and since scsi_host_alloc()
zero-initializes the Scsi_Host structure anyway.
Signed-off-by: Bart Van Assche
Acked-by: David Dillow
Do not send a task management function if sending will fail anyway
because either there is no RDMA/RC connection or the QP is in the
error state.
Signed-off-by: Bart Van Assche
Cc: David Dillow
Cc: Roland Dreier
---
drivers/infiniband/ulp/srp/ib_srp.c |5 +++--
1 file changed, 3
.
Modify the SCSI error handling functions in ib_srp as follows:
- Abort SCSI commands properly even if the QP is in the error
state.
- Make srp_reset_host() reset SCSI requests even after host
removal has already started or if reconnecting fails.
Signed-off-by: Bart Van Assche
Cc: David Dillow
Cc
On 12/19/12 19:04, David Dillow wrote:
On Wed, 2012-12-19 at 15:21 +0100, Bart Van Assche wrote:
The connection state must be initialized before srp_connect_target()
is invoked. Drop the assignment in srp_add_target() since it occurs
after srp_connect_target() and since scsi_host_alloc()
zero
On 12/20/12 13:38, Or Gerlitz wrote:
I think few days ago you had a patch on your tree named "Save and
restore host_scribble during error handling", is it possible we need
this here for happy removal of the scsi host?
No. Host removal works fine even without that patch. That's because
srp_abor
On 12/20/12 16:10, David Dillow wrote:
On Thu, 2012-12-20 at 09:13 +0100, Bart Van Assche wrote:
On 12/19/12 19:04, David Dillow wrote:
On Wed, 2012-12-19 at 15:21 +0100, Bart Van Assche wrote:
The connection state must be initialized before srp_connect_target()
is invoked. Drop the
Remove an assignment that incorrectly overwrites the connection
state update by srp_connect_target().
Signed-off-by: Bart Van Assche
Acked-by: David Dillow
Cc: Roland Dreier
---
drivers/infiniband/ulp/srp/ib_srp.c |1 -
1 file changed, 1 deletion(-)
diff --git a/drivers/infiniband/ulp
On 12/21/12 19:01, Mike Marciniszyn wrote:
> 0b088e00 ("RDS: Use page_remainder_alloc() for recv bufs")
> added uses of sg_dma_len() and sg_dma_address(). This makes
> RDS DOA with the qib driver.
>
> IB ulps should use ib_sg_dma_len() and ib_sg_dma_address
> respectively since some HCAs overload
This patch is based on the srp-ha-v3.7 tree by Bart Van Assche.
See also <https://github.com/advance38/linux/tree/ib-srp-remove-target-v3.7>.
If necessary, I could rebase it on the stable tree.
Signed-off-by: Dongsu Park
Cc: Sebastian Riemer
Cc: Bart Van Assche
Cc: David Dillow
Cc: Rol
On 01/11/13 15:07, Dongsu Park wrote:
However, that action will hang forever until the target machine comes up
again. Precisely it's blocked on scsi_execute() directly after sending
SYNCHRONIZE_CACHE command to the first target of the host. As IB stack
is not able to give any response, further ta
On 01/29/13 18:18, Alex Netes wrote:
During opensm RPM packaging, `chkconfig --add opensmd` is called.
`chkconfig --add` creates the appropriate entry as specified by the
default values in the init script. Having opensmd run by default on boot
isn't desired.
Signed-off-by: Alex Netes
---
conf
On 01/30/13 16:43, Doug Ledford wrote:
On 01/30/13 03:59, Bart Van Assche wrote:
On 01/29/13 18:18, Alex Netes wrote:
During opensm RPM packaging, `chkconfig --add opensmd` is called.
`chkconfig --add` creates the appropriate entry as specified by the
default values in the init script. Having
On 01/30/13 18:48, Doug Ledford wrote:
On 1/30/2013 11:00 AM, Bart Van Assche wrote:
Which convention is followed for other packages ? This is what I found in
the Fedora 18 iscsi-initiator-utils package
(http://be.mirror.eurid.eu/fedora/linux/releases/18/Fedora/source/SRPMS/i/iscsi-initiator
It is known that it takes about two to three minutes before the upstream
SRP initiator fails over from a failed path to a working path. This is
not only considered longer than acceptable but is also longer than other
Linux SCSI initiators (e.g. iSCSI and FC). Progress so far with
improving the
This patch series avoids that SCSI error handling triggers an endless
loop and also restores reporting of QP errors in the kernel log.
Changes between v3 and v2:
- As proposed by Dave, added a patch that prevents sending of a task
management function over a closed connection.
Changes between
Remove an assignment that incorrectly overwrites the connection
state update by srp_connect_target().
Signed-off-by: Bart Van Assche
Acked-by: David Dillow
Cc: Roland Dreier
---
drivers/infiniband/ulp/srp/ib_srp.c |1 -
1 file changed, 1 deletion(-)
diff --git a/drivers/infiniband/ulp
Do not send a task management function if sending will fail anyway
because either there is no RDMA/RC connection or the QP is in the
error state.
Signed-off-by: Bart Van Assche
Acked-by: David Dillow
Cc: Roland Dreier
---
drivers/infiniband/ulp/srp/ib_srp.c |5 +++--
1 file changed, 3
.
Modify the SCSI error handling functions in ib_srp as follows:
- Abort SCSI commands properly even if the QP is in the error
state.
- Make srp_reset_host() reset SCSI requests even after host
removal has already started or if reconnecting fails.
Signed-off-by: Bart Van Assche
Acked-by: David Dillow
On 02/04/13 16:36, Alex Netes wrote:
On 09:20 Thu 31 Jan , Doug Ledford wrote:
On 01/31/13 02:21, Alex Netes wrote:
On 14:24 Wed 30 Jan , Doug Ledford wrote:
On 1/30/2013 2:12 PM, Bart Van Assche wrote:
On 01/30/13 18:48, Doug Ledford wrote:
On 1/30/2013 11:00 AM, Bart Van Assche
On 02/04/13 22:11, Or Gerlitz wrote:
On Fri, Feb 1, 2013 at 5:18 PM, Bart Van Assche wrote:
This patch series avoids that SCSI error handling triggers an endless loop
and also restores reporting of QP errors in the kernel log.
Bart,
You wrote "resend" in the subject line, anythi
On 02/05/13 21:54, Or Gerlitz wrote:
On Tue, Feb 5, 2013 at 6:25 PM, Bart Van Assche wrote:
On 02/04/13 22:11, Or Gerlitz wrote:
Bart, I'd like to sharpen the point: could you please clarify if the
series posted to linux-rdma stands for itself in the sense that SRP HA
scheme X (please
On 02/06/13 08:44, Or Gerlitz wrote:
On 06/02/2013 09:22, Bart Van Assche wrote:
A huge number of patches have been taken upstream between 3.8-rc1 and
3.8-rc6. I have retested these three patches with 3.8-rc6 and would
appreciate if you would also repeat your tests.
not really... this is
On 02/06/13 22:42, Vu Pham wrote:
Conclusion:
1. disable the port/path long enough >35 minutes, we have dangling scsi
host.
2. enable the port within 30 minute, scsi host re-establish connection,
path re-instate and then scsi_host was removed (no entry in sysfs)
I attached a log here to show wha
On 02/07/13 10:41, Or Gerlitz wrote:
(BTW - if the fourth patch that Vu used "save
& restore host_scribble during error handling" is also needed, maybe you
add it to this series, so they are reviewed/accepted together).
Hello Or,
The three patches I posted guarantee timely host removal even w
On 10/29/12 10:50, Paul Bolle wrote:
On Wed, 2012-10-10 at 09:23 +0200, Jack Morgenstein wrote:
You could use:
u16 uninitialized_var(vlan);
instead.
I guess we'd better just wait and see whether uninitialized_var()
survives before discussing your suggestion (see the thread starting at
ht
of failing requests if
(!target->connected || target->qp_in_error) such that the SCSI
error handler has a chance to retry commands after a transport
layer failure occurred.
Signed-off-by: Bart Van Assche
Cc: David Dillow
Cc: Or Gerlitz
Cc: Vu Pham
---
drivers/infiniband/ulp/srp/ib_srp.c
On 02/18/13 05:06, David Dillow wrote:
On Fri, 2013-02-15 at 10:39 +0100, Bart Van Assche wrote:
diff --git a/drivers/infiniband/ulp/srp/ib_srp.c
b/drivers/infiniband/ulp/srp/ib_srp.c
index 8a7eb9f..b34752d 100644
--- a/drivers/infiniband/ulp/srp/ib_srp.c
+++ b/drivers/infiniband/ulp/srp
This patch series avoids that SCSI error handling triggers an endless
loop and also restores reporting of QP errors in the kernel log.
Changes between v4 and v3:
- Added a patch that ensures that the SCSI host gets removed in time
when a user space process keeps queueing I/O during removal, e.
Remove an assignment that incorrectly overwrites the connection
state update by srp_connect_target().
Signed-off-by: Bart Van Assche
Acked-by: David Dillow
Cc: Roland Dreier
Cc: # 3.8
---
drivers/infiniband/ulp/srp/ib_srp.c |1 -
1 file changed, 1 deletion(-)
diff --git a/drivers
Do not send a task management function if sending will fail anyway
because either there is no RDMA/RC connection or the QP is in the
error state.
Signed-off-by: Bart Van Assche
Acked-by: David Dillow
Cc: Roland Dreier
Cc: # 3.8
---
drivers/infiniband/ulp/srp/ib_srp.c |5 +++--
1 file
.
Modify the SCSI error handling functions in ib_srp as follows:
- Abort SCSI commands properly even if the QP is in the error
state.
- Make srp_reset_host() reset SCSI requests even after host
removal has already started or if reconnecting fails.
Signed-off-by: Bart Van Assche
Acked-by: David Dillow
of failing requests if
(!target->connected || target->qp_in_error) such that the SCSI
error handler has a chance to retry commands after a transport
layer failure occurred.
Signed-off-by: Bart Van Assche
Cc: David Dillow
Cc: Or Gerlitz
Cc: Vu Pham
Cc: # 3.8
---
drivers/infiniband/u
On 02/18/13 09:11, Sagi Grimberg wrote:
On 2/18/2013 6:06 AM, David Dillow wrote:
On Fri, 2013-02-15 at 10:39 +0100, Bart Van Assche wrote:
diff --git a/drivers/infiniband/ulp/srp/ib_srp.c
b/drivers/infiniband/ulp/srp/ib_srp.c
index 8a7eb9f..b34752d 100644
--- a/drivers/infiniband/ulp/srp
On 02/06/13 11:40, Sebastian Riemer wrote:
On 06.02.2013 11:20, Or Gerlitz wrote:
On 06/02/2013 12:04, Mathis GAVILLON wrote:
Just a last question : is that possible VFs lid to be different from
PF one ?
NO, we've implemented a "shared port" model, so all functions on the
same IB port use the
On 11/26/12 09:00, Or Gerlitz wrote:
On Fri, Nov 23, 2012 at 2:10 PM, Bart Van Assche wrote:
Apparently unloading the ib_ipoib kernel module triggers a circular locking
dependency complaint. Has anyone already been looking into this ?
Yes, I see that this happens here e.g when doing hot
On 03/19/13 11:16, Sebastian Riemer wrote:
Hi Bart,
now I've got my priority on SRP again.
I've also noticed that your ib_srp-backport doesn't fail the IO fast
enough. The fast_io_fail_tmo only comes into play after the QP is
already in timeout and the "terminate_rport_io" function is missing.
On 03/26/13 17:24, Doug Ledford wrote:
If you have a patched up dhcp server (and dhclient), they will use
AF_PACKET/SOCK_DGRAM pair to send dhcp packets over IPoIB. This has
worked since forever if you use OFED kernels or one of the distribution
kernels. However, when testing an upstream kernel
Yevgeny Petrilin writes:
> Hello Roland,
>
> Those patches where submitted a while ago, I cleaned them up a little and
generated against your latest git.
> They allow to hw driver to choose to which EQ a CQ would be attached,
considering the load on its eqs.
(replying to an e-mail from three ye
On 04/03/13 14:43, Patrick McHardy wrote:
diff --git a/net/tipc/bearer.h b/net/tipc/bearer.h
+#ifdef CONFIG_TIPC_MEDIA_IB
+int tipc_ib_media_start(void);
+void tipc_ib_media_stop(void);
+#else
+int tipc_ib_media_start(void) { return 0; }
+void tipc_ib_media_stop(void) { return; }
+#endif
Is
On 04/08/13 10:00, Vasiliy Tolstov wrote:
Hello. Some times ago, when i'm use kernel 3.6 i'm use
https://github.com/bvanassche/ib_srp-backport/ for srp drivers on my
linux server.
Now i'm using 3.8.6, does i need something from
https://github.com/bvanassche/ib_srp-backport/ or all patches already
On 04/30/13 09:34, Vasiliy Tolstov wrote:
What is main difference between bvanassche repo and sriemer ?
Good question. As soon as I have the time I will try to find a single
approach that works for everyone and post a new patch series for review
on the linux-rdma mailing list such that these
On 05/06/13 10:44, Sebastian Riemer wrote:
Sorry Bart, but a reconnect with just the commit message
"IB/srp: Add kernel-level transport layer recovery" and no further
description isn't very trustworthy for me. I also wonder why you need so
much locking.
Hello Sebastian,
There is a very good re
On 05/14/13 12:00, Vasiliy Tolstov wrote:
if i need faster reconnects and ability to close session from
initiator side under qlogic hardware, does it possible? Or this
patches only covers mallanox cards?
The ability to close a session from the initiator side went upstream in
kernel 3.8 (/sys/c
On 05/15/13 07:12, Vasiliy Tolstov wrote:
Thanks. What about close session from target side? For example i need
to close the srp session and block all access from specific initiator?
The traditional approach to block access from a specific initiator is to
modify the LUN masking configuration a
On 05/21/13 11:40, Or Gerlitz wrote:
2. is possible in the Linux kernel for one hard irq callback to flash on
CPU X while another hard irq callback is running on the same CPU?
I think that from kernel 2.6.35 on MSI IRQs are no longer nested. See
also
http://git.kernel.org/cgit/linux/kernel/gi
On 05/24/13 19:02, Bryce Lelbach wrote:
> The attached patch modifies the kernel Infiniband drivers to support the Xeon
> Phi
> co-processor.
>
> This patch is a modified version of a patch from Intel's MPSS framework
> (specifically, from the "KNC_gold_update_1-2.1.4982-15-rhel-6.3" package),
>
On 06/08/13 04:31, Bruce McKenzie wrote:
ive compiled ubuntu 13.04 to kernel 3.6.11 with OFED 2 from Mellanox, and it
works ok, performance is a little better with SRP. Some packages dont seem
to work, ie srptools and IB-diags some commands fail, which looks like those
tools havenet been tested
On 06/10/13 14:05, Sebastian Riemer wrote:
Perhaps, I should collect all guys who require MD RAID-1 for remote
storage replication in order to put some pressure on Neil.
If I remember correctly one of the things Neil is trying to explain to
md users is that when md is used without write-intent
The purpose of this InfiniBand SRP initiator patch series is as follows:
- Make the SRP initiator driver better suited for use in a H.A. setup.
Speed up failover by reducing the IB RC retry count and by notifying
multipathd faster about transport layer failures by adding
fast_io_fail_tmo and
+0x16/0x1b
[bvanassche: Shortened patch description]
Signed-off-by: Dotan Barak
Reviewed-by: Eli Cohen
Signed-off-by: Bart Van Assche
Cc: Roland Dreier
Cc: David Dillow
Cc: Vu Pham
Cc: Sebastian Riemer
---
drivers/infiniband/ulp/srp/ib_srp.c |2 ++
1 file changed, 2 insertions(+)
diff --
Avoid that srp_claim_command() can claim a command while
srp_queuecommand() is still busy queueing the same command.
Found this via source reading.
Signed-off-by: Bart Van Assche
Cc: Roland Dreier
Cc: David Dillow
Cc: Vu Pham
Cc: Sebastian Riemer
---
drivers/infiniband/ulp/srp/ib_srp.c
after a transport
layer error.
Signed-off-by: Bart Van Assche
Cc: Roland Dreier
Cc: David Dillow
Cc: Vu Pham
Cc: Sebastian Riemer
---
drivers/infiniband/ulp/srp/ib_srp.c | 11 ---
1 file changed, 8 insertions(+), 3 deletions(-)
diff --git a/drivers/infiniband/ulp/srp/ib_srp.c
b
The SRP initiator implements host reset by reconnecting to the SRP
target. That means that communication with the target is possible
as soon as host reset finished. Hence skip the host settle delay.
Signed-off-by: Bart Van Assche
Cc: Roland Dreier
Cc: David Dillow
Cc: Vu Pham
Cc: Sebastian
or handler will cause the SRP initiator to
reconnect, which will cause I/O over the second connection to fail.
Avoid such ping-pong behavior by disabling relogins. Note: if
reconnecting manually is necessary, that is possible by deleting
and recreating an rport via sysfs.
Signed-off-by: Bart Van
queuecommand callback is racy
because srp_remove_host() must be invoked before scsi_remove_host()
and because the queuecommand callback may get invoked after
srp_remove_host() has finished.
Signed-off-by: Bart Van Assche
Cc: Roland Dreier
Cc: James Bottomley
Cc: David Dillow
Cc: Vu Pham
Cc: Sebastian
after having detected a transport layer problem and
before failing I/O.
- Support for implementing dev_loss_tmo, the time that should
elapse after having detected a transport layer problem and
before removing a remote port.
Signed-off-by: Bart Van Assche
Cc: Roland Dreier
Cc: James Bottomley
Finish all outstanding I/O requests after fast_io_fail_tmo expired,
which speeds up failover in a multipath setup. This patch is a
reworked version of a patch from Sebastian Riemer.
Reported-by: Sebastian Riemer
Signed-off-by: Bart Van Assche
Cc: Roland Dreier
Cc: David Dillow
Cc: Vu Pham
Cc
Enable fast_io_fail_tmo and dev_loss_tmo functionality for the IB
SRP initiator.
Signed-off-by: Bart Van Assche
Cc: David Dillow
Cc: Roland Dreier
Cc: Vu Pham
Cc: Sebastian Riemer
---
drivers/infiniband/ulp/srp/ib_srp.c | 123 +--
drivers/infiniband/ulp/srp
Start the reconnect timer, fast_io_fail timer and dev_loss timer
if a transport layer error occurs.
Signed-off-by: Bart Van Assche
Cc: David Dillow
Cc: Roland Dreier
Cc: Vu Pham
Cc: Sebastian Riemer
---
drivers/infiniband/ulp/srp/ib_srp.c | 19 +++
drivers/infiniband/ulp
description]
Signed-off-by: Sebastian Riemer
Signed-off-by: Bart Van Assche
Cc: Roland Dreier
Cc: David Dillow
Cc: Vu Pham
---
drivers/infiniband/ulp/srp/ib_srp.c |3 +++
1 file changed, 3 insertions(+)
diff --git a/drivers/infiniband/ulp/srp/ib_srp.c
b/drivers/infiniband/ulp/srp/ib_srp.c
index
ws to reduce latency on an initiator
connected to multiple SRP targets but also allows to improve
throughput.
Signed-off-by: Bart Van Assche
Cc: Roland Dreier
Cc: David Dillow
Cc: Vu Pham
Cc: Sebastian Riemer
---
drivers/infiniband/ulp/srp/ib_srp.c | 26 --
drivers/infin
tivating
the SCSI error handler on an IB path with a regular BER or due to
brief IB network congestion.
[bvanassche: Rewrote patch description]
Signed-off-by: Vu Pham
Signed-off-by: Bart Van Assche
Cc: Roland Dreier
Cc: David Dillow
Cc: Vu Pham
Cc: Sebastian Riemer
---
drivers/infiniband/ulp/
From: Vu Pham
Signed-off-by: Vu Pham
Signed-off-by: Bart Van Assche
Cc: Roland Dreier
Cc: David Dillow
Cc: Sebastian Riemer
---
drivers/infiniband/ulp/srp/ib_srp.c |4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/infiniband/ulp/srp/ib_srp.c
b/drivers
On 06/12/13 15:20, Bart Van Assche wrote:
If the add_one callback fails during driver load no resources are
allocated so there isn't a need to release any resources. Trying
to clean the resource may lead to the following kernel panic:
BUG: unable to handle kernel NULL pointer dereferen
On 06/12/13 16:58, Sebastian Riemer wrote:
Wait a minute, so you've changed this commit to also hold that target
lock in the following functions in error case:
srp_unmap_data(),
srp_put_tx_iu()
This is different from:
https://github.com/bvanassche/ib_srp-backport/commit/6ce0e30dbb69973926df8429
via sysfs.
Signed-off-by: Bart Van Assche
Cc: Roland Dreier
Cc: David Dillow
Cc: Vu Pham
Cc: Sebastian Riemer
Thanks Bart for refreshing this new patch set. I think you should add
Signed-off-by for Sebastian ?
This patch differs slightly from what Sebastian had posted. But if
Sebastian a
On 06/13/13 11:30, Sebastian Riemer wrote:
On 12.06.2013 15:23, Bart Van Assche wrote:
The SCSI error handler assumes that the transport layer is
operational if an eh_abort_handler() returns SUCCESS. Hence let
srp_abort() only return SUCCESS if sending the ABORT TASK task
management function
On 06/13/13 15:57, Sebastian Riemer wrote:
> You've only changed the style of this function. Functionality is still
> the same. Fine for me.
>
> But why do you put it that high in the source code?
> Do you (still) need it for something else?
>
> I would put it directly in front of srp_create_targ
On 06/13/13 17:52, Vasiliy Tolstov wrote:
Hello. I'm using 3.9.5 and ib_srp_backport from github.
What is recommended setting for dev_loss_tmo and fast_io_fail_tmo ?
Now i have
dev_loss_tmo = 60
fast_io_fail_tmo = 40
Does it right?
I need to wait no more 100 seconds for failed path. (i'm use so
On 06/13/13 19:50, Vu Pham wrote:
Hello Bart,
+/**
+ * srp_conn_unique() - check whether the connection to a target is
unique
+ */
+static bool srp_conn_unique(struct srp_host *host,
+struct srp_target_port *target)
+{
+struct srp_target_port *t;
+bool ret = false;
+
+
On 06/13/13 21:43, Vu Pham wrote:
> Hello Bart,
>
>>
>> +What:/sys/class/srp_remote_ports/port-:/dev_loss_tmo
>> +Date:September 1, 2013
>> +KernelVersion:3.11
>> +Contact:linux-s...@vger.kernel.org, linux-rdma@vger.kernel.org
>> +Description:Number of seconds the SCSI
On 06/13/13 20:21, Vasiliy Tolstov wrote:
P.S. In case of kernel 3.9.5 does i need you backported ib_srp driver
or i can use mainline kernel drivers for faster reconnects?
P.P.S. Can you provide me subject of e-mail or link to patches to
switch off dev_loss_tmo?
Hello Vasiliy,
Does this mean t
On 06/14/13 19:59, Vu Pham wrote:
On 06/13/13 21:43, Vu Pham wrote:
+/**
+ * srp_tmo_valid() - check timeout combination validity
+ *
+ * If no fast I/O fail timeout has been configured then the device
loss timeout
+ * must be below SCSI_DEVICE_BLOCK_MAX_TIMEOUT. If a fast I/O fail
timeout has
+
On 06/17/13 08:18, Hannes Reinecke wrote:
On 06/15/2013 11:52 AM, Bart Van Assche wrote:
On 06/14/13 19:59, Vu Pham wrote:
On 06/13/13 21:43, Vu Pham wrote:
+/**
+ * srp_tmo_valid() - check timeout combination validity
+ *
+ * If no fast I/O fail timeout has been configured then the
device
On 06/17/13 09:14, Hannes Reinecke wrote:
On 06/17/2013 09:04 AM, Bart Van Assche wrote:
I agree that the value of fast_io_fail_tmo should be kept small.
Although as you explained changing the SCSI device state into
SDEV_BLOCK doesn't help for I/O that has already been queued on a
failed
On 06/18/13 18:59, Vu Pham wrote:
Bart Van Assche wrote:
On 06/14/13 19:59, Vu Pham wrote:
On 06/13/13 21:43, Vu Pham wrote:
If rport's state is already SRP_RPORT_BLOCKED, I don't think we need
to do extra block with scsi_block_requests()
Please keep in mind that srp_reconnect_r
On 06/19/13 15:44, Jack Wang wrote:
+ /*
+* It can occur that after fast_io_fail_tmo expired and before
+* dev_loss_tmo expired that the SCSI error handler has
+* offlined one or more devices. scsi_target_unblock() doesn't
+
On 06/23/13 23:13, Mike Christie wrote:
> On 06/12/2013 08:28 AM, Bart Van Assche wrote:
>> +/*
>> + * It can occur that after fast_io_fail_tmo expired and before
>> + * dev_loss_tmo expired that the SCSI error handler has
>> +
On 06/24/13 15:48, Jack Wang wrote:
I'm not sure it's possible to avoid such a race without introducing
a new mutex. How about something like the (untested) SCSI core patch
below, and invoking scsi_block_eh() and scsi_unblock_eh() around any
reconnect activity not initiated from the SCSI EH threa
601 - 700 of 1280 matches
Mail list logo