On Sat, Dec 17, 2011 at 5:03 PM, Or Gerlitz or.gerl...@gmail.com wrote:
Bart Van Assche bvanass...@acm.org wrote:
The default block layer timeout is 30 seconds.
Could you provider a pointer to where this is defined?
Sorry, it's not the block layer but sd (SCSI disk) that sets that
timeout
On Thu, Dec 15, 2011 at 7:08 PM, David Dillow dillo...@ornl.gov wrote:
On Thu, 2011-12-01 at 19:58 +0100, Bart Van Assche wrote:
Currently the ib_srp driver allows to log in multiple times to the
same target via its sysfs interface. This leads to each target LUN
being imported multiple
On Mon, Dec 19, 2011 at 12:33 AM, David Dillow dillo...@ornl.gov wrote:
On Thu, 2011-12-01 at 20:09 +0100, Bart Van Assche wrote:
+ * cmd_sg_entries, the maximum number of memory descriptors
+ that fit in a single SRP command when using the direct data
On Mon, Dec 19, 2011 at 4:03 AM, David Dillow dillo...@ornl.gov wrote:
On Thu, 2011-12-01 at 20:13 +0100, Bart Van Assche wrote:
Make it possible to disconnect via sysfs the IB RC connection used by the
SRP protocol to communicate with a target.
Let the SRP transport layer create a sysfs
On Mon, Dec 19, 2011 at 12:50 AM, David Dillow dillo...@ornl.gov wrote:
On Thu, 2011-12-01 at 20:11 +0100, Bart Van Assche wrote:
Add a time-based transport layer test such that fail-over in a multipath
setup can happen quickly.
Why should this be done in the kernel? multipathd already
On Sun, Dec 18, 2011 at 9:40 PM, David Dillow dillo...@ornl.gov wrote:
A SRP target port (for the purposes of an I_T nexus) is defined by the
IOC GUID and extension; if you want to have redundant connections to a
port, multi-channel logins must be supported. The way most targets avoid
this
On Mon, Dec 19, 2011 at 3:36 AM, David Dillow dillo...@ornl.gov wrote:
On Thu, 2011-12-01 at 20:10 +0100, Bart Van Assche wrote:
Rework ib_srp transport layer error handling. Instead of letting SCSI
commands time out if a transport layer error occurs,
This is good, but should probably be part
On Mon, Dec 19, 2011 at 10:51 PM, David Dillow dillo...@ornl.gov wrote:
I haven't parsed it all out from your changes just yet, but I think part
of the reason you may have had problems with req-scmd being null in
srp_handle_recv() is due to a new race between the tear down of the
connection
On Mon, Dec 19, 2011 at 10:32 PM, David Dillow dillo...@ornl.gov wrote:
I still think this is already solved in user space, but the new
reconnect model you've implemented doesn't match up with the expected
semantics. It'd be better to match the rest of the SCSI stack for this.
Sorry, but I'm
On Mon, Dec 19, 2011 at 12:07 AM, David Dillow dillo...@ornl.gov wrote:
On Thu, 2011-12-01 at 20:08 +0100, Bart Van Assche wrote:
Eliminate the private_rport_attrs[] array and the SETUP_*() macros
used to set up that array since the information in that array
duplicates the information
On Mon, Dec 19, 2011 at 10:32 PM, David Dillow dillo...@ornl.gov wrote:
Part of the problem is introduced by allowing for permanent connections
rather than using the familiar dev_loss_tmo and fast_io_fail_tmo
parameters from other SCSI transports. For instance, in the FC
transport, rports are
On Wed, Dec 21, 2011 at 3:33 AM, David Dillow dillo...@ornl.gov wrote:
What keeps the srp_recv_completion() -- srp_handle_recv() --
srp_process_rsp() -- etc. call chain from racing with
srp_reconnect_target()?
It looks like the srp_reset_req() in srp_reconnect_target() could race
with
On Wed, Dec 21, 2011 at 3:05 AM, David Dillow dillo...@ornl.gov wrote:
We don't want to leave a target blocked indefinitely -- commands caught
in the blocked queue won't be reissued until the queue is unblocked --
but we may want to keep the sdX mappings around for a long time.
On Mon, Dec 19, 2011 at 3:36 AM, David Dillow dillo...@ornl.gov wrote:
On Thu, 2011-12-01 at 20:10 +0100, Bart Van Assche wrote:
/*
* Copyright (c) 2005 Cisco Systems. All rights reserved.
+ * Copyright (c) 2010-2011 Bart Van Assche bvanass...@acm.org.
You've tried to add
On Sat, Dec 24, 2011 at 8:07 PM, David Dillow dillo...@ornl.gov wrote:
This says to me that SRP should use the dev_loss_tmo semantics, though
the naming of fast_io_fail vs replacement_timeout is a bit more of a
question than I thought. I tend to think of SRP more in terms of FC than
iSCSI, so
On Fri, Dec 23, 2011 at 10:56 PM, Mike Christie micha...@cs.wisc.edu wrote:
iSCSI replacement_timeout is the same as fast_io_fail_tmo for FC. iSCSI
replacement_timeout actually came first so you should say FC should have
copied our name :)
Agreed. But as far as I can see multipathd only
On Thu, Dec 29, 2011 at 4:42 PM, Flavio Baronti
f.baro...@list-group.com wrote:
I'm new to RDMA development and I have a question regarding resource release.
If I understood correctly, when ibv_get_cq_event returns, it holds some sort
of lock over the completion queue, which is released when I
On Thu, Dec 29, 2011 at 3:43 PM, Or Gerlitz ogerl...@mellanox.com wrote:
5. what happens if we just want to enhance an -- existing -- function -
suppose we want to enhance ibv_post_send / ibv_poll_cq to support features
like LSO, checksum offload, masked atomic operations, fast memory remote
On Mon, Dec 19, 2011 at 5:39 PM, Roland Dreier rol...@kernel.org wrote:
Roland Dreier (2):
IB/mlx4: Fix shutdown crash accessing a non-existent bitmap
Hi Roland,
As far as I can see this fix is upstream as commit
4af3ce0de0c12e5c17811eaefad36ab8e146c0fd but is not yet included in
v3.1.7.
On Mon, Jan 2, 2012 at 4:39 PM, Hefty, Sean sean.he...@intel.com wrote:
If I understood correctly, when ibv_get_cq_event returns, it holds some sort
of lock over the completion queue, which is
ibv_destroy_cq, so that:
1) When ibv_destroy_cq returns, I am certain that there is no thread running
On Wed, Jan 4, 2012 at 9:05 PM, Hefty, Sean sean.he...@intel.com wrote:
I've just had a look at the kernel code that implements all this
(uverbs_cmd.c and uverbs_main.c). I haven't found any precautions
against ib_uverbs_comp_handler() accessing *uobj after
ib_uverbs_destroy_cq() has invoked
On Thu, Jan 5, 2012 at 5:29 PM, Roland Dreier rol...@purestorage.com wrote:
That should be OK. ib_uverbs_destroy_cq() doesn't really do anything
until ib_destroy_cq() has returned, and at that point it is guaranteed
that the completion handler for the CQ is done.
That sounds like a
Hi,
Sparse complains about nes_addr_resolve_neigh(). Does the patch below make
sense ?
---
drivers/infiniband/hw/nes/nes_cm.c |1 +
1 files changed, 1 insertions(+), 0 deletions(-)
diff --git a/drivers/infiniband/hw/nes/nes_cm.c
b/drivers/infiniband/hw/nes/nes_cm.c
index 425065b..1b2ccc7
On Thu, Jan 12, 2012 at 12:41 PM, Dan Carpenter
dan.carpen...@oracle.com wrote:
Sparse complains because len in struct srp_direct_buf is declared as
big endian but it's used throughout as CPU endian. struct
srp_indirect_buf has the same thing. It's declared one way but used the
other way.
This patch series makes the ib_srp driver better suited for use in a H.A.
setup because:
- Switchover without triggering read or write errors become possible. Such
errors are bad because these can make a filesystem switch to read-only
mode.
- A ping mechanism has been added that allows to
statement in srp_qp_event() from error to debug. Change a
double space into a single in the bad IO class parameter error
message. Remove one trailing space to avoid a checkpatch warning.
Signed-off-by: Bart Van Assche bvanass...@acm.org
Cc: David Dillow dillo...@ornl.gov
Cc: Roland Dreier rol
Remove sysfs attributes before removing a target instead of
testing the target state in every sysfs attribute callback
method. Note: it is safe to invoke a sysfs attribute removal
method like device_remove_file() twice on the same attribute.
Signed-off-by: Bart Van Assche bvanass...@acm.org
Cc
type of
SCSI device (M/O disk, tape, CD-ROM, ...).
Signed-off-by: Bart Van Assche bvanass...@acm.org
Cc: David Dillow dillo...@ornl.gov
Cc: Roland Dreier rol...@purestorage.com
---
drivers/infiniband/ulp/srp/ib_srp.c | 46 +++
drivers/infiniband/ulp/srp/ib_srp.h
from int to bool and
move the initialization of that variable into srp_connect_target().
Signed-off-by: Bart Van Assche bvanass...@acm.org
Cc: David Dillow dillo...@ornl.gov
Cc: Roland Dreier rol...@purestorage.com
---
drivers/infiniband/ulp/srp/ib_srp.c | 57
Van Assche bvanass...@acm.org
Cc: David Dillow dillo...@ornl.gov
Cc: Roland Dreier rol...@purestorage.com
---
drivers/infiniband/ulp/srp/ib_srp.c | 52 +--
drivers/infiniband/ulp/srp/ib_srp.h |2 +-
2 files changed, 38 insertions(+), 16 deletions(-)
diff --git
such that receiving a
completion with zero wr_id is recognized as an end-of-work marker.
Signed-off-by: Bart Van Assche bvanass...@acm.org
Cc: David Dillow dillo...@ornl.gov
Cc: Roland Dreier rol...@purestorage.com
---
drivers/infiniband/ulp/srp/ib_srp.c | 81
Introduce srp_remove_target(), srp_change_state_to_removed() and
srp_scan_target().
Signed-off-by: Bart Van Assche bvanass...@acm.org
Cc: David Dillow dillo...@ornl.gov
Cc: Roland Dreier rol...@purestorage.com
---
drivers/infiniband/ulp/srp/ib_srp.c | 49 +--
1
safe to
invoke that last function even if the IB connection has already
been disconnected. Rename srp_target_port.work into
srp_target_port.remove_work and move function srp_change_state()
to just before srp_change_state_to_removed().
Signed-off-by: Bart Van Assche bvanass...@acm.org
Cc: David Dillow
Document the sysfs attributes of the SRP initiator (ib_srp) according
to the rules specified in Documentation/ABI/README.
Signed-off-by: Bart Van Assche bvanass...@acm.org
Cc: David Dillow dillo...@ornl.gov
Cc: Roland Dreier rol...@purestorage.com
---
Documentation/ABI/stable/sysfs-driver-ib_srp
Make it possible to disconnect the IB RC connection used by the
SRP protocol to communicate with a target.
Let the SRP transport layer create a sysfs delete attribute for
initiator drivers that support this functionality.
Signed-off-by: Bart Van Assche bvanass...@acm.org
Cc: David Dillow dillo
Move the code for removing a target port if reconnecting during
a reset triggered by the SCSI mid-layer fails from inside
srp_reconnect_target() into srp_reset_host().
Signed-off-by: Bart Van Assche bvanass...@acm.org
Cc: David Dillow dillo...@ornl.gov
Cc: Roland Dreier rol...@purestorage.com
, remove the
target port. Add a target to the target list before connecting
instead of after such that this algorithm has a chance to work.
Signed-off-by: Bart Van Assche bvanass...@acm.org
Cc: David Dillow dillo...@ornl.gov
Cc: Roland Dreier rol...@purestorage.com
---
drivers/infiniband/ulp/srp
sure that error recovery is not triggered during host
removal. Swap the connected and removed tests in
srp_queuecommand() because of this change.
Rescan LUNs after having unblocked a SCSI target controlled by
ib_srp.
Signed-off-by: Bart Van Assche bvanass...@acm.org
Cc: David Dillow dillo
On Sat, Jan 14, 2012 at 10:10 PM, David Dillow dillo...@ornl.gov wrote:
On Sat, 2012-01-14 at 12:36 +, Bart Van Assche wrote:
This patch series makes the ib_srp driver better suited for use in a H.A.
What kernel version is this based on?
3.2.0+ (commit
On Sat, Jan 14, 2012 at 1:36 PM, Bart Van Assche bvanass...@acm.org wrote:
This patch series makes the ib_srp driver better suited for use in a H.A.
Hi Dave,
Do you have any review comments about this patch series ?
Thanks,
Bart.
--
To unsubscribe from this list: send the line unsubscribe
On Tue, Feb 7, 2012 at 1:36 AM, Dave Dillow dillo...@ornl.gov wrote:
On Mon, Feb 06, 2012 at 11:16:25AM -0500, Bart Van Assche wrote:
On Sat, Jan 14, 2012 at 1:36 PM, Bart Van Assche bvanass...@acm.org wrote:
This patch series makes the ib_srp driver better suited for use in a H.A.
Do
On Sun, Feb 26, 2012 at 6:32 AM, David Dillow dillo...@ornl.gov wrote:
On Sat, 2012-01-14 at 12:41 +, Bart Van Assche wrote:
Enlarge the block layer timeout such that it is above the
InfiniBand transport layer timeout. This is necessary to avoid
that an SRP response is received after
On 02/26/12 06:32, David Dillow wrote:
On Sat, 2012-01-14 at 12:43 +, Bart Van Assche wrote:
Separate connection and host state. Only report QP errors while
connected. Only invoke ib_send_cm_dreq() from inside
srp_disconnect_target() when connected such that invoking
srp_disconnect_target
On 02/26/12 06:32, David Dillow wrote:
On Sat, 2012-01-14 at 12:45 +, Bart Van Assche wrote:
Introduce srp_remove_target(), srp_change_state_to_removed() and
srp_scan_target().
+static bool srp_change_state_to_removed(struct srp_target_port *target)
+{
+bool changed = false
On 02/26/12 06:32, David Dillow wrote:
On Sat, 2012-01-14 at 12:44 +, Bart Van Assche wrote:
When disconnecting the IB connection via the IB CM, wait until
any invoked completion handlers have finished processing SRP
protocol data and prevent that new work completions are queued.
Change
On 02/26/12 06:34, David Dillow wrote:
On Sat, 2012-01-14 at 12:54 +, Bart Van Assche wrote:
The sysfs attribute 'add_target' may be used to relogin to a
target. An SRP target that receives a second login request from
an initiator will disconnect the previous connection. So before
trying
On 02/26/12 06:39, David Dillow wrote:
On Sat, 2012-01-14 at 12:57 +, Bart Van Assche wrote:
Add fast_io_fail_tmo and dev_loss_tmo sysfs attributes. Block
the SCSI target as soon as a transport layer error has been
detected (ping timeout, disconnect or IB error completion). Try
This patch series makes the ib_srp driver better suited for use in a H.A. setup
because:
- Switchover can be triggered explicitly by deleting an initiator device.
- Disconnecting from a target without unloading ib_srp becomes possible.
Changes since v2:
- Addressed the v2 review comments.
-
Enlarge the block layer timeout for disks such that it is above
the InfiniBand transport layer timeout. This is necessary to avoid
that an SRP response is received after the SCSI layer has already
killed the associated SCSI command.
Signed-off-by: Bart Van Assche bvanass...@acm.org
Cc: David
likely case.
Signed-off-by: Bart Van Assche bvanass...@acm.org
Cc: David Dillow dillo...@ornl.gov
Cc: Roland Dreier rol...@purestorage.com
---
drivers/infiniband/ulp/srp/ib_srp.c | 37 ++
drivers/infiniband/ulp/srp/ib_srp.h |2 +-
2 files changed, 21 insertions
Block the SCSI host while reconnecting instead of representing
the reconnect activity as a distinct SRP target state.
Signed-off-by: Bart Van Assche bvanass...@acm.org
Cc: David Dillow dillo...@ornl.gov
Cc: Roland Dreier rol...@purestorage.com
---
drivers/infiniband/ulp/srp/ib_srp.c | 16
to be printed.
Signed-off-by: Bart Van Assche bvanass...@acm.org
Cc: David Dillow dillo...@ornl.gov
Cc: Roland Dreier rol...@purestorage.com
---
drivers/infiniband/ulp/srp/ib_srp.c | 43 +++---
drivers/infiniband/ulp/srp/ib_srp.h |1 +
2 files changed, 35 insertions(+), 9
Null scmnd for RSP ...
followed by a kernel oops.
Signed-off-by: Bart Van Assche bvanass...@acm.org
Cc: David Dillow dillo...@ornl.gov
Cc: Roland Dreier rol...@purestorage.com
---
drivers/infiniband/ulp/srp/ib_srp.c |6 --
1 files changed, 4 insertions(+), 2 deletions(-)
diff --git
Signed-off-by: Bart Van Assche bvanass...@acm.org
Cc: David Dillow dillo...@ornl.gov
Cc: Roland Dreier rol...@purestorage.com
---
drivers/infiniband/ulp/srp/ib_srp.c | 19 ---
1 files changed, 12 insertions(+), 7 deletions(-)
diff --git a/drivers/infiniband/ulp/srp/ib_srp.c
b
srp_change_state() to just
before srp_change_state_to_removed() to avoid having to
introduce a forward declaration.
Signed-off-by: Bart Van Assche bvanass...@acm.org
Cc: David Dillow dillo...@ornl.gov
Cc: Roland Dreier rol...@purestorage.com
---
drivers/infiniband/ulp/srp/ib_srp.c | 108
Modify srp_disconnect_target() such that it waits until it is
sure that no new IB completions will be received.
Signed-off-by: Bart Van Assche bvanass...@acm.org
Cc: David Dillow dillo...@ornl.gov
Cc: Roland Dreier rol...@purestorage.com
---
drivers/infiniband/ulp/srp/ib_srp.c | 99
array will see all
values written into that array.
Signed-off-by: Bart Van Assche bvanass...@acm.org
Cc: FUJITA Tomonori fujita.tomon...@lab.ntt.co.jp
Cc: Brian King brk...@linux.vnet.ibm.com
Cc: David Dillow dillo...@ornl.gov
Cc: Roland Dreier rol...@purestorage.com
Cc: sta...@kernel.org
adding new attributes.
Signed-off-by: Bart Van Assche bvanass...@acm.org
Cc: FUJITA Tomonori fujita.tomon...@lab.ntt.co.jp
Cc: Brian King brk...@linux.vnet.ibm.com
Cc: David Dillow dillo...@ornl.gov
Cc: Roland Dreier rol...@purestorage.com
---
drivers/scsi/scsi_transport_srp.c | 26
Signed-off-by: Bart Van Assche bvanass...@acm.org
Cc: FUJITA Tomonori fujita.tomon...@lab.ntt.co.jp
Cc: Brian King brk...@linux.vnet.ibm.com
Cc: David Dillow dillo...@ornl.gov
Cc: Roland Dreier rol...@purestorage.com
---
Documentation/ABI/stable/sysfs-transport-srp | 12
1 files
Document the sysfs attributes of the SRP initiator (ib_srp) according
to the rules specified in Documentation/ABI/README.
Signed-off-by: Bart Van Assche bvanass...@acm.org
Cc: David Dillow dillo...@ornl.gov
Cc: Roland Dreier rol...@purestorage.com
---
Documentation/ABI/stable/sysfs-driver-ib_srp
Make it possible to disconnect the IB RC connection used by the
SRP protocol to communicate with a target.
Let the SRP transport layer create a sysfs delete attribute for
initiator drivers that support this functionality.
Signed-off-by: Bart Van Assche bvanass...@acm.org
Cc: David Dillow dillo
Signed-off-by: Bart Van Assche bvanass...@acm.org
Cc: David Dillow dillo...@ornl.gov
Cc: Roland Dreier rol...@purestorage.com
---
drivers/infiniband/ulp/srp/ib_srp.c | 10 ++
1 files changed, 6 insertions(+), 4 deletions(-)
diff --git a/drivers/infiniband/ulp/srp/ib_srp.c
b/drivers
, remove the
target port. Add a target to the target list before connecting
instead of after such that this algorithm has a chance to work.
Signed-off-by: Bart Van Assche bvanass...@acm.org
Cc: David Dillow dillo...@ornl.gov
Cc: Roland Dreier rol...@purestorage.com
---
drivers/infiniband/ulp/srp
On Sunday 25 March 2012 15:17, Bart Van Assche bvanass...@acm.org wrote:
This patch series makes the ib_srp driver better suited for use in a H.A.
setup because [ ... ]
The patch series is also available here:
http://github.com/bvanassche/linux/commits/srp-ha/.
Bart.
--
To unsubscribe from
On 10/11/11 00:41, Roland Dreier wrote:
On Mon, Oct 10, 2011 at 10:47 AM, Bart Van Assche bvanass...@acm.org wrote:
- uint32_t hi = *(uint32_t *)(gid-raw);
- uint32_t lo = *(uint32_t *)(gid-raw + 4);
- if (hi == htonl(0xfe80) lo == 0)
- return 1
Hi,
Apparently applications based on libumad can find local ports with
kernel 3.2.x but not with kernel 3.4-rc1.
# uname -r
3.4.0-rc1
# ls /sys/class/infiniband/mlx4_0/ports/1/rate
/sys/class/infiniband/mlx4_0/ports/1/rate
# cat /sys/class/infiniband/mlx4_0/ports/1/rate
cat:
On 04/02/12 10:33, Or Gerlitz wrote:
On 4/2/2012 10:42 AM, Bart Van Assche wrote:
# uname -r
3.4.0-rc1
# ls /sys/class/infiniband/mlx4_0/ports/1/rate
/sys/class/infiniband/mlx4_0/ports/1/rate
# cat /sys/class/infiniband/mlx4_0/ports/1/rate
cat: /sys/class/infiniband/mlx4_0/ports/1/rate
On 04/02/12 11:20, Or Gerlitz wrote:
On 4/2/2012 2:16 PM, Bart Van Assche wrote:
On 04/02/12 10:33, Or Gerlitz wrote:
As far as I can see the link layer value is fine:
$ cat /sys/class/infiniband/mlx4_0/ports/1/link_layer
InfiniBand
$ cat /sys/class/infiniband/mlx4_0/ports/2/link_layer
On 04/02/12 12:51, Or Gerlitz wrote:
On 4/2/2012 2:48 PM, Bart Van Assche wrote:
The two ports are connected back-to-back to another mlx4 HCA. I
noticed this behavior change since opensm stopped working after
rebooting into 3.4-rc1.
can you add these prints and send me the output after
On 04/01/12 19:09, Bart Van Assche wrote:
On 10/11/11 00:41, Roland Dreier wrote:
On Mon, Oct 10, 2011 at 10:47 AM, Bart Van Assche bvanass...@acm.org wrote:
- uint32_t hi = *(uint32_t *)(gid-raw);
- uint32_t lo = *(uint32_t *)(gid-raw + 4);
- if (hi == htonl(0xfe80
;
+ case IB_SPEED_SDR:
+ default:/* default to SDR for invalid rates */
+ rate = 25;
+ break;
}
rate *= ib_width_enum_to_int(attr.active_width);
--
1.7.9.1
Tested-by: Bart Van Assche bvanass...@acm.org
--
To unsubscribe from this list
On 03/31/12 00:39, Ira Weiny wrote:
First, a question: what package installs the openibd script in OFED? For the
life of me I can't find this script in 1.5.4.1 or 3.2 ... :-/ [*]
That's easy to figure out:
$ rpm -qf /etc/init.d/openibd
kernel-ib-1.5.4-3.0.12+.x86_64
Bart.
--
To
On 04/05/12 19:10, Hefty, Sean wrote:
I create a .tar.gz package using 'make dist', copy it to another
system, then install it using 'configure make install'. When I
do that, sysconfdir defaults to /usr/local/etc, sbindir /usr/local
/sbin, and bindir to /usr/local/bin. I added /usr/local
On 04/11/12 18:29, Hefty, Sean wrote:
The following patch set contains an initial implementation of rsockets as
presented at the 2012 OpenFabrics Workshop. A copy of that presentation
is available at:
https://www.openfabrics.org/downloads/rdmacm/rsockets-ofa12.pptx
and a video of
On 04/18/12 20:21, Roland Dreier wrote:
On Wed, Apr 18, 2012 at 2:07 AM, Alexey Shvetsov ale...@gentoo.org wrote:
Apr 18 13:04:01 store kernel: mlx4_core :4b:00.0: command 0x19 failed:
fw status = 0x9
status 0x9 is:
/* Resource is not in the appropriate state or ownership:
On 03/29/12 16:59, Dave Dillow wrote:
[ ... ]
I haven't chewed on the rest yet, but would like to see this one at
least in 3.4 if possible.
Hi Dave,
If you have further comments about any of the patches in this series,
these are welcome. The 3.5 merge window isn't that far away anymore.
Hello,
If I interpret the source code in drivers/infiniband/core/cm.c correctly
ib_destroy_cm_id() can return before an ongoing cm_id callback has
finished. Is this on purpose ? If not, isn't there a
flush_workqueue(cm.wq) call missing in cm_destroy_id() ?
Thanks,
Bart.
--
To unsubscribe from
On 04/27/12 17:18, Hefty, Sean wrote:
If I interpret the source code in drivers/infiniband/core/cm.c correctly
ib_destroy_cm_id() can return before an ongoing cm_id callback has
finished. Is this on purpose ? If not, isn't there a
flush_workqueue(cm.wq) call missing in cm_destroy_id() ?
On 04/30/12 18:29, Hefty, Sean wrote:
That makes me wonder how it is prevented that two CM callbacks for the
same CM ID run concurrently on different CPUs ?
The callback code ends up looking like this:
ret = atomic_inc_and_test(cm_id_priv-work_count);
if (!ret)
On 04/30/12 19:27, Hefty, Sean wrote:
* User requests shutdown and hence from another thread ib_send_cm_dreq() is
invoked.
ib_send_cm_dreq() at this point should fail with EINVAL, as the
connection state is not yet established.
You are right, that didn't make sense. What I have noticed
Just like other POSIX thread functions, pthread_create() either returns zero
or a positive error code. Found this through source code review. See also
http://pubs.opengroup.org/onlinepubs/9699919799/functions/pthread_create.html
Signed-off-by: Bart Van Assche bvanass...@acm.org
---
complib
On 07/24/12 15:16, Joseph Glanville wrote:
I have been seeing this KP occur about every 3 days on our staging cluster.
I am not exactly sure what the root cause would be.. I assume this
would be a bug in SCST.
The kernel is a 3.2.14 with Ubuntu patch series applied and Bart's SRP
HA patches.
On 07/24/12 19:50, Joseph Glanville wrote:
On 25 July 2012 03:53, Bart Van Assche bvanass...@acm.org wrote:
On 07/24/12 15:16, Joseph Glanville wrote:
I have been seeing this KP occur about every 3 days on our staging cluster.
I am not exactly sure what the root cause would be.. I assume
On 07/24/12 15:43, Joseph Glanville wrote:
[35404.804723] BUG: unable to handle kernel NULL pointer dereference at (null)
I've been able to reproduce this ib_srp crash. Apparently if an SRP
response is received after srp_reset_host() has been invoked
srp_process_rsp() tries to call
On 08/02/12 20:12, David Dillow wrote:
On Thu, 2012-08-02 at 11:04 +, Bart Van Assche wrote:
On 07/24/12 15:43, Joseph Glanville wrote:
[35404.804723] BUG: unable to handle kernel NULL pointer dereference at
(null)
I've been able to reproduce this ib_srp crash. Apparently if an SRP
...@mellanox.com
Signed-off-by: Jack Morgenstein ja...@dev.mellanox.co.il
Tested-by: Bart Van Assche bvanass...@acm.org
--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo
This patch series makes the ib_srp driver better suited for use in a
H.A. setup because:
- multipathd is notified faster about transport layer failures.
- Transport layer failures reliably result in a reconnect.
- Switchover can be triggered explicitly by deleting an initiator
device.
-
Test the QP state inside srp_send_tsk_mgmt() instead of letting each
caller perform that test.
Signed-off-by: Bart Van Assche bvanass...@acm.org
Cc: David Dillow dillo...@ornl.gov
Cc: Roland Dreier rol...@purestorage.com
---
drivers/infiniband/ulp/srp/ib_srp.c |8 +++-
1 files changed, 3
srp_handle_qp_err(), change the type of
qp_in_error from int into bool and move the initialization of that
variable from srp_reconnect_target() to srp_connect_target().
Signed-off-by: Bart Van Assche bvanass...@acm.org
Cc: David Dillow dillo...@ornl.gov
Cc: Roland Dreier rol...@purestorage.com
---
drivers
Block the SCSI host while reconnecting instead of representing
the reconnection activity as a distinct SRP target state. This
allows to eliminate the target state SRP_TARGET_CONNECTING.
Signed-off-by: Bart Van Assche bvanass...@acm.org
Cc: David Dillow dillo...@ornl.gov
Cc: Roland Dreier rol
Keep track of the connection state. Only report QP errors while
connected. Only invoke ib_send_cm_dreq() when connected such that
invoking srp_disconnect_target() after having received a DREQ
does not cause an error message to be printed.
Signed-off-by: Bart Van Assche bvanass...@acm.org
Cc
Null scmnd for RSP ...
followed by a kernel oops.
Signed-off-by: Bart Van Assche bvanass...@acm.org
Cc: David Dillow dillo...@ornl.gov
Cc: Roland Dreier rol...@purestorage.com
---
drivers/infiniband/ulp/srp/ib_srp.c |2 ++
1 files changed, 2 insertions(+), 0 deletions(-)
diff --git a/drivers
Signed-off-by: Bart Van Assche bvanass...@acm.org
Cc: David Dillow dillo...@ornl.gov
Cc: Roland Dreier rol...@purestorage.com
---
drivers/infiniband/ulp/srp/ib_srp.c | 19 ---
1 files changed, 12 insertions(+), 7 deletions(-)
diff --git a/drivers/infiniband/ulp/srp/ib_srp.c
b
Modify srp_disconnect_target() such that it waits until it is
sure that no new IB completions will be received anymore.
Signed-off-by: Bart Van Assche bvanass...@acm.org
Cc: David Dillow dillo...@ornl.gov
Cc: Roland Dreier rol...@purestorage.com
---
drivers/infiniband/ulp/srp/ib_srp.c | 104
adding new attributes.
Signed-off-by: Bart Van Assche bvanass...@acm.org
Cc: FUJITA Tomonori fujita.tomon...@lab.ntt.co.jp
Cc: Robert Jennings r...@linux.vnet.ibm.com
Cc: David Dillow dillo...@ornl.gov
Cc: Roland Dreier rol...@purestorage.com
---
drivers/scsi/scsi_transport_srp.c | 26
Signed-off-by: Bart Van Assche bvanass...@acm.org
Cc: David Dillow dillo...@ornl.gov
Cc: Roland Dreier rol...@purestorage.com
---
drivers/infiniband/ulp/srp/ib_srp.c | 10 ++
1 files changed, 6 insertions(+), 4 deletions(-)
diff --git a/drivers/infiniband/ulp/srp/ib_srp.c
b/drivers
.
- Support for implementing dev_loss_tmo, the time that should
elapse after having detected a transport layer problem and
before removing a remote port.
Signed-off-by: Bart Van Assche bvanass...@acm.org
Cc: FUJITA Tomonori fujita.tomon...@lab.ntt.co.jp
Cc: Robert Jennings r...@linux.vnet.ibm.com
Cc
Remove an SRP host if either dev_loss_tmo expired or the target
closed the IB connection.
Signed-off-by: Bart Van Assche bvanass...@acm.org
Cc: David Dillow dillo...@ornl.gov
Cc: Roland Dreier rol...@purestorage.com
---
drivers/infiniband/ulp/srp/ib_srp.c | 53
On 08/09/12 15:41, Bart Van Assche wrote:
[ ... ]
The patch series is also available on top of 3.6-rc1 here:
http://github.com/bvanassche/linux/tree/srp-ha
Bart.
--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More
-rdmam=134314367801595
Signed-off-by: Bart Van Assche bvanass...@acm.org
Cc: David Dillow dillo...@ornl.gov
Cc: Roland Dreier rol...@purestorage.com
Cc: sta...@vger.kernel.org
---
drivers/infiniband/ulp/srp/ib_srp.c | 87 +--
1 files changed, 63 insertions(+), 24
301 - 400 of 1130 matches
Mail list logo