Re: [PATCH 04/14] ib_srp: Set block layer timeout

2011-12-18 Thread Bart Van Assche
On Sat, Dec 17, 2011 at 5:03 PM, Or Gerlitz or.gerl...@gmail.com wrote: Bart Van Assche bvanass...@acm.org wrote: The default block layer timeout is 30 seconds. Could you provider a pointer to where this is defined? Sorry, it's not the block layer but sd (SCSI disk) that sets that timeout

Re: [PATCH 03/14] ib_srp: Disallow duplicate logins

2011-12-18 Thread Bart Van Assche
On Thu, Dec 15, 2011 at 7:08 PM, David Dillow dillo...@ornl.gov wrote: On Thu, 2011-12-01 at 19:58 +0100, Bart Van Assche wrote: Currently the ib_srp driver allows to log in multiple times to the same target via its sysfs interface. This leads to each target LUN being imported multiple

Re: [PATCH 11/14] ib_srp: Document sysfs attributes

2011-12-19 Thread Bart Van Assche
On Mon, Dec 19, 2011 at 12:33 AM, David Dillow dillo...@ornl.gov wrote: On Thu, 2011-12-01 at 20:09 +0100, Bart Van Assche wrote: +             * cmd_sg_entries, the maximum number of memory descriptors +               that fit in a single SRP command when using the direct data

Re: [PATCH 14/14] ib_srp: Allow SRP disconnect through sysfs

2011-12-19 Thread Bart Van Assche
On Mon, Dec 19, 2011 at 4:03 AM, David Dillow dillo...@ornl.gov wrote: On Thu, 2011-12-01 at 20:13 +0100, Bart Van Assche wrote: Make it possible to disconnect via sysfs the IB RC connection used by the SRP protocol to communicate with a target. Let the SRP transport layer create a sysfs

Re: [PATCH 13/14] ib_srp: Implement transport layer ping

2011-12-19 Thread Bart Van Assche
On Mon, Dec 19, 2011 at 12:50 AM, David Dillow dillo...@ornl.gov wrote: On Thu, 2011-12-01 at 20:11 +0100, Bart Van Assche wrote: Add a time-based transport layer test such that fail-over in a multipath setup can happen quickly. Why should this be done in the kernel? multipathd already

Re: [PATCH 03/14] ib_srp: Disallow duplicate logins

2011-12-19 Thread Bart Van Assche
On Sun, Dec 18, 2011 at 9:40 PM, David Dillow dillo...@ornl.gov wrote: A SRP target port (for the purposes of an I_T nexus) is defined by the IOC GUID and extension; if you want to have redundant connections to a port, multi-channel logins must be supported. The way most targets avoid this

Re: [PATCH 12/14] ib_srp: Rework error handling

2011-12-19 Thread Bart Van Assche
On Mon, Dec 19, 2011 at 3:36 AM, David Dillow dillo...@ornl.gov wrote: On Thu, 2011-12-01 at 20:10 +0100, Bart Van Assche wrote: Rework ib_srp transport layer error handling. Instead of letting SCSI commands time out if a transport layer error occurs, This is good, but should probably be part

Re: [PATCH 12/14] ib_srp: Rework error handling

2011-12-20 Thread Bart Van Assche
On Mon, Dec 19, 2011 at 10:51 PM, David Dillow dillo...@ornl.gov wrote: I haven't parsed it all out from your changes just yet, but I think part of the reason you may have had problems with req-scmd being null in srp_handle_recv() is due to a new race between the tear down of the connection

Re: [PATCH 13/14] ib_srp: Implement transport layer ping

2011-12-20 Thread Bart Van Assche
On Mon, Dec 19, 2011 at 10:32 PM, David Dillow dillo...@ornl.gov wrote: I still think this is already solved in user space, but the new reconnect model you've implemented doesn't match up with the expected semantics. It'd be better to match the rest of the SCSI stack for this. Sorry, but I'm

Re: [PATCH 10/14] srp_transport: Simplify attribute initialization code

2011-12-20 Thread Bart Van Assche
On Mon, Dec 19, 2011 at 12:07 AM, David Dillow dillo...@ornl.gov wrote: On Thu, 2011-12-01 at 20:08 +0100, Bart Van Assche wrote: Eliminate the private_rport_attrs[] array and the SETUP_*() macros used to set up that array since the information in that array duplicates the information

Re: [PATCH 13/14] ib_srp: Implement transport layer ping

2011-12-20 Thread Bart Van Assche
On Mon, Dec 19, 2011 at 10:32 PM, David Dillow dillo...@ornl.gov wrote: Part of the problem is introduced by allowing for permanent connections rather than using the familiar dev_loss_tmo and fast_io_fail_tmo parameters from other SCSI transports. For instance, in the FC transport, rports are

Re: [PATCH 12/14] ib_srp: Rework error handling

2011-12-21 Thread Bart Van Assche
On Wed, Dec 21, 2011 at 3:33 AM, David Dillow dillo...@ornl.gov wrote: What keeps the srp_recv_completion() -- srp_handle_recv() -- srp_process_rsp() -- etc. call chain from racing with srp_reconnect_target()? It looks like the srp_reset_req() in srp_reconnect_target() could race with

Re: [PATCH 13/14] ib_srp: Implement transport layer ping

2011-12-21 Thread Bart Van Assche
On Wed, Dec 21, 2011 at 3:05 AM, David Dillow dillo...@ornl.gov wrote: We don't want to leave a target blocked indefinitely -- commands caught in the blocked queue won't be reissued until the queue is unblocked -- but we may want to keep the sdX mappings around for a long time.

Re: [PATCH 12/14] ib_srp: Rework error handling

2011-12-26 Thread Bart Van Assche
On Mon, Dec 19, 2011 at 3:36 AM, David Dillow dillo...@ornl.gov wrote: On Thu, 2011-12-01 at 20:10 +0100, Bart Van Assche wrote:  /*   * Copyright (c) 2005 Cisco Systems.  All rights reserved. + * Copyright (c) 2010-2011 Bart Van Assche bvanass...@acm.org. You've tried to add

Re: [PATCH 13/14] ib_srp: Implement transport layer ping

2011-12-26 Thread Bart Van Assche
On Sat, Dec 24, 2011 at 8:07 PM, David Dillow dillo...@ornl.gov wrote: This says to me that SRP should use the dev_loss_tmo semantics, though the naming of fast_io_fail vs replacement_timeout is a bit more of a question than I thought. I tend to think of SRP more in terms of FC than iSCSI, so

Re: [PATCH 13/14] ib_srp: Implement transport layer ping

2011-12-26 Thread Bart Van Assche
On Fri, Dec 23, 2011 at 10:56 PM, Mike Christie micha...@cs.wisc.edu wrote: iSCSI replacement_timeout is the same as fast_io_fail_tmo for FC. iSCSI replacement_timeout actually came first so you should say FC should have copied our name :) Agreed. But as far as I can see multipathd only

Re: When is it safe to release connection resources?

2011-12-31 Thread Bart Van Assche
On Thu, Dec 29, 2011 at 4:42 PM, Flavio Baronti f.baro...@list-group.com wrote: I'm new to RDMA development and I have a question regarding resource release. If I understood correctly, when ibv_get_cq_event returns, it holds some sort of lock over the completion queue, which is released when I

Re: [PATCH 1/2] libibverbs: Allow 3rd party extensions to verb routines

2011-12-31 Thread Bart Van Assche
On Thu, Dec 29, 2011 at 3:43 PM, Or Gerlitz ogerl...@mellanox.com wrote: 5. what happens if we just want to enhance an -- existing -- function - suppose we want to enhance ibv_post_send / ibv_poll_cq to support features like LSO, checksum offload, masked atomic operations, fast memory remote

Re: [GIT PULL] please pull infiniband.git

2012-01-03 Thread Bart Van Assche
On Mon, Dec 19, 2011 at 5:39 PM, Roland Dreier rol...@kernel.org wrote: Roland Dreier (2):      IB/mlx4: Fix shutdown crash accessing a non-existent bitmap Hi Roland, As far as I can see this fix is upstream as commit 4af3ce0de0c12e5c17811eaefad36ab8e146c0fd but is not yet included in v3.1.7.

Re: When is it safe to release connection resources?

2012-01-04 Thread Bart Van Assche
On Mon, Jan 2, 2012 at 4:39 PM, Hefty, Sean sean.he...@intel.com wrote: If I understood correctly, when ibv_get_cq_event returns, it holds some sort of lock over the completion queue, which is ibv_destroy_cq, so that: 1) When ibv_destroy_cq returns, I am certain that there is no thread running

Re: When is it safe to release connection resources?

2012-01-05 Thread Bart Van Assche
On Wed, Jan 4, 2012 at 9:05 PM, Hefty, Sean sean.he...@intel.com wrote: I've just had a look at the kernel code that implements all this (uverbs_cmd.c and uverbs_main.c). I haven't found any precautions against ib_uverbs_comp_handler() accessing *uobj after ib_uverbs_destroy_cq() has invoked

Re: When is it safe to release connection resources?

2012-01-07 Thread Bart Van Assche
On Thu, Jan 5, 2012 at 5:29 PM, Roland Dreier rol...@purestorage.com wrote: That should be OK.  ib_uverbs_destroy_cq() doesn't really do anything until ib_destroy_cq() has returned, and at that point it is guaranteed that the completion handler for the CQ is done. That sounds like a

[PATCH, RFC] nes: Add missing rcu_read_unlock() call

2012-01-10 Thread Bart Van Assche
Hi, Sparse complains about nes_addr_resolve_neigh(). Does the patch below make sense ? --- drivers/infiniband/hw/nes/nes_cm.c |1 + 1 files changed, 1 insertions(+), 0 deletions(-) diff --git a/drivers/infiniband/hw/nes/nes_cm.c b/drivers/infiniband/hw/nes/nes_cm.c index 425065b..1b2ccc7

Re: endian question about struct srp_direct_buf

2012-01-12 Thread Bart Van Assche
On Thu, Jan 12, 2012 at 12:41 PM, Dan Carpenter dan.carpen...@oracle.com wrote: Sparse complains because len in struct srp_direct_buf is declared as big endian but it's used throughout as CPU endian.  struct srp_indirect_buf has the same thing.  It's declared one way but used the other way.

[PATCH 00/18, v2] Make ib_srp better suited for H.A. purposes

2012-01-14 Thread Bart Van Assche
This patch series makes the ib_srp driver better suited for use in a H.A. setup because: - Switchover without triggering read or write errors become possible. Such errors are bad because these can make a filesystem switch to read-only mode. - A ping mechanism has been added that allows to

[PATCH 01/18] ib_srp: Introduce pr_fmt()

2012-01-14 Thread Bart Van Assche
statement in srp_qp_event() from error to debug. Change a double space into a single in the bad IO class parameter error message. Remove one trailing space to avoid a checkpatch warning. Signed-off-by: Bart Van Assche bvanass...@acm.org Cc: David Dillow dillo...@ornl.gov Cc: Roland Dreier rol

[PATCH 02/18] ib_srp: Consolidate repetitive sysfs code

2012-01-14 Thread Bart Van Assche
Remove sysfs attributes before removing a target instead of testing the target state in every sysfs attribute callback method. Note: it is safe to invoke a sysfs attribute removal method like device_remove_file() twice on the same attribute. Signed-off-by: Bart Van Assche bvanass...@acm.org Cc

[PATCH 03/18] ib_srp: Enlarge block layer timeout

2012-01-14 Thread Bart Van Assche
type of SCSI device (M/O disk, tape, CD-ROM, ...). Signed-off-by: Bart Van Assche bvanass...@acm.org Cc: David Dillow dillo...@ornl.gov Cc: Roland Dreier rol...@purestorage.com --- drivers/infiniband/ulp/srp/ib_srp.c | 46 +++ drivers/infiniband/ulp/srp/ib_srp.h

[PATCH 04/18] ib_srp: Micro-optimize completion handlers

2012-01-14 Thread Bart Van Assche
from int to bool and move the initialization of that variable into srp_connect_target(). Signed-off-by: Bart Van Assche bvanass...@acm.org Cc: David Dillow dillo...@ornl.gov Cc: Roland Dreier rol...@purestorage.com --- drivers/infiniband/ulp/srp/ib_srp.c | 57

[PATCH 05/18] ib_srp: Separate connection and host state

2012-01-14 Thread Bart Van Assche
Van Assche bvanass...@acm.org Cc: David Dillow dillo...@ornl.gov Cc: Roland Dreier rol...@purestorage.com --- drivers/infiniband/ulp/srp/ib_srp.c | 52 +-- drivers/infiniband/ulp/srp/ib_srp.h |2 +- 2 files changed, 38 insertions(+), 16 deletions(-) diff --git

[PATCH 06/18] ib_srp: Wait for last completion when disconnecting

2012-01-14 Thread Bart Van Assche
such that receiving a completion with zero wr_id is recognized as an end-of-work marker. Signed-off-by: Bart Van Assche bvanass...@acm.org Cc: David Dillow dillo...@ornl.gov Cc: Roland Dreier rol...@purestorage.com --- drivers/infiniband/ulp/srp/ib_srp.c | 81

[PATCH 07/18] ib_srp: Introduce three helper functions

2012-01-14 Thread Bart Van Assche
Introduce srp_remove_target(), srp_change_state_to_removed() and srp_scan_target(). Signed-off-by: Bart Van Assche bvanass...@acm.org Cc: David Dillow dillo...@ornl.gov Cc: Roland Dreier rol...@purestorage.com --- drivers/infiniband/ulp/srp/ib_srp.c | 49 +-- 1

[PATCH 08/18] ib_srp: Eliminate state SRP_TARGET_DEAD

2012-01-14 Thread Bart Van Assche
safe to invoke that last function even if the IB connection has already been disconnected. Rename srp_target_port.work into srp_target_port.remove_work and move function srp_change_state() to just before srp_change_state_to_removed(). Signed-off-by: Bart Van Assche bvanass...@acm.org Cc: David Dillow

[PATCH 12/18] ib_srp: Document sysfs attributes

2012-01-14 Thread Bart Van Assche
Document the sysfs attributes of the SRP initiator (ib_srp) according to the rules specified in Documentation/ABI/README. Signed-off-by: Bart Van Assche bvanass...@acm.org Cc: David Dillow dillo...@ornl.gov Cc: Roland Dreier rol...@purestorage.com --- Documentation/ABI/stable/sysfs-driver-ib_srp

[PATCH 13/18] ib_srp: Allow SRP disconnect through sysfs

2012-01-14 Thread Bart Van Assche
Make it possible to disconnect the IB RC connection used by the SRP protocol to communicate with a target. Let the SRP transport layer create a sysfs delete attribute for initiator drivers that support this functionality. Signed-off-by: Bart Van Assche bvanass...@acm.org Cc: David Dillow dillo

[PATCH 14/18] ib_srp: Move target port removal code

2012-01-14 Thread Bart Van Assche
Move the code for removing a target port if reconnecting during a reset triggered by the SCSI mid-layer fails from inside srp_reconnect_target() into srp_reset_host(). Signed-off-by: Bart Van Assche bvanass...@acm.org Cc: David Dillow dillo...@ornl.gov Cc: Roland Dreier rol...@purestorage.com

[PATCH 15/18] ib_srp: Maintain a single connection per I_T nexus

2012-01-14 Thread Bart Van Assche
, remove the target port. Add a target to the target list before connecting instead of after such that this algorithm has a chance to work. Signed-off-by: Bart Van Assche bvanass...@acm.org Cc: David Dillow dillo...@ornl.gov Cc: Roland Dreier rol...@purestorage.com --- drivers/infiniband/ulp/srp

[PATCH 18/18] ib_srp: Rework error handling

2012-01-14 Thread Bart Van Assche
sure that error recovery is not triggered during host removal. Swap the connected and removed tests in srp_queuecommand() because of this change. Rescan LUNs after having unblocked a SCSI target controlled by ib_srp. Signed-off-by: Bart Van Assche bvanass...@acm.org Cc: David Dillow dillo

Re: [PATCH 00/18, v2] Make ib_srp better suited for H.A. purposes

2012-01-15 Thread Bart Van Assche
On Sat, Jan 14, 2012 at 10:10 PM, David Dillow dillo...@ornl.gov wrote: On Sat, 2012-01-14 at 12:36 +, Bart Van Assche wrote: This patch series makes the ib_srp driver better suited for use in a H.A. What kernel version is this based on? 3.2.0+ (commit

Re: [PATCH 00/18, v2] Make ib_srp better suited for H.A. purposes

2012-02-06 Thread Bart Van Assche
On Sat, Jan 14, 2012 at 1:36 PM, Bart Van Assche bvanass...@acm.org wrote: This patch series makes the ib_srp driver better suited for use in a H.A. Hi Dave, Do you have any review comments about this patch series ? Thanks, Bart. -- To unsubscribe from this list: send the line unsubscribe

Re: [PATCH 00/18, v2] Make ib_srp better suited for H.A. purposes

2012-02-24 Thread Bart Van Assche
On Tue, Feb 7, 2012 at 1:36 AM, Dave Dillow dillo...@ornl.gov wrote: On Mon, Feb 06, 2012 at 11:16:25AM -0500, Bart Van Assche wrote: On Sat, Jan 14, 2012 at 1:36 PM, Bart Van Assche bvanass...@acm.org wrote: This patch series makes the ib_srp driver better suited for use in a H.A. Do

Re: [PATCH 03/18] ib_srp: Enlarge block layer timeout

2012-02-26 Thread Bart Van Assche
On Sun, Feb 26, 2012 at 6:32 AM, David Dillow dillo...@ornl.gov wrote: On Sat, 2012-01-14 at 12:41 +, Bart Van Assche wrote: Enlarge the block layer timeout such that it is above the InfiniBand transport layer timeout. This is necessary to avoid that an SRP response is received after

Re: [PATCH 05/18] ib_srp: Separate connection and host state

2012-03-03 Thread Bart Van Assche
On 02/26/12 06:32, David Dillow wrote: On Sat, 2012-01-14 at 12:43 +, Bart Van Assche wrote: Separate connection and host state. Only report QP errors while connected. Only invoke ib_send_cm_dreq() from inside srp_disconnect_target() when connected such that invoking srp_disconnect_target

Re: [PATCH 07/18] ib_srp: Introduce three helper functions

2012-03-03 Thread Bart Van Assche
On 02/26/12 06:32, David Dillow wrote: On Sat, 2012-01-14 at 12:45 +, Bart Van Assche wrote: Introduce srp_remove_target(), srp_change_state_to_removed() and srp_scan_target(). +static bool srp_change_state_to_removed(struct srp_target_port *target) +{ +bool changed = false

Re: [PATCH 06/18] ib_srp: Wait for last completion when disconnecting

2012-03-03 Thread Bart Van Assche
On 02/26/12 06:32, David Dillow wrote: On Sat, 2012-01-14 at 12:44 +, Bart Van Assche wrote: When disconnecting the IB connection via the IB CM, wait until any invoked completion handlers have finished processing SRP protocol data and prevent that new work completions are queued. Change

Re: [PATCH 15/18] ib_srp: Maintain a single connection per I_T nexus

2012-03-03 Thread Bart Van Assche
On 02/26/12 06:34, David Dillow wrote: On Sat, 2012-01-14 at 12:54 +, Bart Van Assche wrote: The sysfs attribute 'add_target' may be used to relogin to a target. An SRP target that receives a second login request from an initiator will disconnect the previous connection. So before trying

Re: [PATCH 18/18] ib_srp: Rework error handling

2012-03-04 Thread Bart Van Assche
On 02/26/12 06:39, David Dillow wrote: On Sat, 2012-01-14 at 12:57 +, Bart Van Assche wrote: Add fast_io_fail_tmo and dev_loss_tmo sysfs attributes. Block the SCSI target as soon as a transport layer error has been detected (ping timeout, disconnect or IB error completion). Try

[PATCH 00/15, v3] Make ib_srp better suited for H.A. purposes

2012-03-25 Thread Bart Van Assche
This patch series makes the ib_srp driver better suited for use in a H.A. setup because: - Switchover can be triggered explicitly by deleting an initiator device. - Disconnecting from a target without unloading ib_srp becomes possible. Changes since v2: - Addressed the v2 review comments. -

[PATCH 01/15] ib_srp: Enlarge block layer timeout

2012-03-25 Thread Bart Van Assche
Enlarge the block layer timeout for disks such that it is above the InfiniBand transport layer timeout. This is necessary to avoid that an SRP response is received after the SCSI layer has already killed the associated SCSI command. Signed-off-by: Bart Van Assche bvanass...@acm.org Cc: David

[PATCH 02/15] ib_srp: Introduce srp_handle_qp_err()

2012-03-25 Thread Bart Van Assche
likely case. Signed-off-by: Bart Van Assche bvanass...@acm.org Cc: David Dillow dillo...@ornl.gov Cc: Roland Dreier rol...@purestorage.com --- drivers/infiniband/ulp/srp/ib_srp.c | 37 ++ drivers/infiniband/ulp/srp/ib_srp.h |2 +- 2 files changed, 21 insertions

[PATCH 03/15] ib_srp: Micro-optimize srp_queuecommand()

2012-03-25 Thread Bart Van Assche
Block the SCSI host while reconnecting instead of representing the reconnect activity as a distinct SRP target state. Signed-off-by: Bart Van Assche bvanass...@acm.org Cc: David Dillow dillo...@ornl.gov Cc: Roland Dreier rol...@purestorage.com --- drivers/infiniband/ulp/srp/ib_srp.c | 16

[PATCH 04/15] ib_srp: Suppress superfluous error messages

2012-03-25 Thread Bart Van Assche
to be printed. Signed-off-by: Bart Van Assche bvanass...@acm.org Cc: David Dillow dillo...@ornl.gov Cc: Roland Dreier rol...@purestorage.com --- drivers/infiniband/ulp/srp/ib_srp.c | 43 +++--- drivers/infiniband/ulp/srp/ib_srp.h |1 + 2 files changed, 35 insertions(+), 9

[PATCH 05/15] ib_srp: Avoid that SCSI error handling triggers a crash

2012-03-25 Thread Bart Van Assche
Null scmnd for RSP ... followed by a kernel oops. Signed-off-by: Bart Van Assche bvanass...@acm.org Cc: David Dillow dillo...@ornl.gov Cc: Roland Dreier rol...@purestorage.com --- drivers/infiniband/ulp/srp/ib_srp.c |6 -- 1 files changed, 4 insertions(+), 2 deletions(-) diff --git

[PATCH 06/15] ib_srp: Introduce the helper function srp_remove_target()

2012-03-25 Thread Bart Van Assche
Signed-off-by: Bart Van Assche bvanass...@acm.org Cc: David Dillow dillo...@ornl.gov Cc: Roland Dreier rol...@purestorage.com --- drivers/infiniband/ulp/srp/ib_srp.c | 19 --- 1 files changed, 12 insertions(+), 7 deletions(-) diff --git a/drivers/infiniband/ulp/srp/ib_srp.c b

[PATCH 07/15] ib_srp: Eliminate state SRP_TARGET_DEAD

2012-03-25 Thread Bart Van Assche
srp_change_state() to just before srp_change_state_to_removed() to avoid having to introduce a forward declaration. Signed-off-by: Bart Van Assche bvanass...@acm.org Cc: David Dillow dillo...@ornl.gov Cc: Roland Dreier rol...@purestorage.com --- drivers/infiniband/ulp/srp/ib_srp.c | 108

[PATCH 08/15] ib_srp: Make srp_disconnect_target() wait for IB completions

2012-03-25 Thread Bart Van Assche
Modify srp_disconnect_target() such that it waits until it is sure that no new IB completions will be received. Signed-off-by: Bart Van Assche bvanass...@acm.org Cc: David Dillow dillo...@ornl.gov Cc: Roland Dreier rol...@purestorage.com --- drivers/infiniband/ulp/srp/ib_srp.c | 99

[PATCH 09/15] srp_transport: Fix atttribute registration

2012-03-25 Thread Bart Van Assche
array will see all values written into that array. Signed-off-by: Bart Van Assche bvanass...@acm.org Cc: FUJITA Tomonori fujita.tomon...@lab.ntt.co.jp Cc: Brian King brk...@linux.vnet.ibm.com Cc: David Dillow dillo...@ornl.gov Cc: Roland Dreier rol...@purestorage.com Cc: sta...@kernel.org

[PATCH 10/15] srp_transport: Simplify attribute initialization code

2012-03-25 Thread Bart Van Assche
adding new attributes. Signed-off-by: Bart Van Assche bvanass...@acm.org Cc: FUJITA Tomonori fujita.tomon...@lab.ntt.co.jp Cc: Brian King brk...@linux.vnet.ibm.com Cc: David Dillow dillo...@ornl.gov Cc: Roland Dreier rol...@purestorage.com --- drivers/scsi/scsi_transport_srp.c | 26

[PATCH 11/15] srp_transport: Document sysfs attributes

2012-03-25 Thread Bart Van Assche
Signed-off-by: Bart Van Assche bvanass...@acm.org Cc: FUJITA Tomonori fujita.tomon...@lab.ntt.co.jp Cc: Brian King brk...@linux.vnet.ibm.com Cc: David Dillow dillo...@ornl.gov Cc: Roland Dreier rol...@purestorage.com --- Documentation/ABI/stable/sysfs-transport-srp | 12 1 files

[PATCH 12/15] ib_srp: Document sysfs attributes

2012-03-25 Thread Bart Van Assche
Document the sysfs attributes of the SRP initiator (ib_srp) according to the rules specified in Documentation/ABI/README. Signed-off-by: Bart Van Assche bvanass...@acm.org Cc: David Dillow dillo...@ornl.gov Cc: Roland Dreier rol...@purestorage.com --- Documentation/ABI/stable/sysfs-driver-ib_srp

[PATCH 13/15] ib_srp: Allow SRP disconnect through sysfs

2012-03-25 Thread Bart Van Assche
Make it possible to disconnect the IB RC connection used by the SRP protocol to communicate with a target. Let the SRP transport layer create a sysfs delete attribute for initiator drivers that support this functionality. Signed-off-by: Bart Van Assche bvanass...@acm.org Cc: David Dillow dillo

[PATCH 14/15] ib_srp: Introduce a temporary variable in srp_remove_target()

2012-03-25 Thread Bart Van Assche
Signed-off-by: Bart Van Assche bvanass...@acm.org Cc: David Dillow dillo...@ornl.gov Cc: Roland Dreier rol...@purestorage.com --- drivers/infiniband/ulp/srp/ib_srp.c | 10 ++ 1 files changed, 6 insertions(+), 4 deletions(-) diff --git a/drivers/infiniband/ulp/srp/ib_srp.c b/drivers

[PATCH 15/15] ib_srp: Maintain a single connection per I_T nexus

2012-03-25 Thread Bart Van Assche
, remove the target port. Add a target to the target list before connecting instead of after such that this algorithm has a chance to work. Signed-off-by: Bart Van Assche bvanass...@acm.org Cc: David Dillow dillo...@ornl.gov Cc: Roland Dreier rol...@purestorage.com --- drivers/infiniband/ulp/srp

Re: [PATCH 00/15, v3] Make ib_srp better suited for H.A. purposes

2012-03-25 Thread Bart Van Assche
On Sunday 25 March 2012 15:17, Bart Van Assche bvanass...@acm.org wrote: This patch series makes the ib_srp driver better suited for use in a H.A. setup because [ ... ] The patch series is also available here: http://github.com/bvanassche/linux/commits/srp-ha/. Bart. -- To unsubscribe from

Re: [PATCH] libmlx4: Fix a compiler warning

2012-04-01 Thread Bart Van Assche
On 10/11/11 00:41, Roland Dreier wrote: On Mon, Oct 10, 2011 at 10:47 AM, Bart Van Assche bvanass...@acm.org wrote: - uint32_t hi = *(uint32_t *)(gid-raw); - uint32_t lo = *(uint32_t *)(gid-raw + 4); - if (hi == htonl(0xfe80) lo == 0) - return 1

mlx4: kernel 3.4-rc1 breaks libumad

2012-04-02 Thread Bart Van Assche
Hi, Apparently applications based on libumad can find local ports with kernel 3.2.x but not with kernel 3.4-rc1. # uname -r 3.4.0-rc1 # ls /sys/class/infiniband/mlx4_0/ports/1/rate /sys/class/infiniband/mlx4_0/ports/1/rate # cat /sys/class/infiniband/mlx4_0/ports/1/rate cat:

Re: mlx4: kernel 3.4-rc1 breaks libumad

2012-04-02 Thread Bart Van Assche
On 04/02/12 10:33, Or Gerlitz wrote: On 4/2/2012 10:42 AM, Bart Van Assche wrote: # uname -r 3.4.0-rc1 # ls /sys/class/infiniband/mlx4_0/ports/1/rate /sys/class/infiniband/mlx4_0/ports/1/rate # cat /sys/class/infiniband/mlx4_0/ports/1/rate cat: /sys/class/infiniband/mlx4_0/ports/1/rate

Re: mlx4: kernel 3.4-rc1 breaks libumad

2012-04-02 Thread Bart Van Assche
On 04/02/12 11:20, Or Gerlitz wrote: On 4/2/2012 2:16 PM, Bart Van Assche wrote: On 04/02/12 10:33, Or Gerlitz wrote: As far as I can see the link layer value is fine: $ cat /sys/class/infiniband/mlx4_0/ports/1/link_layer InfiniBand $ cat /sys/class/infiniband/mlx4_0/ports/2/link_layer

Re: mlx4: kernel 3.4-rc1 breaks libumad

2012-04-02 Thread Bart Van Assche
On 04/02/12 12:51, Or Gerlitz wrote: On 4/2/2012 2:48 PM, Bart Van Assche wrote: The two ports are connected back-to-back to another mlx4 HCA. I noticed this behavior change since opensm stopped working after rebooting into 3.4-rc1. can you add these prints and send me the output after

Re: [PATCH] libmlx4: Fix a compiler warning

2012-04-02 Thread Bart Van Assche
On 04/01/12 19:09, Bart Van Assche wrote: On 10/11/11 00:41, Roland Dreier wrote: On Mon, Oct 10, 2011 at 10:47 AM, Bart Van Assche bvanass...@acm.org wrote: - uint32_t hi = *(uint32_t *)(gid-raw); - uint32_t lo = *(uint32_t *)(gid-raw + 4); - if (hi == htonl(0xfe80

Re: [PATCH] IB/core: Don't return EINVAL from sysfs rate attribute for invalid speeds

2012-04-04 Thread Bart Van Assche
; + case IB_SPEED_SDR: + default:/* default to SDR for invalid rates */ + rate = 25; + break; } rate *= ib_width_enum_to_int(attr.active_width); -- 1.7.9.1 Tested-by: Bart Van Assche bvanass...@acm.org -- To unsubscribe from this list

Re: [RFC] Proposal to change Node Description naming scheme for HCA's

2012-04-05 Thread Bart Van Assche
On 03/31/12 00:39, Ira Weiny wrote: First, a question: what package installs the openibd script in OFED? For the life of me I can't find this script in 1.5.4.1 or 3.2 ... :-/ [*] That's easy to figure out: $ rpm -qf /etc/init.d/openibd kernel-ib-1.5.4-3.0.12+.x86_64 Bart. -- To

Re: [PATCH] ibacm: Fixes to ACM package to support distros

2012-04-05 Thread Bart Van Assche
On 04/05/12 19:10, Hefty, Sean wrote: I create a .tar.gz package using 'make dist', copy it to another system, then install it using 'configure make install'. When I do that, sysconfdir defaults to /usr/local/etc, sbindir /usr/local /sbin, and bindir to /usr/local/bin. I added /usr/local

Re: [RFC] [PATCH 0/4] librdmacm: Rsockets API and implementation

2012-04-11 Thread Bart Van Assche
On 04/11/12 18:29, Hefty, Sean wrote: The following patch set contains an initial implementation of rsockets as presented at the 2012 OpenFabrics Workshop. A copy of that presentation is available at: https://www.openfabrics.org/downloads/rdmacm/rsockets-ofa12.pptx and a video of

Re: kernel: rejected SRP_LOGIN_REQ because creating a new RDMA channel failed.

2012-04-19 Thread Bart Van Assche
On 04/18/12 20:21, Roland Dreier wrote: On Wed, Apr 18, 2012 at 2:07 AM, Alexey Shvetsov ale...@gentoo.org wrote: Apr 18 13:04:01 store kernel: mlx4_core :4b:00.0: command 0x19 failed: fw status = 0x9 status 0x9 is: /* Resource is not in the appropriate state or ownership:

Re: [PATCH 01/15] ib_srp: Enlarge block layer timeout

2012-04-22 Thread Bart Van Assche
On 03/29/12 16:59, Dave Dillow wrote: [ ... ] I haven't chewed on the rest yet, but would like to see this one at least in 3.4 if possible. Hi Dave, If you have further comments about any of the patches in this series, these are welcome. The 3.5 merge window isn't that far away anymore.

ib_destroy_cm_id() versus cm callback race ?

2012-04-27 Thread Bart Van Assche
Hello, If I interpret the source code in drivers/infiniband/core/cm.c correctly ib_destroy_cm_id() can return before an ongoing cm_id callback has finished. Is this on purpose ? If not, isn't there a flush_workqueue(cm.wq) call missing in cm_destroy_id() ? Thanks, Bart. -- To unsubscribe from

Re: ib_destroy_cm_id() versus cm callback race ?

2012-04-28 Thread Bart Van Assche
On 04/27/12 17:18, Hefty, Sean wrote: If I interpret the source code in drivers/infiniband/core/cm.c correctly ib_destroy_cm_id() can return before an ongoing cm_id callback has finished. Is this on purpose ? If not, isn't there a flush_workqueue(cm.wq) call missing in cm_destroy_id() ?

Re: ib_destroy_cm_id() versus cm callback race ?

2012-04-30 Thread Bart Van Assche
On 04/30/12 18:29, Hefty, Sean wrote: That makes me wonder how it is prevented that two CM callbacks for the same CM ID run concurrently on different CPUs ? The callback code ends up looking like this: ret = atomic_inc_and_test(cm_id_priv-work_count); if (!ret)

Re: ib_destroy_cm_id() versus cm callback race ?

2012-05-01 Thread Bart Van Assche
On 04/30/12 19:27, Hefty, Sean wrote: * User requests shutdown and hence from another thread ib_send_cm_dreq() is invoked. ib_send_cm_dreq() at this point should fail with EINVAL, as the connection state is not yet established. You are right, that didn't make sense. What I have noticed

[PATCH] opensm: Fix pthread_create() return value checks

2012-06-07 Thread Bart Van Assche
Just like other POSIX thread functions, pthread_create() either returns zero or a positive error code. Found this through source code review. See also http://pubs.opengroup.org/onlinepubs/9699919799/functions/pthread_create.html Signed-off-by: Bart Van Assche bvanass...@acm.org --- complib

Re: Kernel panic under 3.2.14 Xen dom0 and SCST trunk

2012-07-24 Thread Bart Van Assche
On 07/24/12 15:16, Joseph Glanville wrote: I have been seeing this KP occur about every 3 days on our staging cluster. I am not exactly sure what the root cause would be.. I assume this would be a bug in SCST. The kernel is a 3.2.14 with Ubuntu patch series applied and Bart's SRP HA patches.

Re: Kernel panic under 3.2.14 Xen dom0 and SCST trunk

2012-07-24 Thread Bart Van Assche
On 07/24/12 19:50, Joseph Glanville wrote: On 25 July 2012 03:53, Bart Van Assche bvanass...@acm.org wrote: On 07/24/12 15:16, Joseph Glanville wrote: I have been seeing this KP occur about every 3 days on our staging cluster. I am not exactly sure what the root cause would be.. I assume

Re: Kernel panic under 3.2.14 Xen dom0 and SCST trunk

2012-08-02 Thread Bart Van Assche
On 07/24/12 15:43, Joseph Glanville wrote: [35404.804723] BUG: unable to handle kernel NULL pointer dereference at (null) I've been able to reproduce this ib_srp crash. Apparently if an SRP response is received after srp_reset_host() has been invoked srp_process_rsp() tries to call

Re: Kernel panic under 3.2.14 Xen dom0 and SCST trunk

2012-08-03 Thread Bart Van Assche
On 08/02/12 20:12, David Dillow wrote: On Thu, 2012-08-02 at 11:04 +, Bart Van Assche wrote: On 07/24/12 15:43, Joseph Glanville wrote: [35404.804723] BUG: unable to handle kernel NULL pointer dereference at (null) I've been able to reproduce this ib_srp crash. Apparently if an SRP

Re: [PATCH] IB/mlx4: fix possible deadlock with sm_lock spinlock

2012-08-07 Thread Bart Van Assche
...@mellanox.com Signed-off-by: Jack Morgenstein ja...@dev.mellanox.co.il Tested-by: Bart Van Assche bvanass...@acm.org -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo

[PATCH 00/20, v4] Make ib_srp better suited for H.A. purposes

2012-08-09 Thread Bart Van Assche
This patch series makes the ib_srp driver better suited for use in a H.A. setup because: - multipathd is notified faster about transport layer failures. - Transport layer failures reliably result in a reconnect. - Switchover can be triggered explicitly by deleting an initiator device. -

[PATCH 03/20] ib_srp: Move QP state check into srp_send_tsk_mgmt()

2012-08-09 Thread Bart Van Assche
Test the QP state inside srp_send_tsk_mgmt() instead of letting each caller perform that test. Signed-off-by: Bart Van Assche bvanass...@acm.org Cc: David Dillow dillo...@ornl.gov Cc: Roland Dreier rol...@purestorage.com --- drivers/infiniband/ulp/srp/ib_srp.c |8 +++- 1 files changed, 3

[PATCH 04/20] ib_srp: Stop queueing if QP in error

2012-08-09 Thread Bart Van Assche
srp_handle_qp_err(), change the type of qp_in_error from int into bool and move the initialization of that variable from srp_reconnect_target() to srp_connect_target(). Signed-off-by: Bart Van Assche bvanass...@acm.org Cc: David Dillow dillo...@ornl.gov Cc: Roland Dreier rol...@purestorage.com --- drivers

[PATCH 05/20] ib_srp: Eliminate state SRP_TARGET_CONNECTING

2012-08-09 Thread Bart Van Assche
Block the SCSI host while reconnecting instead of representing the reconnection activity as a distinct SRP target state. This allows to eliminate the target state SRP_TARGET_CONNECTING. Signed-off-by: Bart Van Assche bvanass...@acm.org Cc: David Dillow dillo...@ornl.gov Cc: Roland Dreier rol

[PATCH 06/20] ib_srp: Suppress superfluous error messages

2012-08-09 Thread Bart Van Assche
Keep track of the connection state. Only report QP errors while connected. Only invoke ib_send_cm_dreq() when connected such that invoking srp_disconnect_target() after having received a DREQ does not cause an error message to be printed. Signed-off-by: Bart Van Assche bvanass...@acm.org Cc

[PATCH 07/20] ib_srp: Avoid that SCSI error handling triggers a crash

2012-08-09 Thread Bart Van Assche
Null scmnd for RSP ... followed by a kernel oops. Signed-off-by: Bart Van Assche bvanass...@acm.org Cc: David Dillow dillo...@ornl.gov Cc: Roland Dreier rol...@purestorage.com --- drivers/infiniband/ulp/srp/ib_srp.c |2 ++ 1 files changed, 2 insertions(+), 0 deletions(-) diff --git a/drivers

[PATCH 08/20] ib_srp: Introduce the helper function, srp_remove_target()

2012-08-09 Thread Bart Van Assche
Signed-off-by: Bart Van Assche bvanass...@acm.org Cc: David Dillow dillo...@ornl.gov Cc: Roland Dreier rol...@purestorage.com --- drivers/infiniband/ulp/srp/ib_srp.c | 19 --- 1 files changed, 12 insertions(+), 7 deletions(-) diff --git a/drivers/infiniband/ulp/srp/ib_srp.c b

[PATCH 11/20] ib_srp: Make srp_disconnect_target() wait for IB completions

2012-08-09 Thread Bart Van Assche
Modify srp_disconnect_target() such that it waits until it is sure that no new IB completions will be received anymore. Signed-off-by: Bart Van Assche bvanass...@acm.org Cc: David Dillow dillo...@ornl.gov Cc: Roland Dreier rol...@purestorage.com --- drivers/infiniband/ulp/srp/ib_srp.c | 104

[PATCH 14/20] srp_transport: Simplify attribute initialization code

2012-08-09 Thread Bart Van Assche
adding new attributes. Signed-off-by: Bart Van Assche bvanass...@acm.org Cc: FUJITA Tomonori fujita.tomon...@lab.ntt.co.jp Cc: Robert Jennings r...@linux.vnet.ibm.com Cc: David Dillow dillo...@ornl.gov Cc: Roland Dreier rol...@purestorage.com --- drivers/scsi/scsi_transport_srp.c | 26

[PATCH 17/20] ib_srp: Introduce a temporary variable in srp_remove_target()

2012-08-09 Thread Bart Van Assche
Signed-off-by: Bart Van Assche bvanass...@acm.org Cc: David Dillow dillo...@ornl.gov Cc: Roland Dreier rol...@purestorage.com --- drivers/infiniband/ulp/srp/ib_srp.c | 10 ++ 1 files changed, 6 insertions(+), 4 deletions(-) diff --git a/drivers/infiniband/ulp/srp/ib_srp.c b/drivers

[PATCH 19/20] srp_transport: Add transport layer error handling

2012-08-09 Thread Bart Van Assche
. - Support for implementing dev_loss_tmo, the time that should elapse after having detected a transport layer problem and before removing a remote port. Signed-off-by: Bart Van Assche bvanass...@acm.org Cc: FUJITA Tomonori fujita.tomon...@lab.ntt.co.jp Cc: Robert Jennings r...@linux.vnet.ibm.com Cc

[PATCH 20/20] ib_srp: Add dev_loss_tmo support

2012-08-09 Thread Bart Van Assche
Remove an SRP host if either dev_loss_tmo expired or the target closed the IB connection. Signed-off-by: Bart Van Assche bvanass...@acm.org Cc: David Dillow dillo...@ornl.gov Cc: Roland Dreier rol...@purestorage.com --- drivers/infiniband/ulp/srp/ib_srp.c | 53

Re: [PATCH 00/20, v4] Make ib_srp better suited for H.A. purposes

2012-08-09 Thread Bart Van Assche
On 08/09/12 15:41, Bart Van Assche wrote: [ ... ] The patch series is also available on top of 3.6-rc1 here: http://github.com/bvanassche/linux/tree/srp-ha Bart. -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More

[PATCH 01/20 v4b] ib_srp: Fix a race condition

2012-08-14 Thread Bart Van Assche
-rdmam=134314367801595 Signed-off-by: Bart Van Assche bvanass...@acm.org Cc: David Dillow dillo...@ornl.gov Cc: Roland Dreier rol...@purestorage.com Cc: sta...@vger.kernel.org --- drivers/infiniband/ulp/srp/ib_srp.c | 87 +-- 1 files changed, 63 insertions(+), 24

<    1   2   3   4   5   6   7   8   9   10   >