Re: [PATCH 11/15] IB/srpt: Fix how aborted commands are processed

2016-01-06 Thread Bart Van Assche

On 01/06/2016 06:13 AM, Christoph Hellwig wrote:

pr_debug("Aborting cmd with state %d and tag %lld\n", state,
 ioctx->cmd.tag);

@@ -1299,14 +1291,16 @@ static int srpt_abort_cmd(struct srpt_send_ioctx *ioctx)
case SRPT_STATE_NEW:
case SRPT_STATE_DATA_IN:
case SRPT_STATE_MGMT:
+   case SRPT_STATE_DONE:
/*
 * Do nothing - defer abort processing until
 * srpt_queue_response() is invoked.
 */
-   WARN_ON(!(ioctx->cmd.transport_state & CMD_T_ABORTED));


Seems like this depends on your target core changes?  Maybe it would be better
to respin the series to got just on top of Doug's RDMA tree, as I think
we're more likely to get this series merged for 4.5 than the target core
changes..


This series indeed depends on my recently posted target core patch 
series. I will rebase, retest and repost this patch series.


Bart.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 13/15] IB/srpt: Detect session shutdown reliably

2016-01-06 Thread Bart Van Assche

On 01/06/2016 06:21 AM, Christoph Hellwig wrote:

On Tue, Jan 05, 2016 at 03:26:49PM +0100, Bart Van Assche wrote:

The Last WQE Reached event is only generated after one or more work
requests have been queued on the QP associated with a session. Since
session shutdown can start before any work requests have been queued,
use a zero-length RDMA write to wait until a QP has been drained.


We actually ran into the same issue with a SRPT-derived work in progress
driver recently..


@@ -2314,14 +2346,13 @@ static void srpt_cm_timewait_exit(struct srpt_rdma_ch 
*ch)
  {
pr_info("Received CM TimeWait exit for ch %s-%d.\n", ch->sess_name,
ch->qp->qp_num);
+   srpt_close_ch(ch);
  }

  static void srpt_cm_rep_error(struct srpt_rdma_ch *ch)
  {
pr_info("Received CM REP error for ch %s-%d.\n", ch->sess_name,
ch->qp->qp_num);
  }

  /**
@@ -2329,33 +2360,7 @@ static void srpt_cm_rep_error(struct srpt_rdma_ch *ch)
   */
  static void srpt_cm_dreq_recv(struct srpt_rdma_ch *ch)
  {
+   srpt_disconnect_ch(ch);
  }

  /**
@@ -2364,7 +2369,7 @@ static void srpt_cm_dreq_recv(struct srpt_rdma_ch *ch)
  static void srpt_cm_drep_recv(struct srpt_rdma_ch *ch)
  {
pr_info("Received InfiniBand DREP message for cm_id %p.\n", ch->cm_id);
+   srpt_close_ch(ch);
  }



Is there any good reason to keep these one-liner helpers around?


Hello Christoph,

Not really. I will inline these functions.

Bart.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 13/15] IB/srpt: Detect session shutdown reliably

2016-01-06 Thread Bart Van Assche

On 01/06/2016 03:39 PM, Sagi Grimberg wrote:

On 05/01/2016 16:26, Bart Van Assche wrote:

The Last WQE Reached event is only generated after one or more work
requests have been queued on the QP associated with a session. Since
session shutdown can start before any work requests have been queued,
use a zero-length RDMA write to wait until a QP has been drained.


Is it just me or is there (a lot) more going on here other than
this patch description?


Hello Sagi,

I will make the patch description more detailed. Sorry if some of this 
code is hard to follow but that's because of the high level of 
concurrency in the SRP target driver. Some time ago I documented how 
session management in the SCST ib_srpt driver works. This driver follows 
the same model. These notes can be found here: 
http://sourceforge.net/p/scst/svn/HEAD/tree/trunk/srpt/session-management.txt.


Bart.

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 10/15] IB/srpt: Fix srpt_handle_cmd() error paths

2016-01-06 Thread Bart Van Assche

On 01/06/2016 03:31 PM, Sagi Grimberg wrote:

On 05/01/2016 16:25, Bart Van Assche wrote:

@@ -1518,8 +1517,7 @@ static int srpt_handle_cmd(struct srpt_rdma_ch *ch,
  if (srpt_get_desc_tbl(send_ioctx, srp_cmd, , _len)) {
  pr_err("0x%llx: parsing SRP descriptor table failed.\n",
 srp_cmd->tag);
-ret = TCM_INVALID_CDB_FIELD;
-goto send_sense;
+goto put;


Do you still need to call target_put_sess_cmd() in this failure?


Good catch. I will update this patch before I repost this patch series.

Bart.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 06/15] IB/srpt: Simplify srpt_handle_tsk_mgmt()

2016-01-05 Thread Bart Van Assche
Let the target core check task existence instead of the SRP target
driver.

Signed-off-by: Bart Van Assche <bart.vanass...@sandisk.com>
Cc: Christoph Hellwig <h...@lst.de>
---
 drivers/infiniband/ulp/srpt/ib_srpt.c | 54 ++-
 1 file changed, 2 insertions(+), 52 deletions(-)

diff --git a/drivers/infiniband/ulp/srpt/ib_srpt.c 
b/drivers/infiniband/ulp/srpt/ib_srpt.c
index fc19203..9cb1a14 100644
--- a/drivers/infiniband/ulp/srpt/ib_srpt.c
+++ b/drivers/infiniband/ulp/srpt/ib_srpt.c
@@ -1554,47 +1554,6 @@ send_sense:
return -1;
 }
 
-/**
- * srpt_rx_mgmt_fn_tag() - Process a task management function by tag.
- * @ch: RDMA channel of the task management request.
- * @fn: Task management function to perform.
- * @req_tag: Tag of the SRP task management request.
- * @mgmt_ioctx: I/O context of the task management request.
- *
- * Returns zero if the target core will process the task management
- * request asynchronously.
- *
- * Note: It is assumed that the initiator serializes tag-based task management
- * requests.
- */
-static int srpt_rx_mgmt_fn_tag(struct srpt_send_ioctx *ioctx, u64 tag)
-{
-   struct srpt_device *sdev;
-   struct srpt_rdma_ch *ch;
-   struct srpt_send_ioctx *target;
-   int ret, i;
-
-   ret = -EINVAL;
-   ch = ioctx->ch;
-   BUG_ON(!ch);
-   BUG_ON(!ch->sport);
-   sdev = ch->sport->sdev;
-   BUG_ON(!sdev);
-   spin_lock_irq(>spinlock);
-   for (i = 0; i < ch->rq_size; ++i) {
-   target = ch->ioctx_ring[i];
-   if (target->cmd.se_lun == ioctx->cmd.se_lun &&
-   target->cmd.tag == tag &&
-   srpt_get_cmd_state(target) != SRPT_STATE_DONE) {
-   ret = 0;
-   /* now let the target core abort >cmd; */
-   break;
-   }
-   }
-   spin_unlock_irq(>spinlock);
-   return ret;
-}
-
 static int srp_tmr_to_tcm(int fn)
 {
switch (fn) {
@@ -1628,7 +1587,6 @@ static void srpt_handle_tsk_mgmt(struct srpt_rdma_ch *ch,
struct srp_tsk_mgmt *srp_tsk;
struct se_cmd *cmd;
struct se_session *sess = ch->sess;
-   uint32_t tag = 0;
int tcm_tmr;
int rc;
 
@@ -1649,18 +1607,10 @@ static void srpt_handle_tsk_mgmt(struct srpt_rdma_ch 
*ch,
TMR_TASK_MGMT_FUNCTION_NOT_SUPPORTED;
goto fail;
}
-   if (srp_tsk->tsk_mgmt_func == SRP_TSK_ABORT_TASK) {
-   rc = srpt_rx_mgmt_fn_tag(send_ioctx, srp_tsk->task_tag);
-   if (rc < 0) {
-   send_ioctx->cmd.se_tmr_req->response =
-   TMR_TASK_DOES_NOT_EXIST;
-   goto fail;
-   }
-   tag = srp_tsk->task_tag;
-   }
rc = target_submit_tmr(_ioctx->cmd, sess, NULL,
   scsilun_to_int(_tsk->lun), srp_tsk, tcm_tmr,
-  GFP_KERNEL, tag, TARGET_SCF_ACK_KREF);
+  GFP_KERNEL, srp_tsk->task_tag,
+  TARGET_SCF_ACK_KREF);
if (rc != 0) {
send_ioctx->cmd.se_tmr_req->response = TMR_FUNCTION_REJECTED;
goto fail;
-- 
2.1.4

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 07/15] IB/srpt: Simplify channel state management

2016-01-05 Thread Bart Van Assche
The only allowed channel state changes are those that change
the channel state into a state with a higher numerical value.
This allows to merge the functions srpt_set_ch_state() and
srpt_test_and_set_ch_state() into a single function.

Signed-off-by: Bart Van Assche <bart.vanass...@sandisk.com>
Cc: Christoph Hellwig <h...@lst.de>
---
 drivers/infiniband/ulp/srpt/ib_srpt.c | 42 +--
 1 file changed, 10 insertions(+), 32 deletions(-)

diff --git a/drivers/infiniband/ulp/srpt/ib_srpt.c 
b/drivers/infiniband/ulp/srpt/ib_srpt.c
index 9cb1a14..a27414d 100644
--- a/drivers/infiniband/ulp/srpt/ib_srpt.c
+++ b/drivers/infiniband/ulp/srpt/ib_srpt.c
@@ -96,37 +96,21 @@ static int srpt_queue_status(struct se_cmd *cmd);
 static void srpt_recv_done(struct ib_cq *cq, struct ib_wc *wc);
 static void srpt_send_done(struct ib_cq *cq, struct ib_wc *wc);
 
-static enum rdma_ch_state
-srpt_set_ch_state(struct srpt_rdma_ch *ch, enum rdma_ch_state new_state)
+static bool srpt_set_ch_state(struct srpt_rdma_ch *ch, enum rdma_ch_state new)
 {
unsigned long flags;
enum rdma_ch_state prev;
+   bool changed = false;
 
spin_lock_irqsave(>spinlock, flags);
prev = ch->state;
-   ch->state = new_state;
-   spin_unlock_irqrestore(>spinlock, flags);
-   return prev;
-}
-
-/**
- * srpt_test_and_set_ch_state() - Test and set the channel state.
- *
- * Returns true if and only if the channel state has been set to the new state.
- */
-static bool
-srpt_test_and_set_ch_state(struct srpt_rdma_ch *ch, enum rdma_ch_state old,
-  enum rdma_ch_state new)
-{
-   unsigned long flags;
-   enum rdma_ch_state prev;
-
-   spin_lock_irqsave(>spinlock, flags);
-   prev = ch->state;
-   if (prev == old)
+   if (new > prev) {
ch->state = new;
+   changed = true;
+   }
spin_unlock_irqrestore(>spinlock, flags);
-   return prev == old;
+
+   return changed;
 }
 
 /**
@@ -199,8 +183,7 @@ static void srpt_qp_event(struct ib_event *event, struct 
srpt_rdma_ch *ch)
ib_cm_notify(ch->cm_id, event->event);
break;
case IB_EVENT_QP_LAST_WQE_REACHED:
-   if (srpt_test_and_set_ch_state(ch, CH_DRAINING,
-  CH_RELEASING))
+   if (srpt_set_ch_state(ch, CH_RELEASING))
srpt_release_channel(ch);
else
pr_debug("%s: state %d - ignored LAST_WQE.\n",
@@ -1946,12 +1929,7 @@ static void srpt_drain_channel(struct ib_cm_id *cm_id)
spin_lock_irq(>spinlock);
list_for_each_entry(ch, >rch_list, list) {
if (ch->cm_id == cm_id) {
-   do_reset = srpt_test_and_set_ch_state(ch,
-   CH_CONNECTING, CH_DRAINING) ||
-  srpt_test_and_set_ch_state(ch,
-   CH_LIVE, CH_DRAINING) ||
-  srpt_test_and_set_ch_state(ch,
-   CH_DISCONNECTING, CH_DRAINING);
+   do_reset = srpt_set_ch_state(ch, CH_DRAINING);
break;
}
}
@@ -2368,7 +2346,7 @@ static void srpt_cm_rtu_recv(struct ib_cm_id *cm_id)
ch = srpt_find_channel(cm_id->context, cm_id);
BUG_ON(!ch);
 
-   if (srpt_test_and_set_ch_state(ch, CH_CONNECTING, CH_LIVE)) {
+   if (srpt_set_ch_state(ch, CH_LIVE)) {
struct srpt_recv_ioctx *ioctx, *ioctx_tmp;
 
ret = srpt_ch_qp_rts(ch, ch->qp);
-- 
2.1.4

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 08/15] IB/srpt: Simplify srpt_shutdown_session()

2016-01-05 Thread Bart Van Assche
The target core guarantees that shutdown_session() is only invoked
once per session. This means that the ib_srpt target driver doesn't
have to track whether or not shutdown_session() has been called.
Additionally, ensure that target_sess_cmd_list_set_waiting() is
called before target_wait_for_sess_cmds() by moving it into
srpt_release_channel_work().

Signed-off-by: Bart Van Assche <bart.vanass...@sandisk.com>
Cc: Christoph Hellwig <h...@lst.de>
---
 drivers/infiniband/ulp/srpt/ib_srpt.c | 14 +-
 drivers/infiniband/ulp/srpt/ib_srpt.h |  1 -
 2 files changed, 1 insertion(+), 14 deletions(-)

diff --git a/drivers/infiniband/ulp/srpt/ib_srpt.c 
b/drivers/infiniband/ulp/srpt/ib_srpt.c
index a27414d..0ff4ed6 100644
--- a/drivers/infiniband/ulp/srpt/ib_srpt.c
+++ b/drivers/infiniband/ulp/srpt/ib_srpt.c
@@ -1887,19 +1887,6 @@ static void srpt_close_ch(struct srpt_rdma_ch *ch)
  */
 static int srpt_shutdown_session(struct se_session *se_sess)
 {
-   struct srpt_rdma_ch *ch = se_sess->fabric_sess_ptr;
-   unsigned long flags;
-
-   spin_lock_irqsave(>spinlock, flags);
-   if (ch->in_shutdown) {
-   spin_unlock_irqrestore(>spinlock, flags);
-   return true;
-   }
-
-   ch->in_shutdown = true;
-   target_sess_cmd_list_set_waiting(se_sess);
-   spin_unlock_irqrestore(>spinlock, flags);
-
return true;
 }
 
@@ -2003,6 +1990,7 @@ static void srpt_release_channel_work(struct work_struct 
*w)
se_sess = ch->sess;
BUG_ON(!se_sess);
 
+   target_sess_cmd_list_set_waiting(se_sess);
target_wait_for_sess_cmds(se_sess);
 
transport_deregister_session_configfs(se_sess);
diff --git a/drivers/infiniband/ulp/srpt/ib_srpt.h 
b/drivers/infiniband/ulp/srpt/ib_srpt.h
index a98b86b..d0de468 100644
--- a/drivers/infiniband/ulp/srpt/ib_srpt.h
+++ b/drivers/infiniband/ulp/srpt/ib_srpt.h
@@ -288,7 +288,6 @@ struct srpt_rdma_ch {
u8  sess_name[36];
struct work_struct  release_work;
struct completion   *release_done;
-   boolin_shutdown;
 };
 
 /**
-- 
2.1.4

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 01/15] IB/srpt: Add parentheses around sizeof argument

2016-01-05 Thread Bart Van Assche
Although sizeof is an operator and hence in many cases parentheses can
be left out, the recommended kernel coding style is to surround the
sizeof argument with parentheses. This patch does not change any
functionality. This patch has been generated by running the following
shell command:

sed -i 's/sizeof \([^ );,]*\)/sizeof(\1)/g' drivers/infiniband/ulp/srpt/*.[ch]

Signed-off-by: Bart Van Assche <bart.vanass...@sandisk.com>
Cc: Christoph Hellwig <h...@lst.de>
---
 drivers/infiniband/ulp/srpt/ib_srpt.c | 42 +--
 1 file changed, 21 insertions(+), 21 deletions(-)

diff --git a/drivers/infiniband/ulp/srpt/ib_srpt.c 
b/drivers/infiniband/ulp/srpt/ib_srpt.c
index 0c3ed7c..fe9474f 100644
--- a/drivers/infiniband/ulp/srpt/ib_srpt.c
+++ b/drivers/infiniband/ulp/srpt/ib_srpt.c
@@ -281,7 +281,7 @@ static void srpt_get_class_port_info(struct ib_dm_mad *mad)
struct ib_class_port_info *cif;
 
cif = (struct ib_class_port_info *)mad->data;
-   memset(cif, 0, sizeof *cif);
+   memset(cif, 0, sizeof(*cif));
cif->base_version = 1;
cif->class_version = 1;
cif->resp_time_value = 20;
@@ -340,7 +340,7 @@ static void srpt_get_ioc(struct srpt_port *sport, u32 slot,
return;
}
 
-   memset(iocp, 0, sizeof *iocp);
+   memset(iocp, 0, sizeof(*iocp));
strcpy(iocp->id_string, SRPT_ID_STRING);
iocp->guid = cpu_to_be64(srpt_service_guid);
iocp->vendor_id = cpu_to_be32(sdev->dev_attr.vendor_id);
@@ -390,7 +390,7 @@ static void srpt_get_svc_entries(u64 ioc_guid,
}
 
svc_entries = (struct ib_dm_svc_entries *)mad->data;
-   memset(svc_entries, 0, sizeof *svc_entries);
+   memset(svc_entries, 0, sizeof(*svc_entries));
svc_entries->service_entries[0].id = cpu_to_be64(ioc_guid);
snprintf(svc_entries->service_entries[0].name,
 sizeof(svc_entries->service_entries[0].name),
@@ -483,7 +483,7 @@ static void srpt_mad_recv_handler(struct ib_mad_agent 
*mad_agent,
rsp->ah = ah;
 
dm_mad = rsp->mad;
-   memcpy(dm_mad, mad_wc->recv_buf.mad, sizeof *dm_mad);
+   memcpy(dm_mad, mad_wc->recv_buf.mad, sizeof(*dm_mad));
dm_mad->mad_hdr.method = IB_MGMT_METHOD_GET_RESP;
dm_mad->mad_hdr.status = 0;
 
@@ -531,7 +531,7 @@ static int srpt_refresh_port(struct srpt_port *sport)
struct ib_port_attr port_attr;
int ret;
 
-   memset(_modify, 0, sizeof port_modify);
+   memset(_modify, 0, sizeof(port_modify));
port_modify.set_port_cap_mask = IB_PORT_DEVICE_MGMT_SUP;
port_modify.clr_port_cap_mask = 0;
 
@@ -552,7 +552,7 @@ static int srpt_refresh_port(struct srpt_port *sport)
goto err_query_port;
 
if (!sport->mad_agent) {
-   memset(_req, 0, sizeof reg_req);
+   memset(_req, 0, sizeof(reg_req));
reg_req.mgmt_class = IB_MGMT_CLASS_DEVICE_MGMT;
reg_req.mgmt_class_version = IB_MGMT_BASE_VERSION;
set_bit(IB_MGMT_METHOD_GET, reg_req.method_mask);
@@ -902,14 +902,14 @@ static int srpt_get_desc_tbl(struct srpt_send_ioctx 
*ioctx,
 
db = (struct srp_direct_buf *)(srp_cmd->add_data
   + add_cdb_offset);
-   memcpy(ioctx->rbufs, db, sizeof *db);
+   memcpy(ioctx->rbufs, db, sizeof(*db));
*data_len = be32_to_cpu(db->len);
} else if (((srp_cmd->buf_fmt & 0xf) == SRP_DATA_DESC_INDIRECT) ||
   ((srp_cmd->buf_fmt >> 4) == SRP_DATA_DESC_INDIRECT)) {
idb = (struct srp_indirect_buf *)(srp_cmd->add_data
  + add_cdb_offset);
 
-   ioctx->n_rbuf = be32_to_cpu(idb->table_desc.len) / sizeof *db;
+   ioctx->n_rbuf = be32_to_cpu(idb->table_desc.len) / sizeof(*db);
 
if (ioctx->n_rbuf >
(srp_cmd->data_out_desc_cnt + srp_cmd->data_in_desc_cnt)) {
@@ -928,7 +928,7 @@ static int srpt_get_desc_tbl(struct srpt_send_ioctx *ioctx,
ioctx->rbufs = >single_rbuf;
else {
ioctx->rbufs =
-   kmalloc(ioctx->n_rbuf * sizeof *db, GFP_ATOMIC);
+   kmalloc(ioctx->n_rbuf * sizeof(*db), 
GFP_ATOMIC);
if (!ioctx->rbufs) {
ioctx->n_rbuf = 0;
ret = -ENOMEM;
@@ -937,7 +937,7 @@ static int srpt_get_desc_tbl(struct srpt_send_ioctx *ioctx,
}
 
db = idb->desc_list;
-   memcpy(ioctx->rbufs, db, ioctx->n_rbuf * sizeof *db);
+   memcpy(ioctx->rbufs, db, ioctx->n_rbuf * sizeof(*db));

[PATCH 02/15] IB/srpt: Inline srpt_sdev_name()

2016-01-05 Thread Bart Van Assche
srpt_sdev_name() is too trivial to keep it as a separate function.

Signed-off-by: Bart Van Assche <bart.vanass...@sandisk.com>
Cc: Christoph Hellwig <h...@lst.de>
---
 drivers/infiniband/ulp/srpt/ib_srpt.c | 14 ++
 1 file changed, 2 insertions(+), 12 deletions(-)

diff --git a/drivers/infiniband/ulp/srpt/ib_srpt.c 
b/drivers/infiniband/ulp/srpt/ib_srpt.c
index fe9474f..7abad26 100644
--- a/drivers/infiniband/ulp/srpt/ib_srpt.c
+++ b/drivers/infiniband/ulp/srpt/ib_srpt.c
@@ -109,16 +109,6 @@ enum dma_data_direction opposite_dma_dir(enum 
dma_data_direction dir)
}
 }
 
-/**
- * srpt_sdev_name() - Return the name associated with the HCA.
- *
- * Examples are ib0, ib1, ...
- */
-static inline const char *srpt_sdev_name(struct srpt_device *sdev)
-{
-   return sdev->device->name;
-}
-
 static enum rdma_ch_state srpt_get_ch_state(struct srpt_rdma_ch *ch)
 {
unsigned long flags;
@@ -182,7 +172,7 @@ static void srpt_event_handler(struct ib_event_handler 
*handler,
return;
 
pr_debug("ASYNC event= %d on device= %s\n", event->event,
-srpt_sdev_name(sdev));
+sdev->device->name);
 
switch (event->event) {
case IB_EVENT_PORT_ERR:
@@ -3089,7 +3079,7 @@ static void srpt_add_one(struct ib_device *device)
 
if (srpt_refresh_port(sport)) {
pr_err("MAD registration failed for %s-%d.\n",
-  srpt_sdev_name(sdev), i);
+  sdev->device->name, i);
goto err_ring;
}
snprintf(sport->port_guid, sizeof(sport->port_guid),
-- 
2.1.4

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 15/15] IB/srpt: Fix a rare crash in srpt_close_session()

2016-01-05 Thread Bart Van Assche
Keep the ib_srpt session as long as srpt_close_session() may
access it.

Signed-off-by: Bart Van Assche <bart.vanass...@sandisk.com>
Cc: Christoph Hellwig <h...@lst.de>
---
 drivers/infiniband/ulp/srpt/ib_srpt.c | 4 
 1 file changed, 4 insertions(+)

diff --git a/drivers/infiniband/ulp/srpt/ib_srpt.c 
b/drivers/infiniband/ulp/srpt/ib_srpt.c
index 669ae5c..7548bd5 100644
--- a/drivers/infiniband/ulp/srpt/ib_srpt.c
+++ b/drivers/infiniband/ulp/srpt/ib_srpt.c
@@ -2228,6 +2228,8 @@ static int srpt_cm_req_recv(struct ib_cm_id *cm_id,
pr_debug("Establish connection sess=%p name=%s cm_id=%p\n", ch->sess,
 ch->sess_name, ch->cm_id);
 
+   kref_get(>kref);
+
/* create srp_login_response */
rsp->opcode = SRP_LOGIN_RSP;
rsp->tag = req->tag;
@@ -2991,6 +2993,8 @@ static void srpt_close_session(struct se_session *se_sess)
 
srpt_disconnect_ch(ch);
 
+   kref_put(>kref, srpt_free_ch);
+
if (!wait)
return;
 
-- 
2.1.4

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 11/15] IB/srpt: Fix how aborted commands are processed

2016-01-05 Thread Bart Van Assche
srpt_abort_cmd() must not be called in state SRPT_STATE_DATA_IN. Issue
a warning if this occurs.

srpt_abort_cmd() must not invoke target_put_sess_cmd() for commands
in state SRPT_STATE_DONE because the srpt_abort_cmd() callers already
do this when necessary. Hence remove this call.

If an RDMA read fails the corresponding SCSI command must fail. Hence
add a transport_generic_request_failure() call.

Remove an incorrect srpt_abort_cmd() call from srpt_rdma_write_done().

Avoid that srpt_send_done() calls srpt_abort_cmd() for finished SCSI
commands.

Signed-off-by: Bart Van Assche <bart.vanass...@sandisk.com>
Cc: Christoph Hellwig <h...@lst.de>
---
 drivers/infiniband/ulp/srpt/ib_srpt.c | 39 ++-
 1 file changed, 15 insertions(+), 24 deletions(-)

diff --git a/drivers/infiniband/ulp/srpt/ib_srpt.c 
b/drivers/infiniband/ulp/srpt/ib_srpt.c
index 395bc1b..2ae3c1b 100644
--- a/drivers/infiniband/ulp/srpt/ib_srpt.c
+++ b/drivers/infiniband/ulp/srpt/ib_srpt.c
@@ -1273,25 +1273,17 @@ static int srpt_abort_cmd(struct srpt_send_ioctx *ioctx)
case SRPT_STATE_NEED_DATA:
ioctx->state = SRPT_STATE_DATA_IN;
break;
-   case SRPT_STATE_DATA_IN:
case SRPT_STATE_CMD_RSP_SENT:
case SRPT_STATE_MGMT_RSP_SENT:
ioctx->state = SRPT_STATE_DONE;
break;
default:
+   WARN_ONCE(true, "%s: unexpected I/O context state %d\n",
+ __func__, state);
break;
}
spin_unlock_irqrestore(>spinlock, flags);
 
-   if (state == SRPT_STATE_DONE) {
-   struct srpt_rdma_ch *ch = ioctx->ch;
-
-   BUG_ON(ch->sess == NULL);
-
-   target_put_sess_cmd(>cmd);
-   goto out;
-   }
-
pr_debug("Aborting cmd with state %d and tag %lld\n", state,
 ioctx->cmd.tag);
 
@@ -1299,14 +1291,16 @@ static int srpt_abort_cmd(struct srpt_send_ioctx *ioctx)
case SRPT_STATE_NEW:
case SRPT_STATE_DATA_IN:
case SRPT_STATE_MGMT:
+   case SRPT_STATE_DONE:
/*
 * Do nothing - defer abort processing until
 * srpt_queue_response() is invoked.
 */
-   WARN_ON(!(ioctx->cmd.transport_state & CMD_T_ABORTED));
break;
case SRPT_STATE_NEED_DATA:
-   /* DMA_TO_DEVICE (write) - RDMA read error. */
+   pr_debug("tag %#llx: RDMA read error\n", ioctx->cmd.tag);
+   transport_generic_request_failure(>cmd,
+   TCM_CHECK_CONDITION_ABORT_CMD);
break;
case SRPT_STATE_CMD_RSP_SENT:
/*
@@ -1314,18 +1308,16 @@ static int srpt_abort_cmd(struct srpt_send_ioctx *ioctx)
 * not been received in time.
 */
srpt_unmap_sg_to_ib_sge(ioctx->ch, ioctx);
-   target_put_sess_cmd(>cmd);
+   transport_generic_free_cmd(>cmd, 0);
break;
case SRPT_STATE_MGMT_RSP_SENT:
-   srpt_set_cmd_state(ioctx, SRPT_STATE_DONE);
-   target_put_sess_cmd(>cmd);
+   transport_generic_free_cmd(>cmd, 0);
break;
default:
WARN(1, "Unexpected command state (%d)", state);
break;
}
 
-out:
return state;
 }
 
@@ -1365,9 +1357,14 @@ static void srpt_rdma_write_done(struct ib_cq *cq, 
struct ib_wc *wc)
container_of(wc->wr_cqe, struct srpt_send_ioctx, rdma_cqe);
 
if (unlikely(wc->status != IB_WC_SUCCESS)) {
+   /*
+* Note: if an RDMA write error completion is received that
+* means that a SEND also has been posted. Defer further
+* processing of the associated command until the send error
+* completion has been received.
+*/
pr_info("RDMA_WRITE for ioctx 0x%p failed with status %d\n",
ioctx, wc->status);
-   srpt_abort_cmd(ioctx);
}
 }
 
@@ -1716,15 +1713,10 @@ static void srpt_send_done(struct ib_cq *cq, struct 
ib_wc *wc)
 
atomic_inc(>sq_wr_avail);
 
-   if (wc->status != IB_WC_SUCCESS) {
+   if (wc->status != IB_WC_SUCCESS)
pr_info("sending response for ioctx 0x%p failed"
" with status %d\n", ioctx, wc->status);
 
-   atomic_dec(>req_lim);
-   srpt_abort_cmd(ioctx);
-   goto out;
-   }
-
if (state != SRPT_STATE_DONE) {
srpt_unmap_sg_to_ib_sge(ch, ioctx);
transport_generic_free_cmd(>cmd, 0);
@@ -1733,7 +1725,6 @@ static void srpt_send_done(struct ib_cq *cq, struct ib_w

[PATCH 12/15] IB/srpt: Eliminate srpt_find_channel()

2016-01-05 Thread Bart Van Assche
In the CM REQ message handler, store the channel pointer in
cm_id->context such that the function srpt_find_channel() is no
longer needed. Additionally, make the CM event messages more
informative.

Signed-off-by: Bart Van Assche <bart.vanass...@sandisk.com>
Cc: Christoph Hellwig <h...@lst.de>
---
 drivers/infiniband/ulp/srpt/ib_srpt.c | 113 +-
 1 file changed, 43 insertions(+), 70 deletions(-)

diff --git a/drivers/infiniband/ulp/srpt/ib_srpt.c 
b/drivers/infiniband/ulp/srpt/ib_srpt.c
index 2ae3c1b..6ec130d 100644
--- a/drivers/infiniband/ulp/srpt/ib_srpt.c
+++ b/drivers/infiniband/ulp/srpt/ib_srpt.c
@@ -1893,25 +1893,14 @@ static int srpt_shutdown_session(struct se_session 
*se_sess)
  * ib_destroy_cm_id(), which locks the cm_id spinlock and hence waits until
  * this function has finished).
  */
-static void srpt_drain_channel(struct ib_cm_id *cm_id)
+static void srpt_drain_channel(struct srpt_rdma_ch *ch)
 {
-   struct srpt_device *sdev;
-   struct srpt_rdma_ch *ch;
int ret;
bool do_reset = false;
 
WARN_ON_ONCE(irqs_disabled());
 
-   sdev = cm_id->context;
-   BUG_ON(!sdev);
-   spin_lock_irq(>spinlock);
-   list_for_each_entry(ch, >rch_list, list) {
-   if (ch->cm_id == cm_id) {
-   do_reset = srpt_set_ch_state(ch, CH_DRAINING);
-   break;
-   }
-   }
-   spin_unlock_irq(>spinlock);
+   do_reset = srpt_set_ch_state(ch, CH_DRAINING);
 
if (do_reset) {
if (ch->sess)
@@ -1925,34 +1914,6 @@ static void srpt_drain_channel(struct ib_cm_id *cm_id)
 }
 
 /**
- * srpt_find_channel() - Look up an RDMA channel.
- * @cm_id: Pointer to the CM ID of the channel to be looked up.
- *
- * Return NULL if no matching RDMA channel has been found.
- */
-static struct srpt_rdma_ch *srpt_find_channel(struct srpt_device *sdev,
- struct ib_cm_id *cm_id)
-{
-   struct srpt_rdma_ch *ch;
-   bool found;
-
-   WARN_ON_ONCE(irqs_disabled());
-   BUG_ON(!sdev);
-
-   found = false;
-   spin_lock_irq(>spinlock);
-   list_for_each_entry(ch, >rch_list, list) {
-   if (ch->cm_id == cm_id) {
-   found = true;
-   break;
-   }
-   }
-   spin_unlock_irq(>spinlock);
-
-   return found ? ch : NULL;
-}
-
-/**
  * srpt_release_channel() - Release channel resources.
  *
  * Schedules the actual release because:
@@ -2160,6 +2121,7 @@ static int srpt_cm_req_recv(struct ib_cm_id *cm_id,
memcpy(ch->t_port_id, req->target_port_id, 16);
ch->sport = >port[param->port - 1];
ch->cm_id = cm_id;
+   cm_id->context = ch;
/*
 * Avoid QUEUE_FULL conditions by limiting the number of buffers used
 * for the SRP protocol to the command queue size.
@@ -2304,10 +2266,23 @@ out:
return ret;
 }
 
-static void srpt_cm_rej_recv(struct ib_cm_id *cm_id)
+static void srpt_cm_rej_recv(struct srpt_rdma_ch *ch,
+enum ib_cm_rej_reason reason,
+const u8 *private_data,
+u8 private_data_len)
 {
-   pr_info("Received IB REJ for cm_id %p.\n", cm_id);
-   srpt_drain_channel(cm_id);
+   char *priv = kmalloc(private_data_len * 3 + 1, GFP_KERNEL);
+   int i;
+
+   if (priv) {
+   priv[0] = '\0';
+   for (i = 0; i < private_data_len; i++)
+   sprintf(priv + 3 * i, "%02x ", private_data[i]);
+   }
+   pr_info("Received CM REJ for ch %s-%d; reason %d; private data %s.\n",
+   ch->sess_name, ch->qp->qp_num, reason, priv ? : "(?)");
+   kfree(priv);
+   srpt_drain_channel(ch);
 }
 
 /**
@@ -2316,14 +2291,10 @@ static void srpt_cm_rej_recv(struct ib_cm_id *cm_id)
  * An IB_CM_RTU_RECEIVED message indicates that the connection is established
  * and that the recipient may begin transmitting (RTU = ready to use).
  */
-static void srpt_cm_rtu_recv(struct ib_cm_id *cm_id)
+static void srpt_cm_rtu_recv(struct srpt_rdma_ch *ch)
 {
-   struct srpt_rdma_ch *ch;
int ret;
 
-   ch = srpt_find_channel(cm_id->context, cm_id);
-   BUG_ON(!ch);
-
if (srpt_set_ch_state(ch, CH_LIVE)) {
struct srpt_recv_ioctx *ioctx, *ioctx_tmp;
 
@@ -2339,31 +2310,30 @@ static void srpt_cm_rtu_recv(struct ib_cm_id *cm_id)
}
 }
 
-static void srpt_cm_timewait_exit(struct ib_cm_id *cm_id)
+static void srpt_cm_timewait_exit(struct srpt_rdma_ch *ch)
 {
-   pr_info("Received IB TimeWait exit for cm_id %p.\n", cm_id);
-   srpt_drain_channel(cm_id);
+   pr_info("Received CM TimeWait exit for ch %s-%d.\n", ch->sess_name,
+   ch->qp->qp_nu

[PATCH 10/15] IB/srpt: Fix srpt_handle_cmd() error paths

2016-01-05 Thread Bart Van Assche
The target core function that should be called if target_submit_cmd()
fails is target_put_sess_cmd(). Additionally, change the return type
of srpt_handle_cmd() from int into void.

Signed-off-by: Bart Van Assche <bart.vanass...@sandisk.com>
Cc: Christoph Hellwig <h...@lst.de>
---
 drivers/infiniband/ulp/srpt/ib_srpt.c | 24 
 1 file changed, 12 insertions(+), 12 deletions(-)

diff --git a/drivers/infiniband/ulp/srpt/ib_srpt.c 
b/drivers/infiniband/ulp/srpt/ib_srpt.c
index 1857d17..395bc1b 100644
--- a/drivers/infiniband/ulp/srpt/ib_srpt.c
+++ b/drivers/infiniband/ulp/srpt/ib_srpt.c
@@ -1482,15 +1482,14 @@ static int srpt_check_stop_free(struct se_cmd *cmd)
 /**
  * srpt_handle_cmd() - Process SRP_CMD.
  */
-static int srpt_handle_cmd(struct srpt_rdma_ch *ch,
-  struct srpt_recv_ioctx *recv_ioctx,
-  struct srpt_send_ioctx *send_ioctx)
+static void srpt_handle_cmd(struct srpt_rdma_ch *ch,
+   struct srpt_recv_ioctx *recv_ioctx,
+   struct srpt_send_ioctx *send_ioctx)
 {
struct se_cmd *cmd;
struct srp_cmd *srp_cmd;
u64 data_len;
enum dma_data_direction dir;
-   sense_reason_t ret;
int rc;
 
BUG_ON(!send_ioctx);
@@ -1518,8 +1517,7 @@ static int srpt_handle_cmd(struct srpt_rdma_ch *ch,
if (srpt_get_desc_tbl(send_ioctx, srp_cmd, , _len)) {
pr_err("0x%llx: parsing SRP descriptor table failed.\n",
   srp_cmd->tag);
-   ret = TCM_INVALID_CDB_FIELD;
-   goto send_sense;
+   goto put;
}
 
rc = target_submit_cmd(cmd, ch->sess, srp_cmd->cdb,
@@ -1527,14 +1525,16 @@ static int srpt_handle_cmd(struct srpt_rdma_ch *ch,
   scsilun_to_int(_cmd->lun), data_len,
   TCM_SIMPLE_TAG, dir, TARGET_SCF_ACK_KREF);
if (rc != 0) {
-   ret = TCM_LOGICAL_UNIT_COMMUNICATION_FAILURE;
-   goto send_sense;
+   pr_debug("target_submit_cmd() returned %d for tag %#llx\n", rc,
+srp_cmd->tag);
+   goto put;
}
-   return 0;
+   return;
 
-send_sense:
-   transport_send_check_condition_and_sense(cmd, ret, 0);
-   return -1;
+put:
+   send_ioctx->state = SRPT_STATE_DONE;
+   target_put_sess_cmd(cmd);
+   return;
 }
 
 static int srp_tmr_to_tcm(int fn)
-- 
2.1.4

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 09/15] IB/srpt: Fix srpt_close_session()

2016-01-05 Thread Bart Van Assche
Avoid that srpt_close_session() waits if it doesn't have to wait.
Additionally, increase the time during which srpt_close_session()
waits until closing a session has finished. This makes it easier
to detect session shutdown bugs.

Signed-off-by: Bart Van Assche <bart.vanass...@sandisk.com>
Cc: Christoph Hellwig <h...@lst.de>
---
 drivers/infiniband/ulp/srpt/ib_srpt.c | 31 ---
 1 file changed, 16 insertions(+), 15 deletions(-)

diff --git a/drivers/infiniband/ulp/srpt/ib_srpt.c 
b/drivers/infiniband/ulp/srpt/ib_srpt.c
index 0ff4ed6..1857d17 100644
--- a/drivers/infiniband/ulp/srpt/ib_srpt.c
+++ b/drivers/infiniband/ulp/srpt/ib_srpt.c
@@ -1981,8 +1981,8 @@ static void srpt_release_channel_work(struct work_struct 
*w)
struct se_session *se_sess;
 
ch = container_of(w, struct srpt_rdma_ch, release_work);
-   pr_debug("ch = %p; ch->sess = %p; release_done = %p\n", ch, ch->sess,
-ch->release_done);
+   pr_debug("%s: %s-%d; release_done = %p\n", __func__, ch->sess_name,
+ch->qp->qp_num, ch->release_done);
 
sdev = ch->sport->sdev;
BUG_ON(!sdev);
@@ -2006,11 +2006,10 @@ static void srpt_release_channel_work(struct 
work_struct *w)
 ch->rsp_size, DMA_TO_DEVICE);
 
spin_lock_irq(>spinlock);
-   list_del(>list);
-   spin_unlock_irq(>spinlock);
-
+   list_del_init(>list);
if (ch->release_done)
complete(ch->release_done);
+   spin_unlock_irq(>spinlock);
 
wake_up(>ch_releaseQ);
 
@@ -3033,24 +3032,26 @@ static void srpt_release_cmd(struct se_cmd *se_cmd)
 static void srpt_close_session(struct se_session *se_sess)
 {
DECLARE_COMPLETION_ONSTACK(release_done);
-   struct srpt_rdma_ch *ch;
-   struct srpt_device *sdev;
-   unsigned long res;
-
-   ch = se_sess->fabric_sess_ptr;
-   WARN_ON(ch->sess != se_sess);
+   struct srpt_rdma_ch *ch = se_sess->fabric_sess_ptr;
+   struct srpt_device *sdev = ch->sport->sdev;
+   bool wait;
 
-   pr_debug("ch %p state %d\n", ch, ch->state);
+   pr_debug("ch %s-%d state %d\n", ch->sess_name, ch->qp->qp_num,
+ch->state);
 
-   sdev = ch->sport->sdev;
spin_lock_irq(>spinlock);
BUG_ON(ch->release_done);
ch->release_done = _done;
+   wait = !list_empty(>list);
__srpt_close_ch(ch);
spin_unlock_irq(>spinlock);
 
-   res = wait_for_completion_timeout(_done, 60 * HZ);
-   WARN_ON(res == 0);
+   if (!wait)
+   return;
+
+   while (wait_for_completion_timeout(_done, 180 * HZ) == 0)
+   pr_info("%s(%s-%d state %d): still waiting ...\n", __func__,
+   ch->sess_name, ch->qp->qp_num, ch->state);
 }
 
 /**
-- 
2.1.4

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 03/15] IB/srpt: Inline srpt_get_ch_state()

2016-01-05 Thread Bart Van Assche
The callers of srpt_get_ch_state() can access ch->state safely without
using locking. Hence inline this function.

Signed-off-by: Bart Van Assche <bart.vanass...@sandisk.com>
Cc: Christoph Hellwig <h...@lst.de>
---
 drivers/infiniband/ulp/srpt/ib_srpt.c | 42 ++-
 1 file changed, 12 insertions(+), 30 deletions(-)

diff --git a/drivers/infiniband/ulp/srpt/ib_srpt.c 
b/drivers/infiniband/ulp/srpt/ib_srpt.c
index 7abad26..cbcc73e 100644
--- a/drivers/infiniband/ulp/srpt/ib_srpt.c
+++ b/drivers/infiniband/ulp/srpt/ib_srpt.c
@@ -109,17 +109,6 @@ enum dma_data_direction opposite_dma_dir(enum 
dma_data_direction dir)
}
 }
 
-static enum rdma_ch_state srpt_get_ch_state(struct srpt_rdma_ch *ch)
-{
-   unsigned long flags;
-   enum rdma_ch_state state;
-
-   spin_lock_irqsave(>spinlock, flags);
-   state = ch->state;
-   spin_unlock_irqrestore(>spinlock, flags);
-   return state;
-}
-
 static enum rdma_ch_state
 srpt_set_ch_state(struct srpt_rdma_ch *ch, enum rdma_ch_state new_state)
 {
@@ -216,7 +205,7 @@ static void srpt_srq_event(struct ib_event *event, void 
*ctx)
 static void srpt_qp_event(struct ib_event *event, struct srpt_rdma_ch *ch)
 {
pr_debug("QP event %d on cm_id=%p sess_name=%s state=%d\n",
-event->event, ch->cm_id, ch->sess_name, srpt_get_ch_state(ch));
+event->event, ch->cm_id, ch->sess_name, ch->state);
 
switch (event->event) {
case IB_EVENT_COMM_EST:
@@ -228,7 +217,7 @@ static void srpt_qp_event(struct ib_event *event, struct 
srpt_rdma_ch *ch)
srpt_release_channel(ch);
else
pr_debug("%s: state %d - ignored LAST_WQE.\n",
-ch->sess_name, srpt_get_ch_state(ch));
+ch->sess_name, ch->state);
break;
default:
pr_err("received unrecognized IB QP event %d\n", event->event);
@@ -1784,7 +1773,6 @@ static void srpt_handle_new_iu(struct srpt_rdma_ch *ch,
   struct srpt_send_ioctx *send_ioctx)
 {
struct srp_cmd *srp_cmd;
-   enum rdma_ch_state ch_state;
 
BUG_ON(!ch);
BUG_ON(!recv_ioctx);
@@ -1793,13 +1781,12 @@ static void srpt_handle_new_iu(struct srpt_rdma_ch *ch,
   recv_ioctx->ioctx.dma, srp_max_req_size,
   DMA_FROM_DEVICE);
 
-   ch_state = srpt_get_ch_state(ch);
-   if (unlikely(ch_state == CH_CONNECTING)) {
+   if (unlikely(ch->state == CH_CONNECTING)) {
list_add_tail(_ioctx->wait_list, >cmd_wait_list);
goto out;
}
 
-   if (unlikely(ch_state != CH_LIVE))
+   if (unlikely(ch->state != CH_LIVE))
goto out;
 
srp_cmd = recv_ioctx->ioctx.buf;
@@ -1908,7 +1895,7 @@ static void srpt_send_done(struct ib_cq *cq, struct ib_wc 
*wc)
 
 out:
while (!list_empty(>cmd_wait_list) &&
-  srpt_get_ch_state(ch) == CH_LIVE &&
+  ch->state == CH_LIVE &&
   (ioctx = srpt_get_send_ioctx(ch)) != NULL) {
struct srpt_recv_ioctx *recv_ioctx;
 
@@ -2314,17 +2301,14 @@ static int srpt_cm_req_recv(struct ib_cm_id *cm_id,
&& param->port == ch->sport->port
&& param->listen_id == ch->sport->sdev->cm_id
&& ch->cm_id) {
-   enum rdma_ch_state ch_state;
-
-   ch_state = srpt_get_ch_state(ch);
-   if (ch_state != CH_CONNECTING
-   && ch_state != CH_LIVE)
+   if (ch->state != CH_CONNECTING
+   && ch->state != CH_LIVE)
continue;
 
/* found an existing channel */
pr_debug("Found existing channel %s"
 " cm_id= %p state= %d\n",
-ch->sess_name, ch->cm_id, ch_state);
+ch->sess_name, ch->cm_id, ch->state);
 
__srpt_close_ch(ch);
 
@@ -2566,7 +2550,7 @@ static void srpt_cm_dreq_recv(struct ib_cm_id *cm_id)
ch = srpt_find_channel(cm_id->context, cm_id);
BUG_ON(!ch);
 
-   pr_debug("cm_id= %p ch->state= %d\n", cm_id, srpt_get_ch_state(ch));
+   pr_debug("cm_id= %p ch->state= %d\n", cm_id, ch->state);
 
spin_lock_irqsave(>spinlock, flags);
switch (ch->state) {
@@ -2750,7 +2734,6 @@ st

[PATCH 13/15] IB/srpt: Detect session shutdown reliably

2016-01-05 Thread Bart Van Assche
The Last WQE Reached event is only generated after one or more work
requests have been queued on the QP associated with a session. Since
session shutdown can start before any work requests have been queued,
use a zero-length RDMA write to wait until a QP has been drained.

Signed-off-by: Bart Van Assche <bart.vanass...@sandisk.com>
Cc: Christoph Hellwig <h...@lst.de>
---
 drivers/infiniband/ulp/srpt/ib_srpt.c | 274 +-
 drivers/infiniband/ulp/srpt/ib_srpt.h |  18 ++-
 2 files changed, 150 insertions(+), 142 deletions(-)

diff --git a/drivers/infiniband/ulp/srpt/ib_srpt.c 
b/drivers/infiniband/ulp/srpt/ib_srpt.c
index 6ec130d..cacb697 100644
--- a/drivers/infiniband/ulp/srpt/ib_srpt.c
+++ b/drivers/infiniband/ulp/srpt/ib_srpt.c
@@ -91,10 +91,11 @@ MODULE_PARM_DESC(srpt_service_guid,
 " instead of using the node_guid of the first HCA.");
 
 static struct ib_client srpt_client;
-static void srpt_release_channel(struct srpt_rdma_ch *ch);
+static void srpt_free_ch(struct kref *kref);
 static int srpt_queue_status(struct se_cmd *cmd);
 static void srpt_recv_done(struct ib_cq *cq, struct ib_wc *wc);
 static void srpt_send_done(struct ib_cq *cq, struct ib_wc *wc);
+static void srpt_zerolength_write_done(struct ib_cq *cq, struct ib_wc *wc);
 
 static bool srpt_set_ch_state(struct srpt_rdma_ch *ch, enum rdma_ch_state new)
 {
@@ -170,6 +171,23 @@ static void srpt_srq_event(struct ib_event *event, void 
*ctx)
pr_info("SRQ event %d\n", event->event);
 }
 
+static const char *get_ch_state_name(enum rdma_ch_state s)
+{
+   switch (s) {
+   case CH_CONNECTING:
+   return "connecting";
+   case CH_LIVE:
+   return "live";
+   case CH_DISCONNECTING:
+   return "disconnecting";
+   case CH_DRAINING:
+   return "draining";
+   case CH_DISCONNECTED:
+   return "disconnected";
+   }
+   return "???";
+}
+
 /**
  * srpt_qp_event() - QP event callback function.
  */
@@ -183,11 +201,9 @@ static void srpt_qp_event(struct ib_event *event, struct 
srpt_rdma_ch *ch)
ib_cm_notify(ch->cm_id, event->event);
break;
case IB_EVENT_QP_LAST_WQE_REACHED:
-   if (srpt_set_ch_state(ch, CH_RELEASING))
-   srpt_release_channel(ch);
-   else
-   pr_debug("%s: state %d - ignored LAST_WQE.\n",
-ch->sess_name, ch->state);
+   pr_debug("%s-%d, state %s: received Last WQE event.\n",
+ch->sess_name, ch->qp->qp_num,
+get_ch_state_name(ch->state));
break;
default:
pr_err("received unrecognized IB QP event %d\n", event->event);
@@ -789,6 +805,37 @@ out:
 }
 
 /**
+ * srpt_zerolength_write() - Perform a zero-length RDMA write.
+ *
+ * A quote from the InfiniBand specification: C9-88: For an HCA responder
+ * using Reliable Connection service, for each zero-length RDMA READ or WRITE
+ * request, the R_Key shall not be validated, even if the request includes
+ * Immediate data.
+ */
+static int srpt_zerolength_write(struct srpt_rdma_ch *ch)
+{
+   struct ib_send_wr wr, *bad_wr;
+
+   memset(, 0, sizeof(wr));
+   wr.opcode = IB_WR_RDMA_WRITE;
+   wr.wr_cqe = >zw_cqe;
+   wr.send_flags = IB_SEND_SIGNALED;
+   return ib_post_send(ch->qp, , _wr);
+}
+
+static void srpt_zerolength_write_done(struct ib_cq *cq, struct ib_wc *wc)
+{
+   struct srpt_rdma_ch *ch = cq->cq_context;
+
+   WARN(wc->status == IB_WC_SUCCESS, "%s-%d: QP not in error state\n",
+ch->sess_name, ch->qp->qp_num);
+   if (srpt_set_ch_state(ch, CH_DISCONNECTED))
+   schedule_work(>release_work);
+   else
+   WARN_ONCE("%s-%d\n", ch->sess_name, ch->qp->qp_num);
+}
+
+/**
  * srpt_get_desc_tbl() - Parse the data descriptors of an SRP_CMD request.
  * @ioctx: Pointer to the I/O context associated with the request.
  * @srp_cmd: Pointer to the SRP_CMD request data.
@@ -1819,111 +1866,102 @@ static void srpt_destroy_ch_ib(struct srpt_rdma_ch 
*ch)
 }
 
 /**
- * __srpt_close_ch() - Close an RDMA channel by setting the QP error state.
+ * srpt_close_ch() - Close an RDMA channel.
  *
- * Reset the QP and make sure all resources associated with the channel will
- * be deallocated at an appropriate time.
+ * Make sure all resources associated with the channel will be deallocated at
+ * an appropriate time.
  *
- * Note: The caller must hold ch->sport->sdev->spinlock.
+ * Returns true if and only if the channel state has been modified into
+ * CH_DRAINING.
  */
-static void __srpt_close_ch(struct srpt_rdma_ch *ch)
+static bool srpt_close_ch(stru

[PATCH 14/15] IB/srpt: Fix srpt_write_pending()

2016-01-05 Thread Bart Van Assche
The only allowed return values for the write_pending() callback
function are 0, -EAGAIN and -ENOMEM. Since attempting to perform
RDMA over a disconnecting channel will result in an IB error
completion anyway, remove the code that checks the channel state
from srpt_write_pending().

Signed-off-by: Bart Van Assche <bart.vanass...@sandisk.com>
Cc: Christoph Hellwig <h...@lst.de>
---
 drivers/infiniband/ulp/srpt/ib_srpt.c | 33 -
 1 file changed, 4 insertions(+), 29 deletions(-)

diff --git a/drivers/infiniband/ulp/srpt/ib_srpt.c 
b/drivers/infiniband/ulp/srpt/ib_srpt.c
index cacb697..669ae5c 100644
--- a/drivers/infiniband/ulp/srpt/ib_srpt.c
+++ b/drivers/infiniband/ulp/srpt/ib_srpt.c
@@ -2522,39 +2522,14 @@ out_unmap:
  */
 static int srpt_write_pending(struct se_cmd *se_cmd)
 {
-   struct srpt_rdma_ch *ch;
-   struct srpt_send_ioctx *ioctx;
+   struct srpt_send_ioctx *ioctx =
+   container_of(se_cmd, struct srpt_send_ioctx, cmd);
+   struct srpt_rdma_ch *ch = ioctx->ch;
enum srpt_command_state new_state;
-   int ret;
-
-   ioctx = container_of(se_cmd, struct srpt_send_ioctx, cmd);
 
new_state = srpt_set_cmd_state(ioctx, SRPT_STATE_NEED_DATA);
WARN_ON(new_state == SRPT_STATE_DONE);
-
-   ch = ioctx->ch;
-   BUG_ON(!ch);
-
-   switch (ch->state) {
-   case CH_CONNECTING:
-   WARN(true, "unexpected channel state %d\n", ch->state);
-   ret = -EINVAL;
-   goto out;
-   case CH_LIVE:
-   break;
-   case CH_DISCONNECTING:
-   case CH_DRAINING:
-   case CH_DISCONNECTED:
-   pr_debug("cmd with tag %lld: channel disconnecting\n",
-ioctx->cmd.tag);
-   srpt_set_cmd_state(ioctx, SRPT_STATE_DATA_IN);
-   ret = -EINVAL;
-   goto out;
-   }
-   ret = srpt_xfer_data(ch, ioctx);
-
-out:
-   return ret;
+   return srpt_xfer_data(ch, ioctx);
 }
 
 static u8 tcm_to_srp_tsk_mgmt_status(const int tcm_mgmt_status)
-- 
2.1.4

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 04/15] IB/srpt: Introduce target_reverse_dma_direction()

2016-01-05 Thread Bart Van Assche
Use the function target_reverse_dma_direction() instead of
reimplementing it.

Signed-off-by: Bart Van Assche <bart.vanass...@sandisk.com>
Cc: Christoph Hellwig <h...@lst.de>
---
 drivers/infiniband/ulp/srpt/ib_srpt.c | 17 ++---
 1 file changed, 2 insertions(+), 15 deletions(-)

diff --git a/drivers/infiniband/ulp/srpt/ib_srpt.c 
b/drivers/infiniband/ulp/srpt/ib_srpt.c
index cbcc73e..fd94780 100644
--- a/drivers/infiniband/ulp/srpt/ib_srpt.c
+++ b/drivers/infiniband/ulp/srpt/ib_srpt.c
@@ -96,19 +96,6 @@ static int srpt_queue_status(struct se_cmd *cmd);
 static void srpt_recv_done(struct ib_cq *cq, struct ib_wc *wc);
 static void srpt_send_done(struct ib_cq *cq, struct ib_wc *wc);
 
-/**
- * opposite_dma_dir() - Swap DMA_TO_DEVICE and DMA_FROM_DEVICE.
- */
-static inline
-enum dma_data_direction opposite_dma_dir(enum dma_data_direction dir)
-{
-   switch (dir) {
-   case DMA_TO_DEVICE: return DMA_FROM_DEVICE;
-   case DMA_FROM_DEVICE:   return DMA_TO_DEVICE;
-   default:return dir;
-   }
-}
-
 static enum rdma_ch_state
 srpt_set_ch_state(struct srpt_rdma_ch *ch, enum rdma_ch_state new_state)
 {
@@ -1048,7 +1035,7 @@ static void srpt_unmap_sg_to_ib_sge(struct srpt_rdma_ch 
*ch,
dir = ioctx->cmd.data_direction;
BUG_ON(dir == DMA_NONE);
ib_dma_unmap_sg(ch->sport->sdev->device, sg, ioctx->sg_cnt,
-   opposite_dma_dir(dir));
+   target_reverse_dma_direction(>cmd));
ioctx->mapped_sg_count = 0;
}
 }
@@ -1085,7 +1072,7 @@ static int srpt_map_sg_to_ib_sge(struct srpt_rdma_ch *ch,
ioctx->sg_cnt = sg_cnt = cmd->t_data_nents;
 
count = ib_dma_map_sg(ch->sport->sdev->device, sg, sg_cnt,
- opposite_dma_dir(dir));
+ target_reverse_dma_direction(cmd));
if (unlikely(!count))
return -EAGAIN;
 
-- 
2.1.4

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 05/15] IB/srpt: Use scsilun_to_int()

2016-01-05 Thread Bart Van Assche
Just like other target drivers, use scsilun_to_int() to unpack SCSI
LUN numbers. This patch only changes the behavior of ib_srpt for LUN
numbers >= 16384.

Signed-off-by: Bart Van Assche <bart.vanass...@sandisk.com>
Cc: Christoph Hellwig <h...@lst.de>
---
 drivers/infiniband/ulp/srpt/ib_srpt.c | 92 +++
 1 file changed, 6 insertions(+), 86 deletions(-)

diff --git a/drivers/infiniband/ulp/srpt/ib_srpt.c 
b/drivers/infiniband/ulp/srpt/ib_srpt.c
index fd94780..fc19203 100644
--- a/drivers/infiniband/ulp/srpt/ib_srpt.c
+++ b/drivers/infiniband/ulp/srpt/ib_srpt.c
@@ -1488,80 +1488,6 @@ static int srpt_build_tskmgmt_rsp(struct srpt_rdma_ch 
*ch,
return resp_len;
 }
 
-#define NO_SUCH_LUN ((uint64_t)-1LL)
-
-/*
- * SCSI LUN addressing method. See also SAM-2 and the section about
- * eight byte LUNs.
- */
-enum scsi_lun_addr_method {
-   SCSI_LUN_ADDR_METHOD_PERIPHERAL   = 0,
-   SCSI_LUN_ADDR_METHOD_FLAT = 1,
-   SCSI_LUN_ADDR_METHOD_LUN  = 2,
-   SCSI_LUN_ADDR_METHOD_EXTENDED_LUN = 3,
-};
-
-/*
- * srpt_unpack_lun() - Convert from network LUN to linear LUN.
- *
- * Convert an 2-byte, 4-byte, 6-byte or 8-byte LUN structure in network byte
- * order (big endian) to a linear LUN. Supports three LUN addressing methods:
- * peripheral, flat and logical unit. See also SAM-2, section 4.9.4 (page 40).
- */
-static uint64_t srpt_unpack_lun(const uint8_t *lun, int len)
-{
-   uint64_t res = NO_SUCH_LUN;
-   int addressing_method;
-
-   if (unlikely(len < 2)) {
-   pr_err("Illegal LUN length %d, expected 2 bytes or more\n",
-  len);
-   goto out;
-   }
-
-   switch (len) {
-   case 8:
-   if ((*((__be64 *)lun) &
-cpu_to_be64(0xLL)) != 0)
-   goto out_err;
-   break;
-   case 4:
-   if (*((__be16 *)[2]) != 0)
-   goto out_err;
-   break;
-   case 6:
-   if (*((__be32 *)[2]) != 0)
-   goto out_err;
-   break;
-   case 2:
-   break;
-   default:
-   goto out_err;
-   }
-
-   addressing_method = (*lun) >> 6; /* highest two bits of byte 0 */
-   switch (addressing_method) {
-   case SCSI_LUN_ADDR_METHOD_PERIPHERAL:
-   case SCSI_LUN_ADDR_METHOD_FLAT:
-   case SCSI_LUN_ADDR_METHOD_LUN:
-   res = *(lun + 1) | (((*lun) & 0x3f) << 8);
-   break;
-
-   case SCSI_LUN_ADDR_METHOD_EXTENDED_LUN:
-   default:
-   pr_err("Unimplemented LUN addressing method %u\n",
-  addressing_method);
-   break;
-   }
-
-out:
-   return res;
-
-out_err:
-   pr_err("Support for multi-level LUNs has not yet been implemented\n");
-   goto out;
-}
-
 static int srpt_check_stop_free(struct se_cmd *cmd)
 {
struct srpt_send_ioctx *ioctx = container_of(cmd,
@@ -1579,7 +1505,6 @@ static int srpt_handle_cmd(struct srpt_rdma_ch *ch,
 {
struct se_cmd *cmd;
struct srp_cmd *srp_cmd;
-   uint64_t unpacked_lun;
u64 data_len;
enum dma_data_direction dir;
sense_reason_t ret;
@@ -1614,11 +1539,10 @@ static int srpt_handle_cmd(struct srpt_rdma_ch *ch,
goto send_sense;
}
 
-   unpacked_lun = srpt_unpack_lun((uint8_t *)_cmd->lun,
-  sizeof(srp_cmd->lun));
rc = target_submit_cmd(cmd, ch->sess, srp_cmd->cdb,
-   _ioctx->sense_data[0], unpacked_lun, data_len,
-   TCM_SIMPLE_TAG, dir, TARGET_SCF_ACK_KREF);
+  _ioctx->sense_data[0],
+  scsilun_to_int(_cmd->lun), data_len,
+  TCM_SIMPLE_TAG, dir, TARGET_SCF_ACK_KREF);
if (rc != 0) {
ret = TCM_LOGICAL_UNIT_COMMUNICATION_FAILURE;
goto send_sense;
@@ -1704,7 +1628,6 @@ static void srpt_handle_tsk_mgmt(struct srpt_rdma_ch *ch,
struct srp_tsk_mgmt *srp_tsk;
struct se_cmd *cmd;
struct se_session *sess = ch->sess;
-   uint64_t unpacked_lun;
uint32_t tag = 0;
int tcm_tmr;
int rc;
@@ -1726,9 +1649,6 @@ static void srpt_handle_tsk_mgmt(struct srpt_rdma_ch *ch,
TMR_TASK_MGMT_FUNCTION_NOT_SUPPORTED;
goto fail;
}
-   unpacked_lun = srpt_unpack_lun((uint8_t *)_tsk->lun,
-  sizeof(srp_tsk->lun));
-
if (srp_tsk->tsk_mgmt_func == SRP_TSK_ABORT_TASK) {
rc = srpt_rx_mgmt_fn_tag(send_ioctx, srp_tsk->task_tag);
if (rc < 0) {
@@ -1738,9 +1658,9 @@ static void srpt_handle_tsk_mgmt(struct srpt_rdma_ch *ch,

[PATCH 00/15] Various ib_srpt patches

2016-01-05 Thread Bart Van Assche
The following series of patches is what I came up with while testing the 
most recent version of my SCSI target patch series (see also 
http://thread.gmane.org/gmane.linux.scsi.target.devel/10905):


0001-IB-srpt-Add-parentheses-around-sizeof-argument.patch
0002-IB-srpt-Inline-srpt_sdev_name.patch
0003-IB-srpt-Inline-srpt_get_ch_state.patch
0004-IB-srpt-Introduce-target_reverse_dma_direction.patch
0005-IB-srpt-Use-scsilun_to_int.patch
0006-IB-srpt-Simplify-srpt_handle_tsk_mgmt.patch
0007-IB-srpt-Simplify-channel-state-management.patch
0008-IB-srpt-Simplify-srpt_shutdown_session.patch
0009-IB-srpt-Fix-srpt_close_session.patch
0010-IB-srpt-Fix-srpt_handle_cmd-error-paths.patch
0011-IB-srpt-Fix-how-aborted-commands-are-processed.patch
0012-IB-srpt-Eliminate-srpt_find_channel.patch
0013-IB-srpt-Detect-session-shutdown-reliably.patch
0014-IB-srpt-Fix-srpt_write_pending.patch
0015-IB-srpt-Fix-a-rare-crash-in-srpt_close_session.patch
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] IB/sysfs: Fix sparse warning on attr_id

2016-01-04 Thread Bart Van Assche

On 01/04/2016 04:44 AM, ira.we...@intel.com wrote:

From: Ira Weiny <ira.we...@intel.com>

Attributed ID was declared as an int while the value should really be big
endian 16.

Fixes: 35c4cbb17811 ("IB/core: Create get_perf_mad function in sysfs.c")

Reported-by: Bart Van Assche <bart.vanass...@sandisk.com>
Signed-off-by: Ira Weiny <ira.we...@intel.com>


Reviewed-by: Bart Van Assche <bart.vanass...@sandisk.com>
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 3/3] IB/srpt: Fix a race condition related to SRP login

2016-01-03 Thread Bart Van Assche

On 01/03/2016 12:25 PM, Sagi Grimberg wrote:

But now the first I/O(s) could be lost if no other I/O comes in,
right?  I suspect that we need to keep this loop to protect against
such corner cases.


It can happen theoretically, but why do we even bother? Why not just
post the recv buffer after we moved in to CH_LIVE? This why we let the
RC transport handle the "Recv-Not-Ready" NAK by retrying?


Making wait list processing in the RTU handler safe would require to 
introduce additional locking. Posting receive buffers after the RTU 
event has been received would introduce another race condition because 
then it can happen that an initiator starts sending data before the 
receive buffers have been posted. A possible solution would be to 
trigger the receive handler after the RTU event has been received in 
such a way that all invocations of the receive handler are still serialized.


Bart.


--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] IB/cm: Fix a recently introduced deadlock

2016-01-01 Thread Bart Van Assche
ib_send_cm_drep() calls cm_enter_timewait() while holding a spinlock
that can be locked from inside an interrupt handler. Hence do not
enable interrupts inside cm_enter_timewait() if called with interrupts
disabled.

This patch fixes e.g. the following deadlock:

=
[ INFO: inconsistent lock state ]
4.4.0-rc7+ #1 Tainted: GE
-
inconsistent {HARDIRQ-ON-W} -> {IN-HARDIRQ-W} usage.
swapper/8/0 [HC1[1]:SC0[0]:HE0:SE1] takes:
(&(_id_priv->lock)->rlock){?.+...}, at: [] cm_establish+0x
74/0x1b0 [ib_cm]
{HARDIRQ-ON-W} state was registered at:
  [] mark_held_locks+0x71/0x90
  [] trace_hardirqs_on_caller+0xa7/0x1c0
  [] trace_hardirqs_on+0xd/0x10
  [] _raw_spin_unlock_irq+0x2b/0x40
  [] cm_enter_timewait+0xae/0x100 [ib_cm]
  [] ib_send_cm_drep+0xb6/0x190 [ib_cm]
  [] srp_cm_handler+0x128/0x1a0 [ib_srp]
  [] cm_process_work+0x20/0xf0 [ib_cm]
  [] cm_dreq_handler+0x135/0x2c0 [ib_cm]
  [] cm_work_handler+0x75/0xd0 [ib_cm]
  [] process_one_work+0x1bd/0x460
  [] worker_thread+0x118/0x420
  [] kthread+0xe4/0x100
  [] ret_from_fork+0x3f/0x70
irq event stamp: 1672286
hardirqs last  enabled at (1672283): [] poll_idle+0x10/0x80
hardirqs last disabled at (1672284): [] 
common_interrupt+0x84/0x89
softirqs last  enabled at (1672286): [] 
_local_bh_enable+0x1c/0x50
softirqs last disabled at (1672285): [] irq_enter+0x47/0x70

other info that might help us debug this:
 Possible unsafe locking scenario:

   CPU0
   
  lock(&(_id_priv->lock)->rlock);
  
lock(&(_id_priv->lock)->rlock);

 *** DEADLOCK ***

no locks held by swapper/8/0.

stack backtrace:
CPU: 8 PID: 0 Comm: swapper/8 Tainted: GE   4.4.0-rc7+ #1
Hardware name: Dell Inc. PowerEdge R430/03XKDV, BIOS 1.0.2 11/17/2014
 88045af5e950 88046e503a88 81251c1b 0007
 0006 0003 88045af5ddc0 88046e503ad8
 810a32f4   0001
Call Trace:
   [] dump_stack+0x4f/0x74
 [] print_usage_bug+0x184/0x190
 [] mark_lock_irq+0xf2/0x290
 [] mark_lock+0x115/0x1b0
 [] mark_irqflags+0x15c/0x170
 [] __lock_acquire+0x1ef/0x560
 [] lock_acquire+0x62/0x80
 [] _raw_spin_lock_irqsave+0x43/0x60
 [] cm_establish+0x74/0x1b0 [ib_cm]
 [] ib_cm_notify+0x31/0x100 [ib_cm]
 [] srpt_qp_event+0x54/0xd0 [ib_srpt]
 [] mlx4_ib_qp_event+0x72/0xc0 [mlx4_ib]
 [] mlx4_qp_event+0x69/0xd0 [mlx4_core]
 [] mlx4_eq_int+0x51e/0xd50 [mlx4_core]
 [] mlx4_msi_x_interrupt+0xf/0x20 [mlx4_core]
 [] handle_irq_event_percpu+0x40/0x110
 [] handle_irq_event+0x3f/0x70
 [] handle_edge_irq+0x79/0x120
 [] handle_irq+0x5d/0x130
 [] do_IRQ+0x6d/0x130
 [] common_interrupt+0x89/0x89
   [] cpuidle_enter_state+0xcf/0x200
 [] cpuidle_enter+0x12/0x20
 [] call_cpuidle+0x36/0x60
 [] cpuidle_idle_call+0x63/0x110
 [] cpu_idle_loop+0xfa/0x130
 [] cpu_startup_entry+0xe/0x10
 [] start_secondary+0x83/0x90

Fixes: commit be4b499323bf ("IB/cm: Do not queue work to a device that's going 
away")
Signed-off-by: Bart Van Assche <bart.vanass...@sandisk.com>
Cc: Erez Shitrit <ere...@mellanox.com>
Cc: stable <sta...@vger.kernel.org>
---
 drivers/infiniband/core/cm.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/infiniband/core/cm.c b/drivers/infiniband/core/cm.c
index 0a26dd6..d6d2b35 100644
--- a/drivers/infiniband/core/cm.c
+++ b/drivers/infiniband/core/cm.c
@@ -782,11 +782,11 @@ static void cm_enter_timewait(struct cm_id_private 
*cm_id_priv)
wait_time = cm_convert_to_ms(cm_id_priv->av.timeout);
 
/* Check if the device started its remove_one */
-   spin_lock_irq();
+   spin_lock_irqsave(, flags);
if (!cm_dev->going_down)
queue_delayed_work(cm.wq, _id_priv->timewait_info->work.work,
   msecs_to_jiffies(wait_time));
-   spin_unlock_irq();
+   spin_unlock_irqrestore(, flags);
 
cm_id_priv->timewait_info = NULL;
 }
-- 
2.1.4

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma.git/k.o/for-4.5 crash during boot

2015-12-31 Thread Bart Van Assche

On 12/30/2015 01:34 PM, Leon Romanovsky wrote:

On Wed, Dec 30, 2015 at 02:22:23PM +0200, Or Gerlitz wrote:

On 12/30/2015 2:04 PM, Bart Van Assche wrote:

Hello Christoph,

Can you check whether the branch in the subject of this e-mail works fine
on your setup (commit 59caaed7a7) ? On my test setup (Dell R430 with two
ConnectX-3 adapters) this branch crashes during boot in
get_counter_table() (see also the attached screenshot).


Can you please check with Hal's fix that was posted here 1-2 days ago,
hopefully this should make your system to work


Or referenced to this patch [1], it should fix your crash.

[1] https://patchwork.kernel.org/patch/7929551/


Hello Or and Leon,

That patch indeed fixes the crash I had reported.

Thanks,

Bart.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 1/3] irq_poll: Fix irq_poll_sched()

2015-12-31 Thread Bart Van Assche
The IRQ_POLL_F_SCHED bit is set as long as polling is ongoing.
This means that irq_poll_sched() must proceed if this bit has
not yet been set.

Fixes: commit ea51190c0315 ("irq_poll: fold irq_poll_sched_prep into 
irq_poll_sched").
Signed-off-by: Bart Van Assche <bart.vanass...@sandisk.com>
Reviewed-by: Christoph Hellwig <h...@lst.de>
---
 lib/irq_poll.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/lib/irq_poll.c b/lib/irq_poll.c
index 2836620..836f7db 100644
--- a/lib/irq_poll.c
+++ b/lib/irq_poll.c
@@ -29,7 +29,7 @@ void irq_poll_sched(struct irq_poll *iop)
 
if (test_bit(IRQ_POLL_F_DISABLE, >state))
return;
-   if (!test_and_set_bit(IRQ_POLL_F_SCHED, >state))
+   if (test_and_set_bit(IRQ_POLL_F_SCHED, >state))
return;
 
local_irq_save(flags);
-- 
2.1.4

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 0/3] Three SRP driver patches for kernel 4.5

2015-12-31 Thread Bart Van Assche

Hello Doug,

This series of three patches fixes new issues that exist in the
git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma.git/k.o/for-4.5 
branch. It would be appreciated if these patches could be included in 
the kernel 4.5 RDMA pull request.


0001-irq_poll-Fix-irq_poll_sched.patch
0002-IB-srpt-Fix-the-RDMA-completion-handlers.patch
0003-IB-srpt-Fix-a-race-condition-related-to-SRP-login.patch

Thanks,

Bart.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 2/3] IB/srpt: Fix the RDMA completion handlers

2015-12-31 Thread Bart Van Assche
Avoid that the following kernel crash is triggered when processing
an RDMA completion:

BUG: unable to handle kernel paging request at 00010198
IP: [] __lock_acquire+0xa2/0x560
Call Trace:
 [] lock_acquire+0x62/0x80
 [] _raw_spin_lock_irqsave+0x43/0x60
 [] srpt_rdma_read_done+0x57/0x120 [ib_srpt]
 [] __ib_process_cq+0x43/0xc0 [ib_core]
 [] ib_cq_poll_work+0x25/0x70 [ib_core]
 [] process_one_work+0x1bd/0x460
 [] worker_thread+0x118/0x420
 [] kthread+0xe4/0x100
 [] ret_from_fork+0x3f/0x70

Fixes: commit 59fae4deaad3 ("IB/srpt: chain RDMA READ/WRITE requests").
Signed-off-by: Bart Van Assche <bart.vanass...@sandisk.com>
Reviewed-by: Christoph Hellwig <h...@lst.de>
---
 drivers/infiniband/ulp/srpt/ib_srpt.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/infiniband/ulp/srpt/ib_srpt.c 
b/drivers/infiniband/ulp/srpt/ib_srpt.c
index 105512d..b42cf0d 100644
--- a/drivers/infiniband/ulp/srpt/ib_srpt.c
+++ b/drivers/infiniband/ulp/srpt/ib_srpt.c
@@ -1395,7 +1395,7 @@ static void srpt_rdma_read_done(struct ib_cq *cq, struct 
ib_wc *wc)
 {
struct srpt_rdma_ch *ch = cq->cq_context;
struct srpt_send_ioctx *ioctx =
-   container_of(wc->wr_cqe, struct srpt_send_ioctx, ioctx.cqe);
+   container_of(wc->wr_cqe, struct srpt_send_ioctx, rdma_cqe);
 
WARN_ON(ioctx->n_rdma <= 0);
atomic_add(ioctx->n_rdma, >sq_wr_avail);
@@ -1418,7 +1418,7 @@ static void srpt_rdma_read_done(struct ib_cq *cq, struct 
ib_wc *wc)
 static void srpt_rdma_write_done(struct ib_cq *cq, struct ib_wc *wc)
 {
struct srpt_send_ioctx *ioctx =
-   container_of(wc->wr_cqe, struct srpt_send_ioctx, ioctx.cqe);
+   container_of(wc->wr_cqe, struct srpt_send_ioctx, rdma_cqe);
 
if (unlikely(wc->status != IB_WC_SUCCESS)) {
pr_info("RDMA_WRITE for ioctx 0x%p failed with status %d\n",
-- 
2.1.4

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 3/3] IB/srpt: Fix a race condition related to SRP login

2015-12-31 Thread Bart Van Assche
Since patch "IB/srpt: chain RDMA READ/WRITE requests" there are
two loops that process the command wait list. ch->cmd_wait_list
is accessed without locking which means that all code that
accesses this list must be serialized. Since processing of the
RTU event happens from another context than IB WC processing,
remove the wait list processing code from the RTU handler.

Fixes: commit a42d985bd5b2 ("ib_srpt: Initial SRP Target merge for v3.3-rc1").
Signed-off-by: Bart Van Assche <bart.vanass...@sandisk.com>
Cc: Christoph Hellwig <h...@lst.de>
---
 drivers/infiniband/ulp/srpt/ib_srpt.c | 8 
 1 file changed, 8 deletions(-)

diff --git a/drivers/infiniband/ulp/srpt/ib_srpt.c 
b/drivers/infiniband/ulp/srpt/ib_srpt.c
index b42cf0d..48b27dc 100644
--- a/drivers/infiniband/ulp/srpt/ib_srpt.c
+++ b/drivers/infiniband/ulp/srpt/ib_srpt.c
@@ -2540,15 +2540,7 @@ static void srpt_cm_rtu_recv(struct ib_cm_id *cm_id)
BUG_ON(!ch);
 
if (srpt_test_and_set_ch_state(ch, CH_CONNECTING, CH_LIVE)) {
-   struct srpt_recv_ioctx *ioctx, *ioctx_tmp;
-
ret = srpt_ch_qp_rts(ch, ch->qp);
-
-   list_for_each_entry_safe(ioctx, ioctx_tmp, >cmd_wait_list,
-wait_list) {
-   list_del(>wait_list);
-   srpt_handle_new_iu(ch, ioctx, NULL);
-   }
if (ret)
srpt_close_ch(ch);
}
-- 
2.1.4

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] IB/core: Allocating larger memory than required for cma_configfs

2015-12-30 Thread Bart Van Assche

On 12/30/2015 03:14 PM, Matan Barak wrote:

We were allocating larger memory space than requried for
cma_dev_group->default_ports_group.


Please change the subject into something like "Do not allocate more 
...". Please also fix the spelling error in the patch description.


Thanks,

Bart.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma.git/k.o/for-4.5 crash during boot

2015-12-30 Thread Bart Van Assche

Hello Christoph,

Can you check whether the branch in the subject of this e-mail works 
fine on your setup (commit 59caaed7a7) ? On my test setup (Dell R430 
with two ConnectX-3 adapters) this branch crashes during boot in 
get_counter_table() (see also the attached screenshot).


Thanks,

Bart.


Re: [PATCH for-next 2/3] IB/core: Change per-entry lock in RoCE GID table to one lock

2015-12-29 Thread Bart Van Assche

On 12/30/2015 07:01 AM, Or Gerlitz wrote:

On 10/28/2015 4:52 PM, Matan Barak wrote:

@@ -134,16 +138,14 @@ static int write_gid(struct ib_device *ib_dev,
u8 port,
  {
  int ret = 0;
  struct net_device *old_net_dev;
-unsigned long flags;
  /* in rdma_cap_roce_gid_table, this funciton should be protected
by a
   * sleep-able lock.
   */
-write_lock_irqsave(>data_vec[ix].lock, flags);
  if (rdma_cap_roce_gid_table(ib_dev, port)) {
  table->data_vec[ix].props |= GID_TABLE_ENTRY_INVALID;
-write_unlock_irqrestore(>data_vec[ix].lock, flags);
+write_unlock_irq(>rwlock);
  /* GID_TABLE_WRITE_ACTION_MODIFY currently isn't supported by
   * RoCE providers and thus only updates the cache.
   */
@@ -153,7 +155,7 @@ static int write_gid(struct ib_device *ib_dev, u8
port,
  else if (action == GID_TABLE_WRITE_ACTION_DEL)
  ret = ib_dev->del_gid(ib_dev, port, ix,
>data_vec[ix].context);
-write_lock_irqsave(>data_vec[ix].lock, flags);
+write_lock_irq(>rwlock);
  }


sparse complains on

drivers/infiniband/core/cache.c:186:17: warning: context imbalance in
'write_gid' - unexpected unlock

is this false positive?


Hello Or,

sparse expects __release() and __acquire() annotations for functions 
that unlock a lock object that has been locked by its caller. See e.g. 
http://lists.kernelnewbies.org/pipermail/kernelnewbies/2011-October/003541.html.


Bart.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: completion queue abstraction V2

2015-12-29 Thread Bart Van Assche

On 12/07/2015 09:51 PM, Christoph Hellwig wrote:

This series adds a new RDMA core abstraction that insulated the
ULPs from the nitty gritty details of CQ polling.  See the individual
patches for more details.


Hello Christoph,

After having tested the SRP initiator and target drivers with this patch 
series applied I have further feedback about this patch series. I will 
provide that feedback as replies to the individual patches.


Bart.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 03/13] irq_poll: fold irq_poll_sched_prep into irq_poll_sched

2015-12-29 Thread Bart Van Assche

On 12/07/2015 09:51 PM, Christoph Hellwig wrote:

diff --git a/lib/irq_poll.c b/lib/irq_poll.c
index 88af879..13cb149 100644
--- a/lib/irq_poll.c
+++ b/lib/irq_poll.c
@@ -21,13 +21,17 @@ static DEFINE_PER_CPU(struct list_head, blk_cpu_iopoll);
   *
   * Description:
   * Add this irq_poll structure to the pending poll list and trigger the
- * raise of the blk iopoll softirq. The driver must already have gotten a
- * successful return from irq_poll_sched_prep() before calling this.
+ * raise of the blk iopoll softirq.
   **/
  void irq_poll_sched(struct irq_poll *iop)
  {
unsigned long flags;

+   if (test_bit(IRQ_POLL_F_DISABLE, >state))
+   return;
+   if (!test_and_set_bit(IRQ_POLL_F_SCHED, >state))
+   return;
+
local_irq_save(flags);
list_add_tail(>list, this_cpu_ptr(_cpu_iopoll));
__raise_softirq_irqoff(IRQ_POLL_SOFTIRQ);


After having applied these changes the SRP initiator didn't receive any 
RDMA completions anymore. I could remedy that by changing 
"!test_and_set_bit()" into "test_and_set_bit()":


diff --git a/lib/irq_poll.c b/lib/irq_poll.c
index 43a3370..3a67019 100644
--- a/lib/irq_poll.c
+++ b/lib/irq_poll.c
@@ -29,7 +29,7 @@ void irq_poll_sched(struct irq_poll *iop)

if (test_bit(IRQ_POLL_F_DISABLE, >state))
return;
-   if (!test_and_set_bit(IRQ_POLL_F_SCHED, >state))
+   if (test_and_set_bit(IRQ_POLL_F_SCHED, >state))
return;

local_irq_save(flags);
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 08/13] IB/srpt: chain RDMA READ/WRITE requests

2015-12-29 Thread Bart Van Assche
On 12/07/2015 09:51 PM, Christoph Hellwig wrote:
> Remove struct rdma_iu and instead allocate the struct ib_rdma_wr array
> early and fill out directly.  This allows us to chain the WRs, and thus
> archive both less lock contention on the HCA workqueue as well as much
> simpler error handling.

Please consider folding the patch below into this patch.

Thanks,

Bart.

[PATCH] IB/srpt: Fix a recently introduced kernel crash

BUG: unable to handle kernel paging request at 00010198
IP: [] __lock_acquire+0xa2/0x560
Call Trace:
 [] lock_acquire+0x62/0x80
 [] _raw_spin_lock_irqsave+0x43/0x60
 [] srpt_rdma_read_done+0x57/0x120 [ib_srpt]
 [] __ib_process_cq+0x43/0xc0 [ib_core]
 [] ib_cq_poll_work+0x25/0x70 [ib_core]
 [] process_one_work+0x1bd/0x460
 [] worker_thread+0x118/0x420
 [] kthread+0xe4/0x100
 [] ret_from_fork+0x3f/0x70

---
 drivers/infiniband/ulp/srpt/ib_srpt.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/infiniband/ulp/srpt/ib_srpt.c 
b/drivers/infiniband/ulp/srpt/ib_srpt.c
index 8068aff..3daab39 100644
--- a/drivers/infiniband/ulp/srpt/ib_srpt.c
+++ b/drivers/infiniband/ulp/srpt/ib_srpt.c
@@ -1395,7 +1395,7 @@ static void srpt_rdma_read_done(struct ib_cq *cq, struct 
ib_wc *wc)
 {
struct srpt_rdma_ch *ch = cq->cq_context;
struct srpt_send_ioctx *ioctx =
-   container_of(wc->wr_cqe, struct srpt_send_ioctx, ioctx.cqe);
+   container_of(wc->wr_cqe, struct srpt_send_ioctx, rdma_cqe);
 
WARN_ON(ioctx->n_rdma <= 0);
atomic_add(ioctx->n_rdma, >sq_wr_avail);
@@ -1418,7 +1418,7 @@ static void srpt_rdma_read_done(struct ib_cq *cq, struct 
ib_wc *wc)
 static void srpt_rdma_write_done(struct ib_cq *cq, struct ib_wc *wc)
 {
struct srpt_send_ioctx *ioctx =
-   container_of(wc->wr_cqe, struct srpt_send_ioctx, ioctx.cqe);
+   container_of(wc->wr_cqe, struct srpt_send_ioctx, rdma_cqe);
 
if (unlikely(wc->status != IB_WC_SUCCESS)) {
pr_info("RDMA_WRITE for ioctx 0x%p failed with status %d\n",
-- 
2.1.4


--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] IB/core: Remove a set-but-not-used variable from ib_sg_to_pages()

2015-12-29 Thread Bart Van Assche
Detected this by building the IB core with W=1. See also patch
"IB core: Fix ib_sg_to_pages()" (commit 8f5ba10ed40a).

Signed-off-by: Bart Van Assche <bart.vanass...@sandisk.com>
Cc: Sagi Grimberg <sa...@mellanox.com>
Cc: Christoph Hellwig <h...@lst.de>
---
 drivers/infiniband/core/verbs.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/drivers/infiniband/core/verbs.c b/drivers/infiniband/core/verbs.c
index 545906d..c90ed29 100644
--- a/drivers/infiniband/core/verbs.c
+++ b/drivers/infiniband/core/verbs.c
@@ -1530,7 +1530,7 @@ int ib_sg_to_pages(struct ib_mr *mr,
   int (*set_page)(struct ib_mr *, u64))
 {
struct scatterlist *sg;
-   u64 last_end_dma_addr = 0, last_page_addr = 0;
+   u64 last_end_dma_addr = 0;
unsigned int last_page_off = 0;
u64 page_mask = ~((u64)mr->page_size - 1);
int i, ret;
@@ -1572,7 +1572,6 @@ next_page:
 
mr->length += dma_len;
last_end_dma_addr = end_dma_addr;
-   last_page_addr = end_dma_addr & page_mask;
last_page_off = end_dma_addr & ~page_mask;
}
 
-- 
2.1.4

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 10/10] IB: remove the unused usecnt field from struct ib_mr

2015-12-18 Thread Bart Van Assche

On 12/18/2015 02:55 PM, Christoph Hellwig wrote:

Signed-off-by: Christoph Hellwig <h...@lst.de>


Shouldn't the description of this patch be changed into something like 
"Remove the usecnt field from ib_mr since it is always zero" ?


Anyway:

Reviewed-by: Bart Van Assche <bvanass...@sandisk.com>
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH rdma-next V1 3/7] IB/ulps: Avoid calling ib_query_device

2015-12-18 Thread Bart Van Assche

On 12/18/2015 09:59 AM, Or Gerlitz wrote:

Instead, use the cached copy of the attributes present on the device.

Signed-off-by: Or Gerlitz 
---
  drivers/infiniband/ulp/ipoib/ipoib_cm.c  | 19 ---
  drivers/infiniband/ulp/ipoib/ipoib_ethtool.c | 14 +++--
  drivers/infiniband/ulp/ipoib/ipoib_main.c| 21 +
  drivers/infiniband/ulp/iser/iscsi_iser.c |  4 +--
  drivers/infiniband/ulp/iser/iscsi_iser.h |  2 --
  drivers/infiniband/ulp/iser/iser_memory.c|  9 +++---
  drivers/infiniband/ulp/iser/iser_verbs.c | 38 ++
  drivers/infiniband/ulp/isert/ib_isert.c  | 47 +---
  drivers/infiniband/ulp/isert/ib_isert.h  |  1 -
  drivers/infiniband/ulp/srp/ib_srp.c  | 32 ++-
  drivers/infiniband/ulp/srpt/ib_srpt.c| 15 -
  drivers/infiniband/ulp/srpt/ib_srpt.h|  3 --
  12 files changed, 63 insertions(+), 142 deletions(-)


Shouldn't this patch be split per ULP driver to make review easier ?

If you have a look at the MAINTAINERS file then you will see that you 
forgot to CC the SRP initiator maintainer.


Bart.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH rdma-next V1 0/7] dev attr cleanup (less is more)

2015-12-18 Thread Bart Van Assche

On 12/18/2015 09:59 AM, Or Gerlitz wrote:

OK, Doug, this is my suggestion for the dev attr cleanup -- it
has the advantages of leaving the attrs on a well defined location,
a field in the IB device, the ability to  get that through smaller
patches, avoid touching any of the HW drivers, etc.


Hello Or,

This version of the dev attr cleanup patches (just like the previous 
version) makes kernel code access the attrs member of struct ib_device 
while user space code keeps calling the query_device function. If there 
would be a bug in one of the query_device implementations, e.g. that the 
correct data is only returned some time after device initialization has 
finished, then user space code will see other attribute values than 
kernel code. I think we should avoid such subtle and hard to debug 
behavior. Hence my proposal to modify ib_uverbs_ex_query_device() such 
that it fetches attribute information from the ib_device structure 
instead of by calling query_device().


Bart.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH rdma-next 1/6] IB/core: Save the device attributes on the device structure

2015-12-17 Thread Bart Van Assche

On 12/17/2015 06:41 PM, Jason Gunthorpe wrote:

On Thu, Dec 17, 2015 at 03:44:19PM +0200, Sagi Grimberg wrote:



+   ret = ib_query_device(device, >attrs);
+   if (ret) {
+   printk(KERN_WARNING "Couldn't query the device attributes\n");
+   goto out;
+   }
+


I thought we're all for removing the call altogether aren't we?

I'd say just call device->query_device() instead.


Christoph's patch even got rid of device->query_device(), which, IHMO,
I prefer to see over this. It re-enforces that these values are
constants and drivers cannot change them on the fly.


I also would like to see the query_device() implementations to be 
removed from the hw drivers.


Bart.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 06/13] irq_poll: remove unused data and max fields

2015-12-10 Thread Bart Van Assche

On 12/07/2015 12:51 PM, Christoph Hellwig wrote:

[ ... ]


Reviewed-by: Bart Van Assche <bart.vanass...@sandisk.com>
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 09/13] IB/srpt: use the new CQ API

2015-12-10 Thread Bart Van Assche

On 12/07/2015 12:51 PM, Christoph Hellwig wrote:

[ ... ]


Reviewed-by: Bart Van Assche <bart.vanass...@sandisk.com>
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 02/13] irq_poll: don't disable new irq_poll instances

2015-12-10 Thread Bart Van Assche

On 12/07/2015 12:51 PM, Christoph Hellwig wrote:

There is no good reason to start out disabled - drivers can control if
the poll instance can be scheduled by simply not scheduling it yet.


Reviewed-by: Bart Van Assche <bart.vanass...@sandisk.com>
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 05/13] irq_poll: mark __irq_poll_complete static

2015-12-10 Thread Bart Van Assche

On 12/07/2015 12:51 PM, Christoph Hellwig wrote:

[ ... ]


Reviewed-by: Bart Van Assche <bart.vanass...@sandisk.com>
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 04/13] irq_poll: fold irq_poll_disable_pending into irq_poll_softirq

2015-12-10 Thread Bart Van Assche

On 12/07/2015 12:51 PM, Christoph Hellwig wrote:
> [ ... ]

Reviewed-by: Bart Van Assche <bart.vanass...@sandisk.com>
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 03/13] irq_poll: fold irq_poll_sched_prep into irq_poll_sched

2015-12-10 Thread Bart Van Assche

On 12/07/2015 12:51 PM, Christoph Hellwig wrote:

@@ -58,7 +62,7 @@ EXPORT_SYMBOL(__irq_poll_complete);
   * Description:
   * If a driver consumes less than the assigned budget in its run of the
   * iopoll handler, it'll end the polled mode by calling this function. The
- * iopoll handler will not be invoked again before irq_poll_sched_prep()
+ * iopoll handler will not be invoked again before irq_poll_schedp()
   * is called.
   **/
  void irq_poll_complete(struct irq_poll *iop)


Is that a typo at the end of the modified line ("schedp") ? If this gets 
addressed:


Reviewed-by: Bart Van Assche <bart.vanass...@sandisk.com>
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 10/13] IB/srp: use the new CQ API

2015-12-10 Thread Bart Van Assche

On 12/07/2015 12:51 PM, Christoph Hellwig wrote:

+static void srp_send_done(struct ib_cq *cq, struct ib_wc *wc)
+{
+   struct srp_iu *iu = container_of(wc->wr_cqe, struct srp_iu, cqe);
+   struct srp_rdma_ch *ch = cq->cq_context;
+
+   if (likely(wc->status != IB_WC_SUCCESS)) {
+   srp_handle_qp_err(cq, wc, "SEND");
+   return;
+   }
+
+   list_add(>list, >free_tx);
+}


Please change likely() in the above code into unlikely().

Thanks,

Bart.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 07/13] IB: add a proper completion queue abstraction

2015-12-10 Thread Bart Van Assche

On 12/07/2015 12:51 PM, Christoph Hellwig wrote:

This adds an abstraction that allows ULP to simply pass a completion

   ^^^

I think this should either be changed into either "an ULP" or "ULPs".


+/**
+ * ib_process_direct_cq - process a CQ in caller context
+ * @cq:CQ to process
+ * @budget:number of CQEs to poll for
+ *
+ * This function is used to process all outstanding CQ entries on a
+ * %IB_POLL_DIRECT CQ.  It does not offload CQ processing to a different
+ * context and does not ask from completion interrupts from the HCA.

   

Should this perhaps be changed into "for" ?


+ *
+ * Note: for compatibility reasons -1 can be passed in %budget for unlimited
+ * polling.  Do not use this feature in new code, it will be remove soon.

^^

Did you perhaps intend "removed" ?



+struct ib_cq *ib_alloc_cq(struct ib_device *dev, void *private,
+   int nr_cqe, int comp_vector, enum ib_poll_context poll_ctx)
+{

> [ ... ]

+   cq->wc = kmalloc_array(IB_POLL_BATCH, sizeof(*cq->wc), GFP_KERNEL);


Why is the wc array allocated separately instead of being embedded in 
struct ib_cq ? I think the faster completion queues can be created the 
better so if it is possible to eliminate the above kmalloc() call I 
would prefer that.



--- a/drivers/infiniband/ulp/srp/ib_srp.c
+++ b/drivers/infiniband/ulp/srp/ib_srp.c
@@ -457,10 +457,11 @@ static struct srp_fr_pool *srp_alloc_fr_pool(struct 
srp_target_port *target)
  static void srp_destroy_qp(struct srp_rdma_ch *ch)
  {
static struct ib_qp_attr attr = { .qp_state = IB_QPS_ERR };
-   static struct ib_recv_wr wr = { .wr_id = SRP_LAST_WR_ID };
+   static struct ib_recv_wr wr = { 0 };
struct ib_recv_wr *bad_wr;
int ret;


Is explicit initialization to "{ 0 }" really needed for static structures ?

Bart.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 01/13] irq_poll: make blk-iopoll available outside the block layer

2015-12-10 Thread Bart Van Assche

On 12/07/2015 12:51 PM, Christoph Hellwig wrote:

-enum {
-   IOPOLL_F_SCHED  = 0,
-   IOPOLL_F_DISABLE= 1,
-};


[ ... ]


+enum {
+   IRQ_POLL_F_SCHED= 0,
+   IRQ_POLL_F_DISABLE  = 1,
+};


A nitpick: the values of these constants were aligned in the original 
code but no longer in the new code. Whether or not this gets addressed:


Reviewed-by: Bart Van Assche <bart.vanass...@sandisk.com>
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 08/13] IB/srpt: chain RDMA READ/WRITE requests

2015-12-10 Thread Bart Van Assche

On 12/07/2015 12:51 PM, Christoph Hellwig wrote:

Remove struct rdma_iu and instead allocate the struct ib_rdma_wr array
early and fill out directly.  This allows us to chain the WRs, and thus
archive both less lock contention on the HCA workqueue as well as much

  ^^^

Did you perhaps intend "achieve" ?


  struct srpt_send_ioctx {
struct srpt_ioctx   ioctx;
struct srpt_rdma_ch *ch;
-   struct rdma_iu  *rdma_ius;
+   struct ib_rdma_wr   *rdma_ius;


Please rename the "rdma_ius" member into "wr" or any other name that 
shows that this member is now a work request array and nothing else.


Otherwise this patch looks fine to me.

Bart.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 0/6] SRP initiator related bug fixes

2015-12-07 Thread Bart Van Assche

On 12/05/2015 03:17 AM, Christoph Hellwig wrote:

Hi Bart,

On Fri, Dec 04, 2015 at 03:08:15PM -0800, Bart Van Assche wrote:

While preparing this patch series I noticed that none of the SCSI RDMA
initiator drivers syncs RDMA buffers before performing RDMA. Does
anyone know why something like the code below is not present in these
drivers and why no dma_sync_sg operations are present in struct
ib_dma_mapping_ops ?


dma_map*/dma_unmap* have to do implicit syncs, so no explicit call
is required.  That's probably the reason why we don't need them in
the weird RDMA shadows of these functions (for which I still don't
understand the reason, while we're at it..).


Hello Christoph,

struct ib_dma_mapping_ops was added through commit "IB: Add DMA mapping 
functions to allow device drivers to interpose" 
(https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=9b513090a3c5e4964f9ac09016c1586988abb3d5). 
I'm not sure that these functions are still needed by the qib or ipath 
drivers since the .dma_address and .dma_len members have been removed 
some time ago from that structure (see also 
http://thread.gmane.org/gmane.linux.drivers.rdma/19730). It would be 
useful to have data available that shows the performance difference with 
and without struct ib_dma_mapping_ops for the qib and ipath drivers but 
I do not know whether such data is available publicly.


Bart.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 0/6] SRP initiator related bug fixes

2015-12-07 Thread Bart Van Assche

On 12/01/2015 10:16 AM, Bart Van Assche wrote:

This patch series contains six bug fixes either for the SRP initiator
itself or for IB core functionality used by the SRP initiator. The order
of these patches matches the order in which the corresponding bugs have
been introduced. The patches in this series are:

0001-IB-srp-Fix-a-memory-leak.patch
0002-IB-srp-Fix-possible-send-queue-overflow.patch
0003-IB-srp-Initialize-dma_length-in-srp_map_idb.patch
0004-IB-srp-Fix-indirect-data-buffer-rkey-endianness.patch
0005-IB-core-Fix-ib_sg_to_pages.patch
0006-IB-srp-Fix-srp_map_sg_fr.patch


(replying to my own e-mail)

Hello Doug,

A consensus has been achieved about this patch series. Since one of the 
patches in this series fixes a data corruption bug that has been 
introduced during the latest (v4.4) merge window I'd like to see these 
patches upstream soon. Please recommend how I should proceed. When I 
repost this patch series, should I send it to you, to Linus or perhaps 
to someone else ?


Thanks,

Bart.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 0/6] SRP initiator related bug fixes

2015-12-07 Thread Bart Van Assche

On 12/07/2015 01:07 PM, Doug Ledford wrote:

On 12/07/2015 02:26 PM, Bart Van Assche wrote:

On 12/01/2015 10:16 AM, Bart Van Assche wrote:

This patch series contains six bug fixes either for the SRP initiator
itself or for IB core functionality used by the SRP initiator. The order
of these patches matches the order in which the corresponding bugs have
been introduced. The patches in this series are:

0001-IB-srp-Fix-a-memory-leak.patch
0002-IB-srp-Fix-possible-send-queue-overflow.patch
0003-IB-srp-Initialize-dma_length-in-srp_map_idb.patch
0004-IB-srp-Fix-indirect-data-buffer-rkey-endianness.patch
0005-IB-core-Fix-ib_sg_to_pages.patch
0006-IB-srp-Fix-srp_map_sg_fr.patch


(replying to my own e-mail)

Hello Doug,

A consensus has been achieved about this patch series. Since one of the
patches in this series fixes a data corruption bug that has been
introduced during the latest (v4.4) merge window I'd like to see these
patches upstream soon. Please recommend how I should proceed. When I
repost this patch series, should I send it to you, to Linus or perhaps
to someone else ?


I was just getting ready to pull this set in.  The only one you actually
changed, and therefore needs either a resubmit or I have to be careful
to get the follow-on version in Patchworks is 5/6, yes?


Hello Doug,

You are correct, only for patch 5 the patch itself has been modified. 
The only change for the other patches is that Reviewed-by tags have been 
posted on the list.


Thanks,

Bart.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 0/6] SRP initiator related bug fixes

2015-12-04 Thread Bart Van Assche
On 12/01/2015 10:16 AM, Bart Van Assche wrote:
> [ ... ]

Hello,

While preparing this patch series I noticed that none of the SCSI RDMA
initiator drivers syncs RDMA buffers before performing RDMA. Does
anyone know why something like the code below is not present in these
drivers and why no dma_sync_sg operations are present in struct
ib_dma_mapping_ops ?

Thanks,

Bart.


[PATCH] IB/srp: Sync scatterlists before and after DMA

---
 drivers/infiniband/ulp/srp/ib_srp.c | 10 ++
 include/rdma/ib_verbs.h | 16 
 2 files changed, 26 insertions(+)

diff --git a/drivers/infiniband/ulp/srp/ib_srp.c 
b/drivers/infiniband/ulp/srp/ib_srp.c
index 3db9a65..23e3c25 100644
--- a/drivers/infiniband/ulp/srp/ib_srp.c
+++ b/drivers/infiniband/ulp/srp/ib_srp.c
@@ -1780,6 +1780,7 @@ static int srp_post_recv(struct srp_rdma_ch *ch, struct 
srp_iu *iu)
 static void srp_process_rsp(struct srp_rdma_ch *ch, struct srp_rsp *rsp)
 {
struct srp_target_port *target = ch->target;
+   struct ib_device *dev = target->srp_host->srp_dev->dev;
struct srp_request *req;
struct scsi_cmnd *scmnd;
unsigned long flags;
@@ -1828,6 +1829,11 @@ static void srp_process_rsp(struct srp_rdma_ch *ch, 
struct srp_rsp *rsp)
else if (unlikely(rsp->flags & SRP_RSP_FLAG_DOOVER))
scsi_set_resid(scmnd, 
-be32_to_cpu(rsp->data_out_res_cnt));
 
+   if (scmnd->sc_data_direction != DMA_NONE)
+   ib_dma_sync_sg_for_cpu(dev, scsi_sglist(scmnd),
+  scsi_sg_count(scmnd),
+  scmnd->sc_data_direction);
+
srp_free_req(ch, req, scmnd,
 be32_to_cpu(rsp->req_lim_delta));
 
@@ -2112,6 +2118,10 @@ static int srp_queuecommand(struct Scsi_Host *shost, 
struct scsi_cmnd *scmnd)
 
ib_dma_sync_single_for_device(dev, iu->dma, target->max_iu_len,
  DMA_TO_DEVICE);
+   if (scmnd->sc_data_direction != DMA_NONE)
+   ib_dma_sync_sg_for_device(dev, scsi_sglist(scmnd),
+ scsi_sg_count(scmnd),
+ scmnd->sc_data_direction);
 
if (srp_post_send(ch, iu, len)) {
shost_printk(KERN_ERR, target->scsi_host, PFX "Send failed\n");
diff --git a/include/rdma/ib_verbs.h b/include/rdma/ib_verbs.h
index 9a68a19..5f9cba7 100644
--- a/include/rdma/ib_verbs.h
+++ b/include/rdma/ib_verbs.h
@@ -2796,6 +2796,22 @@ static inline void ib_dma_sync_single_for_device(struct 
ib_device *dev,
dma_sync_single_for_device(dev->dma_device, addr, size, dir);
 }
 
+/* Prepare DMA region to be accessed by CPU */
+static inline void ib_dma_sync_sg_for_cpu(struct ib_device *dev,
+ struct scatterlist *sg, int nelems,
+ enum dma_data_direction dir)
+{
+   dma_sync_sg_for_cpu(dev->dma_device, sg, nelems, dir);
+}
+
+/* Prepare DMA region to be accessed by HCA */
+static inline void ib_dma_sync_sg_for_device(struct ib_device *dev,
+struct scatterlist *sg, int nelems,
+enum dma_data_direction dir)
+{
+   dma_sync_sg_for_device(dev->dma_device, sg, nelems, dir);
+}
+
 /**
  * ib_dma_alloc_coherent - Allocate memory and map it for DMA
  * @dev: The device for which the DMA address is requested
-- 
2.1.4

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 5/6] IB core: Fix ib_sg_to_pages()

2015-12-03 Thread Bart Van Assche
On 12/03/2015 01:18 AM, Christoph Hellwig wrote:
> The patch looks good to me, but while we touch this area, how about
> throwing in a few cosmetic fixes as well?
 
How about the patch below ? In that version of the ib_sg_to_pages() fix
these concerns have been addressed and additionally to more bugs have been 
fixed.



[PATCH] IB core: Fix ib_sg_to_pages()

Fix the code for detecting gaps. A gap occurs not only if the
second or later scatterlist element is not aligned but also if
any scatterlist element other than the last does not end at a
page boundary.

In the code for coalescing contiguous elements, ensure that
mr->length is correct and that last_page_addr is up-to-date.

Ensure that this function returns a negative
error code instead of zero if the first set_page() call fails.

Fixes: commit 4c67e2bfc8b7 ("IB/core: Introduce new fast registration API")
Reported-by: Christoph Hellwig 
---
 drivers/infiniband/core/verbs.c | 43 +
 1 file changed, 22 insertions(+), 21 deletions(-)

diff --git a/drivers/infiniband/core/verbs.c b/drivers/infiniband/core/verbs.c
index 043a60e..545906d 100644
--- a/drivers/infiniband/core/verbs.c
+++ b/drivers/infiniband/core/verbs.c
@@ -1516,7 +1516,7 @@ EXPORT_SYMBOL(ib_map_mr_sg);
  * @sg_nents:  number of entries in sg
  * @set_page:  driver page assignment function pointer
  *
- * Core service helper for drivers to covert the largest
+ * Core service helper for drivers to convert the largest
  * prefix of given sg list to a page vector. The sg list
  * prefix converted is the prefix that meet the requirements
  * of ib_map_mr_sg.
@@ -1533,7 +1533,7 @@ int ib_sg_to_pages(struct ib_mr *mr,
u64 last_end_dma_addr = 0, last_page_addr = 0;
unsigned int last_page_off = 0;
u64 page_mask = ~((u64)mr->page_size - 1);
-   int i;
+   int i, ret;
 
mr->iova = sg_dma_address([0]);
mr->length = 0;
@@ -1544,27 +1544,29 @@ int ib_sg_to_pages(struct ib_mr *mr,
u64 end_dma_addr = dma_addr + dma_len;
u64 page_addr = dma_addr & page_mask;
 
-   if (i && page_addr != dma_addr) {
-   if (last_end_dma_addr != dma_addr) {
-   /* gap */
-   goto done;
-
-   } else if (last_page_off + dma_len <= mr->page_size) {
-   /* chunk this fragment with the last */
-   mr->length += dma_len;
-   last_end_dma_addr += dma_len;
-   last_page_off += dma_len;
-   continue;
-   } else {
-   /* map starting from the next page */
-   page_addr = last_page_addr + mr->page_size;
-   dma_len -= mr->page_size - last_page_off;
-   }
+   /*
+* For the second and later elements, check whether either the
+* end of element i-1 or the start of element i is not aligned
+* on a page boundary.
+*/
+   if (i && (last_page_off != 0 || page_addr != dma_addr)) {
+   /* Stop mapping if there is a gap. */
+   if (last_end_dma_addr != dma_addr)
+   break;
+
+   /*
+* Coalesce this element with the last. If it is small
+* enough just update mr->length. Otherwise start
+* mapping from the next page.
+*/
+   goto next_page;
}
 
do {
-   if (unlikely(set_page(mr, page_addr)))
-   goto done;
+   ret = set_page(mr, page_addr);
+   if (unlikely(ret < 0))
+   return i ? : ret;
+next_page:
page_addr += mr->page_size;
} while (page_addr < end_dma_addr);
 
@@ -1574,7 +1576,6 @@ int ib_sg_to_pages(struct ib_mr *mr,
last_page_off = end_dma_addr & ~page_mask;
}
 
-done:
return i;
 }
 EXPORT_SYMBOL(ib_sg_to_pages);
-- 
2.1.4

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 5/6] IB core: Fix ib_sg_to_pages()

2015-12-02 Thread Bart Van Assche
On 12/02/2015 01:31 AM, Sagi Grimberg wrote:
> On 01/12/2015 21:10, Bart Van Assche wrote:
>> On 12/01/2015 10:32 AM, Sagi Grimberg wrote:
>> How ib_sg_to_pages() reports to its caller that mapping the first
>> scatterlist element failed is not important to me. I included that
>> change in this patch because I noticed that in the upstream SRP
>> initiator does not consider ib_map_mr_sg() returning zero as an error. I
>> think either ib_sg_to_pages() or ib_map_mr_sg() should be modified such
>> that "no pages have been mapped" is handled as a mapping failure.
>
> I don't either, but the patch introduces inconsistency in a case where
> the caller passed sg_nents=0 (which will return 0 and not -errno).
>
> Would it make better sense to have srp treat 0 return as error? note
> that all other ULPs treat return_val != sg_nents as error.

Hello Sagi,

Hmm ... why would it be unacceptable to return 0 if sg_nents == 0 ?

Regarding which component to modify if mapping the first page fails:
for almost every kernel function I know a negative return value means
failure and a return value >= 0 means success. Hence my proposal to
change the return value of the ib_map_mr_sg() function if mapping the
first page fails.

How about the patch below ?

Bart.



[PATCH] IB core: Fix ib_sg_to_pages()

Fix the code for detecting gaps. A gap occurs not only if the
second or later scatterlist element is not aligned but also if
any scatterlist element other than the last does not end at a
page boundary. Ensure that this function returns a negative
error code instead of zero if the first set_page() call fails.

Fixes: commit 4c67e2bfc8b7 ("IB/core: Introduce new fast registration API")
Reported-by: Christoph Hellwig <h...@lst.de>
---
 drivers/infiniband/core/verbs.c | 15 +++
 1 file changed, 7 insertions(+), 8 deletions(-)

diff --git a/drivers/infiniband/core/verbs.c b/drivers/infiniband/core/verbs.c
index 043a60e..1972c50 100644
--- a/drivers/infiniband/core/verbs.c
+++ b/drivers/infiniband/core/verbs.c
@@ -1516,7 +1516,7 @@ EXPORT_SYMBOL(ib_map_mr_sg);
  * @sg_nents:  number of entries in sg
  * @set_page:  driver page assignment function pointer
  *
- * Core service helper for drivers to covert the largest
+ * Core service helper for drivers to convert the largest
  * prefix of given sg list to a page vector. The sg list
  * prefix converted is the prefix that meet the requirements
  * of ib_map_mr_sg.
@@ -1533,7 +1533,7 @@ int ib_sg_to_pages(struct ib_mr *mr,
u64 last_end_dma_addr = 0, last_page_addr = 0;
unsigned int last_page_off = 0;
u64 page_mask = ~((u64)mr->page_size - 1);
-   int i;
+   int i, ret;
 
mr->iova = sg_dma_address([0]);
mr->length = 0;
@@ -1544,11 +1544,10 @@ int ib_sg_to_pages(struct ib_mr *mr,
u64 end_dma_addr = dma_addr + dma_len;
u64 page_addr = dma_addr & page_mask;
 
-   if (i && page_addr != dma_addr) {
+   if (i && (page_addr != dma_addr || last_page_off != 0)) {
if (last_end_dma_addr != dma_addr) {
/* gap */
-   goto done;
-
+   break;
} else if (last_page_off + dma_len <= mr->page_size) {
/* chunk this fragment with the last */
mr->length += dma_len;
@@ -1563,8 +1562,9 @@ int ib_sg_to_pages(struct ib_mr *mr,
}
 
do {
-   if (unlikely(set_page(mr, page_addr)))
-   goto done;
+   ret = set_page(mr, page_addr);
+   if (unlikely(ret < 0))
+   return i ? : ret;
page_addr += mr->page_size;
} while (page_addr < end_dma_addr);
 
@@ -1574,7 +1574,6 @@ int ib_sg_to_pages(struct ib_mr *mr,
last_page_off = end_dma_addr & ~page_mask;
}
 
-done:
return i;
 }
 EXPORT_SYMBOL(ib_sg_to_pages);
-- 
2.1.4

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 6/6] IB/srp: Fix srp_map_sg_fr()

2015-12-01 Thread Bart Van Assche
After dma_map_sg() has been called the return value of that function
must be used as the number of elements in the scatterlist instead of
scsi_sg_count().

Fixes: commit f7f7aab1a5c0 ("IB/srp: Convert to new registration API")
Reported-by: Christoph Hellwig <h...@lst.de>
Signed-off-by: Bart Van Assche <bart.vanass...@sandisk.com>
Cc: stable <sta...@vger.kernel.org> # v4.4+
Cc: Sagi Grimberg <sa...@mellanox.com>
Cc: Sebastian Parschauer <sebastian.rie...@profitbricks.com>
---
 drivers/infiniband/ulp/srp/ib_srp.c | 19 ---
 drivers/infiniband/ulp/srp/ib_srp.h |  5 +
 2 files changed, 9 insertions(+), 15 deletions(-)

diff --git a/drivers/infiniband/ulp/srp/ib_srp.c 
b/drivers/infiniband/ulp/srp/ib_srp.c
index fac1423..3db9a65 100644
--- a/drivers/infiniband/ulp/srp/ib_srp.c
+++ b/drivers/infiniband/ulp/srp/ib_srp.c
@@ -1313,7 +1313,7 @@ reset_state:
 }
 
 static int srp_map_finish_fr(struct srp_map_state *state,
-struct srp_rdma_ch *ch)
+struct srp_rdma_ch *ch, int sg_nents)
 {
struct srp_target_port *target = ch->target;
struct srp_device *dev = target->srp_host->srp_dev;
@@ -1328,10 +1328,10 @@ static int srp_map_finish_fr(struct srp_map_state 
*state,
 
WARN_ON_ONCE(!dev->use_fast_reg);
 
-   if (state->sg_nents == 0)
+   if (sg_nents == 0)
return 0;
 
-   if (state->sg_nents == 1 && target->global_mr) {
+   if (sg_nents == 1 && target->global_mr) {
srp_map_desc(state, sg_dma_address(state->sg),
 sg_dma_len(state->sg),
 target->global_mr->rkey);
@@ -1345,8 +1345,7 @@ static int srp_map_finish_fr(struct srp_map_state *state,
rkey = ib_inc_rkey(desc->mr->rkey);
ib_update_fast_reg_key(desc->mr, rkey);
 
-   n = ib_map_mr_sg(desc->mr, state->sg, state->sg_nents,
-dev->mr_page_size);
+   n = ib_map_mr_sg(desc->mr, state->sg, sg_nents, dev->mr_page_size);
if (unlikely(n < 0))
return n;
 
@@ -1452,16 +1451,15 @@ static int srp_map_sg_fr(struct srp_map_state *state, 
struct srp_rdma_ch *ch,
state->fr.next = req->fr_list;
state->fr.end = req->fr_list + ch->target->cmd_sg_cnt;
state->sg = scat;
-   state->sg_nents = scsi_sg_count(req->scmnd);
 
-   while (state->sg_nents) {
+   while (count) {
int i, n;
 
-   n = srp_map_finish_fr(state, ch);
+   n = srp_map_finish_fr(state, ch, count);
if (unlikely(n < 0))
return n;
 
-   state->sg_nents -= n;
+   count -= n;
for (i = 0; i < n; i++)
state->sg = sg_next(state->sg);
}
@@ -1521,13 +1519,12 @@ static int srp_map_idb(struct srp_rdma_ch *ch, struct 
srp_request *req,
 
if (dev->use_fast_reg) {
state.sg = idb_sg;
-   state.sg_nents = 1;
sg_set_buf(idb_sg, req->indirect_desc, idb_len);
idb_sg->dma_address = req->indirect_dma_addr; /* hack! */
 #ifdef CONFIG_NEED_SG_DMA_LENGTH
idb_sg->dma_length = idb_sg->length;  /* hack^2 */
 #endif
-   ret = srp_map_finish_fr(, ch);
+   ret = srp_map_finish_fr(, ch, 1);
if (ret < 0)
return ret;
} else if (dev->use_fmr) {
diff --git a/drivers/infiniband/ulp/srp/ib_srp.h 
b/drivers/infiniband/ulp/srp/ib_srp.h
index 87a2a91..f6af531 100644
--- a/drivers/infiniband/ulp/srp/ib_srp.h
+++ b/drivers/infiniband/ulp/srp/ib_srp.h
@@ -300,10 +300,7 @@ struct srp_map_state {
dma_addr_t  base_dma_addr;
u32 dma_len;
u32 total_len;
-   union {
-   unsigned intnpages;
-   int sg_nents;
-   };
+   unsigned intnpages;
unsigned intnmdesc;
unsigned intndesc;
 };
-- 
2.1.4

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 0/6] SRP initiator related bug fixes

2015-12-01 Thread Bart Van Assche
This patch series contains six bug fixes either for the SRP initiator 
itself or for IB core functionality used by the SRP initiator. The order 
of these patches matches the order in which the corresponding bugs have 
been introduced. The patches in this series are:


0001-IB-srp-Fix-a-memory-leak.patch
0002-IB-srp-Fix-possible-send-queue-overflow.patch
0003-IB-srp-Initialize-dma_length-in-srp_map_idb.patch
0004-IB-srp-Fix-indirect-data-buffer-rkey-endianness.patch
0005-IB-core-Fix-ib_sg_to_pages.patch
0006-IB-srp-Fix-srp_map_sg_fr.patch
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 5/6] IB core: Fix ib_sg_to_pages()

2015-12-01 Thread Bart Van Assche
Fix the code for detecting gaps and disable the code for chunking.
A gap occurs not only if the second or later scatterlist element
is not aligned but also if any scatterlist element other than the
last does not end at a page boundary. Disable the code for chunking.
Ensure that this function returns a negative error code instead of
zero if the first set_page() call fails.

Fixes: commit 4c67e2bfc8b7 ("IB/core: Introduce new fast registration API")
Reported-by: Christoph Hellwig <h...@lst.de>
Signed-off-by: Bart Van Assche <bart.vanass...@sandisk.com>
Cc: stable <sta...@vger.kernel.org> # v4.4+
Cc: Sagi Grimberg <sa...@mellanox.com>
Cc: Sebastian Parschauer <sebastian.rie...@profitbricks.com>
---
 drivers/infiniband/core/verbs.c | 31 +++
 1 file changed, 7 insertions(+), 24 deletions(-)

diff --git a/drivers/infiniband/core/verbs.c b/drivers/infiniband/core/verbs.c
index 043a60e..95836c6 100644
--- a/drivers/infiniband/core/verbs.c
+++ b/drivers/infiniband/core/verbs.c
@@ -1530,10 +1530,9 @@ int ib_sg_to_pages(struct ib_mr *mr,
   int (*set_page)(struct ib_mr *, u64))
 {
struct scatterlist *sg;
-   u64 last_end_dma_addr = 0, last_page_addr = 0;
unsigned int last_page_off = 0;
u64 page_mask = ~((u64)mr->page_size - 1);
-   int i;
+   int i, ret;
 
mr->iova = sg_dma_address([0]);
mr->length = 0;
@@ -1544,37 +1543,21 @@ int ib_sg_to_pages(struct ib_mr *mr,
u64 end_dma_addr = dma_addr + dma_len;
u64 page_addr = dma_addr & page_mask;
 
-   if (i && page_addr != dma_addr) {
-   if (last_end_dma_addr != dma_addr) {
-   /* gap */
-   goto done;
-
-   } else if (last_page_off + dma_len <= mr->page_size) {
-   /* chunk this fragment with the last */
-   mr->length += dma_len;
-   last_end_dma_addr += dma_len;
-   last_page_off += dma_len;
-   continue;
-   } else {
-   /* map starting from the next page */
-   page_addr = last_page_addr + mr->page_size;
-   dma_len -= mr->page_size - last_page_off;
-   }
-   }
+   /* gap */
+   if (i && (last_page_off || page_addr != dma_addr))
+   break;
 
do {
-   if (unlikely(set_page(mr, page_addr)))
-   goto done;
+   ret = set_page(mr, page_addr);
+   if (unlikely(ret < 0))
+   return i ? i : ret;
page_addr += mr->page_size;
} while (page_addr < end_dma_addr);
 
mr->length += dma_len;
-   last_end_dma_addr = end_dma_addr;
-   last_page_addr = end_dma_addr & page_mask;
last_page_off = end_dma_addr & ~page_mask;
}
 
-done:
return i;
 }
 EXPORT_SYMBOL(ib_sg_to_pages);
-- 
2.1.4

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 3/6] IB/srp: Initialize dma_length in srp_map_idb

2015-12-01 Thread Bart Van Assche
From: Christoph Hellwig 

Without this sg_dma_len will return 0 on architectures tha have
the dma_length field.

Fixes: commit f7f7aab1a5c0 ("IB/srp: Convert to new registration API")
Signed-off-by: Christoph Hellwig 
---
 drivers/infiniband/ulp/srp/ib_srp.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/drivers/infiniband/ulp/srp/ib_srp.c 
b/drivers/infiniband/ulp/srp/ib_srp.c
index c7a95d2..72fac20 100644
--- a/drivers/infiniband/ulp/srp/ib_srp.c
+++ b/drivers/infiniband/ulp/srp/ib_srp.c
@@ -1524,6 +1524,9 @@ static int srp_map_idb(struct srp_rdma_ch *ch, struct 
srp_request *req,
state.sg_nents = 1;
sg_set_buf(idb_sg, req->indirect_desc, idb_len);
idb_sg->dma_address = req->indirect_dma_addr; /* hack! */
+#ifdef CONFIG_NEED_SG_DMA_LENGTH
+   idb_sg->dma_length = idb_sg->length;  /* hack^2 */
+#endif
ret = srp_map_finish_fr(, ch);
if (ret < 0)
return ret;
-- 
2.1.4

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 4/6] IB/srp: Fix indirect data buffer rkey endianness

2015-12-01 Thread Bart Van Assche
Detected by sparse.

Fixes: commit 330179f2fa93 ("IB/srp: Register the indirect data buffer 
descriptor")
Signed-off-by: Bart Van Assche <bart.vanass...@sandisk.com>
Cc: stable <sta...@vger.kernel.org> # v4.3+
Cc: Sagi Grimberg <sa...@mellanox.com>
Cc: Christoph Hellwig <h...@lst.de>
Cc: Sebastian Parschauer <sebastian.rie...@profitbricks.com>
---
 drivers/infiniband/ulp/srp/ib_srp.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/infiniband/ulp/srp/ib_srp.c 
b/drivers/infiniband/ulp/srp/ib_srp.c
index 72fac20..fac1423 100644
--- a/drivers/infiniband/ulp/srp/ib_srp.c
+++ b/drivers/infiniband/ulp/srp/ib_srp.c
@@ -1662,7 +1662,7 @@ static int srp_map_data(struct scsi_cmnd *scmnd, struct 
srp_rdma_ch *ch,
return ret;
req->nmdesc++;
} else {
-   idb_rkey = target->global_mr->rkey;
+   idb_rkey = cpu_to_be32(target->global_mr->rkey);
}
 
indirect_hdr->table_desc.va = cpu_to_be64(req->indirect_dma_addr);
-- 
2.1.4

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 1/6] IB/srp: Fix a memory leak

2015-12-01 Thread Bart Van Assche
If srp_connect_ch() returns a positive value then that is considered
by its caller as a connection failure but this does not result in a
scsi_host_put() call and additionally causes the srp_create_target()
function to return a positive value while it should return a negative
value. Avoid all this confusion and additionally fix a memory leak by
ensuring that srp_connect_ch() always returns a value that is <= 0.
This patch avoids that a rejected login triggers the following memory
leak:

unreferenced object 0x88021b24a220 (size 8):
  comm "srp_daemon", pid 56421, jiffies 4295006762 (age 4240.750s)
  hex dump (first 8 bytes):
68 6f 73 74 35 38 00 a5  host58..
  backtrace:
[] kmemleak_alloc+0x7a/0xc0
[] __kmalloc_track_caller+0xfe/0x160
[] kvasprintf+0x5b/0x90
[] kvasprintf_const+0x8d/0xb0
[] kobject_set_name_vargs+0x3c/0xa0
[] dev_set_name+0x3c/0x40
[] scsi_host_alloc+0x327/0x4b0
[] srp_create_target+0x4e/0x8a0 [ib_srp]
[] dev_attr_store+0x1b/0x20
[] sysfs_kf_write+0x4a/0x60
[] kernfs_fop_write+0x14e/0x180
[] __vfs_write+0x2f/0xf0
[] vfs_write+0xa4/0x100
[] SyS_write+0x54/0xc0
[] entry_SYSCALL_64_fastpath+0x12/0x6f

Signed-off-by: Bart Van Assche <bart.vanass...@sandisk.com>
Reviewed-by: Christoph Hellwig <h...@lst.de>
Reviewed-by: Sagi Grimberg <sa...@mellanox.com>
Cc: Sebastian Parschauer <sebastian.rie...@profitbricks.com>
Cc: stable <sta...@vger.kernel.org>
---
 drivers/infiniband/ulp/srp/ib_srp.c | 22 +-
 1 file changed, 13 insertions(+), 9 deletions(-)

diff --git a/drivers/infiniband/ulp/srp/ib_srp.c 
b/drivers/infiniband/ulp/srp/ib_srp.c
index 9909022..a074dad 100644
--- a/drivers/infiniband/ulp/srp/ib_srp.c
+++ b/drivers/infiniband/ulp/srp/ib_srp.c
@@ -994,16 +994,16 @@ static int srp_connect_ch(struct srp_rdma_ch *ch, bool 
multich)
 
ret = srp_lookup_path(ch);
if (ret)
-   return ret;
+   goto out;
 
while (1) {
init_completion(>done);
ret = srp_send_req(ch, multich);
if (ret)
-   return ret;
+   goto out;
ret = wait_for_completion_interruptible(>done);
if (ret < 0)
-   return ret;
+   goto out;
 
/*
 * The CM event handling code will set status to
@@ -1011,15 +1011,16 @@ static int srp_connect_ch(struct srp_rdma_ch *ch, bool 
multich)
 * back, or SRP_DLID_REDIRECT if we get a lid/qp
 * redirect REJ back.
 */
-   switch (ch->status) {
+   ret = ch->status;
+   switch (ret) {
case 0:
ch->connected = true;
-   return 0;
+   goto out;
 
case SRP_PORT_REDIRECT:
ret = srp_lookup_path(ch);
if (ret)
-   return ret;
+   goto out;
break;
 
case SRP_DLID_REDIRECT:
@@ -1028,13 +1029,16 @@ static int srp_connect_ch(struct srp_rdma_ch *ch, bool 
multich)
case SRP_STALE_CONN:
shost_printk(KERN_ERR, target->scsi_host, PFX
 "giving up on stale connection\n");
-   ch->status = -ECONNRESET;
-   return ch->status;
+   ret = -ECONNRESET;
+   goto out;
 
default:
-   return ch->status;
+   goto out;
}
}
+
+out:
+   return ret <= 0 ? ret : -ENODEV;
 }
 
 static int srp_inv_rkey(struct srp_rdma_ch *ch, u32 rkey)
-- 
2.1.4

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 2/6] IB/srp: Fix possible send queue overflow

2015-12-01 Thread Bart Van Assche
From: Sagi Grimberg 

When using work request based memory registration (fast_reg)
we must reserve SQ entries for registration and invalidation
in addition to send operations. Each IO consumes 3 SQ entries
(registration, send, invalidation) so we need to allocate 3x
larger send-queue instead of 2x.

Signed-off-by: Sagi Grimberg 
CC: Stable 
---
 drivers/infiniband/ulp/srp/ib_srp.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/infiniband/ulp/srp/ib_srp.c 
b/drivers/infiniband/ulp/srp/ib_srp.c
index a074dad..c7a95d2 100644
--- a/drivers/infiniband/ulp/srp/ib_srp.c
+++ b/drivers/infiniband/ulp/srp/ib_srp.c
@@ -488,7 +488,7 @@ static int srp_create_ch_ib(struct srp_rdma_ch *ch)
struct ib_qp *qp;
struct ib_fmr_pool *fmr_pool = NULL;
struct srp_fr_pool *fr_pool = NULL;
-   const int m = 1 + dev->use_fast_reg;
+   const int m = dev->use_fast_reg ? 3 : 1;
struct ib_cq_init_attr cq_attr = {};
int ret;
 
-- 
2.1.4

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 6/6] IB/srp: Fix srp_map_sg_fr()

2015-12-01 Thread Bart Van Assche

On 12/01/2015 10:35 AM, Sagi Grimberg wrote:

After dma_map_sg() has been called the return value of that function
must be used as the number of elements in the scatterlist instead of
scsi_sg_count().


Umm, but ib_map_mr_sg iterates on the sg list. Say you have sg_nents=3
and after mapping you got dma_nents=2 (2 entries are contig) and you
pass that, ib_sg_to_pages will only iterate on 2 elements won't it? am
I missing something?


Hello Sagi,

From https://www.kernel.org/doc/Documentation/DMA-API-HOWTO.txt:


With scatterlists, you map a region gathered from several regions by:

int i, count = dma_map_sg(dev, sglist, nents, direction);
struct scatterlist *sg;

for_each_sg(sglist, sg, count, i) {
hw_address[i] = sg_dma_address(sg);
hw_len[i] = sg_dma_len(sg);
}

where nents is the number of entries in the sglist.


Bart.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 5/6] IB core: Fix ib_sg_to_pages()

2015-12-01 Thread Bart Van Assche

On 12/01/2015 10:32 AM, Sagi Grimberg wrote:

Fix the code for detecting gaps and disable the code for chunking.
A gap occurs not only if the second or later scatterlist element
is not aligned but also if any scatterlist element other than the
last does not end at a page boundary. Disable the code for chunking.


So can you explain what went wrong with the code? If ib_sg_to_pages()
supports chunking, then sg elements are allowed not to end at a page
boundary if it is physically contig to the next sg and then the next
is chunked. Care to explain how did ib_sg_to_pages mess up?


With the upstream implementation, if the previous element ends at a page 
boundary (last_end_dma_addr == dma_addr) but the current element does 
not start at a page boundary (page_addr != dma_addr) then the current 
element is mapped. I think this is wrong because this condition 
indicates a gap and hence that ib_sg_to_pages() should stop iterating in 
this case.



Ensure that this function returns a negative error code instead of
zero if the first set_page() call fails.


Umm, my recollection was to make ib_map_mr_sg return the largest prefix
mapped. I don't mind a negative error in this case, but isn't zero an
implicit error (given you didn't want to map 0 sg elements)?

If we do agree on this we need to change ib_map_mr_sg documentation
accordingly.


How ib_sg_to_pages() reports to its caller that mapping the first 
scatterlist element failed is not important to me. I included that 
change in this patch because I noticed that in the upstream SRP 
initiator does not consider ib_map_mr_sg() returning zero as an error. I 
think either ib_sg_to_pages() or ib_map_mr_sg() should be modified such 
that "no pages have been mapped" is handled as a mapping failure.


Bart.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 4/6] IB/srp: Fix indirect data buffer rkey endianness

2015-12-01 Thread Bart Van Assche

On 12/01/2015 10:37 AM, Sagi Grimberg wrote:

On 01/12/2015 20:18, Bart Van Assche wrote:

Detected by sparse.

Fixes: commit 330179f2fa93 ("IB/srp: Register the indirect data buffer
descriptor")
Signed-off-by: Bart Van Assche <bart.vanass...@sandisk.com>
Cc: stable <sta...@vger.kernel.org> # v4.3+
Cc: Sagi Grimberg <sa...@mellanox.com>
Cc: Christoph Hellwig <h...@lst.de>
Cc: Sebastian Parschauer <sebastian.rie...@profitbricks.com>
---
  drivers/infiniband/ulp/srp/ib_srp.c | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/infiniband/ulp/srp/ib_srp.c
b/drivers/infiniband/ulp/srp/ib_srp.c
index 72fac20..fac1423 100644
--- a/drivers/infiniband/ulp/srp/ib_srp.c
+++ b/drivers/infiniband/ulp/srp/ib_srp.c
@@ -1662,7 +1662,7 @@ static int srp_map_data(struct scsi_cmnd *scmnd,
struct srp_rdma_ch *ch,
  return ret;
  req->nmdesc++;
  } else {
-idb_rkey = target->global_mr->rkey;
+idb_rkey = cpu_to_be32(target->global_mr->rkey);
  }


Wouldn't it make more sense to define idb_rkey to be u32 and change
endianness when assigning it to the indirect desc?


Hello Sagi,

That's possible, but that would cause the endianness of the indirect 
data buffer rkey to be changed three times - a first time in 
srp_map_desc(), a second time in srp_map_idb() and a third time in 
srp_map_data(). Hence the choice to fix the IDB rkey via the above patch.


Bart.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 4/4] Fix compiler warnings about set-but-not-used variables

2015-11-29 Thread Bart Van Assche
Recent gcc versions emit a warning if a variable is set but not used.
Suppress these warnings by removing dummy assignments to unused arguments.
Additionally, enable the compiler command-line option -Wno-unused-parameter
and remove the '(void)arg' statements that thereby became superfluous.

Signed-off-by: Bart Van Assche <bart.vanass...@sandisk.com>
---
 cmdif/tools_cif.c|   1 -
 cmdparser/my_getopt.c|   3 -
 common/compatibility.h   |   7 +
 flint/Makefile.am|   2 +-
 flint/subcommands.cpp|   5 -
 mflash/Makefile.am   |   2 +-
 mflash/mflash.c  |  15 +-
 mlxconfig/Makefile.am|   2 +-
 mlxconfig/mlxcfg_param_lib.cpp   |  19 -
 mlxconfig/mlxcfg_param_lib.h |  10 +-
 mlxconfig/mlxcfg_ui.cpp  |   1 -
 mlxfwops/lib/Makefile.am |   2 +-
 mlxfwops/lib/flint_io.cpp|   5 +-
 mlxfwops/lib/fs2_ops.cpp |  31 --
 mlxfwops/lib/fs3_ops.cpp |  17 +-
 mlxfwops/lib/fw_ops.cpp  |   4 -
 mtcr_ul/Makefile.am  |   2 +-
 mtcr_ul/mtcr_ib_ofed.c   |   1 -
 mtcr_ul/mtcr_ul.c|  22 +-
 mtcr_ul/mtcr_ul_icmd_cif.c   |   1 -
 small_utils/mcra.c   |   1 -
 tools_layouts/cibfw_layouts.c| 249 --
 tools_layouts/cx4fw_layouts.c|  95 
 tools_layouts/register_access_open_layouts.c |  49 --
 tools_layouts/register_access_sib_layouts.c  |  97 
 tools_layouts/tools_open_layouts.c   | 685 ---
 26 files changed, 25 insertions(+), 1303 deletions(-)

diff --git a/cmdif/tools_cif.c b/cmdif/tools_cif.c
index c6afc68..d767ada 100644
--- a/cmdif/tools_cif.c
+++ b/cmdif/tools_cif.c
@@ -132,7 +132,6 @@ const char* tcif_err2str(MError rc) {
 MError tcif_cr_mbox_supported(mfile* dev)
 {
 #if defined(__FreeBSD__) || defined(UEFI_BUILD)
-(void)dev;
 return ME_NOT_IMPLEMENTED;
 #else
 return tools_cmdif_is_cr_mbox_supported(dev);
diff --git a/cmdparser/my_getopt.c b/cmdparser/my_getopt.c
index be6281d..6e64b35 100755
--- a/cmdparser/my_getopt.c
+++ b/cmdparser/my_getopt.c
@@ -394,9 +394,6 @@
/* Start processing options with ARGV-element 1 (since ARGV-element 0
   is the program name); the sequence of previously skipped
   non-option ARGV-elements is empty.  */
-   // avoid compiler warnings
-   (void) argc;
-   (void) argv;
first_nonopt = last_nonopt = tools_optind;
 
nextchar = NULL;
diff --git a/common/compatibility.h b/common/compatibility.h
index 8dc85e3..8f3b4d5 100644
--- a/common/compatibility.h
+++ b/common/compatibility.h
@@ -215,6 +215,13 @@
 #endif
 
 /*
+ * Windows (Microsoft C Compiler)
+ */
+#ifdef _MSC_VER
+#pragma warning (disable: 4100) // 'identifier' : unreferenced formal parameter
+#endif
+
+/*
  * Windows (CYGWIN)
  */
 #if defined(__CYGWIN32__)
diff --git a/flint/Makefile.am b/flint/Makefile.am
index 96bb6ad..5ae5ee5 100755
--- a/flint/Makefile.am
+++ b/flint/Makefile.am
@@ -41,7 +41,7 @@ CMDIF_DIR   = $(top_srcdir)/cmdif
 AM_CPPFLAGS = -I$(top_srcdir) -I$(srcdir) -I$(MTCR_DIR) -I$(MFLASH_DIR) 
-I$(COMMON_DIR) \
-I$(LAYOUTS_DIR) -I$(MFT_UTILS_DIR)
 
-mstflint_CXXFLAGS = -Wall -W -g -MP -MD -pipe -DEXTERNAL
+mstflint_CXXFLAGS = -Wall -Wextra -Wno-unused-parameter -g -MP -MD -pipe 
-DEXTERNAL
 bin_PROGRAMS = mstflint
 
 mstflint_SOURCES = flint.cpp flint.h subcommands.cpp subcommands.h\
diff --git a/flint/subcommands.cpp b/flint/subcommands.cpp
index 91400a5..5b60334 100644
--- a/flint/subcommands.cpp
+++ b/flint/subcommands.cpp
@@ -981,10 +981,6 @@ bool SubCommand::unzipDataFile (std::vector 
data, std::vector& data, con
 
 bool SubCommand::checkGuidsFlags (chip_type_t ct, u_int16_t devType, u_int8_t 
fwType,
 bool guidsSpecified, bool macsSpecified, bool 
uidsSpecified, bool ibDev, bool ethDev) {
-(void)ibDev;
 if (guidsSpecified || macsSpecified || uidsSpecified) {
 if (ct == CT_BRIDGEX) {
 if (macsSpecified || guidsSpecified) {
diff --git a/mflash/Makefile.am b/mflash/Makefile.am
index c45351a..12da860 100644
--- a/mflash/Makefile.am
+++ b/mflash/Makefile.am
@@ -34,7 +34,7 @@
 AM_CPPFLAGS= -I. -I$(top_srcdir)/include/mtcr_ul -I$(top_srcdir)/common 
-I$(top_srcdir)/tools_layouts -I$(top_srcdir)/reg_access \
   -I$(top_srcdir)/cmdif -I$(top_srcdir)/tools_res_mgmt
 
-AM_CFLAGS = -MD -pipe -Wall -W -DMST_UL -g ${MFLASH_INBAND_FLAG}
+AM_CFLAGS = -MD -pipe -Wall -Wextra -Wno-unused-parameter -DMST_UL -g 
${MFLASH_INBAND_FLAG}
 
 noinst_LIBRARIES = libmflash.a
 
diff --git a/mflash/mflash.c b/mflash/mflash.c
index c039127..4174f39 100644
--- a/mflash/mflash.c
+++ b/mflash/mflash.c
@@ -10

[PATCH 1/4] Fix several automake warnings

2015-11-29 Thread Bart Van Assche
Avoid that automake reports the following warning messages:

configure.ac:13: warning: AM_INIT_AUTOMAKE: two- and three-arguments forms are 
deprecated.  For more info, see:
configure.ac:13: 
http://www.gnu.org/software/automake/manual/automake.html#Modernize-AM_005fINIT_005fAUTOMAKE-invocation
small_utils/Makefile.am:50: warning: compiling 'mtserver.c' with per-target 
flags requires 'AM_PROG_CC_C_O' in 'configure.ac'
warning: 'libdev_mgt.a': linking libraries using a non-POSIX archiver requires 
'AM_PROG_AR' in 'configure.ac'
warning: 'INCLUDES' is the old name for 'AM_CPPFLAGS' (or '*_CPPFLAGS')

Signed-off-by: Bart Van Assche <bart.vanass...@sandisk.com>
---
 cmdif/Makefile.am| 2 +-
 cmdparser/Makefile.am| 2 --
 configure.ac | 4 +++-
 dev_mgt/Makefile.am  | 2 +-
 flint/Makefile.am| 2 +-
 mflash/Makefile.am   | 2 +-
 mft_utils/Makefile.am| 2 +-
 mlxconfig/Makefile.am| 2 +-
 mlxfwops/lib/Makefile.am | 2 +-
 mstdump/crd_lib/Makefile.am  | 2 +-
 mstdump/crd_main/Makefile.am | 2 +-
 mtcr_ul/Makefile.am  | 2 +-
 reg_access/Makefile.am   | 2 +-
 small_utils/Makefile.am  | 2 +-
 tools_layouts/Makefile.am| 2 +-
 tools_res_mgmt/Makefile.am   | 2 +-
 16 files changed, 17 insertions(+), 17 deletions(-)

diff --git a/cmdif/Makefile.am b/cmdif/Makefile.am
index 78aba47..8c03fe9 100644
--- a/cmdif/Makefile.am
+++ b/cmdif/Makefile.am
@@ -34,7 +34,7 @@
 USER_DIR = $(top_srcdir)
 MTCR_DIR = $(USER_DIR)/include/mtcr_ul
 TOOLS_LAYOUTS_DIR = $(USER_DIR)/tools_layouts
-INCLUDES = -I. -I../common -I../tools_layouts  -I$(MTCR_DIR) -I.. 
-I$(USER_DIR)/mtcr_ul
+AM_CPPFLAGS = -I. -I../common -I../tools_layouts  -I$(MTCR_DIR) -I.. 
-I$(USER_DIR)/mtcr_ul
 
 AM_CFLAGS = -W -Wall -Werror -g -MP -MD $(COMPILER_FPIC) -DCMDIF_EXPORTS
 CMDIF_VERSION = 1
diff --git a/cmdparser/Makefile.am b/cmdparser/Makefile.am
index 9696f71..952a074 100644
--- a/cmdparser/Makefile.am
+++ b/cmdparser/Makefile.am
@@ -32,8 +32,6 @@
 
 # Makefile.am -- Process this file with automake to produce Makefile.in
 
-INCLUDES =
-
 AM_CPPFLAGS = -W  -g -MP -MD -fPIC
 
 noinst_LIBRARIES = libcmdparser.a
diff --git a/configure.ac b/configure.ac
index be0f9d1..04fb61d 100644
--- a/configure.ac
+++ b/configure.ac
@@ -10,11 +10,13 @@ AC_SUBST([VERSION])
 
 AC_CONFIG_AUX_DIR(config)
 AC_CONFIG_SRCDIR([README])
-AM_INIT_AUTOMAKE(mstflint, 4.1.0)
+AM_INIT_AUTOMAKE([-Wall foreign])
 
 dnl Checks for programs
 AC_PROG_CC
+AM_PROG_CC_C_O
 AC_PROG_CXX
+AM_PROG_AR
 AC_PROG_LIBTOOL
 AC_CONFIG_HEADERS( config.h )
 
diff --git a/dev_mgt/Makefile.am b/dev_mgt/Makefile.am
index 703a27a..147f7aa 100644
--- a/dev_mgt/Makefile.am
+++ b/dev_mgt/Makefile.am
@@ -32,7 +32,7 @@
 
 # Makefile.am -- Process this file with automake to produce Makefile.in
 
-INCLUDES = -I$(srcdir) -I$(top_srcdir) -I$(top_srcdir)/common  
-I$(top_srcdir)/include/mtcr_ul
+AM_CPPFLAGS = -I$(srcdir) -I$(top_srcdir) -I$(top_srcdir)/common  
-I$(top_srcdir)/include/mtcr_ul
 
 AM_CFLAGS = -W -Wall -g -MP -MD -Wswitch-enum $(COMPILER_FPIC) -DMTCR_EXPORT
 noinst_LIBRARIES = libdev_mgt.a
diff --git a/flint/Makefile.am b/flint/Makefile.am
index f001515..96bb6ad 100755
--- a/flint/Makefile.am
+++ b/flint/Makefile.am
@@ -38,7 +38,7 @@ LAYOUTS_DIR = $(top_srcdir)/tools_layouts
 MFT_UTILS_DIR = $(top_srcdir)/mft_utils
 CMDIF_DIR   = $(top_srcdir)/cmdif
 
-INCLUDES = -I$(top_srcdir) -I$(srcdir) -I$(MTCR_DIR) -I$(MFLASH_DIR) 
-I$(COMMON_DIR) \
+AM_CPPFLAGS = -I$(top_srcdir) -I$(srcdir) -I$(MTCR_DIR) -I$(MFLASH_DIR) 
-I$(COMMON_DIR) \
-I$(LAYOUTS_DIR) -I$(MFT_UTILS_DIR)
 
 mstflint_CXXFLAGS = -Wall -W -g -MP -MD -pipe -DEXTERNAL
diff --git a/mflash/Makefile.am b/mflash/Makefile.am
index 950cabd..c45351a 100644
--- a/mflash/Makefile.am
+++ b/mflash/Makefile.am
@@ -31,7 +31,7 @@
 #--
 
 # Makefile.am -- Process this file with automake to produce Makefile.in
-INCLUDES= -I. -I$(top_srcdir)/include/mtcr_ul -I$(top_srcdir)/common 
-I$(top_srcdir)/tools_layouts -I$(top_srcdir)/reg_access \
+AM_CPPFLAGS= -I. -I$(top_srcdir)/include/mtcr_ul -I$(top_srcdir)/common 
-I$(top_srcdir)/tools_layouts -I$(top_srcdir)/reg_access \
   -I$(top_srcdir)/cmdif -I$(top_srcdir)/tools_res_mgmt
 
 AM_CFLAGS = -MD -pipe -Wall -W -DMST_UL -g ${MFLASH_INBAND_FLAG}
diff --git a/mft_utils/Makefile.am b/mft_utils/Makefile.am
index ad9ad8e..ebb3bf3 100644
--- a/mft_utils/Makefile.am
+++ b/mft_utils/Makefile.am
@@ -32,7 +32,7 @@
 
 # Makefile.am -- Process this file with automake to produce Makefile.in
 USER_DIR = $(top_srcdir)
-INCLUDES = -I. -I$(USER_DIR)/common
+AM_CPPFLAGS = -I. -I$(USER_DIR)/common
 
 AM_CFLAGS = -MD -pipe -Wall -W -Werror
 
diff --git a/mlxconfig/Makefile.am b/mlxconfig/Makefile.am
index f4ce674..2b2fd3e 100755
--- a/mlxconfig/Makefile.am
+++ b/mlxconfig/Makefile.am
@@ -42,7 +42,7 @@ UTILS_LIB = $(USER_DIR)/mft_utils/libmftutils.a
 CMDIF_DIR = $(USER_DIR)/cmdif
 
 
-INCLUDES = -I. -I$(USER_DIR) -I$(top_srcdir)/i

[PATCH 2/4] Makefile.am: Remove the undefined variable MFT_EXT_LIBS_INC_DIR

2015-11-29 Thread Bart Van Assche
This avoids that the option sequence "-I -I" is passed to the compiler,
a sequence that causes weird error messages to be printed.

Signed-off-by: Bart Van Assche <bart.vanass...@sandisk.com>
---
 mlxconfig/Makefile.am| 2 +-
 mlxfwops/lib/Makefile.am | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/mlxconfig/Makefile.am b/mlxconfig/Makefile.am
index 2b2fd3e..c9189aa 100755
--- a/mlxconfig/Makefile.am
+++ b/mlxconfig/Makefile.am
@@ -43,7 +43,7 @@ CMDIF_DIR = $(USER_DIR)/cmdif
 
 
 AM_CPPFLAGS = -I. -I$(USER_DIR) -I$(top_srcdir)/include/mtcr_ul -I$(MTCR_DIR) 
-I$(COMMON_DIR) $(WIN64_INC)\
-   -I$(MFT_EXT_LIBS_INC_DIR) -I $(LAYOUTS_DIR) 
-I$(MFT_EXT_LIBS_INC_DIR)/zlib -I $(UTILS_DIR) -I$(DEV_MGT_DIR) -I$(CMDIF_DIR)
+   -I $(LAYOUTS_DIR) -I $(UTILS_DIR) -I$(DEV_MGT_DIR) -I$(CMDIF_DIR)
 
 AM_CXXFLAGS = -Wall -W -g -MP -MD -pipe
 bin_PROGRAMS = mstconfig
diff --git a/mlxfwops/lib/Makefile.am b/mlxfwops/lib/Makefile.am
index 8690345..c04d333 100755
--- a/mlxfwops/lib/Makefile.am
+++ b/mlxfwops/lib/Makefile.am
@@ -41,7 +41,7 @@ UTILS_LIB = $(top_srcdir)/mft_utils
 UEFI_COMMON_DIR = $(top_srcdir)/mlxfwops/uefi_c
 
 AM_CPPFLAGS = -I$(srcdir) -I$(MTCR_INC_DIR) -I$(MFLASH_DIR) 
-I$(top_srcdir)/ext_libs/json -I$(MINIXZ_DIR)\
-   -I$(COMMON_DIR) -I$(MFT_EXT_LIBS_INC_DIR)/zlib -I $(LAYOUTS_DIR) 
-I$(top_srcdir)/common -I$(UTILS_LIB) -I$(UEFI_COMMON_DIR)
+   -I$(COMMON_DIR) -I $(LAYOUTS_DIR) -I$(top_srcdir)/common 
-I$(UTILS_LIB) -I$(UEFI_COMMON_DIR)
 
 MLXFWOPS_VERSION = 1
 
-- 
2.1.4

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 3/4] crd_read_line(): Rework code to suppress a compiler warning

2015-11-29 Thread Bart Van Assche
Avoid that gcc prints a warning about not checking the result of fgets().

Signed-off-by: Bart Van Assche <bart.vanass...@sandisk.com>
---
 mstdump/crd_lib/crdump.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/mstdump/crd_lib/crdump.c b/mstdump/crd_lib/crdump.c
index 451a944..3e3c4bc 100755
--- a/mstdump/crd_lib/crdump.c
+++ b/mstdump/crd_lib/crdump.c
@@ -504,8 +504,8 @@ static int crd_read_line(IN FILE *fd, OUT char *tmp) {
 if (!feof(fd)) {
 int c = fgetc(fd);
 if (c == '#') {
-char* _ptr=fgets (tmp, 300, fd);
-(void)_ptr;//avoid warning
+if (fgets(tmp, 300, fd)) {
+}
 tmp[0] = 0;
 continue;
 }
-- 
2.1.4

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 0/4] mstflint source code cleanup patches

2015-11-29 Thread Bart Van Assche
These four patches bring the mstflint coding style closer to that of the 
other RDMA user space software. The patches in this series are:


0001-Fix-several-automake-warnings.patch
0002-Makefile.am-Remove-the-undefined-variable-MFT_EXT_LI.patch
0003-crd_read_line-Rework-code-to-suppress-a-compiler-war.patch
0004-Fix-compiler-warnings-about-set-but-not-used-variabl.patch
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: srp state in current mainline

2015-11-25 Thread Bart Van Assche
On 11/25/2015 11:34 AM, Bart Van Assche wrote:
> On 11/23/2015 05:14 PM, Bart Van Assche wrote:
>> On 11/22/2015 07:31 AM, Christoph Hellwig wrote:
>>> On Sun, Nov 22, 2015 at 05:26:28PM +0200, Sagi Grimberg wrote:
>>>>> No. register_always=Y is already broken in 4.3, but
>>>>> register_always=N is
>>>>> now also broken in 4.4.
>>>>
>>>> OK, I'm confused so please let me understand slowly :)
>>>>
>>>> Your patch "ib_srp: initialize dma_length in srp_map_idb" solves
>>>> the register_always=Y dma_length = 0 WARN_ON() on 4.4-rc, does it solve
>>>> the data integrity errors too?
>>>
>>> No, it doesn't.
>>>
>>>> This code path is specific to srp because all other ULPs guarantee
>>>> no-gaps...
>>>
>>> Yes.  Life would be simpler if we could just set the virt_boundary
>>> on SRP, and Bart has indicated that he's willing to at least looks at
>>> this for the next merge window.
>>
>> Hello Christoph,
>>
>> Tomorrow I will try to reproduce this behavior on my test setup. I
>> prepared a setup with kernel v4.4-rc2 and on which the SRP initiator and
>> target are running on the same server. Tomorrow I will install xfstests
>> and see whether these tests pass fine against an XFS filesystem.
> 
> (replying to my own e-mail)
> 
> I can reproduce this behavior with both LIO and SCST. I have modified
> initiator and target code such that the target appends a data CRC at the
> end of the SRP_RSP IU and such that the initiator checks that CRC. A CRC
> mismatch was reported for the following SG-list by the initiator:
> 
> scsi host10: srp_check_crc: bufflen 1024; resid 0; sg-list len 2; dir
> DMA_TO_DEVICE; CRC mismatch (0x7916620b <> 0xde97b796); sg-list:
> [0] 880407378348 len 512
> [1] 880407378000 len 512
> 
> I will check the memory registration code next.

(again replying to my own e-mail)

The patch below helps somewhat but is not sufficient to make the XFS tests
pass. On Monday I will continue to work on this.

Bart.

[PATCH] IB core: Fix ib_sg_to_pages()

Fix the code for detecting gaps and disable the code for chunking.

Fixes: "IB/core: Introduce new fast registration API" (commit 4c67e2bfc8b7)
Signed-off-by: Bart Van Assche <bart.vanass...@sandisk.com>
---
 drivers/infiniband/core/verbs.c | 23 +++
 1 file changed, 3 insertions(+), 20 deletions(-)

diff --git a/drivers/infiniband/core/verbs.c b/drivers/infiniband/core/verbs.c
index 043a60e..3f9c820 100644
--- a/drivers/infiniband/core/verbs.c
+++ b/drivers/infiniband/core/verbs.c
@@ -1530,7 +1530,6 @@ int ib_sg_to_pages(struct ib_mr *mr,
   int (*set_page)(struct ib_mr *, u64))
 {
struct scatterlist *sg;
-   u64 last_end_dma_addr = 0, last_page_addr = 0;
unsigned int last_page_off = 0;
u64 page_mask = ~((u64)mr->page_size - 1);
int i;
@@ -1544,23 +1543,9 @@ int ib_sg_to_pages(struct ib_mr *mr,
u64 end_dma_addr = dma_addr + dma_len;
u64 page_addr = dma_addr & page_mask;
 
-   if (i && page_addr != dma_addr) {
-   if (last_end_dma_addr != dma_addr) {
-   /* gap */
-   goto done;
-
-   } else if (last_page_off + dma_len <= mr->page_size) {
-   /* chunk this fragment with the last */
-   mr->length += dma_len;
-   last_end_dma_addr += dma_len;
-   last_page_off += dma_len;
-   continue;
-   } else {
-   /* map starting from the next page */
-   page_addr = last_page_addr + mr->page_size;
-   dma_len -= mr->page_size - last_page_off;
-   }
-   }
+   /* gap */
+   if (i && (last_page_off || page_addr != dma_addr))
+   goto done;
 
do {
if (unlikely(set_page(mr, page_addr)))
@@ -1569,8 +1554,6 @@ int ib_sg_to_pages(struct ib_mr *mr,
} while (page_addr < end_dma_addr);
 
mr->length += dma_len;
-   last_end_dma_addr = end_dma_addr;
-   last_page_addr = end_dma_addr & page_mask;
last_page_off = end_dma_addr & ~page_mask;
}
 
-- 
2.1.4



--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: srp state in current mainline

2015-11-25 Thread Bart Van Assche

On 11/23/2015 05:14 PM, Bart Van Assche wrote:

On 11/22/2015 07:31 AM, Christoph Hellwig wrote:

On Sun, Nov 22, 2015 at 05:26:28PM +0200, Sagi Grimberg wrote:

No. register_always=Y is already broken in 4.3, but
register_always=N is
now also broken in 4.4.


OK, I'm confused so please let me understand slowly :)

Your patch "ib_srp: initialize dma_length in srp_map_idb" solves
the register_always=Y dma_length = 0 WARN_ON() on 4.4-rc, does it solve
the data integrity errors too?


No, it doesn't.


This code path is specific to srp because all other ULPs guarantee
no-gaps...


Yes.  Life would be simpler if we could just set the virt_boundary
on SRP, and Bart has indicated that he's willing to at least looks at
this for the next merge window.


Hello Christoph,

Tomorrow I will try to reproduce this behavior on my test setup. I
prepared a setup with kernel v4.4-rc2 and on which the SRP initiator and
target are running on the same server. Tomorrow I will install xfstests
and see whether these tests pass fine against an XFS filesystem.


(replying to my own e-mail)

I can reproduce this behavior with both LIO and SCST. I have modified 
initiator and target code such that the target appends a data CRC at the 
end of the SRP_RSP IU and such that the initiator checks that CRC. A CRC 
mismatch was reported for the following SG-list by the initiator:


scsi host10: srp_check_crc: bufflen 1024; resid 0; sg-list len 2; dir 
DMA_TO_DEVICE; CRC mismatch (0x7916620b <> 0xde97b796); sg-list:

[0] 880407378348 len 512
[1] 880407378000 len 512

I will check the memory registration code next.

Bart.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/9] IB: add a proper completion queue abstraction

2015-11-23 Thread Bart Van Assche

On 11/23/2015 02:18 PM, Jason Gunthorpe wrote:

On Mon, Nov 23, 2015 at 01:54:05PM -0800, Bart Van Assche wrote:
What I don't see is how SRP handles things when the
sendq fills up, ie the case where __srp_get_tx_iu() == NULL. It looks
like the driver starts to panic and generates printks. I can't tell if
it can survive that, but it doesn't look very good..


Hello Jason,

From srp_cm_rep_handler():

target->scsi_host->can_queue
= min(ch->req_lim - SRP_TSK_MGMT_SQ_SIZE,
  target->scsi_host->can_queue);

In other words, the SCSI core is told to ensure that the number of 
outstanding SCSI commands is one less than the number of elements in the 
ch->free_tx list. And since the SRP initiator serializes task management 
requests it is guaranteed that __srp_get_tx_iu() won't fail due to 
ch->free_tx being empty.


Bart.

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/9] IB: add a proper completion queue abstraction

2015-11-23 Thread Bart Van Assche

On 11/23/2015 01:28 PM, Jason Gunthorpe wrote:

On Mon, Nov 23, 2015 at 01:04:25PM -0800, Bart Van Assche wrote:


Considerable time ago the send queue in the SRP initiator driver was
modified from signaled to non-signaled to reduce the number of interrupts
triggered by the SRP initiator driver. The SRP initiator driver polls the
send queue every time before a SCSI command is sent to the target. I think
this is a pattern that is also useful for other ULP's so I'm not convinced
that ib_process_cq_direct() should be deprecated :-)


As I explained, that is a fine idea, but I can't see how SRP is able
to correctly do sendq flow control without spinning on the poll, which
it does not do.

I'm guessing SRP is trying to drive sendq flow control from the recv
side, like NFS was. This is wrong and should not be part of the common
API.

Does that make sense?


Not really ... Please have a look at the SRP initiator source code. What 
the SRP initiator does is to poll the send queue before sending a new 
SCSI command to the target system starts. I think this approach could 
also be used in other ULP drivers if the send queue poll frequency is 
such that no send queue overflow occurs.


Bart.

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] IB/srp: Fix possible send queue overflow

2015-11-23 Thread Bart Van Assche

On 11/22/2015 05:37 AM, Christoph Hellwig wrote:

On Tue, Nov 10, 2015 at 12:35:05PM +0200, Sagi Grimberg wrote:

Are you planning to pick this up? Note that this patch
is stable material as well.


Doug? any plans for this patch?


We should really get this in an into -stable.  Bart, can you resend
the series and if it doesn't get picked up next week send it directly
to Linus?


Hi Christoph,

Are you referring to Sagi's patch only or also to my patch series that 
fixes error handling ?


Thanks,

Bart.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: srp state in current mainline

2015-11-23 Thread Bart Van Assche

On 11/22/2015 07:31 AM, Christoph Hellwig wrote:

On Sun, Nov 22, 2015 at 05:26:28PM +0200, Sagi Grimberg wrote:

No. register_always=Y is already broken in 4.3, but register_always=N is
now also broken in 4.4.


OK, I'm confused so please let me understand slowly :)

Your patch "ib_srp: initialize dma_length in srp_map_idb" solves
the register_always=Y dma_length = 0 WARN_ON() on 4.4-rc, does it solve
the data integrity errors too?


No, it doesn't.


This code path is specific to srp because all other ULPs guarantee
no-gaps...


Yes.  Life would be simpler if we could just set the virt_boundary
on SRP, and Bart has indicated that he's willing to at least looks at
this for the next merge window.


Hello Christoph,

Tomorrow I will try to reproduce this behavior on my test setup. I 
prepared a setup with kernel v4.4-rc2 and on which the SRP initiator and 
target are running on the same server. Tomorrow I will install xfstests 
and see whether these tests pass fine against an XFS filesystem.


Bart.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/9] IB: add a proper completion queue abstraction

2015-11-23 Thread Bart Van Assche

On 11/23/2015 12:37 PM, Jason Gunthorpe wrote:

On Sat, Nov 14, 2015 at 08:13:44AM +0100, Christoph Hellwig wrote:

On Fri, Nov 13, 2015 at 03:06:36PM -0700, Jason Gunthorpe wrote:

Looking at that thread and then at the patch a bit more..

+void ib_process_cq_direct(struct ib_cq *cq)
[..]
+   __ib_process_cq(cq, INT_MAX);

INT_MAX is not enough, it needs to loop.
This is missing a ib_req_notify also.


No.  Direct cases _never_ calls ib_req_notify.  Its for the case where
the SRP case polls the send CQ only from the same context it sends for
without any interrupt notification at al.


Hurm, okay, that is not at all what I was thinking this was for..

So the only use of this function is to drain a send cq, in a state
where it is guarenteed no new entries can be added, and only if the cq
is not already event driven. I'd stick those notes in the comment..

Hum. I wonder if this is even a reasonable way to run a ULP. It is
important that rx completions are not used to drive reaping of
resources that are still committed to the send queue. ie do not
trigger send buffer reuse based on a rx completion.

So, if a ULP uses this API, how does it handle the sendq becoming
full? As above, a ULP cannot use recvs to infer available sendq
space. It must directly reap the sendq. So a correct ULP would have to
spin calling ib_process_direct_cq until it makes enough progress to
add more things to the sendq. I don't obviously see that in SRP - so
I'm guessing it has buggered up sendq flow control?

NFS had similar problems lately too, I wrote a long explanation to
Chuck on this subject.

That said, the demand poll almost seems like a reasonable way for a
ULP to run the sendq, do the polls on send occasionally or when more
space is needed to better amortize the reaping overhead at the cost of
send latency. But API wise it needs to be able to switch over to a
sleep if enough progress hasn't been made.

So.. maybe also add to the comment that ib_process_cq_direct is
deprecated and should not be used in new code until SRP gets sorted?


Hello Jason,

Considerable time ago the send queue in the SRP initiator driver was 
modified from signaled to non-signaled to reduce the number of 
interrupts triggered by the SRP initiator driver. The SRP initiator 
driver polls the send queue every time before a SCSI command is sent to 
the target. I think this is a pattern that is also useful for other 
ULP's so I'm not convinced that ib_process_cq_direct() should be 
deprecated :-)


Bart.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/9] IB: add a proper completion queue abstraction

2015-11-22 Thread Bart Van Assche

On 11/22/15 06:57, Sagi Grimberg wrote:

I think that bart wants to allow the caller to select cpu affinity
per CQ. In this case ib_alloc_cq in workqueue mode would need to
accept a affinity_hint from the caller (default to wild-card
WORK_CPU_UNBOUND).


Hmm, true.  How would be set that hint from userspace?  I'd really prefer
to see a practical justification for it first.


In order to assign CPUs from user-space we'd need an ethtool like
interface for isert/srpt/. Given that this is something we don't
want to get into right now, I assumed that Bart meant that srpt
would take a "least used" approach from srpt driver (which isn't better
taking the wild-card option I'd say), So I'll let Bart answer...


Hello Christoph and Sagi,

My intention is indeed to allow to control CPU affinity per CQ. One use 
case is to implement a least-used policy in RDMA drivers that use 
multiple completion queues. Another use case is to make CPU affinity 
configurable from user space through something similar to ethtool or via 
sysfs.


Bart.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/9] IB: add a proper completion queue abstraction

2015-11-20 Thread Bart Van Assche
On 11/20/2015 02:16 AM, Christoph Hellwig wrote:
> On Wed, Nov 18, 2015 at 10:20:14AM -0800, Bart Van Assche wrote:
>> Are you perhaps referring to the sysfs CPU mask that allows to control
>> workqueue affinity ?
> 
> I think he is referring to the defintion of WQ_UNBOUND:
> 
>WQ_UNBOUND
> 
>   Work items queued to an unbound wq are served by the special
>   woker-pools which host workers which are not bound to any
>   specific CPU.  This makes the wq behave as a simple execution
>   context provider without concurrency management.  The unbound
>   worker-pools try to start execution of work items as soon as
>   possible.  Unbound wq sacrifices locality but is useful for
>   the following cases.
> 
>   * Wide fluctuation in the concurrency level requirement is
> expected and using bound wq may end up creating large number
> of mostly unused workers across different CPUs as the issuer
> hops through different CPUs.
> 
>   * Long running CPU intensive workloads which can be better
> managed by the system scheduler.
 
Hello Christoph,

The comment about locality in the above quote is interesting. How about
modifying patch 2/9 as indicated below ? The modification below does not
change the behavior of this patch if ib_cq.w.cpu is not modified. And it
allows users who care about locality and who want to skip the scheduler
overhead by setting ib_cq.w.cpu to the index of the CPU they want the
work to be processed on.

Thanks,

Bart.

---
 drivers/infiniband/core/cq.c | 11 ++-
 include/rdma/ib_verbs.h  |  5 -
 2 files changed, 10 insertions(+), 6 deletions(-)

diff --git a/drivers/infiniband/core/cq.c b/drivers/infiniband/core/cq.c
index bf2a079..4d80d8c 100644
--- a/drivers/infiniband/core/cq.c
+++ b/drivers/infiniband/core/cq.c
@@ -94,18 +94,18 @@ static void ib_cq_completion_softirq(struct ib_cq *cq, void 
*private)
 
 static void ib_cq_poll_work(struct work_struct *work)
 {
-   struct ib_cq *cq = container_of(work, struct ib_cq, work);
+   struct ib_cq *cq = container_of(work, struct ib_cq, w.work);
int completed;
 
completed = __ib_process_cq(cq, IB_POLL_BUDGET_WORKQUEUE);
if (completed >= IB_POLL_BUDGET_WORKQUEUE ||
ib_req_notify_cq(cq, IB_POLL_FLAGS) > 0)
-   queue_work(ib_comp_wq, >work);
+   queue_work_on(cq->w.cpu, ib_comp_wq, >w.work);
 }
 
 static void ib_cq_completion_workqueue(struct ib_cq *cq, void *private)
 {
-   queue_work(ib_comp_wq, >work);
+   queue_work_on(cq->w.cpu, ib_comp_wq, >w.work);
 }
 
 /**
@@ -159,7 +159,8 @@ struct ib_cq *ib_alloc_cq(struct ib_device *dev, void 
*private,
break;
case IB_POLL_WORKQUEUE:
cq->comp_handler = ib_cq_completion_workqueue;
-   INIT_WORK(>work, ib_cq_poll_work);
+   INIT_WORK(>w.work, ib_cq_poll_work);
+   cq->w.cpu = WORK_CPU_UNBOUND;
ib_req_notify_cq(cq, IB_CQ_NEXT_COMP);
break;
default:
@@ -195,7 +196,7 @@ void ib_free_cq(struct ib_cq *cq)
irq_poll_disable(>iop);
break;
case IB_POLL_WORKQUEUE:
-   flush_work(>work);
+   flush_work(>w.work);
break;
default:
WARN_ON_ONCE(1);
diff --git a/include/rdma/ib_verbs.h b/include/rdma/ib_verbs.h
index f59a8d3..b1344f8 100644
--- a/include/rdma/ib_verbs.h
+++ b/include/rdma/ib_verbs.h
@@ -1291,7 +1291,10 @@ struct ib_cq {
struct ib_wc*wc;
union {
struct irq_poll iop;
-   struct work_struct  work;
+   struct {
+   struct work_struct  work;
+   int cpu;
+   } w;
};
 };
 
-- 
2.1.4

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] IB/cma: Add a missing rcu_read_unlock()

2015-11-20 Thread Bart Van Assche
Ensure that validate_ipv4_net_dev() calls rcu_read_unlock() if
fib_lookup() fails. Detected by sparse. Compile-tested only.

Fixes: "IB/cma: Validate routing of incoming requests" (commit f887f2ac87c2).
Cc: Haggai Eran 
Cc: stable 
---
 drivers/infiniband/core/cma.c | 5 +
 1 file changed, 1 insertion(+), 4 deletions(-)

diff --git a/drivers/infiniband/core/cma.c b/drivers/infiniband/core/cma.c
index 69e5477..f8dfc63 100644
--- a/drivers/infiniband/core/cma.c
+++ b/drivers/infiniband/core/cma.c
@@ -1126,10 +1126,7 @@ static bool validate_ipv4_net_dev(struct net_device 
*net_dev,
 
rcu_read_lock();
err = fib_lookup(dev_net(net_dev), , , 0);
-   if (err)
-   return false;
-
-   ret = FIB_RES_DEV(res) == net_dev;
+   ret = err == 0 && FIB_RES_DEV(res) == net_dev;
rcu_read_unlock();
 
return ret;
-- 
2.1.4

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 3/9] IB: add a helper to safely drain a QP

2015-11-17 Thread Bart Van Assche

On 11/15/2015 01:34 AM, Sagi Grimberg wrote:

This is taken from srp, and srp drains using a recv wr due to a race
causing a use-after-free condition in srp which re-posts a recv buffer
in the recv completion handler.


Hello Sagi,

Would it be possible to clarify this ? Does this refer to an existing 
race or a race that would only occur if the code would be modified ?


Thanks,

Bart.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/9] move blk_iopoll to limit and make it generally available

2015-11-17 Thread Bart Van Assche

On 11/13/2015 11:02 PM, Christoph Hellwig wrote:

On Fri, Nov 13, 2015 at 11:19:24AM -0800, Bart Van Assche wrote:

On 11/13/2015 05:46 AM, Christoph Hellwig wrote:

The new name is irq_poll as iopoll is already taken.  Better suggestions
welcome.


Would it be possible to provide more background information about this ?
Which other kernel subsystem is using the name iopoll ?


Take a look at include/linux/iopoll.h  - I can't reaplly make much sense
of it to be honest, but it's used in a quite a few places.


How about renaming blk_iopoll into blk_poll ? That way the name still 
refers to the block layer. And although the current implementation 
performs polling from IRQ context future implementations maybe will 
allow polling from thread context.


Bart.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/9] move blk_iopoll to limit and make it generally available

2015-11-17 Thread Bart Van Assche

On 11/17/2015 09:16 AM, Bart Van Assche wrote:

On 11/13/2015 11:02 PM, Christoph Hellwig wrote:

On Fri, Nov 13, 2015 at 11:19:24AM -0800, Bart Van Assche wrote:

On 11/13/2015 05:46 AM, Christoph Hellwig wrote:

The new name is irq_poll as iopoll is already taken.  Better suggestions
welcome.


Would it be possible to provide more background information about this ?
Which other kernel subsystem is using the name iopoll ?


Take a look at include/linux/iopoll.h  - I can't reaplly make much sense
of it to be honest, but it's used in a quite a few places.


How about renaming blk_iopoll into blk_poll ? That way the name still
refers to the block layer. And although the current implementation
performs polling from IRQ context future implementations maybe will
allow polling from thread context.


(replying to my own e-mail)

Please ignore the previous comment - I just noticed that this mechanism 
is not limited to block devices.


Bart.

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 5/9] srpt: use the new CQ API

2015-11-17 Thread Bart Van Assche

On 11/13/2015 05:46 AM, Christoph Hellwig wrote:
> [ ... ]

The previous patch and this patch look like great work to me. However, 
this patch not only reworks the SRP target driver but also prevents 
users to move the SRP completion thread to another CPU core than the CPU 
core that processes the completion interrupts (with the help of e.g. the 
taskset command). Hence my request to make that CPU core configurable in 
the second patch of this patch series.


Bart.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/9] IB: add a proper completion queue abstraction

2015-11-17 Thread Bart Van Assche

On 11/13/2015 05:46 AM, Christoph Hellwig wrote:

+ * context and does not ask from completion interrupts from the HCA.

   
Should this perhaps be changed into "for" ?


+ */
+void ib_process_cq_direct(struct ib_cq *cq)
+{
+   WARN_ON_ONCE(cq->poll_ctx != IB_POLL_DIRECT);
+
+   __ib_process_cq(cq, INT_MAX);
+}
+EXPORT_SYMBOL(ib_process_cq_direct);


My proposal is to drop this function and to export __ib_process_cq() 
instead (with or without renaming). That will allow callers of this 
function to compare the poll budget with the number of completions that 
have been processed and use that information to decide whether or not to 
call this function again.



+static void ib_cq_poll_work(struct work_struct *work)
+{
+   struct ib_cq *cq = container_of(work, struct ib_cq, work);
+   int completed;
+
+   completed = __ib_process_cq(cq, IB_POLL_BUDGET_WORKQUEUE);
+   if (completed >= IB_POLL_BUDGET_WORKQUEUE ||
+   ib_req_notify_cq(cq, IB_POLL_FLAGS) > 0)
+   queue_work(ib_comp_wq, >work);
+}
+
+static void ib_cq_completion_workqueue(struct ib_cq *cq, void *private)
+{
+   queue_work(ib_comp_wq, >work);
+}


The above code will cause all polling to occur on the context of the CPU 
that received the completion interrupt. This approach is not powerful 
enough. For certain workloads throughput is higher if work completions 
are processed by another CPU core on the same CPU socket. Has it been 
considered to make the CPU core on which work completions are processed 
configurable ?



diff --git a/drivers/infiniband/ulp/srp/ib_srp.c 
b/drivers/infiniband/ulp/srp/ib_srp.c
index 62b6cba..3027824 100644
--- a/drivers/infiniband/ulp/srp/ib_srp.c
+++ b/drivers/infiniband/ulp/srp/ib_srp.c
@@ -457,10 +457,11 @@ static struct srp_fr_pool *srp_alloc_fr_pool(struct 
srp_target_port *target)
  static void srp_destroy_qp(struct srp_rdma_ch *ch)
  {
static struct ib_qp_attr attr = { .qp_state = IB_QPS_ERR };
-   static struct ib_recv_wr wr = { .wr_id = SRP_LAST_WR_ID };
+   static struct ib_recv_wr wr = { 0 };
struct ib_recv_wr *bad_wr;
int ret;


Since the 'wr' structure is static I don't think it needs to be 
zero-initialized explicitly.


Bart.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 5/9] srpt: use the new CQ API

2015-11-17 Thread Bart Van Assche

On 11/13/2015 05:46 AM, Christoph Hellwig wrote:

[ ... ]


This patch contains two logical changes:
- Conversion to the new CQ API.
- Removal of the ib_srpt_compl thread.

Had it been considered to implement these changes as two separate patches ?

Thanks,

Bart.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 4/9] srpt: chain RDMA READ/WRITE requests

2015-11-17 Thread Bart Van Assche

On 11/13/2015 05:46 AM, Christoph Hellwig wrote:

-   ret = ib_post_send(ch->qp, , _wr);
-   if (ret)
-   break;
+   if (i == n_rdma - 1) {
+   /* only get completion event for the last rdma read */
+   if (dir == DMA_TO_DEVICE)
+   wr->wr.send_flags = IB_SEND_SIGNALED;
+   wr->wr.next = NULL;
+   } else {
+   wr->wr.next = >rdma_ius[i + 1].wr;
+   }
}

+   ret = ib_post_send(ch->qp, >rdma_ius->wr, _wr);
if (ret)
pr_err("%s[%d]: ib_post_send() returned %d for %d/%d\n",
 __func__, __LINE__, ret, i, n_rdma);


Hello Christoph,

Chaining RDMA requests is a great idea. But it seems to me that this 
patch is based on the assumption that posting multiple RDMA requests 
either succeeds as a whole or fails as a whole. Sorry but I'm not sure 
that the verbs API guarantees this. In the ib_srpt driver a QP can be 
changed at any time into the error state and there might be drivers that 
report an immediate failure in that case. I think even when chaining 
RDMA requests that we still need a mechanism to wait until ongoing RDMA 
transfers have finished if some but not all RDMA requests have been posted.


Bart.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] ib_srp: initialize dma_length in srp_map_idb

2015-11-16 Thread Bart Van Assche

On 11/15/2015 09:59 AM, Christoph Hellwig wrote:

Without this sg_dma_len will return 0 on architectures tha have
the dma_length field.

Signed-off-by: Christoph Hellwig 
---
  drivers/infiniband/ulp/srp/ib_srp.c | 3 +++
  1 file changed, 3 insertions(+)

diff --git a/drivers/infiniband/ulp/srp/ib_srp.c 
b/drivers/infiniband/ulp/srp/ib_srp.c
index 32f7962..445c0a6 100644
--- a/drivers/infiniband/ulp/srp/ib_srp.c
+++ b/drivers/infiniband/ulp/srp/ib_srp.c
@@ -1520,6 +1520,9 @@ static int srp_map_idb(struct srp_rdma_ch *ch, struct 
srp_request *req,
state.sg_nents = 1;
sg_set_buf(idb_sg, req->indirect_desc, idb_len);
idb_sg->dma_address = req->indirect_dma_addr; /* hack! */
+#ifdef CONFIG_NEED_SG_DMA_LENGTH
+   idb_sg->dma_length = idb_sg->length;/* hack^2 */
+#endif
ret = srp_map_finish_fr(, ch);
if (ret < 0)
return ret;



Hello Christoph,

How about adding "Cc: stable" to this patch such that it not only will 
be integrated in kernel v4.4 but also in kernel v4.3.1 or later ?


Thanks,

Bart.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] ib_srp: initialize dma_length in srp_map_idb

2015-11-16 Thread Bart Van Assche

On 11/16/2015 09:22 AM, Christoph Hellwig wrote:

the code in this area changed enough since 4.3 that it won't easily
apply.  But a backport would still be very useful!


In that case:

Reviewed-by: Bart Van Assche <bart.vanass...@sandisk.com>
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: srp state in current mainline

2015-11-15 Thread Bart Van Assche

On 11/15/15 10:06, Christoph Hellwig wrote:

FYI, I sent a patch for the zero S/G length issue.  With this xfstests
does fine for ext4 and btrfs.  With XFS I still run into corruption
warnings for the slab use after free poison pattern.  I suspect that
issue might be related to uniqueue XFS I/O patterns.  One thing that
might be related is that XFS can submit bios backed by kmalloced memory.

I've also tested XFS on various other storage devices on the same kernel
and didn't see an issue like that just to be sure.


Thanks for submitting that patch.

Did I understand correctly that page-aligned I/O works fine but I/O that 
is not aligned on a page boundary not ? Have you already had the time to 
verify whether the "IB/srp: Convert to new registration API" patch is 
the patch that introduced this issue ?


BTW, another location where a buffer is used that is not aligned on a 
page boundary is the scsi_probe_and_add_lun() function. If a crash 
occurs while probing LUNs or lsscsi shows garbled data for SRP LUNs that 
might indicate a problem in the buffer mapping or registration code.


Bart.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/9] IB: add a proper completion queue abstraction

2015-11-13 Thread Bart Van Assche

On 11/13/2015 10:25 AM, Jason Gunthorpe wrote:

On Fri, Nov 13, 2015 at 02:46:43PM +0100, Christoph Hellwig wrote:

This adds an abstraction that allows ULP to simply pass a completion
object and completion callback with each submitted WR and let the RDMA
core handle the nitty gritty details of how to handle completion
interrupts and poll the CQ.


This looks pretty nice, I'd really like to look it over carefully
after SC|15..

I know Bart and others have attempted to have switching between event
and polling driven operation, but there were problems resolving the
races. Would be nice to review that conversation.. Do you remember the
details Bart?


Hello Jason,

I think this is the conversation you are referring to: "About a 
shortcoming of the verbs API" 
(http://thread.gmane.org/gmane.linux.drivers.rdma/5028). That 
conversation occurred five years ago, which means that you have an 
excellent memory :-)


I doesn't seem to me like Christoph wanted to support dynamic switching 
between the IB_POLL_DIRECT, IB_POLL_SOFTIRQ and IB_POLL_WORKQUEUE 
polling modes. I think this should have been mentioned in the patch 
description.


The implementation of this patch makes it clear that it is essential 
that all polling is serialized. The WC array that is used for polling is 
embedded in the CQ and is not protected against concurrent access. This 
means that it is essential that _ib_process_cq() calls are serialized. I 
need some more time to verify whether such serialization is always 
guaranteed by this patch.


Bart.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/9] move blk_iopoll to limit and make it generally available

2015-11-13 Thread Bart Van Assche

On 11/13/2015 05:46 AM, Christoph Hellwig wrote:

The new name is irq_poll as iopoll is already taken.  Better suggestions
welcome.


Hello Christoph,

Would it be possible to provide more background information about this ? 
Which other kernel subsystem is using the name iopoll ? I think the name 
blk-iopoll was chosen six years ago for this subsystem (see also 
https://lwn.net/Articles/346187/). If the conflicting subsystem is 
newer, how about renaming the other polling mechanism ?


Bart.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: srp state in current mainline

2015-11-12 Thread Bart Van Assche

On 11/12/2015 09:59 AM, Christoph Hellwig wrote:

[  108.998574] WARNING: CPU: 0 PID: 1258 at kernel/sched/core.c:7389 
__might_sleep+0xa7/0xb0()
[  108.998580] do not call blocking ops when !TASK_RUNNING; state=1 set


Although this is most likely unrelated to the issue reported at the 
start of this thread, I have started working on a patch for this warning.


Bart.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] IB/mad: In validate_mad, validate CM method and attribute

2015-11-12 Thread Bart Van Assche

On 11/12/2015 03:48 AM, Hal Rosenstock wrote:

A new SRP target has been observed to respond to Send CM REQ
with GetResp of CM REQ with bad status. This is non conformant
with IBA spec but exposes a vulnerability in the current MAD/CM
code which will respond to the incoming GetResp of CM REQ as if
it was a valid incoming Send of CM REQ rather than tossing
this on the floor. It also causes the MAD layer not to
retransmit the original REQ even though it has not received a REP.


Hello Hal,

With which SRP target has this behavior been observed ? Has this patch 
been tested with the LIO SRP target ?


Thanks,

Bart.

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: srp state in current mainline

2015-11-11 Thread Bart Van Assche

On 11/10/2015 09:15 AM, Christoph Hellwig wrote:

This is a simply xfstests run using XFS on a remote LIO ramdisk.


Hello Christoph,

Which version of the kernel and LIO were installed at the target side ?

Bart.

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: srp state in current mainline

2015-11-11 Thread Bart Van Assche

On 11/11/2015 07:46 AM, Christoph Hellwig wrote:

On Wed, Nov 11, 2015 at 07:35:47AM -0800, Bart Van Assche wrote:

On 11/10/2015 09:15 AM, Christoph Hellwig wrote:

This is a simply xfstests run using XFS on a remote LIO ramdisk.


Hello Christoph,

Which version of the kernel and LIO were installed at the target side ?


I've tried a couple different one from 4.1-rc3 to latest Linus tree
from yesterday, and the target side doesn't matter.

I've also bisected things down on the initiator side and the changes to
enable prefer_fr and register_always seem to be the culprit at least as
far as 4.3 is concerned.  If I disable both 4.3 is working fine, but
enabling either one causes frequent map failures.  Just rebuilding
a new Linus current tree to see if the options help there as well.


Hello Christoph,

The SRP initiator from kernel 4.3 is working fine on my test setup. I 
will start a test with Linus' tree and with the following SRP kernel 
module parameters:


# cat /etc/modprobe.d/ib_srp.conf
options ib_srp cmd_sg_entries=255 register_always=1 prefer_fr=1

Bart.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: srp state in current mainline

2015-11-11 Thread Bart Van Assche

On 11/10/2015 09:15 AM, Christoph Hellwig wrote:

scsi host3: ib_srp: failed receive status WR flushed (5) for iu 880313f4ca40


Can you also post the logs from the target system from around the time 
this message was logged on the initiator system ? Usually this message 
means that the target system closed a QP. I'm looking for messages 
generated by the following statement in ib_srpt.c:


pr_info("RDMA t %d for idx %u failed with status %d\n", opcode,
index, wc->status);

Bart.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH ib-next 2/3] IB/core: IB/core: Allow legacy verbs through extended interfaces

2015-11-10 Thread Bart Van Assche
On 11/10/2015 07:40 AM, Eli Cohen wrote:
> On Tue, Nov 10, 2015 at 05:23:10PM +0200, Eli Cohen wrote:
>>>
>>> Call it ENOTSUP then:
>>>
>>> ENOTSUP Operation not supported (POSIX.1)
>>>
>>> Same value on Linux.
>>>
>>
>> Yes, that's better. I will resend.
> 
> 
> Well it seems like ENOTSUP is defined only here:
> 
> ./arch/parisc/include/uapi/asm/errno.h:115:#define ENOTSUP 252 /* 
> Function not implemented (POSIX.4 / HPUX) */
> 
> Which obviously I cannot use. Jason, do you have another idea?

How about using ENOTSUPP ?

$ PAGER= git grep 'define ENOTSUPP' include
include/linux/errno.h:#define ENOTSUPP 524 /* Operation is not supported */

Bart.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 4/7] IB/srp: Fix a potential queue overflow in an error path

2015-11-04 Thread Bart Van Assche

On 11/03/2015 08:03 PM, Christoph Hellwig wrote:

On Tue, Nov 03, 2015 at 12:50:59PM -0800, Bart Van Assche wrote:

Such a check wouldn't be that simple because the only way to perform such a
check is either by doubling the number of ib_map_mr_sg() calls or by
performing additional memory allocations.


The other option woud be to disallow gappy SG lists unless supported by
the hardware with a single MR similar to iSER.  While this would leave
the SRP protocol support for multiple SG entries per command unused it
would significantly simplify the driver.


Hello Christoph,

I will have a look into this after the kernel 4.4 merge window has been 
closed.


Bart.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 4/7] IB/srp: Fix a potential queue overflow in an error path

2015-11-03 Thread Bart Van Assche

On 11/03/2015 09:36 AM, Sagi Grimberg wrote:

On 28/10/2015 00:02, Bart Van Assche wrote:

Wait until memory registration has finished in the srp_queuecommand()
error path before invalidating memory regions to avoid a send queue
overflow.


This looks backwards to me... Why do we even post anything on our
queue-pair to begin with if we got an unsupported sg list?

Can't we perform a simple sanity check on the sg list instead?


Hello Sagi,

There is one memory descriptor pool per RDMA channel and RDMA channels 
are typically used by more than one CPU. This means that memory 
registration for more than one command can happen concurrently from a 
single memory descriptor pool. This is why checking how many memory 
descriptors are left before memory registration occurs wouldn't be 
sufficient.


Bart.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


  1   2   3   4   5   6   7   8   9   10   >