[PATCH for-3.2 v2 3/5] ib/mlx4: Add extended port capabilities support

2011-10-24 Thread Marcel Apfelbaum
An Extended Port Info packet is sent to each hw port during HCA init
and if returns without error it is assumed the port supports
extended port capabilities.

Signed-off-by: Marcel Apfelbaum marc...@dev.mellanox.co.il
Reviewed-by: Jack Morgenstein ja...@dev.mellanox.com
---

Changes from v1
- method mlx4_get_ext_port_caps changed to mlx4_check_ext_port_caps
- extended port capabilities are assigned inside mlx4_check_ext_port_caps 
rather that in mlx4_setup_hca
- ext_port_cap field is u8 instead of u64
- minor warning notification change in mlx4_setup_hca

 drivers/net/mlx4/main.c |7 +++
 drivers/net/mlx4/mlx4.h |1 +
 drivers/net/mlx4/port.c |   42 ++
 include/linux/mlx4/device.h |7 +++
 4 files changed, 57 insertions(+)

--- a/drivers/net/mlx4/main.c
+++ b/drivers/net/mlx4/main.c
@@ -998,6 +998,13 @@ static int mlx4_setup_hca(struct mlx4_de
  ib capabilities (%d). Continuing with 
  caps = 0\n, port, err);
dev-caps.ib_port_def_cap[port] = ib_port_default_caps;
+
+   err = mlx4_check_ext_port_caps(dev, port);
+   if (err)
+   mlx4_warn(dev, failed to get port %d extended 
+ port capabilities support info (%d).
+  Assuming not supported\n, port, err);
+
err = mlx4_SET_PORT(dev, port);
if (err) {
mlx4_err(dev, Failed to set port %d, aborting\n,
--- a/drivers/net/mlx4/mlx4.h
+++ b/drivers/net/mlx4/mlx4.h
@@ -450,6 +450,7 @@ void mlx4_init_vlan_table(struct mlx4_de
 
 int mlx4_SET_PORT(struct mlx4_dev *dev, u8 port);
 int mlx4_get_port_ib_caps(struct mlx4_dev *dev, u8 port, __be32 *caps);
+int mlx4_check_ext_port_caps(struct mlx4_dev *dev, u8 port);
 
 int mlx4_qp_detach_common(struct mlx4_dev *dev, struct mlx4_qp *qp, u8 gid[16],
  enum mlx4_protocol prot, enum mlx4_steer_type steer);
--- a/drivers/net/mlx4/port.c
+++ b/drivers/net/mlx4/port.c
@@ -464,6 +464,48 @@ int mlx4_get_port_ib_caps(struct mlx4_de
return err;
 }
 
+int mlx4_check_ext_port_caps(struct mlx4_dev *dev, u8 port)
+{
+   struct mlx4_cmd_mailbox *inmailbox, *outmailbox;
+   u8 *inbuf, *outbuf;
+   int err, packet_error;
+
+   inmailbox = mlx4_alloc_cmd_mailbox(dev);
+   if (IS_ERR(inmailbox))
+   return PTR_ERR(inmailbox);
+
+   outmailbox = mlx4_alloc_cmd_mailbox(dev);
+   if (IS_ERR(outmailbox)) {
+   mlx4_free_cmd_mailbox(dev, inmailbox);
+   return PTR_ERR(outmailbox);
+   }
+
+   inbuf = inmailbox-buf;
+   outbuf = outmailbox-buf;
+   memset(inbuf, 0, 256);
+   memset(outbuf, 0, 256);
+   inbuf[0] = 1;
+   inbuf[1] = 1;
+   inbuf[2] = 1;
+   inbuf[3] = 1;
+
+   *(__be16 *) (inbuf[16]) = MLX4_ATTR_EXTENDED_PORT_INFO;
+   *(__be32 *) (inbuf[20]) = cpu_to_be32(port);
+
+   err = mlx4_cmd_box(dev, inmailbox-dma, outmailbox-dma, port, 3,
+  MLX4_CMD_MAD_IFC, MLX4_CMD_TIME_CLASS_C);
+
+   packet_error = be16_to_cpu(*(__be16 *) (outbuf + 4));
+
+   dev-caps.ext_port_cap[port] = (!err  !packet_error) ?
+  MLX_EXT_PORT_CAP_FLAG_EXTENDED_PORT_INFO
+  : 0;
+
+   mlx4_free_cmd_mailbox(dev, inmailbox);
+   mlx4_free_cmd_mailbox(dev, outmailbox);
+   return err;
+}
+
 int mlx4_SET_PORT(struct mlx4_dev *dev, u8 port)
 {
struct mlx4_cmd_mailbox *mailbox;
--- a/include/linux/mlx4/device.h
+++ b/include/linux/mlx4/device.h
@@ -82,6 +82,12 @@ enum {
MLX4_DEV_CAP_FLAG_COUNTERS  = 1LL  48
 };
 
+#define MLX4_ATTR_EXTENDED_PORT_INFO   cpu_to_be16(0xff90)
+
+enum {
+   MLX_EXT_PORT_CAP_FLAG_EXTENDED_PORT_INFO= 1   0
+};
+
 enum {
MLX4_BMME_FLAG_LOCAL_INV= 1   6,
MLX4_BMME_FLAG_REMOTE_INV   = 1   7,
@@ -276,6 +282,7 @@ struct mlx4_caps {
u32 port_mask;
enum mlx4_port_type possible_type[MLX4_MAX_PORTS + 1];
u32 max_counters;
+   u8  ext_port_cap[MLX4_MAX_PORTS + 1];
 };
 
 struct mlx4_buf_list {
--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH V2 05/10] RDMA/cxgb4: Add DB Overflow Avoidance.

2011-10-24 Thread Vipul Pandya
- get FULL/EMPTY/DROP events from LLD

- on FULL event, disable normal user mode DB rings.

- add modify_qp semantics to allow user processes to call into
the kernel to ring doobells without overflowing.

Add DB Full/Empty/Drop stats.

Mark queues when created indicating the doorbell state.

If we're in the middle of db overflow avoidance, then newly created
queues should start out in this mode.

Bump the C4IW_UVERBS_ABI_VERSION to 2 so the user mode library can
know if the driver supports the kernel mode db ringing.

Signed-off-by: Vipul Pandya vi...@chelsio.com
Signed-off-by: Steve Wise sw...@opengridcomputing.com
---
V2: Bump C4IW_UVERBS_ABI_VERSION to 2

 drivers/infiniband/hw/cxgb4/device.c   |   84 +--
 drivers/infiniband/hw/cxgb4/iw_cxgb4.h |   37 --
 drivers/infiniband/hw/cxgb4/qp.c   |   51 +++-
 drivers/infiniband/hw/cxgb4/user.h |2 +-
 4 files changed, 162 insertions(+), 12 deletions(-)

diff --git a/drivers/infiniband/hw/cxgb4/device.c 
b/drivers/infiniband/hw/cxgb4/device.c
index 8483111..9062ed9 100644
--- a/drivers/infiniband/hw/cxgb4/device.c
+++ b/drivers/infiniband/hw/cxgb4/device.c
@@ -44,6 +44,12 @@ MODULE_DESCRIPTION(Chelsio T4 RDMA Driver);
 MODULE_LICENSE(Dual BSD/GPL);
 MODULE_VERSION(DRV_VERSION);
 
+struct uld_ctx {
+   struct list_head entry;
+   struct cxgb4_lld_info lldi;
+   struct c4iw_dev *dev;
+};
+
 static LIST_HEAD(uld_ctx_list);
 static DEFINE_MUTEX(dev_mutex);
 
@@ -263,6 +269,9 @@ static int stats_show(struct seq_file *seq, void *v)
seq_printf(seq,   OCQPMEM: %10llu %10llu %10llu\n,
dev-rdev.stats.ocqp.total, dev-rdev.stats.ocqp.cur,
dev-rdev.stats.ocqp.max);
+   seq_printf(seq,   DB FULL: %10llu\n, dev-rdev.stats.db_full);
+   seq_printf(seq,  DB EMPTY: %10llu\n, dev-rdev.stats.db_empty);
+   seq_printf(seq,   DB DROP: %10llu\n, dev-rdev.stats.db_drop);
return 0;
 }
 
@@ -283,6 +292,9 @@ static ssize_t stats_clear(struct file *file, const char 
__user *buf,
dev-rdev.stats.pbl.max = 0;
dev-rdev.stats.rqt.max = 0;
dev-rdev.stats.ocqp.max = 0;
+   dev-rdev.stats.db_full = 0;
+   dev-rdev.stats.db_empty = 0;
+   dev-rdev.stats.db_drop = 0;
mutex_unlock(dev-rdev.stats.lock);
return count;
 }
@@ -443,12 +455,6 @@ static void c4iw_rdev_close(struct c4iw_rdev *rdev)
c4iw_destroy_resource(rdev-resource);
 }
 
-struct uld_ctx {
-   struct list_head entry;
-   struct cxgb4_lld_info lldi;
-   struct c4iw_dev *dev;
-};
-
 static void c4iw_dealloc(struct uld_ctx *ctx)
 {
c4iw_rdev_close(ctx-dev-rdev);
@@ -514,6 +520,7 @@ static struct c4iw_dev *c4iw_alloc(const struct 
cxgb4_lld_info *infop)
idr_init(devp-mmidr);
spin_lock_init(devp-lock);
mutex_init(devp-rdev.stats.lock);
+   mutex_init(devp-db_mutex);
 
if (c4iw_debugfs_root) {
devp-debugfs_root = debugfs_create_dir(
@@ -659,11 +666,76 @@ static int c4iw_uld_state_change(void *handle, enum 
cxgb4_state new_state)
return 0;
 }
 
+static int disable_qp_db(int id, void *p, void *data)
+{
+   struct c4iw_qp *qp = p;
+
+   t4_disable_wq_db(qp-wq);
+   return 0;
+}
+
+static void stop_queues(struct uld_ctx *ctx)
+{
+   spin_lock_irq(ctx-dev-lock);
+   ctx-dev-db_state = FLOW_CONTROL;
+   idr_for_each(ctx-dev-qpidr, disable_qp_db, NULL);
+   spin_unlock_irq(ctx-dev-lock);
+}
+
+static int enable_qp_db(int id, void *p, void *data)
+{
+   struct c4iw_qp *qp = p;
+
+   t4_enable_wq_db(qp-wq);
+   return 0;
+}
+
+static void resume_queues(struct uld_ctx *ctx)
+{
+   spin_lock_irq(ctx-dev-lock);
+   ctx-dev-db_state = NORMAL;
+   idr_for_each(ctx-dev-qpidr, enable_qp_db, NULL);
+   spin_unlock_irq(ctx-dev-lock);
+}
+
+static int c4iw_uld_control(void *handle, enum cxgb4_control control, ...)
+{
+   struct uld_ctx *ctx = handle;
+
+   switch (control) {
+   case CXGB4_CONTROL_DB_FULL:
+   stop_queues(ctx);
+   mutex_lock(ctx-dev-rdev.stats.lock);
+   ctx-dev-rdev.stats.db_full++;
+   mutex_unlock(ctx-dev-rdev.stats.lock);
+   break;
+   case CXGB4_CONTROL_DB_EMPTY:
+   resume_queues(ctx);
+   mutex_lock(ctx-dev-rdev.stats.lock);
+   ctx-dev-rdev.stats.db_empty++;
+   mutex_unlock(ctx-dev-rdev.stats.lock);
+   break;
+   case CXGB4_CONTROL_DB_DROP:
+   printk(KERN_WARNING MOD %s: Fatal DB DROP\n,
+  pci_name(ctx-lldi.pdev));
+   mutex_lock(ctx-dev-rdev.stats.lock);
+   ctx-dev-rdev.stats.db_drop++;
+   mutex_unlock(ctx-dev-rdev.stats.lock);
+   break;
+   default:
+   printk(KERN_WARNING 

Re: [PATCH 07/10] RDMA/cxgb4: DB Drop Recovery for RDMA and LLD queues.

2011-10-24 Thread Vipul Pandya


On 21-10-2011 02:27, David Miller wrote:

 From: Steve Wise sw...@opengridcomputing.com
 Date: Thu, 20 Oct 2011 12:28:07 -0500
 
 On 10/20/2011 12:17 PM, Roland Dreier wrote:
 I believe 5 and 7 have build dependencies.
 Right, missed that one too.

 But it seems 4,6,8,9,10 are independent of the rest of the series?

 ie I can trivially apply them and then worry about working out
 the drivers/net / drivers/infiniband interdependency a bit later?


 Some of these might be dependent on prior patches the series.  But if
 they aren't, yes, you could do that.
 
 So, how do you guys want to do this?  If you give me a list of which
 patches I should put into net-next and leave the rest to the infiniband
 tree, that'd work fine for me as long as net-next is left in a working
 state independent of the infiniband tree.


Hi Dave Miller/Roland,

With respect to above dependencies we did some experiments and found
following things

1. We can apply three cxgb4 patches, 01 02 and 03, on net-next tree
successfully and build it.

2. Out of 7 RDMA/cxgb4 patches only 04, 08 and 10 can be applied
trivially and driver can be built successfully. If we try to apply
remaining patches, 05 06 07 and 09, either they will fail to apply or
give build failure. Moreover patches 05, 06, 07 and 09 can be applied on
top of 04, 08 and 10 cleanly.

Based on above results we would like to propose following two things.

1. We would like to recommend that all the patches get included in
Roland's infiniband tree since it has build dependencies.

2. Alternatively,
- Patches 01, 02 and 03 can be included in net-next tree.
- Patches 04, 08 and 10 can be included in Roland's infiniband tree at
present.
- Patches 05, 06, 07 and 09 have to wait till the net-next hits the
3.2-rc1.

Please let us know if you have any other suggestions.

Thanks,
Vipul
--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 1/2] RDMA/cxgb4: Serialize calls to cq's comp_handler by lock

2011-10-24 Thread Kumar Sanghvi
Commit 01e7da6ba53ca4d6189a1eae45607c0331c871f2 introduced a potential
problem wherein the cq's comp_handler can get called simultaneously from
different places in iw_cxgb4 driver. This does not comply with
Documentation/infiniband/core_locking.txt, which states that at a given
point of time, there should be only one callback per CQ should be active.

This problem was reported by Parav Pandit parav.pan...@emulex.com.
Based on discussion between Parav Pandit and Steve Wise, this patch
aims to correct above problem by serializing the calls to cq's comp_handler
using a spin_lock.

Reported-by: Parav Pandit parav.pan...@emulex.com
Signed-off-by: Kumar Sanghvi kuma...@chelsio.com
---
 drivers/infiniband/hw/cxgb4/cq.c   |1 +
 drivers/infiniband/hw/cxgb4/ev.c   |   10 --
 drivers/infiniband/hw/cxgb4/iw_cxgb4.h |1 +
 drivers/infiniband/hw/cxgb4/qp.c   |   15 +--
 4 files changed, 23 insertions(+), 4 deletions(-)

diff --git a/drivers/infiniband/hw/cxgb4/cq.c b/drivers/infiniband/hw/cxgb4/cq.c
index 901c5fb..f35a935 100644
--- a/drivers/infiniband/hw/cxgb4/cq.c
+++ b/drivers/infiniband/hw/cxgb4/cq.c
@@ -818,6 +818,7 @@ struct ib_cq *c4iw_create_cq(struct ib_device *ibdev, int 
entries,
chp-cq.size--; /* status page */
chp-ibcq.cqe = entries - 2;
spin_lock_init(chp-lock);
+   spin_lock_init(chp-comp_handler_lock);
atomic_set(chp-refcnt, 1);
init_waitqueue_head(chp-wait);
ret = insert_handle(rhp, rhp-cqidr, chp, chp-cq.cqid);
diff --git a/drivers/infiniband/hw/cxgb4/ev.c b/drivers/infiniband/hw/cxgb4/ev.c
index c13041a..397cb36 100644
--- a/drivers/infiniband/hw/cxgb4/ev.c
+++ b/drivers/infiniband/hw/cxgb4/ev.c
@@ -42,6 +42,7 @@ static void post_qp_event(struct c4iw_dev *dev, struct 
c4iw_cq *chp,
 {
struct ib_event event;
struct c4iw_qp_attributes attrs;
+   unsigned long flag;
 
if ((qhp-attr.state == C4IW_QP_STATE_ERROR) ||
(qhp-attr.state == C4IW_QP_STATE_TERMINATE)) {
@@ -72,7 +73,9 @@ static void post_qp_event(struct c4iw_dev *dev, struct 
c4iw_cq *chp,
if (qhp-ibqp.event_handler)
(*qhp-ibqp.event_handler)(event, qhp-ibqp.qp_context);
 
+   spin_lock_irqsave(chp-comp_handler_lock, flag);
(*chp-ibcq.comp_handler)(chp-ibcq, chp-ibcq.cq_context);
+   spin_unlock_irqrestore(chp-comp_handler_lock, flag);
 }
 
 void c4iw_ev_dispatch(struct c4iw_dev *dev, struct t4_cqe *err_cqe)
@@ -183,11 +186,14 @@ out:
 int c4iw_ev_handler(struct c4iw_dev *dev, u32 qid)
 {
struct c4iw_cq *chp;
+   unsigned long flag;
 
chp = get_chp(dev, qid);
-   if (chp)
+   if (chp) {
+   spin_lock_irqsave(chp-comp_handler_lock, flag);
(*chp-ibcq.comp_handler)(chp-ibcq, chp-ibcq.cq_context);
-   else
+   spin_unlock_irqrestore(chp-comp_handler_lock, flag);
+   } else
PDBG(%s unknown cqid 0x%x\n, __func__, qid);
return 0;
 }
diff --git a/drivers/infiniband/hw/cxgb4/iw_cxgb4.h 
b/drivers/infiniband/hw/cxgb4/iw_cxgb4.h
index 62cea0e..1357c5b 100644
--- a/drivers/infiniband/hw/cxgb4/iw_cxgb4.h
+++ b/drivers/infiniband/hw/cxgb4/iw_cxgb4.h
@@ -309,6 +309,7 @@ struct c4iw_cq {
struct c4iw_dev *rhp;
struct t4_cq cq;
spinlock_t lock;
+   spinlock_t comp_handler_lock;
atomic_t refcnt;
wait_queue_head_t wait;
 };
diff --git a/drivers/infiniband/hw/cxgb4/qp.c b/drivers/infiniband/hw/cxgb4/qp.c
index b59b56c..a391a4a 100644
--- a/drivers/infiniband/hw/cxgb4/qp.c
+++ b/drivers/infiniband/hw/cxgb4/qp.c
@@ -945,8 +945,11 @@ static void __flush_qp(struct c4iw_qp *qhp, struct c4iw_cq 
*rchp,
flushed = c4iw_flush_rq(qhp-wq, rchp-cq, count);
spin_unlock(qhp-lock);
spin_unlock_irqrestore(rchp-lock, flag);
-   if (flushed)
+   if (flushed) {
+   spin_lock_irqsave(rchp-comp_handler_lock, flag);
(*rchp-ibcq.comp_handler)(rchp-ibcq, rchp-ibcq.cq_context);
+   spin_unlock_irqrestore(rchp-comp_handler_lock, flag);
+   }
 
/* locking hierarchy: cq lock first, then qp lock. */
spin_lock_irqsave(schp-lock, flag);
@@ -956,13 +959,17 @@ static void __flush_qp(struct c4iw_qp *qhp, struct 
c4iw_cq *rchp,
flushed = c4iw_flush_sq(qhp-wq, schp-cq, count);
spin_unlock(qhp-lock);
spin_unlock_irqrestore(schp-lock, flag);
-   if (flushed)
+   if (flushed) {
+   spin_lock_irqsave(schp-comp_handler_lock, flag);
(*schp-ibcq.comp_handler)(schp-ibcq, schp-ibcq.cq_context);
+   spin_unlock_irqrestore(schp-comp_handler_lock, flag);
+   }
 }
 
 static void flush_qp(struct c4iw_qp *qhp)
 {
struct c4iw_cq *rchp, *schp;
+   unsigned long flag;
 
rchp = get_chp(qhp-rhp, qhp-attr.rcq);
schp = get_chp(qhp-rhp, qhp-attr.scq);
@@ -970,11 +977,15 @@ static void 

[PATCH 2/2] RDMA/cxgb3: Serialize calls to cq's comp_handler by lock

2011-10-24 Thread Kumar Sanghvi
iw_cxgb3 has a potential problem wherein the cq's comp_handler
can get called simultaneously from different places in iw_cxgb3
driver. This does not comply with Documentation/infiniband/core_locking.txt,
which states that at a given point of time, there should be only one
callback per CQ should be active.

Such problem was reported by Parav Pandit parav.pan...@emulex.com for
iw_cxgb4 driver. Based on discussion between Parav Pandit and Steve Wise,
this patch aims to correct above problem by serializing the calls to cq's
comp_handler using a spin_lock.

Signed-off-by: Kumar Sanghvi kuma...@chelsio.com
---
 drivers/infiniband/hw/cxgb3/iwch_ev.c   |6 ++
 drivers/infiniband/hw/cxgb3/iwch_provider.c |1 +
 drivers/infiniband/hw/cxgb3/iwch_provider.h |1 +
 drivers/infiniband/hw/cxgb3/iwch_qp.c   |   14 --
 4 files changed, 20 insertions(+), 2 deletions(-)

diff --git a/drivers/infiniband/hw/cxgb3/iwch_ev.c 
b/drivers/infiniband/hw/cxgb3/iwch_ev.c
index 71e0d84..abcc9e7 100644
--- a/drivers/infiniband/hw/cxgb3/iwch_ev.c
+++ b/drivers/infiniband/hw/cxgb3/iwch_ev.c
@@ -46,6 +46,7 @@ static void post_qp_event(struct iwch_dev *rnicp, struct 
iwch_cq *chp,
struct ib_event event;
struct iwch_qp_attributes attrs;
struct iwch_qp *qhp;
+   unsigned long flag;
 
spin_lock(rnicp-lock);
qhp = get_qhp(rnicp, CQE_QPID(rsp_msg-cqe));
@@ -94,7 +95,9 @@ static void post_qp_event(struct iwch_dev *rnicp, struct 
iwch_cq *chp,
if (qhp-ibqp.event_handler)
(*qhp-ibqp.event_handler)(event, qhp-ibqp.qp_context);
 
+   spin_lock_irqsave(chp-comp_handler_lock, flag);
(*chp-ibcq.comp_handler)(chp-ibcq, chp-ibcq.cq_context);
+   spin_unlock_irqrestore(chp-comp_handler_lock, flag);
 
if (atomic_dec_and_test(qhp-refcnt))
wake_up(qhp-wait);
@@ -107,6 +110,7 @@ void iwch_ev_dispatch(struct cxio_rdev *rdev_p, struct 
sk_buff *skb)
struct iwch_cq *chp;
struct iwch_qp *qhp;
u32 cqid = RSPQ_CQID(rsp_msg);
+   unsigned long flag;
 
rnicp = (struct iwch_dev *) rdev_p-ulp;
spin_lock(rnicp-lock);
@@ -170,7 +174,9 @@ void iwch_ev_dispatch(struct cxio_rdev *rdev_p, struct 
sk_buff *skb)
 */
if (qhp-ep  SQ_TYPE(rsp_msg-cqe))
dst_confirm(qhp-ep-dst);
+   spin_lock_irqsave(chp-comp_handler_lock, flag);
(*chp-ibcq.comp_handler)(chp-ibcq, chp-ibcq.cq_context);
+   spin_unlock_irqrestore(chp-comp_handler_lock, flag);
break;
 
case TPT_ERR_STAG:
diff --git a/drivers/infiniband/hw/cxgb3/iwch_provider.c 
b/drivers/infiniband/hw/cxgb3/iwch_provider.c
index c7d9411..37c224f 100644
--- a/drivers/infiniband/hw/cxgb3/iwch_provider.c
+++ b/drivers/infiniband/hw/cxgb3/iwch_provider.c
@@ -190,6 +190,7 @@ static struct ib_cq *iwch_create_cq(struct ib_device 
*ibdev, int entries, int ve
chp-rhp = rhp;
chp-ibcq.cqe = 1  chp-cq.size_log2;
spin_lock_init(chp-lock);
+   spin_lock_init(chp-comp_handler_lock);
atomic_set(chp-refcnt, 1);
init_waitqueue_head(chp-wait);
if (insert_handle(rhp, rhp-cqidr, chp, chp-cq.cqid)) {
diff --git a/drivers/infiniband/hw/cxgb3/iwch_provider.h 
b/drivers/infiniband/hw/cxgb3/iwch_provider.h
index 9a342c9..87c14b0 100644
--- a/drivers/infiniband/hw/cxgb3/iwch_provider.h
+++ b/drivers/infiniband/hw/cxgb3/iwch_provider.h
@@ -103,6 +103,7 @@ struct iwch_cq {
struct iwch_dev *rhp;
struct t3_cq cq;
spinlock_t lock;
+   spinlock_t comp_handler_lock;
atomic_t refcnt;
wait_queue_head_t wait;
u32 __user *user_rptr_addr;
diff --git a/drivers/infiniband/hw/cxgb3/iwch_qp.c 
b/drivers/infiniband/hw/cxgb3/iwch_qp.c
index ecd313f..bea5839 100644
--- a/drivers/infiniband/hw/cxgb3/iwch_qp.c
+++ b/drivers/infiniband/hw/cxgb3/iwch_qp.c
@@ -822,8 +822,11 @@ static void __flush_qp(struct iwch_qp *qhp, struct iwch_cq 
*rchp,
flushed = cxio_flush_rq(qhp-wq, rchp-cq, count);
spin_unlock(qhp-lock);
spin_unlock_irqrestore(rchp-lock, *flag);
-   if (flushed)
+   if (flushed) {
+   spin_lock_irqsave(rchp-comp_handler_lock, *flag);
(*rchp-ibcq.comp_handler)(rchp-ibcq, rchp-ibcq.cq_context);
+   spin_unlock_irqrestore(rchp-comp_handler_lock, *flag);
+   }
 
/* locking hierarchy: cq lock first, then qp lock. */
spin_lock_irqsave(schp-lock, *flag);
@@ -833,8 +836,11 @@ static void __flush_qp(struct iwch_qp *qhp, struct iwch_cq 
*rchp,
flushed = cxio_flush_sq(qhp-wq, schp-cq, count);
spin_unlock(qhp-lock);
spin_unlock_irqrestore(schp-lock, *flag);
-   if (flushed)
+   if (flushed) {
+   spin_lock_irqsave(schp-comp_handler_lock, *flag);
(*schp-ibcq.comp_handler)(schp-ibcq, schp-ibcq.cq_context);
+   

RE: [PATCH] An argument for allowing applications to manually send RMPP packets if desired

2011-10-24 Thread Mike Heinz
I'm not sure I understand the resistance to this patch.  The patch does not 
break existing functionality and can demonstrably improve the performance of 
tools that can take advantage of it.

What's wrong with implementing this patch now and working on an improved 
mechanism for the future?

-Original Message-
From: Hefty, Sean [mailto:sean.he...@intel.com]
Sent: Sunday, September 18, 2011 8:35 PM
To: Jason Gunthorpe; Mike Heinz
Cc: Roland Dreier; linux-rdma@vger.kernel.org; Todd Rimmer
Subject: RE: [PATCH] An argument for allowing applications to manually send 
RMPP packets if desired

 Ultimately I think the scalable/compatible answer is to move these
 RMPP work loads to a verbs QP and we need to have a user space RMPP
 implementation for that anyhow.

Many to one is never scalable.  The applications simply cannot rely on every 
node querying the SA at the same time, especially for large amounts of data.  
And the overhead of MADs and RMPP is too high to be a useful solution to 
scaling.




This message and any attached documents contain information from QLogic 
Corporation or its wholly-owned subsidiaries that may be confidential. If you 
are not the intended recipient, you may not read, copy, distribute, or use this 
information. If you have received this transmission in error, please notify the 
sender immediately by reply e-mail and then delete this message.

--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 6/9] ib_srpt: Remove legacy use_port_guid_in_session_name module parameter

2011-10-24 Thread Bart Van Assche
On Mon, Oct 24, 2011 at 7:33 AM, Nicholas A. Bellinger
n...@linux-iscsi.org wrote:
 This patch removes the legacy use_port_guid_in_session_name module parameter
 that is no longer required in modern ib_srpt code.

The patch looks fine to me but the description could be improved: the
use_port_guid_in_session_name had been introduced in the past to
work around a limitation of the Windows SRP initiator in multipath
setups. Apparently with recent Windows SRP initiator versions setting
that parameter is no longer necessary.

Bart.
--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 7/9] ib_srpt: Convert srp_max_rdma_size into per port configfs attribute

2011-10-24 Thread Nicholas A. Bellinger
On Mon, 2011-10-24 at 18:34 +0200, Bart Van Assche wrote:
 On Mon, Oct 24, 2011 at 7:33 AM, Nicholas A. Bellinger
 n...@linux-iscsi.org wrote:
  This patch converts the srp_max_rdma_size module parameter into a per
  endpoint configfs attribute.  This includes adding the necessary bits
  for show + store attributes w/ min/max bounds checking, and updating
  srpt_get_ioc() to accept a struct srpt_port parameter.
 
  [ ... ]
 
  -static void srpt_get_ioc(struct srpt_device *sdev, u32 slot,
  +static void srpt_get_ioc(struct srpt_port *sport, u32 slot,
  struct ib_dm_mad *mad)
 
 The SRP spec says that there should be only one value for
 iocp-rdma_size per I/O controller. This patch breaks that rule. I'm
 not sure it's a good idea to introduce such behavior changes.
 

Hi Bart,

This is something that Roland asked me to persue, so i'll have to defer
to his judgment if he want's to kick this back out to a module parameter
or not.

--nab



--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH V2 1/2] IB/qib: Hold links until tuning data is available

2011-10-24 Thread Mike Marciniszyn
From: Mitko Haralanov mi...@qlogic.com

Hold the link state machine until the tuning data is read
from the QSFP EEPROM so correct tuning settings are applied
before the state machine attempts to bring the link up.
Link is also held on cable unplug in case a different cable is
used.

This version avoids scheduling work on non-qsfp cards.

Signed-off-by: Mitko Haralanov mi...@qlogic.com
Signed-off-by: Mike Marciniszyn mike.marcinis...@qlogic.com
---
 drivers/infiniband/hw/qib/qib_iba7322.c |  120 ---
 drivers/infiniband/hw/qib/qib_init.c|4 -
 drivers/infiniband/hw/qib/qib_qsfp.c|   25 --
 drivers/infiniband/hw/qib/qib_qsfp.h|2 +
 4 files changed, 94 insertions(+), 57 deletions(-)

diff --git a/drivers/infiniband/hw/qib/qib_iba7322.c 
b/drivers/infiniband/hw/qib/qib_iba7322.c
index ebc5266..e3c8ead 100644
--- a/drivers/infiniband/hw/qib/qib_iba7322.c
+++ b/drivers/infiniband/hw/qib/qib_iba7322.c
@@ -2380,17 +2380,17 @@ static int qib_7322_bringup_serdes(struct qib_pportdata 
*ppd)
ppd-cpspec-ibcctrl_a |= SYM_MASK(IBCCtrlA_0, IBLinkEn);
set_vls(ppd);
 
+   /* Hold the link state machine for mezz boards */
+   qib_set_ib_7322_lstate(ppd, 0,
+  QLOGIC_IB_IBCC_LINKINITCMD_DISABLE);
+
+
/* be paranoid against later code motion, etc. */
spin_lock_irqsave(dd-cspec-rcvmod_lock, flags);
ppd-p_rcvctrl |= SYM_MASK(RcvCtrl_0, RcvIBPortEnable);
qib_write_kreg_port(ppd, krp_rcvctrl, ppd-p_rcvctrl);
spin_unlock_irqrestore(dd-cspec-rcvmod_lock, flags);
 
-   /* Hold the link state machine for mezz boards */
-   if (IS_QMH(dd) || IS_QME(dd))
-   qib_set_ib_7322_lstate(ppd, 0,
-  QLOGIC_IB_IBCC_LINKINITCMD_DISABLE);
-
/* Also enable IBSTATUSCHG interrupt.  */
val = qib_read_kreg_port(ppd, krp_errmask);
qib_write_kreg_port(ppd, krp_errmask,
@@ -5276,6 +5276,8 @@ static int qib_7322_ib_updown(struct qib_pportdata *ppd, 
int ibup, u64 ibcs)
 QIBL_IB_AUTONEG_INPROG)))
set_7322_ibspeed_fast(ppd, ppd-link_speed_enabled);
if (!(ppd-lflags  QIBL_IB_AUTONEG_INPROG)) {
+   struct qib_qsfp_data *qd =
+   ppd-cpspec-qsfp_data;
/* unlock the Tx settings, speed may change */
qib_write_kreg_port(ppd, krp_tx_deemph_override,
SYM_MASK(IBSD_TX_DEEMPHASIS_OVERRIDE_0,
@@ -5283,6 +5285,12 @@ static int qib_7322_ib_updown(struct qib_pportdata *ppd, 
int ibup, u64 ibcs)
qib_cancel_sends(ppd);
/* on link down, ensure sane pcs state */
qib_7322_mini_pcs_reset(ppd);
+   /* schedule the qsfp refresh which should turn the link
+  off */
+   if (ppd-dd-flags  QIB_HAS_QSFP) {
+   qd-t_insert = get_jiffies_64();
+   schedule_work(qd-work);
+   }
spin_lock_irqsave(ppd-sdma_lock, flags);
if (__qib_sdma_running(ppd))
__qib_sdma_process_event(ppd,
@@ -5638,38 +5646,62 @@ static void qsfp_7322_event(struct work_struct *work)
 
qd = container_of(work, struct qib_qsfp_data, work);
ppd = qd-ppd;
-   pwrup = qd-t_insert + msecs_to_jiffies(QSFP_PWR_LAG_MSEC);
+   pwrup = qd-t_insert +
+   msecs_to_jiffies(QSFP_PWR_LAG_MSEC - QSFP_MODPRS_LAG_MSEC);
 
-   /*
-* Some QSFP's not only do not respond until the full power-up
-* time, but may behave badly if we try. So hold off responding
-* to insertion.
-*/
-   while (1) {
-   u64 now = get_jiffies_64();
-   if (time_after64(now, pwrup))
-   break;
-   msleep(20);
-   }
-   ret = qib_refresh_qsfp_cache(ppd, qd-cache);
-   /*
-* Need to change LE2 back to defaults if we couldn't
-* read the cable type (to handle cable swaps), so do this
-* even on failure to read cable information.  We don't
-* get here for QME, so IS_QME check not needed here.
-*/
-   if (!ret  !ppd-dd-cspec-r1) {
-   if (QSFP_IS_ACTIVE_FAR(qd-cache.tech))
-   le2 = LE2_QME;
-   else if (qd-cache.atten[1] = qib_long_atten 
-QSFP_IS_CU(qd-cache.tech))
-   le2 = LE2_5m;
-   else
+   /* Delay for 20 msecs to allow ModPrs resistor to setup */
+   mdelay(QSFP_MODPRS_LAG_MSEC);
+
+   if (!qib_qsfp_mod_present(ppd))
+   /* Set the physical link to disabled */
+   qib_set_ib_7322_lstate(ppd, 0,
+  

Re: OpenSM plugin patch request

2011-10-24 Thread Karl Schulz
On Oct 17, 2011, at 5:28 PM, Hal Rosenstock wrote:

 
 A couple of minor comments:
 
 1. Why not move the OSM_EVENT_ID_LIGHT_SWEEP_DONE event to where LIGHT
 SWEEP COMPLETE is indicated in osm_state_mgr.c ? That's actually more
 accurate as all the transactions have completed.
 
 2. The new events should be added to osmeventplugin/src/osmeventplugin.c
 example.
 
 Thanks.
 
 -- Hal
 

Thanks for the comments.  The attached osm_light_sweep.patch addresses the 
points above by moving the OSM_EVENT_ID_LIGHT_SWEEP_DONE event to be with it's 
corresponding LIGHT SWEEP COMPLETE log message.  Also, the light sweep 
START/DONE events are included in the example event plugin. 

While adding to the example event plugin, I realized that the autoconf setup in 
3.3.9 does not appear to be propagating all user settings down into the default 
plugin.  Consequently, I was not able to build the plugin when OFED headers 
were not in the default location.   In case it's useful (and hasn't already 
been addressed), the second attached patch adds  $(OSMV_INCLUDES) to the 
automake directives for the plugin which was sufficient in my case to resolve 
necessary headers.

Thanks,

Karl





osm_light_sweep.patch
Description: osm_light_sweep.patch


build.patch
Description: build.patch


[PATCH 1/2] ib_srpt: Fix possible race with srp_sq_size in srpt_create_ch_ib

2011-10-24 Thread Nicholas A. Bellinger
From: Nicholas Bellinger n...@linux-iscsi.org

This patch fixes a possible race with srp_sq_size in srpt_create_ch_ib()
where changing sport-port_attrib.srp_sq_size via configfs could have
unintended consequences.  It uses a local assignment for srp_sq_size to
ensure the values for ib_create_cq() and qp_init-cap.max_send_wr are
consistent.

Reported-by: Bart Van Assche bvanass...@acm.org
Cc: Bart Van Assche bvanass...@acm.org
Cc: Roland Dreier rol...@purestorage.com
Signed-off-by: Nicholas A. Bellinger n...@linux-iscsi.org
---
 drivers/infiniband/ulp/srpt/ib_srpt.c |7 ---
 1 files changed, 4 insertions(+), 3 deletions(-)

diff --git a/drivers/infiniband/ulp/srpt/ib_srpt.c 
b/drivers/infiniband/ulp/srpt/ib_srpt.c
index 9e1c5e0..e483c54 100644
--- a/drivers/infiniband/ulp/srpt/ib_srpt.c
+++ b/drivers/infiniband/ulp/srpt/ib_srpt.c
@@ -2130,6 +2130,7 @@ static int srpt_create_ch_ib(struct srpt_rdma_ch *ch)
struct ib_qp_init_attr *qp_init;
struct srpt_port *sport = ch-sport;
struct srpt_device *sdev = sport-sdev;
+   u32 srp_sq_size = sport-port_attrib.srp_sq_size;
int ret;
 
WARN_ON(ch-rq_size  1);
@@ -2140,11 +2141,11 @@ static int srpt_create_ch_ib(struct srpt_rdma_ch *ch)
goto out;
 
ch-cq = ib_create_cq(sdev-device, srpt_completion, NULL, ch,
- ch-rq_size + sport-port_attrib.srp_sq_size, 0);
+ ch-rq_size + srp_sq_size, 0);
if (IS_ERR(ch-cq)) {
ret = PTR_ERR(ch-cq);
printk(KERN_ERR failed to create CQ cqe= %d ret= %d\n,
-  ch-rq_size + sport-port_attrib.srp_sq_size, ret);
+  ch-rq_size + srp_sq_size, ret);
goto out;
}
 
@@ -2156,7 +2157,7 @@ static int srpt_create_ch_ib(struct srpt_rdma_ch *ch)
qp_init-srq = sdev-srq;
qp_init-sq_sig_type = IB_SIGNAL_REQ_WR;
qp_init-qp_type = IB_QPT_RC;
-   qp_init-cap.max_send_wr = sport-port_attrib.srp_sq_size;
+   qp_init-cap.max_send_wr = srp_sq_size;
qp_init-cap.max_send_sge = SRPT_DEF_SG_PER_WQE;
 
ch-qp = ib_create_qp(sdev-pd, qp_init);
-- 
1.7.2.5

--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 2/2] ib_srpt: Fix possible race with srp_max_rsp_size in srpt_release_channel_work

2011-10-24 Thread Nicholas A. Bellinger
From: Nicholas Bellinger n...@linux-iscsi.org

This patch fixes a possible race with srp_max_rsp_size in
srpt_release_channel_work() when changing sport-port_attrib.srp_max_rsp_size
via configfs could have unintended consequences.  It uses a new
srpt_rdma_ch-rsp_size and assign the value during srpt_cm_req_recv(),
which is used in subsequent calls to srpt_free_ioctx_ring() to ensure
consistency with the original srpt_alloc_ioctx_ring().

Reported-by: Bart Van Assche bvanass...@acm.org
Cc: Bart Van Assche bvanass...@acm.org
Cc: Roland Dreier rol...@purestorage.com
Signed-off-by: Nicholas A. Bellinger n...@linux-iscsi.org
---
 drivers/infiniband/ulp/srpt/ib_srpt.c |   10 --
 drivers/infiniband/ulp/srpt/ib_srpt.h |2 ++
 2 files changed, 6 insertions(+), 6 deletions(-)

diff --git a/drivers/infiniband/ulp/srpt/ib_srpt.c 
b/drivers/infiniband/ulp/srpt/ib_srpt.c
index e483c54..f7174b3 100644
--- a/drivers/infiniband/ulp/srpt/ib_srpt.c
+++ b/drivers/infiniband/ulp/srpt/ib_srpt.c
@@ -2373,8 +2373,7 @@ static void srpt_release_channel_work(struct work_struct 
*w)
 
srpt_free_ioctx_ring((struct srpt_ioctx **)ch-ioctx_ring,
 ch-sport-sdev, ch-rq_size,
-ch-sport-port_attrib.srp_max_rsp_size,
-DMA_TO_DEVICE);
+ch-rsp_size, DMA_TO_DEVICE);
 
spin_lock_irq(sdev-spinlock);
list_del(ch-list);
@@ -2554,12 +2553,12 @@ static int srpt_cm_req_recv(struct ib_cm_id *cm_id,
spin_lock_init(ch-spinlock);
ch-state = CH_CONNECTING;
INIT_LIST_HEAD(ch-cmd_wait_list);
+   ch-rsp_size = ch-sport-port_attrib.srp_max_rsp_size;
 
ch-ioctx_ring = (struct srpt_send_ioctx **)
srpt_alloc_ioctx_ring(ch-sport-sdev, ch-rq_size,
  sizeof(*ch-ioctx_ring[0]),
- ch-sport-port_attrib.srp_max_rsp_size,
- DMA_TO_DEVICE);
+ ch-rsp_size, DMA_TO_DEVICE);
if (!ch-ioctx_ring)
goto free_ch;
 
@@ -2667,8 +2666,7 @@ destroy_ib:
 free_ring:
srpt_free_ioctx_ring((struct srpt_ioctx **)ch-ioctx_ring,
 ch-sport-sdev, ch-rq_size,
-ch-sport-port_attrib.srp_max_rsp_size,
-DMA_TO_DEVICE);
+ch-rsp_size, DMA_TO_DEVICE);
 free_ch:
kfree(ch);
 
diff --git a/drivers/infiniband/ulp/srpt/ib_srpt.h 
b/drivers/infiniband/ulp/srpt/ib_srpt.h
index 1b607ae..068f66d 100644
--- a/drivers/infiniband/ulp/srpt/ib_srpt.h
+++ b/drivers/infiniband/ulp/srpt/ib_srpt.h
@@ -277,6 +277,7 @@ enum rdma_ch_state {
  * @qp:IB queue pair used for communicating over this channel.
  * @cq:IB completion queue for this channel.
  * @rq_size:   IB receive queue size.
+ * @rsp_size  IB response message size in bytes.
  * @sq_wr_avail:   number of work requests available in the send queue.
  * @sport: pointer to the information of the HCA port used by this
  * channel.
@@ -307,6 +308,7 @@ struct srpt_rdma_ch {
struct ib_qp*qp;
struct ib_cq*cq;
int rq_size;
+   u32 rsp_size;
atomic_tsq_wr_avail;
struct srpt_port*sport;
u8  i_port_id[16];
-- 
1.7.2.5

--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 8/9] ib_srpt: Convert srp_max_rsp_size into per port configfs attribute

2011-10-24 Thread Bart Van Assche
On Mon, Oct 24, 2011 at 7:33 AM, Nicholas A. Bellinger
n...@linux-iscsi.org wrote:
 +static ssize_t srpt_tpg_attrib_store_srp_max_rsp_size(
 +       struct se_portal_group *se_tpg,
 +       const char *page,
 +       size_t count)
 +{
 +       struct srpt_port *sport = container_of(se_tpg, struct srpt_port, 
 port_tpg_1);
 +       unsigned long val;
 +       int ret;
 +
 +       ret = strict_strtoul(page, 0, val);

If the data page points at only consists of digits, the above
strict_strtoul() call will trigger a past-end-of-buffer read. Also,
isn't kstrto*() preferred over strict_strto*() ?

Bart.
--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 8/9] ib_srpt: Convert srp_max_rsp_size into per port configfs attribute

2011-10-24 Thread Nicholas A. Bellinger
On Mon, 2011-10-24 at 21:44 +0200, Bart Van Assche wrote:
 On Mon, Oct 24, 2011 at 7:33 AM, Nicholas A. Bellinger
 n...@linux-iscsi.org wrote:
  +static ssize_t srpt_tpg_attrib_store_srp_max_rsp_size(
  +   struct se_portal_group *se_tpg,
  +   const char *page,
  +   size_t count)
  +{
  +   struct srpt_port *sport = container_of(se_tpg, struct srpt_port, 
  port_tpg_1);
  +   unsigned long val;
  +   int ret;
  +
  +   ret = strict_strtoul(page, 0, val);
 
 If the data page points at only consists of digits, the above
 strict_strtoul() call will trigger a past-end-of-buffer read.

I don't understand what you mean here.  Can you provide a test case to
demonstrate please..?

 Also, isn't kstrto*() preferred over strict_strto*() ?
 

Not AFAIK.

--nab


--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 5/9] ib_srpt: Fix sport-port_guid formatting code

2011-10-24 Thread Bart Van Assche
On Mon, Oct 24, 2011 at 7:33 AM, Nicholas A. Bellinger
n...@linux-iscsi.org wrote:
 +               snprintf(sport-port_guid, sizeof(sport-port_guid),
 +                               0x%016llx,
 +                               be64_to_cpu(sport-gid.global.interface_id));

If I interpret Roland's e-mail correctly Roland asked to use GID table
entry 0 (http://www.spinics.net/lists/linux-rdma/msg09751.html) ?

Bart.
--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 8/9] ib_srpt: Convert srp_max_rsp_size into per port configfs attribute

2011-10-24 Thread Bart Van Assche
On Mon, Oct 24, 2011 at 9:49 PM, Nicholas A. Bellinger
n...@linux-iscsi.org wrote:
 On Mon, 2011-10-24 at 21:44 +0200, Bart Van Assche wrote:
 On Mon, Oct 24, 2011 at 7:33 AM, Nicholas A. Bellinger
 n...@linux-iscsi.org wrote:
  +static ssize_t srpt_tpg_attrib_store_srp_max_rsp_size(
  +       struct se_portal_group *se_tpg,
  +       const char *page,
  +       size_t count)
  +{
  +       struct srpt_port *sport = container_of(se_tpg, struct srpt_port, 
  port_tpg_1);
  +       unsigned long val;
  +       int ret;
  +
  +       ret = strict_strtoul(page, 0, val);

 If the data page points at only consists of digits, the above
 strict_strtoul() call will trigger a past-end-of-buffer read.

 I don't understand what you mean here.  Can you provide a test case to
 demonstrate please..?

echo -n 345 $configfs_path_of_parameter.

Bart.
--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 8/9] ib_srpt: Convert srp_max_rsp_size into per port configfs attribute

2011-10-24 Thread Nicholas A. Bellinger
On Mon, 2011-10-24 at 21:58 +0200, Bart Van Assche wrote:
 On Mon, Oct 24, 2011 at 9:49 PM, Nicholas A. Bellinger
 n...@linux-iscsi.org wrote:
  On Mon, 2011-10-24 at 21:44 +0200, Bart Van Assche wrote:
  On Mon, Oct 24, 2011 at 7:33 AM, Nicholas A. Bellinger
  n...@linux-iscsi.org wrote:
   +static ssize_t srpt_tpg_attrib_store_srp_max_rsp_size(
   +   struct se_portal_group *se_tpg,
   +   const char *page,
   +   size_t count)
   +{
   +   struct srpt_port *sport = container_of(se_tpg, struct srpt_port, 
   port_tpg_1);
   +   unsigned long val;
   +   int ret;
   +
   +   ret = strict_strtoul(page, 0, val);
 
  If the data page points at only consists of digits, the above
  strict_strtoul() call will trigger a past-end-of-buffer read.
 
  I don't understand what you mean here.  Can you provide a test case to
  demonstrate please..?
 
 echo -n 345 $configfs_path_of_parameter.
 

Still not sure what your getting at here..?

root@tifa:/sys/kernel/config/target/srpt/0x0002c903000e8acd/tpgt_1/attrib#
 ls -la
total 0
drwxr-xr-x 2 root root0 Oct 24 13:01 .
drwxr-xr-x 7 root root0 Oct 24 12:58 ..
-rw-r--r-- 1 root root 4096 Oct 24 12:58 srp_max_rdma_size
-rw-r--r-- 1 root root 4096 Oct 24 13:01 srp_max_rsp_size
-rw-r--r-- 1 root root 4096 Oct 24 12:58 srp_sq_size

root@tifa:/sys/kernel/config/target/srpt/0x0002c903000e8acd/tpgt_1/attrib#
 cat srp_max_rsp_size
256
root@tifa:/sys/kernel/config/target/srpt/0x0002c903000e8acd/tpgt_1/attrib#
 echo -n 240  srp_max_rsp_size
root@tifa:/sys/kernel/config/target/srpt/0x0002c903000e8acd/tpgt_1/attrib#
 cat srp_max_rsp_size
240


--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 8/9] ib_srpt: Convert srp_max_rsp_size into per port configfs attribute

2011-10-24 Thread Bart Van Assche
On Mon, Oct 24, 2011 at 10:05 PM, Nicholas A. Bellinger
n...@linux-iscsi.org wrote:
 On Mon, 2011-10-24 at 21:58 +0200, Bart Van Assche wrote:
 On Mon, Oct 24, 2011 at 9:49 PM, Nicholas A. Bellinger
 n...@linux-iscsi.org wrote:
  On Mon, 2011-10-24 at 21:44 +0200, Bart Van Assche wrote:
  On Mon, Oct 24, 2011 at 7:33 AM, Nicholas A. Bellinger
  n...@linux-iscsi.org wrote:
   +static ssize_t srpt_tpg_attrib_store_srp_max_rsp_size(
   +       struct se_portal_group *se_tpg,
   +       const char *page,
   +       size_t count)
   +{
   +       struct srpt_port *sport = container_of(se_tpg, struct 
   srpt_port, port_tpg_1);
   +       unsigned long val;
   +       int ret;
   +
   +       ret = strict_strtoul(page, 0, val);
 
  If the data page points at only consists of digits, the above
  strict_strtoul() call will trigger a past-end-of-buffer read.
 
  I don't understand what you mean here.  Can you provide a test case to
  demonstrate please..?

 echo -n 345 $configfs_path_of_parameter.

 Still not sure what your getting at here..?

Only the data in page[0..count-1] is guaranteed to be initialized.
strict_strtoul() will read until it either finds whitespace or a
binary zero, so if the data in page[] does neither contain whitespace
nor a binary zero then strict_strtoul() will read past the end of the
data in page[]. There may be any data at page[count], including a
valid digit.

Bart.
--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 8/9] ib_srpt: Convert srp_max_rsp_size into per port configfs attribute

2011-10-24 Thread Bart Van Assche
On Mon, Oct 24, 2011 at 9:49 PM, Nicholas A. Bellinger
n...@linux-iscsi.org wrote:
 On Mon, 2011-10-24 at 21:44 +0200, Bart Van Assche wrote:
 Also, isn't kstrto*() preferred over strict_strto*() ?

 Not AFAIK.

This is what I found in the description of commit 33ee3b2:

kstrto*: converting strings to integers done (hopefully) right

1. simple_strto*() do not contain overflow checks and crufty,
   libc way to indicate failure.
2. strict_strto*() also do not have overflow checks but the name and
   comments pretend they do.
3. Both families have only long long and long variants,
   but users want strtou8()
4. Both simple and strict prefixes are wrong:
   Simple doesn't exactly say what's so simple, strict should not exist
   because conversion should be strict by default.

The solution is to use k prefix and add convertors for more types.
Enter
kstrtoull()
...

Bart.
--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 8/9] ib_srpt: Convert srp_max_rsp_size into per port configfs attribute

2011-10-24 Thread Nicholas A. Bellinger
On Mon, 2011-10-24 at 22:11 +0200, Bart Van Assche wrote:
 On Mon, Oct 24, 2011 at 10:05 PM, Nicholas A. Bellinger
 n...@linux-iscsi.org wrote:
  On Mon, 2011-10-24 at 21:58 +0200, Bart Van Assche wrote:
  On Mon, Oct 24, 2011 at 9:49 PM, Nicholas A. Bellinger
  n...@linux-iscsi.org wrote:
   On Mon, 2011-10-24 at 21:44 +0200, Bart Van Assche wrote:
   On Mon, Oct 24, 2011 at 7:33 AM, Nicholas A. Bellinger
   n...@linux-iscsi.org wrote:
+static ssize_t srpt_tpg_attrib_store_srp_max_rsp_size(
+   struct se_portal_group *se_tpg,
+   const char *page,
+   size_t count)
+{
+   struct srpt_port *sport = container_of(se_tpg, struct 
srpt_port, port_tpg_1);
+   unsigned long val;
+   int ret;
+
+   ret = strict_strtoul(page, 0, val);
  
   If the data page points at only consists of digits, the above
   strict_strtoul() call will trigger a past-end-of-buffer read.
  
   I don't understand what you mean here.  Can you provide a test case to
   demonstrate please..?
 
  echo -n 345 $configfs_path_of_parameter.
 
  Still not sure what your getting at here..?
 
 Only the data in page[0..count-1] is guaranteed to be initialized.
 strict_strtoul() will read until it either finds whitespace or a
 binary zero, so if the data in page[] does neither contain whitespace
 nor a binary zero then strict_strtoul() will read past the end of the
 data in page[]. There may be any data at page[count], including a
 valid digit.
 

That is part obvious.  The point your missing is that configfs is
already sanitizing the the incoming buffer in fs/configfs/file.c to work
with strict_strtoul() here:

static int
fill_write_buffer(struct configfs_buffer * buffer, const char __user * buf, 
size_t count)
{
int error;

if (!buffer-page)
buffer-page = (char *)__get_free_pages(GFP_KERNEL, 0);
if (!buffer-page)
return -ENOMEM;

if (count = SIMPLE_ATTR_SIZE)
count = SIMPLE_ATTR_SIZE - 1;
error = copy_from_user(buffer-page,buf,count);
buffer-needs_read_fill = 1;
/* if buf is assumed to contain a string, terminate it by \0,
 * so e.g. sscanf() can scan the string easily */
buffer-page[count] = 0;
return error ? -EFAULT : count;
}

--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 8/9] ib_srpt: Convert srp_max_rsp_size into per port configfs attribute

2011-10-24 Thread Nicholas A. Bellinger
On Mon, 2011-10-24 at 22:16 +0200, Bart Van Assche wrote:
 On Mon, Oct 24, 2011 at 9:49 PM, Nicholas A. Bellinger
 n...@linux-iscsi.org wrote:
  On Mon, 2011-10-24 at 21:44 +0200, Bart Van Assche wrote:
  Also, isn't kstrto*() preferred over strict_strto*() ?
 
  Not AFAIK.
 
 This is what I found in the description of commit 33ee3b2:
 
 kstrto*: converting strings to integers done (hopefully) right
 
 1. simple_strto*() do not contain overflow checks and crufty,
libc way to indicate failure.
 2. strict_strto*() also do not have overflow checks but the name and
comments pretend they do.
 3. Both families have only long long and long variants,
but users want strtou8()
 4. Both simple and strict prefixes are wrong:
Simple doesn't exactly say what's so simple, strict should not exist
because conversion should be strict by default.
 
 The solution is to use k prefix and add convertors for more types.
 Enter
 kstrtoull()
 ...

Fair enough.  I'll convert this separately after the merge considering
it's a very minor improvement, and not a bugfix.

--nab



--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 7/9] ib_srpt: Convert srp_max_rdma_size into per port configfs attribute

2011-10-24 Thread Nicholas A. Bellinger
On Mon, 2011-10-24 at 05:33 +, Nicholas A. Bellinger wrote:
 From: Nicholas Bellinger n...@linux-iscsi.org
 
 This patch converts the srp_max_rdma_size module parameter into a per
 endpoint configfs attribute.  This includes adding the necessary bits
 for show + store attributes w/ min/max bounds checking, and updating
 srpt_get_ioc() to accept a struct srpt_port parameter.
 
 Reported-by: Roland Dreier rol...@purestorage.com
 Cc: Roland Dreier rol...@purestorage.com
 Cc: Bart Van Assche bvanass...@acm.org
 Signed-off-by: Nicholas A. Bellinger n...@linux-iscsi.org
 ---
  drivers/infiniband/ulp/srpt/ib_srpt.c |   60 
 -
  drivers/infiniband/ulp/srpt/ib_srpt.h |   10 +
  2 files changed, 61 insertions(+), 9 deletions(-)
 

SNIP

 diff --git a/drivers/infiniband/ulp/srpt/ib_srpt.h 
 b/drivers/infiniband/ulp/srpt/ib_srpt.h
 index 59ee2d7..c9caa90 100644
 --- a/drivers/infiniband/ulp/srpt/ib_srpt.h
 +++ b/drivers/infiniband/ulp/srpt/ib_srpt.h
 @@ -114,6 +114,7 @@ enum {
   MIN_SRPT_SRQ_SIZE = 4,
   DEFAULT_SRPT_SRQ_SIZE = 4095,
   MAX_SRPT_SRQ_SIZE = 65535,
 + MAX_SRPT_RDMA_SIZE = 256U,
  

Hey btw, what should the proper MAX_SRPT_RDMA_SIZE be here..?

--nab

--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH-v2] ib_srpt: Initial SRP Target merge for v3.2-rc1

2011-10-24 Thread Roland Dreier
On Mon, Oct 24, 2011 at 11:33 AM, Nicholas A. Bellinger
n...@linux-iscsi.org wrote:
 - Handle IB completion timeouts. Although the InfiniBand Architecture
 Manual specifies that a HCA must generate an error completion for each
 pending work item when a queue pair reset is issued, an important
 class of HCAs doesn't do this (that includes the HCA in your test
 setup). In the SCST version of this driver such timeouts are handled
 by the function srpt_pending_cmd_timeout().

Hold on... not sure what you mean by a queue pair reset but the IB
spec does not say that a flush error should be generated when a QP is
transitioned to the reset state.  The flush error completions are
required when a QP is transitioned to the error state, and in fact
completion entries may be lost when a QP transitions to reset.

As far as I know every HCA supported by Linux does implement this
correctly.  Which class did you have in mind as not doing that?

 - R.
--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 07/10] RDMA/cxgb4: DB Drop Recovery for RDMA and LLD queues.

2011-10-24 Thread David Miller
From: Vipul Pandya vi...@chelsio.com
Date: Mon, 24 Oct 2011 20:46:31 +0530

 1. We would like to recommend that all the patches get included in
 Roland's infiniband tree since it has build dependencies.

This is fine with me.  It just means that Roland's tree has to have
net-next included in it already, because otherwise the paths to
the cxgb4 driver under drivers/net won't be right.
--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html