[PATCH for-3.2 v2 3/5] ib/mlx4: Add extended port capabilities support
An Extended Port Info packet is sent to each hw port during HCA init and if returns without error it is assumed the port supports extended port capabilities. Signed-off-by: Marcel Apfelbaum marc...@dev.mellanox.co.il Reviewed-by: Jack Morgenstein ja...@dev.mellanox.com --- Changes from v1 - method mlx4_get_ext_port_caps changed to mlx4_check_ext_port_caps - extended port capabilities are assigned inside mlx4_check_ext_port_caps rather that in mlx4_setup_hca - ext_port_cap field is u8 instead of u64 - minor warning notification change in mlx4_setup_hca drivers/net/mlx4/main.c |7 +++ drivers/net/mlx4/mlx4.h |1 + drivers/net/mlx4/port.c | 42 ++ include/linux/mlx4/device.h |7 +++ 4 files changed, 57 insertions(+) --- a/drivers/net/mlx4/main.c +++ b/drivers/net/mlx4/main.c @@ -998,6 +998,13 @@ static int mlx4_setup_hca(struct mlx4_de ib capabilities (%d). Continuing with caps = 0\n, port, err); dev-caps.ib_port_def_cap[port] = ib_port_default_caps; + + err = mlx4_check_ext_port_caps(dev, port); + if (err) + mlx4_warn(dev, failed to get port %d extended + port capabilities support info (%d). + Assuming not supported\n, port, err); + err = mlx4_SET_PORT(dev, port); if (err) { mlx4_err(dev, Failed to set port %d, aborting\n, --- a/drivers/net/mlx4/mlx4.h +++ b/drivers/net/mlx4/mlx4.h @@ -450,6 +450,7 @@ void mlx4_init_vlan_table(struct mlx4_de int mlx4_SET_PORT(struct mlx4_dev *dev, u8 port); int mlx4_get_port_ib_caps(struct mlx4_dev *dev, u8 port, __be32 *caps); +int mlx4_check_ext_port_caps(struct mlx4_dev *dev, u8 port); int mlx4_qp_detach_common(struct mlx4_dev *dev, struct mlx4_qp *qp, u8 gid[16], enum mlx4_protocol prot, enum mlx4_steer_type steer); --- a/drivers/net/mlx4/port.c +++ b/drivers/net/mlx4/port.c @@ -464,6 +464,48 @@ int mlx4_get_port_ib_caps(struct mlx4_de return err; } +int mlx4_check_ext_port_caps(struct mlx4_dev *dev, u8 port) +{ + struct mlx4_cmd_mailbox *inmailbox, *outmailbox; + u8 *inbuf, *outbuf; + int err, packet_error; + + inmailbox = mlx4_alloc_cmd_mailbox(dev); + if (IS_ERR(inmailbox)) + return PTR_ERR(inmailbox); + + outmailbox = mlx4_alloc_cmd_mailbox(dev); + if (IS_ERR(outmailbox)) { + mlx4_free_cmd_mailbox(dev, inmailbox); + return PTR_ERR(outmailbox); + } + + inbuf = inmailbox-buf; + outbuf = outmailbox-buf; + memset(inbuf, 0, 256); + memset(outbuf, 0, 256); + inbuf[0] = 1; + inbuf[1] = 1; + inbuf[2] = 1; + inbuf[3] = 1; + + *(__be16 *) (inbuf[16]) = MLX4_ATTR_EXTENDED_PORT_INFO; + *(__be32 *) (inbuf[20]) = cpu_to_be32(port); + + err = mlx4_cmd_box(dev, inmailbox-dma, outmailbox-dma, port, 3, + MLX4_CMD_MAD_IFC, MLX4_CMD_TIME_CLASS_C); + + packet_error = be16_to_cpu(*(__be16 *) (outbuf + 4)); + + dev-caps.ext_port_cap[port] = (!err !packet_error) ? + MLX_EXT_PORT_CAP_FLAG_EXTENDED_PORT_INFO + : 0; + + mlx4_free_cmd_mailbox(dev, inmailbox); + mlx4_free_cmd_mailbox(dev, outmailbox); + return err; +} + int mlx4_SET_PORT(struct mlx4_dev *dev, u8 port) { struct mlx4_cmd_mailbox *mailbox; --- a/include/linux/mlx4/device.h +++ b/include/linux/mlx4/device.h @@ -82,6 +82,12 @@ enum { MLX4_DEV_CAP_FLAG_COUNTERS = 1LL 48 }; +#define MLX4_ATTR_EXTENDED_PORT_INFO cpu_to_be16(0xff90) + +enum { + MLX_EXT_PORT_CAP_FLAG_EXTENDED_PORT_INFO= 1 0 +}; + enum { MLX4_BMME_FLAG_LOCAL_INV= 1 6, MLX4_BMME_FLAG_REMOTE_INV = 1 7, @@ -276,6 +282,7 @@ struct mlx4_caps { u32 port_mask; enum mlx4_port_type possible_type[MLX4_MAX_PORTS + 1]; u32 max_counters; + u8 ext_port_cap[MLX4_MAX_PORTS + 1]; }; struct mlx4_buf_list { -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH V2 05/10] RDMA/cxgb4: Add DB Overflow Avoidance.
- get FULL/EMPTY/DROP events from LLD - on FULL event, disable normal user mode DB rings. - add modify_qp semantics to allow user processes to call into the kernel to ring doobells without overflowing. Add DB Full/Empty/Drop stats. Mark queues when created indicating the doorbell state. If we're in the middle of db overflow avoidance, then newly created queues should start out in this mode. Bump the C4IW_UVERBS_ABI_VERSION to 2 so the user mode library can know if the driver supports the kernel mode db ringing. Signed-off-by: Vipul Pandya vi...@chelsio.com Signed-off-by: Steve Wise sw...@opengridcomputing.com --- V2: Bump C4IW_UVERBS_ABI_VERSION to 2 drivers/infiniband/hw/cxgb4/device.c | 84 +-- drivers/infiniband/hw/cxgb4/iw_cxgb4.h | 37 -- drivers/infiniband/hw/cxgb4/qp.c | 51 +++- drivers/infiniband/hw/cxgb4/user.h |2 +- 4 files changed, 162 insertions(+), 12 deletions(-) diff --git a/drivers/infiniband/hw/cxgb4/device.c b/drivers/infiniband/hw/cxgb4/device.c index 8483111..9062ed9 100644 --- a/drivers/infiniband/hw/cxgb4/device.c +++ b/drivers/infiniband/hw/cxgb4/device.c @@ -44,6 +44,12 @@ MODULE_DESCRIPTION(Chelsio T4 RDMA Driver); MODULE_LICENSE(Dual BSD/GPL); MODULE_VERSION(DRV_VERSION); +struct uld_ctx { + struct list_head entry; + struct cxgb4_lld_info lldi; + struct c4iw_dev *dev; +}; + static LIST_HEAD(uld_ctx_list); static DEFINE_MUTEX(dev_mutex); @@ -263,6 +269,9 @@ static int stats_show(struct seq_file *seq, void *v) seq_printf(seq, OCQPMEM: %10llu %10llu %10llu\n, dev-rdev.stats.ocqp.total, dev-rdev.stats.ocqp.cur, dev-rdev.stats.ocqp.max); + seq_printf(seq, DB FULL: %10llu\n, dev-rdev.stats.db_full); + seq_printf(seq, DB EMPTY: %10llu\n, dev-rdev.stats.db_empty); + seq_printf(seq, DB DROP: %10llu\n, dev-rdev.stats.db_drop); return 0; } @@ -283,6 +292,9 @@ static ssize_t stats_clear(struct file *file, const char __user *buf, dev-rdev.stats.pbl.max = 0; dev-rdev.stats.rqt.max = 0; dev-rdev.stats.ocqp.max = 0; + dev-rdev.stats.db_full = 0; + dev-rdev.stats.db_empty = 0; + dev-rdev.stats.db_drop = 0; mutex_unlock(dev-rdev.stats.lock); return count; } @@ -443,12 +455,6 @@ static void c4iw_rdev_close(struct c4iw_rdev *rdev) c4iw_destroy_resource(rdev-resource); } -struct uld_ctx { - struct list_head entry; - struct cxgb4_lld_info lldi; - struct c4iw_dev *dev; -}; - static void c4iw_dealloc(struct uld_ctx *ctx) { c4iw_rdev_close(ctx-dev-rdev); @@ -514,6 +520,7 @@ static struct c4iw_dev *c4iw_alloc(const struct cxgb4_lld_info *infop) idr_init(devp-mmidr); spin_lock_init(devp-lock); mutex_init(devp-rdev.stats.lock); + mutex_init(devp-db_mutex); if (c4iw_debugfs_root) { devp-debugfs_root = debugfs_create_dir( @@ -659,11 +666,76 @@ static int c4iw_uld_state_change(void *handle, enum cxgb4_state new_state) return 0; } +static int disable_qp_db(int id, void *p, void *data) +{ + struct c4iw_qp *qp = p; + + t4_disable_wq_db(qp-wq); + return 0; +} + +static void stop_queues(struct uld_ctx *ctx) +{ + spin_lock_irq(ctx-dev-lock); + ctx-dev-db_state = FLOW_CONTROL; + idr_for_each(ctx-dev-qpidr, disable_qp_db, NULL); + spin_unlock_irq(ctx-dev-lock); +} + +static int enable_qp_db(int id, void *p, void *data) +{ + struct c4iw_qp *qp = p; + + t4_enable_wq_db(qp-wq); + return 0; +} + +static void resume_queues(struct uld_ctx *ctx) +{ + spin_lock_irq(ctx-dev-lock); + ctx-dev-db_state = NORMAL; + idr_for_each(ctx-dev-qpidr, enable_qp_db, NULL); + spin_unlock_irq(ctx-dev-lock); +} + +static int c4iw_uld_control(void *handle, enum cxgb4_control control, ...) +{ + struct uld_ctx *ctx = handle; + + switch (control) { + case CXGB4_CONTROL_DB_FULL: + stop_queues(ctx); + mutex_lock(ctx-dev-rdev.stats.lock); + ctx-dev-rdev.stats.db_full++; + mutex_unlock(ctx-dev-rdev.stats.lock); + break; + case CXGB4_CONTROL_DB_EMPTY: + resume_queues(ctx); + mutex_lock(ctx-dev-rdev.stats.lock); + ctx-dev-rdev.stats.db_empty++; + mutex_unlock(ctx-dev-rdev.stats.lock); + break; + case CXGB4_CONTROL_DB_DROP: + printk(KERN_WARNING MOD %s: Fatal DB DROP\n, + pci_name(ctx-lldi.pdev)); + mutex_lock(ctx-dev-rdev.stats.lock); + ctx-dev-rdev.stats.db_drop++; + mutex_unlock(ctx-dev-rdev.stats.lock); + break; + default: + printk(KERN_WARNING
Re: [PATCH 07/10] RDMA/cxgb4: DB Drop Recovery for RDMA and LLD queues.
On 21-10-2011 02:27, David Miller wrote: From: Steve Wise sw...@opengridcomputing.com Date: Thu, 20 Oct 2011 12:28:07 -0500 On 10/20/2011 12:17 PM, Roland Dreier wrote: I believe 5 and 7 have build dependencies. Right, missed that one too. But it seems 4,6,8,9,10 are independent of the rest of the series? ie I can trivially apply them and then worry about working out the drivers/net / drivers/infiniband interdependency a bit later? Some of these might be dependent on prior patches the series. But if they aren't, yes, you could do that. So, how do you guys want to do this? If you give me a list of which patches I should put into net-next and leave the rest to the infiniband tree, that'd work fine for me as long as net-next is left in a working state independent of the infiniband tree. Hi Dave Miller/Roland, With respect to above dependencies we did some experiments and found following things 1. We can apply three cxgb4 patches, 01 02 and 03, on net-next tree successfully and build it. 2. Out of 7 RDMA/cxgb4 patches only 04, 08 and 10 can be applied trivially and driver can be built successfully. If we try to apply remaining patches, 05 06 07 and 09, either they will fail to apply or give build failure. Moreover patches 05, 06, 07 and 09 can be applied on top of 04, 08 and 10 cleanly. Based on above results we would like to propose following two things. 1. We would like to recommend that all the patches get included in Roland's infiniband tree since it has build dependencies. 2. Alternatively, - Patches 01, 02 and 03 can be included in net-next tree. - Patches 04, 08 and 10 can be included in Roland's infiniband tree at present. - Patches 05, 06, 07 and 09 have to wait till the net-next hits the 3.2-rc1. Please let us know if you have any other suggestions. Thanks, Vipul -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 1/2] RDMA/cxgb4: Serialize calls to cq's comp_handler by lock
Commit 01e7da6ba53ca4d6189a1eae45607c0331c871f2 introduced a potential problem wherein the cq's comp_handler can get called simultaneously from different places in iw_cxgb4 driver. This does not comply with Documentation/infiniband/core_locking.txt, which states that at a given point of time, there should be only one callback per CQ should be active. This problem was reported by Parav Pandit parav.pan...@emulex.com. Based on discussion between Parav Pandit and Steve Wise, this patch aims to correct above problem by serializing the calls to cq's comp_handler using a spin_lock. Reported-by: Parav Pandit parav.pan...@emulex.com Signed-off-by: Kumar Sanghvi kuma...@chelsio.com --- drivers/infiniband/hw/cxgb4/cq.c |1 + drivers/infiniband/hw/cxgb4/ev.c | 10 -- drivers/infiniband/hw/cxgb4/iw_cxgb4.h |1 + drivers/infiniband/hw/cxgb4/qp.c | 15 +-- 4 files changed, 23 insertions(+), 4 deletions(-) diff --git a/drivers/infiniband/hw/cxgb4/cq.c b/drivers/infiniband/hw/cxgb4/cq.c index 901c5fb..f35a935 100644 --- a/drivers/infiniband/hw/cxgb4/cq.c +++ b/drivers/infiniband/hw/cxgb4/cq.c @@ -818,6 +818,7 @@ struct ib_cq *c4iw_create_cq(struct ib_device *ibdev, int entries, chp-cq.size--; /* status page */ chp-ibcq.cqe = entries - 2; spin_lock_init(chp-lock); + spin_lock_init(chp-comp_handler_lock); atomic_set(chp-refcnt, 1); init_waitqueue_head(chp-wait); ret = insert_handle(rhp, rhp-cqidr, chp, chp-cq.cqid); diff --git a/drivers/infiniband/hw/cxgb4/ev.c b/drivers/infiniband/hw/cxgb4/ev.c index c13041a..397cb36 100644 --- a/drivers/infiniband/hw/cxgb4/ev.c +++ b/drivers/infiniband/hw/cxgb4/ev.c @@ -42,6 +42,7 @@ static void post_qp_event(struct c4iw_dev *dev, struct c4iw_cq *chp, { struct ib_event event; struct c4iw_qp_attributes attrs; + unsigned long flag; if ((qhp-attr.state == C4IW_QP_STATE_ERROR) || (qhp-attr.state == C4IW_QP_STATE_TERMINATE)) { @@ -72,7 +73,9 @@ static void post_qp_event(struct c4iw_dev *dev, struct c4iw_cq *chp, if (qhp-ibqp.event_handler) (*qhp-ibqp.event_handler)(event, qhp-ibqp.qp_context); + spin_lock_irqsave(chp-comp_handler_lock, flag); (*chp-ibcq.comp_handler)(chp-ibcq, chp-ibcq.cq_context); + spin_unlock_irqrestore(chp-comp_handler_lock, flag); } void c4iw_ev_dispatch(struct c4iw_dev *dev, struct t4_cqe *err_cqe) @@ -183,11 +186,14 @@ out: int c4iw_ev_handler(struct c4iw_dev *dev, u32 qid) { struct c4iw_cq *chp; + unsigned long flag; chp = get_chp(dev, qid); - if (chp) + if (chp) { + spin_lock_irqsave(chp-comp_handler_lock, flag); (*chp-ibcq.comp_handler)(chp-ibcq, chp-ibcq.cq_context); - else + spin_unlock_irqrestore(chp-comp_handler_lock, flag); + } else PDBG(%s unknown cqid 0x%x\n, __func__, qid); return 0; } diff --git a/drivers/infiniband/hw/cxgb4/iw_cxgb4.h b/drivers/infiniband/hw/cxgb4/iw_cxgb4.h index 62cea0e..1357c5b 100644 --- a/drivers/infiniband/hw/cxgb4/iw_cxgb4.h +++ b/drivers/infiniband/hw/cxgb4/iw_cxgb4.h @@ -309,6 +309,7 @@ struct c4iw_cq { struct c4iw_dev *rhp; struct t4_cq cq; spinlock_t lock; + spinlock_t comp_handler_lock; atomic_t refcnt; wait_queue_head_t wait; }; diff --git a/drivers/infiniband/hw/cxgb4/qp.c b/drivers/infiniband/hw/cxgb4/qp.c index b59b56c..a391a4a 100644 --- a/drivers/infiniband/hw/cxgb4/qp.c +++ b/drivers/infiniband/hw/cxgb4/qp.c @@ -945,8 +945,11 @@ static void __flush_qp(struct c4iw_qp *qhp, struct c4iw_cq *rchp, flushed = c4iw_flush_rq(qhp-wq, rchp-cq, count); spin_unlock(qhp-lock); spin_unlock_irqrestore(rchp-lock, flag); - if (flushed) + if (flushed) { + spin_lock_irqsave(rchp-comp_handler_lock, flag); (*rchp-ibcq.comp_handler)(rchp-ibcq, rchp-ibcq.cq_context); + spin_unlock_irqrestore(rchp-comp_handler_lock, flag); + } /* locking hierarchy: cq lock first, then qp lock. */ spin_lock_irqsave(schp-lock, flag); @@ -956,13 +959,17 @@ static void __flush_qp(struct c4iw_qp *qhp, struct c4iw_cq *rchp, flushed = c4iw_flush_sq(qhp-wq, schp-cq, count); spin_unlock(qhp-lock); spin_unlock_irqrestore(schp-lock, flag); - if (flushed) + if (flushed) { + spin_lock_irqsave(schp-comp_handler_lock, flag); (*schp-ibcq.comp_handler)(schp-ibcq, schp-ibcq.cq_context); + spin_unlock_irqrestore(schp-comp_handler_lock, flag); + } } static void flush_qp(struct c4iw_qp *qhp) { struct c4iw_cq *rchp, *schp; + unsigned long flag; rchp = get_chp(qhp-rhp, qhp-attr.rcq); schp = get_chp(qhp-rhp, qhp-attr.scq); @@ -970,11 +977,15 @@ static void
[PATCH 2/2] RDMA/cxgb3: Serialize calls to cq's comp_handler by lock
iw_cxgb3 has a potential problem wherein the cq's comp_handler can get called simultaneously from different places in iw_cxgb3 driver. This does not comply with Documentation/infiniband/core_locking.txt, which states that at a given point of time, there should be only one callback per CQ should be active. Such problem was reported by Parav Pandit parav.pan...@emulex.com for iw_cxgb4 driver. Based on discussion between Parav Pandit and Steve Wise, this patch aims to correct above problem by serializing the calls to cq's comp_handler using a spin_lock. Signed-off-by: Kumar Sanghvi kuma...@chelsio.com --- drivers/infiniband/hw/cxgb3/iwch_ev.c |6 ++ drivers/infiniband/hw/cxgb3/iwch_provider.c |1 + drivers/infiniband/hw/cxgb3/iwch_provider.h |1 + drivers/infiniband/hw/cxgb3/iwch_qp.c | 14 -- 4 files changed, 20 insertions(+), 2 deletions(-) diff --git a/drivers/infiniband/hw/cxgb3/iwch_ev.c b/drivers/infiniband/hw/cxgb3/iwch_ev.c index 71e0d84..abcc9e7 100644 --- a/drivers/infiniband/hw/cxgb3/iwch_ev.c +++ b/drivers/infiniband/hw/cxgb3/iwch_ev.c @@ -46,6 +46,7 @@ static void post_qp_event(struct iwch_dev *rnicp, struct iwch_cq *chp, struct ib_event event; struct iwch_qp_attributes attrs; struct iwch_qp *qhp; + unsigned long flag; spin_lock(rnicp-lock); qhp = get_qhp(rnicp, CQE_QPID(rsp_msg-cqe)); @@ -94,7 +95,9 @@ static void post_qp_event(struct iwch_dev *rnicp, struct iwch_cq *chp, if (qhp-ibqp.event_handler) (*qhp-ibqp.event_handler)(event, qhp-ibqp.qp_context); + spin_lock_irqsave(chp-comp_handler_lock, flag); (*chp-ibcq.comp_handler)(chp-ibcq, chp-ibcq.cq_context); + spin_unlock_irqrestore(chp-comp_handler_lock, flag); if (atomic_dec_and_test(qhp-refcnt)) wake_up(qhp-wait); @@ -107,6 +110,7 @@ void iwch_ev_dispatch(struct cxio_rdev *rdev_p, struct sk_buff *skb) struct iwch_cq *chp; struct iwch_qp *qhp; u32 cqid = RSPQ_CQID(rsp_msg); + unsigned long flag; rnicp = (struct iwch_dev *) rdev_p-ulp; spin_lock(rnicp-lock); @@ -170,7 +174,9 @@ void iwch_ev_dispatch(struct cxio_rdev *rdev_p, struct sk_buff *skb) */ if (qhp-ep SQ_TYPE(rsp_msg-cqe)) dst_confirm(qhp-ep-dst); + spin_lock_irqsave(chp-comp_handler_lock, flag); (*chp-ibcq.comp_handler)(chp-ibcq, chp-ibcq.cq_context); + spin_unlock_irqrestore(chp-comp_handler_lock, flag); break; case TPT_ERR_STAG: diff --git a/drivers/infiniband/hw/cxgb3/iwch_provider.c b/drivers/infiniband/hw/cxgb3/iwch_provider.c index c7d9411..37c224f 100644 --- a/drivers/infiniband/hw/cxgb3/iwch_provider.c +++ b/drivers/infiniband/hw/cxgb3/iwch_provider.c @@ -190,6 +190,7 @@ static struct ib_cq *iwch_create_cq(struct ib_device *ibdev, int entries, int ve chp-rhp = rhp; chp-ibcq.cqe = 1 chp-cq.size_log2; spin_lock_init(chp-lock); + spin_lock_init(chp-comp_handler_lock); atomic_set(chp-refcnt, 1); init_waitqueue_head(chp-wait); if (insert_handle(rhp, rhp-cqidr, chp, chp-cq.cqid)) { diff --git a/drivers/infiniband/hw/cxgb3/iwch_provider.h b/drivers/infiniband/hw/cxgb3/iwch_provider.h index 9a342c9..87c14b0 100644 --- a/drivers/infiniband/hw/cxgb3/iwch_provider.h +++ b/drivers/infiniband/hw/cxgb3/iwch_provider.h @@ -103,6 +103,7 @@ struct iwch_cq { struct iwch_dev *rhp; struct t3_cq cq; spinlock_t lock; + spinlock_t comp_handler_lock; atomic_t refcnt; wait_queue_head_t wait; u32 __user *user_rptr_addr; diff --git a/drivers/infiniband/hw/cxgb3/iwch_qp.c b/drivers/infiniband/hw/cxgb3/iwch_qp.c index ecd313f..bea5839 100644 --- a/drivers/infiniband/hw/cxgb3/iwch_qp.c +++ b/drivers/infiniband/hw/cxgb3/iwch_qp.c @@ -822,8 +822,11 @@ static void __flush_qp(struct iwch_qp *qhp, struct iwch_cq *rchp, flushed = cxio_flush_rq(qhp-wq, rchp-cq, count); spin_unlock(qhp-lock); spin_unlock_irqrestore(rchp-lock, *flag); - if (flushed) + if (flushed) { + spin_lock_irqsave(rchp-comp_handler_lock, *flag); (*rchp-ibcq.comp_handler)(rchp-ibcq, rchp-ibcq.cq_context); + spin_unlock_irqrestore(rchp-comp_handler_lock, *flag); + } /* locking hierarchy: cq lock first, then qp lock. */ spin_lock_irqsave(schp-lock, *flag); @@ -833,8 +836,11 @@ static void __flush_qp(struct iwch_qp *qhp, struct iwch_cq *rchp, flushed = cxio_flush_sq(qhp-wq, schp-cq, count); spin_unlock(qhp-lock); spin_unlock_irqrestore(schp-lock, *flag); - if (flushed) + if (flushed) { + spin_lock_irqsave(schp-comp_handler_lock, *flag); (*schp-ibcq.comp_handler)(schp-ibcq, schp-ibcq.cq_context); +
RE: [PATCH] An argument for allowing applications to manually send RMPP packets if desired
I'm not sure I understand the resistance to this patch. The patch does not break existing functionality and can demonstrably improve the performance of tools that can take advantage of it. What's wrong with implementing this patch now and working on an improved mechanism for the future? -Original Message- From: Hefty, Sean [mailto:sean.he...@intel.com] Sent: Sunday, September 18, 2011 8:35 PM To: Jason Gunthorpe; Mike Heinz Cc: Roland Dreier; linux-rdma@vger.kernel.org; Todd Rimmer Subject: RE: [PATCH] An argument for allowing applications to manually send RMPP packets if desired Ultimately I think the scalable/compatible answer is to move these RMPP work loads to a verbs QP and we need to have a user space RMPP implementation for that anyhow. Many to one is never scalable. The applications simply cannot rely on every node querying the SA at the same time, especially for large amounts of data. And the overhead of MADs and RMPP is too high to be a useful solution to scaling. This message and any attached documents contain information from QLogic Corporation or its wholly-owned subsidiaries that may be confidential. If you are not the intended recipient, you may not read, copy, distribute, or use this information. If you have received this transmission in error, please notify the sender immediately by reply e-mail and then delete this message. -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 6/9] ib_srpt: Remove legacy use_port_guid_in_session_name module parameter
On Mon, Oct 24, 2011 at 7:33 AM, Nicholas A. Bellinger n...@linux-iscsi.org wrote: This patch removes the legacy use_port_guid_in_session_name module parameter that is no longer required in modern ib_srpt code. The patch looks fine to me but the description could be improved: the use_port_guid_in_session_name had been introduced in the past to work around a limitation of the Windows SRP initiator in multipath setups. Apparently with recent Windows SRP initiator versions setting that parameter is no longer necessary. Bart. -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 7/9] ib_srpt: Convert srp_max_rdma_size into per port configfs attribute
On Mon, 2011-10-24 at 18:34 +0200, Bart Van Assche wrote: On Mon, Oct 24, 2011 at 7:33 AM, Nicholas A. Bellinger n...@linux-iscsi.org wrote: This patch converts the srp_max_rdma_size module parameter into a per endpoint configfs attribute. This includes adding the necessary bits for show + store attributes w/ min/max bounds checking, and updating srpt_get_ioc() to accept a struct srpt_port parameter. [ ... ] -static void srpt_get_ioc(struct srpt_device *sdev, u32 slot, +static void srpt_get_ioc(struct srpt_port *sport, u32 slot, struct ib_dm_mad *mad) The SRP spec says that there should be only one value for iocp-rdma_size per I/O controller. This patch breaks that rule. I'm not sure it's a good idea to introduce such behavior changes. Hi Bart, This is something that Roland asked me to persue, so i'll have to defer to his judgment if he want's to kick this back out to a module parameter or not. --nab -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH V2 1/2] IB/qib: Hold links until tuning data is available
From: Mitko Haralanov mi...@qlogic.com Hold the link state machine until the tuning data is read from the QSFP EEPROM so correct tuning settings are applied before the state machine attempts to bring the link up. Link is also held on cable unplug in case a different cable is used. This version avoids scheduling work on non-qsfp cards. Signed-off-by: Mitko Haralanov mi...@qlogic.com Signed-off-by: Mike Marciniszyn mike.marcinis...@qlogic.com --- drivers/infiniband/hw/qib/qib_iba7322.c | 120 --- drivers/infiniband/hw/qib/qib_init.c|4 - drivers/infiniband/hw/qib/qib_qsfp.c| 25 -- drivers/infiniband/hw/qib/qib_qsfp.h|2 + 4 files changed, 94 insertions(+), 57 deletions(-) diff --git a/drivers/infiniband/hw/qib/qib_iba7322.c b/drivers/infiniband/hw/qib/qib_iba7322.c index ebc5266..e3c8ead 100644 --- a/drivers/infiniband/hw/qib/qib_iba7322.c +++ b/drivers/infiniband/hw/qib/qib_iba7322.c @@ -2380,17 +2380,17 @@ static int qib_7322_bringup_serdes(struct qib_pportdata *ppd) ppd-cpspec-ibcctrl_a |= SYM_MASK(IBCCtrlA_0, IBLinkEn); set_vls(ppd); + /* Hold the link state machine for mezz boards */ + qib_set_ib_7322_lstate(ppd, 0, + QLOGIC_IB_IBCC_LINKINITCMD_DISABLE); + + /* be paranoid against later code motion, etc. */ spin_lock_irqsave(dd-cspec-rcvmod_lock, flags); ppd-p_rcvctrl |= SYM_MASK(RcvCtrl_0, RcvIBPortEnable); qib_write_kreg_port(ppd, krp_rcvctrl, ppd-p_rcvctrl); spin_unlock_irqrestore(dd-cspec-rcvmod_lock, flags); - /* Hold the link state machine for mezz boards */ - if (IS_QMH(dd) || IS_QME(dd)) - qib_set_ib_7322_lstate(ppd, 0, - QLOGIC_IB_IBCC_LINKINITCMD_DISABLE); - /* Also enable IBSTATUSCHG interrupt. */ val = qib_read_kreg_port(ppd, krp_errmask); qib_write_kreg_port(ppd, krp_errmask, @@ -5276,6 +5276,8 @@ static int qib_7322_ib_updown(struct qib_pportdata *ppd, int ibup, u64 ibcs) QIBL_IB_AUTONEG_INPROG))) set_7322_ibspeed_fast(ppd, ppd-link_speed_enabled); if (!(ppd-lflags QIBL_IB_AUTONEG_INPROG)) { + struct qib_qsfp_data *qd = + ppd-cpspec-qsfp_data; /* unlock the Tx settings, speed may change */ qib_write_kreg_port(ppd, krp_tx_deemph_override, SYM_MASK(IBSD_TX_DEEMPHASIS_OVERRIDE_0, @@ -5283,6 +5285,12 @@ static int qib_7322_ib_updown(struct qib_pportdata *ppd, int ibup, u64 ibcs) qib_cancel_sends(ppd); /* on link down, ensure sane pcs state */ qib_7322_mini_pcs_reset(ppd); + /* schedule the qsfp refresh which should turn the link + off */ + if (ppd-dd-flags QIB_HAS_QSFP) { + qd-t_insert = get_jiffies_64(); + schedule_work(qd-work); + } spin_lock_irqsave(ppd-sdma_lock, flags); if (__qib_sdma_running(ppd)) __qib_sdma_process_event(ppd, @@ -5638,38 +5646,62 @@ static void qsfp_7322_event(struct work_struct *work) qd = container_of(work, struct qib_qsfp_data, work); ppd = qd-ppd; - pwrup = qd-t_insert + msecs_to_jiffies(QSFP_PWR_LAG_MSEC); + pwrup = qd-t_insert + + msecs_to_jiffies(QSFP_PWR_LAG_MSEC - QSFP_MODPRS_LAG_MSEC); - /* -* Some QSFP's not only do not respond until the full power-up -* time, but may behave badly if we try. So hold off responding -* to insertion. -*/ - while (1) { - u64 now = get_jiffies_64(); - if (time_after64(now, pwrup)) - break; - msleep(20); - } - ret = qib_refresh_qsfp_cache(ppd, qd-cache); - /* -* Need to change LE2 back to defaults if we couldn't -* read the cable type (to handle cable swaps), so do this -* even on failure to read cable information. We don't -* get here for QME, so IS_QME check not needed here. -*/ - if (!ret !ppd-dd-cspec-r1) { - if (QSFP_IS_ACTIVE_FAR(qd-cache.tech)) - le2 = LE2_QME; - else if (qd-cache.atten[1] = qib_long_atten -QSFP_IS_CU(qd-cache.tech)) - le2 = LE2_5m; - else + /* Delay for 20 msecs to allow ModPrs resistor to setup */ + mdelay(QSFP_MODPRS_LAG_MSEC); + + if (!qib_qsfp_mod_present(ppd)) + /* Set the physical link to disabled */ + qib_set_ib_7322_lstate(ppd, 0, +
Re: OpenSM plugin patch request
On Oct 17, 2011, at 5:28 PM, Hal Rosenstock wrote: A couple of minor comments: 1. Why not move the OSM_EVENT_ID_LIGHT_SWEEP_DONE event to where LIGHT SWEEP COMPLETE is indicated in osm_state_mgr.c ? That's actually more accurate as all the transactions have completed. 2. The new events should be added to osmeventplugin/src/osmeventplugin.c example. Thanks. -- Hal Thanks for the comments. The attached osm_light_sweep.patch addresses the points above by moving the OSM_EVENT_ID_LIGHT_SWEEP_DONE event to be with it's corresponding LIGHT SWEEP COMPLETE log message. Also, the light sweep START/DONE events are included in the example event plugin. While adding to the example event plugin, I realized that the autoconf setup in 3.3.9 does not appear to be propagating all user settings down into the default plugin. Consequently, I was not able to build the plugin when OFED headers were not in the default location. In case it's useful (and hasn't already been addressed), the second attached patch adds $(OSMV_INCLUDES) to the automake directives for the plugin which was sufficient in my case to resolve necessary headers. Thanks, Karl osm_light_sweep.patch Description: osm_light_sweep.patch build.patch Description: build.patch
[PATCH 1/2] ib_srpt: Fix possible race with srp_sq_size in srpt_create_ch_ib
From: Nicholas Bellinger n...@linux-iscsi.org This patch fixes a possible race with srp_sq_size in srpt_create_ch_ib() where changing sport-port_attrib.srp_sq_size via configfs could have unintended consequences. It uses a local assignment for srp_sq_size to ensure the values for ib_create_cq() and qp_init-cap.max_send_wr are consistent. Reported-by: Bart Van Assche bvanass...@acm.org Cc: Bart Van Assche bvanass...@acm.org Cc: Roland Dreier rol...@purestorage.com Signed-off-by: Nicholas A. Bellinger n...@linux-iscsi.org --- drivers/infiniband/ulp/srpt/ib_srpt.c |7 --- 1 files changed, 4 insertions(+), 3 deletions(-) diff --git a/drivers/infiniband/ulp/srpt/ib_srpt.c b/drivers/infiniband/ulp/srpt/ib_srpt.c index 9e1c5e0..e483c54 100644 --- a/drivers/infiniband/ulp/srpt/ib_srpt.c +++ b/drivers/infiniband/ulp/srpt/ib_srpt.c @@ -2130,6 +2130,7 @@ static int srpt_create_ch_ib(struct srpt_rdma_ch *ch) struct ib_qp_init_attr *qp_init; struct srpt_port *sport = ch-sport; struct srpt_device *sdev = sport-sdev; + u32 srp_sq_size = sport-port_attrib.srp_sq_size; int ret; WARN_ON(ch-rq_size 1); @@ -2140,11 +2141,11 @@ static int srpt_create_ch_ib(struct srpt_rdma_ch *ch) goto out; ch-cq = ib_create_cq(sdev-device, srpt_completion, NULL, ch, - ch-rq_size + sport-port_attrib.srp_sq_size, 0); + ch-rq_size + srp_sq_size, 0); if (IS_ERR(ch-cq)) { ret = PTR_ERR(ch-cq); printk(KERN_ERR failed to create CQ cqe= %d ret= %d\n, - ch-rq_size + sport-port_attrib.srp_sq_size, ret); + ch-rq_size + srp_sq_size, ret); goto out; } @@ -2156,7 +2157,7 @@ static int srpt_create_ch_ib(struct srpt_rdma_ch *ch) qp_init-srq = sdev-srq; qp_init-sq_sig_type = IB_SIGNAL_REQ_WR; qp_init-qp_type = IB_QPT_RC; - qp_init-cap.max_send_wr = sport-port_attrib.srp_sq_size; + qp_init-cap.max_send_wr = srp_sq_size; qp_init-cap.max_send_sge = SRPT_DEF_SG_PER_WQE; ch-qp = ib_create_qp(sdev-pd, qp_init); -- 1.7.2.5 -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 2/2] ib_srpt: Fix possible race with srp_max_rsp_size in srpt_release_channel_work
From: Nicholas Bellinger n...@linux-iscsi.org This patch fixes a possible race with srp_max_rsp_size in srpt_release_channel_work() when changing sport-port_attrib.srp_max_rsp_size via configfs could have unintended consequences. It uses a new srpt_rdma_ch-rsp_size and assign the value during srpt_cm_req_recv(), which is used in subsequent calls to srpt_free_ioctx_ring() to ensure consistency with the original srpt_alloc_ioctx_ring(). Reported-by: Bart Van Assche bvanass...@acm.org Cc: Bart Van Assche bvanass...@acm.org Cc: Roland Dreier rol...@purestorage.com Signed-off-by: Nicholas A. Bellinger n...@linux-iscsi.org --- drivers/infiniband/ulp/srpt/ib_srpt.c | 10 -- drivers/infiniband/ulp/srpt/ib_srpt.h |2 ++ 2 files changed, 6 insertions(+), 6 deletions(-) diff --git a/drivers/infiniband/ulp/srpt/ib_srpt.c b/drivers/infiniband/ulp/srpt/ib_srpt.c index e483c54..f7174b3 100644 --- a/drivers/infiniband/ulp/srpt/ib_srpt.c +++ b/drivers/infiniband/ulp/srpt/ib_srpt.c @@ -2373,8 +2373,7 @@ static void srpt_release_channel_work(struct work_struct *w) srpt_free_ioctx_ring((struct srpt_ioctx **)ch-ioctx_ring, ch-sport-sdev, ch-rq_size, -ch-sport-port_attrib.srp_max_rsp_size, -DMA_TO_DEVICE); +ch-rsp_size, DMA_TO_DEVICE); spin_lock_irq(sdev-spinlock); list_del(ch-list); @@ -2554,12 +2553,12 @@ static int srpt_cm_req_recv(struct ib_cm_id *cm_id, spin_lock_init(ch-spinlock); ch-state = CH_CONNECTING; INIT_LIST_HEAD(ch-cmd_wait_list); + ch-rsp_size = ch-sport-port_attrib.srp_max_rsp_size; ch-ioctx_ring = (struct srpt_send_ioctx **) srpt_alloc_ioctx_ring(ch-sport-sdev, ch-rq_size, sizeof(*ch-ioctx_ring[0]), - ch-sport-port_attrib.srp_max_rsp_size, - DMA_TO_DEVICE); + ch-rsp_size, DMA_TO_DEVICE); if (!ch-ioctx_ring) goto free_ch; @@ -2667,8 +2666,7 @@ destroy_ib: free_ring: srpt_free_ioctx_ring((struct srpt_ioctx **)ch-ioctx_ring, ch-sport-sdev, ch-rq_size, -ch-sport-port_attrib.srp_max_rsp_size, -DMA_TO_DEVICE); +ch-rsp_size, DMA_TO_DEVICE); free_ch: kfree(ch); diff --git a/drivers/infiniband/ulp/srpt/ib_srpt.h b/drivers/infiniband/ulp/srpt/ib_srpt.h index 1b607ae..068f66d 100644 --- a/drivers/infiniband/ulp/srpt/ib_srpt.h +++ b/drivers/infiniband/ulp/srpt/ib_srpt.h @@ -277,6 +277,7 @@ enum rdma_ch_state { * @qp:IB queue pair used for communicating over this channel. * @cq:IB completion queue for this channel. * @rq_size: IB receive queue size. + * @rsp_size IB response message size in bytes. * @sq_wr_avail: number of work requests available in the send queue. * @sport: pointer to the information of the HCA port used by this * channel. @@ -307,6 +308,7 @@ struct srpt_rdma_ch { struct ib_qp*qp; struct ib_cq*cq; int rq_size; + u32 rsp_size; atomic_tsq_wr_avail; struct srpt_port*sport; u8 i_port_id[16]; -- 1.7.2.5 -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 8/9] ib_srpt: Convert srp_max_rsp_size into per port configfs attribute
On Mon, Oct 24, 2011 at 7:33 AM, Nicholas A. Bellinger n...@linux-iscsi.org wrote: +static ssize_t srpt_tpg_attrib_store_srp_max_rsp_size( + struct se_portal_group *se_tpg, + const char *page, + size_t count) +{ + struct srpt_port *sport = container_of(se_tpg, struct srpt_port, port_tpg_1); + unsigned long val; + int ret; + + ret = strict_strtoul(page, 0, val); If the data page points at only consists of digits, the above strict_strtoul() call will trigger a past-end-of-buffer read. Also, isn't kstrto*() preferred over strict_strto*() ? Bart. -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 8/9] ib_srpt: Convert srp_max_rsp_size into per port configfs attribute
On Mon, 2011-10-24 at 21:44 +0200, Bart Van Assche wrote: On Mon, Oct 24, 2011 at 7:33 AM, Nicholas A. Bellinger n...@linux-iscsi.org wrote: +static ssize_t srpt_tpg_attrib_store_srp_max_rsp_size( + struct se_portal_group *se_tpg, + const char *page, + size_t count) +{ + struct srpt_port *sport = container_of(se_tpg, struct srpt_port, port_tpg_1); + unsigned long val; + int ret; + + ret = strict_strtoul(page, 0, val); If the data page points at only consists of digits, the above strict_strtoul() call will trigger a past-end-of-buffer read. I don't understand what you mean here. Can you provide a test case to demonstrate please..? Also, isn't kstrto*() preferred over strict_strto*() ? Not AFAIK. --nab -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 5/9] ib_srpt: Fix sport-port_guid formatting code
On Mon, Oct 24, 2011 at 7:33 AM, Nicholas A. Bellinger n...@linux-iscsi.org wrote: + snprintf(sport-port_guid, sizeof(sport-port_guid), + 0x%016llx, + be64_to_cpu(sport-gid.global.interface_id)); If I interpret Roland's e-mail correctly Roland asked to use GID table entry 0 (http://www.spinics.net/lists/linux-rdma/msg09751.html) ? Bart. -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 8/9] ib_srpt: Convert srp_max_rsp_size into per port configfs attribute
On Mon, Oct 24, 2011 at 9:49 PM, Nicholas A. Bellinger n...@linux-iscsi.org wrote: On Mon, 2011-10-24 at 21:44 +0200, Bart Van Assche wrote: On Mon, Oct 24, 2011 at 7:33 AM, Nicholas A. Bellinger n...@linux-iscsi.org wrote: +static ssize_t srpt_tpg_attrib_store_srp_max_rsp_size( + struct se_portal_group *se_tpg, + const char *page, + size_t count) +{ + struct srpt_port *sport = container_of(se_tpg, struct srpt_port, port_tpg_1); + unsigned long val; + int ret; + + ret = strict_strtoul(page, 0, val); If the data page points at only consists of digits, the above strict_strtoul() call will trigger a past-end-of-buffer read. I don't understand what you mean here. Can you provide a test case to demonstrate please..? echo -n 345 $configfs_path_of_parameter. Bart. -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 8/9] ib_srpt: Convert srp_max_rsp_size into per port configfs attribute
On Mon, 2011-10-24 at 21:58 +0200, Bart Van Assche wrote: On Mon, Oct 24, 2011 at 9:49 PM, Nicholas A. Bellinger n...@linux-iscsi.org wrote: On Mon, 2011-10-24 at 21:44 +0200, Bart Van Assche wrote: On Mon, Oct 24, 2011 at 7:33 AM, Nicholas A. Bellinger n...@linux-iscsi.org wrote: +static ssize_t srpt_tpg_attrib_store_srp_max_rsp_size( + struct se_portal_group *se_tpg, + const char *page, + size_t count) +{ + struct srpt_port *sport = container_of(se_tpg, struct srpt_port, port_tpg_1); + unsigned long val; + int ret; + + ret = strict_strtoul(page, 0, val); If the data page points at only consists of digits, the above strict_strtoul() call will trigger a past-end-of-buffer read. I don't understand what you mean here. Can you provide a test case to demonstrate please..? echo -n 345 $configfs_path_of_parameter. Still not sure what your getting at here..? root@tifa:/sys/kernel/config/target/srpt/0x0002c903000e8acd/tpgt_1/attrib# ls -la total 0 drwxr-xr-x 2 root root0 Oct 24 13:01 . drwxr-xr-x 7 root root0 Oct 24 12:58 .. -rw-r--r-- 1 root root 4096 Oct 24 12:58 srp_max_rdma_size -rw-r--r-- 1 root root 4096 Oct 24 13:01 srp_max_rsp_size -rw-r--r-- 1 root root 4096 Oct 24 12:58 srp_sq_size root@tifa:/sys/kernel/config/target/srpt/0x0002c903000e8acd/tpgt_1/attrib# cat srp_max_rsp_size 256 root@tifa:/sys/kernel/config/target/srpt/0x0002c903000e8acd/tpgt_1/attrib# echo -n 240 srp_max_rsp_size root@tifa:/sys/kernel/config/target/srpt/0x0002c903000e8acd/tpgt_1/attrib# cat srp_max_rsp_size 240 -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 8/9] ib_srpt: Convert srp_max_rsp_size into per port configfs attribute
On Mon, Oct 24, 2011 at 10:05 PM, Nicholas A. Bellinger n...@linux-iscsi.org wrote: On Mon, 2011-10-24 at 21:58 +0200, Bart Van Assche wrote: On Mon, Oct 24, 2011 at 9:49 PM, Nicholas A. Bellinger n...@linux-iscsi.org wrote: On Mon, 2011-10-24 at 21:44 +0200, Bart Van Assche wrote: On Mon, Oct 24, 2011 at 7:33 AM, Nicholas A. Bellinger n...@linux-iscsi.org wrote: +static ssize_t srpt_tpg_attrib_store_srp_max_rsp_size( + struct se_portal_group *se_tpg, + const char *page, + size_t count) +{ + struct srpt_port *sport = container_of(se_tpg, struct srpt_port, port_tpg_1); + unsigned long val; + int ret; + + ret = strict_strtoul(page, 0, val); If the data page points at only consists of digits, the above strict_strtoul() call will trigger a past-end-of-buffer read. I don't understand what you mean here. Can you provide a test case to demonstrate please..? echo -n 345 $configfs_path_of_parameter. Still not sure what your getting at here..? Only the data in page[0..count-1] is guaranteed to be initialized. strict_strtoul() will read until it either finds whitespace or a binary zero, so if the data in page[] does neither contain whitespace nor a binary zero then strict_strtoul() will read past the end of the data in page[]. There may be any data at page[count], including a valid digit. Bart. -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 8/9] ib_srpt: Convert srp_max_rsp_size into per port configfs attribute
On Mon, Oct 24, 2011 at 9:49 PM, Nicholas A. Bellinger n...@linux-iscsi.org wrote: On Mon, 2011-10-24 at 21:44 +0200, Bart Van Assche wrote: Also, isn't kstrto*() preferred over strict_strto*() ? Not AFAIK. This is what I found in the description of commit 33ee3b2: kstrto*: converting strings to integers done (hopefully) right 1. simple_strto*() do not contain overflow checks and crufty, libc way to indicate failure. 2. strict_strto*() also do not have overflow checks but the name and comments pretend they do. 3. Both families have only long long and long variants, but users want strtou8() 4. Both simple and strict prefixes are wrong: Simple doesn't exactly say what's so simple, strict should not exist because conversion should be strict by default. The solution is to use k prefix and add convertors for more types. Enter kstrtoull() ... Bart. -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 8/9] ib_srpt: Convert srp_max_rsp_size into per port configfs attribute
On Mon, 2011-10-24 at 22:11 +0200, Bart Van Assche wrote: On Mon, Oct 24, 2011 at 10:05 PM, Nicholas A. Bellinger n...@linux-iscsi.org wrote: On Mon, 2011-10-24 at 21:58 +0200, Bart Van Assche wrote: On Mon, Oct 24, 2011 at 9:49 PM, Nicholas A. Bellinger n...@linux-iscsi.org wrote: On Mon, 2011-10-24 at 21:44 +0200, Bart Van Assche wrote: On Mon, Oct 24, 2011 at 7:33 AM, Nicholas A. Bellinger n...@linux-iscsi.org wrote: +static ssize_t srpt_tpg_attrib_store_srp_max_rsp_size( + struct se_portal_group *se_tpg, + const char *page, + size_t count) +{ + struct srpt_port *sport = container_of(se_tpg, struct srpt_port, port_tpg_1); + unsigned long val; + int ret; + + ret = strict_strtoul(page, 0, val); If the data page points at only consists of digits, the above strict_strtoul() call will trigger a past-end-of-buffer read. I don't understand what you mean here. Can you provide a test case to demonstrate please..? echo -n 345 $configfs_path_of_parameter. Still not sure what your getting at here..? Only the data in page[0..count-1] is guaranteed to be initialized. strict_strtoul() will read until it either finds whitespace or a binary zero, so if the data in page[] does neither contain whitespace nor a binary zero then strict_strtoul() will read past the end of the data in page[]. There may be any data at page[count], including a valid digit. That is part obvious. The point your missing is that configfs is already sanitizing the the incoming buffer in fs/configfs/file.c to work with strict_strtoul() here: static int fill_write_buffer(struct configfs_buffer * buffer, const char __user * buf, size_t count) { int error; if (!buffer-page) buffer-page = (char *)__get_free_pages(GFP_KERNEL, 0); if (!buffer-page) return -ENOMEM; if (count = SIMPLE_ATTR_SIZE) count = SIMPLE_ATTR_SIZE - 1; error = copy_from_user(buffer-page,buf,count); buffer-needs_read_fill = 1; /* if buf is assumed to contain a string, terminate it by \0, * so e.g. sscanf() can scan the string easily */ buffer-page[count] = 0; return error ? -EFAULT : count; } -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 8/9] ib_srpt: Convert srp_max_rsp_size into per port configfs attribute
On Mon, 2011-10-24 at 22:16 +0200, Bart Van Assche wrote: On Mon, Oct 24, 2011 at 9:49 PM, Nicholas A. Bellinger n...@linux-iscsi.org wrote: On Mon, 2011-10-24 at 21:44 +0200, Bart Van Assche wrote: Also, isn't kstrto*() preferred over strict_strto*() ? Not AFAIK. This is what I found in the description of commit 33ee3b2: kstrto*: converting strings to integers done (hopefully) right 1. simple_strto*() do not contain overflow checks and crufty, libc way to indicate failure. 2. strict_strto*() also do not have overflow checks but the name and comments pretend they do. 3. Both families have only long long and long variants, but users want strtou8() 4. Both simple and strict prefixes are wrong: Simple doesn't exactly say what's so simple, strict should not exist because conversion should be strict by default. The solution is to use k prefix and add convertors for more types. Enter kstrtoull() ... Fair enough. I'll convert this separately after the merge considering it's a very minor improvement, and not a bugfix. --nab -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 7/9] ib_srpt: Convert srp_max_rdma_size into per port configfs attribute
On Mon, 2011-10-24 at 05:33 +, Nicholas A. Bellinger wrote: From: Nicholas Bellinger n...@linux-iscsi.org This patch converts the srp_max_rdma_size module parameter into a per endpoint configfs attribute. This includes adding the necessary bits for show + store attributes w/ min/max bounds checking, and updating srpt_get_ioc() to accept a struct srpt_port parameter. Reported-by: Roland Dreier rol...@purestorage.com Cc: Roland Dreier rol...@purestorage.com Cc: Bart Van Assche bvanass...@acm.org Signed-off-by: Nicholas A. Bellinger n...@linux-iscsi.org --- drivers/infiniband/ulp/srpt/ib_srpt.c | 60 - drivers/infiniband/ulp/srpt/ib_srpt.h | 10 + 2 files changed, 61 insertions(+), 9 deletions(-) SNIP diff --git a/drivers/infiniband/ulp/srpt/ib_srpt.h b/drivers/infiniband/ulp/srpt/ib_srpt.h index 59ee2d7..c9caa90 100644 --- a/drivers/infiniband/ulp/srpt/ib_srpt.h +++ b/drivers/infiniband/ulp/srpt/ib_srpt.h @@ -114,6 +114,7 @@ enum { MIN_SRPT_SRQ_SIZE = 4, DEFAULT_SRPT_SRQ_SIZE = 4095, MAX_SRPT_SRQ_SIZE = 65535, + MAX_SRPT_RDMA_SIZE = 256U, Hey btw, what should the proper MAX_SRPT_RDMA_SIZE be here..? --nab -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH-v2] ib_srpt: Initial SRP Target merge for v3.2-rc1
On Mon, Oct 24, 2011 at 11:33 AM, Nicholas A. Bellinger n...@linux-iscsi.org wrote: - Handle IB completion timeouts. Although the InfiniBand Architecture Manual specifies that a HCA must generate an error completion for each pending work item when a queue pair reset is issued, an important class of HCAs doesn't do this (that includes the HCA in your test setup). In the SCST version of this driver such timeouts are handled by the function srpt_pending_cmd_timeout(). Hold on... not sure what you mean by a queue pair reset but the IB spec does not say that a flush error should be generated when a QP is transitioned to the reset state. The flush error completions are required when a QP is transitioned to the error state, and in fact completion entries may be lost when a QP transitions to reset. As far as I know every HCA supported by Linux does implement this correctly. Which class did you have in mind as not doing that? - R. -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 07/10] RDMA/cxgb4: DB Drop Recovery for RDMA and LLD queues.
From: Vipul Pandya vi...@chelsio.com Date: Mon, 24 Oct 2011 20:46:31 +0530 1. We would like to recommend that all the patches get included in Roland's infiniband tree since it has build dependencies. This is fine with me. It just means that Roland's tree has to have net-next included in it already, because otherwise the paths to the cxgb4 driver under drivers/net won't be right. -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html