Re: NULL pointer deref in k.o/for-4.5
On Thu, Jan 7, 2016 at 1:31 AM, Chuck Lever wrote: >> On Jan 6, 2016, at 5:25 PM, Or Gerlitz wrote: >> On Wed, Jan 6, 2016 at 9:20 PM, Chuck Lever >> was reported here 2-3 times, this fixes that >> https://patchwork.kernel.org/patch/7929551 > Confirmed, that fixes it. Thanks, I never would have > guessed that was the fix. tried linux-rdma mailing list search on people failing with the for-4.5 bits? -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: NULL pointer deref in k.o/for-4.5
On Wed, Jan 6, 2016 at 9:20 PM, Chuck Lever wrote: > And appears to be 100% reproducible. Any debugging > advice welcome! was reported here 2-3 times, this fixes that https://patchwork.kernel.org/patch/7929551 -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCHv3 TRIVIAL] IB/core: Documentation fix to ib_mad_snoop_handler in the MAD header file
On Tue, Jan 5, 2016 at 9:00 PM, Hefty, Sean wrote: >> > There are no in tree users and the only attempt at adding one was >> rejected. >> >> There are no in tree users of this but there is your madeye tool (which >> is out of tree). This is still a useful debug tool for MADs and there >> are people who still use that. > > It's an out of tree tool. Maintain the snooping hooks into the mad layer out > of tree as well. Any real reason not to pick this good tool into the kernel code? Or. -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: device attr cleanup
On Tue, Jan 5, 2016 at 6:46 PM, Steve Wise wrote: > Hey Doug, I don't see this branch. Which branch has the accepted device attr > change? k.o/for-4.5 on Doug's kernel.org tree -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCHv2 TRIVIAL] IB/core: ib_mad.h ib_mad_snoop_handler documentation fix
On 1/4/2016 10:44 PM, Hal Rosenstock wrote: ib_mad_snoop_handler uses send_buf rather than send_wr Signed-off-by: Hal Rosenstock Please use higher language in commit titles e.g IB/core: Documentation fix in the MAD header file --- Change since v1: Fixed typo in patch description diff --git a/include/rdma/ib_mad.h b/include/rdma/ib_mad.h index ec9b44d..2b3573d 100644 --- a/include/rdma/ib_mad.h +++ b/include/rdma/ib_mad.h @@ -424,11 +424,11 @@ typedef void (*ib_mad_send_handler)(struct ib_mad_agent *mad_agent, /** * ib_mad_snoop_handler - Callback handler for snooping sent MADs. * @mad_agent: MAD agent that snooped the MAD. - * @send_wr: Work request information on the sent MAD. + * @send_buf: send MAD data buffer. * @mad_send_wc: Work completion information on the sent MAD. Valid * only for snooping that occurs on a send completion. * - * Clients snooping MADs should not modify data referenced by the @send_wr + * Clients snooping MADs should not modify data referenced by the @send_buf * or @mad_send_wc. */ typedef void (*ib_mad_snoop_handler)(struct ib_mad_agent *mad_agent, -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html __ This email has been scanned by the Symantec Email Security.cloud service. For more information please visit http://www.symanteccloud.com __ -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v2 00/10] iSER support for remote invalidate
On Wed, Dec 23, 2015 at 10:53 PM, Or Gerlitz wrote: > On Mon, Dec 21, 2015 at 3:29 PM, Sagi Grimberg > wrote: > >>> Sagi, are you going to spin an increment in the initiator version? > >> I don't know if it's worth a driver version update? > > yes, I think it does. Finally we're getting real close to the visions > of the spec authors... do you have some performance data showing the > gain of remote invalidation on certain HWs and IO benchmarks patterns. > Did we make sure to interoperate between patched initiator to > unpatched LIO or other targets which don't have this now (SCST, TGT) > and the other way around (unpatched initiator, patched LIO)? Sagi? >> In case we do increment, I can send an incremental patch instead of >> re-spinning the entire series. > > lets do that. lets -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] IB/sysfs: Fix sparse warning on attr_id
On Mon, Jan 4, 2016 at 5:44 AM, wrote: > From: Ira Weiny > > Attributed ID was declared as an int while the value should really be big > endian 16. > > Fixes: 35c4cbb17811 ("IB/core: Create get_perf_mad function in sysfs.c") > remove this blank line (&& I provided you this comment multiple times in the past) > Reported-by: Bart Van Assche > Signed-off-by: Ira Weiny reverse the order, your S.O.B sig line should come 1st and later the rest of the reporters/ackers/testers/maintainers notes > --- > drivers/infiniband/core/sysfs.c | 4 ++-- > 1 file changed, 2 insertions(+), 2 deletions(-) > > diff --git a/drivers/infiniband/core/sysfs.c b/drivers/infiniband/core/sysfs.c > index 2daf83276cdf..3de93517efe4 100644 > --- a/drivers/infiniband/core/sysfs.c > +++ b/drivers/infiniband/core/sysfs.c > @@ -77,7 +77,7 @@ struct port_table_attribute { > struct port_attribute attr; > charname[8]; > int index; > - int attr_id; > + __be16 attr_id; > }; > > static ssize_t port_attr_show(struct kobject *kobj, > @@ -413,7 +413,7 @@ struct port_table_attribute port_pma_attr_ext_##_name = { > \ > * Get a Perfmgmt MAD block of data. > * Returns error code or the number of bytes retrieved. > */ > -static int get_perf_mad(struct ib_device *dev, int port_num, int attr, > +static int get_perf_mad(struct ib_device *dev, int port_num, __be16 attr, > void *data, int offset, size_t size) > { > struct ib_mad *in_mad; > -- > 1.8.2 -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] IB/ipoib: Expose ioctl command to retrieve SGID of a given socket
On 12/31/2015 4:41 PM, Yuval Shaia wrote: To support security applications, that need to filter out connections based on SGID, an ioctl command to retrieve SGID of a given socket is added. [...] + +found: + if (!(neigh->nud_state & NUD_VALID)) + return -EINVAL; + + gid = (union ib_gid *)(neigh->ha + 4); + *sgid = be64_to_cpu(gid->global.interface_id); + *subnet_prefix = be64_to_cpu(gid->global.subnet_prefix); wait (1st) the neighbour holds a destination address, not source address, so why are you talking on SGID?! wait (2nd) what prevents you from getting this info in user space through netlink from the kernel rtnl routing/neighbour services? root@r-dcs54 ~]# ip r s | grep 192.168.20.0/24 192.168.20.0/24 dev ib0 proto kernel scope link src 192.168.20.17 [root@r-dcs54 ~]# ip n s | grep ib0 192.168.20.18 dev ib0 lladdr 80:00:00:48:fe:80:00:00:00:00:00:00:f4:52:14:03:00:01:da:81 DELAY -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH for-next 5/7] IB/mlx4: Enable send of RoCE QP1 packets with IP/UDP headers
On 12/30/2015 1:04 PM, Moni Shoua wrote: The last hunk that you removed had a role and was by no means dead-code, right? so... (1) why it's correct to remove it? (2) if you want to introduce different way to implement what was done here, why in this patch? maybe add pre-patch for that In a way you are right. This hunk does not insert a bug and even improves correctness but it acutally belongs to an earlier patch (dbf727de7440f73c4b92be4b958cbc24977e8ca2 IB/core: Use GID table in AH creation and dmac resolution) so what's the plan here? avoid deleting it? Or. -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma.git/k.o/for-4.5 crash during boot
On 12/30/2015 2:04 PM, Bart Van Assche wrote: Hello Christoph, Can you check whether the branch in the subject of this e-mail works fine on your setup (commit 59caaed7a7) ? On my test setup (Dell R430 with two ConnectX-3 adapters) this branch crashes during boot in get_counter_table() (see also the attached screenshot). Can you please check with Hal's fix that was posted here 1-2 days ago, hopefully this should make your system to work Or. -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH for-next V3 00/11] Add RoCE v2 support
On 12/30/2015 12:48 PM, Matan Barak wrote: On 12/30/2015 8:04 AM, Or Gerlitz wrote: Hi Matan, I see these two smatch complaints on code added with this series, can you please take a look? drivers/infiniband/core/addr.c:503 rdma_resolve_ip_route() warn: variable dereferenced before check 'src_addr' (see line 500) drivers/infiniband/core/cma_configfs.c:172 make_cma_ports() warn: double check that we're allocating correct size: 8 vs 128 I'll send fixes for both of them. Thanks for posting this. when the same smatch runs on older gcc, it produces more warnings, are they false-positives? drivers/infiniband/core/cma_configfs.c: In function ?default_roce_mode_store?: drivers/infiniband/core/cma_configfs.c:123: warning: ?cma_dev? may be used uninitialized in this function drivers/infiniband/core/cma_configfs.c:124: warning: ?group? may be used uninitialized in this function drivers/infiniband/core/cma_configfs.c: In function ?default_roce_mode_show?: drivers/infiniband/core/cma_configfs.c:102: warning: ?cma_dev? may be used uninitialized in this function drivers/infiniband/core/cma_configfs.c:103: warning: ?group? may be used uninitialized in this function -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH for-next 1/7] IB/mlx4: Query RoCE support
On 12/30/2015 10:27 AM, Matan Barak wrote: On 12/29/2015 5:19 PM, Or Gerlitz wrote: On 12/29/2015 3:24 PM, Matan Barak wrote: @@ -905,6 +906,8 @@ int mlx4_QUERY_DEV_CAP(struct mlx4_dev *dev, struct mlx4_dev_cap *dev_cap) dev_cap->flags2 |= MLX4_DEV_CAP_FLAG2_EQE_STRIDE; MLX4_GET(dev_cap->bmme_flags, outbox, QUERY_DEV_CAP_BMME_FLAGS_OFFSET); +if (dev_cap->bmme_flags & MLX4_FLAG_ROCE_V1_V2) +dev_cap->flags2 |= MLX4_DEV_CAP_FLAG2_ROCE_V1_V2; Did you make sure that the query dev cap wrapper unsets this bit when proxing VF queries? In mlx4_dev_cap: if (mlx4_is_mfunc(dev)) { dev->caps.flags &= ~MLX4_DEV_CAP_FLAG_SENSE_SUPPORT; dev_cap->flags2 &= ~MLX4_DEV_CAP_FLAG2_ROCE_V1_V2; mlx4_dbg(dev, "RoCE V2 is not supported when SR-IOV is enabled\n"); } mlx4_slave_cap calls mlx4_dev_cap and uses the dev_caps it queried, so we should be safe here. mlx4_slave_cap is part of the Linux VF driver flow, right? So... NO, this is the Linux implementation. You should make things robust against any guest driver. The only way to do that is patch the command wrapper used by the PF to filter out unwanted cap bits, see other filtering we do in mlx4_QUERY_DEV_CAP_wrapper Or. if (dev_cap->bmme_flags & MLX4_FLAG_PORT_REMAP) dev_cap->flags2 |= MLX4_DEV_CAP_FLAG2_PORT_REMAP; MLX4_GET(field, outbox, QUERY_DEV_CAP_CONFIG_DEV_OFFSET); -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH for-next 3/7] IB/mlx4: Configure device to work in RoCEv2
On 12/30/2015 10:23 AM, Matan Barak wrote: int mlx4_CONFIG_DEV_wrapper(struct mlx4_dev *dev, int slave, struct mlx4_vhcr *vhcr, struct mlx4_cmd_mailbox *inbox, struct mlx4_cmd_mailbox *outbox, struct mlx4_cmd_info *cmd) { int err; u8 get = vhcr->op_modifier; if (get != 1) return -EPERM; err = mlx4_DMA_wrapper(dev, slave, vhcr, inbox, outbox, cmd); return err; } Only "get" is permitted in multi-function setups. good, thanks for clarifying this out. Anyway, mlx4_config_roce_v2_port is not called for these setups because of this condition: if (mlx4_is_mfunc(dev)) { dev->caps.flags &= ~MLX4_DEV_CAP_FLAG_SENSE_SUPPORT; dev_cap->flags2 &= ~MLX4_DEV_CAP_FLAG2_ROCE_V1_V2; mlx4_dbg(dev, "RoCE V2 is not supported when SR-IOV is enabled\n"); } wrong again, you assume your Linux VF driver, but the VM can run other driver. -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH for-next V3 00/11] Add RoCE v2 support
Hi Matan, I see these two smatch complaints on code added with this series, can you please take a look? drivers/infiniband/core/addr.c:503 rdma_resolve_ip_route() warn: variable dereferenced before check 'src_addr' (see line 500) drivers/infiniband/core/cma_configfs.c:172 make_cma_ports() warn: double check that we're allocating correct size: 8 vs 128 Or. -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH for-next 2/3] IB/core: Change per-entry lock in RoCE GID table to one lock
On 10/28/2015 4:52 PM, Matan Barak wrote: @@ -134,16 +138,14 @@ static int write_gid(struct ib_device *ib_dev, u8 port, { int ret = 0; struct net_device *old_net_dev; - unsigned long flags; /* in rdma_cap_roce_gid_table, this funciton should be protected by a * sleep-able lock. */ - write_lock_irqsave(&table->data_vec[ix].lock, flags); if (rdma_cap_roce_gid_table(ib_dev, port)) { table->data_vec[ix].props |= GID_TABLE_ENTRY_INVALID; - write_unlock_irqrestore(&table->data_vec[ix].lock, flags); + write_unlock_irq(&table->rwlock); /* GID_TABLE_WRITE_ACTION_MODIFY currently isn't supported by * RoCE providers and thus only updates the cache. */ @@ -153,7 +155,7 @@ static int write_gid(struct ib_device *ib_dev, u8 port, else if (action == GID_TABLE_WRITE_ACTION_DEL) ret = ib_dev->del_gid(ib_dev, port, ix, &table->data_vec[ix].context); - write_lock_irqsave(&table->data_vec[ix].lock, flags); + write_lock_irq(&table->rwlock); } sparse complains on drivers/infiniband/core/cache.c:186:17: warning: context imbalance in 'write_gid' - unexpected unlock is this false positive? Or. -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH for-next V3 10/11] IB/core: Initialize UD header structure with IP and UDP headers
On 12/23/2015 2:56 PM, Matan Barak wrote: +__be16 ib_ud_ip4_csum(struct ib_ud_header *header) +{ + struct iphdr iph; + + iph.ihl = 5; + iph.version = 4; + iph.tos = header->ip4.tos; + iph.tot_len = header->ip4.tot_len; + iph.id = header->ip4.id; + iph.frag_off= header->ip4.frag_off; + iph.ttl = header->ip4.ttl; + iph.protocol= header->ip4.protocol; + iph.check = 0; + iph.saddr = header->ip4.saddr; + iph.daddr = header->ip4.daddr; + + return ip_fast_csum((u8 *)&iph, iph.ihl); +} +EXPORT_SYMBOL(ib_ud_ip4_csum); You have introduced here this sparse warning, please fix drivers/infiniband/core/ud_header.c:299:28: warning: incorrect type in return expression (different base types) drivers/infiniband/core/ud_header.c:299:28:expected restricted __be16 drivers/infiniband/core/ud_header.c:299:28:got restricted __sum16 Or. -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH for-next 5/7] IB/mlx4: Enable send of RoCE QP1 packets with IP/UDP headers
On Tue, Dec 29, 2015 at 3:24 PM, Matan Barak wrote: > @@ -2413,34 +2442,27 @@ static int build_mlx_header(struct mlx4_ib_sqp *sqp, > struct ib_ud_wr *wr, > > if (is_eth) { > struct in6_addr in6; > - > + u16 ether_type; > u16 pcp = (be32_to_cpu(ah->av.ib.sl_tclass_flowlabel) >> 29) > << 13; > > + ether_type = (!is_udp) ? MLX4_IB_IBOE_ETHERTYPE : > + (ip_version == 4 ? ETH_P_IP : ETH_P_IPV6); > + > mlx->sched_prio = cpu_to_be16(pcp); > > + ether_addr_copy(sqp->ud_header.eth.smac_h, ah->av.eth.s_mac); > memcpy(sqp->ud_header.eth.dmac_h, ah->av.eth.mac, 6); > - /* FIXME: cache smac value? */ > memcpy(&ctrl->srcrb_flags16[0], ah->av.eth.mac, 2); > memcpy(&ctrl->imm, ah->av.eth.mac + 2, 4); > memcpy(&in6, sgid.raw, sizeof(in6)); > > - if (!mlx4_is_mfunc(to_mdev(ib_dev)->dev)) { > - u64 mac = > atomic64_read(&to_mdev(ib_dev)->iboe.mac[sqp->qp.port - 1]); > - u8 smac[ETH_ALEN]; > - > - mlx4_u64_to_smac(smac, mac); > - memcpy(sqp->ud_header.eth.smac_h, smac, ETH_ALEN); > - } else { > - /* use the src mac of the tunnel */ > - memcpy(sqp->ud_header.eth.smac_h, ah->av.eth.s_mac, > ETH_ALEN); > - } > The last hunk that you removed had a role and was by no means dead-code, right? so... (1) why it's correct to remove it? (2) if you want to introduce different way to implement what was done here, why in this patch? maybe add pre-patch for that -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH for-next 7/7] IB/mlx4: Advertise RoCE support
On 12/29/2015 3:24 PM, Matan Barak wrote: Advertise RoCE support in port_immutable according to the hardware capabilities. This enables the verbs stack to use RoCE v2 mode. Advertise RoCE V2 support Signed-off-by: Matan Barak I guess you wanted "IB/mlx4: Advertise RoCE V2 support" for the patch title? since we did advertise RDMA_CORE_PORT_IBA_ROCE prior to this patch. Or. --- drivers/infiniband/hw/mlx4/main.c | 12 +--- 1 file changed, 9 insertions(+), 3 deletions(-) diff --git a/drivers/infiniband/hw/mlx4/main.c b/drivers/infiniband/hw/mlx4/main.c index 44e5699..8cf2575 100644 --- a/drivers/infiniband/hw/mlx4/main.c +++ b/drivers/infiniband/hw/mlx4/main.c @@ -2183,6 +2183,7 @@ static int mlx4_port_immutable(struct ib_device *ibdev, u8 port_num, struct ib_port_immutable *immutable) { struct ib_port_attr attr; + struct mlx4_ib_dev *mdev = to_mdev(ibdev); int err; err = mlx4_ib_query_port(ibdev, port_num, &attr); @@ -2192,10 +2193,15 @@ static int mlx4_port_immutable(struct ib_device *ibdev, u8 port_num, immutable->pkey_tbl_len = attr.pkey_tbl_len; immutable->gid_tbl_len = attr.gid_tbl_len; - if (mlx4_ib_port_link_layer(ibdev, port_num) == IB_LINK_LAYER_INFINIBAND) + if (mlx4_ib_port_link_layer(ibdev, port_num) == IB_LINK_LAYER_INFINIBAND) { immutable->core_cap_flags = RDMA_CORE_PORT_IBA_IB; - else - immutable->core_cap_flags = RDMA_CORE_PORT_IBA_ROCE; + } else { + if (mdev->dev->caps.flags & MLX4_DEV_CAP_FLAG_IBOE) + immutable->core_cap_flags = RDMA_CORE_PORT_IBA_ROCE; + if (mdev->dev->caps.flags2 & MLX4_DEV_CAP_FLAG2_ROCE_V1_V2) + immutable->core_cap_flags = RDMA_CORE_PORT_IBA_ROCE | + RDMA_CORE_PORT_IBA_ROCE_UDP_ENCAP; + } immutable->max_mad_size = IB_MGMT_MAD_SIZE; -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH for-next 1/7] IB/mlx4: Query RoCE support
On 12/29/2015 3:24 PM, Matan Barak wrote: @@ -905,6 +906,8 @@ int mlx4_QUERY_DEV_CAP(struct mlx4_dev *dev, struct mlx4_dev_cap *dev_cap) dev_cap->flags2 |= MLX4_DEV_CAP_FLAG2_EQE_STRIDE; MLX4_GET(dev_cap->bmme_flags, outbox, QUERY_DEV_CAP_BMME_FLAGS_OFFSET); + if (dev_cap->bmme_flags & MLX4_FLAG_ROCE_V1_V2) + dev_cap->flags2 |= MLX4_DEV_CAP_FLAG2_ROCE_V1_V2; Did you make sure that the query dev cap wrapper unsets this bit when proxing VF queries? if (dev_cap->bmme_flags & MLX4_FLAG_PORT_REMAP) dev_cap->flags2 |= MLX4_DEV_CAP_FLAG2_PORT_REMAP; MLX4_GET(field, outbox, QUERY_DEV_CAP_CONFIG_DEV_OFFSET); -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH for-next 2/7] IB/mlx4: Add RoCE per GID support for add_gid and del_gid
On 12/29/2015 3:24 PM, Matan Barak wrote: [...] We use a new firmware command in order to populate the GID table and store the type along with the GID value. Its a new value to existing command.. so better say we use a new value to the SET_PORT firmware command to do X Also here, break out mlx4_core new functionality e.g the changes to include/linux/mlx4/cmd.h into mlx4_core only patch. You don't need any change to mlx4_core to have it's own patch, I guess one up to three mlx4 core patches would be OK. Did you make sure (at the resource tracker) that VFs can't do this new set port command flavor? Also find some spot to put blank line in the change-log, it's hard to read this way. Or. -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH V2] IB/mlx5: Unify CQ create flags check
On 12/29/2015 4:41 PM, Leon Romanovsky wrote: From: Leon Romanovsky The create_cq() can receive creation flags which were used differently by two commits which added create_cq extended command and cross-channel. The merged code caused to not accept any flags at all. This patch unifies the check into one function and one return error code. Fixes: 972ecb821379 ("IB/mlx5: Add create_cq extended command") Fixes: 051f263098a9 ("IB/mlx5: Add driver cross-channel support") --- Changes from v1: * Remove links to linux-rdma from commit message * Placed change log under git comment section (---) Changes from v0: * Add Fixes tag Signed-off-by: Leon Romanovsky wrong placing. Needs to be before the 1st --- and w.o blank lines after the Fixes: lines please Please use dry runs to get this to run (...) correctly --- drivers/infiniband/hw/mlx5/cq.c | 9 + drivers/infiniband/hw/mlx5/mlx5_ib.h | 3 ++- 2 files changed, 3 insertions(+), 9 deletions(-) diff --git a/drivers/infiniband/hw/mlx5/cq.c b/drivers/infiniband/hw/mlx5/cq.c index b14316603e44..7ddc790b1819 100644 --- a/drivers/infiniband/hw/mlx5/cq.c +++ b/drivers/infiniband/hw/mlx5/cq.c @@ -757,10 +757,6 @@ static void destroy_cq_kernel(struct mlx5_ib_dev *dev, struct mlx5_ib_cq *cq) mlx5_db_free(dev->mdev, &cq->db); } -enum { - CQ_CREATE_FLAGS_SUPPORTED = IB_CQ_FLAGS_TIMESTAMP_COMPLETION -}; - struct ib_cq *mlx5_ib_create_cq(struct ib_device *ibdev, const struct ib_cq_init_attr *attr, struct ib_ucontext *context, @@ -778,13 +774,10 @@ struct ib_cq *mlx5_ib_create_cq(struct ib_device *ibdev, int eqn; int err; - if (check_cq_create_flags(attr->flags)) - return ERR_PTR(-EINVAL); - if (entries < 0) return ERR_PTR(-EINVAL); - if (attr->flags & ~CQ_CREATE_FLAGS_SUPPORTED) + if (check_cq_create_flags(attr->flags)) return ERR_PTR(-EOPNOTSUPP); entries = roundup_pow_of_two(entries + 1); diff --git a/drivers/infiniband/hw/mlx5/mlx5_ib.h b/drivers/infiniband/hw/mlx5/mlx5_ib.h index d4b227126265..fbf14a768105 100644 --- a/drivers/infiniband/hw/mlx5/mlx5_ib.h +++ b/drivers/infiniband/hw/mlx5/mlx5_ib.h @@ -692,6 +692,7 @@ static inline u32 check_cq_create_flags(u32 flags) * It returns non-zero value for unsupported CQ * create flags, otherwise it returns zero. */ - return (flags & ~IB_CQ_FLAGS_IGNORE_OVERRUN); + return (flags & ~(IB_CQ_FLAGS_IGNORE_OVERRUN | + IB_CQ_FLAGS_TIMESTAMP_COMPLETION)); } #endif /* MLX5_IB_H */ -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH for-next 6/7] IB/mlx4: Create and use another QP1 for RoCEv2
On 12/29/2015 3:24 PM, Matan Barak wrote: The mlx4 driver uses a special QP to implement the GSI QP. This kind of QP allows to build the InfiniBand headers in SW to be put before the payload that comes in with the WR. The mlx4 HW builds the packet, calculates the ICRC and puts it at the end of the payload. This ICRC calculation however depends on the QP configuration which is determined when QP is modified (roce_mode during INIT->RTR). On the other hand, ICRC verification when packet is received does to depend on this configuration. I don't understand the part of the sentence saying "when packet is received does to depend on this configuration" maybe some typo/s there? Therefore, using 2 GSI QPs for send (one for each RoCE version) and 1 GSI QP for receive are required. s/2/two/ and s/1/one/ please Or. -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH for-next 3/7] IB/mlx4: Configure device to work in RoCEv2
On 12/29/2015 3:24 PM, Matan Barak wrote: From: Moni Shoua Some mlx4 adapters are RoCEv2 capable. To enable this feature some hardware configuration is required. This is 1. Set port general parameters 2. Configure the outgoing UDP destination port 3. Configure the QP that work with RoCEv2 Signed-off-by: Moni Shoua --- drivers/infiniband/hw/mlx4/main.c | 19 ++--- drivers/infiniband/hw/mlx4/qp.c | 35 --- drivers/net/ethernet/mellanox/mlx4/fw.c | 16 +- drivers/net/ethernet/mellanox/mlx4/mlx4.h | 7 +-- drivers/net/ethernet/mellanox/mlx4/port.c | 8 +++ drivers/net/ethernet/mellanox/mlx4/qp.c | 28 + include/linux/mlx4/device.h | 1 + include/linux/mlx4/qp.h | 15 +++-- include/rdma/ib_verbs.h | 2 ++ 9 files changed, 120 insertions(+), 11 deletions(-) Better put (please do...) functionality which is plain mlx4 corish (such as new/modified FW commands, new SW/FW fields of structs and such) into mlx4_core patch. diff --git a/drivers/infiniband/hw/mlx4/main.c b/drivers/infiniband/hw/mlx4/main.c index 988fa33..44e5699 100644 --- a/drivers/infiniband/hw/mlx4/main.c +++ b/drivers/infiniband/hw/mlx4/main.c @@ -384,6 +384,7 @@ int mlx4_ib_gid_index_to_real_index(struct mlx4_ib_dev *ibdev, int i; int ret; unsigned long flags; + struct ib_gid_attr attr; if (port_num > MLX4_MAX_PORTS) return -EINVAL; @@ -394,10 +395,13 @@ int mlx4_ib_gid_index_to_real_index(struct mlx4_ib_dev *ibdev, if (!rdma_cap_roce_gid_table(&ibdev->ib_dev, port_num)) return index; - ret = ib_get_cached_gid(&ibdev->ib_dev, port_num, index, &gid, NULL); + ret = ib_get_cached_gid(&ibdev->ib_dev, port_num, index, &gid, &attr); if (ret) return ret; + if (attr.ndev) + dev_put(attr.ndev); + if (!memcmp(&gid, &zgid, sizeof(gid))) return -EINVAL; @@ -405,7 +409,8 @@ int mlx4_ib_gid_index_to_real_index(struct mlx4_ib_dev *ibdev, port_gid_table = &iboe->gids[port_num - 1]; for (i = 0; i < MLX4_MAX_PORT_GIDS; ++i) - if (!memcmp(&port_gid_table->gids[i].gid, &gid, sizeof(gid))) { + if (!memcmp(&port_gid_table->gids[i].gid, &gid, sizeof(gid)) && + attr.gid_type == port_gid_table->gids[i].gid_type) { ctx = port_gid_table->gids[i].ctx; break; } @@ -2481,7 +2486,8 @@ static void *mlx4_ib_add(struct mlx4_dev *dev) if (mlx4_ib_init_sriov(ibdev)) goto err_mad; - if (dev->caps.flags & MLX4_DEV_CAP_FLAG_IBOE) { + if (dev->caps.flags & MLX4_DEV_CAP_FLAG_IBOE || + dev->caps.flags2 & MLX4_DEV_CAP_FLAG2_ROCE_V1_V2) { if (!iboe->nb.notifier_call) { iboe->nb.notifier_call = mlx4_ib_netdev_event; err = register_netdevice_notifier(&iboe->nb); @@ -2490,6 +2496,13 @@ static void *mlx4_ib_add(struct mlx4_dev *dev) goto err_notif; } } + if (!mlx4_is_slave(dev) && + dev->caps.flags2 & MLX4_DEV_CAP_FLAG2_ROCE_V1_V2) { + err = mlx4_config_roce_v2_port(dev, ROCE_V2_UDP_DPORT); + if (err) { + goto err_notif; + } + } } for (j = 0; j < ARRAY_SIZE(mlx4_class_attributes); ++j) { diff --git a/drivers/infiniband/hw/mlx4/qp.c b/drivers/infiniband/hw/mlx4/qp.c index 8d28059..c0dee79 100644 --- a/drivers/infiniband/hw/mlx4/qp.c +++ b/drivers/infiniband/hw/mlx4/qp.c @@ -1508,6 +1508,24 @@ static int create_qp_lb_counter(struct mlx4_ib_dev *dev, struct mlx4_ib_qp *qp) return 0; } +enum { + MLX4_QPC_ROCE_MODE_1 = 0, + MLX4_QPC_ROCE_MODE_2 = 2, + MLX4_QPC_ROCE_MODE_MAX = 0xff +}; + +static u8 gid_type_to_qpc(enum ib_gid_type gid_type) +{ + switch (gid_type) { + case IB_GID_TYPE_ROCE: + return MLX4_QPC_ROCE_MODE_1; + case IB_GID_TYPE_ROCE_UDP_ENCAP: + return MLX4_QPC_ROCE_MODE_2; + default: + return MLX4_QPC_ROCE_MODE_MAX; + } +} + static int __mlx4_ib_modify_qp(struct ib_qp *ibqp, const struct ib_qp_attr *attr, int attr_mask, enum ib_qp_state cur_state, enum ib_qp_state new_state) @@ -1651,9 +1669,10 @@ static int __mlx4_ib_modify_qp(struct ib_qp *ibqp, u16 vlan = 0x; u8 smac[ETH_ALEN]; int status = 0; + int is_eth = rdma_cap_eth_ah(&dev->ib_dev, port_num) && + attr->ah_attr.ah_flags & IB_AH_GRH; - if (rdma_cap_eth_ah(&dev->ib_dev, port_nu
Re: [PATCH for-next 4/7] net/mlx4_core: Add handlning of RoCE v2 over IPV4 in attach_flow
On 12/29/2015 3:24 PM, Matan Barak wrote: From: Maor Gottlieb s/handlning/handling/ When attaching multicast for RoCE v2, we need to be able to steer packets to the QPs. Hence, we add support for IPV4 over IB steering. not sure to follow on the change-log, can you clarify it little further... Signed-off-by: Maor Gottlieb --- drivers/net/ethernet/mellanox/mlx4/mcg.c | 14 -- include/linux/mlx4/device.h | 6 ++ 2 files changed, 18 insertions(+), 2 deletions(-) diff --git a/drivers/net/ethernet/mellanox/mlx4/mcg.c b/drivers/net/ethernet/mellanox/mlx4/mcg.c index 1d4e2e0..834e60e 100644 --- a/drivers/net/ethernet/mellanox/mlx4/mcg.c +++ b/drivers/net/ethernet/mellanox/mlx4/mcg.c @@ -858,7 +858,9 @@ static int parse_trans_rule(struct mlx4_dev *dev, struct mlx4_spec_list *spec, break; case MLX4_NET_TRANS_RULE_ID_IB: - rule_hw->ib.l3_qpn = spec->ib.l3_qpn; + rule_hw->ib.l3_qpn = spec->ib.l3_qpn | + (spec->ib.roce_type == MLX4_FLOW_SPEC_IB_ROCE_TYPE_IPV4 ? +(__force __be32)0x80 : (__force __be32)0); maybe avoid using hard coded constants and get meaningful name for them? rule_hw->ib.qpn_mask = spec->ib.qpn_msk; memcpy(&rule_hw->ib.dst_gid, &spec->ib.dst_gid, 16); memcpy(&rule_hw->ib.dst_gid_msk, &spec->ib.dst_gid_msk, 16); @@ -1384,10 +1386,18 @@ int mlx4_trans_to_dmfs_attach(struct mlx4_dev *dev, struct mlx4_qp *qp, memcpy(spec.eth.dst_mac_msk, &mac_mask, ETH_ALEN); break; + case MLX4_PROT_IB_IPV4: + spec.id = MLX4_NET_TRANS_RULE_ID_IB; + memcpy(spec.ib.dst_gid + 12, gid + 12, 4); + memset(spec.ib.dst_gid_msk + 12, 0xff, 4); + spec.ib.roce_type = MLX4_FLOW_SPEC_IB_ROCE_TYPE_IPV4; + break; + case MLX4_PROT_IB_IPV6: spec.id = MLX4_NET_TRANS_RULE_ID_IB; memcpy(spec.ib.dst_gid, gid, 16); - memset(&spec.ib.dst_gid_msk, 0xff, 16); + memset(spec.ib.dst_gid_msk, 0xff, 16); + spec.ib.roce_type = MLX4_FLOW_SPEC_IB_ROCE_TYPE_IPV6; break; default: return -EINVAL; diff --git a/include/linux/mlx4/device.h b/include/linux/mlx4/device.h index 0d873f1ae..cdc75b2 100644 --- a/include/linux/mlx4/device.h +++ b/include/linux/mlx4/device.h @@ -391,6 +391,11 @@ enum mlx4_protocol { MLX4_PROT_FCOE }; +enum mlx4_flow_roce_type { + MLX4_FLOW_SPEC_IB_ROCE_TYPE_IPV6 = 0, + MLX4_FLOW_SPEC_IB_ROCE_TYPE_IPV4 +}; + enum { MLX4_MTT_FLAG_PRESENT = 1 }; @@ -1197,6 +1202,7 @@ struct mlx4_spec_ipv4 { struct mlx4_spec_ib { __be32 l3_qpn; __be32 qpn_msk; + enummlx4_flow_roce_type roce_type; u8 dst_gid[16]; u8 dst_gid_msk[16]; }; -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v2] IB/core: sysfs.c: Fix PerfMgt ClassPortInfo handling
On 12/29/2015 12:43 PM, Hal Rosenstock wrote: Port number is not part of ClassPortInfo attribute but is still needed as a parameter when invoking process_mad. Please remove the blank line above your 1st sentence. To properly handle this attribute, port_num is added as a parameter to get_counter_table and get_perf_mad was changed not to store port_num in the attribute itself when it's querying the ClassPortInfo attribute. This handles issue pointed out by Matan Barak Fixes: 145d9c541032 ('IB/core: Display extended counter set if available') Signed-off-by: Hal Rosenstock again, remove the blank line after the fixes tag. Also, I am not that the way Doug is setting the branch for pull would preserve commit IDs when the offending patch landed in Linus tree. If this is the case, we should put your patch in 2nd pull request and have the right commit ID there. Please check with Doug. Acked-by: Matan Barak Acked-by: Ira Weiny --- Change from v1: Added fixes line to description So this patch makes mlx4 IB driver on Eth ports workable with the 4.5-rc1 proposed bits? -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH rdma-next V2 00/32] Soft-RoCE driver
On Thu, Dec 24, 2015 at 12:02 PM, Christoph Hellwig wrote: > On Thu, Dec 24, 2015 at 11:17:46AM +0200, Kamal Heib wrote: >> We've located the driver in the staging subtree. This follows a requirement >> to implement an IB transport library - Soft RoCE is in the same boat like >> the hfi1 >> driver. We need to define and implement a lib to prevent those code >> duplications. > > Given the trainwreck that the staging process is it might seems more > sensible to get it into a stage and then merge it directly. You'll > probably save yourself a lot of work that way. I am not sure what you mean by "get it into a stage and then merge it directly" --i is that not go through staging at all? Or. -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v2 for-next 5/7] IB/mlx4: Add IB counters table
On 12/24/2015 12:42 PM, Sagi Grimberg wrote: This patch seems to generate a list corruption [1] when I test with Doug's for-4.5 tree. Eran, care to take a look at this? This patch is part from a series that was introduced in 4.3-rc1 [1], Then something else broke it. Can people check their patches on doug's tree? At the moment it's unusable... Yes, I checked the branch up to commit 882f3b3 "Merge branches '4.5/Or-cleanup' and '4.5/rdma-cq' into k.o/for-4.5" and it works (rping, ibv_rc_pingpong over top of mlx4 VPI) -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH V1 0/3] Add cross-channel support
On 12/24/2015 12:00 PM, Christoph Hellwig wrote: On Thu, Dec 24, 2015 at 10:02:29AM +0200, Or Gerlitz wrote: We had consensus among the reviewers that the 1st patch ("IB/core: Align coding style of ib_device_cap_flags structure") is wrong cleanup which basically is (1) unneeded (2) creates more damage (git blame and such, non-applicable to uapi, more) than benefit, etc -- finally Leon was convinced too [1]. It's not really an issue vs uapi. Using the the wierd BIT() macro would have been, but without it I think this cleanup is ok, even if I personally wouldn't have done it. git-blame isn't really a major issue either, as you can blame past revisions. I would personally wouldn't done cleanup either and I managed to convinced Leon to drop it, so we had concensus among the developers, the maintainer didn't have other opinion and he took the wrong step -- so we're asking to fix, that's all. Leon will re-spin in the coming 1-2 hours V2, could please pick it instead of V1, when people agree on direction X and you are not against it, lets do X and not Y. It would be great if we could stop rebasing whats already in the tree for the benefit of everyone building on top of this. For example just finished rebasing my series to move many constants includin this one to the uapi headers, and I'd hate to rebase it once again now that the dust has settled. The root issue here is that nothing was picked before 4.4-rc6, so we're in a situation where rebases are needed in the own-maintainer tree (github) to make things right. No way to avoid that. We should aim that for 4.6 and onward, code for -next will start getting in around rc1-2 and then things will be more robust, etc Or. Or. -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v2 for-next 5/7] IB/mlx4: Add IB counters table
On 12/24/2015 12:12 PM, Sagi Grimberg wrote: This patch seems to generate a list corruption [1] when I test with Doug's for-4.5 tree. Eran, care to take a look at this? This patch is part from a series that was introduced in 4.3-rc1 [1], did 4.4-rc5/6 worked for you before you uploaded there further patches? Or. [1] fbfb662 IB/mlx4: Add support for blocking multicast loopback QP creation user flag 7b59f0f IB/mlx4: Add counter based implementation for QP multicast loopback block 3ba8e31 IB/mlx4: Add IB counters table 74194fb net/mlx4_en: Implement mcast loopback prevention for ETH qps 9a89283 net/mlx4_core: Add support for filtering multicast loopback ddf9529 IB/core: Allow setting create flags in QP init attribute 6d8a749 IB/core: Extend ib_uverbs_create_qp -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH V1 08/16] i40iw: add files for iwarp interface
On 12/24/2015 9:31 AM, Faisal Latif wrote: On Wed, Dec 23, 2015 at 08:42:01AM -0800, Or Gerlitz wrote: On 12/22/2015 1:13 AM, Faisal Latif wrote: + +enum i40iw_memreg_type { + IW_MEMREG_TYPE_MEM = 0x, + IW_MEMREG_TYPE_QP = 0x0001, + IW_MEMREG_TYPE_CQ = 0x0002, + IW_MEMREG_TYPE_MW = 0x0003, + IW_MEMREG_TYPE_FMR = 0x0004, + IW_MEMREG_TYPE_FMEM = 0x0005, +}; Can't you re-use IB core values or derive that from the actual uverbs command? I did not see anything which will have types that I needed. what do you need? what is the role of this enum? It will be confusing otherwise. I will be reducing number of types from here though. so why some of it can go? is that deal values which aren't used by the code Or. Thanks Faisal -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH V1 15/16] i40iw: add entry in rdma_netlink
On 12/24/2015 9:05 AM, Faisal Latif wrote: >Why the iwarp port mapper implementationhas to be repeated in each >driver? can you join your code in a common place and avoid the duplication? > >root@r-dcs58 hw]# git grep RDMA_NL_ nes >nes/nes.c: [RDMA_NL_IWPM_REG_PID] = {.dump = iwpm_register_pid_cb}, >nes/nes.c: [RDMA_NL_IWPM_ADD_MAPPING] = {.dump = iwpm_add_mapping_cb}, >nes/nes.c: [RDMA_NL_IWPM_QUERY_MAPPING] = {.dump = >iwpm_add_and_query_mapping_cb}, >nes/nes.c: [RDMA_NL_IWPM_REMOTE_INFO] = {.dump = iwpm_remote_info_cb}, >nes/nes.c: [RDMA_NL_IWPM_HANDLE_ERR] = {.dump = iwpm_mapping_error_cb}, >nes/nes.c: [RDMA_NL_IWPM_MAPINFO] = {.dump = iwpm_mapping_info_cb}, >nes/nes.c: [RDMA_NL_IWPM_MAPINFO_NUM] = {.dump = >iwpm_ack_mapping_info_cb} >nes/nes.c: if (ibnl_add_client(RDMA_NL_NES, RDMA_NL_IWPM_NUM_OPS, >nes_nl_cb_table)) >nes/nes.c: ret = iwpm_init(RDMA_NL_NES); >nes/nes.c: ibnl_remove_client(RDMA_NL_NES); >nes/nes.c: ibnl_remove_client(RDMA_NL_NES); >nes/nes.c: iwpm_exit(RDMA_NL_NES); >nes/nes_cm.c: &mapped_sockaddr, RDMA_NL_NES); >nes/nes_cm.c: return iwpm_remove_mapping(&local_sockaddr, RDMA_NL_NES); >nes/nes_cm.c: &remote_addr, RDMA_NL_NES); >nes/nes_cm.c: iwpm_err = iwpm_register_pid(&pm_reg_msg, >RDMA_NL_NES); >nes/nes_cm.c: iwpm_err = iwpm_add_mapping(&pm_msg, >RDMA_NL_NES); >nes/nes_cm.c: iwpm_err = iwpm_register_pid(&pm_reg_msg, RDMA_NL_NES); >nes/nes_cm.c: iwpm_err = iwpm_add_and_query_mapping(&pm_msg, >RDMA_NL_NES); > i40iw iwarp driver registers with port mapper and uses its services. Beside that it is not the scope of the patch series. You are asked a question by reviewer and your reply is (1) YES, I did C & P from driver X to my driver (2) explaining why C&P is right goes beyond the scope of reviewing my driver This isn't how things work in upstream. If needed, talk to the upstream Intel networking folks, they can assist you catching up on upstream practices. Or. -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] IB/cma: cma_match_net_dev needs to take into account port_num
On 12/24/2015 9:57 AM, Matan Barak wrote: I totally agree that it's better to use the local IP address and not just get a random device by using 127.0.0.1. You could get a specific device by binding it, but then - use its local IP instead of 127.0.0.1. Yes guys, it might be better but the user can do that. However, loopback ala 127.0.0.1 is working in the rdma-cm since the 2.6.18 day one and was broken recently when Eth ports are around. Lets fix. Indeed, this is 4.5 and not 4.4 material. Or. -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH V1 0/3] Add cross-channel support
On 12/24/2015 5:31 AM, Doug Ledford wrote: On 12/20/2015 12:16 PM, Leon Romanovsky wrote: Leon Romanovsky (3): IB/core: Align coding style of ib_device_cap_flags structure IB/core: Add cross-channel support IB/mlx5: Add driver cross-channel support drivers/infiniband/core/uverbs_cmd.c | 5 ++- drivers/infiniband/hw/mlx5/cq.c | 7 +++- drivers/infiniband/hw/mlx5/main.c| 3 ++ drivers/infiniband/hw/mlx5/mlx5_ib.h | 16 drivers/infiniband/hw/mlx5/qp.c | 54 ++- include/linux/mlx5/qp.h | 3 ++ include/rdma/ib_verbs.h | 71 +--- 7 files changed, 117 insertions(+), 42 deletions(-) I took the series as is. Please make sure to resubmit the libibverbs portion of these changes with the requested man page updates. Doug, Wait. We had consensus among the reviewers that the 1st patch ("IB/core: Align coding style of ib_device_cap_flags structure") is wrong cleanup which basically is (1) unneeded (2) creates more damage (git blame and such, non-applicable to uapi, more) than benefit, etc -- finally Leon was convinced too [1]. Leon will re-spin in the coming 1-2 hours V2, could please pick it instead of V1, when people agree on direction X and you are not against it, lets do X and not Y. thanks, Or. [1] http://marc.info/?t=14506106803&r=1&w=2 -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v2 00/10] iSER support for remote invalidate
On Mon, Dec 21, 2015 at 3:29 PM, Sagi Grimberg wrote: >> Sagi, are you going to spin an increment in the initiator version? > I don't know if it's worth a driver version update? yes, I think it does. Finally we're getting real close to the visions of the spec authors... do you have some performance data showing the gain of remote invalidation on certain HWs and IO benchmarks patterns. Did we make sure to interoperate between patched initiator to unpatched LIO or other targets which don't have this now (SCST, TGT) and the other way around (unpatched initiator, patched LIO)? > In case we do increment, I can send an incremental patch instead of > re-spinning the entire series. lets do that. -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH for-next V2 0/5] User-space time-stamping support for mlx5_ib
On Tue, Dec 15, 2015 at 8:30 PM, Matan Barak wrote: > This patch-set adds user-space support for time-stamping in mlx5_ib. > It implements the necessary API: > (a) ib_create_cq_ex - Add support for CQ creation flags > (b) ib_query_device - return timestamp_mask and hca_core_clock. > We also add support for mmaping the HCA's free running clock. > In order to do so, we use the response of the vendor's extended > part in init_ucontext. This allows us to pass the page offset > of the free running clock register to the user-space driver. > In order to implement it in a future extensible manner, we use the > same mechanism of verbs extensions to the mlx5 vendor part as well. Hi Doug, As Matan is now the maintainer of the mlx5_ib driver and this series doesn't add any IB/core changes, we see no reason that it can't go into 4.5. make sense? Or > Changes from v1: > * Change ib_is_udata_cleared to use memchr_inv. > > Changes from v0: > * Limit mmap PAGE_SIZE to 4K (security wise). > * Optimize ib_is_udata_cleared. > * Pass hca_core_clock_offset in the vendor's response part of init_ucontext. > > Matan Barak (5): > IB/mlx5: Add create_cq extended command > IB/core: Add ib_is_udata_cleared > IB/mlx5: Add support for hca_core_clock and timestamp_mask > IB/mlx5: Add hca_core_clock_offset to udata in init_ucontext > IB/mlx5: Mmap the HCA's core clock register to user-space > > drivers/infiniband/hw/mlx5/cq.c | 7 > drivers/infiniband/hw/mlx5/main.c| 67 > +++- > drivers/infiniband/hw/mlx5/mlx5_ib.h | 7 +++- > drivers/infiniband/hw/mlx5/user.h| 12 +-- > include/linux/mlx5/device.h | 7 ++-- > include/linux/mlx5/mlx5_ifc.h| 9 +++-- > include/rdma/ib_verbs.h | 27 +++ > 7 files changed, 120 insertions(+), 16 deletions(-) > > -- > 2.1.0 > > -- > To unsubscribe from this list: send the line "unsubscribe linux-rdma" in > the body of a message to majord...@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH V1 00/16] add Intel(R) X722 iWARP driver
On 12/22/2015 1:13 AM, Faisal Latif wrote: This driver provides iWARP RDMA functionality for the Intel(R) X722 Ethernet controller for PCI Physical Functions. Is there any public info on the X722, I didn't manage to find such. It also has support for Virtual Function driver (i40iwvf.ko), which that will be part of separate patch series. can you explain why do you need a separate rdma driver for VFs? Or. -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH V1 15/16] i40iw: add entry in rdma_netlink
On 12/22/2015 1:13 AM, Faisal Latif wrote: Add entry for port mapper services. Signed-off-by: Faisal Latif --- include/uapi/rdma/rdma_netlink.h | 1 + 1 file changed, 1 insertion(+) diff --git a/include/uapi/rdma/rdma_netlink.h b/include/uapi/rdma/rdma_netlink.h index c19a5dc..4fa418d 100644 --- a/include/uapi/rdma/rdma_netlink.h +++ b/include/uapi/rdma/rdma_netlink.h @@ -8,6 +8,7 @@ enum { RDMA_NL_NES, RDMA_NL_C4IW, RDMA_NL_LS, /* RDMA Local Services */ + RDMA_NL_I40IW, RDMA_NL_NUM_CLIENTS }; Do you use this value in down-stream patches of the series? if yes, change the order to have this and other IB core changes before the code that use that. Why the iwarp port mapper implementationhas to be repeated in each driver? can you join your code in a common place and avoid the duplication? root@r-dcs58 hw]# git grep RDMA_NL_ nes nes/nes.c: [RDMA_NL_IWPM_REG_PID] = {.dump = iwpm_register_pid_cb}, nes/nes.c: [RDMA_NL_IWPM_ADD_MAPPING] = {.dump = iwpm_add_mapping_cb}, nes/nes.c: [RDMA_NL_IWPM_QUERY_MAPPING] = {.dump = iwpm_add_and_query_mapping_cb}, nes/nes.c: [RDMA_NL_IWPM_REMOTE_INFO] = {.dump = iwpm_remote_info_cb}, nes/nes.c: [RDMA_NL_IWPM_HANDLE_ERR] = {.dump = iwpm_mapping_error_cb}, nes/nes.c: [RDMA_NL_IWPM_MAPINFO] = {.dump = iwpm_mapping_info_cb}, nes/nes.c: [RDMA_NL_IWPM_MAPINFO_NUM] = {.dump = iwpm_ack_mapping_info_cb} nes/nes.c: if (ibnl_add_client(RDMA_NL_NES, RDMA_NL_IWPM_NUM_OPS, nes_nl_cb_table)) nes/nes.c: ret = iwpm_init(RDMA_NL_NES); nes/nes.c: ibnl_remove_client(RDMA_NL_NES); nes/nes.c: ibnl_remove_client(RDMA_NL_NES); nes/nes.c: iwpm_exit(RDMA_NL_NES); nes/nes_cm.c: &mapped_sockaddr, RDMA_NL_NES); nes/nes_cm.c: return iwpm_remove_mapping(&local_sockaddr, RDMA_NL_NES); nes/nes_cm.c: &remote_addr, RDMA_NL_NES); nes/nes_cm.c: iwpm_err = iwpm_register_pid(&pm_reg_msg, RDMA_NL_NES); nes/nes_cm.c: iwpm_err = iwpm_add_mapping(&pm_msg, RDMA_NL_NES); nes/nes_cm.c: iwpm_err = iwpm_register_pid(&pm_reg_msg, RDMA_NL_NES); nes/nes_cm.c: iwpm_err = iwpm_add_and_query_mapping(&pm_msg, RDMA_NL_NES); [root@r-dcs58 hw]# git grep RDMA_NL_ cxgb4/ cxgb4/cm.c: iwpm_remove_mapping(&ep->com.local_addr, RDMA_NL_C4IW); cxgb4/cm.c: &child_ep->com.remote_addr, RDMA_NL_C4IW); cxgb4/cm.c: iwpm_err = iwpm_register_pid(&pm_reg_msg, RDMA_NL_C4IW); cxgb4/cm.c: iwpm_err = iwpm_add_and_query_mapping(&pm_msg, RDMA_NL_C4IW); cxgb4/cm.c: &ep->com.mapped_local_addr, RDMA_NL_C4IW)) { cxgb4/cm.c: iwpm_remove_mapping(&ep->com.local_addr, RDMA_NL_C4IW); cxgb4/cm.c: iwpm_err = iwpm_register_pid(&pm_reg_msg, RDMA_NL_C4IW); cxgb4/cm.c: iwpm_err = iwpm_add_mapping(&pm_msg, RDMA_NL_C4IW); cxgb4/cm.c: &ep->com.mapped_local_addr, RDMA_NL_C4IW)) { cxgb4/device.c: [RDMA_NL_IWPM_REG_PID] = {.dump = iwpm_register_pid_cb}, cxgb4/device.c: [RDMA_NL_IWPM_ADD_MAPPING] = {.dump = iwpm_add_mapping_cb}, cxgb4/device.c: [RDMA_NL_IWPM_QUERY_MAPPING] = {.dump = iwpm_add_and_query_mapping_cb}, cxgb4/device.c: [RDMA_NL_IWPM_HANDLE_ERR] = {.dump = iwpm_mapping_error_cb}, cxgb4/device.c: [RDMA_NL_IWPM_REMOTE_INFO] = {.dump = iwpm_remote_info_cb}, cxgb4/device.c: [RDMA_NL_IWPM_MAPINFO] = {.dump = iwpm_mapping_info_cb}, cxgb4/device.c: [RDMA_NL_IWPM_MAPINFO_NUM] = {.dump = iwpm_ack_mapping_info_cb} cxgb4/device.c: if (ibnl_add_client(RDMA_NL_C4IW, RDMA_NL_IWPM_NUM_OPS, cxgb4/device.c: err = iwpm_init(RDMA_NL_C4IW); cxgb4/device.c: ibnl_remove_client(RDMA_NL_C4IW); cxgb4/device.c: iwpm_exit(RDMA_NL_C4IW); cxgb4/device.c: ibnl_remove_client(RDMA_NL_C4IW); -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH V1 00/16] add Intel(R) X722 iWARP driver
On 12/23/2015 6:35 PM, Faisal Latif wrote: I will provide new patch series for latest k.o. Also will make sure of shallow threading for the series. Please make sure that the cover letter will include the full output of the git generated cover-letter so we can see the location of changes you did to the IB core and the overall LOC volume of the driver. Did you run the driver through 0-day testing to avoid zillion small follow up patches in a later point when this isin? Or. -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH V1 08/16] i40iw: add files for iwarp interface
On 12/22/2015 1:13 AM, Faisal Latif wrote: + +enum i40iw_memreg_type { + IW_MEMREG_TYPE_MEM = 0x, + IW_MEMREG_TYPE_QP = 0x0001, + IW_MEMREG_TYPE_CQ = 0x0002, + IW_MEMREG_TYPE_MW = 0x0003, + IW_MEMREG_TYPE_FMR = 0x0004, + IW_MEMREG_TYPE_FMEM = 0x0005, +}; Can't you re-use IB core values or derive that from the actual uverbs command? Or. -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH net-next 1/2] MAINTAINERS: Assign new maintainers to Mellanox mlx5 core and IB drivers
Matan and Leon step in as co-maintainers to replace Eli Cohen who wrote and maintained the core and IB drivers. Signed-off-by: Or Gerlitz --- MAINTAINERS | 15 +++ 1 file changed, 7 insertions(+), 8 deletions(-) diff --git a/MAINTAINERS b/MAINTAINERS index c6b78b0..c7586a5 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -7057,27 +7057,26 @@ W: http://linuxtv.org S: Odd Fixes F: drivers/media/radio/radio-miropcm20* -Mellanox MLX5 core VPI driver -M: Eli Cohen +MELLANOX MLX5 core VPI driver +M: Matan Barak +M: Leon Romanovsky L: net...@vger.kernel.org L: linux-rdma@vger.kernel.org W: http://www.mellanox.com Q: http://patchwork.ozlabs.org/project/netdev/list/ -Q: http://patchwork.kernel.org/project/linux-rdma/list/ -T: git git://openfabrics.org/~eli/connect-ib.git S: Supported F: drivers/net/ethernet/mellanox/mlx5/core/ F: include/linux/mlx5/ -Mellanox MLX5 IB driver -M: Eli Cohen +MELLANOX MLX5 IB driver +M: Matan Barak +M: Leon Romanovsky L: linux-rdma@vger.kernel.org W: http://www.mellanox.com Q: http://patchwork.kernel.org/project/linux-rdma/list/ -T: git git://openfabrics.org/~eli/connect-ib.git S: Supported -F: include/linux/mlx5/ F: drivers/infiniband/hw/mlx5/ +F: include/linux/mlx5/ MELEXIS MLX90614 DRIVER M: Crt Mori -- 2.3.7 -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH net-next 2/2] MAINTAINERS: Assign maintainer to Mellanox mlx4 core and IB drivers
The driver was written originally by Roland Dreier, currently there's no official maintainer. Yishai steps in as maintainer. Signed-off-by: Or Gerlitz Cc: Roland Dreier --- MAINTAINERS | 19 +++ 1 file changed, 19 insertions(+) diff --git a/MAINTAINERS b/MAINTAINERS index c7586a5..d9274ca 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -7057,6 +7057,25 @@ W: http://linuxtv.org S: Odd Fixes F: drivers/media/radio/radio-miropcm20* +MELLANOX MLX4 core VPI driver +M: Yishai Hadas +L: net...@vger.kernel.org +L: linux-rdma@vger.kernel.org +W: http://www.mellanox.com +Q: http://patchwork.ozlabs.org/project/netdev/list/ +S: Supported +F: drivers/net/ethernet/mellanox/mlx4/ +F: include/linux/mlx4/ + +MELLANOX MLX4 IB driver +M: Yishai Hadas +L: linux-rdma@vger.kernel.org +W: http://www.mellanox.com +Q: http://patchwork.kernel.org/project/linux-rdma/list/ +S: Supported +F: drivers/infiniband/hw/mlx4/ +F: include/linux/mlx4/ + MELLANOX MLX5 core VPI driver M: Matan Barak M: Leon Romanovsky -- 2.3.7 -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH net-next 0/2] new maintainers for Mellanox mlx4/mlx5 core and IB drivers
Hi Dave, Doug We're happily assigning new maintainers for mlx4/mlx5 core and IB drivers. This is aligned with Eli (mlx5) and Roland (mlx4). FWIW, I did the whole change to go through netdev. Or. Or Gerlitz (2): MAINTAINERS: Assign new maintainers to Mellanox mlx5 core and IB drivers MAINTAINERS: Assign maintainer to Mellanox mlx4 core and IB drivers MAINTAINERS | 32 +--- 1 file changed, 25 insertions(+), 7 deletions(-) -- 2.3.7 -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH for-next V2 4/5] IB/mlx5: Add hca_core_clock_offset to udata in init_ucontext
On 12/22/2015 11:37 PM, Or Gerlitz wrote: On Tue, Dec 15, 2015 at 8:30 PM, Matan Barak wrote: >Pass hca_core_clock_offset to user-space is mandatory in order to >let the user-space read the free-running clock register from the >right offset in the memory mapped page. >Passing this value is done by changing the vendor's command >and response of init_ucontext to be in extensible form. Is "old" (unpatched) libmlx5 still operational over "new" (patched) mlx5 IB driver? and same question for the other way around as well -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH for-next V2 4/5] IB/mlx5: Add hca_core_clock_offset to udata in init_ucontext
On Tue, Dec 15, 2015 at 8:30 PM, Matan Barak wrote: > Pass hca_core_clock_offset to user-space is mandatory in order to > let the user-space read the free-running clock register from the > right offset in the memory mapped page. > Passing this value is done by changing the vendor's command > and response of init_ucontext to be in extensible form. Is "old" (unpatched) libmlx5 still operational over "new" (patched) mlx5 IB driver? -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] IB/cma: cma_match_net_dev needs to take into account port_num
On 12/22/2015 9:17 AM, Or Gerlitz wrote: On 12/21/2015 5:01 PM, Matan Barak wrote: This patch fixes a bug in VPI systems, where the first port is configured as IB and the second one is configured as Ethernet. In this case, if the rdma_id isn't bounded to a port, cma_match_net_dev will try to verify that the first port is a RoCE port and fail. This is fixed by passing the port of the incoming request. OK -- we have another bug down there, cma loopback doesn't work, same reject reason (below).This happens in both VPI and non-VPI configurations. Works well with 4.2.3 I made more checks with the 4.2.3 kernel (before all the IB core/cache changes that went in 4.3) -- rdma-cm loopback does work as long as there are active IB ports in the system. When there are no active IB ports, but rather only Eth ports (VPI or plain Eth), rping fails: # rping -d -v -c -a 127.0.0.1 -C 1 created cm_id 0x6087d0 cma_event type RDMA_CM_EVENT_ADDR_RESOLVED cma_id 0x6087d0 (parent) rdma_resolve_route: No such device so it seems we had something there before the 4.3 changes. Or. -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] IB/cma: cma_match_net_dev needs to take into account port_num
On 12/21/2015 5:01 PM, Matan Barak wrote: Previously, cma_match_net_dev called cma_protocol_roce which tried to verify that the IB device uses RoCE protocol. However, if rdma_id didn't have a bounded port, it used the first port of the device. In VPI systems, the first port might be an IB port while the second one could be an Ethernet port. This made requests for unbounded rdma_ids that come from the Ethernet port fail. Fixing this by passing the port of the request and checking this port of the device. Fixes: b8cab5dab15f ('IB/cma: Accept connection without a valid netdev on RoCE') Signed-off-by: Matan Barak seems that the patch is missing from patchworks, I can't explain that. -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: device attr cleanup
On Wed, Dec 16, 2015 at 7:53 AM, Or Gerlitz wrote: > On 12/15/2015 9:03 PM, Doug Ledford wrote: >> Or, you specifically asked me to wait until this week. I made my >> initial impressions clear (I don't necessarily like the removal of the >> attr struct, but I like the removal of all of the query calls, and I'm >> inclined to take the patch in spite of not liking the removal of the >> struct). Do you have anything to add or have we beat this horse to death? > Hi Doug, > Lets stop beating, both horses and people. > I do understand that > 1. you don't link the removal of the attr > 2. you do like the removal of all the query calls > > I am proposing to take the path of a patch that > does exactly #2 while avoiding #1. Doug, Did you look on my v1 post and the related discussion there w.r.t udata? You didn't make any comment on my response here nor on the proposed patches. Since we are really short in time w.r.t EOY holidays and we have the udata matter open (see [1]), could we move finalizing this discussion to the 4.6 time-frame? If you do have the time, I think it would be fair to see a response from you on the discussion before you pick any of the two patch sets - so?? Or. [1] Christoph's patch doesn't remove the query_device callback from mlx4 since we report there values to libmlx4 through the udata mechanism. The query_device callback will need to be present in future/current drivers if they decide to use udata as well > What's wrong with that? I haven't heard any reasoning for why its > so good to stash ~50 new fields on the IB device structure except > for the author saying that other subsystems do that and other people > saying they are in favor of this approach while not providing any > reasoning, except for maybe something on bikes. > > Why you or anyone else has to be from now and ever the cache line police > making sure that people don't add new attributes in random locations > over the IB device structure? > > What's wrong with putting fifty attributesin a structure which is a field > of the device struct and have people go there to see what are the d > ifferentattrs and add news ones there? > > This will make the 4.5 merge window extremely complex or even totally > threatened w.r.t to the RDMA subsystem and related drivers by 3.3K LOC > patch. > > Sorry, but, I still don't get it. > > Or. -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] IB/cma: cma_match_net_dev needs to take into account port_num
On 12/21/2015 5:01 PM, Matan Barak wrote: This patch fixes a bug in VPI systems, where the first port is configured as IB and the second one is configured as Ethernet. In this case, if the rdma_id isn't bounded to a port, cma_match_net_dev will try to verify that the first port is a RoCE port and fail. This is fixed by passing the port of the incoming request. OK -- we have another bug down there, cma loopback doesn't work, same reject reason (below).This happens in both VPI and non-VPI configurations. Works well with 4.2.3 Or. $ rping -d -v -c -a 127.0.0.1 -C 1 verbose client count 1 created cm_id 0x6087d0 cma_event type RDMA_CM_EVENT_ADDR_RESOLVED cma_id 0x6087d0 (parent) cma_event type RDMA_CM_EVENT_ROUTE_RESOLVED cma_id 0x6087d0 (parent) rdma_resolve_addr - rdma_resolve_route successful created pd 0x60e5f0 created channel 0x608250 created cq 0x608a20 created qp 0x6082e0 rping_setup_buffers called on cb 0x606010 allocated & registered buffers... cq_thread started. wait for CONNECTED state 10 cma_event type RDMA_CM_EVENT_REJECTED cma_id 0x6087d0 (parent) cma event RDMA_CM_EVENT_REJECTED, error 28 connect error -1 rping_free_buffers called on cb 0x606010 destroy cm_id 0x6087d0 -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] IB/cma: cma_match_net_dev needs to take into account port_num
On 12/21/2015 5:01 PM, Matan Barak wrote: Previously, cma_match_net_dev called cma_protocol_roce which tried to verify that the IB device uses RoCE protocol. However, if rdma_id didn't have a bounded port, it used the first port of the device. maybe prefer a higher then code speak language e.g "if the rdma id didn't have" also below "unbounded rdma ids" In VPI systems, the first port might be an IB port while the second one could be an Ethernet port. This made requests for unbounded rdma_ids that come from the Ethernet port fail. add "to" --> "Ethernet port to fail" Fixing this by passing the port of the request and checking this port of the device. OK, so this fix will work for both ib/eth and eth/ib configs, right? good. Fixes: b8cab5dab15f ('IB/cma: Accept connection without a valid netdev on RoCE') Reported-by: Or Gerlitz Signed-off-by: Matan Barak Doug, the bug fixes a commit from from 4.3, lets fix it in 4.4 and later we will send it to -stable as well. So for 4.4 there's this one and the kvfree fix [1] Or. [1] https://patchwork.kernel.org/patch/7868481/ --- Hi Doug, This patch fixes a bug in VPI systems, where the first port is configured as IB and the second one is configured as Ethernet. In this case, if the rdma_id isn't bounded to a port, cma_match_net_dev will try to verify that the first port is a RoCE port and fail. This is fixed by passing the port of the incoming request. Regards, Matan drivers/infiniband/core/cma.c | 16 +--- 1 files changed, 9 insertions(+), 7 deletions(-) diff --git a/drivers/infiniband/core/cma.c b/drivers/infiniband/core/cma.c index d2d5d00..c8a265c 100644 --- a/drivers/infiniband/core/cma.c +++ b/drivers/infiniband/core/cma.c @@ -1265,15 +1265,17 @@ static bool cma_protocol_roce(const struct rdma_cm_id *id) return cma_protocol_roce_dev_port(device, port_num); } -static bool cma_match_net_dev(const struct rdma_id_private *id_priv, - const struct net_device *net_dev) +static bool cma_match_net_dev(const struct rdma_cm_id *id, + const struct net_device *net_dev, + u8 port_num) { - const struct rdma_addr *addr = &id_priv->id.route.addr; + const struct rdma_addr *addr = &id->route.addr; if (!net_dev) /* This request is an AF_IB request or a RoCE request */ - return addr->src_addr.ss_family == AF_IB || - cma_protocol_roce(&id_priv->id); + return (!id->port_num || id->port_num == port_num) && + (addr->src_addr.ss_family == AF_IB || + cma_protocol_roce_dev_port(id->device, port_num)); return !addr->dev_addr.bound_dev_if || (net_eq(dev_net(net_dev), addr->dev_addr.net) && @@ -1295,13 +1297,13 @@ static struct rdma_id_private *cma_find_listener( hlist_for_each_entry(id_priv, &bind_list->owners, node) { if (cma_match_private_data(id_priv, ib_event->private_data)) { if (id_priv->id.device == cm_id->device && - cma_match_net_dev(id_priv, net_dev)) + cma_match_net_dev(&id_priv->id, net_dev, req->port)) return id_priv; list_for_each_entry(id_priv_dev, &id_priv->listen_list, listen_list) { if (id_priv_dev->id.device == cm_id->device && - cma_match_net_dev(id_priv_dev, net_dev)) + cma_match_net_dev(&id_priv_dev->id, net_dev, req->port)) return id_priv_dev; } } -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v2 00/10] iSER support for remote invalidate
On Mon, Dec 21, 2015 at 6:20 AM, Nicholas A. Bellinger wrote: > Applied to target-pending/for-next as v4.5-rc1 material, along with > Reviewed-by tags from HCH. Hi Nic, thanks for stepping in and picking that. Sagi, are you going to spin an increment in the initiator version? Or. -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH V1 1/3] IB/core: Align coding style of ib_device_cap_flags structure
On Mon, Dec 21, 2015 at 9:27 AM, Leon Romanovsky wrote: > On Mon, Dec 21, 2015 at 8:52 AM, Or Gerlitz wrote: >> On Mon, Dec 21, 2015 at 8:37 AM, Leon Romanovsky wrote: >>> On Mon, Dec 21, 2015 at 8:22 AM, ira.weiny wrote: >>>> On Sun, Dec 20, 2015 at 12:16:09PM +0200, Leon Romanovsky wrote: >>>>> From: Leon Romanovsky >> >>>>> Modify enum ib_device_cap_flags such that other patches which add new >>>>> enum values pass strict checkpatch.pl checks. >> >>>>> - IB_DEVICE_RESERVED = (1<<16), /* old SEND_W_INV */ >>>>> - IB_DEVICE_MEM_WINDOW= (1<<17), >>>>> + IB_DEVICE_RESIZE_MAX_WR = (1 << 0), >> >>> 2. Change the whole file => the work with "git blame" will be less >>> straightforward. >> >> Agree. >> >> Leon, I don't think we need to take checkpatch-ing of things to that level. >> >> Indeed, we should make sure that whole new enums and such are done right -- >> but avoid touching core structs/enums in a manner that disallows >> simple git blaming of things, which is very useful for new comers and >> old suffers. > There are no doubts that standalone fixing checkpatch errors is more > suitable to staging subsystem. Agree > In our case, it is part of coming changes in that structure. such > change serves specific goal to minimize the possibility of error > by seeing clean output from static analyser tool. Disagree. What future bugs are you envisioning by let this 10y old header file keep having some checkpatch issues on few of the major enums?! Or. -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH V1 1/3] IB/core: Align coding style of ib_device_cap_flags structure
On Mon, Dec 21, 2015 at 8:37 AM, Leon Romanovsky wrote: > On Mon, Dec 21, 2015 at 8:22 AM, ira.weiny wrote: >> On Sun, Dec 20, 2015 at 12:16:09PM +0200, Leon Romanovsky wrote: >>> From: Leon Romanovsky >>> Modify enum ib_device_cap_flags such that other patches which add new >>> enum values pass strict checkpatch.pl checks. >>> - IB_DEVICE_RESERVED = (1<<16), /* old SEND_W_INV */ >>> - IB_DEVICE_MEM_WINDOW= (1<<17), >>> + IB_DEVICE_RESIZE_MAX_WR = (1 << 0), > 2. Change the whole file => the work with "git blame" will be less > straightforward. Agree. Leon, I don't think we need to take checkpatch-ing of things to that level. Indeed, we should make sure that whole new enums and such are done right -- but avoid touching core structs/enums in a manner that disallows simple git blaming of things, which is very useful for new comers and old suffers. For example, the networking community keeps adding IFF_YYY values to include/linux/netdevice.h :: enum netdev_priv_flags without fixing checkpatch complaint on missing spaces before/after the << shift sign. > I will do the change across whole file, If Doug accepts such change. Please don't... simple git blame is very powerful tool for kernel developers everyday work. Or. -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH rdma-next 1/6] IB/core: Save the device attributes on the device structure
On 12/18/2015 6:49 PM, Jason Gunthorpe wrote: On Fri, Dec 18, 2015 at 10:08:41AM +0200, Or Gerlitz wrote: On 12/17/2015 7:41 PM, Jason Gunthorpe wrote: On Thu, Dec 17, 2015 at 03:44:19PM +0200, Sagi Grimberg wrote: + ret = ib_query_device(device, &device->attrs); + if (ret) { + printk(KERN_WARNING "Couldn't query the device attributes\n"); + goto out; + } + I thought we're all for removing the call altogether aren't we? I'd say just call device->query_device() instead. Christoph's patch even got rid of device->query_device(), Wrong. Not really, lots of hunks in Christoph's patch are removing query_device ie: @@ -1305,7 +1299,6 @@ int mthca_register_device(struct mthca_dev *dev) dev->ib_dev.phys_port_cnt= dev->limits.num_ports; dev->ib_dev.num_comp_vectors = 1; dev->ib_dev.dma_device = &dev->pdev->dev; - dev->ib_dev.query_device = mthca_query_device; dev->ib_dev.query_port = mthca_query_port; I know, I wanted to stress the point that under the udata architecture the query_device callbacked can't be just deleted since drivers use that to get/provide data to/from their user-space libraries. Sure, it sticks around in a couple places but it isn't 'query_device' anymore, it is 'query_device_udata' which is reasonable. I see what you mean. But this involved somehow deeper patching to mlx4 which should be applied later on to any ll driver that would like to expose udata for their query device entry (and return back that entry as well after we delete it now in Christoph's patch). Or. -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH rdma-next V1 3/7] IB/ulps: Avoid calling ib_query_device
On 12/18/2015 4:10 PM, Bart Van Assche wrote: Shouldn't this patch be split per ULP driver to make review easier ? The patch is very simple and does the same practice in all ULPs. If it helps I can spit off the srp initiator and target bits into a different patch, let me know. If you have a look at the MAINTAINERS file then you will see that you forgot to CC the SRP initiator maintainer. yes, sorry for that, will copy you on further versions if there are such. Or. -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH libibverbs 2/3] libibverbs: Add cross-channel QP initialization flags
On 12/20/2015 1:22 PM, Leon Romanovsky wrote: --- include/infiniband/verbs.h | 3 +++ src/cmd.c | 5 - 2 files changed, 7 insertions(+), 1 deletion(-) You need to document the new creation flags in a man page. If this doesn't exist, this should be added to Eran's submission and you will enhance that. Or. -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 09/10] IB: remove the struct ib_phys_buf definition
On 12/20/2015 5:41 PM, Christoph Hellwig wrote: I can't see a useful explanation for removing an unused structure. But if you really want one and come up with a coherent sentence or two I can add it. patch title - IB/core: Remove struct ib_phys_buf change log: Remove struct ib_phys_buf since YYY where you need to fill few words for YYY, e.g not used anywhere ever, not used after you did cleanup ZZZ, something that can serve someone reading your patch -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 10/10] IB: remove the unused usecnt field from struct ib_mr
On 12/20/2015 11:31 AM, Sagi Grimberg wrote: However, I think that the patch from Shani 6b52a12bc3fc ("IB/uverbs: Implement memory windows support in uverbs") is wrong. A memory window allocation should really reference the MR and not the PD (which is referenced by the MR). Otherwise the MR deregistration is allowed with windows bound to it. If this is the case then this field is not useless and we should fix ib_uverbs_alloc_mw()? I tend to agree, lets not remove the field till this is clarified with Haggai and Co Haggai, can you shed some light here? It's Shani's and your code. -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 10/10] IB: remove the unused usecnt field from struct ib_mr
On 12/20/2015 11:31 AM, Sagi Grimberg wrote: However, I think that the patch from Shani 6b52a12bc3fc ("IB/uverbs: Implement memory windows support in uverbs") is wrong. A memory window allocation should really reference the MR and not the PD (which is referenced by the MR). Otherwise the MR deregistration is allowed with windows bound to it. If this is the case then this field is not useless and we should fix ib_uverbs_alloc_mw()? I tend to agree, lets not remove the field this is clarified with Haggai and Co Haggai, can you shed some light here? It's Shani's and your code. -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 10/10] IB: remove the unused usecnt field from struct ib_mr
On 12/20/2015 9:25 AM, Or Gerlitz wrote: On 12/18/2015 3:55 PM, Christoph Hellwig wrote: diff --git a/include/rdma/ib_verbs.h b/include/rdma/ib_verbs.h index 284916d..e45776e 100644 --- a/include/rdma/ib_verbs.h +++ b/include/rdma/ib_verbs.h @@ -1306,7 +1306,6 @@ struct ib_mr { u64 iova; u32 length; unsigned int page_size; -atomic_t usecnt; /* count number of MWs */ }; This comment is part of Roland's uverbs commit. I saw now that you removed in-kernel support for MWs as a downstream patch of this series. I guess this cleanup needs to go there as the refcount field of kernel MRs relates to MWs. This will also help someone coming in the future and returning the in-kernel MW support to avoid forgetting on doing the refs. Or. -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 09/10] IB: remove the struct ib_phys_buf definition
On 12/18/2015 3:55 PM, Christoph Hellwig wrote: Signed-off-by: Christoph Hellwig Reviewed-by: Sagi Grimberg Reviewed-by: Jason Gunthorpe [core] Reviewed-by: Steve Wise Here, too, please avoid empty change logs to IB core patches. Or. --- include/rdma/ib_verbs.h | 5 - 1 file changed, 5 deletions(-) diff --git a/include/rdma/ib_verbs.h b/include/rdma/ib_verbs.h index ea093ee..284916d 100644 --- a/include/rdma/ib_verbs.h +++ b/include/rdma/ib_verbs.h @@ -1143,11 +1143,6 @@ enum ib_access_flags { IB_ACCESS_ON_DEMAND = (1<<6), }; -struct ib_phys_buf { - u64 addr; - u64 size; -}; - /* * XXX: these are apparently used for ->rereg_user_mr, no idea why they * are hidden here instead of a uapi header! -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 10/10] IB: remove the unused usecnt field from struct ib_mr
On 12/18/2015 4:14 PM, Bart Van Assche wrote: On 12/18/2015 02:55 PM, Christoph Hellwig wrote: Signed-off-by: Christoph Hellwig Shouldn't the description of this patch be changed into something like "Remove the usecnt field from ib_mr since it is always zero" ? Agree. I would like us to avoid empty changes log for IB core patches and actually all over the rdma subsystem. Or. -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: RoCE passive side failures on 4.4-rc5
On 12/17/2015 3:58 PM, Or Gerlitz wrote: Using 4.4-rc5+ [1] and **not** applying any of the patches I sent today, I noted that RoCE passive side isn't working (rdma-cm, ibv_rc_pingpong works). I have two nodes in ConnectX3 VPI config (port1 IB and port2 Eth), the one with the 4.4-rc5 kernel can act as both (rping) client/server for IB links but only (rping) client for RoCE. I tried both inter-node and loopback runs, in all cases, the client side getsCM reject with reason 28, see [2], tried both iser and rping. Eth (ICMP, TCP) works OK. OK, small progress, when the force Eth link type on my IB port (using mlx4 sysfs), things work. You should be able to reproduce it on your non-VPI systems the other way around, by forcing IB link type on one of the Eth ports and see the failure. I Saw the same behavior with both 4.4-rc2 and 4.4-rc5 Or. -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 10/10] IB: remove the unused usecnt field from struct ib_mr
On 12/18/2015 3:55 PM, Christoph Hellwig wrote: diff --git a/include/rdma/ib_verbs.h b/include/rdma/ib_verbs.h index 284916d..e45776e 100644 --- a/include/rdma/ib_verbs.h +++ b/include/rdma/ib_verbs.h @@ -1306,7 +1306,6 @@ struct ib_mr { u64iova; u32length; unsigned int page_size; - atomic_t usecnt; /* count number of MWs */ }; This comment is part of Roland's uverbs commit. I wonder if LL driver supporting the IB_WR_BIND_MW op ref the MR on port send and deref it on completion? Or. struct ib_mw { -- 1.9.1 -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH rdma-next 1/6] IB/core: Save the device attributes on the device structure
On Fri, Dec 18, 2015 at 7:16 AM, ira.weiny wrote: > More than anything what I hate is all the places that allocate struct > ib_device_attr just to free it after the query call. did you care to look on patches 2-6, this is exactly what they are doing. -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH rdma-next V1 6/7] staging/o2iblnd: Avoid calling ib_query_device
Instead, use the cached copy of the attributes present on the device. Signed-off-by: Or Gerlitz --- drivers/staging/lustre/lnet/klnds/o2iblnd/o2iblnd.c | 21 + 1 file changed, 1 insertion(+), 20 deletions(-) diff --git a/drivers/staging/lustre/lnet/klnds/o2iblnd/o2iblnd.c b/drivers/staging/lustre/lnet/klnds/o2iblnd/o2iblnd.c index 7c730e3..7231ef8 100644 --- a/drivers/staging/lustre/lnet/klnds/o2iblnd/o2iblnd.c +++ b/drivers/staging/lustre/lnet/klnds/o2iblnd/o2iblnd.c @@ -2070,32 +2070,13 @@ static int kiblnd_net_init_pools(kib_net_t *net, __u32 *cpts, int ncpts) static int kiblnd_hdev_get_attr(kib_hca_dev_t *hdev) { - struct ib_device_attr *attr; - int rc; - /* It's safe to assume a HCA can handle a page size * matching that of the native system */ hdev->ibh_page_shift = PAGE_SHIFT; hdev->ibh_page_size = 1 << PAGE_SHIFT; hdev->ibh_page_mask = ~((__u64)hdev->ibh_page_size - 1); - LIBCFS_ALLOC(attr, sizeof(*attr)); - if (attr == NULL) { - CERROR("Out of memory\n"); - return -ENOMEM; - } - - rc = ib_query_device(hdev->ibh_ibdev, attr); - if (rc == 0) - hdev->ibh_mr_size = attr->max_mr_size; - - LIBCFS_FREE(attr, sizeof(*attr)); - - if (rc != 0) { - CERROR("Failed to query IB device: %d\n", rc); - return rc; - } - + hdev->ibh_mr_size = hdev->ibh_ibdev->attrs.max_mr_size; if (hdev->ibh_mr_size == ~0ULL) { hdev->ibh_mr_shift = 64; return 0; -- 2.3.7 -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH rdma-next V1 7/7] IB/core: Remove ib_query_device
The copy of the attributes present on the device is now used by all consumers expect for uverbs in case of serving user-space query, where dev->query_device is called. Signed-off-by: Or Gerlitz --- drivers/infiniband/core/device.c | 19 --- include/rdma/ib_verbs.h | 3 --- 2 files changed, 22 deletions(-) diff --git a/drivers/infiniband/core/device.c b/drivers/infiniband/core/device.c index 568592e3..6def2f7 100644 --- a/drivers/infiniband/core/device.c +++ b/drivers/infiniband/core/device.c @@ -636,25 +636,6 @@ void ib_dispatch_event(struct ib_event *event) EXPORT_SYMBOL(ib_dispatch_event); /** - * ib_query_device - Query IB device attributes - * @device:Device to query - * @device_attr:Device attributes - * - * ib_query_device() returns the attributes of a device through the - * @device_attr pointer. - */ -int ib_query_device(struct ib_device *device, - struct ib_device_attr *device_attr) -{ - struct ib_udata uhw = {.outlen = 0, .inlen = 0}; - - memset(device_attr, 0, sizeof(*device_attr)); - - return device->query_device(device, device_attr, &uhw); -} -EXPORT_SYMBOL(ib_query_device); - -/** * ib_query_port - Query IB port attributes * @device:Device to query * @port_num:Port number to query diff --git a/include/rdma/ib_verbs.h b/include/rdma/ib_verbs.h index 730dcfb..0567866 100644 --- a/include/rdma/ib_verbs.h +++ b/include/rdma/ib_verbs.h @@ -1913,9 +1913,6 @@ int ib_register_event_handler (struct ib_event_handler *event_handler); int ib_unregister_event_handler(struct ib_event_handler *event_handler); void ib_dispatch_event(struct ib_event *event); -int ib_query_device(struct ib_device *device, - struct ib_device_attr *device_attr); - int ib_query_port(struct ib_device *device, u8 port_num, struct ib_port_attr *port_attr); -- 2.3.7 -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH rdma-next V1 3/7] IB/ulps: Avoid calling ib_query_device
Instead, use the cached copy of the attributes present on the device. Signed-off-by: Or Gerlitz --- drivers/infiniband/ulp/ipoib/ipoib_cm.c | 19 --- drivers/infiniband/ulp/ipoib/ipoib_ethtool.c | 14 +++-- drivers/infiniband/ulp/ipoib/ipoib_main.c| 21 + drivers/infiniband/ulp/iser/iscsi_iser.c | 4 +-- drivers/infiniband/ulp/iser/iscsi_iser.h | 2 -- drivers/infiniband/ulp/iser/iser_memory.c| 9 +++--- drivers/infiniband/ulp/iser/iser_verbs.c | 38 ++ drivers/infiniband/ulp/isert/ib_isert.c | 47 +--- drivers/infiniband/ulp/isert/ib_isert.h | 1 - drivers/infiniband/ulp/srp/ib_srp.c | 32 ++- drivers/infiniband/ulp/srpt/ib_srpt.c| 15 - drivers/infiniband/ulp/srpt/ib_srpt.h| 3 -- 12 files changed, 63 insertions(+), 142 deletions(-) diff --git a/drivers/infiniband/ulp/ipoib/ipoib_cm.c b/drivers/infiniband/ulp/ipoib/ipoib_cm.c index 3ae9726..94d144d 100644 --- a/drivers/infiniband/ulp/ipoib/ipoib_cm.c +++ b/drivers/infiniband/ulp/ipoib/ipoib_cm.c @@ -1522,8 +1522,7 @@ static void ipoib_cm_create_srq(struct net_device *dev, int max_sge) int ipoib_cm_dev_init(struct net_device *dev) { struct ipoib_dev_priv *priv = netdev_priv(dev); - int i, ret; - struct ib_device_attr attr; + int max_srq_sge, i; INIT_LIST_HEAD(&priv->cm.passive_ids); INIT_LIST_HEAD(&priv->cm.reap_list); @@ -1540,19 +1539,13 @@ int ipoib_cm_dev_init(struct net_device *dev) skb_queue_head_init(&priv->cm.skb_queue); - ret = ib_query_device(priv->ca, &attr); - if (ret) { - printk(KERN_WARNING "ib_query_device() failed with %d\n", ret); - return ret; - } - - ipoib_dbg(priv, "max_srq_sge=%d\n", attr.max_srq_sge); + ipoib_dbg(priv, "max_srq_sge=%d\n", priv->ca->attrs.max_srq_sge); - attr.max_srq_sge = min_t(int, IPOIB_CM_RX_SG, attr.max_srq_sge); - ipoib_cm_create_srq(dev, attr.max_srq_sge); + max_srq_sge = min_t(int, IPOIB_CM_RX_SG, priv->ca->attrs.max_srq_sge); + ipoib_cm_create_srq(dev, max_srq_sge); if (ipoib_cm_has_srq(dev)) { - priv->cm.max_cm_mtu = attr.max_srq_sge * PAGE_SIZE - 0x10; - priv->cm.num_frags = attr.max_srq_sge; + priv->cm.max_cm_mtu = max_srq_sge * PAGE_SIZE - 0x10; + priv->cm.num_frags = max_srq_sge; ipoib_dbg(priv, "max_cm_mtu = 0x%x, num_frags=%d\n", priv->cm.max_cm_mtu, priv->cm.num_frags); } else { diff --git a/drivers/infiniband/ulp/ipoib/ipoib_ethtool.c b/drivers/infiniband/ulp/ipoib/ipoib_ethtool.c index 078cadd..a53fa5f 100644 --- a/drivers/infiniband/ulp/ipoib/ipoib_ethtool.c +++ b/drivers/infiniband/ulp/ipoib/ipoib_ethtool.c @@ -40,15 +40,11 @@ static void ipoib_get_drvinfo(struct net_device *netdev, struct ethtool_drvinfo *drvinfo) { struct ipoib_dev_priv *priv = netdev_priv(netdev); - struct ib_device_attr *attr; - - attr = kmalloc(sizeof(*attr), GFP_KERNEL); - if (attr && !ib_query_device(priv->ca, attr)) - snprintf(drvinfo->fw_version, sizeof(drvinfo->fw_version), -"%d.%d.%d", (int)(attr->fw_ver >> 32), -(int)(attr->fw_ver >> 16) & 0x, -(int)attr->fw_ver & 0x); - kfree(attr); + + snprintf(drvinfo->fw_version, sizeof(drvinfo->fw_version), +"%d.%d.%d", (int)(priv->ca->attrs.fw_ver >> 32), +(int)(priv->ca->attrs.fw_ver >> 16) & 0x, +(int)priv->ca->attrs.fw_ver & 0x); strlcpy(drvinfo->bus_info, dev_name(priv->ca->dma_device), sizeof(drvinfo->bus_info)); diff --git a/drivers/infiniband/ulp/ipoib/ipoib_main.c b/drivers/infiniband/ulp/ipoib/ipoib_main.c index 7d32818..58732c5 100644 --- a/drivers/infiniband/ulp/ipoib/ipoib_main.c +++ b/drivers/infiniband/ulp/ipoib/ipoib_main.c @@ -1777,26 +1777,7 @@ int ipoib_add_pkey_attr(struct net_device *dev) int ipoib_set_dev_features(struct ipoib_dev_priv *priv, struct ib_device *hca) { - struct ib_device_attr *device_attr; - int result = -ENOMEM; - - device_attr = kmalloc(sizeof *device_attr, GFP_KERNEL); - if (!device_attr) { - printk(KERN_WARNING "%s: allocation of %zu bytes failed\n", - hca->name, sizeof *device_attr); - return result; - } - - result = ib_query_device(hca, device_attr); - if (result) { - printk(KERN_WARNING "%s: ib_query_device failed (ret = %d)\
[PATCH rdma-next V1 1/7] IB/core: Save the device attributes on the device structure
From: Ira Weiny This way both the IB core and upper level drivers can access these cached device attributes rather than querying or caching them on their own. Signed-off-by: Ira Weiny Signed-off-by: Or Gerlitz --- drivers/infiniband/core/device.c | 8 include/rdma/ib_verbs.h | 1 + 2 files changed, 9 insertions(+) diff --git a/drivers/infiniband/core/device.c b/drivers/infiniband/core/device.c index 179e813..568592e3 100644 --- a/drivers/infiniband/core/device.c +++ b/drivers/infiniband/core/device.c @@ -325,6 +325,7 @@ int ib_register_device(struct ib_device *device, { int ret; struct ib_client *client; + struct ib_udata uhw = {.outlen = 0, .inlen = 0}; mutex_lock(&device_mutex); @@ -352,6 +353,13 @@ int ib_register_device(struct ib_device *device, goto out; } + memset(&device->attrs, 0, sizeof(device->attrs)); + ret = device->query_device(device, &device->attrs, &uhw); + if (ret) { + printk(KERN_WARNING "Couldn't query the device attributes\n"); + goto out; + } + ret = ib_device_register_sysfs(device, port_callback); if (ret) { printk(KERN_WARNING "Couldn't register device %s with driver model\n", diff --git a/include/rdma/ib_verbs.h b/include/rdma/ib_verbs.h index 120da1d..730dcfb 100644 --- a/include/rdma/ib_verbs.h +++ b/include/rdma/ib_verbs.h @@ -1823,6 +1823,7 @@ struct ib_device { u16 is_switch:1; u8 node_type; u8 phys_port_cnt; + struct ib_device_attrattrs; /** * The following mandatory functions are used only at device -- 2.3.7 -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH rdma-next V1 5/7] xprtrdma: Avoid calling ib_query_device
Instead, use the cached copy of the attributes present on the device. Signed-off-by: Or Gerlitz --- net/sunrpc/xprtrdma/frwr_ops.c | 7 ++--- net/sunrpc/xprtrdma/svc_rdma_transport.c | 48 +--- net/sunrpc/xprtrdma/verbs.c | 24 ++-- net/sunrpc/xprtrdma/xprt_rdma.h | 1 - 4 files changed, 30 insertions(+), 50 deletions(-) diff --git a/net/sunrpc/xprtrdma/frwr_ops.c b/net/sunrpc/xprtrdma/frwr_ops.c index 88cf9e7..8d4864f 100644 --- a/net/sunrpc/xprtrdma/frwr_ops.c +++ b/net/sunrpc/xprtrdma/frwr_ops.c @@ -190,12 +190,11 @@ static int frwr_op_open(struct rpcrdma_ia *ia, struct rpcrdma_ep *ep, struct rpcrdma_create_data_internal *cdata) { - struct ib_device_attr *devattr = &ia->ri_devattr; int depth, delta; ia->ri_max_frmr_depth = min_t(unsigned int, RPCRDMA_MAX_DATA_SEGS, - devattr->max_fast_reg_page_list_len); + ia->ri_device->attrs.max_fast_reg_page_list_len); dprintk("RPC: %s: device's max FR page list len = %u\n", __func__, ia->ri_max_frmr_depth); @@ -222,8 +221,8 @@ frwr_op_open(struct rpcrdma_ia *ia, struct rpcrdma_ep *ep, } ep->rep_attr.cap.max_send_wr *= depth; - if (ep->rep_attr.cap.max_send_wr > devattr->max_qp_wr) { - cdata->max_requests = devattr->max_qp_wr / depth; + if (ep->rep_attr.cap.max_send_wr > ia->ri_device->attrs.max_qp_wr) { + cdata->max_requests = ia->ri_device->attrs.max_qp_wr / depth; if (!cdata->max_requests) return -EINVAL; ep->rep_attr.cap.max_send_wr = cdata->max_requests * diff --git a/net/sunrpc/xprtrdma/svc_rdma_transport.c b/net/sunrpc/xprtrdma/svc_rdma_transport.c index b348b4a..4d695af 100644 --- a/net/sunrpc/xprtrdma/svc_rdma_transport.c +++ b/net/sunrpc/xprtrdma/svc_rdma_transport.c @@ -886,10 +886,10 @@ static struct svc_xprt *svc_rdma_accept(struct svc_xprt *xprt) struct rdma_conn_param conn_param; struct ib_cq_init_attr cq_attr = {}; struct ib_qp_init_attr qp_attr; - struct ib_device_attr devattr; + struct ib_device *dev; int uninitialized_var(dma_mr_acc); int need_dma_mr = 0; - int ret; + int ret = 0; int i; listen_rdma = container_of(xprt, struct svcxprt_rdma, sc_xprt); @@ -910,20 +910,15 @@ static struct svc_xprt *svc_rdma_accept(struct svc_xprt *xprt) dprintk("svcrdma: newxprt from accept queue = %p, cm_id=%p\n", newxprt, newxprt->sc_cm_id); - ret = ib_query_device(newxprt->sc_cm_id->device, &devattr); - if (ret) { - dprintk("svcrdma: could not query device attributes on " - "device %p, rc=%d\n", newxprt->sc_cm_id->device, ret); - goto errout; - } + dev = newxprt->sc_cm_id->device; /* Qualify the transport resource defaults with the * capabilities of this particular device */ - newxprt->sc_max_sge = min((size_t)devattr.max_sge, + newxprt->sc_max_sge = min((size_t)dev->attrs.max_sge, (size_t)RPCSVC_MAXPAGES); - newxprt->sc_max_sge_rd = min_t(size_t, devattr.max_sge_rd, + newxprt->sc_max_sge_rd = min_t(size_t, dev->attrs.max_sge_rd, RPCSVC_MAXPAGES); - newxprt->sc_max_requests = min((size_t)devattr.max_qp_wr, + newxprt->sc_max_requests = min((size_t)dev->attrs.max_qp_wr, (size_t)svcrdma_max_requests); newxprt->sc_sq_depth = RPCRDMA_SQ_DEPTH_MULT * newxprt->sc_max_requests; @@ -931,16 +926,16 @@ static struct svc_xprt *svc_rdma_accept(struct svc_xprt *xprt) * Limit ORD based on client limit, local device limit, and * configured svcrdma limit. */ - newxprt->sc_ord = min_t(size_t, devattr.max_qp_rd_atom, newxprt->sc_ord); + newxprt->sc_ord = min_t(size_t, dev->attrs.max_qp_rd_atom, newxprt->sc_ord); newxprt->sc_ord = min_t(size_t, svcrdma_ord, newxprt->sc_ord); - newxprt->sc_pd = ib_alloc_pd(newxprt->sc_cm_id->device); + newxprt->sc_pd = ib_alloc_pd(dev); if (IS_ERR(newxprt->sc_pd)) { dprintk("svcrdma: error creating PD for connect request\n"); goto errout; } cq_attr.cqe = newxprt->sc_sq_depth; - newxprt->sc_sq_cq = ib_create_cq(newxprt->sc_cm_id->device, + newxprt->sc_sq_cq = ib_create_cq(dev, sq_comp_handler, cq_event_handler,
[PATCH rdma-next V1 4/7] net/rds: Avoid calling ib_query_device
Instead, use the cached copy of the attributes present on the device. Signed-off-by: Or Gerlitz --- net/rds/ib.c | 34 +++--- net/rds/iw.c | 23 +-- 2 files changed, 16 insertions(+), 41 deletions(-) diff --git a/net/rds/ib.c b/net/rds/ib.c index f222885..9481d55 100644 --- a/net/rds/ib.c +++ b/net/rds/ib.c @@ -122,44 +122,34 @@ void rds_ib_dev_put(struct rds_ib_device *rds_ibdev) static void rds_ib_add_one(struct ib_device *device) { struct rds_ib_device *rds_ibdev; - struct ib_device_attr *dev_attr; /* Only handle IB (no iWARP) devices */ if (device->node_type != RDMA_NODE_IB_CA) return; - dev_attr = kmalloc(sizeof *dev_attr, GFP_KERNEL); - if (!dev_attr) - return; - - if (ib_query_device(device, dev_attr)) { - rdsdebug("Query device failed for %s\n", device->name); - goto free_attr; - } - rds_ibdev = kzalloc_node(sizeof(struct rds_ib_device), GFP_KERNEL, ibdev_to_node(device)); if (!rds_ibdev) - goto free_attr; + return; spin_lock_init(&rds_ibdev->spinlock); atomic_set(&rds_ibdev->refcount, 1); INIT_WORK(&rds_ibdev->free_work, rds_ib_dev_free); - rds_ibdev->max_wrs = dev_attr->max_qp_wr; - rds_ibdev->max_sge = min(dev_attr->max_sge, RDS_IB_MAX_SGE); + rds_ibdev->max_wrs = device->attrs.max_qp_wr; + rds_ibdev->max_sge = min(device->attrs.max_sge, RDS_IB_MAX_SGE); - rds_ibdev->fmr_max_remaps = dev_attr->max_map_per_fmr?: 32; - rds_ibdev->max_1m_fmrs = dev_attr->max_mr ? - min_t(unsigned int, (dev_attr->max_mr / 2), + rds_ibdev->fmr_max_remaps = device->attrs.max_map_per_fmr?: 32; + rds_ibdev->max_1m_fmrs = device->attrs.max_mr ? + min_t(unsigned int, (device->attrs.max_mr / 2), rds_ib_fmr_1m_pool_size) : rds_ib_fmr_1m_pool_size; - rds_ibdev->max_8k_fmrs = dev_attr->max_mr ? - min_t(unsigned int, ((dev_attr->max_mr / 2) * RDS_MR_8K_SCALE), + rds_ibdev->max_8k_fmrs = device->attrs.max_mr ? + min_t(unsigned int, ((device->attrs.max_mr / 2) * RDS_MR_8K_SCALE), rds_ib_fmr_8k_pool_size) : rds_ib_fmr_8k_pool_size; - rds_ibdev->max_initiator_depth = dev_attr->max_qp_init_rd_atom; - rds_ibdev->max_responder_resources = dev_attr->max_qp_rd_atom; + rds_ibdev->max_initiator_depth = device->attrs.max_qp_init_rd_atom; + rds_ibdev->max_responder_resources = device->attrs.max_qp_rd_atom; rds_ibdev->dev = device; rds_ibdev->pd = ib_alloc_pd(device); @@ -183,7 +173,7 @@ static void rds_ib_add_one(struct ib_device *device) } rdsdebug("RDS/IB: max_mr = %d, max_wrs = %d, max_sge = %d, fmr_max_remaps = %d, max_1m_fmrs = %d, max_8k_fmrs = %d\n", -dev_attr->max_fmr, rds_ibdev->max_wrs, rds_ibdev->max_sge, +device->attrs.max_fmr, rds_ibdev->max_wrs, rds_ibdev->max_sge, rds_ibdev->fmr_max_remaps, rds_ibdev->max_1m_fmrs, rds_ibdev->max_8k_fmrs); @@ -202,8 +192,6 @@ static void rds_ib_add_one(struct ib_device *device) put_dev: rds_ib_dev_put(rds_ibdev); -free_attr: - kfree(dev_attr); } /* diff --git a/net/rds/iw.c b/net/rds/iw.c index 576f182..f4a9fff 100644 --- a/net/rds/iw.c +++ b/net/rds/iw.c @@ -60,30 +60,20 @@ LIST_HEAD(iw_nodev_conns); static void rds_iw_add_one(struct ib_device *device) { struct rds_iw_device *rds_iwdev; - struct ib_device_attr *dev_attr; /* Only handle iwarp devices */ if (device->node_type != RDMA_NODE_RNIC) return; - dev_attr = kmalloc(sizeof *dev_attr, GFP_KERNEL); - if (!dev_attr) - return; - - if (ib_query_device(device, dev_attr)) { - rdsdebug("Query device failed for %s\n", device->name); - goto free_attr; - } - rds_iwdev = kmalloc(sizeof *rds_iwdev, GFP_KERNEL); if (!rds_iwdev) - goto free_attr; + return; spin_lock_init(&rds_iwdev->spinlock); - rds_iwdev->dma_local_lkey = !!(dev_attr->device_cap_flags & IB_DEVICE_LOCAL_DMA_LKEY); - rds_iwdev->max_wrs = dev_attr->max_qp_wr; - rds_iwdev->max_sge = min(dev_attr->max_sge, RDS_IW_MAX_SGE); + rds_iwdev->dma_local_lkey = !!(device->attrs.device_cap_flags & IB_DEVICE_LOCAL_DMA_LKEY); + rds_iwdev->max_wrs = device->attrs.max_qp_wr; + rds_iwdev->max_sge = min(device->attrs.max_sge, RDS_IW_MAX_SGE); rds_iwdev->dev =
[PATCH rdma-next V1 0/7] dev attr cleanup (less is more)
OK, Doug, this is my suggestion for the dev attr cleanup -- it has the advantages of leaving the attrs on a well defined location, a field in the IB device, the ability to get that through smaller patches, avoid touching any of the HW drivers, etc. Or. changes from V0: - addressed Sagi's comment and removed ib_query_device, this changed patch #1 and added patch #7. - removed ref to Christoph patch in the change-logs of patches 2-6 Ira Weiny (1): IB/core: Save the device attributes on the device structure Or Gerlitz (6): IB/core: Avoid calling ib_query_device IB/ulps: Avoid calling ib_query_device net/rds: Avoid calling ib_query_device xprtrdma: Avoid calling ib_query_device staging/o2iblnd: Avoid calling ib_query_device IB/core: Remove ib_query_device drivers/infiniband/core/cm.c | 12 +- drivers/infiniband/core/cma.c | 8 drivers/infiniband/core/device.c | 27 drivers/infiniband/core/fmr_pool.c | 20 + drivers/infiniband/core/sysfs.c| 14 ++- drivers/infiniband/core/uverbs_cmd.c | 23 ++- drivers/infiniband/core/verbs.c| 8 +--- drivers/infiniband/ulp/ipoib/ipoib_cm.c| 19 +++-- drivers/infiniband/ulp/ipoib/ipoib_ethtool.c | 14 +++ drivers/infiniband/ulp/ipoib/ipoib_main.c | 21 +- drivers/infiniband/ulp/iser/iscsi_iser.c | 4 +- drivers/infiniband/ulp/iser/iscsi_iser.h | 2 - drivers/infiniband/ulp/iser/iser_memory.c | 9 ++-- drivers/infiniband/ulp/iser/iser_verbs.c | 38 - drivers/infiniband/ulp/isert/ib_isert.c| 47 +++-- drivers/infiniband/ulp/isert/ib_isert.h| 1 - drivers/infiniband/ulp/srp/ib_srp.c| 32 --- drivers/infiniband/ulp/srpt/ib_srpt.c | 15 +++ drivers/infiniband/ulp/srpt/ib_srpt.h | 3 -- .../staging/lustre/lnet/klnds/o2iblnd/o2iblnd.c| 21 +- include/rdma/ib_verbs.h| 4 +- net/rds/ib.c | 34 +-- net/rds/iw.c | 23 +++ net/sunrpc/xprtrdma/frwr_ops.c | 7 ++-- net/sunrpc/xprtrdma/svc_rdma_transport.c | 48 +- net/sunrpc/xprtrdma/verbs.c| 24 --- net/sunrpc/xprtrdma/xprt_rdma.h| 1 - 27 files changed, 131 insertions(+), 348 deletions(-) -- 2.3.7 -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH rdma-next V1 2/7] IB/core: Avoid calling ib_query_device
Use the cached copy of the attributes present on the device, except for the case of a query originating from user-space, where we have to invoke the driver query_device entry, so they can fill in their udata. Signed-off-by: Or Gerlitz --- drivers/infiniband/core/cm.c | 12 +--- drivers/infiniband/core/cma.c| 8 drivers/infiniband/core/fmr_pool.c | 20 ++-- drivers/infiniband/core/sysfs.c | 14 -- drivers/infiniband/core/uverbs_cmd.c | 23 --- drivers/infiniband/core/verbs.c | 8 +--- 6 files changed, 12 insertions(+), 73 deletions(-) diff --git a/drivers/infiniband/core/cm.c b/drivers/infiniband/core/cm.c index 0a26dd6..7fa2b94 100644 --- a/drivers/infiniband/core/cm.c +++ b/drivers/infiniband/core/cm.c @@ -3731,16 +3731,6 @@ int ib_cm_init_qp_attr(struct ib_cm_id *cm_id, } EXPORT_SYMBOL(ib_cm_init_qp_attr); -static void cm_get_ack_delay(struct cm_device *cm_dev) -{ - struct ib_device_attr attr; - - if (ib_query_device(cm_dev->ib_device, &attr)) - cm_dev->ack_delay = 0; /* acks will rely on packet life time */ - else - cm_dev->ack_delay = attr.local_ca_ack_delay; -} - static ssize_t cm_show_counter(struct kobject *obj, struct attribute *attr, char *buf) { @@ -3852,7 +3842,7 @@ static void cm_add_one(struct ib_device *ib_device) return; cm_dev->ib_device = ib_device; - cm_get_ack_delay(cm_dev); + cm_dev->ack_delay = ib_device->attrs.local_ca_ack_delay; cm_dev->going_down = 0; cm_dev->device = device_create(&cm_class, &ib_device->dev, MKDEV(0, 0), NULL, diff --git a/drivers/infiniband/core/cma.c b/drivers/infiniband/core/cma.c index d2d5d00..f8dfc63 100644 --- a/drivers/infiniband/core/cma.c +++ b/drivers/infiniband/core/cma.c @@ -1894,7 +1894,6 @@ static int iw_conn_req_handler(struct iw_cm_id *cm_id, struct rdma_id_private *listen_id, *conn_id; struct rdma_cm_event event; int ret; - struct ib_device_attr attr; struct sockaddr *laddr = (struct sockaddr *)&iw_event->local_addr; struct sockaddr *raddr = (struct sockaddr *)&iw_event->remote_addr; @@ -1936,13 +1935,6 @@ static int iw_conn_req_handler(struct iw_cm_id *cm_id, memcpy(cma_src_addr(conn_id), laddr, rdma_addr_size(laddr)); memcpy(cma_dst_addr(conn_id), raddr, rdma_addr_size(raddr)); - ret = ib_query_device(conn_id->id.device, &attr); - if (ret) { - mutex_unlock(&conn_id->handler_mutex); - rdma_destroy_id(new_cm_id); - goto out; - } - memset(&event, 0, sizeof event); event.event = RDMA_CM_EVENT_CONNECT_REQUEST; event.param.conn.private_data = iw_event->private_data; diff --git a/drivers/infiniband/core/fmr_pool.c b/drivers/infiniband/core/fmr_pool.c index 9f5ad7c..6ac3683 100644 --- a/drivers/infiniband/core/fmr_pool.c +++ b/drivers/infiniband/core/fmr_pool.c @@ -212,7 +212,6 @@ struct ib_fmr_pool *ib_create_fmr_pool(struct ib_pd *pd, { struct ib_device *device; struct ib_fmr_pool *pool; - struct ib_device_attr *attr; int i; int ret; int max_remaps; @@ -228,25 +227,10 @@ struct ib_fmr_pool *ib_create_fmr_pool(struct ib_pd *pd, return ERR_PTR(-ENOSYS); } - attr = kmalloc(sizeof *attr, GFP_KERNEL); - if (!attr) { - printk(KERN_WARNING PFX "couldn't allocate device attr struct\n"); - return ERR_PTR(-ENOMEM); - } - - ret = ib_query_device(device, attr); - if (ret) { - printk(KERN_WARNING PFX "couldn't query device: %d\n", ret); - kfree(attr); - return ERR_PTR(ret); - } - - if (!attr->max_map_per_fmr) + if (!device->attrs.max_map_per_fmr) max_remaps = IB_FMR_MAX_REMAPS; else - max_remaps = attr->max_map_per_fmr; - - kfree(attr); + max_remaps = device->attrs.max_map_per_fmr; pool = kmalloc(sizeof *pool, GFP_KERNEL); if (!pool) { diff --git a/drivers/infiniband/core/sysfs.c b/drivers/infiniband/core/sysfs.c index b1f37d4..1d5b4b0 100644 --- a/drivers/infiniband/core/sysfs.c +++ b/drivers/infiniband/core/sysfs.c @@ -614,18 +614,12 @@ static ssize_t show_sys_image_guid(struct device *device, struct device_attribute *dev_attr, char *buf) { struct ib_device *dev = container_of(device, struct ib_device, dev); - struct ib_device_attr attr; - ssize_t ret; - - ret = ib_query_device(dev, &attr); - if (ret) - return ret;
Re: RoCE passive side failures on 4.4-rc5
On 12/17/2015 4:18 PM, Sagi Grimberg wrote: I'm using 4.4-rc2+ and I have RoCE working. I tried 4.4-rc2 and I see the same problem. I will make my .config available to you and Moni so you can try it out on your systems Sunday. I see this over both CX2 and CX3-pro, both are in VPI config. BTW getting the 28 (consumer defined) reject reason is something I saw coming and going from time to time in various weird cases over the past years. Or. -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH rdma-next 1/6] IB/core: Save the device attributes on the device structure
On 12/17/2015 7:41 PM, Jason Gunthorpe wrote: On Thu, Dec 17, 2015 at 03:44:19PM +0200, Sagi Grimberg wrote: + ret = ib_query_device(device, &device->attrs); + if (ret) { + printk(KERN_WARNING "Couldn't query the device attributes\n"); + goto out; + } + I thought we're all for removing the call altogether aren't we? I'd say just call device->query_device() instead. Christoph's patch even got rid of device->query_device(), Wrong. UDATA is a mechanism added by the founders of the RDMA stack to allow for LL HW driver pass data to LL HW user space library piggy backed on uverbs commands. For example, mlx4 uses that to path details of the HW clock which are needed for libmlx4. Christoph noted that and didn't remove the query_device entry from mlx4, he also madethis somehow inelegant change to uverbs_cmd.c: @@ -3624,26 +3607,29 @@ int ib_uverbs_ex_query_device(struct ib_uverbs_file *file, if (ucore->outlen < resp.response_length) return -ENOSPC; - memset(&attr, 0, sizeof(attr)); - - err = ib_dev->query_device(ib_dev, &attr, uhw); - if (err) - return err; + if (ib_dev->query_device) { + int err = ib_dev->query_device(ib_dev, uhw); + if (err) + return err; + } else { + if (uhw->inlen || uhw->outlen) + return -EINVAL; + } which, IHMO, I prefer to see over this. It would be wrong to disallow LL drivers to pass data back and forth to/from their user space sister libraries, this is part of a 10Y old kernel UAPI. It re-enforces that these values are constants and drivers cannot change them on the fly. yes, we should be looking on how to constify things, but incrementally. I don't see why block this cleanup now as of this point. Or. -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH rdma-next 0/6] dev attr cleanup (less is more)
On Thu, Dec 17, 2015 at 5:36 PM, Christoph Hellwig wrote: > [...] I heartily disagree with this approach [...] OK, I will remove your name and re-submit that -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
RoCE passive side failures on 4.4-rc5
Guys, Using 4.4-rc5+ [1] and **not** applying any of the patches I sent today, I noted that RoCE passive side isn't working (rdma-cm, ibv_rc_pingpong works). I have two nodes in ConnectX3 VPI config (port1 IB and port2 Eth), the one with the 4.4-rc5 kernel can act as both (rping) client/server for IB links but only (rping) client for RoCE. I tried both inter-node and loopback runs, in all cases, the client side getsCM reject with reason 28, see [2], tried both iser and rping. Eth (ICMP, TCP) works OK. Or. [1] Linus tree up to commit a5e90b1 "Merge branch 'fixes' of git://ftp.arm.linux.org.uk/~rmk/linux-arm" [2] $ rping -d -v -c -a 192.168.31.17 -C 1 verbose client count 1 created cm_id 0x60cd90 cma_event type RDMA_CM_EVENT_ADDR_RESOLVED cma_id 0x60cd90 (parent) cma_event type RDMA_CM_EVENT_ROUTE_RESOLVED cma_id 0x60cd90 (parent) rdma_resolve_addr - rdma_resolve_route successful created pd 0x60cfe0 created channel 0x608510 created cq 0x607f50 created qp 0x60d000 rping_setup_buffers called on cb 0x606010 allocated & registered buffers... cq_thread started. wait for CONNECTED state 10 cma_event type RDMA_CM_EVENT_REJECTED cma_id 0x60cd90 (parent) connect error -1 rping_free_buffers called on cb 0x606010 cma event RDMA_CM_EVENT_REJECTED, error 28 -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH rdma-next 5/6] xprtrdma: Avoid calling ib_query_device
Instead, use the cached copy of the attributes present on the device. Based on a patch from Christoph Hellwig Signed-off-by: Or Gerlitz --- net/sunrpc/xprtrdma/frwr_ops.c | 7 ++--- net/sunrpc/xprtrdma/svc_rdma_transport.c | 48 +--- net/sunrpc/xprtrdma/verbs.c | 24 ++-- net/sunrpc/xprtrdma/xprt_rdma.h | 1 - 4 files changed, 30 insertions(+), 50 deletions(-) diff --git a/net/sunrpc/xprtrdma/frwr_ops.c b/net/sunrpc/xprtrdma/frwr_ops.c index 88cf9e7..8d4864f 100644 --- a/net/sunrpc/xprtrdma/frwr_ops.c +++ b/net/sunrpc/xprtrdma/frwr_ops.c @@ -190,12 +190,11 @@ static int frwr_op_open(struct rpcrdma_ia *ia, struct rpcrdma_ep *ep, struct rpcrdma_create_data_internal *cdata) { - struct ib_device_attr *devattr = &ia->ri_devattr; int depth, delta; ia->ri_max_frmr_depth = min_t(unsigned int, RPCRDMA_MAX_DATA_SEGS, - devattr->max_fast_reg_page_list_len); + ia->ri_device->attrs.max_fast_reg_page_list_len); dprintk("RPC: %s: device's max FR page list len = %u\n", __func__, ia->ri_max_frmr_depth); @@ -222,8 +221,8 @@ frwr_op_open(struct rpcrdma_ia *ia, struct rpcrdma_ep *ep, } ep->rep_attr.cap.max_send_wr *= depth; - if (ep->rep_attr.cap.max_send_wr > devattr->max_qp_wr) { - cdata->max_requests = devattr->max_qp_wr / depth; + if (ep->rep_attr.cap.max_send_wr > ia->ri_device->attrs.max_qp_wr) { + cdata->max_requests = ia->ri_device->attrs.max_qp_wr / depth; if (!cdata->max_requests) return -EINVAL; ep->rep_attr.cap.max_send_wr = cdata->max_requests * diff --git a/net/sunrpc/xprtrdma/svc_rdma_transport.c b/net/sunrpc/xprtrdma/svc_rdma_transport.c index b348b4a..4d695af 100644 --- a/net/sunrpc/xprtrdma/svc_rdma_transport.c +++ b/net/sunrpc/xprtrdma/svc_rdma_transport.c @@ -886,10 +886,10 @@ static struct svc_xprt *svc_rdma_accept(struct svc_xprt *xprt) struct rdma_conn_param conn_param; struct ib_cq_init_attr cq_attr = {}; struct ib_qp_init_attr qp_attr; - struct ib_device_attr devattr; + struct ib_device *dev; int uninitialized_var(dma_mr_acc); int need_dma_mr = 0; - int ret; + int ret = 0; int i; listen_rdma = container_of(xprt, struct svcxprt_rdma, sc_xprt); @@ -910,20 +910,15 @@ static struct svc_xprt *svc_rdma_accept(struct svc_xprt *xprt) dprintk("svcrdma: newxprt from accept queue = %p, cm_id=%p\n", newxprt, newxprt->sc_cm_id); - ret = ib_query_device(newxprt->sc_cm_id->device, &devattr); - if (ret) { - dprintk("svcrdma: could not query device attributes on " - "device %p, rc=%d\n", newxprt->sc_cm_id->device, ret); - goto errout; - } + dev = newxprt->sc_cm_id->device; /* Qualify the transport resource defaults with the * capabilities of this particular device */ - newxprt->sc_max_sge = min((size_t)devattr.max_sge, + newxprt->sc_max_sge = min((size_t)dev->attrs.max_sge, (size_t)RPCSVC_MAXPAGES); - newxprt->sc_max_sge_rd = min_t(size_t, devattr.max_sge_rd, + newxprt->sc_max_sge_rd = min_t(size_t, dev->attrs.max_sge_rd, RPCSVC_MAXPAGES); - newxprt->sc_max_requests = min((size_t)devattr.max_qp_wr, + newxprt->sc_max_requests = min((size_t)dev->attrs.max_qp_wr, (size_t)svcrdma_max_requests); newxprt->sc_sq_depth = RPCRDMA_SQ_DEPTH_MULT * newxprt->sc_max_requests; @@ -931,16 +926,16 @@ static struct svc_xprt *svc_rdma_accept(struct svc_xprt *xprt) * Limit ORD based on client limit, local device limit, and * configured svcrdma limit. */ - newxprt->sc_ord = min_t(size_t, devattr.max_qp_rd_atom, newxprt->sc_ord); + newxprt->sc_ord = min_t(size_t, dev->attrs.max_qp_rd_atom, newxprt->sc_ord); newxprt->sc_ord = min_t(size_t, svcrdma_ord, newxprt->sc_ord); - newxprt->sc_pd = ib_alloc_pd(newxprt->sc_cm_id->device); + newxprt->sc_pd = ib_alloc_pd(dev); if (IS_ERR(newxprt->sc_pd)) { dprintk("svcrdma: error creating PD for connect request\n"); goto errout; } cq_attr.cqe = newxprt->sc_sq_depth; - newxprt->sc_sq_cq = ib_create_cq(newxprt->sc_cm_id->device, + newxprt->sc_sq_cq = ib_create_cq(dev, sq_comp_handler,
[PATCH rdma-next 3/6] IB/ulps: Avoid calling ib_query_device
Instead, use the cached copy of the attributes present on the device. Based on a patch from Christoph Hellwig Signed-off-by: Or Gerlitz --- drivers/infiniband/ulp/ipoib/ipoib_cm.c | 19 --- drivers/infiniband/ulp/ipoib/ipoib_ethtool.c | 14 +++-- drivers/infiniband/ulp/ipoib/ipoib_main.c| 21 + drivers/infiniband/ulp/iser/iscsi_iser.c | 4 +-- drivers/infiniband/ulp/iser/iscsi_iser.h | 2 -- drivers/infiniband/ulp/iser/iser_memory.c| 9 +++--- drivers/infiniband/ulp/iser/iser_verbs.c | 38 ++ drivers/infiniband/ulp/isert/ib_isert.c | 47 +--- drivers/infiniband/ulp/isert/ib_isert.h | 1 - drivers/infiniband/ulp/srp/ib_srp.c | 32 ++- drivers/infiniband/ulp/srpt/ib_srpt.c| 15 - drivers/infiniband/ulp/srpt/ib_srpt.h| 3 -- 12 files changed, 63 insertions(+), 142 deletions(-) diff --git a/drivers/infiniband/ulp/ipoib/ipoib_cm.c b/drivers/infiniband/ulp/ipoib/ipoib_cm.c index 3ae9726..94d144d 100644 --- a/drivers/infiniband/ulp/ipoib/ipoib_cm.c +++ b/drivers/infiniband/ulp/ipoib/ipoib_cm.c @@ -1522,8 +1522,7 @@ static void ipoib_cm_create_srq(struct net_device *dev, int max_sge) int ipoib_cm_dev_init(struct net_device *dev) { struct ipoib_dev_priv *priv = netdev_priv(dev); - int i, ret; - struct ib_device_attr attr; + int max_srq_sge, i; INIT_LIST_HEAD(&priv->cm.passive_ids); INIT_LIST_HEAD(&priv->cm.reap_list); @@ -1540,19 +1539,13 @@ int ipoib_cm_dev_init(struct net_device *dev) skb_queue_head_init(&priv->cm.skb_queue); - ret = ib_query_device(priv->ca, &attr); - if (ret) { - printk(KERN_WARNING "ib_query_device() failed with %d\n", ret); - return ret; - } - - ipoib_dbg(priv, "max_srq_sge=%d\n", attr.max_srq_sge); + ipoib_dbg(priv, "max_srq_sge=%d\n", priv->ca->attrs.max_srq_sge); - attr.max_srq_sge = min_t(int, IPOIB_CM_RX_SG, attr.max_srq_sge); - ipoib_cm_create_srq(dev, attr.max_srq_sge); + max_srq_sge = min_t(int, IPOIB_CM_RX_SG, priv->ca->attrs.max_srq_sge); + ipoib_cm_create_srq(dev, max_srq_sge); if (ipoib_cm_has_srq(dev)) { - priv->cm.max_cm_mtu = attr.max_srq_sge * PAGE_SIZE - 0x10; - priv->cm.num_frags = attr.max_srq_sge; + priv->cm.max_cm_mtu = max_srq_sge * PAGE_SIZE - 0x10; + priv->cm.num_frags = max_srq_sge; ipoib_dbg(priv, "max_cm_mtu = 0x%x, num_frags=%d\n", priv->cm.max_cm_mtu, priv->cm.num_frags); } else { diff --git a/drivers/infiniband/ulp/ipoib/ipoib_ethtool.c b/drivers/infiniband/ulp/ipoib/ipoib_ethtool.c index 078cadd..a53fa5f 100644 --- a/drivers/infiniband/ulp/ipoib/ipoib_ethtool.c +++ b/drivers/infiniband/ulp/ipoib/ipoib_ethtool.c @@ -40,15 +40,11 @@ static void ipoib_get_drvinfo(struct net_device *netdev, struct ethtool_drvinfo *drvinfo) { struct ipoib_dev_priv *priv = netdev_priv(netdev); - struct ib_device_attr *attr; - - attr = kmalloc(sizeof(*attr), GFP_KERNEL); - if (attr && !ib_query_device(priv->ca, attr)) - snprintf(drvinfo->fw_version, sizeof(drvinfo->fw_version), -"%d.%d.%d", (int)(attr->fw_ver >> 32), -(int)(attr->fw_ver >> 16) & 0x, -(int)attr->fw_ver & 0x); - kfree(attr); + + snprintf(drvinfo->fw_version, sizeof(drvinfo->fw_version), +"%d.%d.%d", (int)(priv->ca->attrs.fw_ver >> 32), +(int)(priv->ca->attrs.fw_ver >> 16) & 0x, +(int)priv->ca->attrs.fw_ver & 0x); strlcpy(drvinfo->bus_info, dev_name(priv->ca->dma_device), sizeof(drvinfo->bus_info)); diff --git a/drivers/infiniband/ulp/ipoib/ipoib_main.c b/drivers/infiniband/ulp/ipoib/ipoib_main.c index 7d32818..58732c5 100644 --- a/drivers/infiniband/ulp/ipoib/ipoib_main.c +++ b/drivers/infiniband/ulp/ipoib/ipoib_main.c @@ -1777,26 +1777,7 @@ int ipoib_add_pkey_attr(struct net_device *dev) int ipoib_set_dev_features(struct ipoib_dev_priv *priv, struct ib_device *hca) { - struct ib_device_attr *device_attr; - int result = -ENOMEM; - - device_attr = kmalloc(sizeof *device_attr, GFP_KERNEL); - if (!device_attr) { - printk(KERN_WARNING "%s: allocation of %zu bytes failed\n", - hca->name, sizeof *device_attr); - return result; - } - - result = ib_query_device(hca, device_attr); - if (result) { - printk(KERN_WARNING "%s: ib_
[PATCH rdma-next 4/6] net/rds: Avoid calling ib_query_device
Instead, use the cached copy of the attributes present on the device. Based on a patch from Christoph Hellwig Signed-off-by: Or Gerlitz --- net/rds/ib.c | 34 +++--- net/rds/iw.c | 23 +-- 2 files changed, 16 insertions(+), 41 deletions(-) diff --git a/net/rds/ib.c b/net/rds/ib.c index f222885..9481d55 100644 --- a/net/rds/ib.c +++ b/net/rds/ib.c @@ -122,44 +122,34 @@ void rds_ib_dev_put(struct rds_ib_device *rds_ibdev) static void rds_ib_add_one(struct ib_device *device) { struct rds_ib_device *rds_ibdev; - struct ib_device_attr *dev_attr; /* Only handle IB (no iWARP) devices */ if (device->node_type != RDMA_NODE_IB_CA) return; - dev_attr = kmalloc(sizeof *dev_attr, GFP_KERNEL); - if (!dev_attr) - return; - - if (ib_query_device(device, dev_attr)) { - rdsdebug("Query device failed for %s\n", device->name); - goto free_attr; - } - rds_ibdev = kzalloc_node(sizeof(struct rds_ib_device), GFP_KERNEL, ibdev_to_node(device)); if (!rds_ibdev) - goto free_attr; + return; spin_lock_init(&rds_ibdev->spinlock); atomic_set(&rds_ibdev->refcount, 1); INIT_WORK(&rds_ibdev->free_work, rds_ib_dev_free); - rds_ibdev->max_wrs = dev_attr->max_qp_wr; - rds_ibdev->max_sge = min(dev_attr->max_sge, RDS_IB_MAX_SGE); + rds_ibdev->max_wrs = device->attrs.max_qp_wr; + rds_ibdev->max_sge = min(device->attrs.max_sge, RDS_IB_MAX_SGE); - rds_ibdev->fmr_max_remaps = dev_attr->max_map_per_fmr?: 32; - rds_ibdev->max_1m_fmrs = dev_attr->max_mr ? - min_t(unsigned int, (dev_attr->max_mr / 2), + rds_ibdev->fmr_max_remaps = device->attrs.max_map_per_fmr?: 32; + rds_ibdev->max_1m_fmrs = device->attrs.max_mr ? + min_t(unsigned int, (device->attrs.max_mr / 2), rds_ib_fmr_1m_pool_size) : rds_ib_fmr_1m_pool_size; - rds_ibdev->max_8k_fmrs = dev_attr->max_mr ? - min_t(unsigned int, ((dev_attr->max_mr / 2) * RDS_MR_8K_SCALE), + rds_ibdev->max_8k_fmrs = device->attrs.max_mr ? + min_t(unsigned int, ((device->attrs.max_mr / 2) * RDS_MR_8K_SCALE), rds_ib_fmr_8k_pool_size) : rds_ib_fmr_8k_pool_size; - rds_ibdev->max_initiator_depth = dev_attr->max_qp_init_rd_atom; - rds_ibdev->max_responder_resources = dev_attr->max_qp_rd_atom; + rds_ibdev->max_initiator_depth = device->attrs.max_qp_init_rd_atom; + rds_ibdev->max_responder_resources = device->attrs.max_qp_rd_atom; rds_ibdev->dev = device; rds_ibdev->pd = ib_alloc_pd(device); @@ -183,7 +173,7 @@ static void rds_ib_add_one(struct ib_device *device) } rdsdebug("RDS/IB: max_mr = %d, max_wrs = %d, max_sge = %d, fmr_max_remaps = %d, max_1m_fmrs = %d, max_8k_fmrs = %d\n", -dev_attr->max_fmr, rds_ibdev->max_wrs, rds_ibdev->max_sge, +device->attrs.max_fmr, rds_ibdev->max_wrs, rds_ibdev->max_sge, rds_ibdev->fmr_max_remaps, rds_ibdev->max_1m_fmrs, rds_ibdev->max_8k_fmrs); @@ -202,8 +192,6 @@ static void rds_ib_add_one(struct ib_device *device) put_dev: rds_ib_dev_put(rds_ibdev); -free_attr: - kfree(dev_attr); } /* diff --git a/net/rds/iw.c b/net/rds/iw.c index 576f182..f4a9fff 100644 --- a/net/rds/iw.c +++ b/net/rds/iw.c @@ -60,30 +60,20 @@ LIST_HEAD(iw_nodev_conns); static void rds_iw_add_one(struct ib_device *device) { struct rds_iw_device *rds_iwdev; - struct ib_device_attr *dev_attr; /* Only handle iwarp devices */ if (device->node_type != RDMA_NODE_RNIC) return; - dev_attr = kmalloc(sizeof *dev_attr, GFP_KERNEL); - if (!dev_attr) - return; - - if (ib_query_device(device, dev_attr)) { - rdsdebug("Query device failed for %s\n", device->name); - goto free_attr; - } - rds_iwdev = kmalloc(sizeof *rds_iwdev, GFP_KERNEL); if (!rds_iwdev) - goto free_attr; + return; spin_lock_init(&rds_iwdev->spinlock); - rds_iwdev->dma_local_lkey = !!(dev_attr->device_cap_flags & IB_DEVICE_LOCAL_DMA_LKEY); - rds_iwdev->max_wrs = dev_attr->max_qp_wr; - rds_iwdev->max_sge = min(dev_attr->max_sge, RDS_IW_MAX_SGE); + rds_iwdev->dma_local_lkey = !!(device->attrs.device_cap_flags & IB_DEVICE_LOCAL_DMA_LKEY); + rds_iwdev->max_wrs = device->attrs.max_qp_wr; + rds_iwdev->max_sge = min(device-&
[PATCH rdma-next 6/6] staging/o2iblnd: Avoid calling ib_query_device
Instead, use the cached copy of the attributes present on the device. Based on a patch from Christoph Hellwig Signed-off-by: Or Gerlitz --- drivers/staging/lustre/lnet/klnds/o2iblnd/o2iblnd.c | 21 + 1 file changed, 1 insertion(+), 20 deletions(-) diff --git a/drivers/staging/lustre/lnet/klnds/o2iblnd/o2iblnd.c b/drivers/staging/lustre/lnet/klnds/o2iblnd/o2iblnd.c index 7c730e3..7231ef8 100644 --- a/drivers/staging/lustre/lnet/klnds/o2iblnd/o2iblnd.c +++ b/drivers/staging/lustre/lnet/klnds/o2iblnd/o2iblnd.c @@ -2070,32 +2070,13 @@ static int kiblnd_net_init_pools(kib_net_t *net, __u32 *cpts, int ncpts) static int kiblnd_hdev_get_attr(kib_hca_dev_t *hdev) { - struct ib_device_attr *attr; - int rc; - /* It's safe to assume a HCA can handle a page size * matching that of the native system */ hdev->ibh_page_shift = PAGE_SHIFT; hdev->ibh_page_size = 1 << PAGE_SHIFT; hdev->ibh_page_mask = ~((__u64)hdev->ibh_page_size - 1); - LIBCFS_ALLOC(attr, sizeof(*attr)); - if (attr == NULL) { - CERROR("Out of memory\n"); - return -ENOMEM; - } - - rc = ib_query_device(hdev->ibh_ibdev, attr); - if (rc == 0) - hdev->ibh_mr_size = attr->max_mr_size; - - LIBCFS_FREE(attr, sizeof(*attr)); - - if (rc != 0) { - CERROR("Failed to query IB device: %d\n", rc); - return rc; - } - + hdev->ibh_mr_size = hdev->ibh_ibdev->attrs.max_mr_size; if (hdev->ibh_mr_size == ~0ULL) { hdev->ibh_mr_shift = 64; return 0; -- 2.3.7 -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH rdma-next 2/6] IB/core: Avoid calling ib_query_device when possible
Instead, use the cached copy of the attributes present on the device. Based on a patch from Christoph Hellwig Signed-off-by: Or Gerlitz --- drivers/infiniband/core/cm.c | 12 +--- drivers/infiniband/core/cma.c| 8 drivers/infiniband/core/fmr_pool.c | 20 ++-- drivers/infiniband/core/sysfs.c | 14 -- drivers/infiniband/core/uverbs_cmd.c | 23 --- drivers/infiniband/core/verbs.c | 8 +--- 6 files changed, 12 insertions(+), 73 deletions(-) diff --git a/drivers/infiniband/core/cm.c b/drivers/infiniband/core/cm.c index 0a26dd6..7fa2b94 100644 --- a/drivers/infiniband/core/cm.c +++ b/drivers/infiniband/core/cm.c @@ -3731,16 +3731,6 @@ int ib_cm_init_qp_attr(struct ib_cm_id *cm_id, } EXPORT_SYMBOL(ib_cm_init_qp_attr); -static void cm_get_ack_delay(struct cm_device *cm_dev) -{ - struct ib_device_attr attr; - - if (ib_query_device(cm_dev->ib_device, &attr)) - cm_dev->ack_delay = 0; /* acks will rely on packet life time */ - else - cm_dev->ack_delay = attr.local_ca_ack_delay; -} - static ssize_t cm_show_counter(struct kobject *obj, struct attribute *attr, char *buf) { @@ -3852,7 +3842,7 @@ static void cm_add_one(struct ib_device *ib_device) return; cm_dev->ib_device = ib_device; - cm_get_ack_delay(cm_dev); + cm_dev->ack_delay = ib_device->attrs.local_ca_ack_delay; cm_dev->going_down = 0; cm_dev->device = device_create(&cm_class, &ib_device->dev, MKDEV(0, 0), NULL, diff --git a/drivers/infiniband/core/cma.c b/drivers/infiniband/core/cma.c index d2d5d00..f8dfc63 100644 --- a/drivers/infiniband/core/cma.c +++ b/drivers/infiniband/core/cma.c @@ -1894,7 +1894,6 @@ static int iw_conn_req_handler(struct iw_cm_id *cm_id, struct rdma_id_private *listen_id, *conn_id; struct rdma_cm_event event; int ret; - struct ib_device_attr attr; struct sockaddr *laddr = (struct sockaddr *)&iw_event->local_addr; struct sockaddr *raddr = (struct sockaddr *)&iw_event->remote_addr; @@ -1936,13 +1935,6 @@ static int iw_conn_req_handler(struct iw_cm_id *cm_id, memcpy(cma_src_addr(conn_id), laddr, rdma_addr_size(laddr)); memcpy(cma_dst_addr(conn_id), raddr, rdma_addr_size(raddr)); - ret = ib_query_device(conn_id->id.device, &attr); - if (ret) { - mutex_unlock(&conn_id->handler_mutex); - rdma_destroy_id(new_cm_id); - goto out; - } - memset(&event, 0, sizeof event); event.event = RDMA_CM_EVENT_CONNECT_REQUEST; event.param.conn.private_data = iw_event->private_data; diff --git a/drivers/infiniband/core/fmr_pool.c b/drivers/infiniband/core/fmr_pool.c index 9f5ad7c..6ac3683 100644 --- a/drivers/infiniband/core/fmr_pool.c +++ b/drivers/infiniband/core/fmr_pool.c @@ -212,7 +212,6 @@ struct ib_fmr_pool *ib_create_fmr_pool(struct ib_pd *pd, { struct ib_device *device; struct ib_fmr_pool *pool; - struct ib_device_attr *attr; int i; int ret; int max_remaps; @@ -228,25 +227,10 @@ struct ib_fmr_pool *ib_create_fmr_pool(struct ib_pd *pd, return ERR_PTR(-ENOSYS); } - attr = kmalloc(sizeof *attr, GFP_KERNEL); - if (!attr) { - printk(KERN_WARNING PFX "couldn't allocate device attr struct\n"); - return ERR_PTR(-ENOMEM); - } - - ret = ib_query_device(device, attr); - if (ret) { - printk(KERN_WARNING PFX "couldn't query device: %d\n", ret); - kfree(attr); - return ERR_PTR(ret); - } - - if (!attr->max_map_per_fmr) + if (!device->attrs.max_map_per_fmr) max_remaps = IB_FMR_MAX_REMAPS; else - max_remaps = attr->max_map_per_fmr; - - kfree(attr); + max_remaps = device->attrs.max_map_per_fmr; pool = kmalloc(sizeof *pool, GFP_KERNEL); if (!pool) { diff --git a/drivers/infiniband/core/sysfs.c b/drivers/infiniband/core/sysfs.c index b1f37d4..1d5b4b0 100644 --- a/drivers/infiniband/core/sysfs.c +++ b/drivers/infiniband/core/sysfs.c @@ -614,18 +614,12 @@ static ssize_t show_sys_image_guid(struct device *device, struct device_attribute *dev_attr, char *buf) { struct ib_device *dev = container_of(device, struct ib_device, dev); - struct ib_device_attr attr; - ssize_t ret; - - ret = ib_query_device(dev, &attr); - if (ret) - return ret; return sprintf(buf, "%04x:%04x:%04x:%04x\n", - be16_to_cpu(((__be16 *) &attr.sys_
[PATCH rdma-next 1/6] IB/core: Save the device attributes on the device structure
From: Ira Weiny This way both the IB core and upper level drivers can access these cached device attributes rather than querying or caching them on their own. Signed-off-by: Ira Weiny Signed-off-by: Or Gerlitz --- drivers/infiniband/core/device.c | 6 ++ include/rdma/ib_verbs.h | 1 + 2 files changed, 7 insertions(+) diff --git a/drivers/infiniband/core/device.c b/drivers/infiniband/core/device.c index 179e813..398353b 100644 --- a/drivers/infiniband/core/device.c +++ b/drivers/infiniband/core/device.c @@ -352,6 +352,12 @@ int ib_register_device(struct ib_device *device, goto out; } + ret = ib_query_device(device, &device->attrs); + if (ret) { + printk(KERN_WARNING "Couldn't query the device attributes\n"); + goto out; + } + ret = ib_device_register_sysfs(device, port_callback); if (ret) { printk(KERN_WARNING "Couldn't register device %s with driver model\n", diff --git a/include/rdma/ib_verbs.h b/include/rdma/ib_verbs.h index d56b39a..75a22533 100644 --- a/include/rdma/ib_verbs.h +++ b/include/rdma/ib_verbs.h @@ -1828,6 +1828,7 @@ struct ib_device { u16 is_switch:1; u8 node_type; u8 phys_port_cnt; + struct ib_device_attrattrs; /** * The following mandatory functions are used only at device -- 2.3.7 -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH rdma-next 0/6] dev attr cleanup (less is more)
OK, Doug, this is my suggestion for the dev attr cleanup -- it has the advantages of leaving the attrs on a well defined location, a field in the IB device, the ability to get that through smaller patches, avoid touching any of the HW drivers, etc. Note that in uverbs there's one spot where udata is involved (the extended query device call) so we keep the ib_query_device call there. I used hunks from Christoph's work and mentioned that in the change-logs. This can turn to be his signature, if he wants to. Or. Ira Weiny (1): IB/core: Save the device attributes on the device structure Or Gerlitz (5): IB/core: Avoid calling ib_query_device when possible IB/ulps: Avoid calling ib_query_device net/rds: Avoid calling ib_query_device xprtrdma: Avoid calling ib_query_device staging/o2iblnd: Avoid calling ib_query_device drivers/infiniband/core/cm.c | 12 +- drivers/infiniband/core/cma.c | 8 drivers/infiniband/core/device.c | 6 +++ drivers/infiniband/core/fmr_pool.c | 20 + drivers/infiniband/core/sysfs.c| 14 ++- drivers/infiniband/core/uverbs_cmd.c | 23 ++- drivers/infiniband/core/verbs.c| 8 +--- drivers/infiniband/ulp/ipoib/ipoib_cm.c| 19 +++-- drivers/infiniband/ulp/ipoib/ipoib_ethtool.c | 14 +++ drivers/infiniband/ulp/ipoib/ipoib_main.c | 21 +- drivers/infiniband/ulp/iser/iscsi_iser.c | 4 +- drivers/infiniband/ulp/iser/iscsi_iser.h | 2 - drivers/infiniband/ulp/iser/iser_memory.c | 9 ++-- drivers/infiniband/ulp/iser/iser_verbs.c | 38 - drivers/infiniband/ulp/isert/ib_isert.c| 47 +++-- drivers/infiniband/ulp/isert/ib_isert.h| 1 - drivers/infiniband/ulp/srp/ib_srp.c| 32 --- drivers/infiniband/ulp/srpt/ib_srpt.c | 15 +++ drivers/infiniband/ulp/srpt/ib_srpt.h | 3 -- .../staging/lustre/lnet/klnds/o2iblnd/o2iblnd.c| 21 +- include/rdma/ib_verbs.h| 1 + net/rds/ib.c | 34 +-- net/rds/iw.c | 23 +++ net/sunrpc/xprtrdma/frwr_ops.c | 7 ++-- net/sunrpc/xprtrdma/svc_rdma_transport.c | 48 +- net/sunrpc/xprtrdma/verbs.c| 24 --- net/sunrpc/xprtrdma/xprt_rdma.h| 1 - 27 files changed, 129 insertions(+), 326 deletions(-) -- 2.3.7 -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: device attr cleanup
On 12/16/2015 3:40 PM, Sagi Grimberg wrote: I really don't have a strong preference on either of the approaches. I just want to see this included one way or the other. sure, agree, I will send my patches tomorrow -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: device attr cleanup
On 12/15/2015 9:03 PM, Doug Ledford wrote: Or, you specifically asked me to wait until this week. I made my initial impressions clear (I don't necessarily like the removal of the attr struct, but I like the removal of all of the query calls, and I'm inclined to take the patch in spite of not liking the removal of the struct). Do you have anything to add or have we beat this horse to death? Hi Doug, Lets stop beating, both horses and people. I do understand that 1. you don't link the removal of the attr 2. you do like the removal of all the query calls I am proposing to take the path of a patch that does exactly #2 while avoiding #1. What's wrong with that? I haven't heard any reasoning for why its so good to stash ~50 new fields on the IB device structure except for the author saying that other subsystems do that and other people saying they are in favor of this approach while not providing any reasoning, except for maybe something on bikes. Why you or anyone else has to be from now and ever the cache line police making sure that people don't add new attributes in random locations over the IB device structure? What's wrong with putting fifty attributesin a structure which is a field of the device struct and have people go there to see what are the d ifferentattrs and add news ones there? This will make the 4.5 merge window extremely complex or even totally threatened w.r.t to the RDMA subsystem and related drivers by 3.3K LOC patch. Sorry, but, I still don't get it. Or. -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH for-next 6/6] IB/mlx5: Add Raw Packet Queue Pair (QP) support
On 12/10/2015 10:09 PM, Majd Dibbiny wrote: This patchs adds support for Raw Packet QP for the mlx5 device. Raw Packet QP, unlike other QP types, has no matching mlx5_core_qp object but rather it is built of RQ/SQ/TIR/TIS/TD mlx5_core object. The Raw Packet QP state changes are implemented by changing the state of the sub-objects. Since the SQ and RQ work-queue (WQ) buffers are not contiguous like other QPs, we allocate separate buffers in the user-space and pass the address of each one of them separately to the kernel. Signed-off-by: Majd Dibbiny --- drivers/infiniband/hw/mlx5/main.c | 15 +- drivers/infiniband/hw/mlx5/mlx5_ib.h | 52 +- drivers/infiniband/hw/mlx5/qp.c| 894 ++--- drivers/infiniband/hw/mlx5/user.h | 1 + drivers/net/ethernet/mellanox/mlx5/core/qp.c | 48 +- drivers/net/ethernet/mellanox/mlx5/core/transobj.c | 39 + include/linux/mlx5/mlx5_ifc.h | 11 +- include/linux/mlx5/qp.h| 3 +- include/linux/mlx5/transobj.h | 4 + 9 files changed, 913 insertions(+), 154 deletions(-) Seems way too big for single patch, seek how to split to 2-3 patches Or. -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: device attr cleanup (was: Handle mlx4 max_sge_rd correctly)
On Wed, Dec 9, 2015 at 1:13 AM, Or Gerlitz wrote: > On Wed, Dec 9, 2015 at 12:59 AM, Jason Gunthorpe > wrote: >> On Wed, Dec 09, 2015 at 12:47:55AM +0200, Or Gerlitz wrote: > >>> The patch is three liner to add the cached attrs -- >>> http://marc.info/?l=linux-rdma&m=142309296813985&w=2 -- if you are OK >>> with that, I will add a 2nd patch that ports all ULPs to use the >>> cached copy instead of their code which does the query. > >>> Actually, why not start with this approach and later decide if we need >>> to go further of this is enough? > >> Or, can we please stop this bikeshedding. Christoph's patch is done, >> well tested and does a lot more clean up that this hacky three liner. > > Christoph patch is here indeed, it does two things > > 1. remove all the ULP device attr alloc, device query, attr free hassle > 2. adds tons of new fields to struct ib_device > > I think it just goes too much and needlessly adds tons of these new > fields directly to struct ib_device where we can have them all well > scoped into ib_device_attr member or pointer from struct ib_device > >> It is a good patch, > > I didn't say it's bad, I said I think we can achieve #1 above with way > less drastic patch, and I'd like to hear the maintainer option, and > have the chance to argu with the maintainer if needed, yours I heard, > you like it, I know. and I will be OOO for the rest of this week, since this is in the air for ***months***, it would be fair enough to ask the maintainer not to cut it this way or another before next week, Doug, agree? Or. -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: device attr cleanup (was: Handle mlx4 max_sge_rd correctly)
On Wed, Dec 9, 2015 at 12:59 AM, Jason Gunthorpe wrote: > On Wed, Dec 09, 2015 at 12:47:55AM +0200, Or Gerlitz wrote: >> The patch is three liner to add the cached attrs -- >> http://marc.info/?l=linux-rdma&m=142309296813985&w=2 -- if you are OK >> with that, I will add a 2nd patch that ports all ULPs to use the >> cached copy instead of their code which does the query. >> Actually, why not start with this approach and later decide if we need >> to go further of this is enough? > Or, can we please stop this bikeshedding. Christoph's patch is done, > well tested and does a lot more clean up that this hacky three liner. Christoph patch is here indeed, it does two things 1. remove all the ULP device attr alloc, device query, attr free hassle 2. adds tons of new fields to struct ib_device I think it just goes too much and needlessly adds tons of these new fields directly to struct ib_device where we can have them all well scoped into ib_device_attr member or pointer from struct ib_device > It is a good patch, I didn't say it's bad, I said I think we can achieve #1 above with way less drastic patch, and I'd like to hear the maintainer option, and have the chance to argu with the maintainer if needed, yours I heard, you like it, I know. > and although patchworks doesn't have my remarks > from an earlier revision I still think it should go forward. -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: device attr cleanup (was: Handle mlx4 max_sge_rd correctly)
On Wed, Dec 9, 2015 at 12:04 AM, Doug Ledford wrote: > Makes sense. thanks. > Show me what you are talking about (either a link to Ira's > patch you are referring to or your own patch). The patch is three liner to add the cached attrs -- http://marc.info/?l=linux-rdma&m=142309296813985&w=2 -- if you are OK with that, I will add a 2nd patch that ports all ULPs to use the cached copy instead of their code which does the query. Actually, why not start with this approach and later decide if we need to go further of this is enough? Or. -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: 4.4's rdma plate
On Fri, Oct 23, 2015 at 5:52 PM, Doug Ledford wrote: > On 10/23/2015 02:46 AM, Or Gerlitz wrote: >> Doug, there are few singleton patches to mlxN drivers, just to make >> sure they are not left behind (thanks) >> a small mlx5 enhancement, Signed by Eli C. the driver maintainer >> https://patchwork.kernel.org/patch/7454641 >> two small mlx4 fixes, acked by me... >> https://patchwork.kernel.org/patch/7348911/ >> https://patchwork.kernel.org/patch/7348871/ > Yes, I know. There are quite a few different things for mlx5. I was > waiting on all of them until I knew that RoCEv2 was in as that gates > whether or not mlx5 RoCE can go in. Doug, So RoCEv2 didn't make it for 4.4, but still, the above are changes touching driver only code and I don't see why they should be waiting so long. Can we finally see them in a k.o branch of yours please? either for your 4.4-rc pull or on 4.5 branch. Or. -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: device attr cleanup (was: Handle mlx4 max_sge_rd correctly)
On Tue, Dec 8, 2015 at 8:17 PM, Doug Ledford wrote: > > On 12/08/2015 01:13 PM, Sagi Grimberg wrote: > > > I mentioned this in v1. This patch set is applied over Christoph's > > device attributes patch. Will it go in as well? > > No, that's too big and not the right sort of fix for 4.4-rc. I had it > on my radar for getting into for-next. Doug, re the device attribute patch, I have expressed my opinion that --- we should be going in a slightly different direction of stashing a struct ib_device_attr as a direct or pointer field of struct ib_device and have the device or the core to fill it up just before the device instance creation with the core is to be complete. This way we can remove all the annoying calls from ULPs to query device and avoid adding so many fields to the device structure itself. So there are two suggestions on the table here and we need to hear the maintainer thinking. I do expect that your response will not be "I applied X" but rather you'll allow the time to try and convince you on whatever direction, this is pending for long time and we're on 4.4-rc4, there should be no rush to cut this over night and push to k.o the code for this small cleanup. I didn't submit a patch that follows my suggestion, but Ira did so in one of the OPA submission rounds, if you're OK with that proposal or if you even just want to see the patch to cast your vote/decision, I will rebase it and complete. Makes sense? Or. > > So, it'll have to be fixed up to apply over the top of your patches which > will already be applied. > -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
stalled again
Doug, We're against into this... upstream is on 4.4-rc3 while your latest branch in kernel.org (the one that carries thefor-next tag) is rebased to 4.3-rc3... --> our internal build and review systems for patches to linux-rdma can'tmake any use of your tree. People here have to rebase their work against their own clones of Linus tree and can't work with our internal Gerrit rdma-next branch, etc, etc. We have to work in a very inefficient manner. Any reason on earth not to rebase your tree (== create new branch, move their the for-next tag) after an rc1 is out? how long does this take? Or. -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 1/2] IB/core: correct issue with sge copyin corrupting wr
On Mon, Nov 30, 2015 at 4:34 PM, Mike Marciniszyn wrote: Being a non native English speaker I am not fully sure, but "copyin corrupting" sounds like slang / street talking... proper English please > Commit e622f2f4ad21 ("IB: split struct ib_send_wr") > introduced a regression for HCAs whose user mode post > sends go through ib_uverbs_post_send(). > > The code didn't account for the fact that the first sge is > offset by an operation dependent length. The allocation did, > but the pointer to the sge list is computed without that knowledge. > > Store the operation dependent length in an automatic and > compute the sge list copy in destination using that length. in an automatic what? -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v1 10/10] IB/iser: Support the remote invalidation exception
On Tue, Nov 24, 2015 at 6:23 PM, Sagi Grimberg wrote: > > From: Jenny Derzhavetz > > Declare that we support remote invalidation in case we are: > 1. using Fastreg method > 2. always registering memory. decide if you want or don't want to use periods @ the end of these cases, but be consistent... I vote not to use periods > > Detect the invalidated rkey from the work completion info so we > won't invalidate it locally. The spec madates that we must not rely > on the taget remote invalidate our rkey so we must check it upon > a receive (scsi response) completion. > s/madates/mandates/ and run spell check over the change-logs of the whole series please before posting v2 > > Signed-off-by: Jenny Derzhavetz > Signed-off-by: Sagi Grimberg > --- > drivers/infiniband/ulp/iser/iscsi_iser.h | 5 ++- > drivers/infiniband/ulp/iser/iser_initiator.c | 59 > +++- > drivers/infiniband/ulp/iser/iser_memory.c| 4 +- > drivers/infiniband/ulp/iser/iser_verbs.c | 21 +++--- > 4 files changed, 80 insertions(+), 9 deletions(-) > > diff --git a/drivers/infiniband/ulp/iser/iscsi_iser.h > b/drivers/infiniband/ulp/iser/iscsi_iser.h > index c79fdba6f969..e399d9a49df2 100644 > --- a/drivers/infiniband/ulp/iser/iscsi_iser.h > +++ b/drivers/infiniband/ulp/iser/iscsi_iser.h > @@ -358,6 +358,7 @@ struct iser_reg_ops { > * cpus and device max completion vectors > * @comps: Dinamically allocated array of completion handlers > * @reg_ops: Registration ops > + * @remote_inv_sup: Remote invalidate is supported on this device > */ > struct iser_device { > struct ib_device *ib_device; > @@ -370,6 +371,7 @@ struct iser_device { > int comps_used; > struct iser_comp *comps; > struct iser_reg_ops *reg_ops; > + bool remote_inv_sup; > }; > > #define ISER_CHECK_GUARD 0xc0 > @@ -519,6 +521,7 @@ struct iser_conn { > u32 num_rx_descs; > unsigned short scsi_sg_tablesize; > unsigned int scsi_max_sectors; > + bool snd_w_inv; > }; > > /** > @@ -602,7 +605,7 @@ int iser_conn_terminate(struct iser_conn *iser_conn); > void iser_release_work(struct work_struct *work); > > void iser_rcv_completion(struct iser_rx_desc *desc, > -unsigned long dto_xfer_len, > +struct ib_wc *wc, > struct ib_conn *ib_conn); > > void iser_snd_completion(struct iser_tx_desc *desc, > diff --git a/drivers/infiniband/ulp/iser/iser_initiator.c > b/drivers/infiniband/ulp/iser/iser_initiator.c > index 6a968e350c14..cd12f89b3365 100644 > --- a/drivers/infiniband/ulp/iser/iser_initiator.c > +++ b/drivers/infiniband/ulp/iser/iser_initiator.c > @@ -563,11 +563,61 @@ send_control_error: > return err; > } > > +static inline void > +iser_inv_desc(struct iser_fr_desc *desc, u32 rkey) > +{ > + if (likely(rkey == desc->rsc.mr->rkey)) > + desc->rsc.mr_valid = 0; > + else if (likely(rkey == desc->pi_ctx->sig_mr->rkey)) > + desc->pi_ctx->sig_mr_valid = 0; > +} > + > +static int > +iser_check_remote_inv(struct iser_conn *iser_conn, > + struct ib_wc *wc, > + struct iscsi_hdr *hdr) > +{ > + if (wc->wc_flags & IB_WC_WITH_INVALIDATE) { > + struct iscsi_task *task; > + u32 rkey = wc->ex.invalidate_rkey; > + > + iser_dbg("conn %p: remote invalidation for rkey %#x\n", > +iser_conn, rkey); > + > + if (unlikely(!iser_conn->snd_w_inv)) { > + iser_err("conn %p: unexepected remote invalidation, " > +"terminating connection\n", iser_conn); > + return -EPROTO; > + } > + > + task = iscsi_itt_to_ctask(iser_conn->iscsi_conn, hdr->itt); > + if (likely(task)) { > + struct iscsi_iser_task *iser_task = task->dd_data; > + struct iser_fr_desc *desc; > + > + if (iser_task->dir[ISER_DIR_IN]) { > + desc = iser_task->rdma_reg[ISER_DIR_IN].mem_h; > + iser_inv_desc(desc, rkey); > + } > + > + if (iser_task->dir[ISER_DIR_OUT]) { > + desc = > iser_task->rdma_reg[ISER_DIR_OUT].mem_h; > + iser_inv_desc(desc, rkey); > + } > + } else { > + iser_err("failed to get task for itt=%d\n", hdr->itt); > + return -EINVAL; > + } > + } > + > + return 0; > +} > + > /** > * iser_rcv_dto_completion - recv DTO completion > */ > vo
Re: [PATCH v1 08/10] iser-target: Support the remote invalidation exception
On Wed, Nov 25, 2015 at 10:48 AM, Sagi Grimberg wrote: > On 25/11/2015 10:41, Or Gerlitz wrote: >> On Wed, Nov 25, 2015 at 9:55 AM, Sagi Grimberg >> wrote: >> I see, so if this is case, can you eliminate one the checks here >>>>> + if (isert_conn->snd_w_inv && isert_cmd->inv_rkey) { > This are *exactly* the checks that enforce what I said above. > If we remove that we'd step on the bug you mentioned. > We do remote invalidate only if: > - we are allowed to (send_w_inv) > - initiator passed us rkey (inv_rkey). yep, should be probably OK. You didn't respond to my comment re adding bools vs bit-fields vs bit-flags Or. -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Future of FMR support, was: Re: [PATCH v1 5/9] xprtrdma: Add ro_unmap_sync method for FMR
On Wed, Nov 25, 2015 at 7:09 PM, santosh shilimkar wrote: >>> As already indicated to Sagi [1], RDS IB FR support is work in >>> progress and I was hoping to get it ready for 4.5. These are really good news! can you please elaborate a bit on the design changes this move introduced in RDS? -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v1 08/10] iser-target: Support the remote invalidation exception
On Wed, Nov 25, 2015 at 9:55 AM, Sagi Grimberg wrote: > For NO_DATA commands the iser specification explicitly states that > read_stag and write_stag should be 0 which means that inv_rkey is 0 > too. We don't do remote invalidate in case inv_rkey is 0. I see, so if this is case, can you eliminate one the checks here >>> + if (isert_conn->snd_w_inv && isert_cmd->inv_rkey) { -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v1 09/10] IB/iser: Increment the rkey when registering and not when invalidating
On Tue, Nov 24, 2015 at 6:23 PM, Sagi Grimberg wrote: > With remote invalidate we won't local invalidate > but we still want to increment the rkey. nit, better to phrase is as "we won't do local invalidate" -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v1 03/10] IB/iser: Don't register memory for all immediatedata writes
On Tue, Nov 24, 2015 at 6:23 PM, Sagi Grimberg wrote: > From: Jenny Derzhavetz > > When all the task data is sent as immeidatedata, we are > allowed to use the local_dma_lkey as it is not sent to > the wire. In the long run we'd really need to rework > the wire. the part below doesn't fit into change-log which will be there forever in the kernel-logs, remove it > In the long run we'd really need to rework > the memory registration flow only when we need rkeys. -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v1 08/10] iser-target: Support the remote invalidation exception
On Tue, Nov 24, 2015 at 6:23 PM, Sagi Grimberg wrote: > @@ -1100,7 +1122,14 @@ isert_init_send_wr(struct isert_conn *isert_conn, > struct isert_cmd *isert_cmd, > > isert_cmd->rdma_wr.iser_ib_op = ISER_IB_SEND; > send_wr->wr_id = (uintptr_t)&isert_cmd->tx_desc; > - send_wr->opcode = IB_WR_SEND; > + > + if (isert_conn->snd_w_inv && isert_cmd->inv_rkey) { > + send_wr->opcode = IB_WR_SEND_WITH_INV; > + send_wr->ex.invalidate_rkey = isert_cmd->inv_rkey; > + } else { > + send_wr->opcode = IB_WR_SEND; > + } > + > @@ -1489,6 +1518,7 @@ isert_rx_opcode(struct isert_conn *isert_conn, struct > iser_rx_desc *rx_desc, > isert_cmd->read_va = read_va; > isert_cmd->write_stag = write_stag; > isert_cmd->write_va = write_va; > + isert_cmd->inv_rkey = read_stag ? read_stag : write_stag; bug what happens for commands for which we don't register any memory, TUR and such? Or. -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html