Re: NULL pointer deref in k.o/for-4.5

2016-01-06 Thread Or Gerlitz
On Thu, Jan 7, 2016 at 1:31 AM, Chuck Lever  wrote:
>> On Jan 6, 2016, at 5:25 PM, Or Gerlitz  wrote:
>> On Wed, Jan 6, 2016 at 9:20 PM, Chuck Lever 

>> was reported here 2-3 times, this fixes that
>> https://patchwork.kernel.org/patch/7929551

> Confirmed, that fixes it. Thanks, I never would have
> guessed that was the fix.

tried linux-rdma mailing list search on people failing with the for-4.5 bits?
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: NULL pointer deref in k.o/for-4.5

2016-01-06 Thread Or Gerlitz
On Wed, Jan 6, 2016 at 9:20 PM, Chuck Lever  wrote:
> And appears to be 100% reproducible. Any debugging
> advice welcome!

was reported here 2-3 times, this fixes that
https://patchwork.kernel.org/patch/7929551
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCHv3 TRIVIAL] IB/core: Documentation fix to ib_mad_snoop_handler in the MAD header file

2016-01-05 Thread Or Gerlitz
On Tue, Jan 5, 2016 at 9:00 PM, Hefty, Sean  wrote:
>> > There are no in tree users and the only attempt at adding one was
>> rejected.
>>
>> There are no in tree users of this but there is your madeye tool (which
>> is out of tree). This is still a useful debug tool for MADs and there
>> are people who still use that.
>
> It's an out of tree tool.  Maintain the snooping hooks into the mad layer out 
> of tree as well.

Any real reason not to pick this good tool into the kernel code?

Or.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: device attr cleanup

2016-01-05 Thread Or Gerlitz
On Tue, Jan 5, 2016 at 6:46 PM, Steve Wise  wrote:
> Hey Doug, I don't see this branch.  Which branch has the accepted device attr 
> change?

k.o/for-4.5 on Doug's kernel.org tree
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCHv2 TRIVIAL] IB/core: ib_mad.h ib_mad_snoop_handler documentation fix

2016-01-04 Thread Or Gerlitz

On 1/4/2016 10:44 PM, Hal Rosenstock wrote:

ib_mad_snoop_handler uses send_buf rather than send_wr

Signed-off-by: Hal Rosenstock 


Please use higher language in commit titles e.g

 IB/core: Documentation fix in the MAD header file


---
Change since v1:
Fixed typo in patch description

diff --git a/include/rdma/ib_mad.h b/include/rdma/ib_mad.h
index ec9b44d..2b3573d 100644
--- a/include/rdma/ib_mad.h
+++ b/include/rdma/ib_mad.h
@@ -424,11 +424,11 @@ typedef void (*ib_mad_send_handler)(struct ib_mad_agent 
*mad_agent,
  /**
   * ib_mad_snoop_handler - Callback handler for snooping sent MADs.
   * @mad_agent: MAD agent that snooped the MAD.
- * @send_wr: Work request information on the sent MAD.
+ * @send_buf: send MAD data buffer.
   * @mad_send_wc: Work completion information on the sent MAD.  Valid
   *   only for snooping that occurs on a send completion.
   *
- * Clients snooping MADs should not modify data referenced by the @send_wr
+ * Clients snooping MADs should not modify data referenced by the @send_buf
   * or @mad_send_wc.
   */
  typedef void (*ib_mad_snoop_handler)(struct ib_mad_agent *mad_agent,
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

__
This email has been scanned by the Symantec Email Security.cloud service.
For more information please visit http://www.symanteccloud.com
__


--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2 00/10] iSER support for remote invalidate

2016-01-03 Thread Or Gerlitz
On Wed, Dec 23, 2015 at 10:53 PM, Or Gerlitz  wrote:
> On Mon, Dec 21, 2015 at 3:29 PM, Sagi Grimberg  
> wrote:
>
>>> Sagi, are you going to spin an increment in the initiator version?
>
>> I don't know if it's worth a driver version update?
>
> yes, I think it does. Finally we're getting real close to the visions
> of the spec authors...  do you have some performance data showing the
> gain of remote invalidation on certain HWs and IO benchmarks patterns.
> Did we make sure to interoperate between patched initiator to
> unpatched LIO or other targets which don't have this now (SCST, TGT)
> and the other way around (unpatched initiator, patched LIO)?


Sagi?


>> In case we do increment, I can send an incremental patch instead of
>> re-spinning the entire series.
>
> lets do that.

lets
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] IB/sysfs: Fix sparse warning on attr_id

2016-01-03 Thread Or Gerlitz
On Mon, Jan 4, 2016 at 5:44 AM,   wrote:
> From: Ira Weiny 
>
> Attributed ID was declared as an int while the value should really be big
> endian 16.
>
> Fixes: 35c4cbb17811 ("IB/core: Create get_perf_mad function in sysfs.c")
>

remove this blank line (&& I provided you this comment multiple times
in the past)

> Reported-by: Bart Van Assche 
> Signed-off-by: Ira Weiny 

reverse the order, your S.O.B sig line should come 1st and later the rest of
the reporters/ackers/testers/maintainers notes

> ---
>  drivers/infiniband/core/sysfs.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/infiniband/core/sysfs.c b/drivers/infiniband/core/sysfs.c
> index 2daf83276cdf..3de93517efe4 100644
> --- a/drivers/infiniband/core/sysfs.c
> +++ b/drivers/infiniband/core/sysfs.c
> @@ -77,7 +77,7 @@ struct port_table_attribute {
> struct port_attribute   attr;
> charname[8];
> int index;
> -   int attr_id;
> +   __be16  attr_id;
>  };
>
>  static ssize_t port_attr_show(struct kobject *kobj,
> @@ -413,7 +413,7 @@ struct port_table_attribute port_pma_attr_ext_##_name = { 
>   \
>   * Get a Perfmgmt MAD block of data.
>   * Returns error code or the number of bytes retrieved.
>   */
> -static int get_perf_mad(struct ib_device *dev, int port_num, int attr,
> +static int get_perf_mad(struct ib_device *dev, int port_num, __be16 attr,
> void *data, int offset, size_t size)
>  {
> struct ib_mad *in_mad;
> --
> 1.8.2
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] IB/ipoib: Expose ioctl command to retrieve SGID of a given socket

2015-12-31 Thread Or Gerlitz

On 12/31/2015 4:41 PM, Yuval Shaia wrote:

To support security applications, that need to filter out connections based
on SGID, an ioctl command to retrieve SGID of a given socket is added.

[...]

+
+found:
+   if (!(neigh->nud_state & NUD_VALID))
+   return -EINVAL;
+
+   gid = (union ib_gid *)(neigh->ha + 4);
+   *sgid = be64_to_cpu(gid->global.interface_id);
+   *subnet_prefix = be64_to_cpu(gid->global.subnet_prefix);


wait (1st)

the neighbour holds a destination address, not source address, so why 
are you talking on SGID?!


wait (2nd)

what prevents you from getting this info in user space through netlink 
from the kernel rtnl routing/neighbour services?



root@r-dcs54 ~]# ip r s  | grep 192.168.20.0/24
192.168.20.0/24 dev ib0  proto kernel  scope link  src 192.168.20.17



[root@r-dcs54 ~]# ip n s  | grep ib0
192.168.20.18 dev ib0 lladdr 
80:00:00:48:fe:80:00:00:00:00:00:00:f4:52:14:03:00:01:da:81 DELAY



--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH for-next 5/7] IB/mlx4: Enable send of RoCE QP1 packets with IP/UDP headers

2015-12-30 Thread Or Gerlitz

On 12/30/2015 1:04 PM, Moni Shoua wrote:

The last hunk that you removed had a role and was by no means
dead-code, right? so... (1) why it's correct to remove it? (2) if you
want to introduce different way to implement what was done here, why
in this patch? maybe add pre-patch for that

In a way you are right. This hunk does not insert a bug and even
improves correctness but it acutally belongs to an earlier patch
(dbf727de7440f73c4b92be4b958cbc24977e8ca2 IB/core: Use GID table in AH
creation and dmac resolution)


so what's the plan here? avoid deleting it?

Or.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma.git/k.o/for-4.5 crash during boot

2015-12-30 Thread Or Gerlitz

On 12/30/2015 2:04 PM, Bart Van Assche wrote:

Hello Christoph,

Can you check whether the branch in the subject of this e-mail works 
fine on your setup (commit 59caaed7a7) ? On my test setup (Dell R430 
with two ConnectX-3 adapters) this branch crashes during boot in 
get_counter_table() (see also the attached screenshot).


Can you please check with Hal's fix that was posted here 1-2 days ago, 
hopefully this should make your system to work


Or.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH for-next V3 00/11] Add RoCE v2 support

2015-12-30 Thread Or Gerlitz

On 12/30/2015 12:48 PM, Matan Barak wrote:



On 12/30/2015 8:04 AM, Or Gerlitz wrote:

Hi Matan,

I see these two smatch complaints on code added with this series, can
you please take a look?

drivers/infiniband/core/addr.c:503 rdma_resolve_ip_route() warn:
variable dereferenced before check 'src_addr' (see line 500)
drivers/infiniband/core/cma_configfs.c:172 make_cma_ports() warn: double
check that we're allocating correct size: 8 vs 128



I'll send fixes for both of them. Thanks for posting this.


when the same smatch runs on older gcc, it produces more warnings, are 
they false-positives?



drivers/infiniband/core/cma_configfs.c: In function 
?default_roce_mode_store?:
drivers/infiniband/core/cma_configfs.c:123: warning: ?cma_dev? may be 
used uninitialized in this function
drivers/infiniband/core/cma_configfs.c:124: warning: ?group? may be 
used uninitialized in this function
drivers/infiniband/core/cma_configfs.c: In function 
?default_roce_mode_show?:
drivers/infiniband/core/cma_configfs.c:102: warning: ?cma_dev? may be 
used uninitialized in this function
drivers/infiniband/core/cma_configfs.c:103: warning: ?group? may be 
used uninitialized in this function


--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH for-next 1/7] IB/mlx4: Query RoCE support

2015-12-30 Thread Or Gerlitz

On 12/30/2015 10:27 AM, Matan Barak wrote:



On 12/29/2015 5:19 PM, Or Gerlitz wrote:

On 12/29/2015 3:24 PM, Matan Barak wrote:

@@ -905,6 +906,8 @@ int mlx4_QUERY_DEV_CAP(struct mlx4_dev *dev,
struct mlx4_dev_cap *dev_cap)
  dev_cap->flags2 |= MLX4_DEV_CAP_FLAG2_EQE_STRIDE;
  MLX4_GET(dev_cap->bmme_flags, outbox,
   QUERY_DEV_CAP_BMME_FLAGS_OFFSET);
+if (dev_cap->bmme_flags & MLX4_FLAG_ROCE_V1_V2)
+dev_cap->flags2 |= MLX4_DEV_CAP_FLAG2_ROCE_V1_V2;


Did you make sure that the query dev cap wrapper unsets this bit when
proxing VF queries?


In mlx4_dev_cap:
if (mlx4_is_mfunc(dev)) {
dev->caps.flags &= ~MLX4_DEV_CAP_FLAG_SENSE_SUPPORT;
dev_cap->flags2 &= ~MLX4_DEV_CAP_FLAG2_ROCE_V1_V2;
mlx4_dbg(dev, "RoCE V2 is not supported when SR-IOV is enabled\n");
}

mlx4_slave_cap calls mlx4_dev_cap and uses the dev_caps it queried, so 
we should be safe here.


mlx4_slave_cap is part of the Linux VF driver flow, right?

So...  NO, this is the Linux implementation.

You should make things robust against any guest driver.

The only way to do that is patch the command wrapper used by the PF
to filter out unwanted cap bits, see other filtering we do in 
mlx4_QUERY_DEV_CAP_wrapper


Or.







  if (dev_cap->bmme_flags & MLX4_FLAG_PORT_REMAP)
  dev_cap->flags2 |= MLX4_DEV_CAP_FLAG2_PORT_REMAP;
  MLX4_GET(field, outbox, QUERY_DEV_CAP_CONFIG_DEV_OFFSET);





--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH for-next 3/7] IB/mlx4: Configure device to work in RoCEv2

2015-12-30 Thread Or Gerlitz

On 12/30/2015 10:23 AM, Matan Barak wrote:


int mlx4_CONFIG_DEV_wrapper(struct mlx4_dev *dev, int slave,
struct mlx4_vhcr *vhcr,
struct mlx4_cmd_mailbox *inbox,
struct mlx4_cmd_mailbox *outbox,
struct mlx4_cmd_info *cmd)
{
int err;
u8 get = vhcr->op_modifier;

if (get != 1)
return -EPERM;

err = mlx4_DMA_wrapper(dev, slave, vhcr, inbox, outbox, cmd);

return err;
}

Only "get" is permitted in multi-function setups.


good, thanks for clarifying this out.



Anyway, mlx4_config_roce_v2_port is not called for these setups 
because of this condition:

if (mlx4_is_mfunc(dev)) {
dev->caps.flags &= ~MLX4_DEV_CAP_FLAG_SENSE_SUPPORT;
dev_cap->flags2 &= ~MLX4_DEV_CAP_FLAG2_ROCE_V1_V2;
mlx4_dbg(dev, "RoCE V2 is not supported when SR-IOV is enabled\n");
} 


wrong again, you assume your Linux VF driver, but the VM can run other 
driver.



--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH for-next V3 00/11] Add RoCE v2 support

2015-12-29 Thread Or Gerlitz

Hi Matan,

I see these two smatch complaints on code added with this series, can 
you please take a look?


drivers/infiniband/core/addr.c:503 rdma_resolve_ip_route() warn: 
variable dereferenced before check 'src_addr' (see line 500)
drivers/infiniband/core/cma_configfs.c:172 make_cma_ports() warn: double 
check that we're allocating correct size: 8 vs 128



Or.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH for-next 2/3] IB/core: Change per-entry lock in RoCE GID table to one lock

2015-12-29 Thread Or Gerlitz

On 10/28/2015 4:52 PM, Matan Barak wrote:

@@ -134,16 +138,14 @@ static int write_gid(struct ib_device *ib_dev, u8 port,
  {
int ret = 0;
struct net_device *old_net_dev;
-   unsigned long flags;
  
  	/* in rdma_cap_roce_gid_table, this funciton should be protected by a

 * sleep-able lock.
 */
-   write_lock_irqsave(&table->data_vec[ix].lock, flags);
  
  	if (rdma_cap_roce_gid_table(ib_dev, port)) {

table->data_vec[ix].props |= GID_TABLE_ENTRY_INVALID;
-   write_unlock_irqrestore(&table->data_vec[ix].lock, flags);
+   write_unlock_irq(&table->rwlock);
/* GID_TABLE_WRITE_ACTION_MODIFY currently isn't supported by
 * RoCE providers and thus only updates the cache.
 */
@@ -153,7 +155,7 @@ static int write_gid(struct ib_device *ib_dev, u8 port,
else if (action == GID_TABLE_WRITE_ACTION_DEL)
ret = ib_dev->del_gid(ib_dev, port, ix,
  &table->data_vec[ix].context);
-   write_lock_irqsave(&table->data_vec[ix].lock, flags);
+   write_lock_irq(&table->rwlock);
}


sparse complains on

drivers/infiniband/core/cache.c:186:17: warning: context imbalance in 
'write_gid' - unexpected unlock


is this false positive?

Or.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH for-next V3 10/11] IB/core: Initialize UD header structure with IP and UDP headers

2015-12-29 Thread Or Gerlitz

On 12/23/2015 2:56 PM, Matan Barak wrote:
  
+__be16 ib_ud_ip4_csum(struct ib_ud_header *header)

+{
+   struct iphdr iph;
+
+   iph.ihl = 5;
+   iph.version = 4;
+   iph.tos = header->ip4.tos;
+   iph.tot_len = header->ip4.tot_len;
+   iph.id  = header->ip4.id;
+   iph.frag_off= header->ip4.frag_off;
+   iph.ttl = header->ip4.ttl;
+   iph.protocol= header->ip4.protocol;
+   iph.check   = 0;
+   iph.saddr   = header->ip4.saddr;
+   iph.daddr   = header->ip4.daddr;
+
+   return ip_fast_csum((u8 *)&iph, iph.ihl);
+}
+EXPORT_SYMBOL(ib_ud_ip4_csum);


You have introduced here this sparse warning, please fix

drivers/infiniband/core/ud_header.c:299:28: warning: incorrect type in 
return expression (different base types)

drivers/infiniband/core/ud_header.c:299:28:expected restricted __be16
drivers/infiniband/core/ud_header.c:299:28:got restricted __sum16

Or.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH for-next 5/7] IB/mlx4: Enable send of RoCE QP1 packets with IP/UDP headers

2015-12-29 Thread Or Gerlitz
On Tue, Dec 29, 2015 at 3:24 PM, Matan Barak  wrote:
> @@ -2413,34 +2442,27 @@ static int build_mlx_header(struct mlx4_ib_sqp *sqp, 
> struct ib_ud_wr *wr,
>
> if (is_eth) {
> struct in6_addr in6;
> -
> +   u16 ether_type;
> u16 pcp = (be32_to_cpu(ah->av.ib.sl_tclass_flowlabel) >> 29) 
> << 13;
>
> +   ether_type = (!is_udp) ? MLX4_IB_IBOE_ETHERTYPE :
> +   (ip_version == 4 ? ETH_P_IP : ETH_P_IPV6);
> +
> mlx->sched_prio = cpu_to_be16(pcp);
>
> +   ether_addr_copy(sqp->ud_header.eth.smac_h, ah->av.eth.s_mac);
> memcpy(sqp->ud_header.eth.dmac_h, ah->av.eth.mac, 6);
> -   /* FIXME: cache smac value? */
> memcpy(&ctrl->srcrb_flags16[0], ah->av.eth.mac, 2);
> memcpy(&ctrl->imm, ah->av.eth.mac + 2, 4);
> memcpy(&in6, sgid.raw, sizeof(in6));
>
> -   if (!mlx4_is_mfunc(to_mdev(ib_dev)->dev)) {
> -   u64 mac = 
> atomic64_read(&to_mdev(ib_dev)->iboe.mac[sqp->qp.port - 1]);
> -   u8 smac[ETH_ALEN];
> -
> -   mlx4_u64_to_smac(smac, mac);
> -   memcpy(sqp->ud_header.eth.smac_h, smac, ETH_ALEN);
> -   } else {
> -   /* use the src mac of the tunnel */
> -   memcpy(sqp->ud_header.eth.smac_h, ah->av.eth.s_mac, 
> ETH_ALEN);
> -   }
>

The last hunk that you removed had a role and was by no means
dead-code, right? so... (1) why it's correct to remove it? (2) if you
want to introduce different way to implement what was done here, why
in this patch? maybe add pre-patch for that
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH for-next 7/7] IB/mlx4: Advertise RoCE support

2015-12-29 Thread Or Gerlitz

On 12/29/2015 3:24 PM, Matan Barak wrote:

Advertise RoCE support in port_immutable according to the hardware
capabilities. This enables the verbs stack to use RoCE v2 mode.


Advertise RoCE V2 support



Signed-off-by: Matan Barak 


I guess you wanted  "IB/mlx4: Advertise RoCE V2 support" for the patch 
title? since we did

advertise RDMA_CORE_PORT_IBA_ROCE prior to this patch.

Or.

---
  drivers/infiniband/hw/mlx4/main.c | 12 +---
  1 file changed, 9 insertions(+), 3 deletions(-)

diff --git a/drivers/infiniband/hw/mlx4/main.c 
b/drivers/infiniband/hw/mlx4/main.c
index 44e5699..8cf2575 100644
--- a/drivers/infiniband/hw/mlx4/main.c
+++ b/drivers/infiniband/hw/mlx4/main.c
@@ -2183,6 +2183,7 @@ static int mlx4_port_immutable(struct ib_device *ibdev, 
u8 port_num,
   struct ib_port_immutable *immutable)
  {
struct ib_port_attr attr;
+   struct mlx4_ib_dev *mdev = to_mdev(ibdev);
int err;
  
  	err = mlx4_ib_query_port(ibdev, port_num, &attr);

@@ -2192,10 +2193,15 @@ static int mlx4_port_immutable(struct ib_device *ibdev, 
u8 port_num,
immutable->pkey_tbl_len = attr.pkey_tbl_len;
immutable->gid_tbl_len = attr.gid_tbl_len;
  
-	if (mlx4_ib_port_link_layer(ibdev, port_num) == IB_LINK_LAYER_INFINIBAND)

+   if (mlx4_ib_port_link_layer(ibdev, port_num) == 
IB_LINK_LAYER_INFINIBAND) {
immutable->core_cap_flags = RDMA_CORE_PORT_IBA_IB;
-   else
-   immutable->core_cap_flags = RDMA_CORE_PORT_IBA_ROCE;
+   } else {
+   if (mdev->dev->caps.flags & MLX4_DEV_CAP_FLAG_IBOE)
+   immutable->core_cap_flags = RDMA_CORE_PORT_IBA_ROCE;
+   if (mdev->dev->caps.flags2 & MLX4_DEV_CAP_FLAG2_ROCE_V1_V2)
+   immutable->core_cap_flags = RDMA_CORE_PORT_IBA_ROCE |
+   RDMA_CORE_PORT_IBA_ROCE_UDP_ENCAP;
+   }
  
  	immutable->max_mad_size = IB_MGMT_MAD_SIZE;
  


--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH for-next 1/7] IB/mlx4: Query RoCE support

2015-12-29 Thread Or Gerlitz

On 12/29/2015 3:24 PM, Matan Barak wrote:

@@ -905,6 +906,8 @@ int mlx4_QUERY_DEV_CAP(struct mlx4_dev *dev, struct 
mlx4_dev_cap *dev_cap)
dev_cap->flags2 |= MLX4_DEV_CAP_FLAG2_EQE_STRIDE;
MLX4_GET(dev_cap->bmme_flags, outbox,
 QUERY_DEV_CAP_BMME_FLAGS_OFFSET);
+   if (dev_cap->bmme_flags & MLX4_FLAG_ROCE_V1_V2)
+   dev_cap->flags2 |= MLX4_DEV_CAP_FLAG2_ROCE_V1_V2;


Did you make sure that the query dev cap wrapper unsets this bit when 
proxing VF queries?



if (dev_cap->bmme_flags & MLX4_FLAG_PORT_REMAP)
dev_cap->flags2 |= MLX4_DEV_CAP_FLAG2_PORT_REMAP;
MLX4_GET(field, outbox, QUERY_DEV_CAP_CONFIG_DEV_OFFSET);



--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH for-next 2/7] IB/mlx4: Add RoCE per GID support for add_gid and del_gid

2015-12-29 Thread Or Gerlitz

On 12/29/2015 3:24 PM, Matan Barak wrote:

[...] We use a new firmware command in order to populate the GID table and 
store the type along with the GID value.


Its a new value to existing command.. so better say we use a new value 
to the SET_PORT firmware command to do X


Also here, break out mlx4_core new functionality e.g the changes to 
include/linux/mlx4/cmd.h into mlx4_core only patch. You don't need any 
change to mlx4_core to have it's own patch, I guess one up to three mlx4 
core patches would be OK.


Did you make sure (at the resource tracker) that VFs can't do this new 
set port command flavor?


Also find some spot to put blank line in the change-log, it's hard to 
read this way.


Or.


--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH V2] IB/mlx5: Unify CQ create flags check

2015-12-29 Thread Or Gerlitz

On 12/29/2015 4:41 PM, Leon Romanovsky wrote:

From: Leon Romanovsky 

The create_cq() can receive creation flags which were used
differently by two commits which added create_cq extended
command and cross-channel. The merged code caused to not
accept any flags at all.

This patch unifies the check into one function and one return
error code.

Fixes: 972ecb821379 ("IB/mlx5: Add create_cq extended command")
Fixes: 051f263098a9 ("IB/mlx5: Add driver cross-channel support")

---
Changes from v1:
   * Remove links to linux-rdma from commit message
   * Placed change log under git comment section (---)
Changes from v0:
   * Add Fixes tag

Signed-off-by: Leon Romanovsky 
wrong placing. Needs to be before the 1st --- and w.o blank lines after 
the Fixes: lines please


Please use dry runs to get this to run (...) correctly



---
  drivers/infiniband/hw/mlx5/cq.c  | 9 +
  drivers/infiniband/hw/mlx5/mlx5_ib.h | 3 ++-
  2 files changed, 3 insertions(+), 9 deletions(-)

diff --git a/drivers/infiniband/hw/mlx5/cq.c b/drivers/infiniband/hw/mlx5/cq.c
index b14316603e44..7ddc790b1819 100644
--- a/drivers/infiniband/hw/mlx5/cq.c
+++ b/drivers/infiniband/hw/mlx5/cq.c
@@ -757,10 +757,6 @@ static void destroy_cq_kernel(struct mlx5_ib_dev *dev, 
struct mlx5_ib_cq *cq)
mlx5_db_free(dev->mdev, &cq->db);
  }
  
-enum {

-   CQ_CREATE_FLAGS_SUPPORTED = IB_CQ_FLAGS_TIMESTAMP_COMPLETION
-};
-
  struct ib_cq *mlx5_ib_create_cq(struct ib_device *ibdev,
const struct ib_cq_init_attr *attr,
struct ib_ucontext *context,
@@ -778,13 +774,10 @@ struct ib_cq *mlx5_ib_create_cq(struct ib_device *ibdev,
int eqn;
int err;
  
-	if (check_cq_create_flags(attr->flags))

-   return ERR_PTR(-EINVAL);
-
if (entries < 0)
return ERR_PTR(-EINVAL);
  
-	if (attr->flags & ~CQ_CREATE_FLAGS_SUPPORTED)

+   if (check_cq_create_flags(attr->flags))
return ERR_PTR(-EOPNOTSUPP);
  
  	entries = roundup_pow_of_two(entries + 1);

diff --git a/drivers/infiniband/hw/mlx5/mlx5_ib.h 
b/drivers/infiniband/hw/mlx5/mlx5_ib.h
index d4b227126265..fbf14a768105 100644
--- a/drivers/infiniband/hw/mlx5/mlx5_ib.h
+++ b/drivers/infiniband/hw/mlx5/mlx5_ib.h
@@ -692,6 +692,7 @@ static inline u32 check_cq_create_flags(u32 flags)
 * It returns non-zero value for unsupported CQ
 * create flags, otherwise it returns zero.
 */
-   return (flags & ~IB_CQ_FLAGS_IGNORE_OVERRUN);
+   return (flags & ~(IB_CQ_FLAGS_IGNORE_OVERRUN |
+ IB_CQ_FLAGS_TIMESTAMP_COMPLETION));
  }
  #endif /* MLX5_IB_H */


--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH for-next 6/7] IB/mlx4: Create and use another QP1 for RoCEv2

2015-12-29 Thread Or Gerlitz

On 12/29/2015 3:24 PM, Matan Barak wrote:

The mlx4 driver uses a special QP to implement the GSI QP. This kind
of QP allows to build the InfiniBand headers in SW to be put before
the payload that comes in with the WR. The mlx4 HW builds the packet,
calculates the ICRC and puts it at the end of the payload. This ICRC
calculation however depends on the QP configuration which is
determined when QP is modified (roce_mode during INIT->RTR).
On the other hand, ICRC verification when packet is received does to
depend on this configuration.


I don't understand the part of the sentence saying "when packet is 
received does to depend on this configuration"

maybe some typo/s there?


Therefore, using 2 GSI QPs for send (one for each RoCE version) and 1
GSI QP for receive are required.


s/2/two/ and s/1/one/ please

Or.

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH for-next 3/7] IB/mlx4: Configure device to work in RoCEv2

2015-12-29 Thread Or Gerlitz

On 12/29/2015 3:24 PM, Matan Barak wrote:

From: Moni Shoua 

Some mlx4 adapters are RoCEv2 capable. To enable this feature some
hardware configuration is required. This is

1. Set port general parameters
2. Configure the outgoing UDP destination port
3. Configure the QP that work with RoCEv2

Signed-off-by: Moni Shoua 
---
  drivers/infiniband/hw/mlx4/main.c | 19 ++---
  drivers/infiniband/hw/mlx4/qp.c   | 35 ---
  drivers/net/ethernet/mellanox/mlx4/fw.c   | 16 +-
  drivers/net/ethernet/mellanox/mlx4/mlx4.h |  7 +--
  drivers/net/ethernet/mellanox/mlx4/port.c |  8 +++
  drivers/net/ethernet/mellanox/mlx4/qp.c   | 28 +
  include/linux/mlx4/device.h   |  1 +
  include/linux/mlx4/qp.h   | 15 +++--
  include/rdma/ib_verbs.h   |  2 ++
  9 files changed, 120 insertions(+), 11 deletions(-)


Better put (please do...) functionality which is plain mlx4 corish (such 
as new/modified FW commands, new SW/FW fields of structs and such) into 
mlx4_core patch.




diff --git a/drivers/infiniband/hw/mlx4/main.c 
b/drivers/infiniband/hw/mlx4/main.c
index 988fa33..44e5699 100644
--- a/drivers/infiniband/hw/mlx4/main.c
+++ b/drivers/infiniband/hw/mlx4/main.c
@@ -384,6 +384,7 @@ int mlx4_ib_gid_index_to_real_index(struct mlx4_ib_dev 
*ibdev,
int i;
int ret;
unsigned long flags;
+   struct ib_gid_attr attr;
  
  	if (port_num > MLX4_MAX_PORTS)

return -EINVAL;
@@ -394,10 +395,13 @@ int mlx4_ib_gid_index_to_real_index(struct mlx4_ib_dev 
*ibdev,
if (!rdma_cap_roce_gid_table(&ibdev->ib_dev, port_num))
return index;
  
-	ret = ib_get_cached_gid(&ibdev->ib_dev, port_num, index, &gid, NULL);

+   ret = ib_get_cached_gid(&ibdev->ib_dev, port_num, index, &gid, &attr);
if (ret)
return ret;
  
+	if (attr.ndev)

+   dev_put(attr.ndev);
+
if (!memcmp(&gid, &zgid, sizeof(gid)))
return -EINVAL;
  
@@ -405,7 +409,8 @@ int mlx4_ib_gid_index_to_real_index(struct mlx4_ib_dev *ibdev,

port_gid_table = &iboe->gids[port_num - 1];
  
  	for (i = 0; i < MLX4_MAX_PORT_GIDS; ++i)

-   if (!memcmp(&port_gid_table->gids[i].gid, &gid, sizeof(gid))) {
+   if (!memcmp(&port_gid_table->gids[i].gid, &gid, sizeof(gid)) &&
+   attr.gid_type == port_gid_table->gids[i].gid_type) {
ctx = port_gid_table->gids[i].ctx;
break;
}
@@ -2481,7 +2486,8 @@ static void *mlx4_ib_add(struct mlx4_dev *dev)
if (mlx4_ib_init_sriov(ibdev))
goto err_mad;
  
-	if (dev->caps.flags & MLX4_DEV_CAP_FLAG_IBOE) {

+   if (dev->caps.flags & MLX4_DEV_CAP_FLAG_IBOE ||
+   dev->caps.flags2 & MLX4_DEV_CAP_FLAG2_ROCE_V1_V2) {
if (!iboe->nb.notifier_call) {
iboe->nb.notifier_call = mlx4_ib_netdev_event;
err = register_netdevice_notifier(&iboe->nb);
@@ -2490,6 +2496,13 @@ static void *mlx4_ib_add(struct mlx4_dev *dev)
goto err_notif;
}
}
+   if (!mlx4_is_slave(dev) &&
+   dev->caps.flags2 & MLX4_DEV_CAP_FLAG2_ROCE_V1_V2) {
+   err = mlx4_config_roce_v2_port(dev, ROCE_V2_UDP_DPORT);
+   if (err) {
+   goto err_notif;
+   }
+   }
}
  
  	for (j = 0; j < ARRAY_SIZE(mlx4_class_attributes); ++j) {

diff --git a/drivers/infiniband/hw/mlx4/qp.c b/drivers/infiniband/hw/mlx4/qp.c
index 8d28059..c0dee79 100644
--- a/drivers/infiniband/hw/mlx4/qp.c
+++ b/drivers/infiniband/hw/mlx4/qp.c
@@ -1508,6 +1508,24 @@ static int create_qp_lb_counter(struct mlx4_ib_dev *dev, 
struct mlx4_ib_qp *qp)
return 0;
  }
  
+enum {

+   MLX4_QPC_ROCE_MODE_1 = 0,
+   MLX4_QPC_ROCE_MODE_2 = 2,
+   MLX4_QPC_ROCE_MODE_MAX = 0xff
+};
+
+static u8 gid_type_to_qpc(enum ib_gid_type gid_type)
+{
+   switch (gid_type) {
+   case IB_GID_TYPE_ROCE:
+   return MLX4_QPC_ROCE_MODE_1;
+   case IB_GID_TYPE_ROCE_UDP_ENCAP:
+   return MLX4_QPC_ROCE_MODE_2;
+   default:
+   return MLX4_QPC_ROCE_MODE_MAX;
+   }
+}
+
  static int __mlx4_ib_modify_qp(struct ib_qp *ibqp,
   const struct ib_qp_attr *attr, int attr_mask,
   enum ib_qp_state cur_state, enum ib_qp_state 
new_state)
@@ -1651,9 +1669,10 @@ static int __mlx4_ib_modify_qp(struct ib_qp *ibqp,
u16 vlan = 0x;
u8 smac[ETH_ALEN];
int status = 0;
+   int is_eth = rdma_cap_eth_ah(&dev->ib_dev, port_num) &&
+   attr->ah_attr.ah_flags & IB_AH_GRH;
  
-		if (rdma_cap_eth_ah(&dev->ib_dev, port_nu

Re: [PATCH for-next 4/7] net/mlx4_core: Add handlning of RoCE v2 over IPV4 in attach_flow

2015-12-29 Thread Or Gerlitz

On 12/29/2015 3:24 PM, Matan Barak wrote:

From: Maor Gottlieb 

s/handlning/handling/



When attaching multicast for RoCE v2, we need to be able to steer
packets to the QPs. Hence, we add support for IPV4 over IB steering.


not sure to follow on the change-log, can you clarify it little further...



Signed-off-by: Maor Gottlieb 
---
  drivers/net/ethernet/mellanox/mlx4/mcg.c | 14 --
  include/linux/mlx4/device.h  |  6 ++
  2 files changed, 18 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx4/mcg.c 
b/drivers/net/ethernet/mellanox/mlx4/mcg.c
index 1d4e2e0..834e60e 100644
--- a/drivers/net/ethernet/mellanox/mlx4/mcg.c
+++ b/drivers/net/ethernet/mellanox/mlx4/mcg.c
@@ -858,7 +858,9 @@ static int parse_trans_rule(struct mlx4_dev *dev, struct 
mlx4_spec_list *spec,
break;
  
  	case MLX4_NET_TRANS_RULE_ID_IB:

-   rule_hw->ib.l3_qpn = spec->ib.l3_qpn;
+   rule_hw->ib.l3_qpn = spec->ib.l3_qpn |
+   (spec->ib.roce_type == MLX4_FLOW_SPEC_IB_ROCE_TYPE_IPV4 
?
+(__force __be32)0x80 : (__force __be32)0);


maybe avoid using hard coded constants and get meaningful name for them?


rule_hw->ib.qpn_mask = spec->ib.qpn_msk;
memcpy(&rule_hw->ib.dst_gid, &spec->ib.dst_gid, 16);
memcpy(&rule_hw->ib.dst_gid_msk, &spec->ib.dst_gid_msk, 16);
@@ -1384,10 +1386,18 @@ int mlx4_trans_to_dmfs_attach(struct mlx4_dev *dev, 
struct mlx4_qp *qp,
memcpy(spec.eth.dst_mac_msk, &mac_mask, ETH_ALEN);
break;
  
+		case MLX4_PROT_IB_IPV4:

+   spec.id = MLX4_NET_TRANS_RULE_ID_IB;
+   memcpy(spec.ib.dst_gid + 12, gid + 12, 4);
+   memset(spec.ib.dst_gid_msk + 12, 0xff, 4);
+   spec.ib.roce_type = MLX4_FLOW_SPEC_IB_ROCE_TYPE_IPV4;
+   break;
+
case MLX4_PROT_IB_IPV6:
spec.id = MLX4_NET_TRANS_RULE_ID_IB;
memcpy(spec.ib.dst_gid, gid, 16);
-   memset(&spec.ib.dst_gid_msk, 0xff, 16);
+   memset(spec.ib.dst_gid_msk, 0xff, 16);
+   spec.ib.roce_type = MLX4_FLOW_SPEC_IB_ROCE_TYPE_IPV6;
break;
default:
return -EINVAL;
diff --git a/include/linux/mlx4/device.h b/include/linux/mlx4/device.h
index 0d873f1ae..cdc75b2 100644
--- a/include/linux/mlx4/device.h
+++ b/include/linux/mlx4/device.h
@@ -391,6 +391,11 @@ enum mlx4_protocol {
MLX4_PROT_FCOE
  };
  
+enum mlx4_flow_roce_type {

+   MLX4_FLOW_SPEC_IB_ROCE_TYPE_IPV6 = 0,
+   MLX4_FLOW_SPEC_IB_ROCE_TYPE_IPV4
+};
+
  enum {
MLX4_MTT_FLAG_PRESENT   = 1
  };
@@ -1197,6 +1202,7 @@ struct mlx4_spec_ipv4 {
  struct mlx4_spec_ib {
__be32  l3_qpn;
__be32  qpn_msk;
+   enummlx4_flow_roce_type roce_type;
u8  dst_gid[16];
u8  dst_gid_msk[16];
  };


--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2] IB/core: sysfs.c: Fix PerfMgt ClassPortInfo handling

2015-12-29 Thread Or Gerlitz

On 12/29/2015 12:43 PM, Hal Rosenstock wrote:

Port number is not part of ClassPortInfo attribute but is
still needed as a parameter when invoking process_mad.


Please remove the blank line above your 1st sentence.



To properly handle this attribute, port_num is added as a
parameter to get_counter_table and get_perf_mad was changed
not to store port_num in the attribute itself when it's
querying the ClassPortInfo attribute.

This handles issue pointed out by Matan Barak 

Fixes: 145d9c541032 ('IB/core: Display extended counter set if available')

Signed-off-by: Hal Rosenstock 

again, remove the blank line after the fixes tag.

Also,  I am not that the way Doug is setting the branch for pull would 
preserve commit IDs when
the offending patch landed in Linus tree. If this is the case, we should 
put your patch in 2nd pull

request and have the right commit ID there. Please check with Doug.

Acked-by: Matan Barak 
Acked-by: Ira Weiny 
---
Change from v1:
Added fixes line to description


So this patch makes mlx4 IB driver on Eth ports workable with the 
4.5-rc1 proposed bits?


--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH rdma-next V2 00/32] Soft-RoCE driver

2015-12-24 Thread Or Gerlitz
On Thu, Dec 24, 2015 at 12:02 PM, Christoph Hellwig  wrote:
> On Thu, Dec 24, 2015 at 11:17:46AM +0200, Kamal Heib wrote:
>> We've located the driver in the staging subtree. This follows a requirement
>> to implement an IB transport library - Soft RoCE is in the same boat like 
>> the hfi1
>> driver. We need to define and implement a lib to prevent those code
>> duplications.
>
> Given the trainwreck that the staging process is it might seems more
> sensible to get it into a stage and then merge it directly.  You'll
> probably save yourself a lot of work that way.

I am not sure what you mean by "get it into a stage and then merge it
directly" --i
is that not go through staging at all?

Or.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2 for-next 5/7] IB/mlx4: Add IB counters table

2015-12-24 Thread Or Gerlitz

On 12/24/2015 12:42 PM, Sagi Grimberg wrote:



This patch seems to generate a list corruption [1] when I test
with Doug's for-4.5 tree. Eran, care to take a look at this?


This patch is part from a series that was introduced in 4.3-rc1 [1],


Then something else broke it. Can people check their patches on doug's
tree? At the moment it's unusable...


Yes, I checked the branch up to commit 882f3b3 "Merge branches 
'4.5/Or-cleanup' and '4.5/rdma-cq' into k.o/for-4.5" and it works 
(rping, ibv_rc_pingpong over top of mlx4 VPI)

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH V1 0/3] Add cross-channel support

2015-12-24 Thread Or Gerlitz

On 12/24/2015 12:00 PM, Christoph Hellwig wrote:

On Thu, Dec 24, 2015 at 10:02:29AM +0200, Or Gerlitz wrote:

We had consensus among the reviewers that the 1st patch ("IB/core: Align
coding style of ib_device_cap_flags structure") is wrong cleanup which
basically is (1) unneeded (2) creates more damage (git blame and such,
non-applicable to uapi, more) than benefit, etc -- finally Leon was
convinced too [1].

It's not really an issue vs uapi.  Using the the wierd BIT() macro
would have been, but without it I think this cleanup is ok, even if I
personally wouldn't have done it.  git-blame isn't really a major
issue either, as you can blame past revisions.


I would personally wouldn't done cleanup either and I managed to 
convinced Leon to drop it, so we had concensus among the developers, the 
maintainer didn't have other opinion and he took the wrong step -- so 
we're asking to fix, that's all.



Leon will re-spin in the coming 1-2 hours V2, could please pick it instead
of V1, when people agree on direction X and you are not against it, lets do
X and not Y.

It would be great if we could stop rebasing whats already in the tree
for the benefit of everyone building on top of this.  For example just
finished rebasing my series to move many constants includin this one
to the uapi headers, and I'd hate to rebase it once again now that
the dust has settled.


The root issue here is that nothing was picked before 4.4-rc6, so we're 
in a situation where rebases are needed in the own-maintainer tree 
(github) to make things right. No way to avoid that.


We should aim that for 4.6 and onward, code for -next will start getting 
in around rc1-2 and then things will be more robust, etc


Or.

Or.

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2 for-next 5/7] IB/mlx4: Add IB counters table

2015-12-24 Thread Or Gerlitz

On 12/24/2015 12:12 PM, Sagi Grimberg wrote:

This patch seems to generate a list corruption [1] when I test
with Doug's for-4.5 tree. Eran, care to take a look at this? 


This patch is part from a series that was introduced in 4.3-rc1 [1], did 
4.4-rc5/6 worked for you before you uploaded there further patches?


Or.

[1]
fbfb662 IB/mlx4: Add support for blocking multicast loopback QP creation 
user flag
7b59f0f IB/mlx4: Add counter based implementation for QP multicast 
loopback block

3ba8e31 IB/mlx4: Add IB counters table
74194fb net/mlx4_en: Implement mcast loopback prevention for ETH qps
9a89283 net/mlx4_core: Add support for filtering multicast loopback
ddf9529 IB/core: Allow setting create flags in QP init attribute
6d8a749 IB/core: Extend ib_uverbs_create_qp


--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH V1 08/16] i40iw: add files for iwarp interface

2015-12-24 Thread Or Gerlitz

On 12/24/2015 9:31 AM, Faisal Latif wrote:

On Wed, Dec 23, 2015 at 08:42:01AM -0800, Or Gerlitz wrote:

On 12/22/2015 1:13 AM, Faisal Latif wrote:

+
+enum i40iw_memreg_type {
+   IW_MEMREG_TYPE_MEM = 0x,
+   IW_MEMREG_TYPE_QP = 0x0001,
+   IW_MEMREG_TYPE_CQ = 0x0002,
+   IW_MEMREG_TYPE_MW = 0x0003,
+   IW_MEMREG_TYPE_FMR = 0x0004,
+   IW_MEMREG_TYPE_FMEM = 0x0005,
+};

Can't you re-use IB core values or derive that from the actual uverbs
command?

I did not see anything which will have types that I needed.


what do you need? what is the role of this enum?


It will be confusing otherwise.




I will be reducing number of types from here though.


so why some of it can go? is that deal values which aren't used by the code

Or.


Thanks
Faisal


--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH V1 15/16] i40iw: add entry in rdma_netlink

2015-12-24 Thread Or Gerlitz

On 12/24/2015 9:05 AM, Faisal Latif wrote:


>Why the iwarp port mapper implementationhas to be repeated in each
>driver? can you join your code in a common place and avoid the duplication?
>
>root@r-dcs58 hw]# git grep RDMA_NL_ nes
>nes/nes.c:  [RDMA_NL_IWPM_REG_PID] = {.dump = iwpm_register_pid_cb},
>nes/nes.c:  [RDMA_NL_IWPM_ADD_MAPPING] = {.dump = iwpm_add_mapping_cb},
>nes/nes.c:  [RDMA_NL_IWPM_QUERY_MAPPING] = {.dump =
>iwpm_add_and_query_mapping_cb},
>nes/nes.c:  [RDMA_NL_IWPM_REMOTE_INFO] = {.dump = iwpm_remote_info_cb},
>nes/nes.c:  [RDMA_NL_IWPM_HANDLE_ERR] = {.dump = iwpm_mapping_error_cb},
>nes/nes.c:  [RDMA_NL_IWPM_MAPINFO] = {.dump = iwpm_mapping_info_cb},
>nes/nes.c:  [RDMA_NL_IWPM_MAPINFO_NUM] = {.dump =
>iwpm_ack_mapping_info_cb}
>nes/nes.c:  if (ibnl_add_client(RDMA_NL_NES, RDMA_NL_IWPM_NUM_OPS,
>nes_nl_cb_table))
>nes/nes.c:  ret = iwpm_init(RDMA_NL_NES);
>nes/nes.c:  ibnl_remove_client(RDMA_NL_NES);
>nes/nes.c:  ibnl_remove_client(RDMA_NL_NES);
>nes/nes.c:  iwpm_exit(RDMA_NL_NES);
>nes/nes_cm.c: &mapped_sockaddr, RDMA_NL_NES);
>nes/nes_cm.c:   return iwpm_remove_mapping(&local_sockaddr, RDMA_NL_NES);
>nes/nes_cm.c:   &remote_addr, RDMA_NL_NES);
>nes/nes_cm.c:   iwpm_err = iwpm_register_pid(&pm_reg_msg,
>RDMA_NL_NES);
>nes/nes_cm.c:   iwpm_err = iwpm_add_mapping(&pm_msg,
>RDMA_NL_NES);
>nes/nes_cm.c:   iwpm_err = iwpm_register_pid(&pm_reg_msg, RDMA_NL_NES);
>nes/nes_cm.c:   iwpm_err = iwpm_add_and_query_mapping(&pm_msg,
>RDMA_NL_NES);
>



i40iw iwarp driver registers with port mapper and uses its services. Beside 
that it is not the scope of the patch series.



You are asked a question by reviewer and your reply is

(1) YES, I did C & P from driver X to my driver

(2) explaining why C&P is right goes beyond the scope of reviewing my driver

This isn't how things work in upstream.

If needed, talk to the upstream Intel networking folks, they can assist 
you catching up on upstream practices.


Or.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] IB/cma: cma_match_net_dev needs to take into account port_num

2015-12-24 Thread Or Gerlitz

On 12/24/2015 9:57 AM, Matan Barak wrote:

I totally agree that it's better to use the local IP address and not
just get a random device by using 127.0.0.1. You could get a specific
device by binding it, but then - use its local IP instead of
127.0.0.1.


Yes guys, it might be better but the user can do that. However, loopback 
ala 127.0.0.1 is working in the rdma-cm since the 2.6.18 day one and was 
broken recently when Eth ports are around.


Lets fix.

Indeed, this is 4.5 and not 4.4 material.

Or.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH V1 0/3] Add cross-channel support

2015-12-24 Thread Or Gerlitz

On 12/24/2015 5:31 AM, Doug Ledford wrote:

On 12/20/2015 12:16 PM, Leon Romanovsky wrote:

Leon Romanovsky (3):
   IB/core: Align coding style of ib_device_cap_flags structure
   IB/core: Add cross-channel support
   IB/mlx5: Add driver cross-channel support

  drivers/infiniband/core/uverbs_cmd.c |  5 ++-
  drivers/infiniband/hw/mlx5/cq.c  |  7 +++-
  drivers/infiniband/hw/mlx5/main.c|  3 ++
  drivers/infiniband/hw/mlx5/mlx5_ib.h | 16 
  drivers/infiniband/hw/mlx5/qp.c  | 54 ++-
  include/linux/mlx5/qp.h  |  3 ++
  include/rdma/ib_verbs.h  | 71 +---
  7 files changed, 117 insertions(+), 42 deletions(-)



I took the series as is.  Please make sure to resubmit the libibverbs portion 
of these changes with the requested man page updates.


Doug,

Wait.

We had consensus among the reviewers that the 1st patch ("IB/core: Align 
coding style of ib_device_cap_flags structure") is wrong cleanup which 
basically is (1) unneeded (2) creates more damage (git blame and such,  
non-applicable to uapi, more) than benefit, etc -- finally Leon was 
convinced too [1].


Leon will re-spin in the coming 1-2 hours V2, could please pick it 
instead of V1, when people agree on direction X and you are not against 
it, lets do X and not Y.


thanks,

Or.

[1] http://marc.info/?t=14506106803&r=1&w=2


--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2 00/10] iSER support for remote invalidate

2015-12-23 Thread Or Gerlitz
On Mon, Dec 21, 2015 at 3:29 PM, Sagi Grimberg  wrote:

>> Sagi, are you going to spin an increment in the initiator version?

> I don't know if it's worth a driver version update?

yes, I think it does. Finally we're getting real close to the visions
of the spec authors...  do you have some performance data showing the
gain of remote invalidation on certain HWs and IO benchmarks patterns.
Did we make sure to interoperate between patched initiator to
unpatched LIO or other targets which don't have this now (SCST, TGT)
and the other way around (unpatched initiator, patched LIO)?

> In case we do increment, I can send an incremental patch instead of
> re-spinning the entire series.

lets do that.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH for-next V2 0/5] User-space time-stamping support for mlx5_ib

2015-12-23 Thread Or Gerlitz
On Tue, Dec 15, 2015 at 8:30 PM, Matan Barak  wrote:

> This patch-set adds user-space support for time-stamping in mlx5_ib.
> It implements the necessary API:
> (a) ib_create_cq_ex - Add support for CQ creation flags
> (b) ib_query_device - return timestamp_mask and hca_core_clock.

> We also add support for mmaping the HCA's free running clock.
> In order to do so, we use the response of the vendor's extended
> part in init_ucontext. This allows us to pass the page offset
> of the free running clock register to the user-space driver.
> In order to implement it in a future extensible manner, we  use the
> same mechanism of verbs extensions to the mlx5 vendor part as well.

Hi Doug,

As Matan is now the maintainer of the mlx5_ib driver and this series
doesn't add any IB/core changes, we see no reason that it can't go
into 4.5. make sense?

Or


> Changes from v1:
>  * Change ib_is_udata_cleared to use memchr_inv.
>
> Changes from v0:
>  * Limit mmap PAGE_SIZE to 4K (security wise).
>  * Optimize ib_is_udata_cleared.
>  * Pass hca_core_clock_offset in the vendor's response part of init_ucontext.
>
> Matan Barak (5):
>   IB/mlx5: Add create_cq extended command
>   IB/core: Add ib_is_udata_cleared
>   IB/mlx5: Add support for hca_core_clock and timestamp_mask
>   IB/mlx5: Add hca_core_clock_offset to udata in init_ucontext
>   IB/mlx5: Mmap the HCA's core clock register to user-space
>
>  drivers/infiniband/hw/mlx5/cq.c  |  7 
>  drivers/infiniband/hw/mlx5/main.c| 67 
> +++-
>  drivers/infiniband/hw/mlx5/mlx5_ib.h |  7 +++-
>  drivers/infiniband/hw/mlx5/user.h| 12 +--
>  include/linux/mlx5/device.h  |  7 ++--
>  include/linux/mlx5/mlx5_ifc.h|  9 +++--
>  include/rdma/ib_verbs.h  | 27 +++
>  7 files changed, 120 insertions(+), 16 deletions(-)
>
> --
> 2.1.0
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH V1 00/16] add Intel(R) X722 iWARP driver

2015-12-23 Thread Or Gerlitz

On 12/22/2015 1:13 AM, Faisal Latif wrote:

This driver provides iWARP RDMA functionality for the Intel(R) X722 Ethernet
controller for PCI Physical Functions.


Is there any public info on the X722, I didn't manage to find such.


It also has support for Virtual Function driver (i40iwvf.ko), which that will 
be part of separate patch series.


can you explain why do you need a separate rdma driver for VFs?

Or.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH V1 15/16] i40iw: add entry in rdma_netlink

2015-12-23 Thread Or Gerlitz

On 12/22/2015 1:13 AM, Faisal Latif wrote:

Add entry for port mapper services.

Signed-off-by: Faisal Latif 
---
  include/uapi/rdma/rdma_netlink.h | 1 +
  1 file changed, 1 insertion(+)

diff --git a/include/uapi/rdma/rdma_netlink.h b/include/uapi/rdma/rdma_netlink.h
index c19a5dc..4fa418d 100644
--- a/include/uapi/rdma/rdma_netlink.h
+++ b/include/uapi/rdma/rdma_netlink.h
@@ -8,6 +8,7 @@ enum {
RDMA_NL_NES,
RDMA_NL_C4IW,
RDMA_NL_LS, /* RDMA Local Services */
+   RDMA_NL_I40IW,
RDMA_NL_NUM_CLIENTS
  };
  
Do you use this value in down-stream patches of the series? if yes, 
change the order to have this and other IB core changes before the code 
that use that.


Why the iwarp port mapper implementationhas to be repeated in each 
driver? can you join your code in a common place and avoid the duplication?


root@r-dcs58 hw]# git grep RDMA_NL_ nes
nes/nes.c:  [RDMA_NL_IWPM_REG_PID] = {.dump = iwpm_register_pid_cb},
nes/nes.c:  [RDMA_NL_IWPM_ADD_MAPPING] = {.dump = iwpm_add_mapping_cb},
nes/nes.c:  [RDMA_NL_IWPM_QUERY_MAPPING] = {.dump = 
iwpm_add_and_query_mapping_cb},

nes/nes.c:  [RDMA_NL_IWPM_REMOTE_INFO] = {.dump = iwpm_remote_info_cb},
nes/nes.c:  [RDMA_NL_IWPM_HANDLE_ERR] = {.dump = iwpm_mapping_error_cb},
nes/nes.c:  [RDMA_NL_IWPM_MAPINFO] = {.dump = iwpm_mapping_info_cb},
nes/nes.c:  [RDMA_NL_IWPM_MAPINFO_NUM] = {.dump = 
iwpm_ack_mapping_info_cb}
nes/nes.c:  if (ibnl_add_client(RDMA_NL_NES, RDMA_NL_IWPM_NUM_OPS, 
nes_nl_cb_table))

nes/nes.c:  ret = iwpm_init(RDMA_NL_NES);
nes/nes.c:  ibnl_remove_client(RDMA_NL_NES);
nes/nes.c:  ibnl_remove_client(RDMA_NL_NES);
nes/nes.c:  iwpm_exit(RDMA_NL_NES);
nes/nes_cm.c: &mapped_sockaddr, RDMA_NL_NES);
nes/nes_cm.c:   return iwpm_remove_mapping(&local_sockaddr, RDMA_NL_NES);
nes/nes_cm.c:   &remote_addr, RDMA_NL_NES);
nes/nes_cm.c:   iwpm_err = iwpm_register_pid(&pm_reg_msg, 
RDMA_NL_NES);
nes/nes_cm.c:   iwpm_err = iwpm_add_mapping(&pm_msg, 
RDMA_NL_NES);

nes/nes_cm.c:   iwpm_err = iwpm_register_pid(&pm_reg_msg, RDMA_NL_NES);
nes/nes_cm.c:   iwpm_err = iwpm_add_and_query_mapping(&pm_msg, 
RDMA_NL_NES);


[root@r-dcs58 hw]# git grep RDMA_NL_ cxgb4/

cxgb4/cm.c: iwpm_remove_mapping(&ep->com.local_addr, RDMA_NL_C4IW);
cxgb4/cm.c: &child_ep->com.remote_addr, RDMA_NL_C4IW);
cxgb4/cm.c: iwpm_err = iwpm_register_pid(&pm_reg_msg, RDMA_NL_C4IW);
cxgb4/cm.c: iwpm_err = iwpm_add_and_query_mapping(&pm_msg, 
RDMA_NL_C4IW);

cxgb4/cm.c: &ep->com.mapped_local_addr, RDMA_NL_C4IW)) {
cxgb4/cm.c: iwpm_remove_mapping(&ep->com.local_addr, RDMA_NL_C4IW);
cxgb4/cm.c: iwpm_err = iwpm_register_pid(&pm_reg_msg, RDMA_NL_C4IW);
cxgb4/cm.c: iwpm_err = iwpm_add_mapping(&pm_msg, RDMA_NL_C4IW);
cxgb4/cm.c: &ep->com.mapped_local_addr, RDMA_NL_C4IW)) {
cxgb4/device.c: [RDMA_NL_IWPM_REG_PID] = {.dump = iwpm_register_pid_cb},
cxgb4/device.c: [RDMA_NL_IWPM_ADD_MAPPING] = {.dump = iwpm_add_mapping_cb},
cxgb4/device.c: [RDMA_NL_IWPM_QUERY_MAPPING] = {.dump = 
iwpm_add_and_query_mapping_cb},

cxgb4/device.c: [RDMA_NL_IWPM_HANDLE_ERR] = {.dump = iwpm_mapping_error_cb},
cxgb4/device.c: [RDMA_NL_IWPM_REMOTE_INFO] = {.dump = iwpm_remote_info_cb},
cxgb4/device.c: [RDMA_NL_IWPM_MAPINFO] = {.dump = iwpm_mapping_info_cb},
cxgb4/device.c: [RDMA_NL_IWPM_MAPINFO_NUM] = {.dump = 
iwpm_ack_mapping_info_cb}

cxgb4/device.c: if (ibnl_add_client(RDMA_NL_C4IW, RDMA_NL_IWPM_NUM_OPS,
cxgb4/device.c: err = iwpm_init(RDMA_NL_C4IW);
cxgb4/device.c: ibnl_remove_client(RDMA_NL_C4IW);
cxgb4/device.c: iwpm_exit(RDMA_NL_C4IW);
cxgb4/device.c: ibnl_remove_client(RDMA_NL_C4IW);

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH V1 00/16] add Intel(R) X722 iWARP driver

2015-12-23 Thread Or Gerlitz

On 12/23/2015 6:35 PM, Faisal Latif wrote:

I will provide new patch series for latest k.o. Also will make sure of shallow 
threading for the series.


Please make sure that the cover letter will include the full output of 
the git generated cover-letter so we can see the location of changes you 
did to the IB core and the overall LOC volume of the driver.


Did you run the driver through 0-day testing to avoid zillion small 
follow up patches in a later point when this isin?


Or.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH V1 08/16] i40iw: add files for iwarp interface

2015-12-23 Thread Or Gerlitz

On 12/22/2015 1:13 AM, Faisal Latif wrote:

+
+enum i40iw_memreg_type {
+   IW_MEMREG_TYPE_MEM = 0x,
+   IW_MEMREG_TYPE_QP = 0x0001,
+   IW_MEMREG_TYPE_CQ = 0x0002,
+   IW_MEMREG_TYPE_MW = 0x0003,
+   IW_MEMREG_TYPE_FMR = 0x0004,
+   IW_MEMREG_TYPE_FMEM = 0x0005,
+};


Can't you re-use IB core values or derive that from the actual uverbs 
command?


Or.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH net-next 1/2] MAINTAINERS: Assign new maintainers to Mellanox mlx5 core and IB drivers

2015-12-23 Thread Or Gerlitz
Matan and Leon step in as co-maintainers to replace Eli Cohen
who wrote and maintained the core and IB drivers.

Signed-off-by: Or Gerlitz 
---
 MAINTAINERS | 15 +++
 1 file changed, 7 insertions(+), 8 deletions(-)

diff --git a/MAINTAINERS b/MAINTAINERS
index c6b78b0..c7586a5 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -7057,27 +7057,26 @@ W:  http://linuxtv.org
 S: Odd Fixes
 F: drivers/media/radio/radio-miropcm20*
 
-Mellanox MLX5 core VPI driver
-M: Eli Cohen 
+MELLANOX MLX5 core VPI driver
+M: Matan Barak 
+M: Leon Romanovsky 
 L: net...@vger.kernel.org
 L: linux-rdma@vger.kernel.org
 W: http://www.mellanox.com
 Q: http://patchwork.ozlabs.org/project/netdev/list/
-Q: http://patchwork.kernel.org/project/linux-rdma/list/
-T: git git://openfabrics.org/~eli/connect-ib.git
 S: Supported
 F: drivers/net/ethernet/mellanox/mlx5/core/
 F: include/linux/mlx5/
 
-Mellanox MLX5 IB driver
-M: Eli Cohen 
+MELLANOX MLX5 IB driver
+M: Matan Barak 
+M: Leon Romanovsky 
 L: linux-rdma@vger.kernel.org
 W: http://www.mellanox.com
 Q: http://patchwork.kernel.org/project/linux-rdma/list/
-T: git git://openfabrics.org/~eli/connect-ib.git
 S: Supported
-F: include/linux/mlx5/
 F: drivers/infiniband/hw/mlx5/
+F: include/linux/mlx5/
 
 MELEXIS MLX90614 DRIVER
 M: Crt Mori 
-- 
2.3.7

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH net-next 2/2] MAINTAINERS: Assign maintainer to Mellanox mlx4 core and IB drivers

2015-12-23 Thread Or Gerlitz
The driver was written originally by Roland Dreier, currently there's
no official maintainer. Yishai steps in as maintainer.

Signed-off-by: Or Gerlitz 
Cc: Roland Dreier 
---
 MAINTAINERS | 19 +++
 1 file changed, 19 insertions(+)

diff --git a/MAINTAINERS b/MAINTAINERS
index c7586a5..d9274ca 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -7057,6 +7057,25 @@ W:   http://linuxtv.org
 S: Odd Fixes
 F: drivers/media/radio/radio-miropcm20*
 
+MELLANOX MLX4 core VPI driver
+M: Yishai Hadas 
+L: net...@vger.kernel.org
+L: linux-rdma@vger.kernel.org
+W: http://www.mellanox.com
+Q: http://patchwork.ozlabs.org/project/netdev/list/
+S: Supported
+F: drivers/net/ethernet/mellanox/mlx4/
+F: include/linux/mlx4/
+
+MELLANOX MLX4 IB driver
+M: Yishai Hadas 
+L: linux-rdma@vger.kernel.org
+W: http://www.mellanox.com
+Q: http://patchwork.kernel.org/project/linux-rdma/list/
+S: Supported
+F: drivers/infiniband/hw/mlx4/
+F: include/linux/mlx4/
+
 MELLANOX MLX5 core VPI driver
 M: Matan Barak 
 M: Leon Romanovsky 
-- 
2.3.7

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH net-next 0/2] new maintainers for Mellanox mlx4/mlx5 core and IB drivers

2015-12-23 Thread Or Gerlitz
Hi Dave, Doug
 
We're happily assigning new maintainers for mlx4/mlx5 core and IB drivers.

This is aligned with Eli (mlx5) and Roland (mlx4).

FWIW, I did the whole change to go through netdev. 

Or.

Or Gerlitz (2):
  MAINTAINERS: Assign new maintainers to Mellanox mlx5 core and IB drivers
  MAINTAINERS: Assign maintainer to Mellanox mlx4 core and IB drivers

 MAINTAINERS | 32 +---
 1 file changed, 25 insertions(+), 7 deletions(-)

-- 
2.3.7

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH for-next V2 4/5] IB/mlx5: Add hca_core_clock_offset to udata in init_ucontext

2015-12-23 Thread Or Gerlitz

On 12/22/2015 11:37 PM, Or Gerlitz wrote:

On Tue, Dec 15, 2015 at 8:30 PM, Matan Barak  wrote:

>Pass hca_core_clock_offset to user-space is mandatory in order to
>let the user-space read the free-running clock register from the
>right offset in the memory mapped page.
>Passing this value is done by changing the vendor's command
>and response of init_ucontext to be in extensible form.

Is "old" (unpatched) libmlx5 still operational over "new" (patched)
mlx5 IB driver?


and same question for the other way around as well
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH for-next V2 4/5] IB/mlx5: Add hca_core_clock_offset to udata in init_ucontext

2015-12-22 Thread Or Gerlitz
On Tue, Dec 15, 2015 at 8:30 PM, Matan Barak  wrote:
> Pass hca_core_clock_offset to user-space is mandatory in order to
> let the user-space read the free-running clock register from the
> right offset in the memory mapped page.
> Passing this value is done by changing the vendor's command
> and response of init_ucontext to be in extensible form.

Is "old" (unpatched) libmlx5 still operational over "new" (patched)
mlx5 IB driver?
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] IB/cma: cma_match_net_dev needs to take into account port_num

2015-12-22 Thread Or Gerlitz

On 12/22/2015 9:17 AM, Or Gerlitz wrote:

On 12/21/2015 5:01 PM, Matan Barak wrote:
This patch fixes a bug in VPI systems, where the first port is 
configured
as IB and the second one is configured as Ethernet. In this case, if 
the rdma_id isn't bounded to a port, cma_match_net_dev will try to 
verify that the first port is a RoCE port and fail. This is fixed by 
passing the port of the incoming request.


OK -- we have another bug down there, cma loopback doesn't work, same 
reject reason (below).This happens in both VPI and non-VPI 
configurations.


Works well with 4.2.3


I made more checks with the 4.2.3 kernel (before all the IB core/cache 
changes that went in 4.3) -- rdma-cm loopback does work as long as there 
are active IB ports in the system. When there are no active IB ports, 
but rather only Eth ports (VPI or plain Eth), rping fails:


# rping -d -v -c -a 127.0.0.1 -C 1
created cm_id 0x6087d0
cma_event type RDMA_CM_EVENT_ADDR_RESOLVED cma_id 0x6087d0 (parent)
rdma_resolve_route: No such device


so it seems we had something there before the 4.3 changes.

Or.

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] IB/cma: cma_match_net_dev needs to take into account port_num

2015-12-22 Thread Or Gerlitz

On 12/21/2015 5:01 PM, Matan Barak wrote:

Previously, cma_match_net_dev called cma_protocol_roce which
tried to verify that the IB device uses RoCE protocol. However,
if rdma_id didn't have a bounded port, it used the first port
of the device.

In VPI systems, the first port might be an IB port while the second
one could be an Ethernet port. This made requests for unbounded rdma_ids
that come from the Ethernet port fail.
Fixing this by passing the port of the request and checking this port
of the device.

Fixes: b8cab5dab15f ('IB/cma: Accept connection without a valid netdev on RoCE')
Signed-off-by: Matan Barak


seems that the patch is missing from patchworks, I can't explain that.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: device attr cleanup

2015-12-21 Thread Or Gerlitz
On Wed, Dec 16, 2015 at 7:53 AM, Or Gerlitz  wrote:
> On 12/15/2015 9:03 PM, Doug Ledford wrote:

>> Or, you specifically asked me to wait until this week.  I made my
>> initial impressions clear (I don't necessarily like the removal of the
>> attr struct, but I like the removal of all of the query calls, and I'm
>> inclined to take the patch in spite of not liking the removal of the
>> struct).  Do you have anything to add or have we beat this horse to death?

> Hi Doug,
> Lets stop beating, both horses and people.
> I do understand that
> 1. you don't link the removal of the attr
> 2. you do like the removal of all the query calls
>
> I am proposing to take the path of a patch that
> does exactly #2 while avoiding #1.

Doug,

Did you look on my v1 post and the related discussion there w.r.t udata?

You didn't make any comment on my response here nor on the proposed patches.

Since we are really short in time w.r.t EOY holidays and we have the
udata matter
open (see [1]), could we move finalizing this discussion to the 4.6 time-frame?

If you do have the time, I think it would be fair to see a response
from you on the
discussion before you pick any of the two patch sets - so??

Or.

[1] Christoph's patch doesn't remove the query_device callback from
mlx4 since we
report there values to libmlx4 through the udata mechanism. The
query_device callback
will need to be present in future/current drivers if they decide to
use udata as well


> What's wrong with that? I haven't heard any reasoning for why its
> so good to stash ~50 new fields on the IB device structure except
> for the author saying that other subsystems do that and other people
> saying they are in favor of this approach while not providing any
> reasoning, except for maybe something on bikes.
>
> Why you or anyone else has to be from now and ever the cache line police
> making sure that people don't add new attributes in random locations
> over the IB device structure?
>
> What's wrong with putting fifty attributesin a structure which is a field
> of the device struct and have people go there to see what are the d
> ifferentattrs and add news ones there?
>
> This will make the 4.5 merge window extremely complex or even totally
> threatened  w.r.t to the RDMA subsystem and related drivers by 3.3K LOC
> patch.
>
> Sorry, but, I still don't get it.
>
> Or.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] IB/cma: cma_match_net_dev needs to take into account port_num

2015-12-21 Thread Or Gerlitz

On 12/21/2015 5:01 PM, Matan Barak wrote:

This patch fixes a bug in VPI systems, where the first port is configured
as IB and the second one is configured as Ethernet. In this case, if the 
rdma_id isn't bounded to a port, cma_match_net_dev will try to verify that the 
first port is a RoCE port and fail. This is fixed by passing the port of the 
incoming request.


OK -- we have another bug down there, cma loopback doesn't work, same 
reject reason (below).This happens in both VPI and non-VPI configurations.


Works well with 4.2.3

Or.


$ rping -d -v -c -a 127.0.0.1 -C 1
verbose
client
count 1
created cm_id 0x6087d0
cma_event type RDMA_CM_EVENT_ADDR_RESOLVED cma_id 0x6087d0 (parent)
cma_event type RDMA_CM_EVENT_ROUTE_RESOLVED cma_id 0x6087d0 (parent)
rdma_resolve_addr - rdma_resolve_route successful
created pd 0x60e5f0
created channel 0x608250
created cq 0x608a20
created qp 0x6082e0
rping_setup_buffers called on cb 0x606010
allocated & registered buffers...
cq_thread started.
wait for CONNECTED state 10
cma_event type RDMA_CM_EVENT_REJECTED cma_id 0x6087d0 (parent)
cma event RDMA_CM_EVENT_REJECTED, error 28
connect error -1
rping_free_buffers called on cb 0x606010
destroy cm_id 0x6087d0


--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] IB/cma: cma_match_net_dev needs to take into account port_num

2015-12-21 Thread Or Gerlitz

On 12/21/2015 5:01 PM, Matan Barak wrote:

Previously, cma_match_net_dev called cma_protocol_roce which
tried to verify that the IB device uses RoCE protocol. However,
if rdma_id didn't have a bounded port, it used the first port
of the device.


maybe prefer a higher then code speak language e.g "if the rdma id 
didn't have"  also below "unbounded rdma ids"




In VPI systems, the first port might be an IB port while the second
one could be an Ethernet port. This made requests for unbounded rdma_ids
that come from the Ethernet port fail.


add "to" --> "Ethernet port to fail"


Fixing this by passing the port of the request and checking this port
of the device.


OK, so this fix will work for both ib/eth and eth/ib configs, right? good.




Fixes: b8cab5dab15f ('IB/cma: Accept connection without a valid netdev on RoCE')


Reported-by: Or Gerlitz 



Signed-off-by: Matan Barak 


Doug, the bug fixes a commit from from 4.3, lets fix it in 4.4 and later 
we will send it to -stable as well. So for 4.4 there's this one and the 
kvfree fix [1]


Or.

[1] https://patchwork.kernel.org/patch/7868481/



---
Hi Doug,

This patch fixes a bug in VPI systems, where the first port is configured
as IB and the second one is configured as Ethernet.
In this case, if the rdma_id isn't bounded to a port, cma_match_net_dev
will try to verify that the first port is a RoCE port and fail.
This is fixed by passing the port of the incoming request.

Regards,
Matan

  drivers/infiniband/core/cma.c |   16 +---
  1 files changed, 9 insertions(+), 7 deletions(-)

diff --git a/drivers/infiniband/core/cma.c b/drivers/infiniband/core/cma.c
index d2d5d00..c8a265c 100644
--- a/drivers/infiniband/core/cma.c
+++ b/drivers/infiniband/core/cma.c
@@ -1265,15 +1265,17 @@ static bool cma_protocol_roce(const struct rdma_cm_id 
*id)
return cma_protocol_roce_dev_port(device, port_num);
  }
  
-static bool cma_match_net_dev(const struct rdma_id_private *id_priv,

- const struct net_device *net_dev)
+static bool cma_match_net_dev(const struct rdma_cm_id *id,
+ const struct net_device *net_dev,
+ u8 port_num)
  {
-   const struct rdma_addr *addr = &id_priv->id.route.addr;
+   const struct rdma_addr *addr = &id->route.addr;
  
  	if (!net_dev)

/* This request is an AF_IB request or a RoCE request */
-   return addr->src_addr.ss_family == AF_IB ||
-  cma_protocol_roce(&id_priv->id);
+   return (!id->port_num || id->port_num == port_num) &&
+  (addr->src_addr.ss_family == AF_IB ||
+   cma_protocol_roce_dev_port(id->device, port_num));
  
  	return !addr->dev_addr.bound_dev_if ||

   (net_eq(dev_net(net_dev), addr->dev_addr.net) &&
@@ -1295,13 +1297,13 @@ static struct rdma_id_private *cma_find_listener(
hlist_for_each_entry(id_priv, &bind_list->owners, node) {
if (cma_match_private_data(id_priv, ib_event->private_data)) {
if (id_priv->id.device == cm_id->device &&
-   cma_match_net_dev(id_priv, net_dev))
+   cma_match_net_dev(&id_priv->id, net_dev, req->port))
return id_priv;
list_for_each_entry(id_priv_dev,
&id_priv->listen_list,
listen_list) {
if (id_priv_dev->id.device == cm_id->device &&
-   cma_match_net_dev(id_priv_dev, net_dev))
+   cma_match_net_dev(&id_priv_dev->id, net_dev, 
req->port))
return id_priv_dev;
}
}


--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2 00/10] iSER support for remote invalidate

2015-12-21 Thread Or Gerlitz
On Mon, Dec 21, 2015 at 6:20 AM, Nicholas A. Bellinger
 wrote:

> Applied to target-pending/for-next as v4.5-rc1 material, along with
> Reviewed-by tags from HCH.

Hi Nic, thanks for stepping in and picking that.

Sagi, are you going to spin an increment in the initiator version?

Or.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH V1 1/3] IB/core: Align coding style of ib_device_cap_flags structure

2015-12-20 Thread Or Gerlitz
On Mon, Dec 21, 2015 at 9:27 AM, Leon Romanovsky  wrote:
> On Mon, Dec 21, 2015 at 8:52 AM, Or Gerlitz  wrote:
>> On Mon, Dec 21, 2015 at 8:37 AM, Leon Romanovsky  wrote:
>>> On Mon, Dec 21, 2015 at 8:22 AM, ira.weiny  wrote:
>>>> On Sun, Dec 20, 2015 at 12:16:09PM +0200, Leon Romanovsky wrote:
>>>>> From: Leon Romanovsky 
>>
>>>>> Modify enum ib_device_cap_flags such that other patches which add new
>>>>> enum values pass strict checkpatch.pl checks.
>>
>>>>> - IB_DEVICE_RESERVED  = (1<<16), /* old SEND_W_INV */
>>>>> - IB_DEVICE_MEM_WINDOW= (1<<17),
>>>>> + IB_DEVICE_RESIZE_MAX_WR = (1 << 0),
>>
>>> 2. Change the whole file => the work with "git blame" will be less
>>> straightforward.
>>
>> Agree.
>>
>> Leon, I don't think we need to take checkpatch-ing of things to that level.
>>
>> Indeed, we should make sure that whole new enums and such are done right --
>> but avoid touching core structs/enums in a manner that disallows
>> simple git blaming of things, which is very useful for new comers and
>> old suffers.

> There are no doubts that standalone fixing checkpatch errors is more
> suitable to staging subsystem.

Agree

> In our case, it is part of coming changes in that structure. such
> change serves specific goal to minimize the possibility of error
> by seeing clean output from static analyser tool.

Disagree.

What future bugs are you envisioning by let this 10y old header file
keep having some checkpatch issues on few of the major enums?!

Or.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH V1 1/3] IB/core: Align coding style of ib_device_cap_flags structure

2015-12-20 Thread Or Gerlitz
On Mon, Dec 21, 2015 at 8:37 AM, Leon Romanovsky  wrote:
> On Mon, Dec 21, 2015 at 8:22 AM, ira.weiny  wrote:
>> On Sun, Dec 20, 2015 at 12:16:09PM +0200, Leon Romanovsky wrote:
>>> From: Leon Romanovsky 

>>> Modify enum ib_device_cap_flags such that other patches which add new
>>> enum values pass strict checkpatch.pl checks.

>>> - IB_DEVICE_RESERVED  = (1<<16), /* old SEND_W_INV */
>>> - IB_DEVICE_MEM_WINDOW= (1<<17),
>>> + IB_DEVICE_RESIZE_MAX_WR = (1 << 0),

> 2. Change the whole file => the work with "git blame" will be less
> straightforward.

Agree.

Leon, I don't think we need to take checkpatch-ing of things to that level.

Indeed, we should make sure that whole new enums and such are done right --
but avoid touching core structs/enums in a manner that disallows
simple git blaming of things, which is very useful for new comers and
old suffers.

For example, the networking community keeps adding IFF_YYY values to
include/linux/netdevice.h :: enum netdev_priv_flags without fixing
checkpatch complaint on missing spaces before/after the << shift sign.

> I will do the change across whole file, If Doug accepts such change.

Please don't... simple git blame is very powerful tool for kernel
developers everyday work.

Or.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH rdma-next 1/6] IB/core: Save the device attributes on the device structure

2015-12-20 Thread Or Gerlitz

On 12/18/2015 6:49 PM, Jason Gunthorpe wrote:

On Fri, Dec 18, 2015 at 10:08:41AM +0200, Or Gerlitz wrote:

On 12/17/2015 7:41 PM, Jason Gunthorpe wrote:

On Thu, Dec 17, 2015 at 03:44:19PM +0200, Sagi Grimberg wrote:

+   ret = ib_query_device(device, &device->attrs);
+   if (ret) {
+   printk(KERN_WARNING "Couldn't query the device attributes\n");
+   goto out;
+   }
+

I thought we're all for removing the call altogether aren't we?

I'd say just call device->query_device() instead.

Christoph's patch even got rid of device->query_device(),

Wrong.

Not really, lots of hunks in Christoph's patch are removing query_device ie:

@@ -1305,7 +1299,6 @@  int mthca_register_device(struct mthca_dev *dev)
dev->ib_dev.phys_port_cnt= dev->limits.num_ports;
dev->ib_dev.num_comp_vectors = 1;
dev->ib_dev.dma_device   = &dev->pdev->dev;
-   dev->ib_dev.query_device = mthca_query_device;
dev->ib_dev.query_port   = mthca_query_port;


I know, I wanted to stress the point that under the udata architecture 
the query_device callbacked can't be just deleted since drivers use that 
to get/provide data to/from their user-space libraries.



Sure, it sticks around in a couple places but it isn't 'query_device'
anymore, it is 'query_device_udata' which is reasonable.



I see what you mean. But this involved somehow deeper patching to mlx4 
which should be applied later on to any ll driver that would like to 
expose udata for their query device entry (and return back that entry as 
well after we delete it now in Christoph's patch).


Or.


--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH rdma-next V1 3/7] IB/ulps: Avoid calling ib_query_device

2015-12-20 Thread Or Gerlitz

On 12/18/2015 4:10 PM, Bart Van Assche wrote:

Shouldn't this patch be split per ULP driver to make review easier ?


The patch is very simple and does the same practice in all ULPs. If it 
helps I can spit off the srp initiator and target bits into a different 
patch, let me know.




If you have a look at the MAINTAINERS file then you will see that you 
forgot to CC the SRP initiator maintainer. 


yes, sorry for that, will copy you on further versions if there are such.

Or.

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH libibverbs 2/3] libibverbs: Add cross-channel QP initialization flags

2015-12-20 Thread Or Gerlitz

On 12/20/2015 1:22 PM, Leon Romanovsky wrote:

---
  include/infiniband/verbs.h | 3 +++
  src/cmd.c  | 5 -
  2 files changed, 7 insertions(+), 1 deletion(-)


You need to document the new creation flags in a man page. If this 
doesn't exist, this should be added to Eran's submission and you will 
enhance that.


Or.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 09/10] IB: remove the struct ib_phys_buf definition

2015-12-20 Thread Or Gerlitz

On 12/20/2015 5:41 PM, Christoph Hellwig wrote:


I can't see a useful explanation for removing an unused structure.  But if you
really want one and come up with a coherent sentence or two I can add it.


patch title - IB/core: Remove struct ib_phys_buf

change log:

Remove struct ib_phys_buf since YYY

where you need to fill few words for YYY, e.g not used anywhere ever, 
not used after you did

cleanup ZZZ, something that can serve someone reading your patch
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 10/10] IB: remove the unused usecnt field from struct ib_mr

2015-12-20 Thread Or Gerlitz

On 12/20/2015 11:31 AM, Sagi Grimberg wrote:

However, I think that the patch from Shani 6b52a12bc3fc ("IB/uverbs:
Implement memory windows support in uverbs") is wrong. A memory
window allocation should really reference the MR and not the PD (which
is referenced by the MR). Otherwise the MR deregistration is allowed
with windows bound to it. If this is the case then this field is not
useless and we should fix ib_uverbs_alloc_mw()?


I tend to agree, lets not remove the field till this is clarified with 
Haggai and Co




Haggai, can you shed some light here? It's Shani's and your code. 


--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 10/10] IB: remove the unused usecnt field from struct ib_mr

2015-12-20 Thread Or Gerlitz

On 12/20/2015 11:31 AM, Sagi Grimberg wrote:

However, I think that the patch from Shani 6b52a12bc3fc ("IB/uverbs:
Implement memory windows support in uverbs") is wrong. A memory
window allocation should really reference the MR and not the PD (which
is referenced by the MR). Otherwise the MR deregistration is allowed
with windows bound to it. If this is the case then this field is not
useless and we should fix ib_uverbs_alloc_mw()?


I tend to agree, lets not remove the field this is clarified with Haggai 
and Co




Haggai, can you shed some light here? It's Shani's and your code. 


--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 10/10] IB: remove the unused usecnt field from struct ib_mr

2015-12-19 Thread Or Gerlitz

On 12/20/2015 9:25 AM, Or Gerlitz wrote:

On 12/18/2015 3:55 PM, Christoph Hellwig wrote:

  diff --git a/include/rdma/ib_verbs.h b/include/rdma/ib_verbs.h
index 284916d..e45776e 100644
--- a/include/rdma/ib_verbs.h
+++ b/include/rdma/ib_verbs.h
@@ -1306,7 +1306,6 @@ struct ib_mr {
  u64   iova;
  u32   length;
  unsigned int   page_size;
-atomic_t   usecnt; /* count number of MWs */
  };


This comment is part of Roland's uverbs commit. 


I saw now that you removed in-kernel support for MWs as a downstream patch
of this series. I guess this cleanup needs to go there as the refcount 
field of

kernel MRs relates to MWs. This will also help someone coming in the future
and returning the in-kernel MW support to avoid forgetting on doing the 
refs.


Or.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 09/10] IB: remove the struct ib_phys_buf definition

2015-12-19 Thread Or Gerlitz

On 12/18/2015 3:55 PM, Christoph Hellwig wrote:


Signed-off-by: Christoph Hellwig 
Reviewed-by: Sagi Grimberg 
Reviewed-by: Jason Gunthorpe  [core]
Reviewed-by: Steve Wise 


Here, too, please avoid empty change logs to IB core patches.

Or.

---
  include/rdma/ib_verbs.h | 5 -
  1 file changed, 5 deletions(-)

diff --git a/include/rdma/ib_verbs.h b/include/rdma/ib_verbs.h
index ea093ee..284916d 100644
--- a/include/rdma/ib_verbs.h
+++ b/include/rdma/ib_verbs.h
@@ -1143,11 +1143,6 @@ enum ib_access_flags {
IB_ACCESS_ON_DEMAND = (1<<6),
  };
  
-struct ib_phys_buf {

-   u64  addr;
-   u64  size;
-};
-
  /*
   * XXX: these are apparently used for ->rereg_user_mr, no idea why they
   * are hidden here instead of a uapi header!


--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 10/10] IB: remove the unused usecnt field from struct ib_mr

2015-12-19 Thread Or Gerlitz

On 12/18/2015 4:14 PM, Bart Van Assche wrote:

On 12/18/2015 02:55 PM, Christoph Hellwig wrote:

Signed-off-by: Christoph Hellwig 


Shouldn't the description of this patch be changed into something like 
"Remove the usecnt field from ib_mr since it is always zero" ?



Agree.

I would like us to avoid empty changes log for IB core patches and 
actually all over the rdma subsystem.


Or.


--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: RoCE passive side failures on 4.4-rc5

2015-12-19 Thread Or Gerlitz

On 12/17/2015 3:58 PM, Or Gerlitz wrote:

Using 4.4-rc5+ [1] and **not** applying any of the patches I sent today,
I noted that RoCE passive side isn't working (rdma-cm, ibv_rc_pingpong 
works).


I have two nodes in ConnectX3 VPI config (port1 IB and port2 Eth), the 
one with the 4.4-rc5 kernel can act as both (rping) client/server for 
IB links but only (rping) client for RoCE.


I tried both inter-node and loopback runs, in all cases, the client 
side getsCM
reject with reason 28, see [2], tried both iser and rping. Eth (ICMP, 
TCP) works OK.


OK, small progress, when the force Eth link type on my IB port (using 
mlx4 sysfs), things work.


You should be able to reproduce it on your non-VPI systems the other way 
around, by

forcing IB link type on one of the Eth ports and see the failure.

I Saw the same behavior with both 4.4-rc2 and 4.4-rc5

Or.

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 10/10] IB: remove the unused usecnt field from struct ib_mr

2015-12-19 Thread Or Gerlitz

On 12/18/2015 3:55 PM, Christoph Hellwig wrote:
  
diff --git a/include/rdma/ib_verbs.h b/include/rdma/ib_verbs.h

index 284916d..e45776e 100644
--- a/include/rdma/ib_verbs.h
+++ b/include/rdma/ib_verbs.h
@@ -1306,7 +1306,6 @@ struct ib_mr {
u64iova;
u32length;
unsigned int   page_size;
-   atomic_t   usecnt; /* count number of MWs */
  };


This comment is part of Roland's uverbs commit.

I wonder if LL driver supporting the  IB_WR_BIND_MW op
ref the MR on port send and deref it on completion?

Or.


  
  struct ib_mw {

-- 1.9.1


--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH rdma-next 1/6] IB/core: Save the device attributes on the device structure

2015-12-18 Thread Or Gerlitz
On Fri, Dec 18, 2015 at 7:16 AM, ira.weiny  wrote:

> More than anything what I hate is all the places that allocate struct
> ib_device_attr just to free it after the query call.

did you care to look on patches 2-6, this is exactly what they are doing.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH rdma-next V1 6/7] staging/o2iblnd: Avoid calling ib_query_device

2015-12-18 Thread Or Gerlitz
Instead, use the cached copy of the attributes present on the device.

Signed-off-by: Or Gerlitz 
---
 drivers/staging/lustre/lnet/klnds/o2iblnd/o2iblnd.c | 21 +
 1 file changed, 1 insertion(+), 20 deletions(-)

diff --git a/drivers/staging/lustre/lnet/klnds/o2iblnd/o2iblnd.c 
b/drivers/staging/lustre/lnet/klnds/o2iblnd/o2iblnd.c
index 7c730e3..7231ef8 100644
--- a/drivers/staging/lustre/lnet/klnds/o2iblnd/o2iblnd.c
+++ b/drivers/staging/lustre/lnet/klnds/o2iblnd/o2iblnd.c
@@ -2070,32 +2070,13 @@ static int kiblnd_net_init_pools(kib_net_t *net, __u32 
*cpts, int ncpts)
 
 static int kiblnd_hdev_get_attr(kib_hca_dev_t *hdev)
 {
-   struct ib_device_attr *attr;
-   int rc;
-
/* It's safe to assume a HCA can handle a page size
 * matching that of the native system */
hdev->ibh_page_shift = PAGE_SHIFT;
hdev->ibh_page_size  = 1 << PAGE_SHIFT;
hdev->ibh_page_mask  = ~((__u64)hdev->ibh_page_size - 1);
 
-   LIBCFS_ALLOC(attr, sizeof(*attr));
-   if (attr == NULL) {
-   CERROR("Out of memory\n");
-   return -ENOMEM;
-   }
-
-   rc = ib_query_device(hdev->ibh_ibdev, attr);
-   if (rc == 0)
-   hdev->ibh_mr_size = attr->max_mr_size;
-
-   LIBCFS_FREE(attr, sizeof(*attr));
-
-   if (rc != 0) {
-   CERROR("Failed to query IB device: %d\n", rc);
-   return rc;
-   }
-
+   hdev->ibh_mr_size = hdev->ibh_ibdev->attrs.max_mr_size;
if (hdev->ibh_mr_size == ~0ULL) {
hdev->ibh_mr_shift = 64;
return 0;
-- 
2.3.7

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH rdma-next V1 7/7] IB/core: Remove ib_query_device

2015-12-18 Thread Or Gerlitz
The copy of the attributes present on the device is now used by all consumers
expect for uverbs in case of serving user-space query, where dev->query_device
is called.

Signed-off-by: Or Gerlitz 
---
 drivers/infiniband/core/device.c | 19 ---
 include/rdma/ib_verbs.h  |  3 ---
 2 files changed, 22 deletions(-)

diff --git a/drivers/infiniband/core/device.c b/drivers/infiniband/core/device.c
index 568592e3..6def2f7 100644
--- a/drivers/infiniband/core/device.c
+++ b/drivers/infiniband/core/device.c
@@ -636,25 +636,6 @@ void ib_dispatch_event(struct ib_event *event)
 EXPORT_SYMBOL(ib_dispatch_event);
 
 /**
- * ib_query_device - Query IB device attributes
- * @device:Device to query
- * @device_attr:Device attributes
- *
- * ib_query_device() returns the attributes of a device through the
- * @device_attr pointer.
- */
-int ib_query_device(struct ib_device *device,
-   struct ib_device_attr *device_attr)
-{
-   struct ib_udata uhw = {.outlen = 0, .inlen = 0};
-
-   memset(device_attr, 0, sizeof(*device_attr));
-
-   return device->query_device(device, device_attr, &uhw);
-}
-EXPORT_SYMBOL(ib_query_device);
-
-/**
  * ib_query_port - Query IB port attributes
  * @device:Device to query
  * @port_num:Port number to query
diff --git a/include/rdma/ib_verbs.h b/include/rdma/ib_verbs.h
index 730dcfb..0567866 100644
--- a/include/rdma/ib_verbs.h
+++ b/include/rdma/ib_verbs.h
@@ -1913,9 +1913,6 @@ int ib_register_event_handler  (struct ib_event_handler 
*event_handler);
 int ib_unregister_event_handler(struct ib_event_handler *event_handler);
 void ib_dispatch_event(struct ib_event *event);
 
-int ib_query_device(struct ib_device *device,
-   struct ib_device_attr *device_attr);
-
 int ib_query_port(struct ib_device *device,
  u8 port_num, struct ib_port_attr *port_attr);
 
-- 
2.3.7

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH rdma-next V1 3/7] IB/ulps: Avoid calling ib_query_device

2015-12-18 Thread Or Gerlitz
Instead, use the cached copy of the attributes present on the device.

Signed-off-by: Or Gerlitz 
---
 drivers/infiniband/ulp/ipoib/ipoib_cm.c  | 19 ---
 drivers/infiniband/ulp/ipoib/ipoib_ethtool.c | 14 +++--
 drivers/infiniband/ulp/ipoib/ipoib_main.c| 21 +
 drivers/infiniband/ulp/iser/iscsi_iser.c |  4 +--
 drivers/infiniband/ulp/iser/iscsi_iser.h |  2 --
 drivers/infiniband/ulp/iser/iser_memory.c|  9 +++---
 drivers/infiniband/ulp/iser/iser_verbs.c | 38 ++
 drivers/infiniband/ulp/isert/ib_isert.c  | 47 +---
 drivers/infiniband/ulp/isert/ib_isert.h  |  1 -
 drivers/infiniband/ulp/srp/ib_srp.c  | 32 ++-
 drivers/infiniband/ulp/srpt/ib_srpt.c| 15 -
 drivers/infiniband/ulp/srpt/ib_srpt.h|  3 --
 12 files changed, 63 insertions(+), 142 deletions(-)

diff --git a/drivers/infiniband/ulp/ipoib/ipoib_cm.c 
b/drivers/infiniband/ulp/ipoib/ipoib_cm.c
index 3ae9726..94d144d 100644
--- a/drivers/infiniband/ulp/ipoib/ipoib_cm.c
+++ b/drivers/infiniband/ulp/ipoib/ipoib_cm.c
@@ -1522,8 +1522,7 @@ static void ipoib_cm_create_srq(struct net_device *dev, 
int max_sge)
 int ipoib_cm_dev_init(struct net_device *dev)
 {
struct ipoib_dev_priv *priv = netdev_priv(dev);
-   int i, ret;
-   struct ib_device_attr attr;
+   int max_srq_sge, i;
 
INIT_LIST_HEAD(&priv->cm.passive_ids);
INIT_LIST_HEAD(&priv->cm.reap_list);
@@ -1540,19 +1539,13 @@ int ipoib_cm_dev_init(struct net_device *dev)
 
skb_queue_head_init(&priv->cm.skb_queue);
 
-   ret = ib_query_device(priv->ca, &attr);
-   if (ret) {
-   printk(KERN_WARNING "ib_query_device() failed with %d\n", ret);
-   return ret;
-   }
-
-   ipoib_dbg(priv, "max_srq_sge=%d\n", attr.max_srq_sge);
+   ipoib_dbg(priv, "max_srq_sge=%d\n", priv->ca->attrs.max_srq_sge);
 
-   attr.max_srq_sge = min_t(int, IPOIB_CM_RX_SG, attr.max_srq_sge);
-   ipoib_cm_create_srq(dev, attr.max_srq_sge);
+   max_srq_sge = min_t(int, IPOIB_CM_RX_SG, priv->ca->attrs.max_srq_sge);
+   ipoib_cm_create_srq(dev, max_srq_sge);
if (ipoib_cm_has_srq(dev)) {
-   priv->cm.max_cm_mtu = attr.max_srq_sge * PAGE_SIZE - 0x10;
-   priv->cm.num_frags  = attr.max_srq_sge;
+   priv->cm.max_cm_mtu = max_srq_sge * PAGE_SIZE - 0x10;
+   priv->cm.num_frags  = max_srq_sge;
ipoib_dbg(priv, "max_cm_mtu = 0x%x, num_frags=%d\n",
  priv->cm.max_cm_mtu, priv->cm.num_frags);
} else {
diff --git a/drivers/infiniband/ulp/ipoib/ipoib_ethtool.c 
b/drivers/infiniband/ulp/ipoib/ipoib_ethtool.c
index 078cadd..a53fa5f 100644
--- a/drivers/infiniband/ulp/ipoib/ipoib_ethtool.c
+++ b/drivers/infiniband/ulp/ipoib/ipoib_ethtool.c
@@ -40,15 +40,11 @@ static void ipoib_get_drvinfo(struct net_device *netdev,
  struct ethtool_drvinfo *drvinfo)
 {
struct ipoib_dev_priv *priv = netdev_priv(netdev);
-   struct ib_device_attr *attr;
-
-   attr = kmalloc(sizeof(*attr), GFP_KERNEL);
-   if (attr && !ib_query_device(priv->ca, attr))
-   snprintf(drvinfo->fw_version, sizeof(drvinfo->fw_version),
-"%d.%d.%d", (int)(attr->fw_ver >> 32),
-(int)(attr->fw_ver >> 16) & 0x,
-(int)attr->fw_ver & 0x);
-   kfree(attr);
+
+   snprintf(drvinfo->fw_version, sizeof(drvinfo->fw_version),
+"%d.%d.%d", (int)(priv->ca->attrs.fw_ver >> 32),
+(int)(priv->ca->attrs.fw_ver >> 16) & 0x,
+(int)priv->ca->attrs.fw_ver & 0x);
 
strlcpy(drvinfo->bus_info, dev_name(priv->ca->dma_device),
sizeof(drvinfo->bus_info));
diff --git a/drivers/infiniband/ulp/ipoib/ipoib_main.c 
b/drivers/infiniband/ulp/ipoib/ipoib_main.c
index 7d32818..58732c5 100644
--- a/drivers/infiniband/ulp/ipoib/ipoib_main.c
+++ b/drivers/infiniband/ulp/ipoib/ipoib_main.c
@@ -1777,26 +1777,7 @@ int ipoib_add_pkey_attr(struct net_device *dev)
 
 int ipoib_set_dev_features(struct ipoib_dev_priv *priv, struct ib_device *hca)
 {
-   struct ib_device_attr *device_attr;
-   int result = -ENOMEM;
-
-   device_attr = kmalloc(sizeof *device_attr, GFP_KERNEL);
-   if (!device_attr) {
-   printk(KERN_WARNING "%s: allocation of %zu bytes failed\n",
-  hca->name, sizeof *device_attr);
-   return result;
-   }
-
-   result = ib_query_device(hca, device_attr);
-   if (result) {
-   printk(KERN_WARNING "%s: ib_query_device failed (ret = %d)\

[PATCH rdma-next V1 1/7] IB/core: Save the device attributes on the device structure

2015-12-18 Thread Or Gerlitz
From: Ira Weiny 

This way both the IB core and upper level drivers can access these cached
device attributes rather than querying or caching them on their own.

Signed-off-by: Ira Weiny 
Signed-off-by: Or Gerlitz 
---
 drivers/infiniband/core/device.c | 8 
 include/rdma/ib_verbs.h  | 1 +
 2 files changed, 9 insertions(+)

diff --git a/drivers/infiniband/core/device.c b/drivers/infiniband/core/device.c
index 179e813..568592e3 100644
--- a/drivers/infiniband/core/device.c
+++ b/drivers/infiniband/core/device.c
@@ -325,6 +325,7 @@ int ib_register_device(struct ib_device *device,
 {
int ret;
struct ib_client *client;
+   struct ib_udata uhw = {.outlen = 0, .inlen = 0};
 
mutex_lock(&device_mutex);
 
@@ -352,6 +353,13 @@ int ib_register_device(struct ib_device *device,
goto out;
}
 
+   memset(&device->attrs, 0, sizeof(device->attrs));
+   ret = device->query_device(device, &device->attrs, &uhw);
+   if (ret) {
+   printk(KERN_WARNING "Couldn't query the device attributes\n");
+   goto out;
+   }
+
ret = ib_device_register_sysfs(device, port_callback);
if (ret) {
printk(KERN_WARNING "Couldn't register device %s with driver 
model\n",
diff --git a/include/rdma/ib_verbs.h b/include/rdma/ib_verbs.h
index 120da1d..730dcfb 100644
--- a/include/rdma/ib_verbs.h
+++ b/include/rdma/ib_verbs.h
@@ -1823,6 +1823,7 @@ struct ib_device {
u16  is_switch:1;
u8   node_type;
u8   phys_port_cnt;
+   struct ib_device_attrattrs;
 
/**
 * The following mandatory functions are used only at device
-- 
2.3.7

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH rdma-next V1 5/7] xprtrdma: Avoid calling ib_query_device

2015-12-18 Thread Or Gerlitz
Instead, use the cached copy of the attributes present on the device.

Signed-off-by: Or Gerlitz 
---
 net/sunrpc/xprtrdma/frwr_ops.c   |  7 ++---
 net/sunrpc/xprtrdma/svc_rdma_transport.c | 48 +---
 net/sunrpc/xprtrdma/verbs.c  | 24 ++--
 net/sunrpc/xprtrdma/xprt_rdma.h  |  1 -
 4 files changed, 30 insertions(+), 50 deletions(-)

diff --git a/net/sunrpc/xprtrdma/frwr_ops.c b/net/sunrpc/xprtrdma/frwr_ops.c
index 88cf9e7..8d4864f 100644
--- a/net/sunrpc/xprtrdma/frwr_ops.c
+++ b/net/sunrpc/xprtrdma/frwr_ops.c
@@ -190,12 +190,11 @@ static int
 frwr_op_open(struct rpcrdma_ia *ia, struct rpcrdma_ep *ep,
 struct rpcrdma_create_data_internal *cdata)
 {
-   struct ib_device_attr *devattr = &ia->ri_devattr;
int depth, delta;
 
ia->ri_max_frmr_depth =
min_t(unsigned int, RPCRDMA_MAX_DATA_SEGS,
- devattr->max_fast_reg_page_list_len);
+ ia->ri_device->attrs.max_fast_reg_page_list_len);
dprintk("RPC:   %s: device's max FR page list len = %u\n",
__func__, ia->ri_max_frmr_depth);
 
@@ -222,8 +221,8 @@ frwr_op_open(struct rpcrdma_ia *ia, struct rpcrdma_ep *ep,
}
 
ep->rep_attr.cap.max_send_wr *= depth;
-   if (ep->rep_attr.cap.max_send_wr > devattr->max_qp_wr) {
-   cdata->max_requests = devattr->max_qp_wr / depth;
+   if (ep->rep_attr.cap.max_send_wr > ia->ri_device->attrs.max_qp_wr) {
+   cdata->max_requests = ia->ri_device->attrs.max_qp_wr / depth;
if (!cdata->max_requests)
return -EINVAL;
ep->rep_attr.cap.max_send_wr = cdata->max_requests *
diff --git a/net/sunrpc/xprtrdma/svc_rdma_transport.c 
b/net/sunrpc/xprtrdma/svc_rdma_transport.c
index b348b4a..4d695af 100644
--- a/net/sunrpc/xprtrdma/svc_rdma_transport.c
+++ b/net/sunrpc/xprtrdma/svc_rdma_transport.c
@@ -886,10 +886,10 @@ static struct svc_xprt *svc_rdma_accept(struct svc_xprt 
*xprt)
struct rdma_conn_param conn_param;
struct ib_cq_init_attr cq_attr = {};
struct ib_qp_init_attr qp_attr;
-   struct ib_device_attr devattr;
+   struct ib_device *dev;
int uninitialized_var(dma_mr_acc);
int need_dma_mr = 0;
-   int ret;
+   int ret = 0;
int i;
 
listen_rdma = container_of(xprt, struct svcxprt_rdma, sc_xprt);
@@ -910,20 +910,15 @@ static struct svc_xprt *svc_rdma_accept(struct svc_xprt 
*xprt)
dprintk("svcrdma: newxprt from accept queue = %p, cm_id=%p\n",
newxprt, newxprt->sc_cm_id);
 
-   ret = ib_query_device(newxprt->sc_cm_id->device, &devattr);
-   if (ret) {
-   dprintk("svcrdma: could not query device attributes on "
-   "device %p, rc=%d\n", newxprt->sc_cm_id->device, ret);
-   goto errout;
-   }
+   dev = newxprt->sc_cm_id->device;
 
/* Qualify the transport resource defaults with the
 * capabilities of this particular device */
-   newxprt->sc_max_sge = min((size_t)devattr.max_sge,
+   newxprt->sc_max_sge = min((size_t)dev->attrs.max_sge,
  (size_t)RPCSVC_MAXPAGES);
-   newxprt->sc_max_sge_rd = min_t(size_t, devattr.max_sge_rd,
+   newxprt->sc_max_sge_rd = min_t(size_t, dev->attrs.max_sge_rd,
   RPCSVC_MAXPAGES);
-   newxprt->sc_max_requests = min((size_t)devattr.max_qp_wr,
+   newxprt->sc_max_requests = min((size_t)dev->attrs.max_qp_wr,
   (size_t)svcrdma_max_requests);
newxprt->sc_sq_depth = RPCRDMA_SQ_DEPTH_MULT * newxprt->sc_max_requests;
 
@@ -931,16 +926,16 @@ static struct svc_xprt *svc_rdma_accept(struct svc_xprt 
*xprt)
 * Limit ORD based on client limit, local device limit, and
 * configured svcrdma limit.
 */
-   newxprt->sc_ord = min_t(size_t, devattr.max_qp_rd_atom, 
newxprt->sc_ord);
+   newxprt->sc_ord = min_t(size_t, dev->attrs.max_qp_rd_atom, 
newxprt->sc_ord);
newxprt->sc_ord = min_t(size_t, svcrdma_ord, newxprt->sc_ord);
 
-   newxprt->sc_pd = ib_alloc_pd(newxprt->sc_cm_id->device);
+   newxprt->sc_pd = ib_alloc_pd(dev);
if (IS_ERR(newxprt->sc_pd)) {
dprintk("svcrdma: error creating PD for connect request\n");
goto errout;
}
cq_attr.cqe = newxprt->sc_sq_depth;
-   newxprt->sc_sq_cq = ib_create_cq(newxprt->sc_cm_id->device,
+   newxprt->sc_sq_cq = ib_create_cq(dev,
 sq_comp_handler,
 cq_event_handler,

[PATCH rdma-next V1 4/7] net/rds: Avoid calling ib_query_device

2015-12-18 Thread Or Gerlitz
Instead, use the cached copy of the attributes present on the device.

Signed-off-by: Or Gerlitz 
---
 net/rds/ib.c | 34 +++---
 net/rds/iw.c | 23 +--
 2 files changed, 16 insertions(+), 41 deletions(-)

diff --git a/net/rds/ib.c b/net/rds/ib.c
index f222885..9481d55 100644
--- a/net/rds/ib.c
+++ b/net/rds/ib.c
@@ -122,44 +122,34 @@ void rds_ib_dev_put(struct rds_ib_device *rds_ibdev)
 static void rds_ib_add_one(struct ib_device *device)
 {
struct rds_ib_device *rds_ibdev;
-   struct ib_device_attr *dev_attr;
 
/* Only handle IB (no iWARP) devices */
if (device->node_type != RDMA_NODE_IB_CA)
return;
 
-   dev_attr = kmalloc(sizeof *dev_attr, GFP_KERNEL);
-   if (!dev_attr)
-   return;
-
-   if (ib_query_device(device, dev_attr)) {
-   rdsdebug("Query device failed for %s\n", device->name);
-   goto free_attr;
-   }
-
rds_ibdev = kzalloc_node(sizeof(struct rds_ib_device), GFP_KERNEL,
 ibdev_to_node(device));
if (!rds_ibdev)
-   goto free_attr;
+   return;
 
spin_lock_init(&rds_ibdev->spinlock);
atomic_set(&rds_ibdev->refcount, 1);
INIT_WORK(&rds_ibdev->free_work, rds_ib_dev_free);
 
-   rds_ibdev->max_wrs = dev_attr->max_qp_wr;
-   rds_ibdev->max_sge = min(dev_attr->max_sge, RDS_IB_MAX_SGE);
+   rds_ibdev->max_wrs = device->attrs.max_qp_wr;
+   rds_ibdev->max_sge = min(device->attrs.max_sge, RDS_IB_MAX_SGE);
 
-   rds_ibdev->fmr_max_remaps = dev_attr->max_map_per_fmr?: 32;
-   rds_ibdev->max_1m_fmrs = dev_attr->max_mr ?
-   min_t(unsigned int, (dev_attr->max_mr / 2),
+   rds_ibdev->fmr_max_remaps = device->attrs.max_map_per_fmr?: 32;
+   rds_ibdev->max_1m_fmrs = device->attrs.max_mr ?
+   min_t(unsigned int, (device->attrs.max_mr / 2),
  rds_ib_fmr_1m_pool_size) : rds_ib_fmr_1m_pool_size;
 
-   rds_ibdev->max_8k_fmrs = dev_attr->max_mr ?
-   min_t(unsigned int, ((dev_attr->max_mr / 2) * RDS_MR_8K_SCALE),
+   rds_ibdev->max_8k_fmrs = device->attrs.max_mr ?
+   min_t(unsigned int, ((device->attrs.max_mr / 2) * 
RDS_MR_8K_SCALE),
  rds_ib_fmr_8k_pool_size) : rds_ib_fmr_8k_pool_size;
 
-   rds_ibdev->max_initiator_depth = dev_attr->max_qp_init_rd_atom;
-   rds_ibdev->max_responder_resources = dev_attr->max_qp_rd_atom;
+   rds_ibdev->max_initiator_depth = device->attrs.max_qp_init_rd_atom;
+   rds_ibdev->max_responder_resources = device->attrs.max_qp_rd_atom;
 
rds_ibdev->dev = device;
rds_ibdev->pd = ib_alloc_pd(device);
@@ -183,7 +173,7 @@ static void rds_ib_add_one(struct ib_device *device)
}
 
rdsdebug("RDS/IB: max_mr = %d, max_wrs = %d, max_sge = %d, 
fmr_max_remaps = %d, max_1m_fmrs = %d, max_8k_fmrs = %d\n",
-dev_attr->max_fmr, rds_ibdev->max_wrs, rds_ibdev->max_sge,
+device->attrs.max_fmr, rds_ibdev->max_wrs, rds_ibdev->max_sge,
 rds_ibdev->fmr_max_remaps, rds_ibdev->max_1m_fmrs,
 rds_ibdev->max_8k_fmrs);
 
@@ -202,8 +192,6 @@ static void rds_ib_add_one(struct ib_device *device)
 
 put_dev:
rds_ib_dev_put(rds_ibdev);
-free_attr:
-   kfree(dev_attr);
 }
 
 /*
diff --git a/net/rds/iw.c b/net/rds/iw.c
index 576f182..f4a9fff 100644
--- a/net/rds/iw.c
+++ b/net/rds/iw.c
@@ -60,30 +60,20 @@ LIST_HEAD(iw_nodev_conns);
 static void rds_iw_add_one(struct ib_device *device)
 {
struct rds_iw_device *rds_iwdev;
-   struct ib_device_attr *dev_attr;
 
/* Only handle iwarp devices */
if (device->node_type != RDMA_NODE_RNIC)
return;
 
-   dev_attr = kmalloc(sizeof *dev_attr, GFP_KERNEL);
-   if (!dev_attr)
-   return;
-
-   if (ib_query_device(device, dev_attr)) {
-   rdsdebug("Query device failed for %s\n", device->name);
-   goto free_attr;
-   }
-
rds_iwdev = kmalloc(sizeof *rds_iwdev, GFP_KERNEL);
if (!rds_iwdev)
-   goto free_attr;
+   return;
 
spin_lock_init(&rds_iwdev->spinlock);
 
-   rds_iwdev->dma_local_lkey = !!(dev_attr->device_cap_flags & 
IB_DEVICE_LOCAL_DMA_LKEY);
-   rds_iwdev->max_wrs = dev_attr->max_qp_wr;
-   rds_iwdev->max_sge = min(dev_attr->max_sge, RDS_IW_MAX_SGE);
+   rds_iwdev->dma_local_lkey = !!(device->attrs.device_cap_flags & 
IB_DEVICE_LOCAL_DMA_LKEY);
+   rds_iwdev->max_wrs = device->attrs.max_qp_wr;
+   rds_iwdev->max_sge = min(device->attrs.max_sge, RDS_IW_MAX_SGE);
 
rds_iwdev->dev = 

[PATCH rdma-next V1 0/7] dev attr cleanup (less is more)

2015-12-18 Thread Or Gerlitz
OK, Doug, this is my suggestion for the dev attr cleanup -- it 
has the advantages of leaving the attrs on a well defined location, 
a field in the IB device, the ability to  get that through smaller 
patches, avoid touching any of the HW drivers, etc.

Or.

changes from V0:
 - addressed Sagi's comment and removed ib_query_device,
   this changed patch #1 and added patch #7.
 - removed ref to Christoph patch in the change-logs of patches 2-6

Ira Weiny (1):
  IB/core: Save the device attributes on the device structure

Or Gerlitz (6):
  IB/core: Avoid calling ib_query_device
  IB/ulps: Avoid calling ib_query_device
  net/rds: Avoid calling ib_query_device
  xprtrdma: Avoid calling ib_query_device
  staging/o2iblnd: Avoid calling ib_query_device
  IB/core: Remove ib_query_device

 drivers/infiniband/core/cm.c   | 12 +-
 drivers/infiniband/core/cma.c  |  8 
 drivers/infiniband/core/device.c   | 27 
 drivers/infiniband/core/fmr_pool.c | 20 +
 drivers/infiniband/core/sysfs.c| 14 ++-
 drivers/infiniband/core/uverbs_cmd.c   | 23 ++-
 drivers/infiniband/core/verbs.c|  8 +---
 drivers/infiniband/ulp/ipoib/ipoib_cm.c| 19 +++--
 drivers/infiniband/ulp/ipoib/ipoib_ethtool.c   | 14 +++
 drivers/infiniband/ulp/ipoib/ipoib_main.c  | 21 +-
 drivers/infiniband/ulp/iser/iscsi_iser.c   |  4 +-
 drivers/infiniband/ulp/iser/iscsi_iser.h   |  2 -
 drivers/infiniband/ulp/iser/iser_memory.c  |  9 ++--
 drivers/infiniband/ulp/iser/iser_verbs.c   | 38 -
 drivers/infiniband/ulp/isert/ib_isert.c| 47 +++--
 drivers/infiniband/ulp/isert/ib_isert.h|  1 -
 drivers/infiniband/ulp/srp/ib_srp.c| 32 ---
 drivers/infiniband/ulp/srpt/ib_srpt.c  | 15 +++
 drivers/infiniband/ulp/srpt/ib_srpt.h  |  3 --
 .../staging/lustre/lnet/klnds/o2iblnd/o2iblnd.c| 21 +-
 include/rdma/ib_verbs.h|  4 +-
 net/rds/ib.c   | 34 +--
 net/rds/iw.c   | 23 +++
 net/sunrpc/xprtrdma/frwr_ops.c |  7 ++--
 net/sunrpc/xprtrdma/svc_rdma_transport.c   | 48 +-
 net/sunrpc/xprtrdma/verbs.c| 24 ---
 net/sunrpc/xprtrdma/xprt_rdma.h|  1 -
 27 files changed, 131 insertions(+), 348 deletions(-)

-- 
2.3.7

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH rdma-next V1 2/7] IB/core: Avoid calling ib_query_device

2015-12-18 Thread Or Gerlitz
Use the cached copy of the attributes present on the device, except for
the case of a query originating from user-space, where we have to invoke
the driver query_device entry, so they can fill in their udata.

Signed-off-by: Or Gerlitz 
---
 drivers/infiniband/core/cm.c | 12 +---
 drivers/infiniband/core/cma.c|  8 
 drivers/infiniband/core/fmr_pool.c   | 20 ++--
 drivers/infiniband/core/sysfs.c  | 14 --
 drivers/infiniband/core/uverbs_cmd.c | 23 ---
 drivers/infiniband/core/verbs.c  |  8 +---
 6 files changed, 12 insertions(+), 73 deletions(-)

diff --git a/drivers/infiniband/core/cm.c b/drivers/infiniband/core/cm.c
index 0a26dd6..7fa2b94 100644
--- a/drivers/infiniband/core/cm.c
+++ b/drivers/infiniband/core/cm.c
@@ -3731,16 +3731,6 @@ int ib_cm_init_qp_attr(struct ib_cm_id *cm_id,
 }
 EXPORT_SYMBOL(ib_cm_init_qp_attr);
 
-static void cm_get_ack_delay(struct cm_device *cm_dev)
-{
-   struct ib_device_attr attr;
-
-   if (ib_query_device(cm_dev->ib_device, &attr))
-   cm_dev->ack_delay = 0; /* acks will rely on packet life time */
-   else
-   cm_dev->ack_delay = attr.local_ca_ack_delay;
-}
-
 static ssize_t cm_show_counter(struct kobject *obj, struct attribute *attr,
   char *buf)
 {
@@ -3852,7 +3842,7 @@ static void cm_add_one(struct ib_device *ib_device)
return;
 
cm_dev->ib_device = ib_device;
-   cm_get_ack_delay(cm_dev);
+   cm_dev->ack_delay = ib_device->attrs.local_ca_ack_delay;
cm_dev->going_down = 0;
cm_dev->device = device_create(&cm_class, &ib_device->dev,
   MKDEV(0, 0), NULL,
diff --git a/drivers/infiniband/core/cma.c b/drivers/infiniband/core/cma.c
index d2d5d00..f8dfc63 100644
--- a/drivers/infiniband/core/cma.c
+++ b/drivers/infiniband/core/cma.c
@@ -1894,7 +1894,6 @@ static int iw_conn_req_handler(struct iw_cm_id *cm_id,
struct rdma_id_private *listen_id, *conn_id;
struct rdma_cm_event event;
int ret;
-   struct ib_device_attr attr;
struct sockaddr *laddr = (struct sockaddr *)&iw_event->local_addr;
struct sockaddr *raddr = (struct sockaddr *)&iw_event->remote_addr;
 
@@ -1936,13 +1935,6 @@ static int iw_conn_req_handler(struct iw_cm_id *cm_id,
memcpy(cma_src_addr(conn_id), laddr, rdma_addr_size(laddr));
memcpy(cma_dst_addr(conn_id), raddr, rdma_addr_size(raddr));
 
-   ret = ib_query_device(conn_id->id.device, &attr);
-   if (ret) {
-   mutex_unlock(&conn_id->handler_mutex);
-   rdma_destroy_id(new_cm_id);
-   goto out;
-   }
-
memset(&event, 0, sizeof event);
event.event = RDMA_CM_EVENT_CONNECT_REQUEST;
event.param.conn.private_data = iw_event->private_data;
diff --git a/drivers/infiniband/core/fmr_pool.c 
b/drivers/infiniband/core/fmr_pool.c
index 9f5ad7c..6ac3683 100644
--- a/drivers/infiniband/core/fmr_pool.c
+++ b/drivers/infiniband/core/fmr_pool.c
@@ -212,7 +212,6 @@ struct ib_fmr_pool *ib_create_fmr_pool(struct ib_pd 
*pd,
 {
struct ib_device   *device;
struct ib_fmr_pool *pool;
-   struct ib_device_attr *attr;
int i;
int ret;
int max_remaps;
@@ -228,25 +227,10 @@ struct ib_fmr_pool *ib_create_fmr_pool(struct ib_pd   
  *pd,
return ERR_PTR(-ENOSYS);
}
 
-   attr = kmalloc(sizeof *attr, GFP_KERNEL);
-   if (!attr) {
-   printk(KERN_WARNING PFX "couldn't allocate device attr 
struct\n");
-   return ERR_PTR(-ENOMEM);
-   }
-
-   ret = ib_query_device(device, attr);
-   if (ret) {
-   printk(KERN_WARNING PFX "couldn't query device: %d\n", ret);
-   kfree(attr);
-   return ERR_PTR(ret);
-   }
-
-   if (!attr->max_map_per_fmr)
+   if (!device->attrs.max_map_per_fmr)
max_remaps = IB_FMR_MAX_REMAPS;
else
-   max_remaps = attr->max_map_per_fmr;
-
-   kfree(attr);
+   max_remaps = device->attrs.max_map_per_fmr;
 
pool = kmalloc(sizeof *pool, GFP_KERNEL);
if (!pool) {
diff --git a/drivers/infiniband/core/sysfs.c b/drivers/infiniband/core/sysfs.c
index b1f37d4..1d5b4b0 100644
--- a/drivers/infiniband/core/sysfs.c
+++ b/drivers/infiniband/core/sysfs.c
@@ -614,18 +614,12 @@ static ssize_t show_sys_image_guid(struct device *device,
   struct device_attribute *dev_attr, char *buf)
 {
struct ib_device *dev = container_of(device, struct ib_device, dev);
-   struct ib_device_attr attr;
-   ssize_t ret;
-
-   ret = ib_query_device(dev, &attr);
-   if (ret)
-   return ret;
 
   

Re: RoCE passive side failures on 4.4-rc5

2015-12-18 Thread Or Gerlitz

On 12/17/2015 4:18 PM, Sagi Grimberg wrote:
I'm using 4.4-rc2+ and I have RoCE working. 


I tried 4.4-rc2 and I see the same problem. I will make my .config
available to you and Moni so you can try it out on your systems Sunday.

I see this over both CX2 and CX3-pro, both are in VPI config.

BTW getting the 28 (consumer defined) reject reason is something I saw 
coming

and going from time to time in various weird cases over the past years.

Or.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH rdma-next 1/6] IB/core: Save the device attributes on the device structure

2015-12-18 Thread Or Gerlitz

On 12/17/2015 7:41 PM, Jason Gunthorpe wrote:

On Thu, Dec 17, 2015 at 03:44:19PM +0200, Sagi Grimberg wrote:

+   ret = ib_query_device(device, &device->attrs);
+   if (ret) {
+   printk(KERN_WARNING "Couldn't query the device attributes\n");
+   goto out;
+   }
+

I thought we're all for removing the call altogether aren't we?

I'd say just call device->query_device() instead.

Christoph's patch even got rid of device->query_device(),


Wrong.

UDATA is a mechanism added by the founders of the RDMA stack to allow 
for LL HW
driver pass data to LL HW user space library piggy backed on uverbs 
commands.


For example, mlx4 uses that to path details of the HW clock which are 
needed for libmlx4.


Christoph noted that and didn't remove the query_device entry from mlx4, he
also madethis somehow inelegant change to uverbs_cmd.c:

@@ -3624,26 +3607,29 @@ int ib_uverbs_ex_query_device(struct 
ib_uverbs_file *file,

if (ucore->outlen < resp.response_length)
return -ENOSPC;

-   memset(&attr, 0, sizeof(attr));
-
-   err = ib_dev->query_device(ib_dev, &attr, uhw);
-   if (err)
-   return err;
+   if (ib_dev->query_device) {
+   int err = ib_dev->query_device(ib_dev, uhw);
+   if (err)
+   return err;
+   } else {
+   if (uhw->inlen || uhw->outlen)
+   return -EINVAL;
+   }



which, IHMO, I prefer to see over this.


It would be wrong to disallow LL drivers to pass data back and forth to/from
their user space sister libraries, this is part of a 10Y old kernel UAPI.


It re-enforces that these values are constants and drivers cannot change them 
on the fly.


yes, we should be looking on how to constify things, but incrementally.
I don't see why block this cleanup now as of this point.

Or.

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH rdma-next 0/6] dev attr cleanup (less is more)

2015-12-17 Thread Or Gerlitz
On Thu, Dec 17, 2015 at 5:36 PM, Christoph Hellwig  wrote:

> [...] I heartily disagree with this approach [...]

OK, I will remove your name and re-submit that
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RoCE passive side failures on 4.4-rc5

2015-12-17 Thread Or Gerlitz

Guys,

Using 4.4-rc5+ [1] and **not** applying any of the patches I sent today,
I noted that RoCE passive side isn't working (rdma-cm, ibv_rc_pingpong 
works).


I have two nodes in ConnectX3 VPI config (port1 IB and port2 Eth), the 
one with the 4.4-rc5 kernel can act as both (rping) client/server for IB 
links but only (rping) client for RoCE.


I tried both inter-node and loopback runs, in all cases, the client side 
getsCM
reject with reason 28, see [2], tried both iser and rping. Eth (ICMP, 
TCP) works OK.


Or.

[1] Linus tree up to commit a5e90b1 "Merge branch 'fixes' of 
git://ftp.arm.linux.org.uk/~rmk/linux-arm"


[2] $ rping -d -v -c -a 192.168.31.17 -C 1
verbose
client
count 1
created cm_id 0x60cd90
cma_event type RDMA_CM_EVENT_ADDR_RESOLVED cma_id 0x60cd90 (parent)
cma_event type RDMA_CM_EVENT_ROUTE_RESOLVED cma_id 0x60cd90 (parent)
rdma_resolve_addr - rdma_resolve_route successful
created pd 0x60cfe0
created channel 0x608510
created cq 0x607f50
created qp 0x60d000
rping_setup_buffers called on cb 0x606010
allocated & registered buffers...
cq_thread started.
wait for CONNECTED state 10
cma_event type RDMA_CM_EVENT_REJECTED cma_id 0x60cd90 (parent)
connect error -1
rping_free_buffers called on cb 0x606010
cma event RDMA_CM_EVENT_REJECTED, error 28

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH rdma-next 5/6] xprtrdma: Avoid calling ib_query_device

2015-12-17 Thread Or Gerlitz
Instead, use the cached copy of the attributes present on the device.

Based on a patch from Christoph Hellwig 

Signed-off-by: Or Gerlitz 
---
 net/sunrpc/xprtrdma/frwr_ops.c   |  7 ++---
 net/sunrpc/xprtrdma/svc_rdma_transport.c | 48 +---
 net/sunrpc/xprtrdma/verbs.c  | 24 ++--
 net/sunrpc/xprtrdma/xprt_rdma.h  |  1 -
 4 files changed, 30 insertions(+), 50 deletions(-)

diff --git a/net/sunrpc/xprtrdma/frwr_ops.c b/net/sunrpc/xprtrdma/frwr_ops.c
index 88cf9e7..8d4864f 100644
--- a/net/sunrpc/xprtrdma/frwr_ops.c
+++ b/net/sunrpc/xprtrdma/frwr_ops.c
@@ -190,12 +190,11 @@ static int
 frwr_op_open(struct rpcrdma_ia *ia, struct rpcrdma_ep *ep,
 struct rpcrdma_create_data_internal *cdata)
 {
-   struct ib_device_attr *devattr = &ia->ri_devattr;
int depth, delta;
 
ia->ri_max_frmr_depth =
min_t(unsigned int, RPCRDMA_MAX_DATA_SEGS,
- devattr->max_fast_reg_page_list_len);
+ ia->ri_device->attrs.max_fast_reg_page_list_len);
dprintk("RPC:   %s: device's max FR page list len = %u\n",
__func__, ia->ri_max_frmr_depth);
 
@@ -222,8 +221,8 @@ frwr_op_open(struct rpcrdma_ia *ia, struct rpcrdma_ep *ep,
}
 
ep->rep_attr.cap.max_send_wr *= depth;
-   if (ep->rep_attr.cap.max_send_wr > devattr->max_qp_wr) {
-   cdata->max_requests = devattr->max_qp_wr / depth;
+   if (ep->rep_attr.cap.max_send_wr > ia->ri_device->attrs.max_qp_wr) {
+   cdata->max_requests = ia->ri_device->attrs.max_qp_wr / depth;
if (!cdata->max_requests)
return -EINVAL;
ep->rep_attr.cap.max_send_wr = cdata->max_requests *
diff --git a/net/sunrpc/xprtrdma/svc_rdma_transport.c 
b/net/sunrpc/xprtrdma/svc_rdma_transport.c
index b348b4a..4d695af 100644
--- a/net/sunrpc/xprtrdma/svc_rdma_transport.c
+++ b/net/sunrpc/xprtrdma/svc_rdma_transport.c
@@ -886,10 +886,10 @@ static struct svc_xprt *svc_rdma_accept(struct svc_xprt 
*xprt)
struct rdma_conn_param conn_param;
struct ib_cq_init_attr cq_attr = {};
struct ib_qp_init_attr qp_attr;
-   struct ib_device_attr devattr;
+   struct ib_device *dev;
int uninitialized_var(dma_mr_acc);
int need_dma_mr = 0;
-   int ret;
+   int ret = 0;
int i;
 
listen_rdma = container_of(xprt, struct svcxprt_rdma, sc_xprt);
@@ -910,20 +910,15 @@ static struct svc_xprt *svc_rdma_accept(struct svc_xprt 
*xprt)
dprintk("svcrdma: newxprt from accept queue = %p, cm_id=%p\n",
newxprt, newxprt->sc_cm_id);
 
-   ret = ib_query_device(newxprt->sc_cm_id->device, &devattr);
-   if (ret) {
-   dprintk("svcrdma: could not query device attributes on "
-   "device %p, rc=%d\n", newxprt->sc_cm_id->device, ret);
-   goto errout;
-   }
+   dev = newxprt->sc_cm_id->device;
 
/* Qualify the transport resource defaults with the
 * capabilities of this particular device */
-   newxprt->sc_max_sge = min((size_t)devattr.max_sge,
+   newxprt->sc_max_sge = min((size_t)dev->attrs.max_sge,
  (size_t)RPCSVC_MAXPAGES);
-   newxprt->sc_max_sge_rd = min_t(size_t, devattr.max_sge_rd,
+   newxprt->sc_max_sge_rd = min_t(size_t, dev->attrs.max_sge_rd,
   RPCSVC_MAXPAGES);
-   newxprt->sc_max_requests = min((size_t)devattr.max_qp_wr,
+   newxprt->sc_max_requests = min((size_t)dev->attrs.max_qp_wr,
   (size_t)svcrdma_max_requests);
newxprt->sc_sq_depth = RPCRDMA_SQ_DEPTH_MULT * newxprt->sc_max_requests;
 
@@ -931,16 +926,16 @@ static struct svc_xprt *svc_rdma_accept(struct svc_xprt 
*xprt)
 * Limit ORD based on client limit, local device limit, and
 * configured svcrdma limit.
 */
-   newxprt->sc_ord = min_t(size_t, devattr.max_qp_rd_atom, 
newxprt->sc_ord);
+   newxprt->sc_ord = min_t(size_t, dev->attrs.max_qp_rd_atom, 
newxprt->sc_ord);
newxprt->sc_ord = min_t(size_t, svcrdma_ord, newxprt->sc_ord);
 
-   newxprt->sc_pd = ib_alloc_pd(newxprt->sc_cm_id->device);
+   newxprt->sc_pd = ib_alloc_pd(dev);
if (IS_ERR(newxprt->sc_pd)) {
dprintk("svcrdma: error creating PD for connect request\n");
goto errout;
}
cq_attr.cqe = newxprt->sc_sq_depth;
-   newxprt->sc_sq_cq = ib_create_cq(newxprt->sc_cm_id->device,
+   newxprt->sc_sq_cq = ib_create_cq(dev,
 sq_comp_handler,
   

[PATCH rdma-next 3/6] IB/ulps: Avoid calling ib_query_device

2015-12-17 Thread Or Gerlitz
Instead, use the cached copy of the attributes present on the device.

Based on a patch from Christoph Hellwig 

Signed-off-by: Or Gerlitz 
---
 drivers/infiniband/ulp/ipoib/ipoib_cm.c  | 19 ---
 drivers/infiniband/ulp/ipoib/ipoib_ethtool.c | 14 +++--
 drivers/infiniband/ulp/ipoib/ipoib_main.c| 21 +
 drivers/infiniband/ulp/iser/iscsi_iser.c |  4 +--
 drivers/infiniband/ulp/iser/iscsi_iser.h |  2 --
 drivers/infiniband/ulp/iser/iser_memory.c|  9 +++---
 drivers/infiniband/ulp/iser/iser_verbs.c | 38 ++
 drivers/infiniband/ulp/isert/ib_isert.c  | 47 +---
 drivers/infiniband/ulp/isert/ib_isert.h  |  1 -
 drivers/infiniband/ulp/srp/ib_srp.c  | 32 ++-
 drivers/infiniband/ulp/srpt/ib_srpt.c| 15 -
 drivers/infiniband/ulp/srpt/ib_srpt.h|  3 --
 12 files changed, 63 insertions(+), 142 deletions(-)

diff --git a/drivers/infiniband/ulp/ipoib/ipoib_cm.c 
b/drivers/infiniband/ulp/ipoib/ipoib_cm.c
index 3ae9726..94d144d 100644
--- a/drivers/infiniband/ulp/ipoib/ipoib_cm.c
+++ b/drivers/infiniband/ulp/ipoib/ipoib_cm.c
@@ -1522,8 +1522,7 @@ static void ipoib_cm_create_srq(struct net_device *dev, 
int max_sge)
 int ipoib_cm_dev_init(struct net_device *dev)
 {
struct ipoib_dev_priv *priv = netdev_priv(dev);
-   int i, ret;
-   struct ib_device_attr attr;
+   int max_srq_sge, i;
 
INIT_LIST_HEAD(&priv->cm.passive_ids);
INIT_LIST_HEAD(&priv->cm.reap_list);
@@ -1540,19 +1539,13 @@ int ipoib_cm_dev_init(struct net_device *dev)
 
skb_queue_head_init(&priv->cm.skb_queue);
 
-   ret = ib_query_device(priv->ca, &attr);
-   if (ret) {
-   printk(KERN_WARNING "ib_query_device() failed with %d\n", ret);
-   return ret;
-   }
-
-   ipoib_dbg(priv, "max_srq_sge=%d\n", attr.max_srq_sge);
+   ipoib_dbg(priv, "max_srq_sge=%d\n", priv->ca->attrs.max_srq_sge);
 
-   attr.max_srq_sge = min_t(int, IPOIB_CM_RX_SG, attr.max_srq_sge);
-   ipoib_cm_create_srq(dev, attr.max_srq_sge);
+   max_srq_sge = min_t(int, IPOIB_CM_RX_SG, priv->ca->attrs.max_srq_sge);
+   ipoib_cm_create_srq(dev, max_srq_sge);
if (ipoib_cm_has_srq(dev)) {
-   priv->cm.max_cm_mtu = attr.max_srq_sge * PAGE_SIZE - 0x10;
-   priv->cm.num_frags  = attr.max_srq_sge;
+   priv->cm.max_cm_mtu = max_srq_sge * PAGE_SIZE - 0x10;
+   priv->cm.num_frags  = max_srq_sge;
ipoib_dbg(priv, "max_cm_mtu = 0x%x, num_frags=%d\n",
  priv->cm.max_cm_mtu, priv->cm.num_frags);
} else {
diff --git a/drivers/infiniband/ulp/ipoib/ipoib_ethtool.c 
b/drivers/infiniband/ulp/ipoib/ipoib_ethtool.c
index 078cadd..a53fa5f 100644
--- a/drivers/infiniband/ulp/ipoib/ipoib_ethtool.c
+++ b/drivers/infiniband/ulp/ipoib/ipoib_ethtool.c
@@ -40,15 +40,11 @@ static void ipoib_get_drvinfo(struct net_device *netdev,
  struct ethtool_drvinfo *drvinfo)
 {
struct ipoib_dev_priv *priv = netdev_priv(netdev);
-   struct ib_device_attr *attr;
-
-   attr = kmalloc(sizeof(*attr), GFP_KERNEL);
-   if (attr && !ib_query_device(priv->ca, attr))
-   snprintf(drvinfo->fw_version, sizeof(drvinfo->fw_version),
-"%d.%d.%d", (int)(attr->fw_ver >> 32),
-(int)(attr->fw_ver >> 16) & 0x,
-(int)attr->fw_ver & 0x);
-   kfree(attr);
+
+   snprintf(drvinfo->fw_version, sizeof(drvinfo->fw_version),
+"%d.%d.%d", (int)(priv->ca->attrs.fw_ver >> 32),
+(int)(priv->ca->attrs.fw_ver >> 16) & 0x,
+(int)priv->ca->attrs.fw_ver & 0x);
 
strlcpy(drvinfo->bus_info, dev_name(priv->ca->dma_device),
sizeof(drvinfo->bus_info));
diff --git a/drivers/infiniband/ulp/ipoib/ipoib_main.c 
b/drivers/infiniband/ulp/ipoib/ipoib_main.c
index 7d32818..58732c5 100644
--- a/drivers/infiniband/ulp/ipoib/ipoib_main.c
+++ b/drivers/infiniband/ulp/ipoib/ipoib_main.c
@@ -1777,26 +1777,7 @@ int ipoib_add_pkey_attr(struct net_device *dev)
 
 int ipoib_set_dev_features(struct ipoib_dev_priv *priv, struct ib_device *hca)
 {
-   struct ib_device_attr *device_attr;
-   int result = -ENOMEM;
-
-   device_attr = kmalloc(sizeof *device_attr, GFP_KERNEL);
-   if (!device_attr) {
-   printk(KERN_WARNING "%s: allocation of %zu bytes failed\n",
-  hca->name, sizeof *device_attr);
-   return result;
-   }
-
-   result = ib_query_device(hca, device_attr);
-   if (result) {
-   printk(KERN_WARNING "%s: ib_

[PATCH rdma-next 4/6] net/rds: Avoid calling ib_query_device

2015-12-17 Thread Or Gerlitz
Instead, use the cached copy of the attributes present on the device.

Based on a patch from Christoph Hellwig 

Signed-off-by: Or Gerlitz 
---
 net/rds/ib.c | 34 +++---
 net/rds/iw.c | 23 +--
 2 files changed, 16 insertions(+), 41 deletions(-)

diff --git a/net/rds/ib.c b/net/rds/ib.c
index f222885..9481d55 100644
--- a/net/rds/ib.c
+++ b/net/rds/ib.c
@@ -122,44 +122,34 @@ void rds_ib_dev_put(struct rds_ib_device *rds_ibdev)
 static void rds_ib_add_one(struct ib_device *device)
 {
struct rds_ib_device *rds_ibdev;
-   struct ib_device_attr *dev_attr;
 
/* Only handle IB (no iWARP) devices */
if (device->node_type != RDMA_NODE_IB_CA)
return;
 
-   dev_attr = kmalloc(sizeof *dev_attr, GFP_KERNEL);
-   if (!dev_attr)
-   return;
-
-   if (ib_query_device(device, dev_attr)) {
-   rdsdebug("Query device failed for %s\n", device->name);
-   goto free_attr;
-   }
-
rds_ibdev = kzalloc_node(sizeof(struct rds_ib_device), GFP_KERNEL,
 ibdev_to_node(device));
if (!rds_ibdev)
-   goto free_attr;
+   return;
 
spin_lock_init(&rds_ibdev->spinlock);
atomic_set(&rds_ibdev->refcount, 1);
INIT_WORK(&rds_ibdev->free_work, rds_ib_dev_free);
 
-   rds_ibdev->max_wrs = dev_attr->max_qp_wr;
-   rds_ibdev->max_sge = min(dev_attr->max_sge, RDS_IB_MAX_SGE);
+   rds_ibdev->max_wrs = device->attrs.max_qp_wr;
+   rds_ibdev->max_sge = min(device->attrs.max_sge, RDS_IB_MAX_SGE);
 
-   rds_ibdev->fmr_max_remaps = dev_attr->max_map_per_fmr?: 32;
-   rds_ibdev->max_1m_fmrs = dev_attr->max_mr ?
-   min_t(unsigned int, (dev_attr->max_mr / 2),
+   rds_ibdev->fmr_max_remaps = device->attrs.max_map_per_fmr?: 32;
+   rds_ibdev->max_1m_fmrs = device->attrs.max_mr ?
+   min_t(unsigned int, (device->attrs.max_mr / 2),
  rds_ib_fmr_1m_pool_size) : rds_ib_fmr_1m_pool_size;
 
-   rds_ibdev->max_8k_fmrs = dev_attr->max_mr ?
-   min_t(unsigned int, ((dev_attr->max_mr / 2) * RDS_MR_8K_SCALE),
+   rds_ibdev->max_8k_fmrs = device->attrs.max_mr ?
+   min_t(unsigned int, ((device->attrs.max_mr / 2) * 
RDS_MR_8K_SCALE),
  rds_ib_fmr_8k_pool_size) : rds_ib_fmr_8k_pool_size;
 
-   rds_ibdev->max_initiator_depth = dev_attr->max_qp_init_rd_atom;
-   rds_ibdev->max_responder_resources = dev_attr->max_qp_rd_atom;
+   rds_ibdev->max_initiator_depth = device->attrs.max_qp_init_rd_atom;
+   rds_ibdev->max_responder_resources = device->attrs.max_qp_rd_atom;
 
rds_ibdev->dev = device;
rds_ibdev->pd = ib_alloc_pd(device);
@@ -183,7 +173,7 @@ static void rds_ib_add_one(struct ib_device *device)
}
 
rdsdebug("RDS/IB: max_mr = %d, max_wrs = %d, max_sge = %d, 
fmr_max_remaps = %d, max_1m_fmrs = %d, max_8k_fmrs = %d\n",
-dev_attr->max_fmr, rds_ibdev->max_wrs, rds_ibdev->max_sge,
+device->attrs.max_fmr, rds_ibdev->max_wrs, rds_ibdev->max_sge,
 rds_ibdev->fmr_max_remaps, rds_ibdev->max_1m_fmrs,
 rds_ibdev->max_8k_fmrs);
 
@@ -202,8 +192,6 @@ static void rds_ib_add_one(struct ib_device *device)
 
 put_dev:
rds_ib_dev_put(rds_ibdev);
-free_attr:
-   kfree(dev_attr);
 }
 
 /*
diff --git a/net/rds/iw.c b/net/rds/iw.c
index 576f182..f4a9fff 100644
--- a/net/rds/iw.c
+++ b/net/rds/iw.c
@@ -60,30 +60,20 @@ LIST_HEAD(iw_nodev_conns);
 static void rds_iw_add_one(struct ib_device *device)
 {
struct rds_iw_device *rds_iwdev;
-   struct ib_device_attr *dev_attr;
 
/* Only handle iwarp devices */
if (device->node_type != RDMA_NODE_RNIC)
return;
 
-   dev_attr = kmalloc(sizeof *dev_attr, GFP_KERNEL);
-   if (!dev_attr)
-   return;
-
-   if (ib_query_device(device, dev_attr)) {
-   rdsdebug("Query device failed for %s\n", device->name);
-   goto free_attr;
-   }
-
rds_iwdev = kmalloc(sizeof *rds_iwdev, GFP_KERNEL);
if (!rds_iwdev)
-   goto free_attr;
+   return;
 
spin_lock_init(&rds_iwdev->spinlock);
 
-   rds_iwdev->dma_local_lkey = !!(dev_attr->device_cap_flags & 
IB_DEVICE_LOCAL_DMA_LKEY);
-   rds_iwdev->max_wrs = dev_attr->max_qp_wr;
-   rds_iwdev->max_sge = min(dev_attr->max_sge, RDS_IW_MAX_SGE);
+   rds_iwdev->dma_local_lkey = !!(device->attrs.device_cap_flags & 
IB_DEVICE_LOCAL_DMA_LKEY);
+   rds_iwdev->max_wrs = device->attrs.max_qp_wr;
+   rds_iwdev->max_sge = min(device-&

[PATCH rdma-next 6/6] staging/o2iblnd: Avoid calling ib_query_device

2015-12-17 Thread Or Gerlitz
Instead, use the cached copy of the attributes present on the device.

Based on a patch from Christoph Hellwig 

Signed-off-by: Or Gerlitz 
---
 drivers/staging/lustre/lnet/klnds/o2iblnd/o2iblnd.c | 21 +
 1 file changed, 1 insertion(+), 20 deletions(-)

diff --git a/drivers/staging/lustre/lnet/klnds/o2iblnd/o2iblnd.c 
b/drivers/staging/lustre/lnet/klnds/o2iblnd/o2iblnd.c
index 7c730e3..7231ef8 100644
--- a/drivers/staging/lustre/lnet/klnds/o2iblnd/o2iblnd.c
+++ b/drivers/staging/lustre/lnet/klnds/o2iblnd/o2iblnd.c
@@ -2070,32 +2070,13 @@ static int kiblnd_net_init_pools(kib_net_t *net, __u32 
*cpts, int ncpts)
 
 static int kiblnd_hdev_get_attr(kib_hca_dev_t *hdev)
 {
-   struct ib_device_attr *attr;
-   int rc;
-
/* It's safe to assume a HCA can handle a page size
 * matching that of the native system */
hdev->ibh_page_shift = PAGE_SHIFT;
hdev->ibh_page_size  = 1 << PAGE_SHIFT;
hdev->ibh_page_mask  = ~((__u64)hdev->ibh_page_size - 1);
 
-   LIBCFS_ALLOC(attr, sizeof(*attr));
-   if (attr == NULL) {
-   CERROR("Out of memory\n");
-   return -ENOMEM;
-   }
-
-   rc = ib_query_device(hdev->ibh_ibdev, attr);
-   if (rc == 0)
-   hdev->ibh_mr_size = attr->max_mr_size;
-
-   LIBCFS_FREE(attr, sizeof(*attr));
-
-   if (rc != 0) {
-   CERROR("Failed to query IB device: %d\n", rc);
-   return rc;
-   }
-
+   hdev->ibh_mr_size = hdev->ibh_ibdev->attrs.max_mr_size;
if (hdev->ibh_mr_size == ~0ULL) {
hdev->ibh_mr_shift = 64;
return 0;
-- 
2.3.7

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH rdma-next 2/6] IB/core: Avoid calling ib_query_device when possible

2015-12-17 Thread Or Gerlitz
Instead, use the cached copy of the attributes present on the device.

Based on a patch from Christoph Hellwig 

Signed-off-by: Or Gerlitz 
---
 drivers/infiniband/core/cm.c | 12 +---
 drivers/infiniband/core/cma.c|  8 
 drivers/infiniband/core/fmr_pool.c   | 20 ++--
 drivers/infiniband/core/sysfs.c  | 14 --
 drivers/infiniband/core/uverbs_cmd.c | 23 ---
 drivers/infiniband/core/verbs.c  |  8 +---
 6 files changed, 12 insertions(+), 73 deletions(-)

diff --git a/drivers/infiniband/core/cm.c b/drivers/infiniband/core/cm.c
index 0a26dd6..7fa2b94 100644
--- a/drivers/infiniband/core/cm.c
+++ b/drivers/infiniband/core/cm.c
@@ -3731,16 +3731,6 @@ int ib_cm_init_qp_attr(struct ib_cm_id *cm_id,
 }
 EXPORT_SYMBOL(ib_cm_init_qp_attr);
 
-static void cm_get_ack_delay(struct cm_device *cm_dev)
-{
-   struct ib_device_attr attr;
-
-   if (ib_query_device(cm_dev->ib_device, &attr))
-   cm_dev->ack_delay = 0; /* acks will rely on packet life time */
-   else
-   cm_dev->ack_delay = attr.local_ca_ack_delay;
-}
-
 static ssize_t cm_show_counter(struct kobject *obj, struct attribute *attr,
   char *buf)
 {
@@ -3852,7 +3842,7 @@ static void cm_add_one(struct ib_device *ib_device)
return;
 
cm_dev->ib_device = ib_device;
-   cm_get_ack_delay(cm_dev);
+   cm_dev->ack_delay = ib_device->attrs.local_ca_ack_delay;
cm_dev->going_down = 0;
cm_dev->device = device_create(&cm_class, &ib_device->dev,
   MKDEV(0, 0), NULL,
diff --git a/drivers/infiniband/core/cma.c b/drivers/infiniband/core/cma.c
index d2d5d00..f8dfc63 100644
--- a/drivers/infiniband/core/cma.c
+++ b/drivers/infiniband/core/cma.c
@@ -1894,7 +1894,6 @@ static int iw_conn_req_handler(struct iw_cm_id *cm_id,
struct rdma_id_private *listen_id, *conn_id;
struct rdma_cm_event event;
int ret;
-   struct ib_device_attr attr;
struct sockaddr *laddr = (struct sockaddr *)&iw_event->local_addr;
struct sockaddr *raddr = (struct sockaddr *)&iw_event->remote_addr;
 
@@ -1936,13 +1935,6 @@ static int iw_conn_req_handler(struct iw_cm_id *cm_id,
memcpy(cma_src_addr(conn_id), laddr, rdma_addr_size(laddr));
memcpy(cma_dst_addr(conn_id), raddr, rdma_addr_size(raddr));
 
-   ret = ib_query_device(conn_id->id.device, &attr);
-   if (ret) {
-   mutex_unlock(&conn_id->handler_mutex);
-   rdma_destroy_id(new_cm_id);
-   goto out;
-   }
-
memset(&event, 0, sizeof event);
event.event = RDMA_CM_EVENT_CONNECT_REQUEST;
event.param.conn.private_data = iw_event->private_data;
diff --git a/drivers/infiniband/core/fmr_pool.c 
b/drivers/infiniband/core/fmr_pool.c
index 9f5ad7c..6ac3683 100644
--- a/drivers/infiniband/core/fmr_pool.c
+++ b/drivers/infiniband/core/fmr_pool.c
@@ -212,7 +212,6 @@ struct ib_fmr_pool *ib_create_fmr_pool(struct ib_pd 
*pd,
 {
struct ib_device   *device;
struct ib_fmr_pool *pool;
-   struct ib_device_attr *attr;
int i;
int ret;
int max_remaps;
@@ -228,25 +227,10 @@ struct ib_fmr_pool *ib_create_fmr_pool(struct ib_pd   
  *pd,
return ERR_PTR(-ENOSYS);
}
 
-   attr = kmalloc(sizeof *attr, GFP_KERNEL);
-   if (!attr) {
-   printk(KERN_WARNING PFX "couldn't allocate device attr 
struct\n");
-   return ERR_PTR(-ENOMEM);
-   }
-
-   ret = ib_query_device(device, attr);
-   if (ret) {
-   printk(KERN_WARNING PFX "couldn't query device: %d\n", ret);
-   kfree(attr);
-   return ERR_PTR(ret);
-   }
-
-   if (!attr->max_map_per_fmr)
+   if (!device->attrs.max_map_per_fmr)
max_remaps = IB_FMR_MAX_REMAPS;
else
-   max_remaps = attr->max_map_per_fmr;
-
-   kfree(attr);
+   max_remaps = device->attrs.max_map_per_fmr;
 
pool = kmalloc(sizeof *pool, GFP_KERNEL);
if (!pool) {
diff --git a/drivers/infiniband/core/sysfs.c b/drivers/infiniband/core/sysfs.c
index b1f37d4..1d5b4b0 100644
--- a/drivers/infiniband/core/sysfs.c
+++ b/drivers/infiniband/core/sysfs.c
@@ -614,18 +614,12 @@ static ssize_t show_sys_image_guid(struct device *device,
   struct device_attribute *dev_attr, char *buf)
 {
struct ib_device *dev = container_of(device, struct ib_device, dev);
-   struct ib_device_attr attr;
-   ssize_t ret;
-
-   ret = ib_query_device(dev, &attr);
-   if (ret)
-   return ret;
 
return sprintf(buf, "%04x:%04x:%04x:%04x\n",
-  be16_to_cpu(((__be16 *) &attr.sys_

[PATCH rdma-next 1/6] IB/core: Save the device attributes on the device structure

2015-12-17 Thread Or Gerlitz
From: Ira Weiny 

This way both the IB core and upper level drivers can access these cached
device attributes rather than querying or caching them on their own.

Signed-off-by: Ira Weiny 
Signed-off-by: Or Gerlitz 
---
 drivers/infiniband/core/device.c | 6 ++
 include/rdma/ib_verbs.h  | 1 +
 2 files changed, 7 insertions(+)

diff --git a/drivers/infiniband/core/device.c b/drivers/infiniband/core/device.c
index 179e813..398353b 100644
--- a/drivers/infiniband/core/device.c
+++ b/drivers/infiniband/core/device.c
@@ -352,6 +352,12 @@ int ib_register_device(struct ib_device *device,
goto out;
}
 
+   ret = ib_query_device(device, &device->attrs);
+   if (ret) {
+   printk(KERN_WARNING "Couldn't query the device attributes\n");
+   goto out;
+   }
+
ret = ib_device_register_sysfs(device, port_callback);
if (ret) {
printk(KERN_WARNING "Couldn't register device %s with driver 
model\n",
diff --git a/include/rdma/ib_verbs.h b/include/rdma/ib_verbs.h
index d56b39a..75a22533 100644
--- a/include/rdma/ib_verbs.h
+++ b/include/rdma/ib_verbs.h
@@ -1828,6 +1828,7 @@ struct ib_device {
u16  is_switch:1;
u8   node_type;
u8   phys_port_cnt;
+   struct ib_device_attrattrs;
 
/**
 * The following mandatory functions are used only at device
-- 
2.3.7

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH rdma-next 0/6] dev attr cleanup (less is more)

2015-12-17 Thread Or Gerlitz
OK, Doug, this is my suggestion for the dev attr cleanup -- it 
has the advantages of leaving the attrs on a well defined location, 
a field in the IB device, the ability to  get that through smaller 
patches, avoid touching any of the HW drivers, etc.

Note that in uverbs there's one spot where udata is involved 
(the extended query device call) so we keep the ib_query_device 
call there.

I used hunks from Christoph's work and mentioned that in the 
change-logs. This can turn to be his signature, if he wants to.

Or.

Ira Weiny (1):
  IB/core: Save the device attributes on the device structure

Or Gerlitz (5):
  IB/core: Avoid calling ib_query_device when possible
  IB/ulps: Avoid calling ib_query_device
  net/rds: Avoid calling ib_query_device
  xprtrdma: Avoid calling ib_query_device
  staging/o2iblnd: Avoid calling ib_query_device

 drivers/infiniband/core/cm.c   | 12 +-
 drivers/infiniband/core/cma.c  |  8 
 drivers/infiniband/core/device.c   |  6 +++
 drivers/infiniband/core/fmr_pool.c | 20 +
 drivers/infiniband/core/sysfs.c| 14 ++-
 drivers/infiniband/core/uverbs_cmd.c   | 23 ++-
 drivers/infiniband/core/verbs.c|  8 +---
 drivers/infiniband/ulp/ipoib/ipoib_cm.c| 19 +++--
 drivers/infiniband/ulp/ipoib/ipoib_ethtool.c   | 14 +++
 drivers/infiniband/ulp/ipoib/ipoib_main.c  | 21 +-
 drivers/infiniband/ulp/iser/iscsi_iser.c   |  4 +-
 drivers/infiniband/ulp/iser/iscsi_iser.h   |  2 -
 drivers/infiniband/ulp/iser/iser_memory.c  |  9 ++--
 drivers/infiniband/ulp/iser/iser_verbs.c   | 38 -
 drivers/infiniband/ulp/isert/ib_isert.c| 47 +++--
 drivers/infiniband/ulp/isert/ib_isert.h|  1 -
 drivers/infiniband/ulp/srp/ib_srp.c| 32 ---
 drivers/infiniband/ulp/srpt/ib_srpt.c  | 15 +++
 drivers/infiniband/ulp/srpt/ib_srpt.h  |  3 --
 .../staging/lustre/lnet/klnds/o2iblnd/o2iblnd.c| 21 +-
 include/rdma/ib_verbs.h|  1 +
 net/rds/ib.c   | 34 +--
 net/rds/iw.c   | 23 +++
 net/sunrpc/xprtrdma/frwr_ops.c |  7 ++--
 net/sunrpc/xprtrdma/svc_rdma_transport.c   | 48 +-
 net/sunrpc/xprtrdma/verbs.c| 24 ---
 net/sunrpc/xprtrdma/xprt_rdma.h|  1 -
 27 files changed, 129 insertions(+), 326 deletions(-)

-- 
2.3.7

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: device attr cleanup

2015-12-16 Thread Or Gerlitz

On 12/16/2015 3:40 PM, Sagi Grimberg wrote:

I really don't have a strong preference on either of the approaches. I
just want to see this included one way or the other. 


sure, agree, I will send my patches tomorrow
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: device attr cleanup

2015-12-15 Thread Or Gerlitz

On 12/15/2015 9:03 PM, Doug Ledford wrote:


Or, you specifically asked me to wait until this week.  I made my
initial impressions clear (I don't necessarily like the removal of the
attr struct, but I like the removal of all of the query calls, and I'm
inclined to take the patch in spite of not liking the removal of the
struct).  Do you have anything to add or have we beat this horse to death?


Hi Doug,

Lets stop beating, both horses and people.

I do understand that

1. you don't link the removal of the attr
2. you do like the removal of all the query calls

I am proposing to take the path of a patch that
does exactly #2 while avoiding #1.

What's wrong with that? I haven't heard any reasoning for why its
so good to stash ~50 new fields on the IB device structure except
for the author saying that other subsystems do that and other people
saying they are in favor of this approach while not providing any
reasoning, except for maybe something on bikes.

Why you or anyone else has to be from now and ever the cache line police
making sure that people don't add new attributes in random locations
over the IB device structure?

What's wrong with putting fifty attributesin a structure which is a field
of the device struct and have people go there to see what are the d
ifferentattrs and add news ones there?

This will make the 4.5 merge window extremely complex or even totally
threatened  w.r.t to the RDMA subsystem and related drivers by 3.3K LOC 
patch.


Sorry, but, I still don't get it.

Or.


--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH for-next 6/6] IB/mlx5: Add Raw Packet Queue Pair (QP) support

2015-12-15 Thread Or Gerlitz

On 12/10/2015 10:09 PM, Majd Dibbiny wrote:

This patchs adds support for Raw Packet QP for the mlx5 device.

Raw Packet QP, unlike other QP types, has no matching mlx5_core_qp
object but rather it is built of RQ/SQ/TIR/TIS/TD mlx5_core object.

The Raw Packet QP state changes are implemented by changing the
state of the sub-objects.

Since the SQ and RQ work-queue (WQ) buffers are not contiguous like
other QPs, we allocate separate buffers in the user-space and pass
the address of each one of them separately to the kernel.

Signed-off-by: Majd Dibbiny
---
  drivers/infiniband/hw/mlx5/main.c  |  15 +-
  drivers/infiniband/hw/mlx5/mlx5_ib.h   |  52 +-
  drivers/infiniband/hw/mlx5/qp.c| 894 ++---
  drivers/infiniband/hw/mlx5/user.h  |   1 +
  drivers/net/ethernet/mellanox/mlx5/core/qp.c   |  48 +-
  drivers/net/ethernet/mellanox/mlx5/core/transobj.c |  39 +
  include/linux/mlx5/mlx5_ifc.h  |  11 +-
  include/linux/mlx5/qp.h|   3 +-
  include/linux/mlx5/transobj.h  |   4 +
  9 files changed, 913 insertions(+), 154 deletions(-)


Seems way too big for single patch, seek how to split to 2-3 patches

Or.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: device attr cleanup (was: Handle mlx4 max_sge_rd correctly)

2015-12-08 Thread Or Gerlitz
On Wed, Dec 9, 2015 at 1:13 AM, Or Gerlitz  wrote:
> On Wed, Dec 9, 2015 at 12:59 AM, Jason Gunthorpe
>  wrote:
>> On Wed, Dec 09, 2015 at 12:47:55AM +0200, Or Gerlitz wrote:
>
>>> The patch is three liner to add the cached attrs --
>>> http://marc.info/?l=linux-rdma&m=142309296813985&w=2 -- if you are OK
>>> with that, I will add a 2nd patch that ports all ULPs to use the
>>> cached copy instead of their code which does the query.
>
>>> Actually, why not start with this approach and later decide if we need
>>> to go further of this is enough?
>
>> Or, can we please stop this bikeshedding. Christoph's patch is done,
>> well tested and does a lot more clean up that this hacky three liner.
>
> Christoph patch is here indeed, it does two things
>
> 1. remove all the ULP device attr alloc, device query, attr free hassle
> 2. adds tons of new fields to struct ib_device
>
> I think it just goes too much and needlessly adds tons of these new
> fields directly to struct ib_device where we can have them all well
> scoped into ib_device_attr member or pointer from struct ib_device
>
>> It is a good patch,
>
> I didn't say it's bad, I said I think we can achieve #1 above with way
> less drastic patch, and I'd like to hear the maintainer option, and
> have the chance to argu with the maintainer if needed, yours I heard,
> you like it, I know.

and I will be OOO for the rest of this week, since this is in the air
for ***months***, it would be fair enough to ask the maintainer not to
cut it this way or another before next week, Doug, agree?

Or.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: device attr cleanup (was: Handle mlx4 max_sge_rd correctly)

2015-12-08 Thread Or Gerlitz
On Wed, Dec 9, 2015 at 12:59 AM, Jason Gunthorpe
 wrote:
> On Wed, Dec 09, 2015 at 12:47:55AM +0200, Or Gerlitz wrote:

>> The patch is three liner to add the cached attrs --
>> http://marc.info/?l=linux-rdma&m=142309296813985&w=2 -- if you are OK
>> with that, I will add a 2nd patch that ports all ULPs to use the
>> cached copy instead of their code which does the query.

>> Actually, why not start with this approach and later decide if we need
>> to go further of this is enough?

> Or, can we please stop this bikeshedding. Christoph's patch is done,
> well tested and does a lot more clean up that this hacky three liner.

Christoph patch is here indeed, it does two things

1. remove all the ULP device attr alloc, device query, attr free hassle
2. adds tons of new fields to struct ib_device

I think it just goes too much and needlessly adds tons of these new
fields directly to struct ib_device where we can have them all well
scoped into ib_device_attr member or pointer from struct ib_device

> It is a good patch,

I didn't say it's bad, I said I think we can achieve #1 above with way
less drastic patch, and I'd like to hear the maintainer option, and
have the chance to argu with the maintainer if needed, yours I heard,
you like it, I know.

> and although patchworks doesn't have my remarks
> from an earlier revision I still think it should go forward.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: device attr cleanup (was: Handle mlx4 max_sge_rd correctly)

2015-12-08 Thread Or Gerlitz
On Wed, Dec 9, 2015 at 12:04 AM, Doug Ledford  wrote:
> Makes sense.

thanks.

> Show me what you are talking about (either a link to Ira's
> patch you are referring to or your own patch).

The patch is three liner to add the cached attrs --
http://marc.info/?l=linux-rdma&m=142309296813985&w=2 -- if you are OK
with that, I will add a 2nd patch that ports all ULPs to use the
cached copy instead of their code which does the query.

Actually, why not start with this approach and later decide if we need
to go further of this is enough?

Or.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: 4.4's rdma plate

2015-12-08 Thread Or Gerlitz
On Fri, Oct 23, 2015 at 5:52 PM, Doug Ledford  wrote:
> On 10/23/2015 02:46 AM, Or Gerlitz wrote:

>> Doug, there are few singleton patches to mlxN drivers, just to make
>> sure they are not left behind (thanks)

>> a small mlx5 enhancement, Signed by Eli C. the driver maintainer
>> https://patchwork.kernel.org/patch/7454641

>> two small mlx4 fixes, acked by me...
>> https://patchwork.kernel.org/patch/7348911/
>> https://patchwork.kernel.org/patch/7348871/

> Yes, I know.  There are quite a few different things for mlx5.  I was
> waiting on all of them until I knew that RoCEv2 was in as that gates
> whether or not mlx5 RoCE can go in.

Doug,

So RoCEv2 didn't make it for 4.4, but still, the above are changes
touching driver only code and I don't see why they should be waiting
so long.

Can we finally see them in a k.o branch of yours please? either for
your 4.4-rc pull or on 4.5 branch.

Or.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: device attr cleanup (was: Handle mlx4 max_sge_rd correctly)

2015-12-08 Thread Or Gerlitz
On Tue, Dec 8, 2015 at 8:17 PM, Doug Ledford  wrote:
>
> On 12/08/2015 01:13 PM, Sagi Grimberg wrote:


>
> > I mentioned this in v1. This patch set is applied over Christoph's
> > device attributes patch. Will it go in as well?
>
> No, that's too big and not the right sort of fix for 4.4-rc.  I had it
> on my radar for getting into for-next.



Doug, re the device attribute patch, I have expressed my opinion that
--- we should be going in a slightly different direction of stashing a
struct ib_device_attr as a direct or pointer field of struct ib_device
and have the device or the core to fill it up just before the device
instance creation with the core is to be complete. This way we can
remove all the annoying calls from ULPs to query device and avoid
adding so many fields to the device structure itself.

So there are two suggestions on the table here and we need to hear the
maintainer thinking. I do expect that your response will not be "I
applied X" but rather you'll allow the time to try and convince you on
whatever direction, this is pending for long time and we're on
4.4-rc4, there should be no rush to cut this over night and push to
k.o the code for this small cleanup.

I didn't submit a patch that follows my suggestion, but Ira did so in
one of the OPA submission rounds, if you're OK with that proposal or
if you even just want to see the patch to cast your vote/decision, I
will rebase it and complete.

Makes sense?

Or.

>
> So, it'll have to be fixed up to apply over the top of your patches which 
> will already be applied.
>
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


stalled again

2015-12-01 Thread Or Gerlitz

Doug,

We're against into this... upstream is on 4.4-rc3 while your latest 
branch in kernel.org (the one that carries thefor-next tag) is rebased 
to 4.3-rc3...


--> our internal build and review systems for patches to linux-rdma 
can'tmake any use of your tree. People here have to rebase their work 
against their own clones of Linus tree and can't work with our internal 
Gerrit rdma-next branch, etc, etc.


We have to work in a very inefficient manner.

Any reason on earth not to rebase your tree (== create new branch, move 
their the for-next tag) after an rc1 is out? how long does this take?


Or.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/2] IB/core: correct issue with sge copyin corrupting wr

2015-11-30 Thread Or Gerlitz
On Mon, Nov 30, 2015 at 4:34 PM, Mike Marciniszyn
 wrote:

Being a non native English speaker I am not fully sure, but "copyin
corrupting" sounds like slang / street talking... proper English
please

> Commit e622f2f4ad21 ("IB: split struct ib_send_wr")
> introduced a regression for HCAs whose user mode post
> sends go through ib_uverbs_post_send().
>
> The code didn't account for the fact that the first sge is
> offset by an operation dependent length.  The allocation did,
> but the pointer to the sge list is computed without that knowledge.
>
> Store the operation dependent length in an automatic and
> compute the sge list copy in destination using that length.

in an automatic what?
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v1 10/10] IB/iser: Support the remote invalidation exception

2015-11-30 Thread Or Gerlitz
On Tue, Nov 24, 2015 at 6:23 PM, Sagi Grimberg  wrote:
>
> From: Jenny Derzhavetz 
>
> Declare that we support remote invalidation in case we are:
> 1. using Fastreg method
> 2. always registering memory.

decide if you want or don't want to use periods @ the end of these
cases, but be consistent... I vote not to use periods

>
> Detect the invalidated rkey from the work completion info so we
> won't invalidate it locally. The spec madates that we must not rely
> on the taget remote invalidate our rkey so we must check it upon
> a receive (scsi response) completion.
>

s/madates/mandates/ and run spell check over the change-logs of the
whole series please before posting v2



>
> Signed-off-by: Jenny Derzhavetz 
> Signed-off-by: Sagi Grimberg 
> ---
>  drivers/infiniband/ulp/iser/iscsi_iser.h |  5 ++-
>  drivers/infiniband/ulp/iser/iser_initiator.c | 59 
> +++-
>  drivers/infiniband/ulp/iser/iser_memory.c|  4 +-
>  drivers/infiniband/ulp/iser/iser_verbs.c | 21 +++---
>  4 files changed, 80 insertions(+), 9 deletions(-)
>
> diff --git a/drivers/infiniband/ulp/iser/iscsi_iser.h 
> b/drivers/infiniband/ulp/iser/iscsi_iser.h
> index c79fdba6f969..e399d9a49df2 100644
> --- a/drivers/infiniband/ulp/iser/iscsi_iser.h
> +++ b/drivers/infiniband/ulp/iser/iscsi_iser.h
> @@ -358,6 +358,7 @@ struct iser_reg_ops {
>   * cpus and device max completion vectors
>   * @comps: Dinamically allocated array of completion handlers
>   * @reg_ops:   Registration ops
> + * @remote_inv_sup: Remote invalidate is supported on this device
>   */
>  struct iser_device {
> struct ib_device *ib_device;
> @@ -370,6 +371,7 @@ struct iser_device {
> int  comps_used;
> struct iser_comp *comps;
> struct iser_reg_ops  *reg_ops;
> +   bool remote_inv_sup;
>  };
>
>  #define ISER_CHECK_GUARD   0xc0
> @@ -519,6 +521,7 @@ struct iser_conn {
> u32  num_rx_descs;
> unsigned short   scsi_sg_tablesize;
> unsigned int scsi_max_sectors;
> +   bool snd_w_inv;
>  };
>
>  /**
> @@ -602,7 +605,7 @@ int iser_conn_terminate(struct iser_conn *iser_conn);
>  void iser_release_work(struct work_struct *work);
>
>  void iser_rcv_completion(struct iser_rx_desc *desc,
> -unsigned long dto_xfer_len,
> +struct ib_wc *wc,
>  struct ib_conn *ib_conn);
>
>  void iser_snd_completion(struct iser_tx_desc *desc,
> diff --git a/drivers/infiniband/ulp/iser/iser_initiator.c 
> b/drivers/infiniband/ulp/iser/iser_initiator.c
> index 6a968e350c14..cd12f89b3365 100644
> --- a/drivers/infiniband/ulp/iser/iser_initiator.c
> +++ b/drivers/infiniband/ulp/iser/iser_initiator.c
> @@ -563,11 +563,61 @@ send_control_error:
> return err;
>  }
>
> +static inline void
> +iser_inv_desc(struct iser_fr_desc *desc, u32 rkey)
> +{
> +   if (likely(rkey == desc->rsc.mr->rkey))
> +   desc->rsc.mr_valid = 0;
> +   else if (likely(rkey == desc->pi_ctx->sig_mr->rkey))
> +   desc->pi_ctx->sig_mr_valid = 0;
> +}
> +
> +static int
> +iser_check_remote_inv(struct iser_conn *iser_conn,
> + struct ib_wc *wc,
> + struct iscsi_hdr *hdr)
> +{
> +   if (wc->wc_flags & IB_WC_WITH_INVALIDATE) {
> +   struct iscsi_task *task;
> +   u32 rkey = wc->ex.invalidate_rkey;
> +
> +   iser_dbg("conn %p: remote invalidation for rkey %#x\n",
> +iser_conn, rkey);
> +
> +   if (unlikely(!iser_conn->snd_w_inv)) {
> +   iser_err("conn %p: unexepected remote invalidation, "
> +"terminating connection\n", iser_conn);
> +   return -EPROTO;
> +   }
> +
> +   task = iscsi_itt_to_ctask(iser_conn->iscsi_conn, hdr->itt);
> +   if (likely(task)) {
> +   struct iscsi_iser_task *iser_task = task->dd_data;
> +   struct iser_fr_desc *desc;
> +
> +   if (iser_task->dir[ISER_DIR_IN]) {
> +   desc = iser_task->rdma_reg[ISER_DIR_IN].mem_h;
> +   iser_inv_desc(desc, rkey);
> +   }
> +
> +   if (iser_task->dir[ISER_DIR_OUT]) {
> +   desc = 
> iser_task->rdma_reg[ISER_DIR_OUT].mem_h;
> +   iser_inv_desc(desc, rkey);
> +   }
> +   } else {
> +   iser_err("failed to get task for itt=%d\n", hdr->itt);
> +   return -EINVAL;
> +   }
> +   }
> +
> +   return 0;
> +}
> +
>  /**
>   * iser_rcv_dto_completion - recv DTO completion
>   */
>  vo

Re: [PATCH v1 08/10] iser-target: Support the remote invalidation exception

2015-11-25 Thread Or Gerlitz
On Wed, Nov 25, 2015 at 10:48 AM, Sagi Grimberg
 wrote:
> On 25/11/2015 10:41, Or Gerlitz wrote:
>> On Wed, Nov 25, 2015 at 9:55 AM, Sagi Grimberg 
>> wrote:

>> I see, so if this is case,  can you eliminate one the checks here

>>>>> +   if (isert_conn->snd_w_inv && isert_cmd->inv_rkey) {

> This are *exactly* the checks that enforce what I said above.
> If we remove that we'd step on the bug you mentioned.

> We do remote invalidate only if:
> - we are allowed to (send_w_inv)
> - initiator passed us rkey (inv_rkey).

yep, should be probably OK.

You didn't respond to my comment re adding bools vs bit-fields vs bit-flags

Or.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Future of FMR support, was: Re: [PATCH v1 5/9] xprtrdma: Add ro_unmap_sync method for FMR

2015-11-25 Thread Or Gerlitz
On Wed, Nov 25, 2015 at 7:09 PM, santosh shilimkar
 wrote:
>>> As already indicated to Sagi [1], RDS IB FR support is work in
>>> progress and I was hoping to get it ready for 4.5.

These are really good news! can you please elaborate a bit on the
design changes this move introduced in RDS?
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v1 08/10] iser-target: Support the remote invalidation exception

2015-11-25 Thread Or Gerlitz
On Wed, Nov 25, 2015 at 9:55 AM, Sagi Grimberg  wrote:

> For NO_DATA commands the iser specification explicitly states that
> read_stag and write_stag should be 0 which means that inv_rkey is 0
> too. We don't do remote invalidate in case inv_rkey is 0.

I see, so if this is case,  can you eliminate one the checks here

>>> +   if (isert_conn->snd_w_inv && isert_cmd->inv_rkey) {
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v1 09/10] IB/iser: Increment the rkey when registering and not when invalidating

2015-11-24 Thread Or Gerlitz
On Tue, Nov 24, 2015 at 6:23 PM, Sagi Grimberg  wrote:
> With remote invalidate we won't local invalidate
> but we still want to increment the rkey.

nit, better to phrase is as "we won't do local invalidate"
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v1 03/10] IB/iser: Don't register memory for all immediatedata writes

2015-11-24 Thread Or Gerlitz
On Tue, Nov 24, 2015 at 6:23 PM, Sagi Grimberg  wrote:
> From: Jenny Derzhavetz 
>
> When all the task data is sent as immeidatedata, we are
> allowed to use the local_dma_lkey as it is not sent to
> the wire. In the long run we'd really need to rework
> the wire.

the part below doesn't fit into change-log which will be there forever
in the kernel-logs, remove it

> In the long run we'd really need to rework
> the memory registration flow only when we need rkeys.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v1 08/10] iser-target: Support the remote invalidation exception

2015-11-24 Thread Or Gerlitz
On Tue, Nov 24, 2015 at 6:23 PM, Sagi Grimberg  wrote:
> @@ -1100,7 +1122,14 @@ isert_init_send_wr(struct isert_conn *isert_conn, 
> struct isert_cmd *isert_cmd,
>
> isert_cmd->rdma_wr.iser_ib_op = ISER_IB_SEND;
> send_wr->wr_id = (uintptr_t)&isert_cmd->tx_desc;
> -   send_wr->opcode = IB_WR_SEND;
> +
> +   if (isert_conn->snd_w_inv && isert_cmd->inv_rkey) {
> +   send_wr->opcode  = IB_WR_SEND_WITH_INV;
> +   send_wr->ex.invalidate_rkey = isert_cmd->inv_rkey;
> +   } else {
> +   send_wr->opcode = IB_WR_SEND;
> +   }
> +

> @@ -1489,6 +1518,7 @@ isert_rx_opcode(struct isert_conn *isert_conn, struct 
> iser_rx_desc *rx_desc,
> isert_cmd->read_va = read_va;
> isert_cmd->write_stag = write_stag;
> isert_cmd->write_va = write_va;
> +   isert_cmd->inv_rkey = read_stag ? read_stag : write_stag;

bug

what happens for commands for which we don't register any memory, TUR and such?

Or.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


  1   2   3   4   5   6   7   8   9   10   >