Re: [PATCH v2] staging/rdma/hfi1: check for ARMED->ACTIVE transition in receive interrupt

2016-01-04 Thread Leon Romanovsky
On Mon, Jan 04, 2016 at 11:21:19AM -0500, Jubin John wrote:
> From: Jim Snow 
> 
>   } else {
> + /* Auto activate link on non-SC15 packet receive */
> + if (unlikely(rcd->ppd->host_link_state ==
> +  HLS_UP_ARMED))
> + if (set_armed_to_active(rcd, packet, dd))
> + goto bail;

What is the advantage of double "if" over one "if"?
Something like that
+ if (unlikely(rcd->ppd->host_link_state == HLS_UP_ARMED) && 
(set_armed_to_active(rcd, packet, dd))
+   goto bail;

>   last = process_rcv_packet(, thread);
>   }
>  
> @@ -984,6 +1020,42 @@ bail:
>  }
>  
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH TRIVIAL] IB/core: ib_mad.h ib_mad_snoop_handler documentation fix

2016-01-04 Thread Leon Romanovsky
On Mon, Jan 04, 2016 at 11:04:53AM -0500, Hal Rosenstock wrote:
> ib_mad_snoop_handler ues send_buf rather than send_wr

ues --> uses
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 3/6] IB/uapi: expose uverbs WC opcodes

2015-12-30 Thread Leon Romanovsky
On Wed, Dec 30, 2015 at 10:44:29AM +0100, Christoph Hellwig wrote:
> On Tue, Dec 29, 2015 at 01:02:54PM +0200, Sagi Grimberg wrote:
> >> As you did it in the first patch, just don't assign after IB_WC_LOCAL_INV.
> >> Compiler will handle IB_UVERS_WC_SEND_END + X calculations by itself.
> >
> > I disagree, I'd say its better to keep the code verbosity level here...
> 
> I really don't like enum auto assignment for constants that are fixe
> as part of an ABI.  There is too much chance of things going wrong.

It is up to you, just be constant in your enums assignments.

Thanks.

> --
> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v1 ib-next 0/3] IB core uverbs support for leagacy commands

2015-12-30 Thread Leon Romanovsky
On Tue, Nov 10, 2015 at 08:00:07PM +0200, Eli Cohen wrote:
> Hi Doug,
> 
> this patcheset is addresses comments from the community. Specifically if the 
> verbs is not supported by a hardware driver, we return -EOPNOTSUPP.
> 
> Eli
> 
> Eli Cohen (3):
>   IB/core: Avoid duplicate code
>   IB/core: IB/core: Allow legacy verbs through extended interfaces
>   IB/core: Modify conditional on ucontext existence
> 
>  drivers/infiniband/core/uverbs_main.c | 70 
> +--
>  1 file changed, 34 insertions(+), 36 deletions(-)

Doug,
I remember that you experienced the issues with your email setup and I
wonder if this patchset was left behind.
https://patchwork.kernel.org/patch/7591731/
https://patchwork.kernel.org/patch/7591741/
https://patchwork.kernel.org/patch/7591751/

It is important patchset which enables create_qp_ex() interface
for already merged patches.

Thanks.

> 
> -- 
> 2.6.3
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma.git/k.o/for-4.5 crash during boot

2015-12-30 Thread Leon Romanovsky
On Wed, Dec 30, 2015 at 02:22:23PM +0200, Or Gerlitz wrote:
> On 12/30/2015 2:04 PM, Bart Van Assche wrote:
> >Hello Christoph,
> >
> >Can you check whether the branch in the subject of this e-mail works fine
> >on your setup (commit 59caaed7a7) ? On my test setup (Dell R430 with two
> >ConnectX-3 adapters) this branch crashes during boot in
> >get_counter_table() (see also the attached screenshot).
> 
> Can you please check with Hal's fix that was posted here 1-2 days ago,
> hopefully this should make your system to work

Or referenced to this patch [1], it should fix your crash.

[1] https://patchwork.kernel.org/patch/7929551/

> 
> Or.
> --
> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] IB/core: Remove a set-but-not-used variable from ib_sg_to_pages()

2015-12-29 Thread Leon Romanovsky
On Tue, Dec 29, 2015 at 10:45:03AM +0100, Bart Van Assche wrote:
> Detected this by building the IB core with W=1. See also patch
> "IB core: Fix ib_sg_to_pages()" (commit 8f5ba10ed40a).

Reviewed-by: Leon Romanovsky <leon.romanov...@mellanox.com>

> 
> Signed-off-by: Bart Van Assche <bart.vanass...@sandisk.com>
> Cc: Sagi Grimberg <sa...@mellanox.com>
> Cc: Christoph Hellwig <h...@lst.de>
> ---
>  drivers/infiniband/core/verbs.c | 3 +--
>  1 file changed, 1 insertion(+), 2 deletions(-)
> 
> diff --git a/drivers/infiniband/core/verbs.c b/drivers/infiniband/core/verbs.c
> index 545906d..c90ed29 100644
> --- a/drivers/infiniband/core/verbs.c
> +++ b/drivers/infiniband/core/verbs.c
> @@ -1530,7 +1530,7 @@ int ib_sg_to_pages(struct ib_mr *mr,
>  int (*set_page)(struct ib_mr *, u64))
>  {
>   struct scatterlist *sg;
> - u64 last_end_dma_addr = 0, last_page_addr = 0;
> + u64 last_end_dma_addr = 0;
>   unsigned int last_page_off = 0;
>   u64 page_mask = ~((u64)mr->page_size - 1);
>   int i, ret;
> @@ -1572,7 +1572,6 @@ next_page:
>  
>   mr->length += dma_len;
>   last_end_dma_addr = end_dma_addr;
> - last_page_addr = end_dma_addr & page_mask;
>   last_page_off = end_dma_addr & ~page_mask;
>   }
>  
> -- 
> 2.1.4
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH V2] IB/mlx5: Unify CQ create flags check

2015-12-29 Thread Leon Romanovsky
From: Leon Romanovsky <leo...@mellanox.com>

The create_cq() can receive creation flags which were used
differently by two commits which added create_cq extended
command and cross-channel. The merged code caused to not
accept any flags at all.

This patch unifies the check into one function and one return
error code.

Fixes: 972ecb821379 ("IB/mlx5: Add create_cq extended command")
Fixes: 051f263098a9 ("IB/mlx5: Add driver cross-channel support")

---
Changes from v1:
  * Remove links to linux-rdma from commit message
  * Placed change log under git comment section (---)
Changes from v0:
  * Add Fixes tag

Signed-off-by: Leon Romanovsky <leo...@mellanox.com>
---
 drivers/infiniband/hw/mlx5/cq.c  | 9 +
 drivers/infiniband/hw/mlx5/mlx5_ib.h | 3 ++-
 2 files changed, 3 insertions(+), 9 deletions(-)

diff --git a/drivers/infiniband/hw/mlx5/cq.c b/drivers/infiniband/hw/mlx5/cq.c
index b14316603e44..7ddc790b1819 100644
--- a/drivers/infiniband/hw/mlx5/cq.c
+++ b/drivers/infiniband/hw/mlx5/cq.c
@@ -757,10 +757,6 @@ static void destroy_cq_kernel(struct mlx5_ib_dev *dev, 
struct mlx5_ib_cq *cq)
mlx5_db_free(dev->mdev, >db);
 }
 
-enum {
-   CQ_CREATE_FLAGS_SUPPORTED = IB_CQ_FLAGS_TIMESTAMP_COMPLETION
-};
-
 struct ib_cq *mlx5_ib_create_cq(struct ib_device *ibdev,
const struct ib_cq_init_attr *attr,
struct ib_ucontext *context,
@@ -778,13 +774,10 @@ struct ib_cq *mlx5_ib_create_cq(struct ib_device *ibdev,
int eqn;
int err;
 
-   if (check_cq_create_flags(attr->flags))
-   return ERR_PTR(-EINVAL);
-
if (entries < 0)
return ERR_PTR(-EINVAL);
 
-   if (attr->flags & ~CQ_CREATE_FLAGS_SUPPORTED)
+   if (check_cq_create_flags(attr->flags))
return ERR_PTR(-EOPNOTSUPP);
 
entries = roundup_pow_of_two(entries + 1);
diff --git a/drivers/infiniband/hw/mlx5/mlx5_ib.h 
b/drivers/infiniband/hw/mlx5/mlx5_ib.h
index d4b227126265..fbf14a768105 100644
--- a/drivers/infiniband/hw/mlx5/mlx5_ib.h
+++ b/drivers/infiniband/hw/mlx5/mlx5_ib.h
@@ -692,6 +692,7 @@ static inline u32 check_cq_create_flags(u32 flags)
 * It returns non-zero value for unsupported CQ
 * create flags, otherwise it returns zero.
 */
-   return (flags & ~IB_CQ_FLAGS_IGNORE_OVERRUN);
+   return (flags & ~(IB_CQ_FLAGS_IGNORE_OVERRUN |
+ IB_CQ_FLAGS_TIMESTAMP_COMPLETION));
 }
 #endif /* MLX5_IB_H */
-- 
1.7.12.4

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH V2] IB/mlx5: Unify CQ create flags check

2015-12-29 Thread Leon Romanovsky
On Tue, Dec 29, 2015 at 04:03:41PM +0200, Leon Romanovsky wrote:
> On Tue, Dec 29, 2015 at 03:51:47PM +0200, Sagi Grimberg wrote:
> > >From: Leon Romanovsky <leo...@mellanox.com>
> > >
> > >The create_cq() can receive creation flags which were used
> > >differently by two following commits [1] and [2]. The current
> > >code caused to not accept any flags at all.
> > 
> > We can skip referencing the linux-rdma mailing list.
> > 
> > >
> > >This patch unifies the check into one function and one return
> > >error code.
> > >
> > >Fixes: 972ecb821379 ("IB/mlx5: Add create_cq extended command")
> > >Fixes: 051f263098a9 ("IB/mlx5: Add driver cross-channel support")
> > >
> > >[1] http://www.spinics.net/lists/linux-rdma/msg31430.html
> > >[2] http://www.spinics.net/lists/linux-rdma/msg31658.html
> > >
> > >Changes from v0:
> > >   * Add Fixes tag
> > 
> > Umm, the above ([1], [2], Changes) usually go under the "---"
> > separator so we won't see them in git log forever.
> 
> Doug,
> Do you want me resend the patch?

I sent new version.
Thanks.

> 
> > 
> > >
> > >Signed-off-by: Leon Romanovsky <leo...@mellanox.com>
> > >---
> > >  drivers/infiniband/hw/mlx5/cq.c  | 9 +
> > >  drivers/infiniband/hw/mlx5/mlx5_ib.h | 3 ++-
> > >  2 files changed, 3 insertions(+), 9 deletions(-)
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] IB/mlx5: Unify CQ create flags check

2015-12-29 Thread Leon Romanovsky
From: Leon Romanovsky <leo...@mellanox.com>

The create_cq() can receive creation flags which were used
differently by two following commits [1] and [2].

This patch unifies the check into one function and one return
error code.

[1] commit 972ecb821379 ("IB/mlx5: Add create_cq extended command")
[2] commit 051f263098a9 ("IB/mlx5: Add driver cross-channel support")

Signed-off-by: Leon Romanovsky <leo...@mellanox.com>
---
 drivers/infiniband/hw/mlx5/cq.c  | 9 +
 drivers/infiniband/hw/mlx5/mlx5_ib.h | 3 ++-
 2 files changed, 3 insertions(+), 9 deletions(-)

diff --git a/drivers/infiniband/hw/mlx5/cq.c b/drivers/infiniband/hw/mlx5/cq.c
index b14316603e44..7ddc790b1819 100644
--- a/drivers/infiniband/hw/mlx5/cq.c
+++ b/drivers/infiniband/hw/mlx5/cq.c
@@ -757,10 +757,6 @@ static void destroy_cq_kernel(struct mlx5_ib_dev *dev, 
struct mlx5_ib_cq *cq)
mlx5_db_free(dev->mdev, >db);
 }
 
-enum {
-   CQ_CREATE_FLAGS_SUPPORTED = IB_CQ_FLAGS_TIMESTAMP_COMPLETION
-};
-
 struct ib_cq *mlx5_ib_create_cq(struct ib_device *ibdev,
const struct ib_cq_init_attr *attr,
struct ib_ucontext *context,
@@ -778,13 +774,10 @@ struct ib_cq *mlx5_ib_create_cq(struct ib_device *ibdev,
int eqn;
int err;
 
-   if (check_cq_create_flags(attr->flags))
-   return ERR_PTR(-EINVAL);
-
if (entries < 0)
return ERR_PTR(-EINVAL);
 
-   if (attr->flags & ~CQ_CREATE_FLAGS_SUPPORTED)
+   if (check_cq_create_flags(attr->flags))
return ERR_PTR(-EOPNOTSUPP);
 
entries = roundup_pow_of_two(entries + 1);
diff --git a/drivers/infiniband/hw/mlx5/mlx5_ib.h 
b/drivers/infiniband/hw/mlx5/mlx5_ib.h
index d4b227126265..fbf14a768105 100644
--- a/drivers/infiniband/hw/mlx5/mlx5_ib.h
+++ b/drivers/infiniband/hw/mlx5/mlx5_ib.h
@@ -692,6 +692,7 @@ static inline u32 check_cq_create_flags(u32 flags)
 * It returns non-zero value for unsupported CQ
 * create flags, otherwise it returns zero.
 */
-   return (flags & ~IB_CQ_FLAGS_IGNORE_OVERRUN);
+   return (flags & ~(IB_CQ_FLAGS_IGNORE_OVERRUN |
+ IB_CQ_FLAGS_TIMESTAMP_COMPLETION));
 }
 #endif /* MLX5_IB_H */
-- 
1.7.12.4

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH V2] IB/mlx5: Unify CQ create flags check

2015-12-29 Thread Leon Romanovsky
From: Leon Romanovsky <leo...@mellanox.com>

The create_cq() can receive creation flags which were used
differently by two following commits [1] and [2]. The current
code caused to not accept any flags at all.

This patch unifies the check into one function and one return
error code.

Fixes: 972ecb821379 ("IB/mlx5: Add create_cq extended command")
Fixes: 051f263098a9 ("IB/mlx5: Add driver cross-channel support")

[1] http://www.spinics.net/lists/linux-rdma/msg31430.html
[2] http://www.spinics.net/lists/linux-rdma/msg31658.html

Changes from v0:
  * Add Fixes tag

Signed-off-by: Leon Romanovsky <leo...@mellanox.com>
---
 drivers/infiniband/hw/mlx5/cq.c  | 9 +
 drivers/infiniband/hw/mlx5/mlx5_ib.h | 3 ++-
 2 files changed, 3 insertions(+), 9 deletions(-)

diff --git a/drivers/infiniband/hw/mlx5/cq.c b/drivers/infiniband/hw/mlx5/cq.c
index b14316603e44..7ddc790b1819 100644
--- a/drivers/infiniband/hw/mlx5/cq.c
+++ b/drivers/infiniband/hw/mlx5/cq.c
@@ -757,10 +757,6 @@ static void destroy_cq_kernel(struct mlx5_ib_dev *dev, 
struct mlx5_ib_cq *cq)
mlx5_db_free(dev->mdev, >db);
 }
 
-enum {
-   CQ_CREATE_FLAGS_SUPPORTED = IB_CQ_FLAGS_TIMESTAMP_COMPLETION
-};
-
 struct ib_cq *mlx5_ib_create_cq(struct ib_device *ibdev,
const struct ib_cq_init_attr *attr,
struct ib_ucontext *context,
@@ -778,13 +774,10 @@ struct ib_cq *mlx5_ib_create_cq(struct ib_device *ibdev,
int eqn;
int err;
 
-   if (check_cq_create_flags(attr->flags))
-   return ERR_PTR(-EINVAL);
-
if (entries < 0)
return ERR_PTR(-EINVAL);
 
-   if (attr->flags & ~CQ_CREATE_FLAGS_SUPPORTED)
+   if (check_cq_create_flags(attr->flags))
return ERR_PTR(-EOPNOTSUPP);
 
entries = roundup_pow_of_two(entries + 1);
diff --git a/drivers/infiniband/hw/mlx5/mlx5_ib.h 
b/drivers/infiniband/hw/mlx5/mlx5_ib.h
index d4b227126265..fbf14a768105 100644
--- a/drivers/infiniband/hw/mlx5/mlx5_ib.h
+++ b/drivers/infiniband/hw/mlx5/mlx5_ib.h
@@ -692,6 +692,7 @@ static inline u32 check_cq_create_flags(u32 flags)
 * It returns non-zero value for unsupported CQ
 * create flags, otherwise it returns zero.
 */
-   return (flags & ~IB_CQ_FLAGS_IGNORE_OVERRUN);
+   return (flags & ~(IB_CQ_FLAGS_IGNORE_OVERRUN |
+ IB_CQ_FLAGS_TIMESTAMP_COMPLETION));
 }
 #endif /* MLX5_IB_H */
-- 
1.7.12.4

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH V3] IB/mlx5: Unify CQ create flags check

2015-12-29 Thread Leon Romanovsky
From: Leon Romanovsky <leo...@mellanox.com>

The create_cq() can receive creation flags which were used
differently by two commits which added create_cq extended
command and cross-channel. The merged code caused to not
accept any flags at all.

This patch unifies the check into one function and one return
error code.

Fixes: 972ecb821379 ("IB/mlx5: Add create_cq extended command")
Fixes: 051f263098a9 ("IB/mlx5: Add driver cross-channel support")
Signed-off-by: Leon Romanovsky <leo...@mellanox.com>

---
Changes from v2:
  * Remove blank line after Fixes
  * Place changes below Signed-off-by
Changes from v1:
  * Remove links to linux-rdma from commit message
  * Placed change log under git comment section (---)
Changes from v0:
  * Add Fixes tag

 drivers/infiniband/hw/mlx5/cq.c  | 9 +
 drivers/infiniband/hw/mlx5/mlx5_ib.h | 3 ++-
 2 files changed, 3 insertions(+), 9 deletions(-)

diff --git a/drivers/infiniband/hw/mlx5/cq.c b/drivers/infiniband/hw/mlx5/cq.c
index b14316603e44..7ddc790b1819 100644
--- a/drivers/infiniband/hw/mlx5/cq.c
+++ b/drivers/infiniband/hw/mlx5/cq.c
@@ -757,10 +757,6 @@ static void destroy_cq_kernel(struct mlx5_ib_dev *dev, 
struct mlx5_ib_cq *cq)
mlx5_db_free(dev->mdev, >db);
 }
 
-enum {
-   CQ_CREATE_FLAGS_SUPPORTED = IB_CQ_FLAGS_TIMESTAMP_COMPLETION
-};
-
 struct ib_cq *mlx5_ib_create_cq(struct ib_device *ibdev,
const struct ib_cq_init_attr *attr,
struct ib_ucontext *context,
@@ -778,13 +774,10 @@ struct ib_cq *mlx5_ib_create_cq(struct ib_device *ibdev,
int eqn;
int err;
 
-   if (check_cq_create_flags(attr->flags))
-   return ERR_PTR(-EINVAL);
-
if (entries < 0)
return ERR_PTR(-EINVAL);
 
-   if (attr->flags & ~CQ_CREATE_FLAGS_SUPPORTED)
+   if (check_cq_create_flags(attr->flags))
return ERR_PTR(-EOPNOTSUPP);
 
entries = roundup_pow_of_two(entries + 1);
diff --git a/drivers/infiniband/hw/mlx5/mlx5_ib.h 
b/drivers/infiniband/hw/mlx5/mlx5_ib.h
index d4b227126265..fbf14a768105 100644
--- a/drivers/infiniband/hw/mlx5/mlx5_ib.h
+++ b/drivers/infiniband/hw/mlx5/mlx5_ib.h
@@ -692,6 +692,7 @@ static inline u32 check_cq_create_flags(u32 flags)
 * It returns non-zero value for unsupported CQ
 * create flags, otherwise it returns zero.
 */
-   return (flags & ~IB_CQ_FLAGS_IGNORE_OVERRUN);
+   return (flags & ~(IB_CQ_FLAGS_IGNORE_OVERRUN |
+ IB_CQ_FLAGS_TIMESTAMP_COMPLETION));
 }
 #endif /* MLX5_IB_H */
-- 
1.7.12.4

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH V2] IB/mlx5: Unify CQ create flags check

2015-12-29 Thread Leon Romanovsky
On Tue, Dec 29, 2015 at 03:51:47PM +0200, Sagi Grimberg wrote:
> >From: Leon Romanovsky <leo...@mellanox.com>
> >
> >The create_cq() can receive creation flags which were used
> >differently by two following commits [1] and [2]. The current
> >code caused to not accept any flags at all.
> 
> We can skip referencing the linux-rdma mailing list.
> 
> >
> >This patch unifies the check into one function and one return
> >error code.
> >
> >Fixes: 972ecb821379 ("IB/mlx5: Add create_cq extended command")
> >Fixes: 051f263098a9 ("IB/mlx5: Add driver cross-channel support")
> >
> >[1] http://www.spinics.net/lists/linux-rdma/msg31430.html
> >[2] http://www.spinics.net/lists/linux-rdma/msg31658.html
> >
> >Changes from v0:
> >   * Add Fixes tag
> 
> Umm, the above ([1], [2], Changes) usually go under the "---"
> separator so we won't see them in git log forever.

Doug,
Do you want me resend the patch?

> 
> >
> >Signed-off-by: Leon Romanovsky <leo...@mellanox.com>
> >---
> >  drivers/infiniband/hw/mlx5/cq.c  | 9 +
> >  drivers/infiniband/hw/mlx5/mlx5_ib.h | 3 ++-
> >  2 files changed, 3 insertions(+), 9 deletions(-)
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2 01/36] IB/rdmavt: Create module framework and handle driver registration

2015-12-28 Thread Leon Romanovsky
On Mon, Dec 28, 2015 at 12:59:45PM -0800, Dennis Dalessandro wrote:
> This patch introduces the basics for a new module called rdma_vt. This new
> driver is a software implementation of the InfiniBand verbs and aims to
> replace the multiple implementations that exist and duplicate each others'
> code.
> 
> While the call to actually register the device with the IB core happens in
> rdma_vt, most of the work is still done in the drivers themselves. This
> will be changing in a follow on patch this is just laying the groundwork
> for this infrastructure.
> 
> Reviewed-by: Ira Weiny 
> Reviewed-by: Mike Marciniszyn 
> Signed-off-by: Dennis Dalessandro 
> ---
> Changes since v1:
> remove rdmavt specific driver version
> move copyright above license text
> 
>  MAINTAINERS   |6 ++
>  drivers/infiniband/Kconfig|2 +
>  drivers/infiniband/Makefile   |1 
>  drivers/infiniband/sw/Makefile|1 
>  drivers/infiniband/sw/rdmavt/Kconfig  |6 ++
>  drivers/infiniband/sw/rdmavt/Makefile |   10 
>  drivers/infiniband/sw/rdmavt/vt.c |   83 
> +
>  drivers/infiniband/sw/rdmavt/vt.h |   53 +
>  include/rdma/rdma_vt.h|   70 
>  9 files changed, 232 insertions(+), 0 deletions(-)
>  create mode 100644 drivers/infiniband/sw/Makefile
>  create mode 100644 drivers/infiniband/sw/rdmavt/Kconfig
>  create mode 100644 drivers/infiniband/sw/rdmavt/Makefile
>  create mode 100644 drivers/infiniband/sw/rdmavt/vt.c
>  create mode 100644 drivers/infiniband/sw/rdmavt/vt.h
>  create mode 100644 include/rdma/rdma_vt.h
> 
> diff --git a/MAINTAINERS b/MAINTAINERS
> index 69c8a9c..37fc192 100644
> --- a/MAINTAINERS
> +++ b/MAINTAINERS
> @@ -8880,6 +8880,12 @@ L: rds-de...@oss.oracle.com (moderated for 
> non-subscribers)
>  S:   Supported
>  F:   net/rds/
>  
> +RDMAVT - RDMA verbs software
> +M:   Dennis Dalessandro 
> +L:   linux-rdma@vger.kernel.org
> +S:   Supported
> +F:   drivers/infiniband/sw/rdmavt
> +
>  READ-COPY UPDATE (RCU)
>  M:   "Paul E. McKenney" 
>  M:   Josh Triplett 
> diff --git a/drivers/infiniband/Kconfig b/drivers/infiniband/Kconfig
> index aa26f3c..1b00bb8 100644
> --- a/drivers/infiniband/Kconfig
> +++ b/drivers/infiniband/Kconfig
> @@ -72,4 +72,6 @@ source "drivers/infiniband/ulp/srpt/Kconfig"
>  source "drivers/infiniband/ulp/iser/Kconfig"
>  source "drivers/infiniband/ulp/isert/Kconfig"
>  
> +source "drivers/infiniband/sw/rdmavt/Kconfig"
> +
>  endif # INFINIBAND
> diff --git a/drivers/infiniband/Makefile b/drivers/infiniband/Makefile
> index dc21836..fad0b44 100644
> --- a/drivers/infiniband/Makefile
> +++ b/drivers/infiniband/Makefile
> @@ -1,3 +1,4 @@
>  obj-$(CONFIG_INFINIBAND) += core/
>  obj-$(CONFIG_INFINIBAND) += hw/
>  obj-$(CONFIG_INFINIBAND) += ulp/
> +obj-$(CONFIG_INFINIBAND) += sw/
> diff --git a/drivers/infiniband/sw/Makefile b/drivers/infiniband/sw/Makefile
> new file mode 100644
> index 000..988b6a0
> --- /dev/null
> +++ b/drivers/infiniband/sw/Makefile
> @@ -0,0 +1 @@
> +obj-$(CONFIG_INFINIBAND_RDMAVT)  += rdmavt/
> diff --git a/drivers/infiniband/sw/rdmavt/Kconfig 
> b/drivers/infiniband/sw/rdmavt/Kconfig
> new file mode 100644
> index 000..11aa6a3
> --- /dev/null
> +++ b/drivers/infiniband/sw/rdmavt/Kconfig
> @@ -0,0 +1,6 @@
> +config INFINIBAND_RDMAVT
> + tristate "RDMA verbs transport library"
> + depends on 64BIT
> + default m
> + ---help---
> + This is a common software verbs provider for RDMA networks.
> diff --git a/drivers/infiniband/sw/rdmavt/Makefile 
> b/drivers/infiniband/sw/rdmavt/Makefile
> new file mode 100644
> index 000..98a664d
> --- /dev/null
> +++ b/drivers/infiniband/sw/rdmavt/Makefile
> @@ -0,0 +1,10 @@
> +#
> +# rdmavt driver
> +#
> +#
> +#
> +# Called from the kernel module build system.
> +#
> +obj-$(CONFIG_INFINIBAND_RDMAVT) += rdmavt.o
> +
> +rdmavt-y := vt.o
> diff --git a/drivers/infiniband/sw/rdmavt/vt.c 
> b/drivers/infiniband/sw/rdmavt/vt.c
> new file mode 100644
> index 000..aa325db
> --- /dev/null
> +++ b/drivers/infiniband/sw/rdmavt/vt.c
> @@ -0,0 +1,83 @@
> +/*
> + * Copyright(c) 2015 Intel Corporation.
> + *
> + * This file is provided under a dual BSD/GPLv2 license.  When using or
> + * redistributing this file, you may do so under either license.
> + *
> + * GPL LICENSE SUMMARY
> + *
> + * This program is free software; you can redistribute it and/or modify
> + * it under the terms of version 2 of the GNU General Public License as
> + * published by the Free Software Foundation.
> + *
> + * This program is distributed in the hope that it will be useful, but
> + * WITHOUT ANY WARRANTY; without even the implied warranty of
> + * 

Re: [PATCH 0/2] InfiniBand-iSER: Refactoring for two function implementations

2015-12-27 Thread Leon Romanovsky
On Sun, Dec 27, 2015 at 01:36:30PM +0100, SF Markus Elfring wrote:
> From: Markus Elfring 
> Date: Sun, 27 Dec 2015 13:12:10 +0100
> Subject: [PATCH 0/2] InfiniBand-iSER: Refactoring for two function 
> implementations
Just a note for the future submissions (no need to respin), we are using
different subject line: InfiniBand-iSER: --> IB/iser:

> 
> I suggest to return directly instead of using the jump label "err"
> in two functions (which are working without clean-up there).
> 
> Markus Elfring (2):
>   One jump label less in iser_reg_sig_mr()
>   One jump label less in isert_reg_sig_mr()
> 
>  drivers/infiniband/ulp/iser/iser_memory.c | 5 ++---
>  drivers/infiniband/ulp/isert/ib_isert.c   | 7 +++
>  2 files changed, 5 insertions(+), 7 deletions(-)
> 
> -- 
> 2.6.3
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 0/2] InfiniBand-iSER: Refactoring for two function implementations

2015-12-27 Thread Leon Romanovsky
On Sun, Dec 27, 2015 at 01:36:30PM +0100, SF Markus Elfring wrote:
> From: Markus Elfring <elfr...@users.sourceforge.net>
> Date: Sun, 27 Dec 2015 13:12:10 +0100
> Subject: [PATCH 0/2] InfiniBand-iSER: Refactoring for two function 
> implementations
> 
> I suggest to return directly instead of using the jump label "err"
> in two functions (which are working without clean-up there).
> 
> Markus Elfring (2):
>   One jump label less in iser_reg_sig_mr()
>   One jump label less in isert_reg_sig_mr()
Looks good,
Reviewed-by: Leon Romanovsky <leo...@mellanox.com>

> 
>  drivers/infiniband/ulp/iser/iser_memory.c | 5 ++---
>  drivers/infiniband/ulp/isert/ib_isert.c   | 7 +++
>  2 files changed, 5 insertions(+), 7 deletions(-)
> 
> -- 
> 2.6.3
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/6] IB/uapi: expose uverbs WR opcodes

2015-12-26 Thread Leon Romanovsky
On Thu, Dec 24, 2015 at 03:39:22PM +0100, Christoph Hellwig wrote:

>  #define MLX5_IB_SEND_UMR_UPDATE_MTT (IB_SEND_RESERVED_START << 2)
>  #define MLX5_IB_QPT_REG_UMR  IB_QPT_RESERVED1
> -#define MLX5_IB_WR_UMR   IB_WR_RESERVED1
> +#define MLX5_IB_WR_UMR   (IB_WR_END + 0)

>  
> - IB_WR_RESERVED10,
> +
> + /* drivers may use any value greater or equal to IB_WR_END internally */
> + IB_WR_END,
IB_WR_END --> IB_WR_KERNEL_LAST
>  };
>  

> + IB_UVERBS_WR_END= 9,
I think name IB_UVERBS_WR_LAST describes better than IB_UVERBS_WR_END.
> +};
> +
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 3/6] IB/uapi: expose uverbs WC opcodes

2015-12-26 Thread Leon Romanovsky
On Thu, Dec 24, 2015 at 03:39:24PM +0100, Christoph Hellwig wrote:
> This exposes the WC opcodes supported by uverbs as part of the uapi
> headers.  It follows the same scheme as the WR opcodes.
> 
> Signed-off-by: Christoph Hellwig 
> ---
>  include/rdma/ib_verbs.h  | 29 +
>  include/uapi/rdma/ib_verbs.h | 16 
>  2 files changed, 29 insertions(+), 16 deletions(-)
> 
> diff --git a/include/rdma/ib_verbs.h b/include/rdma/ib_verbs.h
> index 5dccc6a..7dce204 100644
> --- a/include/rdma/ib_verbs.h
> +++ b/include/rdma/ib_verbs.h
> @@ -819,22 +819,19 @@ enum ib_wc_status {
>  const char *__attribute_const__ ib_wc_status_msg(enum ib_wc_status status);
>  
>  enum ib_wc_opcode {
> - IB_WC_SEND,
> - IB_WC_RDMA_WRITE,
> - IB_WC_RDMA_READ,
> - IB_WC_COMP_SWAP,
> - IB_WC_FETCH_ADD,
> - IB_WC_LSO,
> - IB_WC_LOCAL_INV,
> - IB_WC_REG_MR,
> - IB_WC_MASKED_COMP_SWAP,
> - IB_WC_MASKED_FETCH_ADD,
> -/*
> - * Set value of IB_WC_RECV so consumers can test if a completion is a
> - * receive by testing (opcode & IB_WC_RECV).
> - */
> - IB_WC_RECV  = 1 << 7,
> - IB_WC_RECV_RDMA_WITH_IMM
> + IB_WC_SEND  = IB_UVERBS_WC_SEND,
> + IB_WC_RDMA_WRITE= IB_UVERBS_WC_RDMA_WRITE,
> + IB_WC_RDMA_READ = IB_UVERBS_WC_RDMA_READ,
> + IB_WC_COMP_SWAP = IB_UVERBS_WC_COMP_SWAP,
> + IB_WC_FETCH_ADD = IB_UVERBS_WC_FETCH_ADD,
> + IB_WC_LSO   = IB_UVERBS_WC_SEND_END,
> + IB_WC_LOCAL_INV = IB_UVERBS_WC_SEND_END + 1,
> + IB_WC_REG_MR= IB_UVERBS_WC_SEND_END + 2,
> + IB_WC_MASKED_COMP_SWAP  = IB_UVERBS_WC_SEND_END + 3,
> + IB_WC_MASKED_FETCH_ADD  = IB_UVERBS_WC_SEND_END + 4,
As you did it in the first patch, just don't assign after IB_WC_LOCAL_INV.
Compiler will handle IB_UVERS_WC_SEND_END + X calculations by itself.
> +
> + IB_WC_RECV  = IB_UVERBS_WC_RECV,
> + IB_WC_RECV_RDMA_WITH_IMM = IB_UVERBS_WC_RECV_END,
>  };
>  
>  enum ib_wc_flags {
> diff --git a/include/uapi/rdma/ib_verbs.h b/include/uapi/rdma/ib_verbs.h
> index 3be3152..fd7a393 100644
> --- a/include/uapi/rdma/ib_verbs.h
> +++ b/include/uapi/rdma/ib_verbs.h
> @@ -29,4 +29,20 @@ enum ib_uverbs_send_flags {
>   IB_UVERBS_SEND_END  = (1 << 5),
>  };
>  
> +enum ib_uverbs_wc_opcode {
> + IB_UVERBS_WC_SEND   = 0,
> + IB_UVERBS_WC_RDMA_WRITE = 1,
> + IB_UVERBS_WC_RDMA_READ  = 2,
> + IB_UVERBS_WC_COMP_SWAP  = 3,
> + IB_UVERBS_WC_FETCH_ADD  = 4,
> + IB_UVERBS_WC_SEND_END   = 5,
> +
> + /*
> +  * Set value of IB_WC_RECV so consumers can test if a completion is a
> +  * receive by testing (opcode & IB_WC_RECV).
> +  */
> + IB_UVERBS_WC_RECV   = 1 << 7,
> + IB_UVERBS_WC_RECV_END   = (1 << 7) + 1,
> +};
> +
>  #endif /* _UAPI_RDMA_IB_VERBS_H */
> -- 
> 1.9.1
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 4/6] IB/uapi: expose uverbs WC flags

2015-12-26 Thread Leon Romanovsky
On Thu, Dec 24, 2015 at 03:39:25PM +0100, Christoph Hellwig wrote:
> This exposes the WC flags supported by uverbs as part of the uapi
> headers.  It follows the same scheme as the WR opcodes.
> 
> Signed-off-by: Christoph Hellwig 
> ---
>  include/rdma/ib_verbs.h  | 14 +++---
>  include/uapi/rdma/ib_verbs.h | 10 ++
>  2 files changed, 17 insertions(+), 7 deletions(-)
> 
> diff --git a/include/rdma/ib_verbs.h b/include/rdma/ib_verbs.h
> index 7dce204..337db70 100644
> --- a/include/rdma/ib_verbs.h
> +++ b/include/rdma/ib_verbs.h
> @@ -835,13 +835,13 @@ enum ib_wc_opcode {
>  };
>  
>  enum ib_wc_flags {
> - IB_WC_GRH   = 1,
> - IB_WC_WITH_IMM  = (1<<1),
> - IB_WC_WITH_INVALIDATE   = (1<<2),
> - IB_WC_IP_CSUM_OK= (1<<3),
> - IB_WC_WITH_SMAC = (1<<4),
> - IB_WC_WITH_VLAN = (1<<5),
> - IB_WC_WITH_NETWORK_HDR_TYPE = (1<<6),
> + IB_WC_GRH   = IB_UVERBS_WC_GRH,
> + IB_WC_WITH_IMM  = IB_UVERBS_WC_WITH_IMM,
> + IB_WC_WITH_INVALIDATE   = IB_UVERBS_WC_WITH_INVALIDATE,
> + IB_WC_IP_CSUM_OK= IB_UVERBS_WC_IP_CSUM_OK,
> + IB_WC_WITH_SMAC = IB_UVERBS_WC_WITH_SMAC,
> + IB_WC_WITH_VLAN = IB_UVERBS_WC_WITH_VLAN,
> + IB_WC_WITH_NETWORK_HDR_TYPE = IB_UVERBS_WC_WITH_NETWORK_HDR_TYPE,
>  };
>  
>  struct ib_wc {
> diff --git a/include/uapi/rdma/ib_verbs.h b/include/uapi/rdma/ib_verbs.h
> index fd7a393..c40c00b 100644
> --- a/include/uapi/rdma/ib_verbs.h
> +++ b/include/uapi/rdma/ib_verbs.h
> @@ -45,4 +45,14 @@ enum ib_uverbs_wc_opcode {
>   IB_UVERBS_WC_RECV_END   = (1 << 7) + 1,
>  };
>  
> +enum ib_uverbs_wc_flags {
> + IB_UVERBS_WC_GRH= (1 << 0),
> + IB_UVERBS_WC_WITH_IMM   = (1 << 1),
> + IB_UVERBS_WC_WITH_INVALIDATE= (1 << 2),
> + IB_UVERBS_WC_IP_CSUM_OK = (1 << 3),
> + IB_UVERBS_WC_WITH_SMAC  = (1 << 4),
> + IB_UVERBS_WC_WITH_VLAN  = (1 << 5),
> + IB_UVERBS_WC_WITH_NETWORK_HDR_TYPE  = (1 << 6),
> +};
It will be great to add _FLAGS_ mark in the name and add _LAST too.
> +
>  #endif /* _UAPI_RDMA_IB_VERBS_H */
> -- 
> 1.9.1
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 5/6] IB/uapi: expose uverbs WC status codes

2015-12-26 Thread Leon Romanovsky
On Thu, Dec 24, 2015 at 03:39:26PM +0100, Christoph Hellwig wrote:
> This exposes the WC status codes supported by uverbs as part of the uapi
> headers.  It follows the same scheme as the WR opcodes.
> 
> Signed-off-by: Christoph Hellwig 
> ---
>  include/rdma/ib_verbs.h  | 44 
> ++--
>  include/uapi/rdma/ib_verbs.h | 25 +
>  2 files changed, 47 insertions(+), 22 deletions(-)
> 
> diff --git a/include/rdma/ib_verbs.h b/include/rdma/ib_verbs.h
> index 337db70..48bfcf5 100644
> --- a/include/rdma/ib_verbs.h
> +++ b/include/rdma/ib_verbs.h
> @@ -792,28 +792,28 @@ struct ib_ah_attr {
>  };
>  
>  enum ib_wc_status {
> - IB_WC_SUCCESS,
> - IB_WC_LOC_LEN_ERR,
> - IB_WC_LOC_QP_OP_ERR,
> - IB_WC_LOC_EEC_OP_ERR,
> - IB_WC_LOC_PROT_ERR,
> - IB_WC_WR_FLUSH_ERR,
> - IB_WC_MW_BIND_ERR,
> - IB_WC_BAD_RESP_ERR,
> - IB_WC_LOC_ACCESS_ERR,
> - IB_WC_REM_INV_REQ_ERR,
> - IB_WC_REM_ACCESS_ERR,
> - IB_WC_REM_OP_ERR,
> - IB_WC_RETRY_EXC_ERR,
> - IB_WC_RNR_RETRY_EXC_ERR,
> - IB_WC_LOC_RDD_VIOL_ERR,
> - IB_WC_REM_INV_RD_REQ_ERR,
> - IB_WC_REM_ABORT_ERR,
> - IB_WC_INV_EECN_ERR,
> - IB_WC_INV_EEC_STATE_ERR,
> - IB_WC_FATAL_ERR,
> - IB_WC_RESP_TIMEOUT_ERR,
> - IB_WC_GENERAL_ERR
> + IB_WC_SUCCESS   = IB_UVERBS_WC_SUCCESS,
> + IB_WC_LOC_LEN_ERR   = IB_UVERBS_WC_LOC_LEN_ERR,
> + IB_WC_LOC_QP_OP_ERR = IB_UVERBS_WC_LOC_QP_OP_ERR,
> + IB_WC_LOC_EEC_OP_ERR= IB_UVERBS_WC_LOC_EEC_OP_ERR,
> + IB_WC_LOC_PROT_ERR  = IB_UVERBS_WC_LOC_PROT_ERR,
> + IB_WC_WR_FLUSH_ERR  = IB_UVERBS_WC_WR_FLUSH_ERR,
> + IB_WC_MW_BIND_ERR   = IB_UVERBS_WC_MW_BIND_ERR,
> + IB_WC_BAD_RESP_ERR  = IB_UVERBS_WC_BAD_RESP_ERR,
> + IB_WC_LOC_ACCESS_ERR= IB_UVERBS_WC_LOC_ACCESS_ERR,
> + IB_WC_REM_INV_REQ_ERR   = IB_UVERBS_WC_REM_INV_REQ_ERR,
> + IB_WC_REM_ACCESS_ERR= IB_UVERBS_WC_REM_ACCESS_ERR,
> + IB_WC_REM_OP_ERR= IB_UVERBS_WC_REM_OP_ERR,
> + IB_WC_RETRY_EXC_ERR = IB_UVERBS_WC_RETRY_EXC_ERR,
> + IB_WC_RNR_RETRY_EXC_ERR = IB_UVERBS_WC_RNR_RETRY_EXC_ERR,
> + IB_WC_LOC_RDD_VIOL_ERR  = IB_UVERBS_WC_LOC_RDD_VIOL_ERR,
> + IB_WC_REM_INV_RD_REQ_ERR= IB_UVERBS_WC_REM_INV_RD_REQ_ERR,
> + IB_WC_REM_ABORT_ERR = IB_UVERBS_WC_REM_ABORT_ERR,
> + IB_WC_INV_EECN_ERR  = IB_UVERBS_WC_INV_EECN_ERR,
> + IB_WC_INV_EEC_STATE_ERR = IB_UVERBS_WC_INV_EEC_STATE_ERR,
> + IB_WC_FATAL_ERR = IB_UVERBS_WC_FATAL_ERR,
> + IB_WC_RESP_TIMEOUT_ERR  = IB_UVERBS_WC_RESP_TIMEOUT_ERR,
> + IB_WC_GENERAL_ERR   = IB_UVERBS_WC_GENERAL_ERR,
>  };
>  
>  const char *__attribute_const__ ib_wc_status_msg(enum ib_wc_status status);
> diff --git a/include/uapi/rdma/ib_verbs.h b/include/uapi/rdma/ib_verbs.h
> index c40c00b..fbadb29 100644
> --- a/include/uapi/rdma/ib_verbs.h
> +++ b/include/uapi/rdma/ib_verbs.h
> @@ -55,4 +55,29 @@ enum ib_uverbs_wc_flags {
>   IB_UVERBS_WC_WITH_NETWORK_HDR_TYPE  = (1 << 6),
>  };
>  
> +enum ib_uverbs_wc_status {
> + IB_UVERBS_WC_SUCCESS,
> + IB_UVERBS_WC_LOC_LEN_ERR,
> + IB_UVERBS_WC_LOC_QP_OP_ERR,
> + IB_UVERBS_WC_LOC_EEC_OP_ERR,
> + IB_UVERBS_WC_LOC_PROT_ERR,
> + IB_UVERBS_WC_WR_FLUSH_ERR,
> + IB_UVERBS_WC_MW_BIND_ERR,
> + IB_UVERBS_WC_BAD_RESP_ERR,
> + IB_UVERBS_WC_LOC_ACCESS_ERR,
> + IB_UVERBS_WC_REM_INV_REQ_ERR,
> + IB_UVERBS_WC_REM_ACCESS_ERR,
> + IB_UVERBS_WC_REM_OP_ERR,
> + IB_UVERBS_WC_RETRY_EXC_ERR,
> + IB_UVERBS_WC_RNR_RETRY_EXC_ERR,
> + IB_UVERBS_WC_LOC_RDD_VIOL_ERR,
> + IB_UVERBS_WC_REM_INV_RD_REQ_ERR,
> + IB_UVERBS_WC_REM_ABORT_ERR,
> + IB_UVERBS_WC_INV_EECN_ERR,
> + IB_UVERBS_WC_INV_EEC_STATE_ERR,
> + IB_UVERBS_WC_FATAL_ERR,
> + IB_UVERBS_WC_RESP_TIMEOUT_ERR,
> + IB_UVERBS_WC_GENERAL_ERR
> +};
The same as in previous mail, mark it as a _STATUS_ in the name.
> +
>  #endif /* _UAPI_RDMA_IB_VERBS_H */
> -- 
> 1.9.1
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH rdma-next V2 01/32] IB/core: Macro for RoCEv2 UDP port

2015-12-26 Thread Leon Romanovsky
On Thu, Dec 24, 2015 at 11:17:47AM +0200, Kamal Heib wrote:
> From: Amir Vadai 
> 
> Adding a macro for RoCEv2 UDP destination port.
You are adding define and not macro.

> 
> Signed-off-by: Amir Vadai 
> Signed-off-by: Kamal Heib 
> ---
>  include/rdma/ib_verbs.h | 2 ++
>  1 file changed, 2 insertions(+)
> 
> diff --git a/include/rdma/ib_verbs.h b/include/rdma/ib_verbs.h
> index 368fc22..9904394 100644
> --- a/include/rdma/ib_verbs.h
> +++ b/include/rdma/ib_verbs.h
> @@ -79,6 +79,8 @@ enum ib_gid_type {
>   IB_GID_TYPE_SIZE
>  };
>  
> +#define ROCE_V2_UDP_DPORT  4791
> +
>  struct ib_gid_attr {
>   enum ib_gid_typegid_type;
>   struct net_device   *ndev;
> -- 
> 1.8.3.1
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH V2 0/2] Add cross-channel support

2015-12-24 Thread Leon Romanovsky
From: Leon Romanovsky <leo...@mellanox.com>

The following set of patches implements cross-channel (CC) support
in the RDMA core and in the mlx5 infiniband driver.

The cross-channel feature allows to execute WQEs that involve
cross-channel synchronization of IO operations’ on different QPs.

Complex applications usually requires synchronizations for IO
operations from multiple sources before continuing their
execution. In order to implement this the host software
needs to handle completions from each one of the receive
queues (arriving in arbitrary order), process the data after
last message arrival and only then post work request on the
send queue to send the combined data to its destination.

Execution of such an operation generates multiple interrupts at
an unpredictable time with huge overhead for interrupt handling
and context switch.

Current submission adds synchronization primitives which gives
ability to perform conditional flows and a following submission
will introduce arithmetic calculation offload.

Synchronization abilities combined with arithmetic calculations
will allow to program complex flows with a single function call,
hereby significantly reducing overhead associated with IO processing.

Patch #1 adds CQ and QP initialization flags to RDMA core structures
in order to support cross-channel feature.
Patch #2 implements mlx5 infiniband driver configuration logic.

Changes from v1:
  * Enrich cover message and commit messages.
  * Drop patch with checkpatch warning fix.

Changes from v0:
  * Set UAR to be the same for QP and CQ.

Leon Romanovsky (2):
  IB/core: Add cross-channel support
  IB/mlx5: Add driver cross-channel support

 drivers/infiniband/core/uverbs_cmd.c |  5 +++-
 drivers/infiniband/hw/mlx5/cq.c  |  7 -
 drivers/infiniband/hw/mlx5/main.c|  3 ++
 drivers/infiniband/hw/mlx5/mlx5_ib.h | 16 +++
 drivers/infiniband/hw/mlx5/qp.c  | 54 +---
 include/linux/mlx5/qp.h  |  3 ++
 include/rdma/ib_verbs.h  | 11 
 7 files changed, 87 insertions(+), 12 deletions(-)

-- 
1.7.12.4

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH V2 2/2] IB/mlx5: Add driver cross-channel support

2015-12-24 Thread Leon Romanovsky
From: Leon Romanovsky <leo...@mellanox.com>

Add support of cross-channel functionality to mlx5
driver. This includes ability to ignore overrun for CQ
which intended for cross-channel, export device capability and
configure the QP to be sync master/slave queues.

The cross-channel enabled QP supports combination of
three possible properties:
  * WQE processing on the receive queue of this QP
  * WQE processing on the send queue of this QP
  * WQE are supported on the send queue

Signed-off-by: Leon Romanovsky <leo...@mellanox.com>
Reviewed-by: Sagi Grimberg <sa...@mellanox.com>
---
 drivers/infiniband/hw/mlx5/cq.c  |  7 -
 drivers/infiniband/hw/mlx5/main.c|  3 ++
 drivers/infiniband/hw/mlx5/mlx5_ib.h | 16 +++
 drivers/infiniband/hw/mlx5/qp.c  | 54 +---
 include/linux/mlx5/qp.h  |  3 ++
 5 files changed, 72 insertions(+), 11 deletions(-)

diff --git a/drivers/infiniband/hw/mlx5/cq.c b/drivers/infiniband/hw/mlx5/cq.c
index 3dfd287256d6..c363b71c4b0b 100644
--- a/drivers/infiniband/hw/mlx5/cq.c
+++ b/drivers/infiniband/hw/mlx5/cq.c
@@ -760,7 +760,7 @@ struct ib_cq *mlx5_ib_create_cq(struct ib_device *ibdev,
int eqn;
int err;
 
-   if (attr->flags)
+   if (check_cq_create_flags(attr->flags))
return ERR_PTR(-EINVAL);
 
if (entries < 0)
@@ -779,6 +779,7 @@ struct ib_cq *mlx5_ib_create_cq(struct ib_device *ibdev,
spin_lock_init(>lock);
cq->resize_buf = NULL;
cq->resize_umem = NULL;
+   cq->create_flags = attr->flags;
 
if (context) {
err = create_cq_user(dev, udata, context, cq, entries,
@@ -796,6 +797,10 @@ struct ib_cq *mlx5_ib_create_cq(struct ib_device *ibdev,
 
cq->cqe_size = cqe_size;
cqb->ctx.cqe_sz_flags = cqe_sz_to_mlx_sz(cqe_size) << 5;
+
+   if (cq->create_flags & IB_CQ_FLAGS_IGNORE_OVERRUN)
+   cqb->ctx.cqe_sz_flags |= (1 << 1);
+
cqb->ctx.log_sz_usr_page = cpu_to_be32((ilog2(entries) << 24) | index);
err = mlx5_vector2eqn(dev->mdev, vector, , );
if (err)
diff --git a/drivers/infiniband/hw/mlx5/main.c 
b/drivers/infiniband/hw/mlx5/main.c
index a51b5943019e..41ebecd24422 100644
--- a/drivers/infiniband/hw/mlx5/main.c
+++ b/drivers/infiniband/hw/mlx5/main.c
@@ -294,6 +294,9 @@ static int mlx5_ib_init_device_flags(struct ib_device 
*ibdev)
ibdev->odp_caps = dev->odp_caps;
 #endif
 
+   if (MLX5_CAP_GEN(mdev, cd))
+   props->device_cap_flags |= IB_DEVICE_CROSS_CHANNEL;
+
return 0;
 }
 
diff --git a/drivers/infiniband/hw/mlx5/mlx5_ib.h 
b/drivers/infiniband/hw/mlx5/mlx5_ib.h
index 633347260b79..fdbd761fd7fd 100644
--- a/drivers/infiniband/hw/mlx5/mlx5_ib.h
+++ b/drivers/infiniband/hw/mlx5/mlx5_ib.h
@@ -85,6 +85,10 @@ enum mlx5_ib_mad_ifc_flags {
MLX5_MAD_IFC_NET_VIEW   = 4,
 };
 
+enum {
+   MLX5_CROSS_CHANNEL_UUAR = 0,
+};
+
 struct mlx5_ib_ucontext {
struct ib_ucontext  ibucontext;
struct list_headdb_page_list;
@@ -242,6 +246,9 @@ struct mlx5_ib_cq_buf {
 enum mlx5_ib_qp_flags {
MLX5_IB_QP_BLOCK_MULTICAST_LOOPBACK = 1 << 0,
MLX5_IB_QP_SIGNATURE_HANDLING   = 1 << 1,
+   MLX5_IB_QP_CROSS_CHANNEL= 1 << 2,
+   MLX5_IB_QP_MANAGED_SEND = 1 << 3,
+   MLX5_IB_QP_MANAGED_RECV = 1 << 4,
 };
 
 struct mlx5_umr_wr {
@@ -284,6 +291,7 @@ struct mlx5_ib_cq {
struct mlx5_ib_cq_buf  *resize_buf;
struct ib_umem *resize_umem;
int cqe_size;
+   u32 create_flags;
 };
 
 struct mlx5_ib_srq {
@@ -662,4 +670,12 @@ static inline int is_qp1(enum ib_qp_type qp_type)
 #define MLX5_MAX_UMR_SHIFT 16
 #define MLX5_MAX_UMR_PAGES (1 << MLX5_MAX_UMR_SHIFT)
 
+static inline u32 check_cq_create_flags(u32 flags)
+{
+   /*
+* It returns non-zero value for unsupported CQ
+* create flags, otherwise it returns zero.
+*/
+   return (flags & ~IB_CQ_FLAGS_IGNORE_OVERRUN);
+}
 #endif /* MLX5_IB_H */
diff --git a/drivers/infiniband/hw/mlx5/qp.c b/drivers/infiniband/hw/mlx5/qp.c
index 307bdbca8938..c18bf42c51ac 100644
--- a/drivers/infiniband/hw/mlx5/qp.c
+++ b/drivers/infiniband/hw/mlx5/qp.c
@@ -615,18 +615,23 @@ static int create_user_qp(struct mlx5_ib_dev *dev, struct 
ib_pd *pd,
/*
 * TBD: should come from the verbs when we have the API
 */
-   uuarn = alloc_uuar(>uuari, MLX5_IB_LATENCY_CLASS_HIGH);
-   if (uuarn < 0) {
-   mlx5_ib_dbg(dev, "failed to allocate low latency UUAR\n");
-   mlx5_ib_dbg(dev, "reverting to medium latency\n");
-   uuarn = alloc_uuar(>uuari, 
MLX5_IB_LATENCY_CLASS_MEDIUM);
+   if (qp->fla

[PATCH V2 1/2] IB/core: Add cross-channel support

2015-12-24 Thread Leon Romanovsky
From: Leon Romanovsky <leo...@mellanox.com>

The cross-channel feature allows to execute WQEs that involve
synchronization of IO operations’ on different QPs.

This capability enables to program complex flows with a single
function call, hereby significantly reducing overhead associated
with IO processing.

The queue pairs can be configured to work as a “sync master queue”
or “sync slave queues”.

The added flags are:

1. Device capability flag IB_DEVICE_CROSS_CHANNEL for the
   devices that can perform cross-channel operations.

2. CQ property flag IB_CQ_FLAGS_IGNORE_OVERRUN to disable CQ overrun
   check. In cross-channel mode, the send/receive queues will forward
   their completions to managing QP.

3. QP property flags to indicate if queues are slave or master:
   * IB_QP_CREATE_MANAGED_SEND indicates that posted send work requests
 will not be executed immediately and requires enabling.
   * IB_QP_CREATE_MANAGED_RECV indicates that posted receive work
 requests will not be executed immediately and requires enabling.
   * IB_QP_CREATE_CROSS_CHANNEL declares the QP to work in cross-channel
 mode. If IB_QP_CREATE_MANAGED_SEND and IB_QP_CREATE_MANAGED_RECV are
 not provided, this QP will be sync master queue, else it will be sync
 slave.

Signed-off-by: Leon Romanovsky <leo...@mellanox.com>
Reviewed-by: Sagi Grimberg <sa...@mellanox.com>
---
 drivers/infiniband/core/uverbs_cmd.c |  5 -
 include/rdma/ib_verbs.h  | 11 +++
 2 files changed, 15 insertions(+), 1 deletion(-)

diff --git a/drivers/infiniband/core/uverbs_cmd.c 
b/drivers/infiniband/core/uverbs_cmd.c
index 1add53653825..63ddd6a122b0 100644
--- a/drivers/infiniband/core/uverbs_cmd.c
+++ b/drivers/infiniband/core/uverbs_cmd.c
@@ -1829,7 +1829,10 @@ static int create_qp(struct ib_uverbs_file *file,
  sizeof(cmd->create_flags))
attr.create_flags = cmd->create_flags;
 
-   if (attr.create_flags & ~IB_QP_CREATE_BLOCK_MULTICAST_LOOPBACK) {
+   if (attr.create_flags & ~(IB_QP_CREATE_BLOCK_MULTICAST_LOOPBACK |
+   IB_QP_CREATE_CROSS_CHANNEL |
+   IB_QP_CREATE_MANAGED_SEND |
+   IB_QP_CREATE_MANAGED_RECV)) {
ret = -EINVAL;
goto err_put;
}
diff --git a/include/rdma/ib_verbs.h b/include/rdma/ib_verbs.h
index a7dbbfc9d202..d00f3e2e39ac 100644
--- a/include/rdma/ib_verbs.h
+++ b/include/rdma/ib_verbs.h
@@ -141,6 +141,13 @@ enum ib_device_cap_flags {
IB_DEVICE_MEM_WINDOW_TYPE_2B= (1<<24),
IB_DEVICE_RC_IP_CSUM= (1<<25),
IB_DEVICE_RAW_IP_CSUM   = (1<<26),
+   /*
+* Devices should set IB_DEVICE_CROSS_CHANNEL if they
+* support execution of WQEs that involve synchronization
+* of I/O operations with single completion queue managed
+* by hardware.
+*/
+   IB_DEVICE_CROSS_CHANNEL = (1<<27),
IB_DEVICE_MANAGED_FLOW_STEERING = (1<<29),
IB_DEVICE_SIGNATURE_HANDOVER= (1<<30),
IB_DEVICE_ON_DEMAND_PAGING  = (1<<31),
@@ -186,6 +193,7 @@ struct ib_odp_caps {
 
 enum ib_cq_creation_flags {
IB_CQ_FLAGS_TIMESTAMP_COMPLETION   = 1 << 0,
+   IB_CQ_FLAGS_IGNORE_OVERRUN = 1 << 1,
 };
 
 struct ib_cq_init_attr {
@@ -823,6 +831,9 @@ enum ib_qp_type {
 enum ib_qp_create_flags {
IB_QP_CREATE_IPOIB_UD_LSO   = 1 << 0,
IB_QP_CREATE_BLOCK_MULTICAST_LOOPBACK   = 1 << 1,
+   IB_QP_CREATE_CROSS_CHANNEL  = 1 << 2,
+   IB_QP_CREATE_MANAGED_SEND   = 1 << 3,
+   IB_QP_CREATE_MANAGED_RECV   = 1 << 4,
IB_QP_CREATE_NETIF_QP   = 1 << 5,
IB_QP_CREATE_SIGNATURE_EN   = 1 << 6,
IB_QP_CREATE_USE_GFP_NOIO   = 1 << 7,
-- 
1.7.12.4

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH V1 1/3] IB/core: Align coding style of ib_device_cap_flags structure

2015-12-23 Thread Leon Romanovsky
On Wed, Dec 23, 2015 at 12:00:41PM +0200, Or Gerlitz wrote:
> On 12/21/2015 9:53 AM, Leon Romanovsky wrote:
> >On Mon, Dec 21, 2015 at 9:40 AM, Or Gerlitz <gerlitz...@gmail.com> wrote:
> >>On Mon, Dec 21, 2015 at 9:27 AM, Leon Romanovsky <l...@leon.nu> wrote:
> >>>On Mon, Dec 21, 2015 at 8:52 AM, Or Gerlitz <gerlitz...@gmail.com> wrote:
> >>>>On Mon, Dec 21, 2015 at 8:37 AM, Leon Romanovsky <l...@leon.nu> wrote:
> >>>>>On Mon, Dec 21, 2015 at 8:22 AM, ira.weiny <ira.we...@intel.com> wrote:
> >>>>>>On Sun, Dec 20, 2015 at 12:16:09PM +0200, Leon Romanovsky wrote:
> >>>>>>>From: Leon Romanovsky <leo...@mellanox.com>
> >>>>>>>Modify enum ib_device_cap_flags such that other patches which add new
> >>>>>>>enum values pass strict checkpatch.pl checks.
> >>>>>>>- IB_DEVICE_RESERVED  = (1<<16), /* old SEND_W_INV */
> >>>>>>>- IB_DEVICE_MEM_WINDOW= (1<<17),
> >>>>>>>+ IB_DEVICE_RESIZE_MAX_WR = (1 << 0),
> >>>>>2. Change the whole file => the work with "git blame" will be less
> >>>>>straightforward.
> >>>>Agree.
> >>>>
> >>>>Leon, I don't think we need to take checkpatch-ing of things to that 
> >>>>level.
> >>>>
> >>>>Indeed, we should make sure that whole new enums and such are done right 
> >>>>--
> >>>>but avoid touching core structs/enums in a manner that disallows
> >>>>simple git blaming of things, which is very useful for new comers and
> >>>>old suffers.
> >>>There are no doubts that standalone fixing checkpatch errors is more
> >>>suitable to staging subsystem.
> >>Agree
> >>
> >>>In our case, it is part of coming changes in that structure. such
> >>>change serves specific goal to minimize the possibility of error
> >>>by seeing clean output from static analyser tool.
> >>Disagree.
> >>
> >>What future bugs are you envisioning by let this 10y old header file
> >>keep having some checkpatch issues on few of the major enums?!
> >If I knew the future, I would be able to answer.
> 
> Use your common-sense and experience and maybe even some credit that you can
> give to the 10x larger and super active networking community, you should be
> able to provide some answer if you believe this is the  right way to go.
My common-sense and experience suggest me that the proposed patch
doesn't worth investing so much time. I'll drop it in the next version
of this patchset.

> 
> >I think that you expressed your opinion very clearly, let's wait for Doug's 
> >response on such changes.
> >
> 
> I expressed my opinion and I ask you Qs. Christoph also made more comments,
> if you think this is the way to go, respond.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH for-next V3 01/11] IB/core: Add gid_type to gid attribute

2015-12-23 Thread Leon Romanovsky
On Wed, Dec 23, 2015 at 02:56:47PM +0200, Matan Barak wrote:
> In order to support multiple GID types, we need to store the gid_type
> with each GID. This is also aligned with the RoCE v2 annex "RoCEv2 PORT
> GID table entries shall have a "GID type" attribute that denotes the L3
> Address type". The currently supported GID is IB_GID_TYPE_IB which is
> also RoCE v1 GID type.
> 
> This implies that gid_type should be added to roce_gid_table meta-data.
> 
> Signed-off-by: Matan Barak 
> ---
>  drivers/infiniband/core/cache.c   |  144 +++-
>  drivers/infiniband/core/cm.c  |2 +-
>  drivers/infiniband/core/cma.c |3 +-
>  drivers/infiniband/core/core_priv.h   |4 +
>  drivers/infiniband/core/device.c  |9 ++-
>  drivers/infiniband/core/multicast.c   |2 +-
>  drivers/infiniband/core/roce_gid_mgmt.c   |   60 ++--
>  drivers/infiniband/core/sa_query.c|5 +-
>  drivers/infiniband/core/uverbs_marshall.c |1 +
>  drivers/infiniband/core/verbs.c   |1 +
>  include/rdma/ib_cache.h   |4 +
>  include/rdma/ib_sa.h  |1 +
>  include/rdma/ib_verbs.h   |   11 ++-
>  13 files changed, 185 insertions(+), 62 deletions(-)
> 
> diff --git a/drivers/infiniband/core/cache.c b/drivers/infiniband/core/cache.c
> index 097e9df..566fd8f 100644
> --- a/drivers/infiniband/core/cache.c
> +++ b/drivers/infiniband/core/cache.c
> @@ -64,6 +64,7 @@ enum gid_attr_find_mask {
>   GID_ATTR_FIND_MASK_GID  = 1UL << 0,
>   GID_ATTR_FIND_MASK_NETDEV   = 1UL << 1,
>   GID_ATTR_FIND_MASK_DEFAULT  = 1UL << 2,
> + GID_ATTR_FIND_MASK_GID_TYPE = 1UL << 3,
>  };
>  
>  enum gid_table_entry_props {
> @@ -125,6 +126,19 @@ static void dispatch_gid_change_event(struct ib_device 
> *ib_dev, u8 port)
>   }
>  }
>  
> +static const char * const gid_type_str[] = {
   ^^ ^^
IMHO, The white spaces can be a little bit confusing to understand.

> + [IB_GID_TYPE_IB]= "IB/RoCE v1",
> +};
> +
> +const char *ib_cache_gid_type_str(enum ib_gid_type gid_type)
> +{
> + if (gid_type < ARRAY_SIZE(gid_type_str) && gid_type_str[gid_type])
Why do you need to check second condition?
> + return gid_type_str[gid_type];
> +
> + return "Invalid GID type";
> +}
> +EXPORT_SYMBOL(ib_cache_gid_type_str);
> +
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/2] Isolate common list remove code

2015-12-21 Thread Leon Romanovsky
On Mon, Dec 21, 2015 at 08:42:53AM -0600, Christoph Lameter wrote:
> Code cleanup to remove multicast specific code from ipoib_main.c
> 
> The removal of a list of multicast groups occurs in three places.
> Create a new function ipoib_mcast_remove_list(). Use this new
> function in ipoib_main.c too.
> That in turn allows the dropping of two functions that were
> exported from ipoib_multicast.c for expiration of mc groups.
> 
> Reviewed-by: Iraq Weiny 
Iraq Weiny --> Ira Weiny


> +void ipoib_mcast_remove_list(struct net_device *dev, struct list_head 
> *remove_list)
Will it be beneficial to inline this function?
> +{
> + struct ipoib_mcast *mcast, *tmcast;
> +
> + list_for_each_entry_safe(mcast, tmcast, remove_list, list) {
> + ipoib_mcast_leave(dev, mcast);
> + ipoib_mcast_free(mcast);
> + }
> +}
> +
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH V1 1/3] IB/core: Align coding style of ib_device_cap_flags structure

2015-12-20 Thread Leon Romanovsky
On Mon, Dec 21, 2015 at 8:22 AM, ira.weiny <ira.we...@intel.com> wrote:
> On Sun, Dec 20, 2015 at 12:16:09PM +0200, Leon Romanovsky wrote:
>> From: Leon Romanovsky <leo...@mellanox.com>
>>
>> Modify enum ib_device_cap_flags such that other patches which add new
>> enum values pass strict checkpatch.pl checks.
>>

>> - IB_DEVICE_RESERVED  = (1<<16), /* old SEND_W_INV */
>> - IB_DEVICE_MEM_WINDOW= (1<<17),
>> + IB_DEVICE_RESIZE_MAX_WR = (1 << 0),
>
> NIT: Shouldn't we just use the BIT macro?
>
> IB_DEVICE_RESIZE_MAX_WR = BIT(0),
You are right and it is a preferred way for me too, however the
downside of such change will be one of two:
1. Change this structure only => we will have style mix of BITs and
shifts in the same file. IMHO it looks awful.
2. Change the whole file => the work with "git blame" will be less
straightforward.

I will do the change across whole file, If Doug accepts such change.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH V1] libmlx5: Add gitignore file to the project

2015-12-20 Thread Leon Romanovsky
From: Leon Romanovsky <leo...@mellanox.com>

Add gitignore file to the libmlx5 project.

Changes from v0:
 * Add temporary files created by etags/ctags/gtags/cscope.
 * Add Reviewed-by signature

Reviewed-by: Sagi Grimberg <sa...@mellanox.com>
Signed-off-by: Leon Romanovsky <leo...@mellanox.com>
---
 .gitignore | 33 +
 1 file changed, 33 insertions(+)
 create mode 100644 .gitignore

diff --git a/.gitignore b/.gitignore
new file mode 100644
index ..7878db2d068e
--- /dev/null
+++ b/.gitignore
@@ -0,0 +1,33 @@
+*.o
+*.lo
+*.swp
+configure
+Makefile.in
+autom4te.cache
+aclocal.m4
+stamp-h.in
+config.h.in
+config.h.in~
+config.log
+config.h
+.libs
+.deps
+libmlx5.spec
+Makefile
+config.status
+stamp-h1
+libtool
+TAGS
+.TAGS
+!TAGS/
+tags
+.tags
+!tags/
+gtags.files
+GTAGS
+GRTAGS
+GPATH
+cscope.files
+cscope.out
+cscope.in.out
+cscope.po.out
-- 
1.7.12.4

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH libibverbs 2/3] libibverbs: Add cross-channel QP initialization flags

2015-12-20 Thread Leon Romanovsky
From: Leon Romanovsky <leo...@mellanox.com>

The cross-channel feature allows to execute WQEs that involve
synchronization of I/O operations' on different QPs.

These QPs should be configured to work as a "sync master queue" or
“sync slave queues”.

The QP property flags to indicate if queues are slave or master:
   * IB_QP_CREATE_MANAGED_SEND indicates that posted send work requests
 will not be executed immediately and requires enabling.
   * IB_QP_CREATE_MANAGED_RECV indicates that posted receive work
 requests will not be executed immediately and requires enabling.
   * IB_QP_CREATE_CROSS_CHANNEL declares the QP to work in cross-channel
 mode. If IB_QP_CREATE_MANAGED_SEND and IB_QP_CREATE_MANAGED_RECV are
 not provided, this QP will be sync master queue, else it will be sync
 slave.

Signed-off-by: Leon Romanovsky <leo...@mellanox.com>
---
 include/infiniband/verbs.h | 3 +++
 src/cmd.c  | 5 -
 2 files changed, 7 insertions(+), 1 deletion(-)

diff --git a/include/infiniband/verbs.h b/include/infiniband/verbs.h
index d8069d52ae13..2b9b50c87f45 100644
--- a/include/infiniband/verbs.h
+++ b/include/infiniband/verbs.h
@@ -769,6 +769,9 @@ enum ibv_qp_init_attr_mask {
 
 enum ibv_qp_create_flags {
IBV_QP_CREATE_BLOCK_SELF_MCAST_LB   = 1 << 1,
+   IBV_QP_CREATE_CROSS_CHANNEL = 1 << 2,
+   IBV_QP_CREATE_MANAGED_SEND  = 1 << 3,
+   IBV_QP_CREATE_MANAGED_RECV  = 1 << 4
 };
 
 struct ibv_qp_init_attr_ex {
diff --git a/src/cmd.c b/src/cmd.c
index 675777a8ee5a..ae33491befb4 100644
--- a/src/cmd.c
+++ b/src/cmd.c
@@ -815,7 +815,10 @@ static void create_qp_handle_resp_common(struct 
ibv_context *context,
 }
 
 enum {
-   CREATE_QP_EX2_SUP_CREATE_FLAGS = IBV_QP_CREATE_BLOCK_SELF_MCAST_LB,
+   CREATE_QP_EX2_SUP_CREATE_FLAGS = IBV_QP_CREATE_BLOCK_SELF_MCAST_LB |
+   IBV_QP_CREATE_CROSS_CHANNEL |
+   IBV_QP_CREATE_MANAGED_SEND |
+   IBV_QP_CREATE_MANAGED_RECV
 };
 
 int ibv_cmd_create_qp_ex2(struct ibv_context *context,
-- 
1.7.12.4

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH libibverbs 3/3] libibverbs: Export cross-channel capability flag

2015-12-20 Thread Leon Romanovsky
From: Leon Romanovsky <leo...@mellanox.com>

The device implements capability to execute complex
I/O operations across multiple I/O channels - synchronize
execution of I/O operations on one I/O channel with
execution of I/O operations on another channel(s).

Cross-channel operations support is indicated by the HCA_CAP.cd.

Signed-off-by: Leon Romanovsky <leo...@mellanox.com>
---
 include/infiniband/verbs.h | 1 +
 1 file changed, 1 insertion(+)

diff --git a/include/infiniband/verbs.h b/include/infiniband/verbs.h
index 2b9b50c87f45..879f4cca8173 100644
--- a/include/infiniband/verbs.h
+++ b/include/infiniband/verbs.h
@@ -121,6 +121,7 @@ enum ibv_device_cap_flags {
IBV_DEVICE_XRC  = 1 << 20,
IBV_DEVICE_RC_IP_CSUM   = 1 << 25,
IBV_DEVICE_RAW_IP_CSUM  = 1 << 26,
+   IBV_DEVICE_CROSS_CHANNEL= 1 << 27,
IBV_DEVICE_MANAGED_FLOW_STEERING = 1 << 29
 };
 
-- 
1.7.12.4

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH libmlx5 0/2] Add cross-channel support

2015-12-20 Thread Leon Romanovsky
From: Leon Romanovsky <leo...@mellanox.com>

This patchset adds libmlx5 part of cross-channel support.

These patches were added on top of "Completion timestamping" [2].

[1] http://thread.gmane.org/gmane.linux.drivers.rdma/31715
[2] http://www.spinics.net/lists/linux-rdma/msg30940.html

Leon Romanovsky (2):
  libmlx5: Add cross-channel primitives
  libmlx5: Add CQ ignore overrun creation flag

 src/mlx5-abi.h | 14 +
 src/mlx5.h |  4 
 src/verbs.c| 62 +-
 3 files changed, 66 insertions(+), 14 deletions(-)

-- 
1.7.12.4

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH libmlx5 2/2] libmlx5: Add CQ ignore overrun creation flag

2015-12-20 Thread Leon Romanovsky
From: Leon Romanovsky <leo...@mellanox.com>

In order to use CQ for cross-channel operations, the user
needs to configure CQ to ignore overrun creation flag.

This patch adds support of it to libmlx5 library.

Signed-off-by: Leon Romanovsky <leo...@mellanox.com>
---
 src/mlx5-abi.h | 14 +
 src/verbs.c| 62 +-
 2 files changed, 62 insertions(+), 14 deletions(-)

diff --git a/src/mlx5-abi.h b/src/mlx5-abi.h
index 769ea811d26b..85f6ee3f691e 100644
--- a/src/mlx5-abi.h
+++ b/src/mlx5-abi.h
@@ -91,6 +91,20 @@ struct mlx5_create_cq_resp {
__u32   cqn;
 };
 
+struct mlx5_create_cq_ex {
+   struct ibv_create_cq_ex ibv_cmd;
+   __u64   buf_addr;
+   __u64   db_addr;
+   __u32   cqe_size;
+   __u32   comp_mask;
+};
+
+struct mlx5_create_cq_resp_ex {
+   struct ibv_create_cq_resp_exibv_resp;
+   __u32   cqn;
+   __u32   comp_mask;
+};
+
 struct mlx5_create_srq {
struct ibv_create_srq   ibv_cmd;
__u64   buf_addr;
diff --git a/src/verbs.c b/src/verbs.c
index 1dbee60a6f9a..94a1197351a2 100644
--- a/src/verbs.c
+++ b/src/verbs.c
@@ -250,17 +250,31 @@ enum {
 };
 
 enum {
-   CREATE_CQ_SUPPORTED_FLAGS = IBV_CREATE_CQ_ATTR_COMPLETION_TIMESTAMP
+   CREATE_CQ_SUPPORTED_FLAGS = IBV_CREATE_CQ_ATTR_COMPLETION_TIMESTAMP |
+   IBV_CREATE_CQ_ATTR_IGNORE_OVERRUN
+};
+
+enum cmd_type {
+   LEGACY_CMD,
+   EXTENDED_CMD
 };
 
 static struct ibv_cq *create_cq(struct ibv_context *context,
-   const struct ibv_create_cq_attr_ex *cq_attr)
+   struct ibv_create_cq_attr_ex *cq_attr,
+   enum cmd_type ctype)
 {
struct mlx5_create_cq   cmd;
+   struct mlx5_create_cq_excmd_ex;
struct mlx5_create_cq_resp  resp;
+   struct mlx5_create_cq_resp_ex   resp_ex;
struct mlx5_cq *cq;
int cqe_sz;
-   int ret;
+   /*
+* Initialize to non-zero value, because
+* ibv_cmd_* should set it to zero in case
+* of success execution.
+*/
+   int ret = -1;
int ncqe;
 #ifdef MLX5_DEBUG
FILE *fp = to_mctx(context)->dbg_fp;
@@ -299,7 +313,6 @@ static struct ibv_cq *create_cq(struct ibv_context *context,
return NULL;
}
 
-   memset(, 0, sizeof cmd);
cq->cons_index = 0;
 
if (mlx5_spinlock_init(>lock))
@@ -342,14 +355,30 @@ static struct ibv_cq *create_cq(struct ibv_context 
*context,
cq->arm_sn  = 0;
cq->cqe_sz  = cqe_sz;
 
-   cmd.buf_addr = (uintptr_t) cq->buf_a.buf;
-   cmd.db_addr  = (uintptr_t) cq->dbrec;
-   cmd.cqe_size = cqe_sz;
+   if (ctype == LEGACY_CMD) {
+   memset(, 0, sizeof(cmd));
+   cmd.buf_addr = (uintptr_t) cq->buf_a.buf;
+   cmd.db_addr  = (uintptr_t) cq->dbrec;
+   cmd.cqe_size = cqe_sz;
+
+   ret = ibv_cmd_create_cq(context, ncqe - 1, cq_attr->channel,
+   cq_attr->comp_vector,
+   >ibv_cq, _cmd, sizeof cmd,
+   _resp, sizeof resp);
+   }
+   if (ctype == EXTENDED_CMD) {
+   memset(_ex, 0, sizeof(cmd_ex));
+   cmd_ex.buf_addr = (uintptr_t) cq->buf_a.buf;
+   cmd_ex.db_addr  = (uintptr_t) cq->dbrec;
+   cmd_ex.cqe_size = cqe_sz;
+
+   ret = ibv_cmd_create_cq_ex(context, cq_attr,
+   >ibv_cq, _ex.ibv_cmd,
+   sizeof(cmd_ex.ibv_cmd), sizeof(cmd_ex),
+   _ex.ibv_resp,
+   sizeof(resp_ex.ibv_resp), 
sizeof(resp_ex));
+   }
 
-   ret = ibv_cmd_create_cq(context, ncqe - 1, cq_attr->channel,
-   cq_attr->comp_vector,
-   >ibv_cq, _cmd, sizeof cmd,
-   _resp, sizeof resp);
if (ret) {
mlx5_dbg(fp, MLX5_DBG_CQ, "ret %d\n", ret);
goto err_db;
@@ -357,7 +386,12 @@ static struct ibv_cq *create_cq(struct ibv_context 
*context,
 
cq->active_buf = >buf_a;
cq->resize_buf = NULL;
-   cq->cqn = resp.cqn;
+
+   if (ctype == LEGACY_CMD)
+   cq->cqn = resp.cqn;
+   if (ctype == EXTENDED_CMD)
+   cq->cqn = resp_ex.cqn;

[PATCH libibverbs 0/3] Add cross-channel support

2015-12-20 Thread Leon Romanovsky
From: Leon Romanovsky <leo...@mellanox.com>

This patchset adds supplementary part of cross-channel support [1]
to libibverbs.

These patches were added on top of "Completion timestamping" [2]
and "Expose QP block self multicast loopback creation flag" [3]
series.

[1] http://thread.gmane.org/gmane.linux.drivers.rdma/31715
[2] https://www.mail-archive.com/linux-rdma@vger.kernel.org/msg28895.html
[3] http://comments.gmane.org/gmane.linux.drivers.rdma/30158

Leon Romanovsky (3):
  libibverbs: Add CQ ignore overrun creation flag
  libibverbs: Add cross-channel QP initialization flags
  libibverbs: Export cross-channel capability flag

 include/infiniband/verbs.h | 11 ---
 man/ibv_create_cq_ex.3 | 10 --
 src/cmd.c  |  5 -
 3 files changed, 20 insertions(+), 6 deletions(-)

-- 
1.7.12.4

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH libmlx5 1/2] libmlx5: Add cross-channel primitives

2015-12-20 Thread Leon Romanovsky
From: Leon Romanovsky <leo...@mellanox.com>

The cross-channel feature relies on special primitives to
send and receive work requests.

* WAIT on CQ WR - This WR holds execution of subsequent
  requests on that queue until producer index of a CQ is met.

• SEND_EN WR - This WR specifies value of producer index on
  the controlled send queue. It enables the execution of all
  WQEs up to this producer index.

• RECEIVE_EN WR - Same as SEND_EN but related to a receive
  queue.

Signed-off-by: Leon Romanovsky <leo...@mellanox.com>
---
 src/mlx5.h | 4 
 1 file changed, 4 insertions(+)

diff --git a/src/mlx5.h b/src/mlx5.h
index 0c0b0278f66e..49e3374b4439 100644
--- a/src/mlx5.h
+++ b/src/mlx5.h
@@ -208,6 +208,10 @@ enum {
MLX5_OPCODE_LOCAL_INVAL = 0x1b,
MLX5_OPCODE_CONFIG_CMD  = 0x1f,
 
+   MLX5_OPCODE_SEND_ENABLE = 0x17,
+   MLX5_OPCODE_RECV_ENABLE = 0x16,
+   MLX5_OPCODE_CQE_WAIT= 0x0f,
+
MLX5_RECV_OPCODE_RDMA_WRITE_IMM = 0x00,
MLX5_RECV_OPCODE_SEND   = 0x01,
MLX5_RECV_OPCODE_SEND_IMM   = 0x02,
-- 
1.7.12.4

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH V1 2/3] IB/core: Add cross-channel support

2015-12-20 Thread Leon Romanovsky
From: Leon Romanovsky <leo...@mellanox.com>

The cross-channel feature allows to execute WQEs that involve
synchronization of I/O operations’ on different QPs.

This capability enables to program complex flows with a single
function call, hereby significantly reducing overhead associated
with I/O processing.

Cross-channel operations support is indicated by HCA capability
information.

The queue pairs can be configured to work as a “sync master queue”
or “sync slave queues”.

The added flags are:

1. Device capability flag IB_DEVICE_CROSS_CHANNEL for the
   devices that can perform cross-channel operations.

2. CQ property flag IB_CQ_FLAGS_IGNORE_OVERRUN to disable CQ overrun
   check. This check is useless in cross-channel scenario.

3. QP property flags to indicate if queues are slave or master:
   * IB_QP_CREATE_MANAGED_SEND indicates that posted send work requests
 will not be executed immediately and requires enabling.
   * IB_QP_CREATE_MANAGED_RECV indicates that posted receive work
 requests will not be executed immediately and requires enabling.
   * IB_QP_CREATE_CROSS_CHANNEL declares the QP to work in cross-channel
 mode. If IB_QP_CREATE_MANAGED_SEND and IB_QP_CREATE_MANAGED_RECV are
 not provided, this QP will be sync master queue, else it will be sync
 slave.

Reviewed-by: Sagi Grimberg <sa...@mellanox.com>
Signed-off-by: Leon Romanovsky <leo...@mellanox.com>
---
 drivers/infiniband/core/uverbs_cmd.c |  5 -
 include/rdma/ib_verbs.h  | 11 +++
 2 files changed, 15 insertions(+), 1 deletion(-)

diff --git a/drivers/infiniband/core/uverbs_cmd.c 
b/drivers/infiniband/core/uverbs_cmd.c
index 94816ae..0e9711f 100644
--- a/drivers/infiniband/core/uverbs_cmd.c
+++ b/drivers/infiniband/core/uverbs_cmd.c
@@ -1843,7 +1843,10 @@ static int create_qp(struct ib_uverbs_file *file,
  sizeof(cmd->create_flags))
attr.create_flags = cmd->create_flags;
 
-   if (attr.create_flags & ~IB_QP_CREATE_BLOCK_MULTICAST_LOOPBACK) {
+   if (attr.create_flags & ~(IB_QP_CREATE_BLOCK_MULTICAST_LOOPBACK |
+   IB_QP_CREATE_CROSS_CHANNEL |
+   IB_QP_CREATE_MANAGED_SEND |
+   IB_QP_CREATE_MANAGED_RECV)) {
ret = -EINVAL;
goto err_put;
}
diff --git a/include/rdma/ib_verbs.h b/include/rdma/ib_verbs.h
index bcf40ad..370fbdf 100644
--- a/include/rdma/ib_verbs.h
+++ b/include/rdma/ib_verbs.h
@@ -139,6 +139,13 @@ enum ib_device_cap_flags {
IB_DEVICE_MEM_WINDOW_TYPE_2B= (1 << 24),
IB_DEVICE_RC_IP_CSUM= (1 << 25),
IB_DEVICE_RAW_IP_CSUM   = (1 << 26),
+   /*
+* Devices should set IB_DEVICE_CROSS_CHANNEL if they
+* support execution of WQEs that involve synchronization
+* of I/O operations with single completion queue managed
+* by hardware.
+*/
+   IB_DEVICE_CROSS_CHANNEL = (1 << 27),
IB_DEVICE_MANAGED_FLOW_STEERING = (1 << 29),
IB_DEVICE_SIGNATURE_HANDOVER= (1 << 30),
IB_DEVICE_ON_DEMAND_PAGING  = (1 << 31),
@@ -184,6 +191,7 @@ struct ib_odp_caps {
 
 enum ib_cq_creation_flags {
IB_CQ_FLAGS_TIMESTAMP_COMPLETION   = 1 << 0,
+   IB_CQ_FLAGS_IGNORE_OVERRUN = 1 << 1,
 };
 
 struct ib_cq_init_attr {
@@ -866,6 +874,9 @@ enum ib_qp_type {
 enum ib_qp_create_flags {
IB_QP_CREATE_IPOIB_UD_LSO   = 1 << 0,
IB_QP_CREATE_BLOCK_MULTICAST_LOOPBACK   = 1 << 1,
+   IB_QP_CREATE_CROSS_CHANNEL  = 1 << 2,
+   IB_QP_CREATE_MANAGED_SEND   = 1 << 3,
+   IB_QP_CREATE_MANAGED_RECV   = 1 << 4,
IB_QP_CREATE_NETIF_QP   = 1 << 5,
IB_QP_CREATE_SIGNATURE_EN   = 1 << 6,
IB_QP_CREATE_USE_GFP_NOIO   = 1 << 7,
-- 
1.7.12.4

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH V1 0/3] Add cross-channel support

2015-12-20 Thread Leon Romanovsky
From: Leon Romanovsky <leo...@mellanox.com>

This patchset adds cross-channel support.

The cross-channel feature allows to execute WQEs that involve
synchronization of I/O operations’ on different QPs.

This capability enables to program complex flows with a single
function call, hereby significantly reducing overhead associated
with I/O processing.

Changes from v0:
  * Set UAR to be the same for QP and CQ.

Leon Romanovsky (3):
  IB/core: Align coding style of ib_device_cap_flags structure
  IB/core: Add cross-channel support
  IB/mlx5: Add driver cross-channel support

 drivers/infiniband/core/uverbs_cmd.c |  5 ++-
 drivers/infiniband/hw/mlx5/cq.c  |  7 +++-
 drivers/infiniband/hw/mlx5/main.c|  3 ++
 drivers/infiniband/hw/mlx5/mlx5_ib.h | 16 
 drivers/infiniband/hw/mlx5/qp.c  | 54 ++-
 include/linux/mlx5/qp.h  |  3 ++
 include/rdma/ib_verbs.h  | 71 +---
 7 files changed, 117 insertions(+), 42 deletions(-)

-- 
1.7.12.4

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH V1 1/3] IB/core: Align coding style of ib_device_cap_flags structure

2015-12-20 Thread Leon Romanovsky
From: Leon Romanovsky <leo...@mellanox.com>

Modify enum ib_device_cap_flags such that other patches which add new
enum values pass strict checkpatch.pl checks.

Reviewed-by: Sagi Grimberg <sa...@mellanox.com>
Signed-off-by: Leon Romanovsky <leo...@mellanox.com>
---
 include/rdma/ib_verbs.h | 60 -
 1 file changed, 30 insertions(+), 30 deletions(-)

diff --git a/include/rdma/ib_verbs.h b/include/rdma/ib_verbs.h
index 9a68a19..bcf40ad 100644
--- a/include/rdma/ib_verbs.h
+++ b/include/rdma/ib_verbs.h
@@ -105,24 +105,24 @@ enum rdma_link_layer {
 };
 
 enum ib_device_cap_flags {
-   IB_DEVICE_RESIZE_MAX_WR = 1,
-   IB_DEVICE_BAD_PKEY_CNTR = (1<<1),
-   IB_DEVICE_BAD_QKEY_CNTR = (1<<2),
-   IB_DEVICE_RAW_MULTI = (1<<3),
-   IB_DEVICE_AUTO_PATH_MIG = (1<<4),
-   IB_DEVICE_CHANGE_PHY_PORT   = (1<<5),
-   IB_DEVICE_UD_AV_PORT_ENFORCE= (1<<6),
-   IB_DEVICE_CURR_QP_STATE_MOD = (1<<7),
-   IB_DEVICE_SHUTDOWN_PORT = (1<<8),
-   IB_DEVICE_INIT_TYPE = (1<<9),
-   IB_DEVICE_PORT_ACTIVE_EVENT = (1<<10),
-   IB_DEVICE_SYS_IMAGE_GUID= (1<<11),
-   IB_DEVICE_RC_RNR_NAK_GEN= (1<<12),
-   IB_DEVICE_SRQ_RESIZE= (1<<13),
-   IB_DEVICE_N_NOTIFY_CQ   = (1<<14),
-   IB_DEVICE_LOCAL_DMA_LKEY= (1<<15),
-   IB_DEVICE_RESERVED  = (1<<16), /* old SEND_W_INV */
-   IB_DEVICE_MEM_WINDOW= (1<<17),
+   IB_DEVICE_RESIZE_MAX_WR = (1 << 0),
+   IB_DEVICE_BAD_PKEY_CNTR = (1 << 1),
+   IB_DEVICE_BAD_QKEY_CNTR = (1 << 2),
+   IB_DEVICE_RAW_MULTI = (1 << 3),
+   IB_DEVICE_AUTO_PATH_MIG = (1 << 4),
+   IB_DEVICE_CHANGE_PHY_PORT   = (1 << 5),
+   IB_DEVICE_UD_AV_PORT_ENFORCE= (1 << 6),
+   IB_DEVICE_CURR_QP_STATE_MOD = (1 << 7),
+   IB_DEVICE_SHUTDOWN_PORT = (1 << 8),
+   IB_DEVICE_INIT_TYPE = (1 << 9),
+   IB_DEVICE_PORT_ACTIVE_EVENT = (1 << 10),
+   IB_DEVICE_SYS_IMAGE_GUID= (1 << 11),
+   IB_DEVICE_RC_RNR_NAK_GEN= (1 << 12),
+   IB_DEVICE_SRQ_RESIZE= (1 << 13),
+   IB_DEVICE_N_NOTIFY_CQ   = (1 << 14),
+   IB_DEVICE_LOCAL_DMA_LKEY= (1 << 15),
+   IB_DEVICE_RESERVED  = (1 << 16), /* old SEND_W_INV */
+   IB_DEVICE_MEM_WINDOW= (1 << 17),
/*
 * Devices should set IB_DEVICE_UD_IP_SUM if they support
 * insertion of UDP and TCP checksum on outgoing UD IPoIB
@@ -130,18 +130,18 @@ enum ib_device_cap_flags {
 * incoming messages.  Setting this flag implies that the
 * IPoIB driver may set NETIF_F_IP_CSUM for datagram mode.
 */
-   IB_DEVICE_UD_IP_CSUM= (1<<18),
-   IB_DEVICE_UD_TSO= (1<<19),
-   IB_DEVICE_XRC   = (1<<20),
-   IB_DEVICE_MEM_MGT_EXTENSIONS= (1<<21),
-   IB_DEVICE_BLOCK_MULTICAST_LOOPBACK = (1<<22),
-   IB_DEVICE_MEM_WINDOW_TYPE_2A= (1<<23),
-   IB_DEVICE_MEM_WINDOW_TYPE_2B= (1<<24),
-   IB_DEVICE_RC_IP_CSUM= (1<<25),
-   IB_DEVICE_RAW_IP_CSUM   = (1<<26),
-   IB_DEVICE_MANAGED_FLOW_STEERING = (1<<29),
-   IB_DEVICE_SIGNATURE_HANDOVER= (1<<30),
-   IB_DEVICE_ON_DEMAND_PAGING  = (1<<31),
+   IB_DEVICE_UD_IP_CSUM= (1 << 18),
+   IB_DEVICE_UD_TSO= (1 << 19),
+   IB_DEVICE_XRC   = (1 << 20),
+   IB_DEVICE_MEM_MGT_EXTENSIONS= (1 << 21),
+   IB_DEVICE_BLOCK_MULTICAST_LOOPBACK = (1 << 22),
+   IB_DEVICE_MEM_WINDOW_TYPE_2A= (1 << 23),
+   IB_DEVICE_MEM_WINDOW_TYPE_2B= (1 << 24),
+   IB_DEVICE_RC_IP_CSUM= (1 << 25),
+   IB_DEVICE_RAW_IP_CSUM   = (1 << 26),
+   IB_DEVICE_MANAGED_FLOW_STEERING = (1 << 29),
+   IB_DEVICE_SIGNATURE_HANDOVER= (1 << 30),
+   IB_DEVICE_ON_DEMAND_PAGING  = (1 << 31),
 };
 
 enum ib_signature_prot_cap {
-- 
1.7.12.4

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH V1 3/3] IB/mlx5: Add driver cross-channel support

2015-12-20 Thread Leon Romanovsky
From: Leon Romanovsky <leo...@mellanox.com>

Add support of cross-channel functionality to mlx5
driver. This includes ability to ignore overrun for CQ
which intended for cross-channel, export device capability and
configure the QP to be sync master/slave queues.

The cross-channel enabled QP supports combination of
three possible properties:
* WQE processing on the receive queue of this QP
* WQE processing on the send queue of this QP
* WQE are supported on the send queue

Reviewed-by: Sagi Grimberg <sa...@mellanox.com>
Signed-off-by: Leon Romanovsky <leo...@mellanox.com>
---
 drivers/infiniband/hw/mlx5/cq.c  |  7 -
 drivers/infiniband/hw/mlx5/main.c|  3 ++
 drivers/infiniband/hw/mlx5/mlx5_ib.h | 16 +++
 drivers/infiniband/hw/mlx5/qp.c  | 54 +---
 include/linux/mlx5/qp.h  |  3 ++
 5 files changed, 72 insertions(+), 11 deletions(-)

diff --git a/drivers/infiniband/hw/mlx5/cq.c b/drivers/infiniband/hw/mlx5/cq.c
index 3dfd287..c363b71 100644
--- a/drivers/infiniband/hw/mlx5/cq.c
+++ b/drivers/infiniband/hw/mlx5/cq.c
@@ -760,7 +760,7 @@ struct ib_cq *mlx5_ib_create_cq(struct ib_device *ibdev,
int eqn;
int err;
 
-   if (attr->flags)
+   if (check_cq_create_flags(attr->flags))
return ERR_PTR(-EINVAL);
 
if (entries < 0)
@@ -779,6 +779,7 @@ struct ib_cq *mlx5_ib_create_cq(struct ib_device *ibdev,
spin_lock_init(>lock);
cq->resize_buf = NULL;
cq->resize_umem = NULL;
+   cq->create_flags = attr->flags;
 
if (context) {
err = create_cq_user(dev, udata, context, cq, entries,
@@ -796,6 +797,10 @@ struct ib_cq *mlx5_ib_create_cq(struct ib_device *ibdev,
 
cq->cqe_size = cqe_size;
cqb->ctx.cqe_sz_flags = cqe_sz_to_mlx_sz(cqe_size) << 5;
+
+   if (cq->create_flags & IB_CQ_FLAGS_IGNORE_OVERRUN)
+   cqb->ctx.cqe_sz_flags |= (1 << 1);
+
cqb->ctx.log_sz_usr_page = cpu_to_be32((ilog2(entries) << 24) | index);
err = mlx5_vector2eqn(dev->mdev, vector, , );
if (err)
diff --git a/drivers/infiniband/hw/mlx5/main.c 
b/drivers/infiniband/hw/mlx5/main.c
index bdd60a6..be0d994 100644
--- a/drivers/infiniband/hw/mlx5/main.c
+++ b/drivers/infiniband/hw/mlx5/main.c
@@ -300,6 +300,9 @@ static int mlx5_ib_query_device(struct ib_device *ibdev,
props->odp_caps = dev->odp_caps;
 #endif
 
+   if (MLX5_CAP_GEN(mdev, cd))
+   props->device_cap_flags |= IB_DEVICE_CROSS_CHANNEL;
+
return 0;
 }
 
diff --git a/drivers/infiniband/hw/mlx5/mlx5_ib.h 
b/drivers/infiniband/hw/mlx5/mlx5_ib.h
index 6333472..fdbd761 100644
--- a/drivers/infiniband/hw/mlx5/mlx5_ib.h
+++ b/drivers/infiniband/hw/mlx5/mlx5_ib.h
@@ -85,6 +85,10 @@ enum mlx5_ib_mad_ifc_flags {
MLX5_MAD_IFC_NET_VIEW   = 4,
 };
 
+enum {
+   MLX5_CROSS_CHANNEL_UUAR = 0,
+};
+
 struct mlx5_ib_ucontext {
struct ib_ucontext  ibucontext;
struct list_headdb_page_list;
@@ -242,6 +246,9 @@ struct mlx5_ib_cq_buf {
 enum mlx5_ib_qp_flags {
MLX5_IB_QP_BLOCK_MULTICAST_LOOPBACK = 1 << 0,
MLX5_IB_QP_SIGNATURE_HANDLING   = 1 << 1,
+   MLX5_IB_QP_CROSS_CHANNEL= 1 << 2,
+   MLX5_IB_QP_MANAGED_SEND = 1 << 3,
+   MLX5_IB_QP_MANAGED_RECV = 1 << 4,
 };
 
 struct mlx5_umr_wr {
@@ -284,6 +291,7 @@ struct mlx5_ib_cq {
struct mlx5_ib_cq_buf  *resize_buf;
struct ib_umem *resize_umem;
int cqe_size;
+   u32 create_flags;
 };
 
 struct mlx5_ib_srq {
@@ -662,4 +670,12 @@ static inline int is_qp1(enum ib_qp_type qp_type)
 #define MLX5_MAX_UMR_SHIFT 16
 #define MLX5_MAX_UMR_PAGES (1 << MLX5_MAX_UMR_SHIFT)
 
+static inline u32 check_cq_create_flags(u32 flags)
+{
+   /*
+* It returns non-zero value for unsupported CQ
+* create flags, otherwise it returns zero.
+*/
+   return (flags & ~IB_CQ_FLAGS_IGNORE_OVERRUN);
+}
 #endif /* MLX5_IB_H */
diff --git a/drivers/infiniband/hw/mlx5/qp.c b/drivers/infiniband/hw/mlx5/qp.c
index 307bdbc..c18bf42 100644
--- a/drivers/infiniband/hw/mlx5/qp.c
+++ b/drivers/infiniband/hw/mlx5/qp.c
@@ -615,18 +615,23 @@ static int create_user_qp(struct mlx5_ib_dev *dev, struct 
ib_pd *pd,
/*
 * TBD: should come from the verbs when we have the API
 */
-   uuarn = alloc_uuar(>uuari, MLX5_IB_LATENCY_CLASS_HIGH);
-   if (uuarn < 0) {
-   mlx5_ib_dbg(dev, "failed to allocate low latency UUAR\n");
-   mlx5_ib_dbg(dev, "reverting to medium latency\n");
-   uuarn = alloc_uuar(>uuari, 
MLX5_IB_LATENCY_CLASS_MEDIUM);
+   if (qp->flags & MLX5_IB_QP_CROSS_CHANNEL)
+

[PATCH libibverbs 1/3] libibverbs: Add CQ ignore overrun creation flag

2015-12-20 Thread Leon Romanovsky
From: Leon Romanovsky <leo...@mellanox.com>

A CQ overrun is checked while posting a completion, and if
encountered, the QP is transferred to the appropriate error
state.

CQ update (and error discovery) are not synchronized with
WQE execution. Thus, a positive acknowledgement can be sent
to the remote request, and subsequently a CQ overrun may be
detected. Hardware may not write the CQE as a result of the
error, and the CQ will be transitioned to an error state as
well as the QP that caused the CQ error.

To avoid these errors, software usually ensures that the
number of WQEs posted to a QP does not exceed the number
of entries in the CQ associated with this QP. When multiple
QPs post completions into the same CQ, this check is cumulative.

CQ ignore overrun flag instructs the HW doesn't ring doorbells
on updates of CQ consumer counter (poll for completion) or
request completion notifications (Arm CQ).

Signed-off-by: Leon Romanovsky <leo...@mellanox.com>
---
 include/infiniband/verbs.h |  7 ---
 man/ibv_create_cq_ex.3 | 10 --
 2 files changed, 12 insertions(+), 5 deletions(-)

diff --git a/include/infiniband/verbs.h b/include/infiniband/verbs.h
index c3e863850d4e..d8069d52ae13 100644
--- a/include/infiniband/verbs.h
+++ b/include/infiniband/verbs.h
@@ -1201,12 +1201,13 @@ struct ibv_context {
 };
 
 enum ibv_create_cq_attr {
-   IBV_CREATE_CQ_ATTR_FLAGS= 1 << 0,
-   IBV_CREATE_CQ_ATTR_RESERVED = 1 << 1
+   IBV_CREATE_CQ_ATTR_FLAGS= 1 << 0,
+   IBV_CREATE_CQ_ATTR_RESERVED = 1 << 1
 };
 
 enum ibv_create_cq_attr_flags {
-   IBV_CREATE_CQ_ATTR_COMPLETION_TIMESTAMP = 1 << 0,
+   IBV_CREATE_CQ_ATTR_COMPLETION_TIMESTAMP = 1 << 0,
+   IBV_CREATE_CQ_ATTR_IGNORE_OVERRUN   = 1 << 1
 };
 
 struct ibv_create_cq_attr_ex {
diff --git a/man/ibv_create_cq_ex.3 b/man/ibv_create_cq_ex.3
index 9f9e049b0d43..8670bd0a828b 100644
--- a/man/ibv_create_cq_ex.3
+++ b/man/ibv_create_cq_ex.3
@@ -42,13 +42,19 @@ enum ibv_wc_flags_ex {
 IBV_WC_EX_WITH_SLID  = 1 << 7,  /* Require slid in WC 
*/
 IBV_WC_EX_WITH_SL= 1 << 8,  /* Require sl in WC */
 IBV_WC_EX_WITH_DLID_PATH_BITS= 1 << 9,  /* Require dlid path 
bits in WC */
-IBV_WC_EX_WITH_COMPLETION_TIMESTAMP  = 1 << 10, /* Require completion 
timestamp in WC /*
+IBV_WC_EX_WITH_COMPLETION_TIMESTAMP  = 1 << 10, /* Require completion 
timestamp in WC */
 };
 
+enum ibv_create_cq_attr_flags {
+   IBV_CREATE_CQ_ATTR_COMPLETION_TIMESTAMP = 1 << 0,
+   IBV_CREATE_CQ_ATTR_IGNORE_OVERRUN   = 1 << 1
+};
 
 enum ibv_create_cq_attr {
-IBV_CREATE_CQ_ATTR_FLAGS = 1 << 0,
+   IBV_CREATE_CQ_ATTR_FLAGS= 1 << 0,
+   IBV_CREATE_CQ_ATTR_RESERVED = 1 << 1
 };
+
 .SH "RETURN VALUE"
 .B ibv_create_cq_ex()
 returns a pointer to the CQ, or NULL if the request fails.
-- 
1.7.12.4

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH V1 1/3] IB/core: Align coding style of ib_device_cap_flags structure

2015-12-20 Thread Leon Romanovsky
On Mon, Dec 21, 2015 at 8:52 AM, Or Gerlitz <gerlitz...@gmail.com> wrote:
> On Mon, Dec 21, 2015 at 8:37 AM, Leon Romanovsky <l...@leon.nu> wrote:
>> On Mon, Dec 21, 2015 at 8:22 AM, ira.weiny <ira.we...@intel.com> wrote:
>>> On Sun, Dec 20, 2015 at 12:16:09PM +0200, Leon Romanovsky wrote:
>>>> From: Leon Romanovsky <leo...@mellanox.com>
>
>>>> Modify enum ib_device_cap_flags such that other patches which add new
>>>> enum values pass strict checkpatch.pl checks.
>
>>>> - IB_DEVICE_RESERVED  = (1<<16), /* old SEND_W_INV */
>>>> - IB_DEVICE_MEM_WINDOW= (1<<17),
>>>> + IB_DEVICE_RESIZE_MAX_WR = (1 << 0),
>
>> 2. Change the whole file => the work with "git blame" will be less
>> straightforward.
>
> Agree.
>
> Leon, I don't think we need to take checkpatch-ing of things to that level.
>
> Indeed, we should make sure that whole new enums and such are done right --
> but avoid touching core structs/enums in a manner that disallows
> simple git blaming of things, which is very useful for new comers and
> old suffers.
There are no doubts that standalone fixing checkpatch errors is more
suitable to staging subsystem.
In our case, it is part of coming changes in that structure. such
change serves specific goal to minimize
the possibility of error by seeing clean output from static analyser tool.

For the new comers and old suffers checkpatch tool is extremely useful too.

>
>> I will do the change across whole file, If Doug accepts such change.
>
> Please don't... simple git blame is very powerful tool for kernel
> developers everyday work.
It is an open question what we prefer more: history with chance to
sneak a mistake or less history with less chances to make a mistake.

>
> Or.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH V1 1/3] IB/core: Align coding style of ib_device_cap_flags structure

2015-12-20 Thread Leon Romanovsky
On Mon, Dec 21, 2015 at 9:40 AM, Or Gerlitz <gerlitz...@gmail.com> wrote:
> On Mon, Dec 21, 2015 at 9:27 AM, Leon Romanovsky <l...@leon.nu> wrote:
>> On Mon, Dec 21, 2015 at 8:52 AM, Or Gerlitz <gerlitz...@gmail.com> wrote:
>>> On Mon, Dec 21, 2015 at 8:37 AM, Leon Romanovsky <l...@leon.nu> wrote:
>>>> On Mon, Dec 21, 2015 at 8:22 AM, ira.weiny <ira.we...@intel.com> wrote:
>>>>> On Sun, Dec 20, 2015 at 12:16:09PM +0200, Leon Romanovsky wrote:
>>>>>> From: Leon Romanovsky <leo...@mellanox.com>
>>>
>>>>>> Modify enum ib_device_cap_flags such that other patches which add new
>>>>>> enum values pass strict checkpatch.pl checks.
>>>
>>>>>> - IB_DEVICE_RESERVED  = (1<<16), /* old SEND_W_INV */
>>>>>> - IB_DEVICE_MEM_WINDOW= (1<<17),
>>>>>> + IB_DEVICE_RESIZE_MAX_WR = (1 << 0),
>>>
>>>> 2. Change the whole file => the work with "git blame" will be less
>>>> straightforward.
>>>
>>> Agree.
>>>
>>> Leon, I don't think we need to take checkpatch-ing of things to that level.
>>>
>>> Indeed, we should make sure that whole new enums and such are done right --
>>> but avoid touching core structs/enums in a manner that disallows
>>> simple git blaming of things, which is very useful for new comers and
>>> old suffers.
>
>> There are no doubts that standalone fixing checkpatch errors is more
>> suitable to staging subsystem.
>
> Agree
>
>> In our case, it is part of coming changes in that structure. such
>> change serves specific goal to minimize the possibility of error
>> by seeing clean output from static analyser tool.
>
> Disagree.
>
> What future bugs are you envisioning by let this 10y old header file
> keep having some checkpatch issues on few of the major enums?!
If I knew the future, I would be able to answer.

I think that you expressed your opinion very clearly, let's wait for
Doug's response on such changes.

>
> Or.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] libmlx5: Add gitignore file to the project

2015-12-19 Thread Leon Romanovsky
From: Leon Romanovsky <leo...@mellanox.com>

Add gitignore file to the libmlx5 project.

Signed-off-by: Leon Romanovsky <leo...@mellanox.com>
---
 .gitignore | 20 
 1 file changed, 20 insertions(+)
 create mode 100644 .gitignore

diff --git a/.gitignore b/.gitignore
new file mode 100644
index ..be8e0f03eb93
--- /dev/null
+++ b/.gitignore
@@ -0,0 +1,20 @@
+*.o
+*.lo
+*.swp
+configure
+Makefile.in
+autom4te.cache
+aclocal.m4
+stamp-h.in
+config.h.in
+config.h.in~
+config.log
+config.h
+.libs
+.deps
+libmlx5.spec
+Makefile
+config.status
+stamp-h1
+libtool
+tags
-- 
1.7.12.4

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 3/3] bject: IB Core: Display extended counter set if available

2015-12-17 Thread Leon Romanovsky
There are number of minor issues:
Subject: Re: [PATCH 3/3] bject: IB Core: Display extended counter set if
^^^

On Thu, Dec 17, 2015 at 01:52:58PM -0600, Christoph Lameter wrote:
> - in_mad->data[41] = port_num;/* PortSelect field */
> + if (port_num)
> + in_mad->data[41] = port_num;/* PortSelect field */
in_mad was created with kzalloc with all fields zeroed by default, so
you can drop if(port_num) condition.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 2/2] IB/mlx4: Convert kmalloc to be kmalloc_array to fix checkpatch warnings

2015-12-16 Thread Leon Romanovsky
From: Leon Romanovsky <leo...@mellanox.com>

Convert kmalloc to be kmalloc_array to fix warnings below:

WARNING: Prefer kmalloc_array over kmalloc with multiply
+   qp->sq.wrid = kmalloc(qp->sq.wqe_cnt * sizeof(u64),

WARNING: Prefer kmalloc_array over kmalloc with multiply
+   qp->rq.wrid = kmalloc(qp->rq.wqe_cnt * sizeof(u64),

WARNING: Prefer kmalloc_array over kmalloc with multiply
+   srq->wrid = kmalloc(srq->msrq.max * sizeof(u64),

Signed-off-by: Leon Romanovsky <leo...@mellanox.com>
Reviewed-by: Or Gerlitz <ogerl...@mellanox.com>
---
 drivers/infiniband/hw/mlx4/qp.c  | 4 ++--
 drivers/infiniband/hw/mlx4/srq.c | 2 +-
 2 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/infiniband/hw/mlx4/qp.c b/drivers/infiniband/hw/mlx4/qp.c
index dc86975fe1a9..70de13ed9da7 100644
--- a/drivers/infiniband/hw/mlx4/qp.c
+++ b/drivers/infiniband/hw/mlx4/qp.c
@@ -796,12 +796,12 @@ static int create_qp_common(struct mlx4_ib_dev *dev, 
struct ib_pd *pd,
if (err)
goto err_mtt;
 
-   qp->sq.wrid = kmalloc(qp->sq.wqe_cnt * sizeof(u64),
+   qp->sq.wrid = kmalloc_array(qp->sq.wqe_cnt, sizeof(u64),
gfp | __GFP_NOWARN);
if (!qp->sq.wrid)
qp->sq.wrid = __vmalloc(qp->sq.wqe_cnt * sizeof(u64),
gfp, PAGE_KERNEL);
-   qp->rq.wrid = kmalloc(qp->rq.wqe_cnt * sizeof(u64),
+   qp->rq.wrid = kmalloc_array(qp->rq.wqe_cnt, sizeof(u64),
gfp | __GFP_NOWARN);
if (!qp->rq.wrid)
qp->rq.wrid = __vmalloc(qp->rq.wqe_cnt * sizeof(u64),
diff --git a/drivers/infiniband/hw/mlx4/srq.c b/drivers/infiniband/hw/mlx4/srq.c
index f416c7463827..68d5a5fda271 100644
--- a/drivers/infiniband/hw/mlx4/srq.c
+++ b/drivers/infiniband/hw/mlx4/srq.c
@@ -171,7 +171,7 @@ struct ib_srq *mlx4_ib_create_srq(struct ib_pd *pd,
if (err)
goto err_mtt;
 
-   srq->wrid = kmalloc(srq->msrq.max * sizeof(u64),
+   srq->wrid = kmalloc_array(srq->msrq.max, sizeof(u64),
GFP_KERNEL | __GFP_NOWARN);
if (!srq->wrid) {
srq->wrid = __vmalloc(srq->msrq.max * sizeof(u64),
-- 
1.7.12.4

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 1/2] IB/mlx4: Suppress memory allocations warnings in kmalloc->__vmalloc flows

2015-12-16 Thread Leon Romanovsky
From: Leon Romanovsky <leo...@mellanox.com>

Failure in kmalloc memory allocations will throw a warning about it.
Such warnings are not needed anymore, since in commit 0ef2f05c7e02
("IB/mlx4: Use vmalloc for WR buffers when needed"), fallback mechanism
from kmalloc() to __vmalloc() was added.

Signed-off-by: Leon Romanovsky <leo...@mellanox.com>
Reviewed-by: Or Gerlitz <ogerl...@mellanox.com>
---
 drivers/infiniband/hw/mlx4/qp.c  | 6 --
 drivers/infiniband/hw/mlx4/srq.c | 3 ++-
 2 files changed, 6 insertions(+), 3 deletions(-)

diff --git a/drivers/infiniband/hw/mlx4/qp.c b/drivers/infiniband/hw/mlx4/qp.c
index 13eaaf45288f..dc86975fe1a9 100644
--- a/drivers/infiniband/hw/mlx4/qp.c
+++ b/drivers/infiniband/hw/mlx4/qp.c
@@ -796,11 +796,13 @@ static int create_qp_common(struct mlx4_ib_dev *dev, 
struct ib_pd *pd,
if (err)
goto err_mtt;
 
-   qp->sq.wrid = kmalloc(qp->sq.wqe_cnt * sizeof(u64), gfp);
+   qp->sq.wrid = kmalloc(qp->sq.wqe_cnt * sizeof(u64),
+   gfp | __GFP_NOWARN);
if (!qp->sq.wrid)
qp->sq.wrid = __vmalloc(qp->sq.wqe_cnt * sizeof(u64),
gfp, PAGE_KERNEL);
-   qp->rq.wrid = kmalloc(qp->rq.wqe_cnt * sizeof(u64), gfp);
+   qp->rq.wrid = kmalloc(qp->rq.wqe_cnt * sizeof(u64),
+   gfp | __GFP_NOWARN);
if (!qp->rq.wrid)
qp->rq.wrid = __vmalloc(qp->rq.wqe_cnt * sizeof(u64),
gfp, PAGE_KERNEL);
diff --git a/drivers/infiniband/hw/mlx4/srq.c b/drivers/infiniband/hw/mlx4/srq.c
index 8d133c40fa0e..f416c7463827 100644
--- a/drivers/infiniband/hw/mlx4/srq.c
+++ b/drivers/infiniband/hw/mlx4/srq.c
@@ -171,7 +171,8 @@ struct ib_srq *mlx4_ib_create_srq(struct ib_pd *pd,
if (err)
goto err_mtt;
 
-   srq->wrid = kmalloc(srq->msrq.max * sizeof (u64), GFP_KERNEL);
+   srq->wrid = kmalloc(srq->msrq.max * sizeof(u64),
+   GFP_KERNEL | __GFP_NOWARN);
if (!srq->wrid) {
srq->wrid = __vmalloc(srq->msrq.max * sizeof(u64),
  GFP_KERNEL, PAGE_KERNEL);
-- 
1.7.12.4

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 2/3] IB/core: Add cross-channel support

2015-12-12 Thread Leon Romanovsky
From: Leon Romanovsky <leo...@mellanox.com>

The cross-channel feature allows to execute WQEs that involve
synchronization of I/O operations’ on different QPs.

This capability enables to program complex flows with a single
function call, hereby significantly reducing overhead associated
with I/O processing.

Cross-channel operations support is indicated by HCA capability
information.

The queue pairs can be configured to work as a “sync master queue”
or “sync slave queues”.

The added flags are:

1. Device capability flag IB_DEVICE_CROSS_CHANNEL for the
   devices that can perform cross-channel operations.

2. CQ property flag IB_CQ_FLAGS_IGNORE_OVERRUN to disable CQ overrun
   check. This check is useless in cross-channel scenario.

3. QP property flags to indicate if queues are slave or master:
   * IB_QP_CREATE_MANAGED_SEND indicates that posted send work requests
 will not be executed immediately and requires enabling.
   * IB_QP_CREATE_MANAGED_RECV indicates that posted receive work
 requests will not be executed immediately and requires enabling.
   * IB_QP_CREATE_CROSS_CHANNEL declares the QP to work in cross-channel
 mode. If IB_QP_CREATE_MANAGED_SEND and IB_QP_CREATE_MANAGED_RECV are
 not provided, this QP will be sync master queue, else it will be sync
 slave.

Reviewed-by: Sagi Grimberg <sa...@mellanox.com>
Signed-off-by: Leon Romanovsky <leo...@mellanox.com>
---
 drivers/infiniband/core/uverbs_cmd.c |  5 -
 include/rdma/ib_verbs.h  | 11 +++
 2 files changed, 15 insertions(+), 1 deletion(-)

diff --git a/drivers/infiniband/core/uverbs_cmd.c 
b/drivers/infiniband/core/uverbs_cmd.c
index 94816ae..0e9711f 100644
--- a/drivers/infiniband/core/uverbs_cmd.c
+++ b/drivers/infiniband/core/uverbs_cmd.c
@@ -1843,7 +1843,10 @@ static int create_qp(struct ib_uverbs_file *file,
  sizeof(cmd->create_flags))
attr.create_flags = cmd->create_flags;
 
-   if (attr.create_flags & ~IB_QP_CREATE_BLOCK_MULTICAST_LOOPBACK) {
+   if (attr.create_flags & ~(IB_QP_CREATE_BLOCK_MULTICAST_LOOPBACK |
+   IB_QP_CREATE_CROSS_CHANNEL |
+   IB_QP_CREATE_MANAGED_SEND |
+   IB_QP_CREATE_MANAGED_RECV)) {
ret = -EINVAL;
goto err_put;
}
diff --git a/include/rdma/ib_verbs.h b/include/rdma/ib_verbs.h
index bcf40ad..370fbdf 100644
--- a/include/rdma/ib_verbs.h
+++ b/include/rdma/ib_verbs.h
@@ -139,6 +139,13 @@ enum ib_device_cap_flags {
IB_DEVICE_MEM_WINDOW_TYPE_2B= (1 << 24),
IB_DEVICE_RC_IP_CSUM= (1 << 25),
IB_DEVICE_RAW_IP_CSUM   = (1 << 26),
+   /*
+* Devices should set IB_DEVICE_CROSS_CHANNEL if they
+* support execution of WQEs that involve synchronization
+* of I/O operations with single completion queue managed
+* by hardware.
+*/
+   IB_DEVICE_CROSS_CHANNEL = (1 << 27),
IB_DEVICE_MANAGED_FLOW_STEERING = (1 << 29),
IB_DEVICE_SIGNATURE_HANDOVER= (1 << 30),
IB_DEVICE_ON_DEMAND_PAGING  = (1 << 31),
@@ -184,6 +191,7 @@ struct ib_odp_caps {
 
 enum ib_cq_creation_flags {
IB_CQ_FLAGS_TIMESTAMP_COMPLETION   = 1 << 0,
+   IB_CQ_FLAGS_IGNORE_OVERRUN = 1 << 1,
 };
 
 struct ib_cq_init_attr {
@@ -866,6 +874,9 @@ enum ib_qp_type {
 enum ib_qp_create_flags {
IB_QP_CREATE_IPOIB_UD_LSO   = 1 << 0,
IB_QP_CREATE_BLOCK_MULTICAST_LOOPBACK   = 1 << 1,
+   IB_QP_CREATE_CROSS_CHANNEL  = 1 << 2,
+   IB_QP_CREATE_MANAGED_SEND   = 1 << 3,
+   IB_QP_CREATE_MANAGED_RECV   = 1 << 4,
IB_QP_CREATE_NETIF_QP   = 1 << 5,
IB_QP_CREATE_SIGNATURE_EN   = 1 << 6,
IB_QP_CREATE_USE_GFP_NOIO   = 1 << 7,
-- 
1.7.12.4

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 0/3] Add cross-channel support

2015-12-12 Thread Leon Romanovsky
From: Leon Romanovsky <leo...@mellanox.com>

This patchset adds cross-channel support.

The cross-channel feature allows to execute WQEs that involve
synchronization of I/O operations’ on different QPs.

This capability enables to program complex flows with a single
function call, hereby significantly reducing overhead associated
with I/O processing.

Leon Romanovsky (3):
  IB/core: Align coding style of ib_device_cap_flags structure
  IB/core: Add cross-channel support
  IB/mlx5: Add driver cross-channel support

 drivers/infiniband/core/uverbs_cmd.c |  5 ++-
 drivers/infiniband/hw/mlx5/cq.c  |  7 +++-
 drivers/infiniband/hw/mlx5/main.c|  3 ++
 drivers/infiniband/hw/mlx5/mlx5_ib.h | 12 ++
 drivers/infiniband/hw/mlx5/qp.c  | 29 +++
 include/linux/mlx5/qp.h  |  3 ++
 include/rdma/ib_verbs.h  | 71 +---
 7 files changed, 98 insertions(+), 32 deletions(-)

-- 
1.7.12.4

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 3/3] IB/mlx5: Add driver cross-channel support

2015-12-12 Thread Leon Romanovsky
From: Leon Romanovsky <leo...@mellanox.com>

Add support of cross-channel functionality to mlx5
driver. This includes ability to ignore overrun for CQ
which intended for cross-channel, export device capability and
configure the QP to be sync master/slave queues.

The cross-channel enabled QP supports combination of
three possible properties:
* WQE processing on the receive queue of this QP
* WQE processing on the send queue of this QP
* WQE are supported on the send queue

Reviewed-by: Sagi Grimberg <sa...@mellanox.com>
Signed-off-by: Leon Romanovsky <leo...@mellanox.com>
---
 drivers/infiniband/hw/mlx5/cq.c  |  7 ++-
 drivers/infiniband/hw/mlx5/main.c|  3 +++
 drivers/infiniband/hw/mlx5/mlx5_ib.h | 12 
 drivers/infiniband/hw/mlx5/qp.c  | 29 +
 include/linux/mlx5/qp.h  |  3 +++
 5 files changed, 53 insertions(+), 1 deletion(-)

diff --git a/drivers/infiniband/hw/mlx5/cq.c b/drivers/infiniband/hw/mlx5/cq.c
index 3dfd287..c363b71 100644
--- a/drivers/infiniband/hw/mlx5/cq.c
+++ b/drivers/infiniband/hw/mlx5/cq.c
@@ -760,7 +760,7 @@ struct ib_cq *mlx5_ib_create_cq(struct ib_device *ibdev,
int eqn;
int err;
 
-   if (attr->flags)
+   if (check_cq_create_flags(attr->flags))
return ERR_PTR(-EINVAL);
 
if (entries < 0)
@@ -779,6 +779,7 @@ struct ib_cq *mlx5_ib_create_cq(struct ib_device *ibdev,
spin_lock_init(>lock);
cq->resize_buf = NULL;
cq->resize_umem = NULL;
+   cq->create_flags = attr->flags;
 
if (context) {
err = create_cq_user(dev, udata, context, cq, entries,
@@ -796,6 +797,10 @@ struct ib_cq *mlx5_ib_create_cq(struct ib_device *ibdev,
 
cq->cqe_size = cqe_size;
cqb->ctx.cqe_sz_flags = cqe_sz_to_mlx_sz(cqe_size) << 5;
+
+   if (cq->create_flags & IB_CQ_FLAGS_IGNORE_OVERRUN)
+   cqb->ctx.cqe_sz_flags |= (1 << 1);
+
cqb->ctx.log_sz_usr_page = cpu_to_be32((ilog2(entries) << 24) | index);
err = mlx5_vector2eqn(dev->mdev, vector, , );
if (err)
diff --git a/drivers/infiniband/hw/mlx5/main.c 
b/drivers/infiniband/hw/mlx5/main.c
index bdd60a6..be0d994 100644
--- a/drivers/infiniband/hw/mlx5/main.c
+++ b/drivers/infiniband/hw/mlx5/main.c
@@ -300,6 +300,9 @@ static int mlx5_ib_query_device(struct ib_device *ibdev,
props->odp_caps = dev->odp_caps;
 #endif
 
+   if (MLX5_CAP_GEN(mdev, cd))
+   props->device_cap_flags |= IB_DEVICE_CROSS_CHANNEL;
+
return 0;
 }
 
diff --git a/drivers/infiniband/hw/mlx5/mlx5_ib.h 
b/drivers/infiniband/hw/mlx5/mlx5_ib.h
index 6333472..7f9c6bf 100644
--- a/drivers/infiniband/hw/mlx5/mlx5_ib.h
+++ b/drivers/infiniband/hw/mlx5/mlx5_ib.h
@@ -242,6 +242,9 @@ struct mlx5_ib_cq_buf {
 enum mlx5_ib_qp_flags {
MLX5_IB_QP_BLOCK_MULTICAST_LOOPBACK = 1 << 0,
MLX5_IB_QP_SIGNATURE_HANDLING   = 1 << 1,
+   MLX5_IB_QP_CROSS_CHANNEL= 1 << 2,
+   MLX5_IB_QP_MANAGED_SEND = 1 << 3,
+   MLX5_IB_QP_MANAGED_RECV = 1 << 4,
 };
 
 struct mlx5_umr_wr {
@@ -284,6 +287,7 @@ struct mlx5_ib_cq {
struct mlx5_ib_cq_buf  *resize_buf;
struct ib_umem *resize_umem;
int cqe_size;
+   u32 create_flags;
 };
 
 struct mlx5_ib_srq {
@@ -662,4 +666,12 @@ static inline int is_qp1(enum ib_qp_type qp_type)
 #define MLX5_MAX_UMR_SHIFT 16
 #define MLX5_MAX_UMR_PAGES (1 << MLX5_MAX_UMR_SHIFT)
 
+static inline u32 check_cq_create_flags(u32 flags)
+{
+   /*
+* It returns non-zero value for unsupported CQ
+* create flags, otherwise it returns zero.
+*/
+   return (flags & ~IB_CQ_FLAGS_IGNORE_OVERRUN);
+}
 #endif /* MLX5_IB_H */
diff --git a/drivers/infiniband/hw/mlx5/qp.c b/drivers/infiniband/hw/mlx5/qp.c
index 307bdbc..54ceba8 100644
--- a/drivers/infiniband/hw/mlx5/qp.c
+++ b/drivers/infiniband/hw/mlx5/qp.c
@@ -880,6 +880,21 @@ static int create_qp_common(struct mlx5_ib_dev *dev, 
struct ib_pd *pd,
}
}
 
+   if (init_attr->create_flags &
+   (IB_QP_CREATE_CROSS_CHANNEL |
+IB_QP_CREATE_MANAGED_SEND |
+IB_QP_CREATE_MANAGED_RECV)) {
+   if (!MLX5_CAP_GEN(mdev, cd)) {
+   mlx5_ib_dbg(dev, "cross-channel isn't supported\n");
+   return -EINVAL;
+   }
+   if (init_attr->create_flags & IB_QP_CREATE_CROSS_CHANNEL)
+   qp->flags |= MLX5_IB_QP_CROSS_CHANNEL;
+   if (init_attr->create_flags & IB_QP_CREATE_MANAGED_SEND)
+   qp->flags |= MLX5_IB_QP_MANAGED_SEND;
+   if (init_attr->create_flags &

[PATCH 1/3] IB/core: Align coding style of ib_device_cap_flags structure

2015-12-12 Thread Leon Romanovsky
From: Leon Romanovsky <leo...@mellanox.com>

Modify enum ib_device_cap_flags such that other patches which add new
enum values pass strict checkpatch.pl checks.

Reviewed-by: Sagi Grimberg <sa...@mellanox.com>
Signed-off-by: Leon Romanovsky <leo...@mellanox.com>
---
 include/rdma/ib_verbs.h | 60 -
 1 file changed, 30 insertions(+), 30 deletions(-)

diff --git a/include/rdma/ib_verbs.h b/include/rdma/ib_verbs.h
index 9a68a19..bcf40ad 100644
--- a/include/rdma/ib_verbs.h
+++ b/include/rdma/ib_verbs.h
@@ -105,24 +105,24 @@ enum rdma_link_layer {
 };
 
 enum ib_device_cap_flags {
-   IB_DEVICE_RESIZE_MAX_WR = 1,
-   IB_DEVICE_BAD_PKEY_CNTR = (1<<1),
-   IB_DEVICE_BAD_QKEY_CNTR = (1<<2),
-   IB_DEVICE_RAW_MULTI = (1<<3),
-   IB_DEVICE_AUTO_PATH_MIG = (1<<4),
-   IB_DEVICE_CHANGE_PHY_PORT   = (1<<5),
-   IB_DEVICE_UD_AV_PORT_ENFORCE= (1<<6),
-   IB_DEVICE_CURR_QP_STATE_MOD = (1<<7),
-   IB_DEVICE_SHUTDOWN_PORT = (1<<8),
-   IB_DEVICE_INIT_TYPE = (1<<9),
-   IB_DEVICE_PORT_ACTIVE_EVENT = (1<<10),
-   IB_DEVICE_SYS_IMAGE_GUID= (1<<11),
-   IB_DEVICE_RC_RNR_NAK_GEN= (1<<12),
-   IB_DEVICE_SRQ_RESIZE= (1<<13),
-   IB_DEVICE_N_NOTIFY_CQ   = (1<<14),
-   IB_DEVICE_LOCAL_DMA_LKEY= (1<<15),
-   IB_DEVICE_RESERVED  = (1<<16), /* old SEND_W_INV */
-   IB_DEVICE_MEM_WINDOW= (1<<17),
+   IB_DEVICE_RESIZE_MAX_WR = (1 << 0),
+   IB_DEVICE_BAD_PKEY_CNTR = (1 << 1),
+   IB_DEVICE_BAD_QKEY_CNTR = (1 << 2),
+   IB_DEVICE_RAW_MULTI = (1 << 3),
+   IB_DEVICE_AUTO_PATH_MIG = (1 << 4),
+   IB_DEVICE_CHANGE_PHY_PORT   = (1 << 5),
+   IB_DEVICE_UD_AV_PORT_ENFORCE= (1 << 6),
+   IB_DEVICE_CURR_QP_STATE_MOD = (1 << 7),
+   IB_DEVICE_SHUTDOWN_PORT = (1 << 8),
+   IB_DEVICE_INIT_TYPE = (1 << 9),
+   IB_DEVICE_PORT_ACTIVE_EVENT = (1 << 10),
+   IB_DEVICE_SYS_IMAGE_GUID= (1 << 11),
+   IB_DEVICE_RC_RNR_NAK_GEN= (1 << 12),
+   IB_DEVICE_SRQ_RESIZE= (1 << 13),
+   IB_DEVICE_N_NOTIFY_CQ   = (1 << 14),
+   IB_DEVICE_LOCAL_DMA_LKEY= (1 << 15),
+   IB_DEVICE_RESERVED  = (1 << 16), /* old SEND_W_INV */
+   IB_DEVICE_MEM_WINDOW= (1 << 17),
/*
 * Devices should set IB_DEVICE_UD_IP_SUM if they support
 * insertion of UDP and TCP checksum on outgoing UD IPoIB
@@ -130,18 +130,18 @@ enum ib_device_cap_flags {
 * incoming messages.  Setting this flag implies that the
 * IPoIB driver may set NETIF_F_IP_CSUM for datagram mode.
 */
-   IB_DEVICE_UD_IP_CSUM= (1<<18),
-   IB_DEVICE_UD_TSO= (1<<19),
-   IB_DEVICE_XRC   = (1<<20),
-   IB_DEVICE_MEM_MGT_EXTENSIONS= (1<<21),
-   IB_DEVICE_BLOCK_MULTICAST_LOOPBACK = (1<<22),
-   IB_DEVICE_MEM_WINDOW_TYPE_2A= (1<<23),
-   IB_DEVICE_MEM_WINDOW_TYPE_2B= (1<<24),
-   IB_DEVICE_RC_IP_CSUM= (1<<25),
-   IB_DEVICE_RAW_IP_CSUM   = (1<<26),
-   IB_DEVICE_MANAGED_FLOW_STEERING = (1<<29),
-   IB_DEVICE_SIGNATURE_HANDOVER= (1<<30),
-   IB_DEVICE_ON_DEMAND_PAGING  = (1<<31),
+   IB_DEVICE_UD_IP_CSUM= (1 << 18),
+   IB_DEVICE_UD_TSO= (1 << 19),
+   IB_DEVICE_XRC   = (1 << 20),
+   IB_DEVICE_MEM_MGT_EXTENSIONS= (1 << 21),
+   IB_DEVICE_BLOCK_MULTICAST_LOOPBACK = (1 << 22),
+   IB_DEVICE_MEM_WINDOW_TYPE_2A= (1 << 23),
+   IB_DEVICE_MEM_WINDOW_TYPE_2B= (1 << 24),
+   IB_DEVICE_RC_IP_CSUM= (1 << 25),
+   IB_DEVICE_RAW_IP_CSUM   = (1 << 26),
+   IB_DEVICE_MANAGED_FLOW_STEERING = (1 << 29),
+   IB_DEVICE_SIGNATURE_HANDOVER= (1 << 30),
+   IB_DEVICE_ON_DEMAND_PAGING  = (1 << 31),
 };
 
 enum ib_signature_prot_cap {
-- 
1.7.12.4

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] IB/usnic: Handle 0 counts in resource allocation

2015-12-11 Thread Leon Romanovsky
> Thanks for looking at the code.  I am still not understanding your point.
>> Old code:
>> usnic_vnic_res_free_cnt(vnic, type) == 0 and cnt == 1 will return EINVAL
> Yes:
> if (0 < 1 || 1 < 1 || !owner)
> return -EINVAL;
>> New code
>> snic_vnic_res_free_cnt(vnic, type) == 0 and cnt == 1 will pass and will
>> pass te "if (cnt > 0)" check below and will decrease free_cnt variable
>> to be below zero.
> This I don't understand.  The following still fails with -EINVAL.
> if (0 < 1 || 1 < 0 || !owner)
> return -EINVAL;

Thank you for clarifying it. I don't know why I missed first comparison.
Sorry for bothering you.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 03/37] IB/rdmavt: Add protection domain to rdmavt.

2015-12-10 Thread Leon Romanovsky
On Thu, Dec 10, 2015 at 11:40:48AM -0500, Dennis Dalessandro wrote:
> On Tue, Dec 08, 2015 at 08:28:17AM +0200, Leon Romanovsky wrote:
> >On Mon, Dec 07, 2015 at 03:43:10PM -0500, Dennis Dalessandro wrote:
> >>+
> >>+/*
> >>+ * Things that are driver specific, module parameters in hfi1 and qib
> >>+ */
> >>+struct rvt_driver_params {
> >>+   int max_pds;
> >Can it be negative value?
> >>+};
> 
> If so no protection domains would ever get allocated. I don't think anything
> else would break though if a driver were to do that.
In such way, it should be "unsigned int".

> 
> -Denny
> --
> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 02/37] IB/rdmavt: Consolidate dma ops in rdmavt.

2015-12-10 Thread Leon Romanovsky
On Thu, Dec 10, 2015 at 11:17:09AM -0500, Dennis Dalessandro wrote:
> On Tue, Dec 08, 2015 at 08:08:21AM +0200, Leon Romanovsky wrote:
> >On Mon, Dec 07, 2015 at 03:43:06PM -0500, Dennis Dalessandro wrote:
> >>+
> >>+#define BAD_DMA_ADDRESS ((u64)0)
> >What is the advantage in using directly u64 values instead of
> >pointers? You will get NULL and functions which return pointers
> >without need of casting.
> >
> >...
> >>+static u64 rvt_dma_map_single(struct ib_device *dev, void *cpu_addr,
> >>+ size_t size, enum dma_data_direction direction)
> >>+{
> >>+   if (WARN_ON(!valid_dma_direction(direction)))
> >>+   return BAD_DMA_ADDRESS;
> >>+
> >>+   return (u64)cpu_addr;
> >>+}
> >An example of such function.
> 
> Honestly I'm not really sure why it's done this way. We are just following
> the signature of the function in struct ib_dma_mapping_ops.
Is it worth to consider to implement these functions with the pointers?

> 
> -Denny
> --
> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Ping: [PATCH 1/1] Fix deadlock when running run_iter_bw_infinitely

2015-12-09 Thread Leon Romanovsky
On Wed, Dec 09, 2015 at 10:11:55AM +0100, Hans Westgaard Ry wrote:
> Sorry but I didn't see that "something" gobbled part of the mail.
> The correct lines are  (as in the original):
> 
> src/perftest_resources.c | 22 --
>  1 file changed, 16 insertions(+), 6 deletions(-)
> 
> diff --git a/src/perftest_resources.c b/src/perftest_resources.c
> index cb915e0..4d8e6c0 100755
> --- a/src/perftest_resources.c
> +++ b/src/perftest_resources.c
> @@ -25,6 +25,12 @@ static enum ibv_wr_opcode opcode_atomic_array[] = 
> {IBV_WR_ATOMIC_CMP_AND_SWP,IBV
>  struct perftest_parameters* duration_param;
>  struct check_alive_data check_alive_data;
> +/*
> + * Flag to trig call to print_report_bw
> + * The flag is set in signal_handler and tested/reset in 
> run_iter_bw_infinitely
> + */
> +static volatile int trig_print_report_bw = 0;
> +
>  
> /**
>   * Beginning
> @@ -3214,7 +3220,15 @@ int run_iter_bw_infinitely(struct pingpong_context 
> *ctx,struct perftest_paramete
>   /* main loop for posting */
>   while (1) {
> -
> + if (trig_print_report_bw) {
> + duration_param->tcompleted[0] = get_cycles();
> + print_report_bw(duration_param,NULL);
> + duration_param->iters = 0;
> + alarm(duration_param->duration);
> + duration_param->tposted[0] = get_cycles();
> + trig_print_report_bw = 0;
> + }
> + 
>   /* main loop to run over all the qps and post each time n 
> messages */
>   for (index =0 ; index < num_of_qps ; index++) {
> @@ -4213,11 +4227,7 @@ void check_alive(int sig)
>   
> **/
>  void catch_alarm_infintely(int sig)
>  {
> - duration_param->tcompleted[0] = get_cycles();
> - print_report_bw(duration_param,NULL);
> - duration_param->iters = 0;
> - alarm(duration_param->duration);
> - duration_param->tposted[0] = get_cycles();
> + trig_print_report_bw = 1;
>  }
Thanks, it looks good,
Reviewed-by: Leon Romanovsky <leo...@mellanox.com>

> 
> 
> 
> On 12/09/2015 09:52 AM, Leon Romanovsky wrote:
> >On Wed, Dec 09, 2015 at 09:33:11AM +0100, Hans Westgaard Ry wrote:
> >>Any comments on this patch ?
> >>
> >>
> >>
> >>On 11/18/2015 01:27 PM, Hans Westgaard Ry wrote:
> >>>The test uses a signal-handler to periodically print a line of report.
> >>>+  if (trig_print_report_bw) {
> >>>+  duration_param->tcompleted[0] =et_cycles();
> >>>+  print_report_bw(duration_param,NULL);
> >>>+  duration_param->iters =;
> >Excuse me, what did you mean by the line above?
> >
> >>>+  alarm(duration_param->duration);
> >>>+  duration_param->tposted[0] =et_cycles();
> >>>+  trig_print_report_bw =;
> >And this line too.
> >
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Ping: [PATCH 1/1] Fix deadlock when running run_iter_bw_infinitely

2015-12-09 Thread Leon Romanovsky
On Wed, Dec 09, 2015 at 09:33:11AM +0100, Hans Westgaard Ry wrote:
> Any comments on this patch ?
> 
> 
> 
> On 11/18/2015 01:27 PM, Hans Westgaard Ry wrote:
> >The test uses a signal-handler to periodically print a line of report.
> >+if (trig_print_report_bw) {
> >+duration_param->tcompleted[0] =et_cycles();
> >+print_report_bw(duration_param,NULL);
> >+duration_param->iters =;
Excuse me, what did you mean by the line above?

> >+alarm(duration_param->duration);
> >+duration_param->tposted[0] =et_cycles();
> >+trig_print_report_bw =;
And this line too.

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] IB/usnic: Handle 0 counts in resource allocation

2015-12-09 Thread Leon Romanovsky
On Wed, Dec 09, 2015 at 10:42:19AM -0800, Nelson Escobar wrote:
> - if (usnic_vnic_res_free_cnt(vnic, type) < cnt || cnt < 1 || !owner)
> + if (usnic_vnic_res_free_cnt(vnic, type) < cnt || cnt < 0 || !owner)
Before this change you returned EINVAL if no free_cnt were available,
now you will continue. is this behaviour expected?

>   return ERR_PTR(-EINVAL);
>  
>   ret = kzalloc(sizeof(*ret), GFP_ATOMIC);
> @@ -247,26 +247,28 @@ usnic_vnic_get_resources(struct usnic_vnic *vnic, enum 
> usnic_vnic_res_type type,
>   return ERR_PTR(-ENOMEM);
>   }
>  
> - ret->res = kzalloc(sizeof(*(ret->res))*cnt, GFP_ATOMIC);
> - if (!ret->res) {
> - usnic_err("Failed to allocate resources for %s. Out of 
> memory\n",
> - usnic_vnic_pci_name(vnic));
> - kfree(ret);
> - return ERR_PTR(-ENOMEM);
> - }
> + if (cnt > 0) {
> + ret->res = kcalloc(cnt, sizeof(*(ret->res)), GFP_ATOMIC);
> + if (!ret->res) {
> + usnic_err("Failed to allocate resources for %s. Out of 
> memory\n",
> + usnic_vnic_pci_name(vnic));
You don't need to print OOM messages, failure in memory allocation very hard to 
miss.
> + kfree(ret);
> + return ERR_PTR(-ENOMEM);
> + }
>  
> - spin_lock(>res_lock);
> - src = >chunks[type];
> - for (i = 0; i < src->cnt && ret->cnt < cnt; i++) {
> - res = src->res[i];
> - if (!res->owner) {
> - src->free_cnt--;
> - res->owner = owner;
> - ret->res[ret->cnt++] = res;
> + spin_lock(>res_lock);
> + src = >chunks[type];
> + for (i = 0; i < src->cnt && ret->cnt < cnt; i++) {
> + res = src->res[i];
> + if (!res->owner) {
> + src->free_cnt--;
It will be negative, because of skip usnic_vnic_res_free_cnt check
before.
> + res->owner = owner;
> + ret->res[ret->cnt++] = res;
> + }
>   }
> - }
>  
> - spin_unlock(>res_lock);
> + spin_unlock(>res_lock);
> + }
>   ret->type = type;
>   ret->vnic = vnic;
>   WARN_ON(ret->cnt != cnt);
> @@ -281,14 +283,16 @@ void usnic_vnic_put_resources(struct 
> usnic_vnic_res_chunk *chunk)
>   int i;
>   struct usnic_vnic *vnic = chunk->vnic;
>  
> - spin_lock(>res_lock);
> - while ((i = --chunk->cnt) >= 0) {
> - res = chunk->res[i];
> - chunk->res[i] = NULL;
> - res->owner = NULL;
> - vnic->chunks[res->type].free_cnt++;
> + if (chunk->cnt > 0) {
> + spin_lock(>res_lock);
> + while ((i = --chunk->cnt) >= 0) {
> + res = chunk->res[i];
> + chunk->res[i] = NULL;
> + res->owner = NULL;
> + vnic->chunks[res->type].free_cnt++;
> + }
> + spin_unlock(>res_lock);
>   }
> - spin_unlock(>res_lock);
>  
>   kfree(chunk->res);
>   kfree(chunk);
> -- 
> 2.4.3
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 07/37] IB/rdmavt: Add query and modify port stubs

2015-12-08 Thread Leon Romanovsky
On Mon, Dec 07, 2015 at 03:43:28PM -0500, Dennis Dalessandro wrote:
> +/**
> + * rvt_query_port: Passes the query port call to the driver
> + * ibdev: Verbs IB dev
> + * port: port number
> + * props: structure to hold returned properties
> + *
> + * Returns 0 on success
> + */
...

> +/**
> + * rvt_modify_port
> + * @ibdev: Verbs IB dev
> + * @port: Port number
> + * @port_modify_mask: How to change the port
> + * @props: Structure to fill in
> + *
> + * Returns 0 on success
> + */

The comments are different in their format: one start from @ for
variables and another without @.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 02/37] IB/rdmavt: Consolidate dma ops in rdmavt.

2015-12-07 Thread Leon Romanovsky
On Mon, Dec 07, 2015 at 03:43:06PM -0500, Dennis Dalessandro wrote:
> +
> +#define BAD_DMA_ADDRESS ((u64)0)
What is the advantage in using directly u64 values instead of
pointers? You will get NULL and functions which return pointers
without need of casting.

...
> +static u64 rvt_dma_map_single(struct ib_device *dev, void *cpu_addr,
> +   size_t size, enum dma_data_direction direction)
> +{
> + if (WARN_ON(!valid_dma_direction(direction)))
> + return BAD_DMA_ADDRESS;
> +
> + return (u64)cpu_addr;
> +}
An example of such function.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 01/37] IB/rdmavt: Create module framework and handle driver registration

2015-12-07 Thread Leon Romanovsky
On Mon, Dec 07, 2015 at 03:43:02PM -0500, Dennis Dalessandro wrote:
> +#include 
> +#include 
> +#include "vt.h"
> +
> +#define RDMAVT_DRIVER_VERSION "0.1"
Do we really need driver version?

> +
> +MODULE_LICENSE("Dual BSD/GPL");
> +MODULE_DESCRIPTION("RDMA Verbs Transport Library");
> +MODULE_VERSION(RDMAVT_DRIVER_VERSION);
> +
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 03/37] IB/rdmavt: Add protection domain to rdmavt.

2015-12-07 Thread Leon Romanovsky
On Mon, Dec 07, 2015 at 03:43:10PM -0500, Dennis Dalessandro wrote:
> +
> +/*
> + * Things that are driver specific, module parameters in hfi1 and qib
> + */
> +struct rvt_driver_params {
> + int max_pds;
Can it be negative value?
> +};
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 03/37] IB/rdmavt: Add protection domain to rdmavt.

2015-12-07 Thread Leon Romanovsky
On Tue, Dec 08, 2015 at 08:28:17AM +0200, Leon Romanovsky wrote:
> On Mon, Dec 07, 2015 at 03:43:10PM -0500, Dennis Dalessandro wrote:
> > +
> > +/*
> > + * Things that are driver specific, module parameters in hfi1 and qib
> > + */
> > +struct rvt_driver_params {
> > +   int max_pds;
> Can it be negative value?
> > +};
Forget my question, I see, you removed this variable in the following patch.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 06/37] IB/rdmavt: Add query and modify device stubs

2015-12-07 Thread Leon Romanovsky
On Mon, Dec 07, 2015 at 02:39:04PM -0700, Jason Gunthorpe wrote:
> On Mon, Dec 07, 2015 at 09:26:11PM +, Hefty, Sean wrote:
> > > +static int rvt_query_device(struct ib_device *ibdev,
> > > + struct ib_device_attr *props,
> > > + struct ib_udata *uhw)
> > > +{
> > > + /*
> > > +  * Return rvt_dev_info.props contents
> > > +  */
> > > + return -EINVAL;
> > 
> > ENOSYS on all of the function templates.  This and subsequent patches.
> 
> We recently had a long discussion about what the correct answer here
> is.
> 
> ENOSYS and EINVAL are both wrong for different reasons.. Can't recall
> if something else was settled on? I think I was suggesting one of the
> ENOTSUP varients?
You suggested EOPNOTSUPP [1]

[1] http://www.spinics.net/lists/linux-rdma/msg30176.html
> 
> Jason
> --
> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] IB/mlx4: use list_for_each_entry_safe in search_relocate_mgid0_group

2015-12-07 Thread Leon Romanovsky
On Mon, Dec 07, 2015 at 11:04:43PM +0800, Geliang Tang wrote:
> Simplify the code with list_for_each_entry_safe().
> 
> Signed-off-by: Geliang Tang <geliangt...@163.com>
> ---
>  drivers/infiniband/hw/mlx4/mcg.c | 7 ++-
>  1 file changed, 2 insertions(+), 5 deletions(-)
> 
> diff --git a/drivers/infiniband/hw/mlx4/mcg.c 
> b/drivers/infiniband/hw/mlx4/mcg.c
> index 99451d8..ebdca2b 100644
> --- a/drivers/infiniband/hw/mlx4/mcg.c
> +++ b/drivers/infiniband/hw/mlx4/mcg.c
> @@ -747,14 +747,11 @@ static struct mcast_group 
> *search_relocate_mgid0_group(struct mlx4_ib_demux_ctx
>  __be64 tid,
>  union ib_gid *new_mgid)
>  {
> - struct mcast_group *group = NULL, *cur_group;
> + struct mcast_group *group = NULL, *cur_group, *n;
>   struct mcast_req *req;
> - struct list_head *pos;
> - struct list_head *n;
>  
>   mutex_lock(>mcg_table_lock);
> - list_for_each_safe(pos, n, >mcg_mgid0_list) {
> - group = list_entry(pos, struct mcast_group, mgid0_list);
> + list_for_each_entry_safe(group, n, >mcg_mgid0_list, mgid0_list) {
>   mutex_lock(>lock);
>   if (group->last_req_tid == tid) {
>   if (memcmp(new_mgid, , sizeof mgid0)) {
Thanks,
Reviewed-by: Leon Romanovsky <leo...@mellanox.com>

> -- 
> 2.5.0
> 
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] staging/rdma/hfi1: Reduce number of parameters passed to send handlers

2015-11-11 Thread Leon Romanovsky
On Wed, Nov 11, 2015 at 08:39:08AM -0500, Dennis Dalessandro wrote:
> On Wed, Nov 11, 2015 at 08:25:35AM +0200, Leon Romanovsky wrote:
> >On Wed, Nov 11, 2015 at 12:34:37AM -0500, ira.we...@intel.com wrote:
> >>From: Dennis Dalessandro <dennis.dalessan...@intel.com>
> >>
> >>+int snoop_send_dma_handler(struct hfi1_qp *qp, struct hfi1_pkt_state *ps,
> >>+  u64 pbc)
> >> {
> >>-   pr_alert("Snooping/Capture of  Send DMA Packets Is Not Supported!\n");
> >>+   pr_alert("Snooping/Capture of Send DMA Packets Is Not Supported!\n");
> >>snoop_dbg("Unsupported Operation");
> >Is it really necessary to do both print alert to message log (pr_alert) and 
> >to
> >trace buffer (snoop_dbg)?
> 
> Well yes, and no. They serve two different purposes. One for users of the
> snoop feature the other for developers.
> 
> However, I don't think we really even need to keep this function around
> anymore. How about we just remove it in a follow on patch, and let this one
> stand as is?
Sure, I'm fine with that.
> 
> -Denny
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] staging/rdma/hfi1: Reduce number of parameters passed to send handlers

2015-11-10 Thread Leon Romanovsky
On Wed, Nov 11, 2015 at 12:34:37AM -0500, ira.we...@intel.com wrote:
> From: Dennis Dalessandro 
> 
> +int snoop_send_dma_handler(struct hfi1_qp *qp, struct hfi1_pkt_state *ps,
> +u64 pbc)
>  {
> - pr_alert("Snooping/Capture of  Send DMA Packets Is Not Supported!\n");
> + pr_alert("Snooping/Capture of Send DMA Packets Is Not Supported!\n");
>   snoop_dbg("Unsupported Operation");
Is it really necessary to do both print alert to message log (pr_alert) and to
trace buffer (snoop_dbg)?
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH for-next 3/4] IB/mlx5: Add support querying timestamp related fields in query_device

2015-11-09 Thread Leon Romanovsky
On Mon, Nov 09, 2015 at 06:30:56PM +0200, Matan Barak wrote:
> +
> + if (uhw->outlen) {
> + err = ib_copy_to_udata(uhw, , resp.response_length);
> + if (err)
> + return err;
> + }
> +
>   return 0;
What do you think about to rewrite this part of code to be something
like that?
+   int ret = 0;
.
+   if (uhw->outlen)
+   ret = ib_copy_to_udata(uhw, , resp.response_length);
+   return ret;
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH for-next 2/4] IB/core: Add ib_is_udata_cleared

2015-11-09 Thread Leon Romanovsky
On Mon, Nov 09, 2015 at 06:30:55PM +0200, Matan Barak wrote:
>  
> +static inline bool ib_is_udata_cleared(struct ib_udata *udata,
> +char cleared_char,
> +size_t offset,
> +size_t len)
> +{
> + short i;
> +
> + for (i = 0; i < len; i++) {
You are comparing "len" which is declared as size_t which is "unsigned" int and 
"i" which is "short".
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/1] staging: rdma: hfi1 : Prefer using the BIT macro

2015-11-05 Thread Leon Romanovsky
On Thu, Nov 05, 2015 at 05:28:03PM +0530, Sunny Kumar wrote:
> This patch replaces bit shifting on 1 with the BIT(x) macro
> 
> Signed-off-by: Sunny Kumar 
> ---
>  drivers/staging/rdma/hfi1/user_sdma.c | 10 +-
>  1 file changed, 5 insertions(+), 5 deletions(-)
> 
>  /* Last packet in the request */
> -#define TXREQ_FLAGS_REQ_LAST_PKT   (1 << 0)
> -#define TXREQ_FLAGS_IOVEC_LAST_PKT (1 << 0)
> +#define TXREQ_FLAGS_REQ_LAST_PKT   BIT(1 << 0)
> +#define TXREQ_FLAGS_IOVEC_LAST_PKT BIT(1 << 0)
>  
It is wrong. It actually creates 0x2 instead of 0x1.

> -#define SDMA_PKT_Q_INACTIVE (1 << 0)
> -#define SDMA_PKT_Q_ACTIVE   (1 << 1)
> -#define SDMA_PKT_Q_DEFERRED (1 << 2)
> +#define SDMA_PKT_Q_INACTIVE BIT(1 << 0)
> +#define SDMA_PKT_Q_ACTIVE   BIT(1 << 1)
> +#define SDMA_PKT_Q_DEFERRED BIT(1 << 2)
The same comment as above. It is wrong.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] IB/mlx5: Postpone remove_keys under knowledge of coming preemption

2015-11-02 Thread Leon Romanovsky
On Wed, Oct 21, 2015 at 9:21 AM, Leon Romanovsky <l...@leon.nu> wrote:
> From: Leon Romanovsky <leo...@mellanox.com>
>
> The remove_keys() logic is performed as garbage collection task. Such
> task is intended to be run when no other active processes are running.
>
> The need_resched() will return TRUE if there are user tasks to be
> activated in near future.
>
> In such case, we don't execute remove_keys() and postpone
> the garbage collection work to try to run in next cycle,
> in order to free CPU resources to other tasks.
>
> The possible pseudo-code to trigger such scenario:
> 1. Allocate a lot of MR to fill the cache above the limit.
> 2. Wait a small amount of time "to calm" the system.
> 3. Start CPU extensive operations on multi-node cluster.
> 4. Expect performance degradation during MR cache shrink operation.
>
> Signed-off-by: Leon Romanovsky <leo...@mellanox.com>
> Signed-off-by: Eli Cohen <e...@mellanox.com>
> ---
>
> diff --git a/drivers/infiniband/hw/mlx5/mr.c b/drivers/infiniband/hw/mlx5/mr.c
> index 54a15b5..45ce00e 100644
> --- a/drivers/infiniband/hw/mlx5/mr.c
> +++ b/drivers/infiniband/hw/mlx5/mr.c
> @@ -381,7 +381,19 @@
> }
> }
> } else if (ent->cur > 2 * ent->limit) {
> -   if (!someone_adding(cache) &&
> +   /*
> +* The remove_keys() logic is performed as garbage collection
> +* task. Such task is intended to be run when no other active
> +* processes are running.
> +*
> +* The need_resched() will return TRUE if there are user tasks
> +* to be activated in near future.
> +*
> +* In such case, we don't execute remove_keys() and postpone
> +* the garbage collection work to try to run in next cycle,
> +* in order to free CPU resources to other tasks.
> +*/
> +   if (!need_resched() && !someone_adding(cache) &&
> time_after(jiffies, cache->last_add + 300 * HZ)) {
> remove_keys(dev, i, 1);
> if (ent->cur > ent->limit)

Hi Doug,
Is it possible to take this patch before RoCE patches are coming?
This is a small change in mlx5 driver which gives us significant performance
improvement.

Thanks.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/4 v2] staging: ipath: ipath_driver: Use setup_timer

2015-10-27 Thread Leon Romanovsky
On Tue, Oct 27, 2015 at 11:19 AM, Dan Carpenter
<dan.carpen...@oracle.com> wrote:
> On Sun, Oct 25, 2015 at 01:21:11PM +0200, Leon Romanovsky wrote:
>> On Sun, Oct 25, 2015 at 12:17 PM, Muhammad Falak R Wani
>> <falakre...@gmail.com> wrote:
>> Please follow standard naming convention for the patches.
>> It should be [PATCH v2 1/4] and not [PATCH 1/4 v2].
>
> Does this matter?  It's in a thread so it sorts fine either way.
It will be wise if people read guides and follow examples.

[1] https://www.kernel.org/doc/Documentation/SubmittingPatches


>
> regards,
> dan carpenter
>
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH libibverbs] Expose QP block self multicast loopback creation flag

2015-10-27 Thread Leon Romanovsky
On Tue, Oct 27, 2015 at 02:53:01PM +0200, Eran Ben Elisha wrote:
...
> +enum ibv_qp_create_flags {
> + IBV_QP_CREATE_BLOCK_SELF_MCAST_LB   = 1 << 1,
>  };
>  
I'm sure that I'm missing something important, but why did it start
from shift 1 and not shift 0?

> --
> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2195/2195] Staging: rdma: amso1100: Fix multiple assignments on single line.

2015-10-25 Thread Leon Romanovsky
On Sun, Oct 25, 2015 at 12:50 PM, Weston Silbaugh
 wrote:
> Multiple assignments should be avoided.
>
> For example:
>
> cm_event.ird = cm_event.ord = 128;
>
> Should be:
>
> cm_event.ird = 128;
> cm_event.ord = 128;
>
> Signed-off-by: Weston Silbaugh 
> ---
>  drivers/staging/rdma/amso1100/c2_ae.c | 6 --
>  1 file changed, 4 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/staging/rdma/amso1100/c2_ae.c 
> b/drivers/staging/rdma/amso1100/c2_ae.c
> index eb7a92b..20d98b7 100644
> --- a/drivers/staging/rdma/amso1100/c2_ae.c
> +++ b/drivers/staging/rdma/amso1100/c2_ae.c
> @@ -173,7 +173,8 @@ void c2_ae_event(struct c2_dev *c2dev, u32 mq_index)
> resource_user_context =
> (void *) (unsigned long) wr->ae.ae_generic.user_context;
>
> -   status = cm_event.status = c2_convert_cm_status(c2_wr_get_result(wr));
> +   status = c2_convert_cm_status(c2_wr_get_result(wr));
> +   cm_event.status = c2_convert_cm_status(c2_wr_get_result(wr));
This is wrong fro two reasons:
1. In case of wrong status it will print twice the same error line
"Unable to convert CM status"
2. The better solution will be to rewrite the patch to remove "status
variable" at all.

>
> pr_debug("event received c2_dev=%p, event_id=%d, "
> "resource_indicator=%d, user_context=%p, status = %d\n",
> @@ -294,7 +295,8 @@ void c2_ae_event(struct c2_dev *c2dev, u32 mq_index)
>  * Until ird/ord negotiation via MPAv2 support is added, send
>  * max supported values
>  */
> -   cm_event.ird = cm_event.ord = 128;
> +   cm_event.ird = 128;
> +   cm_event.ord = 128;
>
> if (cm_id->event_handler)
> cm_id->event_handler(cm_id, _event);
> --
> 2.4.3
>
> ___
> devel mailing list
> de...@linuxdriverproject.org
> http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/4 v2] staging: ipath: ipath_driver: Use setup_timer

2015-10-25 Thread Leon Romanovsky
On Sun, Oct 25, 2015 at 12:17 PM, Muhammad Falak R Wani
 wrote:
Please follow standard naming convention for the patches.
It should be [PATCH v2 1/4] and not [PATCH 1/4 v2].
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] IB/mlx5: Postpone remove_keys under knowledge of coming preemption

2015-10-21 Thread Leon Romanovsky
From: Leon Romanovsky <leo...@mellanox.com>

The remove_keys() logic is performed as garbage collection task. Such
task is intended to be run when no other active processes are running.

The need_resched() will return TRUE if there are user tasks to be
activated in near future.

In such case, we don't execute remove_keys() and postpone
the garbage collection work to try to run in next cycle,
in order to free CPU resources to other tasks.

The possible pseudo-code to trigger such scenario:
1. Allocate a lot of MR to fill the cache above the limit.
2. Wait a small amount of time "to calm" the system.
3. Start CPU extensive operations on multi-node cluster.
4. Expect performance degradation during MR cache shrink operation.

Signed-off-by: Leon Romanovsky <leo...@mellanox.com>
Signed-off-by: Eli Cohen <e...@mellanox.com>
---

diff --git a/drivers/infiniband/hw/mlx5/mr.c b/drivers/infiniband/hw/mlx5/mr.c
index 54a15b5..45ce00e 100644
--- a/drivers/infiniband/hw/mlx5/mr.c
+++ b/drivers/infiniband/hw/mlx5/mr.c
@@ -381,7 +381,19 @@
}
}
} else if (ent->cur > 2 * ent->limit) {
-   if (!someone_adding(cache) &&
+   /*
+* The remove_keys() logic is performed as garbage collection
+* task. Such task is intended to be run when no other active
+* processes are running.
+*
+* The need_resched() will return TRUE if there are user tasks
+* to be activated in near future.
+*
+* In such case, we don't execute remove_keys() and postpone
+* the garbage collection work to try to run in next cycle,
+* in order to free CPU resources to other tasks.
+*/
+   if (!need_resched() && !someone_adding(cache) &&
time_after(jiffies, cache->last_add + 300 * HZ)) {
remove_keys(dev, i, 1);
if (ent->cur > ent->limit)
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] fix return value error

2015-10-14 Thread Leon Romanovsky
On Wed, Oct 14, 2015 at 5:17 PM, Marciniszyn, Mike
 wrote:
>> > Subject: [PATCH] fix return value error
>> >
>>
>> I checked returns in configfs (-ENOMEM), proc (-ENOENT), proc-sys (-
>> ENOMEM), ramfs (-ENOSPC), vfs (-ENOMEM).
>>
>> Not entirely consistent but this matches the majority.
>>
>> I agree -EPERM is pretty misleading.
>>
>> Acked-by: Mike Marciniszyn 
>
> Thanks Or on catching the bad directory in the patch.
Thanks Or

>
> Doug, can you fix this up or do you want it resubmitted?
>
> Mike
> --
> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] IB/mlx4: Use vmalloc for WR buffers when needed

2015-10-08 Thread Leon Romanovsky
On Fri, Sep 25, 2015 at 08:51:22AM +0800, Wengang Wang wrote:
> Hi Or,
> 
> 在 2015年09月24日 19:57, Or Gerlitz 写道:
> >On Thu, Sep 24, 2015 at 1:49 PM, Wengang Wang  
> >wrote:
> >>@@ -786,8 +787,14 @@ static int create_qp_common(struct mlx4_ib_dev *dev, 
> >>struct ib_pd *pd,
> >> if (err)
> >> goto err_mtt;
> >>
> >>-   qp->sq.wrid  = kmalloc(qp->sq.wqe_cnt * sizeof (u64), gfp);
> >>-   qp->rq.wrid  = kmalloc(qp->rq.wqe_cnt * sizeof (u64), gfp);
> >>+   qp->sq.wrid = kmalloc(qp->sq.wqe_cnt * sizeof(u64), gfp);
> >>+   if (!qp->sq.wrid)
> >>+   qp->sq.wrid = __vmalloc(qp->sq.wqe_cnt * 
> >>sizeof(u64),
> >>+   gfp, PAGE_KERNEL);
> >On other spots of mlx4, we're using vmalloc and not __vmalloc, any
> >pros/cons for going that way too here?
> 
> vmalloc is just using GFP_KERNEL | __GFP_HIGHMEM, we can't pass in the flag
> gfp with it.  We should respect orginal code which needs to pass the flag.
Additionally, I want to spot Or's attention on the following discussion
in MM-subsystem about kmalloc/vmalloc and general function to fallback
from one to another.

[1] [PATCH 2/7] mm: introduce kvmalloc and kvmalloc_node
https://lkml.org/lkml/2015/7/7/548
[2] [PATCH 0/7] mm: reliable memory allocation with kvmalloc
https://lkml.org/lkml/2015/7/7/545

> 
> thanks,
> wengang
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] IB/mlx4: Use vmalloc for WR buffers when needed

2015-10-08 Thread Leon Romanovsky
On Thu, Oct 8, 2015 at 9:14 AM, Or Gerlitz <ogerl...@mellanox.com> wrote:
> On 10/8/2015 9:06 AM, Leon Romanovsky wrote:
>>
>> Additionally, I want to spot Or's attention on the following discussion
>> in MM-subsystem about kmalloc/vmalloc and general function to fallback
>> from one to another.
>>
>>
>
> too busy to read them now, if you have review comments speak now and provide
> them to the author.
My comments to author that from my point of view this patch is a
correct to fix current behaviour.

The more general solution (I doubt if it is feasible) is to decrease
the dependency on high order allocations.

>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] IPoIB: serialize changing on tx_outstanding

2015-10-08 Thread Leon Romanovsky
>>> +   spin_lock_irqsave(>lock, flags);
>>> +   --priv->tx_outstanding;
>>> +   if (netif_queue_stopped(dev))
>>> +   netif_wake_queue(dev);
>>> +   spin_unlock_irqrestore(>lock, flags);
>>
>> Why are you locking the netif_* calls?
>
>
> Yes, I intended to do that.   This make the accessing on tx_outstanding and
> the reopening of the send queue in the same atomic session which is the
> expected behavior.
> Otherwise,  we may have the following problem:
> #time order
>
> thread1(on cpu1) thread2(on cpu2)
> lock
> modify/check tx_outstanding
> unlock
>
>
> lock
> modify/check tx_outstanding
> unlock
>
> reopen queue
>
>
>stop queue
>
>
> So that we actually want reopen the send queue, but the result is we stopped
> it.
Thanks for the explanation.

>
> thanks,
> wengang
>
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] IPoIB: serialize changing on tx_outstanding

2015-10-07 Thread Leon Romanovsky
On Mon, Sep 28, 2015 at 01:42:10PM +0800, Wengang Wang wrote:
> The changing on tx_outstanding should be protected by spinlock or to be
> atomic operations.
> 
> Such log is found in dmesg:
> 
> Sep 16 14:20:53 naep11x06 kernel: ib0: queue stopped 1, tx_head 1034733, 
> tx_tail 1034733, tx_outstanding 359 ipoib_sendq_size: 512
> Sep 16 14:21:33 naep11x06 kernel: ib0: transmit timeout: latency 9560 msecs
> Sep 16 14:21:33 naep11x06 kernel: ib0: queue stopped 1, tx_head 1034854, 
> tx_tail 1034854, tx_outstanding 511 ipoib_sendq_size: 512
> Sep 16 14:21:38 naep11x06 kernel: ib0: transmit timeout: latency 14568 msecs
> Sep 16 14:21:38 naep11x06 kernel: ib0: queue stopped 1, tx_head 1034854, 
> tx_tail 1034854, tx_outstanding 511 ipoib_sendq_size: 512
> 
> And the send queue of ib0 kept full. When transmit timeout is reported,
> queue is reported as "stopped", but the IPoIB stuff tx_head and tx_tail
> points to same value. I am not able to see such numbers in ipoib_cm_tx
> (for CM) because I have no vmcore. Though I am not quite sure it's caused
> by parallel access of tx_outstanding(send path VS interrup path), we really
> need to serialize the changeing on tx_outstanding.
> 
> This patch also make sure the increase of tx_outstanding prior to the
> calling of post_send to avoid the possible decreasing before increasing in
> case the running of increasing is scheduled later than the interrupt
> handler.
> 
> Signed-off-by: Wengang Wang 
> ---
>  drivers/infiniband/ulp/ipoib/ipoib_cm.c | 40 
> +++--
>  drivers/infiniband/ulp/ipoib/ipoib_ib.c | 24 ++--
>  2 files changed, 50 insertions(+), 14 deletions(-)
> 
> diff --git a/drivers/infiniband/ulp/ipoib/ipoib_cm.c 
> b/drivers/infiniband/ulp/ipoib/ipoib_cm.c
> index c78dc16..044da94 100644
> --- a/drivers/infiniband/ulp/ipoib/ipoib_cm.c
> +++ b/drivers/infiniband/ulp/ipoib/ipoib_cm.c
> @@ -710,6 +710,7 @@ void ipoib_cm_send(struct net_device *dev, struct sk_buff 
> *skb, struct ipoib_cm_
>   struct ipoib_dev_priv *priv = netdev_priv(dev);
>   struct ipoib_tx_buf *tx_req;
>   int rc;
> + unsigned long flags;
>  
>   if (unlikely(skb->len > tx->mtu)) {
>   ipoib_warn(priv, "packet len %d (> %d) too long to send, 
> dropping\n",
> @@ -742,27 +743,36 @@ void ipoib_cm_send(struct net_device *dev, struct 
> sk_buff *skb, struct ipoib_cm_
>   skb_orphan(skb);
>   skb_dst_drop(skb);
>  
> + spin_lock_irqsave(>lock, flags);
> + if (++priv->tx_outstanding == ipoib_sendq_size) {
> + ipoib_dbg(priv, "TX ring 0x%x full, stopping kernel net 
> queue\n",
> +   tx->qp->qp_num);
> + netif_stop_queue(dev);
> + }
> + spin_unlock_irqrestore(>lock, flags);
> + if (netif_queue_stopped(dev)) {
> + rc = ib_req_notify_cq(priv->send_cq,
> + IB_CQ_NEXT_COMP | IB_CQ_REPORT_MISSED_EVENTS);
> + if (rc < 0)
> + ipoib_warn(priv, "request notify on send CQ failed\n");
> + else if (rc)
> + ipoib_send_comp_handler(priv->send_cq, dev);
> + }
> +
>   rc = post_send(priv, tx, tx->tx_head & (ipoib_sendq_size - 1), tx_req);
>   if (unlikely(rc)) {
>   ipoib_warn(priv, "post_send failed, error %d\n", rc);
>   ++dev->stats.tx_errors;
> + spin_lock_irqsave(>lock, flags);
> + --priv->tx_outstanding;
> + if (netif_queue_stopped(dev))
> + netif_wake_queue(dev);
> + spin_unlock_irqrestore(>lock, flags);
Why are you locking the netif_* calls?
>   ipoib_dma_unmap_tx(priv, tx_req);
>   dev_kfree_skb_any(skb);
>   } else {
>   dev->trans_start = jiffies;
>   ++tx->tx_head;
> -
> - if (++priv->tx_outstanding == ipoib_sendq_size) {
> - ipoib_dbg(priv, "TX ring 0x%x full, stopping kernel net 
> queue\n",
> -   tx->qp->qp_num);
> - netif_stop_queue(dev);
> - rc = ib_req_notify_cq(priv->send_cq,
> - IB_CQ_NEXT_COMP | IB_CQ_REPORT_MISSED_EVENTS);
> - if (rc < 0)
> - ipoib_warn(priv, "request notify on send CQ 
> failed\n");
> - else if (rc)
> - ipoib_send_comp_handler(priv->send_cq, dev);
> - }
>   }
>  }
>  
> @@ -796,10 +806,13 @@ void ipoib_cm_handle_tx_wc(struct net_device *dev, 
> struct ib_wc *wc)
>   netif_tx_lock(dev);
>  
>   ++tx->tx_tail;
> +
> + spin_lock_irqsave(>lock, flags);
>   if (unlikely(--priv->tx_outstanding == ipoib_sendq_size >> 1) &&
>   netif_queue_stopped(dev) &&
>   test_bit(IPOIB_FLAG_ADMIN_UP, >flags))
>   netif_wake_queue(dev);
> + spin_unlock_irqrestore(>lock, flags);
>  
>   if 

Re: [PATCH v2 16/16] NFS: Enable client side NFSv4.1 backchannel to use other transports

2015-10-07 Thread Leon Romanovsky
On Tue, Oct 6, 2015 at 6:00 PM, Chuck Lever  wrote:
> Pass the correct backchannel transport class to svc_create_xprt()
> when setting up an NFSv4.1 backchannel transport.
>
> Signed-off-by: Chuck Lever 
> ---
>  fs/nfs/callback.c   |   33 +
>  include/linux/sunrpc/xprt.h |1 +
>  net/sunrpc/xprtrdma/transport.c |1 +
>  net/sunrpc/xprtsock.c   |1 +
>  4 files changed, 24 insertions(+), 12 deletions(-)
>
> diff --git a/fs/nfs/callback.c b/fs/nfs/callback.c
> index 75f7c0a..46ed2c5 100644
> --- a/fs/nfs/callback.c
> +++ b/fs/nfs/callback.c
> @@ -99,15 +99,22 @@ nfs4_callback_up(struct svc_serv *serv)
>  }
>
>  #if defined(CONFIG_NFS_V4_1)
> -static int nfs41_callback_up_net(struct svc_serv *serv, struct net *net)
> +/*
> + * Create an svc_sock for the back channel service that shares the
> + * fore channel connection.
> + * Returns the input port (0) and sets the svc_serv bc_xprt on success
> + */
> +static int nfs41_callback_up_net(struct svc_serv *serv, struct net *net,
> +struct rpc_xprt *xprt)
>  {
> -   /*
> -* Create an svc_sock for the back channel service that shares the
> -* fore channel connection.
> -* Returns the input port (0) and sets the svc_serv bc_xprt on success
> -*/
> -   return svc_create_xprt(serv, "tcp-bc", net, PF_INET, 0,
> - SVC_SOCK_ANONYMOUS);
> +   int ret = -EPROTONOSUPPORT;
> +
> +   if (xprt->bc_name)
> +   ret = svc_create_xprt(serv, xprt->bc_name, net, PF_INET, 0,
> + SVC_SOCK_ANONYMOUS);
> +   dprintk("NFS: svc_create_xprt(%s) returned %d\n",
> +   xprt->bc_name, ret);

Maybe we should convert it to more general and common pr_debug(..)
interface [1]?
Despite the fact that the other parts of code already use the dprintk macro.

> +   return ret;
>  }
>
>  /*
> @@ -184,7 +191,8 @@ static inline void nfs_callback_bc_serv(u32 minorversion, 
> struct rpc_xprt *xprt,
> xprt->bc_serv = serv;
>  }
>  #else
> -static int nfs41_callback_up_net(struct svc_serv *serv, struct net *net)
> +static int nfs41_callback_up_net(struct svc_serv *serv, struct net *net,
> +struct rpc_xprt *xprt)
>  {
> return 0;
>  }
> @@ -259,7 +267,8 @@ static void nfs_callback_down_net(u32 minorversion, 
> struct svc_serv *serv, struc
> svc_shutdown_net(serv, net);
>  }
>
> -static int nfs_callback_up_net(int minorversion, struct svc_serv *serv, 
> struct net *net)
> +static int nfs_callback_up_net(int minorversion, struct svc_serv *serv,
> +  struct net *net, struct rpc_xprt *xprt)
>  {
> struct nfs_net *nn = net_generic(net, nfs_net_id);
> int ret;
> @@ -281,7 +290,7 @@ static int nfs_callback_up_net(int minorversion, struct 
> svc_serv *serv, struct n
> break;
> case 1:
> case 2:
> -   ret = nfs41_callback_up_net(serv, net);
> +   ret = nfs41_callback_up_net(serv, net, xprt);
> break;
> default:
> printk(KERN_ERR "NFS: unknown callback version: %d\n",
It can be pr_err(..) [1].

> @@ -364,7 +373,7 @@ int nfs_callback_up(u32 minorversion, struct rpc_xprt 
> *xprt)
> goto err_create;
> }
>
> -   ret = nfs_callback_up_net(minorversion, serv, net);
> +   ret = nfs_callback_up_net(minorversion, serv, net, xprt);
> if (ret < 0)
> goto err_net;
>
> diff --git a/include/linux/sunrpc/xprt.h b/include/linux/sunrpc/xprt.h
> index 82c0839..88bae18 100644
> --- a/include/linux/sunrpc/xprt.h
> +++ b/include/linux/sunrpc/xprt.h
> @@ -170,6 +170,7 @@ struct rpc_xprt {
> struct sockaddr_storage addr;   /* server address */
> size_t  addrlen;/* size of server address */
> int prot;   /* IP protocol */
> +   char*bc_name;   /* backchannel transport */
>
> unsigned long   cong;   /* current congestion */
> unsigned long   cwnd;   /* congestion window */
> diff --git a/net/sunrpc/xprtrdma/transport.c b/net/sunrpc/xprtrdma/transport.c
> index 845278e..c60bbc8 100644
> --- a/net/sunrpc/xprtrdma/transport.c
> +++ b/net/sunrpc/xprtrdma/transport.c
> @@ -337,6 +337,7 @@ xprt_setup_rdma(struct xprt_create *args)
> /* Ensure xprt->addr holds valid server TCP (not RDMA)
>  * address, for any side protocols which peek at it */
> xprt->prot = IPPROTO_TCP;
> +   xprt->bc_name = "rdma-bc";
> xprt->addrlen = args->addrlen;
> memcpy(>addr, sap, xprt->addrlen);
>
> diff --git a/net/sunrpc/xprtsock.c b/net/sunrpc/xprtsock.c
> index