Re: Seeing WARN_ON in ib_dealloc_pd from ipoib in kernel 4.3-rc1-debug

2015-10-12 Thread Sagi Grimberg



The following fixup patch is needed:



Subject: ipoib: For sendonly join free the multicast group on leave

When we leave the multicast group on expiration of a neighbor we
do not free the mcast structure. This results in a memory leak.

Signed-off-by: Christoph Lameter 

Index: linux/drivers/infiniband/ulp/ipoib/ipoib.h
===
--- linux.orig/drivers/infiniband/ulp/ipoib/ipoib.h
+++ linux/drivers/infiniband/ulp/ipoib/ipoib.h
@@ -495,6 +495,7 @@ void ipoib_dev_cleanup(struct net_device
  void ipoib_mcast_join_task(struct work_struct *work);
  void ipoib_mcast_carrier_on_task(struct work_struct *work);
  void ipoib_mcast_send(struct net_device *dev, u8 *daddr, struct sk_buff *skb);
+void ipoib_mcast_free(struct ipoib_mcast *mc);

  void ipoib_mcast_restart_task(struct work_struct *work);
  int ipoib_mcast_start_thread(struct net_device *dev);
Index: linux/drivers/infiniband/ulp/ipoib/ipoib_main.c
===
--- linux.orig/drivers/infiniband/ulp/ipoib/ipoib_main.c
+++ linux/drivers/infiniband/ulp/ipoib/ipoib_main.c
@@ -1207,8 +1207,10 @@ static void __ipoib_reap_neigh(struct ip

  out_unlock:
spin_unlock_irqrestore(>lock, flags);
-   list_for_each_entry_safe(mcast, tmcast, _list, list)
+   list_for_each_entry_safe(mcast, tmcast, _list, list) {
ipoib_mcast_leave(dev, mcast);
+   ipoib_mcast_free(mcast);
+   }
  }

  static void ipoib_reap_neigh(struct work_struct *work)
Index: linux/drivers/infiniband/ulp/ipoib/ipoib_multicast.c
===
--- linux.orig/drivers/infiniband/ulp/ipoib/ipoib_multicast.c
+++ linux/drivers/infiniband/ulp/ipoib/ipoib_multicast.c
@@ -106,7 +106,7 @@ static void __ipoib_mcast_schedule_join_
queue_delayed_work(priv->wq, >mcast_task, 0);
  }

-static void ipoib_mcast_free(struct ipoib_mcast *mcast)
+void ipoib_mcast_free(struct ipoib_mcast *mcast)
  {
struct net_device *dev = mcast->dev;
int tx_dropped = 0;




Hey Christoph,

Thanks for the quick patch. When you re-spin this as
a proper patch you can add my:

Tested-by: Sagi Grimberg 
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Possible circular locking when unloading device driver

2015-10-12 Thread Sagi Grimberg

Hey,

I stepped on this lockdep circular locking complaint on 4.3-rc when
unloading the device driver (mlx5 in my case). Has anyone seen this?

I have seen such warnings with kernfs_mutex when deleting iscsi
devices on the fly.

I wander if kernfs_remove() should use mutex_lock_nested?

output:
kernel:  (s_active#78){.+}, at: [] 
kernfs_remove+0x27/0x40

kernel:
but task is already holding lock:
kernel:  (rtnl_mutex){+.+.+.}, at: [] rtnl_lock+0x17/0x20
kernel:
which lock already depends on the new lock.
kernel:
the existing dependency chain (in reverse order) is:
kernel:
-> #1 (rtnl_mutex){+.+.+.}:
kernel:[] __lock_acquire+0xc1f/0x1090
kernel:[] lock_acquire+0xd3/0x1f0
kernel:[] mutex_lock_nested+0x60/0x3a0
kernel:[] rtnl_lock+0x17/0x20
kernel:[] ipoib_set_mode+0x96/0xf0 [ib_ipoib]
kernel:[] set_mode+0x3b/0x80 [ib_ipoib]
kernel:[] dev_attr_store+0x20/0x30
kernel:[] sysfs_kf_write+0x4f/0x70
kernel:[] kernfs_fop_write+0x153/0x180
kernel:[] __vfs_write+0x34/0xf0
kernel:[] vfs_write+0xaa/0x120
kernel:[] SyS_write+0x5d/0xc0
kernel:[] entry_SYSCALL_64_fastpath+0x12/0x76
kernel:
-> #0 (s_active#78){.+}:
kernel:[] check_prev_add+0x527/0x560
kernel:[] __lock_acquire+0xc1f/0x1090
kernel:[] lock_acquire+0xd3/0x1f0
kernel:[] __kernfs_remove+0x2b3/0x390
kernel:[] kernfs_remove+0x27/0x40
kernel:[] sysfs_remove_dir+0x5a/0x90
kernel:[] kobject_del+0x22/0x60
kernel:[] device_del+0x192/0x220
kernel:[] netdev_unregister_kobject+0x71/0x80
kernel:[] rollback_registered_many+0x1e1/0x2c0
kernel:[] rollback_registered+0x31/0x40
kernel:[] unregister_netdevice_queue+0x58/0xb0
kernel:[] unregister_netdev+0x20/0x30
kernel:[] ipoib_remove_one+0xa1/0xe0 [ib_ipoib]
kernel:[] ib_unregister_device+0xc1/0x160 
[ib_core]

kernel:[] mlx5_ib_remove+0x19/0x50 [mlx5_ib]
kernel:[] mlx5_remove_device+0x68/0x80 [mlx5_core]
kernel:[] mlx5_unregister_interface+0x3e/0x70 
[mlx5_core]

kernel:[] mlx5_ib_cleanup+0x10/0x814 [mlx5_ib]
kernel:[] SyS_delete_module+0x17a/0x1c0
kernel:[] entry_SYSCALL_64_fastpath+0x12/0x76
kernel:
other info that might help us debug this:
kernel:  Possible unsafe locking scenario:
kernel:CPU0CPU1
kernel:
kernel:   lock(rtnl_mutex);
kernel:lock(s_active#78);
kernel:lock(rtnl_mutex);
kernel:   lock(s_active#78);
kernel:
 *** DEADLOCK ***
kernel: 4 locks held by modprobe/1662:
kernel:  #0:  (intf_mutex){+.+.+.}, at: [] 
mlx5_unregister_interface+0x1d/0x70 [mlx5_core]
kernel:  #1:  (device_mutex){+.+.+.}, at: [] 
ib_unregister_device+0x2f/0x160 [ib_core]
kernel:  #2:  (lists_rwsem){+.}, at: [] 
ib_unregister_device+0x43/0x160 [ib_core]
kernel:  #3:  (rtnl_mutex){+.+.+.}, at: [] 
rtnl_lock+0x17/0x20

kernel:
stack backtrace:
kernel: CPU: 3 PID: 1662 Comm: modprobe Tainted: G L 
4.3.0-rc3-debug+ #67

kernel: Hardware name: Supermicro SYS-1027R-WRF/X9DRW, BIOS 3.0a 08/08/2013
kernel:  820fb120 88080b62f998 8129915b 
kernel:  8215fe60 88080b62f9e8 810bd7dd 880810ee2d00
kernel:  88080b62fa08 880810ee3458 880810ee3430 880810ee3458
kernel: Call Trace:
kernel:  [] dump_stack+0x4f/0x74
kernel:  [] print_circular_bug+0x20d/0x310
kernel:  [] check_prev_add+0x527/0x560
kernel:  [] __lock_acquire+0xc1f/0x1090
kernel:  [] lock_acquire+0xd3/0x1f0
kernel:  [] ? kernfs_remove+0x27/0x40
kernel:  [] ? trace_hardirqs_on+0xd/0x10
kernel:  [] __kernfs_remove+0x2b3/0x390
kernel:  [] ? kernfs_remove+0x27/0x40
kernel:  [] ? trace_hardirqs_on+0xd/0x10
kernel:  [] ? kernfs_remove+0x1f/0x40
kernel:  [] ? sysfs_remove_dir+0x3e/0x90
kernel:  [] ? __mutex_unlock_slowpath+0xc7/0x190
kernel:  [] kernfs_remove+0x27/0x40
kernel:  [] sysfs_remove_dir+0x5a/0x90
kernel:  [] kobject_del+0x22/0x60
kernel:  [] device_del+0x192/0x220
kernel:  [] netdev_unregister_kobject+0x71/0x80
kernel:  [] rollback_registered_many+0x1e1/0x2c0
kernel:  [] rollback_registered+0x31/0x40
kernel:  [] unregister_netdevice_queue+0x58/0xb0
kernel:  [] unregister_netdev+0x20/0x30
kernel:  [] ipoib_remove_one+0xa1/0xe0 [ib_ipoib]
kernel:  [] ib_unregister_device+0xc1/0x160 [ib_core]
kernel:  [] mlx5_ib_remove+0x19/0x50 [mlx5_ib]
kernel:  [] mlx5_remove_device+0x68/0x80 [mlx5_core]
kernel:  [] mlx5_unregister_interface+0x3e/0x70 
[mlx5_core]

kernel:  [] mlx5_ib_cleanup+0x10/0x814 [mlx5_ib]
kernel:  [] SyS_delete_module+0x17a/0x1c0
kernel:  [] ? trace_hardirqs_on_thunk+0x17/0x19
kernel:  [] ? generic_show_options+0x180/0x180
kernel:  [] entry_SYSCALL_64_fastpath+0x12/0x76
--
--
To unsubscribe from this list: send the line "unsubscribe 

[PATCH] IB: merge struct ib_device_attr into struct ib_device

2015-10-12 Thread Christoph Hellwig
Avoid the need to query for device attributes and store them in a
separate structure by merging struct ib_device_attr into struct
ib_device.  This matches how the device structures are used in most
Linux subsystems.

Signed-off-by: Christoph Hellwig 
---
 drivers/infiniband/core/cm.c   |  12 +-
 drivers/infiniband/core/cma.c  |   8 -
 drivers/infiniband/core/device.c   |  20 ---
 drivers/infiniband/core/fmr_pool.c |  20 +--
 drivers/infiniband/core/sysfs.c|  14 +-
 drivers/infiniband/core/uverbs_cmd.c   | 128 +++-
 drivers/infiniband/core/verbs.c|   8 +-
 drivers/infiniband/hw/cxgb3/iwch_provider.c|  60 +++-
 drivers/infiniband/hw/cxgb4/provider.c |  64 +++-
 drivers/infiniband/hw/mlx4/main.c  | 169 -
 drivers/infiniband/hw/mlx5/main.c  | 116 ++
 drivers/infiniband/hw/mthca/mthca_provider.c   |  77 +-
 drivers/infiniband/hw/nes/nes_verbs.c  |  94 +---
 drivers/infiniband/hw/ocrdma/ocrdma_main.c |  40 -
 drivers/infiniband/hw/ocrdma/ocrdma_verbs.c|  49 --
 drivers/infiniband/hw/ocrdma/ocrdma_verbs.h|   2 -
 drivers/infiniband/hw/qib/qib_verbs.c  |  86 +--
 drivers/infiniband/hw/usnic/usnic_ib_main.c|   3 +-
 drivers/infiniband/hw/usnic/usnic_ib_verbs.c   |  50 ++
 drivers/infiniband/hw/usnic/usnic_ib_verbs.h   |   4 +-
 drivers/infiniband/ulp/ipoib/ipoib_cm.c|  19 +--
 drivers/infiniband/ulp/ipoib/ipoib_ethtool.c   |  14 +-
 drivers/infiniband/ulp/ipoib/ipoib_main.c  |  21 +--
 drivers/infiniband/ulp/iser/iscsi_iser.c   |   4 +-
 drivers/infiniband/ulp/iser/iscsi_iser.h   |   2 -
 drivers/infiniband/ulp/iser/iser_memory.c  |   9 +-
 drivers/infiniband/ulp/iser/iser_verbs.c   |  38 ++---
 drivers/infiniband/ulp/isert/ib_isert.c|  43 ++
 drivers/infiniband/ulp/isert/ib_isert.h|   1 -
 drivers/infiniband/ulp/srp/ib_srp.c|  32 ++--
 drivers/infiniband/ulp/srpt/ib_srpt.c  |  15 +-
 drivers/infiniband/ulp/srpt/ib_srpt.h  |   3 -
 .../staging/lustre/lnet/klnds/o2iblnd/o2iblnd.c|  21 +--
 drivers/staging/rdma/amso1100/c2.h |   3 -
 drivers/staging/rdma/amso1100/c2_pd.c  |   6 +-
 drivers/staging/rdma/amso1100/c2_provider.c|  23 +--
 drivers/staging/rdma/amso1100/c2_rnic.c|  63 +++-
 drivers/staging/rdma/ehca/ehca_hca.c   |  78 +-
 drivers/staging/rdma/ehca/ehca_iverbs.h|   3 +-
 drivers/staging/rdma/ehca/ehca_main.c  |   3 +-
 drivers/staging/rdma/hfi1/verbs.c  |  89 +--
 drivers/staging/rdma/ipath/ipath_verbs.c   |  90 +--
 include/rdma/ib_verbs.h|  98 ++--
 net/rds/ib.c   |  28 +---
 net/rds/iw.c   |  23 +--
 net/sunrpc/xprtrdma/frwr_ops.c |   7 +-
 net/sunrpc/xprtrdma/svc_rdma_transport.c   |  48 +++---
 net/sunrpc/xprtrdma/verbs.c|  24 +--
 net/sunrpc/xprtrdma/xprt_rdma.h|   1 -
 49 files changed, 725 insertions(+), 1108 deletions(-)

diff --git a/drivers/infiniband/core/cm.c b/drivers/infiniband/core/cm.c
index ea4db9c..56c7a70 100644
--- a/drivers/infiniband/core/cm.c
+++ b/drivers/infiniband/core/cm.c
@@ -3749,16 +3749,6 @@ int ib_cm_init_qp_attr(struct ib_cm_id *cm_id,
 }
 EXPORT_SYMBOL(ib_cm_init_qp_attr);
 
-static void cm_get_ack_delay(struct cm_device *cm_dev)
-{
-   struct ib_device_attr attr;
-
-   if (ib_query_device(cm_dev->ib_device, ))
-   cm_dev->ack_delay = 0; /* acks will rely on packet life time */
-   else
-   cm_dev->ack_delay = attr.local_ca_ack_delay;
-}
-
 static ssize_t cm_show_counter(struct kobject *obj, struct attribute *attr,
   char *buf)
 {
@@ -3870,7 +3860,7 @@ static void cm_add_one(struct ib_device *ib_device)
return;
 
cm_dev->ib_device = ib_device;
-   cm_get_ack_delay(cm_dev);
+   cm_dev->ack_delay = ib_device->local_ca_ack_delay;
cm_dev->going_down = 0;
cm_dev->device = device_create(_class, _device->dev,
   MKDEV(0, 0), NULL,
diff --git a/drivers/infiniband/core/cma.c b/drivers/infiniband/core/cma.c
index b1ab13f..077c4e2 100644
--- a/drivers/infiniband/core/cma.c
+++ b/drivers/infiniband/core/cma.c
@@ -1847,7 +1847,6 @@ static int iw_conn_req_handler(struct iw_cm_id *cm_id,
struct rdma_id_private *listen_id, *conn_id;
struct rdma_cm_event event;
int ret;
-   struct ib_device_attr attr;
struct sockaddr *laddr = (struct 

merge struct ib_device_attr into struct ib_device V2

2015-10-12 Thread Christoph Hellwig
This patch gets rid of struct ib_device_attr and cleans up drivers nicely.

It goes on top of my send_wr cleanups and the memory registration udpates
from Sagi.

Changes since V1:
 - rebased on top of the Sagi's latest reg_api.6 branch

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: merge struct ib_device_attr into struct ib_device V2

2015-10-12 Thread Sagi Grimberg

On 10/12/2015 9:57 AM, Christoph Hellwig wrote:

This patch gets rid of struct ib_device_attr and cleans up drivers nicely.

It goes on top of my send_wr cleanups and the memory registration udpates
from Sagi.

Changes since V1:
  - rebased on top of the Sagi's latest reg_api.6 branch



Christoph,

First go with this looks OK for mlx4. mlx5 needs the below incremental
patch to be folded in.

we need dev->ib_dev.max_pkeys set when get_port_caps() is called.

--
diff --git a/drivers/infiniband/hw/mlx5/main.c 
b/drivers/infiniband/hw/mlx5/main.c

index 67b979f..5b73322 100644
--- a/drivers/infiniband/hw/mlx5/main.c
+++ b/drivers/infiniband/hw/mlx5/main.c
@@ -1321,6 +1321,10 @@ static void *mlx5_ib_add(struct mlx5_core_dev *mdev)

dev->mdev = mdev;

+   err = mlx5_ib_init_device_flags(>ib_dev);
+   if (err)
+   goto err_dealloc;
+
err = get_port_caps(dev);
if (err)
goto err_dealloc;
@@ -1433,10 +1437,6 @@ static void *mlx5_ib_add(struct mlx5_core_dev *mdev)
if (err)
goto err_rsrc;

-   err = mlx5_ib_init_device_flags(>ib_dev);
-   if (err)
-   goto err_rsrc;
-
err = ib_register_device(>ib_dev, NULL);
if (err)
goto err_odp;
--
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v4 04/26] IB/mlx4: Support the new memory registration API

2015-10-12 Thread Sagi Grimberg
Support the new memory registration API by allocating a
private page list array in mlx4_ib_mr and populate it when
mlx4_ib_map_mr_sg is invoked. Also, support IB_WR_REG_MR
by setting the exact WQE as IB_WR_FAST_REG_MR, just take the
needed information from different places:
- page_size, iova, length, access flags (ib_mr)
- page array (mlx4_ib_mr)
- key (ib_reg_wr)

The IB_WR_FAST_REG_MR handlers will be removed later when
all the ULPs will be converted.

Signed-off-by: Sagi Grimberg 
Tested-by: Christoph Hellwig 
---
 drivers/infiniband/hw/mlx4/cq.c  |   1 +
 drivers/infiniband/hw/mlx4/main.c|   1 +
 drivers/infiniband/hw/mlx4/mlx4_ib.h |  10 
 drivers/infiniband/hw/mlx4/mr.c  | 101 ---
 drivers/infiniband/hw/mlx4/qp.c  |  25 +
 5 files changed, 132 insertions(+), 6 deletions(-)

diff --git a/drivers/infiniband/hw/mlx4/cq.c b/drivers/infiniband/hw/mlx4/cq.c
index 5fd49f9435f9..2ea4125b7903 100644
--- a/drivers/infiniband/hw/mlx4/cq.c
+++ b/drivers/infiniband/hw/mlx4/cq.c
@@ -819,6 +819,7 @@ repoll:
break;
case MLX4_OPCODE_FMR:
wc->opcode= IB_WC_FAST_REG_MR;
+   /* TODO: wc->opcode= IB_WC_REG_MR; */
break;
case MLX4_OPCODE_LOCAL_INVAL:
wc->opcode= IB_WC_LOCAL_INV;
diff --git a/drivers/infiniband/hw/mlx4/main.c 
b/drivers/infiniband/hw/mlx4/main.c
index 38be8dc2932e..19191ac0783c 100644
--- a/drivers/infiniband/hw/mlx4/main.c
+++ b/drivers/infiniband/hw/mlx4/main.c
@@ -2249,6 +2249,7 @@ static void *mlx4_ib_add(struct mlx4_dev *dev)
ibdev->ib_dev.rereg_user_mr = mlx4_ib_rereg_user_mr;
ibdev->ib_dev.dereg_mr  = mlx4_ib_dereg_mr;
ibdev->ib_dev.alloc_mr  = mlx4_ib_alloc_mr;
+   ibdev->ib_dev.map_mr_sg = mlx4_ib_map_mr_sg;
ibdev->ib_dev.alloc_fast_reg_page_list = 
mlx4_ib_alloc_fast_reg_page_list;
ibdev->ib_dev.free_fast_reg_page_list  = 
mlx4_ib_free_fast_reg_page_list;
ibdev->ib_dev.attach_mcast  = mlx4_ib_mcg_attach;
diff --git a/drivers/infiniband/hw/mlx4/mlx4_ib.h 
b/drivers/infiniband/hw/mlx4/mlx4_ib.h
index 1e7b23bb2eb0..d6214577ecf8 100644
--- a/drivers/infiniband/hw/mlx4/mlx4_ib.h
+++ b/drivers/infiniband/hw/mlx4/mlx4_ib.h
@@ -129,10 +129,17 @@ struct mlx4_ib_cq {
struct list_headrecv_qp_list;
 };
 
+#define MLX4_MR_PAGES_ALIGN 0x40
+
 struct mlx4_ib_mr {
struct ib_mribmr;
+   __be64  *pages;
+   dma_addr_t  page_map;
+   u32 npages;
+   u32 max_pages;
struct mlx4_mr  mmr;
struct ib_umem *umem;
+   void*pages_alloc;
 };
 
 struct mlx4_ib_mw {
@@ -706,6 +713,9 @@ int mlx4_ib_dealloc_mw(struct ib_mw *mw);
 struct ib_mr *mlx4_ib_alloc_mr(struct ib_pd *pd,
   enum ib_mr_type mr_type,
   u32 max_num_sg);
+int mlx4_ib_map_mr_sg(struct ib_mr *ibmr,
+ struct scatterlist *sg,
+ unsigned int sg_nents);
 struct ib_fast_reg_page_list *mlx4_ib_alloc_fast_reg_page_list(struct 
ib_device *ibdev,
   int 
page_list_len);
 void mlx4_ib_free_fast_reg_page_list(struct ib_fast_reg_page_list *page_list);
diff --git a/drivers/infiniband/hw/mlx4/mr.c b/drivers/infiniband/hw/mlx4/mr.c
index 5bba176e9dfa..96fc7ed99fb8 100644
--- a/drivers/infiniband/hw/mlx4/mr.c
+++ b/drivers/infiniband/hw/mlx4/mr.c
@@ -59,7 +59,7 @@ struct ib_mr *mlx4_ib_get_dma_mr(struct ib_pd *pd, int acc)
struct mlx4_ib_mr *mr;
int err;
 
-   mr = kmalloc(sizeof *mr, GFP_KERNEL);
+   mr = kzalloc(sizeof(*mr), GFP_KERNEL);
if (!mr)
return ERR_PTR(-ENOMEM);
 
@@ -140,7 +140,7 @@ struct ib_mr *mlx4_ib_reg_user_mr(struct ib_pd *pd, u64 
start, u64 length,
int err;
int n;
 
-   mr = kmalloc(sizeof *mr, GFP_KERNEL);
+   mr = kzalloc(sizeof(*mr), GFP_KERNEL);
if (!mr)
return ERR_PTR(-ENOMEM);
 
@@ -271,11 +271,59 @@ release_mpt_entry:
return err;
 }
 
+static int
+mlx4_alloc_priv_pages(struct ib_device *device,
+ struct mlx4_ib_mr *mr,
+ int max_pages)
+{
+   int size = max_pages * sizeof(u64);
+   int add_size;
+   int ret;
+
+   add_size = max_t(int, MLX4_MR_PAGES_ALIGN - ARCH_KMALLOC_MINALIGN, 0);
+
+   mr->pages_alloc = kzalloc(size + add_size, GFP_KERNEL);
+   if (!mr->pages_alloc)
+   return -ENOMEM;
+
+   mr->pages = PTR_ALIGN(mr->pages_alloc, MLX4_MR_PAGES_ALIGN);
+
+   mr->page_map = dma_map_single(device->dma_device, mr->pages,
+ size, DMA_TO_DEVICE);
+
+   if 

[PATCH v4 00/26] New fast registration API

2015-10-12 Thread Sagi Grimberg
Hi all,

As discussed on the linux-rdma list, there is plenty of room for
improvement in our memory registration APIs. We keep finding
ULPs that are duplicating code, sometimes use wrong strategies
and mis-use our current API.

As a first step, this patch set replaces the fast registration API
to accept a kernel common struct scatterlist and takes care of
the page vector construction in the core layer with hooks for the
drivers HW specific assignments. This allows to remove a common
code duplication as it was done in each and every ULP driver.

Changes from v3:
- Addressed some xprtrdma comments (Chuck)
- Removed xprtrdma change-log paragraph (Or)

Changes from v2:
- Fixed alignment for page lists allocations in mlx4, mlx5 (Bart)
- Rebased against Doug's for-4.4 tree (4.3.0-rc1) + 4.3-rc fixes
- Added Acked/Tested tags

Changes from v1:
- Add ib_map_mr_sg_zbva() for RDS which uses it (preferred it over
  polluting the API).
- Replaced coherent allocations in mlx4, mlx5 with DMA streaming
  APIs (Bart)
- Changed ib_map_mr_sg description (Bart)
- Split SRP driver patches (Bart)
- Added missing wr->next = NULL from various ULPs (Steve, Santosh)
- Fixed 0-day testing errors in nes driver, xprtrdma and svcrdma
- Fixed checkpatch issues

Changes from v0:
- Rebased on top of 4.3-rc1 + Christoph's ib_send_wr conversion patches
- Allow the ULP to pass page_size argument to ib_map_mr_sg in order
  to have it work better in some specific workloads. This suggestion
  came from Bart Van Assche which pointed out that some applications
  might use page sizes significantly smaller than the system PAGE_SIZE
  of specific architectures
- Fixed some logical bugs in ib_sg_to_pages
- Added a set_page function pointer for drivers to pass to ib_sg_to_pages
  so some drivers (e.g mlx4, mlx5, nes) can avoid keeping a second page
  vector and/or re-iterate on the page vector in order to perform HW specific
  assignments (big/little endian conversion, extra flags)
- Converted SRP initiator and RDS iwarp ULPs to the new API
- Removed fast registration code from hfi1 driver (as it isn't supported
  anyway). I assume that the correct place to get the support back would
  be in a shared SW library (hfi1, qib, rxe).
- Updated the change logs

The code is available at: https://github.com/sagigrimberg/linux/tree/reg_api.6

Sagi Grimberg (26):
  IB/core: Introduce new fast registration API
  IB/mlx5: Remove dead fmr code
  IB/mlx5: Support the new memory registration API
  IB/mlx4: Support the new memory registration API
  RDMA/ocrdma: Support the new memory registration API
  RDMA/cxgb3: Support the new memory registration API
  iw_cxgb4: Support the new memory registration API
  IB/qib: Support the new memory registration API
  RDMA/nes: Support the new memory registration API
  IB/iser: Port to new fast registration API
  iser-target: Port to new memory registration API
  xprtrdma: Port to new memory registration API
  svcrdma: Port to new memory registration API
  RDS/IW: Convert to new memory registration API
  IB/srp: Split srp_map_sg
  IB/srp: Convert to new registration API
  IB/srp: Remove srp_finish_mapping
  IB/srp: Dont allocate a page vector when using fast_reg
  IB/mlx5: Remove old FRWR API support
  IB/mlx4: Remove old FRWR API support
  RDMA/ocrdma: Remove old FRWR API
  RDMA/cxgb3: Remove old FRWR API
  iw_cxgb4: Remove old FRWR API
  IB/qib: Remove old FRWR API
  RDMA/nes: Remove old FRWR API
  IB/core: Remove old fast registration API

 drivers/infiniband/core/verbs.c | 132 +++---
 drivers/infiniband/hw/cxgb3/iwch_cq.c   |   2 +-
 drivers/infiniband/hw/cxgb3/iwch_provider.c |  39 +++--
 drivers/infiniband/hw/cxgb3/iwch_provider.h |   2 +
 drivers/infiniband/hw/cxgb3/iwch_qp.c   |  37 ++--
 drivers/infiniband/hw/cxgb4/cq.c|   2 +-
 drivers/infiniband/hw/cxgb4/iw_cxgb4.h  |  25 +--
 drivers/infiniband/hw/cxgb4/mem.c   |  61 +++
 drivers/infiniband/hw/cxgb4/provider.c  |   3 +-
 drivers/infiniband/hw/cxgb4/qp.c|  47 +++--
 drivers/infiniband/hw/mlx4/cq.c |   2 +-
 drivers/infiniband/hw/mlx4/main.c   |   3 +-
 drivers/infiniband/hw/mlx4/mlx4_ib.h|  25 ++-
 drivers/infiniband/hw/mlx4/mr.c | 149 ++--
 drivers/infiniband/hw/mlx4/qp.c |  34 ++--
 drivers/infiniband/hw/mlx5/cq.c |   4 +-
 drivers/infiniband/hw/mlx5/main.c   |   3 +-
 drivers/infiniband/hw/mlx5/mlx5_ib.h|  48 +-
 drivers/infiniband/hw/mlx5/mr.c | 135 ++-
 drivers/infiniband/hw/mlx5/qp.c | 140 +++
 drivers/infiniband/hw/nes/nes_hw.h  |   6 -
 drivers/infiniband/hw/nes/nes_verbs.c   | 163 +++---
 drivers/infiniband/hw/nes/nes_verbs.h   |   4 +
 drivers/infiniband/hw/ocrdma/ocrdma.h   |   2 +
 drivers/infiniband/hw/ocrdma/ocrdma_main.c  |   3 +-
 drivers/infiniband/hw/ocrdma/ocrdma_verbs.c | 154 -
 

RE: [PATCH rdma-RC] IB/cm: Fix sleeping while atomic when creating AH from WC

2015-10-12 Thread Hefty, Sean
> When IP based addressing was introduced, ib_create_ah_from_wc was
> changed in order to support a suitable AH. Since this AH should
> now contains the DMAC (which isn't a simple derivative of the GID).
> In order to find the DMAC, an ARP should sometime be sent. This ARP
> is a sleeping context.

Wait - are you saying that the CM may now be waiting for an ARP response before 
it can send a message?

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v4 16/26] IB/srp: Convert to new registration API

2015-10-12 Thread Sagi Grimberg
Instead of constructing a page list, call ib_map_mr_sg
and post a new ib_reg_wr. srp_map_finish_fr now returns
the number of sg elements registered.

Remove srp_finish_mapping since no one is calling it.

Signed-off-by: Sagi Grimberg 
Tested-by: Bart Van Assche 
---
 drivers/infiniband/ulp/srp/ib_srp.c | 125 ++--
 drivers/infiniband/ulp/srp/ib_srp.h |  11 +++-
 2 files changed, 71 insertions(+), 65 deletions(-)

diff --git a/drivers/infiniband/ulp/srp/ib_srp.c 
b/drivers/infiniband/ulp/srp/ib_srp.c
index 3ec94c109e1b..6d399378928d 100644
--- a/drivers/infiniband/ulp/srp/ib_srp.c
+++ b/drivers/infiniband/ulp/srp/ib_srp.c
@@ -340,8 +340,6 @@ static void srp_destroy_fr_pool(struct srp_fr_pool *pool)
return;
 
for (i = 0, d = >desc[0]; i < pool->size; i++, d++) {
-   if (d->frpl)
-   ib_free_fast_reg_page_list(d->frpl);
if (d->mr)
ib_dereg_mr(d->mr);
}
@@ -362,7 +360,6 @@ static struct srp_fr_pool *srp_create_fr_pool(struct 
ib_device *device,
struct srp_fr_pool *pool;
struct srp_fr_desc *d;
struct ib_mr *mr;
-   struct ib_fast_reg_page_list *frpl;
int i, ret = -EINVAL;
 
if (pool_size <= 0)
@@ -385,12 +382,6 @@ static struct srp_fr_pool *srp_create_fr_pool(struct 
ib_device *device,
goto destroy_pool;
}
d->mr = mr;
-   frpl = ib_alloc_fast_reg_page_list(device, max_page_list_len);
-   if (IS_ERR(frpl)) {
-   ret = PTR_ERR(frpl);
-   goto destroy_pool;
-   }
-   d->frpl = frpl;
list_add_tail(>entry, >free_list);
}
 
@@ -1321,23 +1312,24 @@ static int srp_map_finish_fr(struct srp_map_state 
*state,
struct srp_target_port *target = ch->target;
struct srp_device *dev = target->srp_host->srp_dev;
struct ib_send_wr *bad_wr;
-   struct ib_fast_reg_wr wr;
+   struct ib_reg_wr wr;
struct srp_fr_desc *desc;
u32 rkey;
-   int err;
+   int n, err;
 
if (state->fr.next >= state->fr.end)
return -ENOMEM;
 
WARN_ON_ONCE(!dev->use_fast_reg);
 
-   if (state->npages == 0)
+   if (state->sg_nents == 0)
return 0;
 
-   if (state->npages == 1 && target->global_mr) {
-   srp_map_desc(state, state->base_dma_addr, state->dma_len,
+   if (state->sg_nents == 1 && target->global_mr) {
+   srp_map_desc(state, sg_dma_address(state->sg),
+sg_dma_len(state->sg),
 target->global_mr->rkey);
-   goto reset_state;
+   return 1;
}
 
desc = srp_fr_pool_get(ch->fr_pool);
@@ -1347,37 +1339,33 @@ static int srp_map_finish_fr(struct srp_map_state 
*state,
rkey = ib_inc_rkey(desc->mr->rkey);
ib_update_fast_reg_key(desc->mr, rkey);
 
-   memcpy(desc->frpl->page_list, state->pages,
-  sizeof(state->pages[0]) * state->npages);
+   n = ib_map_mr_sg(desc->mr, state->sg, state->sg_nents,
+dev->mr_page_size);
+   if (unlikely(n < 0))
+   return n;
 
-   memset(, 0, sizeof(wr));
-   wr.wr.opcode = IB_WR_FAST_REG_MR;
+   wr.wr.next = NULL;
+   wr.wr.opcode = IB_WR_REG_MR;
wr.wr.wr_id = FAST_REG_WR_ID_MASK;
-   wr.iova_start = state->base_dma_addr;
-   wr.page_list = desc->frpl;
-   wr.page_list_len = state->npages;
-   wr.page_shift = ilog2(dev->mr_page_size);
-   wr.length = state->dma_len;
-   wr.access_flags = (IB_ACCESS_LOCAL_WRITE |
-  IB_ACCESS_REMOTE_READ |
-  IB_ACCESS_REMOTE_WRITE);
-   wr.rkey = desc->mr->lkey;
+   wr.wr.num_sge = 0;
+   wr.wr.send_flags = 0;
+   wr.mr = desc->mr;
+   wr.key = desc->mr->rkey;
+   wr.access = (IB_ACCESS_LOCAL_WRITE |
+IB_ACCESS_REMOTE_READ |
+IB_ACCESS_REMOTE_WRITE);
 
*state->fr.next++ = desc;
state->nmdesc++;
 
-   srp_map_desc(state, state->base_dma_addr, state->dma_len,
-desc->mr->rkey);
+   srp_map_desc(state, desc->mr->iova,
+desc->mr->length, desc->mr->rkey);
 
err = ib_post_send(ch->qp, , _wr);
-   if (err)
+   if (unlikely(err))
return err;
 
-reset_state:
-   state->npages = 0;
-   state->dma_len = 0;
-
-   return 0;
+   return n;
 }
 
 static int srp_finish_mapping(struct srp_map_state *state,
@@ -1407,7 +1395,7 @@ static int srp_map_sg_entry(struct srp_map_state *state,
while (dma_len) {
unsigned offset = dma_addr & ~dev->mr_page_mask;
if (state->npages == dev->max_pages_per_mr || offset != 0) {
-

[PATCH v4 02/26] IB/mlx5: Remove dead fmr code

2015-10-12 Thread Sagi Grimberg
Just function declarations - no need for those
laying arround. If for some reason someone will want
FMR support in mlx5, it should be easy enough to restore
a few structs.

Signed-off-by: Sagi Grimberg 
Reviewed-by: Bart Van Assche 
Acked-by: Christoph Hellwig 
---
 drivers/infiniband/hw/mlx5/mlx5_ib.h | 25 -
 1 file changed, 25 deletions(-)

diff --git a/drivers/infiniband/hw/mlx5/mlx5_ib.h 
b/drivers/infiniband/hw/mlx5/mlx5_ib.h
index 29f3ecdbe790..f789a3e6c215 100644
--- a/drivers/infiniband/hw/mlx5/mlx5_ib.h
+++ b/drivers/infiniband/hw/mlx5/mlx5_ib.h
@@ -364,20 +364,6 @@ enum {
MLX5_FMR_BUSY,
 };
 
-struct mlx5_ib_fmr {
-   struct ib_fmr   ibfmr;
-   struct mlx5_core_mr mr;
-   int access_flags;
-   int state;
-   /* protect fmr state
-*/
-   spinlock_t  lock;
-   u64 wrid;
-   struct ib_send_wr   wr[2];
-   u8  page_shift;
-   struct ib_fast_reg_page_listpage_list;
-};
-
 struct mlx5_cache_ent {
struct list_headhead;
/* sync access to the cahce entry
@@ -462,11 +448,6 @@ static inline struct mlx5_ib_dev *to_mdev(struct ib_device 
*ibdev)
return container_of(ibdev, struct mlx5_ib_dev, ib_dev);
 }
 
-static inline struct mlx5_ib_fmr *to_mfmr(struct ib_fmr *ibfmr)
-{
-   return container_of(ibfmr, struct mlx5_ib_fmr, ibfmr);
-}
-
 static inline struct mlx5_ib_cq *to_mcq(struct ib_cq *ibcq)
 {
return container_of(ibcq, struct mlx5_ib_cq, ibcq);
@@ -582,12 +563,6 @@ struct ib_mr *mlx5_ib_alloc_mr(struct ib_pd *pd,
 struct ib_fast_reg_page_list *mlx5_ib_alloc_fast_reg_page_list(struct 
ib_device *ibdev,
   int 
page_list_len);
 void mlx5_ib_free_fast_reg_page_list(struct ib_fast_reg_page_list *page_list);
-struct ib_fmr *mlx5_ib_fmr_alloc(struct ib_pd *pd, int acc,
-struct ib_fmr_attr *fmr_attr);
-int mlx5_ib_map_phys_fmr(struct ib_fmr *ibfmr, u64 *page_list,
- int npages, u64 iova);
-int mlx5_ib_unmap_fmr(struct list_head *fmr_list);
-int mlx5_ib_fmr_dealloc(struct ib_fmr *ibfmr);
 int mlx5_ib_process_mad(struct ib_device *ibdev, int mad_flags, u8 port_num,
const struct ib_wc *in_wc, const struct ib_grh *in_grh,
const struct ib_mad_hdr *in, size_t in_mad_size,
-- 
1.8.4.3

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v4 17/26] IB/srp: Remove srp_finish_mapping

2015-10-12 Thread Sagi Grimberg
No callers left, remove it.

Signed-off-by: Sagi Grimberg 
Tested-by: Bart Van Assche 
---
 drivers/infiniband/ulp/srp/ib_srp.c | 10 --
 1 file changed, 10 deletions(-)

diff --git a/drivers/infiniband/ulp/srp/ib_srp.c 
b/drivers/infiniband/ulp/srp/ib_srp.c
index 6d399378928d..d4a5a9b86390 100644
--- a/drivers/infiniband/ulp/srp/ib_srp.c
+++ b/drivers/infiniband/ulp/srp/ib_srp.c
@@ -1368,16 +1368,6 @@ static int srp_map_finish_fr(struct srp_map_state *state,
return n;
 }
 
-static int srp_finish_mapping(struct srp_map_state *state,
- struct srp_rdma_ch *ch)
-{
-   struct srp_target_port *target = ch->target;
-   struct srp_device *dev = target->srp_host->srp_dev;
-
-   return dev->use_fast_reg ? srp_map_finish_fr(state, ch) :
-  srp_map_finish_fmr(state, ch);
-}
-
 static int srp_map_sg_entry(struct srp_map_state *state,
struct srp_rdma_ch *ch,
struct scatterlist *sg, int sg_index)
-- 
1.8.4.3

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v4 03/26] IB/mlx5: Support the new memory registration API

2015-10-12 Thread Sagi Grimberg
Support the new memory registration API by allocating a
private page list array in mlx5_ib_mr and populate it when
mlx5_ib_map_mr_sg is invoked. Also, support IB_WR_REG_MR
by setting the exact WQE as IB_WR_FAST_REG_MR, just take the
needed information from different places:
- page_size, iova, length, access flags (ib_mr)
- page array (mlx5_ib_mr)
- key (ib_reg_wr)

The IB_WR_FAST_REG_MR handlers will be removed later when
all the ULPs will be converted.

Signed-off-by: Sagi Grimberg 
Acked-by: Christoph Hellwig 
---
 drivers/infiniband/hw/mlx5/cq.c  |  3 ++
 drivers/infiniband/hw/mlx5/main.c|  1 +
 drivers/infiniband/hw/mlx5/mlx5_ib.h |  9 
 drivers/infiniband/hw/mlx5/mr.c  | 93 
 drivers/infiniband/hw/mlx5/qp.c  | 83 
 5 files changed, 189 insertions(+)

diff --git a/drivers/infiniband/hw/mlx5/cq.c b/drivers/infiniband/hw/mlx5/cq.c
index 2d0dbbf38ceb..206930096d56 100644
--- a/drivers/infiniband/hw/mlx5/cq.c
+++ b/drivers/infiniband/hw/mlx5/cq.c
@@ -109,6 +109,9 @@ static enum ib_wc_opcode get_umr_comp(struct mlx5_ib_wq 
*wq, int idx)
case IB_WR_LOCAL_INV:
return IB_WC_LOCAL_INV;
 
+   case IB_WR_REG_MR:
+   return IB_WC_REG_MR;
+
case IB_WR_FAST_REG_MR:
return IB_WC_FAST_REG_MR;
 
diff --git a/drivers/infiniband/hw/mlx5/main.c 
b/drivers/infiniband/hw/mlx5/main.c
index f1ccd40beae9..7e93044ea6ce 100644
--- a/drivers/infiniband/hw/mlx5/main.c
+++ b/drivers/infiniband/hw/mlx5/main.c
@@ -1425,6 +1425,7 @@ static void *mlx5_ib_add(struct mlx5_core_dev *mdev)
dev->ib_dev.detach_mcast= mlx5_ib_mcg_detach;
dev->ib_dev.process_mad = mlx5_ib_process_mad;
dev->ib_dev.alloc_mr= mlx5_ib_alloc_mr;
+   dev->ib_dev.map_mr_sg   = mlx5_ib_map_mr_sg;
dev->ib_dev.alloc_fast_reg_page_list = mlx5_ib_alloc_fast_reg_page_list;
dev->ib_dev.free_fast_reg_page_list  = mlx5_ib_free_fast_reg_page_list;
dev->ib_dev.check_mr_status = mlx5_ib_check_mr_status;
diff --git a/drivers/infiniband/hw/mlx5/mlx5_ib.h 
b/drivers/infiniband/hw/mlx5/mlx5_ib.h
index f789a3e6c215..72672ae48296 100644
--- a/drivers/infiniband/hw/mlx5/mlx5_ib.h
+++ b/drivers/infiniband/hw/mlx5/mlx5_ib.h
@@ -319,6 +319,11 @@ enum mlx5_ib_mtt_access_flags {
 
 struct mlx5_ib_mr {
struct ib_mribmr;
+   void*descs;
+   dma_addr_t  desc_map;
+   int ndescs;
+   int max_descs;
+   int desc_size;
struct mlx5_core_mr mmr;
struct ib_umem *umem;
struct mlx5_shared_mr_info  *smr_info;
@@ -330,6 +335,7 @@ struct mlx5_ib_mr {
struct mlx5_create_mkey_mbox_out out;
struct mlx5_core_sig_ctx*sig;
int live;
+   void*descs_alloc;
 };
 
 struct mlx5_ib_fast_reg_page_list {
@@ -560,6 +566,9 @@ int mlx5_ib_dereg_mr(struct ib_mr *ibmr);
 struct ib_mr *mlx5_ib_alloc_mr(struct ib_pd *pd,
   enum ib_mr_type mr_type,
   u32 max_num_sg);
+int mlx5_ib_map_mr_sg(struct ib_mr *ibmr,
+ struct scatterlist *sg,
+ unsigned int sg_nents);
 struct ib_fast_reg_page_list *mlx5_ib_alloc_fast_reg_page_list(struct 
ib_device *ibdev,
   int 
page_list_len);
 void mlx5_ib_free_fast_reg_page_list(struct ib_fast_reg_page_list *page_list);
diff --git a/drivers/infiniband/hw/mlx5/mr.c b/drivers/infiniband/hw/mlx5/mr.c
index 6a2fed3015ab..cabfe4190657 100644
--- a/drivers/infiniband/hw/mlx5/mr.c
+++ b/drivers/infiniband/hw/mlx5/mr.c
@@ -1153,6 +1153,52 @@ error:
return err;
 }
 
+static int
+mlx5_alloc_priv_descs(struct ib_device *device,
+ struct mlx5_ib_mr *mr,
+ int ndescs,
+ int desc_size)
+{
+   int size = ndescs * desc_size;
+   int add_size;
+   int ret;
+
+   add_size = max_t(int, MLX5_UMR_ALIGN - ARCH_KMALLOC_MINALIGN, 0);
+
+   mr->descs_alloc = kzalloc(size + add_size, GFP_KERNEL);
+   if (!mr->descs_alloc)
+   return -ENOMEM;
+
+   mr->descs = PTR_ALIGN(mr->descs_alloc, MLX5_UMR_ALIGN);
+
+   mr->desc_map = dma_map_single(device->dma_device, mr->descs,
+ size, DMA_TO_DEVICE);
+   if (dma_mapping_error(device->dma_device, mr->desc_map)) {
+   ret = -ENOMEM;
+   goto err;
+   }
+
+   return 0;
+err:
+   kfree(mr->descs_alloc);
+
+   return ret;
+}
+
+static void
+mlx5_free_priv_descs(struct mlx5_ib_mr *mr)
+{
+   if (mr->descs) {
+   struct ib_device *device = mr->ibmr.device;
+   int size = mr->max_descs * mr->desc_size;
+

[PATCH v4 26/26] IB/core: Remove old fast registration API

2015-10-12 Thread Sagi Grimberg
No callers and no providers left, go ahead and remove it.

Signed-off-by: Sagi Grimberg 
Acked-by: Christoph Hellwig 
---
 drivers/infiniband/core/verbs.c | 25 ---
 include/rdma/ib_verbs.h | 54 -
 2 files changed, 79 deletions(-)

diff --git a/drivers/infiniband/core/verbs.c b/drivers/infiniband/core/verbs.c
index e18a8bce8130..ddfdd02ac35d 100644
--- a/drivers/infiniband/core/verbs.c
+++ b/drivers/infiniband/core/verbs.c
@@ -1253,31 +1253,6 @@ struct ib_mr *ib_alloc_mr(struct ib_pd *pd,
 }
 EXPORT_SYMBOL(ib_alloc_mr);
 
-struct ib_fast_reg_page_list *ib_alloc_fast_reg_page_list(struct ib_device 
*device,
- int max_page_list_len)
-{
-   struct ib_fast_reg_page_list *page_list;
-
-   if (!device->alloc_fast_reg_page_list)
-   return ERR_PTR(-ENOSYS);
-
-   page_list = device->alloc_fast_reg_page_list(device, max_page_list_len);
-
-   if (!IS_ERR(page_list)) {
-   page_list->device = device;
-   page_list->max_page_list_len = max_page_list_len;
-   }
-
-   return page_list;
-}
-EXPORT_SYMBOL(ib_alloc_fast_reg_page_list);
-
-void ib_free_fast_reg_page_list(struct ib_fast_reg_page_list *page_list)
-{
-   page_list->device->free_fast_reg_page_list(page_list);
-}
-EXPORT_SYMBOL(ib_free_fast_reg_page_list);
-
 /* Memory windows */
 
 struct ib_mw *ib_alloc_mw(struct ib_pd *pd, enum ib_mw_type type)
diff --git a/include/rdma/ib_verbs.h b/include/rdma/ib_verbs.h
index 24bb87f16afb..f5d706ea2e19 100644
--- a/include/rdma/ib_verbs.h
+++ b/include/rdma/ib_verbs.h
@@ -738,7 +738,6 @@ enum ib_wc_opcode {
IB_WC_BIND_MW,
IB_WC_LSO,
IB_WC_LOCAL_INV,
-   IB_WC_FAST_REG_MR,
IB_WC_REG_MR,
IB_WC_MASKED_COMP_SWAP,
IB_WC_MASKED_FETCH_ADD,
@@ -1030,7 +1029,6 @@ enum ib_wr_opcode {
IB_WR_SEND_WITH_INV,
IB_WR_RDMA_READ_WITH_INV,
IB_WR_LOCAL_INV,
-   IB_WR_FAST_REG_MR,
IB_WR_REG_MR,
IB_WR_MASKED_ATOMIC_CMP_AND_SWP,
IB_WR_MASKED_ATOMIC_FETCH_AND_ADD,
@@ -1069,12 +1067,6 @@ struct ib_sge {
u32 lkey;
 };
 
-struct ib_fast_reg_page_list {
-   struct ib_device   *device;
-   u64*page_list;
-   unsigned intmax_page_list_len;
-};
-
 /**
  * struct ib_mw_bind_info - Parameters for a memory window bind operation.
  * @mr: A memory region to bind the memory window to.
@@ -1149,22 +1141,6 @@ static inline struct ib_ud_wr *ud_wr(struct ib_send_wr 
*wr)
return container_of(wr, struct ib_ud_wr, wr);
 }
 
-struct ib_fast_reg_wr {
-   struct ib_send_wr   wr;
-   u64 iova_start;
-   struct ib_fast_reg_page_list *page_list;
-   unsigned intpage_shift;
-   unsigned intpage_list_len;
-   u32 length;
-   int access_flags;
-   u32 rkey;
-};
-
-static inline struct ib_fast_reg_wr *fast_reg_wr(struct ib_send_wr *wr)
-{
-   return container_of(wr, struct ib_fast_reg_wr, wr);
-}
-
 struct ib_reg_wr {
struct ib_send_wr   wr;
struct ib_mr*mr;
@@ -1779,9 +1755,6 @@ struct ib_device {
int(*map_mr_sg)(struct ib_mr *mr,
struct scatterlist *sg,
unsigned int sg_nents);
-   struct ib_fast_reg_page_list * (*alloc_fast_reg_page_list)(struct 
ib_device *device,
-  int 
page_list_len);
-   void   (*free_fast_reg_page_list)(struct 
ib_fast_reg_page_list *page_list);
int(*rereg_phys_mr)(struct ib_mr *mr,
int mr_rereg_mask,
struct ib_pd *pd,
@@ -2890,33 +2863,6 @@ struct ib_mr *ib_alloc_mr(struct ib_pd *pd,
  u32 max_num_sg);
 
 /**
- * ib_alloc_fast_reg_page_list - Allocates a page list array
- * @device - ib device pointer.
- * @page_list_len - size of the page list array to be allocated.
- *
- * This allocates and returns a struct ib_fast_reg_page_list * and a
- * page_list array that is at least page_list_len in size.  The actual
- * size is returned in max_page_list_len.  The caller is responsible
- * for initializing the contents of the page_list array before posting
- * a send work request with the IB_WC_FAST_REG_MR opcode.
- *
- * The page_list array entries must be translated using one of the
- * ib_dma_*() functions just like the addresses passed to
- * ib_map_phys_fmr().  Once the ib_post_send() is issued, the struct
- * ib_fast_reg_page_list must not be modified by the caller until the
- * IB_WC_FAST_REG_MR work request completes.
- 

[PATCH v4 07/26] iw_cxgb4: Support the new memory registration API

2015-10-12 Thread Sagi Grimberg
Support the new memory registration API by allocating a
private page list array in c4iw_mr and populate it when
c4iw_map_mr_sg is invoked. Also, support IB_WR_REG_MR
by duplicating build_fastreg just take the needed information
from different places:
- page_size, iova, length (ib_mr)
- page array (c4iw_mr)
- key, access flags (ib_reg_wr)

The IB_WR_FAST_REG_MR handlers will be removed later when
all the ULPs will be converted.

Signed-off-by: Sagi Grimberg 
Acked-by: Christoph Hellwig 
Tested-by: Steve Wise 
---
 drivers/infiniband/hw/cxgb4/iw_cxgb4.h |  7 
 drivers/infiniband/hw/cxgb4/mem.c  | 38 +
 drivers/infiniband/hw/cxgb4/provider.c |  1 +
 drivers/infiniband/hw/cxgb4/qp.c   | 74 ++
 4 files changed, 120 insertions(+)

diff --git a/drivers/infiniband/hw/cxgb4/iw_cxgb4.h 
b/drivers/infiniband/hw/cxgb4/iw_cxgb4.h
index c7bb38c931a5..032f90aa8ac9 100644
--- a/drivers/infiniband/hw/cxgb4/iw_cxgb4.h
+++ b/drivers/infiniband/hw/cxgb4/iw_cxgb4.h
@@ -386,6 +386,10 @@ struct c4iw_mr {
struct c4iw_dev *rhp;
u64 kva;
struct tpt_attributes attr;
+   u64 *mpl;
+   dma_addr_t mpl_addr;
+   u32 max_mpl_len;
+   u32 mpl_len;
 };
 
 static inline struct c4iw_mr *to_c4iw_mr(struct ib_mr *ibmr)
@@ -973,6 +977,9 @@ struct ib_fast_reg_page_list *c4iw_alloc_fastreg_pbl(
 struct ib_mr *c4iw_alloc_mr(struct ib_pd *pd,
enum ib_mr_type mr_type,
u32 max_num_sg);
+int c4iw_map_mr_sg(struct ib_mr *ibmr,
+  struct scatterlist *sg,
+  unsigned int sg_nents);
 int c4iw_dealloc_mw(struct ib_mw *mw);
 struct ib_mw *c4iw_alloc_mw(struct ib_pd *pd, enum ib_mw_type type);
 struct ib_mr *c4iw_reg_user_mr(struct ib_pd *pd, u64 start,
diff --git a/drivers/infiniband/hw/cxgb4/mem.c 
b/drivers/infiniband/hw/cxgb4/mem.c
index 026b91ebd5e2..86ec65721797 100644
--- a/drivers/infiniband/hw/cxgb4/mem.c
+++ b/drivers/infiniband/hw/cxgb4/mem.c
@@ -863,6 +863,7 @@ struct ib_mr *c4iw_alloc_mr(struct ib_pd *pd,
u32 mmid;
u32 stag = 0;
int ret = 0;
+   int length = roundup(max_num_sg * sizeof(u64), 32);
 
if (mr_type != IB_MR_TYPE_MEM_REG ||
max_num_sg > t4_max_fr_depth(use_dsgl))
@@ -876,6 +877,14 @@ struct ib_mr *c4iw_alloc_mr(struct ib_pd *pd,
goto err;
}
 
+   mhp->mpl = dma_alloc_coherent(>rdev.lldi.pdev->dev,
+ length, >mpl_addr, GFP_KERNEL);
+   if (!mhp->mpl) {
+   ret = -ENOMEM;
+   goto err_mpl;
+   }
+   mhp->max_mpl_len = length;
+
mhp->rhp = rhp;
ret = alloc_pbl(mhp, max_num_sg);
if (ret)
@@ -905,11 +914,37 @@ err2:
c4iw_pblpool_free(>rhp->rdev, mhp->attr.pbl_addr,
  mhp->attr.pbl_size << 3);
 err1:
+   dma_free_coherent(>rhp->rdev.lldi.pdev->dev,
+ mhp->max_mpl_len, mhp->mpl, mhp->mpl_addr);
+err_mpl:
kfree(mhp);
 err:
return ERR_PTR(ret);
 }
 
+static int c4iw_set_page(struct ib_mr *ibmr, u64 addr)
+{
+   struct c4iw_mr *mhp = to_c4iw_mr(ibmr);
+
+   if (unlikely(mhp->mpl_len == mhp->max_mpl_len))
+   return -ENOMEM;
+
+   mhp->mpl[mhp->mpl_len++] = addr;
+
+   return 0;
+}
+
+int c4iw_map_mr_sg(struct ib_mr *ibmr,
+  struct scatterlist *sg,
+  unsigned int sg_nents)
+{
+   struct c4iw_mr *mhp = to_c4iw_mr(ibmr);
+
+   mhp->mpl_len = 0;
+
+   return ib_sg_to_pages(ibmr, sg, sg_nents, c4iw_set_page);
+}
+
 struct ib_fast_reg_page_list *c4iw_alloc_fastreg_pbl(struct ib_device *device,
 int page_list_len)
 {
@@ -970,6 +1005,9 @@ int c4iw_dereg_mr(struct ib_mr *ib_mr)
rhp = mhp->rhp;
mmid = mhp->attr.stag >> 8;
remove_handle(rhp, >mmidr, mmid);
+   if (mhp->mpl)
+   dma_free_coherent(>rhp->rdev.lldi.pdev->dev,
+ mhp->max_mpl_len, mhp->mpl, mhp->mpl_addr);
dereg_mem(>rdev, mhp->attr.stag, mhp->attr.pbl_size,
   mhp->attr.pbl_addr);
if (mhp->attr.pbl_size)
diff --git a/drivers/infiniband/hw/cxgb4/provider.c 
b/drivers/infiniband/hw/cxgb4/provider.c
index 7746113552e7..55dedadcffaa 100644
--- a/drivers/infiniband/hw/cxgb4/provider.c
+++ b/drivers/infiniband/hw/cxgb4/provider.c
@@ -557,6 +557,7 @@ int c4iw_register_device(struct c4iw_dev *dev)
dev->ibdev.bind_mw = c4iw_bind_mw;
dev->ibdev.dealloc_mw = c4iw_dealloc_mw;
dev->ibdev.alloc_mr = c4iw_alloc_mr;
+   dev->ibdev.map_mr_sg = c4iw_map_mr_sg;
dev->ibdev.alloc_fast_reg_page_list = c4iw_alloc_fastreg_pbl;
dev->ibdev.free_fast_reg_page_list = c4iw_free_fastreg_pbl;
dev->ibdev.attach_mcast = c4iw_multicast_attach;
diff 

[PATCH v4 14/26] RDS/IW: Convert to new memory registration API

2015-10-12 Thread Sagi Grimberg
Get rid of fast_reg page list and its construction.
Instead, just pass the RDS sg list to ib_map_mr_sg
and post the new ib_reg_wr.

This is done both for server IW RDMA_READ registration
and the client remote key registration.

Signed-off-by: Sagi Grimberg 
Acked-by: Christoph Hellwig 
Acked-by: Santosh Shilimkar 
---
 net/rds/iw.h  |   5 +--
 net/rds/iw_rdma.c | 128 +++---
 net/rds/iw_send.c |  57 
 3 files changed, 75 insertions(+), 115 deletions(-)

diff --git a/net/rds/iw.h b/net/rds/iw.h
index fe858e5dd8d1..5af01d1758b3 100644
--- a/net/rds/iw.h
+++ b/net/rds/iw.h
@@ -74,13 +74,12 @@ struct rds_iw_send_work {
struct rm_rdma_op   *s_op;
struct rds_iw_mapping   *s_mapping;
struct ib_mr*s_mr;
-   struct ib_fast_reg_page_list *s_page_list;
unsigned char   s_remap_count;
 
union {
struct ib_send_wr   s_send_wr;
struct ib_rdma_wr   s_rdma_wr;
-   struct ib_fast_reg_wr   s_fast_reg_wr;
+   struct ib_reg_wrs_reg_wr;
};
struct ib_sge   s_sge[RDS_IW_MAX_SGE];
unsigned long   s_queued;
@@ -199,7 +198,7 @@ struct rds_iw_device {
 
 /* Magic WR_ID for ACKs */
 #define RDS_IW_ACK_WR_ID   ((u64)0xULL)
-#define RDS_IW_FAST_REG_WR_ID  ((u64)0xefefefefefefefefULL)
+#define RDS_IW_REG_WR_ID   ((u64)0xefefefefefefefefULL)
 #define RDS_IW_LOCAL_INV_WR_ID ((u64)0xdfdfdfdfdfdfdfdfULL)
 
 struct rds_iw_statistics {
diff --git a/net/rds/iw_rdma.c b/net/rds/iw_rdma.c
index f8a612cc69e6..47bd68451ff7 100644
--- a/net/rds/iw_rdma.c
+++ b/net/rds/iw_rdma.c
@@ -47,7 +47,6 @@ struct rds_iw_mr {
struct rdma_cm_id   *cm_id;
 
struct ib_mr*mr;
-   struct ib_fast_reg_page_list *page_list;
 
struct rds_iw_mapping   mapping;
unsigned char   remap_count;
@@ -77,8 +76,8 @@ struct rds_iw_mr_pool {
 
 static int rds_iw_flush_mr_pool(struct rds_iw_mr_pool *pool, int free_all);
 static void rds_iw_mr_pool_flush_worker(struct work_struct *work);
-static int rds_iw_init_fastreg(struct rds_iw_mr_pool *pool, struct rds_iw_mr 
*ibmr);
-static int rds_iw_map_fastreg(struct rds_iw_mr_pool *pool,
+static int rds_iw_init_reg(struct rds_iw_mr_pool *pool, struct rds_iw_mr 
*ibmr);
+static int rds_iw_map_reg(struct rds_iw_mr_pool *pool,
  struct rds_iw_mr *ibmr,
  struct scatterlist *sg, unsigned int nents);
 static void rds_iw_free_fastreg(struct rds_iw_mr_pool *pool, struct rds_iw_mr 
*ibmr);
@@ -258,19 +257,18 @@ static void rds_iw_set_scatterlist(struct 
rds_iw_scatterlist *sg,
sg->bytes = 0;
 }
 
-static u64 *rds_iw_map_scatterlist(struct rds_iw_device *rds_iwdev,
-   struct rds_iw_scatterlist *sg)
+static int rds_iw_map_scatterlist(struct rds_iw_device *rds_iwdev,
+ struct rds_iw_scatterlist *sg)
 {
struct ib_device *dev = rds_iwdev->dev;
-   u64 *dma_pages = NULL;
-   int i, j, ret;
+   int i, ret;
 
WARN_ON(sg->dma_len);
 
sg->dma_len = ib_dma_map_sg(dev, sg->list, sg->len, DMA_BIDIRECTIONAL);
if (unlikely(!sg->dma_len)) {
printk(KERN_WARNING "RDS/IW: dma_map_sg failed!\n");
-   return ERR_PTR(-EBUSY);
+   return -EBUSY;
}
 
sg->bytes = 0;
@@ -303,31 +301,14 @@ static u64 *rds_iw_map_scatterlist(struct rds_iw_device 
*rds_iwdev,
if (sg->dma_npages > fastreg_message_size)
goto out_unmap;
 
-   dma_pages = kmalloc(sizeof(u64) * sg->dma_npages, GFP_ATOMIC);
-   if (!dma_pages) {
-   ret = -ENOMEM;
-   goto out_unmap;
-   }
 
-   for (i = j = 0; i < sg->dma_len; ++i) {
-   unsigned int dma_len = ib_sg_dma_len(dev, >list[i]);
-   u64 dma_addr = ib_sg_dma_address(dev, >list[i]);
-   u64 end_addr;
 
-   end_addr = dma_addr + dma_len;
-   dma_addr &= ~PAGE_MASK;
-   for (; dma_addr < end_addr; dma_addr += PAGE_SIZE)
-   dma_pages[j++] = dma_addr;
-   BUG_ON(j > sg->dma_npages);
-   }
-
-   return dma_pages;
+   return 0;
 
 out_unmap:
ib_dma_unmap_sg(rds_iwdev->dev, sg->list, sg->len, DMA_BIDIRECTIONAL);
sg->dma_len = 0;
-   kfree(dma_pages);
-   return ERR_PTR(ret);
+   return ret;
 }
 
 
@@ -440,7 +421,7 @@ static struct rds_iw_mr *rds_iw_alloc_mr(struct 
rds_iw_device *rds_iwdev)
INIT_LIST_HEAD(>mapping.m_list);
ibmr->mapping.m_mr = ibmr;
 
-   err = rds_iw_init_fastreg(pool, ibmr);
+   err = rds_iw_init_reg(pool, ibmr);
if (err)
goto out_no_cigar;
 
@@ -622,7 +603,7 @@ void *rds_iw_get_mr(struct scatterlist 

[PATCH v4 18/26] IB/srp: Dont allocate a page vector when using fast_reg

2015-10-12 Thread Sagi Grimberg
The new fast registration API does not reuqire a page vector
so we can't avoid allocating it.

Signed-off-by: Sagi Grimberg 
Tested-by: Bart Van Assche 
---
 drivers/infiniband/ulp/srp/ib_srp.c | 20 +++-
 1 file changed, 11 insertions(+), 9 deletions(-)

diff --git a/drivers/infiniband/ulp/srp/ib_srp.c 
b/drivers/infiniband/ulp/srp/ib_srp.c
index d4a5a9b86390..d00f819c09b0 100644
--- a/drivers/infiniband/ulp/srp/ib_srp.c
+++ b/drivers/infiniband/ulp/srp/ib_srp.c
@@ -840,11 +840,12 @@ static void srp_free_req_data(struct srp_target_port 
*target,
 
for (i = 0; i < target->req_ring_size; ++i) {
req = >req_ring[i];
-   if (dev->use_fast_reg)
+   if (dev->use_fast_reg) {
kfree(req->fr_list);
-   else
+   } else {
kfree(req->fmr_list);
-   kfree(req->map_page);
+   kfree(req->map_page);
+   }
if (req->indirect_dma_addr) {
ib_dma_unmap_single(ibdev, req->indirect_dma_addr,
target->indirect_size,
@@ -878,14 +879,15 @@ static int srp_alloc_req_data(struct srp_rdma_ch *ch)
  GFP_KERNEL);
if (!mr_list)
goto out;
-   if (srp_dev->use_fast_reg)
+   if (srp_dev->use_fast_reg) {
req->fr_list = mr_list;
-   else
+   } else {
req->fmr_list = mr_list;
-   req->map_page = kmalloc(srp_dev->max_pages_per_mr *
-   sizeof(void *), GFP_KERNEL);
-   if (!req->map_page)
-   goto out;
+   req->map_page = kmalloc(srp_dev->max_pages_per_mr *
+   sizeof(void *), GFP_KERNEL);
+   if (!req->map_page)
+   goto out;
+   }
req->indirect_desc = kmalloc(target->indirect_size, GFP_KERNEL);
if (!req->indirect_desc)
goto out;
-- 
1.8.4.3

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v4 13/26] svcrdma: Port to new memory registration API

2015-10-12 Thread Sagi Grimberg
Instead of maintaining a fastreg page list, keep an sg table
and convert an array of pages to a sg list. Then call ib_map_mr_sg
and construct ib_reg_wr.

Signed-off-by: Sagi Grimberg 
Acked-by: Christoph Hellwig 
Tested-by: Steve Wise 
Tested-by: Selvin Xavier 
---
 include/linux/sunrpc/svc_rdma.h  |  6 +--
 net/sunrpc/xprtrdma/svc_rdma_recvfrom.c  | 76 ++--
 net/sunrpc/xprtrdma/svc_rdma_transport.c | 34 +-
 3 files changed, 55 insertions(+), 61 deletions(-)

diff --git a/include/linux/sunrpc/svc_rdma.h b/include/linux/sunrpc/svc_rdma.h
index 7ccc961f33e9..e8147d535588 100644
--- a/include/linux/sunrpc/svc_rdma.h
+++ b/include/linux/sunrpc/svc_rdma.h
@@ -105,11 +105,9 @@ struct svc_rdma_chunk_sge {
 };
 struct svc_rdma_fastreg_mr {
struct ib_mr *mr;
-   void *kva;
-   struct ib_fast_reg_page_list *page_list;
-   int page_list_len;
+   struct scatterlist *sg;
+   unsigned int sg_nents;
unsigned long access_flags;
-   unsigned long map_len;
enum dma_data_direction direction;
struct list_head frmr_list;
 };
diff --git a/net/sunrpc/xprtrdma/svc_rdma_recvfrom.c 
b/net/sunrpc/xprtrdma/svc_rdma_recvfrom.c
index 7be42d0da19e..303f194970f9 100644
--- a/net/sunrpc/xprtrdma/svc_rdma_recvfrom.c
+++ b/net/sunrpc/xprtrdma/svc_rdma_recvfrom.c
@@ -220,12 +220,12 @@ int rdma_read_chunk_frmr(struct svcxprt_rdma *xprt,
 {
struct ib_rdma_wr read_wr;
struct ib_send_wr inv_wr;
-   struct ib_fast_reg_wr fastreg_wr;
+   struct ib_reg_wr reg_wr;
u8 key;
-   int pages_needed = PAGE_ALIGN(*page_offset + rs_length) >> PAGE_SHIFT;
+   unsigned int nents = PAGE_ALIGN(*page_offset + rs_length) >> PAGE_SHIFT;
struct svc_rdma_op_ctxt *ctxt = svc_rdma_get_context(xprt);
struct svc_rdma_fastreg_mr *frmr = svc_rdma_get_frmr(xprt);
-   int ret, read, pno;
+   int ret, read, pno, dma_nents, n;
u32 pg_off = *page_offset;
u32 pg_no = *page_no;
 
@@ -234,16 +234,14 @@ int rdma_read_chunk_frmr(struct svcxprt_rdma *xprt,
 
ctxt->direction = DMA_FROM_DEVICE;
ctxt->frmr = frmr;
-   pages_needed = min_t(int, pages_needed, xprt->sc_frmr_pg_list_len);
-   read = min_t(int, pages_needed << PAGE_SHIFT, rs_length);
+   nents = min_t(unsigned int, nents, xprt->sc_frmr_pg_list_len);
+   read = min_t(int, nents << PAGE_SHIFT, rs_length);
 
-   frmr->kva = page_address(rqstp->rq_arg.pages[pg_no]);
frmr->direction = DMA_FROM_DEVICE;
frmr->access_flags = (IB_ACCESS_LOCAL_WRITE|IB_ACCESS_REMOTE_WRITE);
-   frmr->map_len = pages_needed << PAGE_SHIFT;
-   frmr->page_list_len = pages_needed;
+   frmr->sg_nents = nents;
 
-   for (pno = 0; pno < pages_needed; pno++) {
+   for (pno = 0; pno < nents; pno++) {
int len = min_t(int, rs_length, PAGE_SIZE - pg_off);
 
head->arg.pages[pg_no] = rqstp->rq_arg.pages[pg_no];
@@ -251,17 +249,12 @@ int rdma_read_chunk_frmr(struct svcxprt_rdma *xprt,
head->arg.len += len;
if (!pg_off)
head->count++;
+
+   sg_set_page(>sg[pno], rqstp->rq_arg.pages[pg_no],
+   len, pg_off);
+
rqstp->rq_respages = >rq_arg.pages[pg_no+1];
rqstp->rq_next_page = rqstp->rq_respages + 1;
-   frmr->page_list->page_list[pno] =
-   ib_dma_map_page(xprt->sc_cm_id->device,
-   head->arg.pages[pg_no], 0,
-   PAGE_SIZE, DMA_FROM_DEVICE);
-   ret = ib_dma_mapping_error(xprt->sc_cm_id->device,
-  frmr->page_list->page_list[pno]);
-   if (ret)
-   goto err;
-   atomic_inc(>sc_dma_used);
 
/* adjust offset and wrap to next page if needed */
pg_off += len;
@@ -277,28 +270,42 @@ int rdma_read_chunk_frmr(struct svcxprt_rdma *xprt,
else
clear_bit(RDMACTXT_F_LAST_CTXT, >flags);
 
+   dma_nents = ib_dma_map_sg(xprt->sc_cm_id->device,
+ frmr->sg, frmr->sg_nents,
+ frmr->direction);
+   if (!dma_nents) {
+   pr_err("svcrdma: failed to dma map sg %p\n",
+  frmr->sg);
+   return -ENOMEM;
+   }
+   atomic_inc(>sc_dma_used);
+
+   n = ib_map_mr_sg(frmr->mr, frmr->sg, frmr->sg_nents, PAGE_SIZE);
+   if (unlikely(n != frmr->sg_nents)) {
+   pr_err("svcrdma: failed to map mr %p (%d/%d elements)\n",
+  frmr->mr, n, frmr->sg_nents);
+   return n < 0 ? n : -EINVAL;
+   }
+
/* Bump the key */
key = (u8)(frmr->mr->lkey & 0x00FF);

[PATCH v4 22/26] RDMA/cxgb3: Remove old FRWR API

2015-10-12 Thread Sagi Grimberg
No ULP uses it anymore, go ahead and remove it.

Signed-off-by: Sagi Grimberg 
Acked-by: Christoph Hellwig 
---
 drivers/infiniband/hw/cxgb3/iwch_cq.c   |  2 +-
 drivers/infiniband/hw/cxgb3/iwch_provider.c | 24 ---
 drivers/infiniband/hw/cxgb3/iwch_qp.c   | 47 -
 3 files changed, 1 insertion(+), 72 deletions(-)

diff --git a/drivers/infiniband/hw/cxgb3/iwch_cq.c 
b/drivers/infiniband/hw/cxgb3/iwch_cq.c
index cf5474ae68ff..cfe404925a39 100644
--- a/drivers/infiniband/hw/cxgb3/iwch_cq.c
+++ b/drivers/infiniband/hw/cxgb3/iwch_cq.c
@@ -123,7 +123,7 @@ static int iwch_poll_cq_one(struct iwch_dev *rhp, struct 
iwch_cq *chp,
wc->opcode = IB_WC_LOCAL_INV;
break;
case T3_FAST_REGISTER:
-   wc->opcode = IB_WC_FAST_REG_MR;
+   wc->opcode = IB_WC_REG_MR;
break;
default:
printk(KERN_ERR MOD "Unexpected opcode %d "
diff --git a/drivers/infiniband/hw/cxgb3/iwch_provider.c 
b/drivers/infiniband/hw/cxgb3/iwch_provider.c
index ee3d5ca7de6c..99ae2ab14b9e 100644
--- a/drivers/infiniband/hw/cxgb3/iwch_provider.c
+++ b/drivers/infiniband/hw/cxgb3/iwch_provider.c
@@ -884,28 +884,6 @@ static int iwch_map_mr_sg(struct ib_mr *ibmr,
return ib_sg_to_pages(ibmr, sg, sg_nents, iwch_set_page);
 }
 
-static struct ib_fast_reg_page_list *iwch_alloc_fastreg_pbl(
-   struct ib_device *device,
-   int page_list_len)
-{
-   struct ib_fast_reg_page_list *page_list;
-
-   page_list = kmalloc(sizeof *page_list + page_list_len * sizeof(u64),
-   GFP_KERNEL);
-   if (!page_list)
-   return ERR_PTR(-ENOMEM);
-
-   page_list->page_list = (u64 *)(page_list + 1);
-   page_list->max_page_list_len = page_list_len;
-
-   return page_list;
-}
-
-static void iwch_free_fastreg_pbl(struct ib_fast_reg_page_list *page_list)
-{
-   kfree(page_list);
-}
-
 static int iwch_destroy_qp(struct ib_qp *ib_qp)
 {
struct iwch_dev *rhp;
@@ -1483,8 +1461,6 @@ int iwch_register_device(struct iwch_dev *dev)
dev->ibdev.dealloc_mw = iwch_dealloc_mw;
dev->ibdev.alloc_mr = iwch_alloc_mr;
dev->ibdev.map_mr_sg = iwch_map_mr_sg;
-   dev->ibdev.alloc_fast_reg_page_list = iwch_alloc_fastreg_pbl;
-   dev->ibdev.free_fast_reg_page_list = iwch_free_fastreg_pbl;
dev->ibdev.attach_mcast = iwch_multicast_attach;
dev->ibdev.detach_mcast = iwch_multicast_detach;
dev->ibdev.process_mad = iwch_process_mad;
diff --git a/drivers/infiniband/hw/cxgb3/iwch_qp.c 
b/drivers/infiniband/hw/cxgb3/iwch_qp.c
index a09ea538e990..d0548fc6395e 100644
--- a/drivers/infiniband/hw/cxgb3/iwch_qp.c
+++ b/drivers/infiniband/hw/cxgb3/iwch_qp.c
@@ -189,48 +189,6 @@ static int build_memreg(union t3_wr *wqe, struct ib_reg_wr 
*wr,
return 0;
 }
 
-static int build_fastreg(union t3_wr *wqe, struct ib_send_wr *send_wr,
-   u8 *flit_cnt, int *wr_cnt, struct t3_wq *wq)
-{
-   struct ib_fast_reg_wr *wr = fast_reg_wr(send_wr);
-   int i;
-   __be64 *p;
-
-   if (wr->page_list_len > T3_MAX_FASTREG_DEPTH)
-   return -EINVAL;
-   *wr_cnt = 1;
-   wqe->fastreg.stag = cpu_to_be32(wr->rkey);
-   wqe->fastreg.len = cpu_to_be32(wr->length);
-   wqe->fastreg.va_base_hi = cpu_to_be32(wr->iova_start >> 32);
-   wqe->fastreg.va_base_lo_fbo = cpu_to_be32(wr->iova_start & 0x);
-   wqe->fastreg.page_type_perms = cpu_to_be32(
-   V_FR_PAGE_COUNT(wr->page_list_len) |
-   V_FR_PAGE_SIZE(wr->page_shift-12) |
-   V_FR_TYPE(TPT_VATO) |
-   V_FR_PERMS(iwch_ib_to_tpt_access(wr->access_flags)));
-   p = >fastreg.pbl_addrs[0];
-   for (i = 0; i < wr->page_list_len; i++, p++) {
-
-   /* If we need a 2nd WR, then set it up */
-   if (i == T3_MAX_FASTREG_FRAG) {
-   *wr_cnt = 2;
-   wqe = (union t3_wr *)(wq->queue +
-   Q_PTR2IDX((wq->wptr+1), wq->size_log2));
-   build_fw_riwrh((void *)wqe, T3_WR_FASTREG, 0,
-  Q_GENBIT(wq->wptr + 1, wq->size_log2),
-  0, 1 + wr->page_list_len - T3_MAX_FASTREG_FRAG,
-  T3_EOP);
-
-   p = >pbl_frag.pbl_addrs[0];
-   }
-   *p = cpu_to_be64((u64)wr->page_list->page_list[i]);
-   }
-   *flit_cnt = 5 + wr->page_list_len;
-   if (*flit_cnt > 15)
-   *flit_cnt = 15;
-   return 0;
-}
-
 static int build_inv_stag(union t3_wr *wqe, struct ib_send_wr *wr,
u8 *flit_cnt)
 {
@@ -457,11 +415,6 @@ int iwch_post_send(struct ib_qp *ibqp, 

[PATCH v4 21/26] RDMA/ocrdma: Remove old FRWR API

2015-10-12 Thread Sagi Grimberg
No ULP uses it anymore, go ahead and remove it.

Signed-off-by: Sagi Grimberg 
Acked-by: Christoph Hellwig 
---
 drivers/infiniband/hw/ocrdma/ocrdma_main.c  |   2 -
 drivers/infiniband/hw/ocrdma/ocrdma_verbs.c | 104 +---
 drivers/infiniband/hw/ocrdma/ocrdma_verbs.h |   4 --
 3 files changed, 1 insertion(+), 109 deletions(-)

diff --git a/drivers/infiniband/hw/ocrdma/ocrdma_main.c 
b/drivers/infiniband/hw/ocrdma/ocrdma_main.c
index 874beb4b07a1..9bf430ef8eb6 100644
--- a/drivers/infiniband/hw/ocrdma/ocrdma_main.c
+++ b/drivers/infiniband/hw/ocrdma/ocrdma_main.c
@@ -183,8 +183,6 @@ static int ocrdma_register_device(struct ocrdma_dev *dev)
 
dev->ibdev.alloc_mr = ocrdma_alloc_mr;
dev->ibdev.map_mr_sg = ocrdma_map_mr_sg;
-   dev->ibdev.alloc_fast_reg_page_list = ocrdma_alloc_frmr_page_list;
-   dev->ibdev.free_fast_reg_page_list = ocrdma_free_frmr_page_list;
 
/* mandatory to support user space verbs consumer. */
dev->ibdev.alloc_ucontext = ocrdma_alloc_ucontext;
diff --git a/drivers/infiniband/hw/ocrdma/ocrdma_verbs.c 
b/drivers/infiniband/hw/ocrdma/ocrdma_verbs.c
index 66cc15a78f63..0b8598c9d56b 100644
--- a/drivers/infiniband/hw/ocrdma/ocrdma_verbs.c
+++ b/drivers/infiniband/hw/ocrdma/ocrdma_verbs.c
@@ -2133,41 +2133,6 @@ static void ocrdma_build_read(struct ocrdma_qp *qp, 
struct ocrdma_hdr_wqe *hdr,
ext_rw->len = hdr->total_len;
 }
 
-static void build_frmr_pbes(struct ib_fast_reg_wr *wr,
-   struct ocrdma_pbl *pbl_tbl,
-   struct ocrdma_hw_mr *hwmr)
-{
-   int i;
-   u64 buf_addr = 0;
-   int num_pbes;
-   struct ocrdma_pbe *pbe;
-
-   pbe = (struct ocrdma_pbe *)pbl_tbl->va;
-   num_pbes = 0;
-
-   /* go through the OS phy regions & fill hw pbe entries into pbls. */
-   for (i = 0; i < wr->page_list_len; i++) {
-   /* number of pbes can be more for one OS buf, when
-* buffers are of different sizes.
-* split the ib_buf to one or more pbes.
-*/
-   buf_addr = wr->page_list->page_list[i];
-   pbe->pa_lo = cpu_to_le32((u32) (buf_addr & PAGE_MASK));
-   pbe->pa_hi = cpu_to_le32((u32) upper_32_bits(buf_addr));
-   num_pbes += 1;
-   pbe++;
-
-   /* if the pbl is full storing the pbes,
-* move to next pbl.
-   */
-   if (num_pbes == (hwmr->pbl_size/sizeof(u64))) {
-   pbl_tbl++;
-   pbe = (struct ocrdma_pbe *)pbl_tbl->va;
-   }
-   }
-   return;
-}
-
 static int get_encoded_page_size(int pg_sz)
 {
/* Max size is 256M 4096 << 16 */
@@ -2234,50 +2199,6 @@ static int ocrdma_build_reg(struct ocrdma_qp *qp,
return 0;
 }
 
-static int ocrdma_build_fr(struct ocrdma_qp *qp, struct ocrdma_hdr_wqe *hdr,
-  struct ib_send_wr *send_wr)
-{
-   u64 fbo;
-   struct ib_fast_reg_wr *wr = fast_reg_wr(send_wr);
-   struct ocrdma_ewqe_fr *fast_reg = (struct ocrdma_ewqe_fr *)(hdr + 1);
-   struct ocrdma_mr *mr;
-   struct ocrdma_dev *dev = get_ocrdma_dev(qp->ibqp.device);
-   u32 wqe_size = sizeof(*fast_reg) + sizeof(*hdr);
-
-   wqe_size = roundup(wqe_size, OCRDMA_WQE_ALIGN_BYTES);
-
-   if (wr->page_list_len > dev->attr.max_pages_per_frmr)
-   return -EINVAL;
-
-   hdr->cw |= (OCRDMA_FR_MR << OCRDMA_WQE_OPCODE_SHIFT);
-   hdr->cw |= ((wqe_size / OCRDMA_WQE_STRIDE) << OCRDMA_WQE_SIZE_SHIFT);
-
-   if (wr->page_list_len == 0)
-   BUG();
-   if (wr->access_flags & IB_ACCESS_LOCAL_WRITE)
-   hdr->rsvd_lkey_flags |= OCRDMA_LKEY_FLAG_LOCAL_WR;
-   if (wr->access_flags & IB_ACCESS_REMOTE_WRITE)
-   hdr->rsvd_lkey_flags |= OCRDMA_LKEY_FLAG_REMOTE_WR;
-   if (wr->access_flags & IB_ACCESS_REMOTE_READ)
-   hdr->rsvd_lkey_flags |= OCRDMA_LKEY_FLAG_REMOTE_RD;
-   hdr->lkey = wr->rkey;
-   hdr->total_len = wr->length;
-
-   fbo = wr->iova_start - (wr->page_list->page_list[0] & PAGE_MASK);
-
-   fast_reg->va_hi = upper_32_bits(wr->iova_start);
-   fast_reg->va_lo = (u32) (wr->iova_start & 0x);
-   fast_reg->fbo_hi = upper_32_bits(fbo);
-   fast_reg->fbo_lo = (u32) fbo & 0x;
-   fast_reg->num_sges = wr->page_list_len;
-   fast_reg->size_sge =
-   get_encoded_page_size(1 << wr->page_shift);
-   mr = (struct ocrdma_mr *) (unsigned long)
-   dev->stag_arr[(hdr->lkey >> 8) & (OCRDMA_MAX_STAG - 1)];
-   build_frmr_pbes(wr, mr->hwmr.pbl_table, >hwmr);
-   return 0;
-}
-
 static void ocrdma_ring_sq_db(struct ocrdma_qp *qp)
 {
u32 val = qp->sq.dbid | (1 << OCRDMA_DB_SQ_SHIFT);
@@ -2357,9 +2278,6 @@ int ocrdma_post_send(struct ib_qp *ibqp, struct 
ib_send_wr *wr,

[PATCH v4 25/26] RDMA/nes: Remove old FRWR API

2015-10-12 Thread Sagi Grimberg
No ULP uses it anymore, go ahead and remove it.

Signed-off-by: Sagi Grimberg 
Acked-by: Christoph Hellwig 
---
 drivers/infiniband/hw/nes/nes_hw.h|   6 --
 drivers/infiniband/hw/nes/nes_verbs.c | 162 +-
 2 files changed, 1 insertion(+), 167 deletions(-)

diff --git a/drivers/infiniband/hw/nes/nes_hw.h 
b/drivers/infiniband/hw/nes/nes_hw.h
index d748e4b31b8d..c9080208aad2 100644
--- a/drivers/infiniband/hw/nes/nes_hw.h
+++ b/drivers/infiniband/hw/nes/nes_hw.h
@@ -1200,12 +1200,6 @@ struct nes_fast_mr_wqe_pbl {
dma_addr_t  paddr;
 };
 
-struct nes_ib_fast_reg_page_list {
-   struct ib_fast_reg_page_listibfrpl;
-   struct nes_fast_mr_wqe_pbl  nes_wqe_pbl;
-   u64 pbl;
-};
-
 struct nes_listener {
struct work_struct  work;
struct workqueue_struct *wq;
diff --git a/drivers/infiniband/hw/nes/nes_verbs.c 
b/drivers/infiniband/hw/nes/nes_verbs.c
index f5b0cc972403..c58091a6635f 100644
--- a/drivers/infiniband/hw/nes/nes_verbs.c
+++ b/drivers/infiniband/hw/nes/nes_verbs.c
@@ -486,76 +486,6 @@ static int nes_map_mr_sg(struct ib_mr *ibmr,
return ib_sg_to_pages(ibmr, sg, sg_nents, nes_set_page);
 }
 
-/*
- * nes_alloc_fast_reg_page_list
- */
-static struct ib_fast_reg_page_list *nes_alloc_fast_reg_page_list(
-   struct ib_device *ibdev,
-   int page_list_len)
-{
-   struct nes_vnic *nesvnic = to_nesvnic(ibdev);
-   struct nes_device *nesdev = nesvnic->nesdev;
-   struct ib_fast_reg_page_list *pifrpl;
-   struct nes_ib_fast_reg_page_list *pnesfrpl;
-
-   if (page_list_len > (NES_4K_PBL_CHUNK_SIZE / sizeof(u64)))
-   return ERR_PTR(-E2BIG);
-   /*
-* Allocate the ib_fast_reg_page_list structure, the
-* nes_fast_bpl structure, and the PLB table.
-*/
-   pnesfrpl = kmalloc(sizeof(struct nes_ib_fast_reg_page_list) +
-  page_list_len * sizeof(u64), GFP_KERNEL);
-
-   if (!pnesfrpl)
-   return ERR_PTR(-ENOMEM);
-
-   pifrpl = >ibfrpl;
-   pifrpl->page_list = >pbl;
-   pifrpl->max_page_list_len = page_list_len;
-   /*
-* Allocate the WQE PBL
-*/
-   pnesfrpl->nes_wqe_pbl.kva = pci_alloc_consistent(nesdev->pcidev,
-page_list_len * 
sizeof(u64),
-
>nes_wqe_pbl.paddr);
-
-   if (!pnesfrpl->nes_wqe_pbl.kva) {
-   kfree(pnesfrpl);
-   return ERR_PTR(-ENOMEM);
-   }
-   nes_debug(NES_DBG_MR, "nes_alloc_fast_reg_pbl: nes_frpl = %p, "
- "ibfrpl = %p, ibfrpl.page_list = %p, pbl.kva = %p, "
- "pbl.paddr = %llx\n", pnesfrpl, >ibfrpl,
- pnesfrpl->ibfrpl.page_list, pnesfrpl->nes_wqe_pbl.kva,
- (unsigned long long) pnesfrpl->nes_wqe_pbl.paddr);
-
-   return pifrpl;
-}
-
-/*
- * nes_free_fast_reg_page_list
- */
-static void nes_free_fast_reg_page_list(struct ib_fast_reg_page_list *pifrpl)
-{
-   struct nes_vnic *nesvnic = to_nesvnic(pifrpl->device);
-   struct nes_device *nesdev = nesvnic->nesdev;
-   struct nes_ib_fast_reg_page_list *pnesfrpl;
-
-   pnesfrpl = container_of(pifrpl, struct nes_ib_fast_reg_page_list, 
ibfrpl);
-   /*
-* Free the WQE PBL.
-*/
-   pci_free_consistent(nesdev->pcidev,
-   pifrpl->max_page_list_len * sizeof(u64),
-   pnesfrpl->nes_wqe_pbl.kva,
-   pnesfrpl->nes_wqe_pbl.paddr);
-   /*
-* Free the PBL structure
-*/
-   kfree(pnesfrpl);
-}
-
 /**
  * nes_query_device
  */
@@ -3470,94 +3400,6 @@ static int nes_post_send(struct ib_qp *ibqp, struct 
ib_send_wr *ib_wr,

NES_IWARP_SQ_LOCINV_WQE_INV_STAG_IDX,
ib_wr->ex.invalidate_rkey);
break;
-   case IB_WR_FAST_REG_MR:
-   {
-   int i;
-   struct ib_fast_reg_wr *fwr = fast_reg_wr(ib_wr);
-   int flags = fwr->access_flags;
-   struct nes_ib_fast_reg_page_list *pnesfrpl =
-   container_of(fwr->page_list,
-struct nes_ib_fast_reg_page_list,
-ibfrpl);
-   u64 *src_page_list = pnesfrpl->ibfrpl.page_list;
-   u64 *dst_page_list = pnesfrpl->nes_wqe_pbl.kva;
-
-   if (fwr->page_list_len >
-   (NES_4K_PBL_CHUNK_SIZE / sizeof(u64))) {
-   nes_debug(NES_DBG_IW_TX, "SQ_FMR: bad 
page_list_len\n");
-  

[PATCH v4 20/26] IB/mlx4: Remove old FRWR API support

2015-10-12 Thread Sagi Grimberg
No ULP uses it anymore, go ahead and remove it.

Signed-off-by: Sagi Grimberg 
Acked-by: Christoph Hellwig 
---
 drivers/infiniband/hw/mlx4/cq.c  |  3 +--
 drivers/infiniband/hw/mlx4/main.c|  2 --
 drivers/infiniband/hw/mlx4/mlx4_ib.h | 15 ---
 drivers/infiniband/hw/mlx4/mr.c  | 48 
 drivers/infiniband/hw/mlx4/qp.c  | 31 ---
 5 files changed, 1 insertion(+), 98 deletions(-)

diff --git a/drivers/infiniband/hw/mlx4/cq.c b/drivers/infiniband/hw/mlx4/cq.c
index 2ea4125b7903..b88fc8f5ab18 100644
--- a/drivers/infiniband/hw/mlx4/cq.c
+++ b/drivers/infiniband/hw/mlx4/cq.c
@@ -818,8 +818,7 @@ repoll:
wc->opcode= IB_WC_LSO;
break;
case MLX4_OPCODE_FMR:
-   wc->opcode= IB_WC_FAST_REG_MR;
-   /* TODO: wc->opcode= IB_WC_REG_MR; */
+   wc->opcode= IB_WC_REG_MR;
break;
case MLX4_OPCODE_LOCAL_INVAL:
wc->opcode= IB_WC_LOCAL_INV;
diff --git a/drivers/infiniband/hw/mlx4/main.c 
b/drivers/infiniband/hw/mlx4/main.c
index 19191ac0783c..0d7698e11329 100644
--- a/drivers/infiniband/hw/mlx4/main.c
+++ b/drivers/infiniband/hw/mlx4/main.c
@@ -2250,8 +2250,6 @@ static void *mlx4_ib_add(struct mlx4_dev *dev)
ibdev->ib_dev.dereg_mr  = mlx4_ib_dereg_mr;
ibdev->ib_dev.alloc_mr  = mlx4_ib_alloc_mr;
ibdev->ib_dev.map_mr_sg = mlx4_ib_map_mr_sg;
-   ibdev->ib_dev.alloc_fast_reg_page_list = 
mlx4_ib_alloc_fast_reg_page_list;
-   ibdev->ib_dev.free_fast_reg_page_list  = 
mlx4_ib_free_fast_reg_page_list;
ibdev->ib_dev.attach_mcast  = mlx4_ib_mcg_attach;
ibdev->ib_dev.detach_mcast  = mlx4_ib_mcg_detach;
ibdev->ib_dev.process_mad   = mlx4_ib_process_mad;
diff --git a/drivers/infiniband/hw/mlx4/mlx4_ib.h 
b/drivers/infiniband/hw/mlx4/mlx4_ib.h
index d6214577ecf8..4c6924791771 100644
--- a/drivers/infiniband/hw/mlx4/mlx4_ib.h
+++ b/drivers/infiniband/hw/mlx4/mlx4_ib.h
@@ -147,12 +147,6 @@ struct mlx4_ib_mw {
struct mlx4_mw  mmw;
 };
 
-struct mlx4_ib_fast_reg_page_list {
-   struct ib_fast_reg_page_listibfrpl;
-   __be64 *mapped_page_list;
-   dma_addr_t  map;
-};
-
 struct mlx4_ib_fmr {
struct ib_fmr   ibfmr;
struct mlx4_fmr mfmr;
@@ -645,11 +639,6 @@ static inline struct mlx4_ib_mw *to_mmw(struct ib_mw *ibmw)
return container_of(ibmw, struct mlx4_ib_mw, ibmw);
 }
 
-static inline struct mlx4_ib_fast_reg_page_list *to_mfrpl(struct 
ib_fast_reg_page_list *ibfrpl)
-{
-   return container_of(ibfrpl, struct mlx4_ib_fast_reg_page_list, ibfrpl);
-}
-
 static inline struct mlx4_ib_fmr *to_mfmr(struct ib_fmr *ibfmr)
 {
return container_of(ibfmr, struct mlx4_ib_fmr, ibfmr);
@@ -716,10 +705,6 @@ struct ib_mr *mlx4_ib_alloc_mr(struct ib_pd *pd,
 int mlx4_ib_map_mr_sg(struct ib_mr *ibmr,
  struct scatterlist *sg,
  unsigned int sg_nents);
-struct ib_fast_reg_page_list *mlx4_ib_alloc_fast_reg_page_list(struct 
ib_device *ibdev,
-  int 
page_list_len);
-void mlx4_ib_free_fast_reg_page_list(struct ib_fast_reg_page_list *page_list);
-
 int mlx4_ib_modify_cq(struct ib_cq *cq, u16 cq_count, u16 cq_period);
 int mlx4_ib_resize_cq(struct ib_cq *ibcq, int entries, struct ib_udata *udata);
 struct ib_cq *mlx4_ib_create_cq(struct ib_device *ibdev,
diff --git a/drivers/infiniband/hw/mlx4/mr.c b/drivers/infiniband/hw/mlx4/mr.c
index 96fc7ed99fb8..33f7ffac7b19 100644
--- a/drivers/infiniband/hw/mlx4/mr.c
+++ b/drivers/infiniband/hw/mlx4/mr.c
@@ -443,54 +443,6 @@ err_free:
return ERR_PTR(err);
 }
 
-struct ib_fast_reg_page_list *mlx4_ib_alloc_fast_reg_page_list(struct 
ib_device *ibdev,
-  int 
page_list_len)
-{
-   struct mlx4_ib_dev *dev = to_mdev(ibdev);
-   struct mlx4_ib_fast_reg_page_list *mfrpl;
-   int size = page_list_len * sizeof (u64);
-
-   if (page_list_len > MLX4_MAX_FAST_REG_PAGES)
-   return ERR_PTR(-EINVAL);
-
-   mfrpl = kmalloc(sizeof *mfrpl, GFP_KERNEL);
-   if (!mfrpl)
-   return ERR_PTR(-ENOMEM);
-
-   mfrpl->ibfrpl.page_list = kmalloc(size, GFP_KERNEL);
-   if (!mfrpl->ibfrpl.page_list)
-   goto err_free;
-
-   mfrpl->mapped_page_list = dma_alloc_coherent(>dev->persist->
-pdev->dev,
-size, >map,
-GFP_KERNEL);
-   if (!mfrpl->mapped_page_list)
-   goto err_free;
-
-   WARN_ON(mfrpl->map & 0x3f);
-
-   

[PATCH v4 19/26] IB/mlx5: Remove old FRWR API support

2015-10-12 Thread Sagi Grimberg
No ULP uses it anymore, go ahead and remove it.
Keep only the local invalidate part of the handlers.

Signed-off-by: Sagi Grimberg 
Acked-by: Christoph Hellwig 
---
 drivers/infiniband/hw/mlx5/cq.c  |  3 --
 drivers/infiniband/hw/mlx5/main.c|  2 -
 drivers/infiniband/hw/mlx5/mlx5_ib.h | 14 --
 drivers/infiniband/hw/mlx5/mr.c  | 42 
 drivers/infiniband/hw/mlx5/qp.c  | 97 
 5 files changed, 9 insertions(+), 149 deletions(-)

diff --git a/drivers/infiniband/hw/mlx5/cq.c b/drivers/infiniband/hw/mlx5/cq.c
index 206930096d56..3dfd287256d6 100644
--- a/drivers/infiniband/hw/mlx5/cq.c
+++ b/drivers/infiniband/hw/mlx5/cq.c
@@ -112,9 +112,6 @@ static enum ib_wc_opcode get_umr_comp(struct mlx5_ib_wq 
*wq, int idx)
case IB_WR_REG_MR:
return IB_WC_REG_MR;
 
-   case IB_WR_FAST_REG_MR:
-   return IB_WC_FAST_REG_MR;
-
default:
pr_warn("unknown completion status\n");
return 0;
diff --git a/drivers/infiniband/hw/mlx5/main.c 
b/drivers/infiniband/hw/mlx5/main.c
index 7e93044ea6ce..bdd60a69be2d 100644
--- a/drivers/infiniband/hw/mlx5/main.c
+++ b/drivers/infiniband/hw/mlx5/main.c
@@ -1426,8 +1426,6 @@ static void *mlx5_ib_add(struct mlx5_core_dev *mdev)
dev->ib_dev.process_mad = mlx5_ib_process_mad;
dev->ib_dev.alloc_mr= mlx5_ib_alloc_mr;
dev->ib_dev.map_mr_sg   = mlx5_ib_map_mr_sg;
-   dev->ib_dev.alloc_fast_reg_page_list = mlx5_ib_alloc_fast_reg_page_list;
-   dev->ib_dev.free_fast_reg_page_list  = mlx5_ib_free_fast_reg_page_list;
dev->ib_dev.check_mr_status = mlx5_ib_check_mr_status;
dev->ib_dev.get_port_immutable  = mlx5_port_immutable;
 
diff --git a/drivers/infiniband/hw/mlx5/mlx5_ib.h 
b/drivers/infiniband/hw/mlx5/mlx5_ib.h
index 72672ae48296..86bf036887d0 100644
--- a/drivers/infiniband/hw/mlx5/mlx5_ib.h
+++ b/drivers/infiniband/hw/mlx5/mlx5_ib.h
@@ -338,12 +338,6 @@ struct mlx5_ib_mr {
void*descs_alloc;
 };
 
-struct mlx5_ib_fast_reg_page_list {
-   struct ib_fast_reg_page_listibfrpl;
-   __be64 *mapped_page_list;
-   dma_addr_t  map;
-};
-
 struct mlx5_ib_umr_context {
enum ib_wc_status   status;
struct completion   done;
@@ -494,11 +488,6 @@ static inline struct mlx5_ib_mr *to_mmr(struct ib_mr *ibmr)
return container_of(ibmr, struct mlx5_ib_mr, ibmr);
 }
 
-static inline struct mlx5_ib_fast_reg_page_list *to_mfrpl(struct 
ib_fast_reg_page_list *ibfrpl)
-{
-   return container_of(ibfrpl, struct mlx5_ib_fast_reg_page_list, ibfrpl);
-}
-
 struct mlx5_ib_ah {
struct ib_ahibah;
struct mlx5_av  av;
@@ -569,9 +558,6 @@ struct ib_mr *mlx5_ib_alloc_mr(struct ib_pd *pd,
 int mlx5_ib_map_mr_sg(struct ib_mr *ibmr,
  struct scatterlist *sg,
  unsigned int sg_nents);
-struct ib_fast_reg_page_list *mlx5_ib_alloc_fast_reg_page_list(struct 
ib_device *ibdev,
-  int 
page_list_len);
-void mlx5_ib_free_fast_reg_page_list(struct ib_fast_reg_page_list *page_list);
 int mlx5_ib_process_mad(struct ib_device *ibdev, int mad_flags, u8 port_num,
const struct ib_wc *in_wc, const struct ib_grh *in_grh,
const struct ib_mad_hdr *in, size_t in_mad_size,
diff --git a/drivers/infiniband/hw/mlx5/mr.c b/drivers/infiniband/hw/mlx5/mr.c
index cabfe4190657..277499c4f156 100644
--- a/drivers/infiniband/hw/mlx5/mr.c
+++ b/drivers/infiniband/hw/mlx5/mr.c
@@ -1383,48 +1383,6 @@ err_free:
return ERR_PTR(err);
 }
 
-struct ib_fast_reg_page_list *mlx5_ib_alloc_fast_reg_page_list(struct 
ib_device *ibdev,
-  int 
page_list_len)
-{
-   struct mlx5_ib_fast_reg_page_list *mfrpl;
-   int size = page_list_len * sizeof(u64);
-
-   mfrpl = kmalloc(sizeof(*mfrpl), GFP_KERNEL);
-   if (!mfrpl)
-   return ERR_PTR(-ENOMEM);
-
-   mfrpl->ibfrpl.page_list = kmalloc(size, GFP_KERNEL);
-   if (!mfrpl->ibfrpl.page_list)
-   goto err_free;
-
-   mfrpl->mapped_page_list = dma_alloc_coherent(ibdev->dma_device,
-size, >map,
-GFP_KERNEL);
-   if (!mfrpl->mapped_page_list)
-   goto err_free;
-
-   WARN_ON(mfrpl->map & 0x3f);
-
-   return >ibfrpl;
-
-err_free:
-   kfree(mfrpl->ibfrpl.page_list);
-   kfree(mfrpl);
-   return ERR_PTR(-ENOMEM);
-}
-
-void mlx5_ib_free_fast_reg_page_list(struct ib_fast_reg_page_list *page_list)
-{
-   struct mlx5_ib_fast_reg_page_list *mfrpl = to_mfrpl(page_list);
-   struct mlx5_ib_dev *dev = to_mdev(page_list->device);
- 

[PATCH v4 10/26] IB/iser: Port to new fast registration API

2015-10-12 Thread Sagi Grimberg
Remove fastreg page list allocation as the page vector
is now private to the provider. Instead of constructing
the page list and fast_req work request, call ib_map_mr_sg
and construct ib_reg_wr.

Signed-off-by: Sagi Grimberg 
Acked-by: Christoph Hellwig 
---
 drivers/infiniband/ulp/iser/iscsi_iser.h  |  8 ++---
 drivers/infiniband/ulp/iser/iser_memory.c | 54 ++-
 drivers/infiniband/ulp/iser/iser_verbs.c  | 16 +
 3 files changed, 27 insertions(+), 51 deletions(-)

diff --git a/drivers/infiniband/ulp/iser/iscsi_iser.h 
b/drivers/infiniband/ulp/iser/iscsi_iser.h
index 2484bee993ec..271aa71e827c 100644
--- a/drivers/infiniband/ulp/iser/iscsi_iser.h
+++ b/drivers/infiniband/ulp/iser/iscsi_iser.h
@@ -297,7 +297,7 @@ struct iser_tx_desc {
u8   wr_idx;
union iser_wr {
struct ib_send_wr   send;
-   struct ib_fast_reg_wr   fast_reg;
+   struct ib_reg_wrfast_reg;
struct ib_sig_handover_wr   sig;
} wrs[ISER_MAX_WRS];
struct iser_mem_reg  data_reg;
@@ -412,7 +412,6 @@ struct iser_device {
  *
  * @mr: memory region
  * @fmr_pool:   pool of fmrs
- * @frpl:   fast reg page list used by frwrs
  * @page_vec:   fast reg page list used by fmr pool
  * @mr_valid:   is mr valid indicator
  */
@@ -421,10 +420,7 @@ struct iser_reg_resources {
struct ib_mr *mr;
struct ib_fmr_pool   *fmr_pool;
};
-   union {
-   struct ib_fast_reg_page_list *frpl;
-   struct iser_page_vec *page_vec;
-   };
+   struct iser_page_vec *page_vec;
u8mr_valid:1;
 };
 
diff --git a/drivers/infiniband/ulp/iser/iser_memory.c 
b/drivers/infiniband/ulp/iser/iser_memory.c
index b29fda3e8e74..ea765fb9664d 100644
--- a/drivers/infiniband/ulp/iser/iser_memory.c
+++ b/drivers/infiniband/ulp/iser/iser_memory.c
@@ -472,7 +472,7 @@ iser_reg_sig_mr(struct iscsi_iser_task *iser_task,
sig_reg->sge.addr = 0;
sig_reg->sge.length = scsi_transfer_length(iser_task->sc);
 
-   iser_dbg("sig reg: lkey: 0x%x, rkey: 0x%x, addr: 0x%llx, length: %u\n",
+   iser_dbg("lkey=0x%x rkey=0x%x addr=0x%llx length=%u\n",
 sig_reg->sge.lkey, sig_reg->rkey, sig_reg->sge.addr,
 sig_reg->sge.length);
 err:
@@ -484,47 +484,41 @@ static int iser_fast_reg_mr(struct iscsi_iser_task 
*iser_task,
struct iser_reg_resources *rsc,
struct iser_mem_reg *reg)
 {
-   struct ib_conn *ib_conn = _task->iser_conn->ib_conn;
-   struct iser_device *device = ib_conn->device;
-   struct ib_mr *mr = rsc->mr;
-   struct ib_fast_reg_page_list *frpl = rsc->frpl;
struct iser_tx_desc *tx_desc = _task->desc;
-   struct ib_fast_reg_wr *wr;
-   int offset, size, plen;
-
-   plen = iser_sg_to_page_vec(mem, device->ib_device, frpl->page_list,
-  , );
-   if (plen * SIZE_4K < size) {
-   iser_err("fast reg page_list too short to hold this SG\n");
-   return -EINVAL;
-   }
+   struct ib_mr *mr = rsc->mr;
+   struct ib_reg_wr *wr;
+   int n;
 
if (!rsc->mr_valid)
iser_inv_rkey(iser_tx_next_wr(tx_desc), mr);
 
-   wr = fast_reg_wr(iser_tx_next_wr(tx_desc));
-   wr->wr.opcode = IB_WR_FAST_REG_MR;
+   n = ib_map_mr_sg(mr, mem->sg, mem->size, SIZE_4K);
+   if (unlikely(n != mem->size)) {
+   iser_err("failed to map sg (%d/%d)\n",
+n, mem->size);
+   return n < 0 ? n : -EINVAL;
+   }
+
+   wr = reg_wr(iser_tx_next_wr(tx_desc));
+   wr->wr.opcode = IB_WR_REG_MR;
wr->wr.wr_id = ISER_FASTREG_LI_WRID;
wr->wr.send_flags = 0;
-   wr->iova_start = frpl->page_list[0] + offset;
-   wr->page_list = frpl;
-   wr->page_list_len = plen;
-   wr->page_shift = SHIFT_4K;
-   wr->length = size;
-   wr->rkey = mr->rkey;
-   wr->access_flags = (IB_ACCESS_LOCAL_WRITE  |
-   IB_ACCESS_REMOTE_WRITE |
-   IB_ACCESS_REMOTE_READ);
+   wr->wr.num_sge = 0;
+   wr->mr = mr;
+   wr->key = mr->rkey;
+   wr->access = IB_ACCESS_LOCAL_WRITE  |
+IB_ACCESS_REMOTE_WRITE |
+IB_ACCESS_REMOTE_READ;
+
rsc->mr_valid = 0;
 
reg->sge.lkey = mr->lkey;
reg->rkey = mr->rkey;
-   reg->sge.addr = frpl->page_list[0] + offset;
-   reg->sge.length = size;
+   reg->sge.addr = mr->iova;
+   reg->sge.length = mr->length;
 
-   iser_dbg("fast reg: lkey=0x%x, rkey=0x%x, addr=0x%llx,"
-" length=0x%x\n", reg->sge.lkey, reg->rkey,
-reg->sge.addr, 

[PATCH v4 24/26] IB/qib: Remove old FRWR API

2015-10-12 Thread Sagi Grimberg
No ULP uses it anymore, go ahead and remove it.

Signed-off-by: Sagi Grimberg 
Acked-by: Christoph Hellwig 
---
 drivers/infiniband/hw/qib/qib_keys.c  | 56 ---
 drivers/infiniband/hw/qib/qib_mr.c| 32 +---
 drivers/infiniband/hw/qib/qib_verbs.c |  8 -
 drivers/infiniband/hw/qib/qib_verbs.h |  7 -
 4 files changed, 1 insertion(+), 102 deletions(-)

diff --git a/drivers/infiniband/hw/qib/qib_keys.c 
b/drivers/infiniband/hw/qib/qib_keys.c
index 95b8b9110fc6..d725c565518d 100644
--- a/drivers/infiniband/hw/qib/qib_keys.c
+++ b/drivers/infiniband/hw/qib/qib_keys.c
@@ -336,62 +336,6 @@ bail:
 }
 
 /*
- * Initialize the memory region specified by the work reqeust.
- */
-int qib_fast_reg_mr(struct qib_qp *qp, struct ib_send_wr *send_wr)
-{
-   struct ib_fast_reg_wr *wr = fast_reg_wr(send_wr);
-   struct qib_lkey_table *rkt = _idev(qp->ibqp.device)->lk_table;
-   struct qib_pd *pd = to_ipd(qp->ibqp.pd);
-   struct qib_mregion *mr;
-   u32 rkey = wr->rkey;
-   unsigned i, n, m;
-   int ret = -EINVAL;
-   unsigned long flags;
-   u64 *page_list;
-   size_t ps;
-
-   spin_lock_irqsave(>lock, flags);
-   if (pd->user || rkey == 0)
-   goto bail;
-
-   mr = rcu_dereference_protected(
-   rkt->table[(rkey >> (32 - ib_qib_lkey_table_size))],
-   lockdep_is_held(>lock));
-   if (unlikely(mr == NULL || qp->ibqp.pd != mr->pd))
-   goto bail;
-
-   if (wr->page_list_len > mr->max_segs)
-   goto bail;
-
-   ps = 1UL << wr->page_shift;
-   if (wr->length > ps * wr->page_list_len)
-   goto bail;
-
-   mr->user_base = wr->iova_start;
-   mr->iova = wr->iova_start;
-   mr->lkey = rkey;
-   mr->length = wr->length;
-   mr->access_flags = wr->access_flags;
-   page_list = wr->page_list->page_list;
-   m = 0;
-   n = 0;
-   for (i = 0; i < wr->page_list_len; i++) {
-   mr->map[m]->segs[n].vaddr = (void *) page_list[i];
-   mr->map[m]->segs[n].length = ps;
-   if (++n == QIB_SEGSZ) {
-   m++;
-   n = 0;
-   }
-   }
-
-   ret = 0;
-bail:
-   spin_unlock_irqrestore(>lock, flags);
-   return ret;
-}
-
-/*
  * Initialize the memory region specified by the work request.
  */
 int qib_reg_mr(struct qib_qp *qp, struct ib_reg_wr *wr)
diff --git a/drivers/infiniband/hw/qib/qib_mr.c 
b/drivers/infiniband/hw/qib/qib_mr.c
index 0fa4b0de8074..73f78c0f9522 100644
--- a/drivers/infiniband/hw/qib/qib_mr.c
+++ b/drivers/infiniband/hw/qib/qib_mr.c
@@ -324,7 +324,7 @@ out:
 
 /*
  * Allocate a memory region usable with the
- * IB_WR_FAST_REG_MR send work request.
+ * IB_WR_REG_MR send work request.
  *
  * Return the memory region on success, otherwise return an errno.
  */
@@ -375,36 +375,6 @@ int qib_map_mr_sg(struct ib_mr *ibmr,
return ib_sg_to_pages(ibmr, sg, sg_nents, qib_set_page);
 }
 
-struct ib_fast_reg_page_list *
-qib_alloc_fast_reg_page_list(struct ib_device *ibdev, int page_list_len)
-{
-   unsigned size = page_list_len * sizeof(u64);
-   struct ib_fast_reg_page_list *pl;
-
-   if (size > PAGE_SIZE)
-   return ERR_PTR(-EINVAL);
-
-   pl = kzalloc(sizeof(*pl), GFP_KERNEL);
-   if (!pl)
-   return ERR_PTR(-ENOMEM);
-
-   pl->page_list = kzalloc(size, GFP_KERNEL);
-   if (!pl->page_list)
-   goto err_free;
-
-   return pl;
-
-err_free:
-   kfree(pl);
-   return ERR_PTR(-ENOMEM);
-}
-
-void qib_free_fast_reg_page_list(struct ib_fast_reg_page_list *pl)
-{
-   kfree(pl->page_list);
-   kfree(pl);
-}
-
 /**
  * qib_alloc_fmr - allocate a fast memory region
  * @pd: the protection domain for this memory region
diff --git a/drivers/infiniband/hw/qib/qib_verbs.c 
b/drivers/infiniband/hw/qib/qib_verbs.c
index a1e53d7b662b..de6cb6fcda8d 100644
--- a/drivers/infiniband/hw/qib/qib_verbs.c
+++ b/drivers/infiniband/hw/qib/qib_verbs.c
@@ -365,9 +365,6 @@ static int qib_post_one_send(struct qib_qp *qp, struct 
ib_send_wr *wr,
if (wr->opcode == IB_WR_REG_MR) {
if (qib_reg_mr(qp, reg_wr(wr)))
goto bail_inval;
-   } else if (wr->opcode == IB_WR_FAST_REG_MR) {
-   if (qib_fast_reg_mr(qp, wr))
-   goto bail_inval;
} else if (qp->ibqp.qp_type == IB_QPT_UC) {
if ((unsigned) wr->opcode >= IB_WR_RDMA_READ)
goto bail_inval;
@@ -407,9 +404,6 @@ static int qib_post_one_send(struct qib_qp *qp, struct 
ib_send_wr *wr,
else if (wr->opcode == IB_WR_REG_MR)
memcpy(>reg_wr, reg_wr(wr),
sizeof(wqe->reg_wr));
-   else if (wr->opcode == IB_WR_FAST_REG_MR)
-   memcpy(>fast_reg_wr, fast_reg_wr(wr),
-   

[PATCH v4 15/26] IB/srp: Split srp_map_sg

2015-10-12 Thread Sagi Grimberg
This is a preparation patch for the new registration API
conversion. It splits srp_map_sg per registration strategy
(srp_map_sg[fmr|fr|dma]. On its own it adds some code duplication,
but it makes the API switch easier to comprehend.

Signed-off-by: Sagi Grimberg 
Tested-by: Bart Van Assche 
---
 drivers/infiniband/ulp/srp/ib_srp.c | 157 
 1 file changed, 106 insertions(+), 51 deletions(-)

diff --git a/drivers/infiniband/ulp/srp/ib_srp.c 
b/drivers/infiniband/ulp/srp/ib_srp.c
index 1390f99ca76b..3ec94c109e1b 100644
--- a/drivers/infiniband/ulp/srp/ib_srp.c
+++ b/drivers/infiniband/ulp/srp/ib_srp.c
@@ -1286,6 +1286,17 @@ static int srp_map_finish_fmr(struct srp_map_state 
*state,
if (state->fmr.next >= state->fmr.end)
return -ENOMEM;
 
+   WARN_ON_ONCE(!dev->use_fmr);
+
+   if (state->npages == 0)
+   return 0;
+
+   if (state->npages == 1 && target->global_mr) {
+   srp_map_desc(state, state->base_dma_addr, state->dma_len,
+target->global_mr->rkey);
+   goto reset_state;
+   }
+
fmr = ib_fmr_pool_map_phys(ch->fmr_pool, state->pages,
   state->npages, io_addr);
if (IS_ERR(fmr))
@@ -1297,6 +1308,10 @@ static int srp_map_finish_fmr(struct srp_map_state 
*state,
srp_map_desc(state, state->base_dma_addr & ~dev->mr_page_mask,
 state->dma_len, fmr->fmr->rkey);
 
+reset_state:
+   state->npages = 0;
+   state->dma_len = 0;
+
return 0;
 }
 
@@ -1309,10 +1324,22 @@ static int srp_map_finish_fr(struct srp_map_state 
*state,
struct ib_fast_reg_wr wr;
struct srp_fr_desc *desc;
u32 rkey;
+   int err;
 
if (state->fr.next >= state->fr.end)
return -ENOMEM;
 
+   WARN_ON_ONCE(!dev->use_fast_reg);
+
+   if (state->npages == 0)
+   return 0;
+
+   if (state->npages == 1 && target->global_mr) {
+   srp_map_desc(state, state->base_dma_addr, state->dma_len,
+target->global_mr->rkey);
+   goto reset_state;
+   }
+
desc = srp_fr_pool_get(ch->fr_pool);
if (!desc)
return -ENOMEM;
@@ -1342,7 +1369,15 @@ static int srp_map_finish_fr(struct srp_map_state *state,
srp_map_desc(state, state->base_dma_addr, state->dma_len,
 desc->mr->rkey);
 
-   return ib_post_send(ch->qp, , _wr);
+   err = ib_post_send(ch->qp, , _wr);
+   if (err)
+   return err;
+
+reset_state:
+   state->npages = 0;
+   state->dma_len = 0;
+
+   return 0;
 }
 
 static int srp_finish_mapping(struct srp_map_state *state,
@@ -1350,26 +1385,9 @@ static int srp_finish_mapping(struct srp_map_state 
*state,
 {
struct srp_target_port *target = ch->target;
struct srp_device *dev = target->srp_host->srp_dev;
-   int ret = 0;
 
-   WARN_ON_ONCE(!dev->use_fast_reg && !dev->use_fmr);
-
-   if (state->npages == 0)
-   return 0;
-
-   if (state->npages == 1 && target->global_mr)
-   srp_map_desc(state, state->base_dma_addr, state->dma_len,
-target->global_mr->rkey);
-   else
-   ret = dev->use_fast_reg ? srp_map_finish_fr(state, ch) :
-   srp_map_finish_fmr(state, ch);
-
-   if (ret == 0) {
-   state->npages = 0;
-   state->dma_len = 0;
-   }
-
-   return ret;
+   return dev->use_fast_reg ? srp_map_finish_fr(state, ch) :
+  srp_map_finish_fmr(state, ch);
 }
 
 static int srp_map_sg_entry(struct srp_map_state *state,
@@ -1415,47 +1433,79 @@ static int srp_map_sg_entry(struct srp_map_state *state,
return ret;
 }
 
-static int srp_map_sg(struct srp_map_state *state, struct srp_rdma_ch *ch,
- struct srp_request *req, struct scatterlist *scat,
- int count)
+static int srp_map_sg_fmr(struct srp_map_state *state, struct srp_rdma_ch *ch,
+ struct srp_request *req, struct scatterlist *scat,
+ int count)
 {
-   struct srp_target_port *target = ch->target;
-   struct srp_device *dev = target->srp_host->srp_dev;
struct scatterlist *sg;
int i, ret;
 
-   state->desc = req->indirect_desc;
-   state->pages= req->map_page;
-   if (dev->use_fast_reg) {
-   state->fr.next = req->fr_list;
-   state->fr.end = req->fr_list + target->cmd_sg_cnt;
-   } else if (dev->use_fmr) {
-   state->fmr.next = req->fmr_list;
-   state->fmr.end = req->fmr_list + target->cmd_sg_cnt;
+   state->desc = req->indirect_desc;
+   state->pages = req->map_page;
+   state->fmr.next = req->fmr_list;
+   state->fmr.end = req->fmr_list + 

[PATCH v4 05/26] RDMA/ocrdma: Support the new memory registration API

2015-10-12 Thread Sagi Grimberg
Support the new memory registration API by allocating a
private page list array in ocrdma_mr and populate it when
ocrdma_map_mr_sg is invoked. Also, support IB_WR_REG_MR
by duplicating IB_WR_FAST_REG_MR, but take the needed
information from different places:
- page_size, iova, length, access flags (ib_mr)
- page array (ocrdma_mr)
- key (ib_reg_wr)

The IB_WR_FAST_REG_MR handlers will be removed later when
all the ULPs will be converted.

Signed-off-by: Sagi Grimberg 
Acked-by: Christoph Hellwig 
---
 drivers/infiniband/hw/ocrdma/ocrdma.h   |  2 +
 drivers/infiniband/hw/ocrdma/ocrdma_main.c  |  1 +
 drivers/infiniband/hw/ocrdma/ocrdma_verbs.c | 90 +
 drivers/infiniband/hw/ocrdma/ocrdma_verbs.h |  3 +
 4 files changed, 96 insertions(+)

diff --git a/drivers/infiniband/hw/ocrdma/ocrdma.h 
b/drivers/infiniband/hw/ocrdma/ocrdma.h
index b4091ab48db0..c2f3af5d5194 100644
--- a/drivers/infiniband/hw/ocrdma/ocrdma.h
+++ b/drivers/infiniband/hw/ocrdma/ocrdma.h
@@ -193,6 +193,8 @@ struct ocrdma_mr {
struct ib_mr ibmr;
struct ib_umem *umem;
struct ocrdma_hw_mr hwmr;
+   u64 *pages;
+   u32 npages;
 };
 
 struct ocrdma_stats {
diff --git a/drivers/infiniband/hw/ocrdma/ocrdma_main.c 
b/drivers/infiniband/hw/ocrdma/ocrdma_main.c
index 87aa55df7c82..874beb4b07a1 100644
--- a/drivers/infiniband/hw/ocrdma/ocrdma_main.c
+++ b/drivers/infiniband/hw/ocrdma/ocrdma_main.c
@@ -182,6 +182,7 @@ static int ocrdma_register_device(struct ocrdma_dev *dev)
dev->ibdev.reg_user_mr = ocrdma_reg_user_mr;
 
dev->ibdev.alloc_mr = ocrdma_alloc_mr;
+   dev->ibdev.map_mr_sg = ocrdma_map_mr_sg;
dev->ibdev.alloc_fast_reg_page_list = ocrdma_alloc_frmr_page_list;
dev->ibdev.free_fast_reg_page_list = ocrdma_free_frmr_page_list;
 
diff --git a/drivers/infiniband/hw/ocrdma/ocrdma_verbs.c 
b/drivers/infiniband/hw/ocrdma/ocrdma_verbs.c
index eb09e224acb9..66cc15a78f63 100644
--- a/drivers/infiniband/hw/ocrdma/ocrdma_verbs.c
+++ b/drivers/infiniband/hw/ocrdma/ocrdma_verbs.c
@@ -1013,6 +1013,7 @@ int ocrdma_dereg_mr(struct ib_mr *ib_mr)
 
(void) ocrdma_mbx_dealloc_lkey(dev, mr->hwmr.fr_mr, mr->hwmr.lkey);
 
+   kfree(mr->pages);
ocrdma_free_mr_pbl_tbl(dev, >hwmr);
 
/* it could be user registered memory. */
@@ -2177,6 +2178,61 @@ static int get_encoded_page_size(int pg_sz)
return i;
 }
 
+static int ocrdma_build_reg(struct ocrdma_qp *qp,
+   struct ocrdma_hdr_wqe *hdr,
+   struct ib_reg_wr *wr)
+{
+   u64 fbo;
+   struct ocrdma_ewqe_fr *fast_reg = (struct ocrdma_ewqe_fr *)(hdr + 1);
+   struct ocrdma_mr *mr = get_ocrdma_mr(wr->mr);
+   struct ocrdma_pbl *pbl_tbl = mr->hwmr.pbl_table;
+   struct ocrdma_pbe *pbe;
+   u32 wqe_size = sizeof(*fast_reg) + sizeof(*hdr);
+   int num_pbes = 0, i;
+
+   wqe_size = roundup(wqe_size, OCRDMA_WQE_ALIGN_BYTES);
+
+   hdr->cw |= (OCRDMA_FR_MR << OCRDMA_WQE_OPCODE_SHIFT);
+   hdr->cw |= ((wqe_size / OCRDMA_WQE_STRIDE) << OCRDMA_WQE_SIZE_SHIFT);
+
+   if (wr->access & IB_ACCESS_LOCAL_WRITE)
+   hdr->rsvd_lkey_flags |= OCRDMA_LKEY_FLAG_LOCAL_WR;
+   if (wr->access & IB_ACCESS_REMOTE_WRITE)
+   hdr->rsvd_lkey_flags |= OCRDMA_LKEY_FLAG_REMOTE_WR;
+   if (wr->access & IB_ACCESS_REMOTE_READ)
+   hdr->rsvd_lkey_flags |= OCRDMA_LKEY_FLAG_REMOTE_RD;
+   hdr->lkey = wr->key;
+   hdr->total_len = mr->ibmr.length;
+
+   fbo = mr->ibmr.iova - mr->pages[0];
+
+   fast_reg->va_hi = upper_32_bits(mr->ibmr.iova);
+   fast_reg->va_lo = (u32) (mr->ibmr.iova & 0x);
+   fast_reg->fbo_hi = upper_32_bits(fbo);
+   fast_reg->fbo_lo = (u32) fbo & 0x;
+   fast_reg->num_sges = mr->npages;
+   fast_reg->size_sge = get_encoded_page_size(mr->ibmr.page_size);
+
+   pbe = pbl_tbl->va;
+   for (i = 0; i < mr->npages; i++) {
+   u64 buf_addr = mr->pages[i];
+
+   pbe->pa_lo = cpu_to_le32((u32) (buf_addr & PAGE_MASK));
+   pbe->pa_hi = cpu_to_le32((u32) upper_32_bits(buf_addr));
+   num_pbes += 1;
+   pbe++;
+
+   /* if the pbl is full storing the pbes,
+* move to next pbl.
+   */
+   if (num_pbes == (mr->hwmr.pbl_size/sizeof(u64))) {
+   pbl_tbl++;
+   pbe = (struct ocrdma_pbe *)pbl_tbl->va;
+   }
+   }
+
+   return 0;
+}
 
 static int ocrdma_build_fr(struct ocrdma_qp *qp, struct ocrdma_hdr_wqe *hdr,
   struct ib_send_wr *send_wr)
@@ -2304,6 +2360,9 @@ int ocrdma_post_send(struct ib_qp *ibqp, struct 
ib_send_wr *wr,
case IB_WR_FAST_REG_MR:
status = ocrdma_build_fr(qp, hdr, wr);
break;
+   case IB_WR_REG_MR:
+   

[PATCH v4 09/26] RDMA/nes: Support the new memory registration API

2015-10-12 Thread Sagi Grimberg
Support the new memory registration API by allocating a
private page list array in nes_mr and populate it when
nes_map_mr_sg is invoked. Also, support IB_WR_REG_MR
by duplicating IB_WR_FAST_REG_MR handling and take the
needed information from different places:
- page_size, iova, length (ib_mr)
- page array (nes_mr)
- key, access flags (ib_reg_wr)

The IB_WR_FAST_REG_MR handlers will be removed later when
all the ULPs will be converted.

Signed-off-by: Sagi Grimberg 
Acked-by: Christoph Hellwig 
---
 drivers/infiniband/hw/nes/nes_verbs.c | 117 +-
 drivers/infiniband/hw/nes/nes_verbs.h |   4 ++
 2 files changed, 120 insertions(+), 1 deletion(-)

diff --git a/drivers/infiniband/hw/nes/nes_verbs.c 
b/drivers/infiniband/hw/nes/nes_verbs.c
index f71b37b75f82..f5b0cc972403 100644
--- a/drivers/infiniband/hw/nes/nes_verbs.c
+++ b/drivers/infiniband/hw/nes/nes_verbs.c
@@ -51,6 +51,7 @@ atomic_t qps_created;
 atomic_t sw_qps_destroyed;
 
 static void nes_unregister_ofa_device(struct nes_ib_device *nesibdev);
+static int nes_dereg_mr(struct ib_mr *ib_mr);
 
 /**
  * nes_alloc_mw
@@ -443,9 +444,46 @@ static struct ib_mr *nes_alloc_mr(struct ib_pd *ibpd,
} else {
kfree(nesmr);
nes_free_resource(nesadapter, nesadapter->allocated_mrs, 
stag_index);
-   ibmr = ERR_PTR(-ENOMEM);
+   return ERR_PTR(-ENOMEM);
}
+
+   nesmr->pages = pci_alloc_consistent(nesdev->pcidev,
+   max_num_sg * sizeof(u64),
+   >paddr);
+   if (!nesmr->paddr)
+   goto err;
+
+   nesmr->max_pages = max_num_sg;
+
return ibmr;
+
+err:
+   nes_dereg_mr(ibmr);
+
+   return ERR_PTR(-ENOMEM);
+}
+
+static int nes_set_page(struct ib_mr *ibmr, u64 addr)
+{
+   struct nes_mr *nesmr = to_nesmr(ibmr);
+
+   if (unlikely(nesmr->npages == nesmr->max_pages))
+   return -ENOMEM;
+
+   nesmr->pages[nesmr->npages++] = cpu_to_le64(addr);
+
+   return 0;
+}
+
+static int nes_map_mr_sg(struct ib_mr *ibmr,
+struct scatterlist *sg,
+unsigned int sg_nents)
+{
+   struct nes_mr *nesmr = to_nesmr(ibmr);
+
+   nesmr->npages = 0;
+
+   return ib_sg_to_pages(ibmr, sg, sg_nents, nes_set_page);
 }
 
 /*
@@ -2683,6 +2721,13 @@ static int nes_dereg_mr(struct ib_mr *ib_mr)
u16 major_code;
u16 minor_code;
 
+
+   if (nesmr->pages)
+   pci_free_consistent(nesdev->pcidev,
+   nesmr->max_pages * sizeof(u64),
+   nesmr->pages,
+   nesmr->paddr);
+
if (nesmr->region) {
ib_umem_release(nesmr->region);
}
@@ -3513,6 +3558,75 @@ static int nes_post_send(struct ib_qp *ibqp, struct 
ib_send_wr *ib_wr,
  wqe_misc);
break;
}
+   case IB_WR_REG_MR:
+   {
+   struct nes_mr *mr = to_nesmr(reg_wr(ib_wr)->mr);
+   int page_shift = ilog2(reg_wr(ib_wr)->mr->page_size);
+   int flags = reg_wr(ib_wr)->access;
+
+   if (mr->npages > (NES_4K_PBL_CHUNK_SIZE / sizeof(u64))) 
{
+   nes_debug(NES_DBG_IW_TX, "SQ_FMR: bad 
page_list_len\n");
+   err = -EINVAL;
+   break;
+   }
+   wqe_misc = NES_IWARP_SQ_OP_FAST_REG;
+   set_wqe_64bit_value(wqe->wqe_words,
+   NES_IWARP_SQ_FMR_WQE_VA_FBO_LOW_IDX,
+   mr->ibmr.iova);
+   set_wqe_32bit_value(wqe->wqe_words,
+   NES_IWARP_SQ_FMR_WQE_LENGTH_LOW_IDX,
+   mr->ibmr.length);
+   set_wqe_32bit_value(wqe->wqe_words,
+   
NES_IWARP_SQ_FMR_WQE_LENGTH_HIGH_IDX, 0);
+   set_wqe_32bit_value(wqe->wqe_words,
+   NES_IWARP_SQ_FMR_WQE_MR_STAG_IDX,
+   reg_wr(ib_wr)->key);
+
+   if (page_shift == 12) {
+   wqe_misc |= NES_IWARP_SQ_FMR_WQE_PAGE_SIZE_4K;
+   } else if (page_shift == 21) {
+   wqe_misc |= NES_IWARP_SQ_FMR_WQE_PAGE_SIZE_2M;
+   } else {
+   nes_debug(NES_DBG_IW_TX, "Invalid page shift,"
+ " ib_wr=%u, max=1\n", ib_wr->num_sge);
+   err = -EINVAL;
+   break;
+   }
+
+ 

RE: [PATCH rdma-RC] IB/cm: Fix rb-tree duplicate free and use-after-free

2015-10-12 Thread Hefty, Sean
> ib_send_cm_sidr_rep could sometimes erase the node from the sidr
> (depending on errors in the process). Since ib_send_cm_sidr_rep is
> called both from cm_sidr_req_handler and cm_destroy_id, cm_id_priv

This should clarify that it is the app calling from the callback, and not a 
direct call from the cm_sidr_req_handler.

> could be either erased from the rb_tree twice or not erased at all.

In an error case, I can see why it would be left in the rbtree, but I don't see 
how it can be removed twice.


> Fixing that by making sure it's erased only once before freeing
> cm_id_priv.
> 
> Fixes: a977049dacde ('[PATCH] IB: Add the kernel CM implementation')
> Signed-off-by: Doron Tsur 
> Signed-off-by: Matan Barak 
> ---
> 
> Hi Doug,
> This patch fixes a bug in the CM. In some flow, rb-tree could be
> freed twice or used after it was freed. This bug was picked by
> our regression tests and this fix was verified.
> 
> Thanks,
> Doron and Matan
> 
>  drivers/infiniband/core/cm.c | 10 +-
>  1 file changed, 9 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/infiniband/core/cm.c b/drivers/infiniband/core/cm.c
> index f5cf1c4..56ff0f3 100644
> --- a/drivers/infiniband/core/cm.c
> +++ b/drivers/infiniband/core/cm.c
> @@ -844,6 +844,11 @@ retest:
>   case IB_CM_SIDR_REQ_RCVD:
>   spin_unlock_irq(_id_priv->lock);
>   cm_reject_sidr_req(cm_id_priv, IB_SIDR_REJECT);
> + spin_lock_irq();
> + if (!RB_EMPTY_NODE(_id_priv->sidr_id_node))
> + rb_erase(_id_priv->sidr_id_node,
> +  _sidr_table);
> + spin_unlock_irq();

We should be able to use a return value from cm_reject_sidr_req() -- passed 
through from ib_send_cm_sidr_rep() to determine if the id was removed from the 
tree.

>   break;
>   case IB_CM_REQ_SENT:
>   case IB_CM_MRA_REQ_RCVD:
> @@ -3210,7 +3215,10 @@ int ib_send_cm_sidr_rep(struct ib_cm_id *cm_id,
>   spin_unlock_irqrestore(_id_priv->lock, flags);
> 
>   spin_lock_irqsave(, flags);
> - rb_erase(_id_priv->sidr_id_node, _sidr_table);
> + if (!RB_EMPTY_NODE(_id_priv->sidr_id_node)) {
> + rb_erase(_id_priv->sidr_id_node, _sidr_table);
> + RB_CLEAR_NODE(_id_priv->sidr_id_node);
> + }
>   spin_unlock_irqrestore(, flags);

Something is very wrong in this function if the id is not in the tree at this 
point.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v4 01/26] IB/core: Introduce new fast registration API

2015-10-12 Thread Sagi Grimberg
The new fast registration  verb ib_map_mr_sg receives a scatterlist
and converts it to a page list under the verbs API thus hiding
the specific HW mapping details away from the consumer.

The provider drivers are provided with a generic helper ib_sg_to_pages
that converts a scatterlist into a vector of page addresses. The
drivers can still perform any HW specific page address setting
by passing a set_page function pointer which will be invoked for
each page address. This allows drivers to avoid keeping a shadow
page vectors and convert them to HW specific translations by doing
extra copies.

This API will allow ULPs to remove the duplicated code of constructing
a page vector from a given sg list.

The send work request ib_reg_wr also shrinks as it will contain only
mr, key and access flags in addition.

Signed-off-by: Sagi Grimberg 
Tested-by: Christoph Hellwig 
---
 drivers/infiniband/core/verbs.c | 107 
 include/rdma/ib_verbs.h |  44 +
 2 files changed, 151 insertions(+)

diff --git a/drivers/infiniband/core/verbs.c b/drivers/infiniband/core/verbs.c
index e1f2c9887f3f..e18a8bce8130 100644
--- a/drivers/infiniband/core/verbs.c
+++ b/drivers/infiniband/core/verbs.c
@@ -1469,3 +1469,110 @@ int ib_check_mr_status(struct ib_mr *mr, u32 check_mask,
mr->device->check_mr_status(mr, check_mask, mr_status) : 
-ENOSYS;
 }
 EXPORT_SYMBOL(ib_check_mr_status);
+
+/**
+ * ib_map_mr_sg() - Map the largest prefix of a dma mapped SG list
+ * and set it the memory region.
+ * @mr:memory region
+ * @sg:dma mapped scatterlist
+ * @sg_nents:  number of entries in sg
+ * @page_size: page vector desired page size
+ *
+ * Constraints:
+ * - The first sg element is allowed to have an offset.
+ * - Each sg element must be aligned to page_size (or physically
+ *   contiguous to the previous element). In case an sg element has a
+ *   non contiguous offset, the mapping prefix will not include it.
+ * - The last sg element is allowed to have length less than page_size.
+ * - If sg_nents total byte length exceeds the mr max_num_sge * page_size
+ *   then only max_num_sg entries will be mapped.
+ *
+ * Returns the number of sg elements that were mapped to the memory region.
+ *
+ * After this completes successfully, the  memory region
+ * is ready for registration.
+ */
+int ib_map_mr_sg(struct ib_mr *mr,
+struct scatterlist *sg,
+unsigned int sg_nents,
+unsigned int page_size)
+{
+   if (unlikely(!mr->device->map_mr_sg))
+   return -ENOSYS;
+
+   mr->page_size = page_size;
+
+   return mr->device->map_mr_sg(mr, sg, sg_nents);
+}
+EXPORT_SYMBOL(ib_map_mr_sg);
+
+/**
+ * ib_sg_to_pages() - Convert the largest prefix of a sg list
+ * to a page vector
+ * @mr:memory region
+ * @sgl:   dma mapped scatterlist
+ * @sg_nents:  number of entries in sg
+ * @set_page:  driver page assignment function pointer
+ *
+ * Core service helper for drivers to covert the largest
+ * prefix of given sg list to a page vector. The sg list
+ * prefix converted is the prefix that meet the requirements
+ * of ib_map_mr_sg.
+ *
+ * Returns the number of sg elements that were assigned to
+ * a page vector.
+ */
+int ib_sg_to_pages(struct ib_mr *mr,
+  struct scatterlist *sgl,
+  unsigned int sg_nents,
+  int (*set_page)(struct ib_mr *, u64))
+{
+   struct scatterlist *sg;
+   u64 last_end_dma_addr = 0, last_page_addr = 0;
+   unsigned int last_page_off = 0;
+   u64 page_mask = ~((u64)mr->page_size - 1);
+   int i;
+
+   mr->iova = sg_dma_address([0]);
+   mr->length = 0;
+
+   for_each_sg(sgl, sg, sg_nents, i) {
+   u64 dma_addr = sg_dma_address(sg);
+   unsigned int dma_len = sg_dma_len(sg);
+   u64 end_dma_addr = dma_addr + dma_len;
+   u64 page_addr = dma_addr & page_mask;
+
+   if (i && page_addr != dma_addr) {
+   if (last_end_dma_addr != dma_addr) {
+   /* gap */
+   goto done;
+
+   } else if (last_page_off + dma_len <= mr->page_size) {
+   /* chunk this fragment with the last */
+   mr->length += dma_len;
+   last_end_dma_addr += dma_len;
+   last_page_off += dma_len;
+   continue;
+   } else {
+   /* map starting from the next page */
+   page_addr = last_page_addr + mr->page_size;
+   dma_len -= mr->page_size - last_page_off;
+   }
+   }
+
+   do {
+   if 

[PATCH v4 06/26] RDMA/cxgb3: Support the new memory registration API

2015-10-12 Thread Sagi Grimberg
Support the new memory registration API by allocating a
private page list array in iwch_mr and populate it when
iwch_map_mr_sg is invoked. Also, support IB_WR_REG_MR
by duplicating build_fastreg just take the needed information
from different places:
- page_size, iova, length (ib_mr)
- page array (iwch_mr)
- key, access flags (ib_reg_wr)

The IB_WR_FAST_REG_MR handlers will be removed later when
all the ULPs will be converted.

Signed-off-by: Sagi Grimberg 
Acked-by: Christoph Hellwig 
---
 drivers/infiniband/hw/cxgb3/iwch_provider.c | 33 
 drivers/infiniband/hw/cxgb3/iwch_provider.h |  2 ++
 drivers/infiniband/hw/cxgb3/iwch_qp.c   | 48 +
 3 files changed, 83 insertions(+)

diff --git a/drivers/infiniband/hw/cxgb3/iwch_provider.c 
b/drivers/infiniband/hw/cxgb3/iwch_provider.c
index 93308c45f298..ee3d5ca7de6c 100644
--- a/drivers/infiniband/hw/cxgb3/iwch_provider.c
+++ b/drivers/infiniband/hw/cxgb3/iwch_provider.c
@@ -463,6 +463,7 @@ static int iwch_dereg_mr(struct ib_mr *ib_mr)
return -EINVAL;
 
mhp = to_iwch_mr(ib_mr);
+   kfree(mhp->pages);
rhp = mhp->rhp;
mmid = mhp->attr.stag >> 8;
cxio_dereg_mem(>rdev, mhp->attr.stag, mhp->attr.pbl_size,
@@ -821,6 +822,12 @@ static struct ib_mr *iwch_alloc_mr(struct ib_pd *pd,
if (!mhp)
goto err;
 
+   mhp->pages = kcalloc(max_num_sg, sizeof(u64), GFP_KERNEL);
+   if (!mhp->pages) {
+   ret = -ENOMEM;
+   goto pl_err;
+   }
+
mhp->rhp = rhp;
ret = iwch_alloc_pbl(mhp, max_num_sg);
if (ret)
@@ -847,11 +854,36 @@ err3:
 err2:
iwch_free_pbl(mhp);
 err1:
+   kfree(mhp->pages);
+pl_err:
kfree(mhp);
 err:
return ERR_PTR(ret);
 }
 
+static int iwch_set_page(struct ib_mr *ibmr, u64 addr)
+{
+   struct iwch_mr *mhp = to_iwch_mr(ibmr);
+
+   if (unlikely(mhp->npages == mhp->attr.pbl_size))
+   return -ENOMEM;
+
+   mhp->pages[mhp->npages++] = addr;
+
+   return 0;
+}
+
+static int iwch_map_mr_sg(struct ib_mr *ibmr,
+ struct scatterlist *sg,
+ unsigned int sg_nents)
+{
+   struct iwch_mr *mhp = to_iwch_mr(ibmr);
+
+   mhp->npages = 0;
+
+   return ib_sg_to_pages(ibmr, sg, sg_nents, iwch_set_page);
+}
+
 static struct ib_fast_reg_page_list *iwch_alloc_fastreg_pbl(
struct ib_device *device,
int page_list_len)
@@ -1450,6 +1482,7 @@ int iwch_register_device(struct iwch_dev *dev)
dev->ibdev.bind_mw = iwch_bind_mw;
dev->ibdev.dealloc_mw = iwch_dealloc_mw;
dev->ibdev.alloc_mr = iwch_alloc_mr;
+   dev->ibdev.map_mr_sg = iwch_map_mr_sg;
dev->ibdev.alloc_fast_reg_page_list = iwch_alloc_fastreg_pbl;
dev->ibdev.free_fast_reg_page_list = iwch_free_fastreg_pbl;
dev->ibdev.attach_mcast = iwch_multicast_attach;
diff --git a/drivers/infiniband/hw/cxgb3/iwch_provider.h 
b/drivers/infiniband/hw/cxgb3/iwch_provider.h
index 87c14b0c5ac0..2ac85b86a680 100644
--- a/drivers/infiniband/hw/cxgb3/iwch_provider.h
+++ b/drivers/infiniband/hw/cxgb3/iwch_provider.h
@@ -77,6 +77,8 @@ struct iwch_mr {
struct iwch_dev *rhp;
u64 kva;
struct tpt_attributes attr;
+   u64 *pages;
+   u32 npages;
 };
 
 typedef struct iwch_mw iwch_mw_handle;
diff --git a/drivers/infiniband/hw/cxgb3/iwch_qp.c 
b/drivers/infiniband/hw/cxgb3/iwch_qp.c
index bac0508fedd9..a09ea538e990 100644
--- a/drivers/infiniband/hw/cxgb3/iwch_qp.c
+++ b/drivers/infiniband/hw/cxgb3/iwch_qp.c
@@ -146,6 +146,49 @@ static int build_rdma_read(union t3_wr *wqe, struct 
ib_send_wr *wr,
return 0;
 }
 
+static int build_memreg(union t3_wr *wqe, struct ib_reg_wr *wr,
+ u8 *flit_cnt, int *wr_cnt, struct t3_wq *wq)
+{
+   struct iwch_mr *mhp = to_iwch_mr(wr->mr);
+   int i;
+   __be64 *p;
+
+   if (mhp->npages > T3_MAX_FASTREG_DEPTH)
+   return -EINVAL;
+   *wr_cnt = 1;
+   wqe->fastreg.stag = cpu_to_be32(wr->key);
+   wqe->fastreg.len = cpu_to_be32(mhp->ibmr.length);
+   wqe->fastreg.va_base_hi = cpu_to_be32(mhp->ibmr.iova >> 32);
+   wqe->fastreg.va_base_lo_fbo =
+   cpu_to_be32(mhp->ibmr.iova & 0x);
+   wqe->fastreg.page_type_perms = cpu_to_be32(
+   V_FR_PAGE_COUNT(mhp->npages) |
+   V_FR_PAGE_SIZE(ilog2(wr->mr->page_size) - 12) |
+   V_FR_TYPE(TPT_VATO) |
+   V_FR_PERMS(iwch_ib_to_tpt_access(wr->access)));
+   p = >fastreg.pbl_addrs[0];
+   for (i = 0; i < mhp->npages; i++, p++) {
+
+   /* If we need a 2nd WR, then set it up */
+   if (i == T3_MAX_FASTREG_FRAG) {
+   *wr_cnt = 2;
+   wqe = (union t3_wr *)(wq->queue +
+ 

[PATCH v4 08/26] IB/qib: Support the new memory registration API

2015-10-12 Thread Sagi Grimberg
Support the new memory registration API by allocating a
private page list array in qib_mr and populate it when
qib_map_mr_sg is invoked. Also, support IB_WR_REG_MR
by duplicating qib_fastreg_mr just take the needed information
from different places:
- page_size, iova, length (ib_mr)
- page array (qib_mr)
- key, access flags (ib_reg_wr)

The IB_WR_FAST_REG_MR handlers will be removed later when
all the ULPs will be converted.

Signed-off-by: Sagi Grimberg 
Acked-by: Christoph Hellwig 
---
 drivers/infiniband/hw/qib/qib_keys.c  | 56 +++
 drivers/infiniband/hw/qib/qib_mr.c| 32 
 drivers/infiniband/hw/qib/qib_verbs.c |  9 +-
 drivers/infiniband/hw/qib/qib_verbs.h |  8 +
 4 files changed, 104 insertions(+), 1 deletion(-)

diff --git a/drivers/infiniband/hw/qib/qib_keys.c 
b/drivers/infiniband/hw/qib/qib_keys.c
index eaf139a33b2e..95b8b9110fc6 100644
--- a/drivers/infiniband/hw/qib/qib_keys.c
+++ b/drivers/infiniband/hw/qib/qib_keys.c
@@ -390,3 +390,59 @@ bail:
spin_unlock_irqrestore(>lock, flags);
return ret;
 }
+
+/*
+ * Initialize the memory region specified by the work request.
+ */
+int qib_reg_mr(struct qib_qp *qp, struct ib_reg_wr *wr)
+{
+   struct qib_lkey_table *rkt = _idev(qp->ibqp.device)->lk_table;
+   struct qib_pd *pd = to_ipd(qp->ibqp.pd);
+   struct qib_mr *mr = to_imr(wr->mr);
+   struct qib_mregion *mrg;
+   u32 key = wr->key;
+   unsigned i, n, m;
+   int ret = -EINVAL;
+   unsigned long flags;
+   u64 *page_list;
+   size_t ps;
+
+   spin_lock_irqsave(>lock, flags);
+   if (pd->user || key == 0)
+   goto bail;
+
+   mrg = rcu_dereference_protected(
+   rkt->table[(key >> (32 - ib_qib_lkey_table_size))],
+   lockdep_is_held(>lock));
+   if (unlikely(mrg == NULL || qp->ibqp.pd != mrg->pd))
+   goto bail;
+
+   if (mr->npages > mrg->max_segs)
+   goto bail;
+
+   ps = mr->ibmr.page_size;
+   if (mr->ibmr.length > ps * mr->npages)
+   goto bail;
+
+   mrg->user_base = mr->ibmr.iova;
+   mrg->iova = mr->ibmr.iova;
+   mrg->lkey = key;
+   mrg->length = mr->ibmr.length;
+   mrg->access_flags = wr->access;
+   page_list = mr->pages;
+   m = 0;
+   n = 0;
+   for (i = 0; i < mr->npages; i++) {
+   mrg->map[m]->segs[n].vaddr = (void *) page_list[i];
+   mrg->map[m]->segs[n].length = ps;
+   if (++n == QIB_SEGSZ) {
+   m++;
+   n = 0;
+   }
+   }
+
+   ret = 0;
+bail:
+   spin_unlock_irqrestore(>lock, flags);
+   return ret;
+}
diff --git a/drivers/infiniband/hw/qib/qib_mr.c 
b/drivers/infiniband/hw/qib/qib_mr.c
index 19220dcb9a3b..0fa4b0de8074 100644
--- a/drivers/infiniband/hw/qib/qib_mr.c
+++ b/drivers/infiniband/hw/qib/qib_mr.c
@@ -303,6 +303,7 @@ int qib_dereg_mr(struct ib_mr *ibmr)
int ret = 0;
unsigned long timeout;
 
+   kfree(mr->pages);
qib_free_lkey(>mr);
 
qib_put_mr(>mr); /* will set completion if last */
@@ -340,7 +341,38 @@ struct ib_mr *qib_alloc_mr(struct ib_pd *pd,
if (IS_ERR(mr))
return (struct ib_mr *)mr;
 
+   mr->pages = kcalloc(max_num_sg, sizeof(u64), GFP_KERNEL);
+   if (!mr->pages)
+   goto err;
+
return >ibmr;
+
+err:
+   qib_dereg_mr(>ibmr);
+   return ERR_PTR(-ENOMEM);
+}
+
+static int qib_set_page(struct ib_mr *ibmr, u64 addr)
+{
+   struct qib_mr *mr = to_imr(ibmr);
+
+   if (unlikely(mr->npages == mr->mr.max_segs))
+   return -ENOMEM;
+
+   mr->pages[mr->npages++] = addr;
+
+   return 0;
+}
+
+int qib_map_mr_sg(struct ib_mr *ibmr,
+ struct scatterlist *sg,
+ unsigned int sg_nents)
+{
+   struct qib_mr *mr = to_imr(ibmr);
+
+   mr->npages = 0;
+
+   return ib_sg_to_pages(ibmr, sg, sg_nents, qib_set_page);
 }
 
 struct ib_fast_reg_page_list *
diff --git a/drivers/infiniband/hw/qib/qib_verbs.c 
b/drivers/infiniband/hw/qib/qib_verbs.c
index a6b0b098ff30..a1e53d7b662b 100644
--- a/drivers/infiniband/hw/qib/qib_verbs.c
+++ b/drivers/infiniband/hw/qib/qib_verbs.c
@@ -362,7 +362,10 @@ static int qib_post_one_send(struct qib_qp *qp, struct 
ib_send_wr *wr,
 * undefined operations.
 * Make sure buffer is large enough to hold the result for atomics.
 */
-   if (wr->opcode == IB_WR_FAST_REG_MR) {
+   if (wr->opcode == IB_WR_REG_MR) {
+   if (qib_reg_mr(qp, reg_wr(wr)))
+   goto bail_inval;
+   } else if (wr->opcode == IB_WR_FAST_REG_MR) {
if (qib_fast_reg_mr(qp, wr))
goto bail_inval;
} else if (qp->ibqp.qp_type == IB_QPT_UC) {
@@ -401,6 +404,9 @@ static int qib_post_one_send(struct qib_qp *qp, struct 
ib_send_wr *wr,

Re: [Ksummit-discuss] [TECH TOPIC] IRQ affinity

2015-10-12 Thread Theodore Ts'o
Hi Christoph,

Do you think this is still an issue that would be worth discsussing at
the kernel summit as a technical topic?  If so, would you be willing
to be responsible for kicking off the discussion for this topic?

Thanks,

- Ted



On Wed, Jul 15, 2015 at 05:07:08AM -0700, Christoph Hellwig wrote:
> Many years ago we decided to move setting of IRQ to core affnities to
> userspace with the irqbalance daemon.
> 
> These days we have systems with lots of MSI-X vector, and we have
> hardware and subsystem support for per-CPU I/O queues in the block
> layer, the RDMA subsystem and probably the network stack (I'm not too
> familar with the recent developments there).  It would really help the
> out of the box performance and experience if we could allow such
> subsystems to bind interrupt vectors to the node that the queue is
> configured on.
> 
> I'd like to discuss if the rationale for moving the IRQ affinity setting
> fully to userspace are still correct in todays world any any pitfalls
> we'll have to learn from in irqbalanced and the old in-kernel affinity
> code.
> ___
> Ksummit-discuss mailing list
> ksummit-disc...@lists.linuxfoundation.org
> https://lists.linuxfoundation.org/mailman/listinfo/ksummit-discuss
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH infiniband-diags] perfquery.c: Fix smp_query_via return value checks

2015-10-12 Thread Hal Rosenstock

smp_query_via returns pointer so < 0 comparison is wrong:
src/perfquery.c: In function ?is_rsfec_mode_active?:
src/perfquery.c:481: warning: ordered comparison of pointer with integer zero
src/perfquery.c: In function ?main?:
src/perfquery.c:919: warning: ordered comparison of pointer with integer zero
src/perfquery.c:928: warning: ordered comparison of pointer with integer zero
 
Reported-by: David Binderman  
Signed-off-by: Hal Rosenstock 
---
Fix for OFA Bugzilla #2572

diff --git a/src/perfquery.c b/src/perfquery.c
index 9e3a307..948ce52 100644
--- a/src/perfquery.c
+++ b/src/perfquery.c
@@ -477,8 +477,8 @@ static uint8_t is_rsfec_mode_active(ib_portid_t * portid, 
int port,
return 0;
}
 
-   if (smp_query_via(data, portid, IB_ATTR_PORT_INFO_EXT, port, 0,
- srcport) < 0)
+   if (!smp_query_via(data, portid, IB_ATTR_PORT_INFO_EXT, port, 0,
+  srcport))
IBEXIT("smp query portinfo extended failed");
 
mad_decode_field(data, IB_PORT_EXT_CAPMASK_F, _capmask);
@@ -915,8 +915,8 @@ int main(int argc, char **argv)
 
 
if (all_ports_loop || (loop_ports && (all_ports || port == ALL_PORTS))) 
{
-   if (smp_query_via(data, , IB_ATTR_NODE_INFO, 0, 0,
- srcport) < 0)
+   if (!smp_query_via(data, , IB_ATTR_NODE_INFO, 0, 0,
+  srcport))
IBEXIT("smp query nodeinfo failed");
node_type = mad_get_field(data, 0, IB_NODE_TYPE_F);
mad_decode_field(data, IB_NODE_NPORTS_F, _ports);
@@ -924,8 +924,8 @@ int main(int argc, char **argv)
IBEXIT("smp query nodeinfo: num ports invalid");
 
if (node_type == IB_NODE_SWITCH) {
-   if (smp_query_via(data, , IB_ATTR_SWITCH_INFO,
- 0, 0, srcport) < 0)
+   if (!smp_query_via(data, , IB_ATTR_SWITCH_INFO,
+  0, 0, srcport))
IBEXIT("smp query nodeinfo failed");
enhancedport0 =
mad_get_field(data, 0, IB_SW_ENHANCED_PORT0_F);
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH TRIVIAL infiniband-diags] ibdiag_common.c: Move static to beginning of get_build_version declaration

2015-10-12 Thread Hal Rosenstock

Eliminate compiler warning:
src/ibdiag_common.c:85: warning: ?static? is not at beginning of declaration

Signed-off-by: Hal Rosenstock 
---
diff --git a/src/ibdiag_common.c b/src/ibdiag_common.c
index e09623d..5424845 100644
--- a/src/ibdiag_common.c
+++ b/src/ibdiag_common.c
@@ -82,7 +82,7 @@ static const char **prog_examples;
 static struct option *long_opts = NULL;
 static const struct ibdiag_opt *opts_map[256];
 
-const static char *get_build_version(void)
+static const char *get_build_version(void)
 {
return "BUILD VERSION: " IBDIAG_VERSION " Build date: " __DATE__ " "
__TIME__;
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: merge struct ib_device_attr into struct ib_device V2

2015-10-12 Thread Christoph Hellwig
On Mon, Oct 12, 2015 at 12:26:06PM +0300, Sagi Grimberg wrote:
> First go with this looks OK for mlx4. mlx5 needs the below incremental
> patch to be folded in.
>
> we need dev->ib_dev.max_pkeys set when get_port_caps() is called.

Thanks, I've folded your patch and force pushed out the updated tree.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH for-next 0/2] RDMA/cxgb4: Add iWARP support for T6 adapter

2015-10-12 Thread Steve Wise


> -Original Message-
> From: netdev-ow...@vger.kernel.org [mailto:netdev-ow...@vger.kernel.org] On 
> Behalf Of Hariprasad Shenai
> Sent: Wednesday, September 23, 2015 6:49 AM
> To: linux-rdma@vger.kernel.org; net...@vger.kernel.org
> Cc: dledf...@redhat.com; da...@davemloft.net; sw...@opengridcomputing.com; 
> lee...@chelsio.com; nirran...@chelsio.com;
> Hariprasad Shenai
> Subject: [PATCH for-next 0/2] RDMA/cxgb4: Add iWARP support for T6 adapter
> 
> Hi,
> 
> PATCH 1/2 adds changes like new register, structure and functions in cxgb4
> driver for iw_cxgb4 driver, and PATCH 2/2 adds iw_cxgb4 specific code to
> support T6 adapter.
> 
> This patch series has been created against Doug's linux tree and includes
> patches on cxgb4 and iw_cxgb4 driver.
> 
> We have included all the maintainers of respective drivers. Kindly review
> the change and let us know in case of any review comments.
> 
> Thanks
> 

These look ok to me. 

Series Reviewed-by: Steve Wise 

Doug, should these get staged through your tree?

Steve.

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] svcrdma: Fix NFS server crash triggered by 1MB NFS WRITE

2015-10-12 Thread J. Bruce Fields
On Mon, Oct 12, 2015 at 10:53:39AM -0400, Chuck Lever wrote:
> Now that the NFS server advertises a maximum payload size of 1MB
> for RPC/RDMA again, it crashes in svc_process_common() when NFS
> client sends a 1MB NFS WRITE on an NFS/RDMA mount.
> 
> The server has set up a 259 element array of struct page pointers
> in rq_pages[] for each incoming request. The last element of the
> array is NULL.
> 
> When an incoming request has been completely received,
> rdma_read_complete() attempts to set the starting page of the
> incoming page vector:
> 
>   rqstp->rq_arg.pages = >rq_pages[head->hdr_count];
> 
> and the page to use for the reply:
> 
>   rqstp->rq_respages = >rq_arg.pages[page_no];
> 
> But the value of page_no has already accounted for head->hdr_count.
> Thus rq_respages now points past the end of the incoming pages.
> 
> For NFS WRITE operations smaller than the maximum, this is harmless.
> But when the NFS WRITE operation is as large as the server's max
> payload size, rq_respages now points at the last entry in rq_pages,
> which is NULL.
> 
> Fixes: cc9a903d915c ('svcrdma: Change maximum server payload . . .')
> BugLink: https://bugzilla.linux-nfs.org/show_bug.cgi?id=270
> Signed-off-by: Chuck Lever 
> Reviewed-by: Sagi Grimberg 
> Reviewed-by: Steve Wise 
> Reviewed-by: Shirley Ma 
> ---
> 
> Hi Bruce-
> 
> This is a regression in 4.3. Can you send this to Linus?

OK, queuing for 4.3, thanks.--b.

> 
> 
>  net/sunrpc/xprtrdma/svc_rdma_recvfrom.c |2 +-
>  1 files changed, 1 insertions(+), 1 deletions(-)
> 
> diff --git a/net/sunrpc/xprtrdma/svc_rdma_recvfrom.c 
> b/net/sunrpc/xprtrdma/svc_rdma_recvfrom.c
> index cb51742..37b4341 100644
> --- a/net/sunrpc/xprtrdma/svc_rdma_recvfrom.c
> +++ b/net/sunrpc/xprtrdma/svc_rdma_recvfrom.c
> @@ -531,7 +531,7 @@ static int rdma_read_complete(struct svc_rqst *rqstp,
>   rqstp->rq_arg.page_base = head->arg.page_base;
>  
>   /* rq_respages starts after the last arg page */
> - rqstp->rq_respages = >rq_arg.pages[page_no];
> + rqstp->rq_respages = >rq_pages[page_no];
>   rqstp->rq_next_page = rqstp->rq_respages + 1;
>  
>   /* Rebuild rq_arg head and tail. */
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Ksummit-discuss] [TECH TOPIC] IRQ affinity

2015-10-12 Thread Christoph Hellwig
On Mon, Oct 12, 2015 at 12:09:48PM -0400, Theodore Ts'o wrote:
> Hi Christoph,
> 
> Do you think this is still an issue that would be worth discsussing at
> the kernel summit as a technical topic?  If so, would you be willing
> to be responsible for kicking off the discussion for this topic?

Hi Ted,

while we have a high level agreement there's still some discussion
needed.  I can prepare a few slides for 10 minute discussion and then
take it to the hallways with the interested people.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH TRIVIAL infiniband-diags] ibdiag_common.c: Move static to beginning of get_build_version declaration

2015-10-12 Thread ira.weiny
On Mon, Oct 12, 2015 at 08:31:23AM -0400, Hal Rosenstock wrote:
> 
> Eliminate compiler warning:
> src/ibdiag_common.c:85: warning: ?static? is not at beginning of declaration
> 
> Signed-off-by: Hal Rosenstock 

Thanks applied,
Ira

> ---
> diff --git a/src/ibdiag_common.c b/src/ibdiag_common.c
> index e09623d..5424845 100644
> --- a/src/ibdiag_common.c
> +++ b/src/ibdiag_common.c
> @@ -82,7 +82,7 @@ static const char **prog_examples;
>  static struct option *long_opts = NULL;
>  static const struct ibdiag_opt *opts_map[256];
>  
> -const static char *get_build_version(void)
> +static const char *get_build_version(void)
>  {
>   return "BUILD VERSION: " IBDIAG_VERSION " Build date: " __DATE__ " "
>   __TIME__;
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH infiniband-diags] perfquery.c: Fix smp_query_via return value checks

2015-10-12 Thread ira.weiny
On Mon, Oct 12, 2015 at 08:30:31AM -0400, Hal Rosenstock wrote:
> 
> smp_query_via returns pointer so < 0 comparison is wrong:
> src/perfquery.c: In function ?is_rsfec_mode_active?:
> src/perfquery.c:481: warning: ordered comparison of pointer with integer zero
> src/perfquery.c: In function ?main?:
> src/perfquery.c:919: warning: ordered comparison of pointer with integer zero
> src/perfquery.c:928: warning: ordered comparison of pointer with integer zero
>  
> Reported-by: David Binderman  
> Signed-off-by: Hal Rosenstock 

There is also a call in ibdiag_common.c which is wrong:

diff --git a/src/ibdiag_common.c b/src/ibdiag_common.c
index 54248455bac4..5ec0167f87de 100644
--- a/src/ibdiag_common.c
+++ b/src/ibdiag_common.c
@@ -507,7 +507,7 @@ int is_port_info_extended_supported(ib_portid_t * dest, int
port,
uint32_t cap_mask;
uint16_t cap_mask2;
 
-   if (smp_query_via(data, dest, IB_ATTR_PORT_INFO, port, 0, srcport) < 0)
+   if (!smp_query_via(data, dest, IB_ATTR_PORT_INFO, port, 0, srcport))
IBEXIT("port info query failed");
 
mad_decode_field(data, IB_PORT_CAPMASK_F, _mask);


I went ahead and added this chunk to this patch and accepted.

Thanks,
Ira


> ---
> Fix for OFA Bugzilla #2572
> 
> diff --git a/src/perfquery.c b/src/perfquery.c
> index 9e3a307..948ce52 100644
> --- a/src/perfquery.c
> +++ b/src/perfquery.c
> @@ -477,8 +477,8 @@ static uint8_t is_rsfec_mode_active(ib_portid_t * portid, 
> int port,
>   return 0;
>   }
>  
> - if (smp_query_via(data, portid, IB_ATTR_PORT_INFO_EXT, port, 0,
> -   srcport) < 0)
> + if (!smp_query_via(data, portid, IB_ATTR_PORT_INFO_EXT, port, 0,
> +srcport))
>   IBEXIT("smp query portinfo extended failed");
>  
>   mad_decode_field(data, IB_PORT_EXT_CAPMASK_F, _capmask);
> @@ -915,8 +915,8 @@ int main(int argc, char **argv)
>  
>  
>   if (all_ports_loop || (loop_ports && (all_ports || port == ALL_PORTS))) 
> {
> - if (smp_query_via(data, , IB_ATTR_NODE_INFO, 0, 0,
> -   srcport) < 0)
> + if (!smp_query_via(data, , IB_ATTR_NODE_INFO, 0, 0,
> +srcport))
>   IBEXIT("smp query nodeinfo failed");
>   node_type = mad_get_field(data, 0, IB_NODE_TYPE_F);
>   mad_decode_field(data, IB_NODE_NPORTS_F, _ports);
> @@ -924,8 +924,8 @@ int main(int argc, char **argv)
>   IBEXIT("smp query nodeinfo: num ports invalid");
>  
>   if (node_type == IB_NODE_SWITCH) {
> - if (smp_query_via(data, , IB_ATTR_SWITCH_INFO,
> -   0, 0, srcport) < 0)
> + if (!smp_query_via(data, , IB_ATTR_SWITCH_INFO,
> +0, 0, srcport))
>   IBEXIT("smp query nodeinfo failed");
>   enhancedport0 =
>   mad_get_field(data, 0, IB_SW_ENHANCED_PORT0_F);
> --
> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Create a common verbs transport library

2015-10-12 Thread Moni Shoua
Hi Denny,



We initially thought to implement a shared library that contains the
transport logic.

However, it seems that a SW Verbs transport driver would allow better
code sharing.

In fact, the VT driver would need only a single user-space driver for
all "backends". Any direct HW access from user-space should be exposed
by the corresponding backend driver and accessed by a different
library (e.g., psm).



At a high-level, it seems that we should do as follows:

- Decide on an initial code base for VT (rxe/hfi/qib), clone it, and
rename to VT

- Split the code to VT and backend and create the initial backend APIs, e.g.:

-- Send packet

-- Deliver packet (receive)

-- Attach multicast

-- Packet buffer allocation

-- Notify when more send space is available

- In parallel, prepare the backends of other drivers while enhancing
VT as needed.



Do you have any preferences to the initial code base?

Do you already have some code that we can look at?



Please advise as we are starting to develop a VT driver for RoCE now.

I suggest that we set up common user+kernel git repos for the initial work.



Thanks,

-Moni

On Tue, Sep 29, 2015 at 3:56 PM, Dennis Dalessandro
 wrote:
> Hi All,
>
> One of the conditions to move the hfi1 driver from staging into the normal
> drivers/infiniband/hw directory is to handle the code duplication in our
> verbs layer. This is going to be done by creating a new kmod which we will
> call rdmavt, for RDMA verbs transport. This will eventually live in the
> existing drivers/infiniband tree in a new sw directory:
> drivers/infiniband/sw/vt. This new directory can serve as a home for soft
> roce when its ready as well.
>
> The verbs library will start out life in drivers/staging/rdma/vt alongside
> hfi1. We (Intel) will push incremental patches to keep the community
> apprised of the development and allow for early and more continuous
> feedback. Once complete the plan would be to move out of staging along with
> hfi1.
>
> The current verbs support in the IB core should not need to be modified,
> rdmavt is just another verbs provider. Drivers will not use rdmavt directly.
> Rather, rdmavt will use the drivers to abstract away the hardware
> differences. Here is a diagram of what this will look like.
>
>   +---+
>   |Ib Core|
>   +---+
>   +
>   |
> +--v+
> |Verbs Transport|
> +-+--+--+
>  |  |
>  |  |
> +-v--+ +-v--+
> |qib | |hfi1|
> ++ ++
>
> -Denny
> --
> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[ANNOUNCE] infiniband-diags 1.6.6 release

2015-10-12 Thread ira.weiny
There is a new release of infiniband-diags at:

https://www.openfabrics.org/downloads/management/infiniband-diags-1.6.6.tar.gz

md5sum:

b855ca3b98afefc2ad6a2de378ab71dd  infiniband-diags-1.6.6.tar.gz


Dependencies:

1) libibmad >= 1.3.12
2) libibumad >= 1.3.7
3) opensm-libs >= 3.3.10
4) ib_umad kernel module
5) glib2


Release notes v1.6.5 => 1.6.6

   1) bug fixes


Authors since 1.6.5

Ana Guerrero López (1):
  rdma-ndd: fix compiler warnings.

Dan Ben Yosef (2):
  perfquery -T (print Extended Speed Counters) times out on nodes supporting
  libibnetdisc: Avoid pushing same pointer to the hash table

Hal Rosenstock (5):
  perfquery.c: Change format of capability mask in IBWARN for consistency
  ibdiag_sa.c: In sa_get_handle, handle umad_open_port and umad_register fai
  iblinkinfo.c: Close additional file descriptor in advance
  ibdiag_common.c: Move static to beginning of get_build_version declaration
  perfquery.c: Fix smp_query_via return value checks

Ira Weiny (3):
  infiniband-diags/rdma-ndd: Fix issues with install
  infiniband-diags/rdma-ndd: add --pidfile option
  infiniband-diags: rdma-ndd: remove udev logging when not supported

Michal Schmidt (2):
  rdma-ndd: never use udev_get_sys_path()
  build-sys: avoid overlinking to libudev

Vladimir Koushnir (10):
  ibqueryerrors: Resource leak in path_record_query
  Remove redundant umad file descriptor from libibnetdisc
  query_smp.c: Avoid busy looping in process_one_recv
  dump_fts: Open global file descriptor after calling ibnd_discover_fabric
  ibqueryerrors: code improvement
  ibqueryerrors: Fix crash when no SM is running
  ibqueryerrors: Close global file descriptor before running ibnd_discover_f
  ibqueryerrors: improve code related to DR option
  vendstat: mad_rpc_close_port not called in corner cases
  saquery.c: Fix saquery -D option


--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html