Re: [dpdk-dev] [PATCH v7 1/5] vfio: extend data structure for multi container

2018-04-16 Thread Wang, Xiao W
Hi Anatoly,

> -Original Message-
> From: Burakov, Anatoly
> Sent: Monday, April 16, 2018 6:03 PM
> To: Wang, Xiao W ; Yigit, Ferruh
> 
> Cc: dev@dpdk.org; maxime.coque...@redhat.com; Wang, Zhihong
> ; Bie, Tiwei ; Tan, Jianfeng
> ; Liang, Cunming ; Daly,
> Dan ; tho...@monjalon.net; Chen, Junjie J
> 
> Subject: Re: [PATCH v7 1/5] vfio: extend data structure for multi container
> 
> On 15-Apr-18 4:33 PM, Xiao Wang wrote:
> > Currently eal vfio framework binds vfio group fd to the default
> > container fd during rte_vfio_setup_device, while in some cases,
> > e.g. vDPA (vhost data path acceleration), we want to put vfio group
> > to a separate container and program IOMMU via this container.
> >
> > This patch extends the vfio_config structure to contain per-container
> > user_mem_maps and defines an array of vfio_config. The next patch will
> > base on this to add container API.
> >
> > Signed-off-by: Junjie Chen 
> > Signed-off-by: Xiao Wang 
> > Reviewed-by: Maxime Coquelin 
> > Reviewed-by: Ferruh Yigit 
> > ---
> >   config/common_base |   1 +
> >   lib/librte_eal/linuxapp/eal/eal_vfio.c | 407 ++---
> 
> >   lib/librte_eal/linuxapp/eal/eal_vfio.h |  19 +-
> >   3 files changed, 275 insertions(+), 152 deletions(-)
> >
> > diff --git a/config/common_base b/config/common_base
> > index c4236fd1f..4a76d2f14 100644
> > --- a/config/common_base
> > +++ b/config/common_base
> > @@ -87,6 +87,7 @@ CONFIG_RTE_EAL_ALWAYS_PANIC_ON_ERROR=n
> >   CONFIG_RTE_EAL_IGB_UIO=n
> >   CONFIG_RTE_EAL_VFIO=n
> >   CONFIG_RTE_MAX_VFIO_GROUPS=64
> > +CONFIG_RTE_MAX_VFIO_CONTAINERS=64
> >   CONFIG_RTE_MALLOC_DEBUG=n
> >   CONFIG_RTE_EAL_NUMA_AWARE_HUGEPAGES=n
> >   CONFIG_RTE_USE_LIBBSD=n
> > diff --git a/lib/librte_eal/linuxapp/eal/eal_vfio.c
> b/lib/librte_eal/linuxapp/eal/eal_vfio.c
> > index 589d7d478..46fba2d8d 100644
> > --- a/lib/librte_eal/linuxapp/eal/eal_vfio.c
> > +++ b/lib/librte_eal/linuxapp/eal/eal_vfio.c
> > @@ -22,8 +22,46 @@
> >
> >   #define VFIO_MEM_EVENT_CLB_NAME "vfio_mem_event_clb"
> >
> > +/*
> > + * we don't need to store device fd's anywhere since they can be obtained
> from
> > + * the group fd via an ioctl() call.
> > + */
> > +struct vfio_group {
> > +   int group_no;
> > +   int fd;
> > +   int devices;
> > +};
> 
> What is the purpose of moving this into .c file? Seems like an
> unnecessary change.

Yes, we can let vfio_group stay at .h, and move vfio_config into .c

> 
> > +
> > +/* hot plug/unplug of VFIO groups may cause all DMA maps to be dropped.
> we can
> > + * recreate the mappings for DPDK segments, but we cannot do so for
> memory that
> > + * was registered by the user themselves, so we need to store the user
> mappings
> > + * somewhere, to recreate them later.
> > + */
> > +#define VFIO_MAX_USER_MEM_MAPS 256
> > +struct user_mem_map {
> > +   uint64_t addr;
> > +   uint64_t iova;
> > +   uint64_t len;
> > +};
> > +
> 
> <...>
> 
> > +static struct vfio_config *
> > +get_vfio_cfg_by_group_no(int iommu_group_no)
> > +{
> > +   struct vfio_config *vfio_cfg;
> > +   int i, j;
> > +
> > +   for (i = 0; i < VFIO_MAX_CONTAINERS; i++) {
> > +   vfio_cfg = &vfio_cfgs[i];
> > +   for (j = 0; j < VFIO_MAX_GROUPS; j++) {
> > +   if (vfio_cfg->vfio_groups[j].group_no ==
> > +   iommu_group_no)
> > +   return vfio_cfg;
> > +   }
> > +   }
> > +
> > +   return default_vfio_cfg;
> 
> Here and in other places: i'm not sure returning default vfio config if
> group not found is such a good idea. It would be better if calling code
> explicitly handled case of group not existing yet.

Agree. It would be explicit.

> 
> > +}
> > +
> > +static struct vfio_config *
> > +get_vfio_cfg_by_group_fd(int vfio_group_fd)
> > +{
> > +   struct vfio_config *vfio_cfg;
> > +   int i, j;
> > +
> > +   for (i = 0; i < VFIO_MAX_CONTAINERS; i++) {
> > +   vfio_cfg = &vfio_cfgs[i];
> > +   for (j = 0; j < VFIO_MAX_GROUPS; j++)
> > +   if (vfio_cfg->vfio_groups[j].fd == vfio_group_fd)
> > +   return vfio_cfg;
> > +   }
> >
> 
> <...>
> 
> > -   for (i = 0; i < VFIO_MAX_GROUPS; i++) {
> > -   vfio_cfg.vfio_groups[i].fd = -1;
> > -   vfio_cfg.vfio_groups[i].group_no = -1;
> > -   vfio_cfg.vfio_groups[i].devices = 0;
> > +   rte_spinlock_recursive_t lock =
> RTE_SPINLOCK_RECURSIVE_INITIALIZER;
> > +
> > +   for (i = 0; i < VFIO_MAX_CONTAINERS; i++) {
> > +   vfio_cfgs[i].vfio_container_fd = -1;
> > +   vfio_cfgs[i].vfio_active_groups = 0;
> > +   vfio_cfgs[i].vfio_iommu_type = NULL;
> > +   vfio_cfgs[i].mem_maps.lock = lock;
> 
> Nitpick - why copy, instead of straight up initializing with
> RTE_SPINLOCK_RECURSIVE_INITIALIZER?

I tried but compiler doesn't allow this assignment.
RTE_SPINLOCK_RECURSIVE_INITIALIZER could only be used for initialization.

Thanks for the comme

Re: [dpdk-dev] [PATCH v7 1/5] vfio: extend data structure for multi container

2018-04-16 Thread Burakov, Anatoly

On 15-Apr-18 4:33 PM, Xiao Wang wrote:

Currently eal vfio framework binds vfio group fd to the default
container fd during rte_vfio_setup_device, while in some cases,
e.g. vDPA (vhost data path acceleration), we want to put vfio group
to a separate container and program IOMMU via this container.

This patch extends the vfio_config structure to contain per-container
user_mem_maps and defines an array of vfio_config. The next patch will
base on this to add container API.

Signed-off-by: Junjie Chen 
Signed-off-by: Xiao Wang 
Reviewed-by: Maxime Coquelin 
Reviewed-by: Ferruh Yigit 
---
  config/common_base |   1 +
  lib/librte_eal/linuxapp/eal/eal_vfio.c | 407 ++---
  lib/librte_eal/linuxapp/eal/eal_vfio.h |  19 +-
  3 files changed, 275 insertions(+), 152 deletions(-)

diff --git a/config/common_base b/config/common_base
index c4236fd1f..4a76d2f14 100644
--- a/config/common_base
+++ b/config/common_base
@@ -87,6 +87,7 @@ CONFIG_RTE_EAL_ALWAYS_PANIC_ON_ERROR=n
  CONFIG_RTE_EAL_IGB_UIO=n
  CONFIG_RTE_EAL_VFIO=n
  CONFIG_RTE_MAX_VFIO_GROUPS=64
+CONFIG_RTE_MAX_VFIO_CONTAINERS=64
  CONFIG_RTE_MALLOC_DEBUG=n
  CONFIG_RTE_EAL_NUMA_AWARE_HUGEPAGES=n
  CONFIG_RTE_USE_LIBBSD=n
diff --git a/lib/librte_eal/linuxapp/eal/eal_vfio.c 
b/lib/librte_eal/linuxapp/eal/eal_vfio.c
index 589d7d478..46fba2d8d 100644
--- a/lib/librte_eal/linuxapp/eal/eal_vfio.c
+++ b/lib/librte_eal/linuxapp/eal/eal_vfio.c
@@ -22,8 +22,46 @@
  
  #define VFIO_MEM_EVENT_CLB_NAME "vfio_mem_event_clb"
  
+/*

+ * we don't need to store device fd's anywhere since they can be obtained from
+ * the group fd via an ioctl() call.
+ */
+struct vfio_group {
+   int group_no;
+   int fd;
+   int devices;
+};


What is the purpose of moving this into .c file? Seems like an 
unnecessary change.



+
+/* hot plug/unplug of VFIO groups may cause all DMA maps to be dropped. we can
+ * recreate the mappings for DPDK segments, but we cannot do so for memory that
+ * was registered by the user themselves, so we need to store the user mappings
+ * somewhere, to recreate them later.
+ */
+#define VFIO_MAX_USER_MEM_MAPS 256
+struct user_mem_map {
+   uint64_t addr;
+   uint64_t iova;
+   uint64_t len;
+};
+


<...>


+static struct vfio_config *
+get_vfio_cfg_by_group_no(int iommu_group_no)
+{
+   struct vfio_config *vfio_cfg;
+   int i, j;
+
+   for (i = 0; i < VFIO_MAX_CONTAINERS; i++) {
+   vfio_cfg = &vfio_cfgs[i];
+   for (j = 0; j < VFIO_MAX_GROUPS; j++) {
+   if (vfio_cfg->vfio_groups[j].group_no ==
+   iommu_group_no)
+   return vfio_cfg;
+   }
+   }
+
+   return default_vfio_cfg;


Here and in other places: i'm not sure returning default vfio config if 
group not found is such a good idea. It would be better if calling code 
explicitly handled case of group not existing yet.



+}
+
+static struct vfio_config *
+get_vfio_cfg_by_group_fd(int vfio_group_fd)
+{
+   struct vfio_config *vfio_cfg;
+   int i, j;
+
+   for (i = 0; i < VFIO_MAX_CONTAINERS; i++) {
+   vfio_cfg = &vfio_cfgs[i];
+   for (j = 0; j < VFIO_MAX_GROUPS; j++)
+   if (vfio_cfg->vfio_groups[j].fd == vfio_group_fd)
+   return vfio_cfg;
+   }
  


<...>


-   for (i = 0; i < VFIO_MAX_GROUPS; i++) {
-   vfio_cfg.vfio_groups[i].fd = -1;
-   vfio_cfg.vfio_groups[i].group_no = -1;
-   vfio_cfg.vfio_groups[i].devices = 0;
+   rte_spinlock_recursive_t lock = RTE_SPINLOCK_RECURSIVE_INITIALIZER;
+
+   for (i = 0; i < VFIO_MAX_CONTAINERS; i++) {
+   vfio_cfgs[i].vfio_container_fd = -1;
+   vfio_cfgs[i].vfio_active_groups = 0;
+   vfio_cfgs[i].vfio_iommu_type = NULL;
+   vfio_cfgs[i].mem_maps.lock = lock;


Nitpick - why copy, instead of straight up initializing with 
RTE_SPINLOCK_RECURSIVE_INITIALIZER?



+
+   for (j = 0; j < VFIO_MAX_GROUPS; j++) {
+   vfio_cfgs[i].vfio_groups[j].fd = -1;
+   vfio_cfgs[i].vfio_groups[j].group_no = -1;
+   vfio_cfgs[i].vfio_groups[j].devices = 0;
+   }
}
  
  	/* inform the user that we are probing for VFIO */

@@ -841,12 +971,12 @@ rte_vfio_enable(const char *modname)
return 0;
}


<...>

--
Thanks,
Anatoly