Re: [dpdk-users] attach/detach on secondary process

2017-12-13 Thread Thomas Monjalon
13/12/2017 22:10, Stephen Hemminger:
> On Wed, 13 Dec 2017 22:00:48 +0100
> Thomas Monjalon  wrote:
> 
> > 13/12/2017 18:09, Stephen Hemminger:
> > > Many DPDK drivers require that setup and initialization be done by
> > > the primary process. This is mostly to avoid dealing with concurrency 
> > > since
> > > there can be multiple secondary processes.  
> > 
> > I think we should consider this limitation as a bug.
> > We must allow a secondary process to initialize a device.
> > The race in device creation must be fixed.
> > 
> 
> Secondary processes should be able to do setup.
> But it is up to the application not to do it concurrently from multiple
> processes.

Yes there can be synchronization between processes.
But I think it is safer to fix the device creation race in ethdev.
Note that I am not talking about configuration concurrency,
but just race in probing.


Re: [dpdk-users] attach/detach on secondary process

2017-12-13 Thread Stephen Hemminger
On Wed, 13 Dec 2017 22:00:48 +0100
Thomas Monjalon  wrote:

> 13/12/2017 18:09, Stephen Hemminger:
> > Many DPDK drivers require that setup and initialization be done by
> > the primary process. This is mostly to avoid dealing with concurrency since
> > there can be multiple secondary processes.  
> 
> I think we should consider this limitation as a bug.
> We must allow a secondary process to initialize a device.
> The race in device creation must be fixed.
> 

Secondary processes should be able to do setup.
But it is up to the application not to do it concurrently from multiple
processes.


Re: [dpdk-users] attach/detach on secondary process

2017-12-13 Thread Thomas Monjalon
13/12/2017 18:09, Stephen Hemminger:
> Many DPDK drivers require that setup and initialization be done by
> the primary process. This is mostly to avoid dealing with concurrency since
> there can be multiple secondary processes.

I think we should consider this limitation as a bug.
We must allow a secondary process to initialize a device.
The race in device creation must be fixed.



Re: [dpdk-users] attach/detach on secondary process

2017-12-13 Thread Stephen Hemminger
On Wed, 13 Dec 2017 17:58:53 +0100
Ricardo Roldan  wrote:

> Hi,
> 
> We have a multi-process application and we need to support 
> attaching/detaching of ports. We are using the 17.11 version with the 
> Intel x520 (ixgbe driver) and virtio.
> 
> At the time we initialize our processes there are not any devices binded 
> with the DPDK drivers, so we initialize all processes (primary and 
> secondaries) with 0 ports.
> 
> This seems to work fine only on the primary processes, but on the 
> secondary processes we see some problems. In the following paragraphs I 
> describe the procedure used to attach/detach interfaces with DPDK.
> 
> For the attach procedure (all processes initially have no devices 
> attached):
> 
> - Bind the devices we want to attach to the DPDK driver (with the script 
> dpdk-devbind, from external process)
> 
> - Primary process: Call rte_eth_dev_attach
> 
> - Primary process: Configure ports using ...
> 
> - Secondary processes: Call to rte_eth_dev_attach
> 
> 
> Start to send/receive packets from all processes.
> 
> 
> For the detach procedure:
> 
> - Secondary processes: For each port, call rte_eth_dev_stop(port), 
> rte_eth_dev_close(port) and rte_eth_dev_detach(port, dev).
> 
> - Primary process: After the secondary processes have detach all their 
> ports, for each port call rte_eth_dev_stop(port), 
> rte_eth_dev_close(port) and rte_eth_dev_detach(port, dev).
> 
> - Bind the device to the original Linux driver (with the script 
> dpdk-devbind, from external process)
> 
> 
> With this approach we have noticed that when the secondary processes 
> call rte_dev_detach there is an error, because it calls the remove 
> operation, which ends up calling eth_ixgbe_dev_uninit that returns 
> -EPERM (because it does not allow a non primary process to uninitialize 
> the driver).
> 
> 
> Therefore, the port attach never works again on the secondary processes 
> as the function rte_eal_hotplug_add fails because it cannot find the 
> device.
> 
> 
> dev = bus->find_device(NULL, cmp_detached_dev_name, devname);
>      if (dev == NULL) {
>      RTE_LOG(ERR, EAL, "Cannot find unplugged device (%s)\n",
>      devname);
>      ret = -ENODEV;
>      goto err_devarg;
>      }
> 
> 
> This happens because in order to check unplugged devices, the function 
> cmp_detached_dev_name  checks if there is a pointer to the driver and 
> fails if it is already set, because the detach procedure never set the 
> driver variable to NULL.
> 
> static int cmp_detached_dev_name(const struct rte_device *dev,
>      const void *_name)
> {
>      const char *name = _name;
> 
>      /* skip attached devices */
>      RTE_LOG(ERR, EAL, "cmp_detached_dev_name dev %p name %s driver %p"
>      " search %s\n",
>      dev, dev->name, dev->driver, name);
>      if (dev->driver != NULL)
>      return 1;
> 
>      return strcmp(dev->name, name);
> }
> 
> To fix this behavior we have done the following changes on the DPDK code.
> 
> First, in order to prevent cmp_detached_dev_name from failing, 
> rte_eal_dev_detach sets driver to NULL.
> 
> diff --git a/lib/librte_eal/common/eal_common_dev.c 
> b/lib/librte_eal/common/eal_common_dev.c
> 
> index dda8f5835..9a363dcf7 100644
> --- a/lib/librte_eal/common/eal_common_dev.c
> +++ b/lib/librte_eal/common/eal_common_dev.c
> @@ -114,6 +114,7 @@ int rte_eal_dev_detach(struct rte_device *dev)
>      if (ret)
>      RTE_LOG(ERR, EAL, "Driver cannot detach the device (%s)\n",
>      dev->name);
> +   dev->driver = NULL;
>      return ret;
>   }/*
> */
> 
> 
> Then, in the rte_eth_dev_pci_generic_remove function, the call to 
> dev_uninit does not consider -EPERM an error, because when detaching a 
> port, some drivers return 0 and other drivers return -EPERM to indicate 
> that it is called from a secondary process.
> 
> diff --git a/lib/librte_ether/rte_ethdev_pci.h 
> b/lib/librte_ether/rte_ethdev_pci.h
> @@ -184,7 +184,7 @@ rte_eth_dev_pci_generic_remove(struct rte_pci_device 
> *pci_dev,
> 
>      if (dev_uninit) {
>      ret = dev_uninit(eth_dev);
> -   if (ret)
> +   if (ret && ret != -EPERM)
>      return ret;
>      }
> 
> Finally, in the rte_eth_dev_pci_release function, only the fields in the 
> shared memory region are reset if called from a primary process.
> 
> /**/diff --git a/lib/librte_ether/rte_ethdev_pci.h 
> b/lib/librte_ether/rte_ethdev_pci.h
> index 722075e09..a79188fbf 100644
> --- a/lib/librte_ether/rte_ethdev_pci.h
> +++ b/lib/librte_ether/rte_ethdev_pci.h
> @@ -125,16 +125,16 @@ rte_eth_dev_pci_release(struct rte_eth_dev *eth_dev)
>      /* free ether device */
>      rte_eth_dev_release_port(eth_dev);
> 
> -   if (rte_eal_process_type() == RTE_PROC_PRIMARY)
> +   if (rte_eal_process_type() == RTE_PROC_PRIMARY) {
> rte_free(eth_dev->data

[dpdk-users] attach/detach on secondary process

2017-12-13 Thread Ricardo Roldan

Hi,

We have a multi-process application and we need to support 
attaching/detaching of ports. We are using the 17.11 version with the 
Intel x520 (ixgbe driver) and virtio.


At the time we initialize our processes there are not any devices binded 
with the DPDK drivers, so we initialize all processes (primary and 
secondaries) with 0 ports.


This seems to work fine only on the primary processes, but on the 
secondary processes we see some problems. In the following paragraphs I 
describe the procedure used to attach/detach interfaces with DPDK.


For the attach procedure (all processes initially have no devices 
attached):


- Bind the devices we want to attach to the DPDK driver (with the script 
dpdk-devbind, from external process)


- Primary process: Call rte_eth_dev_attach

- Primary process: Configure ports using ...

- Secondary processes: Call to rte_eth_dev_attach


Start to send/receive packets from all processes.


For the detach procedure:

- Secondary processes: For each port, call rte_eth_dev_stop(port), 
rte_eth_dev_close(port) and rte_eth_dev_detach(port, dev).


- Primary process: After the secondary processes have detach all their 
ports, for each port call rte_eth_dev_stop(port), 
rte_eth_dev_close(port) and rte_eth_dev_detach(port, dev).


- Bind the device to the original Linux driver (with the script 
dpdk-devbind, from external process)



With this approach we have noticed that when the secondary processes 
call rte_dev_detach there is an error, because it calls the remove 
operation, which ends up calling eth_ixgbe_dev_uninit that returns 
-EPERM (because it does not allow a non primary process to uninitialize 
the driver).



Therefore, the port attach never works again on the secondary processes 
as the function rte_eal_hotplug_add fails because it cannot find the 
device.



dev = bus->find_device(NULL, cmp_detached_dev_name, devname);
    if (dev == NULL) {
    RTE_LOG(ERR, EAL, "Cannot find unplugged device (%s)\n",
    devname);
    ret = -ENODEV;
    goto err_devarg;
    }


This happens because in order to check unplugged devices, the function 
cmp_detached_dev_name  checks if there is a pointer to the driver and 
fails if it is already set, because the detach procedure never set the 
driver variable to NULL.


static int cmp_detached_dev_name(const struct rte_device *dev,
    const void *_name)
{
    const char *name = _name;

    /* skip attached devices */
    RTE_LOG(ERR, EAL, "cmp_detached_dev_name dev %p name %s driver %p"
    " search %s\n",
    dev, dev->name, dev->driver, name);
    if (dev->driver != NULL)
    return 1;

    return strcmp(dev->name, name);
}

To fix this behavior we have done the following changes on the DPDK code.

First, in order to prevent cmp_detached_dev_name from failing, 
rte_eal_dev_detach sets driver to NULL.


diff --git a/lib/librte_eal/common/eal_common_dev.c 
b/lib/librte_eal/common/eal_common_dev.c


index dda8f5835..9a363dcf7 100644
--- a/lib/librte_eal/common/eal_common_dev.c
+++ b/lib/librte_eal/common/eal_common_dev.c
@@ -114,6 +114,7 @@ int rte_eal_dev_detach(struct rte_device *dev)
    if (ret)
    RTE_LOG(ERR, EAL, "Driver cannot detach the device (%s)\n",
    dev->name);
+   dev->driver = NULL;
    return ret;
 }/*
*/


Then, in the rte_eth_dev_pci_generic_remove function, the call to 
dev_uninit does not consider -EPERM an error, because when detaching a 
port, some drivers return 0 and other drivers return -EPERM to indicate 
that it is called from a secondary process.


diff --git a/lib/librte_ether/rte_ethdev_pci.h 
b/lib/librte_ether/rte_ethdev_pci.h
@@ -184,7 +184,7 @@ rte_eth_dev_pci_generic_remove(struct rte_pci_device 
*pci_dev,


    if (dev_uninit) {
    ret = dev_uninit(eth_dev);
-   if (ret)
+   if (ret && ret != -EPERM)
    return ret;
    }

Finally, in the rte_eth_dev_pci_release function, only the fields in the 
shared memory region are reset if called from a primary process.


/**/diff --git a/lib/librte_ether/rte_ethdev_pci.h 
b/lib/librte_ether/rte_ethdev_pci.h

index 722075e09..a79188fbf 100644
--- a/lib/librte_ether/rte_ethdev_pci.h
+++ b/lib/librte_ether/rte_ethdev_pci.h
@@ -125,16 +125,16 @@ rte_eth_dev_pci_release(struct rte_eth_dev *eth_dev)
    /* free ether device */
    rte_eth_dev_release_port(eth_dev);

-   if (rte_eal_process_type() == RTE_PROC_PRIMARY)
+   if (rte_eal_process_type() == RTE_PROC_PRIMARY) {
rte_free(eth_dev->data->dev_private);
+ eth_dev->data->dev_private = NULL;

-   eth_dev->data->dev_private = NULL;
-
-   /*
-    * Secondary process will check the name to attach.
-    * Clear this field to avoid attaching a released ports.
-    */
-   eth_dev->data->name[0] = '\0';
+   /*
+    * Secondary process