Hi Qi,

I wanted to push this old series, but I still have some questions
and comments. Please let's fix them quickly.

Note: I did not review the IPC mechanism and rollback. I trust you :)

28/09/2018 06:23, Qi Zhang:
> --- a/doc/guides/rel_notes/release_18_11.rst
> +++ b/doc/guides/rel_notes/release_18_11.rst
> @@ -67,6 +67,12 @@ New Features
>    SR-IOV option in Hyper-V and Azure. This is an alternative to the previous
>    vdev_netvsc, tap, and failsafe drivers combination.
>  
> +* **Support device multi-process hotplug.**
> +
> +  Hotplug and hot-unplug for devices will now be supported in multiprocessing
> +  scenario. Any ethdev devices created in the primary process will be 
> regarded
> +  as shared and will be available for all DPDK processes. Synchronization
> +  between processes will be done using DPDK IPC.
>  

A blank line is missing here.

>  API Changes
>  -----------
> @@ -91,6 +97,11 @@ API Changes
>    flag the MAC can be properly configured in any case. This is particularly
>    important for bonding.
>  
> +* eal: scope of rte_eal_hotplug_add and rte_eal_hotplug_remove is extended.
> +
> +  In primary-secondary process model, ``rte_eal_hotplug_add`` will guarantee
> +  that device be attached on all processes, while ``rte_eal_hotplug_remove``
> +  will guarantee device be detached on all processes.
>  

Here too, double blank line before next heading.

> --- a/lib/librte_eal/common/eal_common_dev.c
> +++ b/lib/librte_eal/common/eal_common_dev.c
> -int
> -rte_eal_hotplug_add(const char *busname, const char *devname,
> -                    const char *drvargs)
> +/* help funciton to build devargs, caller should free the memory */

"help funciton" -> "helper function"

> +static char *
> +build_devargs(const char *busname, const char *devname,
> +           const char *drvargs)
>  {
>       char *devargs = NULL;
>       int size, length = -1;
> @@ -140,19 +143,33 @@ rte_eal_hotplug_add(const char *busname, const char 
> *devname,
>               if (length >= size)
>                       devargs = malloc(length + 1);
>               if (devargs == NULL)
> -                     return -ENOMEM;
> +                     break;
>       } while (size == 0);

It is an old code, please rebase on master.

> -int __rte_experimental
> -rte_dev_remove(struct rte_device *dev)
> +/* remove device at local process. */
> +int
> +local_dev_remove(struct rte_device *dev)
>  {
>       struct rte_bus *bus;
>       int ret;
> @@ -248,7 +268,194 @@ rte_dev_remove(struct rte_device *dev)
>       if (ret)
>               RTE_LOG(ERR, EAL, "Driver cannot detach the device (%s)\n",
>                       dev->name);
> -     rte_devargs_remove(dev->devargs);
> +     else
> +             rte_devargs_remove(dev->devargs);

It looks you are fixing a bug here. Good catch!

> +int __rte_experimental
> +rte_dev_probe(const char *devargs)
> +{
> +     struct eal_dev_mp_req req;
> +     struct rte_device *dev;
> +     int ret;
> +
> +     memset(&req, 0, sizeof(req));
> +     req.t = EAL_DEV_REQ_TYPE_ATTACH;
> +     strlcpy(req.devargs, devargs, EAL_DEV_MP_DEV_ARGS_MAX_LEN);
> +
> +     if (rte_eal_process_type() != RTE_PROC_PRIMARY) {
> +             /**
> +              * If in secondary process, just send IPC request to
> +              * primary process.
> +              */
> +             ret = eal_dev_hotplug_request_to_primary(&req);
> +             if (ret) {
> +                     RTE_LOG(ERR, EAL,
> +                             "Failed to send hotplug request to primary\n");
> +                     return -ENOMSG;
> +             }
> +             if (req.result)
> +                     RTE_LOG(ERR, EAL,
> +                             "Failed to hotplug add device\n");
> +             return req.result;
> +     }
> +
> +     /* attach a shared device from primary start from here: */
> +
> +     /* primary attach the new device itself. */
> +     ret = local_dev_probe(devargs, &dev);
> +
> +     if (ret) {
> +             RTE_LOG(ERR, EAL,
> +                     "Failed to attach device on primary process\n");
> +
> +             /**
> +              * it is possible that secondary process failed to attached a
> +              * device that primary process have during initialization,
> +              * so for -EEXIST case, we still need to sync with secondary
> +              * process.
> +              */
> +             if (ret != -EEXIST)
> +                     return ret;
> +     }
> +
> +     /* primary send attach sync request to secondary. */
> +     ret = eal_dev_hotplug_request_to_secondary(&req);
> +
> +     /* if any commnunication error, we need to rollback. */

typo: communication

> +     if (ret) {
> +             RTE_LOG(ERR, EAL,
> +                     "Failed to send hotplug add request to secondary\n");
> +             ret = -ENOMSG;
> +             goto rollback;
> +     }
> +
> +     /**
> +      * if any secondary failed to attach, we need to consider if rollback
> +      * is necessary.
> +      */
> +     if (req.result) {
> +             RTE_LOG(ERR, EAL,
> +                     "Failed to attach device on secondary process\n");
> +             ret = req.result;
> +
> +             /* for -EEXIST, we don't need to rollback. */
> +             if (ret == -EEXIST)
> +                     return ret;
> +             goto rollback;
> +     }
> +
> +     return 0;
> +
> +rollback:
> +     req.t = EAL_DEV_REQ_TYPE_ATTACH_ROLLBACK;
> +
> +     /* primary send rollback request to secondary. */
> +     if (eal_dev_hotplug_request_to_secondary(&req))

For all occurences of "if (function())", the coding style is requesting
an explicit check of the return value: if (function() != 0)

> +             RTE_LOG(WARNING, EAL,
> +                     "Failed to rollback device attach on secondary."
> +                     "Devices in secondary may not sync with primary\n");
> +
> +     /* primary rollback itself. */
> +     if (local_dev_remove(dev))
> +             RTE_LOG(WARNING, EAL,
> +                     "Failed to rollback device attach on primary."
> +                     "Devices in secondary may not sync with primary\n");
> +
> +     return ret;
> +}
> +
> +int __rte_experimental
> +rte_dev_remove(struct rte_device *dev)
> +{
> +     struct eal_dev_mp_req req;
> +     char *devargs;
> +     int ret;
> +
> +     devargs = build_devargs(dev->devargs->bus->name, dev->name, "");
> +     if (devargs == NULL)
> +             return -ENOMEM;
> +
> +     memset(&req, 0, sizeof(req));
> +     req.t = EAL_DEV_REQ_TYPE_DETACH;
> +     strlcpy(req.devargs, devargs, EAL_DEV_MP_DEV_ARGS_MAX_LEN);
> +     free(devargs);
> +
> +     if (rte_eal_process_type() != RTE_PROC_PRIMARY) {
> +             /**
> +              * If in secondary process, just send IPC request to
> +              * primary process.
> +              */
> +             ret = eal_dev_hotplug_request_to_primary(&req);
> +             if (ret) {
> +                     RTE_LOG(ERR, EAL,
> +                             "Failed to send hotplug request to primary\n");
> +                     return -ENOMSG;
> +             }
> +             if (req.result)
> +                     RTE_LOG(ERR, EAL,
> +                             "Failed to hotplug remove device\n");
> +             return req.result;
> +     }
> +
> +     /* detach a device from primary start from here: */
> +
> +     /* primary send detach sync request to secondary */
> +     ret = eal_dev_hotplug_request_to_secondary(&req);
> +
> +     /**
> +      * if communication error, we need to rollback, because it is possible
> +      * part of the secondary processes still detached it successfully.
> +      */
> +     if (ret) {

ret is not a boolean, please do explicit check != 0.

> +             RTE_LOG(ERR, EAL,
> +                     "Failed to send device detach request to secondary\n");
> +             ret = -ENOMSG;
> +             goto rollback;
> +     }
> +
> +     /**
> +      * if any secondary failed to detach, we need to consider if rollback
> +      * is necessary.
> +      */
> +     if (req.result) {

result is not a boolean, please do explicit check != 0.

> +             RTE_LOG(ERR, EAL,
> +                     "Failed to detach device on secondary process\n");
> +             ret = req.result;
> +             /**
> +              * if -ENOENT, we don't need to rollback, since devices is
> +              * already detached on secondary process.
> +              */
> +             if (ret != -ENOENT)
> +                     goto rollback;
> +     }
> +
> +     /* primary detach the device itself. */
> +     ret = local_dev_remove(dev);
> +
> +     /* if primary failed, still need to consider if rollback is necessary */
> +     if (ret) {
> +             RTE_LOG(ERR, EAL,
> +                     "Failed to detach device on primary process\n");
> +             /* if -ENOENT, we don't need to rollback */
> +             if (ret == -ENOENT)
> +                     return ret;
> +             goto rollback;
> +     }
> +
> +     return 0;
> +
> +rollback:
> +     req.t = EAL_DEV_REQ_TYPE_DETACH_ROLLBACK;
> +
> +     /* primary send rollback request to secondary. */
> +     if (eal_dev_hotplug_request_to_secondary(&req))
> +             RTE_LOG(WARNING, EAL,
> +                     "Failed to rollback device detach on secondary."
> +                     "Devices in secondary may not sync with primary\n");
> +
>       return ret;
>  }

> --- a/lib/librte_eal/common/eal_private.h
> +++ b/lib/librte_eal/common/eal_private.h
> +/**
> + * Register all mp action callbacks for hotplug.
> + *
> + * @return
> + *   0 on success, negative on error.
> + */
> +int rte_dev_hotplug_mp_init(void);

This function is called by the init, so it should not be private.
The app should be free to build its own init routine.

> --- /dev/null
> +++ b/lib/librte_eal/common/hotplug_mp.h
> +#include <rte_dev.h>
> +#include <rte_bus.h>

I think EAL headers should be included with quotes.

> +/**
> + * this is a synchronous wrapper for secondary process send

Missing uppercase at the beggining of sentence.

> + * request to primary process, this is invoked when an attach
> + * or detach request issued from primary process.

Missing "is" before "issued": request is issued.

> --- a/lib/librte_eal/common/include/rte_dev.h
> +++ b/lib/librte_eal/common/include/rte_dev.h
> @@ -190,6 +190,9 @@ int rte_eal_dev_detach(struct rte_device *dev);
>  
>  /**
>   * Hotplug add a given device to a specific bus.
> + * In multi-process, this function will inform all other processes
> + * to hotplug add the same device. Any failure on other process rollback
> + * the action.

Better to leave a blank line before this comment.
Small reword:

 * Hotplug add a given device to a specific bus.
 *
 * In multi-process, it will request other processes to add the same device.
 * A failure, in any process, will rollback the action.

You should add the same comment for rte_dev_probe().

> @@ -219,6 +222,9 @@ int __rte_experimental rte_dev_probe(const char *devargs);
>  
>  /**
>   * Hotplug remove a given device from a specific bus.
> + * In multi-process, this function will inform all other processes
> + * to hotplug remove the same device. Any failure on other process
> + * will rollback the action.

Same reword:

 * Hotplug remove a given device from a specific bus.
 *
 * In multi-process, it will request other processes to remove the same device.
 * A failure, in any process, will rollback the action.

[...]
> --- a/lib/librte_eal/linuxapp/eal/eal.c
> +++ b/lib/librte_eal/linuxapp/eal/eal.c
> +     /* register mp action callbacks for hotplug */
> +     if (rte_dev_hotplug_mp_init() < 0) {

In the comment, better to say "multi-process" instead of the cryptic "mp".




Reply via email to