Re: [dpdk-dev] [PATCH 0/8] Remove IPC threads

2018-06-26 Thread Zhang, Qi Z


> -Original Message-
> From: Zhang, Qi Z
> Sent: Tuesday, June 26, 2018 9:19 AM
> To: 'Anatoly Burakov' ; dev@dpdk.org
> Cc: Ananyev, Konstantin ;
> tho...@monjalon.net; Richardson, Bruce 
> Subject: RE: [dpdk-dev] [PATCH 0/8] Remove IPC threads
> 
> Hi Anatoly and Thomas:
> 
> Sorry for raise this late, but seems merge mp thread into interrupt thread 
> gives
> problem to enable hotplug on secondary [1].
> 
> The issue is, when secondary want to attach a share device, it send request to
> primary Then primary is running in mp callback (mp thread) to attach device, 
> it
> will call rte_malloc which get chance to increase heap that will do sync IPC, 
> You
> know, this is the limitation we can't do sync IPC in mp thread itself. so the
> solution is try to move real work to a separate thread which has no limitation
> to do sync IPC, and interrupt thread is the good candidate, because we just
> need to call rte_eal_set_alarm and we don't need to worry about the
> execution sequence.
> 
> But if we merge mp thread into interrupt thread, the solution will not work, 
> we
> may need to create specific temporal thread to handle callback, but this looks
> like some re-build which we already have.

Ok, actually this method looks good to me :), I will send v4 to have this,
But please let me know if you have better idea.

> So I think we need to revisit if we need to merge the thread before we have a
> good solution for such kind of issue.
> 
> Thanks
> Qi
> 
> [1] https://mails.dpdk.org/archives/dev/2018-June/105018.html
> 
> 
> > -Original Message-
> > From: dev [mailto:dev-boun...@dpdk.org] On Behalf Of Anatoly Burakov
> > Sent: Friday, June 15, 2018 10:25 PM
> > To: dev@dpdk.org
> > Cc: Ananyev, Konstantin ;
> > tho...@monjalon.net; Richardson, Bruce 
> > Subject: [dpdk-dev] [PATCH 0/8] Remove IPC threads
> >
> > As previously discussed [1], IPC threads need to be removed and their
> > workload moved to interrupt thread.
> >
> > FreeBSD did not have an interrupt thread, nor did it support alarm
> > API. This patchset adds support for both on FreeBSD. FreeBSD interrupt
> > thread is based on kevent, FreeBSD's native event multiplexing
> > mechanism similar to Linux's epoll.
> >
> > The patchset makes FreeBSD's interrupts and alarm work just enough to
> > suffice for purposes of IPC, however there are really weird problems
> observed.
> > Specifically, FreeBSD's kevent timers are really slow to trigger for
> > some reason, sleeping on a 10ms timer as much as 200ms before waking
> > up. Interrupt handling on fd's is also a bit flaky.
> >
> > It has also been observed that both problems go away if we do not
> > affinitize master lcore (by commenting relevant code out [2]). It is
> > not known why these problems are observed, nor it is clear what a solution
> might entail.
> >
> > For the purposes of making IPC work and having rudimentary support for
> > alarm and interrupt API's, this patchset works fine. However, because
> > of the above described issues, documentation will not be updated to
> > indicate support for interrupts on FreeBSD at this time.
> >
> > [1] http://dpdk.org/dev/patchwork/patch/36579/
> > [2]
> > http://dpdk.org/browse/dpdk/tree/lib/librte_eal/bsdapp/eal/eal.c#n729
> >
> > Anatoly Burakov (4):
> >   ipc: remove IPC thread for async requests
> >   eal/bsdapp: add interrupt thread
> >   eal/bsdapp: add alarm support
> >   ipc: remove main IPC thread
> >
> > Jianfeng Tan (4):
> >   eal/linux: use glibc malloc in alarm
> >   eal/linux: use glibc malloc in interrupt handling
> >   eal: bring forward init of interrupt handling
> >   eal: add IPC type for interrupt thread
> >
> >  lib/librte_eal/bsdapp/eal/eal.c   |  10 +-
> >  lib/librte_eal/bsdapp/eal/eal_alarm.c | 299 +++-
> >  lib/librte_eal/bsdapp/eal/eal_alarm_private.h |  19 +
> >  lib/librte_eal/bsdapp/eal/eal_interrupts.c| 460 +-
> >  lib/librte_eal/common/eal_common_proc.c   | 243 -
> >  .../common/include/rte_eal_interrupts.h   |   1 +
> >  lib/librte_eal/linuxapp/eal/eal.c |  10 +-
> >  lib/librte_eal/linuxapp/eal/eal_alarm.c   |   9 +-
> >  lib/librte_eal/linuxapp/eal/eal_interrupts.c  |  19 +-
> >  test/test/test_interrupts.c   |  29 +-
> >  10 files changed, 899 insertions(+), 200 deletions(-)  create mode
> > 100644 lib/librte_eal/bsdapp/eal/eal_alarm_private.h
> >
> > --
> > 2.17.1


[dpdk-dev] [PATCH v4 00/24] enable hotplug on multi-process

2018-06-26 Thread Qi Zhang
v4:
- since mp thread will be merged to interrupt thread, the fix on v3
  for sync IPC deadlock will not work. the new version enable the
  machanism to invoke a mp action callback in a temporary thread to
  avoid the IPC deadlock, with this, secondary to primary request
  impelemtation also be simplified, since we can use sync request
  directly in a separate thread.

v3:
- enable mp init callback register to help non-eal module to initialize
  mp channel during rte_eal_init
- fix when attach share device from secondary.
  1) dead lock due to sync IPC be invoked in rte_malloc in primary
 process when handle secondary request to attach device, the
 solution is primary process to issue share device attach/detach
 in interrupt thread.
  2) return port_id not correct.
- check nb_sent and nb_received in sync IPC.
- fix memory leak duirng error handling at attach_on_secondary.
- improve clean_lock_callback to only lock/unlock spinlock once
- improve error code return in check-reply during async IPC.
- remove rte_ prefix of internal function in ethdev_mp.c
- sample code improvement.
  1) rename sample to "hotplug_mp", and move to example/multi-process.
  2) cleanup header include.
  3) call rte_eal_cleanup before exit.

v2:
- rename rte_ethdev_mp.* to ethdev_mp.*
- rename rte_ethdev_lock.* to ethdev_lock.*
- move internal funciton to ethdev_private.h
- separate rte_eth_dev_[un]lock into rte_eth_dev_[un]lock and
  rte_eth_dev_[un]lock_with_callback
- lock callbacks will be removed automatically after device is detached.
- add experimental tag for all new APIs.
- fix coding style issue.
- fix wrong lisence header in sample code.
- fix spelling 
- fix meson.build.
- improve comments. 

Background:
===

Currently secondary process will only sync ethdev from primary
process at init stage, but it will not be aware if device
is attached/detached on primary process at runtime.

While there is the requirement from application that take
primary-secondary process model. The primary process work as a
resource management process, it will create/destroy virtual device
at runtime, while the secondary process deal with the network stuff
with these devices.

Solution:
=

So the orignial intention is to fix this gap, but beyond that
the patch set provide a more comprehesive solution to handle
different hotplug cases in multi-process situation, it cover below
scenario:

1. Attach a share device from primary
2. Detach a share device from primary
3. Attach a share device from secondary
4. Detach a share device from secondary
5. Attach a private device from secondary
6. Detach a private device from secondary
7. Detach a share device from secondary privately
8. Attach a share device from secondary privately

In primary-secondary process model, we assume ethernet devices are
shared by default. that means attach or detach a device on any process
will broadcast to all other processes through mp channel then device
information will be synchronized on all processes.

Any failure during attaching process will cause inconsistent status
between processes, so proper rollback action should be considered.
Also, it is not safe to detach a share device when other process still
use it, so a handshake mechanism is introduced.

Scenario for Case 1, 2:

attach device from primary
a) primary attach the new device if failed goto h).
b) primary send attach sync request to all secondary.
c) secondary receive request and attach device and send reply.
d) primary check the reply if all success go to i).
e) primary send attach rollback sync request to all secondary.
f) secondary receive the request and detach device and send reply.
g) primary receive the reply and detach device as rollback action.
h) attach fail
i) attach success

detach device from primary
a) primary perform pre-detach check, if device is locked, goto i).
b) primary send pre-detach sync request to all secondary.
c) secondary perform pre-detach check and send reply.
d) primary check the reply if any fail goto i).
e) primary send detach sync request to all secondary
f) secondary detach the device and send reply (assume no fail)
g) primary detach the device.
h) detach success
i) detach failed

Scenario for case 3, 4:

attach device from secondary:
a) seconary send asycn request to primary and wait on a condition
   which will be released by matched response from primary.
b) primary receive the request and attach the new device if failed
   goto i).
c) primary forward attach request to all secondary as async request
   (because this in mp thread context, use sync request will deadlock,
same reason for all following async request.)
d) secondary receive request and attach device and send reply.
e) primary check the reply if all success go to j).
f) primary send attach rollback async request to all secondary.
g) secondary receive the request and detach device and send reply.
h) primary receive the reply and detach device as rollback action.
i) send fail response to s

[dpdk-dev] [PATCH v4 01/24] eal: introduce one device scan

2018-06-26 Thread Qi Zhang
When hot plug a new device, it is not necessary to scan everything
on the bus since the devname and devargs are already there. So new
rte_bus ops "scan_one" is introduced, bus driver can implement this
function to simplify the hotplug process.

Signed-off-by: Qi Zhang 
---

 lib/librte_eal/common/eal_common_dev.c  | 17 +
 lib/librte_eal/common/include/rte_bus.h | 16 
 2 files changed, 29 insertions(+), 4 deletions(-)

diff --git a/lib/librte_eal/common/eal_common_dev.c 
b/lib/librte_eal/common/eal_common_dev.c
index 61cb3b162..1ad033536 100644
--- a/lib/librte_eal/common/eal_common_dev.c
+++ b/lib/librte_eal/common/eal_common_dev.c
@@ -147,11 +147,20 @@ int __rte_experimental rte_eal_hotplug_add(const char 
*busname, const char *devn
if (ret)
goto err_devarg;
 
-   ret = bus->scan();
-   if (ret)
-   goto err_devarg;
+   /**
+* if bus support to scan specific device by devargs,
+* we don't need to scan all devices on the bus.
+*/
+   if (bus->scan_one) {
+   dev = bus->scan_one(da);
+   } else {
+   ret = bus->scan();
+   if (ret)
+   goto err_devarg;
+
+   dev = bus->find_device(NULL, cmp_detached_dev_name, devname);
+   }
 
-   dev = bus->find_device(NULL, cmp_detached_dev_name, devname);
if (dev == NULL) {
RTE_LOG(ERR, EAL, "Cannot find unplugged device (%s)\n",
devname);
diff --git a/lib/librte_eal/common/include/rte_bus.h 
b/lib/librte_eal/common/include/rte_bus.h
index eb9eded4e..3269ef78b 100644
--- a/lib/librte_eal/common/include/rte_bus.h
+++ b/lib/librte_eal/common/include/rte_bus.h
@@ -84,6 +84,21 @@ enum rte_iova_mode {
 typedef int (*rte_bus_scan_t)(void);
 
 /**
+ * Bus specific scan for one specific device attached on the bus.
+ * For each bus object, the scan would be responsible for finding the specific
+ * device and adding it to its private device list, and the device object will
+ * be return also.
+ *
+ * @param devargs
+ * Device arguments be used to identify the device.
+ *
+ * @return
+ * !NULL for successful scan
+ * NULL for unsuccessful scan
+ */
+typedef struct rte_device *(*rte_bus_scan_one_t)(struct rte_devargs *devargs);
+
+/**
  * Implementation specific probe function which is responsible for linking
  * devices on that bus with applicable drivers.
  *
@@ -204,6 +219,7 @@ struct rte_bus {
TAILQ_ENTRY(rte_bus) next;   /**< Next bus object in linked list */
const char *name;/**< Name of the bus */
rte_bus_scan_t scan; /**< Scan for devices attached to bus */
+   rte_bus_scan_one_t scan_one; /**< Scan one device using devargs */
rte_bus_probe_t probe;   /**< Probe devices on bus */
rte_bus_find_device_t find_device; /**< Find a device on the bus */
rte_bus_plug_t plug; /**< Probe single device for drivers */
-- 
2.13.6



[dpdk-dev] [PATCH v4 02/24] bus/vdev: enable one device scan

2018-06-26 Thread Qi Zhang
The patch implemented the ops scan_one for vdev bus, it gives two benifits
1. Improve scan efficiency when a device is attached as hotplug, since no
need to pupulate a new device by iterating all devargs in devargs_list.
2. It also avoid sync IPC invoke (which happens in vdev->scan on secondary
process). The benifit is this removes the potential deadlock in the case
when secondary process receive a request from primary process to attach a
new device, since vdev->scan will be invoked on mp thread itself in that
case.

Signed-off-by: Qi Zhang 
---

 drivers/bus/vdev/vdev.c | 30 ++
 1 file changed, 30 insertions(+)

diff --git a/drivers/bus/vdev/vdev.c b/drivers/bus/vdev/vdev.c
index 6139dd551..cdbd77df0 100644
--- a/drivers/bus/vdev/vdev.c
+++ b/drivers/bus/vdev/vdev.c
@@ -467,6 +467,35 @@ vdev_scan(void)
return 0;
 }
 
+static struct rte_device *vdev_scan_one(struct rte_devargs *devargs)
+{
+   struct rte_vdev_device *dev = NULL;
+
+   dev = calloc(1, sizeof(*dev));
+   if (!dev) {
+   VDEV_LOG(ERR, "failed to allocate memory for new device");
+   return NULL;
+   }
+
+   rte_spinlock_recursive_lock(&vdev_device_list_lock);
+
+   if (find_vdev(devargs->name)) {
+   VDEV_LOG(ERR, "device %s already exist", devargs->name);
+   free(dev);
+   rte_spinlock_recursive_unlock(&vdev_device_list_lock);
+   return NULL;
+   }
+
+   dev->device.devargs = devargs;
+   dev->device.numa_node = SOCKET_ID_ANY;
+   dev->device.name = devargs->name;
+   TAILQ_INSERT_TAIL(&vdev_device_list, dev, next);
+
+   rte_spinlock_recursive_unlock(&vdev_device_list_lock);
+
+   return &dev->device;
+}
+
 static int
 vdev_probe(void)
 {
@@ -531,6 +560,7 @@ vdev_unplug(struct rte_device *dev)
 
 static struct rte_bus rte_vdev_bus = {
.scan = vdev_scan,
+   .scan_one = vdev_scan_one,
.probe = vdev_probe,
.find_device = vdev_find_device,
.plug = vdev_plug,
-- 
2.13.6



[dpdk-dev] [PATCH v4 04/24] eal: enable multi process init callback

2018-06-26 Thread Qi Zhang
Introduce new API rte_eal_register_mp_init that help to register
a callback function which will be invoked right after multi-process
channel be established (rte_mp_channel_init). Typically the API
will be used by other module that want it's mp channel action callbacks
can be registered during rte_eal_init automatically.

Signed-off-by: Qi Zhang 
---
 lib/librte_eal/common/eal_common_proc.c | 51 -
 lib/librte_eal/common/eal_private.h |  5 
 lib/librte_eal/common/include/rte_eal.h | 34 ++
 lib/librte_eal/linuxapp/eal/eal.c   |  2 ++
 4 files changed, 91 insertions(+), 1 deletion(-)

diff --git a/lib/librte_eal/common/eal_common_proc.c 
b/lib/librte_eal/common/eal_common_proc.c
index 707d8ab30..fc0eb4d17 100644
--- a/lib/librte_eal/common/eal_common_proc.c
+++ b/lib/librte_eal/common/eal_common_proc.c
@@ -619,6 +619,42 @@ unlink_sockets(const char *filter)
return 0;
 }
 
+struct mp_init_entry {
+   TAILQ_ENTRY(mp_init_entry) next;
+   rte_eal_mp_init_callback_t callback;
+};
+
+TAILQ_HEAD(mp_init_entry_list, mp_init_entry);
+static struct mp_init_entry_list mp_init_entry_list =
+   TAILQ_HEAD_INITIALIZER(mp_init_entry_list);
+
+static int process_mp_init_callbacks(void)
+{
+   struct mp_init_entry *entry;
+   int ret;
+
+   TAILQ_FOREACH(entry, &mp_init_entry_list, next) {
+   ret = entry->callback();
+   if (ret)
+   return ret;
+   }
+   return 0;
+}
+
+int __rte_experimental
+rte_eal_register_mp_init(rte_eal_mp_init_callback_t callback)
+{
+   struct mp_init_entry *entry = calloc(1, sizeof(struct mp_init_entry));
+
+   if (entry == NULL)
+   return -ENOMEM;
+
+   entry->callback = callback;
+   TAILQ_INSERT_TAIL(&mp_init_entry_list, entry, next);
+
+   return 0;
+}
+
 int
 rte_mp_channel_init(void)
 {
@@ -686,7 +722,20 @@ rte_mp_channel_init(void)
flock(dir_fd, LOCK_UN);
close(dir_fd);
 
-   return 0;
+   return process_mp_init_callbacks();
+}
+
+void rte_mp_init_callback_cleanup(void)
+{
+   struct mp_init_entry *entry;
+
+   while (!TAILQ_EMPTY(&mp_init_entry_list)) {
+   TAILQ_FOREACH(entry, &mp_init_entry_list, next) {
+   TAILQ_REMOVE(&mp_init_entry_list, entry, next);
+   free(entry);
+   break;
+   }
+   }
 }
 
 /**
diff --git a/lib/librte_eal/common/eal_private.h 
b/lib/librte_eal/common/eal_private.h
index bdadc4d50..bc230ee23 100644
--- a/lib/librte_eal/common/eal_private.h
+++ b/lib/librte_eal/common/eal_private.h
@@ -247,6 +247,11 @@ struct rte_bus *rte_bus_find_by_device_name(const char 
*str);
 int rte_mp_channel_init(void);
 
 /**
+ * Cleanup all mp channel init callbacks.
+ */
+void rte_mp_init_callback_cleanup(void);
+
+/**
  * Internal Executes all the user application registered callbacks for
  * the specific device. It is for DPDK internal user only. User
  * application should not call it directly.
diff --git a/lib/librte_eal/common/include/rte_eal.h 
b/lib/librte_eal/common/include/rte_eal.h
index 8de5d69e8..506f17f34 100644
--- a/lib/librte_eal/common/include/rte_eal.h
+++ b/lib/librte_eal/common/include/rte_eal.h
@@ -512,6 +512,40 @@ __rte_deprecated
 const char *
 rte_eal_mbuf_default_mempool_ops(void);
 
+/**
+ * Callback function right after multi-process channel be established.
+ * Typical implementation of these functions is to register mp channel
+ * action callbacks
+ *
+ * @return
+ *  - 0 on success.
+ *  - (<0) on failure.
+ */
+typedef int (*rte_eal_mp_init_callback_t)(void);
+
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice
+ *
+ * Register a callback function that will be invoked right after
+ * multi-process channel be established (rte_mp_channel_init). Typically
+ * the function is used by other module that want it's mp channel
+ * action callbacks can be registered during rte_eal_init automatically.
+ *
+ * @note
+ *   This function only take effect when be called before rte_eal_init,
+ *   and all registered callback will be clear during rte_eal_cleanup.
+ *
+ * @param callback
+ *   function be called at that moment.
+ *
+ * @return
+ *  - 0 on success.
+ *  - (<0) on failure.
+ */
+int __rte_experimental
+rte_eal_register_mp_init(rte_eal_mp_init_callback_t callback);
+
 #ifdef __cplusplus
 }
 #endif
diff --git a/lib/librte_eal/linuxapp/eal/eal.c 
b/lib/librte_eal/linuxapp/eal/eal.c
index 8655b8691..45cccff7e 100644
--- a/lib/librte_eal/linuxapp/eal/eal.c
+++ b/lib/librte_eal/linuxapp/eal/eal.c
@@ -1048,6 +1048,8 @@ int __rte_experimental
 rte_eal_cleanup(void)
 {
rte_service_finalize();
+   rte_mp_init_callback_cleanup();
+
return 0;
 }
 
-- 
2.13.6



[dpdk-dev] [PATCH v4 03/24] ethdev: add function to release port in local process

2018-06-26 Thread Qi Zhang
Add driver API rte_eth_release_port_private to support the
requirement that an ethdev only be released on secondary process,
so only local state be set to unused , share data will not be
reset so primary process can still use it.

Signed-off-by: Qi Zhang 
---

 lib/librte_ethdev/rte_ethdev.c| 24 +---
 lib/librte_ethdev/rte_ethdev_driver.h | 13 +
 2 files changed, 34 insertions(+), 3 deletions(-)

diff --git a/lib/librte_ethdev/rte_ethdev.c b/lib/librte_ethdev/rte_ethdev.c
index a9977df97..205b2ee33 100644
--- a/lib/librte_ethdev/rte_ethdev.c
+++ b/lib/librte_ethdev/rte_ethdev.c
@@ -359,6 +359,23 @@ rte_eth_dev_attach_secondary(const char *name)
 }
 
 int
+rte_eth_dev_release_port_private(struct rte_eth_dev *eth_dev)
+{
+   if (eth_dev == NULL)
+   return -EINVAL;
+
+   _rte_eth_dev_callback_process(eth_dev, RTE_ETH_EVENT_DESTROY, NULL);
+
+   rte_spinlock_lock(&rte_eth_dev_shared_data->ownership_lock);
+
+   eth_dev->state = RTE_ETH_DEV_UNUSED;
+
+   rte_spinlock_unlock(&rte_eth_dev_shared_data->ownership_lock);
+
+   return 0;
+}
+
+int
 rte_eth_dev_release_port(struct rte_eth_dev *eth_dev)
 {
if (eth_dev == NULL)
@@ -370,9 +387,10 @@ rte_eth_dev_release_port(struct rte_eth_dev *eth_dev)
 
rte_spinlock_lock(&rte_eth_dev_shared_data->ownership_lock);
 
-   eth_dev->state = RTE_ETH_DEV_UNUSED;
-
-   memset(eth_dev->data, 0, sizeof(struct rte_eth_dev_data));
+   if (eth_dev->state != RTE_ETH_DEV_UNUSED) {
+   eth_dev->state = RTE_ETH_DEV_UNUSED;
+   memset(eth_dev->data, 0, sizeof(struct rte_eth_dev_data));
+   }
 
rte_spinlock_unlock(&rte_eth_dev_shared_data->ownership_lock);
 
diff --git a/lib/librte_ethdev/rte_ethdev_driver.h 
b/lib/librte_ethdev/rte_ethdev_driver.h
index c9c825e3f..49c27223d 100644
--- a/lib/librte_ethdev/rte_ethdev_driver.h
+++ b/lib/librte_ethdev/rte_ethdev_driver.h
@@ -70,6 +70,19 @@ int rte_eth_dev_release_port(struct rte_eth_dev *eth_dev);
 
 /**
  * @internal
+ * Release the specified ethdev port in local process, only set to ethdev
+ * state to unused, but not reset share data since it assume other process
+ * is still using it, typically it is called by secondary process.
+ *
+ * @param eth_dev
+ * The *eth_dev* pointer is the address of the *rte_eth_dev* structure.
+ * @return
+ *   - 0 on success, negative on error
+ */
+int rte_eth_dev_release_port_private(struct rte_eth_dev *eth_dev);
+
+/**
+ * @internal
  * Release device queues and clear its configuration to force the user
  * application to reconfigure it. It is for internal use only.
  *
-- 
2.13.6



[dpdk-dev] [PATCH v4 05/24] eal: support mp task be invoked in a separate task

2018-06-26 Thread Qi Zhang
We know the limitation that sync IPC can't be invoked in mp handler
itself which will cause deadlock, the patch introduce new API
rte_eal_mp_task_add to support mp handler be delegated in a separate
task.

Signed-off-by: Qi Zhang 
---
 lib/librte_eal/common/eal_common_proc.c | 67 +
 lib/librte_eal/common/include/rte_eal.h | 31 +++
 2 files changed, 98 insertions(+)

diff --git a/lib/librte_eal/common/eal_common_proc.c 
b/lib/librte_eal/common/eal_common_proc.c
index fc0eb4d17..166bb0951 100644
--- a/lib/librte_eal/common/eal_common_proc.c
+++ b/lib/librte_eal/common/eal_common_proc.c
@@ -27,6 +27,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include "eal_private.h"
 #include "eal_filesystem.h"
@@ -738,6 +739,72 @@ void rte_mp_init_callback_cleanup(void)
}
 }
 
+struct mp_task {
+   TAILQ_ENTRY(mp_task) next;
+   rte_eal_mp_task task;
+   void *args;
+};
+
+TAILQ_HEAD(mp_task_list, mp_task);
+static struct mp_task_list mp_task_list =
+   TAILQ_HEAD_INITIALIZER(mp_task_list);
+static rte_spinlock_t mp_task_lock = RTE_SPINLOCK_INITIALIZER;
+
+static void *schedule_mp_task(void *args __rte_unused)
+{
+   struct mp_task *task;
+
+   rte_spinlock_lock(&mp_task_lock);
+   while (!TAILQ_EMPTY(&mp_task_list)) {
+
+   task = TAILQ_FIRST(&mp_task_list);
+   rte_spinlock_unlock(&mp_task_lock);
+
+   task->task(task->args);
+
+   rte_spinlock_lock(&mp_task_lock);
+   TAILQ_REMOVE(&mp_task_list, task, next);
+   if (task->args)
+   free(task->args);
+   free(task);
+   }
+
+   rte_spinlock_unlock(&mp_task_lock);
+   return NULL;
+}
+
+int __rte_experimental
+rte_eal_mp_task_add(rte_eal_mp_task task, void *args)
+{
+   struct mp_task *t = calloc(1, sizeof(struct mp_task));
+   pthread_t tid;
+
+   if (t == NULL)
+   return -ENOMEM;
+
+   t->task = task;
+   t->args = args;
+
+   rte_spinlock_lock(&mp_task_lock);
+
+   if (TAILQ_EMPTY(&mp_task_list)) {
+   TAILQ_INSERT_TAIL(&mp_task_list, t, next);
+   rte_spinlock_unlock(&mp_task_lock);
+
+   if (rte_ctrl_thread_create(&tid,
+   "rte_mp_handle", NULL, schedule_mp_task, NULL) < 0) {
+   RTE_LOG(ERR, EAL, "failed to create mp thead: %s\n",
+   strerror(errno));
+   }
+   return 0;
+   }
+
+   TAILQ_INSERT_TAIL(&mp_task_list, t, next);
+   rte_spinlock_unlock(&mp_task_lock);
+
+   return 0;
+}
+
 /**
  * Return -1, as fail to send message and it's caused by the local side.
  * Return 0, as fail to send message and it's caused by the remote side.
diff --git a/lib/librte_eal/common/include/rte_eal.h 
b/lib/librte_eal/common/include/rte_eal.h
index 506f17f34..0ce49668c 100644
--- a/lib/librte_eal/common/include/rte_eal.h
+++ b/lib/librte_eal/common/include/rte_eal.h
@@ -546,6 +546,37 @@ typedef int (*rte_eal_mp_init_callback_t)(void);
 int __rte_experimental
 rte_eal_register_mp_init(rte_eal_mp_init_callback_t callback);
 
+/**
+ * Function to perform the task that handle mp request,
+ * it will be scheduled on a separate task.
+ *
+ * @param args
+ *   argument parse to the function point.
+ */
+typedef void (*rte_eal_mp_task)(void *args);
+
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice
+ *
+ * Add a rte_eal_mp_task into a task list, it will invoked in a
+ * separate task, the purpose is to prevent deadlock if sync IPC
+ * is required in the task.
+ *
+ * @param task
+ *   function point to perform the task.
+ *
+ * @param args
+ *   argument parse to the function point.
+ *
+ * @return
+ *  - 0 on success.
+ *  - (<0) on failure.
+ */
+int __rte_experimental
+rte_eal_mp_task_add(rte_eal_mp_task task, void *args);
+
+
 #ifdef __cplusplus
 }
 #endif
-- 
2.13.6



[dpdk-dev] [PATCH v4 06/24] ethdev: enable hotplug on multi-process

2018-06-26 Thread Qi Zhang
We are going to introduce the solution to handle different hotplug
cases in multi-process situation, it include below scenario:

1. Attach a share device from primary
2. Detach a share device from primary
3. Attach a share device from secondary
4. Detach a share device from secondary
5. Attach a private device from secondary
6. Detach a private device from secondary
7. Detach a share device from secondary privately
8. Attach a share device from secondary privately

In primary-secondary process model, we assume device is shared by default.
that means attach or detach a device on any process will broadcast to
all other processes through mp channel then device information will be
synchronized on all processes.

Any failure during attaching process will cause inconsistent status
between processes, so proper rollback action should be considered.
Also it is not safe to detach a share device when other process still use
it, so a handshake mechanism is introduced.

This patch covers the implementation of case 1,2,5,6,7,8.
Case 3,4 will be implemented on separate patch as well as handshake
mechanism.

Scenario for Case 1, 2:

attach device
a) primary attach the new device if failed goto h).
b) primary send attach sync request to all secondary.
c) secondary receive request and attach device and send reply.
d) primary check the reply if all success go to i).
e) primary send attach rollback sync request to all secondary.
f) secondary receive the request and detach device and send reply.
g) primary receive the reply and detach device as rollback action.
h) attach fail
i) attach success

detach device
a) primary perform pre-detach check, if device is locked, goto i).
b) primary send pre-detach sync request to all secondary.
c) secondary perform pre-detach check and send reply.
d) primary check the reply if any fail goto i).
e) primary send detach sync request to all secondary
f) secondary detach the device and send reply (assume no fail)
g) primary detach the device.
h) detach success
i) detach failed

Case 5, 6:
Secondary process can attach private device which only visible to itself,
in this case no IPC is involved, primary process is not allowed to have
private device so far.

Case 7, 8:
Secondary process can also temporally to detach a share device "privately"
then attach it back later, this action also not impact other processes.

APIs changes:

rte_eth_dev_attach and rte_eth_dev_attach are extended to support
share device attach/detach in primary-secondary process model, it will
be called in case 1,2,3,4.

New API rte_eth_dev_attach_private and rte_eth_dev_detach_private are
introduced to cover case 5,6,7,8, this API can only be invoked in secondary
process.

Signed-off-by: Qi Zhang 
---
 lib/librte_ethdev/Makefile  |   1 +
 lib/librte_ethdev/ethdev_mp.c   | 248 
 lib/librte_ethdev/ethdev_mp.h   |  41 ++
 lib/librte_ethdev/ethdev_private.h  |  39 ++
 lib/librte_ethdev/meson.build   |   1 +
 lib/librte_ethdev/rte_ethdev.c  | 190 ---
 lib/librte_ethdev/rte_ethdev.h  |  45 +++
 lib/librte_ethdev/rte_ethdev_core.h |   5 +
 8 files changed, 553 insertions(+), 17 deletions(-)
 create mode 100644 lib/librte_ethdev/ethdev_mp.c
 create mode 100644 lib/librte_ethdev/ethdev_mp.h
 create mode 100644 lib/librte_ethdev/ethdev_private.h

diff --git a/lib/librte_ethdev/Makefile b/lib/librte_ethdev/Makefile
index c2f2f7d82..d0a059b83 100644
--- a/lib/librte_ethdev/Makefile
+++ b/lib/librte_ethdev/Makefile
@@ -19,6 +19,7 @@ EXPORT_MAP := rte_ethdev_version.map
 LIBABIVER := 9
 
 SRCS-y += rte_ethdev.c
+SRCS-y += ethdev_mp.c
 SRCS-y += rte_flow.c
 SRCS-y += rte_tm.c
 SRCS-y += rte_mtr.c
diff --git a/lib/librte_ethdev/ethdev_mp.c b/lib/librte_ethdev/ethdev_mp.c
new file mode 100644
index 0..87fc430bf
--- /dev/null
+++ b/lib/librte_ethdev/ethdev_mp.c
@@ -0,0 +1,248 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2010-2018 Intel Corporation
+ */
+
+#include 
+#include "rte_ethdev_driver.h"
+#include "ethdev_mp.h"
+
+#define MP_TIMEOUT_S 5 /**< 5 seconds timeouts */
+
+struct mp_reply_bundle {
+   struct rte_mp_msg msg;
+   const void *peer;
+};
+
+static int detach_on_secondary(uint16_t port_id)
+{
+   struct rte_device *dev;
+   struct rte_bus *bus;
+   int ret = 0;
+
+   if (rte_eth_devices[port_id].state == RTE_ETH_DEV_UNUSED) {
+   ethdev_log(ERR, "detach on secondary: invalid port %d\n",
+  port_id);
+   return -ENODEV;
+   }
+
+   dev = rte_eth_devices[port_id].device;
+   if (dev == NULL)
+   return -EINVAL;
+
+   bus = rte_bus_find_by_device(dev);
+   if (bus == NULL)
+   return -ENOENT;
+
+   ret = rte_eal_hotplug_remove(bus->name, dev->name);
+   if (ret) {
+   ethdev_log(ERR, "failed to hot unplug bus: %s, device:%s\n",
+  bus->name, dev->name);
+ 

[dpdk-dev] [PATCH v4 07/24] ethdev: introduce device lock

2018-06-26 Thread Qi Zhang
Introduce API rte_eth_dev_lock and rte_eth_dev_unlock to let
application lock or unlock on specific ethdev, a locked device
can't be detached, this help applicaiton to prevent unexpected
device detaching, especially in multi-process envrionment.

Aslo introduce the new API rte_eth_dev_lock_with_callback and
rte_eth_dev_unlock_with callback to let application to register
a callback function which will be invoked before a device is going
to be detached, the return value of the function will decide if
device will continue be detached or not, this support application
to do condition check at runtime.

Signed-off-by: Qi Zhang 
---
 lib/librte_ethdev/Makefile  |   1 +
 lib/librte_ethdev/ethdev_lock.c | 140 
 lib/librte_ethdev/ethdev_lock.h |  31 +
 lib/librte_ethdev/ethdev_mp.c   |   3 +-
 lib/librte_ethdev/meson.build   |   1 +
 lib/librte_ethdev/rte_ethdev.c  |  60 -
 lib/librte_ethdev/rte_ethdev.h  | 124 +++
 7 files changed, 358 insertions(+), 2 deletions(-)
 create mode 100644 lib/librte_ethdev/ethdev_lock.c
 create mode 100644 lib/librte_ethdev/ethdev_lock.h

diff --git a/lib/librte_ethdev/Makefile b/lib/librte_ethdev/Makefile
index d0a059b83..62bef03fc 100644
--- a/lib/librte_ethdev/Makefile
+++ b/lib/librte_ethdev/Makefile
@@ -20,6 +20,7 @@ LIBABIVER := 9
 
 SRCS-y += rte_ethdev.c
 SRCS-y += ethdev_mp.c
+SRCS-y += ethdev_lock.c
 SRCS-y += rte_flow.c
 SRCS-y += rte_tm.c
 SRCS-y += rte_mtr.c
diff --git a/lib/librte_ethdev/ethdev_lock.c b/lib/librte_ethdev/ethdev_lock.c
new file mode 100644
index 0..6379519e3
--- /dev/null
+++ b/lib/librte_ethdev/ethdev_lock.c
@@ -0,0 +1,140 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2018 Intel Corporation
+ */
+#include "ethdev_lock.h"
+
+struct lock_entry {
+   TAILQ_ENTRY(lock_entry) next;
+   rte_eth_dev_lock_callback_t callback;
+   uint16_t port_id;
+   void *user_args;
+   int ref_count;
+};
+
+TAILQ_HEAD(lock_entry_list, lock_entry);
+static struct lock_entry_list lock_entry_list =
+   TAILQ_HEAD_INITIALIZER(lock_entry_list);
+static rte_spinlock_t lock_entry_lock = RTE_SPINLOCK_INITIALIZER;
+
+int
+register_lock_callback(uint16_t port_id,
+   rte_eth_dev_lock_callback_t callback,
+   void *user_args)
+{
+   struct lock_entry *le;
+
+   rte_spinlock_lock(&lock_entry_lock);
+
+   TAILQ_FOREACH(le, &lock_entry_list, next) {
+   if (le->port_id == port_id &&
+   le->callback == callback &&
+   le->user_args == user_args)
+   break;
+   }
+
+   if (le == NULL) {
+   le = calloc(1, sizeof(struct lock_entry));
+   if (le == NULL) {
+   rte_spinlock_unlock(&lock_entry_lock);
+   return -ENOMEM;
+   }
+   le->callback = callback;
+   le->port_id = port_id;
+   le->user_args = user_args;
+   TAILQ_INSERT_TAIL(&lock_entry_list, le, next);
+   }
+   le->ref_count++;
+
+   rte_spinlock_unlock(&lock_entry_lock);
+   return 0;
+}
+
+int
+unregister_lock_callback(uint16_t port_id,
+   rte_eth_dev_lock_callback_t callback,
+   void *user_args)
+{
+   struct lock_entry *le;
+   int ret = 0;
+
+   rte_spinlock_lock(&lock_entry_lock);
+
+   TAILQ_FOREACH(le, &lock_entry_list, next) {
+   if (le->port_id == port_id &&
+   le->callback == callback &&
+   le->user_args == user_args)
+   break;
+   }
+
+   if (le != NULL) {
+   le->ref_count--;
+   if (le->ref_count == 0) {
+   TAILQ_REMOVE(&lock_entry_list, le, next);
+   free(le);
+   }
+   } else {
+   ret = -ENOENT;
+   }
+
+   rte_spinlock_unlock(&lock_entry_lock);
+   return ret;
+}
+
+static int clean_lock_callback_one(uint16_t port_id)
+{
+   struct lock_entry *le;
+   int ret = 0;
+
+   TAILQ_FOREACH(le, &lock_entry_list, next) {
+   if (le->port_id == port_id)
+   break;
+   }
+
+   if (le != NULL) {
+   le->ref_count--;
+   if (le->ref_count == 0) {
+   TAILQ_REMOVE(&lock_entry_list, le, next);
+   free(le);
+   }
+   } else {
+   ret = -ENOENT;
+   }
+
+   return ret;
+
+}
+
+void clean_lock_callback(uint16_t port_id)
+{
+   int ret;
+
+   rte_spinlock_lock(&lock_entry_lock);
+
+   for (;;) {
+   ret = clean_lock_callback_one(port_id);
+   if (ret == -ENOENT)
+   break;
+   }
+
+   rte_spinlock_unlock(&lock_entry_lock);
+}
+
+int process_lock_callbacks(uint16_t port_id)
+{
+  

[dpdk-dev] [PATCH v4 08/24] ethdev: support attach or detach share device from secondary

2018-06-26 Thread Qi Zhang
This patch cover the multi-process hotplug case when a share device
attach/detach request be issued from secondary process

device attach on secondary:
a) seconary send sync request to primary.
b) primary receive the request and attach the new device if failed
   goto i).
c) primary forward attach sync request to all secondary.
d) secondary receive request and attach device and send reply.
e) primary check the reply if all success go to j).
f) primary send attach rollback sync request to all secondary.
g) secondary receive the request and detach device and send reply.
h) primary receive the reply and detach device as rollback action.
i) send fail reply to secondary, goto k).
j) send success reply to secondary.
k) secondary process receive reply of step a) and return.

device detach on secondary:
a) secondary send sync request to primary
b) primary receive the request and perform pre-detach check, if device
   is locked, goto j).
c) primary send pre-detach sync request to all secondary.
d) secondary perform pre-detach check and send reply.
e) primary check the reply if any fail goto j).
f) primary send detach sync request to all secondary
g) secondary detach the device and send reply
h) primary detach the device.
i) send success reply to secondary, goto k).
j) send fail reply to secondary.
k) secondary process receive reply of step a) and return.

Signed-off-by: Qi Zhang 
---
 lib/librte_ethdev/ethdev_mp.c | 155 --
 1 file changed, 149 insertions(+), 6 deletions(-)

diff --git a/lib/librte_ethdev/ethdev_mp.c b/lib/librte_ethdev/ethdev_mp.c
index b94bd9501..7afbd4cf2 100644
--- a/lib/librte_ethdev/ethdev_mp.c
+++ b/lib/librte_ethdev/ethdev_mp.c
@@ -4,8 +4,44 @@
 
 #include 
 #include "rte_ethdev_driver.h"
+
 #include "ethdev_mp.h"
 #include "ethdev_lock.h"
+#include "ethdev_private.h"
+
+/**
+ *
+ * secondary to primary request.
+ * start from function eth_dev_request_to_primary.
+ *
+ * device attach on secondary:
+ * a) seconary send sycn request to primary
+ * b) primary receive the request and attach the new device thread,
+ *if failed goto i).
+ * c) primary forward attach request to all secondary as sync request
+ * d) secondary receive request and attach device and send reply.
+ * e) primary check the reply if all success go to j).
+ * f) primary send attach rollback sync request to all secondary.
+ * g) secondary receive the request and detach device and send reply.
+ * h) primary receive the reply and detach device as rollback action.
+ * i) send fail sync reply to secondary, goto k).
+ * j) send success sync reply to secondary.
+ * k) secondary process receive reply of step a) and return.
+ *
+ * device detach on secondary:
+ * a) secondary send detach sync request to primary
+ * b) primary receive the request and perform pre-detach check, if device
+ *is locked, goto j).
+ * c) primary send pre-detach sync request to all secondary.
+ * d) secondary perform pre-detach check and send reply.
+ * e) primary check the reply if any fail goto j).
+ * f) primary send detach sync request to all secondary
+ * g) secondary detach the device and send reply
+ * h) primary detach the device.
+ * i) send success sync reply to secondary, goto k).
+ * j) send fail sync reply to secondary.
+ * k) secondary process receive reply of step a) and return.
+ */
 
 #define MP_TIMEOUT_S 5 /**< 5 seconds timeouts */
 
@@ -83,11 +119,98 @@ static int attach_on_secondary(const char *devargs, 
uint16_t port_id)
 }
 
 static int
-handle_secondary_request(const struct rte_mp_msg *msg, const void *peer)
+send_response_to_secondary(const struct eth_dev_mp_req *req,
+   int result,
+   const void *peer)
+{
+   struct rte_mp_msg mp_resp;
+   struct eth_dev_mp_req *resp =
+   (struct eth_dev_mp_req *)mp_resp.param;
+   int ret;
+
+   memset(&mp_resp, 0, sizeof(mp_resp));
+   mp_resp.len_param = sizeof(*resp);
+   strcpy(mp_resp.name, ETH_DEV_MP_ACTION_REQUEST);
+   memcpy(resp, req, sizeof(*req));
+   resp->result = result;
+
+   ret = rte_mp_reply(&mp_resp, peer);
+   if (ret)
+   ethdev_log(ERR, "failed to send response to secondary\n");
+
+   return ret;
+}
+
+int eth_dev_request_to_secondary(struct eth_dev_mp_req *req);
+
+static void
+__handle_secondary_request(void *param)
 {
-   RTE_SET_USED(msg);
-   RTE_SET_USED(peer);
-   return -ENOTSUP;
+   struct mp_reply_bundle *bundle = param;
+   const struct rte_mp_msg *msg = &bundle->msg;
+   const struct eth_dev_mp_req *req =
+   (const struct eth_dev_mp_req *)msg->param;
+   struct eth_dev_mp_req tmp_req;
+   uint16_t port_id;
+   int ret = 0;
+
+   tmp_req = *req;
+
+   if (req->t == REQ_TYPE_ATTACH) {
+   ret = do_eth_dev_attach(req->devargs, &port_id);
+   if (!ret) {
+   tmp_req.port_id = port_id;
+   ret 

[dpdk-dev] [PATCH v4 12/24] net/igb: enable port detach on secondary process

2018-06-26 Thread Qi Zhang
Previously, detach port on secondary process will mess primary
process and cause same device can't be attached again, by take
advantage of rte_eth_release_port_private, we can support this
with minor change.

Signed-off-by: Qi Zhang 
---
 drivers/net/e1000/igb_ethdev.c | 9 +
 1 file changed, 9 insertions(+)

diff --git a/drivers/net/e1000/igb_ethdev.c b/drivers/net/e1000/igb_ethdev.c
index edc7be319..db07a83e3 100644
--- a/drivers/net/e1000/igb_ethdev.c
+++ b/drivers/net/e1000/igb_ethdev.c
@@ -1089,6 +1089,15 @@ static int eth_igb_pci_probe(struct rte_pci_driver 
*pci_drv __rte_unused,
 
 static int eth_igb_pci_remove(struct rte_pci_device *pci_dev)
 {
+   struct rte_eth_dev *ethdev =
+   rte_eth_dev_allocated(pci_dev->device.name);
+
+   if (!ethdev)
+   return -ENODEV;
+
+   if (rte_eal_process_type() != RTE_PROC_PRIMARY)
+   return rte_eth_dev_release_port_private(ethdev);
+
return rte_eth_dev_pci_generic_remove(pci_dev, eth_igb_dev_uninit);
 }
 
-- 
2.13.6



[dpdk-dev] [PATCH v4 10/24] net/ixgbe: enable port detach on secondary process

2018-06-26 Thread Qi Zhang
Previously, detach port on secondary process will mess primary
process and cause same device can't be attached again, by take
advantage of rte_eth_release_port_private, we can support this
with minor change.

Signed-off-by: Qi Zhang 
---
 drivers/net/ixgbe/ixgbe_ethdev.c | 12 
 1 file changed, 12 insertions(+)

diff --git a/drivers/net/ixgbe/ixgbe_ethdev.c b/drivers/net/ixgbe/ixgbe_ethdev.c
index 87d2ad090..f9d560835 100644
--- a/drivers/net/ixgbe/ixgbe_ethdev.c
+++ b/drivers/net/ixgbe/ixgbe_ethdev.c
@@ -1792,6 +1792,9 @@ static int eth_ixgbe_pci_remove(struct rte_pci_device 
*pci_dev)
if (!ethdev)
return -ENODEV;
 
+   if (rte_eal_process_type() != RTE_PROC_PRIMARY)
+   return rte_eth_dev_release_port_private(ethdev);
+
if (ethdev->data->dev_flags & RTE_ETH_DEV_REPRESENTOR)
return rte_eth_dev_destroy(ethdev, ixgbe_vf_representor_uninit);
else
@@ -1809,6 +1812,15 @@ static struct rte_pci_driver rte_ixgbe_pmd = {
 static int eth_ixgbevf_pci_probe(struct rte_pci_driver *pci_drv __rte_unused,
struct rte_pci_device *pci_dev)
 {
+   struct rte_eth_dev *ethdev;
+
+   ethdev = rte_eth_dev_allocated(pci_dev->device.name);
+   if (!ethdev)
+   return -ENODEV;
+
+   if (rte_eal_process_type() != RTE_PROC_PRIMARY)
+   return rte_eth_dev_release_port_private(ethdev);
+
return rte_eth_dev_pci_generic_probe(pci_dev,
sizeof(struct ixgbe_adapter), eth_ixgbevf_dev_init);
 }
-- 
2.13.6



[dpdk-dev] [PATCH v4 11/24] net/e1000: enable port detach on secondary process

2018-06-26 Thread Qi Zhang
Previously, detach port on secondary process will mess primary
process and cause same device can't be attached again, by take
advantage of rte_eth_release_port_private, we can support this
with minor change.

Signed-off-by: Qi Zhang 
---
 drivers/net/e1000/em_ethdev.c | 9 +
 1 file changed, 9 insertions(+)

diff --git a/drivers/net/e1000/em_ethdev.c b/drivers/net/e1000/em_ethdev.c
index 7039dc100..e6b7ce63a 100644
--- a/drivers/net/e1000/em_ethdev.c
+++ b/drivers/net/e1000/em_ethdev.c
@@ -349,6 +349,15 @@ static int eth_em_pci_probe(struct rte_pci_driver *pci_drv 
__rte_unused,
 
 static int eth_em_pci_remove(struct rte_pci_device *pci_dev)
 {
+   struct rte_eth_dev *ethdev =
+   rte_eth_dev_allocated(pci_dev->device.name);
+
+   if (!ethdev)
+   return -ENODEV;
+
+   if (rte_eal_process_type() != RTE_PROC_PRIMARY)
+   return rte_eth_dev_release_port_private(ethdev);
+
return rte_eth_dev_pci_generic_remove(pci_dev, eth_em_dev_uninit);
 }
 
-- 
2.13.6



[dpdk-dev] [PATCH v4 09/24] net/i40e: enable port detach on secondary process

2018-06-26 Thread Qi Zhang
Previously, detach port on secondary process will mess primary
process and cause same device can't be attached again, by take
advantage of rte_eth_release_port_private, we can support this
with minor change.

Signed-off-by: Qi Zhang 
---
 drivers/net/i40e/i40e_ethdev.c| 2 ++
 drivers/net/i40e/i40e_ethdev_vf.c | 9 +
 2 files changed, 11 insertions(+)

diff --git a/drivers/net/i40e/i40e_ethdev.c b/drivers/net/i40e/i40e_ethdev.c
index 13c5d3296..7d1f98422 100644
--- a/drivers/net/i40e/i40e_ethdev.c
+++ b/drivers/net/i40e/i40e_ethdev.c
@@ -678,6 +678,8 @@ static int eth_i40e_pci_remove(struct rte_pci_device 
*pci_dev)
if (!ethdev)
return -ENODEV;
 
+   if (rte_eal_process_type() != RTE_PROC_PRIMARY)
+   return rte_eth_dev_release_port_private(ethdev);
 
if (ethdev->data->dev_flags & RTE_ETH_DEV_REPRESENTOR)
return rte_eth_dev_destroy(ethdev, i40e_vf_representor_uninit);
diff --git a/drivers/net/i40e/i40e_ethdev_vf.c 
b/drivers/net/i40e/i40e_ethdev_vf.c
index 804e44530..fc6f079d5 100644
--- a/drivers/net/i40e/i40e_ethdev_vf.c
+++ b/drivers/net/i40e/i40e_ethdev_vf.c
@@ -1500,6 +1500,15 @@ static int eth_i40evf_pci_probe(struct rte_pci_driver 
*pci_drv __rte_unused,
 
 static int eth_i40evf_pci_remove(struct rte_pci_device *pci_dev)
 {
+   struct rte_eth_dev *ethdev;
+   ethdev = rte_eth_dev_allocated(pci_dev->device.name);
+
+   if (!ethdev)
+   return -ENODEV;
+
+   if (rte_eal_process_type() != RTE_PROC_PRIMARY)
+   return rte_eth_dev_release_port_private(ethdev);
+
return rte_eth_dev_pci_generic_remove(pci_dev, i40evf_dev_uninit);
 }
 
-- 
2.13.6



[dpdk-dev] [PATCH v4 13/24] net/fm10k: enable port detach on secondary process

2018-06-26 Thread Qi Zhang
Previously, detach port on secondary process will mess primary
process and cause same device can't be attached again, by take
advantage of rte_eth_release_port_private, we can support this
with minor change.

Signed-off-by: Qi Zhang 
---
 drivers/net/fm10k/fm10k_ethdev.c | 9 +
 1 file changed, 9 insertions(+)

diff --git a/drivers/net/fm10k/fm10k_ethdev.c b/drivers/net/fm10k/fm10k_ethdev.c
index 3ff1b0e0f..f73301182 100644
--- a/drivers/net/fm10k/fm10k_ethdev.c
+++ b/drivers/net/fm10k/fm10k_ethdev.c
@@ -3264,6 +3264,15 @@ static int eth_fm10k_pci_probe(struct rte_pci_driver 
*pci_drv __rte_unused,
 
 static int eth_fm10k_pci_remove(struct rte_pci_device *pci_dev)
 {
+   struct rte_eth_dev *ethdev =
+   rte_eth_dev_allocated(pci_dev->device.name);
+
+   if (!ethdev)
+   return -ENODEV;
+
+   if (rte_eal_process_type() != RTE_PROC_PRIMARY)
+   return rte_eth_dev_release_port_private(ethdev);
+
return rte_eth_dev_pci_generic_remove(pci_dev, eth_fm10k_dev_uninit);
 }
 
-- 
2.13.6



[dpdk-dev] [PATCH v4 17/24] net/kni: enable port detach on secondary process

2018-06-26 Thread Qi Zhang
Previously, detach port on secondary process will mess primary
process and cause same device can't be attached again, by take
advantage of rte_eth_release_port_private, we can support this
with minor change.

Signed-off-by: Qi Zhang 
---
 drivers/net/kni/rte_eth_kni.c | 11 +++
 1 file changed, 11 insertions(+)

diff --git a/drivers/net/kni/rte_eth_kni.c b/drivers/net/kni/rte_eth_kni.c
index ab63ea427..e5679c76a 100644
--- a/drivers/net/kni/rte_eth_kni.c
+++ b/drivers/net/kni/rte_eth_kni.c
@@ -419,6 +419,7 @@ eth_kni_probe(struct rte_vdev_device *vdev)
}
/* TODO: request info from primary to set up Rx and Tx */
eth_dev->dev_ops = ð_kni_ops;
+   eth_dev->device = &vdev->device;
rte_eth_dev_probing_finish(eth_dev);
return 0;
}
@@ -463,6 +464,16 @@ eth_kni_remove(struct rte_vdev_device *vdev)
if (eth_dev == NULL)
return -1;
 
+   if (rte_eal_process_type() != RTE_PROC_PRIMARY) {
+   /* detach device on local pprocess only */
+   if (strlen(rte_vdev_device_args(vdev)) == 0)
+   return rte_eth_dev_release_port_private(eth_dev);
+   /**
+* else this is a private device for current process
+* so continue with normal detach scenario
+*/
+   }
+
eth_kni_dev_stop(eth_dev);
 
internals = eth_dev->data->dev_private;
-- 
2.13.6



[dpdk-dev] [PATCH v4 16/24] net/failsafe: enable port detach on secondary process

2018-06-26 Thread Qi Zhang
Previously, detach port on secondary process will mess primary
process and cause same device can't be attached again, by take
advantage of rte_eth_release_port_private, we can support this
with minor change.

Signed-off-by: Qi Zhang 
---
 drivers/net/failsafe/failsafe.c | 16 
 1 file changed, 16 insertions(+)

diff --git a/drivers/net/failsafe/failsafe.c b/drivers/net/failsafe/failsafe.c
index eafbb75df..c5e8651f6 100644
--- a/drivers/net/failsafe/failsafe.c
+++ b/drivers/net/failsafe/failsafe.c
@@ -328,6 +328,7 @@ rte_pmd_failsafe_probe(struct rte_vdev_device *vdev)
}
/* TODO: request info from primary to set up Rx and Tx */
eth_dev->dev_ops = &failsafe_ops;
+   eth_dev->device = &vdev->device;
rte_eth_dev_probing_finish(eth_dev);
return 0;
}
@@ -338,10 +339,25 @@ rte_pmd_failsafe_probe(struct rte_vdev_device *vdev)
 static int
 rte_pmd_failsafe_remove(struct rte_vdev_device *vdev)
 {
+   struct rte_eth_dev *eth_dev;
const char *name;
 
name = rte_vdev_device_name(vdev);
INFO("Uninitializing " FAILSAFE_DRIVER_NAME " for %s", name);
+
+   eth_dev = rte_eth_dev_allocated(name);
+   if (!eth_dev)
+   return -ENODEV;
+   if (rte_eal_process_type() != RTE_PROC_PRIMARY) {
+   /* detach device on local pprocess only */
+   if (strlen(rte_vdev_device_args(vdev)) == 0)
+   return rte_eth_dev_release_port_private(eth_dev);
+   /**
+* else this is a private device for current process
+* so continue with normal detach scenario.
+*/
+   }
+
return fs_rte_eth_free(name);
 }
 
-- 
2.13.6



[dpdk-dev] [PATCH v4 15/24] net/bonding: enable port detach on secondary process

2018-06-26 Thread Qi Zhang
Previously, detach port on secondary process will mess primary
process and cause same device can't be attached again, by take
advantage of rte_eth_release_port_private, we can support this
with minor change.

Signed-off-by: Qi Zhang 
---
 drivers/net/bonding/rte_eth_bond_pmd.c | 11 +++
 1 file changed, 11 insertions(+)

diff --git a/drivers/net/bonding/rte_eth_bond_pmd.c 
b/drivers/net/bonding/rte_eth_bond_pmd.c
index f155ff779..da45ba9ba 100644
--- a/drivers/net/bonding/rte_eth_bond_pmd.c
+++ b/drivers/net/bonding/rte_eth_bond_pmd.c
@@ -3062,6 +3062,7 @@ bond_probe(struct rte_vdev_device *dev)
}
/* TODO: request info from primary to set up Rx and Tx */
eth_dev->dev_ops = &default_dev_ops;
+   eth_dev->device = &dev->device;
rte_eth_dev_probing_finish(eth_dev);
return 0;
}
@@ -3168,6 +3169,16 @@ bond_remove(struct rte_vdev_device *dev)
if (eth_dev == NULL)
return -ENODEV;
 
+   if (rte_eal_process_type() != RTE_PROC_PRIMARY) {
+   /* detach device on local pprocess only */
+   if (strlen(rte_vdev_device_args(dev)) == 0)
+   return rte_eth_dev_release_port_private(eth_dev);
+   /**
+* else this is a private device for current process
+* so continue with normal detach scenario
+*/
+   }
+
RTE_ASSERT(eth_dev->device == &dev->device);
 
internals = eth_dev->data->dev_private;
-- 
2.13.6



[dpdk-dev] [PATCH v4 14/24] net/af_packet: enable port detach on secondary process

2018-06-26 Thread Qi Zhang
Previously, detach port on secondary process will mess primary
process and cause same device can't be attached again, by take
advantage of rte_eth_release_port_private, we can support this
with minor change.

Signed-off-by: Qi Zhang 
---
 drivers/net/af_packet/rte_eth_af_packet.c | 11 +++
 1 file changed, 11 insertions(+)

diff --git a/drivers/net/af_packet/rte_eth_af_packet.c 
b/drivers/net/af_packet/rte_eth_af_packet.c
index ea47abbf8..33ac19de8 100644
--- a/drivers/net/af_packet/rte_eth_af_packet.c
+++ b/drivers/net/af_packet/rte_eth_af_packet.c
@@ -935,6 +935,7 @@ rte_pmd_af_packet_probe(struct rte_vdev_device *dev)
}
/* TODO: request info from primary to set up Rx and Tx */
eth_dev->dev_ops = &ops;
+   eth_dev->device = &dev->device;
rte_eth_dev_probing_finish(eth_dev);
return 0;
}
@@ -986,6 +987,16 @@ rte_pmd_af_packet_remove(struct rte_vdev_device *dev)
if (eth_dev == NULL)
return -1;
 
+   if (rte_eal_process_type() != RTE_PROC_PRIMARY) {
+   /* detach device on local pprocess only */
+   if (strlen(rte_vdev_device_args(dev)) == 0)
+   return rte_eth_dev_release_port_private(eth_dev);
+   /**
+* else this is a private device for current process
+* so continue with normal detach scenario
+*/
+   }
+
internals = eth_dev->data->dev_private;
for (q = 0; q < internals->nb_queues; q++) {
rte_free(internals->rx_queue[q].rd);
-- 
2.13.6



[dpdk-dev] [PATCH v4 18/24] net/null: enable port detach on secondary process

2018-06-26 Thread Qi Zhang
Previously, detach port on secondary process will mess primary
process and cause same device can't be attached again, by take
advantage of rte_eth_release_port_private, we can support this
with minor change.

Signed-off-by: Qi Zhang 
---
 drivers/net/null/rte_eth_null.c | 16 +++-
 1 file changed, 15 insertions(+), 1 deletion(-)

diff --git a/drivers/net/null/rte_eth_null.c b/drivers/net/null/rte_eth_null.c
index 1d2e6b9e9..2f040729b 100644
--- a/drivers/net/null/rte_eth_null.c
+++ b/drivers/net/null/rte_eth_null.c
@@ -623,6 +623,7 @@ rte_pmd_null_probe(struct rte_vdev_device *dev)
}
/* TODO: request info from primary to set up Rx and Tx */
eth_dev->dev_ops = &ops;
+   eth_dev->device = &dev->device;
rte_eth_dev_probing_finish(eth_dev);
return 0;
}
@@ -667,18 +668,31 @@ static int
 rte_pmd_null_remove(struct rte_vdev_device *dev)
 {
struct rte_eth_dev *eth_dev = NULL;
+   const char *name;
 
if (!dev)
return -EINVAL;
 
+   name = rte_vdev_device_name(dev);
+
PMD_LOG(INFO, "Closing null ethdev on numa socket %u",
rte_socket_id());
 
/* find the ethdev entry */
-   eth_dev = rte_eth_dev_allocated(rte_vdev_device_name(dev));
+   eth_dev = rte_eth_dev_allocated(name);
if (eth_dev == NULL)
return -1;
 
+   if (rte_eal_process_type() != RTE_PROC_PRIMARY) {
+   /* detach device on local pprocess only */
+   if (strlen(rte_vdev_device_args(dev)) == 0)
+   return rte_eth_dev_release_port_private(eth_dev);
+   /**
+* else this is a private device for current process
+* so continue with normal detach scenario
+*/
+   }
+
rte_free(eth_dev->data->dev_private);
 
rte_eth_dev_release_port(eth_dev);
-- 
2.13.6



[dpdk-dev] [PATCH v4 19/24] net/octeontx: enable port detach on secondary process

2018-06-26 Thread Qi Zhang
Previously, detach port on secondary process will mess primary
process and cause same device can't be attached again, by take
advantage of rte_eth_release_port_private, we can support this
with minor change.

Signed-off-by: Qi Zhang 
---
 drivers/net/octeontx/octeontx_ethdev.c | 16 
 1 file changed, 16 insertions(+)

diff --git a/drivers/net/octeontx/octeontx_ethdev.c 
b/drivers/net/octeontx/octeontx_ethdev.c
index 1eb453b21..497bacdc6 100644
--- a/drivers/net/octeontx/octeontx_ethdev.c
+++ b/drivers/net/octeontx/octeontx_ethdev.c
@@ -1016,6 +1016,7 @@ octeontx_create(struct rte_vdev_device *dev, int port, 
uint8_t evdev,
 
eth_dev->tx_pkt_burst = octeontx_xmit_pkts;
eth_dev->rx_pkt_burst = octeontx_recv_pkts;
+   eth_dev->device = &dev->device;
rte_eth_dev_probing_finish(eth_dev);
return 0;
}
@@ -1138,6 +1139,18 @@ octeontx_remove(struct rte_vdev_device *dev)
if (eth_dev == NULL)
return -ENODEV;
 
+   if (rte_eal_process_type() != RTE_PROC_PRIMARY) {
+   /* detach device on local pprocess only */
+   if (strlen(rte_vdev_device_args(dev)) == 0) {
+   rte_eth_dev_release_port_private(eth_dev);
+   continue;
+   }
+   /**
+* else this is a private device for current process
+* so continue with normal detach scenario
+*/
+   }
+
nic = octeontx_pmd_priv(eth_dev);
rte_event_dev_stop(nic->evdev);
PMD_INIT_LOG(INFO, "Closing octeontx device %s", octtx_name);
@@ -1148,6 +1161,9 @@ octeontx_remove(struct rte_vdev_device *dev)
rte_event_dev_close(nic->evdev);
}
 
+   if (rte_eal_process_type() != RTE_PROC_PRIMARY)
+   return 0;
+
/* Free FC resource */
octeontx_pko_fc_free();
 
-- 
2.13.6



[dpdk-dev] [PATCH v4 20/24] net/pcap: enable port detach on secondary process

2018-06-26 Thread Qi Zhang
Previously, detach port on secondary process will mess primary
process and cause same device can't be attached again, by take
advantage of rte_eth_release_port_private, we can support this
with minor change.

Signed-off-by: Qi Zhang 
---
 drivers/net/pcap/rte_eth_pcap.c | 15 ++-
 1 file changed, 14 insertions(+), 1 deletion(-)

diff --git a/drivers/net/pcap/rte_eth_pcap.c b/drivers/net/pcap/rte_eth_pcap.c
index 6bd4a7d79..6cc20c2b2 100644
--- a/drivers/net/pcap/rte_eth_pcap.c
+++ b/drivers/net/pcap/rte_eth_pcap.c
@@ -925,6 +925,7 @@ pmd_pcap_probe(struct rte_vdev_device *dev)
}
/* TODO: request info from primary to set up Rx and Tx */
eth_dev->dev_ops = &ops;
+   eth_dev->device = &dev->device;
rte_eth_dev_probing_finish(eth_dev);
return 0;
}
@@ -1016,6 +1017,7 @@ static int
 pmd_pcap_remove(struct rte_vdev_device *dev)
 {
struct rte_eth_dev *eth_dev = NULL;
+   const char *name;
 
PMD_LOG(INFO, "Closing pcap ethdev on numa socket %d",
rte_socket_id());
@@ -1023,11 +1025,22 @@ pmd_pcap_remove(struct rte_vdev_device *dev)
if (!dev)
return -1;
 
+   name = rte_vdev_device_name(dev);
/* reserve an ethdev entry */
-   eth_dev = rte_eth_dev_allocated(rte_vdev_device_name(dev));
+   eth_dev = rte_eth_dev_allocated(name);
if (eth_dev == NULL)
return -1;
 
+   if (rte_eal_process_type() != RTE_PROC_PRIMARY) {
+   /* detach device on local pprocess only */
+   if (strlen(rte_vdev_device_args(dev)) == 0)
+   return rte_eth_dev_release_port_private(eth_dev);
+   /**
+* else this is a private device for current process
+* so continue with normal detach scenario
+*/
+   }
+
rte_free(eth_dev->data->dev_private);
 
rte_eth_dev_release_port(eth_dev);
-- 
2.13.6



[dpdk-dev] [PATCH v4 21/24] net/softnic: enable port detach on secondary process

2018-06-26 Thread Qi Zhang
Previously, detach port on secondary process will mess primary
process and cause same device can't be attached again, by take
advantage of rte_eth_release_port_private, we can support this
with minor change.

Signed-off-by: Qi Zhang 
---
 drivers/net/softnic/rte_eth_softnic.c | 19 ---
 1 file changed, 16 insertions(+), 3 deletions(-)

diff --git a/drivers/net/softnic/rte_eth_softnic.c 
b/drivers/net/softnic/rte_eth_softnic.c
index 6b3c13e5c..a45a7b0dd 100644
--- a/drivers/net/softnic/rte_eth_softnic.c
+++ b/drivers/net/softnic/rte_eth_softnic.c
@@ -750,6 +750,7 @@ pmd_probe(struct rte_vdev_device *vdev)
}
/* TODO: request info from primary to set up Rx and Tx */
eth_dev->dev_ops = &pmd_ops;
+   eth_dev->device = &vdev->device;
rte_eth_dev_probing_finish(eth_dev);
return 0;
}
@@ -803,17 +804,29 @@ pmd_remove(struct rte_vdev_device *vdev)
 {
struct rte_eth_dev *dev = NULL;
struct pmd_internals *p;
+   const char *name;
 
if (!vdev)
return -EINVAL;
 
-   PMD_LOG(INFO, "Removing device \"%s\"",
-   rte_vdev_device_name(vdev));
+   name = rte_vdev_device_name(vdev);
+   PMD_LOG(INFO, "Removing device \"%s\"", name);
 
/* Find the ethdev entry */
-   dev = rte_eth_dev_allocated(rte_vdev_device_name(vdev));
+   dev = rte_eth_dev_allocated(name);
if (dev == NULL)
return -ENODEV;
+
+   if (rte_eal_process_type() != RTE_PROC_PRIMARY) {
+   /* detach device on local pprocess only */
+   if (strlen(rte_vdev_device_args(vdev)) == 0)
+   return rte_eth_dev_release_port_private(dev);
+   /**
+* else this is a private device for current process
+* so continue with normal detach scenario
+*/
+   }
+
p = dev->data->dev_private;
 
/* Free device data structures*/
-- 
2.13.6



[dpdk-dev] [PATCH v4 22/24] net/tap: enable port detach on secondary process

2018-06-26 Thread Qi Zhang
Previously, detach port on secondary process will mess primary
process and cause same device can't be attached again, by take
advantage of rte_eth_release_port_private, we can support this
with minor change.

Signed-off-by: Qi Zhang 
---
 drivers/net/tap/rte_eth_tap.c | 17 +++--
 1 file changed, 15 insertions(+), 2 deletions(-)

diff --git a/drivers/net/tap/rte_eth_tap.c b/drivers/net/tap/rte_eth_tap.c
index df396bfde..bb5f20b01 100644
--- a/drivers/net/tap/rte_eth_tap.c
+++ b/drivers/net/tap/rte_eth_tap.c
@@ -1759,6 +1759,7 @@ rte_pmd_tap_probe(struct rte_vdev_device *dev)
}
/* TODO: request info from primary to set up Rx and Tx */
eth_dev->dev_ops = &ops;
+   eth_dev->device = &dev->device;
rte_eth_dev_probing_finish(eth_dev);
return 0;
}
@@ -1827,12 +1828,24 @@ rte_pmd_tap_remove(struct rte_vdev_device *dev)
 {
struct rte_eth_dev *eth_dev = NULL;
struct pmd_internals *internals;
+   const char *name;
int i;
 
+   name = rte_vdev_device_name(dev);
/* find the ethdev entry */
-   eth_dev = rte_eth_dev_allocated(rte_vdev_device_name(dev));
+   eth_dev = rte_eth_dev_allocated(name);
if (!eth_dev)
-   return 0;
+   return -ENODEV;
+
+   if (rte_eal_process_type() != RTE_PROC_PRIMARY) {
+   /* detach device on local pprocess only */
+   if (strlen(rte_vdev_device_args(dev)) == 0)
+   return rte_eth_dev_release_port_private(eth_dev);
+   /**
+* else this is a private device for current process
+* so continue with normal detach scenario
+*/
+   }
 
internals = eth_dev->data->dev_private;
 
-- 
2.13.6



[dpdk-dev] [PATCH v4 23/24] net/vhost: enable port detach on secondary process

2018-06-26 Thread Qi Zhang
Previously, detach port on secondary process will mess primary
process and cause same device can't be attached again, by take
advantage of rte_eth_release_port_private, we can support this
with minor change.

Signed-off-by: Qi Zhang 
---
 drivers/net/vhost/rte_eth_vhost.c | 11 +++
 1 file changed, 11 insertions(+)

diff --git a/drivers/net/vhost/rte_eth_vhost.c 
b/drivers/net/vhost/rte_eth_vhost.c
index ba9d768a0..f773711b4 100644
--- a/drivers/net/vhost/rte_eth_vhost.c
+++ b/drivers/net/vhost/rte_eth_vhost.c
@@ -1353,6 +1353,7 @@ rte_pmd_vhost_probe(struct rte_vdev_device *dev)
}
/* TODO: request info from primary to set up Rx and Tx */
eth_dev->dev_ops = &ops;
+   eth_dev->device = &dev->device;
rte_eth_dev_probing_finish(eth_dev);
return 0;
}
@@ -1435,6 +1436,16 @@ rte_pmd_vhost_remove(struct rte_vdev_device *dev)
if (eth_dev == NULL)
return -ENODEV;
 
+   if (rte_eal_process_type() != RTE_PROC_PRIMARY) {
+   /* detach device on local pprocess only */
+   if (strlen(rte_vdev_device_args(dev)) == 0)
+   return rte_eth_dev_release_port_private(eth_dev);
+   /**
+* else this is a private device for current process
+* so continue with normal detach scenario
+*/
+   }
+
eth_dev_close(eth_dev);
 
rte_free(vring_states[eth_dev->data->port_id]);
-- 
2.13.6



[dpdk-dev] [PATCH v4 24/24] examples/multi_process: add hotplug sample

2018-06-26 Thread Qi Zhang
The sample code demonstrate device (ethdev only) management
at multi-process envrionment. User can attach/detach a device
on primary process and see it is synced on secondary process
automatically, also user can lock a device to prevent it be
detached or unlock it to go back to default behaviour.

How to start?
./hotplug_mp --proc-type=auto

Command Line Example:

>help
>list

/* attach a af_packet vdev */
>attach net_af_packet,iface=eth0

/* detach port 0 */
>detach 0

/* attach a private af_packet vdev (secondary process only)*/
>attachp net_af_packet,iface=eth0

/* detach a private device (secondary process only) */
>detachp 0

/* lock port 0 */
>lock 0

/* unlock port 0 */
>unlock 0

Signed-off-by: Qi Zhang 
---
 examples/multi_process/Makefile  |   1 +
 examples/multi_process/hotplug_mp/Makefile   |  23 ++
 examples/multi_process/hotplug_mp/commands.c | 356 +++
 examples/multi_process/hotplug_mp/commands.h |  10 +
 examples/multi_process/hotplug_mp/main.c |  41 +++
 5 files changed, 431 insertions(+)
 create mode 100644 examples/multi_process/hotplug_mp/Makefile
 create mode 100644 examples/multi_process/hotplug_mp/commands.c
 create mode 100644 examples/multi_process/hotplug_mp/commands.h
 create mode 100644 examples/multi_process/hotplug_mp/main.c

diff --git a/examples/multi_process/Makefile b/examples/multi_process/Makefile
index a6708b7e4..b76b02fcb 100644
--- a/examples/multi_process/Makefile
+++ b/examples/multi_process/Makefile
@@ -13,5 +13,6 @@ include $(RTE_SDK)/mk/rte.vars.mk
 DIRS-$(CONFIG_RTE_EXEC_ENV_LINUXAPP) += client_server_mp
 DIRS-$(CONFIG_RTE_EXEC_ENV_LINUXAPP) += simple_mp
 DIRS-$(CONFIG_RTE_EXEC_ENV_LINUXAPP) += symmetric_mp
+DIRS-$(CONFIG_RTE_EXEC_ENV_LINUXAPP) += hotplug_mp
 
 include $(RTE_SDK)/mk/rte.extsubdir.mk
diff --git a/examples/multi_process/hotplug_mp/Makefile 
b/examples/multi_process/hotplug_mp/Makefile
new file mode 100644
index 0..c09a57bfa
--- /dev/null
+++ b/examples/multi_process/hotplug_mp/Makefile
@@ -0,0 +1,23 @@
+# SPDX-License-Identifier: BSD-3-Clause
+# Copyright(c) 2010-2014 Intel Corporation
+
+ifeq ($(RTE_SDK),)
+$(error "Please define RTE_SDK environment variable")
+endif
+
+# Default target, can be overridden by command line or environment
+RTE_TARGET ?= x86_64-native-linuxapp-gcc
+
+include $(RTE_SDK)/mk/rte.vars.mk
+
+# binary name
+APP = hotplug_mp
+
+# all source are stored in SRCS-y
+SRCS-y := main.c commands.c
+
+CFLAGS += -O3
+CFLAGS += $(WERROR_FLAGS)
+CFLAGS += -DALLOW_EXPERIMENTAL_API
+
+include $(RTE_SDK)/mk/rte.extapp.mk
diff --git a/examples/multi_process/hotplug_mp/commands.c 
b/examples/multi_process/hotplug_mp/commands.c
new file mode 100644
index 0..31f9e2e15
--- /dev/null
+++ b/examples/multi_process/hotplug_mp/commands.c
@@ -0,0 +1,356 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2018 Intel Corporation.
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+/**/
+
+struct cmd_help_result {
+   cmdline_fixed_string_t help;
+};
+
+static void cmd_help_parsed(__attribute__((unused)) void *parsed_result,
+   struct cmdline *cl,
+   __attribute__((unused)) void *data)
+{
+   cmdline_printf(cl,
+  "commands:\n"
+  "- attach \n"
+  "- detach \n"
+  "- attachp \n"
+  "- detachp \n"
+  "- lock \n"
+  "- unlock \n"
+  "- list\n\n");
+}
+
+cmdline_parse_token_string_t cmd_help_help =
+   TOKEN_STRING_INITIALIZER(struct cmd_help_result, help, "help");
+
+cmdline_parse_inst_t cmd_help = {
+   .f = cmd_help_parsed,  /* function to call */
+   .data = NULL,  /* 2nd arg of func */
+   .help_str = "show help",
+   .tokens = {/* token list, NULL terminated */
+   (void *)&cmd_help_help,
+   NULL,
+   },
+};
+
+/**/
+
+struct cmd_quit_result {
+   cmdline_fixed_string_t quit;
+};
+
+static void cmd_quit_parsed(__attribute__((unused)) void *parsed_result,
+   struct cmdline *cl,
+   __attribute__((unused)) void *data)
+{
+   cmdline_quit(cl);
+}
+
+cmdline_parse_token_string_t cmd_quit_quit =
+   TOKEN_STRING_INITIALIZER(struct cmd_quit_result, quit, "quit");
+
+cmdline_parse_inst_t cmd_quit = {
+   .f = cmd_quit_parsed,  /* function to call */
+   .data = NULL,  /* 2nd arg of func */
+   .help_str = "quit",
+   .tokens = {/* token list, NULL terminated */
+   (void *)&cmd_quit_quit,
+   NULL,
+   },
+};
+
+/**/
+
+struct cmd_list_result {
+   cmdline_fixed_string_t list;
+};
+
+static void cmd_

[dpdk-dev] DPDK techboard minutes of June 20

2018-06-26 Thread Ananyev, Konstantin


Meeting notes for the DPDK technical board meeting held on 2018-06-20

Attendees:
- Bruce Richardson
- Ferruh Yigit
- Hemant Agrawal
- Jerin Jacob
- Konstantin Ananyev
- Olivier Matz
- Stephen Hemminger
- Thomas Monjalon
  

1) next-net-intel maintainer:
Qi Zhang will be the new maintainer instead of Helin Zhang.
Beilei Xing will be the backup maintainer of the subtree.

2) maintainer of 17.11 LTS
Yuanhan Liu is stepping away from LTS maintainer role.
Yongseok Koh will be new maintainer for 17.11 LTS

3) A new design for dpdk.org website.
www.dpdk.org is now managed by LF.
New site core.dpdk.org is the technical part for DPDK itself (not the side 
projects)
core.dpdk.org will now use Hugo instead of plain HTML as before.
git repository for core.dpdk.org will still remain at 
http://git.dpdk.org/tools/dpdk-web/
and old HTML-based version will still be available as a branch.  

4) LF proposal to move to GitHub with pull requests for managing content of 
core.dpdk.org. 
The decision was made to stay with existing mail-based patch process. 

5) Hyper-V series status update
v11 was sent last week, Thomas plan to apply them soon.

6) Discussion about minimal supported kernel version
No final decision made.
Agreed to continue discussion on ML and return to that subject at next TB 
meeting.


Next meeting will be on July 4th and Olivier will chair it


Re: [dpdk-dev] [PATCH 0/4] support for write combining

2018-06-26 Thread Rafał Kozik
Hello Thomas,

I would like to kindly remind about question about support for write
combining patch set:
https://mails.dpdk.org/archives/dev/2018-April/096749.html

It got ack from Bruce Richardson,
what is the next step to commit them to DPDK source?

Best regards,
Rafal Kozik


2018-06-11 11:32 GMT+02:00 Rafał Kozik :
> Hello Thomas,
>
> I have a question about support for write combining patch set.
> It got ack from Bruce Richardson more then month ago.
> Also no one has any further comments about it.
> What is the next step to commit them to DPDK source?
>
> Best regards,
> Rafal Kozik


Re: [dpdk-dev] [PATCH 1/1] ena: fix SIGFPE with 0 rx queues

2018-06-26 Thread Michał Krawczyk
Hi Daria,

please see my comments below and answer on them or apply fix and send
2nd version of the patch. You can do that by adding -v2 flag to git
format-patch command.

Please also sent the new version in response to this  email. You can
do that by adding --in-reply-to 'msgid' to git send-email. The message
ID can be read from the raw version of the email.

Thanks,
Michal

2018-06-25 15:40 GMT+02:00 Daria Kolistratova :
> When  he number of rx queues is 0
Please fix the typo (' he' -> 'the').
Please also add information that it happens when the application is
also requesting ETH_MQ_RX_RSS_FLAG in the
rte_dev->data->dev_conf.rxmode.mq_mode.

> (what can be when application does not receive)
> failed with SIGFPE.
> Fixed adding zero check before division.
>
> Signed-off-by: Daria Kolistratova 
> ---
>  drivers/net/ena/ena_ethdev.c | 8 ++--
>  1 file changed, 6 insertions(+), 2 deletions(-)
>  mode change 100644 => 100755 drivers/net/ena/ena_ethdev.c
>
> diff --git a/drivers/net/ena/ena_ethdev.c b/drivers/net/ena/ena_ethdev.c
> old mode 100644
> new mode 100755
> index 9ae73e331..89004c903
> --- a/drivers/net/ena/ena_ethdev.c
> +++ b/drivers/net/ena/ena_ethdev.c
> @@ -684,7 +684,11 @@ static int ena_rss_init_default(struct ena_adapter 
> *adapter)
> }
>
> for (i = 0; i < ENA_RX_RSS_TABLE_SIZE; i++) {
> -   val = i % nb_rx_queues;
> +   if (nb_rx_queues != 0)
> +   val = i % nb_rx_queues;
> +   else
> +   val = 0;
> +
This change is not needed if you are adding the below change. This
function should not be called if the nb_rx_queues == 0, so there is no
need to perform additional check.

> rc = ena_com_indirect_table_fill_entry(ena_dev, i,
>ENA_IO_RXQ_IDX(val));
> if (unlikely(rc && (rc != ENA_COM_UNSUPPORTED))) {
> @@ -1052,7 +1056,7 @@ static int ena_start(struct rte_eth_dev *dev)
> return rc;
>
> if (adapter->rte_dev->data->dev_conf.rxmode.mq_mode &
> -   ETH_MQ_RX_RSS_FLAG) {
> +   ETH_MQ_RX_RSS_FLAG && adapter->rte_dev->data->nb_rx_queues > 0) {
> rc = ena_rss_init_default(adapter);
> if (rc)
> return rc;
> --
> 2.14.4
>


Re: [dpdk-dev] [PATCH v2 1/4] mbuf: add accessor function for private data area

2018-06-26 Thread Olivier Matz
Hi Dan,

On Mon, Jun 18, 2018 at 04:35:34PM -0700, Dan Gora wrote:
> Add an inline accessor function to return the starting address of
> the private data area in the supplied mbuf.
> 
> This allows applications to easily access the private data area between
> the struct rte_mbuf and the data buffer in the specified mbuf without
> creating private macros or accessor functions.
> 
> No checks are made to ensure that a private data area actually exists
> in the buffer.
> 
> Signed-off-by: Dan Gora 

Thank you for this patch.

Few (late) comments to your previous questions:

- about rte_mbuf vs rte_pktmbuf, as Andrew said pktmbuf was used in
  the past when there was a ctrlmbuf. This one has been removed now, so
  mbuf should be used.

- I agree that removing the test (m->priv_size == 0) is better for
  the reasons mentionned, and also because it would add a test in the
  dataplane area, which would sometimes be useless: the application create
  the mbuf pools, so it can know that all mbufs have a priv area.


Acked-by: Olivier Matz 


Re: [dpdk-dev] [PATCH v2 1/8] vhost: announce VIRTIO_F_IN_ORDER support

2018-06-26 Thread Maxime Coquelin




On 06/25/2018 05:17 PM, Marvin Liu wrote:

If devices always use descriptors in the same order in which they have
been made available. These devices can offer the VIRTIO_F_IN_ORDER
feature. If negotiated, this knowledge allows devices to notify the use
of a batch of buffers to virtio driver by only writing used ring index.

Vhost user device has supported this feature by default. If vhost
dequeue zero is enabled, should disable VIRTIO_F_IN_ORDER as vhost can’t
assure that descriptors returned from NIC are in order.

Signed-off-by: Marvin Liu 


Reviewed-by: Maxime Coquelin 

Thanks,
Maxime


Re: [dpdk-dev] [PATCH v2 0/8] support VIRTIO_F_IN_ORDER feature

2018-06-26 Thread Maxime Coquelin

Hi,

On 06/25/2018 05:17 PM, Marvin Liu wrote:

In latest virtio-spec, new feature bit VIRTIO_F_IN_ORDER was introduced.
When this feature has been negotiated, virtio driver will use
descriptors in ring order: starting from offset 0 in the table, and
wrapping around at the end of the table. Vhost devices will always use
descriptors in the same order in which they have been made available.
This can reduce virtio accesses to used ring.

Based on updated virtio-spec, this series realized IN_ORDER prototype
in virtio driver. Due to new [RT]x path added into selection, also add
two new parameters mrg_rx and in_order into virtio-user vdev parameters
list. This will allow user to configure feature bits thus can impact
[RT]x path selection.

Performance of virtio user with IN_ORDER feature:

 Platform: Purely
 CPU: Intel(R) Xeon(R) Platinum 8160 CPU @ 2.10GHz
 DPDK baseline: 18.05
 Setup: testpmd with vhost vdev + testpmd with virtio vdev

 +--+--+--+-+
 |Vhost->Virtio |1 Queue   |2 Queues  |4 Queues |
 +--+--+--+-+
 |Inorder   |12.0Mpps  |24.2Mpps  |26.0Mpps |
 |Normal|12.1Mpps  |18.5Mpps  |18.9Mpps |
 +--+--+--+-+
 
 +--+--++-+

 |Virtio->Vhost |1 Queue   |2 Queues|4 Queues |
 +--+--++-+
 |Inorder   |13.8Mpps  |10.7 ~ 15.2Mpps |11.5Mpps |
 |Normal|13.3Mpps  |9.8 ~ 14Mpps|10.5Mpps |
 +--+--++-+
 
 +-+--+++

 |Loopback |1 Queue   |2 Queues|4 Queues|
 +-+--+++
 |Inorder  |7.4Mpps   |9.1 ~ 11.6Mpps  |10.5 ~ 11.3Mpps |
 +-+--+++
 |Normal   |7.5Mpps   |7.7 ~ 9.0Mpps   |7.6 ~ 7.8Mpps   |
 +-+--+++

v2:
- merge to latest dpdk-net-virtio
- not use in_direct for normal xmit packets
- update available ring for each descriptor
- clean up IN_ORDER xmit function
- unmask feature bits when disabled in_order or mgr_rxbuf
- extract common part between IN_ORDER and normal functions
- update performance result

Marvin Liu (8):
   vhost: announce VIRTIO_F_IN_ORDER support
   net/virtio: add VIRTIO_F_IN_ORDER definition
   net/virtio-user: add mrg_rxbuf and in_order vdev parameters
   net/virtio: free IN_ORDER descriptors before device start
   net/virtio: extract common part for IN_ORDER functions
   net/virtio: support IN_ORDER Rx and Tx
   net/virtio: add IN_ORDER Rx/Tx into selection
   net/virtio: annouce VIRTIO_F_IN_ORDER support


I haven't checked but guess the titles doesn't pass the check-git-log.sh
script as they contains underscores.


  drivers/net/virtio/virtio_ethdev.c|  31 +-
  drivers/net/virtio/virtio_ethdev.h|   7 +
  drivers/net/virtio/virtio_pci.h   |   8 +
  drivers/net/virtio/virtio_rxtx.c  | 635 --
  .../net/virtio/virtio_user/virtio_user_dev.c  |  14 +-
  .../net/virtio/virtio_user/virtio_user_dev.h  |   3 +-
  drivers/net/virtio/virtio_user_ethdev.c   |  33 +-
  drivers/net/virtio/virtqueue.c|   8 +
  drivers/net/virtio/virtqueue.h|   2 +
  lib/librte_vhost/socket.c |   6 +
  lib/librte_vhost/vhost.h  |  10 +-
  11 files changed, 688 insertions(+), 69 deletions(-)



Re: [dpdk-dev] [PATCH v2 2/8] net/virtio: add VIRTIO_F_IN_ORDER definition

2018-06-26 Thread Maxime Coquelin




On 06/25/2018 05:17 PM, Marvin Liu wrote:

If VIRTIO_F_IN_ORDER has been negotiated, driver will use descriptors in
ring order: starting from offset 0 in the table, and wrapping around at
the end of the table. Also introduce use_inorder_[rt]x flag for
selection of IN_ORDER [RT]x handlers.


Reviewed-by: Maxime Coquelin 

Thanks,
Maxime


Re: [dpdk-dev] [PATCH v2 0/8] support VIRTIO_F_IN_ORDER feature

2018-06-26 Thread Liu, Yong


> -Original Message-
> From: Maxime Coquelin [mailto:maxime.coque...@redhat.com]
> Sent: Tuesday, June 26, 2018 3:56 PM
> To: Liu, Yong ; Bie, Tiwei 
> Cc: Wang, Zhihong ; dev@dpdk.org
> Subject: Re: [PATCH v2 0/8] support VIRTIO_F_IN_ORDER feature
> 
> Hi,
> 
> On 06/25/2018 05:17 PM, Marvin Liu wrote:
> > In latest virtio-spec, new feature bit VIRTIO_F_IN_ORDER was introduced.
> > When this feature has been negotiated, virtio driver will use
> > descriptors in ring order: starting from offset 0 in the table, and
> > wrapping around at the end of the table. Vhost devices will always use
> > descriptors in the same order in which they have been made available.
> > This can reduce virtio accesses to used ring.
> >
> > Based on updated virtio-spec, this series realized IN_ORDER prototype
> > in virtio driver. Due to new [RT]x path added into selection, also add
> > two new parameters mrg_rx and in_order into virtio-user vdev parameters
> > list. This will allow user to configure feature bits thus can impact
> > [RT]x path selection.
> >
> > Performance of virtio user with IN_ORDER feature:
> >
> >  Platform: Purely
> >  CPU: Intel(R) Xeon(R) Platinum 8160 CPU @ 2.10GHz
> >  DPDK baseline: 18.05
> >  Setup: testpmd with vhost vdev + testpmd with virtio vdev
> >
> >  +--+--+--+-+
> >  |Vhost->Virtio |1 Queue   |2 Queues  |4 Queues |
> >  +--+--+--+-+
> >  |Inorder   |12.0Mpps  |24.2Mpps  |26.0Mpps |
> >  |Normal|12.1Mpps  |18.5Mpps  |18.9Mpps |
> >  +--+--+--+-+
> >
> >  +--+--++-+
> >  |Virtio->Vhost |1 Queue   |2 Queues|4 Queues |
> >  +--+--++-+
> >  |Inorder   |13.8Mpps  |10.7 ~ 15.2Mpps |11.5Mpps |
> >  |Normal|13.3Mpps  |9.8 ~ 14Mpps|10.5Mpps |
> >  +--+--++-+
> >
> >  +-+--+++
> >  |Loopback |1 Queue   |2 Queues|4 Queues|
> >  +-+--+++
> >  |Inorder  |7.4Mpps   |9.1 ~ 11.6Mpps  |10.5 ~ 11.3Mpps |
> >  +-+--+++
> >  |Normal   |7.5Mpps   |7.7 ~ 9.0Mpps   |7.6 ~ 7.8Mpps   |
> >  +-+--+++
> >
> > v2:
> > - merge to latest dpdk-net-virtio
> > - not use in_direct for normal xmit packets
> > - update available ring for each descriptor
> > - clean up IN_ORDER xmit function
> > - unmask feature bits when disabled in_order or mgr_rxbuf
> > - extract common part between IN_ORDER and normal functions
> > - update performance result
> >
> > Marvin Liu (8):
> >vhost: announce VIRTIO_F_IN_ORDER support
> >net/virtio: add VIRTIO_F_IN_ORDER definition
> >net/virtio-user: add mrg_rxbuf and in_order vdev parameters
> >net/virtio: free IN_ORDER descriptors before device start
> >net/virtio: extract common part for IN_ORDER functions
> >net/virtio: support IN_ORDER Rx and Tx
> >net/virtio: add IN_ORDER Rx/Tx into selection
> >net/virtio: annouce VIRTIO_F_IN_ORDER support
> 
> I haven't checked but guess the titles doesn't pass the check-git-log.sh
> script as they contains underscores.

Thanks for your reminder, Maxime. I will replace "IN_ORDER" to "in order" which 
can pass check.
> 
> >   drivers/net/virtio/virtio_ethdev.c|  31 +-
> >   drivers/net/virtio/virtio_ethdev.h|   7 +
> >   drivers/net/virtio/virtio_pci.h   |   8 +
> >   drivers/net/virtio/virtio_rxtx.c  | 635 --
> >   .../net/virtio/virtio_user/virtio_user_dev.c  |  14 +-
> >   .../net/virtio/virtio_user/virtio_user_dev.h  |   3 +-
> >   drivers/net/virtio/virtio_user_ethdev.c   |  33 +-
> >   drivers/net/virtio/virtqueue.c|   8 +
> >   drivers/net/virtio/virtqueue.h|   2 +
> >   lib/librte_vhost/socket.c |   6 +
> >   lib/librte_vhost/vhost.h  |  10 +-
> >   11 files changed, 688 insertions(+), 69 deletions(-)
> >


Re: [dpdk-dev] [PATCH] net/virtio-user: add unsupported features mask

2018-06-26 Thread Maxime Coquelin




On 06/25/2018 03:10 PM, Marvin Liu wrote:

This patch introduces unsupported features mask for virtio-user device.
For virtio-user server mode, when reconnecting virtio-user will
retrieve vhost devcie features as base and then unmask unsupported

s/devcie/device/

features.


I am not sure to understand why you are doing it like this.

Shouldn't you just:
 1. Don't advertise features you don't want to support
 2. In server mode, save the negotiated features, and re-use it when
reconnect happens?

Also, I find "unmask" a bit misleading, why not something like "unsupp"
or "unsupported"?

Thanks,
Maxime


[dpdk-dev] [PATCH v3 0/3] crypto/qat: move files to drivers/common directory

2018-06-26 Thread Tomasz Jozwiak
This patchset depends on QAT dynamic logging patchset and should be targetig on 
18.08.
Patchset refactors the PMD in order that files are split into several
places: common, crypto.
New drivers/common/qat are added and files split between locations.

Changes for v2:
  -  removed drivers/common/qat/qat
  -  updated meson.build files
  -  added description into qat.rst
  -  updated MAINTAINERS file 

Changes for v3:
  -  removed libcrypto detection from Makefile
  -  removed description about libcrypto detection from doc.
  -  renamed CONFIG_LIBCRYPTO_QAT define into BUILD_QAT_SYM

Tomasz Jozwiak (3):
  crypto/qat: add weak functions
  crypto/qat: re-organise build file content
  crypto/qat: move common qat files to common dir

 MAINTAINERS|  1 +
 drivers/Makefile   |  2 +
 drivers/common/meson.build |  2 +-
 drivers/common/qat/Makefile| 49 ++
 drivers/common/qat/meson.build | 14 +++
 .../qat/qat_adf/adf_transport_access_macros.h  |  0
 .../{crypto => common}/qat/qat_adf/icp_qat_fw.h|  0
 .../{crypto => common}/qat/qat_adf/icp_qat_fw_la.h |  0
 .../{crypto => common}/qat/qat_adf/icp_qat_hw.h|  0
 drivers/{crypto => common}/qat/qat_common.c|  0
 drivers/{crypto => common}/qat/qat_common.h|  0
 drivers/{crypto => common}/qat/qat_device.c| 39 -
 drivers/{crypto => common}/qat/qat_device.h| 20 +
 drivers/{crypto => common}/qat/qat_logs.c  |  0
 drivers/{crypto => common}/qat/qat_logs.h  |  0
 drivers/{crypto => common}/qat/qat_qp.c|  0
 drivers/{crypto => common}/qat/qat_qp.h|  0
 drivers/crypto/Makefile|  1 -
 drivers/crypto/qat/Makefile| 40 --
 drivers/crypto/qat/README  |  8 
 drivers/crypto/qat/meson.build | 32 --
 drivers/crypto/qat/qat_asym_pmd.c  | 17 
 drivers/crypto/qat/qat_asym_pmd.h  | 15 ---
 drivers/crypto/qat/qat_comp_pmd.c  | 18 
 drivers/crypto/qat/qat_comp_pmd.h  | 29 -
 drivers/crypto/qat/qat_sym.h   |  8 
 drivers/crypto/qat/qat_sym_pmd.h   |  6 ++-
 27 files changed, 163 insertions(+), 138 deletions(-)
 create mode 100644 drivers/common/qat/Makefile
 create mode 100644 drivers/common/qat/meson.build
 rename drivers/{crypto => common}/qat/qat_adf/adf_transport_access_macros.h 
(100%)
 rename drivers/{crypto => common}/qat/qat_adf/icp_qat_fw.h (100%)
 rename drivers/{crypto => common}/qat/qat_adf/icp_qat_fw_la.h (100%)
 rename drivers/{crypto => common}/qat/qat_adf/icp_qat_hw.h (100%)
 rename drivers/{crypto => common}/qat/qat_common.c (100%)
 rename drivers/{crypto => common}/qat/qat_common.h (100%)
 rename drivers/{crypto => common}/qat/qat_device.c (88%)
 rename drivers/{crypto => common}/qat/qat_device.h (80%)
 rename drivers/{crypto => common}/qat/qat_logs.c (100%)
 rename drivers/{crypto => common}/qat/qat_logs.h (100%)
 rename drivers/{crypto => common}/qat/qat_qp.c (100%)
 rename drivers/{crypto => common}/qat/qat_qp.h (100%)
 delete mode 100644 drivers/crypto/qat/Makefile
 create mode 100644 drivers/crypto/qat/README
 delete mode 100644 drivers/crypto/qat/qat_asym_pmd.c
 delete mode 100644 drivers/crypto/qat/qat_asym_pmd.h
 delete mode 100644 drivers/crypto/qat/qat_comp_pmd.c
 delete mode 100644 drivers/crypto/qat/qat_comp_pmd.h

-- 
2.7.4



Re: [dpdk-dev] [PATCH 5/6] cryptodev: remove old get session size functions

2018-06-26 Thread De Lara Guarch, Pablo


> -Original Message-
> From: Verma, Shally [mailto:shally.ve...@cavium.com]
> Sent: Tuesday, June 26, 2018 6:28 AM
> To: De Lara Guarch, Pablo ; Akhil Goyal
> ; Doherty, Declan ;
> ravi1.ku...@amd.com; Jacob, Jerin ;
> Zhang, Roy Fan ; Trahe, Fiona
> ; t...@semihalf.com; jianjay.z...@huawei.com
> Cc: dev@dpdk.org
> Subject: RE: [dpdk-dev] [PATCH 5/6] cryptodev: remove old get session size
> functions
> 
> 
> 
> >-Original Message-
> >From: De Lara Guarch, Pablo [mailto:pablo.de.lara.gua...@intel.com]
> >Sent: 25 June 2018 22:10
> >To: Verma, Shally ; Akhil Goyal
> >; Doherty, Declan ;
> >ravi1.ku...@amd.com; Jacob, Jerin
> >; Zhang, Roy Fan
> >; Trahe, Fiona ;
> >t...@semihalf.com; jianjay.z...@huawei.com
> >Cc: dev@dpdk.org
> >Subject: RE: [dpdk-dev] [PATCH 5/6] cryptodev: remove old get session
> >size functions
> >
> >External Email
> >
> >> -Original Message-
> >> From: Verma, Shally [mailto:shally.ve...@cavium.com]
> >> Sent: Friday, June 22, 2018 6:02 PM
> >> To: Akhil Goyal ; De Lara Guarch, Pablo
> >> ; Doherty, Declan
> >> ; ravi1.ku...@amd.com; Jacob, Jerin
> >> ; Zhang, Roy Fan
> >> ; Trahe, Fiona ;
> >> t...@semihalf.com; jianjay.z...@huawei.com
> >> Cc: dev@dpdk.org
> >> Subject: RE: [dpdk-dev] [PATCH 5/6] cryptodev: remove old get session
> >> size functions
> >>
> >> Hi Pablo
> >>
> >> >-Original Message-
> >> >From: dev [mailto:dev-boun...@dpdk.org] On Behalf Of Akhil Goyal
> >> >Sent: 21 June 2018 18:29
> >> >To: Pablo de Lara ;
> >> >declan.dohe...@intel.com; ravi1.ku...@amd.com; Jacob, Jerin
> >> >; roy.fan.zh...@intel.com;
> >> >fiona.tr...@intel.com; t...@semihalf.com; jianjay.z...@huawei.com
> >> >Cc: dev@dpdk.org
> >> >Subject: Re: [dpdk-dev] [PATCH 5/6] cryptodev: remove old get
> >> >session size functions
> >> >
> >> >External Email
> >> >
> >> >Hi Pablo,
> >> >
> >> >
> >> >On 6/9/2018 3:32 AM, Pablo de Lara wrote:
> >> >> Removed rte_cryptodev_get_header_session_size
> >> >> and rte_cryptodev_get_private_session_size functions, as they have
> >> >> been substituted with functions specific for symmetric operations,
> >> >> with _sym_ word after "rte_cryptodev_".
> >> >>
> >> >> Signed-off-by: Pablo de Lara 
> >> >> ---
> >
> >...
> >
> >> >> +
> >> >> +  - ``rte_cryptodev_get_header_session_size`` is replaced with
> >> >> +``rte_cryptodev_sym_get_header_session_size``
> >> >> +  - ``rte_cryptodev_get_private_session_size`` is replaced with
> >> >> +``rte_cryptodev_sym_get_private_session_size``
> >> >> +
> >> >rte_cryptodev_get_private_session_size is not removed in this patch.
> >> >I think you missed it in your patch.
> >
> >Right Akhil, thanks for spotting this. Will fix in next version.
> >
> >> >
> >> >-Akhil
> >> >>
> >> >>   ABI Changes
> >> >>   ---
> >> >> diff --git a/lib/librte_cryptodev/rte_cryptodev.c
> >> >> b/lib/librte_cryptodev/rte_cryptodev.c
> >> >> index a07904fb9..40e249e79 100644
> >> >> --- a/lib/librte_cryptodev/rte_cryptodev.c
> >> >> +++ b/lib/librte_cryptodev/rte_cryptodev.c
> >> >> @@ -1181,12 +1181,6 @@ rte_cryptodev_sym_session_free(struct
> >> rte_cryptodev_sym_session *sess)
> >> >>   return 0;
> >> >>   }
> >> >>
> >> >> -unsigned int
> >> >> -rte_cryptodev_get_header_session_size(void)
> >> >> -{
> >> >> - return rte_cryptodev_sym_get_header_session_size();
> >> >> -}
> >> >> -
> >> >>   unsigned int
> >> >>   rte_cryptodev_sym_get_header_session_size(void)
> >> >>   {
> >>
> >> [Shally] I missed this before. I think this implementation either
> >> should change to use nb_drivers which support symmetric or else I am
> >> not seeing a need for separate sym specific API for header_size since
> >> it will always be same for both sym and asym.
> >
> >The implementation is already using nb_drivers to calculate the size, right?
> [Shally] I meant change it to nb_sym_drivers, where nb_sym_drivers = number
> of drivers that have symmetric capability

Ok, I see now. Well, this will overcomplicate things, 
rte_cryptodev_allocate_driver
would need to be changed to two functions,
one for symmetric and another for asymmetric, causing an API breakage.
I think as long as the session creation/initialization functions check if a PMD 
supports
symmetric and/or asymmetric, we should be OK.

We might need some changes in the current symmetric implementation to peform 
those checks
and in the new asymmetric implementation.

Thanks,
Pablo

> 
> >Anyway, I understand that the way asymmetric sessions are done, the API
> >will be the same, but this could change in the future and since we have
> >already deprecated the generic function (get_header_session_size), I think we
> should continue and have both _sym and _asym_ functions.
> >
> [Shally] Ok.
> >Thanks,
> >Pablo



[dpdk-dev] [PATCH v3 0/3] crypto/qat: move files to drivers/common directory

2018-06-26 Thread Tomasz Jozwiak
This patchset depends on QAT dynamic logging patchset and should be targetig on 
18.08.
Patchset refactors the PMD in order that files are split into several
places: common, crypto.
New drivers/common/qat are added and files split between locations.

Changes for v2:
  -  removed drivers/common/qat/qat
  -  updated meson.build files
  -  added description into qat.rst
  -  updated MAINTAINERS file 

Changes for v3:
  -  removed libcrypto detection from Makefile
  -  removed description about libcrypto detection from doc.
  -  renamed CONFIG_LIBCRYPTO_QAT define into BUILD_QAT_SYM

Tomasz Jozwiak (3):
  crypto/qat: add weak functions
  crypto/qat: re-organise build file content
  crypto/qat: move common qat files to common dir

 MAINTAINERS|  1 +
 drivers/Makefile   |  2 +
 drivers/common/meson.build |  2 +-
 drivers/common/qat/Makefile| 49 ++
 drivers/common/qat/meson.build | 14 +++
 .../qat/qat_adf/adf_transport_access_macros.h  |  0
 .../{crypto => common}/qat/qat_adf/icp_qat_fw.h|  0
 .../{crypto => common}/qat/qat_adf/icp_qat_fw_la.h |  0
 .../{crypto => common}/qat/qat_adf/icp_qat_hw.h|  0
 drivers/{crypto => common}/qat/qat_common.c|  0
 drivers/{crypto => common}/qat/qat_common.h|  0
 drivers/{crypto => common}/qat/qat_device.c| 39 -
 drivers/{crypto => common}/qat/qat_device.h| 20 +
 drivers/{crypto => common}/qat/qat_logs.c  |  0
 drivers/{crypto => common}/qat/qat_logs.h  |  0
 drivers/{crypto => common}/qat/qat_qp.c|  0
 drivers/{crypto => common}/qat/qat_qp.h|  0
 drivers/crypto/Makefile|  1 -
 drivers/crypto/qat/Makefile| 40 --
 drivers/crypto/qat/README  |  8 
 drivers/crypto/qat/meson.build | 32 --
 drivers/crypto/qat/qat_asym_pmd.c  | 17 
 drivers/crypto/qat/qat_asym_pmd.h  | 15 ---
 drivers/crypto/qat/qat_comp_pmd.c  | 18 
 drivers/crypto/qat/qat_comp_pmd.h  | 29 -
 drivers/crypto/qat/qat_sym.h   |  8 
 drivers/crypto/qat/qat_sym_pmd.h   |  6 ++-
 27 files changed, 163 insertions(+), 138 deletions(-)
 create mode 100644 drivers/common/qat/Makefile
 create mode 100644 drivers/common/qat/meson.build
 rename drivers/{crypto => common}/qat/qat_adf/adf_transport_access_macros.h 
(100%)
 rename drivers/{crypto => common}/qat/qat_adf/icp_qat_fw.h (100%)
 rename drivers/{crypto => common}/qat/qat_adf/icp_qat_fw_la.h (100%)
 rename drivers/{crypto => common}/qat/qat_adf/icp_qat_hw.h (100%)
 rename drivers/{crypto => common}/qat/qat_common.c (100%)
 rename drivers/{crypto => common}/qat/qat_common.h (100%)
 rename drivers/{crypto => common}/qat/qat_device.c (88%)
 rename drivers/{crypto => common}/qat/qat_device.h (80%)
 rename drivers/{crypto => common}/qat/qat_logs.c (100%)
 rename drivers/{crypto => common}/qat/qat_logs.h (100%)
 rename drivers/{crypto => common}/qat/qat_qp.c (100%)
 rename drivers/{crypto => common}/qat/qat_qp.h (100%)
 delete mode 100644 drivers/crypto/qat/Makefile
 create mode 100644 drivers/crypto/qat/README
 delete mode 100644 drivers/crypto/qat/qat_asym_pmd.c
 delete mode 100644 drivers/crypto/qat/qat_asym_pmd.h
 delete mode 100644 drivers/crypto/qat/qat_comp_pmd.c
 delete mode 100644 drivers/crypto/qat/qat_comp_pmd.h

-- 
2.7.4



[dpdk-dev] [PATCH v3 2/3] crypto/qat: re-organise build file content

2018-06-26 Thread Tomasz Jozwiak
This patch groups sources and related dependencies into
common and sym sections in build files.

Signed-off-by: Tomasz Jozwiak 
Acked-by: Fiona Trahe 
---
 drivers/crypto/qat/Makefile  | 25 ++---
 drivers/crypto/qat/meson.build   | 14 --
 drivers/crypto/qat/qat_sym.h |  2 +-
 drivers/crypto/qat/qat_sym_pmd.h |  3 +--
 4 files changed, 24 insertions(+), 20 deletions(-)

diff --git a/drivers/crypto/qat/Makefile b/drivers/crypto/qat/Makefile
index 64f39fd..a939eca 100644
--- a/drivers/crypto/qat/Makefile
+++ b/drivers/crypto/qat/Makefile
@@ -15,19 +15,22 @@ CFLAGS += -O3
 
 # external library include paths
 CFLAGS += -I$(SRCDIR)/qat_adf
-LDLIBS += -lcrypto
-LDLIBS += -lrte_eal -lrte_mbuf -lrte_mempool -lrte_ring
+
+# library common source files
+SRCS-y += qat_device.c
+SRCS-y += qat_common.c
+SRCS-y += qat_logs.c
+SRCS-y += qat_qp.c
+
+# library symetric crypto source files
 LDLIBS += -lrte_cryptodev
-LDLIBS += -lrte_pci -lrte_bus_pci
+LDLIBS += -lcrypto
+SRCS-y += qat_sym.c
+SRCS-y += qat_sym_session.c
+SRCS-y += qat_sym_pmd.c
 
-# library source files
-SRCS-$(CONFIG_RTE_LIBRTE_PMD_QAT) += qat_sym.c
-SRCS-$(CONFIG_RTE_LIBRTE_PMD_QAT) += qat_device.c
-SRCS-$(CONFIG_RTE_LIBRTE_PMD_QAT) += qat_qp.c
-SRCS-$(CONFIG_RTE_LIBRTE_PMD_QAT) += qat_sym_session.c
-SRCS-$(CONFIG_RTE_LIBRTE_PMD_QAT) += qat_common.c
-SRCS-$(CONFIG_RTE_LIBRTE_PMD_QAT) += qat_logs.c
-SRCS-$(CONFIG_RTE_LIBRTE_PMD_QAT) += qat_sym_pmd.c
+LDLIBS += -lrte_eal -lrte_mbuf -lrte_mempool
+LDLIBS += -lrte_pci -lrte_bus_pci
 
 # export include files
 SYMLINK-y-include +=
diff --git a/drivers/crypto/qat/meson.build b/drivers/crypto/qat/meson.build
index 6d01dac..db4af2c 100644
--- a/drivers/crypto/qat/meson.build
+++ b/drivers/crypto/qat/meson.build
@@ -2,15 +2,17 @@
 # Copyright(c) 2017-2018 Intel Corporation
 
 dep = dependency('libcrypto', required: false)
-if not dep.found()
-   build = false
-endif
+
 sources = files('qat_common.c',
'qat_qp.c',
'qat_device.c',
-   'qat_logs.c',
-   'qat_sym_pmd.c', 'qat_sym.c', 'qat_sym_session.c')
+   'qat_logs.c')
+
+if dep.found()
+   sources += files('qat_sym_pmd.c', 'qat_sym.c', 'qat_sym_session.c')
+   pkgconfig_extra_libs += '-lcrypto'
+endif
+
 includes += include_directories('qat_adf')
 deps += ['bus_pci']
 ext_deps += dep
-pkgconfig_extra_libs += '-lcrypto'
diff --git a/drivers/crypto/qat/qat_sym.h b/drivers/crypto/qat/qat_sym.h
index f9e72a6..126c191 100644
--- a/drivers/crypto/qat/qat_sym.h
+++ b/drivers/crypto/qat/qat_sym.h
@@ -6,7 +6,6 @@
 #define _QAT_SYM_H_
 
 #include 
-
 #include 
 
 #include "qat_common.h"
@@ -153,4 +152,5 @@ qat_sym_process_response(void **op, uint8_t *resp)
}
*op = (void *)rx_op;
 }
+
 #endif /* _QAT_SYM_H_ */
diff --git a/drivers/crypto/qat/qat_sym_pmd.h b/drivers/crypto/qat/qat_sym_pmd.h
index 80a1987..1e2344c 100644
--- a/drivers/crypto/qat/qat_sym_pmd.h
+++ b/drivers/crypto/qat/qat_sym_pmd.h
@@ -10,7 +10,6 @@
 #include "qat_sym_capabilities.h"
 #include "qat_device.h"
 
-
 /**< Intel(R) QAT Symmetric Crypto PMD device name */
 #define CRYPTODEV_NAME_QAT_SYM_PMD crypto_qat
 #define QAT_SYM_PMD_MAX_NB_SESSIONS2048
@@ -31,10 +30,10 @@ struct qat_sym_dev_private {
/* QAT device symmetric crypto capabilities */
 };
 
-
 int
 qat_sym_dev_create(struct qat_pci_device *qat_pci_dev);
 
 int
 qat_sym_dev_destroy(struct qat_pci_device *qat_pci_dev);
+
 #endif /* _QAT_SYM_PMD_H_ */
-- 
2.7.4



[dpdk-dev] [PATCH v3 1/3] crypto/qat: add weak functions

2018-06-26 Thread Tomasz Jozwiak
This patch adds following weak functions to facilitate conditional
compilation of code for those services:
  -  qat_sym_dev_create
  -  qat_asym_dev_create
  -  qat_comp_dev_create
  -  qat_sym_dev_destroy
  -  qat_asym_dev_destroy
  -  qat_comp_dev_destroy
and removes unused files with empty definitions of above functions.

Signed-off-by: Tomasz Jozwiak 
Acked-by: Fiona Trahe 
---
 drivers/crypto/qat/Makefile   |  2 --
 drivers/crypto/qat/meson.build|  4 +---
 drivers/crypto/qat/qat_asym_pmd.c | 17 -
 drivers/crypto/qat/qat_asym_pmd.h | 15 ---
 drivers/crypto/qat/qat_comp_pmd.c | 18 --
 drivers/crypto/qat/qat_comp_pmd.h | 29 -
 drivers/crypto/qat/qat_device.c   | 39 +--
 drivers/crypto/qat/qat_device.h   | 20 
 8 files changed, 58 insertions(+), 86 deletions(-)
 delete mode 100644 drivers/crypto/qat/qat_asym_pmd.c
 delete mode 100644 drivers/crypto/qat/qat_asym_pmd.h
 delete mode 100644 drivers/crypto/qat/qat_comp_pmd.c
 delete mode 100644 drivers/crypto/qat/qat_comp_pmd.h

diff --git a/drivers/crypto/qat/Makefile b/drivers/crypto/qat/Makefile
index ef4a567..64f39fd 100644
--- a/drivers/crypto/qat/Makefile
+++ b/drivers/crypto/qat/Makefile
@@ -28,8 +28,6 @@ SRCS-$(CONFIG_RTE_LIBRTE_PMD_QAT) += qat_sym_session.c
 SRCS-$(CONFIG_RTE_LIBRTE_PMD_QAT) += qat_common.c
 SRCS-$(CONFIG_RTE_LIBRTE_PMD_QAT) += qat_logs.c
 SRCS-$(CONFIG_RTE_LIBRTE_PMD_QAT) += qat_sym_pmd.c
-SRCS-$(CONFIG_RTE_LIBRTE_PMD_QAT) += qat_asym_pmd.c
-SRCS-$(CONFIG_RTE_LIBRTE_PMD_QAT) += qat_comp_pmd.c
 
 # export include files
 SYMLINK-y-include +=
diff --git a/drivers/crypto/qat/meson.build b/drivers/crypto/qat/meson.build
index bcab16e..6d01dac 100644
--- a/drivers/crypto/qat/meson.build
+++ b/drivers/crypto/qat/meson.build
@@ -9,9 +9,7 @@ sources = files('qat_common.c',
'qat_qp.c',
'qat_device.c',
'qat_logs.c',
-   'qat_sym_pmd.c', 'qat_sym.c', 'qat_sym_session.c',
-   'qat_asym_pmd.c',
-   'qat_comp_pmd.c')
+   'qat_sym_pmd.c', 'qat_sym.c', 'qat_sym_session.c')
 includes += include_directories('qat_adf')
 deps += ['bus_pci']
 ext_deps += dep
diff --git a/drivers/crypto/qat/qat_asym_pmd.c 
b/drivers/crypto/qat/qat_asym_pmd.c
deleted file mode 100644
index 8d36300..000
--- a/drivers/crypto/qat/qat_asym_pmd.c
+++ /dev/null
@@ -1,17 +0,0 @@
-/* SPDX-License-Identifier: BSD-3-Clause
- * Copyright(c) 2018 Intel Corporation
- */
-
-#include "qat_asym_pmd.h"
-
-int
-qat_asym_dev_create(struct qat_pci_device *qat_pci_dev __rte_unused)
-{
-   return 0;
-}
-
-int
-qat_asym_dev_destroy(struct qat_pci_device *qat_pci_dev __rte_unused)
-{
-   return 0;
-}
diff --git a/drivers/crypto/qat/qat_asym_pmd.h 
b/drivers/crypto/qat/qat_asym_pmd.h
deleted file mode 100644
index 0465e03..000
--- a/drivers/crypto/qat/qat_asym_pmd.h
+++ /dev/null
@@ -1,15 +0,0 @@
-/* SPDX-License-Identifier: BSD-3-Clause
- * Copyright(c) 2018 Intel Corporation
- */
-
-#ifndef _QAT_ASYM_PMD_H_
-#define _QAT_ASYM_PMD_H_
-
-#include "qat_device.h"
-
-int
-qat_asym_dev_create(struct qat_pci_device *qat_pci_dev);
-
-int
-qat_asym_dev_destroy(struct qat_pci_device *qat_pci_dev);
-#endif /* _QAT_ASYM_PMD_H_ */
diff --git a/drivers/crypto/qat/qat_comp_pmd.c 
b/drivers/crypto/qat/qat_comp_pmd.c
deleted file mode 100644
index 547b3db..000
--- a/drivers/crypto/qat/qat_comp_pmd.c
+++ /dev/null
@@ -1,18 +0,0 @@
-/* SPDX-License-Identifier: BSD-3-Clause
- * Copyright(c) 2018 Intel Corporation
- */
-
-#include "qat_comp_pmd.h"
-
-
-int
-qat_comp_dev_create(struct qat_pci_device *qat_pci_dev __rte_unused)
-{
-   return 0;
-}
-
-int
-qat_comp_dev_destroy(struct qat_pci_device *qat_pci_dev __rte_unused)
-{
-   return 0;
-}
diff --git a/drivers/crypto/qat/qat_comp_pmd.h 
b/drivers/crypto/qat/qat_comp_pmd.h
deleted file mode 100644
index cc31246..000
--- a/drivers/crypto/qat/qat_comp_pmd.h
+++ /dev/null
@@ -1,29 +0,0 @@
-/* SPDX-License-Identifier: BSD-3-Clause
- * Copyright(c) 2018 Intel Corporation
- */
-
-#ifndef _QAT_COMP_PMD_H_
-#define _QAT_COMP_PMD_H_
-
-#include "qat_device.h"
-
-
-/**< Intel(R) QAT Compression PMD device name */
-#define COMPRESSDEV_NAME_QAT_PMD   comp_qat
-
-
-/** private data structure for a QAT compression device.
- * This QAT device is a device offering only a compression service,
- * there can be one of these on each qat_pci_device (VF).
- */
-struct qat_comp_dev_private {
-   struct qat_pci_device *qat_dev;
-   /**< The qat pci device hosting the service */
-};
-
-int
-qat_comp_dev_create(struct qat_pci_device *qat_pci_dev);
-
-int
-qat_comp_dev_destroy(struct qat_pci_device *qat_pci_dev);
-#endif /* _QAT_COMP_PMD_H_ */
diff --git a/drivers/crypto/qat/qat_device.c b/drivers/crypto/qat/qat_device.c
index 4b97c84..64f236e 100644
--- a/drivers/crypto/qat/qat_device.c
+++ b/drivers/crypto/qat/qat_device

[dpdk-dev] [PATCH v3 3/3] crypto/qat: move common qat files to common dir

2018-06-26 Thread Tomasz Jozwiak
  -  moved common qat files to common/qat dir.
  -  changed common/qat/Makefile, common/qat/meson.build,
 drivers/Makefile, crypto/Makefile
 to add possibility of using new files locations
  -  added README file into crypto/qat to clarify where
 the build is made from
  -  updated MAINTAINERS file

Signed-off-by: Tomasz Jozwiak 
Acked-by: Fiona Trahe 
---
 MAINTAINERS|  1 +
 drivers/Makefile   |  2 ++
 drivers/common/meson.build |  2 +-
 drivers/{crypto => common}/qat/Makefile| 20 +++-
 drivers/common/qat/meson.build | 14 +++
 .../qat/qat_adf/adf_transport_access_macros.h  |  0
 .../{crypto => common}/qat/qat_adf/icp_qat_fw.h|  0
 .../{crypto => common}/qat/qat_adf/icp_qat_fw_la.h |  0
 .../{crypto => common}/qat/qat_adf/icp_qat_hw.h|  0
 drivers/{crypto => common}/qat/qat_common.c|  0
 drivers/{crypto => common}/qat/qat_common.h|  0
 drivers/{crypto => common}/qat/qat_device.c|  0
 drivers/{crypto => common}/qat/qat_device.h|  0
 drivers/{crypto => common}/qat/qat_logs.c  |  0
 drivers/{crypto => common}/qat/qat_logs.h  |  0
 drivers/{crypto => common}/qat/qat_qp.c|  0
 drivers/{crypto => common}/qat/qat_qp.h|  0
 drivers/crypto/Makefile|  1 -
 drivers/crypto/qat/README  |  8 +++
 drivers/crypto/qat/meson.build | 28 +-
 drivers/crypto/qat/qat_sym.h   |  8 +++
 drivers/crypto/qat/qat_sym_pmd.h   |  3 +++
 22 files changed, 68 insertions(+), 19 deletions(-)
 rename drivers/{crypto => common}/qat/Makefile (60%)
 create mode 100644 drivers/common/qat/meson.build
 rename drivers/{crypto => common}/qat/qat_adf/adf_transport_access_macros.h 
(100%)
 rename drivers/{crypto => common}/qat/qat_adf/icp_qat_fw.h (100%)
 rename drivers/{crypto => common}/qat/qat_adf/icp_qat_fw_la.h (100%)
 rename drivers/{crypto => common}/qat/qat_adf/icp_qat_hw.h (100%)
 rename drivers/{crypto => common}/qat/qat_common.c (100%)
 rename drivers/{crypto => common}/qat/qat_common.h (100%)
 rename drivers/{crypto => common}/qat/qat_device.c (100%)
 rename drivers/{crypto => common}/qat/qat_device.h (100%)
 rename drivers/{crypto => common}/qat/qat_logs.c (100%)
 rename drivers/{crypto => common}/qat/qat_logs.h (100%)
 rename drivers/{crypto => common}/qat/qat_qp.c (100%)
 rename drivers/{crypto => common}/qat/qat_qp.h (100%)
 create mode 100644 drivers/crypto/qat/README

diff --git a/MAINTAINERS b/MAINTAINERS
index 3bc928f..bc16078 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -775,6 +775,7 @@ M: John Griffin 
 M: Fiona Trahe 
 M: Deepak Kumar Jain 
 F: drivers/crypto/qat/
+F: drivers/common/qat/
 F: doc/guides/cryptodevs/qat.rst
 F: doc/guides/cryptodevs/features/qat.ini
 
diff --git a/drivers/Makefile b/drivers/Makefile
index c88638c..7566076 100644
--- a/drivers/Makefile
+++ b/drivers/Makefile
@@ -13,6 +13,8 @@ DIRS-$(CONFIG_RTE_LIBRTE_BBDEV) += baseband
 DEPDIRS-baseband := common bus mempool
 DIRS-$(CONFIG_RTE_LIBRTE_CRYPTODEV) += crypto
 DEPDIRS-crypto := common bus mempool
+DIRS-$(CONFIG_RTE_LIBRTE_PMD_QAT) += common/qat
+DEPDIRS-common/qat := bus mempool
 DIRS-$(CONFIG_RTE_LIBRTE_COMPRESSDEV) += compress
 DEPDIRS-compress := bus mempool
 DIRS-$(CONFIG_RTE_LIBRTE_EVENTDEV) += event
diff --git a/drivers/common/meson.build b/drivers/common/meson.build
index 5f6341b..d7b7d8c 100644
--- a/drivers/common/meson.build
+++ b/drivers/common/meson.build
@@ -2,6 +2,6 @@
 # Copyright(c) 2018 Cavium, Inc
 
 std_deps = ['eal']
-drivers = ['octeontx']
+drivers = ['octeontx', 'qat']
 config_flag_fmt = 'RTE_LIBRTE_@0@_COMMON'
 driver_name_fmt = 'rte_common_@0@'
diff --git a/drivers/crypto/qat/Makefile b/drivers/common/qat/Makefile
similarity index 60%
rename from drivers/crypto/qat/Makefile
rename to drivers/common/qat/Makefile
index a939eca..069ac8c 100644
--- a/drivers/crypto/qat/Makefile
+++ b/drivers/common/qat/Makefile
@@ -13,8 +13,13 @@ LIBABIVER := 1
 CFLAGS += $(WERROR_FLAGS)
 CFLAGS += -O3
 
+# build directories
+QAT_CRYPTO_DIR := $(RTE_SDK)/drivers/crypto/qat
+
 # external library include paths
 CFLAGS += -I$(SRCDIR)/qat_adf
+CFLAGS += -I$(SRCDIR)
+CFLAGS += -I$(QAT_CRYPTO_DIR)
 
 # library common source files
 SRCS-y += qat_device.c
@@ -23,11 +28,14 @@ SRCS-y += qat_logs.c
 SRCS-y += qat_qp.c
 
 # library symetric crypto source files
-LDLIBS += -lrte_cryptodev
-LDLIBS += -lcrypto
-SRCS-y += qat_sym.c
-SRCS-y += qat_sym_session.c
-SRCS-y += qat_sym_pmd.c
+ifeq ($(CONFIG_RTE_LIBRTE_CRYPTODEV),y)
+   LDLIBS += -lrte_cryptodev
+   LDLIBS += -lcrypto
+   CFLAGS += -DBUILD_QAT_SYM
+   SRCS-y += $(QAT_CRYPTO_DIR)/qat_sym.c
+   SRCS-y += $(QAT_CRYPTO_DIR)/qat_sym_session.c
+   SRCS-y += $(QAT_CRYPTO_DIR)/qat_sym_pmd.c
+endif
 
 LDLIBS += -lrte_eal 

Re: [dpdk-dev] [RFC v3 0/7] vhost2: new librte_vhost2 proposal

2018-06-26 Thread Tiwei Bie
On Mon, Jun 25, 2018 at 08:17:08PM +0800, Stojaczyk, DariuszX wrote:
> > -Original Message-
> > From: dev [mailto:dev-boun...@dpdk.org] On Behalf Of Tiwei Bie
> > Sent: Monday, June 25, 2018 1:02 PM
> > 
> > 
> > Hi Dariusz,
> > 
> 
> Hi Tiwei,
> 
> > Thank you for putting efforts in making the DPDK
> > vhost more generic!
> > 
> > From my understanding, your proposal is that:
> > 
> > 1) Introduce rte_vhost2 to provide the APIs which
> >allow users to implement vhost backends like
> >SCSI, net, crypto, ..
> > 
> 
> That's right.
> 
> > 2) Refactor the existing rte_vhost to use rte_vhost2.
> >The rte_vhost will still provide below existing
> >sets of APIs:
> > 1. The APIs which allow users to implement
> >external vhost backends (these APIs were
> >designed for SPDK previously)
> > 2. The APIs provided by the net backend
> > 3. The APIs provided by the crypto backend
> >And above APIs in rte_vhost won't be changed.
> 
> That's correct. Rte_vhost would register its own rte_vhost2_tgt_ops 
> underneath and will call existing vhost_device_ops for e.g. starting the 
> device once all queues are started.

Currently I have below concerns and questions:

- The rte_vhost's problem is still there. Even though
  rte_vhost2 is introduced, the net and crypto backends
  in rte_vhost won't benefit from the new callbacks.

  The existing rte_vhost in DPDK not only provides the
  APIs for DPDK applications to implement the external
  backends. But also provides high performance net and
  crypto backends implementation (maybe more in the
  future). So it's important that besides the DPDK
  applications which implement their external backends,
  the DPDK applications which use the builtin backends
  will also benefit from the new callbacks.

  So we should have a clear plan on how will the legacy
  callbacks in rte_vhost be dealt with in the next step.

  Besides, the new library's name is a bit misleading.
  It makes the existing rte_vhost library sound like an
  obsolete library. But actually the existing rte_vhost
  isn't an obsolete library. It will still provide the
  net and crypto backends. So if we want to introduce
  this new library, we should give it a better name.

- It's possible to solve rte_vhost's problem you met
  by refactoring the existing vhost library directly
  instead of re-implementing a new vhost library from
  scratch and keeping the old one's problem as is.

  In this way, it will solve the problem you met and
  also solve the problem for rte_vhost. Why not go
  this way? Something like:

  Below is the existing callbacks set in rte_vhost.h:

  /**
   * Device and vring operations.
   */
  struct vhost_device_ops {
  ..
  };

  It's a legacy implementation, and doesn't really
  follow the DPDK API design (e.g. no rte_ prefix).
  We can design and implement a new message handling
  and a new set of callbacks for rte_vhost to solve
  the problem you met without changing the old one.
  Something like:

  struct rte_vhost_device_ops {
  ..
  }

  int
  vhost_user_msg_handler(struct vhost_dev *vdev, struct vhost_user_msg *msg)
  {
  ..

  if (!vdev->is_using_new_device_ops) {
  // Call the existing message handler
  return vhost_user_msg_handler_legacy(vdev, msg);
  }

  // Implement the new logic here
  ..
  }

  A vhost application is allowed to register only struct
  rte_vhost_device_ops or struct vhost_device_ops (which
  should be deprecated in the future). The two ops cannot
  be registered at the same time.

  The existing applications could use the old ops. And
  if an application registers struct rte_vhost_device_ops,
  the new callbacks and message handler will be used.

Best regards,
Tiwei Bie


> Regards,
> D.
> 
> > 
> > Is my above understanding correct? Thanks!
> > 
> > Best regards,
> > Tiwei Bie
> > 


Re: [dpdk-dev] [RFC v3 0/7] vhost2: new librte_vhost2 proposal

2018-06-26 Thread Thomas Monjalon
26/06/2018 10:22, Tiwei Bie:
> On Mon, Jun 25, 2018 at 08:17:08PM +0800, Stojaczyk, DariuszX wrote:
> > From: Tiwei Bie
> > > 
> > > Hi Dariusz,
> > 
> > Hi Tiwei,
> > 
> > > Thank you for putting efforts in making the DPDK
> > > vhost more generic!
> > > 
> > > From my understanding, your proposal is that:
> > > 
> > > 1) Introduce rte_vhost2 to provide the APIs which
> > >allow users to implement vhost backends like
> > >SCSI, net, crypto, ..
> > 
> > That's right.
> > 
> > > 2) Refactor the existing rte_vhost to use rte_vhost2.
> > >The rte_vhost will still provide below existing
> > >sets of APIs:
> > > 1. The APIs which allow users to implement
> > >external vhost backends (these APIs were
> > >designed for SPDK previously)
> > > 2. The APIs provided by the net backend
> > > 3. The APIs provided by the crypto backend
> > >And above APIs in rte_vhost won't be changed.
> > 
> > That's correct. Rte_vhost would register its own rte_vhost2_tgt_ops 
> > underneath and will call existing vhost_device_ops for e.g. starting the 
> > device once all queues are started.
> 
> Currently I have below concerns and questions:
> 
> - The rte_vhost's problem is still there. Even though
>   rte_vhost2 is introduced, the net and crypto backends
>   in rte_vhost won't benefit from the new callbacks.
> 
>   The existing rte_vhost in DPDK not only provides the
>   APIs for DPDK applications to implement the external
>   backends. But also provides high performance net and
>   crypto backends implementation (maybe more in the
>   future). So it's important that besides the DPDK
>   applications which implement their external backends,
>   the DPDK applications which use the builtin backends
>   will also benefit from the new callbacks.
> 
>   So we should have a clear plan on how will the legacy
>   callbacks in rte_vhost be dealt with in the next step.
> 
>   Besides, the new library's name is a bit misleading.
>   It makes the existing rte_vhost library sound like an
>   obsolete library. But actually the existing rte_vhost
>   isn't an obsolete library. It will still provide the
>   net and crypto backends. So if we want to introduce
>   this new library, we should give it a better name.
> 
> - It's possible to solve rte_vhost's problem you met
>   by refactoring the existing vhost library directly
>   instead of re-implementing a new vhost library from
>   scratch and keeping the old one's problem as is.

+1

>   In this way, it will solve the problem you met and
>   also solve the problem for rte_vhost. Why not go
>   this way? Something like:
> 
>   Below is the existing callbacks set in rte_vhost.h:
> 
>   /**
>* Device and vring operations.
>*/
>   struct vhost_device_ops {
>   ..
>   };
> 
>   It's a legacy implementation, and doesn't really
>   follow the DPDK API design (e.g. no rte_ prefix).
>   We can design and implement a new message handling
>   and a new set of callbacks for rte_vhost to solve
>   the problem you met without changing the old one.
>   Something like:
> 
>   struct rte_vhost_device_ops {
>   ..
>   }
> 
>   int
>   vhost_user_msg_handler(struct vhost_dev *vdev, struct vhost_user_msg *msg)
>   {
>   ..
> 
>   if (!vdev->is_using_new_device_ops) {
>   // Call the existing message handler
>   return vhost_user_msg_handler_legacy(vdev, msg);
>   }
> 
>   // Implement the new logic here
>   ..
>   }
> 
>   A vhost application is allowed to register only struct
>   rte_vhost_device_ops or struct vhost_device_ops (which
>   should be deprecated in the future). The two ops cannot
>   be registered at the same time.
> 
>   The existing applications could use the old ops. And
>   if an application registers struct rte_vhost_device_ops,
>   the new callbacks and message handler will be used.






Re: [dpdk-dev] [RFC] net/ixgbe: fix Tx descriptor status api

2018-06-26 Thread Olivier Matz
Hi Wei,

On Tue, Jun 26, 2018 at 01:38:22AM +, Zhao1, Wei wrote:
> Hi,  Olivier Matz
> 
>  Will you commit fix patch for i40e and ixgbe and em?

If you think the patch are relevant, yes :)

Here is a pre-version (last 5 patches):
http://git.droids-corp.org/?p=dpdk.git;a=shortlog;h=refs/heads/tx-desc

It still need to fix checkpatch issues, few more tests, and
rebase on next-net.

> And the code " dd = (desc / txq->tx_rs_thresh + 1) * txq->tx_rs_thresh - 1;"
> Is only proper for tx function ixgbe_xmit_pkts_simple  and 
> ixgbe_xmit_pkts_vec ().
> But not proper for ixgbe_xmit_pkts (), the RS bit set rule is different from 
> all these two. 

Can you please give more detail please?

Note this code, maybe you are talking about this?

+   /* In full featured mode, RS bit is only set in the last descriptor */
+   /* of a multisegments packet */
+   if (!((txq->offloads == 0) &&
+ (txq->tx_rs_thresh >= RTE_PMD_IXGBE_TX_MAX_BURST)))
+   dd = txq->sw_ring[dd].last_id;

Maybe there is something better to test?

Just to ensure we are on the same line, here are some more infos.

===

- sw advances the tail pointer
- hw advances the head pointer
- the software populates the ring with full buffers to be sent by
  the hw
- head points to the in-progress descriptor.
- sw writes new descriptors at tail
- head == tail means that the transmit queue is empty
- when the hw has processed a descriptor, it sets the DD bit if
  the descriptor has the RS (report status) bit.
- the driver never reads the head (needs a pci transaction), instead
it monitors the DD bit of a descriptor that has the RS bit

txq->tx_tail: sw value for tail register
txq->tx_free_thresh: free buffers if count(free descs) < this value
txq->tx_rs_thresh: RS bit is set every rs_thresh descriptor
txq->tx_next_dd: next desc to scan for DD bit
txq->tx_next_rs: next desc to set RS bit
txq->last_desc_cleaned: last descriptor that have been cleaned
txq->nb_tx_free: number of free descriptors

Example:

||
|   D   R   R   R|
|x   |
|<- descs not sent yet  ->   |
|x   |
||
^last_desc_cleaned=8^next_rs=47
^next_dd=15   ^sw_tail=45
 ^hw_head=20

 <  nb_used  ->

The hardware is currently processing the descriptor 20
'R' means the descriptor has the RS bit
'D' means the descriptor has the DD + RS bits
'x' are packets in txq (not sent)
'.' are packet already sent but not freed by sw

In this example, we have rs_thres=8. On next call to
ixgbe_tx_free_bufs(), some buffers will be freed.

===

Let's call ixgbe_dev_tx_descriptor_status(10):


- original version:

desc = 45 + 10 = 55
desc = ((55 + 8 - 1) / 8) * 8 = (62 / 8) * 8 = 56

wrong because it goes in the wrong direction, and because
56 does not have the RS bit

- after your patch:

desc = 45 + 10 = 55
desc = (((55 / 8) + 1) * 8) - 1 = (7 * 8) - 1 = 55

wrong because it goes in the wrong direction

- after my patch

desc = 45 - 10 - 1 = 34
desc = (((34 / 8) + 1) * 8) - 1 = (5 * 8) - 1 = 39

looks correct



Regards,
Olivier


Re: [dpdk-dev] [RFC v3 0/7] vhost2: new librte_vhost2 proposal

2018-06-26 Thread Stojaczyk, DariuszX


> -Original Message-
> From: Bie, Tiwei
> Sent: Tuesday, June 26, 2018 10:22 AM
> To: Stojaczyk, DariuszX 
> Cc: Dariusz Stojaczyk ; dev@dpdk.org; Maxime
> Coquelin ; Tetsuya Mukawa
> ; Stefan Hajnoczi ; Thomas
> Monjalon ; y...@fridaylinux.org; Harris, James R
> ; Kulasek, TomaszX ;
> Wodkowski, PawelX 
> Subject: Re: [dpdk-dev] [RFC v3 0/7] vhost2: new librte_vhost2 proposal
> 
> On Mon, Jun 25, 2018 at 08:17:08PM +0800, Stojaczyk, DariuszX wrote:
> > > -Original Message-
> > > From: dev [mailto:dev-boun...@dpdk.org] On Behalf Of Tiwei Bie
> > > Sent: Monday, June 25, 2018 1:02 PM
> > > 
> > >
> > > Hi Dariusz,
> > >
> >
> > Hi Tiwei,
> >
> > > Thank you for putting efforts in making the DPDK
> > > vhost more generic!
> > >
> > > From my understanding, your proposal is that:
> > >
> > > 1) Introduce rte_vhost2 to provide the APIs which
> > >allow users to implement vhost backends like
> > >SCSI, net, crypto, ..
> > >
> >
> > That's right.
> >
> > > 2) Refactor the existing rte_vhost to use rte_vhost2.
> > >The rte_vhost will still provide below existing
> > >sets of APIs:
> > > 1. The APIs which allow users to implement
> > >external vhost backends (these APIs were
> > >designed for SPDK previously)
> > > 2. The APIs provided by the net backend
> > > 3. The APIs provided by the crypto backend
> > >And above APIs in rte_vhost won't be changed.
> >
> > That's correct. Rte_vhost would register its own rte_vhost2_tgt_ops
> underneath and will call existing vhost_device_ops for e.g. starting the 
> device
> once all queues are started.
> 
> Currently I have below concerns and questions:
> 
> - The rte_vhost's problem is still there. Even though
>   rte_vhost2 is introduced, the net and crypto backends
>   in rte_vhost won't benefit from the new callbacks.
> 
>   The existing rte_vhost in DPDK not only provides the
>   APIs for DPDK applications to implement the external
>   backends. But also provides high performance net and
>   crypto backends implementation (maybe more in the
>   future). So it's important that besides the DPDK
>   applications which implement their external backends,
>   the DPDK applications which use the builtin backends
>   will also benefit from the new callbacks.
> 
>   So we should have a clear plan on how will the legacy
>   callbacks in rte_vhost be dealt with in the next step.
> 
>   Besides, the new library's name is a bit misleading.
>   It makes the existing rte_vhost library sound like an
>   obsolete library. But actually the existing rte_vhost
>   isn't an obsolete library. It will still provide the
>   net and crypto backends. So if we want to introduce
>   this new library, we should give it a better name.
> 
> - It's possible to solve rte_vhost's problem you met
>   by refactoring the existing vhost library directly
>   instead of re-implementing a new vhost library from
>   scratch and keeping the old one's problem as is.
> 
>   In this way, it will solve the problem you met and
>   also solve the problem for rte_vhost. Why not go
>   this way? Something like:
> 
>   Below is the existing callbacks set in rte_vhost.h:
> 
>   /**
>* Device and vring operations.
>*/
>   struct vhost_device_ops {
>   ..
>   };
> 
>   It's a legacy implementation, and doesn't really
>   follow the DPDK API design (e.g. no rte_ prefix).
>   We can design and implement a new message handling
>   and a new set of callbacks for rte_vhost to solve
>   the problem you met without changing the old one.
>   Something like:
> 
>   struct rte_vhost_device_ops {
>   ..
>   }
> 
>   int
>   vhost_user_msg_handler(struct vhost_dev *vdev, struct vhost_user_msg
> *msg)
>   {
>   ..
> 
>   if (!vdev->is_using_new_device_ops) {
>   // Call the existing message handler
>   return vhost_user_msg_handler_legacy(vdev, msg);
>   }
> 
>   // Implement the new logic here
>   ..
>   }
> 
>   A vhost application is allowed to register only struct
>   rte_vhost_device_ops or struct vhost_device_ops (which
>   should be deprecated in the future). The two ops cannot
>   be registered at the same time.
> 
>   The existing applications could use the old ops. And
>   if an application registers struct rte_vhost_device_ops,
>   the new callbacks and message handler will be used.

Please notice that some features like vIOMMU are not even a part of the public 
rte_vhost API. Only vhost-net benefits from vIOMMU right now. Separating 
vhost-net from a generic vhost library (rte_vhost2) would avoid making such 
design mistakes in future. What's the point of having a single rte_vhost 
library, if some vhost-user features are only implemented for vhost-net.

> 
> Best regards,
> Tiwei Bie
> 
> 
> > Regards,
> > D.
> >
> > >
> > > Is my above understanding correct? Thanks!
> > >
> > > Best regards,
> > > Tiwei Bie
> > >


[dpdk-dev] [RFC v2] ethdev: add flow metadata

2018-06-26 Thread Xueming Li
Currently, rte_flow pattern only match packet header fields.
This patch adds additional data to match the packet.

For example, in egress direction, to do an action depending on the VM
id, the application needs to configure rte_flow rule with the new
metadata pattern:
pattern meta data is {vm} / end action encap …
Then the PMD will send VM id as metadata associated in mbuf to NIC,
then egress flow on NIC match metadata as other regular packet headers,
the appropriate encapsulation is done according to the VM id metadata.

Metadata could be used on ingress as well to save useful info before
flow modification (not defined yet) or decapsulation action. PMD is
responsible to save metadata into mbuf field. The application must get
metadata from the mbuf.

Cc: Thomas Monjalon 
Cc: Olivier Matz 
Cc: Shahaf Shuler 
Signed-off-by: Xueming Li 
---
 doc/guides/prog_guide/rte_flow.rst |  7 +++
 lib/librte_ethdev/rte_flow.c   |  1 +
 lib/librte_ethdev/rte_flow.h   | 28 
 3 files changed, 36 insertions(+)

diff --git a/doc/guides/prog_guide/rte_flow.rst 
b/doc/guides/prog_guide/rte_flow.rst
index b305a72a5..7989e5856 100644
--- a/doc/guides/prog_guide/rte_flow.rst
+++ b/doc/guides/prog_guide/rte_flow.rst
@@ -1191,6 +1191,13 @@ Normally preceded by any of:
 - `Item: ICMP6_ND_NS`_
 - `Item: ICMP6_ND_OPT`_
 
+Item: ``META``
+^^
+
+Matches a metadata variable.
+
+- ``data``: 64 bit value.
+
 Actions
 ~~~
 
diff --git a/lib/librte_ethdev/rte_flow.c b/lib/librte_ethdev/rte_flow.c
index b2afba089..54a07dd4a 100644
--- a/lib/librte_ethdev/rte_flow.c
+++ b/lib/librte_ethdev/rte_flow.c
@@ -66,6 +66,7 @@ static const struct rte_flow_desc_data rte_flow_desc_item[] = 
{
 sizeof(struct rte_flow_item_icmp6_nd_opt_sla_eth)),
MK_FLOW_ITEM(ICMP6_ND_OPT_TLA_ETH,
 sizeof(struct rte_flow_item_icmp6_nd_opt_tla_eth)),
+   MK_FLOW_ITEM(META, sizeof(struct rte_flow_item_meta)),
 };
 
 /** Generate flow_action[] entry. */
diff --git a/lib/librte_ethdev/rte_flow.h b/lib/librte_ethdev/rte_flow.h
index f8ba71cdb..61b3dbe7b 100644
--- a/lib/librte_ethdev/rte_flow.h
+++ b/lib/librte_ethdev/rte_flow.h
@@ -413,6 +413,18 @@ enum rte_flow_item_type {
 * See struct rte_flow_item_mark.
 */
RTE_FLOW_ITEM_TYPE_MARK,
+
+   /**
+* Matches a metadata variable.
+*
+* Possible sources of metadata:
+* - mbuf udata64 field on egress
+* - egress metadata loopback to ingress
+* - data copied from packet header
+*
+* See struct rte_flow_item_meta.
+*/
+   RTE_FLOW_ITEM_TYPE_META,
 };
 
 /**
@@ -849,6 +861,22 @@ static const struct rte_flow_item_gre 
rte_flow_item_gre_mask = {
 #endif
 
 /**
+ * RTE_FLOW_ITEM_TYPE_META.
+ *
+ * Matches a meta data header.
+ */
+struct rte_flow_item_meta {
+   uint64_t data;
+};
+
+/** Default mask for RTE_FLOW_ITEM_TYPE_META. */
+#ifndef __cplusplus
+static const struct rte_flow_item_meta rte_flow_item_meta_mask = {
+   .data = RTE_BE64(UINT64_MAX),
+};
+#endif
+
+/**
  * RTE_FLOW_ITEM_TYPE_FUZZY
  *
  * Fuzzy pattern match, expect faster than default.
-- 
2.13.3



Re: [dpdk-dev] [PATCH v2 22/22] app/testpmd: rework softnic forward mode

2018-06-26 Thread Iremonger, Bernard
Hi Jasvinder

> -Original Message-
> From: dev [mailto:dev-boun...@dpdk.org] On Behalf Of Jasvinder Singh
> Sent: Friday, June 15, 2018 5:52 PM
> To: dev@dpdk.org
> Cc: Dumitrescu, Cristian ; Pattan, Reshma
> 
> Subject: [dpdk-dev] [PATCH v2 22/22] app/testpmd: rework softnic forward
> mode
> 
> Modied the testpmd softnic forwarding mode as per the changes in softnic PMD.
> 
> To run testpmd application with softnic fwd mode, following command is used;
> 
> $ ./testpmd -c 0xc -n 4 --vdev 'net_softnic0,firware=script.cli'
>   -- -i --forward-mode=softnic
> 
> Signed-off-by: Jasvinder Singh 
> Signed-off-by: Reshma Pattan 
> ---
>  app/test-pmd/Makefile   |   4 +-
>  app/test-pmd/cmdline.c  |  53 -
>  app/test-pmd/config.c   |  55 +
>  app/test-pmd/{tm.c => softnicfwd.c} | 418 
> 
>  app/test-pmd/testpmd.c  |  27 ++-
>  app/test-pmd/testpmd.h  |  44 +---
>  6 files changed, 256 insertions(+), 345 deletions(-)  rename 
> app/test-pmd/{tm.c
> => softnicfwd.c} (61%)
> 


This patch fails to compile when applied the current dpdk 18_08 master.

/root/dpdk_sforge_2/app/test-pmd/cmdline.c: In function 'prompt':
/root/dpdk_sforge_2/app/test-pmd/cmdline.c:17583:3: error: implicit declaration 
of function 'rte_pmd_softnic_manage' [-Werror=implicit-function-declaration]
   rte_pmd_softnic_manage(softnic_portid);
   ^
/root/dpdk_sforge_2/app/test-pmd/cmdline.c:17583:3: error: nested extern 
declaration of 'rte_pmd_softnic_manage' [-Werror=nested-externs]


It is also giving the following checkpatch errors and warnings:

WARNING: 'firware' may be misspelled - perhaps 'firmware'?
#24: 
$ ./testpmd -c 0xc -n 4 --vdev 'net_softnic0,firware=script.cli'

WARNING: added, moved or deleted file(s), does MAINTAINERS need updating?
#33: 
 app/test-pmd/{tm.c => softnicfwd.c} | 418 

WARNING: Missing a blank line after declarations
#107: FILE: app/test-pmd/cmdline.c:17554:
+   uint8_t softnic_enable = 0;
+   if (strcmp(cur_fwd_eng->fwd_mode_name, "softnic") == 0) {

WARNING: line over 80 characters
#110: FILE: app/test-pmd/cmdline.c:17557:
+   if (strcmp(port->dev_info.driver_name, "net_softnic") 
== 0) {

WARNING: line over 80 characters
#163: FILE: app/test-pmd/config.c:2346:
+   if (strcmp(port->dev_info.driver_name, "net_softnic") 
== 0) {

ERROR: spaces required around that '=' (ctx:WxV)
#165: FILE: app/test-pmd/config.c:2348:
+   softnic_enable =1;
   ^

WARNING: line over 80 characters
#171: FILE: app/test-pmd/config.c:2354:
+   printf("Softnicfwd mode configuration not complete(%s)!\n", 
__func__);

ERROR: space required before the open parenthesis '('
#208: FILE: app/test-pmd/config.c:2392:
+   if(strcmp(cur_fwd_eng->fwd_mode_name, "softnic") == 0) {

ERROR: space required before the open parenthesis '('
#437: FILE: app/test-pmd/softnicfwd.c:150:
+   for(;;) {

WARNING: void function return statements are not generally useful
#445: FILE: app/test-pmd/softnicfwd.c:158:
+   return;
+}

ERROR: open brace '{' following function definitions go on the next line
#449: FILE: app/test-pmd/softnicfwd.c:162:
+static int
+softnic_begin(void *arg __rte_unused) {

ERROR: space required before the open parenthesis '('
#456: FILE: app/test-pmd/softnicfwd.c:169:
+   } while(!softnic_fwd_lcore->stopped);

WARNING: void function return statements are not generally useful
#713: FILE: app/test-pmd/softnicfwd.c:683:
+   return;
+}

WARNING: adding a line without newline at end of file
#722: FILE: app/test-pmd/softnicfwd.c:690:
+};

WARNING: line over 80 characters
#749: FILE: app/test-pmd/testpmd.c:823:
+   if (strcmp(port->dev_info.driver_name, "net_softnic") 
== 0)

total: 5 errors, 10 warnings, 768 lines checked

Regards,

Bernard



Re: [dpdk-dev] [PATCH v2 22/22] app/testpmd: rework softnic forward mode

2018-06-26 Thread Singh, Jasvinder
Hi Bernard,



> This patch fails to compile when applied the current dpdk 18_08 master.
> 
> /root/dpdk_sforge_2/app/test-pmd/cmdline.c: In function 'prompt':
> /root/dpdk_sforge_2/app/test-pmd/cmdline.c:17583:3: error: implicit
> declaration of function 'rte_pmd_softnic_manage' [-Werror=implicit-function-
> declaration]
>rte_pmd_softnic_manage(softnic_portid);
>^
> /root/dpdk_sforge_2/app/test-pmd/cmdline.c:17583:3: error: nested extern
> declaration of 'rte_pmd_softnic_manage' [-Werror=nested-externs]
> 
> 
> It is also giving the following checkpatch errors and warnings:
> 
> WARNING: 'firware' may be misspelled - perhaps 'firmware'?
> #24:
> $ ./testpmd -c 0xc -n 4 --vdev 'net_softnic0,firware=script.cli'
> 
> WARNING: added, moved or deleted file(s), does MAINTAINERS need updating?
> #33:
>  app/test-pmd/{tm.c => softnicfwd.c} | 418 
> 
> 
> WARNING: Missing a blank line after declarations
> #107: FILE: app/test-pmd/cmdline.c:17554:
> +   uint8_t softnic_enable = 0;
> +   if (strcmp(cur_fwd_eng->fwd_mode_name, "softnic") == 0) {
> 
> WARNING: line over 80 characters
> #110: FILE: app/test-pmd/cmdline.c:17557:
> +   if (strcmp(port->dev_info.driver_name,
> + "net_softnic") == 0) {
> 
> WARNING: line over 80 characters
> #163: FILE: app/test-pmd/config.c:2346:
> +   if (strcmp(port->dev_info.driver_name,
> + "net_softnic") == 0) {
> 
> ERROR: spaces required around that '=' (ctx:WxV)
> #165: FILE: app/test-pmd/config.c:2348:
> +   softnic_enable =1;
>^
> 
> WARNING: line over 80 characters
> #171: FILE: app/test-pmd/config.c:2354:
> +   printf("Softnicfwd mode configuration not
> + complete(%s)!\n", __func__);
> 
> ERROR: space required before the open parenthesis '('
> #208: FILE: app/test-pmd/config.c:2392:
> +   if(strcmp(cur_fwd_eng->fwd_mode_name, "softnic") == 0) {
> 
> ERROR: space required before the open parenthesis '('
> #437: FILE: app/test-pmd/softnicfwd.c:150:
> +   for(;;) {
> 
> WARNING: void function return statements are not generally useful
> #445: FILE: app/test-pmd/softnicfwd.c:158:
> +   return;
> +}
> 
> ERROR: open brace '{' following function definitions go on the next line
> #449: FILE: app/test-pmd/softnicfwd.c:162:
> +static int
> +softnic_begin(void *arg __rte_unused) {
> 
> ERROR: space required before the open parenthesis '('
> #456: FILE: app/test-pmd/softnicfwd.c:169:
> +   } while(!softnic_fwd_lcore->stopped);
> 
> WARNING: void function return statements are not generally useful
> #713: FILE: app/test-pmd/softnicfwd.c:683:
> +   return;
> +}
> 
> WARNING: adding a line without newline at end of file
> #722: FILE: app/test-pmd/softnicfwd.c:690:
> +};
> 
> WARNING: line over 80 characters
> #749: FILE: app/test-pmd/testpmd.c:823:
> +   if (strcmp(port->dev_info.driver_name,
> + "net_softnic") == 0)
> 
> total: 5 errors, 10 warnings, 768 lines checked
> 
> Regards,
> 
> Bernard

I am about to send v3 which will address above issues. Thanks you.

Jasvinder 


Re: [dpdk-dev] [PATCH] net/virtio-user: add unsupported features mask

2018-06-26 Thread Liu, Yong


> -Original Message-
> From: Maxime Coquelin [mailto:maxime.coque...@redhat.com]
> Sent: Tuesday, June 26, 2018 4:08 PM
> To: Liu, Yong ; Bie, Tiwei 
> Cc: Wang, Zhihong ; dev@dpdk.org
> Subject: Re: [PATCH] net/virtio-user: add unsupported features mask
> 
> 
> 
> On 06/25/2018 03:10 PM, Marvin Liu wrote:
> > This patch introduces unsupported features mask for virtio-user device.
> > For virtio-user server mode, when reconnecting virtio-user will
> > retrieve vhost devcie features as base and then unmask unsupported
> s/devcie/device/
> > features.
> 
> I am not sure to understand why you are doing it like this.
> 
> Shouldn't you just:
>   1. Don't advertise features you don't want to support
>   2. In server mode, save the negotiated features, and re-use it when
>  reconnect happens?
> 
Maxime,
I think our vhost reconnect design is following qemu vhost-user server mode. 
Virtio-user will try to support connected vhost device.
So device_features of virtio user just retrieve from vhost device. 
And in server mode, we have recorded previous feature bits and used it for 
later negotiation. But virtio user device_features which may has changed by 
vdev parameters. This mask will guaranty device_features correct.

Thanks,
Marvin

> Also, I find "unmask" a bit misleading, why not something like "unsupp"
> or "unsupported"?
> 
> Thanks,
> Maxime


Re: [dpdk-dev] [PATCH v4 05/24] eal: support mp task be invoked in a separate task

2018-06-26 Thread Burakov, Anatoly

On 26-Jun-18 8:08 AM, Qi Zhang wrote:

We know the limitation that sync IPC can't be invoked in mp handler
itself which will cause deadlock, the patch introduce new API
rte_eal_mp_task_add to support mp handler be delegated in a separate
task.

Signed-off-by: Qi Zhang 
---


I would really like to find another solution to this problem. Creating a 
new thread per hotplug request seems like an overkill - even more so 
than having two threads. Creating a new thread potentially while the 
application is working may have other implications (e.g. there's a 
non-zero amount of time between thread created and thread affinitized, 
which may disrupt hotpaths).


It seems to me that the better solution would've been to leave the IPC 
thread in place. There are two IPC threads in the first place because 
there was a circular dependency between rte_malloc and alarm API. My 
patch fixes that - so how about we remove *one* IPC thread, but leave 
the other one in place?


Thomas, any thoughts? (quick description - hotplug needs IPC, and 
hotplug may need to allocate memory, which also needs IPC, which will 
cause a deadlock if IPC is one thread)


--
Thanks,
Anatoly


[dpdk-dev] [PATCH v2] kni: fix build with gcc 8.1

2018-06-26 Thread Ferruh Yigit
Error observed when CONFIG_RTE_KNI_KMOD_ETHTOOL config option is
enabled.

build error:
In function ‘strncpy’,
inlined from ‘igb_get_drvinfo’ at
.../dpdk/build/build/kernel/linux/kni/igb_ethtool.c:814:2:
.../include/linux/string.h:246:9: error: ‘__builtin_strncpy’ output
may be truncated copying 31 bytes from a string of length 42
[-Werror=stringop-truncation]
  return __builtin_strncpy(p, q, size);
   ^

Fixed by using strlcpy instead of strncpy.

adapter->fw_version size kept same because of
c3698192940c ("kni: fix build with gcc 7.1")

Also next line strncpy usage replaced with strlcpy while arround.

Fixes: c3698192940c ("kni: fix build with gcc 7.1")
Cc: sta...@dpdk.org

Signed-off-by: Ferruh Yigit 
---
v2:
* used strlcpy instead of strncpy
* Updated strncpy usage in next line to strlcpy too
* Added fixes line
---
 kernel/linux/kni/ethtool/igb/igb_ethtool.c | 7 ---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/kernel/linux/kni/ethtool/igb/igb_ethtool.c
b/kernel/linux/kni/ethtool/igb/igb_ethtool.c
index 064528bcf..002f75c48 100644
--- a/kernel/linux/kni/ethtool/igb/igb_ethtool.c
+++ b/kernel/linux/kni/ethtool/igb/igb_ethtool.c
@@ -811,9 +811,10 @@ static void igb_get_drvinfo(struct net_device *netdev,
strncpy(drvinfo->driver,  igb_driver_name, sizeof(drvinfo->driver) - 1);
strncpy(drvinfo->version, igb_driver_version, sizeof(drvinfo->version) 
- 1);
 -  strncpy(drvinfo->fw_version, adapter->fw_version,
-   sizeof(drvinfo->fw_version) - 1);
-   strncpy(drvinfo->bus_info, pci_name(adapter->pdev), 
sizeof(drvinfo->bus_info) -1);
+   strlcpy(drvinfo->fw_version, adapter->fw_version,
+   sizeof(drvinfo->fw_version));
+   strlcpy(drvinfo->bus_info, pci_name(adapter->pdev),
+   sizeof(drvinfo->bus_info));
drvinfo->n_stats = IGB_STATS_LEN;
drvinfo->testinfo_len = IGB_TEST_LEN;
drvinfo->regdump_len = igb_get_regs_len(netdev);
-- 
2.17.1



Re: [dpdk-dev] [PATCH] net/nfp: use generic PCI config access functions

2018-06-26 Thread Ferruh Yigit
On 6/18/2018 9:06 PM, Alejandro Lucero wrote:
> This patch avoids direct access to device config sysfs file using
> rte_pci_read_config instead.
> 
> Apart from replicating code, it turns out this direct access does
> not always work if non-root users execute DPDK apps. In those cases
> it is mandatory to go through VFIO specific function for reading pci
> config space.
> 
> Signed-off-by: Alejandro Lucero 

Applied to dpdk-next-net/master, thanks.


Re: [dpdk-dev] [DPDK] examples/ipsec-secgw: fix use of unsupported RSS offloads

2018-06-26 Thread Ferruh Yigit
On 6/22/2018 2:27 PM, Remy Horton wrote:
> Since commit aa1a6d87f15d ("ethdev: force RSS offload rules again")
> a check that requested RSS offloads are supported by a PMD is
> enforced, whereas in the past asking for unsupported offloads would
> not result in an error. This patch changes the IPSec gateway sample
> so that it only requests modes that are supported rather than
> failing to start up.
> 
> Fixes: d299106e8e31 ("examples/ipsec-secgw: add IPsec sample application")
> 
> Signed-off-by: Remy Horton 

Hi Remy,

Is following covering this patch:
https://patches.dpdk.org/patch/41313/

> ---
>  examples/ipsec-secgw/ipsec-secgw.c | 5 +
>  1 file changed, 5 insertions(+)
> 
> diff --git a/examples/ipsec-secgw/ipsec-secgw.c 
> b/examples/ipsec-secgw/ipsec-secgw.c
> index a5da8b2..d247d5f 100644
> --- a/examples/ipsec-secgw/ipsec-secgw.c
> +++ b/examples/ipsec-secgw/ipsec-secgw.c
> @@ -1566,6 +1566,11 @@ port_init(uint16_t portid)
>   if (dev_info.tx_offload_capa & DEV_TX_OFFLOAD_MBUF_FAST_FREE)
>   local_port_conf.txmode.offloads |=
>   DEV_TX_OFFLOAD_MBUF_FAST_FREE;
> +
> + /* Only request RSS offloads the NIC supports. */
> + local_port_conf.rx_adv_conf.rss_conf.rss_hf &=
> + dev_info.flow_type_rss_offloads;
> +
>   ret = rte_eth_dev_configure(portid, nb_rx_queue, nb_tx_queue,
>   &local_port_conf);
>   if (ret < 0)
> 



[dpdk-dev] Reviewathon

2018-06-26 Thread Ferruh Yigit
On 6/22/2018 1:13 PM, dev-boun...@dpdk.org wrote:
> DPDK Release Status Meeting 21/07/2018
> ==
> 
> Minutes from the weekly DPDK Release Status Meeting.

<...>

> Reviewathon
> ---
> 
> * We plan a Community Reviewathon, next Tuesday 26th June.
> * Community will co-ordinate on IRC.
> * Look for announcement of dev mailing list.

As mentioned above this is planned for Tuesday 26th June.

The target is consume the patch backlog [1] as a community effort.


One thing we can think of is everybody attending review event join to dpdk irc
channel [1] which makes easier to communicate and reduces the communicating
related delay.

More comments are welcome for more efficient review event.


[1]
https://patches.dpdk.org/project/dpdk/list/


[2]
IRC freenode #dpdk-board
web client: https://webchat.freenode.net/


Re: [dpdk-dev] [PATCH] maintainers: change maintainership

2018-06-26 Thread Ferruh Yigit
On 6/22/2018 10:13 AM, Helin Zhang wrote:
> Xiaoyun Li has agreed to take over the maintainership of example
> application tep_termination, as Jijiang Liu is no longer working
> on that.
> 
> Signed-off-by: Helin Zhang 

Acked-by: Ferruh Yigit 

Thanks Xiaoyun for volunteering.


Re: [dpdk-dev] [RFC 9/9] usertools/lib: add GRUB utility library for hugepage config

2018-06-26 Thread Burakov, Anatoly

On 26-Jun-18 2:09 AM, Kevin Wilson wrote:

Hi, Anatoly,

Thanks for these patches, good work.
Regarding "update-grub": IIRC, this is Ubuntu specific command (and
also used in Debian/Debian based flavors).
In Fedora (RedHat based) recent distros, you use grub2-mkconfig
instead (and there is no "update-grub", IIRC).
If this is true, I would consider adding a comment in the commit log
saying something like "it is for Ubuntu/Debian-based distros".
So maybe in the future someone will add the python code which detects
the OS (using lsb-release, etc) and, in case it is Fedora/RedHat
distro, the grub2-mkconfig util
will be invoked instead.
Regards,
KW



Hi Kevin,

It wasn't intended to be Ubuntu-specific - that's just what i developed 
this patchset on :) Of course, if we get to a v1 stage, this will be 
properly implemented to work on all distributions supported by DPDK.


Thanks for your feedback!

--
Thanks,
Anatoly


Re: [dpdk-dev] [PATCH v2 3/3] net/pcap: support pcap files and ifaces mix

2018-06-26 Thread Ferruh Yigit
On 6/22/2018 8:15 AM, Ido Goshen wrote:
> 
> 
>> -Original Message-
>> From: Ferruh Yigit 
>> Sent: Thursday, June 21, 2018 3:51 PM
>> To: Ido Goshen 
>> Cc: dev@dpdk.org
>> Subject: Re: [PATCH v2 3/3] net/pcap: support pcap files and ifaces mix
>>
>> On 6/21/2018 1:24 PM, ido goshen wrote:
>>> Suggested-by: Ferruh Yigit 
>>>
>>> Signed-off-by: ido goshen 
>>
>> <...>
>>
>>> +static uint16_t
>>> +eth_pcap_tx_mux(void *queue, struct rte_mbuf **bufs, uint16_t
>>> +nb_pkts) {
>>> +   struct pcap_tx_queue *tx_queue = queue;
>>> +   if (tx_queue->dumper)
>>> +   return eth_pcap_tx_dumper(queue, bufs, nb_pkts);
>>> +   else
>>> +   return eth_pcap_tx(queue, bufs, nb_pkts); }
>>> +
>>>  /*
>>>   * pcap_open_live wrapper function
>>>   */
>>> @@ -773,6 +783,31 @@ struct pmd_devargs {
>>> return open_iface(key, value, extra_args);  }
>>>
>>> +static int
>>> +open_pcap_rx_mux(const char *key, const char *value, void
>>> +*extra_args) {
>>> +   struct pmd_devargs *pcaps = extra_args;
>>
>> Do we need this assignment? Why not pass extra_args directly?
> 
> [idog] Correct, it can be passed directly
> other option is to leave the assignment here and pass strong type to the 
> internal open_rx_pcap/iface 
> instead of passing it as void*
> Any preference?
> 
>>
>>> +
>>> +   if (strcmp(key, ETH_PCAP_RX_PCAP_ARG) == 0)
>>> +   return open_rx_pcap(key, value, pcaps);
>>> +   if (strcmp(key, ETH_PCAP_RX_IFACE_ARG) == 0)
>>> +   return open_rx_iface(key, value, pcaps);
>>> +   return 0;
>>> +}
>>> +
>>> +static int
>>> +open_pcap_tx_mux(const char *key, const char *value, void
>>> +*extra_args) {
>>> +   struct pmd_devargs *dumpers = extra_args;
>>
>> Do we need this assignment? Why not pass extra_args directly?
>>
>>> +
>>> +   if (strcmp(key, ETH_PCAP_TX_PCAP_ARG) == 0)
>>> +   return open_tx_pcap(key, value, dumpers);
>>> +   if (strcmp(key, ETH_PCAP_TX_IFACE_ARG) == 0)
>>> +   return open_tx_iface(key, value, dumpers);
>>> +   return 0;
>>> +}
>>> +
>>> +
>>>  static struct rte_vdev_driver pmd_pcap_drv;
>>>
>>>  static int
>>> @@ -873,8 +908,7 @@ struct pmd_devargs {  eth_from_pcaps(struct
>>> rte_vdev_device *vdev,
>>> struct pmd_devargs *rx_queues, const unsigned int
>> nb_rx_queues,
>>> struct pmd_devargs *tx_queues, const unsigned int
>> nb_tx_queues,
>>> -   struct rte_kvargs *kvlist, int single_iface,
>>> -   unsigned int using_dumpers)
>>> +   struct rte_kvargs *kvlist, int single_iface)
>>>  {
>>> struct pmd_internals *internals = NULL;
>>> struct rte_eth_dev *eth_dev = NULL;
>>> @@ -891,10 +925,7 @@ struct pmd_devargs {
>>>
>>> eth_dev->rx_pkt_burst = eth_pcap_rx;
>>>
>>> -   if (using_dumpers)
>>> -   eth_dev->tx_pkt_burst = eth_pcap_tx_dumper;
>>> -   else
>>> -   eth_dev->tx_pkt_burst = eth_pcap_tx;
>>> +   eth_dev->tx_pkt_burst = eth_pcap_tx_mux;
>>
>> We shouldn't introduce an extra check in data path. Instead of checking "if
>> (tx_queue->dumper)" for _each_ packet, we should check it here once and
>> assign proper burst function.
> 
> [idog] I don't see how it can be avoided 
> rte_eth_dev has only single tx_pkt_burst
> but now we suggest to support 2 different queue types in a single device
> each type requires different end functionality pcap_dump or pcap_sendpkt
> btw - it's only once per burst 

Right, we can't avoid.

This change is removing a limitation in the PMD but with a side effect, I missed
side effect part.
I am for rejecting the patch until this feature explicitly requested for a
practical usecase, to be sure we are not introducing the side effect a feature
that is not really needed.

Thanks for your effort.




Re: [dpdk-dev] [DPDK] examples/ipsec-secgw: fix use of unsupported RSS offloads

2018-06-26 Thread Remy Horton



On 26/06/2018 10:03, Ferruh Yigit wrote:
[..]

Hi Remy,

Is following covering this patch:
https://patches.dpdk.org/patch/41313/


Patch was sent out with wrong subject, so consider it Nack'd.


[dpdk-dev] [PATCH 1/2] eal: remove deprecated function returning mbuf pool ops name

2018-06-26 Thread Olivier Matz
rte_eal_mbuf_default_mempool_ops() is replaced by
rte_mbuf_best_mempool_ops().

Signed-off-by: Olivier Matz 
---
 doc/guides/rel_notes/deprecation.rst|  9 -
 lib/librte_eal/bsdapp/eal/eal.c | 10 --
 lib/librte_eal/common/include/rte_eal.h | 11 ---
 lib/librte_eal/linuxapp/eal/eal.c   | 10 --
 lib/librte_eal/rte_eal_version.map  |  2 --
 5 files changed, 42 deletions(-)

diff --git a/doc/guides/rel_notes/deprecation.rst 
b/doc/guides/rel_notes/deprecation.rst
index 1ce692eac..5bf680515 100644
--- a/doc/guides/rel_notes/deprecation.rst
+++ b/doc/guides/rel_notes/deprecation.rst
@@ -37,15 +37,6 @@ Deprecation Notices
   - ``eal_parse_pci_DomBDF`` replaced by ``rte_pci_addr_parse``
   - ``rte_eal_compare_pci_addr`` replaced by ``rte_pci_addr_cmp``
 
-* eal: a new set of mbuf mempool ops name APIs for user, platform and best
-  mempool names have been defined in ``rte_mbuf`` in v18.02. The uses of
-  ``rte_eal_mbuf_default_mempool_ops`` shall be replaced by
-  ``rte_mbuf_best_mempool_ops``.
-  The following function is deprecated since 18.05, and will be removed
-  in 18.08:
-
-  - ``rte_eal_mbuf_default_mempool_ops``
-
 * mbuf: The opaque ``mbuf->hash.sched`` field will be updated to support 
generic
   definition in line with the ethdev TM and MTR APIs. Currently, this field
   is defined in librte_sched in a non-generic way. The new generic format
diff --git a/lib/librte_eal/bsdapp/eal/eal.c b/lib/librte_eal/bsdapp/eal/eal.c
index dc279542d..f7cced725 100644
--- a/lib/librte_eal/bsdapp/eal/eal.c
+++ b/lib/librte_eal/bsdapp/eal/eal.c
@@ -153,16 +153,6 @@ rte_eal_mbuf_user_pool_ops(void)
return internal_config.user_mbuf_pool_ops_name;
 }
 
-/* Return mbuf pool ops name */
-const char *
-rte_eal_mbuf_default_mempool_ops(void)
-{
-   if (internal_config.user_mbuf_pool_ops_name == NULL)
-   return RTE_MBUF_DEFAULT_MEMPOOL_OPS;
-
-   return internal_config.user_mbuf_pool_ops_name;
-}
-
 /* Return a pointer to the configuration structure */
 struct rte_config *
 rte_eal_get_configuration(void)
diff --git a/lib/librte_eal/common/include/rte_eal.h 
b/lib/librte_eal/common/include/rte_eal.h
index 8de5d69e8..0c9c3f13b 100644
--- a/lib/librte_eal/common/include/rte_eal.h
+++ b/lib/librte_eal/common/include/rte_eal.h
@@ -501,17 +501,6 @@ enum rte_iova_mode rte_eal_iova_mode(void);
 const char * __rte_experimental
 rte_eal_mbuf_user_pool_ops(void);
 
-/**
- * @deprecated
- * Get default pool ops name for mbuf
- *
- * @return
- *   returns default pool ops name.
- */
-__rte_deprecated
-const char *
-rte_eal_mbuf_default_mempool_ops(void);
-
 #ifdef __cplusplus
 }
 #endif
diff --git a/lib/librte_eal/linuxapp/eal/eal.c 
b/lib/librte_eal/linuxapp/eal/eal.c
index 8655b8691..cf2a8082b 100644
--- a/lib/librte_eal/linuxapp/eal/eal.c
+++ b/lib/librte_eal/linuxapp/eal/eal.c
@@ -161,16 +161,6 @@ rte_eal_mbuf_user_pool_ops(void)
return internal_config.user_mbuf_pool_ops_name;
 }
 
-/* Return mbuf pool ops name */
-const char *
-rte_eal_mbuf_default_mempool_ops(void)
-{
-   if (internal_config.user_mbuf_pool_ops_name == NULL)
-   return RTE_MBUF_DEFAULT_MEMPOOL_OPS;
-
-   return internal_config.user_mbuf_pool_ops_name;
-}
-
 /* Return a pointer to the configuration structure */
 struct rte_config *
 rte_eal_get_configuration(void)
diff --git a/lib/librte_eal/rte_eal_version.map 
b/lib/librte_eal/rte_eal_version.map
index f7dd0e7bc..3d4a9d3bb 100644
--- a/lib/librte_eal/rte_eal_version.map
+++ b/lib/librte_eal/rte_eal_version.map
@@ -181,7 +181,6 @@ DPDK_17.11 {
rte_bus_get_iommu_class;
rte_eal_has_pci;
rte_eal_iova_mode;
-   rte_eal_mbuf_default_mempool_ops;
rte_eal_using_phys_addrs;
rte_eal_vfio_intr_mode;
rte_lcore_has_role;
@@ -259,7 +258,6 @@ EXPERIMENTAL {
rte_eal_cleanup;
rte_eal_hotplug_add;
rte_eal_hotplug_remove;
-   rte_eal_mbuf_user_pool_ops;
rte_fbarray_attach;
rte_fbarray_destroy;
rte_fbarray_detach;
-- 
2.11.0



[dpdk-dev] [PATCH 2/2] eal: remove experimental tag from user mbuf pool ops func

2018-06-26 Thread Olivier Matz
Remove experimental tag from rte_eal_mbuf_user_pool_ops().

Signed-off-by: Olivier Matz 
---
 lib/librte_eal/bsdapp/eal/eal.c | 2 +-
 lib/librte_eal/common/include/rte_eal.h | 5 +
 lib/librte_eal/linuxapp/eal/eal.c   | 2 +-
 lib/librte_eal/rte_eal_version.map  | 6 ++
 lib/librte_mbuf/Makefile| 1 -
 lib/librte_mbuf/meson.build | 1 -
 6 files changed, 9 insertions(+), 8 deletions(-)

diff --git a/lib/librte_eal/bsdapp/eal/eal.c b/lib/librte_eal/bsdapp/eal/eal.c
index f7cced725..98c689b16 100644
--- a/lib/librte_eal/bsdapp/eal/eal.c
+++ b/lib/librte_eal/bsdapp/eal/eal.c
@@ -147,7 +147,7 @@ eal_get_runtime_dir(void)
 }
 
 /* Return user provided mbuf pool ops name */
-const char * __rte_experimental
+const char *
 rte_eal_mbuf_user_pool_ops(void)
 {
return internal_config.user_mbuf_pool_ops_name;
diff --git a/lib/librte_eal/common/include/rte_eal.h 
b/lib/librte_eal/common/include/rte_eal.h
index 0c9c3f13b..e114dcbdc 100644
--- a/lib/librte_eal/common/include/rte_eal.h
+++ b/lib/librte_eal/common/include/rte_eal.h
@@ -490,15 +490,12 @@ static inline int rte_gettid(void)
 enum rte_iova_mode rte_eal_iova_mode(void);
 
 /**
- * @warning
- * @b EXPERIMENTAL: this API may change without prior notice
- *
  * Get user provided pool ops name for mbuf
  *
  * @return
  *   returns user provided pool ops name.
  */
-const char * __rte_experimental
+const char *
 rte_eal_mbuf_user_pool_ops(void);
 
 #ifdef __cplusplus
diff --git a/lib/librte_eal/linuxapp/eal/eal.c 
b/lib/librte_eal/linuxapp/eal/eal.c
index cf2a8082b..71ec2be9f 100644
--- a/lib/librte_eal/linuxapp/eal/eal.c
+++ b/lib/librte_eal/linuxapp/eal/eal.c
@@ -155,7 +155,7 @@ eal_get_runtime_dir(void)
 }
 
 /* Return user provided mbuf pool ops name */
-const char * __rte_experimental
+const char *
 rte_eal_mbuf_user_pool_ops(void)
 {
return internal_config.user_mbuf_pool_ops_name;
diff --git a/lib/librte_eal/rte_eal_version.map 
b/lib/librte_eal/rte_eal_version.map
index 3d4a9d3bb..c151c8454 100644
--- a/lib/librte_eal/rte_eal_version.map
+++ b/lib/librte_eal/rte_eal_version.map
@@ -240,6 +240,12 @@ DPDK_18.05 {
 
 } DPDK_18.02;
 
+DPDK_18.08 {
+   global:
+
+   rte_eal_mbuf_user_pool_ops;
+} DPDK_18.05;
+
 EXPERIMENTAL {
global:
 
diff --git a/lib/librte_mbuf/Makefile b/lib/librte_mbuf/Makefile
index 8749a00fe..e2b98a254 100644
--- a/lib/librte_mbuf/Makefile
+++ b/lib/librte_mbuf/Makefile
@@ -6,7 +6,6 @@ include $(RTE_SDK)/mk/rte.vars.mk
 # library name
 LIB = librte_mbuf.a
 
-CFLAGS += -DALLOW_EXPERIMENTAL_API
 CFLAGS += $(WERROR_FLAGS) -I$(SRCDIR) -O3
 LDLIBS += -lrte_eal -lrte_mempool
 
diff --git a/lib/librte_mbuf/meson.build b/lib/librte_mbuf/meson.build
index 869c17c1c..45ffb0db5 100644
--- a/lib/librte_mbuf/meson.build
+++ b/lib/librte_mbuf/meson.build
@@ -2,7 +2,6 @@
 # Copyright(c) 2017 Intel Corporation
 
 version = 3
-allow_experimental_apis = true
 sources = files('rte_mbuf.c', 'rte_mbuf_ptype.c', 'rte_mbuf_pool_ops.c')
 headers = files('rte_mbuf.h', 'rte_mbuf_ptype.h', 'rte_mbuf_pool_ops.h')
 deps += ['mempool']
-- 
2.11.0



Re: [dpdk-dev] [RFC v3 0/7] vhost2: new librte_vhost2 proposal

2018-06-26 Thread Tiwei Bie
On Tue, Jun 26, 2018 at 04:47:33PM +0800, Stojaczyk, DariuszX wrote:
> > -Original Message-
> > From: Bie, Tiwei
> > Sent: Tuesday, June 26, 2018 10:22 AM
> > To: Stojaczyk, DariuszX 
> > Cc: Dariusz Stojaczyk ; dev@dpdk.org; Maxime
> > Coquelin ; Tetsuya Mukawa
> > ; Stefan Hajnoczi ; Thomas
> > Monjalon ; y...@fridaylinux.org; Harris, James R
> > ; Kulasek, TomaszX ;
> > Wodkowski, PawelX 
> > Subject: Re: [dpdk-dev] [RFC v3 0/7] vhost2: new librte_vhost2 proposal
> > 
> > On Mon, Jun 25, 2018 at 08:17:08PM +0800, Stojaczyk, DariuszX wrote:
> > > > -Original Message-
> > > > From: dev [mailto:dev-boun...@dpdk.org] On Behalf Of Tiwei Bie
> > > > Sent: Monday, June 25, 2018 1:02 PM
> > > > 
> > > >
> > > > Hi Dariusz,
> > > >
> > >
> > > Hi Tiwei,
> > >
> > > > Thank you for putting efforts in making the DPDK
> > > > vhost more generic!
> > > >
> > > > From my understanding, your proposal is that:
> > > >
> > > > 1) Introduce rte_vhost2 to provide the APIs which
> > > >allow users to implement vhost backends like
> > > >SCSI, net, crypto, ..
> > > >
> > >
> > > That's right.
> > >
> > > > 2) Refactor the existing rte_vhost to use rte_vhost2.
> > > >The rte_vhost will still provide below existing
> > > >sets of APIs:
> > > > 1. The APIs which allow users to implement
> > > >external vhost backends (these APIs were
> > > >designed for SPDK previously)
> > > > 2. The APIs provided by the net backend
> > > > 3. The APIs provided by the crypto backend
> > > >And above APIs in rte_vhost won't be changed.
> > >
> > > That's correct. Rte_vhost would register its own rte_vhost2_tgt_ops
> > underneath and will call existing vhost_device_ops for e.g. starting the 
> > device
> > once all queues are started.
> > 
> > Currently I have below concerns and questions:
> > 
> > - The rte_vhost's problem is still there. Even though
> >   rte_vhost2 is introduced, the net and crypto backends
> >   in rte_vhost won't benefit from the new callbacks.
> > 
> >   The existing rte_vhost in DPDK not only provides the
> >   APIs for DPDK applications to implement the external
> >   backends. But also provides high performance net and
> >   crypto backends implementation (maybe more in the
> >   future). So it's important that besides the DPDK
> >   applications which implement their external backends,
> >   the DPDK applications which use the builtin backends
> >   will also benefit from the new callbacks.
> > 
> >   So we should have a clear plan on how will the legacy
> >   callbacks in rte_vhost be dealt with in the next step.
> > 
> >   Besides, the new library's name is a bit misleading.
> >   It makes the existing rte_vhost library sound like an
> >   obsolete library. But actually the existing rte_vhost
> >   isn't an obsolete library. It will still provide the
> >   net and crypto backends. So if we want to introduce
> >   this new library, we should give it a better name.
> > 
> > - It's possible to solve rte_vhost's problem you met
> >   by refactoring the existing vhost library directly
> >   instead of re-implementing a new vhost library from
> >   scratch and keeping the old one's problem as is.
> > 
> >   In this way, it will solve the problem you met and
> >   also solve the problem for rte_vhost. Why not go
> >   this way? Something like:
> > 
> >   Below is the existing callbacks set in rte_vhost.h:
> > 
> >   /**
> >* Device and vring operations.
> >*/
> >   struct vhost_device_ops {
> >   ..
> >   };
> > 
> >   It's a legacy implementation, and doesn't really
> >   follow the DPDK API design (e.g. no rte_ prefix).
> >   We can design and implement a new message handling
> >   and a new set of callbacks for rte_vhost to solve
> >   the problem you met without changing the old one.
> >   Something like:
> > 
> >   struct rte_vhost_device_ops {
> >   ..
> >   }
> > 
> >   int
> >   vhost_user_msg_handler(struct vhost_dev *vdev, struct vhost_user_msg
> > *msg)
> >   {
> >   ..
> > 
> >   if (!vdev->is_using_new_device_ops) {
> >   // Call the existing message handler
> >   return vhost_user_msg_handler_legacy(vdev, msg);
> >   }
> > 
> >   // Implement the new logic here
> >   ..
> >   }
> > 
> >   A vhost application is allowed to register only struct
> >   rte_vhost_device_ops or struct vhost_device_ops (which
> >   should be deprecated in the future). The two ops cannot
> >   be registered at the same time.
> > 
> >   The existing applications could use the old ops. And
> >   if an application registers struct rte_vhost_device_ops,
> >   the new callbacks and message handler will be used.
> 
> Please notice that some features like vIOMMU are not even a part of the 
> public rte_vhost API. Only vhost-net benefits from vIOMMU right now. 
> Separating vhost-net from a generic vhost library (rte_vhost2) would avoid 
> making such design mistakes i

Re: [dpdk-dev] [PATCH v2] net/i40e: remove VF interrupt handler

2018-06-26 Thread Ferruh Yigit
On 6/24/2018 11:56 AM, Zhang, Qi Z wrote:
> Hi Stephen:
> 
>> -Original Message-
>> From: Stephen Hemminger [mailto:step...@networkplumber.org]
>> Sent: Friday, June 22, 2018 11:44 PM
>> To: Zhang, Qi Z 
>> Cc: Xing, Beilei ; Wu, Jingjing 
>> ;
>> Yu, De ; dev@dpdk.org
>> Subject: Re: [dpdk-dev] [PATCH v2] net/i40e: remove VF interrupt handler
>>
>> On Fri, 22 Jun 2018 08:44:14 +0800
>> Qi Zhang  wrote:
>>
>>> For i40evf, internal rx interrupt and adminq interrupt share the same
>>> source, that cause a lot cpu cycles be wasted on interrupt handler on
>>> rx path. This is complained by customers which require low latency
>>> (when set I40E_ITR_INTERVAL to small value), but have to be sufferred
>>> by tremendous interrupts handling that eat significant CPU resources.
>>>
>>> The patch disable pci interrupt and remove the interrupt handler,
>>> replace it with a low frequency (50ms) interrupt polling daemon which
>>> is implemented by registering a alarm callback periodly, this save CPU
>>> time significently: On a typical x86 server with 2.1GHz CPU, with low
>>> latency configure (32us) we saw CPU usage from top commmand reduced
>>> from 20% to 0% on management core in testpmd).
>>>
>>> Also with the new method we can remove compile option:
>>> I40E_ITR_INTERVAL which is used to balance between low latency and low
>> CPU usage previously.
>>> Now we don't need it since we can reach both at same time.
>>>
>>> Suggested-by: Jingjing Wu 
>>> Signed-off-by: Qi Zhang 
>>> ---
>>>
>>> v2:
>>> - update doc
>>>
>>>  config/common_base|  2 --
>>>  doc/guides/nics/i40e.rst  |  5 -
>>>  drivers/net/i40e/i40e_ethdev.c|  3 +--
>>>  drivers/net/i40e/i40e_ethdev.h| 22 +++---
>>>  drivers/net/i40e/i40e_ethdev_vf.c | 36
>>> ++--
>>>  5 files changed, 26 insertions(+), 42 deletions(-)
>>>
>>> diff --git a/config/common_base b/config/common_base index
>>> 6b0d1cbbb..9e21c6865 100644
>>> --- a/config/common_base
>>> +++ b/config/common_base
>>> @@ -264,8 +264,6 @@ CONFIG_RTE_LIBRTE_I40E_INC_VECTOR=y
>>>  CONFIG_RTE_LIBRTE_I40E_16BYTE_RX_DESC=n
>>>  CONFIG_RTE_LIBRTE_I40E_QUEUE_NUM_PER_PF=64
>>>  CONFIG_RTE_LIBRTE_I40E_QUEUE_NUM_PER_VM=4
>>> -# interval up to 8160 us, aligned to 2 (or default value)
>>> -CONFIG_RTE_LIBRTE_I40E_ITR_INTERVAL=-1
>>>
>>>  #
>>>  # Compile burst-oriented FM10K PMD
>>> diff --git a/doc/guides/nics/i40e.rst b/doc/guides/nics/i40e.rst index
>>> 18549bf5a..3fc4ceac7 100644
>>> --- a/doc/guides/nics/i40e.rst
>>> +++ b/doc/guides/nics/i40e.rst
>>> @@ -96,11 +96,6 @@ Please note that enabling debugging options may
>> affect system performance.
>>>
>>>Number of queues reserved for each VMDQ Pool.
>>>
>>> -- ``CONFIG_RTE_LIBRTE_I40E_ITR_INTERVAL`` (default ``-1``)
>>> -
>>> -  Interrupt Throttling interval.
>>> -
>>> -
>>>  Runtime Config Options
>>>  ~~
>>>
>>> diff --git a/drivers/net/i40e/i40e_ethdev.c
>>> b/drivers/net/i40e/i40e_ethdev.c index 13c5d3296..c8f9566e0 100644
>>> --- a/drivers/net/i40e/i40e_ethdev.c
>>> +++ b/drivers/net/i40e/i40e_ethdev.c
>>> @@ -1829,8 +1829,7 @@ __vsi_queues_bind_intr(struct i40e_vsi *vsi,
>> uint16_t msix_vect,
>>> /* Write first RX queue to Link list register as the head element */
>>> if (vsi->type != I40E_VSI_SRIOV) {
>>> uint16_t interval =
>>> -   i40e_calc_itr_interval(RTE_LIBRTE_I40E_ITR_INTERVAL, 1,
>>> -  pf->support_multi_driver);
>>> +   i40e_calc_itr_interval(1, pf->support_multi_driver);
>>>
>>> if (msix_vect == I40E_MISC_VEC_ID) {
>>> I40E_WRITE_REG(hw, I40E_PFINT_LNKLST0, diff --git
>>> a/drivers/net/i40e/i40e_ethdev.h b/drivers/net/i40e/i40e_ethdev.h
>>> index 11c4c76bd..53dac 100644
>>> --- a/drivers/net/i40e/i40e_ethdev.h
>>> +++ b/drivers/net/i40e/i40e_ethdev.h
>>> @@ -178,7 +178,7 @@ enum i40e_flxpld_layer_idx {
>>>  #define I40E_ITR_INDEX_NONE 3
>>>  #define I40E_QUEUE_ITR_INTERVAL_DEFAULT 32 /* 32 us */
>>>  #define I40E_QUEUE_ITR_INTERVAL_MAX 8160 /* 8160 us */
>>> -#define I40E_VF_QUEUE_ITR_INTERVAL_DEFAULT 8160 /* 8160 us */
>>> +#define I40E_VF_QUEUE_ITR_INTERVAL_DEFAULT 32 /* 32 us */
>>>  /* Special FW support this floating VEB feature */  #define
>>> FLOATING_VEB_SUPPORTED_FW_MAJ 5  #define
>> FLOATING_VEB_SUPPORTED_FW_MIN
>>> 0 @@ -1328,17 +1328,17 @@ i40e_align_floor(int n)  }
>>>
>>>  static inline uint16_t
>>> -i40e_calc_itr_interval(int16_t interval, bool is_pf, bool
>>> is_multi_drv)
>>> +i40e_calc_itr_interval(bool is_pf, bool is_multi_drv)
>>>  {
>>> -   if (interval < 0 || interval > I40E_QUEUE_ITR_INTERVAL_MAX) {
>>> -   if (is_multi_drv) {
>>> -   interval = I40E_QUEUE_ITR_INTERVAL_MAX;
>>> -   } else {
>>> -   if (is_pf)
>>> -   interval = I40E_QUEUE_ITR_INTERVAL_DEFAULT;
>>> -   else
>>> -

Re: [dpdk-dev] [PATCH] net/thunderx: fix build with gcc optimization on

2018-06-26 Thread Ferruh Yigit
On 6/24/2018 1:17 PM, Jerin Jacob wrote:
> -Original Message-
>> Date: Thu, 21 Jun 2018 19:14:50 +0100
>> From: Ferruh Yigit 
>> To: Jerin Jacob , Maciej Czekaj
>>  
>> CC: dev@dpdk.org, Ferruh Yigit , sta...@dpdk.org
>> Subject: [PATCH] net/thunderx: fix build with gcc optimization on
>> X-Mailer: git-send-email 2.17.1
>>
>>
>> build error gcc version 6.3.1 20161221 (Red Hat 6.3.1-1),
>> with EXTRA_CFLAGS="-O3":
>>
>> .../drivers/net/thunderx/nicvf_ethdev.c:907:9:
>>error: ‘txq’ may be used uninitialized in this function
>>[-Werror=maybe-uninitialized]
>>   if (txq->pool_free == nicvf_single_pool_free_xmited_buffers)
>>   ~~~^~~
>> .../drivers/net/thunderx/nicvf_ethdev.c:886:20:
>>note: ‘txq’ was declared here
>>   struct nicvf_txq *txq;
>> ^~~
>>
>> Same error on function 'nicvf_eth_dev_init' and 'nicvf_dev_start', it
>> seems 'nicvf_set_tx_function' inlined when optimization enabled.
>>
>> Initialize the txq and add NULL check before using it to fix.
>>
>> Fixes: 7413feee662d ("net/thunderx: add device start/stop and close")
>> Cc: sta...@dpdk.org
>>
>> Reported-by: Richard Walsh 
>> Signed-off-by: Ferruh Yigit 
> 
> Acked-by: Jerin Jacob 
> 
>> ---
>>
>> Btw, no compiler optimization enabled, only nicvf_rxtx.c has -Ofast,
>> is this intentional?
> 
> Yes. At least in our setup, -Ofast turns out to be super set of -O3.

That is what gcc documents about -Ofast, but again it is only for single
nicvf_rxtx.c file. The problem seen with -O3 case with other file.



Re: [dpdk-dev] [PATCH v3 2/6] lib/cryptodev: add asym op support in cryptodev

2018-06-26 Thread De Lara Guarch, Pablo
Hi Shally,

> -Original Message-
> From: Shally Verma [mailto:shally.ve...@caviumnetworks.com]
> Sent: Wednesday, May 16, 2018 7:05 AM
> To: De Lara Guarch, Pablo 
> Cc: Trahe, Fiona ; akhil.go...@nxp.com;
> dev@dpdk.org; pathr...@caviumnetworks.com; Sunila Sahu
> ; Ashish Gupta
> 
> Subject: [PATCH v3 2/6] lib/cryptodev: add asym op support in cryptodev
> 
> Extend DPDK librte_cryptodev to:
> - define asym op type in rte_crypto_op_type and associated
>   op pool create/alloc APIs
> - define asym session and associated session APIs
> 
> If PMD shows in its feature flag that it supports both sym and asym then it 
> must
> support those on all its qps.
> 
> Changes from v2:
> - added rte_cryptodev_asym_session_set/get_private_data for app to setup
> private data in a session as per latest dpdk-next-crypto spec
> - rename rte_cryptodev_get_asym_session_private_size to be consistent with
> other API names
> - correct rte_cryptodev_asym_session_create to pass void** to
> rte_mempool_get() and add for private_data_size flag
> 
> Changes from v1
> - resolve new line error in librte_cryptodev/rte_cryptodev_version.map
> 
> Signed-off-by: Shally Verma 
> Signed-off-by: Sunila Sahu 
> Signed-off-by: Ashish Gupta 

...

> +int __rte_experimental
> +rte_cryptodev_asym_session_init(uint8_t dev_id,
> + struct rte_cryptodev_asym_session *sess,
> + struct rte_crypto_asym_xform *xforms,
> + struct rte_mempool *mp)
> +{
> + struct rte_cryptodev *dev;
> + uint8_t index;
> + int ret;
> +
> + dev = rte_cryptodev_pmd_get_dev(dev_id);
> +
> + if (sess == NULL || xforms == NULL || dev == NULL)
> + return -EINVAL;
> +
> + index = dev->driver_id;
> +

Check if asym_session_configure is implemented in the device, like this:

RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->asym_session_configure, -ENOTSUP);

This way, there won't be a segmentation fault when using a device that
does not support asymmetric operations.

> + if (sess->sess_private_data[index] == NULL) {
> + ret = dev->dev_ops->asym_session_configure(dev,
> + xforms,
> + sess, mp);
> + if (ret < 0) {
> + CDEV_LOG_ERR(
> + "dev_id %d failed to configure session details",
> + dev_id);
> + return ret;

...

> +int __rte_experimental
> +rte_cryptodev_asym_session_clear(uint8_t dev_id,
> + struct rte_cryptodev_asym_session *sess) {
> + struct rte_cryptodev *dev;
> +
> + dev = rte_cryptodev_pmd_get_dev(dev_id);
> +
> + if (dev == NULL || sess == NULL)
> + return -EINVAL;
> +

Same as above, add the following.

RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->asym_session_clear, -ENOTSUP);

> + dev->dev_ops->asym_session_clear(dev, sess);
> +
> + return 0;
> +}

I will send a patch doing the same for symmetric.

Pablo


Re: [dpdk-dev] [PATCH v3 5/6] crypto/openssl: add asym crypto support

2018-06-26 Thread De Lara Guarch, Pablo



> -Original Message-
> From: Shally Verma [mailto:shally.ve...@caviumnetworks.com]
> Sent: Wednesday, May 16, 2018 7:05 AM
> To: De Lara Guarch, Pablo 
> Cc: Trahe, Fiona ; akhil.go...@nxp.com;
> dev@dpdk.org; pathr...@caviumnetworks.com; Sunila Sahu
> ; Ashish Gupta
> 
> Subject: [PATCH v3 5/6] crypto/openssl: add asym crypto support
> 
> Add asymmetric crypto operation support in openssl PMD.
> Current list of supported asym xforms:
> * RSA
> * DSA
> * Deffie-hellman
> * Modular Operations
> 
> changes from v2:
> - Update the pmd capability as per new capability structure
> 
> changes from v1:
> - resolve new line error in dod/guides/cryptodevs/openssl.rst
> 
> Signed-off-by: Shally Verma 
> Signed-off-by: Sunila Sahu 
> Signed-off-by: Ashish Gupta 
> ---
>  doc/guides/cryptodevs/features/openssl.ini   |  11 +
>  doc/guides/cryptodevs/openssl.rst|   1 +
>  drivers/crypto/openssl/rte_openssl_pmd.c | 377 -
>  drivers/crypto/openssl/rte_openssl_pmd_ops.c | 395
> ++-
>  drivers/crypto/openssl/rte_openssl_pmd_private.h |  29 ++
>  5 files changed, 801 insertions(+), 12 deletions(-)

...

> @@ -1606,7 +1957,12 @@ openssl_pmd_enqueue_burst(void *queue_pair,
> struct rte_crypto_op **ops,
>   if (unlikely(sess == NULL))
>   goto enqueue_err;
> 
> - retval = process_op(qp, ops[i], sess);
> + if (ops[i]->type == RTE_CRYPTO_OP_TYPE_SYMMETRIC)
> + retval = process_op(qp, ops[i],
> + (struct openssl_session *) sess);

Could you rename process_op to process_sym_op?

Also, I think we need this check for the other PMDs.
I will send a patch to check if op type is equal to symmetric.

Pablo



[dpdk-dev] [PATCH v3 2/9] examples/vm_power: add core list parameter

2018-06-26 Thread David Hunt
Add in the '-l' command line parameter (also --core-list)
So the user can now pass --corelist=4,6,8-10 and it will
expand out to 4,6,8,9,10 using the parse function provided
in parse.c (parse_set).

This list of cores is then used to enable out-of-band monitoring
to scale up and down these cores based on the ratio of branch
hits versus branch misses. The ratio will be low when a poll
loop is spinning with no packets being received, so the frequency
will be scaled down.

Also , as part of this change, we introduce a core_info struct
which keeps information on each core in the system, and whether
we're doing out of band monitoring on them.

Signed-off-by: David Hunt 
Acked-by: Radu Nicolau 
---
 examples/vm_power_manager/Makefile|  2 +-
 examples/vm_power_manager/main.c  | 34 -
 examples/vm_power_manager/parse.c | 93 +++
 examples/vm_power_manager/parse.h | 20 +
 examples/vm_power_manager/power_manager.c | 31 
 examples/vm_power_manager/power_manager.h | 20 +
 6 files changed, 197 insertions(+), 3 deletions(-)
 create mode 100644 examples/vm_power_manager/parse.c
 create mode 100644 examples/vm_power_manager/parse.h

diff --git a/examples/vm_power_manager/Makefile 
b/examples/vm_power_manager/Makefile
index ef2a9f959..0c925967c 100644
--- a/examples/vm_power_manager/Makefile
+++ b/examples/vm_power_manager/Makefile
@@ -19,7 +19,7 @@ APP = vm_power_mgr
 
 # all source are stored in SRCS-y
 SRCS-y := main.c vm_power_cli.c power_manager.c channel_manager.c
-SRCS-y += channel_monitor.c
+SRCS-y += channel_monitor.c parse.c
 
 CFLAGS += -O3 -I$(RTE_SDK)/lib/librte_power/
 CFLAGS += $(WERROR_FLAGS)
diff --git a/examples/vm_power_manager/main.c b/examples/vm_power_manager/main.c
index 043b374bc..cc2a1289c 100644
--- a/examples/vm_power_manager/main.c
+++ b/examples/vm_power_manager/main.c
@@ -29,6 +29,7 @@
 #include "channel_monitor.h"
 #include "power_manager.h"
 #include "vm_power_cli.h"
+#include "parse.h"
 #include 
 #include 
 #include 
@@ -135,18 +136,22 @@ parse_portmask(const char *portmask)
 static int
 parse_args(int argc, char **argv)
 {
-   int opt, ret;
+   int opt, ret, cnt, i;
char **argvopt;
+   uint16_t *oob_enable;
int option_index;
char *prgname = argv[0];
+   struct core_info *ci;
static struct option lgopts[] = {
{ "mac-updating", no_argument, 0, 1},
{ "no-mac-updating", no_argument, 0, 0},
+   { "core-list", optional_argument, 0, 'l'},
{NULL, 0, 0, 0}
};
argvopt = argv;
+   ci = get_core_info();
 
-   while ((opt = getopt_long(argc, argvopt, "p:q:T:",
+   while ((opt = getopt_long(argc, argvopt, "l:p:q:T:",
  lgopts, &option_index)) != EOF) {
 
switch (opt) {
@@ -158,6 +163,27 @@ parse_args(int argc, char **argv)
return -1;
}
break;
+   case 'l':
+   oob_enable = malloc(ci->core_count * sizeof(uint16_t));
+   if (oob_enable == NULL) {
+   printf("Error - Unable to allocate memory\n");
+   return -1;
+   }
+   cnt = parse_set(optarg, oob_enable, ci->core_count);
+   if (cnt < 0) {
+   printf("Invalid core-list - [%s]\n",
+   optarg);
+   break;
+   }
+   for (i = 0; i < ci->core_count; i++) {
+   if (oob_enable[i]) {
+   printf("***Using core %d\n", i);
+   ci->cd[i].oob_enabled = 1;
+   ci->cd[i].global_enabled_cpus = 1;
+   }
+   }
+   free(oob_enable);
+   break;
/* long options */
case 0:
break;
@@ -263,6 +289,10 @@ main(int argc, char **argv)
uint16_t portid;
 
 
+   ret = core_info_init();
+   if (ret < 0)
+   rte_panic("Cannot allocate core info\n");
+
ret = rte_eal_init(argc, argv);
if (ret < 0)
rte_panic("Cannot init EAL\n");
diff --git a/examples/vm_power_manager/parse.c 
b/examples/vm_power_manager/parse.c
new file mode 100644
index 0..9de15c4a7
--- /dev/null
+++ b/examples/vm_power_manager/parse.c
@@ -0,0 +1,93 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2010-2014 Intel Corporation.
+ * Copyright(c) 2014 6WIND S.A.
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include "parse

[dpdk-dev] [0/9] examples/vm_power: 100% Busy Polling

2018-06-26 Thread David Hunt
This patch set adds the capability to do out-of-band power
monitoring on a system. It uses a thread to monitor the branch
counters in the targeted cores, and calculates the branch ratio
if the running code.

If the branch ratop is low (0.01), then
the code is most likely running in a tight poll loop and doing
nothing, i.e. receiving no packets. In this case we scale down
the frequency of that core.

If the branch ratio is higher (>0.01), then it is likely that
the code is receiving and processing packets. In this case, we
scale up the frequency of that core.

The cpu counters are read via /dev/cpu/x/msr, so requires the
msr kernel module to be loaded. Because this method is used,
the patch set is implemented with one file for x86 systems, and
another for non-x86 systems, with conditional compilation in
the Makefile. The non-x86 functions are stubs, and do not
currently implement any functionality.

The vm_power_manager app has been modified to take a new parameter
   --core-list or -l
which takes a list of cores in a comma-separated list format,
e.g. 1,3,5-7,9, which resolvest to a core list of 1,3,5,6,7,9
These cores will then be enabled for oob monitoring. When the
OOB monitoring thread starts, it reads the branch hits/miss
counters of each monitored core, and scales up/down accordingly.

The guest_cli app has also been modified to allow sending of a
policy of type BRANCH_RATIO where all of the cores included in
the policy will be monitored by the vm_power_manager oob thread.

v2 changes:
   * Add the guest_cli patch into this patch set, including the
 ability to set the policy to BRANCH_RATIO.
 http://patches.dpdk.org/patch/40742/
   * When vm_power_manger receives a policy with type BRANCH_RATIO,
 add the relevant cores to the monitoring thread.

v3 changes:
   * Added a command line parameter to alloe changing of the
 default branch ratio threshold. can now use -b 0.3 or
 --branch-ratio=0.3 to set the ratio for scaling up/down.

[1/9] examples/vm_power: add check for port count
[2/9] examples/vm_power: add core list parameter
[3/9] examples/vm_power: add oob monitoring functions
[4/9] examples/vm_power: allow greater than 64 cores
[5/9] examples/vm_power: add thread for oob core monitor
[6/9] examples/vm_power: add port-list to command line
[7/9] examples/vm_power: add branch ratio policy type
[8/9] examples/vm_power: add cli args to guest app
[9/9] examples/vm_power: make branch ratio configurable



[dpdk-dev] [PATCH v3 1/9] examples/vm_power: add check for port count

2018-06-26 Thread David Hunt
If we don't pass any ports to the app, we don't need to create
any mempools, and we don't need to init any ports.

Signed-off-by: David Hunt 
Acked-by: Radu Nicolau 
---
 examples/vm_power_manager/main.c | 81 +---
 1 file changed, 43 insertions(+), 38 deletions(-)

diff --git a/examples/vm_power_manager/main.c b/examples/vm_power_manager/main.c
index c9805a461..043b374bc 100644
--- a/examples/vm_power_manager/main.c
+++ b/examples/vm_power_manager/main.c
@@ -280,51 +280,56 @@ main(int argc, char **argv)
 
nb_ports = rte_eth_dev_count_avail();
 
-   mbuf_pool = rte_pktmbuf_pool_create("MBUF_POOL", NUM_MBUFS * nb_ports,
-   MBUF_CACHE_SIZE, 0, RTE_MBUF_DEFAULT_BUF_SIZE, rte_socket_id());
+   if (nb_ports > 0) {
+   mbuf_pool = rte_pktmbuf_pool_create("MBUF_POOL",
+   NUM_MBUFS * nb_ports, MBUF_CACHE_SIZE, 0,
+   RTE_MBUF_DEFAULT_BUF_SIZE, rte_socket_id());
 
-   if (mbuf_pool == NULL)
-   rte_exit(EXIT_FAILURE, "Cannot create mbuf pool\n");
+   if (mbuf_pool == NULL)
+   rte_exit(EXIT_FAILURE, "Cannot create mbuf pool\n");
 
-   /* Initialize ports. */
-   RTE_ETH_FOREACH_DEV(portid) {
-   struct ether_addr eth;
-   int w, j;
-   int ret;
+   /* Initialize ports. */
+   RTE_ETH_FOREACH_DEV(portid) {
+   struct ether_addr eth;
+   int w, j;
+   int ret;
 
-   if ((enabled_port_mask & (1 << portid)) == 0)
-   continue;
+   if ((enabled_port_mask & (1 << portid)) == 0)
+   continue;
 
-   eth.addr_bytes[0] = 0xe0;
-   eth.addr_bytes[1] = 0xe0;
-   eth.addr_bytes[2] = 0xe0;
-   eth.addr_bytes[3] = 0xe0;
-   eth.addr_bytes[4] = portid + 0xf0;
+   eth.addr_bytes[0] = 0xe0;
+   eth.addr_bytes[1] = 0xe0;
+   eth.addr_bytes[2] = 0xe0;
+   eth.addr_bytes[3] = 0xe0;
+   eth.addr_bytes[4] = portid + 0xf0;
 
-   if (port_init(portid, mbuf_pool) != 0)
-   rte_exit(EXIT_FAILURE, "Cannot init port %"PRIu8 "\n",
+   if (port_init(portid, mbuf_pool) != 0)
+   rte_exit(EXIT_FAILURE,
+   "Cannot init port %"PRIu8 "\n",
portid);
 
-   for (w = 0; w < MAX_VFS; w++) {
-   eth.addr_bytes[5] = w + 0xf0;
-
-   ret = rte_pmd_ixgbe_set_vf_mac_addr(portid,
-   w, ð);
-   if (ret == -ENOTSUP)
-   ret = rte_pmd_i40e_set_vf_mac_addr(portid,
-   w, ð);
-   if (ret == -ENOTSUP)
-   ret = rte_pmd_bnxt_set_vf_mac_addr(portid,
-   w, ð);
-
-   switch (ret) {
-   case 0:
-   printf("Port %d VF %d MAC: ",
-   portid, w);
-   for (j = 0; j < 6; j++) {
-   printf("%02x", eth.addr_bytes[j]);
-   if (j < 5)
-   printf(":");
+   for (w = 0; w < MAX_VFS; w++) {
+   eth.addr_bytes[5] = w + 0xf0;
+
+   ret = rte_pmd_ixgbe_set_vf_mac_addr(portid,
+   w, ð);
+   if (ret == -ENOTSUP)
+   ret = rte_pmd_i40e_set_vf_mac_addr(
+   portid, w, ð);
+   if (ret == -ENOTSUP)
+   ret = rte_pmd_bnxt_set_vf_mac_addr(
+   portid, w, ð);
+
+   switch (ret) {
+   case 0:
+   printf("Port %d VF %d MAC: ",
+   portid, w);
+   for (j = 0; j < 5; j++) {
+   printf("%02x:",
+   eth.addr_bytes[j]);
+   }
+   printf("%02x\n", eth.addr_bytes[5]);
+   break;
}
printf("\n");
  

[dpdk-dev] [PATCH v3 5/9] examples/vm_power: add thread for oob core monitor

2018-06-26 Thread David Hunt
Change the app to now require three cores, as the third core
will be used to run the oob montoring thread.

Signed-off-by: David Hunt 
Acked-by: Radu Nicolau 
---
 examples/vm_power_manager/main.c | 37 +---
 1 file changed, 34 insertions(+), 3 deletions(-)

diff --git a/examples/vm_power_manager/main.c b/examples/vm_power_manager/main.c
index cc2a1289c..4c6b5a990 100644
--- a/examples/vm_power_manager/main.c
+++ b/examples/vm_power_manager/main.c
@@ -29,6 +29,7 @@
 #include "channel_monitor.h"
 #include "power_manager.h"
 #include "vm_power_cli.h"
+#include "oob_monitor.h"
 #include "parse.h"
 #include 
 #include 
@@ -269,6 +270,17 @@ run_monitor(__attribute__((unused)) void *arg)
return 0;
 }
 
+static int
+run_core_monitor(__attribute__((unused)) void *arg)
+{
+   if (branch_monitor_init() < 0) {
+   printf("Unable to initialize core monitor\n");
+   return -1;
+   }
+   run_branch_monitor();
+   return 0;
+}
+
 static void
 sig_handler(int signo)
 {
@@ -287,12 +299,15 @@ main(int argc, char **argv)
unsigned int nb_ports;
struct rte_mempool *mbuf_pool;
uint16_t portid;
+   struct core_info *ci;
 
 
ret = core_info_init();
if (ret < 0)
rte_panic("Cannot allocate core info\n");
 
+   ci = get_core_info();
+
ret = rte_eal_init(argc, argv);
if (ret < 0)
rte_panic("Cannot init EAL\n");
@@ -367,16 +382,23 @@ main(int argc, char **argv)
}
}
 
+   check_all_ports_link_status(enabled_port_mask);
+
lcore_id = rte_get_next_lcore(-1, 1, 0);
if (lcore_id == RTE_MAX_LCORE) {
-   RTE_LOG(ERR, EAL, "A minimum of two cores are required to run "
+   RTE_LOG(ERR, EAL, "A minimum of three cores are required to run 
"
"application\n");
return 0;
}
-
-   check_all_ports_link_status(enabled_port_mask);
+   printf("Running channel monitor on lcore id %d\n", lcore_id);
rte_eal_remote_launch(run_monitor, NULL, lcore_id);
 
+   lcore_id = rte_get_next_lcore(lcore_id, 1, 0);
+   if (lcore_id == RTE_MAX_LCORE) {
+   RTE_LOG(ERR, EAL, "A minimum of three cores are required to run 
"
+   "application\n");
+   return 0;
+   }
if (power_manager_init() < 0) {
printf("Unable to initialize power manager\n");
return -1;
@@ -385,8 +407,17 @@ main(int argc, char **argv)
printf("Unable to initialize channel manager\n");
return -1;
}
+
+   printf("Running core monitor on lcore id %d\n", lcore_id);
+   rte_eal_remote_launch(run_core_monitor, NULL, lcore_id);
+
run_cli(NULL);
 
+   branch_monitor_exit();
+
rte_eal_mp_wait_lcore();
+
+   free(ci->cd);
+
return 0;
 }
-- 
2.17.1



[dpdk-dev] [PATCH v3 3/9] examples/vm_power: add oob monitoring functions

2018-06-26 Thread David Hunt
This patch introduces the out-of-band (oob) core monitoring
functions.

The functions are similar to the channel manager functions.
There are function to add and remove cores from the
list of cores being monitored. There is a function to initialise
the monitor setup, run the monitor thread, and exit the monitor.

The monitor thread runs in it's own lcore, and is separate
functionality to the channel monitor which is epoll based.
THis thread is timer based. It loops through all monitored cores,
calculates the branch ratio, scales up or down the core, then
sleeps for an interval (~250 uS).

The method it uses to read the branch counters is a pread on the
/dev/cpu/x/msr file, so the 'msr' kernel module needs to be loaded.
Also, since the msr.h file has been made unavailable in recent
kernels, we have #defines for the relevant MSRs included in the
code.

The makefile has a switch for x86 and non-x86 platforms,
and compiles stub function for non-x86 platforms.

Signed-off-by: David Hunt 
Acked-by: Radu Nicolau 
---
 examples/vm_power_manager/Makefile  |   5 +
 examples/vm_power_manager/oob_monitor.h |  68 +
 examples/vm_power_manager/oob_monitor_nop.c |  38 +++
 examples/vm_power_manager/oob_monitor_x86.c | 282 
 4 files changed, 393 insertions(+)
 create mode 100644 examples/vm_power_manager/oob_monitor.h
 create mode 100644 examples/vm_power_manager/oob_monitor_nop.c
 create mode 100644 examples/vm_power_manager/oob_monitor_x86.c

diff --git a/examples/vm_power_manager/Makefile 
b/examples/vm_power_manager/Makefile
index 0c925967c..13a5205ba 100644
--- a/examples/vm_power_manager/Makefile
+++ b/examples/vm_power_manager/Makefile
@@ -20,6 +20,11 @@ APP = vm_power_mgr
 # all source are stored in SRCS-y
 SRCS-y := main.c vm_power_cli.c power_manager.c channel_manager.c
 SRCS-y += channel_monitor.c parse.c
+ifeq ($(CONFIG_RTE_ARCH_X86_64),y)
+SRCS-y += oob_monitor_x86.c
+else
+SRCS-y += oob_monitor_nop.c
+endif
 
 CFLAGS += -O3 -I$(RTE_SDK)/lib/librte_power/
 CFLAGS += $(WERROR_FLAGS)
diff --git a/examples/vm_power_manager/oob_monitor.h 
b/examples/vm_power_manager/oob_monitor.h
new file mode 100644
index 0..b96e08df7
--- /dev/null
+++ b/examples/vm_power_manager/oob_monitor.h
@@ -0,0 +1,68 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2018 Intel Corporation
+ */
+
+#ifndef OOB_MONITOR_H_
+#define OOB_MONITOR_H_
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+/**
+ * Setup the Branch Monitor resources required to initialize epoll.
+ * Must be called first before calling other functions.
+ *
+ * @return
+ *  - 0 on success.
+ *  - Negative on error.
+ */
+int branch_monitor_init(void);
+
+/**
+ * Run the OOB branch monitor, loops forever on on epoll_wait.
+ *
+ *
+ * @return
+ *  None
+ */
+void run_branch_monitor(void);
+
+/**
+ * Exit the OOB Branch Monitor.
+ *
+ * @return
+ *  None
+ */
+void branch_monitor_exit(void);
+
+/**
+ * Add a core to the list of cores to monitor.
+ *
+ * @param core
+ *  Core Number
+ *
+ * @return
+ *  - 0 on success.
+ *  - Negative on error.
+ */
+int add_core_to_monitor(int core);
+
+/**
+ * Remove a previously added core from core list.
+ *
+ * @param core
+ *  Core Number
+ *
+ * @return
+ *  - 0 on success.
+ *  - Negative on error.
+ */
+int remove_core_from_monitor(int core);
+
+#ifdef __cplusplus
+}
+#endif
+
+
+#endif /* OOB_MONITOR_H_ */
diff --git a/examples/vm_power_manager/oob_monitor_nop.c 
b/examples/vm_power_manager/oob_monitor_nop.c
new file mode 100644
index 0..7e7b8bc14
--- /dev/null
+++ b/examples/vm_power_manager/oob_monitor_nop.c
@@ -0,0 +1,38 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2010-2014 Intel Corporation
+ */
+
+#include "oob_monitor.h"
+
+void branch_monitor_exit(void)
+{
+}
+
+__attribute__((unused)) static float
+apply_policy(__attribute__((unused)) int core)
+{
+   return 0.0;
+}
+
+int
+add_core_to_monitor(__attribute__((unused)) int core)
+{
+   return 0;
+}
+
+int
+remove_core_from_monitor(__attribute__((unused)) int core)
+{
+   return 0;
+}
+
+int
+branch_monitor_init(void)
+{
+   return 0;
+}
+
+void
+run_branch_monitor(void)
+{
+}
diff --git a/examples/vm_power_manager/oob_monitor_x86.c 
b/examples/vm_power_manager/oob_monitor_x86.c
new file mode 100644
index 0..485ec5e3f
--- /dev/null
+++ b/examples/vm_power_manager/oob_monitor_x86.c
@@ -0,0 +1,282 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2018 Intel Corporation
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include 
+#include "oob_monitor.h"
+#include "power_manager.h"
+#include "channel_manager.h"
+
+#include 
+#define RTE_LOGTYPE_CHANNEL_MONITOR RTE_LOGTYPE_USER1
+
+#define MAX_EVENTS 256
+
+static volatile unsigned run_loop = 1;
+static uint64_t g_branches, g_branch_misses;
+static int 

[dpdk-dev] [PATCH v3 4/9] examples/vm_power: allow greater than 64 cores

2018-06-26 Thread David Hunt
To facilitate more info per core, change the global_cpu_mask
from a uint64_t to an array. This also removes the limit on
64 cores, allocing the aray at run-time based on the number of
cores found in the system.

Signed-off-by: David Hunt 
Acked-by: Radu Nicolau 
---
 examples/vm_power_manager/power_manager.c | 115 +++---
 1 file changed, 58 insertions(+), 57 deletions(-)

diff --git a/examples/vm_power_manager/power_manager.c 
b/examples/vm_power_manager/power_manager.c
index a7849e48a..4bdde23da 100644
--- a/examples/vm_power_manager/power_manager.c
+++ b/examples/vm_power_manager/power_manager.c
@@ -19,14 +19,14 @@
 #include 
 #include 
 
+#include "channel_manager.h"
 #include "power_manager.h"
-
-#define RTE_LOGTYPE_POWER_MANAGER RTE_LOGTYPE_USER1
+#include "oob_monitor.h"
 
 #define POWER_SCALE_CORE(DIRECTION, core_num , ret) do { \
-   if (core_num >= POWER_MGR_MAX_CPUS) \
+   if (core_num >= ci.core_count) \
return -1; \
-   if (!(global_enabled_cpus & (1ULL << core_num))) \
+   if (!(ci.cd[core_num].global_enabled_cpus)) \
return -1; \
rte_spinlock_lock(&global_core_freq_info[core_num].power_sl); \
ret = rte_power_freq_##DIRECTION(core_num); \
@@ -37,7 +37,7 @@
int i; \
for (i = 0; core_mask; core_mask &= ~(1 << i++)) { \
if ((core_mask >> i) & 1) { \
-   if (!(global_enabled_cpus & (1ULL << i))) \
+   if (!(ci.cd[i].global_enabled_cpus)) \
continue; \
rte_spinlock_lock(&global_core_freq_info[i].power_sl); \
if (rte_power_freq_##DIRECTION(i) != 1) \
@@ -56,28 +56,9 @@ struct freq_info {
 static struct freq_info global_core_freq_info[POWER_MGR_MAX_CPUS];
 
 struct core_info ci;
-static uint64_t global_enabled_cpus;
 
 #define SYSFS_CPU_PATH "/sys/devices/system/cpu/cpu%u/topology/core_id"
 
-static unsigned
-set_host_cpus_mask(void)
-{
-   char path[PATH_MAX];
-   unsigned i;
-   unsigned num_cpus = 0;
-
-   for (i = 0; i < POWER_MGR_MAX_CPUS; i++) {
-   snprintf(path, sizeof(path), SYSFS_CPU_PATH, i);
-   if (access(path, F_OK) == 0) {
-   global_enabled_cpus |= 1ULL << i;
-   num_cpus++;
-   } else
-   return num_cpus;
-   }
-   return num_cpus;
-}
-
 struct core_info *
 get_core_info(void)
 {
@@ -110,38 +91,45 @@ core_info_init(void)
 int
 power_manager_init(void)
 {
-   unsigned int i, num_cpus, num_freqs;
-   uint64_t cpu_mask;
+   unsigned int i, num_cpus = 0, num_freqs = 0;
int ret = 0;
+   struct core_info *ci;
+
+   rte_power_set_env(PM_ENV_ACPI_CPUFREQ);
 
-   num_cpus = set_host_cpus_mask();
-   if (num_cpus == 0) {
-   RTE_LOG(ERR, POWER_MANAGER, "Unable to detected host CPUs, 
please "
-   "ensure that sufficient privileges exist to inspect 
sysfs\n");
+   ci = get_core_info();
+   if (!ci) {
+   RTE_LOG(ERR, POWER_MANAGER,
+   "Failed to get core info!\n");
return -1;
}
-   rte_power_set_env(PM_ENV_ACPI_CPUFREQ);
-   cpu_mask = global_enabled_cpus;
-   for (i = 0; cpu_mask; cpu_mask &= ~(1 << i++)) {
-   if (rte_power_init(i) < 0)
-   RTE_LOG(ERR, POWER_MANAGER,
-   "Unable to initialize power manager "
-   "for core %u\n", i);
-   num_freqs = rte_power_freqs(i, global_core_freq_info[i].freqs,
+
+   for (i = 0; i < ci->core_count; i++) {
+   if (ci->cd[i].global_enabled_cpus) {
+   if (rte_power_init(i) < 0)
+   RTE_LOG(ERR, POWER_MANAGER,
+   "Unable to initialize power 
manager "
+   "for core %u\n", i);
+   num_cpus++;
+   num_freqs = rte_power_freqs(i,
+   global_core_freq_info[i].freqs,
RTE_MAX_LCORE_FREQS);
-   if (num_freqs == 0) {
-   RTE_LOG(ERR, POWER_MANAGER,
-   "Unable to get frequency list for core %u\n",
-   i);
-   global_enabled_cpus &= ~(1 << i);
-   num_cpus--;
-   ret = -1;
+   if (num_freqs == 0) {
+   RTE_LOG(ERR, POWER_MANAGER,
+   "Unable to get frequency list for core 
%u\n",
+   i);
+   ci->cd[i].oob_enabled = 0;
+   ret = -1;
+   }
+  

[dpdk-dev] [PATCH v3 6/9] examples/vm_power: add port-list to command line

2018-06-26 Thread David Hunt
add in the long form of -p, which is --port-list

Signed-off-by: David Hunt 
Acked-by: Radu Nicolau 
---
 examples/vm_power_manager/main.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/examples/vm_power_manager/main.c b/examples/vm_power_manager/main.c
index 4c6b5a990..4088861f1 100644
--- a/examples/vm_power_manager/main.c
+++ b/examples/vm_power_manager/main.c
@@ -147,6 +147,7 @@ parse_args(int argc, char **argv)
{ "mac-updating", no_argument, 0, 1},
{ "no-mac-updating", no_argument, 0, 0},
{ "core-list", optional_argument, 0, 'l'},
+   { "port-list", optional_argument, 0, 'p'},
{NULL, 0, 0, 0}
};
argvopt = argv;
-- 
2.17.1



[dpdk-dev] [PATCH v3 9/9] examples/vm_power: make branch ratio configurable

2018-06-26 Thread David Hunt
For different workloads and poll loops, the theshold
may be different for when you want to scale up and down.

This patch allows changing of the default branch ratio
by using the -b command line argument (or --branch-ratio=)

Signed-off-by: David Hunt 
Acked-by: Radu Nicolau 
---
 examples/vm_power_manager/main.c| 16 +++-
 examples/vm_power_manager/oob_monitor_x86.c |  3 +--
 examples/vm_power_manager/power_manager.c   |  1 +
 examples/vm_power_manager/power_manager.h   |  3 +++
 4 files changed, 20 insertions(+), 3 deletions(-)

diff --git a/examples/vm_power_manager/main.c b/examples/vm_power_manager/main.c
index 4088861f1..784d928bd 100644
--- a/examples/vm_power_manager/main.c
+++ b/examples/vm_power_manager/main.c
@@ -143,17 +143,19 @@ parse_args(int argc, char **argv)
int option_index;
char *prgname = argv[0];
struct core_info *ci;
+   float branch_ratio;
static struct option lgopts[] = {
{ "mac-updating", no_argument, 0, 1},
{ "no-mac-updating", no_argument, 0, 0},
{ "core-list", optional_argument, 0, 'l'},
{ "port-list", optional_argument, 0, 'p'},
+   { "branch-ratio", optional_argument, 0, 'b'},
{NULL, 0, 0, 0}
};
argvopt = argv;
ci = get_core_info();
 
-   while ((opt = getopt_long(argc, argvopt, "l:p:q:T:",
+   while ((opt = getopt_long(argc, argvopt, "l:p:q:T:b:",
  lgopts, &option_index)) != EOF) {
 
switch (opt) {
@@ -186,6 +188,18 @@ parse_args(int argc, char **argv)
}
free(oob_enable);
break;
+   case 'b':
+   branch_ratio = 0.0;
+   if (strlen(optarg))
+   branch_ratio = atof(optarg);
+   if (branch_ratio <= 0.0) {
+   printf("invalid branch ratio specified\n");
+   return -1;
+   }
+   ci->branch_ratio_threshold = branch_ratio;
+   printf("***Setting branch ratio to %f\n",
+   branch_ratio);
+   break;
/* long options */
case 0:
break;
diff --git a/examples/vm_power_manager/oob_monitor_x86.c 
b/examples/vm_power_manager/oob_monitor_x86.c
index 485ec5e3f..ea327b819 100644
--- a/examples/vm_power_manager/oob_monitor_x86.c
+++ b/examples/vm_power_manager/oob_monitor_x86.c
@@ -45,7 +45,6 @@ void branch_monitor_exit(void)
 /* Number of microseconds between each poll */
 #define INTERVAL 100
 #define PRINT_LOOP_COUNT (100/INTERVAL)
-#define RATIO_THRESHOLD 0.03
 #define IA32_PERFEVTSEL0 0x186
 #define IA32_PERFEVTSEL1 0x187
 #define IA32_PERFCTR0 0xc1
@@ -112,7 +111,7 @@ apply_policy(int core)
 
ratio = (float)miss_diff * (float)100 / (float)hits_diff;
 
-   if (ratio < RATIO_THRESHOLD)
+   if (ratio < ci->branch_ratio_threshold)
power_manager_scale_core_min(core);
else
power_manager_scale_core_max(core);
diff --git a/examples/vm_power_manager/power_manager.c 
b/examples/vm_power_manager/power_manager.c
index 4bdde23da..b7769c3c3 100644
--- a/examples/vm_power_manager/power_manager.c
+++ b/examples/vm_power_manager/power_manager.c
@@ -74,6 +74,7 @@ core_info_init(void)
ci = get_core_info();
 
ci->core_count = get_nprocs_conf();
+   ci->branch_ratio_threshold = BRANCH_RATIO_THRESHOLD;
ci->cd = malloc(ci->core_count * sizeof(struct core_details));
if (!ci->cd) {
RTE_LOG(ERR, POWER_MANAGER, "Failed to allocate memory for core 
info.");
diff --git a/examples/vm_power_manager/power_manager.h 
b/examples/vm_power_manager/power_manager.h
index 45385de37..605b3c8f6 100644
--- a/examples/vm_power_manager/power_manager.h
+++ b/examples/vm_power_manager/power_manager.h
@@ -19,8 +19,11 @@ struct core_details {
 struct core_info {
uint16_t core_count;
struct core_details *cd;
+   float branch_ratio_threshold;
 };
 
+#define BRANCH_RATIO_THRESHOLD 0.1
+
 struct core_info *
 get_core_info(void);
 
-- 
2.17.1



[dpdk-dev] [PATCH v3 7/9] examples/vm_power: add branch ratio policy type

2018-06-26 Thread David Hunt
Add the capability for the vm_power_manager to receive
a policy of type BRANCH_RATIO. This will add any vcpus
in the policy to the oob monitoring thread.

Signed-off-by: David Hunt 
Acked-by: Radu Nicolau 
---
 examples/vm_power_manager/channel_monitor.c | 23 +++--
 lib/librte_power/channel_commands.h |  3 ++-
 2 files changed, 23 insertions(+), 3 deletions(-)

diff --git a/examples/vm_power_manager/channel_monitor.c 
b/examples/vm_power_manager/channel_monitor.c
index 73bddd993..7fa47ba97 100644
--- a/examples/vm_power_manager/channel_monitor.c
+++ b/examples/vm_power_manager/channel_monitor.c
@@ -27,6 +27,7 @@
 #include "channel_commands.h"
 #include "channel_manager.h"
 #include "power_manager.h"
+#include "oob_monitor.h"
 
 #define RTE_LOGTYPE_CHANNEL_MONITOR RTE_LOGTYPE_USER1
 
@@ -92,6 +93,10 @@ get_pcpu_to_control(struct policy *pol)
struct vm_info info;
int pcpu, count;
uint64_t mask_u64b;
+   struct core_info *ci;
+   int ret;
+
+   ci = get_core_info();
 
RTE_LOG(INFO, CHANNEL_MONITOR, "Looking for pcpu for %s\n",
pol->pkt.vm_name);
@@ -100,8 +105,22 @@ get_pcpu_to_control(struct policy *pol)
for (count = 0; count < pol->pkt.num_vcpu; count++) {
mask_u64b = info.pcpu_mask[pol->pkt.vcpu_to_control[count]];
for (pcpu = 0; mask_u64b; mask_u64b &= ~(1ULL << pcpu++)) {
-   if ((mask_u64b >> pcpu) & 1)
-   pol->core_share[count].pcpu = pcpu;
+   if ((mask_u64b >> pcpu) & 1) {
+   if (pol->pkt.policy_to_use == BRANCH_RATIO) {
+   ci->cd[pcpu].oob_enabled = 1;
+   ret = add_core_to_monitor(pcpu);
+   if (ret == 0)
+   printf("Monitoring pcpu %d via 
Branch Ratio\n",
+   pcpu);
+   else
+   printf("Failed to start OOB 
Monitoring pcpu %d\n",
+   pcpu);
+
+   } else {
+   pol->core_share[count].pcpu = pcpu;
+   printf("Monitoring pcpu %d\n", pcpu);
+   }
+   }
}
}
 }
diff --git a/lib/librte_power/channel_commands.h 
b/lib/librte_power/channel_commands.h
index 5e8b4ab5d..ee638eefa 100644
--- a/lib/librte_power/channel_commands.h
+++ b/lib/librte_power/channel_commands.h
@@ -48,7 +48,8 @@ enum workload {HIGH, MEDIUM, LOW};
 enum policy_to_use {
TRAFFIC,
TIME,
-   WORKLOAD
+   WORKLOAD,
+   BRANCH_RATIO
 };
 
 struct traffic {
-- 
2.17.1



Re: [dpdk-dev] [PATCH v4 05/24] eal: support mp task be invoked in a separate task

2018-06-26 Thread Thomas Monjalon
26/06/2018 11:02, Burakov, Anatoly:
> On 26-Jun-18 8:08 AM, Qi Zhang wrote:
> > We know the limitation that sync IPC can't be invoked in mp handler
> > itself which will cause deadlock, the patch introduce new API
> > rte_eal_mp_task_add to support mp handler be delegated in a separate
> > task.
> > 
> > Signed-off-by: Qi Zhang 
> > ---
> 
> I would really like to find another solution to this problem. Creating a 
> new thread per hotplug request seems like an overkill - even more so 
> than having two threads. Creating a new thread potentially while the 
> application is working may have other implications (e.g. there's a 
> non-zero amount of time between thread created and thread affinitized, 
> which may disrupt hotpaths).
> 
> It seems to me that the better solution would've been to leave the IPC 
> thread in place. There are two IPC threads in the first place because 
> there was a circular dependency between rte_malloc and alarm API. My 
> patch fixes that - so how about we remove *one* IPC thread, but leave 
> the other one in place?
> 
> Thomas, any thoughts? (quick description - hotplug needs IPC, and 
> hotplug may need to allocate memory, which also needs IPC, which will 
> cause a deadlock if IPC is one thread)

We can keep one IPC thread until we find a better solution.






[dpdk-dev] [PATCH v3 8/9] examples/vm_power: add cli args to guest app

2018-06-26 Thread David Hunt
Add new command line arguments to the guest app to make
testing and validation of the policy usage easier.
These arguments are mainly around setting up the power
management policy that is sent from the guest vm to
to the vm_power_manager in the host

New command line parameters:
-n or --vm-name
   sets the name of the vm to be used by the host OS.
-b or --busy-hours
   sets the list of hours that are predicted to be busy
-q or --quiet-hours
   sets the list of hours that are predicted to be quiet
-l or --vcpu-list
   sets the list of vcpus to monitor
-p or --port-list
   sets the list of posts to monitor when using a
   workload policy.
-o or --policy
   sets the default policy type
  TIME
  WORKLOAD
  TRAFFIC
  BRANCH_RATIO

The format of the hours or list paramers is a comma-separated
list of integers, which can take the form of
   a. xe.g. --vcpu-list=1
   b. x,y  e.g. --quiet-hours=3,4
   c. x-y  e.g. --busy-hours=9-12
   d. combination of above (e.g. --busy-hours=4,5-7,9)

Signed-off-by: David Hunt 
Acked-by: Radu Nicolau 
---
 examples/vm_power_manager/guest_cli/Makefile  |   2 +-
 examples/vm_power_manager/guest_cli/main.c| 151 +-
 examples/vm_power_manager/guest_cli/parse.c   |  93 +++
 examples/vm_power_manager/guest_cli/parse.h   |  19 +++
 .../guest_cli/vm_power_cli_guest.c| 113 +++--
 .../guest_cli/vm_power_cli_guest.h|   6 +
 6 files changed, 330 insertions(+), 54 deletions(-)
 create mode 100644 examples/vm_power_manager/guest_cli/parse.c
 create mode 100644 examples/vm_power_manager/guest_cli/parse.h

diff --git a/examples/vm_power_manager/guest_cli/Makefile 
b/examples/vm_power_manager/guest_cli/Makefile
index d710e22d9..8b1db861e 100644
--- a/examples/vm_power_manager/guest_cli/Makefile
+++ b/examples/vm_power_manager/guest_cli/Makefile
@@ -14,7 +14,7 @@ include $(RTE_SDK)/mk/rte.vars.mk
 APP = guest_vm_power_mgr
 
 # all source are stored in SRCS-y
-SRCS-y := main.c vm_power_cli_guest.c
+SRCS-y := main.c vm_power_cli_guest.c parse.c
 
 CFLAGS += -O3 -I$(RTE_SDK)/lib/librte_power/
 CFLAGS += $(WERROR_FLAGS)
diff --git a/examples/vm_power_manager/guest_cli/main.c 
b/examples/vm_power_manager/guest_cli/main.c
index b17936d6b..36365b124 100644
--- a/examples/vm_power_manager/guest_cli/main.c
+++ b/examples/vm_power_manager/guest_cli/main.c
@@ -2,23 +2,20 @@
  * Copyright(c) 2010-2014 Intel Corporation
  */
 
-/*
 #include 
-#include 
-#include 
-#include 
-#include 
-#include 
 #include 
-#include 
-*/
 #include 
+#include 
+#include 
 
 #include 
 #include 
 #include 
+#include 
+#include 
 
 #include "vm_power_cli_guest.h"
+#include "parse.h"
 
 static void
 sig_handler(int signo)
@@ -32,6 +29,136 @@ sig_handler(int signo)
 
 }
 
+#define MAX_HOURS 24
+
+/* Parse the argument given in the command line of the application */
+static int
+parse_args(int argc, char **argv)
+{
+   int opt, ret;
+   char **argvopt;
+   int option_index;
+   char *prgname = argv[0];
+   const struct option lgopts[] = {
+   { "vm-name", required_argument, 0, 'n'},
+   { "busy-hours", required_argument, 0, 'b'},
+   { "quiet-hours", required_argument, 0, 'q'},
+   { "port-list", required_argument, 0, 'p'},
+   { "vcpu-list", required_argument, 0, 'l'},
+   { "policy", required_argument, 0, 'o'},
+   {NULL, 0, 0, 0}
+   };
+   struct channel_packet *policy;
+   unsigned short int hours[MAX_HOURS];
+   unsigned short int cores[MAX_VCPU_PER_VM];
+   unsigned short int ports[MAX_VCPU_PER_VM];
+   int i, cnt, idx;
+
+   policy = get_policy();
+   set_policy_defaults(policy);
+
+   argvopt = argv;
+
+   while ((opt = getopt_long(argc, argvopt, "n:b:q:p:",
+ lgopts, &option_index)) != EOF) {
+
+   switch (opt) {
+   /* portmask */
+   case 'n':
+   strcpy(policy->vm_name, optarg);
+   printf("Setting VM Name to [%s]\n", policy->vm_name);
+   break;
+   case 'b':
+   case 'q':
+   //printf("***Processing set using [%s]\n", optarg);
+   cnt = parse_set(optarg, hours, MAX_HOURS);
+   if (cnt < 0) {
+   printf("Invalid value passed to quiet/busy 
hours - [%s]\n",
+   optarg);
+   break;
+   }
+   idx = 0;
+   for (i = 0; i < MAX_HOURS; i++) {
+   if (hours[i]) {
+   if (opt == 'b') {
+   printf("***Busy Hour %d\n", i);
+  

Re: [dpdk-dev] [RFC v3 0/7] vhost2: new librte_vhost2 proposal

2018-06-26 Thread Maxime Coquelin




On 06/26/2018 11:14 AM, Tiwei Bie wrote:

On Tue, Jun 26, 2018 at 04:47:33PM +0800, Stojaczyk, DariuszX wrote:

-Original Message-
From: Bie, Tiwei
Sent: Tuesday, June 26, 2018 10:22 AM
To: Stojaczyk, DariuszX 
Cc: Dariusz Stojaczyk ; dev@dpdk.org; Maxime
Coquelin ; Tetsuya Mukawa
; Stefan Hajnoczi ; Thomas
Monjalon ; y...@fridaylinux.org; Harris, James R
; Kulasek, TomaszX ;
Wodkowski, PawelX 
Subject: Re: [dpdk-dev] [RFC v3 0/7] vhost2: new librte_vhost2 proposal

On Mon, Jun 25, 2018 at 08:17:08PM +0800, Stojaczyk, DariuszX wrote:

-Original Message-
From: dev [mailto:dev-boun...@dpdk.org] On Behalf Of Tiwei Bie
Sent: Monday, June 25, 2018 1:02 PM


Hi Dariusz,



Hi Tiwei,


Thank you for putting efforts in making the DPDK
vhost more generic!

 From my understanding, your proposal is that:

1) Introduce rte_vhost2 to provide the APIs which
allow users to implement vhost backends like
SCSI, net, crypto, ..



That's right.


2) Refactor the existing rte_vhost to use rte_vhost2.
The rte_vhost will still provide below existing
sets of APIs:
 1. The APIs which allow users to implement
external vhost backends (these APIs were
designed for SPDK previously)
 2. The APIs provided by the net backend
 3. The APIs provided by the crypto backend
And above APIs in rte_vhost won't be changed.


That's correct. Rte_vhost would register its own rte_vhost2_tgt_ops

underneath and will call existing vhost_device_ops for e.g. starting the device
once all queues are started.

Currently I have below concerns and questions:

- The rte_vhost's problem is still there. Even though
   rte_vhost2 is introduced, the net and crypto backends
   in rte_vhost won't benefit from the new callbacks.

   The existing rte_vhost in DPDK not only provides the
   APIs for DPDK applications to implement the external
   backends. But also provides high performance net and
   crypto backends implementation (maybe more in the
   future). So it's important that besides the DPDK
   applications which implement their external backends,
   the DPDK applications which use the builtin backends
   will also benefit from the new callbacks.

   So we should have a clear plan on how will the legacy
   callbacks in rte_vhost be dealt with in the next step.

   Besides, the new library's name is a bit misleading.
   It makes the existing rte_vhost library sound like an
   obsolete library. But actually the existing rte_vhost
   isn't an obsolete library. It will still provide the
   net and crypto backends. So if we want to introduce
   this new library, we should give it a better name.

- It's possible to solve rte_vhost's problem you met
   by refactoring the existing vhost library directly
   instead of re-implementing a new vhost library from
   scratch and keeping the old one's problem as is.

   In this way, it will solve the problem you met and
   also solve the problem for rte_vhost. Why not go
   this way? Something like:

   Below is the existing callbacks set in rte_vhost.h:

   /**
* Device and vring operations.
*/
   struct vhost_device_ops {
   ..
   };

   It's a legacy implementation, and doesn't really
   follow the DPDK API design (e.g. no rte_ prefix).
   We can design and implement a new message handling
   and a new set of callbacks for rte_vhost to solve
   the problem you met without changing the old one.
   Something like:

   struct rte_vhost_device_ops {
   ..
   }

   int
   vhost_user_msg_handler(struct vhost_dev *vdev, struct vhost_user_msg
*msg)
   {
   ..

   if (!vdev->is_using_new_device_ops) {
   // Call the existing message handler
   return vhost_user_msg_handler_legacy(vdev, msg);
   }

   // Implement the new logic here
   ..
   }

   A vhost application is allowed to register only struct
   rte_vhost_device_ops or struct vhost_device_ops (which
   should be deprecated in the future). The two ops cannot
   be registered at the same time.

   The existing applications could use the old ops. And
   if an application registers struct rte_vhost_device_ops,
   the new callbacks and message handler will be used.


Please notice that some features like vIOMMU are not even a part of the public 
rte_vhost API. Only vhost-net benefits from vIOMMU right now. Separating 
vhost-net from a generic vhost library (rte_vhost2) would avoid making such 
design mistakes in future. What's the point of having a single rte_vhost 
library, if some vhost-user features are only implemented for vhost-net.


These APIs can be safely added at any time.
And we can also ask developers to add public
APIs if it matters when adding new features
in the future. I don't think it's a big
problem.


+1, I don't think it is a problem.
It is better to have it internal only at the beginning than having to
break the API.

Thanks,
Maxime

Best regards,
Tiwei Bie





Best regards,
Tiwei

Re: [dpdk-dev] [PATCH v2 1/3] power: add get capabilities API

2018-06-26 Thread Hunt, David



On 11/6/2018 11:03 AM, Radu Nicolau wrote:

New API added, rte_power_get_capabilities(), that allows the
application to query the power and performance capabilities
of the CPU cores.

Signed-off-by: Radu Nicolau 
---
v2: fixed coding style errors, split test into separate patch

  lib/librte_power/power_acpi_cpufreq.c  | 21 +
  lib/librte_power/power_acpi_cpufreq.h  | 17 +
  lib/librte_power/power_kvm_vm.c|  8 
  lib/librte_power/power_kvm_vm.h| 17 +
  lib/librte_power/rte_power.c   |  3 +++
  lib/librte_power/rte_power.h   | 33 +
  lib/librte_power/rte_power_version.map |  7 +++
  7 files changed, 106 insertions(+)

diff --git a/lib/librte_power/power_acpi_cpufreq.c 
b/lib/librte_power/power_acpi_cpufreq.c
index bce933e..cd5978d 100644
--- a/lib/librte_power/power_acpi_cpufreq.c
+++ b/lib/librte_power/power_acpi_cpufreq.c
@@ -623,3 +623,24 @@ power_acpi_disable_turbo(unsigned int lcore_id)
  
  	return 0;

  }
+
+int power_acpi_get_capabilities(unsigned int lcore_id,
+   struct rte_power_core_capabilities *caps)
+{
+   struct rte_power_info *pi;
+
+   if (lcore_id >= RTE_MAX_LCORE) {
+   RTE_LOG(ERR, POWER, "Invalid lcore ID\n");
+   return -1;
+   }
+   if (caps == NULL) {
+   RTE_LOG(ERR, POWER, "Invalid argument\n");
+   return -1;
+   }
+
+   pi = &lcore_power_info[lcore_id];
+   caps->capabilities = 0;
+   caps->turbo = !!(pi->turbo_available);
+
+   return 0;
+}
diff --git a/lib/librte_power/power_acpi_cpufreq.h 
b/lib/librte_power/power_acpi_cpufreq.h
index edeeb27..77701a9 100644
--- a/lib/librte_power/power_acpi_cpufreq.h
+++ b/lib/librte_power/power_acpi_cpufreq.h
@@ -14,6 +14,7 @@
  #include 
  #include 
  #include 
+#include "rte_power.h"
  
  #ifdef __cplusplus

  extern "C" {
@@ -196,6 +197,22 @@ int power_acpi_enable_turbo(unsigned int lcore_id);
   */
  int power_acpi_disable_turbo(unsigned int lcore_id);
  
+/**

+ * Returns power capabilities for a specific lcore.
+ *
+ * @param lcore_id
+ *  lcore id.
+ * @param caps
+ *  pointer to rte_power_core_capabilities object.
+ *
+ * @return
+ *  - 0 on success.
+ *  - Negative on error.
+ */
+int power_acpi_get_capabilities(unsigned int lcore_id,
+   struct rte_power_core_capabilities *caps);
+
+
  #ifdef __cplusplus
  }
  #endif
diff --git a/lib/librte_power/power_kvm_vm.c b/lib/librte_power/power_kvm_vm.c
index 38e9066..20659b7 100644
--- a/lib/librte_power/power_kvm_vm.c
+++ b/lib/librte_power/power_kvm_vm.c
@@ -124,3 +124,11 @@ power_kvm_vm_disable_turbo(unsigned int lcore_id)
  {
return send_msg(lcore_id, CPU_POWER_DISABLE_TURBO);
  }
+
+struct rte_power_core_capabilities;
+int power_kvm_vm_get_capabilities(__rte_unused unsigned int lcore_id,
+   __rte_unused struct rte_power_core_capabilities *caps)
+{
+   RTE_LOG(ERR, POWER, "rte_power_get_capabilities is not implemented for 
Virtual Machine Power Management\n");
+   return -ENOTSUP;
+}
diff --git a/lib/librte_power/power_kvm_vm.h b/lib/librte_power/power_kvm_vm.h
index 446d699..94d4aa1 100644
--- a/lib/librte_power/power_kvm_vm.h
+++ b/lib/librte_power/power_kvm_vm.h
@@ -14,6 +14,7 @@
  #include 
  #include 
  #include 
+#include "rte_power.h"
  
  #ifdef __cplusplus

  extern "C" {
@@ -177,6 +178,22 @@ int power_kvm_vm_enable_turbo(unsigned int lcore_id);
   *  - Negative on error.
   */
  int power_kvm_vm_disable_turbo(unsigned int lcore_id);
+
+/**
+ * Returns power capabilities for a specific lcore.
+ *
+ * @param lcore_id
+ *  lcore id.
+ * @param caps
+ *  pointer to rte_power_core_capabilities object.
+ *
+ * @return
+ *  - 0 on success.
+ *  - Negative on error.
+ */
+int power_kvm_vm_get_capabilities(unsigned int lcore_id,
+   struct rte_power_core_capabilities *caps);
+
  #ifdef __cplusplus
  }
  #endif
diff --git a/lib/librte_power/rte_power.c b/lib/librte_power/rte_power.c
index 6c8fb40..208b791 100644
--- a/lib/librte_power/rte_power.c
+++ b/lib/librte_power/rte_power.c
@@ -24,6 +24,7 @@ rte_power_freq_change_t rte_power_freq_min = NULL;
  rte_power_freq_change_t rte_power_turbo_status;
  rte_power_freq_change_t rte_power_freq_enable_turbo;
  rte_power_freq_change_t rte_power_freq_disable_turbo;
+rte_power_get_capabilities_t rte_power_get_capabilities;
  
  int

  rte_power_set_env(enum power_management_env env)
@@ -42,6 +43,7 @@ rte_power_set_env(enum power_management_env env)
rte_power_turbo_status = power_acpi_turbo_status;
rte_power_freq_enable_turbo = power_acpi_enable_turbo;
rte_power_freq_disable_turbo = power_acpi_disable_turbo;
+   rte_power_get_capabilities = power_acpi_get_capabilities;
} else if (env == PM_ENV_KVM_VM) {
rte_power_freqs = power_kvm_vm_freqs;
rte_power_get_freq = power_kvm_v

Re: [dpdk-dev] [PATCH v2 2/3] test/power: add unit test for get capabilities API

2018-06-26 Thread Hunt, David




On 11/6/2018 11:03 AM, Radu Nicolau wrote:

Signed-off-by: Radu Nicolau 
---
  test/test/test_power_acpi_cpufreq.c | 42 +
  1 file changed, 42 insertions(+)

diff --git a/test/test/test_power_acpi_cpufreq.c 
b/test/test/test_power_acpi_cpufreq.c
index 8da2dcc..67d5ee0 100644
--- a/test/test/test_power_acpi_cpufreq.c
+++ b/test/test/test_power_acpi_cpufreq.c
@@ -18,6 +18,12 @@ test_power_acpi_cpufreq(void)
printf("Power management library not supported, skipping test\n");
return TEST_SKIPPED;
  }
+static int
+test_power_acpi_caps(void)
+{
+   printf("Power management library not supported, skipping test\n");
+   return TEST_SKIPPED;
+}
  
  #else

  #include 
@@ -517,6 +523,42 @@ test_power_acpi_cpufreq(void)
rte_power_unset_env();
return -1;
  }
+
+static int
+test_power_acpi_caps(void)
+{
+   struct rte_power_core_capabilities caps;
+   int ret;
+
+   ret = rte_power_set_env(PM_ENV_ACPI_CPUFREQ);
+   if (ret) {
+   printf("Error setting ACPI environment\n");
+   return -1;
+   }
+
+   ret = rte_power_init(TEST_POWER_LCORE_ID);
+   if (ret < 0) {
+   printf("Cannot initialise power management for lcore %u, this "
+   "may occur if environment is not configured "
+   "correctly(APCI cpufreq) or operating in another valid "
+   "Power management environment\n", TEST_POWER_LCORE_ID);
+   rte_power_unset_env();
+   return -1;
+   }
+
+   ret = rte_power_get_capabilities(TEST_POWER_LCORE_ID, &caps);
+   if (ret) {
+   printf("ACPI: Error getting capabilities\n");
+   return -1;
+   }
+
+   printf("ACPI: Capabilities %lx\n", caps.capabilities);
+
+   rte_power_unset_env();
+   return 0;
+}
+
  #endif
  
  REGISTER_TEST_COMMAND(power_acpi_cpufreq_autotest, test_power_acpi_cpufreq);

+REGISTER_TEST_COMMAND(power_acpi_caps_autotest, test_power_acpi_caps);


Acked-by: David Hunt



Re: [dpdk-dev] [PATCH v2 3/3] examples/l3fw-power: add high/regular performance cores option

2018-06-26 Thread Hunt, David




On 11/6/2018 11:03 AM, Radu Nicolau wrote:

Added high/regular performance core pinning configuration options
that can be used in place of the existing 'config' option.

'--high-perf-cores CORELIST' option allow the user to specify a
high performance cores list; if this option is not used and the
'perf-config' option is used, the application will query the
system using the rte_power library in order to get a list of
available high performance cores. The cores that are considered
high performance are the cores that have turbo enabled.

'--perf-config (port,queue,hi_perf,lcore_index)'
option is similar to the existing config option, the cores are specified
as indices for bins containing high or regular performance cores.

Example:

l3fwd-power -l 6,7 -- -p 0xff \
--high-perf-cores 6 --perf-config="(0,0,0,0),(1,0,1,0)"

cores 6 and 7 are used, core 6 is specified as a high performance core.
port 0 queue 0 will use a regular performance core, index 0 (core 7)
port 1 queue 0 will use a high performance core, index 0 (core 6)

Signed-off-by: Radu Nicolau 
---
  examples/l3fwd-power/Makefile|   4 +-
  examples/l3fwd-power/main.c  |  74 ++---
  examples/l3fwd-power/main.h  |  20 
  examples/l3fwd-power/meson.build |   4 +-
  examples/l3fwd-power/perf_core.c | 230 +++
  examples/l3fwd-power/perf_core.h |  12 ++
  6 files changed, 323 insertions(+), 21 deletions(-)
  create mode 100644 examples/l3fwd-power/main.h
  create mode 100644 examples/l3fwd-power/perf_core.c
  create mode 100644 examples/l3fwd-power/perf_core.h

diff --git a/examples/l3fwd-power/Makefile b/examples/l3fwd-power/Makefile
index 390b7d6..1a46033 100644
--- a/examples/l3fwd-power/Makefile
+++ b/examples/l3fwd-power/Makefile
@@ -1,11 +1,11 @@
  # SPDX-License-Identifier: BSD-3-Clause
-# Copyright(c) 2010-2014 Intel Corporation
+# Copyright(c) 2010-2018 Intel Corporation
  
  # binary name

  APP = l3fwd-power
  
  # all source are stored in SRCS-y

-SRCS-y := main.c
+SRCS-y := main.c perf_core.c
  
  # Build using pkg-config variables if possible

  $(shell pkg-config --exists libdpdk)
diff --git a/examples/l3fwd-power/main.c b/examples/l3fwd-power/main.c
index 596d645..2268d54 100644
--- a/examples/l3fwd-power/main.c
+++ b/examples/l3fwd-power/main.c
@@ -1,5 +1,5 @@
  /* SPDX-License-Identifier: BSD-3-Clause
- * Copyright(c) 2010-2016 Intel Corporation
+ * Copyright(c) 2010-2018 Intel Corporation
   */
  
  #include 

@@ -44,6 +44,9 @@
  #include 
  #include 
  
+#include "perf_core.h"

+#include "main.h"
+
  #define RTE_LOGTYPE_L3FWD_POWER RTE_LOGTYPE_USER1
  
  #define MAX_PKT_BURST 32

@@ -155,14 +158,7 @@ struct lcore_rx_queue {
  #define MAX_RX_QUEUE_INTERRUPT_PER_PORT 16
  
  
-#define MAX_LCORE_PARAMS 1024

-struct lcore_params {
-   uint16_t port_id;
-   uint8_t queue_id;
-   uint8_t lcore_id;
-} __rte_cache_aligned;
-
-static struct lcore_params lcore_params_array[MAX_LCORE_PARAMS];
+struct lcore_params lcore_params_array[MAX_LCORE_PARAMS];
  static struct lcore_params lcore_params_array_default[] = {
{0, 0, 2},
{0, 1, 2},
@@ -175,8 +171,8 @@ static struct lcore_params lcore_params_array_default[] = {
{3, 1, 3},
  };
  
-static struct lcore_params * lcore_params = lcore_params_array_default;

-static uint16_t nb_lcore_params = sizeof(lcore_params_array_default) /
+struct lcore_params *lcore_params = lcore_params_array_default;
+uint16_t nb_lcore_params = sizeof(lcore_params_array_default) /
sizeof(lcore_params_array_default[0]);
  
  static struct rte_eth_conf port_conf = {

@@ -1121,10 +1117,15 @@ print_usage(const char *prgname)
  {
printf ("%s [EAL options] -- -p PORTMASK -P"
"  [--config (port,queue,lcore)[,(port,queue,lcore]]"
+   "  [--high-perf-cores CORELIST"
+   "  [--perf-config 
(port,queue,hi_perf,lcore_index)[,(port,queue,hi_perf,lcore_index]]"
"  [--enable-jumbo [--max-pkt-len PKTLEN]]\n"
"  -p PORTMASK: hexadecimal bitmask of ports to configure\n"
"  -P : enable promiscuous mode\n"
"  --config (port,queue,lcore): rx queues configuration\n"
+   "  --high-perf-cores CORELIST: list of high performance cores\n"
+   "  --perf-config: similar as config, cores specified as indices"
+   " for bins containing high or regular performance cores\n"
"  --no-numa: optional, disable numa awareness\n"
"  --enable-jumbo: enable jumbo frame"
" which max packet len is PKTLEN in decimal (64-9600)\n"
@@ -1234,6 +1235,8 @@ parse_args(int argc, char **argv)
char *prgname = argv[0];
static struct option lgopts[] = {
{"config", 1, 0, 0},
+   {"perf-config", 1, 0, 0},
+   {"high-perf-cores", 1, 0, 0},
{"no-numa", 0, 0, 0},
{"enable-jumbo", 

Re: [dpdk-dev] [PATCH v4 05/24] eal: support mp task be invoked in a separate task

2018-06-26 Thread Zhang, Qi Z



> -Original Message-
> From: Thomas Monjalon [mailto:tho...@monjalon.net]
> Sent: Tuesday, June 26, 2018 5:24 PM
> To: Burakov, Anatoly 
> Cc: Zhang, Qi Z ; Ananyev, Konstantin
> ; dev@dpdk.org; Richardson, Bruce
> ; Yigit, Ferruh ; Shelton,
> Benjamin H ; Vangati, Narender
> 
> Subject: Re: [PATCH v4 05/24] eal: support mp task be invoked in a separate
> task
> 
> 26/06/2018 11:02, Burakov, Anatoly:
> > On 26-Jun-18 8:08 AM, Qi Zhang wrote:
> > > We know the limitation that sync IPC can't be invoked in mp handler
> > > itself which will cause deadlock, the patch introduce new API
> > > rte_eal_mp_task_add to support mp handler be delegated in a separate
> > > task.
> > >
> > > Signed-off-by: Qi Zhang 
> > > ---
> >
> > I would really like to find another solution to this problem. Creating
> > a new thread per hotplug request seems like an overkill - even more so
> > than having two threads. Creating a new thread potentially while the
> > application is working may have other implications (e.g. there's a
> > non-zero amount of time between thread created and thread affinitized,
> > which may disrupt hotpaths).
> >
> > It seems to me that the better solution would've been to leave the IPC
> > thread in place. There are two IPC threads in the first place because
> > there was a circular dependency between rte_malloc and alarm API. My
> > patch fixes that - so how about we remove *one* IPC thread, but leave
> > the other one in place?
> >
> > Thomas, any thoughts? (quick description - hotplug needs IPC, and
> > hotplug may need to allocate memory, which also needs IPC, which will
> > cause a deadlock if IPC is one thread)
> 
> We can keep one IPC thread until we find a better solution.
> 
> 
OK, then I will delegate the task to interrupt thread and remove the temporal 
thread solution.

Thanks
Qi

> 



Re: [dpdk-dev] [PATCH v2 2/3] app/testpmd: enable UDP GSO in csum engine

2018-06-26 Thread Iremonger, Bernard
> -Original Message-
> From: Hu, Jiayu
> Sent: Sunday, June 17, 2018 4:13 AM
> To: dev@dpdk.org
> Cc: Ananyev, Konstantin ; Zhang, Yuwei1
> ; Iremonger, Bernard
> ; Hu, Jiayu 
> Subject: [PATCH v2 2/3] app/testpmd: enable UDP GSO in csum engine
> 
> This patch enables GSO for UDP/IPv4 packets. Oversized UDP/IPv4 packets
> transmitted over a GSO-enabled port will undergo segmentation.
> 
> Signed-off-by: Jiayu Hu 
> ---
Acked-by: Bernard Iremonger 




Re: [dpdk-dev] [PATCH 1/2] eal: remove deprecated function returning mbuf pool ops name

2018-06-26 Thread Olivier Matz
On Tue, Jun 26, 2018 at 11:12:35AM +0200, Olivier Matz wrote:
> rte_eal_mbuf_default_mempool_ops() is replaced by
> rte_mbuf_best_mempool_ops().
> 
> Signed-off-by: Olivier Matz 

Self nack, rebase issue between these 2 patches.
Thanks Thomas for spotting it.

Will send a v2.


Re: [dpdk-dev] [dpdk-stable] [PATCH v2] kni: fix build with gcc 8.1

2018-06-26 Thread Thomas Monjalon
26/06/2018 11:02, Ferruh Yigit:
> --- a/kernel/linux/kni/ethtool/igb/igb_ethtool.c
> +++ b/kernel/linux/kni/ethtool/igb/igb_ethtool.c
> @@ -811,9 +811,10 @@ static void igb_get_drvinfo(struct net_device *netdev,
>   strncpy(drvinfo->driver,  igb_driver_name, sizeof(drvinfo->driver) - 1);
>   strncpy(drvinfo->version, igb_driver_version, sizeof(drvinfo->version) 
> - 1);
>  -strncpy(drvinfo->fw_version, adapter->fw_version,
> - sizeof(drvinfo->fw_version) - 1);

There is a leading space before the minus of first line.




[dpdk-dev] [PATCH v2 1/2] eal: remove deprecated function returning mbuf pool ops name

2018-06-26 Thread Olivier Matz
rte_eal_mbuf_default_mempool_ops() is replaced by
rte_mbuf_best_mempool_ops().

Signed-off-by: Olivier Matz 
---

v2:
* remove rte_eal_mbuf_user_pool_ops from .map in next patch instead of this

 doc/guides/rel_notes/deprecation.rst|  9 -
 lib/librte_eal/bsdapp/eal/eal.c | 10 --
 lib/librte_eal/common/include/rte_eal.h | 11 ---
 lib/librte_eal/linuxapp/eal/eal.c   | 10 --
 lib/librte_eal/rte_eal_version.map  |  1 -
 5 files changed, 41 deletions(-)

diff --git a/doc/guides/rel_notes/deprecation.rst 
b/doc/guides/rel_notes/deprecation.rst
index 1ce692eac..5bf680515 100644
--- a/doc/guides/rel_notes/deprecation.rst
+++ b/doc/guides/rel_notes/deprecation.rst
@@ -37,15 +37,6 @@ Deprecation Notices
   - ``eal_parse_pci_DomBDF`` replaced by ``rte_pci_addr_parse``
   - ``rte_eal_compare_pci_addr`` replaced by ``rte_pci_addr_cmp``
 
-* eal: a new set of mbuf mempool ops name APIs for user, platform and best
-  mempool names have been defined in ``rte_mbuf`` in v18.02. The uses of
-  ``rte_eal_mbuf_default_mempool_ops`` shall be replaced by
-  ``rte_mbuf_best_mempool_ops``.
-  The following function is deprecated since 18.05, and will be removed
-  in 18.08:
-
-  - ``rte_eal_mbuf_default_mempool_ops``
-
 * mbuf: The opaque ``mbuf->hash.sched`` field will be updated to support 
generic
   definition in line with the ethdev TM and MTR APIs. Currently, this field
   is defined in librte_sched in a non-generic way. The new generic format
diff --git a/lib/librte_eal/bsdapp/eal/eal.c b/lib/librte_eal/bsdapp/eal/eal.c
index dc279542d..f7cced725 100644
--- a/lib/librte_eal/bsdapp/eal/eal.c
+++ b/lib/librte_eal/bsdapp/eal/eal.c
@@ -153,16 +153,6 @@ rte_eal_mbuf_user_pool_ops(void)
return internal_config.user_mbuf_pool_ops_name;
 }
 
-/* Return mbuf pool ops name */
-const char *
-rte_eal_mbuf_default_mempool_ops(void)
-{
-   if (internal_config.user_mbuf_pool_ops_name == NULL)
-   return RTE_MBUF_DEFAULT_MEMPOOL_OPS;
-
-   return internal_config.user_mbuf_pool_ops_name;
-}
-
 /* Return a pointer to the configuration structure */
 struct rte_config *
 rte_eal_get_configuration(void)
diff --git a/lib/librte_eal/common/include/rte_eal.h 
b/lib/librte_eal/common/include/rte_eal.h
index 8de5d69e8..0c9c3f13b 100644
--- a/lib/librte_eal/common/include/rte_eal.h
+++ b/lib/librte_eal/common/include/rte_eal.h
@@ -501,17 +501,6 @@ enum rte_iova_mode rte_eal_iova_mode(void);
 const char * __rte_experimental
 rte_eal_mbuf_user_pool_ops(void);
 
-/**
- * @deprecated
- * Get default pool ops name for mbuf
- *
- * @return
- *   returns default pool ops name.
- */
-__rte_deprecated
-const char *
-rte_eal_mbuf_default_mempool_ops(void);
-
 #ifdef __cplusplus
 }
 #endif
diff --git a/lib/librte_eal/linuxapp/eal/eal.c 
b/lib/librte_eal/linuxapp/eal/eal.c
index 8655b8691..cf2a8082b 100644
--- a/lib/librte_eal/linuxapp/eal/eal.c
+++ b/lib/librte_eal/linuxapp/eal/eal.c
@@ -161,16 +161,6 @@ rte_eal_mbuf_user_pool_ops(void)
return internal_config.user_mbuf_pool_ops_name;
 }
 
-/* Return mbuf pool ops name */
-const char *
-rte_eal_mbuf_default_mempool_ops(void)
-{
-   if (internal_config.user_mbuf_pool_ops_name == NULL)
-   return RTE_MBUF_DEFAULT_MEMPOOL_OPS;
-
-   return internal_config.user_mbuf_pool_ops_name;
-}
-
 /* Return a pointer to the configuration structure */
 struct rte_config *
 rte_eal_get_configuration(void)
diff --git a/lib/librte_eal/rte_eal_version.map 
b/lib/librte_eal/rte_eal_version.map
index f7dd0e7bc..d0564d11f 100644
--- a/lib/librte_eal/rte_eal_version.map
+++ b/lib/librte_eal/rte_eal_version.map
@@ -181,7 +181,6 @@ DPDK_17.11 {
rte_bus_get_iommu_class;
rte_eal_has_pci;
rte_eal_iova_mode;
-   rte_eal_mbuf_default_mempool_ops;
rte_eal_using_phys_addrs;
rte_eal_vfio_intr_mode;
rte_lcore_has_role;
-- 
2.11.0



Re: [dpdk-dev] [PATCH] net/thunderx: fix build with gcc optimization on

2018-06-26 Thread Jerin Jacob
-Original Message-
> Date: Tue, 26 Jun 2018 10:17:14 +0100
> From: Ferruh Yigit 
> To: Jerin Jacob 
> CC: Maciej Czekaj , dev@dpdk.org,
>  sta...@dpdk.org
> Subject: Re: [dpdk-dev] [PATCH] net/thunderx: fix build with gcc
>  optimization on
> User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:52.0) Gecko/20100101
>  Thunderbird/52.8.0
> 
> 
> On 6/24/2018 1:17 PM, Jerin Jacob wrote:
> > -Original Message-
> >> Date: Thu, 21 Jun 2018 19:14:50 +0100
> >> From: Ferruh Yigit 
> >> To: Jerin Jacob , Maciej Czekaj
> >>  
> >> CC: dev@dpdk.org, Ferruh Yigit , sta...@dpdk.org
> >> Subject: [PATCH] net/thunderx: fix build with gcc optimization on
> >> X-Mailer: git-send-email 2.17.1
> >>
> >>
> >> build error gcc version 6.3.1 20161221 (Red Hat 6.3.1-1),
> >> with EXTRA_CFLAGS="-O3":
> >>
> >> .../drivers/net/thunderx/nicvf_ethdev.c:907:9:
> >>error: ‘txq’ may be used uninitialized in this function
> >>[-Werror=maybe-uninitialized]
> >>   if (txq->pool_free == nicvf_single_pool_free_xmited_buffers)
> >>   ~~~^~~
> >> .../drivers/net/thunderx/nicvf_ethdev.c:886:20:
> >>note: ‘txq’ was declared here
> >>   struct nicvf_txq *txq;
> >> ^~~
> >>
> >> Same error on function 'nicvf_eth_dev_init' and 'nicvf_dev_start', it
> >> seems 'nicvf_set_tx_function' inlined when optimization enabled.
> >>
> >> Initialize the txq and add NULL check before using it to fix.
> >>
> >> Fixes: 7413feee662d ("net/thunderx: add device start/stop and close")
> >> Cc: sta...@dpdk.org
> >>
> >> Reported-by: Richard Walsh 
> >> Signed-off-by: Ferruh Yigit 
> >
> > Acked-by: Jerin Jacob 
> >
> >> ---
> >>
> >> Btw, no compiler optimization enabled, only nicvf_rxtx.c has -Ofast,
> >> is this intentional?
> >
> > Yes. At least in our setup, -Ofast turns out to be super set of -O3.
> 
> That is what gcc documents about -Ofast, but again it is only for single
> nicvf_rxtx.c file. The problem seen with -O3 case with other file.

OK. For other files, we intentionally kept non -03 as it is in slow path.

> 


[dpdk-dev] [PATCH v2 2/2] eal: remove experimental tag from user mbuf pool ops func

2018-06-26 Thread Olivier Matz
Remove experimental tag from rte_eal_mbuf_user_pool_ops().

Signed-off-by: Olivier Matz 
---

v2:
* remove rte_eal_mbuf_user_pool_ops from .map in this patch instead of previous 
one

 lib/librte_eal/bsdapp/eal/eal.c | 2 +-
 lib/librte_eal/common/include/rte_eal.h | 5 +
 lib/librte_eal/linuxapp/eal/eal.c   | 2 +-
 lib/librte_eal/rte_eal_version.map  | 7 ++-
 lib/librte_mbuf/Makefile| 1 -
 lib/librte_mbuf/meson.build | 1 -
 6 files changed, 9 insertions(+), 9 deletions(-)

diff --git a/lib/librte_eal/bsdapp/eal/eal.c b/lib/librte_eal/bsdapp/eal/eal.c
index f7cced725..98c689b16 100644
--- a/lib/librte_eal/bsdapp/eal/eal.c
+++ b/lib/librte_eal/bsdapp/eal/eal.c
@@ -147,7 +147,7 @@ eal_get_runtime_dir(void)
 }
 
 /* Return user provided mbuf pool ops name */
-const char * __rte_experimental
+const char *
 rte_eal_mbuf_user_pool_ops(void)
 {
return internal_config.user_mbuf_pool_ops_name;
diff --git a/lib/librte_eal/common/include/rte_eal.h 
b/lib/librte_eal/common/include/rte_eal.h
index 0c9c3f13b..e114dcbdc 100644
--- a/lib/librte_eal/common/include/rte_eal.h
+++ b/lib/librte_eal/common/include/rte_eal.h
@@ -490,15 +490,12 @@ static inline int rte_gettid(void)
 enum rte_iova_mode rte_eal_iova_mode(void);
 
 /**
- * @warning
- * @b EXPERIMENTAL: this API may change without prior notice
- *
  * Get user provided pool ops name for mbuf
  *
  * @return
  *   returns user provided pool ops name.
  */
-const char * __rte_experimental
+const char *
 rte_eal_mbuf_user_pool_ops(void);
 
 #ifdef __cplusplus
diff --git a/lib/librte_eal/linuxapp/eal/eal.c 
b/lib/librte_eal/linuxapp/eal/eal.c
index cf2a8082b..71ec2be9f 100644
--- a/lib/librte_eal/linuxapp/eal/eal.c
+++ b/lib/librte_eal/linuxapp/eal/eal.c
@@ -155,7 +155,7 @@ eal_get_runtime_dir(void)
 }
 
 /* Return user provided mbuf pool ops name */
-const char * __rte_experimental
+const char *
 rte_eal_mbuf_user_pool_ops(void)
 {
return internal_config.user_mbuf_pool_ops_name;
diff --git a/lib/librte_eal/rte_eal_version.map 
b/lib/librte_eal/rte_eal_version.map
index d0564d11f..c151c8454 100644
--- a/lib/librte_eal/rte_eal_version.map
+++ b/lib/librte_eal/rte_eal_version.map
@@ -240,6 +240,12 @@ DPDK_18.05 {
 
 } DPDK_18.02;
 
+DPDK_18.08 {
+   global:
+
+   rte_eal_mbuf_user_pool_ops;
+} DPDK_18.05;
+
 EXPERIMENTAL {
global:
 
@@ -258,7 +264,6 @@ EXPERIMENTAL {
rte_eal_cleanup;
rte_eal_hotplug_add;
rte_eal_hotplug_remove;
-   rte_eal_mbuf_user_pool_ops;
rte_fbarray_attach;
rte_fbarray_destroy;
rte_fbarray_detach;
diff --git a/lib/librte_mbuf/Makefile b/lib/librte_mbuf/Makefile
index 8749a00fe..e2b98a254 100644
--- a/lib/librte_mbuf/Makefile
+++ b/lib/librte_mbuf/Makefile
@@ -6,7 +6,6 @@ include $(RTE_SDK)/mk/rte.vars.mk
 # library name
 LIB = librte_mbuf.a
 
-CFLAGS += -DALLOW_EXPERIMENTAL_API
 CFLAGS += $(WERROR_FLAGS) -I$(SRCDIR) -O3
 LDLIBS += -lrte_eal -lrte_mempool
 
diff --git a/lib/librte_mbuf/meson.build b/lib/librte_mbuf/meson.build
index 869c17c1c..45ffb0db5 100644
--- a/lib/librte_mbuf/meson.build
+++ b/lib/librte_mbuf/meson.build
@@ -2,7 +2,6 @@
 # Copyright(c) 2017 Intel Corporation
 
 version = 3
-allow_experimental_apis = true
 sources = files('rte_mbuf.c', 'rte_mbuf_ptype.c', 'rte_mbuf_pool_ops.c')
 headers = files('rte_mbuf.h', 'rte_mbuf_ptype.h', 'rte_mbuf_pool_ops.h')
 deps += ['mempool']
-- 
2.11.0



Re: [dpdk-dev] [PATCH v2] net/i40e: remove VF interrupt handler

2018-06-26 Thread Zhang, Qi Z


> -Original Message-
> From: Yigit, Ferruh
> Sent: Tuesday, June 26, 2018 5:15 PM
> To: Zhang, Qi Z ; Stephen Hemminger
> 
> Cc: Xing, Beilei ; Wu, Jingjing 
> ;
> Yu, De ; dev@dpdk.org
> Subject: Re: [dpdk-dev] [PATCH v2] net/i40e: remove VF interrupt handler
> 
> On 6/24/2018 11:56 AM, Zhang, Qi Z wrote:
> > Hi Stephen:
> >
> >> -Original Message-
> >> From: Stephen Hemminger [mailto:step...@networkplumber.org]
> >> Sent: Friday, June 22, 2018 11:44 PM
> >> To: Zhang, Qi Z 
> >> Cc: Xing, Beilei ; Wu, Jingjing
> >> ; Yu, De ; dev@dpdk.org
> >> Subject: Re: [dpdk-dev] [PATCH v2] net/i40e: remove VF interrupt
> >> handler
> >>
> >> On Fri, 22 Jun 2018 08:44:14 +0800
> >> Qi Zhang  wrote:
> >>
> >>> For i40evf, internal rx interrupt and adminq interrupt share the
> >>> same source, that cause a lot cpu cycles be wasted on interrupt
> >>> handler on rx path. This is complained by customers which require
> >>> low latency (when set I40E_ITR_INTERVAL to small value), but have to
> >>> be sufferred by tremendous interrupts handling that eat significant CPU
> resources.
> >>>
> >>> The patch disable pci interrupt and remove the interrupt handler,
> >>> replace it with a low frequency (50ms) interrupt polling daemon
> >>> which is implemented by registering a alarm callback periodly, this
> >>> save CPU time significently: On a typical x86 server with 2.1GHz
> >>> CPU, with low latency configure (32us) we saw CPU usage from top
> >>> commmand reduced from 20% to 0% on management core in testpmd).
> >>>
> >>> Also with the new method we can remove compile option:
> >>> I40E_ITR_INTERVAL which is used to balance between low latency and
> >>> low
> >> CPU usage previously.
> >>> Now we don't need it since we can reach both at same time.
> >>>
> >>> Suggested-by: Jingjing Wu 
> >>> Signed-off-by: Qi Zhang 
> >>> ---
> >>>
> >>> v2:
> >>> - update doc
> >>>
> >>>  config/common_base|  2 --
> >>>  doc/guides/nics/i40e.rst  |  5 -
> >>>  drivers/net/i40e/i40e_ethdev.c|  3 +--
> >>>  drivers/net/i40e/i40e_ethdev.h| 22 +++---
> >>>  drivers/net/i40e/i40e_ethdev_vf.c | 36
> >>> ++--
> >>>  5 files changed, 26 insertions(+), 42 deletions(-)
> >>>
> >>> diff --git a/config/common_base b/config/common_base index
> >>> 6b0d1cbbb..9e21c6865 100644
> >>> --- a/config/common_base
> >>> +++ b/config/common_base
> >>> @@ -264,8 +264,6 @@ CONFIG_RTE_LIBRTE_I40E_INC_VECTOR=y
> >>>  CONFIG_RTE_LIBRTE_I40E_16BYTE_RX_DESC=n
> >>>  CONFIG_RTE_LIBRTE_I40E_QUEUE_NUM_PER_PF=64
> >>>  CONFIG_RTE_LIBRTE_I40E_QUEUE_NUM_PER_VM=4
> >>> -# interval up to 8160 us, aligned to 2 (or default value)
> >>> -CONFIG_RTE_LIBRTE_I40E_ITR_INTERVAL=-1
> >>>
> >>>  #
> >>>  # Compile burst-oriented FM10K PMD
> >>> diff --git a/doc/guides/nics/i40e.rst b/doc/guides/nics/i40e.rst
> >>> index
> >>> 18549bf5a..3fc4ceac7 100644
> >>> --- a/doc/guides/nics/i40e.rst
> >>> +++ b/doc/guides/nics/i40e.rst
> >>> @@ -96,11 +96,6 @@ Please note that enabling debugging options may
> >> affect system performance.
> >>>
> >>>Number of queues reserved for each VMDQ Pool.
> >>>
> >>> -- ``CONFIG_RTE_LIBRTE_I40E_ITR_INTERVAL`` (default ``-1``)
> >>> -
> >>> -  Interrupt Throttling interval.
> >>> -
> >>> -
> >>>  Runtime Config Options
> >>>  ~~
> >>>
> >>> diff --git a/drivers/net/i40e/i40e_ethdev.c
> >>> b/drivers/net/i40e/i40e_ethdev.c index 13c5d3296..c8f9566e0 100644
> >>> --- a/drivers/net/i40e/i40e_ethdev.c
> >>> +++ b/drivers/net/i40e/i40e_ethdev.c
> >>> @@ -1829,8 +1829,7 @@ __vsi_queues_bind_intr(struct i40e_vsi *vsi,
> >> uint16_t msix_vect,
> >>>   /* Write first RX queue to Link list register as the head element */
> >>>   if (vsi->type != I40E_VSI_SRIOV) {
> >>>   uint16_t interval =
> >>> - i40e_calc_itr_interval(RTE_LIBRTE_I40E_ITR_INTERVAL, 1,
> >>> -pf->support_multi_driver);
> >>> + i40e_calc_itr_interval(1, pf->support_multi_driver);
> >>>
> >>>   if (msix_vect == I40E_MISC_VEC_ID) {
> >>>   I40E_WRITE_REG(hw, I40E_PFINT_LNKLST0, diff --git
> >>> a/drivers/net/i40e/i40e_ethdev.h b/drivers/net/i40e/i40e_ethdev.h
> >>> index 11c4c76bd..53dac 100644
> >>> --- a/drivers/net/i40e/i40e_ethdev.h
> >>> +++ b/drivers/net/i40e/i40e_ethdev.h
> >>> @@ -178,7 +178,7 @@ enum i40e_flxpld_layer_idx {
> >>>  #define I40E_ITR_INDEX_NONE 3
> >>>  #define I40E_QUEUE_ITR_INTERVAL_DEFAULT 32 /* 32 us */
> >>>  #define I40E_QUEUE_ITR_INTERVAL_MAX 8160 /* 8160 us */
> >>> -#define I40E_VF_QUEUE_ITR_INTERVAL_DEFAULT 8160 /* 8160 us */
> >>> +#define I40E_VF_QUEUE_ITR_INTERVAL_DEFAULT 32 /* 32 us */
> >>>  /* Special FW support this floating VEB feature */  #define
> >>> FLOATING_VEB_SUPPORTED_FW_MAJ 5  #define
> >> FLOATING_VEB_SUPPORTED_FW_MIN
> >>> 0 @@ -1328,17 +1328,17 @@ i40e_align_floor(int n)  }
> >>>
> >>>  static inline uint16_t

Re: [dpdk-dev] [PATCH v2 4/4] app/testpmd: show example to handle hot unplug

2018-06-26 Thread Iremonger, Bernard
> -Original Message-
> From: dev [mailto:dev-boun...@dpdk.org] On Behalf Of Jeff Guo
> Sent: Friday, June 22, 2018 12:51 PM
> To: step...@networkplumber.org; Richardson, Bruce
> ; Yigit, Ferruh ; Ananyev,
> Konstantin ; gaetan.ri...@6wind.com; Wu,
> Jingjing ; tho...@monjalon.net;
> mo...@mellanox.com; ma...@mellanox.com; Van Haaren, Harry
> ; Zhang, Qi Z ; He,
> Shaopeng 
> Cc: jblu...@infradead.org; shreyansh.j...@nxp.com; dev@dpdk.org; Guo, Jia
> ; Zhang, Helin 
> Subject: [dpdk-dev] [PATCH v2 4/4] app/testpmd: show example to handle hot
> unplug
> 
> Use testpmd for example, to show how an application smoothly handle failure
> when device being hot unplug. If app have enabled the device event monitor and
> register the hot plug event’s callback before running, once app detect the
> removal event, the callback would be called. It will first stop the packet
> forwarding, then stop the port, close the port, and finally detach the port to
> remove the device out from the device lists.
> 
> Signed-off-by: Jeff Guo 
> ---

Acked-by: Bernard Iremonger 



Re: [dpdk-dev] [PATCH v2] net/i40e: remove VF interrupt handler

2018-06-26 Thread Ferruh Yigit
On 6/26/2018 11:04 AM, Zhang, Qi Z wrote:
> 
> 
>> -Original Message-
>> From: Yigit, Ferruh
>> Sent: Tuesday, June 26, 2018 5:15 PM
>> To: Zhang, Qi Z ; Stephen Hemminger
>> 
>> Cc: Xing, Beilei ; Wu, Jingjing 
>> ;
>> Yu, De ; dev@dpdk.org
>> Subject: Re: [dpdk-dev] [PATCH v2] net/i40e: remove VF interrupt handler
>>
>> On 6/24/2018 11:56 AM, Zhang, Qi Z wrote:
>>> Hi Stephen:
>>>
 -Original Message-
 From: Stephen Hemminger [mailto:step...@networkplumber.org]
 Sent: Friday, June 22, 2018 11:44 PM
 To: Zhang, Qi Z 
 Cc: Xing, Beilei ; Wu, Jingjing
 ; Yu, De ; dev@dpdk.org
 Subject: Re: [dpdk-dev] [PATCH v2] net/i40e: remove VF interrupt
 handler

 On Fri, 22 Jun 2018 08:44:14 +0800
 Qi Zhang  wrote:

> For i40evf, internal rx interrupt and adminq interrupt share the
> same source, that cause a lot cpu cycles be wasted on interrupt
> handler on rx path. This is complained by customers which require
> low latency (when set I40E_ITR_INTERVAL to small value), but have to
> be sufferred by tremendous interrupts handling that eat significant CPU
>> resources.
>
> The patch disable pci interrupt and remove the interrupt handler,
> replace it with a low frequency (50ms) interrupt polling daemon
> which is implemented by registering a alarm callback periodly, this
> save CPU time significently: On a typical x86 server with 2.1GHz
> CPU, with low latency configure (32us) we saw CPU usage from top
> commmand reduced from 20% to 0% on management core in testpmd).
>
> Also with the new method we can remove compile option:
> I40E_ITR_INTERVAL which is used to balance between low latency and
> low
 CPU usage previously.
> Now we don't need it since we can reach both at same time.
>
> Suggested-by: Jingjing Wu 
> Signed-off-by: Qi Zhang 
> ---
>
> v2:
> - update doc
>
>  config/common_base|  2 --
>  doc/guides/nics/i40e.rst  |  5 -
>  drivers/net/i40e/i40e_ethdev.c|  3 +--
>  drivers/net/i40e/i40e_ethdev.h| 22 +++---
>  drivers/net/i40e/i40e_ethdev_vf.c | 36
> ++--
>  5 files changed, 26 insertions(+), 42 deletions(-)
>
> diff --git a/config/common_base b/config/common_base index
> 6b0d1cbbb..9e21c6865 100644
> --- a/config/common_base
> +++ b/config/common_base
> @@ -264,8 +264,6 @@ CONFIG_RTE_LIBRTE_I40E_INC_VECTOR=y
>  CONFIG_RTE_LIBRTE_I40E_16BYTE_RX_DESC=n
>  CONFIG_RTE_LIBRTE_I40E_QUEUE_NUM_PER_PF=64
>  CONFIG_RTE_LIBRTE_I40E_QUEUE_NUM_PER_VM=4
> -# interval up to 8160 us, aligned to 2 (or default value)
> -CONFIG_RTE_LIBRTE_I40E_ITR_INTERVAL=-1
>
>  #
>  # Compile burst-oriented FM10K PMD
> diff --git a/doc/guides/nics/i40e.rst b/doc/guides/nics/i40e.rst
> index
> 18549bf5a..3fc4ceac7 100644
> --- a/doc/guides/nics/i40e.rst
> +++ b/doc/guides/nics/i40e.rst
> @@ -96,11 +96,6 @@ Please note that enabling debugging options may
 affect system performance.
>
>Number of queues reserved for each VMDQ Pool.
>
> -- ``CONFIG_RTE_LIBRTE_I40E_ITR_INTERVAL`` (default ``-1``)
> -
> -  Interrupt Throttling interval.
> -
> -
>  Runtime Config Options
>  ~~
>
> diff --git a/drivers/net/i40e/i40e_ethdev.c
> b/drivers/net/i40e/i40e_ethdev.c index 13c5d3296..c8f9566e0 100644
> --- a/drivers/net/i40e/i40e_ethdev.c
> +++ b/drivers/net/i40e/i40e_ethdev.c
> @@ -1829,8 +1829,7 @@ __vsi_queues_bind_intr(struct i40e_vsi *vsi,
 uint16_t msix_vect,
>   /* Write first RX queue to Link list register as the head element */
>   if (vsi->type != I40E_VSI_SRIOV) {
>   uint16_t interval =
> - i40e_calc_itr_interval(RTE_LIBRTE_I40E_ITR_INTERVAL, 1,
> -pf->support_multi_driver);
> + i40e_calc_itr_interval(1, pf->support_multi_driver);
>
>   if (msix_vect == I40E_MISC_VEC_ID) {
>   I40E_WRITE_REG(hw, I40E_PFINT_LNKLST0, diff --git
> a/drivers/net/i40e/i40e_ethdev.h b/drivers/net/i40e/i40e_ethdev.h
> index 11c4c76bd..53dac 100644
> --- a/drivers/net/i40e/i40e_ethdev.h
> +++ b/drivers/net/i40e/i40e_ethdev.h
> @@ -178,7 +178,7 @@ enum i40e_flxpld_layer_idx {
>  #define I40E_ITR_INDEX_NONE 3
>  #define I40E_QUEUE_ITR_INTERVAL_DEFAULT 32 /* 32 us */
>  #define I40E_QUEUE_ITR_INTERVAL_MAX 8160 /* 8160 us */
> -#define I40E_VF_QUEUE_ITR_INTERVAL_DEFAULT 8160 /* 8160 us */
> +#define I40E_VF_QUEUE_ITR_INTERVAL_DEFAULT 32 /* 32 us */
>  /* Special FW support this floating VEB feature */  #define
> FLOATING_VEB_SUPPORTED_FW_MAJ 5  #define
 FLOATING_VEB_SUPPORTED_FW_MIN
> 0 @@ -1328,17 +1328,17 @@ i40e_

Re: [dpdk-dev] [PATCH 1/2] cryptodev: add min headroom and tailroom requirement

2018-06-26 Thread Doherty, Declan

On 19/06/2018 7:26 AM, Anoob Joseph wrote:

Enabling crypto devs to specify the minimum headroom and tailroom it
expects in the mbuf. For net PMDs, standard headroom has to be honoured
by applications, which is not strictly followed for crypto devs. This


How is this done for NET PMDs, I don't see anything explicit in the 
ehtdev API for specification of headroom requirements.



prevents crypto devs from using free space in mbuf (available as
head/tailroom) for internal requirements in crypto operations. Addition
of head/tailroom requirement will help PMDs to communicate such
requirements to the application.

The availability and use of head/tailroom is an optimization if the
hardware supports use of head/tailroom for crypto-op info. For devices
that do not support using the head/tailroom, they can continue to operate
without any performance-drop.

Is there any variations in requirements for terms headroom/tailroom on a 
per algorithmic basis or is it purely for the device?



Signed-off-by: Anoob Joseph 
---
  doc/guides/rel_notes/deprecation.rst | 4 
  lib/librte_cryptodev/rte_cryptodev.h | 6 ++
  2 files changed, 10 insertions(+)

diff --git a/doc/guides/rel_notes/deprecation.rst 
b/doc/guides/rel_notes/deprecation.rst
index 1ce692e..a547289 100644
--- a/doc/guides/rel_notes/deprecation.rst
+++ b/doc/guides/rel_notes/deprecation.rst
@@ -122,3 +122,7 @@ Deprecation Notices
- Function ``rte_cryptodev_get_private_session_size()`` will be deprecated
  in 18.05, and it gets replaced with 
``rte_cryptodev_sym_get_private_session_size()``.
  It will be removed in 18.08.
+  - New field, ``min_headroom_req``, added in ``rte_cryptodev_info`` 
structure. It will be
+added in 18.11.
+  - New field, ``min_tailroom_req``, added in ``rte_cryptodev_info`` 
structure. It will be
+added in 18.11.
diff --git a/lib/librte_cryptodev/rte_cryptodev.h 
b/lib/librte_cryptodev/rte_cryptodev.h
index 92ce6d4..fa944b8 100644
--- a/lib/librte_cryptodev/rte_cryptodev.h
+++ b/lib/librte_cryptodev/rte_cryptodev.h
@@ -382,6 +382,12 @@ struct rte_cryptodev_info {
unsigned max_nb_queue_pairs;
/**< Maximum number of queues pairs supported by device. */
  
+	uint32_t min_headroom_req;

+   /**< Minimum mbuf headroom required by device */
+
+   uint32_t min_tailroom_req;
+   /**< Minimum mbuf tailroom required by device */
+
struct {
unsigned max_nb_sessions;
/**< Maximum number of sessions supported by device. */





Re: [dpdk-dev] [PATCH] net/mlx5: separate generic tunnel TSO from the standard one

2018-06-26 Thread Shahaf Shuler
Monday, June 25, 2018 2:33 PM, Nélio Laranjeiro:
> Subject: Re: [PATCH] net/mlx5: separate generic tunnel TSO from the
> standard one
> 
> On Mon, Jun 25, 2018 at 11:23:22AM +, Shahaf Shuler wrote:
> > Monday, June 25, 2018 9:41 AM , Nélio Laranjeiro:
> > > Subject: Re: [PATCH] net/mlx5: separate generic tunnel TSO from the
> > > standard one
> > >
> > > On Sun, Jun 24, 2018 at 09:22:26AM +0300, Shahaf Shuler wrote:
> > > > The generic tunnel TSO was depended in the regular one
> > > > capabilities to be enabled.
> > > >
> > > > Cc: sta...@dpdk.org
> > > >
> > > > Signed-off-by: Shahaf Shuler 
> > > > Acked-by: Yongseok Koh 
> > > > ---
> > > >  drivers/net/mlx5/mlx5_txq.c | 13 +
> > > >  1 file changed, 9 insertions(+), 4 deletions(-)
> > > >
> >
> > [...]
> >
> > > > -   txq_ctrl->txq.tunnel_en = config->tunnel_en;
> > > > +   txq_ctrl->txq.tunnel_en = config->tunnel_en | config->swp;
> > > > txq_ctrl->txq.swp_en = ((DEV_TX_OFFLOAD_IP_TNL_TSO |
> > > >  DEV_TX_OFFLOAD_UDP_TNL_TSO |
> > > >  DEV_TX_OFFLOAD_OUTER_IPV4_CKSUM) &
> > > > --
> > > > 2.12.0
> > > >
> > >
> > > Is not it a fix?
> >
> > Well, more like optimization. To be less strict on when to enable the
> > generic tunnel TSO.
> > I can rephrase the title if you insist.
> 
> I was asking due to the CC'ed stable, which is generally used when the it is a
> fix.  I don't know how the stable maintainers trigger such patch, that why I
> am asking.
> 
> I am not insisting in any thing here.
> 
> By the way:
> Acked-by: Nelio Laranjeiro 

Applied to next-net-mlx, thanks. 

> 
> --
> Nélio Laranjeiro
> 6WIND


Re: [dpdk-dev] [RFC 0/3] ethdev: add IP address and TCP/UDP port rewrite actions to flow API

2018-06-26 Thread Thomas Monjalon
Hi,

22/06/2018 11:56, Rahul Lakkireddy:
> This series of patches add support for actions:
> - OF_SET_NW_IPV4_SRC - set a new IPv4 source address.
> - OF_SET_NW_IPV4_DST - set a new IPv4 destination address.
> - OF_SET_NW_IPV6_SRC - set a new IPv6 source address.
> - OF_SET_NW_IPV6_DST - set a new IPv6 destination address.
> - OF_SET_TP_SRC - set a new TCP/UDP source port number.
> - OF_SET_TP_DST - set a new TCP/UDP destination port number.

Given the date of submission, I guess you do not expect it for 18.08.
Next time, better to make it clear by adding it in the Subject:
[RFC 18.11 0/3]

Thanks for proposing in advance.
I hope we will have some good reviews in advance too.

Adding some active maintainers as Cc.




Re: [dpdk-dev] [PATCH v4 10/24] net/ixgbe: enable port detach on secondary process

2018-06-26 Thread Remy Horton



On 26/06/2018 08:08, Qi Zhang wrote:
[..]

 static int eth_ixgbevf_pci_probe(struct rte_pci_driver *pci_drv __rte_unused,
struct rte_pci_device *pci_dev)
 {
+   struct rte_eth_dev *ethdev;
+
+   ethdev = rte_eth_dev_allocated(pci_dev->device.name);
+   if (!ethdev)
+   return -ENODEV;
+
+   if (rte_eal_process_type() != RTE_PROC_PRIMARY)
+   return rte_eth_dev_release_port_private(ethdev);
+
return rte_eth_dev_pci_generic_probe(pci_dev,
sizeof(struct ixgbe_adapter), eth_ixgbevf_dev_init);
 }



Is calling of rte_eth_dev_release_port_private() from the probe function 
intentional? To me it looks like the code has been pasted into the wrong 
place..




Re: [dpdk-dev] [PATCH v4 09/24] net/i40e: enable port detach on secondary process

2018-06-26 Thread Remy Horton



On 26/06/2018 08:08, Qi Zhang wrote:
[..]

 static int eth_i40evf_pci_remove(struct rte_pci_device *pci_dev)
 {
+   struct rte_eth_dev *ethdev;
+   ethdev = rte_eth_dev_allocated(pci_dev->device.name);
+
+   if (!ethdev)
+   return -ENODEV;
+
+   if (rte_eal_process_type() != RTE_PROC_PRIMARY)
+   return rte_eth_dev_release_port_private(ethdev);
+
return rte_eth_dev_pci_generic_remove(pci_dev, i40evf_dev_uninit);
 }


This identical code appears in multiple drivers. Is there anything 
stopping it being folded into rte_eth_dev_pci_generic_remove()?


[dpdk-dev] [PATCH v3] kni: fix build with gcc 8.1

2018-06-26 Thread Ferruh Yigit
Error observed when CONFIG_RTE_KNI_KMOD_ETHTOOL config option is
enabled.

build error:
In function ‘strncpy’,
inlined from ‘igb_get_drvinfo’ at
.../dpdk/build/build/kernel/linux/kni/igb_ethtool.c:814:2:
.../include/linux/string.h:246:9: error: ‘__builtin_strncpy’ output
may be truncated copying 31 bytes from a string of length 42
[-Werror=stringop-truncation]
  return __builtin_strncpy(p, q, size);
   ^

Fixed by using strlcpy instead of strncpy.

adapter->fw_version size kept same because of
c3698192940c ("kni: fix build with gcc 7.1")

Also next line strncpy usage replaced with strlcpy while arround.

Fixes: c3698192940c ("kni: fix build with gcc 7.1")
Cc: sta...@dpdk.org

Signed-off-by: Ferruh Yigit 
---
v2:
* used strlcpy instead of strncpy
* Updated strncpy usage in next line to strlcpy too
* Added fixes line

v3:
* fix patch syntax corrupted during send
---
 kernel/linux/kni/ethtool/igb/igb_ethtool.c | 7 ---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/kernel/linux/kni/ethtool/igb/igb_ethtool.c 
b/kernel/linux/kni/ethtool/igb/igb_ethtool.c
index 064528bcf..002f75c48 100644
--- a/kernel/linux/kni/ethtool/igb/igb_ethtool.c
+++ b/kernel/linux/kni/ethtool/igb/igb_ethtool.c
@@ -811,9 +811,10 @@ static void igb_get_drvinfo(struct net_device *netdev,
strncpy(drvinfo->driver,  igb_driver_name, sizeof(drvinfo->driver) - 1);
strncpy(drvinfo->version, igb_driver_version, sizeof(drvinfo->version) 
- 1);
 
-   strncpy(drvinfo->fw_version, adapter->fw_version,
-   sizeof(drvinfo->fw_version) - 1);
-   strncpy(drvinfo->bus_info, pci_name(adapter->pdev), 
sizeof(drvinfo->bus_info) -1);
+   strlcpy(drvinfo->fw_version, adapter->fw_version,
+   sizeof(drvinfo->fw_version));
+   strlcpy(drvinfo->bus_info, pci_name(adapter->pdev),
+   sizeof(drvinfo->bus_info));
drvinfo->n_stats = IGB_STATS_LEN;
drvinfo->testinfo_len = IGB_TEST_LEN;
drvinfo->regdump_len = igb_get_regs_len(netdev);
-- 
2.17.1



Re: [dpdk-dev] [PATCH V3] net/thunderx: add support for hardware first skip feature

2018-06-26 Thread Ferruh Yigit
On 6/18/2018 7:40 AM, Jerin Jacob wrote:
> -Original Message-
>> Date: Mon, 18 Jun 2018 11:06:24 +0530
>> From: Rakesh Kudurumalla 
>> To: dev@dpdk.org
>> Cc: ferruh.yi...@intel.com, jerin.ja...@caviumnetworks.com, rkudurumalla
>>  
>> Subject: [PATCH V3] net/thunderx: add support for hardware first skip
>>  feature
>> X-Mailer: git-send-email 2.7.4
>>
>> From: rkudurumalla 
>>
>> This feature is used to create a hole between HEADROOM
>> and actual data.Size of hole is specified in bytes as
>> module param to pmd
>>
>> Signed-off-by: Rakesh Kudurumalla 
> 
> Acked-by: Jerin Jacob 

Applied to dpdk-next-net/master, thanks.


Re: [dpdk-dev] [PATCH v4 2/2] app/testpmd: add NVGRE encap/decap support

2018-06-26 Thread Ori Kam
Acked-by: Ori Kam 

> -Original Message-
> From: Nelio Laranjeiro [mailto:nelio.laranje...@6wind.com]
> Sent: Thursday, June 21, 2018 10:14 AM
> To: dev@dpdk.org; Adrien Mazarguil ;
> Wenzhuo Lu ; Jingjing Wu
> ; Bernard Iremonger
> ; Mohammad Abdul Awal
> ; Ori Kam ;
> Stephen Hemminger 
> Subject: [PATCH v4 2/2] app/testpmd: add NVGRE encap/decap support
> 
> Due to the complex NVGRE_ENCAP flow action and based on the fact
> testpmd
> does not allocate memory, this patch adds a new command in testpmd to
> initialise a global structure containing the necessary information to
> make the outer layer of the packet.  This same global structure will
> then be used by the flow command line in testpmd when the action
> nvgre_encap will be parsed, at this point, the conversion into such
> action becomes trivial.
> 
> This global structure is only used for the encap action.
> 
> Signed-off-by: Nelio Laranjeiro 
> ---
>  app/test-pmd/cmdline.c  | 118 ++
>  app/test-pmd/cmdline_flow.c | 129 
>  app/test-pmd/testpmd.c  |  15 +++
>  app/test-pmd/testpmd.h  |  15 +++
>  doc/guides/testpmd_app_ug/testpmd_funcs.rst |  37 ++
>  5 files changed, 314 insertions(+)
> 
> diff --git a/app/test-pmd/cmdline.c b/app/test-pmd/cmdline.c
> index 048fff2bd..ad7f9eda5 100644
> --- a/app/test-pmd/cmdline.c
> +++ b/app/test-pmd/cmdline.c
> @@ -789,6 +789,12 @@ static void cmd_help_long_parsed(void
> *parsed_result,
>   " vlan-tci eth-src eth-dst\n"
>   "   Configure the VXLAN encapsulation for
> flows.\n\n"
> 
> + "nvgre ipv4|ipv6 tni ip-src ip-dst eth-src eth-dst\n"
> + "   Configure the NVGRE encapsulation for
> flows.\n\n"
> +
> + "nvgre-with-vlan ipv4|ipv6 tni ip-src ip-dst vlan-tci
> eth-src eth-dst\n"
> + "   Configure the NVGRE encapsulation for
> flows.\n\n"
> +
>   , list_pkt_forwarding_modes()
>   );
>   }
> @@ -14970,6 +14976,116 @@ cmdline_parse_inst_t
> cmd_set_vxlan_with_vlan = {
>   },
>  };
> 
> +/** Set NVGRE encapsulation details */
> +struct cmd_set_nvgre_result {
> + cmdline_fixed_string_t set;
> + cmdline_fixed_string_t nvgre;
> + cmdline_fixed_string_t ip_version;
> + uint32_t tni;
> + cmdline_ipaddr_t ip_src;
> + cmdline_ipaddr_t ip_dst;
> + uint16_t tci;
> + struct ether_addr eth_src;
> + struct ether_addr eth_dst;
> +};
> +
> +cmdline_parse_token_string_t cmd_set_nvgre_set =
> + TOKEN_STRING_INITIALIZER(struct cmd_set_nvgre_result, set,
> "set");
> +cmdline_parse_token_string_t cmd_set_nvgre_nvgre =
> + TOKEN_STRING_INITIALIZER(struct cmd_set_nvgre_result, nvgre,
> "nvgre");
> +cmdline_parse_token_string_t cmd_set_nvgre_nvgre_with_vlan =
> + TOKEN_STRING_INITIALIZER(struct cmd_set_nvgre_result, nvgre,
> "nvgre-with-vlan");
> +cmdline_parse_token_string_t cmd_set_nvgre_ip_version =
> + TOKEN_STRING_INITIALIZER(struct cmd_set_nvgre_result,
> ip_version,
> +  "ipv4#ipv6");
> +cmdline_parse_token_num_t cmd_set_nvgre_tni =
> + TOKEN_NUM_INITIALIZER(struct cmd_set_nvgre_result, tni,
> UINT32);
> +cmdline_parse_token_num_t cmd_set_nvgre_ip_src =
> + TOKEN_IPADDR_INITIALIZER(struct cmd_set_nvgre_result, ip_src);
> +cmdline_parse_token_ipaddr_t cmd_set_nvgre_ip_dst =
> + TOKEN_IPADDR_INITIALIZER(struct cmd_set_nvgre_result, ip_dst);
> +cmdline_parse_token_num_t cmd_set_nvgre_vlan =
> + TOKEN_NUM_INITIALIZER(struct cmd_set_nvgre_result, tci,
> UINT16);
> +cmdline_parse_token_etheraddr_t cmd_set_nvgre_eth_src =
> + TOKEN_ETHERADDR_INITIALIZER(struct cmd_set_nvgre_result,
> eth_src);
> +cmdline_parse_token_etheraddr_t cmd_set_nvgre_eth_dst =
> + TOKEN_ETHERADDR_INITIALIZER(struct cmd_set_nvgre_result,
> eth_dst);
> +
> +static void cmd_set_nvgre_parsed(void *parsed_result,
> + __attribute__((unused)) struct cmdline *cl,
> + __attribute__((unused)) void *data)
> +{
> + struct cmd_set_nvgre_result *res = parsed_result;
> + union {
> + uint32_t nvgre_tni;
> + uint8_t tni[4];
> + } id = {
> + .nvgre_tni = rte_cpu_to_be_32(res->tni) &
> RTE_BE32(0x00ff),
> + };
> +
> + if (strcmp(res->nvgre, "nvgre") == 0)
> + nvgre_encap_conf.select_vlan = 0;
> + else if (strcmp(res->nvgre, "nvgre-with-vlan") == 0)
> + nvgre_encap_conf.select_vlan = 1;
> + if (strcmp(res->ip_version, "ipv4") == 0)
> + nvgre_encap_conf.select_ipv4 = 1;
> + else if (strcmp(res->ip_version, "ipv6") == 0)
> + nvgre_encap_conf.select_ipv4 = 0;
> + else
> + return;
> + rte_memcpy(nvgre_encap_conf.tni, &id.tni[1], 3);
> + if (nvgre_encap_conf.select_ipv4) {
> + IPV4_ADDR_TO_UINT(res->ip_src,
> nvgre_encap_conf.ipv4

Re: [dpdk-dev] [PATCH v4 01/24] eal: introduce one device scan

2018-06-26 Thread Remy Horton



On 26/06/2018 08:08, Qi Zhang wrote:
[..]

Signed-off-by: Qi Zhang 
---

 lib/librte_eal/common/eal_common_dev.c  | 17 +
 lib/librte_eal/common/include/rte_bus.h | 16 


Acked-by: Remy Horton 


Re: [dpdk-dev] [PATCH v4 03/24] ethdev: add function to release port in local process

2018-06-26 Thread Remy Horton



On 26/06/2018 08:08, Qi Zhang wrote:
[..]

Signed-off-by: Qi Zhang 
---

 lib/librte_ethdev/rte_ethdev.c| 24 +---
 lib/librte_ethdev/rte_ethdev_driver.h | 13 +
 2 files changed, 34 insertions(+), 3 deletions(-)


Acked-by: Remy Horton 


  1   2   3   >