date:20160128

[dpdk-dev] [PATCH 3/5] mempool: add custom external mempool handler example

2016-01-28 Thread Jerin Jacob

On Tue, Jan 26, 2016 at 05:25:53PM +, David Hunt wrote:
> adds a simple ring-based mempool handler using mallocs for each object

nit,

$ git am /export/dh/3
Applying: mempool: add custom external mempool handler example
/export/dpdk-master/.git/rebase-apply/patch:184: new blank line at EOF.
+
warning: 1 line adds whitespace errors.

> 
> Signed-off-by: David Hunt 
> ---
>  lib/librte_mempool/Makefile |   1 +
>  lib/librte_mempool/custom_mempool.c | 160 
> 
>  2 files changed, 161 insertions(+)
>  create mode 100644 lib/librte_mempool/custom_mempool.c
> 
> diff --git a/lib/librte_mempool/Makefile b/lib/librte_mempool/Makefile
> index d795b48..4f72546 100644
> --- a/lib/librte_mempool/Makefile
> +++ b/lib/librte_mempool/Makefile
> @@ -44,6 +44,7 @@ LIBABIVER := 1
>  SRCS-$(CONFIG_RTE_LIBRTE_MEMPOOL) +=  rte_mempool.c
>  SRCS-$(CONFIG_RTE_LIBRTE_MEMPOOL) +=  rte_mempool_default.c
>  SRCS-$(CONFIG_RTE_LIBRTE_MEMPOOL) +=  rte_mempool_stack.c
> +SRCS-$(CONFIG_RTE_LIBRTE_MEMPOOL) +=  custom_mempool.c
>  ifeq ($(CONFIG_RTE_LIBRTE_XEN_DOM0),y)
>  SRCS-$(CONFIG_RTE_LIBRTE_MEMPOOL) +=  rte_dom0_mempool.c
>  endif
> diff --git a/lib/librte_mempool/custom_mempool.c 
> b/lib/librte_mempool/custom_mempool.c
> new file mode 100644
> index 000..a9da8c5
> --- /dev/null
> +++ b/lib/librte_mempool/custom_mempool.c
> @@ -0,0 +1,160 @@
> +/*-
> + *   BSD LICENSE
> + *
> + *   Copyright(c) 2010-2016 Intel Corporation. All rights reserved.
> + *   All rights reserved.
> + *
> + *   Redistribution and use in source and binary forms, with or without
> + *   modification, are permitted provided that the following conditions
> + *   are met:
> + *
> + * * Redistributions of source code must retain the above copyright
> + *   notice, this list of conditions and the following disclaimer.
> + * * Redistributions in binary form must reproduce the above copyright
> + *   notice, this list of conditions and the following disclaimer in
> + *   the documentation and/or other materials provided with the
> + *   distribution.
> + * * Neither the name of Intel Corporation nor the names of its
> + *   contributors may be used to endorse or promote products derived
> + *   from this software without specific prior written permission.
> + *
> + *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
> + *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
> + *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
> + *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
> + *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
> + *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
> + *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
> + *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
> + *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
> + *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
> + *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
> + */
> +
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +
> +#include 
> +
> +#include "rte_mempool_internal.h"
> +
> +/*
> + * Mempool
> + * ===
> + *
> + * Basic tests: done on one core with and without cache:
> + *
> + *- Get one object, put one object
> + *- Get two objects, put two objects
> + *- Get all objects, test that their content is not modified and
> + *  put them back in the pool.
> + */
> +
> +#define TIME_S 5
> +#define MEMPOOL_ELT_SIZE 2048
> +#define MAX_KEEP 128
> +#define MEMPOOL_SIZE 8192
> +
> +#if 0
> +/*
> + * For our example mempool handler, we use the following struct to
> + * pass info to our create callback so it can call rte_mempool_create
> + */
> +struct custom_mempool_alloc_params {
> + char ring_name[RTE_RING_NAMESIZE];
> + unsigned n_elt;
> + unsigned elt_size;
> +};
> +#endif
> +
> +/*
> + * Simple example of custom mempool structure. Holds pointers to all the
> + * elements which are simply malloc'd in this example.
> + */
> +struct custom_mempool {
> + struct rte_ring *r; /* Ring to manage elements */
> + void *elements[MEMPOOL_SIZE];   /* Element pointers */
> +};
> +
> +/*
> + * Loop though all the element pointers and allocate a chunk of memory, then
> + * insert that memory into the ring.
> + */
> +static void *
> +custom_mempool_alloc(struct rte_mempool *mp,
> + const char *name, unsigned n,
> + __attribute__((unused)) int socket_id,
> + __attribute__((unused)) unsigned flags)
> +
> +{
> + static struct custom_mempool *cm;
> + uint32_t *objnum;
> + unsigned int i;
> +
> + cm = malloc(sizeof(struct custom_mempool));
> +
> + /* Create the ring so we can enqueue/dequeue */
> + cm->r =

[dpdk-dev] [PATCH 4/4] szedata2: update documentation and release notes

2016-01-28 Thread Matej Vido

Add info about change from virtual PMD_VDEV to PMD_PDEV and new
functions to release notes.

Signed-off-by: Matej Vido 
---
 doc/guides/nics/szedata2.rst |   93 +
 doc/guides/rel_notes/release_2_3.rst |9 +++
 2 files changed, 68 insertions(+), 34 deletions(-)

diff --git a/doc/guides/nics/szedata2.rst b/doc/guides/nics/szedata2.rst
index ee3c3fe..ac512a0 100644
--- a/doc/guides/nics/szedata2.rst
+++ b/doc/guides/nics/szedata2.rst
@@ -1,5 +1,5 @@
 ..  BSD LICENSE
-Copyright 2015 CESNET
+Copyright 2015 - 2016 CESNET
 All rights reserved.

 Redistribution and use in source and binary forms, with or without
@@ -33,8 +33,8 @@ SZEDATA2 poll mode driver library

 The SZEDATA2 poll mode driver library implements support for cards from COMBO
 family (**COMBO-80G**, **COMBO-100G**).
-The SZEDATA2 PMD is virtual PMD which uses interface provided by libsze2
-library to communicate with COMBO cards over sze2 layer.
+The SZEDATA2 PMD uses interface provided by libsze2 library to communicate
+with COMBO cards over sze2 layer.

 More information about family of
 `COMBO cards `_
@@ -74,50 +74,41 @@ separately:
* szedata2_cv3

Kernel modules manage initialization of hardware, allocation and
-   sharing of resources for user space applications:
+   sharing of resources for user space applications.

 Information about getting the dependencies can be found `here
 `_.

+Configuration
+-

-Using the SZEDATA2 PMD
---
-
-SZEDATA2 PMD can be created by passing ``--vdev=`` option to EAL in the
-following format:
-
-.. code-block:: console
-
-   --vdev 'DEVICE,dev_path=PATH,rx_ifaces=RX_MASK,tx_ifaces=TX_MASK'
-
-``DEVICE`` and options ``dev_path``, ``rx_ifaces``, ``tx_ifaces`` are mandatory
-and must be separated by commas.
+These configuration options can be modified before compilation in the
+``.config`` file:

-*  ``DEVICE``: contains prefix ``eth_szedata2`` followed by numbers or letters,
-   must be unique for each virtual device
+*  ``CONFIG_RTE_LIBRTE_PMD_SZEDATA2`` default value: **n**

-*  ``dev_path``: Defines path to szedata2 device.
-   Value is valid path to szedata2 device. Example:
+   Value **y** enables compilation of szedata2 PMD.

-   .. code-block:: console
+*  ``CONFIG_RTE_LIBRTE_PMD_SZEDATA2_AS`` default value: **0**

-  dev_path=/dev/szedataII0
+   This option defines type of firmware address space.
+   Currently supported value is:

-*  ``rx_ifaces``: Defines which receive channels will be used.
-   For each channel is created one queue. Value is mask for selecting which
-   receive channels are required. Example:
+   * **0** for firmwares:

-   .. code-block:: console
+  * NIC_100G1_LR4
+  * HANIC_100G1_LR4
+  * HANIC_100G1_SR10

-  rx_ifaces=0x3

-*  ``tx_ifaces``: Defines which transmit channels will be used.
-   For each channel is created one queue. Value is mask for selecting which
-   transmit channels are required. Example:
+Using the SZEDATA2 PMD
+--

-   .. code-block:: console
+From DPDK version 2.3.0 the type of SZEDATA2 PMD is changed to PMD_PDEV.
+SZEDATA2 device is automatically recognized during EAL initialization.
+No special command line options are needed.

-  tx_ifaces=0x3
+Kernel modules have to be loaded before running the DPDK application.

 Example of usage
 
@@ -128,5 +119,39 @@ transmit channel:
 .. code-block:: console

$RTE_TARGET/app/testpmd -c 0xf -n 2 \
-   --vdev 'eth_szedata20,dev_path=/dev/szedataII0,rx_ifaces=0x3,tx_ifaces=0x3' 
\
-   -- --port-topology=chained --rxq=2 --txq=2 --nb-cores=2
+   -- --port-topology=chained --rxq=2 --txq=2 --nb-cores=2 -i -a
+
+Example output:
+
+.. code-block:: console
+
+   [...]
+   EAL: PCI device :06:00.0 on NUMA socket -1
+   EAL:   probe driver: 1b26:c1c1 rte_szedata2_pmd
+   PMD: Initializing szedata2 device (:06:00.0)
+   PMD: SZEDATA2 path: /dev/szedataII0
+   PMD: Available DMA channels RX: 8 TX: 8
+   PMD: resource0 phys_addr = 0xe800 len = 134217728 virt addr = 
7f48f800
+   PMD: szedata2 device (:06:00.0) successfully initialized
+   Interactive-mode selected
+   Auto-start selected
+   Configuring Port 0 (socket 0)
+   Port 0: 00:11:17:00:00:00
+   Checking link statuses...
+   Port 0 Link Up - speed 1 Mbps - full-duplex
+   Done
+   Start automatic packet forwarding
+ io packet forwarding - CRC stripping disabled - packets/burst=32
+ nb forwarding cores=2 - nb forwarding ports=1
+ RX queues=2 - RX desc=128 - RX free threshold=0
+ RX threshold registers: pthresh=0 hthresh=0 wthresh=0
+ TX queues=2 - TX desc=512 - TX free threshold=0
+ TX threshold registers: pthresh=0 hthresh=0 wthresh=0
+ TX RS bit threshold=0 - TXQ flags=0x0
+   testpmd>
+
+.. note::
+
+   Link speed API currently supports

[dpdk-dev] [PATCH 3/4] szedata2: add functions for enabling/disabling promiscuous, allmulticast modes

2016-01-28 Thread Matej Vido

Signed-off-by: Matej Vido 
---
 drivers/net/szedata2/rte_eth_szedata2.c |   45 +++
 drivers/net/szedata2/rte_eth_szedata2.h |   39 ++
 2 files changed, 84 insertions(+), 0 deletions(-)

diff --git a/drivers/net/szedata2/rte_eth_szedata2.c 
b/drivers/net/szedata2/rte_eth_szedata2.c
index d8c260b..81c806e 100644
--- a/drivers/net/szedata2/rte_eth_szedata2.c
+++ b/drivers/net/szedata2/rte_eth_szedata2.c
@@ -1281,6 +1281,42 @@ eth_mac_addr_set(struct rte_eth_dev *dev __rte_unused,
 {
 }

+static void
+eth_promiscuous_enable(struct rte_eth_dev *dev)
+{
+   volatile struct szedata2_cgmii_ibuf *ibuf = SZEDATA2_PCI_RESOURCE_PTR(
+   dev, SZEDATA2_CGMII_IBUF_BASE_OFF,
+   volatile struct szedata2_cgmii_ibuf *);
+   cgmii_ibuf_mac_mode_write(ibuf, SZEDATA2_MAC_CHMODE_PROMISC);
+}
+
+static void
+eth_promiscuous_disable(struct rte_eth_dev *dev)
+{
+   volatile struct szedata2_cgmii_ibuf *ibuf = SZEDATA2_PCI_RESOURCE_PTR(
+   dev, SZEDATA2_CGMII_IBUF_BASE_OFF,
+   volatile struct szedata2_cgmii_ibuf *);
+   cgmii_ibuf_mac_mode_write(ibuf, SZEDATA2_MAC_CHMODE_ONLY_VALID);
+}
+
+static void
+eth_allmulticast_enable(struct rte_eth_dev *dev)
+{
+   volatile struct szedata2_cgmii_ibuf *ibuf = SZEDATA2_PCI_RESOURCE_PTR(
+   dev, SZEDATA2_CGMII_IBUF_BASE_OFF,
+   volatile struct szedata2_cgmii_ibuf *);
+   cgmii_ibuf_mac_mode_write(ibuf, SZEDATA2_MAC_CHMODE_ALL_MULTICAST);
+}
+
+static void
+eth_allmulticast_disable(struct rte_eth_dev *dev)
+{
+   volatile struct szedata2_cgmii_ibuf *ibuf = SZEDATA2_PCI_RESOURCE_PTR(
+   dev, SZEDATA2_CGMII_IBUF_BASE_OFF,
+   volatile struct szedata2_cgmii_ibuf *);
+   cgmii_ibuf_mac_mode_write(ibuf, SZEDATA2_MAC_CHMODE_ONLY_VALID);
+}
+
 static struct eth_dev_ops ops = {
.dev_start  = eth_dev_start,
.dev_stop   = eth_dev_stop,
@@ -1289,6 +1325,10 @@ static struct eth_dev_ops ops = {
.dev_close  = eth_dev_close,
.dev_configure  = eth_dev_configure,
.dev_infos_get  = eth_dev_info,
+   .promiscuous_enable   = eth_promiscuous_enable,
+   .promiscuous_disable  = eth_promiscuous_disable,
+   .allmulticast_enable  = eth_allmulticast_enable,
+   .allmulticast_disable = eth_allmulticast_disable,
.rx_queue_start = eth_rx_queue_start,
.rx_queue_stop  = eth_rx_queue_stop,
.tx_queue_start = eth_tx_queue_start,
@@ -1471,8 +1511,10 @@ rte_szedata2_eth_dev_init(struct rte_eth_dev *dev)
(unsigned long long)pci_rsc->len,
(unsigned long long)pci_rsc->addr);

+   /* Get link state */
eth_link_update(dev, 0);

+   /* Allocate space for one mac address */
data->mac_addrs = rte_zmalloc(data->name, sizeof(struct ether_addr),
RTE_CACHE_LINE_SIZE);
if (data->mac_addrs == NULL) {
@@ -1484,6 +1526,9 @@ rte_szedata2_eth_dev_init(struct rte_eth_dev *dev)

ether_addr_copy(_addr, data->mac_addrs);

+   /* At initial state COMBO card is in promiscuous mode so disable it */
+   eth_promiscuous_disable(dev);
+
RTE_LOG(INFO, PMD, "szedata2 device ("
PCI_PRI_FMT ") successfully initialized\n",
pci_addr->domain, pci_addr->bus, pci_addr->devid,
diff --git a/drivers/net/szedata2/rte_eth_szedata2.h 
b/drivers/net/szedata2/rte_eth_szedata2.h
index 39d1c48..522cf47 100644
--- a/drivers/net/szedata2/rte_eth_szedata2.h
+++ b/drivers/net/szedata2/rte_eth_szedata2.h
@@ -213,6 +213,13 @@ enum szedata2_link_speed {
SZEDATA2_LINK_SPEED_100G,
 };

+enum szedata2_mac_check_mode {
+   SZEDATA2_MAC_CHMODE_PROMISC   = 0x0,
+   SZEDATA2_MAC_CHMODE_ONLY_VALID= 0x1,
+   SZEDATA2_MAC_CHMODE_ALL_BROADCAST = 0x2,
+   SZEDATA2_MAC_CHMODE_ALL_MULTICAST = 0x3,
+};
+
 /*
  * Structure describes CGMII IBUF address space
  */
@@ -299,6 +306,38 @@ cgmii_ibuf_is_link_up(volatile struct szedata2_cgmii_ibuf 
*ibuf)
 }

 /*
+ * @return
+ * MAC address check mode
+ */
+static inline enum szedata2_mac_check_mode
+cgmii_ibuf_mac_mode_read(volatile struct szedata2_cgmii_ibuf *ibuf)
+{
+   switch (rte_le_to_cpu_32(ibuf->mac_chmode) & 0x3) {
+   case 0x0:
+   return SZEDATA2_MAC_CHMODE_PROMISC;
+   case 0x1:
+   return SZEDATA2_MAC_CHMODE_ONLY_VALID;
+   case 0x2:
+   return SZEDATA2_MAC_CHMODE_ALL_BROADCAST;
+   case 0x3:
+   return SZEDATA2_MAC_CHMODE_ALL_MULTICAST;
+   default:
+   return SZEDATA2_MAC_CHMODE_PROMISC;
+   }
+}
+
+/*
+ * Writes "mode" in MAC address check mode register.
+ */
+static inline void

[dpdk-dev] [PATCH 2/4] szedata2: add functions for setting link up/down and updating link info

2016-01-28 Thread Matej Vido

Mmap PCI resource file and add inline functions for reading from and
writing to PCI resource address space.
Add description of IBUF and OBUF address space.
Add configuration option for setting which firmware type will be used.
Right address space values for IBUFs and OBUFs offsets are used
according to configuration option CONFIG_RTE_LIBRTE_PMD_SZEDATA2_AS.
Setting link up/down and getting info about link status is done through
mmapped PCI resource address space.

Signed-off-by: Matej Vido 
---
 config/common_linuxapp  |   11 +
 drivers/net/szedata2/rte_eth_szedata2.c |  152 +--
 drivers/net/szedata2/rte_eth_szedata2.h |  315 ++-
 3 files changed, 462 insertions(+), 16 deletions(-)

diff --git a/config/common_linuxapp b/config/common_linuxapp
index 74bc515..43a795c 100644
--- a/config/common_linuxapp
+++ b/config/common_linuxapp
@@ -264,6 +264,17 @@ CONFIG_RTE_LIBRTE_NFP_DEBUG=n
 # Compile software PMD backed by SZEDATA2 device
 #
 CONFIG_RTE_LIBRTE_PMD_SZEDATA2=n
+##
+## Defines firmware type address space.
+## RTE_LIBRTE_PMD_SZEDATA2_AS can be:
+## 0 - for firmwares:
+## NIC_100G1_LR4
+## HANIC_100G1_LR4
+## HANIC_100G1_SR10
+##
+## Other values raise compile time error
+##
+CONFIG_RTE_LIBRTE_PMD_SZEDATA2_AS=0

 #
 # Compile burst-oriented VIRTIO PMD driver
diff --git a/drivers/net/szedata2/rte_eth_szedata2.c 
b/drivers/net/szedata2/rte_eth_szedata2.c
index ef906f3..d8c260b 100644
--- a/drivers/net/szedata2/rte_eth_szedata2.c
+++ b/drivers/net/szedata2/rte_eth_szedata2.c
@@ -1,7 +1,7 @@
 /*-
  *   BSD LICENSE
  *
- *   Copyright (c) 2015 CESNET
+ *   Copyright (c) 2015 - 2016 CESNET
  *   All rights reserved.
  *
  *   Redistribution and use in source and binary forms, with or without
@@ -37,6 +37,9 @@
 #include 
 #include 
 #include 
+#include 
+#include 
+#include 

 #include 

@@ -46,6 +49,7 @@
 #include 
 #include 
 #include 
+#include 

 #include "rte_eth_szedata2.h"

@@ -92,11 +96,6 @@ struct pmd_internals {
 static struct ether_addr eth_addr = {
.addr_bytes = { 0x00, 0x11, 0x17, 0x00, 0x00, 0x00 }
 };
-static struct rte_eth_link pmd_link = {
-   .link_speed = ETH_LINK_SPEED_10G,
-   .link_duplex = ETH_LINK_FULL_DUPLEX,
-   .link_status = 0
-};

 static uint16_t
 eth_szedata2_rx(void *queue,
@@ -987,7 +986,6 @@ eth_dev_start(struct rte_eth_dev *dev)
goto err_tx;
}

-   dev->data->dev_link.link_status = 1;
return 0;

 err_tx:
@@ -1011,8 +1009,6 @@ eth_dev_stop(struct rte_eth_dev *dev)

for (i = 0; i < nb_rx; i++)
eth_rx_queue_stop(dev, i);
-
-   dev->data->dev_link.link_status = 0;
 }

 static int
@@ -1141,9 +1137,76 @@ eth_dev_close(struct rte_eth_dev *dev)
 }

 static int
-eth_link_update(struct rte_eth_dev *dev __rte_unused,
+eth_link_update(struct rte_eth_dev *dev,
int wait_to_complete __rte_unused)
 {
+   struct rte_eth_link link;
+   struct rte_eth_link *link_ptr = 
+   struct rte_eth_link *dev_link = >data->dev_link;
+   volatile struct szedata2_cgmii_ibuf *ibuf = SZEDATA2_PCI_RESOURCE_PTR(
+   dev, SZEDATA2_CGMII_IBUF_BASE_OFF,
+   volatile struct szedata2_cgmii_ibuf *);
+
+   switch (cgmii_link_speed(ibuf)) {
+   case SZEDATA2_LINK_SPEED_10G:
+   link.link_speed = ETH_LINK_SPEED_10G;
+   break;
+   case SZEDATA2_LINK_SPEED_40G:
+   link.link_speed = ETH_LINK_SPEED_40G;
+   break;
+   case SZEDATA2_LINK_SPEED_100G:
+   /*
+* TODO
+* If link_speed value from rte_eth_link structure
+* will be changed to support 100Gbps speed change
+* this value to 100G.
+*/
+   link.link_speed = ETH_LINK_SPEED_10G;
+   break;
+   default:
+   link.link_speed = ETH_LINK_SPEED_10G;
+   break;
+   }
+
+   /* szedata2 uses only full duplex */
+   link.link_duplex = ETH_LINK_FULL_DUPLEX;
+
+   link.link_status = (cgmii_ibuf_is_enabled(ibuf) &&
+   cgmii_ibuf_is_link_up(ibuf)) ? 1 : 0;
+
+   rte_atomic64_cmpset((uint64_t *)dev_link, *(uint64_t *)dev_link,
+   *(uint64_t *)link_ptr);
+
+   return 0;
+}
+
+static int
+eth_dev_set_link_up(struct rte_eth_dev *dev)
+{
+   volatile struct szedata2_cgmii_ibuf *ibuf = SZEDATA2_PCI_RESOURCE_PTR(
+   dev, SZEDATA2_CGMII_IBUF_BASE_OFF,
+   volatile struct szedata2_cgmii_ibuf *);
+   volatile struct szedata2_cgmii_obuf *obuf = SZEDATA2_PCI_RESOURCE_PTR(
+   dev, SZEDATA2_CGMII_OBUF_BASE_OFF,
+   volatile struct szedata2_cgmii_obuf *);
+
+   cgmii_ibuf_enable(ibuf);
+   cgmii_obuf_enable(obuf);
+   return 0;
+}
+
+static int

[dpdk-dev] [PATCH 1/4] szedata2: rewrite PMD from virtual PMD_VDEV type to PMD_PDEV type

2016-01-28 Thread Matej Vido

PMD was of type PMD_VDEV which means that PCI device is not recognised
automatically during EAL initialization, but it has to be created by
EAL option --vdev.
Now, PMD is of type PMD_PDEV which means that PCI device is probed
and recognised during EAL initialization automatically.
Path to szedata2 device file is matched with device and the count
of available RX and TX DMA channels is found out during device
initialization.
Initialization, starting and stopping of queues is changed to better
correspond with Ethernet device API model. Function callbacks
(rx|tx)_queue_(start|stop) are added. Unnecessary items are removed
from ethernet device private data structure.

Signed-off-by: Matej Vido 
---
 drivers/net/szedata2/rte_eth_szedata2.c |  900 ---
 drivers/net/szedata2/rte_eth_szedata2.h |8 +
 2 files changed, 366 insertions(+), 542 deletions(-)

diff --git a/drivers/net/szedata2/rte_eth_szedata2.c 
b/drivers/net/szedata2/rte_eth_szedata2.c
index 9f86c99..ef906f3 100644
--- a/drivers/net/szedata2/rte_eth_szedata2.c
+++ b/drivers/net/szedata2/rte_eth_szedata2.c
@@ -35,6 +35,8 @@
 #include 
 #include 
 #include 
+#include 
+#include 

 #include 

@@ -47,10 +49,6 @@

 #include "rte_eth_szedata2.h"

-#define RTE_ETH_SZEDATA2_DEV_PATH_ARG "dev_path"
-#define RTE_ETH_SZEDATA2_RX_IFACES_ARG "rx_ifaces"
-#define RTE_ETH_SZEDATA2_TX_IFACES_ARG "tx_ifaces"
-
 #define RTE_ETH_SZEDATA2_MAX_RX_QUEUES 32
 #define RTE_ETH_SZEDATA2_MAX_TX_QUEUES 32
 #define RTE_ETH_SZEDATA2_TX_LOCK_SIZE (32 * 1024 * 1024)
@@ -60,6 +58,11 @@
  */
 #define RTE_SZE2_PACKET_HEADER_SIZE_ALIGNED 8

+#define RTE_SZEDATA2_DRIVER_NAME "rte_szedata2_pmd"
+#define RTE_SZEDATA2_PCI_DRIVER_NAME "rte_szedata2_pmd"
+
+#define SZEDATA2_DEV_PATH_FMT "/dev/szedataII%u"
+
 struct szedata2_rx_queue {
struct szedata *sze;
uint8_t rx_channel;
@@ -74,57 +77,27 @@ struct szedata2_tx_queue {
struct szedata *sze;
uint8_t tx_channel;
volatile uint64_t tx_pkts;
-   volatile uint64_t err_pkts;
volatile uint64_t tx_bytes;
-};
-
-struct rxtx_szedata2 {
-   uint32_t num_of_rx;
-   uint32_t num_of_tx;
-   uint32_t sze_rx_mask_req;
-   uint32_t sze_tx_mask_req;
-   char *sze_dev;
+   volatile uint64_t err_pkts;
 };

 struct pmd_internals {
struct szedata2_rx_queue rx_queue[RTE_ETH_SZEDATA2_MAX_RX_QUEUES];
struct szedata2_tx_queue tx_queue[RTE_ETH_SZEDATA2_MAX_TX_QUEUES];
-   unsigned nb_rx_queues;
-   unsigned nb_tx_queues;
-   uint32_t num_of_rx;
-   uint32_t num_of_tx;
-   uint32_t sze_rx_req;
-   uint32_t sze_tx_req;
-   int if_index;
-   char *sze_dev;
-};
-
-static const char *valid_arguments[] = {
-   RTE_ETH_SZEDATA2_DEV_PATH_ARG,
-   RTE_ETH_SZEDATA2_RX_IFACES_ARG,
-   RTE_ETH_SZEDATA2_TX_IFACES_ARG,
-   NULL
+   uint16_t max_rx_queues;
+   uint16_t max_tx_queues;
+   char sze_dev[PATH_MAX];
 };

 static struct ether_addr eth_addr = {
.addr_bytes = { 0x00, 0x11, 0x17, 0x00, 0x00, 0x00 }
 };
-static const char *drivername = "SZEdata2 PMD";
 static struct rte_eth_link pmd_link = {
.link_speed = ETH_LINK_SPEED_10G,
.link_duplex = ETH_LINK_FULL_DUPLEX,
.link_status = 0
 };

-
-static uint32_t
-count_ones(uint32_t num)
-{
-   num = num - ((num >> 1) & 0x); /* reuse input as temporary */
-   num = (num & 0x) + ((num >> 2) & 0x);/* temp */
-   return (((num + (num >> 4)) & 0xF0F0F0F) * 0x1010101) >> 24; /* count */
-}
-
 static uint16_t
 eth_szedata2_rx(void *queue,
struct rte_mbuf **bufs,
@@ -905,288 +878,139 @@ next_packet:
 }

 static int
-init_rx_channels(struct rte_eth_dev *dev, int v)
+eth_rx_queue_start(struct rte_eth_dev *dev, uint16_t rxq_id)
 {
-   struct pmd_internals *internals = dev->data->dev_private;
+   struct szedata2_rx_queue *rxq = dev->data->rx_queues[rxq_id];
int ret;
-   uint32_t i;
-   uint32_t count = internals->num_of_rx;
-   uint32_t num_sub = 0;
-   uint32_t x;
-   uint32_t rx;
-   uint32_t tx;
-
-   rx = internals->sze_rx_req;
-   tx = 0;
-
-   for (i = 0; i < count; i++) {
-   /*
-* Open, subscribe rx,tx channels and start device
-*/
-   if (v)
-   RTE_LOG(INFO, PMD, "Opening SZE device %u. time\n", i);
-
-   internals->rx_queue[num_sub].sze =
-   szedata_open(internals->sze_dev);
-   if (internals->rx_queue[num_sub].sze == NULL)
-   return -1;
-
-   /* separate least significant non-zero bit */
-   x = rx & ((~rx) + 1);
-
-   if (v)
-   RTE_LOG(INFO, PMD, "Subscribing rx channel: 0x%x "
-   "tx channel: 0x%x\n", x, tx);
-
-   ret =

[dpdk-dev] [PATCH 1/5] mempool: add external mempool manager support

2016-01-28 Thread Jerin Jacob

On Tue, Jan 26, 2016 at 05:25:51PM +, David Hunt wrote:
> Adds the new rte_mempool_create_ext api and callback mechanism for
> external mempool handlers
> 
> Modifies the existing rte_mempool_create to set up the handler_idx to
> the relevant mempool handler based on the handler name:
>   ring_sp_sc
>   ring_mp_mc
>   ring_sp_mc
>   ring_mp_sc
> 
> Signed-off-by: David Hunt 
> ---
>  app/test/test_mempool_perf.c  |   1 -
>  lib/librte_mempool/Makefile   |   1 +
>  lib/librte_mempool/rte_mempool.c  | 210 +++
>  lib/librte_mempool/rte_mempool.h  | 207 +++
>  lib/librte_mempool/rte_mempool_default.c  | 229 
> ++
>  lib/librte_mempool/rte_mempool_internal.h |  74 ++
>  6 files changed, 634 insertions(+), 88 deletions(-)
>  create mode 100644 lib/librte_mempool/rte_mempool_default.c
>  create mode 100644 lib/librte_mempool/rte_mempool_internal.h
> 
> diff --git a/app/test/test_mempool_perf.c b/app/test/test_mempool_perf.c
> index cdc02a0..091c1df 100644
> --- a/app/test/test_mempool_perf.c
> +++ b/app/test/test_mempool_perf.c
> @@ -161,7 +161,6 @@ per_lcore_mempool_test(__attribute__((unused)) void *arg)
>  n_get_bulk);
>   if (unlikely(ret < 0)) {
>   rte_mempool_dump(stdout, mp);
> - rte_ring_dump(stdout, mp->ring);
>   /* in this case, objects are lost... */
>   return -1;
>   }
> diff --git a/lib/librte_mempool/Makefile b/lib/librte_mempool/Makefile
> index a6898ef..7c81ef6 100644
> --- a/lib/librte_mempool/Makefile
> +++ b/lib/librte_mempool/Makefile
> @@ -42,6 +42,7 @@ LIBABIVER := 1
>  
>  # all source are stored in SRCS-y
>  SRCS-$(CONFIG_RTE_LIBRTE_MEMPOOL) +=  rte_mempool.c
> +SRCS-$(CONFIG_RTE_LIBRTE_MEMPOOL) +=  rte_mempool_default.c
>  ifeq ($(CONFIG_RTE_LIBRTE_XEN_DOM0),y)
>  SRCS-$(CONFIG_RTE_LIBRTE_MEMPOOL) +=  rte_dom0_mempool.c
>  endif
> diff --git a/lib/librte_mempool/rte_mempool.c 
> b/lib/librte_mempool/rte_mempool.c
> index aff5f6d..8c01838 100644
> --- a/lib/librte_mempool/rte_mempool.c
> +++ b/lib/librte_mempool/rte_mempool.c
> @@ -59,10 +59,11 @@
>  #include 
>  
>  #include "rte_mempool.h"
> +#include "rte_mempool_internal.h"
>  
>  TAILQ_HEAD(rte_mempool_list, rte_tailq_entry);
>  
> -static struct rte_tailq_elem rte_mempool_tailq = {
> +struct rte_tailq_elem rte_mempool_tailq = {
>   .name = "RTE_MEMPOOL",
>  };
>  EAL_REGISTER_TAILQ(rte_mempool_tailq)
> @@ -149,7 +150,7 @@ mempool_add_elem(struct rte_mempool *mp, void *obj, 
> uint32_t obj_idx,
>   obj_init(mp, obj_init_arg, obj, obj_idx);
>  
>   /* enqueue in ring */
> - rte_ring_sp_enqueue(mp->ring, obj);
> + rte_mempool_ext_put_bulk(mp, , 1);
>  }
>  
>  uint32_t
> @@ -375,48 +376,28 @@ rte_mempool_xmem_usage(void *vaddr, uint32_t elt_num, 
> size_t elt_sz,
>   return usz;
>  }
>  
> -#ifndef RTE_LIBRTE_XEN_DOM0
> -/* stub if DOM0 support not configured */
> -struct rte_mempool *
> -rte_dom0_mempool_create(const char *name __rte_unused,
> - unsigned n __rte_unused,
> - unsigned elt_size __rte_unused,
> - unsigned cache_size __rte_unused,
> - unsigned private_data_size __rte_unused,
> - rte_mempool_ctor_t *mp_init __rte_unused,
> - void *mp_init_arg __rte_unused,
> - rte_mempool_obj_ctor_t *obj_init __rte_unused,
> - void *obj_init_arg __rte_unused,
> - int socket_id __rte_unused,
> - unsigned flags __rte_unused)
> -{
> - rte_errno = EINVAL;
> - return NULL;
> -}
> -#endif
> -
>  /* create the mempool */
>  struct rte_mempool *
>  rte_mempool_create(const char *name, unsigned n, unsigned elt_size,
> -unsigned cache_size, unsigned private_data_size,
> -rte_mempool_ctor_t *mp_init, void *mp_init_arg,
> -rte_mempool_obj_ctor_t *obj_init, void *obj_init_arg,
> -int socket_id, unsigned flags)
> + unsigned cache_size, unsigned private_data_size,
> + rte_mempool_ctor_t *mp_init, void *mp_init_arg,
> + rte_mempool_obj_ctor_t *obj_init, void *obj_init_arg,
> + int socket_id, unsigned flags)
>  {
>   if (rte_xen_dom0_supported())
>   return rte_dom0_mempool_create(name, n, elt_size,
> -cache_size, private_data_size,
> -mp_init, mp_init_arg,
> -obj_init, obj_init_arg,
> -socket_id,

[dpdk-dev] [PATCH 0/5] add external mempool manager

2016-01-28 Thread Jerin Jacob

On Tue, Jan 26, 2016 at 05:25:50PM +, David Hunt wrote:
> Hi all on the list.
> 
> Here's a proposed patch for an external mempool manager
> 
> The External Mempool Manager is an extension to the mempool API that allows
> users to add and use an external mempool manager, which allows external memory
> subsystems such as external hardware memory management systems and software
> based memory allocators to be used with DPDK.

I like this approach.It will be useful for external hardware memory
pool managers.

BTW, Do you encounter any performance impact on changing to function
pointer based approach?

> 
> The existing API to the internal DPDK mempool manager will remain unchanged
> and will be backward compatible.
> 
> There are two aspects to external mempool manager.
>   1. Adding the code for your new mempool handler. This is achieved by adding 
> a
>  new mempool handler source file into the librte_mempool library, and
>  using the REGISTER_MEMPOOL_HANDLER macro.
>   2. Using the new API to call rte_mempool_create_ext to create a new mempool
>  using the name parameter to identify which handler to use.
> 
> New API calls added
>  1. A new mempool 'create' function which accepts mempool handler name.
>  2. A new mempool 'rte_get_mempool_handler' function which accepts mempool
> handler name, and returns the index to the relevant set of callbacks for
> that mempool handler
> 
> Several external mempool managers may be used in the same application. A new
> mempool can then be created by using the new 'create' function, providing the
> mempool handler name to point the mempool to the relevant mempool manager
> callback structure.
> 
> The old 'create' function can still be called by legacy programs, and will
> internally work out the mempool handle based on the flags provided (single
> producer, single consumer, etc). By default handles are created internally to
> implement the built-in DPDK mempool manager and mempool types.
> 
> The external mempool manager needs to provide the following functions.
>  1. alloc - allocates the mempool memory, and adds each object onto a ring
>  2. put   - puts an object back into the mempool once an application has
> finished with it
>  3. get   - gets an object from the mempool for use by the application
>  4. get_count - gets the number of available objects in the mempool
>  5. free  - frees the mempool memory
> 
> Every time a get/put/get_count is called from the application/PMD, the
> callback for that mempool is called. These functions are in the fastpath,
> and any unoptimised handlers may limit performance.
> 
> The new APIs are as follows:
> 
> 1. rte_mempool_create_ext
> 
> struct rte_mempool *
> rte_mempool_create_ext(const char * name, unsigned n,
> unsigned cache_size, unsigned private_data_size,
> int socket_id, unsigned flags,
> const char * handler_name);
> 
> 2. rte_get_mempool_handler
> 
> int16_t
> rte_get_mempool_handler(const char *name);

Do we need above public API as, in any case we need rte_mempool* pointer to
operate on mempools(which has the index anyway)?

May a similar functional API with different name/return will be
better to figure out, given "name" registered or not in ethernet driver
which has dependency on a particular HW pool manager.

> 
> Please see rte_mempool.h for further information on the parameters.
> 
> 
> The important thing to note is that the mempool handler is passed by name
> to rte_mempool_create_ext, and that in turn calls rte_get_mempool_handler to
> get the handler index, which is stored in the rte_memool structure. This
> allow multiple processes to use the same mempool, as the function pointers
> are accessed via handler index.
> 
> The mempool handler structure contains callbacks to the implementation of
> the handler, and is set up for registration as follows:
> 
> static struct rte_mempool_handler handler_sp_mc = {
> .name = "ring_sp_mc",
> .alloc = rte_mempool_common_ring_alloc,
> .put = common_ring_sp_put,
> .get = common_ring_mc_get,
> .get_count = common_ring_get_count,
> .free = common_ring_free,
> };
> 
> And then the following macro will register the handler in the array of 
> handlers
> 
> REGISTER_MEMPOOL_HANDLER(handler_mp_mc);
> 
> For and example of a simple malloc based mempool manager, see
> lib/librte_mempool/custom_mempool.c
> 
> For an example of API usage, please see app/test/test_ext_mempool.c, which
> implements a rudimentary mempool manager using simple mallocs for each
> mempool object (custom_mempool.c).
> 
> 
> David Hunt (5):
>   mempool: add external mempool manager support
>   memool: add stack (lifo) based external mempool handler
>   mempool: add custom external mempool handler example
>   mempool: add autotest for external mempool custom example
>   mempool: allow rte_pktmbuf_pool_create switch between memool handlers
> 
>  app/test/Makefile |   1 +
>

[dpdk-dev] [PATCH] pci: Add the class_id support in pci probe

2016-01-28 Thread Thomas Monjalon

2016-01-13 14:22, Panu Matilainen:
> On 01/13/2016 01:55 PM, Bruce Richardson wrote:
> > On Thu, Dec 31, 2015 at 09:12:14AM -0800, Stephen Hemminger wrote:
> >> On Tue, 29 Dec 2015 10:53:26 +0800
> >> Ziye Yang  wrote:
> >>
> >>> This patch is used to add the class_id support
> >>> for pci_probe since some devices need the class_info
> >>> (class_code, subclass_code, programming_interface)
> >>>
> >>> Signed-off-by: Ziye Yang 
> >>
> >> Since rte_pci is exposed to application this breaks the ABI.
> >
> > But applications are not going to be defining rte_pci_ids values 
> > internally, are
> > they? That is for drivers to use. Is this really an ABI breakage for 
> > applications that we
> > need to be concerned about?
> 
> There might not be applications using it but drivers are ABI consumers 
> too - think of 3rd party drivers and such.

Drivers are not ABI consumers in the sense that ABI means
Application Binary Interface.
We are talking about drivers interface here.
When establishing the ABI policy we were discussing about applications only.

I agree we must allow 3rd party drivers but there is no good reason
to try to upgrade DPDK without upgrading/porting the external drivers.
If someone does not want to release its driver and keep upgrading DPDK,
it is acceptable IMHO to force an upgrade of its driver.

[dpdk-dev] [RFC] Abi breakage for rte_mempool to reduce footprint

2016-01-28 Thread Wiles, Keith

Hi Everyone,

Currently every mempool created has a footprint of 1.5Megs of memory just for 
the struct rte_mempool, this also includes all of the rte_pktmbuf creates as 
well. The issue is the local_cache adds about 1.5Megs of memory, which is a 
huge amount IMHO for non-cached based mempools. Without the local_cache the 
structure is about 192bytes. You can set the config option for the cache to 
?n?, but then all allocations will not use the per core cache. I have some code 
I will send as a patch to have the local_cache allocated only when a mempool is 
created and the caller has a non-zero cache size set in the call.

This will break ABI for the struct rte_mempool, but does remove some of the 
ifdefs for RTE_MEMPOOL_CACHE_SIZE in the code. The performance appears to be 
the same, but will do some more testing before I submit the patch.

Please let me know if this would be reasonable or other comments.

Regards,
Keith

[dpdk-dev] [RFC] ABI breakage for rte_mempool to reduce footprint

2016-01-28 Thread Wiles, Keith


Hi Everyone,

Currently every mempool created has a footprint of 1.5Megs of memory just for 
the struct rte_mempool, this also includes all of the rte_pktmbuf creates as 
well. The issue is the local_cache adds about 1.5Megs of memory, which is a 
huge amount IMHO for non-cached based mempools. Without the local_cache the 
structure is about 192bytes. You can set the config option for the cache to 
?n?, but then all allocations will not use the per core cache. I have some code 
I will send as a patch to have the local_cache allocated only when a mempool is 
created and the caller has a non-zero cache size set in the call.

This will break ABI for the struct rte_mempool, but does remove some of the 
ifdefs for RTE_MEMPOOL_CACHE_SIZE in the code. The performance appears to be 
the same, but will do some more testing before I submit the patch.

Please let me know if this would be reasonable or other comments.

Regards,
Keith

[dpdk-dev] [RFC PATCH 5/5] virtio: Extend virtio-net PMD to support container environment

2016-01-28 Thread Tetsuya Mukawa

On 2016/01/28 18:48, Xie, Huawei wrote:
> On 1/28/2016 10:47 AM, Tetsuya Mukawa wrote:
>> On 2016/01/28 0:58, Xie, Huawei wrote:
>>> On 1/21/2016 7:09 PM, Tetsuya Mukawa wrote:
>>> [snip]
 +
 +static int
 +qtest_raw_recv(int fd, char *buf, size_t count)
 +{
 +  size_t len = count;
 +  size_t total_len = 0;
 +  int ret = 0;
 +
 +  while (len > 0) {
 +  ret = read(fd, buf, len);
 +  if (ret == (int)len)
 +  break;
 +  if (*(buf + ret - 1) == '\n')
 +  break;
>>> The above two lines should be put after the below if block.
>> Yes, it should be so.
>>
 +  if (ret == -1) {
 +  if (errno == EINTR)
 +  continue;
 +  return ret;
 +  }
 +  total_len += ret;
 +  buf += ret;
 +  len -= ret;
 +  }
 +  return total_len + ret;
 +}
 +
>>> [snip]
>>>
 +
 +static void
 +qtest_handle_one_message(struct qtest_session *s, char *buf)
 +{
 +  int ret;
 +
 +  if (strncmp(buf, interrupt_message, strlen(interrupt_message)) == 0) {
 +  if (rte_atomic16_read(>enable_intr) == 0)
 +  return;
 +
 +  /* relay interrupt to pipe */
 +  ret = write(s->irqfds.writefd, "1", 1);
 +  if (ret < 0)
 +  rte_panic("cannot relay interrupt\n");
 +  } else {
 +  /* relay normal message to pipe */
 +  ret = qtest_raw_send(s->msgfds.writefd, buf, strlen(buf));
 +  if (ret < 0)
 +  rte_panic("cannot relay normal message\n");
 +  }
 +}
 +
 +static char *
 +qtest_get_next_message(char *p)
 +{
 +  p = strchr(p, '\n');
 +  if ((p == NULL) || (*(p + 1) == '\0'))
 +  return NULL;
 +  return p + 1;
 +}
 +
 +static void
 +qtest_close_one_socket(int *fd)
 +{
 +  if (*fd > 0) {
 +  close(*fd);
 +  *fd = -1;
 +  }
 +}
 +
 +static void
 +qtest_close_sockets(struct qtest_session *s)
 +{
 +  qtest_close_one_socket(>qtest_socket);
 +  qtest_close_one_socket(>msgfds.readfd);
 +  qtest_close_one_socket(>msgfds.writefd);
 +  qtest_close_one_socket(>irqfds.readfd);
 +  qtest_close_one_socket(>irqfds.writefd);
 +  qtest_close_one_socket(>ivshmem_socket);
 +}
 +
 +/*
 + * This thread relays QTest response using pipe.
 + * The function is needed because we need to separate IRQ message from 
 others.
 + */
 +static void *
 +qtest_event_handler(void *data) {
 +  struct qtest_session *s = (struct qtest_session *)data;
 +  char buf[1024];
 +  char *p;
 +  int ret;
 +
 +  for (;;) {
 +  memset(buf, 0, sizeof(buf));
 +  ret = qtest_raw_recv(s->qtest_socket, buf, sizeof(buf));
 +  if (ret < 0) {
 +  qtest_close_sockets(s);
 +  return NULL;
 +  }
 +
 +  /* may receive multiple messages at the same time */
>>> From the qtest_raw_recv implementation, if at some point one message is
>>> received by two qtest_raw_recv calls, then is that message discarded?
>>> We could save the last incomplete message in buffer, and combine the
>>> message received next time together.
>> I guess we don't lose replies from QEMU.
>> Please let me describe more.
>>
>> According to the qtest specification, after sending a message, we need
>> to receive a reply like below.
>> APP: ---command---> QEMU
>> APP: <---OK QEMU
>>
>> But, to handle interrupt message, we need to take care below case.
>> APP: ---command---> QEMU
>> APP: <---interrupt QEMU
>> APP: <---OK QEMU
>>
>> Also, we need to handle a case like multiple threads tries to send a
>> qtest message.
>> Anyway, here is current implementation.
>>
>> So far, we have 3 types of sockets.
>> 1. socket for qtest messaging.
>> 2. socket for relaying normal message.
>> 3. socket for relaying interrupt message.
>>
>> About read direction:
>> The qtest socket is only read by "qtest_event_handler". The handler may
>> receive multiple messages at once.
> I think there are two assumptions that all messages are ended with "\n"
> and the sizeof(buf) could hold the maximum length of sum of all multiple
> messages that QEMU could send at one time.
> Otherwise in the last read call of qtest_raw_receive, you might receive
> only part of the a message.

I've got your point. I will fix above case.

Thanks,
Tetsuya

[dpdk-dev] [PATCH v2 3/3] virtio: Add a new layer to abstract pci access method

2016-01-28 Thread Tetsuya Mukawa

This patch addss function pointers to abstract pci access method.
This abstraction layer will be used when virtio-net PMD supports
container extension.

The below functions abstract how to access to pci configuration space.

struct virtio_pci_cfg_ops {
int   (*map)(...);
void  (*unmap)(...);
void *(*get_mapped_addr)(...);
int   (*read)(...);
};

The pci configuration space has information how to access to virtio
device registers. Basically, there are 2 ways to acccess to the
registers. One is using portio and the other is using mapped memory.
The below functions abstract this access method.

struct virtio_pci_dev_ops {
uint8_t  (*read8)(...);
uint16_t (*read16)(...);
uint32_t (*read32)(...);
void (*write8)(...);
void (*write16)(...);
void (*write32)(...);
};

Signed-off-by: Tetsuya Mukawa 
---
 drivers/net/virtio/virtio_ethdev.c |   4 +-
 drivers/net/virtio/virtio_pci.c| 531 +
 drivers/net/virtio/virtio_pci.h|  24 +-
 3 files changed, 386 insertions(+), 173 deletions(-)

diff --git a/drivers/net/virtio/virtio_ethdev.c 
b/drivers/net/virtio/virtio_ethdev.c
index 37833a8..c477b05 100644
--- a/drivers/net/virtio/virtio_ethdev.c
+++ b/drivers/net/virtio/virtio_ethdev.c
@@ -1037,7 +1037,7 @@ eth_virtio_dev_init(struct rte_eth_dev *eth_dev)

pci_dev = eth_dev->pci_dev;

-   if (vtpci_init(pci_dev, hw) < 0)
+   if (vtpci_init(eth_dev, hw) < 0)
return -1;

/* Reset the device although not necessary at startup */
@@ -1177,7 +1177,7 @@ eth_virtio_dev_uninit(struct rte_eth_dev *eth_dev)
rte_intr_callback_unregister(_dev->intr_handle,
virtio_interrupt_handler,
eth_dev);
-   vtpci_uninit(pci_dev, hw);
+   vtpci_uninit(eth_dev, hw);

PMD_INIT_LOG(DEBUG, "dev_uninit completed");

diff --git a/drivers/net/virtio/virtio_pci.c b/drivers/net/virtio/virtio_pci.c
index 3e6be8c..c6d72f9 100644
--- a/drivers/net/virtio/virtio_pci.c
+++ b/drivers/net/virtio/virtio_pci.c
@@ -49,24 +49,198 @@
 #define PCI_CAPABILITY_LIST0x34
 #define PCI_CAP_ID_VNDR0x09

+static uint8_t
+phys_legacy_read8(struct virtio_hw *hw, uint8_t *addr)
+{
+   return inb((unsigned short)(hw->io_base + (uint64_t)addr));
+}

-#define VIRTIO_PCI_REG_ADDR(hw, reg) \
-   (unsigned short)((hw)->io_base + (reg))
+static uint16_t
+phys_legacy_read16(struct virtio_hw *hw, uint16_t *addr)
+{
+   return inw((unsigned short)(hw->io_base + (uint64_t)addr));
+}

-#define VIRTIO_READ_REG_1(hw, reg) \
-   inb((VIRTIO_PCI_REG_ADDR((hw), (reg
-#define VIRTIO_WRITE_REG_1(hw, reg, value) \
-   outb_p((unsigned char)(value), (VIRTIO_PCI_REG_ADDR((hw), (reg
+static uint32_t
+phys_legacy_read32(struct virtio_hw *hw, uint32_t *addr)
+{
+   return inl((unsigned short)(hw->io_base + (uint64_t)addr));
+}

-#define VIRTIO_READ_REG_2(hw, reg) \
-   inw((VIRTIO_PCI_REG_ADDR((hw), (reg
-#define VIRTIO_WRITE_REG_2(hw, reg, value) \
-   outw_p((unsigned short)(value), (VIRTIO_PCI_REG_ADDR((hw), (reg
+static void
+phys_legacy_write8(struct virtio_hw *hw, uint8_t *addr, uint8_t val)
+{
+   return outb_p((unsigned char)val,
+   (unsigned short)(hw->io_base + (uint64_t)addr));
+}

-#define VIRTIO_READ_REG_4(hw, reg) \
-   inl((VIRTIO_PCI_REG_ADDR((hw), (reg
-#define VIRTIO_WRITE_REG_4(hw, reg, value) \
-   outl_p((unsigned int)(value), (VIRTIO_PCI_REG_ADDR((hw), (reg
+static void
+phys_legacy_write16(struct virtio_hw *hw, uint16_t *addr, uint16_t val)
+{
+   return outw_p((unsigned short)val,
+   (unsigned short)(hw->io_base + (uint64_t)addr));
+}
+
+static void
+phys_legacy_write32(struct virtio_hw *hw, uint32_t *addr, uint32_t val)
+{
+   return outl_p((unsigned int)val,
+   (unsigned short)(hw->io_base + (uint64_t)addr));
+}
+
+static const struct virtio_pci_dev_ops phys_legacy_dev_ops = {
+   .read8  = phys_legacy_read8,
+   .read16 = phys_legacy_read16,
+   .read32 = phys_legacy_read32,
+   .write8 = phys_legacy_write8,
+   .write16= phys_legacy_write16,
+   .write32= phys_legacy_write32,
+};
+
+static uint8_t
+phys_modern_read8(struct virtio_hw *hw __rte_unused, uint8_t *addr)
+{
+   return *(volatile uint8_t *)addr;
+}
+
+static uint16_t
+phys_modern_read16(struct virtio_hw *hw __rte_unused, uint16_t *addr)
+{
+   return *(volatile uint16_t *)addr;
+}
+
+static uint32_t
+phys_modern_read32(struct virtio_hw *hw __rte_unused, uint32_t *addr)
+{
+   return *(volatile uint32_t *)addr;
+}
+
+static void
+phys_modern_write8(struct virtio_hw *hw __rte_unused,
+   uint8_t *addr, uint8_t val)
+{
+   *(volatile uint8_t *)addr = val;
+}
+
+static

[dpdk-dev] [PATCH v2 2/3] virtio: move rte_eal_pci_unmap_device() to virtio_pci.c

2016-01-28 Thread Tetsuya Mukawa

To abstract pci access method, the patch moves below function
to "virtio_pci.c".
 - rte_eal_pci_unmap_device()

Signed-off-by: Tetsuya Mukawa 
---
 drivers/net/virtio/virtio_ethdev.c |  2 +-
 drivers/net/virtio/virtio_pci.c| 11 +++
 drivers/net/virtio/virtio_pci.h|  1 +
 3 files changed, 13 insertions(+), 1 deletion(-)

diff --git a/drivers/net/virtio/virtio_ethdev.c 
b/drivers/net/virtio/virtio_ethdev.c
index deb0382..37833a8 100644
--- a/drivers/net/virtio/virtio_ethdev.c
+++ b/drivers/net/virtio/virtio_ethdev.c
@@ -1177,7 +1177,7 @@ eth_virtio_dev_uninit(struct rte_eth_dev *eth_dev)
rte_intr_callback_unregister(_dev->intr_handle,
virtio_interrupt_handler,
eth_dev);
-   rte_eal_pci_unmap_device(pci_dev);
+   vtpci_uninit(pci_dev, hw);

PMD_INIT_LOG(DEBUG, "dev_uninit completed");

diff --git a/drivers/net/virtio/virtio_pci.c b/drivers/net/virtio/virtio_pci.c
index 1fca39f..3e6be8c 100644
--- a/drivers/net/virtio/virtio_pci.c
+++ b/drivers/net/virtio/virtio_pci.c
@@ -892,3 +892,14 @@ vtpci_init(struct rte_pci_device *dev, struct virtio_hw 
*hw)

return 0;
 }
+
+void
+vtpci_uninit(struct rte_pci_device *dev, struct virtio_hw *hw)
+{
+   hw->dev  = NULL;
+   hw->vtpci_ops = NULL;
+   hw->use_msix = 0;
+   hw->io_base  = 0;
+   hw->modern   = 0;
+   rte_eal_pci_unmap_device(dev);
+}
diff --git a/drivers/net/virtio/virtio_pci.h b/drivers/net/virtio/virtio_pci.h
index 0544a07..17c7972 100644
--- a/drivers/net/virtio/virtio_pci.h
+++ b/drivers/net/virtio/virtio_pci.h
@@ -328,6 +328,7 @@ vtpci_with_feature(struct virtio_hw *hw, uint64_t bit)
  * Function declaration from virtio_pci.c
  */
 int vtpci_init(struct rte_pci_device *, struct virtio_hw *);
+void vtpci_uninit(struct rte_pci_device *dev, struct virtio_hw *);
 void vtpci_reset(struct virtio_hw *);

 void vtpci_reinit_complete(struct virtio_hw *);
-- 
2.1.4

[dpdk-dev] [PATCH v2 0/3] virtio: Add a new layer to abstract pci access method

2016-01-28 Thread Tetsuya Mukawa

The patches abstract pci access method of virtio-net PMD.
The patch should be on Yuanhan's below patch series.
 - [PATCH v6 0/9] virtio 1.0 enabling for virtio pmd driver.

PATCH v2 changes
 - Rebase on Yuanhan's v6 patches.
 - split virtio_pci_access_ops in 2 different structures.
 - some refactoring.


Tetsuya Mukawa (3):
  virtio: Change the parameter order of io_write8/16/32()
  virtio: move rte_eal_pci_unmap_device() to virtio_pci.c
  virtio: Add a new layer to abstract pci access method

 drivers/net/virtio/virtio_ethdev.c |   4 +-
 drivers/net/virtio/virtio_pci.c| 554 +
 drivers/net/virtio/virtio_pci.h|  23 +-
 3 files changed, 403 insertions(+), 178 deletions(-)

-- 
2.1.4

[dpdk-dev] [PATCH v3 4/4] app/test-pmd: test tunnel filter for IP in GRE

2016-01-28 Thread Xutao Sun

This patch add some options in tunnel_filter command to test IP in GRE packet 
classification on i40e.

Signed-off-by: Xutao Sun 
Signed-off-by: Jijiang Liu 
---
 app/test-pmd/cmdline.c | 36 
 1 file changed, 24 insertions(+), 12 deletions(-)

diff --git a/app/test-pmd/cmdline.c b/app/test-pmd/cmdline.c
index 67df259..4dedf28 100644
--- a/app/test-pmd/cmdline.c
+++ b/app/test-pmd/cmdline.c
@@ -301,12 +301,14 @@ static void cmd_help_long_parsed(void *parsed_result,
"Set the outer VLAN TPID for Packet Filtering on"
" a port\n\n"

-   "tunnel_filter add (port_id) (outer_mac) (inner_mac) 
(ip_addr) "
-   "(inner_vlan) (vxlan|nvgre) (filter_type) (tenant_id) 
(queue_id)\n"
+   "tunnel_filter add (port_id) (outer_ip) (inner_ip) 
(outer_mac)"
+   "(inner_mac) (ip_addr) (inner_vlan) 
(vxlan|nvgre|iningre) (filter_type)"
+   "(tenant_id) (queue_id)\n"
"   add a tunnel filter of a port.\n\n"

-   "tunnel_filter rm (port_id) (outer_mac) (inner_mac) 
(ip_addr) "
-   "(inner_vlan) (vxlan|nvgre) (filter_type) (tenant_id) 
(queue_id)\n"
+   "tunnel_filter rm (port_id) (outer_ip) (inner_ip) 
(outer_mac)"
+   "(inner_mac) (ip_addr) (inner_vlan) 
(vxlan|nvgre|ipingre) (filter_type)"
+   "(tenant_id) (queue_id)\n"
"   remove a tunnel filter of a port.\n\n"

"rx_vxlan_port add (udp_port) (port_id)\n"
@@ -6640,6 +6642,8 @@ cmd_tunnel_filter_parsed(void *parsed_result,
struct rte_eth_tunnel_filter_conf tunnel_filter_conf;
int ret = 0;

+   memset(_filter_conf, 0, sizeof(tunnel_filter_conf));
+
rte_memcpy(_filter_conf.outer_mac, >outer_mac,
ETHER_ADDR_LEN);
rte_memcpy(_filter_conf.inner_mac, >inner_mac,
@@ -6648,12 +6652,14 @@ cmd_tunnel_filter_parsed(void *parsed_result,

if (res->ip_value.family == AF_INET) {
tunnel_filter_conf.ip_addr.ipv4_addr =
-   res->ip_value.addr.ipv4.s_addr;
+   rte_be_to_cpu_32(res->ip_value.addr.ipv4.s_addr);
tunnel_filter_conf.ip_type = RTE_TUNNEL_IPTYPE_IPV4;
} else {
-   memcpy(&(tunnel_filter_conf.ip_addr.ipv6_addr),
-   &(res->ip_value.addr.ipv6),
-   sizeof(struct in6_addr));
+   int i;
+   for (i = 0; i < 4; i++) {
+   tunnel_filter_conf.ip_addr.ipv6_addr[i] =
+   rte_be_to_cpu_32(res->ip_value.addr.ipv6.s6_addr32[i]);
+   }
tunnel_filter_conf.ip_type = RTE_TUNNEL_IPTYPE_IPV6;
}

@@ -6669,6 +6675,10 @@ cmd_tunnel_filter_parsed(void *parsed_result,
else if (!strcmp(res->filter_type, "omac-imac-tenid"))
tunnel_filter_conf.filter_type =
RTE_TUNNEL_FILTER_OMAC_TENID_IMAC;
+   else if (!strcmp(res->filter_type, "oip"))
+   tunnel_filter_conf.filter_type = ETH_TUNNEL_FILTER_OIP;
+   else if (!strcmp(res->filter_type, "iip"))
+   tunnel_filter_conf.filter_type = ETH_TUNNEL_FILTER_IIP;
else {
printf("The filter type is not supported");
return;
@@ -6678,6 +6688,8 @@ cmd_tunnel_filter_parsed(void *parsed_result,
tunnel_filter_conf.tunnel_type = RTE_TUNNEL_TYPE_VXLAN;
else if (!strcmp(res->tunnel_type, "nvgre"))
tunnel_filter_conf.tunnel_type = RTE_TUNNEL_TYPE_NVGRE;
+   else if (!strcmp(res->tunnel_type, "ipingre"))
+   tunnel_filter_conf.tunnel_type = RTE_TUNNEL_TYPE_IP_IN_GRE;
else {
printf("The tunnel type %s not supported.\n", res->tunnel_type);
return;
@@ -6723,11 +6735,11 @@ cmdline_parse_token_ipaddr_t cmd_tunnel_filter_ip_value 
=
ip_value);
 cmdline_parse_token_string_t cmd_tunnel_filter_tunnel_type =
TOKEN_STRING_INITIALIZER(struct cmd_tunnel_filter_result,
-   tunnel_type, "vxlan#nvgre");
+   tunnel_type, "vxlan#nvgre#ipingre");

 cmdline_parse_token_string_t cmd_tunnel_filter_filter_type =
TOKEN_STRING_INITIALIZER(struct cmd_tunnel_filter_result,
-   filter_type, "imac-ivlan#imac-ivlan-tenid#imac-tenid#"
+   filter_type, "oip#iip#imac-ivlan#imac-ivlan-tenid#imac-tenid#"
"imac#omac-imac-tenid");
 cmdline_parse_token_num_t cmd_tunnel_filter_tenant_id =
TOKEN_NUM_INITIALIZER(struct cmd_tunnel_filter_result,
@@ -6741,8 +6753,8 @@ cmdline_parse_inst_t cmd_tunnel_filter = {
.data = (void *)0,
.help_str = "add/rm tunnel filter of a port: "
"tunnel_filter add port_id outer_mac inner_mac ip "
-   "inner_vlan

[dpdk-dev] [PATCH v3 3/4] driver/i40e: implement tunnel filter for IP in GRE

2016-01-28 Thread Xutao Sun

Signed-off-by: Xutao Sun 
Signed-off-by: Jijiang Liu 
---
 drivers/net/i40e/i40e_ethdev.c | 32 
 1 file changed, 24 insertions(+), 8 deletions(-)

diff --git a/drivers/net/i40e/i40e_ethdev.c b/drivers/net/i40e/i40e_ethdev.c
index 1dd1077..5c0eff9 100644
--- a/drivers/net/i40e/i40e_ethdev.c
+++ b/drivers/net/i40e/i40e_ethdev.c
@@ -5797,6 +5797,12 @@ i40e_dev_get_filter_type(uint16_t filter_type, uint16_t 
*flag)
case ETH_TUNNEL_FILTER_IMAC:
*flag = I40E_AQC_ADD_CLOUD_FILTER_IMAC;
break;
+   case ETH_TUNNEL_FILTER_OIP:
+   *flag = I40E_AQC_ADD_CLOUD_FILTER_OIP;
+   break;
+   case ETH_TUNNEL_FILTER_IIP:
+   *flag = I40E_AQC_ADD_CLOUD_FILTER_IIP;
+   break;
default:
PMD_DRV_LOG(ERR, "invalid tunnel filter type");
return -EINVAL;
@@ -5811,7 +5817,7 @@ i40e_dev_tunnel_filter_set(struct i40e_pf *pf,
uint8_t add)
 {
uint16_t ip_type;
-   uint8_t tun_type = 0;
+   uint8_t i, tun_type = 0;
int val, ret = 0;
struct i40e_hw *hw = I40E_PF_TO_HW(pf);
struct i40e_vsi *vsi = pf->main_vsi;
@@ -5833,16 +5839,22 @@ i40e_dev_tunnel_filter_set(struct i40e_pf *pf,
(void)rte_memcpy(>inner_mac, _filter->inner_mac,
ETHER_ADDR_LEN);

-   pfilter->inner_vlan = tunnel_filter->inner_vlan;
+   pfilter->inner_vlan = rte_cpu_to_le_16(tunnel_filter->inner_vlan);
if (tunnel_filter->ip_type == RTE_TUNNEL_IPTYPE_IPV4) {
ip_type = I40E_AQC_ADD_CLOUD_FLAGS_IPV4;
+   tunnel_filter->ip_addr.ipv4_addr =
+   rte_cpu_to_le_32(tunnel_filter->ip_addr.ipv4_addr);
(void)rte_memcpy(>ipaddr.v4.data,
-   _filter->ip_addr,
+   _filter->ip_addr.ipv4_addr,
sizeof(pfilter->ipaddr.v4.data));
} else {
ip_type = I40E_AQC_ADD_CLOUD_FLAGS_IPV6;
+   for (i = 0; i < 4; i++) {
+   tunnel_filter->ip_addr.ipv6_addr[i] =
+   rte_cpu_to_le_32(tunnel_filter->ip_addr.ipv6_addr[i]);
+   }
(void)rte_memcpy(>ipaddr.v6.data,
-   _filter->ip_addr,
+   _filter->ip_addr.ipv6_addr,
sizeof(pfilter->ipaddr.v6.data));
}

@@ -5854,6 +5866,9 @@ i40e_dev_tunnel_filter_set(struct i40e_pf *pf,
case RTE_TUNNEL_TYPE_NVGRE:
tun_type = I40E_AQC_ADD_CLOUD_TNL_TYPE_NVGRE_OMAC;
break;
+   case RTE_TUNNEL_TYPE_IP_IN_GRE:
+   tun_type = I40E_AQC_ADD_CLOUD_TNL_TYPE_IP;
+   break;
default:
/* Other tunnel types is not supported. */
PMD_DRV_LOG(ERR, "tunnel type is not supported.");
@@ -5868,10 +5883,11 @@ i40e_dev_tunnel_filter_set(struct i40e_pf *pf,
return -EINVAL;
}

-   pfilter->flags |= I40E_AQC_ADD_CLOUD_FLAGS_TO_QUEUE | ip_type |
-   (tun_type << I40E_AQC_ADD_CLOUD_TNL_TYPE_SHIFT);
-   pfilter->tenant_id = tunnel_filter->tenant_id;
-   pfilter->queue_number = tunnel_filter->queue_id;
+   pfilter->flags |= rte_cpu_to_le_16(
+   I40E_AQC_ADD_CLOUD_FLAGS_TO_QUEUE
+   | ip_type | (tun_type << I40E_AQC_ADD_CLOUD_TNL_TYPE_SHIFT));
+   pfilter->tenant_id = rte_cpu_to_le_32(tunnel_filter->tenant_id);
+   pfilter->queue_number = rte_cpu_to_le_16(tunnel_filter->queue_id);

if (add)
ret = i40e_aq_add_cloud_filters(hw, vsi->seid, cld_filter, 1);
-- 
1.9.3

[dpdk-dev] [PATCH v3 2/4] lib/ether: add IP in GRE type

2016-01-28 Thread Xutao Sun

Signed-off-by: Xutao Sun 
Signed-off-by: Jijiang Liu 
---
 lib/librte_ether/rte_eth_ctrl.h | 1 +
 1 file changed, 1 insertion(+)

diff --git a/lib/librte_ether/rte_eth_ctrl.h b/lib/librte_ether/rte_eth_ctrl.h
index 30cbde7..0e948a1 100644
--- a/lib/librte_ether/rte_eth_ctrl.h
+++ b/lib/librte_ether/rte_eth_ctrl.h
@@ -244,6 +244,7 @@ enum rte_eth_tunnel_type {
RTE_TUNNEL_TYPE_GENEVE,
RTE_TUNNEL_TYPE_TEREDO,
RTE_TUNNEL_TYPE_NVGRE,
+   RTE_TUNNEL_TYPE_IP_IN_GRE,
RTE_TUNNEL_TYPE_MAX,
 };

-- 
1.9.3

[dpdk-dev] [PATCH v3 1/4] lib/ether: optimize the'rte_eth_tunnel_filter_conf' structure

2016-01-28 Thread Xutao Sun

Change the fields of outer_mac and inner_mac from pointer to struct in order to 
keep the code's readability.

Signed-off-by: Xutao Sun 
Signed-off-by: Jijiang Liu 
---
 app/test-pmd/cmdline.c   |  6 --
 doc/guides/rel_notes/deprecation.rst |  5 -
 doc/guides/rel_notes/release_2_3.rst |  2 ++
 drivers/net/i40e/i40e_ethdev.c   | 12 ++--
 lib/librte_ether/rte_eth_ctrl.h  |  4 ++--
 5 files changed, 14 insertions(+), 15 deletions(-)

diff --git a/app/test-pmd/cmdline.c b/app/test-pmd/cmdline.c
index 6d28c1b..67df259 100644
--- a/app/test-pmd/cmdline.c
+++ b/app/test-pmd/cmdline.c
@@ -6640,8 +6640,10 @@ cmd_tunnel_filter_parsed(void *parsed_result,
struct rte_eth_tunnel_filter_conf tunnel_filter_conf;
int ret = 0;

-   tunnel_filter_conf.outer_mac = >outer_mac;
-   tunnel_filter_conf.inner_mac = >inner_mac;
+   rte_memcpy(_filter_conf.outer_mac, >outer_mac,
+   ETHER_ADDR_LEN);
+   rte_memcpy(_filter_conf.inner_mac, >inner_mac,
+   ETHER_ADDR_LEN);
tunnel_filter_conf.inner_vlan = res->inner_vlan;

if (res->ip_value.family == AF_INET) {
diff --git a/doc/guides/rel_notes/deprecation.rst 
b/doc/guides/rel_notes/deprecation.rst
index e94d4a2..a895364 100644
--- a/doc/guides/rel_notes/deprecation.rst
+++ b/doc/guides/rel_notes/deprecation.rst
@@ -32,11 +32,6 @@ Deprecation Notices
   RTE_ETH_FLOW_MAX. The release 2.2 does not contain these ABI changes,
   but release 2.3 will.

-* ABI changes are planned for rte_eth_tunnel_filter_conf. Change the fields
-  of outer_mac and inner_mac from pointer to struct in order to keep the
-  code's readability. The release 2.2 does not contain these ABI changes, but
-  release 2.3 will, and no backwards compatibility is planned.
-
 * The scheduler statistics structure will change to allow keeping track of
   RED actions.

diff --git a/doc/guides/rel_notes/release_2_3.rst 
b/doc/guides/rel_notes/release_2_3.rst
index 99de186..ee7fd48 100644
--- a/doc/guides/rel_notes/release_2_3.rst
+++ b/doc/guides/rel_notes/release_2_3.rst
@@ -39,6 +39,8 @@ API Changes
 ABI Changes
 ---

+* The fields of outer_mac and inner_mac were changed from pointer
+  to struct in order to keep the code's readability.

 Shared Library Versions
 ---
diff --git a/drivers/net/i40e/i40e_ethdev.c b/drivers/net/i40e/i40e_ethdev.c
index bf6220d..1dd1077 100644
--- a/drivers/net/i40e/i40e_ethdev.c
+++ b/drivers/net/i40e/i40e_ethdev.c
@@ -5828,10 +5828,10 @@ i40e_dev_tunnel_filter_set(struct i40e_pf *pf,
}
pfilter = cld_filter;

-   (void)rte_memcpy(>outer_mac, tunnel_filter->outer_mac,
-   sizeof(struct ether_addr));
-   (void)rte_memcpy(>inner_mac, tunnel_filter->inner_mac,
-   sizeof(struct ether_addr));
+   (void)rte_memcpy(>outer_mac, _filter->outer_mac,
+   ETHER_ADDR_LEN);
+   (void)rte_memcpy(>inner_mac, _filter->inner_mac,
+   ETHER_ADDR_LEN);

pfilter->inner_vlan = tunnel_filter->inner_vlan;
if (tunnel_filter->ip_type == RTE_TUNNEL_IPTYPE_IPV4) {
@@ -6131,13 +6131,13 @@ i40e_tunnel_filter_param_check(struct i40e_pf *pf,
}

if ((filter->filter_type & ETH_TUNNEL_FILTER_OMAC) &&
-   (is_zero_ether_addr(filter->outer_mac))) {
+   (is_zero_ether_addr(>outer_mac))) {
PMD_DRV_LOG(ERR, "Cannot add NULL outer MAC address");
return -EINVAL;
}

if ((filter->filter_type & ETH_TUNNEL_FILTER_IMAC) &&
-   (is_zero_ether_addr(filter->inner_mac))) {
+   (is_zero_ether_addr(>inner_mac))) {
PMD_DRV_LOG(ERR, "Cannot add NULL inner MAC address");
return -EINVAL;
}
diff --git a/lib/librte_ether/rte_eth_ctrl.h b/lib/librte_ether/rte_eth_ctrl.h
index ce224ad..30cbde7 100644
--- a/lib/librte_ether/rte_eth_ctrl.h
+++ b/lib/librte_ether/rte_eth_ctrl.h
@@ -280,8 +280,8 @@ enum rte_tunnel_iptype {
  * Tunneling Packet filter configuration.
  */
 struct rte_eth_tunnel_filter_conf {
-   struct ether_addr *outer_mac;  /**< Outer MAC address filter. */
-   struct ether_addr *inner_mac;  /**< Inner MAC address filter. */
+   struct ether_addr outer_mac;  /**< Outer MAC address filter. */
+   struct ether_addr inner_mac;  /**< Inner MAC address filter. */
uint16_t inner_vlan;   /**< Inner VLAN filter. */
enum rte_tunnel_iptype ip_type; /**< IP address type. */
union {
-- 
1.9.3

[dpdk-dev] [PATCH v3 0/4] Add tunnel filter support for IP in GRE on i40e

2016-01-28 Thread Xutao Sun

This patch set adds tunnel filter support for IP in GRE on i40e.

v2 changes:
  Fix the byte order problem.

v3 changes:
  Remove the deprecation notice and update the release notes.

Xutao Sun (4):
  lib/ether: optimize the'rte_eth_tunnel_filter_conf' structure
  lib/ether: add IP in GRE type
  driver/i40e: implement tunnel filter for IP in GRE
  app/test-pmd: test tunnel filter for IP in GRE

 app/test-pmd/cmdline.c   | 42 ++
 doc/guides/rel_notes/deprecation.rst |  5 
 doc/guides/rel_notes/release_2_3.rst |  2 ++
 drivers/net/i40e/i40e_ethdev.c   | 44 
 lib/librte_ether/rte_eth_ctrl.h  |  5 ++--
 5 files changed, 63 insertions(+), 35 deletions(-)

-- 
1.9.3

[dpdk-dev] [PATCH] pcap: fix captured frame length

2016-01-28 Thread Nicolas Pernas Maradei

Hi Dror,

Good catch. What you are saying makes sense and it is also explained in 
pcap's documentation. Was your setup unusual though?
This might sound like a silly question but I don't remember seeing that 
issue and I should have since your fix is correct.

Nico.

On 28/01/16 11:09, Dror Birkman wrote:
> The actual captured length is header.caplen, whereas header.len is
> the original length on the wire.
>
> Signed-off-by: Dror Birkman 
> ---
>
>
> Without this fix, if the captured length is smaller than the original
> length on the wire, mbuf will contain incorrect data.
>
>
>   drivers/net/pcap/rte_eth_pcap.c | 12 ++--
>   1 file changed, 6 insertions(+), 6 deletions(-)
>
> diff --git a/drivers/net/pcap/rte_eth_pcap.c b/drivers/net/pcap/rte_eth_pcap.c
> index f9230eb..1d121f8 100644
> --- a/drivers/net/pcap/rte_eth_pcap.c
> +++ b/drivers/net/pcap/rte_eth_pcap.c
> @@ -220,25 +220,25 @@ eth_pcap_rx(void *queue,
>   buf_size = 
> (uint16_t)(rte_pktmbuf_data_room_size(pcap_q->mb_pool) -
>   RTE_PKTMBUF_HEADROOM);
>   
> - if (header.len <= buf_size) {
> + if (header.caplen <= buf_size) {
>   /* pcap packet will fit in the mbuf, go ahead and copy 
> */
>   rte_memcpy(rte_pktmbuf_mtod(mbuf, void *), packet,
> - header.len);
> - mbuf->data_len = (uint16_t)header.len;
> + header.caplen);
> + mbuf->data_len = (uint16_t)header.caplen;
>   } else {
>   /* Try read jumbo frame into multi mbufs. */
>   if (unlikely(eth_pcap_rx_jumbo(pcap_q->mb_pool,
>  mbuf,
>  packet,
> -header.len) == -1))
> +header.caplen) == -1))
>   break;
>   }
>   
> - mbuf->pkt_len = (uint16_t)header.len;
> + mbuf->pkt_len = (uint16_t)header.caplen;
>   mbuf->port = pcap_q->in_port;
>   bufs[num_rx] = mbuf;
>   num_rx++;
> - rx_bytes += header.len;
> + rx_bytes += header.caplen;
>   }
>   pcap_q->rx_pkts += num_rx;
>   pcap_q->rx_bytes += rx_bytes;

[dpdk-dev] [PATCH v2] lib: remove "extern" keyword for functions from header files

2016-01-28 Thread Thomas Monjalon

2016-01-28 14:31, Ferruh Yigit:
> Remove "extern" keywords in header files, the ones for function
> prototypes
> 
> v2:
> * fix identation
> 
> Signed-off-by: Ferruh Yigit 

Applied, thanks

[dpdk-dev] [PATCH v6] vfio: Support for no-IOMMU mode

2016-01-28 Thread Thomas Monjalon

2016-01-28 11:57, Anatoly Burakov:
> This commit is adding a generic mechanism to support multiple IOMMU
> types. For now, it's only type 1 (x86 IOMMU) and no-IOMMU (a special
> VFIO mode that doesn't use IOMMU at all), but it's easily extended
> by adding necessary definitions to eal_vfio.h, and DMA mapping
> functions to eal_pci_vfio.c.
> 
> Since type 1 IOMMU module is no longer necessary to have VFIO,
> we fix the module check to check for vfio-pci instead. It's not
> ideal and triggers VFIO checks more often (and thus produces more
> error output, which was the reason behind the module check in the
> first place), so we compensate for that by providing more verbose
> logging, indicating whether VFIO initialization has succeeded or
> failed.
> 
> Signed-off-by: Anatoly Burakov 
> Signed-off-by: Santosh Shukla 
> Tested-by: Santosh Shukla 

Applied, thanks

[dpdk-dev] [PATCH] fm10k: handle err flags in vector RX func

2016-01-28 Thread Chen Jing D(Mark)

From: "Chen Jing D(Mark)" 

Using SSE instructions to parse error flags in HW Rx descriptor,
then set corresponding bits of mbuf.

Signed-off-by: Chen Jing D(Mark) 
---
 doc/guides/rel_notes/release_2_3.rst |2 +
 drivers/net/fm10k/fm10k_rxtx_vec.c   |   42 +-
 2 files changed, 43 insertions(+), 1 deletions(-)

diff --git a/doc/guides/rel_notes/release_2_3.rst 
b/doc/guides/rel_notes/release_2_3.rst
index 99de186..19e8aa2 100644
--- a/doc/guides/rel_notes/release_2_3.rst
+++ b/doc/guides/rel_notes/release_2_3.rst
@@ -3,7 +3,9 @@ DPDK Release 2.3

 New Features
 
+* **Handle error flags in fm10k vector RX func**

+  * Parse err flags in Rx desc and set error bits in mbuf with vector 
instructions.

 Resolved Issues
 ---
diff --git a/drivers/net/fm10k/fm10k_rxtx_vec.c 
b/drivers/net/fm10k/fm10k_rxtx_vec.c
index 2a57eef..0c48a48 100644
--- a/drivers/net/fm10k/fm10k_rxtx_vec.c
+++ b/drivers/net/fm10k/fm10k_rxtx_vec.c
@@ -61,11 +61,17 @@ fm10k_reset_tx_queue(struct fm10k_tx_queue *txq);
 #define L3TYPE_SHIFT (4)
 /* L4 type shift */
 #define L4TYPE_SHIFT (7)
+/* HBO flag shift */
+#define HBOFLAG_SHIFT (10)
+/* RXE flag shift */
+#define RXEFLAG_SHIFT (13)
+/* IPE/L4E flag shift */
+#define L3L4EFLAG_SHIFT (14)

 static inline void
 fm10k_desc_to_olflags_v(__m128i descs[4], struct rte_mbuf **rx_pkts)
 {
-   __m128i ptype0, ptype1, vtag0, vtag1;
+   __m128i ptype0, ptype1, vtag0, vtag1, eflag0, eflag1, cksumflag;
union {
uint16_t e[4];
uint64_t dword;
@@ -81,12 +87,29 @@ fm10k_desc_to_olflags_v(__m128i descs[4], struct rte_mbuf 
**rx_pkts)
0x, 0x, 0x, 0x,
0x000F, 0x000F, 0x000F, 0x000F);

+   /* mask for HBO and RXE flag flags */
+   const __m128i rxe_msk = _mm_set_epi16(
+   0x, 0x, 0x, 0x,
+   0x0001, 0x0001, 0x0001, 0x0001);
+
+   const __m128i l3l4cksum_flag = _mm_set_epi8(0, 0, 0, 0,
+   0, 0, 0, 0,
+   0, 0, 0, 0,
+   PKT_RX_IP_CKSUM_BAD | PKT_RX_L4_CKSUM_BAD,
+   PKT_RX_IP_CKSUM_BAD, PKT_RX_L4_CKSUM_BAD, 0);
+
+   const __m128i rxe_flag = _mm_set_epi8(0, 0, 0, 0,
+   0, 0, 0, 0,
+   0, 0, 0, 0,
+   0, 0, PKT_RX_RECIP_ERR, 0);
+
/* map rss type to rss hash flag */
const __m128i rss_flags = _mm_set_epi8(0, 0, 0, 0,
0, 0, 0, PKT_RX_RSS_HASH,
PKT_RX_RSS_HASH, 0, PKT_RX_RSS_HASH, 0,
PKT_RX_RSS_HASH, PKT_RX_RSS_HASH, PKT_RX_RSS_HASH, 0);

+   /* Calculate RSS_hash and Vlan fields */
ptype0 = _mm_unpacklo_epi16(descs[0], descs[1]);
ptype1 = _mm_unpacklo_epi16(descs[2], descs[3]);
vtag0 = _mm_unpackhi_epi16(descs[0], descs[1]);
@@ -97,10 +120,27 @@ fm10k_desc_to_olflags_v(__m128i descs[4], struct rte_mbuf 
**rx_pkts)
ptype0 = _mm_shuffle_epi8(rss_flags, ptype0);

vtag1 = _mm_unpacklo_epi32(vtag0, vtag1);
+   eflag0 = vtag1;
+   cksumflag = vtag1;
vtag1 = _mm_srli_epi16(vtag1, VP_SHIFT);
vtag1 = _mm_and_si128(vtag1, pkttype_msk);

vtag1 = _mm_or_si128(ptype0, vtag1);
+
+   /* Process err flags, simply set RECIP_ERR bit if HBO/IXE is set */
+   eflag1 = _mm_srli_epi16(eflag0, RXEFLAG_SHIFT);
+   eflag0 = _mm_srli_epi16(eflag0, HBOFLAG_SHIFT);
+   eflag0 = _mm_or_si128(eflag0, eflag1);
+   eflag0 = _mm_and_si128(eflag1, rxe_msk);
+   eflag0 = _mm_shuffle_epi8(rxe_flag, eflag0);
+
+   vtag1 = _mm_or_si128(eflag0, vtag1);
+
+   /* Process L4/L3 checksum error flags */
+   cksumflag = _mm_srli_epi16(cksumflag, L3L4EFLAG_SHIFT);
+   cksumflag = _mm_shuffle_epi8(l3l4cksum_flag, cksumflag);
+   vtag1 = _mm_or_si128(cksumflag, vtag1);
+
vol.dword = _mm_cvtsi128_si64(vtag1);

rx_pkts[0]->ol_flags = vol.e[0];
-- 
1.7.7.6

[dpdk-dev] [PATCH 3/3] app/test: add Snow3G UEA2 tests

2016-01-28 Thread Deepak Kumar JAIN

Added encryption and decryption tests with input test vectors
from Snow3G UEA2 specifications.

Signed-off-by: Deepak Kumar JAIN 
---
 app/test/test_cryptodev.c | 318 -
 app/test/test_cryptodev.h |   2 +-
 app/test/test_cryptodev_snow3g_test_vectors.h | 323 ++
 3 files changed, 641 insertions(+), 2 deletions(-)
 create mode 100644 app/test/test_cryptodev_snow3g_test_vectors.h

diff --git a/app/test/test_cryptodev.c b/app/test/test_cryptodev.c
index fd5b7ec..0809b0f 100644
--- a/app/test/test_cryptodev.c
+++ b/app/test/test_cryptodev.c
@@ -1,7 +1,7 @@
 /*-
  *   BSD LICENSE
  *
- *   Copyright(c) 2015 Intel Corporation. All rights reserved.
+ *   Copyright(c) 2015-2016 Intel Corporation. All rights reserved.
  *
  *   Redistribution and use in source and binary forms, with or without
  *   modification, are permitted provided that the following conditions
@@ -43,6 +43,7 @@

 #include "test.h"
 #include "test_cryptodev.h"
+#include "test_cryptodev_snow3g_test_vectors.h"

 static enum rte_cryptodev_type gbl_cryptodev_type;

@@ -188,6 +189,23 @@ testsuite_setup(void)
}
}

+   /* Create 2 Snow3G devices if required */
+   if (gbl_cryptodev_type == RTE_CRYPTODEV_SNOW3G_PMD) {
+   nb_devs = rte_cryptodev_count_devtype(RTE_CRYPTODEV_SNOW3G_PMD);
+   if (nb_devs < 2) {
+   for (i = nb_devs; i < 2; i++) {
+   int dev_id =
+   
rte_eal_vdev_init(CRYPTODEV_NAME_SNOW3G_PMD,
+ NULL);
+
+   TEST_ASSERT(dev_id >= 0,
+   "Failed to create instance %u of"
+   " pmd : %s",
+   i, CRYPTODEV_NAME_SNOW3G_PMD);
+   }
+   }
+   }
+
nb_devs = rte_cryptodev_count();
if (nb_devs < 1) {
RTE_LOG(ERR, USER1, "No crypto devices found?");
@@ -1681,7 +1699,283 @@ test_AES_CBC_HMAC_AES_XCBC_decrypt_digest_verify(void)
return TEST_SUCCESS;
 }

+/* * Snow3G Tests * */
+static int
+create_snow3g_cipher_session(uint8_t dev_id,
+   enum rte_crypto_cipher_operation op,
+   const uint8_t *key, const uint8_t key_len)
+{
+   uint8_t cipher_key[key_len];
+
+   struct crypto_unittest_params *ut_params = _params;
+
+   memcpy(cipher_key, key, key_len);
+
+   /* Setup Cipher Parameters */
+   ut_params->cipher_xform.type = RTE_CRYPTO_XFORM_CIPHER;
+   ut_params->cipher_xform.next = NULL;
+
+   ut_params->cipher_xform.cipher.algo = RTE_CRYPTO_CIPHER_SNOW3G_UEA2;
+   ut_params->cipher_xform.cipher.op = op;
+   ut_params->cipher_xform.cipher.key.data = cipher_key;
+   ut_params->cipher_xform.cipher.key.length = key_len;
+
+#ifdef RTE_APP_TEST_DEBUG
+   rte_hexdump(stdout, "key:", key, key_len);
+#endif
+   /* Create Crypto session */
+   ut_params->sess = rte_cryptodev_session_create(dev_id,
+   _params->
+   cipher_xform);
+
+   TEST_ASSERT_NOT_NULL(ut_params->sess, "Session creation failed");
+
+   return 0;
+}
+
+static int
+create_snow3g_cipher_operation(const uint8_t *iv, const unsigned iv_len,
+   const unsigned data_len)
+{
+   struct crypto_testsuite_params *ts_params = _params;
+   struct crypto_unittest_params *ut_params = _params;
+
+   unsigned iv_pad_len = 0;
+
+   /* Generate Crypto op data structure */
+   ut_params->ol = rte_pktmbuf_offload_alloc(ts_params->mbuf_ol_pool,
+   RTE_PKTMBUF_OL_CRYPTO);
+   TEST_ASSERT_NOT_NULL(ut_params->ol,
+"Failed to allocate pktmbuf offload");
+
+   ut_params->op = _params->ol->op.crypto;
+
+   /* iv */
+   iv_pad_len = RTE_ALIGN_CEIL(iv_len, 16);
+
+   ut_params->op->iv.data =
+   (uint8_t *) rte_pktmbuf_prepend(ut_params->ibuf, iv_pad_len);
+   TEST_ASSERT_NOT_NULL(ut_params->op->iv.data, "no room to prepend iv");
+
+   memset(ut_params->op->iv.data, 0, iv_pad_len);
+   ut_params->op->iv.phys_addr = rte_pktmbuf_mtophys(ut_params->ibuf);
+   ut_params->op->iv.length = iv_pad_len;
+
+   rte_memcpy(ut_params->op->iv.data, iv, iv_len);
+
+   rte_hexdump(stdout, "iv:", ut_params->op->iv.data, iv_pad_len);
+   ut_params->op->data.to_cipher.length = data_len;
+   ut_params->op->data.to_cipher.offset = iv_pad_len;
+   return 0;
+}
+
+static int test_snow3g_encryption(const struct snow3g_test_data *tdata)
+{
+   struct crypto_testsuite_params *ts_params = _params;
+   struct crypto_unittest_params *ut_params = _params;
+
+   int retval;
+
+

[dpdk-dev] [PATCH 2/3] qat: add Snow3G UEA2 support

2016-01-28 Thread Deepak Kumar JAIN

Added support for wireless Snow3G cipher only,
for the Intel Quick Assist device.

Signed-off-by: Deepak Kumar JAIN 
---
 doc/guides/cryptodevs/qat.rst|  5 +++--
 doc/guides/rel_notes/release_2_3.rst |  1 +
 drivers/crypto/qat/qat_adf/qat_algs.h|  1 +
 drivers/crypto/qat/qat_adf/qat_algs_build_desc.c | 12 
 drivers/crypto/qat/qat_crypto.c  |  8 
 5 files changed, 25 insertions(+), 2 deletions(-)

diff --git a/doc/guides/cryptodevs/qat.rst b/doc/guides/cryptodevs/qat.rst
index 1901842..eda5de2 100644
--- a/doc/guides/cryptodevs/qat.rst
+++ b/doc/guides/cryptodevs/qat.rst
@@ -1,5 +1,5 @@
 ..  BSD LICENSE
-Copyright(c) 2015 Intel Corporation. All rights reserved.
+Copyright(c) 2015-2016 Intel Corporation. All rights reserved.

 Redistribution and use in source and binary forms, with or without
 modification, are permitted provided that the following conditions
@@ -47,6 +47,7 @@ Cipher algorithms:
 * ``RTE_CRYPTO_SYM_CIPHER_AES128_CBC``
 * ``RTE_CRYPTO_SYM_CIPHER_AES192_CBC``
 * ``RTE_CRYPTO_SYM_CIPHER_AES256_CBC``
+* ``RTE_CRYPTO_SYM_CIPHER_SNOW3G_UEA2``

 Hash algorithms:

@@ -61,7 +62,7 @@ Limitations

 * Chained mbufs are not supported.
 * Hash only is not supported.
-* Cipher only is not supported.
+* Cipher only is not supported except Snow3G UEA2.
 * Only in-place is currently supported (destination address is the same as 
source address).
 * Only supports the session-oriented API implementation (session-less APIs are 
not supported).
 * Not performance tuned.
diff --git a/doc/guides/rel_notes/release_2_3.rst 
b/doc/guides/rel_notes/release_2_3.rst
index 99de186..0e1f1ff 100644
--- a/doc/guides/rel_notes/release_2_3.rst
+++ b/doc/guides/rel_notes/release_2_3.rst
@@ -3,6 +3,7 @@ DPDK Release 2.3

 New Features
 
+* **Added the support of Snow3g UEA2 Cipher operation for Intel Quick Assist 
Devices.*


 Resolved Issues
diff --git a/drivers/crypto/qat/qat_adf/qat_algs.h 
b/drivers/crypto/qat/qat_adf/qat_algs.h
index d4aa087..54eeb23 100644
--- a/drivers/crypto/qat/qat_adf/qat_algs.h
+++ b/drivers/crypto/qat/qat_adf/qat_algs.h
@@ -127,5 +127,6 @@ void qat_alg_ablkcipher_init_dec(struct 
qat_alg_ablkcipher_cd *cd,
unsigned int keylen);

 int qat_alg_validate_aes_key(int key_len, enum icp_qat_hw_cipher_algo *alg);
+int qat_alg_validate_snow3g_key(int key_len, enum icp_qat_hw_cipher_algo *alg);

 #endif
diff --git a/drivers/crypto/qat/qat_adf/qat_algs_build_desc.c 
b/drivers/crypto/qat/qat_adf/qat_algs_build_desc.c
index 88fd803..200371d 100644
--- a/drivers/crypto/qat/qat_adf/qat_algs_build_desc.c
+++ b/drivers/crypto/qat/qat_adf/qat_algs_build_desc.c
@@ -755,3 +755,15 @@ int qat_alg_validate_aes_key(int key_len, enum 
icp_qat_hw_cipher_algo *alg)
}
return 0;
 }
+
+int qat_alg_validate_snow3g_key(int key_len, enum icp_qat_hw_cipher_algo *alg)
+{
+   switch (key_len) {
+   case ICP_QAT_HW_SNOW_3G_UEA2_KEY_SZ:
+   *alg = ICP_QAT_HW_CIPHER_ALGO_SNOW_3G_UEA2;
+   break;
+   default:
+   return -EINVAL;
+   }
+   return 0;
+}
diff --git a/drivers/crypto/qat/qat_crypto.c b/drivers/crypto/qat/qat_crypto.c
index e524638..9ae6715 100644
--- a/drivers/crypto/qat/qat_crypto.c
+++ b/drivers/crypto/qat/qat_crypto.c
@@ -168,6 +168,14 @@ qat_crypto_sym_configure_session_cipher(struct 
rte_cryptodev *dev,
}
session->qat_mode = ICP_QAT_HW_CIPHER_CTR_MODE;
break;
+   case RTE_CRYPTO_CIPHER_SNOW3G_UEA2:
+   if (qat_alg_validate_snow3g_key(cipher_xform->key.length,
+   >qat_cipher_alg) != 0) {
+   PMD_DRV_LOG(ERR, "Invalid SNOW3G cipher key size");
+   goto error_out;
+   }
+   session->qat_mode = ICP_QAT_HW_CIPHER_ECB_MODE;
+   break;
case RTE_CRYPTO_CIPHER_NULL:
case RTE_CRYPTO_CIPHER_3DES_ECB:
case RTE_CRYPTO_CIPHER_3DES_CBC:
-- 
2.1.0

[dpdk-dev] [PATCH 1/3] crypto: add cipher/auth only support

2016-01-28 Thread Deepak Kumar JAIN

Refactored the existing functionality into
modular form to support the cipher/auth only
functionalities.

Signed-off-by: Deepak Kumar JAIN 
---
 drivers/crypto/qat/qat_adf/qat_algs.h|  20 ++-
 drivers/crypto/qat/qat_adf/qat_algs_build_desc.c | 206 ---
 drivers/crypto/qat/qat_crypto.c  | 136 +++
 drivers/crypto/qat/qat_crypto.h  |  12 +-
 4 files changed, 308 insertions(+), 66 deletions(-)

diff --git a/drivers/crypto/qat/qat_adf/qat_algs.h 
b/drivers/crypto/qat/qat_adf/qat_algs.h
index 76c08c0..d4aa087 100644
--- a/drivers/crypto/qat/qat_adf/qat_algs.h
+++ b/drivers/crypto/qat/qat_adf/qat_algs.h
@@ -3,7 +3,7 @@
  *  redistributing this file, you may do so under either license.
  *
  *  GPL LICENSE SUMMARY
- *  Copyright(c) 2015 Intel Corporation.
+ *  Copyright(c) 2015-2016 Intel Corporation.
  *  This program is free software; you can redistribute it and/or modify
  *  it under the terms of version 2 of the GNU General Public License as
  *  published by the Free Software Foundation.
@@ -17,7 +17,7 @@
  *  qat-linux at intel.com
  *
  *  BSD LICENSE
- *  Copyright(c) 2015 Intel Corporation.
+ *  Copyright(c) 2015-2016 Intel Corporation.
  *  Redistribution and use in source and binary forms, with or without
  *  modification, are permitted provided that the following conditions
  *  are met:
@@ -104,11 +104,17 @@ struct qat_alg_ablkcipher_cd {

 int qat_get_inter_state_size(enum icp_qat_hw_auth_algo qat_hash_alg);

-int qat_alg_aead_session_create_content_desc(struct qat_session *cd,
-   uint8_t *enckey, uint32_t enckeylen,
-   uint8_t *authkey, uint32_t authkeylen,
-   uint32_t add_auth_data_length,
-   uint32_t digestsize);
+int qat_alg_aead_session_create_content_desc_cipher(struct qat_session *cd,
+   uint8_t *enckey,
+   uint32_t enckeylen);
+
+int qat_alg_aead_session_create_content_desc_auth(struct qat_session *cdesc,
+   uint8_t *cipherkey,
+   uint32_t cipherkeylen,
+   uint8_t *authkey,
+   uint32_t authkeylen,
+   uint32_t add_auth_data_length,
+   uint32_t digestsize);

 void qat_alg_init_common_hdr(struct icp_qat_fw_comn_req_hdr *header);

diff --git a/drivers/crypto/qat/qat_adf/qat_algs_build_desc.c 
b/drivers/crypto/qat/qat_adf/qat_algs_build_desc.c
index ceaffb7..88fd803 100644
--- a/drivers/crypto/qat/qat_adf/qat_algs_build_desc.c
+++ b/drivers/crypto/qat/qat_adf/qat_algs_build_desc.c
@@ -3,7 +3,7 @@
  *  redistributing this file, you may do so under either license.
  *
  *  GPL LICENSE SUMMARY
- *  Copyright(c) 2015 Intel Corporation.
+ *  Copyright(c) 2015-2016 Intel Corporation.
  *  This program is free software; you can redistribute it and/or modify
  *  it under the terms of version 2 of the GNU General Public License as
  *  published by the Free Software Foundation.
@@ -17,7 +17,7 @@
  *  qat-linux at intel.com
  *
  *  BSD LICENSE
- *  Copyright(c) 2015 Intel Corporation.
+ *  Copyright(c) 2015-2016 Intel Corporation.
  *  Redistribution and use in source and binary forms, with or without
  *  modification, are permitted provided that the following conditions
  *  are met:
@@ -359,15 +359,141 @@ void qat_alg_init_common_hdr(struct 
icp_qat_fw_comn_req_hdr *header)
   ICP_QAT_FW_LA_NO_UPDATE_STATE);
 }

-int qat_alg_aead_session_create_content_desc(struct qat_session *cdesc,
-   uint8_t *cipherkey, uint32_t cipherkeylen,
-   uint8_t *authkey, uint32_t authkeylen,
-   uint32_t add_auth_data_length,
-   uint32_t digestsize)
+int qat_alg_aead_session_create_content_desc_cipher(struct qat_session *cdesc,
+   uint8_t *cipherkey,
+   uint32_t cipherkeylen)
 {
-   struct qat_alg_cd *content_desc = >cd;
-   struct icp_qat_hw_cipher_algo_blk *cipher = _desc->cipher;
-   struct icp_qat_hw_auth_algo_blk *hash = _desc->hash;
+   struct icp_qat_hw_cipher_algo_blk *cipher;
+   struct icp_qat_fw_la_bulk_req *req_tmpl = >fw_req;
+   struct icp_qat_fw_comn_req_hdr_cd_pars *cd_pars = _tmpl->cd_pars;
+   struct icp_qat_fw_comn_req_hdr *header = _tmpl->comn_hdr;
+   void *ptr = _tmpl->cd_ctrl;
+   struct icp_qat_fw_cipher_cd_ctrl_hdr *cipher_cd_ctrl = ptr;
+   struct icp_qat_fw_auth_cd_ctrl_hdr *hash_cd_ctrl = ptr;
+   enum icp_qat_hw_cipher_convert key_convert;
+   uint16_t proto =

[dpdk-dev] [PATCH 0/3] Snow3G UEA2 support for Intel Quick Assist Devices

2016-01-28 Thread Deepak Kumar JAIN

This patchset contains support for snow3g UEA2 wireless algorithm
for Intel Quick Assist devices. (cipher-only)
?
QAT PMD previously supported only cipher/hash chaining for AES/SHA.
The code has been refactored to also support cipher-only
functionality for Snow3g algorithms.
Cipher/hash only functionality is only supported
for Snow3g and not for AES/SHA.

Deepak Kumar JAIN (3):
  crypto: add cipher/auth only support
  qat: add Snow3G UEA2 support
  app/test: add Snow3G UEA2 tests

 app/test/test_cryptodev.c| 318 +-
 app/test/test_cryptodev.h|   2 +-
 app/test/test_cryptodev_snow3g_test_vectors.h| 323 +++
 doc/guides/cryptodevs/qat.rst|   5 +-
 doc/guides/rel_notes/release_2_3.rst |   1 +
 drivers/crypto/qat/qat_adf/qat_algs.h|  21 +-
 drivers/crypto/qat/qat_adf/qat_algs_build_desc.c | 218 +--
 drivers/crypto/qat/qat_crypto.c  | 144 +++---
 drivers/crypto/qat/qat_crypto.h  |  12 +-
 9 files changed, 974 insertions(+), 70 deletions(-)
 create mode 100644 app/test/test_cryptodev_snow3g_test_vectors.h

-- 
2.1.0

[dpdk-dev] [PATCH] fm10k: optimize legacy TX func

2016-01-28 Thread Chen Jing D(Mark)

From: "Chen Jing D(Mark)" 

When legacy TX func tries to free a bunch of mbufs, it will free
them one by one. This change will scan the free list and merge the
requests in case they belongs to same pool, then free once, which
will reduce cycles on freeing mbufs.

Signed-off-by: Chen Jing D(Mark) 
---
 doc/guides/rel_notes/release_2_3.rst |2 +
 drivers/net/fm10k/fm10k_rxtx.c   |   59 -
 2 files changed, 52 insertions(+), 9 deletions(-)

diff --git a/doc/guides/rel_notes/release_2_3.rst 
b/doc/guides/rel_notes/release_2_3.rst
index 99de186..20ce78d 100644
--- a/doc/guides/rel_notes/release_2_3.rst
+++ b/doc/guides/rel_notes/release_2_3.rst
@@ -3,7 +3,9 @@ DPDK Release 2.3

 New Features
 
+* **Optimize fm10k Tx func.**

+  * Free multiple mbufs at a time to reduce freeing mbuf cycles.

 Resolved Issues
 ---
diff --git a/drivers/net/fm10k/fm10k_rxtx.c b/drivers/net/fm10k/fm10k_rxtx.c
index e958865..f3de691 100644
--- a/drivers/net/fm10k/fm10k_rxtx.c
+++ b/drivers/net/fm10k/fm10k_rxtx.c
@@ -369,6 +369,51 @@ fm10k_recv_scattered_pkts(void *rx_queue, struct rte_mbuf 
**rx_pkts,
return nb_rcv;
 }

+/*
+ * Free multiple TX mbuf at a time if they are in the same pool
+ *
+ * @txep: software desc ring index that starts to free
+ * @num: number of descs to free
+ *
+ */
+static inline void tx_free_bulk_mbuf(struct rte_mbuf **txep, int num)
+{
+   struct rte_mbuf *m, *free[RTE_FM10K_TX_MAX_FREE_BUF_SZ];
+   int i;
+   int nb_free = 0;
+
+   if (unlikely(num == 0))
+   return;
+
+   m = __rte_pktmbuf_prefree_seg(txep[0]);
+   if (likely(m != NULL)) {
+   free[0] = m;
+   nb_free = 1;
+   for (i = 1; i < num; i++) {
+   m = __rte_pktmbuf_prefree_seg(txep[i]);
+   if (likely(m != NULL)) {
+   if (likely(m->pool == free[0]->pool))
+   free[nb_free++] = m;
+   else {
+   rte_mempool_put_bulk(free[0]->pool,
+   (void *)free, nb_free);
+   free[0] = m;
+   nb_free = 1;
+   }
+   }
+   txep[i] = NULL;
+   }
+   rte_mempool_put_bulk(free[0]->pool, (void **)free, nb_free);
+   } else {
+   for (i = 1; i < num; i++) {
+   m = __rte_pktmbuf_prefree_seg(txep[i]);
+   if (m != NULL)
+   rte_mempool_put(m->pool, m);
+   txep[i] = NULL;
+   }
+   }
+}
+
 static inline void tx_free_descriptors(struct fm10k_tx_queue *q)
 {
uint16_t next_rs, count = 0;
@@ -385,11 +430,7 @@ static inline void tx_free_descriptors(struct 
fm10k_tx_queue *q)
 * including nb_desc */
if (q->last_free > next_rs) {
count = q->nb_desc - q->last_free;
-   while (q->last_free < q->nb_desc) {
-   rte_pktmbuf_free_seg(q->sw_ring[q->last_free]);
-   q->sw_ring[q->last_free] = NULL;
-   ++q->last_free;
-   }
+   tx_free_bulk_mbuf(>sw_ring[q->last_free], count);
q->last_free = 0;
}

@@ -397,10 +438,10 @@ static inline void tx_free_descriptors(struct 
fm10k_tx_queue *q)
q->nb_free += count + (next_rs + 1 - q->last_free);

/* free buffers from last_free, up to and including next_rs */
-   while (q->last_free <= next_rs) {
-   rte_pktmbuf_free_seg(q->sw_ring[q->last_free]);
-   q->sw_ring[q->last_free] = NULL;
-   ++q->last_free;
+   if (q->last_free <= next_rs) {
+   count = next_rs - q->last_free + 1;
+   tx_free_bulk_mbuf(>sw_ring[q->last_free], count);
+   q->last_free += count;
}

if (q->last_free == q->nb_desc)
-- 
1.7.7.6

[dpdk-dev] [PATCH v5 3/4] ethdev: redesign link speed config API

2016-01-28 Thread Harish Patil



From: Marc Sune mailto:marcde...@gmail.com>>
Date: Sunday, October 4, 2015 at 2:12 PM
To: "dev at dpdk.org" mailto:dev at 
dpdk.org>>
Subject: [dpdk-dev] [PATCH v5 3/4] ethdev: redesign link speed config API

This patch redesigns the API to set the link speed/s configure
for an ethernet port. Specifically:

- it allows to define a set of advertised speeds for
  auto-negociation.
- it allows to disable link auto-negociation (single fixed speed).
- default: auto-negociate all supported speeds.

Other changes:

* Added utility MACROs ETH_SPEED_NUM_XXX with the numeric
  values of all supported link speeds, in Mbps.
* Converted link_speed to uint32_t to accomodate 100G speeds
  (bug).
* Added autoneg flag in struct rte_eth_link to indicate if
  link speed was a result of auto-negociation or was fixed
  by configuration.
* Added utility function to convert numeric speeds to bitmap
  fields.
* Adapted testpmd to the new link API.

Signed-off-by: Marc Sune mailto:marcdevel at gmail.com>>
---
 app/test-pmd/cmdline.c | 124 +++--
 app/test/virtual_pmd.c |   4 +-
 drivers/net/af_packet/rte_eth_af_packet.c  |   5 +-
 drivers/net/bonding/rte_eth_bond_8023ad.c  |  14 ++--
 drivers/net/cxgbe/base/t4_hw.c |   8 +-
 drivers/net/e1000/base/e1000_80003es2lan.c |   6 +-
 drivers/net/e1000/base/e1000_82541.c   |   8 +-
 drivers/net/e1000/base/e1000_82543.c   |   4 +-
 drivers/net/e1000/base/e1000_82575.c   |  11 +--
 drivers/net/e1000/base/e1000_api.c |   2 +-
 drivers/net/e1000/base/e1000_api.h |   2 +-
 drivers/net/e1000/base/e1000_defines.h |   4 +-
 drivers/net/e1000/base/e1000_hw.h  |   2 +-
 drivers/net/e1000/base/e1000_ich8lan.c |   4 +-
 drivers/net/e1000/base/e1000_mac.c |   9 ++-
 drivers/net/e1000/base/e1000_mac.h |   6 +-
 drivers/net/e1000/base/e1000_vf.c  |   4 +-
 drivers/net/e1000/base/e1000_vf.h  |   2 +-
 drivers/net/e1000/em_ethdev.c  | 108 -
 drivers/net/e1000/igb_ethdev.c | 103 
 drivers/net/fm10k/fm10k_ethdev.c   |   8 +-
 drivers/net/i40e/i40e_ethdev.c |  70 
 drivers/net/i40e/i40e_ethdev_vf.c  |  11 +--
 drivers/net/ixgbe/ixgbe_ethdev.c   |  72 -
 drivers/net/mlx4/mlx4.c|   2 +
 drivers/net/mpipe/mpipe_tilegx.c   |   6 +-
 drivers/net/null/rte_eth_null.c|   5 +-
 drivers/net/pcap/rte_eth_pcap.c|   9 ++-
 drivers/net/ring/rte_eth_ring.c|   5 +-
 drivers/net/vmxnet3/vmxnet3_ethdev.c   |   5 +-
 drivers/net/xenvirt/rte_eth_xenvirt.c  |   5 +-
 examples/ip_pipeline/config_parse.c|   3 +-
 lib/librte_ether/rte_ethdev.c  |  49 
 lib/librte_ether/rte_ethdev.h  | 113 --
 34 files changed, 437 insertions(+), 356 deletions(-)

diff --git a/app/test-pmd/cmdline.c b/app/test-pmd/cmdline.c
index 0f8f48f..c62f5be 100644
--- a/app/test-pmd/cmdline.c
+++ b/app/test-pmd/cmdline.c
@@ -897,14 +897,65 @@ struct cmd_config_speed_all {
 cmdline_fixed_string_t value2;
 };

+static int
+parse_and_check_speed_duplex(char *value1, char *value2, uint32_t *link_speed)
+{
+
+int duplex;
+
+if (!strcmp(value2, "half")) {
+duplex = 0;
+} else if (!strcmp(value2, "full")) {
+duplex = 1;
+} else if (!strcmp(value2, "auto")) {
+duplex = 1;
+} else {
+printf("Unknown parameter\n");
+return -1;
+}
+
+if (!strcmp(value1, "10")) {
+*link_speed = (duplex) ? ETH_LINK_SPEED_10M :
+ETH_LINK_SPEED_10M_HD;
+} else if (!strcmp(value1, "100")) {
+*link_speed = (duplex) ? ETH_LINK_SPEED_100M :
+ETH_LINK_SPEED_100M_HD;
+} else if (!strcmp(value1, "1000")) {
+if (!duplex)
+goto invalid_speed_param;
+*link_speed = ETH_LINK_SPEED_1G;
+} else if (!strcmp(value1, "1")) {
+if (!duplex)
+goto invalid_speed_param;
+*link_speed = ETH_LINK_SPEED_10G;
+} else if (!strcmp(value1, "4")) {
+if (!duplex)
+goto invalid_speed_param;
+*link_speed = ETH_LINK_SPEED_40G;
+} else if (!strcmp(value1, "auto")) {
+if (!duplex)
+goto invalid_speed_param;
+*link_speed = ETH_LINK_SPEED_AUTONEG;
+} else {
+printf("Unknown parameter\n");
+return -1;
+}
+
+return 0;
+
+invalid_speed_param:
+
+printf("Invalid speed parameter\n");
+return -1;
+}
+
 static void
 cmd_config_speed_all_parsed(void *parsed_result,
 __attribute__((unused)) struct cmdline *cl,
 __attribute__((unused)) void *data)
 {
 struct cmd_config_speed_all *res = parsed_result;
-uint16_t link_speed =

[dpdk-dev] [PATCH] doc: add doc for i40e pmd driver introduction

2016-01-28 Thread Jingjing Wu

A new doc "i40e.rst" is added to introduce i40e pmd driver.

Signed-off-by: Jingjing Wu 
---
 doc/guides/nics/i40e.rst  | 351 ++
 doc/guides/nics/index.rst |   1 +
 2 files changed, 352 insertions(+)
 create mode 100644 doc/guides/nics/i40e.rst

diff --git a/doc/guides/nics/i40e.rst b/doc/guides/nics/i40e.rst
new file mode 100644
index 000..de44486
--- /dev/null
+++ b/doc/guides/nics/i40e.rst
@@ -0,0 +1,351 @@
+..  BSD LICENSE
+Copyright(c) 2010-2016 Intel Corporation. All rights reserved.
+All rights reserved.
+
+Redistribution and use in source and binary forms, with or without
+modification, are permitted provided that the following conditions
+are met:
+
+* Redistributions of source code must retain the above copyright
+notice, this list of conditions and the following disclaimer.
+* Redistributions in binary form must reproduce the above copyright
+notice, this list of conditions and the following disclaimer in
+the documentation and/or other materials provided with the
+distribution.
+* Neither the name of Intel Corporation nor the names of its
+contributors may be used to endorse or promote products derived
+from this software without specific prior written permission.
+
+THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+"AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+(INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+
+I40E Poll Mode Driver
+==
+
+The I40E PMD (**librte_pmd_i40e**) provides poll mode driver support
+for **Intel X710/XL710/X722** 10/40 Gbps family of adapters.
+
+
+More information can be found at `Intel Official Website 
`_.
+
+Features
+
+
+- Multiple queues for TX and RX
+- Receiver Side Steering (RSS)
+- MAC/VLAN filtering
+- Packet type information
+- flow director
+- cloud filter
+- Checksum offload
+- VLAN/QinQ stripping and inserting
+- TSO offload
+- Promiscuous mode
+- Multicast mode
+- Port hardware statistics
+- Jumbo frames
+- Link state information
+- Link flow control
+- Mirror on port, VLAN and VSI
+- Interrupt mode RX
+- Scattered and gather for TX and RX
+- Vector Poll mode driver
+- DCB
+- VMDQ
+- SR-IOV VF
+- Hot plug
+- IEEE1588/802.1AS timestamping
+
+Prerequisites
+-
+
+- Identifying Your Adapter
+  `Intel Support `_ to identify your
+  adapter, and get latest NVM/FW images.
+
+- Follow the **DPDK Getting Started Guide** to setup the basic DPDK 
environment.
+
+- To get better performance on Intel platforms, please follow **How to get 
best performance with NICs on Intel platforms** to set up environment.
+
+Pre-Installation Configuration
+--
+
+Config File Options
+~~~
+
+The following options can be modified in the ``config`` file. Please note that
+enabling debugging options may affect system performance.
+
+- ``CONFIG_RTE_LIBRTE_I40E_PMD`` (default **y**)
+
+  Toggle compilation of librte_pmd_i40e driver.
+
+- ``CONFIG_RTE_LIBRTE_I40E_DEBUG_*`` (default **n**)
+
+  Toggle display of generic debugging messages.
+
+- ``CONFIG_RTE_LIBRTE_I40E_RX_ALLOW_BULK_ALLOC`` (default **y**)
+
+  Toggle allow bulk allocation for RX.
+
+- ``CONFIG_RTE_LIBRTE_I40E_INC_VECTOR`` (default **n**)
+
+  Toggle to use Vector PMD instead of normal RX TX path, to enable vPMD for
+  RX, bulk allocation for Rx must be allowed.
+
+- ``CONFIG_RTE_LIBRTE_I40E_RX_OLFLAGS_ENABLE`` (default **y**)
+
+  Toggle to enable RX olflags, it is only meaningful when Vector PMD is used.
+
+- ``CONFIG_RTE_LIBRTE_I40E_16BYTE_RX_DESC`` (default **n**)
+
+  Toggle to use 16-byte RX descriptor, by defualt the RX descriptor is 32 byte.
+
+- ``CONFIG_RTE_LIBRTE_I40E_QUEUE_NUM_PER_PF`` (default **64**)
+
+  Number of queues reserved for PF.
+
+- ``CONFIG_RTE_LIBRTE_I40E_QUEUE_NUM_PER_VF`` (default **4**)
+
+  Number of queues reserverd for each SR-IOV VF.
+
+- ``CONFIG_RTE_LIBRTE_I40E_QUEUE_NUM_PER_VM`` (default **4**)
+
+  Number of queues reserverd for each VMDQ Pool.
+
+- ``CONFIG_RTE_LIBRTE_I40E_ITR_INTERVAL`` (default **-1**)
+
+  Interrupt Throttling interval.
+
+Driver Compilation
+~~
+
+I40E PMD for Linux, please see :ref:`Linux guide `.
+I40E PMD for FreeBSD, please see :ref:`FreeBSD guide `.
+

[dpdk-dev] [PATCH 0/4] DPDK polling-mode driver for Amazon Elastic Network Adapters (ENA)

2016-01-28 Thread Thomas Monjalon

Woh a new driver!
Welcome :)

2016-01-28 16:20, Jan Medala:
> This is a PMD for the Amazon ethernet ENA family.

Where can we find some documentation about this family?
Please some explanations about its design and usage with DPDK
would be well fit in the directory doc/guides/nics/.

> The driver operates variety of ENA adapters through feature negotiation with 
> the adapter and upgradable commands set.
> ENA driver handles PCI Physical and Virtual ENA functions.

Definitely interested to know more.

>  lib/librte_eal/linuxapp/ena_uio/ena_uio_driver.c   |  276 +++

Sorry the kernel module party is over.
One day, igb_uio will be removed.
I suggest to make a first version without interrupt support
and work with Linux community to fix your issues.

[dpdk-dev] [PATCH 3/3] app/testpmd: set default MAC addresses for each VF

2016-01-28 Thread Helin Zhang

It generates MAC addresses during host port initialization, which
will be set as default MAC addresses for corresponding VFs.

Signed-off-by: Helin Zhang 
---
 app/test-pmd/testpmd.c | 19 +++
 1 file changed, 19 insertions(+)

diff --git a/app/test-pmd/testpmd.c b/app/test-pmd/testpmd.c
index 1319917..f7aac81 100644
--- a/app/test-pmd/testpmd.c
+++ b/app/test-pmd/testpmd.c
@@ -1762,6 +1762,23 @@ rxtx_port_config(struct rte_port *port)
port->tx_conf.txq_flags = txq_flags;
 }

+static void
+generate_default_vf_mac_addr(portid_t pid, struct rte_vf_conf *vf_conf,
+uint8_t vf_num)
+{
+   uint8_t i;
+   uint8_t addr[ETHER_ADDR_LEN] = {0x68, 0x05, 0xca, 0x25, 0x00, 0x00};
+
+   if (vf_num >= ETH_VF_NUM_MAX)
+   return;
+
+   addr[4] = (uint8_t)pid;
+   for (i = 0; i < vf_num; i++) {
+   addr[5] = i;
+   memcpy(vf_conf[i].mac_addr.addr_bytes, addr, sizeof(addr));
+   }
+}
+
 void
 init_port_config(void)
 {
@@ -1772,6 +1789,8 @@ init_port_config(void)
port = [pid];
port->dev_conf.rxmode = rx_mode;
port->dev_conf.fdir_conf = fdir_conf;
+   generate_default_vf_mac_addr(pid, port->dev_conf.vf_conf, 32);
+
if (nb_rxq > 1) {
port->dev_conf.rx_adv_conf.rss_conf.rss_key = NULL;
port->dev_conf.rx_adv_conf.rss_conf.rss_hf = rss_hf;
-- 
2.5.0

[dpdk-dev] [PATCH 2/3] i40evf: use ether interface for validating MAC address

2016-01-28 Thread Helin Zhang

It uses ether interface of 'is_valid_assigned_ether_addr' for
validating MAC address. In the meanwhile, more annotations are
added for obtaining/generating VF MAC address.

Signed-off-by: Helin Zhang 
---
 drivers/net/i40e/i40e_ethdev_vf.c | 14 +++---
 1 file changed, 7 insertions(+), 7 deletions(-)

diff --git a/drivers/net/i40e/i40e_ethdev_vf.c 
b/drivers/net/i40e/i40e_ethdev_vf.c
index 14d2a50..2a54596 100644
--- a/drivers/net/i40e/i40e_ethdev_vf.c
+++ b/drivers/net/i40e/i40e_ethdev_vf.c
@@ -1180,6 +1180,7 @@ i40evf_init_vf(struct rte_eth_dev *dev)
int i, err, bufsz;
struct i40e_hw *hw = I40E_DEV_PRIVATE_TO_HW(dev->data->dev_private);
struct i40e_vf *vf = I40EVF_DEV_PRIVATE_TO_VF(dev->data->dev_private);
+   struct ether_addr *p_mac_addr;

vf->adapter = I40E_DEV_PRIVATE_TO_ADAPTER(dev->data->dev_private);
vf->dev_data = dev->data;
@@ -1249,13 +1250,12 @@ i40evf_init_vf(struct rte_eth_dev *dev)
vf->vsi.nb_qps = vf->vsi_res->num_queue_pairs;
vf->vsi.adapter = I40E_DEV_PRIVATE_TO_ADAPTER(dev->data->dev_private);

-   /* check mac addr, if it's not valid, genrate one */
-   if (I40E_SUCCESS != i40e_validate_mac_addr(\
-   vf->vsi_res->default_mac_addr))
-   eth_random_addr(vf->vsi_res->default_mac_addr);
-
-   ether_addr_copy((struct ether_addr *)vf->vsi_res->default_mac_addr,
-   (struct ether_addr *)hw->mac.addr);
+   /* Store the MAC address configured by host, or generate random one */
+   p_mac_addr = (struct ether_addr *)(vf->vsi_res->default_mac_addr);
+   if (is_valid_assigned_ether_addr(p_mac_addr)) /* Configured by host */
+   ether_addr_copy(p_mac_addr, (struct ether_addr *)hw->mac.addr);
+   else
+   eth_random_addr(hw->mac.addr); /* Generate a random one */

return 0;

-- 
2.5.0

[dpdk-dev] [PATCH 0/3] support setting i40e VF MAC address from DPDK host side

2016-01-28 Thread Helin Zhang

It adds pre-setting i40e VF MAC addresses from DPDK PF host side,
during host port initialization, by introduing a new port
configuration element. It then can pre-set VF MAC addresses
before any launching VFs, and the VF MAC addresses will not be
random each time launching a VF.
There should be no ABI broken, as ABI changes
in 'struct rte_eth_conf' has already been announced in R2.2.

Helin Zhang (3):
  i40e: add setting VF MAC address in DPDK PF host
  i40evf: use ether interface for validating MAC address
  app/testpmd: set default MAC addresses for each VF

 app/test-pmd/testpmd.c   | 19 +++
 doc/guides/rel_notes/release_2_3.rst |  9 +
 drivers/net/i40e/i40e_ethdev.c   | 21 +
 drivers/net/i40e/i40e_ethdev.h   |  1 +
 drivers/net/i40e/i40e_ethdev_vf.c| 14 +++---
 drivers/net/i40e/i40e_pf.c   |  2 ++
 lib/librte_ether/rte_ethdev.h| 10 ++
 7 files changed, 69 insertions(+), 7 deletions(-)

-- 
2.5.0

[dpdk-dev] [PATCH 4/4] DPDK polling-mode driver for Amazon Elastic Network Adapters (ENA)

2016-01-28 Thread Jan Medala

This is a PMD for the Amazon ethernet ENA family.
The driver operates variety of ENA adapters through feature
negotiation with the adapter and upgradable commands set.
ENA driver handles PCI Physical and Virtual ENA functions.

Signed-off-by: Evgeny Schemeilin 
Signed-off-by: Jan Medala 
Signed-off-by: Jakub Palider 
---
 config/common_linuxapp |7 +
 drivers/net/Makefile   |1 +
 drivers/net/ena/Makefile   |   62 +++
 drivers/net/ena/ena_ethdev.c   | 1051 
 drivers/net/ena/ena_ethdev.h   |  143 ++
 drivers/net/ena/ena_logs.h |   76 +++
 drivers/net/ena/ena_platform.h |   58 +++
 mk/rte.app.mk  |1 +
 8 files changed, 1399 insertions(+)
 create mode 100755 drivers/net/ena/Makefile
 create mode 100644 drivers/net/ena/ena_ethdev.c
 create mode 100755 drivers/net/ena/ena_ethdev.h
 create mode 100644 drivers/net/ena/ena_logs.h
 create mode 100644 drivers/net/ena/ena_platform.h

diff --git a/config/common_linuxapp b/config/common_linuxapp
index 1777c4e..261a54f 100644
--- a/config/common_linuxapp
+++ b/config/common_linuxapp
@@ -251,6 +251,13 @@ CONFIG_RTE_LIBRTE_CXGBE_DEBUG_RX=n
 #
 # Compile burst-oriented Amazon ENA PMD driver
 #
+CONFIG_RTE_LIBRTE_ENA_PMD=y
+CONFIG_RTE_LIBRTE_ENA_DEBUG_INIT=y
+CONFIG_RTE_LIBRTE_ENA_DEBUG_RX=n
+CONFIG_RTE_LIBRTE_ENA_DEBUG_TX=n
+CONFIG_RTE_LIBRTE_ENA_DEBUG_TX_FREE=n
+CONFIG_RTE_LIBRTE_ENA_DEBUG_DRIVER=n
+CONFIG_RTE_LIBRTE_ENA_COM_DEBUG=n
 CONFIG_RTE_EAL_ENA_UIO=y

 #
diff --git a/drivers/net/Makefile b/drivers/net/Makefile
index 6e4497e..8f2649f 100644
--- a/drivers/net/Makefile
+++ b/drivers/net/Makefile
@@ -36,6 +36,7 @@ DIRS-$(CONFIG_RTE_LIBRTE_BNX2X_PMD) += bnx2x
 DIRS-$(CONFIG_RTE_LIBRTE_PMD_BOND) += bonding
 DIRS-$(CONFIG_RTE_LIBRTE_CXGBE_PMD) += cxgbe
 DIRS-$(CONFIG_RTE_LIBRTE_E1000_PMD) += e1000
+DIRS-$(CONFIG_RTE_LIBRTE_ENA_PMD) += ena
 DIRS-$(CONFIG_RTE_LIBRTE_ENIC_PMD) += enic
 DIRS-$(CONFIG_RTE_LIBRTE_FM10K_PMD) += fm10k
 DIRS-$(CONFIG_RTE_LIBRTE_I40E_PMD) += i40e
diff --git a/drivers/net/ena/Makefile b/drivers/net/ena/Makefile
new file mode 100755
index 000..960e4cd
--- /dev/null
+++ b/drivers/net/ena/Makefile
@@ -0,0 +1,62 @@
+#
+# BSD LICENSE
+#
+# Copyright (c) 2015-2016 Amazon.com, Inc. or its affiliates.
+# All rights reserved.
+#
+# Redistribution and use in source and binary forms, with or without
+# modification, are permitted provided that the following conditions
+# are met:
+#
+# * Redistributions of source code must retain the above copyright
+# notice, this list of conditions and the following disclaimer.
+# * Redistributions in binary form must reproduce the above copyright
+# notice, this list of conditions and the following disclaimer in
+# the documentation and/or other materials provided with the
+# distribution.
+# * Neither the name of copyright holder nor the names of its
+# contributors may be used to endorse or promote products derived
+# from this software without specific prior written permission.
+#
+# THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+# "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+# LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+# A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+# OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+# SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+# LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+# DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+# THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+# (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+# OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+#
+
+include $(RTE_SDK)/mk/rte.vars.mk
+
+#
+# library name
+#
+LIB = librte_pmd_ena.a
+CFLAGS += $(WERROR_FLAGS) -O2
+INCLUDES :=-I$(SRCDIR) -I$(SRCDIR)/base/ena_defs -I$(SRCDIR)/base
+
+VPATH += $(SRCDIR)/base
+#
+# all source are stored in SRCS-y
+#
+SRCS-$(CONFIG_RTE_LIBRTE_ENA_PMD) += ena_ethdev.c
+SRCS-$(CONFIG_RTE_LIBRTE_ENA_PMD) += ena_com.c
+SRCS-$(CONFIG_RTE_LIBRTE_ENA_PMD) += ena_eth_com.c
+
+# this lib depends upon:
+DEPDIRS-$(CONFIG_RTE_LIBRTE_ENA_PMD) += lib/librte_eal lib/librte_ether
+DEPDIRS-$(CONFIG_RTE_LIBRTE_ENA_PMD) += lib/librte_mempool lib/librte_mbuf
+DEPDIRS-$(CONFIG_RTE_LIBRTE_ENA_PMD) += lib/librte_net lib/librte_malloc
+
+ifeq ($(CONFIG_RTE_EXEC_ENV),"cvos")
+CFLAGS += -Wno-old-style-definition
+endif
+
+CFLAGS += $(INCLUDES)
+
+include $(RTE_SDK)/mk/rte.lib.mk
diff --git a/drivers/net/ena/ena_ethdev.c b/drivers/net/ena/ena_ethdev.c
new file mode 100644
index 000..c71ba9d
--- /dev/null
+++ b/drivers/net/ena/ena_ethdev.c
@@ -0,0 +1,1051 @@
+/*-
+* BSD LICENSE
+*
+* Copyright (c) 2015-2016 Amazon.com, Inc. or its affiliates.
+* All rights reserved.
+*
+* Redistribution and use in source and binary forms, with or without
+* modification, are

[dpdk-dev] [PATCH 3/4] Amazon ENA communication layer for DPDK platform

2016-01-28 Thread Jan Medala

Implementation of platform specific code for ENA communication layer.

Signed-off-by: Evgeny Schemeilin 
Signed-off-by: Jan Medala 
Signed-off-by: Jakub Palider 
---
 drivers/net/ena/base/ena_plat_dpdk.h | 209 +++
 1 file changed, 209 insertions(+)
 create mode 100644 drivers/net/ena/base/ena_plat_dpdk.h

diff --git a/drivers/net/ena/base/ena_plat_dpdk.h 
b/drivers/net/ena/base/ena_plat_dpdk.h
new file mode 100644
index 000..3059343
--- /dev/null
+++ b/drivers/net/ena/base/ena_plat_dpdk.h
@@ -0,0 +1,209 @@
+/*-
+* BSD LICENSE
+*
+* Copyright (c) 2015-2016 Amazon.com, Inc. or its affiliates.
+* All rights reserved.
+*
+* Redistribution and use in source and binary forms, with or without
+* modification, are permitted provided that the following conditions
+* are met:
+*
+* * Redistributions of source code must retain the above copyright
+* notice, this list of conditions and the following disclaimer.
+* * Redistributions in binary form must reproduce the above copyright
+* notice, this list of conditions and the following disclaimer in
+* the documentation and/or other materials provided with the
+* distribution.
+* * Neither the name of copyright holder nor the names of its
+* contributors may be used to endorse or promote products derived
+* from this software without specific prior written permission.
+*
+* THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+* "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+* LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+* A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+* OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+* SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+* LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+* DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+* THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+* (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+* OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+*/
+
+#ifndef DPDK_ENA_COM_ENA_PLAT_DPDK_H_
+#define DPDK_ENA_COM_ENA_PLAT_DPDK_H_
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+typedef _Bool bool;
+typedef rte_atomic32_t ena_atomic32_t;
+
+typedef uint64_t u64;
+typedef uint32_t u32;
+typedef uint16_t u16;
+typedef uint8_t u8;
+
+typedef uint64_t dma_addr_t;
+typedef void *ena_mem_handle_t;
+
+#defineENA_COM_OK 0
+#defineENA_COM_NO_MEM -ENOMEM
+#defineENA_COM_INVAL -EINVAL
+#defineENA_COM_NO_SPACE -ENOSPC
+#defineENA_COM_NO_DEVICE -ENODEV
+#defineENA_COM_PERMISSION -EPERM
+#defineENA_COM_TIMER_EXPIRED -ETIME
+
+#define cacheline_aligned __rte_cache_aligned
+#define true   ((bool)1)
+#define false  ((bool)0)
+
+#define ENA_ABORT() abort()
+
+#define ENA_MSLEEP(x) rte_delay_ms(x)
+#define ENA_UDELAY(x) rte_delay_us(x)
+
+#define memcpy_toio memcpy
+#define wmb __sync_synchronize
+#define mb __sync_synchronize
+
+#define US_PER_S 100
+#define ENA_GET_SYSTEM_USECS() \
+   (rte_get_timer_cycles() * US_PER_S / rte_get_timer_hz())
+
+#define ENA_ASSERT(cond, format, arg...)   \
+   do {\
+   if (unlikely(!(cond))) {\
+   printf("Assertion failed on %s:%s:%d:" format,  \
+   __FILE__, __func__, __LINE__, ##arg);   
\
+   rte_exit(EXIT_FAILURE, "ASSERTION FAILED\n");   
\
+   }   \
+   } while (0)
+
+
+#define max_t(type, x, y) ({   \
+   type __max1 = (x);  \
+   type __max2 = (y);  \
+   __max1 > __max2 ? __max1 : __max2; })
+
+#define ENA_MAX32(x,y) max_t(u32, (x), (y))
+#define ENA_MAX16(x,y) max_t(u16, (x), (y))
+#define ENA_MAX8(x, y) max_t(u8, (x), (y))
+
+#define U64_C(x) x ## ULL
+#define BIT(nr) (1UL << (nr))
+#define BITS_PER_LONG  (__SIZEOF_LONG__ * 8)
+#define GENMASK(h, l)  (((~0UL) << (l)) & (~0UL >> (BITS_PER_LONG - 1 - (h
+#define GENMASK_ULL(h, l) (((U64_C(1) << ((h) - (l) + 1)) - 1) << (l))
+
+#ifdef RTE_LIBRTE_ENA_COM_DEBUG
+#define ena_trc_dbg(format, arg...)\
+   RTE_LOG(DEBUG, PMD, "[ENA_COM: %s] " format, __func__, ##arg)
+#define ena_trc_info(format, arg...)   \
+   RTE_LOG(INFO, PMD, "[ENA_COM: %s] " format, __func__, ##arg)
+#define ena_trc_warn(format, arg...)   \
+   RTE_LOG(ERR, PMD, "[ENA_COM: %s] " format, __func__, ##arg)
+#define ena_trc_err(format, arg...)\
+   RTE_LOG(ERR, PMD, "[ENA_COM: %s] " format, __func__, ##arg)
+#else
+#define ena_trc_dbg(format, arg...)

[dpdk-dev] [PATCH 2/4] Amazon ENA communication layer

2016-01-28 Thread Jan Medala

Low level common abstraction for ENA device communication.

Signed-off-by: Netanel Belgazal 
Signed-off-by: Jan Medala 
Signed-off-by: Jakub Palider 
---
 drivers/net/ena/base/ena_com.c | 2401 
 drivers/net/ena/base/ena_com.h |  765 +++
 drivers/net/ena/base/ena_defs/ena_admin_defs.h | 1660 ++
 .../net/ena/base/ena_defs/ena_admin_defs_custom.h  |   40 +
 drivers/net/ena/base/ena_defs/ena_common_defs.h|   54 +
 drivers/net/ena/base/ena_defs/ena_efa_admin_defs.h |  685 ++
 drivers/net/ena/base/ena_defs/ena_efa_io_defs.h|  543 +
 drivers/net/ena/base/ena_defs/ena_eth_io_defs.h| 1095 +
 drivers/net/ena/base/ena_defs/ena_gen_info.h   |   35 +
 drivers/net/ena/base/ena_defs/ena_includes.h   |   39 +
 drivers/net/ena/base/ena_defs/ena_regs_defs.h  |  326 +++
 drivers/net/ena/base/ena_eth_com.c |  496 
 drivers/net/ena/base/ena_eth_com.h |  130 ++
 drivers/net/ena/base/ena_plat.h|   51 +
 14 files changed, 8320 insertions(+)
 create mode 100644 drivers/net/ena/base/ena_com.c
 create mode 100644 drivers/net/ena/base/ena_com.h
 create mode 100644 drivers/net/ena/base/ena_defs/ena_admin_defs.h
 create mode 100644 drivers/net/ena/base/ena_defs/ena_admin_defs_custom.h
 create mode 100644 drivers/net/ena/base/ena_defs/ena_common_defs.h
 create mode 100644 drivers/net/ena/base/ena_defs/ena_efa_admin_defs.h
 create mode 100644 drivers/net/ena/base/ena_defs/ena_efa_io_defs.h
 create mode 100644 drivers/net/ena/base/ena_defs/ena_eth_io_defs.h
 create mode 100644 drivers/net/ena/base/ena_defs/ena_gen_info.h
 create mode 100644 drivers/net/ena/base/ena_defs/ena_includes.h
 create mode 100644 drivers/net/ena/base/ena_defs/ena_regs_defs.h
 create mode 100644 drivers/net/ena/base/ena_eth_com.c
 create mode 100644 drivers/net/ena/base/ena_eth_com.h
 create mode 100644 drivers/net/ena/base/ena_plat.h

diff --git a/drivers/net/ena/base/ena_com.c b/drivers/net/ena/base/ena_com.c
new file mode 100644
index 000..f7f539d
--- /dev/null
+++ b/drivers/net/ena/base/ena_com.c
@@ -0,0 +1,2401 @@
+/*-
+* BSD LICENSE
+*
+* Copyright (c) 2015-2016 Amazon.com, Inc. or its affiliates.
+* All rights reserved.
+*
+* Redistribution and use in source and binary forms, with or without
+* modification, are permitted provided that the following conditions
+* are met:
+*
+* * Redistributions of source code must retain the above copyright
+* notice, this list of conditions and the following disclaimer.
+* * Redistributions in binary form must reproduce the above copyright
+* notice, this list of conditions and the following disclaimer in
+* the documentation and/or other materials provided with the
+* distribution.
+* * Neither the name of copyright holder nor the names of its
+* contributors may be used to endorse or promote products derived
+* from this software without specific prior written permission.
+*
+* THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+* "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+* LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+* A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+* OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+* SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+* LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+* DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+* THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+* (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+* OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+*/
+
+#include "ena_com.h"
+
+/*/
+/*/
+
+/* Timeout in micro-sec */
+#define ADMIN_CMD_TIMEOUT_US (100)
+
+#define ENA_ASYNC_QUEUE_DEPTH 4
+#define ENA_ADMIN_QUEUE_DEPTH 32
+
+#define ENA_HISTOGRAM_ACTIVE_MASK_OFFSET 0xF08
+#define ENA_EXTENDED_STAT_GET_FUNCT(_funct_queue) (_funct_queue & 0x)
+#define ENA_EXTENDED_STAT_GET_QUEUE(_funct_queue) (_funct_queue >> 16)
+
+#define MIN_ENA_VER (((ENA_COMMON_SPEC_VERSION_MAJOR) << \
+   ENA_REGS_VERSION_MAJOR_VERSION_SHIFT) \
+   | (ENA_COMMON_SPEC_VERSION_MINOR))
+
+#define ENA_CTRL_MAJOR 0
+#define ENA_CTRL_MINOR 0
+#define ENA_CTRL_SUB_MINOR 1
+
+#define MIN_ENA_CTRL_VER \
+   (((ENA_CTRL_MAJOR) << \
+   (ENA_REGS_CONTROLLER_VERSION_MAJOR_VERSION_SHIFT)) | \
+   ((ENA_CTRL_MINOR) << \
+   (ENA_REGS_CONTROLLER_VERSION_MINOR_VERSION_SHIFT)) | \
+   (ENA_CTRL_SUB_MINOR))
+
+#define ENA_DMA_ADDR_TO_UINT32_LOW(x)  ((u32)((u64)(x)))
+#define ENA_DMA_ADDR_TO_UINT32_HIGH(x) ((u32)(((u64)(x)) >> 32))
+
+static int ena_alloc_cnt = 0;
+

[dpdk-dev] [PATCH 1/4] Amazon ENA UIO driver

2016-01-28 Thread Jan Medala

Amazon ENA device doesn't implement legacy interrupt which is
required by default UIO. This driver introduces all necessary
memory mappings in order to use ENA device.

Signed-off-by: Evgeny Schemeilin 
Signed-off-by: Jan Medala 
Signed-off-by: Jakub Palider 
---
 config/common_linuxapp   |   5 +
 lib/librte_eal/common/include/rte_pci.h  |   1 +
 lib/librte_eal/common/include/rte_pci_dev_ids.h  |  16 ++
 lib/librte_eal/linuxapp/Makefile |   3 +
 lib/librte_eal/linuxapp/eal/eal_pci.c|   4 +
 lib/librte_eal/linuxapp/ena_uio/Makefile |  55 +
 lib/librte_eal/linuxapp/ena_uio/ena_uio_driver.c | 276 +++
 7 files changed, 360 insertions(+)
 create mode 100644 lib/librte_eal/linuxapp/ena_uio/Makefile
 create mode 100644 lib/librte_eal/linuxapp/ena_uio/ena_uio_driver.c

diff --git a/config/common_linuxapp b/config/common_linuxapp
index 74bc515..1777c4e 100644
--- a/config/common_linuxapp
+++ b/config/common_linuxapp
@@ -249,6 +249,11 @@ CONFIG_RTE_LIBRTE_CXGBE_DEBUG_TX=n
 CONFIG_RTE_LIBRTE_CXGBE_DEBUG_RX=n

 #
+# Compile burst-oriented Amazon ENA PMD driver
+#
+CONFIG_RTE_EAL_ENA_UIO=y
+
+#
 # Compile burst-oriented Cisco ENIC PMD driver
 #
 CONFIG_RTE_LIBRTE_ENIC_PMD=y
diff --git a/lib/librte_eal/common/include/rte_pci.h 
b/lib/librte_eal/common/include/rte_pci.h
index 334c12e..201a5a7 100644
--- a/lib/librte_eal/common/include/rte_pci.h
+++ b/lib/librte_eal/common/include/rte_pci.h
@@ -149,6 +149,7 @@ enum rte_kernel_driver {
RTE_KDRV_VFIO,
RTE_KDRV_UIO_GENERIC,
RTE_KDRV_NIC_UIO,
+   RTE_KDRV_ENA_UIO,
RTE_KDRV_NONE,
 };

diff --git a/lib/librte_eal/common/include/rte_pci_dev_ids.h 
b/lib/librte_eal/common/include/rte_pci_dev_ids.h
index d088191..0600e03 100644
--- a/lib/librte_eal/common/include/rte_pci_dev_ids.h
+++ b/lib/librte_eal/common/include/rte_pci_dev_ids.h
@@ -152,6 +152,15 @@
 #define RTE_PCI_DEV_ID_DECL_BNX2XVF(vend, dev)
 #endif

+#ifndef RTE_PCI_DEV_ID_DECL_ENA
+#define RTE_PCI_DEV_ID_DECL_ENA(vend, dev)
+#endif
+
+#ifndef PCI_VENDOR_ID_AMAZON
+/** Vendor ID used by Amazon devices */
+#define PCI_VENDOR_ID_AMAZON 0x1D0F
+#endif
+
 #ifndef PCI_VENDOR_ID_INTEL
 /** Vendor ID used by Intel devices */
 #define PCI_VENDOR_ID_INTEL 0x8086
@@ -598,6 +607,12 @@ RTE_PCI_DEV_ID_DECL_VMXNET3(PCI_VENDOR_ID_VMWARE, 
VMWARE_DEV_ID_VMXNET3)

 RTE_PCI_DEV_ID_DECL_FM10KVF(PCI_VENDOR_ID_INTEL, FM10K_DEV_ID_VF)

+/** Amazon devices **/
+
+#define PCI_DEVICE_ID_ENA_VF   0xEC20
+
+RTE_PCI_DEV_ID_DECL_ENA(PCI_VENDOR_ID_AMAZON, PCI_DEVICE_ID_ENA_VF)
+
 /** Cisco VIC devices **/

 #define PCI_DEVICE_ID_CISCO_VIC_ENET 0x0043  /* ethernet vnic */
@@ -656,6 +671,7 @@ RTE_PCI_DEV_ID_DECL_BNX2X(PCI_VENDOR_ID_BROADCOM, 
BNX2X_DEV_ID_57840_MF)
  */
 #undef RTE_PCI_DEV_ID_DECL_BNX2X
 #undef RTE_PCI_DEV_ID_DECL_BNX2XVF
+#undef RTE_PCI_DEV_ID_DECL_ENA
 #undef RTE_PCI_DEV_ID_DECL_EM
 #undef RTE_PCI_DEV_ID_DECL_IGB
 #undef RTE_PCI_DEV_ID_DECL_IGBVF
diff --git a/lib/librte_eal/linuxapp/Makefile b/lib/librte_eal/linuxapp/Makefile
index d9c5233..b293b1c 100644
--- a/lib/librte_eal/linuxapp/Makefile
+++ b/lib/librte_eal/linuxapp/Makefile
@@ -31,6 +31,9 @@

 include $(RTE_SDK)/mk/rte.vars.mk

+ifeq ($(CONFIG_RTE_EAL_ENA_UIO),y)
+DIRS-$(CONFIG_RTE_LIBRTE_EAL_LINUXAPP) += ena_uio
+endif
 ifeq ($(CONFIG_RTE_EAL_IGB_UIO),y)
 DIRS-$(CONFIG_RTE_LIBRTE_EAL_LINUXAPP) += igb_uio
 endif
diff --git a/lib/librte_eal/linuxapp/eal/eal_pci.c 
b/lib/librte_eal/linuxapp/eal/eal_pci.c
index bc5b5be..b05a564 100644
--- a/lib/librte_eal/linuxapp/eal/eal_pci.c
+++ b/lib/librte_eal/linuxapp/eal/eal_pci.c
@@ -137,6 +137,7 @@ pci_map_device(struct rte_pci_device *dev)
 #endif
break;
case RTE_KDRV_IGB_UIO:
+   case RTE_KDRV_ENA_UIO:
case RTE_KDRV_UIO_GENERIC:
/* map resources for devices that use uio */
ret = pci_uio_map_resource(dev);
@@ -161,6 +162,7 @@ pci_unmap_device(struct rte_pci_device *dev)
RTE_LOG(ERR, EAL, "Hotplug doesn't support vfio yet\n");
break;
case RTE_KDRV_IGB_UIO:
+   case RTE_KDRV_ENA_UIO:
case RTE_KDRV_UIO_GENERIC:
/* unmap resources for devices that use uio */
pci_uio_unmap_resource(dev);
@@ -355,6 +357,8 @@ pci_scan_one(const char *dirname, uint16_t domain, uint8_t 
bus,
if (!ret) {
if (!strcmp(driver, "vfio-pci"))
dev->kdrv = RTE_KDRV_VFIO;
+   else if (!strcmp(driver, "ena_uio"))
+   dev->kdrv = RTE_KDRV_ENA_UIO;
else if (!strcmp(driver, "igb_uio"))
dev->kdrv = RTE_KDRV_IGB_UIO;
else if (!strcmp(driver, "uio_pci_generic"))
diff --git a/lib/librte_eal/linuxapp/ena_uio/Makefile 
b/lib/librte_eal/linuxapp/ena_uio/Makefile
new file mode 100644
index

[dpdk-dev] [PATCH 0/4] DPDK polling-mode driver for Amazon Elastic Network Adapters (ENA)

2016-01-28 Thread Jan Medala

This is a PMD for the Amazon ethernet ENA family.
The driver operates variety of ENA adapters through feature negotiation with 
the adapter and upgradable commands set.
ENA driver handles PCI Physical and Virtual ENA functions.

Jan Medala (4):
  Amazon ENA UIO driver
  Amazon ENA communication layer
  Amazon ENA communication layer for DPDK platform
  DPDK polling-mode driver for Amazon Elastic Network Adapters (ENA)

 config/common_linuxapp |   12 +
 drivers/net/Makefile   |1 +
 drivers/net/ena/Makefile   |   62 +
 drivers/net/ena/base/ena_com.c | 2401 
 drivers/net/ena/base/ena_com.h |  765 +++
 drivers/net/ena/base/ena_defs/ena_admin_defs.h | 1660 ++
 .../net/ena/base/ena_defs/ena_admin_defs_custom.h  |   40 +
 drivers/net/ena/base/ena_defs/ena_common_defs.h|   54 +
 drivers/net/ena/base/ena_defs/ena_efa_admin_defs.h |  685 ++
 drivers/net/ena/base/ena_defs/ena_efa_io_defs.h|  543 +
 drivers/net/ena/base/ena_defs/ena_eth_io_defs.h| 1095 +
 drivers/net/ena/base/ena_defs/ena_gen_info.h   |   35 +
 drivers/net/ena/base/ena_defs/ena_includes.h   |   39 +
 drivers/net/ena/base/ena_defs/ena_regs_defs.h  |  326 +++
 drivers/net/ena/base/ena_eth_com.c |  496 
 drivers/net/ena/base/ena_eth_com.h |  130 ++
 drivers/net/ena/base/ena_plat.h|   51 +
 drivers/net/ena/base/ena_plat_dpdk.h   |  209 ++
 drivers/net/ena/ena_ethdev.c   | 1051 +
 drivers/net/ena/ena_ethdev.h   |  143 ++
 drivers/net/ena/ena_logs.h |   76 +
 drivers/net/ena/ena_platform.h |   58 +
 lib/librte_eal/common/include/rte_pci.h|1 +
 lib/librte_eal/common/include/rte_pci_dev_ids.h|   16 +
 lib/librte_eal/linuxapp/Makefile   |3 +
 lib/librte_eal/linuxapp/eal/eal_pci.c  |4 +
 lib/librte_eal/linuxapp/ena_uio/Makefile   |   55 +
 lib/librte_eal/linuxapp/ena_uio/ena_uio_driver.c   |  276 +++
 mk/rte.app.mk  |1 +
 29 files changed, 10288 insertions(+)
 create mode 100755 drivers/net/ena/Makefile
 create mode 100644 drivers/net/ena/base/ena_com.c
 create mode 100644 drivers/net/ena/base/ena_com.h
 create mode 100644 drivers/net/ena/base/ena_defs/ena_admin_defs.h
 create mode 100644 drivers/net/ena/base/ena_defs/ena_admin_defs_custom.h
 create mode 100644 drivers/net/ena/base/ena_defs/ena_common_defs.h
 create mode 100644 drivers/net/ena/base/ena_defs/ena_efa_admin_defs.h
 create mode 100644 drivers/net/ena/base/ena_defs/ena_efa_io_defs.h
 create mode 100644 drivers/net/ena/base/ena_defs/ena_eth_io_defs.h
 create mode 100644 drivers/net/ena/base/ena_defs/ena_gen_info.h
 create mode 100644 drivers/net/ena/base/ena_defs/ena_includes.h
 create mode 100644 drivers/net/ena/base/ena_defs/ena_regs_defs.h
 create mode 100644 drivers/net/ena/base/ena_eth_com.c
 create mode 100644 drivers/net/ena/base/ena_eth_com.h
 create mode 100644 drivers/net/ena/base/ena_plat.h
 create mode 100644 drivers/net/ena/base/ena_plat_dpdk.h
 create mode 100644 drivers/net/ena/ena_ethdev.c
 create mode 100755 drivers/net/ena/ena_ethdev.h
 create mode 100644 drivers/net/ena/ena_logs.h
 create mode 100644 drivers/net/ena/ena_platform.h
 create mode 100644 lib/librte_eal/linuxapp/ena_uio/Makefile
 create mode 100644 lib/librte_eal/linuxapp/ena_uio/ena_uio_driver.c

-- 
1.9.1

[dpdk-dev] [PATCH] examples/ip_pipeline: config parser clean-up

2016-01-28 Thread Fan Zhang

This patch updates the pipelne configuration file parser, cleans up nesting
if/else conditions, and add clearer error message display.

Signed-off-by: Fan Zhang 
---
 examples/ip_pipeline/config_parse.c | 798 
 examples/ip_pipeline/pipeline_be.h  |  48 +++
 2 files changed, 494 insertions(+), 352 deletions(-)

diff --git a/examples/ip_pipeline/config_parse.c 
b/examples/ip_pipeline/config_parse.c
index 1bedbe4..6575e31 100644
--- a/examples/ip_pipeline/config_parse.c
+++ b/examples/ip_pipeline/config_parse.c
@@ -291,34 +291,7 @@ parser_read_arg_bool(const char *p)
return result;
 }

-#define PARSE_ERROR(exp, section, entry)   \
-APP_CHECK(exp, "Parse error in section \"%s\": entry \"%s\"\n", section, entry)
-
-#define PARSE_ERROR_MALLOC(exp)
\
-APP_CHECK(exp, "Parse error: no free memory\n")
-
-#define PARSE_ERROR_SECTION(exp, section)  \
-APP_CHECK(exp, "Parse error in section \"%s\"", section)
-
-#define PARSE_ERROR_SECTION_NO_ENTRIES(exp, section)   \
-APP_CHECK(exp, "Parse error in section \"%s\": no entries\n", section)
-
-#define PARSE_WARNING_IGNORED(exp, section, entry) \
-do \
-if (!(exp))\
-   fprintf(stderr, "Parse warning in section \"%s\": " \
-   "entry \"%s\" is ignored\n", section, entry);   \
-while (0)
-
-#define PARSE_ERROR_INVALID(exp, section, entry)   \
-APP_CHECK(exp, "Parse error in section \"%s\": unrecognized entry \"%s\"\n",\
-   section, entry)
-
-#define PARSE_ERROR_DUPLICATE(exp, section, entry) \
-APP_CHECK(exp, "Parse error in section \"%s\": duplicate entry \"%s\"\n",\
-   section, entry)
-
-static int
+int
 parser_read_uint64(uint64_t *value, const char *p)
 {
char *next;
@@ -358,7 +331,7 @@ parser_read_uint64(uint64_t *value, const char *p)
return 0;
 }

-static int
+int
 parser_read_uint32(uint32_t *value, const char *p)
 {
uint64_t val = 0;
@@ -935,6 +908,7 @@ parse_pipeline_pktq_in(struct app_params *app,

while (*next != '\0') {
enum app_pktq_in_type type;
+   int name_validated = 0;
int id;

end = strchr(next, ' ');
@@ -955,24 +929,41 @@ parse_pipeline_pktq_in(struct app_params *app,
if (validate_name(name, "RXQ", 2) == 0) {
type = APP_PKTQ_IN_HWQ;
id = APP_PARAM_ADD(app->hwq_in_params, name);
-   } else if (validate_name(name, "SWQ", 1) == 0) {
+   if (id < 0)
+   return id;
+   name_validated = 1;
+   }
+
+   if (validate_name(name, "SWQ", 1) == 0) {
type = APP_PKTQ_IN_SWQ;
id = APP_PARAM_ADD(app->swq_params, name);
-   } else if (validate_name(name, "TM", 1) == 0) {
+   if (id < 0)
+   return id;
+   name_validated = 1;
+   }
+
+   if (validate_name(name, "TM", 1) == 0) {
type = APP_PKTQ_IN_TM;
id = APP_PARAM_ADD(app->tm_params, name);
-   } else if (validate_name(name, "SOURCE", 1) == 0) {
+   if (id < 0)
+   return id;
+   name_validated = 1;
+   }
+
+   if (validate_name(name, "SOURCE", 1) == 0) {
type = APP_PKTQ_IN_SOURCE;
id = APP_PARAM_ADD(app->source_params, name);
+   if (id < 0)
+   return id;
+   name_validated = 1;
+   }
+
+   if (name_validated == 1) {
+   p->pktq_in[p->n_pktq_in].type = type;
+   p->pktq_in[p->n_pktq_in].id = (uint32_t) id;
+   p->n_pktq_in++;
} else
return -EINVAL;
-
-   if (id < 0)
-   return id;
-
-   p->pktq_in[p->n_pktq_in].type = type;
-   p->pktq_in[p->n_pktq_in].id = (uint32_t) id;
-   p->n_pktq_in++;
}

return 0;
@@ -990,6 +981,7 @@ parse_pipeline_pktq_out(struct app_params *app,

while (*next != '\0') {
enum app_pktq_out_type type;
+   int name_validated = 0;
int id;

end = strchr(next, ' ');
@@ -1010,24 +1002,41 @@ parse_pipeline_pktq_out(struct app_params *app,
if (validate_name(name, "TXQ", 2) == 0) {
type = APP_PKTQ_OUT_HWQ;
id =

[dpdk-dev] [PATCH] examples/ip_pipeline: add link identification feature

2016-01-28 Thread Fan Zhang

This patch adds link identification feature to packet framework. To
identify a link, user can use both existing port-mask option, or specify
PCI device in each LINK section in the configuration file.

Signed-off-by: Fan Zhang 
---
 examples/ip_pipeline/app.h  |   1 +
 examples/ip_pipeline/config_parse.c | 138 +---
 2 files changed, 131 insertions(+), 8 deletions(-)

diff --git a/examples/ip_pipeline/app.h b/examples/ip_pipeline/app.h
index 6510d6d..43bee8a 100644
--- a/examples/ip_pipeline/app.h
+++ b/examples/ip_pipeline/app.h
@@ -73,6 +73,7 @@ struct app_link_params {
uint32_t ip; /* 0 = Invalid */
uint32_t depth; /* Valid only when IP is valid */
uint64_t mac_addr; /* Read from HW */
+   struct rte_pci_addr *pci_bdf; /* Hardware PCI address */

struct rte_eth_conf conf;
uint8_t promisc;
diff --git a/examples/ip_pipeline/config_parse.c 
b/examples/ip_pipeline/config_parse.c
index 1bedbe4..961e753 100644
--- a/examples/ip_pipeline/config_parse.c
+++ b/examples/ip_pipeline/config_parse.c
@@ -41,10 +41,14 @@
 #include 
 #include 
 #include 
+#include 

 #include 
 #include 
 #include 
+#include 
+#include 
+#include 

 #include "app.h"

@@ -2532,12 +2536,113 @@ filenamedup(const char *filename, const char *suffix)
return s;
 }

+#define IGB_UIO_DEVICES"/sys/bus/pci/drivers/igb_uio/"
+/*
+ * split up a pci address into its constituent parts.
+ */
+static int
+parse_pci_addr_format(const char *buf, int bufsize, uint16_t *domain,
+   uint8_t *bus, uint8_t *devid, uint8_t *function)
+{
+   /* first split on ':' */
+   union splitaddr {
+   struct {
+   char *domain;
+   char *bus;
+   char *devid;
+   char *function;
+   };
+   char *str[PCI_FMT_NVAL];
+   } splitaddr;
+   char *buf_copy = strndup(buf, bufsize);
+
+   if (buf_copy == NULL)
+   return -1;
+
+   if (rte_strsplit(buf_copy, bufsize, splitaddr.str, PCI_FMT_NVAL, ':')
+   != PCI_FMT_NVAL - 1)
+   goto error;
+   /* final split is on '.' between devid and function */
+   splitaddr.function = strchr(splitaddr.devid, '.');
+   if (splitaddr.function == NULL)
+   goto error;
+   *splitaddr.function++ = '\0';
+
+   /* now convert to int values */
+   errno = 0;
+   *domain = (uint16_t)strtoul(splitaddr.domain, NULL, 16);
+   *bus = (uint8_t)strtoul(splitaddr.bus, NULL, 16);
+   *devid = (uint8_t)strtoul(splitaddr.devid, NULL, 16);
+   *function = (uint8_t)strtoul(splitaddr.function, NULL, 10);
+   if (errno != 0)
+   goto error;
+
+   free(buf_copy); /* free the copy made with strdup */
+   return 0;
+error:
+   free(buf_copy);
+   return -1;
+}
+
+static int
+parse_pci_dev_str(struct app_params *app, const char *devArgStr)
+{
+   struct dirent *e;
+   DIR *dir;
+   char dev_name[PATH_MAX];
+   uint16_t domain;
+   uint8_t bus, devid, function;
+   uint8_t port_id = 0;
+   struct rte_pci_addr *pci_addr;
+   uint8_t found_match = 0;
+
+   dir = opendir(IGB_UIO_DEVICES);
+   if (dir == NULL)
+   return -1;
+
+   while ((e = readdir(dir)) != NULL) {
+   if (e->d_name[0] == '.')
+   continue;
+
+   if (parse_pci_addr_format(e->d_name, sizeof(e->d_name), ,
+   , , ) != 0)
+   continue;
+
+   snprintf(dev_name, sizeof(dev_name), PCI_PRI_FMT,
+   domain, bus, devid, function);
+
+   if (strncmp(devArgStr, dev_name, sizeof(dev_name)) == 0) {
+   found_match = 1;
+   pci_addr = malloc(sizeof(struct rte_pci_addr));
+   PARSE_ERROR_MALLOC(pci_addr != NULL);
+   pci_addr->domain = domain;
+   pci_addr->bus = bus;
+   pci_addr->devid = devid;
+   pci_addr->function = function;
+
+   app->link_params[port_id].pci_bdf = pci_addr;
+   app->port_mask |= 1 << port_id;
+
+   break;
+   }
+   /* Assuming all devices will be taken account in EAL */
+   port_id++;
+   }
+
+   closedir(dir);
+
+   if (found_match == 0)
+   return -1;
+
+   return 0;
+}
+
 int
 app_config_args(struct app_params *app, int argc, char **argv)
 {
const char *optname;
int opt, option_index;
-   int f_present, s_present, p_present, l_present;
+   int f_present, s_present, p_present, l_present, w_present;
int preproc_present, preproc_params_present;
int scaned = 0;

@@ -2554,10 +2659,11 @@ app_config_args(struct app_params *app, int

[dpdk-dev] [PATCH v6 9/9] virtio: move VIRTIO_READ/WRITE_REG_X into virtio_pci.c

2016-01-28 Thread Yuanhan Liu

virtio_pci.c is the only file references those macros; move them there.

Signed-off-by: Yuanhan Liu 
Tested-by: Qian Xu 
Reviewed-by: Tetsuya Mukawa 
Tested-by: Tetsuya Mukawa 
Acked-by: Huawei Xie 
---
 drivers/net/virtio/virtio_pci.c | 19 +++
 drivers/net/virtio/virtio_pci.h | 18 --
 2 files changed, 19 insertions(+), 18 deletions(-)

diff --git a/drivers/net/virtio/virtio_pci.c b/drivers/net/virtio/virtio_pci.c
index 2d3143b..e16104e 100644
--- a/drivers/net/virtio/virtio_pci.c
+++ b/drivers/net/virtio/virtio_pci.c
@@ -49,6 +49,25 @@
 #define PCI_CAPABILITY_LIST0x34
 #define PCI_CAP_ID_VNDR0x09

+
+#define VIRTIO_PCI_REG_ADDR(hw, reg) \
+   (unsigned short)((hw)->io_base + (reg))
+
+#define VIRTIO_READ_REG_1(hw, reg) \
+   inb((VIRTIO_PCI_REG_ADDR((hw), (reg
+#define VIRTIO_WRITE_REG_1(hw, reg, value) \
+   outb_p((unsigned char)(value), (VIRTIO_PCI_REG_ADDR((hw), (reg
+
+#define VIRTIO_READ_REG_2(hw, reg) \
+   inw((VIRTIO_PCI_REG_ADDR((hw), (reg
+#define VIRTIO_WRITE_REG_2(hw, reg, value) \
+   outw_p((unsigned short)(value), (VIRTIO_PCI_REG_ADDR((hw), (reg
+
+#define VIRTIO_READ_REG_4(hw, reg) \
+   inl((VIRTIO_PCI_REG_ADDR((hw), (reg
+#define VIRTIO_WRITE_REG_4(hw, reg, value) \
+   outl_p((unsigned int)(value), (VIRTIO_PCI_REG_ADDR((hw), (reg
+
 static void
 legacy_read_dev_config(struct virtio_hw *hw, size_t offset,
   void *dst, int length)
diff --git a/drivers/net/virtio/virtio_pci.h b/drivers/net/virtio/virtio_pci.h
index fcac660..0544a07 100644
--- a/drivers/net/virtio/virtio_pci.h
+++ b/drivers/net/virtio/virtio_pci.h
@@ -318,24 +318,6 @@ outl_p(unsigned int data, unsigned int port)
 }
 #endif

-#define VIRTIO_PCI_REG_ADDR(hw, reg) \
-   (unsigned short)((hw)->io_base + (reg))
-
-#define VIRTIO_READ_REG_1(hw, reg) \
-   inb((VIRTIO_PCI_REG_ADDR((hw), (reg
-#define VIRTIO_WRITE_REG_1(hw, reg, value) \
-   outb_p((unsigned char)(value), (VIRTIO_PCI_REG_ADDR((hw), (reg
-
-#define VIRTIO_READ_REG_2(hw, reg) \
-   inw((VIRTIO_PCI_REG_ADDR((hw), (reg
-#define VIRTIO_WRITE_REG_2(hw, reg, value) \
-   outw_p((unsigned short)(value), (VIRTIO_PCI_REG_ADDR((hw), (reg
-
-#define VIRTIO_READ_REG_4(hw, reg) \
-   inl((VIRTIO_PCI_REG_ADDR((hw), (reg
-#define VIRTIO_WRITE_REG_4(hw, reg, value) \
-   outl_p((unsigned int)(value), (VIRTIO_PCI_REG_ADDR((hw), (reg
-
 static inline int
 vtpci_with_feature(struct virtio_hw *hw, uint64_t bit)
 {
-- 
1.9.0

[dpdk-dev] [PATCH v6 8/9] virtio: add 1.0 support

2016-01-28 Thread Yuanhan Liu

Modern (v1.0) virtio pci device defines several pci capabilities.
Each cap has a configure structure corresponding to it, and the
cap.bar and cap.offset fields tell us where to find it.

Firstly, we map the pci resources by rte_eal_pci_map_device().
We then could easily locate a cfg structure by:

cfg_addr = dev->mem_resources[cap.bar].addr + cap.offset;

Therefore, the entrance of enabling modern (v1.0) pci device support
is to iterate the pci capability lists, and to locate some configs
we care; and they are:

- common cfg

  For generic virtio and virtqueue configuration, such as setting/getting
  features, enabling a specific queue, and so on.

- nofity cfg

  Combining with `queue_notify_off' from common cfg, we could use it to
  notify a specific virt queue.

- device cfg

  Where virtio_net_config structure is located.

- isr cfg

  Where to read isr (interrupt status).

If any of above cap is not found, we fallback to the legacy virtio
handling.

If succeed, hw->vtpci_ops is assigned to modern_ops, where all
operations are implemented by reading/writing a (or few) specific
configuration space from above 4 cfg structures. And that's basically
how this patch works.

Besides those changes, virtio 1.0 introduces a new status field:
FEATURES_OK, which is set after features negotiation is done.

Last, set the VIRTIO_F_VERSION_1 feature flag.

Signed-off-by: Yuanhan Liu 
Tested-by: Qian Xu 
Reviewed-by: Tetsuya Mukawa 
Tested-by: Tetsuya Mukawa 
Acked-by: Huawei Xie 
---

v6: - unfold DEF_IO_READ/WRITE macros

v5: - rename MODERN_READ/WRITE_DEF macro name to IO_READ/WRITE_DEF

- check offset + length overflow
---
 doc/guides/rel_notes/release_2_3.rst |   3 +
 drivers/net/virtio/virtio_ethdev.c   |  25 ++-
 drivers/net/virtio/virtio_ethdev.h   |   3 +-
 drivers/net/virtio/virtio_pci.c  | 355 ++-
 drivers/net/virtio/virtio_pci.h  |  67 +++
 drivers/net/virtio/virtqueue.h   |   2 +
 6 files changed, 450 insertions(+), 5 deletions(-)

diff --git a/doc/guides/rel_notes/release_2_3.rst 
b/doc/guides/rel_notes/release_2_3.rst
index 99de186..c390d97 100644
--- a/doc/guides/rel_notes/release_2_3.rst
+++ b/doc/guides/rel_notes/release_2_3.rst
@@ -4,6 +4,9 @@ DPDK Release 2.3
 New Features
 

+* **Virtio 1.0 support.**
+
+  Enabled virtio 1.0 support for virtio pmd driver.

 Resolved Issues
 ---
diff --git a/drivers/net/virtio/virtio_ethdev.c 
b/drivers/net/virtio/virtio_ethdev.c
index 94e0c4a..deb0382 100644
--- a/drivers/net/virtio/virtio_ethdev.c
+++ b/drivers/net/virtio/virtio_ethdev.c
@@ -927,7 +927,7 @@ virtio_vlan_filter_set(struct rte_eth_dev *dev, uint16_t 
vlan_id, int on)
return virtio_send_command(hw->cvq, , , 1);
 }

-static void
+static int
 virtio_negotiate_features(struct virtio_hw *hw)
 {
uint64_t host_features;
@@ -949,6 +949,22 @@ virtio_negotiate_features(struct virtio_hw *hw)
hw->guest_features = vtpci_negotiate_features(hw, host_features);
PMD_INIT_LOG(DEBUG, "features after negotiate = %"PRIx64,
hw->guest_features);
+
+   if (hw->modern) {
+   if (!vtpci_with_feature(hw, VIRTIO_F_VERSION_1)) {
+   PMD_INIT_LOG(ERR,
+   "VIRTIO_F_VERSION_1 features is not enabled.");
+   return -1;
+   }
+   vtpci_set_status(hw, VIRTIO_CONFIG_STATUS_FEATURES_OK);
+   if (!(vtpci_get_status(hw) & VIRTIO_CONFIG_STATUS_FEATURES_OK)) 
{
+   PMD_INIT_LOG(ERR,
+   "failed to set FEATURES_OK status!");
+   return -1;
+   }
+   }
+
+   return 0;
 }

 /*
@@ -1032,7 +1048,8 @@ eth_virtio_dev_init(struct rte_eth_dev *eth_dev)

/* Tell the host we've known how to drive the device. */
vtpci_set_status(hw, VIRTIO_CONFIG_STATUS_DRIVER);
-   virtio_negotiate_features(hw);
+   if (virtio_negotiate_features(hw) < 0)
+   return -1;

/* If host does not support status then disable LSC */
if (!vtpci_with_feature(hw, VIRTIO_NET_F_STATUS))
@@ -1043,7 +1060,8 @@ eth_virtio_dev_init(struct rte_eth_dev *eth_dev)
rx_func_get(eth_dev);

/* Setting up rx_header size for the device */
-   if (vtpci_with_feature(hw, VIRTIO_NET_F_MRG_RXBUF))
+   if (vtpci_with_feature(hw, VIRTIO_NET_F_MRG_RXBUF) ||
+   vtpci_with_feature(hw, VIRTIO_F_VERSION_1))
hw->vtnet_hdr_size = sizeof(struct virtio_net_hdr_mrg_rxbuf);
else
hw->vtnet_hdr_size = sizeof(struct virtio_net_hdr);
@@ -1159,6 +1177,7 @@ eth_virtio_dev_uninit(struct rte_eth_dev *eth_dev)
rte_intr_callback_unregister(_dev->intr_handle,
virtio_interrupt_handler,
eth_dev);
+   rte_eal_pci_unmap_device(pci_dev);

[dpdk-dev] [PATCH v6 6/9] virtio: retrieve hdr_size from hw->vtnet_hdr_size

2016-01-28 Thread Yuanhan Liu

The mergeable virtio net hdr format has been the standard and the
only virtio net hdr format since virtio 1.0. Therefore, we can
not hardcode hdr_size to "sizeof(struct virtio_net_hdr)" any more
at virtio_recv_pkts(), otherwise, there would be a mismatch of
hdr size from rte_vhost_enqueue_burst() and virtio_recv_pkts(),
leading a packet corruption.

Instead, we should retrieve it from hw->vtnet_hdr_size; we will
do proper settings at eth_virtio_dev_init() in later patches.

Signed-off-by: Yuanhan Liu 
Tested-by: Qian Xu 
Reviewed-by: Tetsuya Mukawa 
Tested-by: Tetsuya Mukawa 
Acked-by: Huawei Xie 
---
---
 drivers/net/virtio/virtio_rxtx.c|  6 --
 drivers/net/virtio/virtio_rxtx_simple.c | 12 ++--
 2 files changed, 10 insertions(+), 8 deletions(-)

diff --git a/drivers/net/virtio/virtio_rxtx.c b/drivers/net/virtio/virtio_rxtx.c
index b7267c0..41a1366 100644
--- a/drivers/net/virtio/virtio_rxtx.c
+++ b/drivers/net/virtio/virtio_rxtx.c
@@ -560,7 +560,7 @@ virtio_recv_pkts(void *rx_queue, struct rte_mbuf **rx_pkts, 
uint16_t nb_pkts)
struct rte_mbuf *rcv_pkts[VIRTIO_MBUF_BURST_SZ];
int error;
uint32_t i, nb_enqueued;
-   const uint32_t hdr_size = sizeof(struct virtio_net_hdr);
+   uint32_t hdr_size;

nb_used = VIRTQUEUE_NUSED(rxvq);

@@ -580,6 +580,7 @@ virtio_recv_pkts(void *rx_queue, struct rte_mbuf **rx_pkts, 
uint16_t nb_pkts)
hw = rxvq->hw;
nb_rx = 0;
nb_enqueued = 0;
+   hdr_size = hw->vtnet_hdr_size;

for (i = 0; i < num ; i++) {
rxm = rcv_pkts[i];
@@ -664,7 +665,7 @@ virtio_recv_mergeable_pkts(void *rx_queue,
uint32_t seg_num;
uint16_t extra_idx;
uint32_t seg_res;
-   const uint32_t hdr_size = sizeof(struct virtio_net_hdr_mrg_rxbuf);
+   uint32_t hdr_size;

nb_used = VIRTQUEUE_NUSED(rxvq);

@@ -682,6 +683,7 @@ virtio_recv_mergeable_pkts(void *rx_queue,
seg_num = 0;
extra_idx = 0;
seg_res = 0;
+   hdr_size = hw->vtnet_hdr_size;

while (i < nb_used) {
struct virtio_net_hdr_mrg_rxbuf *header;
diff --git a/drivers/net/virtio/virtio_rxtx_simple.c 
b/drivers/net/virtio/virtio_rxtx_simple.c
index ff3c11a..3e66e8b 100644
--- a/drivers/net/virtio/virtio_rxtx_simple.c
+++ b/drivers/net/virtio/virtio_rxtx_simple.c
@@ -81,9 +81,9 @@ virtqueue_enqueue_recv_refill_simple(struct virtqueue *vq,

start_dp = vq->vq_ring.desc;
start_dp[desc_idx].addr = (uint64_t)((uintptr_t)cookie->buf_physaddr +
-   RTE_PKTMBUF_HEADROOM - sizeof(struct virtio_net_hdr));
+   RTE_PKTMBUF_HEADROOM - vq->hw->vtnet_hdr_size);
start_dp[desc_idx].len = cookie->buf_len -
-   RTE_PKTMBUF_HEADROOM + sizeof(struct virtio_net_hdr);
+   RTE_PKTMBUF_HEADROOM + vq->hw->vtnet_hdr_size;

vq->vq_free_cnt--;
vq->vq_avail_idx++;
@@ -120,9 +120,9 @@ virtio_rxq_rearm_vec(struct virtqueue *rxvq)

start_dp[i].addr =
(uint64_t)((uintptr_t)sw_ring[i]->buf_physaddr +
-   RTE_PKTMBUF_HEADROOM - sizeof(struct virtio_net_hdr));
+   RTE_PKTMBUF_HEADROOM - rxvq->hw->vtnet_hdr_size);
start_dp[i].len = sw_ring[i]->buf_len -
-   RTE_PKTMBUF_HEADROOM + sizeof(struct virtio_net_hdr);
+   RTE_PKTMBUF_HEADROOM + rxvq->hw->vtnet_hdr_size;
}

rxvq->vq_avail_idx += RTE_VIRTIO_VPMD_RX_REARM_THRESH;
@@ -175,8 +175,8 @@ virtio_recv_pkts_vec(void *rx_queue, struct rte_mbuf 
**rx_pkts,
len_adjust = _mm_set_epi16(
0, 0,
0,
-   (uint16_t) -sizeof(struct virtio_net_hdr),
-   0, (uint16_t) -sizeof(struct virtio_net_hdr),
+   (uint16_t) -rxvq->hw->vtnet_hdr_size,
+   0, (uint16_t) -rxvq->hw->vtnet_hdr_size,
0, 0);

if (unlikely(nb_pkts < RTE_VIRTIO_DESC_PER_LOOP))
-- 
1.9.0

[dpdk-dev] [PATCH v6 5/9] viritio: switch to 64 bit features

2016-01-28 Thread Yuanhan Liu

Switch to 64 bit features, which virtio 1.0 supports.

While legacy virtio only supports 32 bit features, it complains aloud
and quit when trying to setting > 32 bit features.

Signed-off-by: Yuanhan Liu 
Tested-by: Qian Xu 
Reviewed-by: Tetsuya Mukawa 
Tested-by: Tetsuya Mukawa 
Acked-by: Huawei Xie 
---
 drivers/net/virtio/virtio_ethdev.c |  8 
 drivers/net/virtio/virtio_pci.c| 15 ++-
 drivers/net/virtio/virtio_pci.h| 12 ++--
 3 files changed, 20 insertions(+), 15 deletions(-)

diff --git a/drivers/net/virtio/virtio_ethdev.c 
b/drivers/net/virtio/virtio_ethdev.c
index b57224d..94e0c4a 100644
--- a/drivers/net/virtio/virtio_ethdev.c
+++ b/drivers/net/virtio/virtio_ethdev.c
@@ -930,16 +930,16 @@ virtio_vlan_filter_set(struct rte_eth_dev *dev, uint16_t 
vlan_id, int on)
 static void
 virtio_negotiate_features(struct virtio_hw *hw)
 {
-   uint32_t host_features;
+   uint64_t host_features;

/* Prepare guest_features: feature that driver wants to support */
hw->guest_features = VIRTIO_PMD_GUEST_FEATURES;
-   PMD_INIT_LOG(DEBUG, "guest_features before negotiate = %x",
+   PMD_INIT_LOG(DEBUG, "guest_features before negotiate = %"PRIx64,
hw->guest_features);

/* Read device(host) feature bits */
host_features = hw->vtpci_ops->get_features(hw);
-   PMD_INIT_LOG(DEBUG, "host_features before negotiate = %x",
+   PMD_INIT_LOG(DEBUG, "host_features before negotiate = %"PRIx64,
host_features);

/*
@@ -947,7 +947,7 @@ virtio_negotiate_features(struct virtio_hw *hw)
 * guest feature bits.
 */
hw->guest_features = vtpci_negotiate_features(hw, host_features);
-   PMD_INIT_LOG(DEBUG, "features after negotiate = %x",
+   PMD_INIT_LOG(DEBUG, "features after negotiate = %"PRIx64,
hw->guest_features);
 }

diff --git a/drivers/net/virtio/virtio_pci.c b/drivers/net/virtio/virtio_pci.c
index 16485fa..5e1c55f 100644
--- a/drivers/net/virtio/virtio_pci.c
+++ b/drivers/net/virtio/virtio_pci.c
@@ -87,15 +87,20 @@ legacy_write_dev_config(struct virtio_hw *hw, size_t offset,
}
 }

-static uint32_t
+static uint64_t
 legacy_get_features(struct virtio_hw *hw)
 {
return VIRTIO_READ_REG_4(hw, VIRTIO_PCI_HOST_FEATURES);
 }

 static void
-legacy_set_features(struct virtio_hw *hw, uint32_t features)
+legacy_set_features(struct virtio_hw *hw, uint64_t features)
 {
+   if ((features >> 32) != 0) {
+   PMD_DRV_LOG(ERR,
+   "only 32 bit features are allowed for legacy virtio!");
+   return;
+   }
VIRTIO_WRITE_REG_4(hw, VIRTIO_PCI_GUEST_FEATURES, features);
 }

@@ -451,10 +456,10 @@ vtpci_write_dev_config(struct virtio_hw *hw, size_t 
offset,
hw->vtpci_ops->write_dev_cfg(hw, offset, src, length);
 }

-uint32_t
-vtpci_negotiate_features(struct virtio_hw *hw, uint32_t host_features)
+uint64_t
+vtpci_negotiate_features(struct virtio_hw *hw, uint64_t host_features)
 {
-   uint32_t features;
+   uint64_t features;

/*
 * Limit negotiated features to what the driver, virtqueue, and
diff --git a/drivers/net/virtio/virtio_pci.h b/drivers/net/virtio/virtio_pci.h
index e8e7509..d7bc6bb 100644
--- a/drivers/net/virtio/virtio_pci.h
+++ b/drivers/net/virtio/virtio_pci.h
@@ -175,8 +175,8 @@ struct virtio_pci_ops {
uint8_t (*get_status)(struct virtio_hw *hw);
void(*set_status)(struct virtio_hw *hw, uint8_t status);

-   uint32_t (*get_features)(struct virtio_hw *hw);
-   void (*set_features)(struct virtio_hw *hw, uint32_t features);
+   uint64_t (*get_features)(struct virtio_hw *hw);
+   void (*set_features)(struct virtio_hw *hw, uint64_t features);

uint8_t (*get_isr)(struct virtio_hw *hw);

@@ -191,7 +191,7 @@ struct virtio_pci_ops {
 struct virtio_hw {
struct virtqueue *cvq;
uint32_tio_base;
-   uint32_tguest_features;
+   uint64_tguest_features;
uint32_tmax_tx_queues;
uint32_tmax_rx_queues;
uint16_tvtnet_hdr_size;
@@ -271,9 +271,9 @@ outl_p(unsigned int data, unsigned int port)
outl_p((unsigned int)(value), (VIRTIO_PCI_REG_ADDR((hw), (reg

 static inline int
-vtpci_with_feature(struct virtio_hw *hw, uint32_t bit)
+vtpci_with_feature(struct virtio_hw *hw, uint64_t bit)
 {
-   return (hw->guest_features & (1u << bit)) != 0;
+   return (hw->guest_features & (1ULL << bit)) != 0;
 }

 /*
@@ -286,7 +286,7 @@ void vtpci_reinit_complete(struct virtio_hw *);

 void vtpci_set_status(struct virtio_hw *, uint8_t);

-uint32_t vtpci_negotiate_features(struct virtio_hw *, uint32_t);
+uint64_t vtpci_negotiate_features(struct virtio_hw *, uint64_t);

 void vtpci_write_dev_config(struct virtio_hw *, size_t, const void *, int);

-- 
1.9.0

[dpdk-dev] [PATCH v6 4/9] virtio: move left pci stuff to virtio_pci.c

2016-01-28 Thread Yuanhan Liu

virtio_pci.c is a more proper place for pci stuff; virtio_ethdev is not.

Signed-off-by: Yuanhan Liu 
Tested-by: Qian Xu 
Reviewed-by: Tetsuya Mukawa 
Tested-by: Tetsuya Mukawa 
Acked-by: Huawei Xie 
---
 drivers/net/virtio/virtio_ethdev.c | 265 +---
 drivers/net/virtio/virtio_pci.c| 270 -
 2 files changed, 270 insertions(+), 265 deletions(-)

diff --git a/drivers/net/virtio/virtio_ethdev.c 
b/drivers/net/virtio/virtio_ethdev.c
index 6c1d3a0..b57224d 100644
--- a/drivers/net/virtio/virtio_ethdev.c
+++ b/drivers/net/virtio/virtio_ethdev.c
@@ -36,10 +36,6 @@
 #include 
 #include 
 #include 
-#ifdef RTE_EXEC_ENV_LINUXAPP
-#include 
-#include 
-#endif

 #include 
 #include 
@@ -955,260 +951,6 @@ virtio_negotiate_features(struct virtio_hw *hw)
hw->guest_features);
 }

-#ifdef RTE_EXEC_ENV_LINUXAPP
-static int
-parse_sysfs_value(const char *filename, unsigned long *val)
-{
-   FILE *f;
-   char buf[BUFSIZ];
-   char *end = NULL;
-
-   f = fopen(filename, "r");
-   if (f == NULL) {
-   PMD_INIT_LOG(ERR, "%s(): cannot open sysfs value %s",
-__func__, filename);
-   return -1;
-   }
-
-   if (fgets(buf, sizeof(buf), f) == NULL) {
-   PMD_INIT_LOG(ERR, "%s(): cannot read sysfs value %s",
-__func__, filename);
-   fclose(f);
-   return -1;
-   }
-   *val = strtoul(buf, , 0);
-   if ((buf[0] == '\0') || (end == NULL) || (*end != '\n')) {
-   PMD_INIT_LOG(ERR, "%s(): cannot parse sysfs value %s",
-__func__, filename);
-   fclose(f);
-   return -1;
-   }
-   fclose(f);
-   return 0;
-}
-
-static int get_uio_dev(struct rte_pci_addr *loc, char *buf, unsigned int 
buflen,
-   unsigned int *uio_num)
-{
-   struct dirent *e;
-   DIR *dir;
-   char dirname[PATH_MAX];
-
-   /* depending on kernel version, uio can be located in uio/uioX
-* or uio:uioX */
-   snprintf(dirname, sizeof(dirname),
-SYSFS_PCI_DEVICES "/" PCI_PRI_FMT "/uio",
-loc->domain, loc->bus, loc->devid, loc->function);
-   dir = opendir(dirname);
-   if (dir == NULL) {
-   /* retry with the parent directory */
-   snprintf(dirname, sizeof(dirname),
-SYSFS_PCI_DEVICES "/" PCI_PRI_FMT,
-loc->domain, loc->bus, loc->devid, loc->function);
-   dir = opendir(dirname);
-
-   if (dir == NULL) {
-   PMD_INIT_LOG(ERR, "Cannot opendir %s", dirname);
-   return -1;
-   }
-   }
-
-   /* take the first file starting with "uio" */
-   while ((e = readdir(dir)) != NULL) {
-   /* format could be uio%d ...*/
-   int shortprefix_len = sizeof("uio") - 1;
-   /* ... or uio:uio%d */
-   int longprefix_len = sizeof("uio:uio") - 1;
-   char *endptr;
-
-   if (strncmp(e->d_name, "uio", 3) != 0)
-   continue;
-
-   /* first try uio%d */
-   errno = 0;
-   *uio_num = strtoull(e->d_name + shortprefix_len, , 10);
-   if (errno == 0 && endptr != (e->d_name + shortprefix_len)) {
-   snprintf(buf, buflen, "%s/uio%u", dirname, *uio_num);
-   break;
-   }
-
-   /* then try uio:uio%d */
-   errno = 0;
-   *uio_num = strtoull(e->d_name + longprefix_len, , 10);
-   if (errno == 0 && endptr != (e->d_name + longprefix_len)) {
-   snprintf(buf, buflen, "%s/uio:uio%u", dirname,
-*uio_num);
-   break;
-   }
-   }
-   closedir(dir);
-
-   /* No uio resource found */
-   if (e == NULL) {
-   PMD_INIT_LOG(ERR, "Could not find uio resource");
-   return -1;
-   }
-
-   return 0;
-}
-
-static int
-virtio_has_msix(const struct rte_pci_addr *loc)
-{
-   DIR *d;
-   char dirname[PATH_MAX];
-
-   snprintf(dirname, sizeof(dirname),
-SYSFS_PCI_DEVICES "/" PCI_PRI_FMT "/msi_irqs",
-loc->domain, loc->bus, loc->devid, loc->function);
-
-   d = opendir(dirname);
-   if (d)
-   closedir(d);
-
-   return (d != NULL);
-}
-
-/* Extract I/O port numbers from sysfs */
-static int virtio_resource_init_by_uio(struct rte_pci_device *pci_dev)
-{
-   char dirname[PATH_MAX];
-   char filename[PATH_MAX];
-   unsigned long start, size;
-   unsigned int uio_num;
-
-   if (get_uio_dev(_dev->addr, dirname, sizeof(dirname), _num) < 0)
-   return -1;
-
-   /* get portio size */
-

[dpdk-dev] [PATCH v6 3/9] virtio: introduce struct virtio_pci_ops

2016-01-28 Thread Yuanhan Liu

Introduce struct virtio_pci_ops, to let legacy virtio (v0.95) and
modern virtio (1.0) have different implementation regarding to a
specific pci action, such as read host status.

With that, this patch reimplements all exported pci functions, in
a way like:

vtpci_foo_bar(struct virtio_hw *hw)
{
hw->vtpci_ops->foo_bar(hw);
}

So that we need pay attention to those pci related functions only
while adding virtio 1.0 support.

This patch introduced a new vtpci function, vtpci_init(), to do
proper virtio pci settings. It's pretty simple so far: just sets
hw->vtpci_ops to legacy_ops as we don't support 1.0 yet.

Signed-off-by: Yuanhan Liu 
Tested-by: Qian Xu 
Reviewed-by: Tetsuya Mukawa 
Tested-by: Tetsuya Mukawa 
Acked-by: Huawei Xie 
---
v5: - define "src" arg of vtpci_write_dev_config()
---
 drivers/net/virtio/virtio_ethdev.c |  22 ++---
 drivers/net/virtio/virtio_pci.c| 164 ++---
 drivers/net/virtio/virtio_pci.h|  29 ++-
 drivers/net/virtio/virtqueue.h |   2 +-
 4 files changed, 170 insertions(+), 47 deletions(-)

diff --git a/drivers/net/virtio/virtio_ethdev.c 
b/drivers/net/virtio/virtio_ethdev.c
index d928339..6c1d3a0 100644
--- a/drivers/net/virtio/virtio_ethdev.c
+++ b/drivers/net/virtio/virtio_ethdev.c
@@ -272,9 +272,7 @@ virtio_dev_queue_release(struct virtqueue *vq) {

if (vq) {
hw = vq->hw;
-   /* Select and deactivate the queue */
-   VIRTIO_WRITE_REG_2(hw, VIRTIO_PCI_QUEUE_SEL, 
vq->vq_queue_index);
-   VIRTIO_WRITE_REG_4(hw, VIRTIO_PCI_QUEUE_PFN, 0);
+   hw->vtpci_ops->del_queue(hw, vq);

rte_free(vq->sw_ring);
rte_free(vq);
@@ -295,15 +293,13 @@ int virtio_dev_queue_setup(struct rte_eth_dev *dev,
struct virtio_hw *hw = dev->data->dev_private;
struct virtqueue *vq = NULL;

-   /* Write the virtqueue index to the Queue Select Field */
-   VIRTIO_WRITE_REG_2(hw, VIRTIO_PCI_QUEUE_SEL, vtpci_queue_idx);
-   PMD_INIT_LOG(DEBUG, "selecting queue: %u", vtpci_queue_idx);
+   PMD_INIT_LOG(DEBUG, "setting up queue: %u", vtpci_queue_idx);

/*
 * Read the virtqueue size from the Queue Size field
 * Always power of 2 and if 0 virtqueue does not exist
 */
-   vq_size = VIRTIO_READ_REG_2(hw, VIRTIO_PCI_QUEUE_NUM);
+   vq_size = hw->vtpci_ops->get_queue_num(hw, vtpci_queue_idx);
PMD_INIT_LOG(DEBUG, "vq_size: %u nb_desc:%u", vq_size, nb_desc);
if (vq_size == 0) {
PMD_INIT_LOG(ERR, "%s: virtqueue does not exist", __func__);
@@ -436,12 +432,8 @@ int virtio_dev_queue_setup(struct rte_eth_dev *dev,
memset(vq->virtio_net_hdr_mz->addr, 0, PAGE_SIZE);
}

-   /*
-* Set guest physical address of the virtqueue
-* in VIRTIO_PCI_QUEUE_PFN config register of device
-*/
-   VIRTIO_WRITE_REG_4(hw, VIRTIO_PCI_QUEUE_PFN,
-   mz->phys_addr >> VIRTIO_PCI_QUEUE_ADDR_SHIFT);
+   hw->vtpci_ops->setup_queue(hw, vq);
+
*pvq = vq;
return 0;
 }
@@ -950,7 +942,7 @@ virtio_negotiate_features(struct virtio_hw *hw)
hw->guest_features);

/* Read device(host) feature bits */
-   host_features = VIRTIO_READ_REG_4(hw, VIRTIO_PCI_HOST_FEATURES);
+   host_features = hw->vtpci_ops->get_features(hw);
PMD_INIT_LOG(DEBUG, "host_features before negotiate = %x",
host_features);

@@ -1287,6 +1279,8 @@ eth_virtio_dev_init(struct rte_eth_dev *eth_dev)

pci_dev = eth_dev->pci_dev;

+   vtpci_init(pci_dev, hw);
+
if (virtio_resource_init(pci_dev) < 0)
return -1;

diff --git a/drivers/net/virtio/virtio_pci.c b/drivers/net/virtio/virtio_pci.c
index b34b59e..8d001e8 100644
--- a/drivers/net/virtio/virtio_pci.c
+++ b/drivers/net/virtio/virtio_pci.c
@@ -34,12 +34,11 @@

 #include "virtio_pci.h"
 #include "virtio_logs.h"
+#include "virtqueue.h"

-static uint8_t vtpci_get_status(struct virtio_hw *);
-
-void
-vtpci_read_dev_config(struct virtio_hw *hw, size_t offset,
-   void *dst, int length)
+static void
+legacy_read_dev_config(struct virtio_hw *hw, size_t offset,
+  void *dst, int length)
 {
uint64_t off;
uint8_t *d;
@@ -60,22 +59,22 @@ vtpci_read_dev_config(struct virtio_hw *hw, size_t offset,
}
 }

-void
-vtpci_write_dev_config(struct virtio_hw *hw, size_t offset,
-   void *src, int length)
+static void
+legacy_write_dev_config(struct virtio_hw *hw, size_t offset,
+   const void *src, int length)
 {
uint64_t off;
-   uint8_t *s;
+   const uint8_t *s;
int size;

off = VIRTIO_PCI_CONFIG(hw) + offset;
for (s = src; length > 0; s += size, off += size, length -= size) {
if (length >= 4) {
size = 4;
-

[dpdk-dev] [PATCH v6 2/9] virtio: define offset as size_t type

2016-01-28 Thread Yuanhan Liu

offset arg of vtpci_read/write_dev_config is derived from offsetof(),
which is of size_t type, instead of uint64_t. So, define it as size_t
type.

Signed-off-by: Yuanhan Liu 
Tested-by: Qian Xu 
Reviewed-by: Tetsuya Mukawa 
Tested-by: Tetsuya Mukawa 
Acked-by: Huawei Xie 
---
 drivers/net/virtio/virtio_pci.c | 4 ++--
 drivers/net/virtio/virtio_pci.h | 4 ++--
 2 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/drivers/net/virtio/virtio_pci.c b/drivers/net/virtio/virtio_pci.c
index 2245bec..b34b59e 100644
--- a/drivers/net/virtio/virtio_pci.c
+++ b/drivers/net/virtio/virtio_pci.c
@@ -38,7 +38,7 @@
 static uint8_t vtpci_get_status(struct virtio_hw *);

 void
-vtpci_read_dev_config(struct virtio_hw *hw, uint64_t offset,
+vtpci_read_dev_config(struct virtio_hw *hw, size_t offset,
void *dst, int length)
 {
uint64_t off;
@@ -61,7 +61,7 @@ vtpci_read_dev_config(struct virtio_hw *hw, uint64_t offset,
 }

 void
-vtpci_write_dev_config(struct virtio_hw *hw, uint64_t offset,
+vtpci_write_dev_config(struct virtio_hw *hw, size_t offset,
void *src, int length)
 {
uint64_t off;
diff --git a/drivers/net/virtio/virtio_pci.h b/drivers/net/virtio/virtio_pci.h
index 47f722a..fe89c21 100644
--- a/drivers/net/virtio/virtio_pci.h
+++ b/drivers/net/virtio/virtio_pci.h
@@ -261,9 +261,9 @@ void vtpci_set_status(struct virtio_hw *, uint8_t);

 uint32_t vtpci_negotiate_features(struct virtio_hw *, uint32_t);

-void vtpci_write_dev_config(struct virtio_hw *, uint64_t, void *, int);
+void vtpci_write_dev_config(struct virtio_hw *, size_t, void *, int);

-void vtpci_read_dev_config(struct virtio_hw *, uint64_t, void *, int);
+void vtpci_read_dev_config(struct virtio_hw *, size_t, void *, int);

 uint8_t vtpci_isr(struct virtio_hw *);

-- 
1.9.0

[dpdk-dev] [PATCH v6 1/9] virtio: don't set vring address again at queue startup

2016-01-28 Thread Yuanhan Liu

As we have already set up it at virtio_dev_queue_setup(), and a vq
restart will not reset the settings.

Signed-off-by: Yuanhan Liu 
Tested-by: Qian Xu 
Reviewed-by: Tetsuya Mukawa 
Tested-by: Tetsuya Mukawa 
Acked-by: Huawei Xie 
---
 drivers/net/virtio/virtio_rxtx.c | 15 ---
 1 file changed, 15 deletions(-)

diff --git a/drivers/net/virtio/virtio_rxtx.c b/drivers/net/virtio/virtio_rxtx.c
index 74b39ef..b7267c0 100644
--- a/drivers/net/virtio/virtio_rxtx.c
+++ b/drivers/net/virtio/virtio_rxtx.c
@@ -339,11 +339,6 @@ virtio_dev_vring_start(struct virtqueue *vq, int 
queue_type)
vq_update_avail_idx(vq);

PMD_INIT_LOG(DEBUG, "Allocated %d bufs", nbufs);
-
-   VIRTIO_WRITE_REG_2(vq->hw, VIRTIO_PCI_QUEUE_SEL,
-   vq->vq_queue_index);
-   VIRTIO_WRITE_REG_4(vq->hw, VIRTIO_PCI_QUEUE_PFN,
-   vq->mz->phys_addr >> VIRTIO_PCI_QUEUE_ADDR_SHIFT);
} else if (queue_type == VTNET_TQ) {
if (use_simple_rxtx) {
int mid_idx  = vq->vq_nentries >> 1;
@@ -362,16 +357,6 @@ virtio_dev_vring_start(struct virtqueue *vq, int 
queue_type)
for (i = mid_idx; i < vq->vq_nentries; i++)
vq->vq_ring.avail->ring[i] = i;
}
-
-   VIRTIO_WRITE_REG_2(vq->hw, VIRTIO_PCI_QUEUE_SEL,
-   vq->vq_queue_index);
-   VIRTIO_WRITE_REG_4(vq->hw, VIRTIO_PCI_QUEUE_PFN,
-   vq->mz->phys_addr >> VIRTIO_PCI_QUEUE_ADDR_SHIFT);
-   } else {
-   VIRTIO_WRITE_REG_2(vq->hw, VIRTIO_PCI_QUEUE_SEL,
-   vq->vq_queue_index);
-   VIRTIO_WRITE_REG_4(vq->hw, VIRTIO_PCI_QUEUE_PFN,
-   vq->mz->phys_addr >> VIRTIO_PCI_QUEUE_ADDR_SHIFT);
}
 }

-- 
1.9.0

[dpdk-dev] [PATCH v6 0/9] virtio 1.0 enabling for virtio pmd driver

2016-01-28 Thread Yuanhan Liu

v6: unfold IO_READ/WRITE_DEF macro

v5: minor fixes:

- fix wrong type of arg "offset" of read/write_dev_config(): patch 2
  is newly added for that.

- check "offset + length" overflow

Almost all difference comes from virtio 1.0 are the PCI layout change:
the major configuration structures are stored at bar space, and their
location is stored at corresponding pci cap structure. Reading/parsing
them is one of the major work of patch 8.

To make handling virtio v1.0 and v0.95 co-exist well, this patch set
introduces a virtio_pci_ops structure, to add another layer so that
we could keep those vtpci_foo_bar "APIs". With that, we could do the
minimum change to add virtio 1.0 support.


Rough test guide


Firstly, you need get a virtio 1.0 supported QEMU (say, v2.5), then add
option "disable-modern=false" to qemu virtio-net-pci device to enable
virtio 1.0 (which is disabled by default).

And if you see something like following from 'lspci -v', it means virtio
1.0 is indeed enabled:

00:04.0 Ethernet controller: Red Hat, Inc Virtio network device
Subsystem: Red Hat, Inc Device 0001
Physical Slot: 4
Flags: bus master, fast devsel, latency 0, IRQ 11
I/O ports at c040 [size=64]
Memory at febf1000 (32-bit, non-prefetchable) [size=4K]
Memory at fe00 (64-bit, prefetchable) [size=8M]
Expansion ROM at feb8 [disabled] [size=256K]
Capabilities: [98] MSI-X: Enable+ Count=6 Masked-
==> Capabilities: [84] Vendor Specific Information: Len=14 
==> Capabilities: [70] Vendor Specific Information: Len=14 
==> Capabilities: [60] Vendor Specific Information: Len=10 
==> Capabilities: [50] Vendor Specific Information: Len=10 
==> Capabilities: [40] Vendor Specific Information: Len=10 
Kernel driver in use: virtio-pci
Kernel modules: virtio_pci

After that, there wasn't anything speical comparing to the old virtio
0.95 pmd driver.


---
Yuanhan Liu (9):
  virtio: don't set vring address again at queue startup
  virtio: define offset as size_t type
  virtio: introduce struct virtio_pci_ops
  virtio: move left pci stuff to virtio_pci.c
  viritio: switch to 64 bit features
  virtio: retrieve hdr_size from hw->vtnet_hdr_size
  eal: pci: export pci_[un]map_device
  virtio: add 1.0 support
  virtio: move VIRTIO_READ/WRITE_REG_X into virtio_pci.c

 doc/guides/rel_notes/release_2_3.rst|   3 +
 drivers/net/virtio/virtio_ethdev.c  | 302 +
 drivers/net/virtio/virtio_ethdev.h  |   3 +-
 drivers/net/virtio/virtio_pci.c | 813 +++-
 drivers/net/virtio/virtio_pci.h | 124 +++-
 drivers/net/virtio/virtio_rxtx.c|  21 +-
 drivers/net/virtio/virtio_rxtx_simple.c |  12 +-
 drivers/net/virtio/virtqueue.h  |   4 +-
 lib/librte_eal/bsdapp/eal/eal_pci.c |   4 +-
 lib/librte_eal/bsdapp/eal/rte_eal_version.map   |   7 +
 lib/librte_eal/common/eal_common_pci.c  |   4 +-
 lib/librte_eal/common/eal_private.h |  18 -
 lib/librte_eal/common/include/rte_pci.h |  27 +
 lib/librte_eal/linuxapp/eal/eal_pci.c   |   4 +-
 lib/librte_eal/linuxapp/eal/rte_eal_version.map |   7 +
 15 files changed, 971 insertions(+), 382 deletions(-)

-- 
1.9.0

[dpdk-dev] [PATCH v6] vfio: Support for no-IOMMU mode

2016-01-28 Thread Thomas Monjalon

2016-01-28 14:16, Burakov, Anatoly:
> Hi Thomas,
> 
> > 2016-01-28 11:57, Anatoly Burakov:
> > > +#if LINUX_VERSION_CODE < KERNEL_VERSION(4, 5, 0)
> > 
> > Why not #ifndef VFIO_NOIOMMU_IOMMU?
> > It would avoid some backport issue.
> 
> I don't see how it could. Versions post-4.5 will have VFIO_NOIOMMU_IOMMU, so 
> no issue there. Pre-4.5 versions, whether they do or do not have 
> VFIO_NOIOMMU_IOMMU defined, will have RTE_VFIO_NOIOMMU defined as 8 
> regardless.

Are we sure it will ever be backported as 8?
Anyway I think it's better to avoid version number checks.
What happens if the feature is reverted from 4.5 as it was from 4.4?

[dpdk-dev] [PATCH] lib/librte_eal: Fix compile issue with gcc 5.3.1

2016-01-28 Thread Michael Qiu

In fedora 22 with GCC version 5.3.1, when compile,
will result an error:

include/rte_memcpy.h:309:7: error: "RTE_MACHINE_CPUFLAG_AVX2"
is not defined [-Werror=undef]
#elif RTE_MACHINE_CPUFLAG_AVX2

Fixes: 9484092baad3 ("eal/x86: optimize memcpy for AVX512 platforms")

Signed-off-by: Michael Qiu 
---
 app/test/test_memcpy_perf.c | 2 +-
 lib/librte_eal/common/include/arch/x86/rte_memcpy.h | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/app/test/test_memcpy_perf.c b/app/test/test_memcpy_perf.c
index 73babec..f150d8d 100644
--- a/app/test/test_memcpy_perf.c
+++ b/app/test/test_memcpy_perf.c
@@ -81,7 +81,7 @@ static size_t buf_sizes[TEST_VALUE_RANGE];
 /* Data is aligned on this many bytes (power of 2) */
 #ifdef RTE_MACHINE_CPUFLAG_AVX512F
 #define ALIGNMENT_UNIT  64
-#elif RTE_MACHINE_CPUFLAG_AVX2
+#elif defined RTE_MACHINE_CPUFLAG_AVX2
 #define ALIGNMENT_UNIT  32
 #else /* RTE_MACHINE_CPUFLAG */
 #define ALIGNMENT_UNIT  16
diff --git a/lib/librte_eal/common/include/arch/x86/rte_memcpy.h 
b/lib/librte_eal/common/include/arch/x86/rte_memcpy.h
index d965957..8e2c53c 100644
--- a/lib/librte_eal/common/include/arch/x86/rte_memcpy.h
+++ b/lib/librte_eal/common/include/arch/x86/rte_memcpy.h
@@ -306,7 +306,7 @@ COPY_BLOCK_128_BACK63:
goto COPY_BLOCK_128_BACK63;
 }

-#elif RTE_MACHINE_CPUFLAG_AVX2
+#elif defined RTE_MACHINE_CPUFLAG_AVX2

 /**
  * AVX2 implementation below
-- 
1.9.3

[dpdk-dev] [PATCH v2 4/4] app/test-pmd: test tunnel filter for IP in GRE

2016-01-28 Thread Xutao Sun

This patch add some options in tunnel_filter command to test IP in GRE packet 
classification on i40e.

Signed-off-by: Xutao Sun 
Signed-off-by: Jijiang Liu 
---
 app/test-pmd/cmdline.c | 36 
 1 file changed, 24 insertions(+), 12 deletions(-)

diff --git a/app/test-pmd/cmdline.c b/app/test-pmd/cmdline.c
index 6084449..ad09a4a 100644
--- a/app/test-pmd/cmdline.c
+++ b/app/test-pmd/cmdline.c
@@ -301,12 +301,14 @@ static void cmd_help_long_parsed(void *parsed_result,
"Set the outer VLAN TPID for Packet Filtering on"
" a port\n\n"

-   "tunnel_filter add (port_id) (outer_mac) (inner_mac) 
(ip_addr) "
-   "(inner_vlan) (vxlan|nvgre) (filter_type) (tenant_id) 
(queue_id)\n"
+   "tunnel_filter add (port_id) (outer_ip) (inner_ip) 
(outer_mac)"
+   "(inner_mac) (ip_addr) (inner_vlan) 
(vxlan|nvgre|iningre) (filter_type)"
+   "(tenant_id) (queue_id)\n"
"   add a tunnel filter of a port.\n\n"

-   "tunnel_filter rm (port_id) (outer_mac) (inner_mac) 
(ip_addr) "
-   "(inner_vlan) (vxlan|nvgre) (filter_type) (tenant_id) 
(queue_id)\n"
+   "tunnel_filter rm (port_id) (outer_ip) (inner_ip) 
(outer_mac)"
+   "(inner_mac) (ip_addr) (inner_vlan) 
(vxlan|nvgre|ipingre) (filter_type)"
+   "(tenant_id) (queue_id)\n"
"   remove a tunnel filter of a port.\n\n"

"rx_vxlan_port add (udp_port) (port_id)\n"
@@ -6640,6 +6642,8 @@ cmd_tunnel_filter_parsed(void *parsed_result,
struct rte_eth_tunnel_filter_conf tunnel_filter_conf;
int ret = 0;

+   memset(_filter_conf, 0, sizeof(tunnel_filter_conf));
+
(void)rte_memcpy(_filter_conf.outer_mac, >outer_mac,
ETHER_ADDR_LEN);
(void)rte_memcpy(_filter_conf.inner_mac, >inner_mac,
@@ -6648,12 +6652,14 @@ cmd_tunnel_filter_parsed(void *parsed_result,

if (res->ip_value.family == AF_INET) {
tunnel_filter_conf.ip_addr.ipv4_addr =
-   res->ip_value.addr.ipv4.s_addr;
+   rte_be_to_cpu_32(res->ip_value.addr.ipv4.s_addr);
tunnel_filter_conf.ip_type = RTE_TUNNEL_IPTYPE_IPV4;
} else {
-   memcpy(&(tunnel_filter_conf.ip_addr.ipv6_addr),
-   &(res->ip_value.addr.ipv6),
-   sizeof(struct in6_addr));
+   int i;
+   for (i = 0; i < 4; i++) {
+   tunnel_filter_conf.ip_addr.ipv6_addr[i] =
+   rte_be_to_cpu_32(res->ip_value.addr.ipv6.s6_addr32[i]);
+   }
tunnel_filter_conf.ip_type = RTE_TUNNEL_IPTYPE_IPV6;
}

@@ -6669,6 +6675,10 @@ cmd_tunnel_filter_parsed(void *parsed_result,
else if (!strcmp(res->filter_type, "omac-imac-tenid"))
tunnel_filter_conf.filter_type =
RTE_TUNNEL_FILTER_OMAC_TENID_IMAC;
+   else if (!strcmp(res->filter_type, "oip"))
+   tunnel_filter_conf.filter_type = ETH_TUNNEL_FILTER_OIP;
+   else if (!strcmp(res->filter_type, "iip"))
+   tunnel_filter_conf.filter_type = ETH_TUNNEL_FILTER_IIP;
else {
printf("The filter type is not supported");
return;
@@ -6678,6 +6688,8 @@ cmd_tunnel_filter_parsed(void *parsed_result,
tunnel_filter_conf.tunnel_type = RTE_TUNNEL_TYPE_VXLAN;
else if (!strcmp(res->tunnel_type, "nvgre"))
tunnel_filter_conf.tunnel_type = RTE_TUNNEL_TYPE_NVGRE;
+   else if (!strcmp(res->tunnel_type, "ipingre"))
+   tunnel_filter_conf.tunnel_type = RTE_TUNNEL_TYPE_IP_IN_GRE;
else {
printf("The tunnel type %s not supported.\n", res->tunnel_type);
return;
@@ -6723,11 +6735,11 @@ cmdline_parse_token_ipaddr_t cmd_tunnel_filter_ip_value 
=
ip_value);
 cmdline_parse_token_string_t cmd_tunnel_filter_tunnel_type =
TOKEN_STRING_INITIALIZER(struct cmd_tunnel_filter_result,
-   tunnel_type, "vxlan#nvgre");
+   tunnel_type, "vxlan#nvgre#ipingre");

 cmdline_parse_token_string_t cmd_tunnel_filter_filter_type =
TOKEN_STRING_INITIALIZER(struct cmd_tunnel_filter_result,
-   filter_type, "imac-ivlan#imac-ivlan-tenid#imac-tenid#"
+   filter_type, "oip#iip#imac-ivlan#imac-ivlan-tenid#imac-tenid#"
"imac#omac-imac-tenid");
 cmdline_parse_token_num_t cmd_tunnel_filter_tenant_id =
TOKEN_NUM_INITIALIZER(struct cmd_tunnel_filter_result,
@@ -6741,8 +6753,8 @@ cmdline_parse_inst_t cmd_tunnel_filter = {
.data = (void *)0,
.help_str = "add/rm tunnel filter of a port: "
"tunnel_filter add port_id outer_mac inner_mac ip "
-

[dpdk-dev] [PATCH v2 3/4] driver/i40e: implement tunnel filter for IP in GRE

2016-01-28 Thread Xutao Sun

Signed-off-by: Xutao Sun 
Signed-off-by: Jijiang Liu 
---
 drivers/net/i40e/i40e_ethdev.c | 32 
 1 file changed, 24 insertions(+), 8 deletions(-)

diff --git a/drivers/net/i40e/i40e_ethdev.c b/drivers/net/i40e/i40e_ethdev.c
index 1dd1077..5c0eff9 100644
--- a/drivers/net/i40e/i40e_ethdev.c
+++ b/drivers/net/i40e/i40e_ethdev.c
@@ -5797,6 +5797,12 @@ i40e_dev_get_filter_type(uint16_t filter_type, uint16_t 
*flag)
case ETH_TUNNEL_FILTER_IMAC:
*flag = I40E_AQC_ADD_CLOUD_FILTER_IMAC;
break;
+   case ETH_TUNNEL_FILTER_OIP:
+   *flag = I40E_AQC_ADD_CLOUD_FILTER_OIP;
+   break;
+   case ETH_TUNNEL_FILTER_IIP:
+   *flag = I40E_AQC_ADD_CLOUD_FILTER_IIP;
+   break;
default:
PMD_DRV_LOG(ERR, "invalid tunnel filter type");
return -EINVAL;
@@ -5811,7 +5817,7 @@ i40e_dev_tunnel_filter_set(struct i40e_pf *pf,
uint8_t add)
 {
uint16_t ip_type;
-   uint8_t tun_type = 0;
+   uint8_t i, tun_type = 0;
int val, ret = 0;
struct i40e_hw *hw = I40E_PF_TO_HW(pf);
struct i40e_vsi *vsi = pf->main_vsi;
@@ -5833,16 +5839,22 @@ i40e_dev_tunnel_filter_set(struct i40e_pf *pf,
(void)rte_memcpy(>inner_mac, _filter->inner_mac,
ETHER_ADDR_LEN);

-   pfilter->inner_vlan = tunnel_filter->inner_vlan;
+   pfilter->inner_vlan = rte_cpu_to_le_16(tunnel_filter->inner_vlan);
if (tunnel_filter->ip_type == RTE_TUNNEL_IPTYPE_IPV4) {
ip_type = I40E_AQC_ADD_CLOUD_FLAGS_IPV4;
+   tunnel_filter->ip_addr.ipv4_addr =
+   rte_cpu_to_le_32(tunnel_filter->ip_addr.ipv4_addr);
(void)rte_memcpy(>ipaddr.v4.data,
-   _filter->ip_addr,
+   _filter->ip_addr.ipv4_addr,
sizeof(pfilter->ipaddr.v4.data));
} else {
ip_type = I40E_AQC_ADD_CLOUD_FLAGS_IPV6;
+   for (i = 0; i < 4; i++) {
+   tunnel_filter->ip_addr.ipv6_addr[i] =
+   rte_cpu_to_le_32(tunnel_filter->ip_addr.ipv6_addr[i]);
+   }
(void)rte_memcpy(>ipaddr.v6.data,
-   _filter->ip_addr,
+   _filter->ip_addr.ipv6_addr,
sizeof(pfilter->ipaddr.v6.data));
}

@@ -5854,6 +5866,9 @@ i40e_dev_tunnel_filter_set(struct i40e_pf *pf,
case RTE_TUNNEL_TYPE_NVGRE:
tun_type = I40E_AQC_ADD_CLOUD_TNL_TYPE_NVGRE_OMAC;
break;
+   case RTE_TUNNEL_TYPE_IP_IN_GRE:
+   tun_type = I40E_AQC_ADD_CLOUD_TNL_TYPE_IP;
+   break;
default:
/* Other tunnel types is not supported. */
PMD_DRV_LOG(ERR, "tunnel type is not supported.");
@@ -5868,10 +5883,11 @@ i40e_dev_tunnel_filter_set(struct i40e_pf *pf,
return -EINVAL;
}

-   pfilter->flags |= I40E_AQC_ADD_CLOUD_FLAGS_TO_QUEUE | ip_type |
-   (tun_type << I40E_AQC_ADD_CLOUD_TNL_TYPE_SHIFT);
-   pfilter->tenant_id = tunnel_filter->tenant_id;
-   pfilter->queue_number = tunnel_filter->queue_id;
+   pfilter->flags |= rte_cpu_to_le_16(
+   I40E_AQC_ADD_CLOUD_FLAGS_TO_QUEUE
+   | ip_type | (tun_type << I40E_AQC_ADD_CLOUD_TNL_TYPE_SHIFT));
+   pfilter->tenant_id = rte_cpu_to_le_32(tunnel_filter->tenant_id);
+   pfilter->queue_number = rte_cpu_to_le_16(tunnel_filter->queue_id);

if (add)
ret = i40e_aq_add_cloud_filters(hw, vsi->seid, cld_filter, 1);
-- 
1.9.3

[dpdk-dev] [PATCH v2 2/4] lib/ether: add IP in GRE type

2016-01-28 Thread Xutao Sun

Signed-off-by: Xutao Sun 
Signed-off-by: Jijiang Liu 
---
 lib/librte_ether/rte_eth_ctrl.h | 1 +
 1 file changed, 1 insertion(+)

diff --git a/lib/librte_ether/rte_eth_ctrl.h b/lib/librte_ether/rte_eth_ctrl.h
index 30cbde7..0e948a1 100644
--- a/lib/librte_ether/rte_eth_ctrl.h
+++ b/lib/librte_ether/rte_eth_ctrl.h
@@ -244,6 +244,7 @@ enum rte_eth_tunnel_type {
RTE_TUNNEL_TYPE_GENEVE,
RTE_TUNNEL_TYPE_TEREDO,
RTE_TUNNEL_TYPE_NVGRE,
+   RTE_TUNNEL_TYPE_IP_IN_GRE,
RTE_TUNNEL_TYPE_MAX,
 };

-- 
1.9.3

[dpdk-dev] [PATCH v2 1/4] lib/ether: optimize the 'rte_eth_tunnel_filter_conf' structure

2016-01-28 Thread Xutao Sun

Change the fields of outer_mac and inner_mac from pointer to struct in order to 
keep the code's readability.

Signed-off-by: Xutao Sun 
Signed-off-by: Jijiang Liu 
---
 app/test-pmd/cmdline.c  |  6 --
 drivers/net/i40e/i40e_ethdev.c  | 12 ++--
 lib/librte_ether/rte_eth_ctrl.h |  4 ++--
 3 files changed, 12 insertions(+), 10 deletions(-)

diff --git a/app/test-pmd/cmdline.c b/app/test-pmd/cmdline.c
index 6d28c1b..6084449 100644
--- a/app/test-pmd/cmdline.c
+++ b/app/test-pmd/cmdline.c
@@ -6640,8 +6640,10 @@ cmd_tunnel_filter_parsed(void *parsed_result,
struct rte_eth_tunnel_filter_conf tunnel_filter_conf;
int ret = 0;

-   tunnel_filter_conf.outer_mac = >outer_mac;
-   tunnel_filter_conf.inner_mac = >inner_mac;
+   (void)rte_memcpy(_filter_conf.outer_mac, >outer_mac,
+   ETHER_ADDR_LEN);
+   (void)rte_memcpy(_filter_conf.inner_mac, >inner_mac,
+   ETHER_ADDR_LEN);
tunnel_filter_conf.inner_vlan = res->inner_vlan;

if (res->ip_value.family == AF_INET) {
diff --git a/drivers/net/i40e/i40e_ethdev.c b/drivers/net/i40e/i40e_ethdev.c
index bf6220d..1dd1077 100644
--- a/drivers/net/i40e/i40e_ethdev.c
+++ b/drivers/net/i40e/i40e_ethdev.c
@@ -5828,10 +5828,10 @@ i40e_dev_tunnel_filter_set(struct i40e_pf *pf,
}
pfilter = cld_filter;

-   (void)rte_memcpy(>outer_mac, tunnel_filter->outer_mac,
-   sizeof(struct ether_addr));
-   (void)rte_memcpy(>inner_mac, tunnel_filter->inner_mac,
-   sizeof(struct ether_addr));
+   (void)rte_memcpy(>outer_mac, _filter->outer_mac,
+   ETHER_ADDR_LEN);
+   (void)rte_memcpy(>inner_mac, _filter->inner_mac,
+   ETHER_ADDR_LEN);

pfilter->inner_vlan = tunnel_filter->inner_vlan;
if (tunnel_filter->ip_type == RTE_TUNNEL_IPTYPE_IPV4) {
@@ -6131,13 +6131,13 @@ i40e_tunnel_filter_param_check(struct i40e_pf *pf,
}

if ((filter->filter_type & ETH_TUNNEL_FILTER_OMAC) &&
-   (is_zero_ether_addr(filter->outer_mac))) {
+   (is_zero_ether_addr(>outer_mac))) {
PMD_DRV_LOG(ERR, "Cannot add NULL outer MAC address");
return -EINVAL;
}

if ((filter->filter_type & ETH_TUNNEL_FILTER_IMAC) &&
-   (is_zero_ether_addr(filter->inner_mac))) {
+   (is_zero_ether_addr(>inner_mac))) {
PMD_DRV_LOG(ERR, "Cannot add NULL inner MAC address");
return -EINVAL;
}
diff --git a/lib/librte_ether/rte_eth_ctrl.h b/lib/librte_ether/rte_eth_ctrl.h
index ce224ad..30cbde7 100644
--- a/lib/librte_ether/rte_eth_ctrl.h
+++ b/lib/librte_ether/rte_eth_ctrl.h
@@ -280,8 +280,8 @@ enum rte_tunnel_iptype {
  * Tunneling Packet filter configuration.
  */
 struct rte_eth_tunnel_filter_conf {
-   struct ether_addr *outer_mac;  /**< Outer MAC address filter. */
-   struct ether_addr *inner_mac;  /**< Inner MAC address filter. */
+   struct ether_addr outer_mac;  /**< Outer MAC address filter. */
+   struct ether_addr inner_mac;  /**< Inner MAC address filter. */
uint16_t inner_vlan;   /**< Inner VLAN filter. */
enum rte_tunnel_iptype ip_type; /**< IP address type. */
union {
-- 
1.9.3

[dpdk-dev] [PATCH v2 0/4] Add tunnel filter support for IP in GRE on i40e

2016-01-28 Thread Xutao Sun

This patch set adds tunnel filter support for IP in GRE on i40e.

v2 changes:
  Fix the byte order problem.

Xutao Sun (4):
  change the 'rte_eth_tunnel_filter_conf' structure
  add IP in GRE type in the enum 'rte_eth_tunnel_type'
  implement cloud filter for ip in GRE on i40e
  test tunnel filter for IP in GRE

 app/test-pmd/cmdline.c  | 42 ++-
 drivers/net/i40e/i40e_ethdev.c  | 44 -
 lib/librte_ether/rte_eth_ctrl.h |  5 +++--
 3 files changed, 61 insertions(+), 30 deletions(-)

-- 
1.9.3

[dpdk-dev] [RFC PATCH 5/5] virtio: Extend virtio-net PMD to support container environment

2016-01-28 Thread Tetsuya Mukawa

On 2016/01/28 15:15, Xie, Huawei wrote:
> On 1/28/2016 10:48 AM, Tetsuya Mukawa wrote:
>> I measured it, and seems it takes 0.35 seconds in my environment.
>> This will be done only once when the port is initialized. Probably it's
>> not so heady.
> There are 256 x 32 loop of pci scan. That is too long if we dynamically
> start/tear down the container, otherwise it is ok. Some people are
> struggling reducing the VM booting time from seconds to milliseconds to
> compete with container technology. Let us consider if we could optimize
> this.
> For example, QEMU supports specifying bus/dev for a device in its
> commandline, so could we assign fixed bus for virtio-net and ivshm
> device? And for piix3, is it on bus 0/1?
>

OK, I understand the necessity. Let's consider it.
So far, the users doesn't need to specify pci address on QEMU command
line and DPDK vdev option.
But, let's change this, then we can remove this looping.

Probably specifying pci address on vdev option will not be mandatory.
if not specified, just using default value is nice.
I will fix like above in next release.

Tetsuya

[dpdk-dev] [PATCH v6] vfio: Support for no-IOMMU mode

2016-01-28 Thread Burakov, Anatoly

> > Hi Thomas,
> >
> > > 2016-01-28 11:57, Anatoly Burakov:
> > > > +#if LINUX_VERSION_CODE < KERNEL_VERSION(4, 5, 0)
> > >
> > > Why not #ifndef VFIO_NOIOMMU_IOMMU?
> > > It would avoid some backport issue.
> >
> > I don't see how it could. Versions post-4.5 will have
> VFIO_NOIOMMU_IOMMU, so no issue there. Pre-4.5 versions, whether
> they do or do not have VFIO_NOIOMMU_IOMMU defined, will have
> RTE_VFIO_NOIOMMU defined as 8 regardless.
> 
> Are we sure it will ever be backported as 8?
> Anyway I think it's better to avoid version number checks.

Is there a precedent of kernel API definitions ever changing in backports? 
Presumably whoever backports the changes is interested in making them as 
compatible as possible, so I believe it's a safe bet to make. I have no strong 
opinion for or against this way of doing things, but if we're taking issue with 
kernel version checks, we probably should also adapt all the other stuff in the 
eal_vfio.h that does things in the exact same manner.

> What happens if the feature is reverted from 4.5 as it was from 4.4?

Well then we have to wait until NOIOMMU makes it into official kernel before 
applying this patch. There's nothing we can do about that. If the patch gets 
reverted, then defining NOIOMMU as 8 will be wrong regardless of whether 
there's a kernel version check.

Thanks,
Anatoly

[dpdk-dev] [PATCH v6] vfio: Support for no-IOMMU mode

2016-01-28 Thread Thomas Monjalon

2016-01-28 11:57, Anatoly Burakov:
> +#if LINUX_VERSION_CODE < KERNEL_VERSION(4, 5, 0)

Why not #ifndef VFIO_NOIOMMU_IOMMU?
It would avoid some backport issue.

> +#define RTE_VFIO_NOIOMMU 8
> +#else
> +#define RTE_VFIO_NOIOMMU VFIO_NOIOMMU_IOMMU
> +#endif

[dpdk-dev] [PATCH v6 1/2] tools: Add support for handling built-in kernel modules

2016-01-28 Thread Thomas Monjalon

2016-01-28 12:17, Kamil Rytarowski:
> 
> W dniu 26.01.2016 o 16:23, Thomas Monjalon pisze:
> > 2016-01-20 10:48, krytarowski at caviumnetworks.com:
> >> --- a/tools/dpdk_nic_bind.py
> >> +++ b/tools/dpdk_nic_bind.py
> >> -for line in loaded_mods:
> >> +try:
> >> +# Get list of syfs modules, some of them might be builtin and 
> >> merge with mods
> > Please could you explain this comment?
> > Is it remaining from previous versions of the patch?
> 
> Yes. It might be changed to:
> # Get list of sysfs modules (both built-in and dynamically loaded)

OK

> > [...]
> >> +# special case for vfio_pci (module is named vfio-pci,
> >> +# but its .ko is named vfio_pci)
> > Isn't it common to have dash replaced by underscore for kernel modules?
> >
> 
> I retained the logic for special case of vfio-pci. At the moment 
> (according to my knowledge) there are no other DPDK modules with this 
> name replacement.
> 
> I checked few example Linux modules and if a module is named with dash, 
> it's being replaced to underscore. The modprobe(8) tool can accept both 
> names as interchangeable (with dash and underscore).
> 
> Would you like to make it a general rule and replace all dashes with 
> underscores?

I don't know. Do what you think is best.
Thanks

[dpdk-dev] [PATCH] lib: remove "extern" keyword for functions from header files

2016-01-28 Thread Thomas Monjalon

2016-01-28 10:11, Ferruh Yigit:
> On Wed, Jan 27, 2016 at 07:05:52PM +0100, Thomas Monjalon wrote:
> > 2016-01-25 10:01, Ferruh Yigit:
> > > Remove "extern" keywords in header files, the ones for function
> > > prototypes
> > 
> > I've seen a lot of other extern keywords. Why not removing all?
> > 
> Remaining one are Linux drivers in KNI, they are kind of internal headers, I 
> doubt on touching them.
> Should I remove them all?
> Also there are more usage in "drivers" folder, I am not sure touching them 
> too, what do you comment?
> 
> > > -extern int rte_eth_dev_configure(uint8_t port_id,
> > > -  uint16_t nb_rx_queue,
> > > -  uint16_t nb_tx_queue,
> > > -  const struct rte_eth_conf *eth_conf);
> > > +int rte_eth_dev_configure(uint8_t port_id, uint16_t nb_rx_queue,
> > > +   uint16_t nb_tx_queue,
> > > +   const struct rte_eth_conf *eth_conf);
> > 
> > The indent is weird.
> > Why not follow the guideline with 2 tabs?
> > http://dpdk.org/doc/guides-2.2/contributing/coding_style.html#c-indentation
> 
> Intentionally kept them as original, to scope the patch just to remove a 
> keyword.
> Do you want me fix the syntax wherever I touch for this patch?

Syntax? Do you mean to fix the indent?
Yes I think it is a good practice to fix the indent when modifying some code.

[dpdk-dev] [PATCH] config: add default linux configuration

2016-01-28 Thread Bernard Iremonger

add config/defconfig_x86_64-default-linuxapp-gcc file.

Signed-off-by: Bernard Iremonger 
---
 config/defconfig_x86_64-default-linuxapp-gcc | 42 
 1 file changed, 42 insertions(+)
 create mode 100644 config/defconfig_x86_64-default-linuxapp-gcc

diff --git a/config/defconfig_x86_64-default-linuxapp-gcc 
b/config/defconfig_x86_64-default-linuxapp-gcc
new file mode 100644
index 000..d1b9dbf
--- /dev/null
+++ b/config/defconfig_x86_64-default-linuxapp-gcc
@@ -0,0 +1,42 @@
+#   BSD LICENSE
+#
+#   Copyright(c) 2016 Intel Corporation. All rights reserved.
+#   All rights reserved.
+#
+#   Redistribution and use in source and binary forms, with or without
+#   modification, are permitted provided that the following conditions
+#   are met:
+#
+# * Redistributions of source code must retain the above copyright
+#   notice, this list of conditions and the following disclaimer.
+# * Redistributions in binary form must reproduce the above copyright
+#   notice, this list of conditions and the following disclaimer in
+#   the documentation and/or other materials provided with the
+#   distribution.
+# * Neither the name of Intel Corporation nor the names of its
+#   contributors may be used to endorse or promote products derived
+#   from this software without specific prior written permission.
+#
+#   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+#   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+#   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+#   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+#   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+#   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+#   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+#   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+#   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+#   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+#   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+#
+
+#include "common_linuxapp"
+
+CONFIG_RTE_MACHINE="default"
+
+CONFIG_RTE_ARCH="x86_64"
+CONFIG_RTE_ARCH_X86_64=y
+CONFIG_RTE_ARCH_64=y
+
+CONFIG_RTE_TOOLCHAIN="gcc"
+CONFIG_RTE_TOOLCHAIN_GCC=y
-- 
2.6.3

[dpdk-dev] [PATCH v2] lib: remove "extern" keyword for functions from header files

2016-01-28 Thread Ferruh Yigit

Remove "extern" keywords in header files, the ones for function
prototypes

v2:
* fix identation

Signed-off-by: Ferruh Yigit 
---
 lib/librte_eal/common/include/rte_memory.h |   2 +-
 lib/librte_ether/rte_ethdev.h  | 133 ++---
 lib/librte_kni/rte_kni.h   |  30 +++
 3 files changed, 78 insertions(+), 87 deletions(-)

diff --git a/lib/librte_eal/common/include/rte_memory.h 
b/lib/librte_eal/common/include/rte_memory.h
index 9c9e40f..587a25d 100644
--- a/lib/librte_eal/common/include/rte_memory.h
+++ b/lib/librte_eal/common/include/rte_memory.h
@@ -184,7 +184,7 @@ unsigned rte_memory_get_nrank(void);
 #ifdef RTE_LIBRTE_XEN_DOM0

 /**< Internal use only - should DOM0 memory mapping be used */
-extern int rte_xen_dom0_supported(void);
+int rte_xen_dom0_supported(void);

 /**< Internal use only - phys to virt mapping for xen */
 phys_addr_t rte_xen_mem_phy2mch(uint32_t, const phys_addr_t);
diff --git a/lib/librte_ether/rte_ethdev.h b/lib/librte_ether/rte_ethdev.h
index bada8ad..8710dd7 100644
--- a/lib/librte_ether/rte_ethdev.h
+++ b/lib/librte_ether/rte_ethdev.h
@@ -1637,7 +1637,7 @@ extern struct rte_eth_dev rte_eth_devices[];
  * @return
  *   - The total number of usable Ethernet devices.
  */
-extern uint8_t rte_eth_dev_count(void);
+uint8_t rte_eth_dev_count(void);

 /**
  * @internal
@@ -1648,7 +1648,7 @@ extern uint8_t rte_eth_dev_count(void);
  * @return
  *   - The pointer to the ethdev slot, on success. NULL on error
  */
-extern struct rte_eth_dev *rte_eth_dev_allocated(const char *name);
+struct rte_eth_dev *rte_eth_dev_allocated(const char *name);

 /**
  * @internal
@@ -1784,7 +1784,7 @@ struct eth_driver {
  *   The pointer to the *eth_driver* structure associated with
  *   the Ethernet driver.
  */
-extern void rte_eth_driver_register(struct eth_driver *eth_drv);
+void rte_eth_driver_register(struct eth_driver *eth_drv);

 /**
  * Configure an Ethernet device.
@@ -1815,10 +1815,8 @@ extern void rte_eth_driver_register(struct eth_driver 
*eth_drv);
  *   - 0: Success, device configured.
  *   - <0: Error code returned by the driver configuration function.
  */
-extern int rte_eth_dev_configure(uint8_t port_id,
-uint16_t nb_rx_queue,
-uint16_t nb_tx_queue,
-const struct rte_eth_conf *eth_conf);
+int rte_eth_dev_configure(uint8_t port_id, uint16_t nb_rx_queue,
+   uint16_t nb_tx_queue, const struct rte_eth_conf *eth_conf);

 /**
  * Allocate and set up a receive queue for an Ethernet device.
@@ -1859,10 +1857,10 @@ extern int rte_eth_dev_configure(uint8_t port_id,
  *  allocate network memory buffers from the memory pool when
  *  initializing receive descriptors.
  */
-extern int rte_eth_rx_queue_setup(uint8_t port_id, uint16_t rx_queue_id,
- uint16_t nb_rx_desc, unsigned int socket_id,
- const struct rte_eth_rxconf *rx_conf,
- struct rte_mempool *mb_pool);
+int rte_eth_rx_queue_setup(uint8_t port_id, uint16_t rx_queue_id,
+   uint16_t nb_rx_desc, unsigned int socket_id,
+   const struct rte_eth_rxconf *rx_conf,
+   struct rte_mempool *mb_pool);

 /**
  * Allocate and set up a transmit queue for an Ethernet device.
@@ -1907,9 +1905,9 @@ extern int rte_eth_rx_queue_setup(uint8_t port_id, 
uint16_t rx_queue_id,
  *   - 0: Success, the transmit queue is correctly set up.
  *   - -ENOMEM: Unable to allocate the transmit ring descriptors.
  */
-extern int rte_eth_tx_queue_setup(uint8_t port_id, uint16_t tx_queue_id,
- uint16_t nb_tx_desc, unsigned int socket_id,
- const struct rte_eth_txconf *tx_conf);
+int rte_eth_tx_queue_setup(uint8_t port_id, uint16_t tx_queue_id,
+   uint16_t nb_tx_desc, unsigned int socket_id,
+   const struct rte_eth_txconf *tx_conf);

 /*
  * Return the NUMA socket to which an Ethernet device is connected
@@ -1921,7 +1919,7 @@ extern int rte_eth_tx_queue_setup(uint8_t port_id, 
uint16_t tx_queue_id,
  *   a default of zero if the socket could not be determined.
  *   -1 is returned is the port_id value is out of range.
  */
-extern int rte_eth_dev_socket_id(uint8_t port_id);
+int rte_eth_dev_socket_id(uint8_t port_id);

 /*
  * Check if port_id of device is attached
@@ -1932,7 +1930,7 @@ extern int rte_eth_dev_socket_id(uint8_t port_id);
  *   - 0 if port is out of range or not attached
  *   - 1 if device is attached
  */
-extern int rte_eth_dev_is_valid_port(uint8_t port_id);
+int rte_eth_dev_is_valid_port(uint8_t port_id);

 /*
  * Allocate mbuf from mempool, setup the DMA physical address
@@ -1950,7 +1948,7 @@ extern int rte_eth_dev_is_valid_port(uint8_t port_id);
  *   - -EINVAL: The port_id or the queue_id out of range.
  *   - -ENOTSUP: The function not supported in PMD driver.
  */

[dpdk-dev] [PATCH v5] vfio: Support for no-IOMMU mode

2016-01-28 Thread Thomas Monjalon

2016-01-28 10:03, Burakov, Anatoly:
> > 2016-01-27 16:50, Anatoly Burakov:
> > > --- a/lib/librte_eal/linuxapp/eal/eal_vfio.h
> > > +++ b/lib/librte_eal/linuxapp/eal/eal_vfio.h
> > > +/* older kernels may not have no-IOMMU mode */ #ifndef
> > > +VFIO_NOIOMMU_IOMMU #define VFIO_NOIOMMU_IOMMU 8 #endif
> > 
> > Shouldn't it be defined privately in .c file?
> 
> We already have other VFIO-related definitions in that file, specifically the 
> PCI defines that aren't present in earlier kernels. This definition is 
> similar in nature - it will be present in kernels starting from 4.5 (when 
> NOIOMMU was introduced), but earlier kernels will need this defined. I didn't 
> want to go similar route with redefining everything VFIO-related, but maybe 
> it makes sense in this case for consistency's sake? E.g.
> 
> #define RTE_VFIO_TYPE1 VFIO_TYPE1_IOMMU [we're already in an ifdef linux >= 
> 3.6, so define type1 unconditionally]
> #if linux < 4.5
> #define RTE_VFIO_NOIOMMU 8
> #else
> #define RTE_VFIO_NOIOMMU VFIO_NOIOMMU_IOMMU
> #endif
> 
> Or something like that?

OK you can keep it as is or define a RTE constant. Up to you.

[dpdk-dev] [PATCH 2/3] rte_ctrl_if: add control interface library

2016-01-28 Thread Yigit, Ferruh

On Thu, Jan 28, 2016 at 01:57:04PM +, Ananyev, Konstantin wrote:
> Hi Ferruh,
> 
> > -Original Message-
> > From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Ferruh Yigit
> > Sent: Thursday, January 28, 2016 1:15 PM
> > To: Horton, Remy
> > Cc: dev at dpdk.org
> > Subject: Re: [dpdk-dev] [PATCH 2/3] rte_ctrl_if: add control interface 
> > library
> > 
> > On Thu, Jan 28, 2016 at 11:14:47AM +, Remy Horton wrote:
> > > On 27/01/2016 16:24, Ferruh Yigit wrote:
> > >
> > > > +   default:
> > > > +   ret = -95 /* EOPNOTSUPP */;
> > > > +   break;
> > >
> > > Is this intentional? -EOPNOTSUPP is -122 (-95 is -ENOTSOCK)..
> > >
> > Return value is not significant, callee just checks for negative value,
> > I can remove comment to prevent confusion.
> 
> Please use values defined in errno.h, there are plenty of them,
> no need to invent your own error codes.
OK

> Also pls don't forget to address all comments I gave you offline.
Yes, I also remember your comment when I saw this J, it seems this one missed.
I will address in next revision.

Thanks,
ferruh

[dpdk-dev] [PATCH v6] vfio: Support for no-IOMMU mode

2016-01-28 Thread Burakov, Anatoly

Hi Thomas,

> 2016-01-28 11:57, Anatoly Burakov:
> > +#if LINUX_VERSION_CODE < KERNEL_VERSION(4, 5, 0)
> 
> Why not #ifndef VFIO_NOIOMMU_IOMMU?
> It would avoid some backport issue.

I don't see how it could. Versions post-4.5 will have VFIO_NOIOMMU_IOMMU, so no 
issue there. Pre-4.5 versions, whether they do or do not have 
VFIO_NOIOMMU_IOMMU defined, will have RTE_VFIO_NOIOMMU defined as 8 regardless.

Thanks,
Anatoly

[dpdk-dev] [PATCH] eal: fix compile error in eal_timer.c caused by hpet

2016-01-28 Thread 卢毅

Fix compile error when enable CONFIG_RTE_LIBEAL_USE_HPET.

Error messages:
/root/dpdk-2.2.0/lib/librte_eal/linuxapp/eal/eal_timer.c: In function 
?rte_eal_hpet_init?:
/root/dpdk-2.2.0/lib/librte_eal/linuxapp/eal/eal_timer.c:222:2: error: implicit 
declaration of function ?rte_thread_setname? 
[-Werror=implicit-function-declaration]
  ret = rte_thread_setname(msb_inc_thread_id, thread_name);
  ^
/root/dpdk-2.2.0/lib/librte_eal/linuxapp/eal/eal_timer.c:222:2: error: nested 
extern declaration of ?rte_thread_setname? [-Werror=nested-externs]
cc1: all warnings being treated as errors

Fixes: badb3688ffa8 ("eal/linux: fix build with glibc < 2.12")

Signed-off-by: Yi Lu 
Acked-by: David Marchand 
---
 lib/librte_eal/linuxapp/eal/eal_timer.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/lib/librte_eal/linuxapp/eal/eal_timer.c 
b/lib/librte_eal/linuxapp/eal/eal_timer.c
index 9ceff33..bcadf09 100644
--- a/lib/librte_eal/linuxapp/eal/eal_timer.c
+++ b/lib/librte_eal/linuxapp/eal/eal_timer.c
@@ -50,6 +50,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 

 #include "eal_private.h"
-- 
1.8.3.1

[dpdk-dev] [PATCH v7 2/2] eal/linux: Add support for handling built-in kernel modules

2016-01-28 Thread krytarow...@caviumnetworks.com

From: Kamil Rytarowski 

Currently rte_eal_check_module() detects Linux kernel modules via reading
/proc/modules. Built-in ones aren't listed there and therefore they are not
being found by the script.

Add support for checking built-in modules with parsing the sysfs files

This commit obsoletes the /proc/modules parsing approach.

Signed-off-by: Kamil Rytarowski 
Acked-by: David Marchand 
Acked-by: Yuanhan Liu 
---
 lib/librte_eal/linuxapp/eal/eal.c | 34 --
 1 file changed, 20 insertions(+), 14 deletions(-)

diff --git a/lib/librte_eal/linuxapp/eal/eal.c 
b/lib/librte_eal/linuxapp/eal/eal.c
index 635ec36..21a4a32 100644
--- a/lib/librte_eal/linuxapp/eal/eal.c
+++ b/lib/librte_eal/linuxapp/eal/eal.c
@@ -901,27 +901,33 @@ int rte_eal_has_hugepages(void)
 int
 rte_eal_check_module(const char *module_name)
 {
-   char mod_name[30]; /* Any module names can be longer than 30 bytes? */
-   int ret = 0;
+   char sysfs_mod_name[PATH_MAX];
+   struct stat st;
int n;

if (NULL == module_name)
return -1;

-   FILE *fd = fopen("/proc/modules", "r");
-   if (NULL == fd) {
-   RTE_LOG(ERR, EAL, "Open /proc/modules failed!"
-   " error %i (%s)\n", errno, strerror(errno));
+   /* Check if there is sysfs mounted */
+   if (stat("/sys/module", ) != 0) {
+   RTE_LOG(DEBUG, EAL, "sysfs is not mounted! error %i (%s)\n",
+   errno, strerror(errno));
return -1;
}
-   while (!feof(fd)) {
-   n = fscanf(fd, "%29s %*[^\n]", mod_name);
-   if ((n == 1) && !strcmp(mod_name, module_name)) {
-   ret = 1;
-   break;
-   }
+
+   /* A module might be built-in, therefore try sysfs */
+   n = snprintf(sysfs_mod_name, PATH_MAX, "/sys/module/%s", module_name);
+   if (n < 0 || n > PATH_MAX) {
+   RTE_LOG(DEBUG, EAL, "Could not format module path\n");
+   return -1;
}
-   fclose(fd);

-   return ret;
+   if (stat(sysfs_mod_name, ) != 0) {
+   RTE_LOG(DEBUG, EAL, "Module %s not found! error %i (%s)\n",
+   sysfs_mod_name, errno, strerror(errno));
+   return 0;
+   }
+
+   /* Module has been found */
+   return 1;
 }
-- 
1.9.1

[dpdk-dev] [PATCH v7 1/2] tools: Add support for handling built-in kernel modules

2016-01-28 Thread krytarow...@caviumnetworks.com

From: Kamil Rytarowski 

Currently dpdk_nic_bind.py detects Linux kernel modules via reading
/proc/modules. Built-in ones aren't listed there and therefore they are not
being found by the script.

Add support for checking built-in modules with parsing the sysfs files.

This commit obsoletes the /proc/modules parsing approach.

Signed-off-by: Kamil Rytarowski 
---
 tools/dpdk_nic_bind.py | 30 --
 1 file changed, 20 insertions(+), 10 deletions(-)

diff --git a/tools/dpdk_nic_bind.py b/tools/dpdk_nic_bind.py
index f02454e..85cb6f1 100755
--- a/tools/dpdk_nic_bind.py
+++ b/tools/dpdk_nic_bind.py
@@ -156,22 +156,32 @@ def check_modules():
 '''Checks that igb_uio is loaded'''
 global dpdk_drivers

-fd = file("/proc/modules")
-loaded_mods = fd.readlines()
-fd.close()
-
 # list of supported modules
 mods =  [{"Name" : driver, "Found" : False} for driver in dpdk_drivers]

 # first check if module is loaded
-for line in loaded_mods:
+try:
+# Get list of sysfs modules (both built-in and dynamically loaded)
+sysfs_path = '/sys/module/'
+
+# Get the list of directories in sysfs_path
+sysfs_mods = [os.path.join(sysfs_path, o) for o
+  in os.listdir(sysfs_path)
+  if os.path.isdir(os.path.join(sysfs_path, o))]
+
+# Extract the last element of '/sys/module/abc' in the array
+sysfs_mods = [a.split('/')[-1] for a in sysfs_mods]
+
+# special case for vfio_pci (module is named vfio-pci,
+# but its .ko is named vfio_pci)
+sysfs_mods = map(lambda a:
+ a if a != 'vfio_pci' else 'vfio-pci', sysfs_mods)
+
 for mod in mods:
-if line.startswith(mod["Name"]):
-mod["Found"] = True
-# special case for vfio_pci (module is named vfio-pci,
-# but its .ko is named vfio_pci)
-elif line.replace("_", "-").startswith(mod["Name"]):
+if mod["Name"] in sysfs_mods:
 mod["Found"] = True
+except:
+pass

 # check if we have at least one loaded module
 if True not in [mod["Found"] for mod in mods] and b_flag is not None:
-- 
1.9.1

[dpdk-dev] [PATCH 2/3] rte_ctrl_if: add control interface library

2016-01-28 Thread Ananyev, Konstantin

Hi Ferruh,

> -Original Message-
> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Ferruh Yigit
> Sent: Thursday, January 28, 2016 1:15 PM
> To: Horton, Remy
> Cc: dev at dpdk.org
> Subject: Re: [dpdk-dev] [PATCH 2/3] rte_ctrl_if: add control interface library
> 
> On Thu, Jan 28, 2016 at 11:14:47AM +, Remy Horton wrote:
> > On 27/01/2016 16:24, Ferruh Yigit wrote:
> >
> > > + default:
> > > + ret = -95 /* EOPNOTSUPP */;
> > > + break;
> >
> > Is this intentional? -EOPNOTSUPP is -122 (-95 is -ENOTSOCK)..
> >
> Return value is not significant, callee just checks for negative value,
> I can remove comment to prevent confusion.

Please use values defined in errno.h, there are plenty of them,
no need to invent your own error codes.
Also pls don't forget to address all comments I gave you offline.
Thanks
Konstantin

> 
> Thanks,
> ferruh

[dpdk-dev] [PATCH 2/3] rte_ctrl_if: add control interface library

2016-01-28 Thread Ferruh Yigit

On Thu, Jan 28, 2016 at 07:24:51AM -0600, Jay Rolette wrote:
> On Thu, Jan 28, 2016 at 7:15 AM, Ferruh Yigit 
> wrote:
> 
> > On Thu, Jan 28, 2016 at 11:14:47AM +, Remy Horton wrote:
> > > On 27/01/2016 16:24, Ferruh Yigit wrote:
> > >
> > > > +   default:
> > > > +   ret = -95 /* EOPNOTSUPP */;
> > > > +   break;
> > >
> > > Is this intentional? -EOPNOTSUPP is -122 (-95 is -ENOTSOCK)..
> > >
> > Return value is not significant, callee just checks for negative value,
> > I can remove comment to prevent confusion.
> >
> 
> No, please fix the return value. Return values are significant when you are
> trying to debug or understand the intent of the code.
> 
There is nothing to fix in return value here. I am simply planning to do 
something like:
#define NOT_SUPPORTED -2
ret = NOT_SUPPORTED;

But the value of -95 or -110 or -2 does not differ.

Thanks,
ferruh

[dpdk-dev] [PATCH 1/4] lib/librte_port: add PCAP file support to source port

2016-01-28 Thread Panu Matilainen

On 01/27/2016 07:39 PM, Fan Zhang wrote:
> Originally, source ports in librte_port is an input port used as packet
> generator. Similar to Linux kernel /dev/zero character device, it
> generates null packets. This patch adds optional PCAP file support to
> source port: instead of sending NULL packets, the source port generates
> packets copied from a PCAP file. To increase the performance, the packets
> in the file are loaded to memory initially, and copied to mbufs in circular
> manner. Users can enable or disable this feature by setting
> CONFIG_RTE_PORT_PCAP compiler option "y" or "n".
>
> Signed-off-by: Fan Zhang 
> Acked-by: Cristian Dumitrescu 
> ---
>   config/common_bsdapp   |   1 +
>   config/common_linuxapp |   1 +
>   lib/librte_port/Makefile   |   4 +
>   lib/librte_port/rte_port_source_sink.c | 190 
> +
>   lib/librte_port/rte_port_source_sink.h |   7 ++
>   mk/rte.app.mk  |   1 +
>   6 files changed, 204 insertions(+)
>
[...]
> +#ifdef RTE_PORT_PCAP
> +
> +/**
> + * Load PCAP file, allocate and copy packets in the file to memory
> + *
> + * @param p
> + *   Parameters for source port
> + * @param port
> + *   Handle to source port
> + * @param socket_id
> + *   Socket id where the memory is created
> + * @return
> + *   0 on SUCCESS
> + *   error code otherwise
> + */
> +static int
> +pcap_source_load(struct rte_port_source_params *p,
> + struct rte_port_source *port,
> + int socket_id)
> +{
[...]
> +#else
> +static int
> +pcap_source_load(__rte_unused struct rte_port_source_params *p,
> + struct rte_port_source *port,
> + __rte_unused int socket_id)
> +{
> + port->pkt_buff = NULL;
> + port->pkt_len = NULL;
> + port->pkts = NULL;
> + port->pkt_index = 0;
> +
> + return 0;
> +}
> +#endif

Same as in patch 3/4, shouldn't this return -ENOTSUP when pcap support 
is not built in, instead of success?

[...]

> diff --git a/lib/librte_port/rte_port_source_sink.h 
> b/lib/librte_port/rte_port_source_sink.h
> index 0f9be79..6f39bec 100644
> --- a/lib/librte_port/rte_port_source_sink.h
> +++ b/lib/librte_port/rte_port_source_sink.h
> @@ -53,6 +53,13 @@ extern "C" {
>   struct rte_port_source_params {
>   /** Pre-initialized buffer pool */
>   struct rte_mempool *mempool;
> + /** The full path of the pcap file to read packets from */
> + char *file_name;
> + /** The number of bytes to be read from each packet in the
> +  *  pcap file. If this value is 0, the whole packet is read;
> +  *  if it is bigger than packet size, the generated packets
> +  *  will contain the whole packet */
> + uint32_t n_bytes_per_pkt;
>   };

This is a likely ABI-break. It "only" appends to the struct, which might 
in some cases be okay but only when there's no sensible use for the 
struct within arrays or embedded in structs. The ip_pipeline example for 
example embeds struct rte_port_source_params within another struct which 
is could be thought of as an indication that other applications might be 
doing this as well.

An ABI break for librte_port has not been announced for 2.3 so you'd 
need to announce the intent to do so in 2.4 now, and then either wait 
till post 2.3 or wrap this in CONFIG_RTE_NEXT_ABI.

- Panu -

[dpdk-dev] [PATCH v6 1/2] tools: Add support for handling built-in kernel modules

2016-01-28 Thread Kamil Rytarowski

W dniu 28.01.2016 o 12:22, Panu Matilainen pisze:
> On 01/28/2016 01:17 PM, Kamil Rytarowski wrote:
>> I retained the logic for special case of vfio-pci. At the moment
>> (according to my knowledge) there are no other DPDK modules with this
>> name replacement.
>>
>> I checked few example Linux modules and if a module is named with dash,
>> it's being replaced to underscore. The modprobe(8) tool can accept both
>> names as interchangeable (with dash and underscore).
>>
>> Would you like to make it a general rule and replace all dashes with
>> underscores?
>
> It would be nice to behave the same as modprobe wrt dash and 
> underscore, yes.
>
> - Panu -
>

My patch is intended to support built-in modules, the rest isn't that 
trivial without changing the behavior.

I prototyped it and it added extra unnecessary complexity, while we just 
want to handle vfio_pci -> vfio-pci.

I'm going to submit new version with improved comment in the code. 
Please continue possible improvements in separate threads.

[dpdk-dev] [PATCH 1/3] kcp: add kernel control path kernel module

2016-01-28 Thread Ferruh Yigit

On Thu, Jan 28, 2016 at 09:49:49AM +, Remy Horton wrote:
> Comments inline
>
> ..Remy
>
>
> On 27/01/2016 16:24, Ferruh Yigit wrote:
> > This kernel module is based on KNI module, but this one is stripped
> > version of it and only for control messages, no data transfer
> > functionality provided.
> >
> > This Linux kernel module helps userspace application create virtual
> > interfaces and when a control command issued into that virtual
> > interface, module pushes the command to the userspace and gets the
> > response back for the caller application.
> >
> > Signed-off-by: Ferruh Yigit 
> > ---
>
>
> > +   net_dev = alloc_netdev(sizeof(struct kcp_dev), name,
> > +#ifdef NET_NAME_UNKNOWN
> > +   NET_NAME_UNKNOWN,
> > +#endif
> > +   kcp_net_init);
>
> Something doesn't feel quite right here. In cases where NET_NAME_UNKNOWN is 
> undefined, is the signature for alloc_netdev different?
>
Yes, this is because of API change between kernel versions,
when NET_NAME_* introduced, alloc_netdev() also updated to have this.

>
> > +MODULE_LICENSE("Dual BSD/GPL");
> > +MODULE_AUTHOR("Intel Corporation");
> > +MODULE_DESCRIPTION("Kernel Module for managing kcp devices");
>
> I'm not up to speed on this area, but some of the file headers only mention 
> GPL/LGPL. This correct?
>
This is because a header file (rte_kcp_common.h) shared by this kernel module 
and user-space application is dual licensed (BSD + GPL)
I mimicked this from exiting KNI.
>
> > +   nlmsg_unicast(nl_sock, skb, pid);
> > +   KCP_DBG("Sent cmd:%d port:%d\n", cmd_id, port_id);
> > +
> > +   /*nlmsg_free(skb);*/
> > +
> > +   return 0;
> > +}
>
> Oops.. :)
> Possible memory leak, or is *skb statically allocated?
>
No leak, not statically allocated, but taken care by nlmsg_unicast()
But commented code needs to be removed.

Thanks,
ferruh

[dpdk-dev] [PATCH] fm10k: enable PCIe port level Loopback Suppression

2016-01-28 Thread Shaopeng He

A PCIe port may represent within it multiple logical ports
(for example when SR-IOV is enabled, or when a VMDQ type logical
port scheme is employed assigning ports to sets of queues).
For this reason each RX queue in each PCIe port is given a source
GLORT that is used for loopback suppression.
This patch assigns a SGLORT for each RX queue, and enables PCIe
port level Loopback Suppression.

Signed-off-by: Shaopeng He 
---
 drivers/net/fm10k/fm10k_ethdev.c | 18 +-
 1 file changed, 17 insertions(+), 1 deletion(-)

diff --git a/drivers/net/fm10k/fm10k_ethdev.c b/drivers/net/fm10k/fm10k_ethdev.c
index f6eb05d..60f821a 100644
--- a/drivers/net/fm10k/fm10k_ethdev.c
+++ b/drivers/net/fm10k/fm10k_ethdev.c
@@ -690,12 +690,15 @@ static int
 fm10k_dev_rx_init(struct rte_eth_dev *dev)
 {
struct fm10k_hw *hw = FM10K_DEV_PRIVATE_TO_HW(dev->data->dev_private);
+   struct fm10k_macvlan_filter_info *macvlan;
int i, ret;
struct fm10k_rx_queue *rxq;
uint64_t base_addr;
uint32_t size;
uint32_t rxdctl = FM10K_RXDCTL_WRITE_BACK_MIN_DELAY;
+   uint32_t logic_port = hw->mac.dglort_map;
uint16_t buf_size;
+   uint16_t queue_stride = 0;

/* Disable RXINT to avoid possible interrupt */
for (i = 0; i < hw->mac.max_queues; i++)
@@ -735,7 +738,8 @@ fm10k_dev_rx_init(struct rte_eth_dev *dev)
buf_size -= FM10K_RX_DATABUF_ALIGN;

FM10K_WRITE_REG(hw, FM10K_SRRCTL(i),
-   buf_size >> FM10K_SRRCTL_BSIZEPKT_SHIFT);
+   (buf_size >> FM10K_SRRCTL_BSIZEPKT_SHIFT) |
+   FM10K_SRRCTL_LOOPBACK_SUPPRESS);

/* It adds dual VLAN length for supporting dual VLAN */
if ((dev->data->dev_conf.rxmode.max_rx_pkt_len +
@@ -762,6 +766,18 @@ fm10k_dev_rx_init(struct rte_eth_dev *dev)
/* Decide the best RX function */
fm10k_set_rx_function(dev);

+   /* update RX_SGLORT for loopback suppress*/
+   if (hw->mac.type != fm10k_mac_pf)
+   return 0;
+   macvlan = FM10K_DEV_PRIVATE_TO_MACVLAN(dev->data->dev_private);
+   if (macvlan->nb_queue_pools)
+   queue_stride = dev->data->nb_rx_queues / 
macvlan->nb_queue_pools;
+   for (i = 0; i < dev->data->nb_rx_queues; ++i) {
+   if (i && queue_stride && !(i % queue_stride))
+   logic_port++;
+   FM10K_WRITE_REG(hw, FM10K_RX_SGLORT(i), logic_port);
+   }
+
return 0;
 }

-- 
1.9.3

[dpdk-dev] [PATCH] fm10k: fix switch manager high CPU usage

2016-01-28 Thread Shaopeng He

fm10k switch core uses source MAC + VID + SGLORT to do
look up in MAC table. If no match, an exception interrupt
will be sent to the switch manager, and cause high CPU
usage.
This patch fixes this issue. A default SGLORT is assigned
to each TX queue. This default value works for non-VMDq mode
and current VMDq example. For advanced VMDq usage, e.g.
different source MAC address for different TX queue, FTAG
forwarding function could be used to change this default
SGLORT value.

Signed-off-by: Shaopeng He 
---
 drivers/net/fm10k/fm10k_ethdev.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/drivers/net/fm10k/fm10k_ethdev.c b/drivers/net/fm10k/fm10k_ethdev.c
index e4aed94..f6eb05d 100644
--- a/drivers/net/fm10k/fm10k_ethdev.c
+++ b/drivers/net/fm10k/fm10k_ethdev.c
@@ -675,6 +675,9 @@ fm10k_dev_tx_init(struct rte_eth_dev *dev)
FM10K_WRITE_REG(hw, FM10K_TDBAH(i),
base_addr >> (CHAR_BIT * sizeof(uint32_t)));
FM10K_WRITE_REG(hw, FM10K_TDLEN(i), size);
+
+   /* assign default SGLORT for each TX queue */
+   FM10K_WRITE_REG(hw, FM10K_TX_SGLORT(i), hw->mac.dglort_map);
}

/* set up vector or scalar TX function as appropriate */
-- 
1.9.3

[dpdk-dev] [PATCH 3/4] lib/librte_port: add packet dumping to PCAP file support in sink port

2016-01-28 Thread Panu Matilainen

On 01/27/2016 07:39 PM, Fan Zhang wrote:
> Originally, sink ports in librte_port releases received mbufs back to
> mempool. This patch adds optional packet dumping to PCAP feature in sink
> port: the packets will be dumped to user defined PCAP file for storage or
> debugging. The user may also choose the sink port's activity: either it
> continuously dump the packets to the file, or stops at certain dumping
>
> This feature shares same CONFIG_RTE_PORT_PCAP compiler option as source
> port PCAP file support feature. Users can enable or disable this feature
> by setting CONFIG_RTE_PORT_PCAP compiler option "y" or "n".
>
> Signed-off-by: Fan Zhang 
> Acked-by: Cristian Dumitrescu 
> ---
>   lib/librte_port/rte_port_source_sink.c | 268 
> +++--
>   lib/librte_port/rte_port_source_sink.h |  11 +-
>   2 files changed, 263 insertions(+), 16 deletions(-)
>
[...]
> +#ifdef RTE_PORT_PCAP
> +
> +/**
> + * Open PCAP file for dumping packets to the file later
> + *
> + * @param port
> + *   Handle to sink port
> + * @param p
> + *   Sink port parameter
> + * @return
> + *   0 on SUCCESS
> + *   error code otherwise
> + */
[...]
> +
> +#else
> +
> +static int
> +pcap_sink_open(struct rte_port_sink *port,
> + __rte_unused struct rte_port_sink_params *p)
> +{
> + port->dumper = NULL;
> + port->max_pkts = 0;
> + port->pkt_index = 0;
> + port->dump_finish = 0;
> +
> + return 0;
> +}

Shouldn't this just return -ENOTSUP instead of success when the pcap 
feature is not built in?

> +
> +static void
> +pcap_sink_dump_pkt(__rte_unused struct rte_port_sink *port,
> + __rte_unused struct rte_mbuf *mbuf) {}
> +
> +static void
> +pcap_sink_flush_pkt(__rte_unused void *dumper) {}
> +
> +static void
> +pcap_sink_close(__rte_unused void *dumper) {}
> +
> +#endif
> +
>   static void *
>   rte_port_sink_create(__rte_unused void *params, int socket_id)
>   {
>   struct rte_port_sink *port;
> + struct rte_port_sink_params *p = params;
> + int status;
>
>   /* Memory allocation */
>   port = rte_zmalloc_socket("PORT", sizeof(*port),
> @@ -360,6 +532,19 @@ rte_port_sink_create(__rte_unused void *params, int 
> socket_id)
>   return NULL;
>   }
>
> + /* Try to open PCAP file for dumping, if possible */
> + status = pcap_sink_open(port, p);
> + if (status < 0) {
> + RTE_LOG(ERR, PORT, "%s: Failed to enable PCAP support "
> + "support\n", __func__);
> + rte_free(port);
> + port = NULL;
> + } else {
> + if (port->dumper != NULL)
> + RTE_LOG(INFO, PORT, "Ready to dump packets to file "
> + "%s\n", p->file_name);
> + }
> +
>   return port;
>   }
>
> @@ -369,6 +554,8 @@ rte_port_sink_tx(void *port, struct rte_mbuf *pkt)
>   __rte_unused struct rte_port_sink *p = (struct rte_port_sink *) port;
>
>   RTE_PORT_SINK_STATS_PKTS_IN_ADD(p, 1);
> + if (p->dumper != NULL)
> + pcap_sink_dump_pkt(p, pkt);
>   rte_pktmbuf_free(pkt);
>   RTE_PORT_SINK_STATS_PKTS_DROP_ADD(p, 1);
>
> @@ -387,21 +574,44 @@ rte_port_sink_tx_bulk(void *port, struct rte_mbuf 
> **pkts,
>
>   RTE_PORT_SINK_STATS_PKTS_IN_ADD(p, n_pkts);
>   RTE_PORT_SINK_STATS_PKTS_DROP_ADD(p, n_pkts);
> - for (i = 0; i < n_pkts; i++) {
> - struct rte_mbuf *pkt = pkts[i];
> -
> - rte_pktmbuf_free(pkt);
> + if (p->dumper) {
> + for (i = 0; i < n_pkts; i++) {
> + struct rte_mbuf *pkt = pkts[i];
> +
> + pcap_sink_dump_pkt(p, pkt);
> + rte_pktmbuf_free(pkt);
> + }
> + } else {
> + for (i = 0; i < n_pkts; i++) {
> + struct rte_mbuf *pkt = pkts[i];
> +
> + rte_pktmbuf_free(pkt);
> + }
>   }
>   } else {
> - for ( ; pkts_mask; ) {
> - uint32_t pkt_index = __builtin_ctzll(pkts_mask);
> - uint64_t pkt_mask = 1LLU << pkt_index;
> - struct rte_mbuf *pkt = pkts[pkt_index];
> -
> - RTE_PORT_SINK_STATS_PKTS_IN_ADD(p, 1);
> - RTE_PORT_SINK_STATS_PKTS_DROP_ADD(p, 1);
> - rte_pktmbuf_free(pkt);
> - pkts_mask &= ~pkt_mask;
> + if (p->dumper) {
> + for ( ; pkts_mask; ) {
> + uint32_t pkt_index = __builtin_ctzll(pkts_mask);
> + uint64_t pkt_mask = 1LLU << pkt_index;
> + struct rte_mbuf *pkt = pkts[pkt_index];
> +
> + RTE_PORT_SINK_STATS_PKTS_IN_ADD(p, 1);
> +

[dpdk-dev] [PATCH v6 1/2] tools: Add support for handling built-in kernel modules

2016-01-28 Thread Panu Matilainen

On 01/28/2016 01:17 PM, Kamil Rytarowski wrote:
>
>
> W dniu 26.01.2016 o 16:23, Thomas Monjalon pisze:
>> 2016-01-20 10:48, krytarowski at caviumnetworks.com:
>>> --- a/tools/dpdk_nic_bind.py
>>> +++ b/tools/dpdk_nic_bind.py
>>> -for line in loaded_mods:
>>> +try:
>>> +# Get list of syfs modules, some of them might be builtin
>>> and merge with mods
>> Please could you explain this comment?
>> Is it remaining from previous versions of the patch?
>
> Yes. It might be changed to:
> # Get list of sysfs modules (both built-in and dynamically loaded)
>
>> [...]
>>> +# special case for vfio_pci (module is named vfio-pci,
>>> +# but its .ko is named vfio_pci)
>> Isn't it common to have dash replaced by underscore for kernel modules?
>>
>
> I retained the logic for special case of vfio-pci. At the moment
> (according to my knowledge) there are no other DPDK modules with this
> name replacement.
>
> I checked few example Linux modules and if a module is named with dash,
> it's being replaced to underscore. The modprobe(8) tool can accept both
> names as interchangeable (with dash and underscore).
>
> Would you like to make it a general rule and replace all dashes with
> underscores?

It would be nice to behave the same as modprobe wrt dash and underscore, 
yes.

- Panu -

> Thank you

[dpdk-dev] [PATCH 2/3] rte_ctrl_if: add control interface library

2016-01-28 Thread Ferruh Yigit

On Thu, Jan 28, 2016 at 11:14:47AM +, Remy Horton wrote:
> On 27/01/2016 16:24, Ferruh Yigit wrote:
>
> > +   default:
> > +   ret = -95 /* EOPNOTSUPP */;
> > +   break;
>
> Is this intentional? -EOPNOTSUPP is -122 (-95 is -ENOTSOCK)..
>
Return value is not significant, callee just checks for negative value,
I can remove comment to prevent confusion.

Thanks,
ferruh

[dpdk-dev] [PATCH] pcap: fix captured frame length

2016-01-28 Thread Dror Birkman

The actual captured length is header.caplen, whereas header.len is
the original length on the wire.

Signed-off-by: Dror Birkman 
---


Without this fix, if the captured length is smaller than the original
length on the wire, mbuf will contain incorrect data.


 drivers/net/pcap/rte_eth_pcap.c | 12 ++--
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/drivers/net/pcap/rte_eth_pcap.c b/drivers/net/pcap/rte_eth_pcap.c
index f9230eb..1d121f8 100644
--- a/drivers/net/pcap/rte_eth_pcap.c
+++ b/drivers/net/pcap/rte_eth_pcap.c
@@ -220,25 +220,25 @@ eth_pcap_rx(void *queue,
buf_size = 
(uint16_t)(rte_pktmbuf_data_room_size(pcap_q->mb_pool) -
RTE_PKTMBUF_HEADROOM);

-   if (header.len <= buf_size) {
+   if (header.caplen <= buf_size) {
/* pcap packet will fit in the mbuf, go ahead and copy 
*/
rte_memcpy(rte_pktmbuf_mtod(mbuf, void *), packet,
-   header.len);
-   mbuf->data_len = (uint16_t)header.len;
+   header.caplen);
+   mbuf->data_len = (uint16_t)header.caplen;
} else {
/* Try read jumbo frame into multi mbufs. */
if (unlikely(eth_pcap_rx_jumbo(pcap_q->mb_pool,
   mbuf,
   packet,
-  header.len) == -1))
+  header.caplen) == -1))
break;
}

-   mbuf->pkt_len = (uint16_t)header.len;
+   mbuf->pkt_len = (uint16_t)header.caplen;
mbuf->port = pcap_q->in_port;
bufs[num_rx] = mbuf;
num_rx++;
-   rx_bytes += header.len;
+   rx_bytes += header.caplen;
}
pcap_q->rx_pkts += num_rx;
pcap_q->rx_bytes += rx_bytes;
-- 
2.6.3

[dpdk-dev] [PATCH v2] eal: add architecture specific rte_cpuflags.c files

2016-01-28 Thread Ferruh Yigit

Move cpu_feature_table array from arch specific rte_cpuflags.h files to
new arch specific rte_cpuflags.c files.

Main motivation is to escape from static variable declarations in
header files. cpu_feature_table has many copies in final binary, even
exist in some object files that does not use this variable at all.

And this can be a sample to create architecture specific source files
and move some functions which are not performance sensitive from
architecture header files to source files.

v2:
* rebased for DPDK2.3 (16.04)
* added arm arch
* renamed cpu_feature_table[] to rte_cpu_feature_table[]

Signed-off-by: Ferruh Yigit 
---
 lib/librte_eal/bsdapp/eal/Makefile |   6 +
 lib/librte_eal/bsdapp/eal/rte_eal_version.map  |   7 ++
 lib/librte_eal/common/arch/arm/rte_cpuflags.c  |  79 +
 lib/librte_eal/common/arch/ppc_64/rte_cpuflags.c   |  70 +++
 lib/librte_eal/common/arch/tile/rte_cpuflags.c |  36 ++
 lib/librte_eal/common/arch/x86/rte_cpuflags.c  | 129 +
 lib/librte_eal/common/eal_common_cpuflags.c|   2 +-
 .../common/include/arch/arm/rte_cpuflags_32.h  |  35 +-
 .../common/include/arch/arm/rte_cpuflags_64.h  |  16 +--
 .../common/include/arch/ppc_64/rte_cpuflags.h  |  41 +--
 .../common/include/arch/tile/rte_cpuflags.h|   3 -
 .../common/include/arch/x86/rte_cpuflags.h |  99 +---
 lib/librte_eal/linuxapp/eal/Makefile   |   6 +
 lib/librte_eal/linuxapp/eal/rte_eal_version.map|   6 +
 14 files changed, 352 insertions(+), 183 deletions(-)
 create mode 100644 lib/librte_eal/common/arch/arm/rte_cpuflags.c
 create mode 100644 lib/librte_eal/common/arch/ppc_64/rte_cpuflags.c
 create mode 100644 lib/librte_eal/common/arch/tile/rte_cpuflags.c
 create mode 100644 lib/librte_eal/common/arch/x86/rte_cpuflags.c

diff --git a/lib/librte_eal/bsdapp/eal/Makefile 
b/lib/librte_eal/bsdapp/eal/Makefile
index 65b293f..d7ca60b 100644
--- a/lib/librte_eal/bsdapp/eal/Makefile
+++ b/lib/librte_eal/bsdapp/eal/Makefile
@@ -33,7 +33,9 @@ include $(RTE_SDK)/mk/rte.vars.mk

 LIB = librte_eal.a

+ARCH_DIR ?= $(RTE_ARCH)
 VPATH += $(RTE_SDK)/lib/librte_eal/common
+VPATH += $(RTE_SDK)/lib/librte_eal/common/arch/$(ARCH_DIR)

 CFLAGS += -I$(SRCDIR)/include
 CFLAGS += -I$(RTE_SDK)/lib/librte_eal/common
@@ -82,6 +84,9 @@ SRCS-$(CONFIG_RTE_LIBRTE_EAL_BSDAPP) += malloc_elem.c
 SRCS-$(CONFIG_RTE_LIBRTE_EAL_BSDAPP) += malloc_heap.c
 SRCS-$(CONFIG_RTE_LIBRTE_EAL_BSDAPP) += rte_keepalive.c

+# from arch dir
+SRCS-$(CONFIG_RTE_LIBRTE_EAL_BSDAPP) += rte_cpuflags.c
+
 CFLAGS_eal.o := -D_GNU_SOURCE
 #CFLAGS_eal_thread.o := -D_GNU_SOURCE
 CFLAGS_eal_log.o := -D_GNU_SOURCE
@@ -100,5 +105,6 @@ SYMLINK-$(CONFIG_RTE_LIBRTE_EAL_BSDAPP)-include/exec-env := 
\
$(addprefix include/exec-env/,$(INC))

 DEPDIRS-$(CONFIG_RTE_LIBRTE_EAL_BSDAPP) += lib/librte_eal/common
+DEPDIRS-$(CONFIG_RTE_LIBRTE_EAL_BSDAPP) += 
lib/librte_eal/common/arch/$(ARCH_DIR)

 include $(RTE_SDK)/mk/rte.lib.mk
diff --git a/lib/librte_eal/bsdapp/eal/rte_eal_version.map 
b/lib/librte_eal/bsdapp/eal/rte_eal_version.map
index 9d7adf1..6fa9c67 100644
--- a/lib/librte_eal/bsdapp/eal/rte_eal_version.map
+++ b/lib/librte_eal/bsdapp/eal/rte_eal_version.map
@@ -135,3 +135,10 @@ DPDK_2.2 {
rte_xen_dom0_supported;

 } DPDK_2.1;
+
+DPDK_2.3 {
+   global:
+
+   rte_cpu_feature_table;
+
+} DPDK_2.2;
diff --git a/lib/librte_eal/common/arch/arm/rte_cpuflags.c 
b/lib/librte_eal/common/arch/arm/rte_cpuflags.c
new file mode 100644
index 000..4348574
--- /dev/null
+++ b/lib/librte_eal/common/arch/arm/rte_cpuflags.c
@@ -0,0 +1,79 @@
+/*
+ *   BSD LICENSE
+ *
+ *   Copyright (C) Cavium networks Ltd. 2015.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ * * Redistributions of source code must retain the above copyright
+ *   notice, this list of conditions and the following disclaimer.
+ * * Redistributions in binary form must reproduce the above copyright
+ *   notice, this list of conditions and the following disclaimer in
+ *   the documentation and/or other materials provided with the
+ *   distribution.
+ * * Neither the name of Cavium networks nor the names of its
+ *   contributors may be used to endorse or promote products derived
+ *   from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF

[dpdk-dev] [PATCH v6 1/2] tools: Add support for handling built-in kernel modules

2016-01-28 Thread Kamil Rytarowski



W dniu 26.01.2016 o 16:23, Thomas Monjalon pisze:
> 2016-01-20 10:48, krytarowski at caviumnetworks.com:
>> --- a/tools/dpdk_nic_bind.py
>> +++ b/tools/dpdk_nic_bind.py
>> -for line in loaded_mods:
>> +try:
>> +# Get list of syfs modules, some of them might be builtin and merge 
>> with mods
> Please could you explain this comment?
> Is it remaining from previous versions of the patch?

Yes. It might be changed to:
# Get list of sysfs modules (both built-in and dynamically loaded)

> [...]
>> +# special case for vfio_pci (module is named vfio-pci,
>> +# but its .ko is named vfio_pci)
> Isn't it common to have dash replaced by underscore for kernel modules?
>

I retained the logic for special case of vfio-pci. At the moment 
(according to my knowledge) there are no other DPDK modules with this 
name replacement.

I checked few example Linux modules and if a module is named with dash, 
it's being replaced to underscore. The modprobe(8) tool can accept both 
names as interchangeable (with dash and underscore).

Would you like to make it a general rule and replace all dashes with 
underscores?

Thank you

[dpdk-dev] [PATCH v5 1/2] tools: Add support for handling built-in kernel modules

2016-01-28 Thread Kamil Rytarowski



W dniu 26.01.2016 o 16:12, Thomas Monjalon pisze:
> 2016-01-19 17:35, Kamil Rytarowski:
>> W dniu 18.01.2016 o 15:32, Thomas Monjalon pisze:
>>> Hi Kamil,
>>>
>>> 2015-12-09 14:19, Kamil Rytarowski:
 Currently dpdk_nic_bind.py detects Linux kernel modules via reading
 /proc/modules. Built-in ones aren't listed there and therefore they are not
 being found by the script.

 Add support for checking built-in modules with parsing the sysfs files.

 This commit obsoletes the /proc/modules parsing approach.

 Signed-off-by: Kamil Rytarowski 
>>> I have a doubt about this tag:
 Signed-off-by: David Marchand 
>>> What do you mean here?
>> Excuse me, it should be:  Acked-by: David Marchand
>> 
>>
>> http://dpdk.org/ml/archives/dev/2015-December/029720.html
> The ack was only for the patch 2/2


I see. I will correct it.

[dpdk-dev] [PATCH v6] vfio: Support for no-IOMMU mode

2016-01-28 Thread Anatoly Burakov

This commit is adding a generic mechanism to support multiple IOMMU
types. For now, it's only type 1 (x86 IOMMU) and no-IOMMU (a special
VFIO mode that doesn't use IOMMU at all), but it's easily extended
by adding necessary definitions to eal_vfio.h, and DMA mapping
functions to eal_pci_vfio.c.

Since type 1 IOMMU module is no longer necessary to have VFIO,
we fix the module check to check for vfio-pci instead. It's not
ideal and triggers VFIO checks more often (and thus produces more
error output, which was the reason behind the module check in the
first place), so we compensate for that by providing more verbose
logging, indicating whether VFIO initialization has succeeded or
failed.

Signed-off-by: Anatoly Burakov 
Signed-off-by: Santosh Shukla 
Tested-by: Santosh Shukla 
---
v6 changes:
  Fixed functions not declared as static
  Fixed definitions to be more consistent with others

v5 changes:
  Renamed functions

v4 changes:
  Fixed the commit message and added a missing sign-off

v3 changes:
  Merging DMA mapping functions back into eal_pci_vfio.c
  Fixing and adding comments

v2 changes:
  Compile fix (hat-tip to Santosh Shukla)
  Tested-by is provisional, since only superficial testing was done

 lib/librte_eal/linuxapp/eal/eal_pci_vfio.c | 205 +
 lib/librte_eal/linuxapp/eal/eal_vfio.h |   8 ++
 2 files changed, 160 insertions(+), 53 deletions(-)

diff --git a/lib/librte_eal/linuxapp/eal/eal_pci_vfio.c 
b/lib/librte_eal/linuxapp/eal/eal_pci_vfio.c
index 74f91ba..a6c7e16 100644
--- a/lib/librte_eal/linuxapp/eal/eal_pci_vfio.c
+++ b/lib/librte_eal/linuxapp/eal/eal_pci_vfio.c
@@ -72,11 +72,74 @@ EAL_REGISTER_TAILQ(rte_vfio_tailq)
 #define VFIO_DIR "/dev/vfio"
 #define VFIO_CONTAINER_PATH "/dev/vfio/vfio"
 #define VFIO_GROUP_FMT "/dev/vfio/%u"
+#define VFIO_NOIOMMU_GROUP_FMT "/dev/vfio/noiommu-%u"
 #define VFIO_GET_REGION_ADDR(x) ((uint64_t) x << 40ULL)

 /* per-process VFIO config */
 static struct vfio_config vfio_cfg;

+/* DMA mapping function prototype.
+ * Takes VFIO container fd as a parameter.
+ * Returns 0 on success, -1 on error.
+ * */
+typedef int (*vfio_dma_func_t)(int);
+
+struct vfio_iommu_type {
+   int type_id;
+   const char *name;
+   vfio_dma_func_t dma_map_func;
+};
+
+static int vfio_type1_dma_map(int);
+static int vfio_noiommu_dma_map(int);
+
+/* IOMMU types we support */
+static const struct vfio_iommu_type iommu_types[] = {
+   /* x86 IOMMU, otherwise known as type 1 */
+   { RTE_VFIO_TYPE1, "Type 1", _type1_dma_map},
+   /* IOMMU-less mode */
+   { RTE_VFIO_NOIOMMU, "No-IOMMU", _noiommu_dma_map},
+};
+
+int
+vfio_type1_dma_map(int vfio_container_fd)
+{
+   const struct rte_memseg *ms = rte_eal_get_physmem_layout();
+   int i, ret;
+
+   /* map all DPDK segments for DMA. use 1:1 PA to IOVA mapping */
+   for (i = 0; i < RTE_MAX_MEMSEG; i++) {
+   struct vfio_iommu_type1_dma_map dma_map;
+
+   if (ms[i].addr == NULL)
+   break;
+
+   memset(_map, 0, sizeof(dma_map));
+   dma_map.argsz = sizeof(struct vfio_iommu_type1_dma_map);
+   dma_map.vaddr = ms[i].addr_64;
+   dma_map.size = ms[i].len;
+   dma_map.iova = ms[i].phys_addr;
+   dma_map.flags = VFIO_DMA_MAP_FLAG_READ | 
VFIO_DMA_MAP_FLAG_WRITE;
+
+   ret = ioctl(vfio_container_fd, VFIO_IOMMU_MAP_DMA, _map);
+
+   if (ret) {
+   RTE_LOG(ERR, EAL, "  cannot set up DMA remapping, "
+   "error %i (%s)\n", errno, 
strerror(errno));
+   return -1;
+   }
+   }
+
+   return 0;
+}
+
+int
+vfio_noiommu_dma_map(int __rte_unused vfio_container_fd)
+{
+   /* No-IOMMU mode does not need DMA mapping */
+   return 0;
+}
+
 int
 pci_vfio_read_config(const struct rte_intr_handle *intr_handle,
void *buf, size_t len, off_t offs)
@@ -208,42 +271,58 @@ pci_vfio_set_bus_master(int dev_fd)
return 0;
 }

-/* set up DMA mappings */
-static int
-pci_vfio_setup_dma_maps(int vfio_container_fd)
-{
-   const struct rte_memseg *ms = rte_eal_get_physmem_layout();
-   int i, ret;
-
-   ret = ioctl(vfio_container_fd, VFIO_SET_IOMMU,
-   VFIO_TYPE1_IOMMU);
-   if (ret) {
-   RTE_LOG(ERR, EAL, "  cannot set IOMMU type, "
-   "error %i (%s)\n", errno, strerror(errno));
-   return -1;
+/* pick IOMMU type. returns a pointer to vfio_iommu_type or NULL for error */
+static const struct vfio_iommu_type *
+pci_vfio_set_iommu_type(int vfio_container_fd) {
+   unsigned idx;
+   for (idx = 0; idx < RTE_DIM(iommu_types); idx++) {
+   const struct vfio_iommu_type *t = _types[idx];
+
+   int ret = ioctl(vfio_container_fd, VFIO_SET_IOMMU,
+   t->type_id);
+   if (!ret) {
+

[dpdk-dev] [PATCH v3 4/4] virtio: check if kernel driver is manipulating the virtio device

2016-01-28 Thread Panu Matilainen

On 01/27/2016 05:21 PM, Huawei Xie wrote:
> v3 changes:
>   change log message to tell user that the virtio device is skipped
> due to it is managed by kernel driver, instead of asking user to
> unbind it from kernel driver.
>
> v2 changes:
>   change LOG level from ERR to INFO
>
> virtio PMD could use IO port to configure the virtio device without
> using uio driver(vfio-noniommu mode should work as well).
>
> There are two issues with previous implementation:
> 1) virtio PMD will take over each virtio device blindly even if some
> are not intended for DPDK.
> 2) driver conflict between virtio PMD and virtio-net kernel driver.
>
> This patch checks if there is any kernel driver manipulating the virtio
> device before virtio PMD uses IO port to configure the device.
>
> Fixes: da978dfdc43b ("virtio: use port IO to get PCI resource")
>
> Signed-off-by: Huawei Xie 
> ---
>   drivers/net/virtio/virtio_ethdev.c | 5 +
>   1 file changed, 5 insertions(+)
>
> diff --git a/drivers/net/virtio/virtio_ethdev.c 
> b/drivers/net/virtio/virtio_ethdev.c
> index e815acd..ea1874a 100644
> --- a/drivers/net/virtio/virtio_ethdev.c
> +++ b/drivers/net/virtio/virtio_ethdev.c
> @@ -1138,6 +1138,11 @@ static int virtio_resource_init_by_ioports(struct 
> rte_pci_device *pci_dev)
>   int found = 0;
>   size_t linesz;
>
> + if (pci_dev->kdrv != RTE_KDRV_NONE) {
> + PMD_INIT_LOG(INFO, "skip kernel managed virtio device.");
> + return -1;
> + }
> +
>   snprintf(pci_id, sizeof(pci_id), PCI_PRI_FMT,
>pci_dev->addr.domain,
>pci_dev->addr.bus,
>

"Manage" is a good term for this, much better than "manipulate" used in 
the subject of this patch and patch 2/4.

"Check if kernel is manipulating foo" sounds like something that is 
happening right now, as in "wait until kernel has stopped fiddling with 
it and then do our own stuff while its quiet", managed makes is clear 
its about the overall state instead.

Not asking you to submit v4 just because of that, but if the need arises 
for other reasons it'd be nice to fix it as well, otherwise perhaps 
Thomas can adjust it while committing?

- Panu -

[dpdk-dev] [RFC PATCH 5/5] virtio: Extend virtio-net PMD to support container environment

2016-01-28 Thread Tetsuya Mukawa

On 2016/01/28 1:45, Xie, Huawei wrote:
> On 1/21/2016 7:09 PM, Tetsuya Mukawa wrote:
>> +qtest_find_pci_device(struct qtest_session *s, uint16_t bus, uint8_t device)
>> +{
>> +struct qtest_pci_device *dev;
>> +uint32_t val;
>> +
>> +val = qtest_pci_inl(s, bus, device, 0, 0);
>> +TAILQ_FOREACH(dev, >head, next) {
>> +if (val == ((uint32_t)dev->device_id << 16 | dev->vendor_id)) {
>> +dev->bus_addr = bus;
>> +dev->device_addr = device;
>> +return;
>> +}
>> +
>> +}
>> +}
>> +
>> +static int
>> +qtest_init_pci_devices(struct qtest_session *s)
>> +{
>> +struct qtest_pci_device *dev;
>> +uint16_t bus;
>> +uint8_t device;
>> +int ret;
>> +
>> +/* Find devices */
>> +bus = 0;
>> +do {
>> +device = 0;
>> +do {
>> +qtest_find_pci_device(s, bus, device);
>> +} while (device++ != NB_DEVICE - 1);
>> +} while (bus++ != NB_BUS - 1);
> Seems this scan of all the pci devices is very time consuming operation,
> and each scan involves socket communication.
> Do you measure how long it takes to do the pci devices initialization?

I measured it, and seems it takes 0.35 seconds in my environment.
This will be done only once when the port is initialized. Probably it's
not so heady.

Tetsuya

[dpdk-dev] [RFC PATCH 5/5] virtio: Extend virtio-net PMD to support container environment

2016-01-28 Thread Tetsuya Mukawa

On 2016/01/28 0:58, Xie, Huawei wrote:
> On 1/21/2016 7:09 PM, Tetsuya Mukawa wrote:
> [snip]
>> +
>> +static int
>> +qtest_raw_recv(int fd, char *buf, size_t count)
>> +{
>> +size_t len = count;
>> +size_t total_len = 0;
>> +int ret = 0;
>> +
>> +while (len > 0) {
>> +ret = read(fd, buf, len);
>> +if (ret == (int)len)
>> +break;
>> +if (*(buf + ret - 1) == '\n')
>> +break;
> The above two lines should be put after the below if block.

Yes, it should be so.

>
>> +if (ret == -1) {
>> +if (errno == EINTR)
>> +continue;
>> +return ret;
>> +}
>> +total_len += ret;
>> +buf += ret;
>> +len -= ret;
>> +}
>> +return total_len + ret;
>> +}
>> +
> [snip]
>
>> +
>> +static void
>> +qtest_handle_one_message(struct qtest_session *s, char *buf)
>> +{
>> +int ret;
>> +
>> +if (strncmp(buf, interrupt_message, strlen(interrupt_message)) == 0) {
>> +if (rte_atomic16_read(>enable_intr) == 0)
>> +return;
>> +
>> +/* relay interrupt to pipe */
>> +ret = write(s->irqfds.writefd, "1", 1);
>> +if (ret < 0)
>> +rte_panic("cannot relay interrupt\n");
>> +} else {
>> +/* relay normal message to pipe */
>> +ret = qtest_raw_send(s->msgfds.writefd, buf, strlen(buf));
>> +if (ret < 0)
>> +rte_panic("cannot relay normal message\n");
>> +}
>> +}
>> +
>> +static char *
>> +qtest_get_next_message(char *p)
>> +{
>> +p = strchr(p, '\n');
>> +if ((p == NULL) || (*(p + 1) == '\0'))
>> +return NULL;
>> +return p + 1;
>> +}
>> +
>> +static void
>> +qtest_close_one_socket(int *fd)
>> +{
>> +if (*fd > 0) {
>> +close(*fd);
>> +*fd = -1;
>> +}
>> +}
>> +
>> +static void
>> +qtest_close_sockets(struct qtest_session *s)
>> +{
>> +qtest_close_one_socket(>qtest_socket);
>> +qtest_close_one_socket(>msgfds.readfd);
>> +qtest_close_one_socket(>msgfds.writefd);
>> +qtest_close_one_socket(>irqfds.readfd);
>> +qtest_close_one_socket(>irqfds.writefd);
>> +qtest_close_one_socket(>ivshmem_socket);
>> +}
>> +
>> +/*
>> + * This thread relays QTest response using pipe.
>> + * The function is needed because we need to separate IRQ message from 
>> others.
>> + */
>> +static void *
>> +qtest_event_handler(void *data) {
>> +struct qtest_session *s = (struct qtest_session *)data;
>> +char buf[1024];
>> +char *p;
>> +int ret;
>> +
>> +for (;;) {
>> +memset(buf, 0, sizeof(buf));
>> +ret = qtest_raw_recv(s->qtest_socket, buf, sizeof(buf));
>> +if (ret < 0) {
>> +qtest_close_sockets(s);
>> +return NULL;
>> +}
>> +
>> +/* may receive multiple messages at the same time */
> From the qtest_raw_recv implementation, if at some point one message is
> received by two qtest_raw_recv calls, then is that message discarded?
> We could save the last incomplete message in buffer, and combine the
> message received next time together.

I guess we don't lose replies from QEMU.
Please let me describe more.

According to the qtest specification, after sending a message, we need
to receive a reply like below.
APP: ---command---> QEMU
APP: <---OK QEMU

But, to handle interrupt message, we need to take care below case.
APP: ---command---> QEMU
APP: <---interrupt QEMU
APP: <---OK QEMU

Also, we need to handle a case like multiple threads tries to send a
qtest message.
Anyway, here is current implementation.

So far, we have 3 types of sockets.
1. socket for qtest messaging.
2. socket for relaying normal message.
3. socket for relaying interrupt message.

About read direction:
The qtest socket is only read by "qtest_event_handler". The handler may
receive multiple messages at once.
In the case,  the handler split messages, and send it to normal message
socket or interrupt message socket.

About write direction:
The qtest socket will be written by below functions.
 - qtest_raw_in/out
 - qtest_raw_read/write
But all functions that use above functions need to have mutex before
sending messages.
So all messaging will not be overlapped, then only one thread will read
the socket for relaying normal message.

Tetsuya

[dpdk-dev] [RFC PATCH 5/5] virtio: Extend virtio-net PMD to support container environment

2016-01-28 Thread Tetsuya Mukawa

On 2016/01/27 19:03, Xie, Huawei wrote:
> On 1/21/2016 7:09 PM, Tetsuya Mukawa wrote:
>> +/* Set BAR region */
>> +for (i = 0; i < NB_BAR; i++) {
>> +switch (dev->bar[i].type) {
>> +case QTEST_PCI_BAR_IO:
>> +case QTEST_PCI_BAR_MEMORY_UNDER_1MB:
>> +case QTEST_PCI_BAR_MEMORY_32:
>> +qtest_pci_outl(s, bus, device, 0, dev->bar[i].addr,
>> +dev->bar[i].region_start);
>> +PMD_DRV_LOG(INFO, "Set BAR of %s device: 0x%lx - 
>> 0x%lx\n",
>> +dev->name, dev->bar[i].region_start,
>> +dev->bar[i].region_start + 
>> dev->bar[i].region_size);
>> +break;
>> +case QTEST_PCI_BAR_MEMORY_64:
>> +qtest_pci_outq(s, bus, device, 0, dev->bar[i].addr,
>> +dev->bar[i].region_start);
>> +PMD_DRV_LOG(INFO, "Set BAR of %s device: 0x%lx - 
>> 0x%lx\n",
>> +dev->name, dev->bar[i].region_start,
>> +dev->bar[i].region_start + 
>> dev->bar[i].region_size);
>> +break;
> Hasn't the bar resource already been allocated? Is it the app's
> responsibility to allocate the bar resource in qtest mode? The app
> couldn't have that knowledge.

Yes. In qtest mode, the app should register above values.
(Without it, default values are 0)
Usually, this will be done by BIOS or uEFI. But in qtest mode, these
will not be invoked.
So we need to define above values, and also need to enable PCI devices.

In this release, I just register hard coded values except for one of
ivshmem BAR.
In next release, I will describe memory map in comment.

Tetsuya

[dpdk-dev] [PATCH v2] fix checkpatch errors

2016-01-28 Thread Panu Matilainen

On 01/28/2016 10:38 AM, Xie, Huawei wrote:
> On 1/28/2016 4:06 PM, Thomas Monjalon wrote:
>> 2016-01-28 03:09, Xie, Huawei:
>>> On 1/28/2016 2:17 AM, Thomas Monjalon wrote:
 2016-01-27 01:26, Huawei Xie:
> v2 changes:
>   add missed commit message in v1
>
> fix the error reported by checkpatch:
>   "ERROR: return is not a function, parentheses are not required"
>
> also removed other extra parentheses like:
>   "return val == 0"
>   "return (rte_mempool_lookup(...))"
 How these examples are differents from above checkpatch error?
>>> Don't get it.
>> Me too ;)
>> I don't understand which paren you removed in "return val == 0"
>> and why you say "also removed other...", meaning it is different
>> from the checkpatch error.
>
> Got you. I thought your example means DPDK examples.
> I mean i also removed paren in "return (val == 0)". But checkpatch
> doesn't report "return (logical expression)" as error. I think it is
> also not necessary, so removed some of them. That is why i listed them
> seperately.
>

So perhaps there's a reason checkpatch doesn't report it as an error?
At least I find the parentheses to increase readability in case of 
logical expressions, for example

return val == 0;

return (val == 0);

The parentheses kinda force you to notice there's something special 
going on and its not val that's returned. This "note there's something 
special here" of course only works if parentheses are not sprinkled 
around everywhere.

- Panu -

[dpdk-dev] [RFC PATCH 5/5] virtio: Extend virtio-net PMD to support container environment

2016-01-28 Thread Tetsuya Mukawa

On 2016/01/27 18:39, Xie, Huawei wrote:
> On 1/26/2016 10:58 AM, Tetsuya Mukawa wrote:
>> On 2016/01/25 19:15, Xie, Huawei wrote:
>>
>> BTW, my container implementation needed a QEMU patch in the case of
>> vhost-user.
>> But the patch has been merged in upstream QEMU, so we don't have this
>> limitation any more.
> Great, better put the QEMU dependency information in the commit message

Thanks for all your comments and carefully reviewing.

So far, I am not sure what is next QEMU version.
But I will add it after QEMU releases new one.

Tetsuya

[dpdk-dev] Fw: dpdk-armv7 - Build # 264 - Failure!

2016-01-28 Thread Jan Viktorin

Hello, It seems the build is broken for armv7 in the master branch. I am sorry 
I am away of my office and cannot do any deeper analysis at the moment.?

Jan?Viktorin
RehiveTech
Sent?from?a?mobile?device
? P?vodn? zpr?va ?
?

dpdk-armv7 - Build # 264 - Failure

See the attached log file.

[dpdk-dev] [PATCH 2/3] rte_ctrl_if: add control interface library

2016-01-28 Thread Remy Horton

On 27/01/2016 16:24, Ferruh Yigit wrote:

 > +default:
 > +ret = -95 /* EOPNOTSUPP */;
 > +break;

Is this intentional? -EOPNOTSUPP is -122 (-95 is -ENOTSOCK)..

[dpdk-dev] [PATCH] lib: remove "extern" keyword for functions from header files

2016-01-28 Thread Ferruh Yigit

On Wed, Jan 27, 2016 at 07:05:52PM +0100, Thomas Monjalon wrote:
> 2016-01-25 10:01, Ferruh Yigit:
> > Remove "extern" keywords in header files, the ones for function
> > prototypes
> 
> I've seen a lot of other extern keywords. Why not removing all?
> 
Remaining one are Linux drivers in KNI, they are kind of internal headers, I 
doubt on touching them.
Should I remove them all?
Also there are more usage in "drivers" folder, I am not sure touching them too, 
what do you comment?

> > -extern int rte_eth_dev_configure(uint8_t port_id,
> > -uint16_t nb_rx_queue,
> > -uint16_t nb_tx_queue,
> > -const struct rte_eth_conf *eth_conf);
> > +int rte_eth_dev_configure(uint8_t port_id, uint16_t nb_rx_queue,
> > + uint16_t nb_tx_queue,
> > + const struct rte_eth_conf *eth_conf);
> 
> The indent is weird.
> Why not follow the guideline with 2 tabs?
> http://dpdk.org/doc/guides-2.2/contributing/coding_style.html#c-indentation

Intentionally kept them as original, to scope the patch just to remove a 
keyword.
Do you want me fix the syntax wherever I touch for this patch?

Thanks,
ferruh

[dpdk-dev] [PATCH v5] vfio: Support for no-IOMMU mode

2016-01-28 Thread Burakov, Anatoly

> 2016-01-27 16:50, Anatoly Burakov:
> > --- a/lib/librte_eal/linuxapp/eal/eal_pci_vfio.c
> > +++ b/lib/librte_eal/linuxapp/eal/eal_pci_vfio.c
> > +int vfio_type1_dma_map(int);
> > +int vfio_noiommu_dma_map(int);
> 
> WARNING:AVOID_EXTERNS: externs should be avoided in .c files I agree with
> checkpatch, they should be static ;)
> 

Will fix that.

> > --- a/lib/librte_eal/linuxapp/eal/eal_vfio.h
> > +++ b/lib/librte_eal/linuxapp/eal/eal_vfio.h
> > +/* older kernels may not have no-IOMMU mode */ #ifndef
> > +VFIO_NOIOMMU_IOMMU #define VFIO_NOIOMMU_IOMMU 8 #endif
> 
> Shouldn't it be defined privately in .c file?

We already have other VFIO-related definitions in that file, specifically the 
PCI defines that aren't present in earlier kernels. This definition is similar 
in nature - it will be present in kernels starting from 4.5 (when NOIOMMU was 
introduced), but earlier kernels will need this defined. I didn't want to go 
similar route with redefining everything VFIO-related, but maybe it makes sense 
in this case for consistency's sake? E.g.

#define RTE_VFIO_TYPE1 VFIO_TYPE1_IOMMU [we're already in an ifdef linux >= 
3.6, so define type1 unconditionally]
#if linux < 4.5
#define RTE_VFIO_NOIOMMU 8
#else
#define RTE_VFIO_NOIOMMU VFIO_NOIOMMU_IOMMU
#endif

Or something like that?

Thanks,
Anatoly

[dpdk-dev] Errors Rx count increasing while pktgen doing nothing on Intel 82598EB 10G

2016-01-28 Thread Moon-Sang Lee

Helin, I implemented my own sample application that is a kind of carrier
grade NAT server.
It works fine on 1G NIC (i.e. Intel Corporation 82576 Gigabit Network
Connection (rev 01))
But, it does not receive packets on 10G NIC (i.e. Intel Corporation 82598EB
10-Gigabit AF Network Connection (rev 01)) as described in the previous
email.
According to my log messages, it seems that control register for RX DMA is
not enabled.

Here is some information about my environment.

1. HW & OS
[mslee at centos7 ~]$ uname -a
Linux centos7 3.10.0-229.el7.x86_64 #1 SMP Fri Mar 6 11:36:42 UTC 2015
x86_64 x86_64 x86_64 GNU/Linux
[mslee at centos7 ~]$ more /proc/cpuinfo
processor : 0
vendor_id : GenuineIntel
cpu family : 6
model : 26
model name : Intel(R) Xeon(R) CPU   E5520  @ 2.27GHz
stepping : 5
microcode : 0x19
cpu MHz : 2262.000
cache size : 8192 KB
physical id : 1
siblings : 8
core id : 0
cpu cores : 4
apicid : 16
initial apicid : 16
fpu : yes
fpu_exception : yes
cpuid level : 11
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov
pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx rdtscp
lm c
onstant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc
aperfmperf
 pni dtes64 monitor ds_cpl vmx est tm2 ssse3 cx16 xtpr pdcm dca sse4_1
sse4_2 po
pcnt lahf_lm ida dtherm tpr_shadow vnmi flexpriority ept vpid
bogomips : 4521.93
clflush size : 64
cache_alignment : 64
address sizes : 40 bits physical, 48 bits virtual
power management:
...



2. port ocnfigure parameter for rte_eth_dev_configure():
ret = rte_eth_dev_configure(port, NB_RXQ, NB_TXQ, _conf);
where NB_RXQ=1, NB_TXQ=2, and
struct rte_eth_conf port_conf = {
.rxmode = {
//.mq_mode = ETH_MQ_RX_RSS,
.mq_mode = ETH_MQ_RX_NONE,  // for 10G NIC
.max_rx_pkt_len = ETHER_MAX_LEN,
.split_hdr_size = 0,
.header_split = 0,  // Header Split disabled
.hw_ip_checksum = 0,// IP checksum offload enabled
.hw_vlan_filter = 0,// VLAN filtering disabled
.jumbo_frame = 0,   // Jumbo Frame Support disabled
.hw_strip_crc = 0,  // CRC stripped by hardware
},
.rx_adv_conf = {
.rss_conf = {
.rss_key = NULL,
.rss_hf = ETH_RSS_IP,
},
},
.txmode = {
.mq_mode = ETH_MQ_TX_NONE,
},
};



3. rx queue setup parameter
ret = rte_eth_rx_queue_setup(port, RXQ_ID, NB_RXD, socket_id,  NULL,
pktmbuf_pool[socket_id])
where RXQ_ID = 0, NB_RXD = 128



4. config parameters in config/common_linuxapp
#
# Compile burst-oriented IXGBE PMD driver
#
CONFIG_RTE_LIBRTE_IXGBE_PMD=y
CONFIG_RTE_LIBRTE_IXGBE_DEBUG_INIT=n
CONFIG_RTE_LIBRTE_IXGBE_DEBUG_RX=n
CONFIG_RTE_LIBRTE_IXGBE_DEBUG_TX=n
CONFIG_RTE_LIBRTE_IXGBE_DEBUG_TX_FREE=n
CONFIG_RTE_LIBRTE_IXGBE_DEBUG_DRIVER=n
CONFIG_RTE_LIBRTE_IXGBE_PF_DISABLE_STRIP_CRC=n
CONFIG_RTE_IXGBE_INC_VECTOR=y
CONFIG_RTE_IXGBE_RX_OLFLAGS_ENABLE=y



5. where log message is printed

dpdk-2.2.0/drivers/net/ixgbe/ixgbe_rxtx.c:

/* Allocate buffers for descriptor rings */
if (ixgbe_alloc_rx_queue_mbufs(rxq) != 0) {
PMD_INIT_LOG(ERR, "Could not alloc mbuf for queue:%d",
 rx_queue_id);
return -1;
}
rxdctl = IXGBE_READ_REG(hw, IXGBE_RXDCTL(rxq->reg_idx));
rxdctl |= IXGBE_RXDCTL_ENABLE;
IXGBE_WRITE_REG(hw, IXGBE_RXDCTL(rxq->reg_idx), rxdctl);

/* Wait until RX Enable ready */
poll_ms = RTE_IXGBE_REGISTER_POLL_WAIT_10_MS;
do {
rte_delay_ms(1);
rxdctl = IXGBE_READ_REG(hw, IXGBE_RXDCTL(rxq->reg_idx));
} while (--poll_ms && !(rxdctl & IXGBE_RXDCTL_ENABLE));
if (!poll_ms)
PMD_INIT_LOG(ERR, "*Could not enable Rx Queue %d*",
 rx_queue_id);


I'm going to update firmware of my NIC, but I'm not sure it helps.
I appreciate any comment.



On Wed, Jan 27, 2016 at 4:23 PM, Zhang, Helin  wrote:

> Moon-Sang
>
> Were you using pktgen or else application?
> Could you help to share with me the detailed steps of your reproducing
> that issue?
> We will find time on that soon later. Thanks!
>
> Regards,
> Helin
>
> -Original Message-
> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Laurent GUERBY
> Sent: Wednesday, January 27, 2016 3:16 PM
> To: Moon-Sang Lee 
> Cc: dev at dpdk.org
> Subject: Re: [dpdk-dev] Errors Rx count increasing while pktgen doing
> nothing on Intel 82598EB 10G
>
> On Wed, 2016-01-27 at 15:50 +0900, Moon-Sang Lee wrote:
> >
> >
> > Laurent, have you resolved this problem?
> > I'm using the same NIC as yours (i.e. Intel 82598EB 10G NIC) and faced
> > the same problem as you.
> > Here is parts of my log and it says that PMD cannot enable RX queue
> > for my NIC.
> > I'm using DPDK 2.2.0 and used 'null' for the 4th parameter in calling
> > rte_eth_rx_queue_setup().
> > (i.e. 'null' parameter provides the default rx_conf value.)
>
> Hi,
>
> I had to reuse my DPDK machines for another task, I will

[dpdk-dev] [PATCH 1/3] kcp: add kernel control path kernel module

2016-01-28 Thread Remy Horton

Comments inline

..Remy


On 27/01/2016 16:24, Ferruh Yigit wrote:
 > This kernel module is based on KNI module, but this one is stripped
 > version of it and only for control messages, no data transfer
 > functionality provided.
 >
 > This Linux kernel module helps userspace application create virtual
 > interfaces and when a control command issued into that virtual
 > interface, module pushes the command to the userspace and gets the
 > response back for the caller application.
 >
 > Signed-off-by: Ferruh Yigit 
 > ---


 > +net_dev = alloc_netdev(sizeof(struct kcp_dev), name,
 > +#ifdef NET_NAME_UNKNOWN
 > +NET_NAME_UNKNOWN,
 > +#endif
 > +kcp_net_init);

Something doesn't feel quite right here. In cases where NET_NAME_UNKNOWN 
is undefined, is the signature for alloc_netdev different?


 > +MODULE_LICENSE("Dual BSD/GPL");
 > +MODULE_AUTHOR("Intel Corporation");
 > +MODULE_DESCRIPTION("Kernel Module for managing kcp devices");

I'm not up to speed on this area, but some of the file headers only 
mention GPL/LGPL. This correct?


 > +nlmsg_unicast(nl_sock, skb, pid);
 > +KCP_DBG("Sent cmd:%d port:%d\n", cmd_id, port_id);
 > +
 > +/*nlmsg_free(skb);*/
 > +
 > +return 0;
 > +}

Oops.. :)
Possible memory leak, or is *skb statically allocated?

[dpdk-dev] [PATCH V1 1/1] jobstats: added function abort for job

2016-01-28 Thread Panu Matilainen

On 01/27/2016 05:57 PM, Jastrzebski, MichalX K wrote:
>> -Original Message-
>> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Panu Matilainen
>> Sent: Wednesday, January 27, 2016 2:38 PM
>> To: Kerlin, MarcinX ; dev at dpdk.org
>> Subject: Re: [dpdk-dev] [PATCH V1 1/1] jobstats: added function abort for job
>>
>> On 01/26/2016 06:15 PM, Marcin Kerlin wrote:
>>> This patch adds new function rte_jobstats_abort. It marks *job* as finished
>>> and time of this work will be add to management time instead of execution
>> time.
>>> This function should be used instead of rte_jobstats_finish if condition
>> occure,
>>> condition is defined by the application for example when receiving n>0
>> packets.
>>>
>>> Signed-off-by: Marcin Kerlin 
>>> ---
>>>lib/librte_jobstats/rte_jobstats.c   | 22 ++
>>>lib/librte_jobstats/rte_jobstats.h   | 17 +
>>>lib/librte_jobstats/rte_jobstats_version.map |  7 +++
>>>3 files changed, 46 insertions(+)
>>>
>> [...]
>>> diff --git a/lib/librte_jobstats/rte_jobstats.h
>> b/lib/librte_jobstats/rte_jobstats.h
>>> index de6a89a..9995319 100644
>>> --- a/lib/librte_jobstats/rte_jobstats.h
>>> +++ b/lib/librte_jobstats/rte_jobstats.h
>>> @@ -90,6 +90,9 @@ struct rte_jobstats {
>>> uint64_t exec_cnt;
>>> /**< Execute count. */
>>>
>>> +   uint64_t last_job_time;
>>> +   /**< Last job time */
>>> +
>>> char name[RTE_JOBSTATS_NAMESIZE];
>>> /**< Name of this job */
>>>
>>
>> AFAICS this is an ABI break and as such, needs to be preannounced, see
>> http://dpdk.org/doc/guides/contributing/versioning.html
>> For 2.3 it'd need to be a CONFIG_RTE_NEXT_ABI feature.
>>
>>  - Panu -
>
> Hi Panu,
> Thanks for Your notice. This last_job_time field is actually not necessary 
> here
> and will be removed from this structure.

Right, makes sense. You can always add it later on when there's a more 
pressing need to break the ABI.

- Panu -

[dpdk-dev] Future Direction for rte_eth_stats_get()

2016-01-28 Thread Van Haaren, Harry

> From: David Harton
> 
> enum rte_eth_stat_e {
> /* accurate desc #1 */
> RTE_ETH_STAT_1,
> /* accurate desc #2 */
> RTE_ETH_STAT_2,
> ...
> }
> struct rte_eth_id_stat {
> rte_eth_stat_e id;
> uin64_t value;
> }
> 
> int rte_eth_id_stats_num(uint8_t port_id, uint32_t *num_stats);
> /* returns < 0 on error or the number of stats that could have been read 
> (i.e. if userd
> */
> int rte_eth_id_stats_get(uint8_t port_id, uint32_t num_stats, rte_eth_id_stat 
> *id_stats);
> const char* rte_eth_id_stat_str(rte_eth_stat_e id);
> 
> This allows a driver to return whatever stats that it supports in a 
> consistent manner and
> also in a performance friendly way.  In fact, the driver side would be 
> identical to what
> they have today but instead of having arrays with "string stat name" they 
> will have the
> rte_eth_stat_e.


Thanks for the code and explanation.


> > RE: Thomas asking about performance numbers:
> > I can scrape together some raw tsc data on Monday and post to list, and we
> > can discuss it more then.
> 
> I can do the same if desired.  But, just to make sure we are discussing the 
> same issue:
> 
> 1) call rte_eth_xtats_get()
> This will result in many string copies and depending on the driver *many* 
> copies I don't
> want or care about.
> 2) "tokenize"/parse/hash the string returned to identify what the stat 
> actually is
> I'm guessing you are stating that this step could be mitigated at startup.  
> But, again, I
> don't think the API provides a guarantee which usually leads to bugs over 
> time.
> 3) Copy the value of the stat into the driver agnostic container the 
> application uses
> 4) Repeat steps 1-3 for every interface being serviced every 5 or 10 secs
> 
> Contrast that to my suggestion which has no string copies and a compile time 
> mapping
> between "stat_id" and "app stat" can be created for step 2.  I think the 
> performance
> differences are obvious even without generating cycle times.


Indeed using integers will reduce overhead compared to
strings, and the helper function to convert the integer
to string provides the same possibilities as the current
API (in a different way).

I haven't collected performance data yet, apologies for
the delay. Perhaps continuing this conversation after the
V1 patch deadline at the end of the week is a good idea?
I'll have more time to dedicate to thinking about this.


-Harry

[dpdk-dev] [PATCH v2 1/4] lib/ether: optimize the 'rte_eth_tunnel_filter_conf' structure

2016-01-28 Thread Thomas Monjalon

2016-01-28 15:30, Xutao Sun:
> Change the fields of outer_mac and inner_mac from pointer to struct in order 
> to keep the code's readability.

[...]
> - tunnel_filter_conf.outer_mac = >outer_mac;
> - tunnel_filter_conf.inner_mac = >inner_mac;
> + (void)rte_memcpy(_filter_conf.outer_mac, >outer_mac,
> + ETHER_ADDR_LEN);
> + (void)rte_memcpy(_filter_conf.inner_mac, >inner_mac,
> + ETHER_ADDR_LEN);

The (void) casting is useless here.

> --- a/lib/librte_ether/rte_eth_ctrl.h
> +++ b/lib/librte_ether/rte_eth_ctrl.h
> @@ -280,8 +280,8 @@ enum rte_tunnel_iptype {
>   * Tunneling Packet filter configuration.
>   */
>  struct rte_eth_tunnel_filter_conf {
> - struct ether_addr *outer_mac;  /**< Outer MAC address filter. */
> - struct ether_addr *inner_mac;  /**< Inner MAC address filter. */
> + struct ether_addr outer_mac;  /**< Outer MAC address filter. */
> + struct ether_addr inner_mac;  /**< Inner MAC address filter. */

It is an API change.
Please remove the deprecation notice and update the release notes
in this patch (atomically).

1 2 >

1 - 100 of 116 matches

Mail list logo