date:20160624

[dpdk-dev] [PATCH v5 5/5] app/pdump: fix type casting of ring size

2016-06-24 Thread Mcnamara, John

> -Original Message-
> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Reshma Pattan
> Sent: Friday, June 24, 2016 5:36 PM
> To: dev at dpdk.org
> Cc: Pattan, Reshma 
> Subject: [dpdk-dev] [PATCH v5 5/5] app/pdump: fix type casting of ring
> size
> 
> ring_size value is wrongly type casted to uint16_t.
> It should be type casted to uint32_t, as maximum ring size is 28bit long.
> Wrong type cast wrapping around the ring size values bigger than 65535.
> 
> Fixes: caa7028276b8 ("app/pdump: add tool for packet capturing")
> 
> Signed-off-by: Reshma Pattan 

Acked-by: John McNamara

[dpdk-dev] [PATCH v5 4/5] app/pdump: fix string overflow

2016-06-24 Thread Mcnamara, John

> -Original Message-
> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Reshma Pattan
> Sent: Friday, June 24, 2016 5:36 PM
> To: dev at dpdk.org
> Cc: Pattan, Reshma 
> Subject: [dpdk-dev] [PATCH v5 4/5] app/pdump: fix string overflow
> 
> replaced strncpy with snprintf for safely copying the strings.
> 
> Coverity issue 127351: string overflow
> 
> Fixes: caa7028276b8 ("app/pdump: add tool for packet capturing")
> 
> Signed-off-by: Reshma Pattan 

Acked-by: John McNamara

[dpdk-dev] [PATCH v5 3/5] pdump: fix string overflow

2016-06-24 Thread Mcnamara, John

> -Original Message-
> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Reshma Pattan
> Sent: Friday, June 24, 2016 5:36 PM
> To: dev at dpdk.org
> Cc: Pattan, Reshma 
> Subject: [dpdk-dev] [PATCH v5 3/5] pdump: fix string overflow
> 
> replaced strncpy with snprintf for safely copying the strings.
> 
> Cverity issue 127350: string overflow
> 
> Fixes: 278f945402c5 ("pdump: add new library for packet capture")
> 
> Signed-off-by: Reshma Pattan 

Acked-by: John McNamara

[dpdk-dev] [PATCH v5 2/5] pdump: check getenv return value

2016-06-24 Thread Mcnamara, John

> -Original Message-
> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Reshma Pattan
> Sent: Friday, June 24, 2016 5:36 PM
> To: dev at dpdk.org
> Cc: Pattan, Reshma 
> Subject: [dpdk-dev] [PATCH v5 2/5] pdump: check getenv return value
> 
> inside pdump_get_socket_path(), getenv can return a NULL pointer if the
> match for SOCKET_PATH_HOME is not found in the environment. NULL check is
> added to return -1 immediately. Since pdump_get_socket_path() returns -1
> now, wherever this function is called there the return value is checked
> and error message is logged.
> 
> Coverity issue 127344:  return value check Coverity issue 127347:  null
> pointer dereference
> 
> Fixes: 278f945402c5 ("pdump: add new library for packet capture")
> 
> Signed-off-by: Reshma Pattan 

Acked-by: John McNamara

[dpdk-dev] [PATCH v5 1/5] pdump: fix default socket path

2016-06-24 Thread Mcnamara, John



> -Original Message-
> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Reshma Pattan
> Sent: Friday, June 24, 2016 5:36 PM
> To: dev at dpdk.org
> Cc: Pattan, Reshma 
> Subject: [dpdk-dev] [PATCH v5 1/5] pdump: fix default socket path
> 
> SOCKET_PATH_HOME is to specify environment variable "HOME", so it should
> not contain "/pdump_sockets"  in the macro.
> So removed "/pdump_sockets" from SOCKET_PATH_HOME and SOCKET_PATH_VAR_RUN.
> New changes will create pdump sockets under /var/run/.dpdk/pdump_sockets
> for root users and under HOME/.dpdk/pdump_sockets for non root users.
> Changes are done in pdump_get_socket_path() to accommodate new socket path
> changes.
> 
> Fixes: 278f945402c5 ("pdump: add new library for packet capture")
> 
> Signed-off-by: Reshma Pattan 

Acked-by: John McNamara

[dpdk-dev] [PATCH 2/4] enic: set the max allowed MTU for the NIC

2016-06-24 Thread John Daley (johndale)

Hi Bruce,

> > * What was the MTU set to by default before this patch is applied? Was
> > it just set to 1518 or something else?
> > * What happens, if anything, if buffers bigger than the MTU size are sent
> down?
> This is obviously referring to buffers bigger than MTU on TX. There is also 
> the
> question of what happens if buffer sizes smaller than MTU are provided on
> RX.

I think I answered all your questions in the revised commit messages of the v2 
patchset (and then some) except this last one. Enic doesn't do any checking on 
Rx that buffers are greater than the MTU since it would affect performance. 
However if a packet is bigger than a buffer and Rx scatter is disabled, the 
packet will be dropped and 'imissed' incremented.

Thanks,
Johnd

[dpdk-dev] [PATCH v2] mempool: replace c memcpy code semantics with optimized rte_memcpy

2016-06-24 Thread Olivier Matz



On 06/17/2016 12:40 PM, Olivier Matz wrote:
> Hi Jerin,
> 
> On 06/03/2016 09:02 AM, Jerin Jacob wrote:
>> On Thu, Jun 02, 2016 at 11:16:16PM +0200, Olivier MATZ wrote:
>> Hi Olivier,
>>
>>> This is probably more a measure of the pure CPU cost of the mempool
>>> function, without considering the memory cache aspect. So, of course,
>>> a real use-case test should be done to confirm or not that it increases
>>> the performance. I'll manage to do a test and let you know the result.
>>
>> OK
>>
>> IMO, put rte_memcpy makes sense(this patch) as their no behavior change.
>> However, if get rte_memcpy with behavioral changes makes sense some platform
>> then we can enable it on conditional basics(I am OK with that)
>>
>>>
>>> By the way, not all drivers are allocating or freeing the mbufs by
>>> bulk, so this modification would only affect these ones. What driver
>>> are you using for your test?
>>
>> I have tested with ThunderX nicvf pmd(uses the bulk mode).
>> Recently sent out driver in ml for review
> 
> Just to let you know I do not forget this. I still need to
> find some time to do a performance test.


Quoting from the other thread [1] too to save this in patchwork:
[1] http://dpdk.org/ml/archives/dev/2016-June/042701.html


> On 06/24/2016 05:56 PM, Hunt, David wrote:
>> Hi Jerin,
>>
>> I just ran a couple of tests on this patch on the latest master head on
>> a couple of machines. An older quad socket E5-4650 and a quad socket
>> E5-2699 v3
>>
>> E5-4650:
>> I'm seeing a gain of 2% for un-cached tests and a gain of 9% on the
>> cached tests.
>>
>> E5-2699 v3:
>> I'm seeing a loss of 0.1% for un-cached tests and a gain of 11% on the
>> cached tests.
>>
>> This is purely the autotest comparison, I don't have traffic generator
>> results. But based on the above, I don't think there are any performance
>> issues with the patch.
>>
> 
> Thanks for doing the test on your side. I think it's probably enough
> to integrate Jerin's patch .
> 
> About using a rte_memcpy() in the mempool_get(), I don't think I'll have
> the time to do a more exhaustive test before the 16.07, so I'll come
> back with it later.
> 
> I'm sending an ack on the v2 thread.


Acked-by: Olivier Matz

[dpdk-dev] [PATCH] mbuf: replace c memcpy code semantics with optimized rte_memcpy

2016-06-24 Thread Olivier Matz

Hi Dave,

On 06/24/2016 05:56 PM, Hunt, David wrote:
> Hi Jerin,
> 
> I just ran a couple of tests on this patch on the latest master head on
> a couple of machines. An older quad socket E5-4650 and a quad socket
> E5-2699 v3
> 
> E5-4650:
> I'm seeing a gain of 2% for un-cached tests and a gain of 9% on the
> cached tests.
> 
> E5-2699 v3:
> I'm seeing a loss of 0.1% for un-cached tests and a gain of 11% on the
> cached tests.
> 
> This is purely the autotest comparison, I don't have traffic generator
> results. But based on the above, I don't think there are any performance
> issues with the patch.
> 

Thanks for doing the test on your side. I think it's probably enough
to integrate Jerin's patch .

About using a rte_memcpy() in the mempool_get(), I don't think I'll have
the time to do a more exhaustive test before the 16.07, so I'll come
back with it later.

I'm sending an ack on the v2 thread.

[dpdk-dev] librte_meter compilation fails on IBM Power8

2016-06-24 Thread Chao Zhu

I can repeat this problem by "export EXTRA_CFLAGS="-O0 -g"" on Power8. But
I'm not sure why this happens. The "-O3 -g" option works properly. I'll
investigate more.

-Original Message-
From: Dumitrescu, Cristian [mailto:cristian.dumitre...@intel.com] 
Sent: 2016?6?24? 1:26
To: N?lio Laranjeiro ; Chao Zhu 
Cc: dev at dpdk.org
Subject: RE: librte_meter compilation fails on IBM Power8

> -Original Message-
> From: N?lio Laranjeiro [mailto:nelio.laranjeiro at 6wind.com]
> Sent: Wednesday, June 22, 2016 1:31 PM
> To: Dumitrescu, Cristian ; Chao Zhu 
> 
> Cc: dev at dpdk.org
> Subject: librte_meter compilation fails on IBM Power8
> 
> Hi Cristian, Chao,
> 
> I have encountered a compilation failure on IBM Power8 when compiling 
> master branch with EXTRA_CFLAGS='-O0 -g':
> 
>   /root/nl/dpdk.org/build/lib/librte_meter.a(rte_meter.o): In function
> `rte_meter_get_tb_params':
>   /root/nl/dpdk.org/lib/librte_meter/rte_meter.c:57: undefined 
> reference to `ceil'
> 
> Seems related to commit 43f4364d.
> 
> I don't have the time to search more deeply, I hope it can help.
> 
> Regards,
> 
> --
> N?lio Laranjeiro
> 6WIND

I am not sure what the problem might be for IBM Power8.

ceil() is a function defined in math library, we include math.h header file
in rte_meter.c and we also link the library properly in the Makefile by
using LDLIBS += -lm, therefore I do not see any issue in the library code.

Thanks,
Cristian

[dpdk-dev] [PATCH] cryptodev: uninline parameter parsing

2016-06-24 Thread Thomas Monjalon

There is no need to have this parsing inlined in the header.
It brings kvargs dependency to every crypto drivers.
The functions are moved into rte_cryptodev.c.

Signed-off-by: Thomas Monjalon 
---
 lib/librte_cryptodev/rte_cryptodev.c | 91 ++
 lib/librte_cryptodev/rte_cryptodev.h | 95 ++--
 2 files changed, 95 insertions(+), 91 deletions(-)

diff --git a/lib/librte_cryptodev/rte_cryptodev.c 
b/lib/librte_cryptodev/rte_cryptodev.c
index 960e2d5..20e5beb 100644
--- a/lib/librte_cryptodev/rte_cryptodev.c
+++ b/lib/librte_cryptodev/rte_cryptodev.c
@@ -102,6 +102,97 @@ struct rte_cryptodev_callback {
uint32_t active;/**< Callback is executing */
 };

+#define RTE_CRYPTODEV_VDEV_MAX_NB_QP_ARG   ("max_nb_queue_pairs")
+#define RTE_CRYPTODEV_VDEV_MAX_NB_SESS_ARG ("max_nb_sessions")
+#define RTE_CRYPTODEV_VDEV_SOCKET_ID   ("socket_id")
+
+static const char *cryptodev_vdev_valid_params[] = {
+   RTE_CRYPTODEV_VDEV_MAX_NB_QP_ARG,
+   RTE_CRYPTODEV_VDEV_MAX_NB_SESS_ARG,
+   RTE_CRYPTODEV_VDEV_SOCKET_ID
+};
+
+static uint8_t
+number_of_sockets(void)
+{
+   int sockets = 0;
+   int i;
+   const struct rte_memseg *ms = rte_eal_get_physmem_layout();
+
+   for (i = 0; ((i < RTE_MAX_MEMSEG) && (ms[i].addr != NULL)); i++) {
+   if (sockets < ms[i].socket_id)
+   sockets = ms[i].socket_id;
+   }
+
+   /* Number of sockets = maximum socket_id + 1 */
+   return ++sockets;
+}
+
+/** Parse integer from integer argument */
+static int
+parse_integer_arg(const char *key __rte_unused,
+   const char *value, void *extra_args)
+{
+   int *i = (int *) extra_args;
+
+   *i = atoi(value);
+   if (*i < 0) {
+   CDEV_LOG_ERR("Argument has to be positive.");
+   return -1;
+   }
+
+   return 0;
+}
+
+int
+rte_cryptodev_parse_vdev_init_params(struct rte_crypto_vdev_init_params 
*params,
+   const char *input_args)
+{
+   struct rte_kvargs *kvlist;
+   int ret;
+
+   if (params == NULL)
+   return -EINVAL;
+
+   if (input_args) {
+   kvlist = rte_kvargs_parse(input_args,
+   cryptodev_vdev_valid_params);
+   if (kvlist == NULL)
+   return -1;
+
+   ret = rte_kvargs_process(kvlist,
+   RTE_CRYPTODEV_VDEV_MAX_NB_QP_ARG,
+   _integer_arg,
+   >max_nb_queue_pairs);
+   if (ret < 0)
+   goto free_kvlist;
+
+   ret = rte_kvargs_process(kvlist,
+   RTE_CRYPTODEV_VDEV_MAX_NB_SESS_ARG,
+   _integer_arg,
+   >max_nb_sessions);
+   if (ret < 0)
+   goto free_kvlist;
+
+   ret = rte_kvargs_process(kvlist, RTE_CRYPTODEV_VDEV_SOCKET_ID,
+   _integer_arg,
+   >socket_id);
+   if (ret < 0)
+   goto free_kvlist;
+
+   if (params->socket_id >= number_of_sockets()) {
+   CDEV_LOG_ERR("Invalid socket id specified to create "
+   "the virtual crypto device on");
+   goto free_kvlist;
+   }
+   }
+
+   return 0;
+
+free_kvlist:
+   rte_kvargs_free(kvlist);
+   return ret;
+}

 const char *
 rte_cryptodev_get_feature_name(uint64_t flag)
diff --git a/lib/librte_cryptodev/rte_cryptodev.h 
b/lib/librte_cryptodev/rte_cryptodev.h
index 27cf8ef..7768f0a 100644
--- a/lib/librte_cryptodev/rte_cryptodev.h
+++ b/lib/librte_cryptodev/rte_cryptodev.h
@@ -300,48 +300,6 @@ struct rte_crypto_vdev_init_params {
uint8_t socket_id;
 };

-#define RTE_CRYPTODEV_VDEV_MAX_NB_QP_ARG   ("max_nb_queue_pairs")
-#define RTE_CRYPTODEV_VDEV_MAX_NB_SESS_ARG ("max_nb_sessions")
-#define RTE_CRYPTODEV_VDEV_SOCKET_ID   ("socket_id")
-
-static const char *cryptodev_vdev_valid_params[] = {
-   RTE_CRYPTODEV_VDEV_MAX_NB_QP_ARG,
-   RTE_CRYPTODEV_VDEV_MAX_NB_SESS_ARG,
-   RTE_CRYPTODEV_VDEV_SOCKET_ID
-};
-
-static inline uint8_t
-number_of_sockets(void)
-{
-   int sockets = 0;
-   int i;
-   const struct rte_memseg *ms = rte_eal_get_physmem_layout();
-
-   for (i = 0; ((i < RTE_MAX_MEMSEG) && (ms[i].addr != NULL)); i++) {
-   if (sockets < ms[i].socket_id)
-   sockets = ms[i].socket_id;
-   }
-
-   /* Number of sockets = maximum socket_id + 1 */
-   return ++sockets;
-}
-
-/** Parse integer from integer argument */
-static inline int
-__rte_cryptodev_parse_integer_arg(const char *key

[dpdk-dev] [PATCH] pci:don't insert an unbound device to pci_device_list in pci_scan_one

2016-06-24 Thread Rugang Chen

If a device isn't bound by any uio driver (vfio-pci, igb_uio, uio_pci_generic)
and is expected to owned by a kernel space driver, here it's still inserted to
pci_device_list.

This may cause application based on dpdk fetch the device by accident and then
the device is hanlded by dpdk.

For safe, skip it from pci_device_list as if it's unbound, dpdk won't want to
use it.

Signed-off-by: Rugang Chen 
---
 lib/librte_eal/linuxapp/eal/eal_pci.c | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/lib/librte_eal/linuxapp/eal/eal_pci.c 
b/lib/librte_eal/linuxapp/eal/eal_pci.c
index f9c3efd..f63febc 100644
--- a/lib/librte_eal/linuxapp/eal/eal_pci.c
+++ b/lib/librte_eal/linuxapp/eal/eal_pci.c
@@ -388,6 +388,12 @@ pci_scan_one(const char *dirname, uint16_t domain, uint8_t 
bus,
} else
dev->kdrv = RTE_KDRV_NONE;

+   /* Ignore device that isn't bound with any uio driver, then application 
won't
+* fetch it from pci_device_list by accident and then dpdk handles it. 
Kernel
+* space driver maybe wants to own it.
+*/
+   if (dev->kdrv == RTE_KDRV_NONE)
+   return 0;
/* device is valid, add in list (sorted) */
if (TAILQ_EMPTY(_device_list)) {
TAILQ_INSERT_TAIL(_device_list, dev, next);
-- 
2.1.4

[dpdk-dev] [PATCH] vmxnet3: remove 0x prefix for %p format

2016-06-24 Thread Bruce Richardson

On Thu, Jun 23, 2016 at 01:45:43PM -0700, Yong Wang wrote:
> > On Jun 23, 2016, at 3:52 AM, Ferruh Yigit  wrote:
> > 
> > To prevent double 0x in logs

The commit message would be better as the title and vice versa, I think.

> > 
> > Signed-off-by: Ferruh Yigit 
> > ?
> 
> Acked-by: Yong Wang 
> 

Applied to dpdk-next-net/rel_16_07 with updated title.

/Bruce

[dpdk-dev] [PATCH] nfp: modifying guide about using uio modules

2016-06-24 Thread Bruce Richardson

On Fri, Jun 24, 2016 at 02:37:41PM +0100, Mcnamara, John wrote:
> > -Original Message-
> > From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Alejandro Lucero
> > Sent: Tuesday, April 26, 2016 12:37 PM
> > To: dev at dpdk.org
> > Subject: [dpdk-dev] [PATCH] nfp: modifying guide about using uio modules
> > 
> >  - Removing dependency on nfp_uio kernel module. The igb_uio
> >kernel modules can be used instead.
> > 
> > Fixes: 80bc1752f16e ("nfp: add guide")
> > 
> > Signed-off-by: Alejandro Lucero 
> 
> Acked-by: John McNamara 
> 
Applied to dpdk-next-net/rel_16_07

/Bruce

[dpdk-dev] [PATCH v5 5/5] app/pdump: fix type casting of ring size

2016-06-24 Thread Reshma Pattan

ring_size value is wrongly type casted to uint16_t.
It should be type casted to uint32_t, as maximum
ring size is 28bit long. Wrong type cast
wrapping around the ring size values bigger than 65535.

Fixes: caa7028276b8 ("app/pdump: add tool for packet capturing")

Signed-off-by: Reshma Pattan 
---
 app/pdump/main.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/app/pdump/main.c b/app/pdump/main.c
index fe4d38a..2087c15 100644
--- a/app/pdump/main.c
+++ b/app/pdump/main.c
@@ -362,7 +362,7 @@ parse_pdump(const char *optarg)
_uint_value, );
if (ret < 0)
goto free_kvlist;
-   pt->ring_size = (uint16_t) v.val;
+   pt->ring_size = (uint32_t) v.val;
} else
pt->ring_size = RING_SIZE;

-- 
2.5.0

[dpdk-dev] [PATCH v5 4/5] app/pdump: fix string overflow

2016-06-24 Thread Reshma Pattan

replaced strncpy with snprintf for safely
copying the strings.

Coverity issue 127351: string overflow

Fixes: caa7028276b8 ("app/pdump: add tool for packet capturing")

Signed-off-by: Reshma Pattan 
---
 app/pdump/main.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/app/pdump/main.c b/app/pdump/main.c
index f8923b9..fe4d38a 100644
--- a/app/pdump/main.c
+++ b/app/pdump/main.c
@@ -217,12 +217,12 @@ parse_rxtxdev(const char *key, const char *value, void 
*extra_args)
struct pdump_tuples *pt = extra_args;

if (!strcmp(key, PDUMP_RX_DEV_ARG)) {
-   strncpy(pt->rx_dev, value, strlen(value));
+   snprintf(pt->rx_dev, sizeof(pt->rx_dev), "%s", value);
/* identify the tx stream type for pcap vdev */
if (if_nametoindex(pt->rx_dev))
pt->rx_vdev_stream_type = IFACE;
} else if (!strcmp(key, PDUMP_TX_DEV_ARG)) {
-   strncpy(pt->tx_dev, value, strlen(value));
+   snprintf(pt->tx_dev, sizeof(pt->tx_dev), "%s", value);
/* identify the tx stream type for pcap vdev */
if (if_nametoindex(pt->tx_dev))
pt->tx_vdev_stream_type = IFACE;
-- 
2.5.0

[dpdk-dev] [PATCH v5 3/5] pdump: fix string overflow

2016-06-24 Thread Reshma Pattan

replaced strncpy with snprintf for safely
copying the strings.

Cverity issue 127350: string overflow

Fixes: 278f945402c5 ("pdump: add new library for packet capture")

Signed-off-by: Reshma Pattan 
---
 lib/librte_pdump/rte_pdump.c | 6 --
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/lib/librte_pdump/rte_pdump.c b/lib/librte_pdump/rte_pdump.c
index e3b03a6..ee566cb 100644
--- a/lib/librte_pdump/rte_pdump.c
+++ b/lib/librte_pdump/rte_pdump.c
@@ -810,13 +810,15 @@ pdump_prepare_client_request(char *device, uint16_t queue,
req.flags = flags;
req.op =  operation;
if ((operation & ENABLE) != 0) {
-   strncpy(req.data.en_v1.device, device, strlen(device));
+   snprintf(req.data.en_v1.device, sizeof(req.data.en_v1.device),
+   "%s", device);
req.data.en_v1.queue = queue;
req.data.en_v1.ring = ring;
req.data.en_v1.mp = mp;
req.data.en_v1.filter = filter;
} else {
-   strncpy(req.data.dis_v1.device, device, strlen(device));
+   snprintf(req.data.dis_v1.device, sizeof(req.data.dis_v1.device),
+   "%s", device);
req.data.dis_v1.queue = queue;
req.data.dis_v1.ring = NULL;
req.data.dis_v1.mp = NULL;
-- 
2.5.0

[dpdk-dev] [PATCH v5 2/5] pdump: check getenv return value

2016-06-24 Thread Reshma Pattan

inside pdump_get_socket_path(), getenv can return
a NULL pointer if the match for SOCKET_PATH_HOME is
not found in the environment. NULL check is added to
return -1 immediately. Since pdump_get_socket_path()
returns -1 now, wherever this function is called
there the return value is checked and error message
is logged.

Coverity issue 127344:  return value check
Coverity issue 127347:  null pointer dereference

Fixes: 278f945402c5 ("pdump: add new library for packet capture")

Signed-off-by: Reshma Pattan 
---
 lib/librte_pdump/rte_pdump.c | 43 ++-
 1 file changed, 38 insertions(+), 5 deletions(-)

diff --git a/lib/librte_pdump/rte_pdump.c b/lib/librte_pdump/rte_pdump.c
index 70efd96..e3b03a6 100644
--- a/lib/librte_pdump/rte_pdump.c
+++ b/lib/librte_pdump/rte_pdump.c
@@ -443,7 +443,7 @@ set_pdump_rxtx_cbs(struct pdump_request *p)
 }

 /* get socket path (/var/run if root, $HOME otherwise) */
-static void
+static int
 pdump_get_socket_path(char *buffer, int bufsz, enum rte_pdump_socktype type)
 {
char dpdk_dir[PATH_MAX] = {0};
@@ -457,6 +457,13 @@ pdump_get_socket_path(char *buffer, int bufsz, enum 
rte_pdump_socktype type)
else {
if (getuid() != 0) {
dir_home = getenv(SOCKET_PATH_HOME);
+   if (!dir_home) {
+   RTE_LOG(ERR, PDUMP,
+   "Failed to get environment variable"
+   " value for %s, %s:%d\n",
+   SOCKET_PATH_HOME, __func__, __LINE__);
+   return -1;
+   }
snprintf(dpdk_dir, sizeof(dpdk_dir), "%s%s",
dir_home, DPDK_DIR);
} else
@@ -474,6 +481,8 @@ pdump_get_socket_path(char *buffer, int bufsz, enum 
rte_pdump_socktype type)
else
snprintf(buffer, bufsz, CLIENT_SOCKET, dir, getpid(),
rte_sys_gettid());
+
+   return 0;
 }

 static int
@@ -483,8 +492,14 @@ pdump_create_server_socket(void)
struct sockaddr_un addr;
socklen_t addr_len;

-   pdump_get_socket_path(addr.sun_path, sizeof(addr.sun_path),
+   ret = pdump_get_socket_path(addr.sun_path, sizeof(addr.sun_path),
RTE_PDUMP_SOCKET_SERVER);
+   if (ret != 0) {
+   RTE_LOG(ERR, PDUMP,
+   "Failed to get server socket path: %s:%d\n",
+   __func__, __LINE__);
+   return -1;
+   }
addr.sun_family = AF_UNIX;

/* remove if file already exists */
@@ -615,8 +630,14 @@ rte_pdump_uninit(void)

struct sockaddr_un addr;

-   pdump_get_socket_path(addr.sun_path, sizeof(addr.sun_path),
+   ret = pdump_get_socket_path(addr.sun_path, sizeof(addr.sun_path),
RTE_PDUMP_SOCKET_SERVER);
+   if (ret != 0) {
+   RTE_LOG(ERR, PDUMP,
+   "Failed to get server socket path: %s:%d\n",
+   __func__, __LINE__);
+   return -1;
+   }
ret = unlink(addr.sun_path);
if (ret != 0) {
RTE_LOG(ERR, PDUMP,
@@ -650,8 +671,14 @@ pdump_create_client_socket(struct pdump_request *p)
return ret;
}

-   pdump_get_socket_path(addr.sun_path, sizeof(addr.sun_path),
+   ret = pdump_get_socket_path(addr.sun_path, sizeof(addr.sun_path),
RTE_PDUMP_SOCKET_CLIENT);
+   if (ret != 0) {
+   RTE_LOG(ERR, PDUMP,
+   "Failed to get client socket path: %s:%d\n",
+   __func__, __LINE__);
+   return -1;
+   }
addr.sun_family = AF_UNIX;
addr_len = sizeof(struct sockaddr_un);

@@ -667,9 +694,15 @@ pdump_create_client_socket(struct pdump_request *p)

serv_len = sizeof(struct sockaddr_un);
memset(_addr, 0, sizeof(serv_addr));
-   pdump_get_socket_path(serv_addr.sun_path,
+   ret = pdump_get_socket_path(serv_addr.sun_path,
sizeof(serv_addr.sun_path),
RTE_PDUMP_SOCKET_SERVER);
+   if (ret != 0) {
+   RTE_LOG(ERR, PDUMP,
+   "Failed to get server socket path: %s:%d\n",
+   __func__, __LINE__);
+   break;
+   }
serv_addr.sun_family = AF_UNIX;

n =  sendto(socket_fd, p, sizeof(struct pdump_request), 0,
-- 
2.5.0

[dpdk-dev] [PATCH v5 1/5] pdump: fix default socket path

2016-06-24 Thread Reshma Pattan

SOCKET_PATH_HOME is to specify environment variable "HOME",
so it should not contain "/pdump_sockets"  in the macro.
So removed "/pdump_sockets" from SOCKET_PATH_HOME and
SOCKET_PATH_VAR_RUN. New changes will create pdump sockets under
/var/run/.dpdk/pdump_sockets for root users and
under HOME/.dpdk/pdump_sockets for non root users.
Changes are done in pdump_get_socket_path() to accommodate
new socket path changes.

Fixes: 278f945402c5 ("pdump: add new library for packet capture")

Signed-off-by: Reshma Pattan 
---
 lib/librte_pdump/rte_pdump.c | 29 -
 1 file changed, 20 insertions(+), 9 deletions(-)

diff --git a/lib/librte_pdump/rte_pdump.c b/lib/librte_pdump/rte_pdump.c
index c921f51..70efd96 100644
--- a/lib/librte_pdump/rte_pdump.c
+++ b/lib/librte_pdump/rte_pdump.c
@@ -50,8 +50,10 @@

 #include "rte_pdump.h"

-#define SOCKET_PATH_VAR_RUN "/var/run/pdump_sockets"
-#define SOCKET_PATH_HOME "HOME/pdump_sockets"
+#define SOCKET_PATH_VAR_RUN "/var/run"
+#define SOCKET_PATH_HOME "HOME"
+#define DPDK_DIR "/.dpdk"
+#define SOCKET_DIR   "/pdump_sockets"
 #define SERVER_SOCKET "%s/pdump_server_socket"
 #define CLIENT_SOCKET "%s/pdump_client_socket_%d_%u"
 #define DEVICE_ID_SIZE 64
@@ -444,17 +446,26 @@ set_pdump_rxtx_cbs(struct pdump_request *p)
 static void
 pdump_get_socket_path(char *buffer, int bufsz, enum rte_pdump_socktype type)
 {
-   const char *dir = NULL;
+   char dpdk_dir[PATH_MAX] = {0};
+   char dir[PATH_MAX] = {0};
+   char *dir_home = NULL;

if (type == RTE_PDUMP_SOCKET_SERVER && server_socket_dir[0] != 0)
-   dir = server_socket_dir;
+   snprintf(dir, sizeof(dir), "%s", server_socket_dir);
else if (type == RTE_PDUMP_SOCKET_CLIENT && client_socket_dir[0] != 0)
-   dir = client_socket_dir;
+   snprintf(dir, sizeof(dir), "%s", client_socket_dir);
else {
-   if (getuid() != 0)
-   dir = getenv(SOCKET_PATH_HOME);
-   else
-   dir = SOCKET_PATH_VAR_RUN;
+   if (getuid() != 0) {
+   dir_home = getenv(SOCKET_PATH_HOME);
+   snprintf(dpdk_dir, sizeof(dpdk_dir), "%s%s",
+   dir_home, DPDK_DIR);
+   } else
+   snprintf(dpdk_dir, sizeof(dpdk_dir), "%s%s",
+   SOCKET_PATH_VAR_RUN, DPDK_DIR);
+
+   mkdir(dpdk_dir, 700);
+   snprintf(dir, sizeof(dir), "%s%s",
+   dpdk_dir, SOCKET_DIR);
}

mkdir(dir, 700);
-- 
2.5.0

[dpdk-dev] [PATCH v5 0/5] fix issues in packet capture framework

2016-06-24 Thread Reshma Pattan

This patchset includes listed fixes
1)fix default socket path in pdump library.
2)fix coverity issues in pdump library.
3)fix coverity issues in pdump tool.
4)fix wrong typecast of ring size in pdump tool.

v5:
changes are done to default socket paths
now default socket path will be /var/run/.dpdk/pdump_sockets
for root users and HOME/.dpdk/pdump_sockets for nonroot users.

v4:
added new patch for fixing wrong typecast of ring size
in pdump tool.

v3:
added new patch for fixing default socket paths "HOME" and "/var/run".
reworked coverity fixes on top of the above change.

v2:
fixed code review comment to use snprintf instead of strncpy.


Reshma Pattan (5):
  pdump: fix default socket path
  pdump: check getenv return value
  pdump: fix string overflow
  app/pdump: fix string overflow
  app/pdump: fix type casting of ring size

 app/pdump/main.c |  6 ++--
 lib/librte_pdump/rte_pdump.c | 78 +++-
 2 files changed, 65 insertions(+), 19 deletions(-)

-- 
2.5.0

[dpdk-dev] [PATCH v4] ixgbe: configure VLAN TPID

2016-06-24 Thread Bruce Richardson

On Thu, Jun 23, 2016 at 11:11:58PM +0800, Beilei Xing wrote:
> Previously, a single VLAN header is treated as inner VLAN,
> but generally, a single VLAN header is treated as the outer
> VLAN header.
> The patch fixes the ether type of a single VLAN type, and
> enables configuring inner and outer TPID for double VLAN.
> 
> Fixes: 19b16e2f6442 ("ethdev: add vlan type when setting ether type")
> 
> Signed-off-by: Beilei Xing 

Applied to dpdk-next-net/rel_16_07

/Bruce

[dpdk-dev] [PATCH v2] mk: fix parallel build of test resources

2016-06-24 Thread Thomas Monjalon

|...]
> 
> OK alright :)
> 
> Acked-by: Olivier Matz 

Applied

[dpdk-dev] [PATCH] mbuf: replace c memcpy code semantics with optimized rte_memcpy

2016-06-24 Thread Hunt, David

Hi Jerin,

I just ran a couple of tests on this patch on the latest master head on 
a couple of machines. An older quad socket E5-4650 and a quad socket 
E5-2699 v3

E5-4650:
I'm seeing a gain of 2% for un-cached tests and a gain of 9% on the 
cached tests.

E5-2699 v3:
I'm seeing a loss of 0.1% for un-cached tests and a gain of 11% on the 
cached tests.

This is purely the autotest comparison, I don't have traffic generator 
results. But based on the above, I don't think there are any performance 
issues with the patch.

Regards,
Dave.




On 24/5/2016 4:17 PM, Jerin Jacob wrote:
> On Tue, May 24, 2016 at 04:59:47PM +0200, Olivier Matz wrote:
>> Hi Jerin,
>>
>>
>> On 05/24/2016 04:50 PM, Jerin Jacob wrote:
>>> Signed-off-by: Jerin Jacob 
>>> ---
>>>   lib/librte_mempool/rte_mempool.h | 5 ++---
>>>   1 file changed, 2 insertions(+), 3 deletions(-)
>>>
>>> diff --git a/lib/librte_mempool/rte_mempool.h 
>>> b/lib/librte_mempool/rte_mempool.h
>>> index ed2c110..ebe399a 100644
>>> --- a/lib/librte_mempool/rte_mempool.h
>>> +++ b/lib/librte_mempool/rte_mempool.h
>>> @@ -74,6 +74,7 @@
>>>   #include 
>>>   #include 
>>>   #include 
>>> +#include 
>>>   
>>>   #ifdef __cplusplus
>>>   extern "C" {
>>> @@ -917,7 +918,6 @@ __mempool_put_bulk(struct rte_mempool *mp, void * const 
>>> *obj_table,
>>> unsigned n, __rte_unused int is_mp)
>>>   {
>>> struct rte_mempool_cache *cache;
>>> -   uint32_t index;
>>> void **cache_objs;
>>> unsigned lcore_id = rte_lcore_id();
>>> uint32_t cache_size = mp->cache_size;
>>> @@ -946,8 +946,7 @@ __mempool_put_bulk(struct rte_mempool *mp, void * const 
>>> *obj_table,
>>>  */
>>>   
>>> /* Add elements back into the cache */
>>> -   for (index = 0; index < n; ++index, obj_table++)
>>> -   cache_objs[index] = *obj_table;
>>> +   rte_memcpy(_objs[0], obj_table, sizeof(void *) * n);
>>>   
>>> cache->len += n;
>>>   
>>>
>> The commit title should be "mempool" instead of "mbuf".
> I will fix it.
>
>> Are you seeing some performance improvement by using rte_memcpy()?
> Yes, In some case, In default case, It was replaced with memcpy by the
> compiler itself(gcc 5.3). But when I tried external mempool manager patch and
> then performance dropped almost 800Kpps. Debugging further it turns out that
> external mempool managers unrelated change was knocking out the memcpy.
> explicit rte_memcpy brought back 500Kpps. Remaing 300Kpps drop is still
> unknown(In my test setup, packets are in the local cache, so it must be
> something do with __mempool_put_bulk text alignment change or similar.
>
> Anyone else observed performance drop with external poolmanager?
>
> Jerin
>
>> Regards
>> Olivier

[dpdk-dev] [PATCH v4 1/5] pdump: fix default socket path

2016-06-24 Thread Thomas Monjalon

2016-06-24 14:54, Reshma Pattan:
> +#define SOCKET_DIR   "/pdump_sockets"

I think the default socket directory should contain dpdk as prefix.
Like dpdk-pdump-sockets (I think dash is preferred for filenames).
I wonder whether it should be a hidden directory:
~/.dpdk-pdump-sockets
And after all, why not simply
~/.dpdk/
It would allow other DPDK applications to put some files.

[dpdk-dev] [PATCH v4 1/5] pdump: fix default socket path

2016-06-24 Thread Pattan, Reshma



> -Original Message-
> From: Thomas Monjalon [mailto:thomas.monjalon at 6wind.com]
> Sent: Friday, June 24, 2016 3:55 PM
> To: Pattan, Reshma 
> Cc: dev at dpdk.org
> Subject: Re: [dpdk-dev] [PATCH v4 1/5] pdump: fix default socket path
> 
> 2016-06-24 14:54, Reshma Pattan:
> > +#define SOCKET_DIR   "/pdump_sockets"
> 
> I think the default socket directory should contain dpdk as prefix.
> Like dpdk-pdump-sockets (I think dash is preferred for filenames).
> I wonder whether it should be a hidden directory:
>   ~/.dpdk-pdump-sockets
> And after all, why not simply
>   ~/.dpdk/

patch v5 is sent with changes /.dpdk/pdump_sockets.

> It would allow other DPDK applications to put some files.

[dpdk-dev] [PATCH v12 2/2] i40e: add floating VEB support in i40e

2016-06-24 Thread Zhe Tao

This patch add the support for floating VEB in i40e.
All the VFs VSIs can decide whether to connect to the legacy VEB/VEPA or
the floating VEB. When connect to the floating VEB a new floating VEB is
created. Now all the VFs need to connect to floating VEB or legacy VEB,
cannot connect to both of them. The PF and VMDQ,FD VSIs still connect to
the old legacy VEB/VEPA.

All the VEB/VEPA concepts are not specific for FVL, they are defined in
the 802.1Qbg spec.

The floating VEB feature is only available for the FW version which
newer than 5.0 (FW major version number > 5).

To enable this feature, the user should pass a devargs parameter to the
EAL like "-w 84:00.0,enable_floating_veb=1", and the application use
this method to tell PMD to connect all the VFs created by
this PF device to the floating VEB.

Also you can specify which VF need to connect to this floating veb using
"floating_veb_list".
Like "-w 84:00.0,enable_floating_veb=1,floating_veb_list=1/3-4", means VF1, VF3,
VF4 connect to the floating VEB, other VFs connect to the legacy VEB.The "/"
is used for delimiter of the floating VEB list.

Signed-off-by: Zhe Tao 
---
 doc/guides/nics/i40e.rst   |  27 
 doc/guides/rel_notes/release_16_07.rst |   4 ++
 drivers/net/i40e/i40e_ethdev.c | 112 ++---
 drivers/net/i40e/i40e_ethdev.h |   2 +
 drivers/net/i40e/i40e_pf.c |  12 +++-
 5 files changed, 134 insertions(+), 23 deletions(-)

diff --git a/doc/guides/nics/i40e.rst b/doc/guides/nics/i40e.rst
index 934eb02..1ce60ab 100644
--- a/doc/guides/nics/i40e.rst
+++ b/doc/guides/nics/i40e.rst
@@ -366,3 +366,30 @@ Delete all flow director rules on a port:

testpmd> flush_flow_director 0

+Floating VEB
+~
+FVL can support floating VEB feature.
+
+The floating VEB means the VEB doesn't has some uplink connection to the 
outside
+world, so all the switching belong to this VEB cannot go to outside, the
+security can be assured. Because the floating VEB doesn't need to connect to
+the MAC port, so when the physical port link is down, all the switching within
+this VEB still works fine, but for legacy VEB when the physical port is down
+the VEB cannot forward packets anymore.
+
+With this feature, VFs can communicate with each other, but cannot access
+outside network. When PF is down, and VFs can still forward pkts between each
+other.
+
+To enable this feature, the user should pass a devargs parameter to the EAL
+like "-w 84:00.0,enable_floating_veb=1", and then the PMD will use the floating
+VEB feature for all the VFs created by this PF device.
+Also you can specify which VF need to connect to this floating veb using
+"floating_veb_list".
+Like "-w 84:00.0,enable_floating_veb=1,floating_veb_list=1/3-4", means VF1, 
VF3,
+VF4 connect to the floating VEB, other VFs connect to the legacy VEB.The "/"
+is used for delimiter of the floating VEB list.
+
+The current implementation only support one floating VEB and one legacy
+VEB. VF can connect to floating VEB or legacy VEB according to the
+configuration.
diff --git a/doc/guides/rel_notes/release_16_07.rst 
b/doc/guides/rel_notes/release_16_07.rst
index 30e78d4..1752c40 100644
--- a/doc/guides/rel_notes/release_16_07.rst
+++ b/doc/guides/rel_notes/release_16_07.rst
@@ -47,6 +47,10 @@ New Features
   * Dropped specific Xen Dom0 code.
   * Dropped specific anonymous mempool code in testpmd.

+* **Added floating VEB support for i40e PF driver.**
+
+  More details please see floating VEB part in the document
+  doc/guides/nics/i40e.rst.

 Resolved Issues
 ---
diff --git a/drivers/net/i40e/i40e_ethdev.c b/drivers/net/i40e/i40e_ethdev.c
index b304fc3..4673619 100644
--- a/drivers/net/i40e/i40e_ethdev.c
+++ b/drivers/net/i40e/i40e_ethdev.c
@@ -3860,21 +3860,27 @@ i40e_veb_release(struct i40e_veb *veb)
struct i40e_vsi *vsi;
struct i40e_hw *hw;

-   if (veb == NULL || veb->associate_vsi == NULL)
+   if (veb == NULL)
return -EINVAL;

if (!TAILQ_EMPTY(>head)) {
PMD_DRV_LOG(ERR, "VEB still has VSI attached, can't remove");
return -EACCES;
}
+   /* associate_vsi field is NULL for floating VEB */
+   if (veb->associate_vsi != NULL) {
+   vsi = veb->associate_vsi;
+   hw = I40E_VSI_TO_HW(vsi);

-   vsi = veb->associate_vsi;
-   hw = I40E_VSI_TO_HW(vsi);
+   vsi->uplink_seid = veb->uplink_seid;
+   vsi->veb = NULL;
+   } else {
+   veb->associate_pf->main_vsi->floating_veb = NULL;
+   hw = I40E_VSI_TO_HW(veb->associate_pf->main_vsi);
+   }

-   vsi->uplink_seid = veb->uplink_seid;
i40e_aq_delete_element(hw, veb->seid, NULL);
rte_free(veb);
-   vsi->veb = NULL;
return I40E_SUCCESS;
 }

@@ -3886,9 +3892,9 @@ i40e_veb_setup(struct i40e_pf *pf, struct i40e_vsi *vsi)
int ret;
struct i40e_hw *hw;

-   if (NULL

[dpdk-dev] [PATCH v12 1/2] i40e: support floating VEB config

2016-06-24 Thread Zhe Tao

Add the new floating VEB related arguments option in the devarg.
Using this parameter, all the applications can decide whether to use legacy
VEB/VEPA or floating VEB.
To enable this feature, the user should pass a devargs parameter to the
EAL like "-w 84:00.0,enable_floating_veb=1", and the application will
tell PMD whether to use the floating VEB feature or not.
Once the floating VEB feature is enabled, all the VFs created by
this PF device are connected to the floating VEB.

Also user can specify which VF need to connect to this floating veb using
"floating_veb_list".
Like "-w 84:00.0,enable_floating_veb=1,floating_veb_list=1/3-4", means VF1, VF3,
VF4 connect to the floating VEB, other VFs connect to the legacy VEB.The "/"
is used for delimiter of the floating VEB list.

All the VEB/VEPA concepts are not specific for FVL, they are defined in
the 802.1Qbg spec.

But for floating VEB, it has two major difference.
1. doesn't has a up link connection which means
the traffic cannot go to outside world.
2. doesn't need to connect to the physical port which means
when the physical link is down the floating VEB can still works
fine.

Signed-off-by: Zhe Tao 
---
 drivers/net/i40e/i40e_ethdev.c | 142 +
 drivers/net/i40e/i40e_ethdev.h |  10 +++
 2 files changed, 152 insertions(+)

diff --git a/drivers/net/i40e/i40e_ethdev.c b/drivers/net/i40e/i40e_ethdev.c
index 24777d5..b304fc3 100644
--- a/drivers/net/i40e/i40e_ethdev.c
+++ b/drivers/net/i40e/i40e_ethdev.c
@@ -63,6 +63,9 @@
 #include "i40e_pf.h"
 #include "i40e_regs.h"

+#define ETH_I40E_FLOATING_VEB_ARG  "enable_floating_veb"
+#define ETH_I40E_FLOATING_VEB_LIST_ARG "floating_veb_list"
+
 #define I40E_CLEAR_PXE_WAIT_MS 200

 /* Maximun number of capability elements */
@@ -750,6 +753,136 @@ i40e_add_tx_flow_control_drop_filter(struct i40e_pf *pf)
  " frames from VSIs.");
 }

+static int vf_floating_veb_handler(__rte_unused const char *key,
+  const char *floating_veb_value,
+  void *opaque)
+{
+   int idx = 0;
+   unsigned count = 0;
+   char *end = NULL;
+   int min, max;
+   bool *vf_floating_veb = opaque;
+
+   while (isblank(*floating_veb_value))
+   floating_veb_value++;
+
+   /* Reset floating VEB configuration for VFs */
+   for (idx = 0; idx < I40E_MAX_VF; idx++)
+   vf_floating_veb[idx] = false;
+
+   min = I40E_MAX_VF;
+   do {
+   while (isblank(*floating_veb_value))
+   floating_veb_value++;
+   if (*floating_veb_value == '\0')
+   return -1;
+   errno = 0;
+   idx = strtoul(floating_veb_value, , 10);
+   if (errno || end == NULL)
+   return -1;
+   while (isblank(*end))
+   end++;
+   if (*end == '-') {
+   min = idx;
+   } else if ((*end == '/') || (*end == '\0')) {
+   max = idx;
+   if (min == I40E_MAX_VF)
+   min = idx;
+   if (max >= I40E_MAX_VF)
+   max = I40E_MAX_VF - 1;
+   for (idx = min; idx <= max; idx++) {
+   vf_floating_veb[idx] = true;
+   count++;
+   }
+   min = I40E_MAX_VF;
+   } else {
+   return -1;
+   }
+   floating_veb_value = end + 1;
+   } while (*end != '\0');
+
+   if (count == 0)
+   return -1;
+
+   return 0;
+}
+
+static void config_vf_floating_veb(struct rte_devargs *devargs,
+  uint16_t floating,
+  bool *vf_floating_veb)
+{
+   struct rte_kvargs *kvlist;
+   int i;
+   const char *floating_veb_list = ETH_I40E_FLOATING_VEB_LIST_ARG;
+
+   if (floating == false)
+   return;
+   for (i = 0; i < I40E_MAX_VF; i++)
+   vf_floating_veb[i] = true;
+
+   if (devargs == NULL)
+   return;
+
+   kvlist = rte_kvargs_parse(devargs->args, NULL);
+   if (kvlist == NULL)
+   return;
+
+   if (!rte_kvargs_count(kvlist, floating_veb_list)) {
+   rte_kvargs_free(kvlist);
+   return;
+   }
+
+   if (rte_kvargs_process(kvlist, floating_veb_list,
+  vf_floating_veb_handler,
+  vf_floating_veb) < 0) {
+   rte_kvargs_free(kvlist);
+   return;
+   }
+   rte_kvargs_free(kvlist);
+
+   return;
+}
+
+static int i40e_check_floating_handler(__rte_unused const char *key,
+  const char *value,
+  __rte_unused

[dpdk-dev] [PATCH v12 0/2] i40e: add floating VEB support for i40e

2016-06-24 Thread Zhe Tao

This patch-set add the support for floating VEB in i40e.
All the VFs VSIs can decide whether to connect to the legacy VEB/VEPA or
the floating VEB. When connect to the floating VEB a new floating VEB is
created. Now all the VFs need to connect to floating VEB or legacy VEB,
cannot connect to both of them. The PF and VMDQ,FD VSIs connect to
the old legacy VEB/VEPA.

All the VEB/VEPA concepts are not specific for FVL, they are defined in the
802.1Qbg spec.

This floating VEB only take effects on the specific version F/W which newer
than 5.0.

Zhe Tao (2):
  i40e: support floating VEB config
  i40e: add floating VEB support in i40e

 doc/guides/nics/i40e.rst   |  27 
 doc/guides/rel_notes/release_16_07.rst |   4 +
 drivers/net/i40e/i40e_ethdev.c | 254 ++---
 drivers/net/i40e/i40e_ethdev.h |  12 ++
 drivers/net/i40e/i40e_pf.c |  12 +-
 5 files changed, 286 insertions(+), 23 deletions(-)

V2: Added the release notes and changed commit log. 
V3: Changed the VSI release operation. 
V4: Added the FW version check otherwise it will cause the segment fault.
V5: Edited the code for new share code APIs
V6: Changed the floating VEB configuration method 
V7: Added global reset for i40e 
V8: removed global reset and added floating VEB extension support 
V9: Added floating VEB related explanation into commit log 
V10: Changed third patch commit log 
V11: Fixed the issues reported by check-git-log.sh 
V12: Changed the floating VEB VF bitmask to VF list 

-- 
2.1.4

[dpdk-dev] [PATCH v2] mk: fix parallel build of test resources

2016-06-24 Thread Olivier Matz



On 06/24/2016 04:19 PM, Thomas Monjalon wrote:
> 2016-06-24 16:06, Olivier Matz:
>> Hi Thomas,
>>
>> On 06/24/2016 01:22 PM, Thomas Monjalon wrote:
>>> --- a/app/test/Makefile
>>> +++ b/app/test/Makefile
>>> @@ -43,14 +43,14 @@ define linked_resource
>>>  SRCS-y += $(1).res.o
>>>  $(1).res.o: $(2)
>>> @  echo '  MKRES $$@'
>>> -   $Q ln -fs $$< resource.tmp
>>> +   $Q [ "$$(>
>> Maybe the symbolic link file in the build directory could be prefixed
>> with "lnk_"... (see following below)
> 
> yes...
>  
>>> $Q $(OBJCOPY) -I binary -B $(RTE_OBJCOPY_ARCH) -O $(RTE_OBJCOPY_TARGET) 
>>> \
>>> --rename-section \
>>> .data=.rodata,alloc,load,data,contents,readonly  \
>>> -   --redefine-sym _binary_resource_tmp_start=beg_$(1)   \
>>> -   --redefine-sym _binary_resource_tmp_end=end_$(1) \
>>> -   --redefine-sym _binary_resource_tmp_size=siz_$(1)\
>>> -   resource.tmp $$@ && rm -f resource.tmp
>>> +   --redefine-sym _binary_$$(subst .,_,$$(>> +   --redefine-sym _binary_$$(subst .,_,$$(>> +   --redefine-sym _binary_$$(subst .,_,$$(>> +   $$(>>  endef
>>>  
>>>  ifeq ($(CONFIG_RTE_APP_TEST_RESOURCE_TAR),y)
>>> @@ -76,7 +76,9 @@ SRCS-$(CONFIG_RTE_LIBRTE_CMDLINE) := commands.c
>>>  SRCS-y += test.c
>>>  SRCS-y += resource.c
>>>  SRCS-y += test_resource.c
>>> -$(eval $(call linked_resource,test_resource_c,resource.c))
>>> +test_resource.res: test_resource.c
>>> +   @ cp $< $@
>>> +$(eval $(call linked_resource,test_resource_c,test_resource.res))
>>>  $(eval $(call linked_tar_resource,test_resource_tar,test_resource.c))
>>>  SRCS-$(CONFIG_RTE_APP_TEST_RESOURCE_TAR) += test_pci.c
>>>  $(eval $(call linked_tar_resource,test_pci_sysfs,test_pci_sysfs))
>>
>> ...this would avoid to rename the resource file and make the patch
>> easier to understand.
> 
> ... but it would be harder to understand how are named the resources in
> the build directory ;)
> Ideally we should not use a source file (referenced in SRCS-y) as a
> test resource.
> 

OK alright :)

Acked-by: Olivier Matz

[dpdk-dev] [PATCH v2] mk: fix parallel build of test resources

2016-06-24 Thread Thomas Monjalon

2016-06-24 16:06, Olivier Matz:
> Hi Thomas,
> 
> On 06/24/2016 01:22 PM, Thomas Monjalon wrote:
> > --- a/app/test/Makefile
> > +++ b/app/test/Makefile
> > @@ -43,14 +43,14 @@ define linked_resource
> >  SRCS-y += $(1).res.o
> >  $(1).res.o: $(2)
> > @  echo '  MKRES $$@'
> > -   $Q ln -fs $$< resource.tmp
> > +   $Q [ "$$( 
> Maybe the symbolic link file in the build directory could be prefixed
> with "lnk_"... (see following below)

yes...

> > $Q $(OBJCOPY) -I binary -B $(RTE_OBJCOPY_ARCH) -O $(RTE_OBJCOPY_TARGET) 
> > \
> > --rename-section \
> > .data=.rodata,alloc,load,data,contents,readonly  \
> > -   --redefine-sym _binary_resource_tmp_start=beg_$(1)   \
> > -   --redefine-sym _binary_resource_tmp_end=end_$(1) \
> > -   --redefine-sym _binary_resource_tmp_size=siz_$(1)\
> > -   resource.tmp $$@ && rm -f resource.tmp
> > +   --redefine-sym _binary_$$(subst .,_,$$( > +   --redefine-sym _binary_$$(subst .,_,$$( > +   --redefine-sym _binary_$$(subst .,_,$$( > +   $$( >  endef
> >  
> >  ifeq ($(CONFIG_RTE_APP_TEST_RESOURCE_TAR),y)
> > @@ -76,7 +76,9 @@ SRCS-$(CONFIG_RTE_LIBRTE_CMDLINE) := commands.c
> >  SRCS-y += test.c
> >  SRCS-y += resource.c
> >  SRCS-y += test_resource.c
> > -$(eval $(call linked_resource,test_resource_c,resource.c))
> > +test_resource.res: test_resource.c
> > +   @ cp $< $@
> > +$(eval $(call linked_resource,test_resource_c,test_resource.res))
> >  $(eval $(call linked_tar_resource,test_resource_tar,test_resource.c))
> >  SRCS-$(CONFIG_RTE_APP_TEST_RESOURCE_TAR) += test_pci.c
> >  $(eval $(call linked_tar_resource,test_pci_sysfs,test_pci_sysfs))
> 
> ...this would avoid to rename the resource file and make the patch
> easier to understand.

... but it would be harder to understand how are named the resources in
the build directory ;)
Ideally we should not use a source file (referenced in SRCS-y) as a
test resource.

[dpdk-dev] [PATCH v3] rte_hash: add scalable multi-writer insertion w/ Intel TSX

2016-06-24 Thread Thomas Monjalon

> > This patch introduced scalable multi-writer Cuckoo Hash insertion
> > based on a split Cuckoo Search and Move operation using Intel
> > TSX. It can do scalable hash insertion with 22 cores with little
> > performance loss and negligible TSX abortion rate.
> > 
> > * Added an extra rte_hash flag definition to switch default single writer
> >   Cuckoo Hash behavior to multiwriter.
> > - If HTM is available, it would use hardware feature for concurrency.
> > - If HTM is not available, it would fall back to spinlock.
> > 
> > * Created a rte_cuckoo_hash_x86.h file to hold all x86-arch related
> >   cuckoo_hash functions. And rte_cuckoo_hash.c uses compile time flag to
> >   select x86 file or other platform-specific implementations. While HTM 
> > check
> >   is still done at runtime (same idea with
> >   RTE_HASH_EXTRA_FLAGS_TRANS_MEM_SUPPORT)
> > 
> > * Moved rte_hash private struct definitions to rte_cuckoo_hash.h, to allow
> >   rte_cuckoo_hash_x86.h or future platform dependent functions to include.
> > 
> > * Following new functions are created for consistent names when new
> > platform
> >   TM support are added.
> > - rte_hash_cuckoo_move_insert_mw_tm: do insertion with bucket
> > movement.
> > - rte_hash_cuckoo_insert_mw_tm: do insertion without bucket movement.
> > 
> > * One extra multi-writer test case is added.
> > 
> > Signed-off-by: Shen Wei 
> > Signed-off-by: Sameh Gobriel 
> 
> Acked-by: Pablo de Lara 

Applied, thanks

[dpdk-dev] [PATCH v2] mk: fix parallel build of test resources

2016-06-24 Thread Olivier Matz

Hi Thomas,

On 06/24/2016 01:22 PM, Thomas Monjalon wrote:
> --- a/app/test/Makefile
> +++ b/app/test/Makefile
> @@ -43,14 +43,14 @@ define linked_resource
>  SRCS-y += $(1).res.o
>  $(1).res.o: $(2)
>   @  echo '  MKRES $$@'
> - $Q ln -fs $$< resource.tmp
> + $Q [ "$$(   $Q $(OBJCOPY) -I binary -B $(RTE_OBJCOPY_ARCH) -O $(RTE_OBJCOPY_TARGET) 
> \
>   --rename-section \
>   .data=.rodata,alloc,load,data,contents,readonly  \
> - --redefine-sym _binary_resource_tmp_start=beg_$(1)   \
> - --redefine-sym _binary_resource_tmp_end=end_$(1) \
> - --redefine-sym _binary_resource_tmp_size=siz_$(1)\
> - resource.tmp $$@ && rm -f resource.tmp
> + --redefine-sym _binary_$$(subst .,_,$$( + --redefine-sym _binary_$$(subst .,_,$$( + --redefine-sym _binary_$$(subst .,_,$$( + $$(  endef
>  
>  ifeq ($(CONFIG_RTE_APP_TEST_RESOURCE_TAR),y)
> @@ -76,7 +76,9 @@ SRCS-$(CONFIG_RTE_LIBRTE_CMDLINE) := commands.c
>  SRCS-y += test.c
>  SRCS-y += resource.c
>  SRCS-y += test_resource.c
> -$(eval $(call linked_resource,test_resource_c,resource.c))
> +test_resource.res: test_resource.c
> + @ cp $< $@
> +$(eval $(call linked_resource,test_resource_c,test_resource.res))
>  $(eval $(call linked_tar_resource,test_resource_tar,test_resource.c))
>  SRCS-$(CONFIG_RTE_APP_TEST_RESOURCE_TAR) += test_pci.c
>  $(eval $(call linked_tar_resource,test_pci_sysfs,test_pci_sysfs))

...this would avoid to rename the resource file and make the patch
easier to understand.

Olivier

[dpdk-dev] [PATCH v2] doc: virtio pmd versions

2016-06-24 Thread Mcnamara, John

> -Original Message-
> From: Wang, Zhihong
> Sent: Wednesday, June 15, 2016 12:53 AM
> To: dev at dpdk.org
> Cc: Richardson, Bruce ; Mcnamara, John
> ; Wang, Zhihong 
> Subject: [PATCH v2] doc: virtio pmd versions
> 
> This patch explains all the versions of current virtio pmd implementation,
> what's the difference, and how to choose the right version.
> 
> Signed-off-by: Zhihong Wang 


Acked-by: John McNamara

[dpdk-dev] [PATCH v2 12/16] vfio: fix typo in doc for vfio_setup_device

2016-06-24 Thread Mcnamara, John

> -Original Message-
> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Jan Viktorin
> Sent: Monday, June 13, 2016 2:02 PM
> To: dev at dpdk.org
> Cc: Jan Viktorin ; Burakov, Anatoly
> ; David Marchand ;
> Wiles, Keith ; Santosh Shukla  mvista.com>;
> Stephen Hemminger ; Shreyansh Jain
> 
> Subject: [dpdk-dev] [PATCH v2 12/16] vfio: fix typo in doc for
> vfio_setup_device
> 
> Signed-off-by: Jan Viktorin 
> Suggested-by: Anatoly Burakov 

Acked-by: John McNamara

[dpdk-dev] [PATCH] doc: update vhost guide

2016-06-24 Thread Yuanhan Liu

Mainly on updating vhost-user part: we now support client mode.
Also refine some words, and add a bit more explanation.

And made an emphatic statement that you are suggested to use vhost-user
instead of vhost-cuse, because we have enhanced vhost-user a lot since
v2.2 (Actually, I doubt there are any people still using vhost-cuse)

Signed-off-by: Yuanhan Liu 
---
 doc/guides/prog_guide/vhost_lib.rst | 161 ++--
 1 file changed, 116 insertions(+), 45 deletions(-)

diff --git a/doc/guides/prog_guide/vhost_lib.rst 
b/doc/guides/prog_guide/vhost_lib.rst
index 48e1fff..ad9ecf8 100644
--- a/doc/guides/prog_guide/vhost_lib.rst
+++ b/doc/guides/prog_guide/vhost_lib.rst
@@ -1,5 +1,5 @@
 ..  BSD LICENSE
-Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+Copyright(c) 2010-2016 Intel Corporation. All rights reserved.
 All rights reserved.

 Redistribution and use in source and binary forms, with or without
@@ -31,48 +31,100 @@
 Vhost Library
 =

-The vhost library implements a user space vhost driver. It supports both 
vhost-cuse
-(cuse: user space character device) and vhost-user(user space socket server).
-It also creates, manages and destroys vhost devices for corresponding virtio
-devices in the guest. Vhost supported vSwitch could register callbacks to this
-library, which will be called when a vhost device is activated or deactivated
-by guest virtual machine.
+The vhost library implements a user space virtio net server, allowing
+us to manipulate the virtio ring directly. In another word, it allows
+us to fetch/put packets from/to the VM virtio net device. To achieve
+that, we should be able to
+
+* access the guest memory
+
+  For QEMU, this is done by using **-object memory-backend-file,share=on,...**
+  option. Which means QEMU will create a file to serve as the guest RAM.
+  The **share=on** option allows another process to map that file, which
+  means it can access the guest RAM.
+
+* know all the necessary info about the vring, such as where is avail
+  ring stored.
+
+  Vhost defines some messages to tell the backend all the info it needs
+  to know to manipulate the vring.
+
+Currently, there are two ways to pass those messages. That results to we
+have two implementations: vhost-cuse (character devices in user space) and
+vhost-user. Vhost-cuse creates a user space char dev and hook a function
+ioctl, so that all ioctl commands (that represent those messages) sent
+from the frontend (QEMU) will be captured and handled. While vhost-user
+creates a Unix domain socket file, through which those messages are passed.
+
+Note that since DPDK v2.2, we have spent a lot of efforts on enhancing
+vhost-user, such as multiple queue, live migration, reconnect, etc. Thus,
+**you are encouraged to use vhost-user instead of vhost-cuse**.

 Vhost API Overview
 --

-*   Vhost driver registration
+* ``rte_vhost_driver_register(path, flags)``
+
+  It registers a vhost driver into the system. For vhost-cuse, a /dev/``path``
+  character device file will be created. For vhost-user server mode, a Unix
+  domain socket file ``path`` will be created.
+
+  We currently support two flags (both are valid for vhost-user only):
+
+  - ``RTE_VHOST_USER_CLIENT``
+
+DPDK vhost-user will acts as the client when this flag is given. See more
+detailed info below.
+
+  - ``RTE_VHOST_USER_NO_RECONNECT``
+
+When DPDK-vhost user acts as the client, it will keep trying to reconnect
+to the server (QEMU) until it succeeds. This become handy in two cases:
+
+* when QEMU is not started yet.
+* when QEMU restarts (say guest OS reboots).
+
+It's enabled by default. However, you can turn it off by setting this flag.
+

-  rte_vhost_driver_register registers the vhost driver into the system.
-  For vhost-cuse, character device file will be created under the /dev 
directory.
-  Character device name is specified as the parameter.
-  For vhost-user, a Unix domain socket server will be created with the 
parameter as
-  the local socket path.
+* ``rte_vhost_driver_session_start()``

-*   Vhost session start
+  It starts the vhost session loop, to handle those vhost messages. It's an
+  infinite loop, therefore, you should call it in a dedicate thread.

-  rte_vhost_driver_session_start starts the vhost session loop.
-  Vhost session is an infinite blocking loop.
-  Put the session in a dedicate DPDK thread.
+* ``rte_vhost_driver_callback_register(virtio_net_device_ops)``

-*   Callback register
+  It registers a set of callbacks, to let DPDK applications to take proper
+  actions when some events happen. Currently, we have:

-  Vhost supported vSwitch could call rte_vhost_driver_callback_register to
-  register two callbacks, new_destory and destroy_device.
-  When virtio device is activated or deactivated by guest virtual machine,
-  the callback will be called, then vSwitch could put the

[dpdk-dev] [RFC] librte_vhost: Add unix domain socket fd registration

2016-06-24 Thread Yuanhan Liu

On Fri, Jun 24, 2016 at 07:43:29AM +, Loftus, Ciara wrote:
> > 
> > On Tue, Jun 21, 2016 at 09:15:03AM -0400, Aaron Conole wrote:
> > > Yuanhan Liu  writes:
> > >
> > > > On Fri, Jun 17, 2016 at 11:32:36AM -0400, Aaron Conole wrote:
> > > >> Prior to this commit, the only way to add a vhost-user socket to the
> > > >> system is by relying on librte_vhost to open the unix domain socket and
> > > >> add it to the unix socket list.  This is problematic for applications
> > > >> which would like to set the permissions,
> > > >
> > > > So, you want to address the issue raised by following patch?
> > > >
> > > > http://dpdk.org/dev/patchwork/patch/1/
> > >
> > > That patch does try to address the issue, however - it has some
> > > problems.  The biggest is a TOCTTOU issue when using chown.  The way to
> > > solve that issue properly is different depending on which operating
> > > system is being used (for instance, FreeBSD doesn't honor
> > > fchown(),fchmod() on file descriptors).  My solution is basically to
> > > punt that responsibility to the controlling application.
> > >
> > > > I would still like to stick to my proposal, that is to introduce a
> > > > new API to do the permission change at anytime, if we end up with
> > > > wanting to introduce a new API.
> > >
> > > I've spent a lot of time looking at the TOCTTOU problem, and I think
> > > that is a really hard problem to solve portably.  Might be good to just
> > > start with the flexible mechanism here that lets the application
> > > developer satisfy their own needs.
> > >
> > > >> or applications which are not
> > > >> directly allowed to open sockets due to policy restrictions.
> > > >
> > > > Could you name a specific example?
> > >
> > > SELinux policy might require one application to open the socket, and
> > > pass it back via a dbus mechanism.  I can't actually think of a concrete
> > > implemented case, so it may not be valid.
> > >
> > > > BTW, JFYI, since 16.07, DPDK supports client mode. It's QEMU (acting
> > > > as the server) will create the socket file. I guess that would diminish
> > > > (or even avoid?) the permission pain that DPDK acting as server brings.
> > > > I doubt the API to do the permission change is really needed then.
> > >
> > > I wouldn't say it 'solves' the issue so much as hopes no one uses server
> > > mode in DPDK.  I agree, for OvS, it could.
> > 
> > Actually, I think I would (personally) suggest people to switch to DPDK
> > vhost-user client mode, for two good reasons:
> > 
> > - it should solve the socket permission issue raised by you and Christian.
> > 
> > - it has the "reconnect" feature since 16.07. Which means guest network
> >   will still work from a DPDK vhost-user restart/crash. DPDK vhost-user
> >   as server simply doesn't support that.
> > 
> > And FYI, Loftus is doing the DPDK for OVS intergration. Not quite sure
> > whether she put the client mode as the default mode though.
> 
> Hi Yuanhan,

Hi Ciara,

Thanks for the note.

> I intend to keep the DPDK server-mode as the default. My reasoning is that not
> all users will have access to QEMU v2.7.0 initially. We will keep operating 
> as before
> but have an option to switch to DPDK client mode,

And yes, good point.

> and then perhaps look at
> switching the default in a later release.

Also okay to me.

--yliu

[dpdk-dev] [PATCH] examples/l3fwd: update documentation

2016-06-24 Thread Mcnamara, John

> -Original Message-
> From: Xing, Beilei
> Sent: Thursday, June 23, 2016 10:05 AM
> To: Mcnamara, John 
> Cc: dev at dpdk.org; Xing, Beilei 
> Subject: [PATCH] examples/l3fwd: update documentation
> 
> Update l3fwd documentation with -E, -L and --eth-dest options.
> 

Hi,

Thanks for the doc fixes.


The usage example just before this should also be updated to add these
(and a few other) missing options. Something like:

./l3fwd [EAL options] -- -p PORTMASK
 [-P]
 [-E]
 [-L]
 --config(port,queue,lcore)[,(port,queue,lcore)]
 [--eth-dest=X,MM:MM:MM:MM:MM:MM]
 [--enable-jumbo]
 [--max-pkt-len PKTLEN]]
 [--no-numa]
 [--hash-entry-num]
 [--ipv6]
 [--parse-ptype]



> +*   -E: enable exact match

The options would look better in fixed width quotes:

* ``-E:`` Enable exact match.


> +*   --parse-ptype: optional, set it if use software way to analyze packet
> type. Without this option, HW will check packet type.

Maybe better as:

* ``--parse-ptype:`` Optional, set to use software to analyze packet type. 
Without this option, hardware will check the packet type.

Note, the l3fwd main.c usage should also be updated to add these options
and fix missing (and incorrect) options. I'll send you a patch for that
and you can integrate it with your changes.

John

[dpdk-dev] [PATCH] enic: fix issues when using Rx scatter with multiple RQs

2016-06-24 Thread Nelson Escobar

The Rx scatter patch failed to make a few changes and resulted
in problems when using multiple RQs since the wrong RQ or CQ
was being used.

Fixes: 14a261bf0520 ("enic: add scattered Rx support")

Signed-off-by: Nelson Escobar 
Reviewed-by: John Daley 
---
 drivers/net/enic/enic.h  |  2 +-
 drivers/net/enic/enic_main.c | 10 ++
 2 files changed, 7 insertions(+), 5 deletions(-)

diff --git a/drivers/net/enic/enic.h b/drivers/net/enic/enic.h
index ed5f18d..15b1d45 100644
--- a/drivers/net/enic/enic.h
+++ b/drivers/net/enic/enic.h
@@ -174,7 +174,7 @@ static inline unsigned int enic_vnic_rq_count(struct enic 
*enic)

 static inline unsigned int enic_cq_rq(__rte_unused struct enic *enic, unsigned 
int rq)
 {
-   return rq;
+   return rq / 2;
 }

 static inline unsigned int enic_cq_wq(struct enic *enic, unsigned int wq)
diff --git a/drivers/net/enic/enic_main.c b/drivers/net/enic/enic_main.c
index 15389e5..68dbe40 100644
--- a/drivers/net/enic/enic_main.c
+++ b/drivers/net/enic/enic_main.c
@@ -238,19 +238,20 @@ void enic_init_vnic_resources(struct enic *enic)
struct vnic_rq *data_rq;

for (index = 0; index < enic->rq_count; index++) {
+   cq_idx = enic_cq_rq(enic, enic_sop_rq(index));
+
vnic_rq_init(>rq[enic_sop_rq(index)],
-   enic_cq_rq(enic, index),
+   cq_idx,
error_interrupt_enable,
error_interrupt_offset);

data_rq = >rq[enic_data_rq(index)];
if (data_rq->in_use)
vnic_rq_init(data_rq,
-enic_cq_rq(enic, index),
+cq_idx,
 error_interrupt_enable,
 error_interrupt_offset);

-   cq_idx = enic_cq_rq(enic, index);
vnic_cq_init(>cq[cq_idx],
0 /* flow_control_enable */,
1 /* color_enable */,
@@ -899,7 +900,8 @@ static int enic_set_rsscpu(struct enic *enic, u8 
rss_hash_bits)
return -ENOMEM;

for (i = 0; i < (1 << rss_hash_bits); i++)
-   (*rss_cpu_buf_va).cpu[i/4].b[i%4] = i % enic->rq_count;
+   (*rss_cpu_buf_va).cpu[i / 4].b[i % 4] =
+   enic_sop_rq(i % enic->rq_count);

err = enic_set_rss_cpu(enic,
rss_cpu_buf_pa,
-- 
2.7.0

[dpdk-dev] [PATCH v2 2/2] enic: add an update MTU function for non-Rx scatter mode

2016-06-24 Thread John Daley

Provide an update MTU callback. The function returns -ENOTSUP
if Rx scatter is enabled. Updating the MTU to be greater than
the value configured via the Cisco CIMC/UCSM management interface
is allowed provided it is still less than the maximum egress packet
size allowed by the NIC minus the size of the L2 header.

Signed-off-by: John Daley 
---
v2: Squished 3/4 and 4/4 into 1 patch. Slight change of wording
and fixed typo in commit message.

 doc/guides/nics/overview.rst   |  2 +-
 drivers/net/enic/enic.h|  1 +
 drivers/net/enic/enic_ethdev.c | 10 +-
 drivers/net/enic/enic_main.c   | 44 ++
 4 files changed, 55 insertions(+), 2 deletions(-)

diff --git a/doc/guides/nics/overview.rst b/doc/guides/nics/overview.rst
index f94f6a2..872392b 100644
--- a/doc/guides/nics/overview.rst
+++ b/doc/guides/nics/overview.rst
@@ -92,7 +92,7 @@ Most of these differences are summarized below.
Queue status event  
 Y
Rx interrupt Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y
Queue start/stop   Y   Y   Y Y Y Y Y Y Y Y Y Y Y Y Y Y  
 Y Y   Y Y
-   MTU update Y Y Y   Y   Y Y Y Y Y Y  
   Y Y Y
+   MTU update Y Y Y Y Y   Y Y Y Y Y Y  
   Y Y Y
Jumbo frameY Y Y Y Y Y Y Y Y   Y Y Y Y Y Y Y Y Y Y  
 Y Y Y Y
Scattered Rx   Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y  
 Y Y   Y
LROY Y Y Y
diff --git a/drivers/net/enic/enic.h b/drivers/net/enic/enic.h
index b557e12..9f5740d 100644
--- a/drivers/net/enic/enic.h
+++ b/drivers/net/enic/enic.h
@@ -263,4 +263,5 @@ uint16_t enic_recv_pkts(void *rx_queue, struct rte_mbuf 
**rx_pkts,
uint16_t nb_pkts);
 uint16_t enic_xmit_pkts(void *tx_queue, struct rte_mbuf **tx_pkts,
   uint16_t nb_pkts);
+int enic_set_mtu(struct enic *enic, uint16_t new_mtu);
 #endif /* _ENIC_H_ */
diff --git a/drivers/net/enic/enic_ethdev.c b/drivers/net/enic/enic_ethdev.c
index 6fa54b2..a7ce064 100644
--- a/drivers/net/enic/enic_ethdev.c
+++ b/drivers/net/enic/enic_ethdev.c
@@ -528,6 +528,14 @@ static void enicpmd_remove_mac_addr(struct rte_eth_dev 
*eth_dev, __rte_unused ui
enic_del_mac_address(enic);
 }

+static int enicpmd_mtu_set(struct rte_eth_dev *eth_dev, uint16_t mtu)
+{
+   struct enic *enic = pmd_priv(eth_dev);
+
+   ENICPMD_FUNC_TRACE();
+   return enic_set_mtu(enic, mtu);
+}
+
 static const struct eth_dev_ops enicpmd_eth_dev_ops = {
.dev_configure= enicpmd_dev_configure,
.dev_start= enicpmd_dev_start,
@@ -545,7 +553,7 @@ static const struct eth_dev_ops enicpmd_eth_dev_ops = {
.queue_stats_mapping_set = NULL,
.dev_infos_get= enicpmd_dev_info_get,
.dev_supported_ptypes_get = enicpmd_dev_supported_ptypes_get,
-   .mtu_set  = NULL,
+   .mtu_set  = enicpmd_mtu_set,
.vlan_filter_set  = enicpmd_vlan_filter_set,
.vlan_tpid_set= NULL,
.vlan_offload_set = enicpmd_vlan_offload_set,
diff --git a/drivers/net/enic/enic_main.c b/drivers/net/enic/enic_main.c
index 91883f8..1d16f0e 100644
--- a/drivers/net/enic/enic_main.c
+++ b/drivers/net/enic/enic_main.c
@@ -1005,6 +1005,50 @@ int enic_set_vnic_res(struct enic *enic)
return rc;
 }

+/* The Cisco NIC can send and receive packets up to a max packet size
+ * determined by the NIC type and firmware. There is also an MTU
+ * configured into the NIC via the CIMC/UCSM management interface
+ * which can be overridden by this function (up to the max packet size).
+ * Depending on the network setup, doing so may cause packet drops
+ * and unexpected behavior.
+ */
+int enic_set_mtu(struct enic *enic, uint16_t new_mtu)
+{
+   uint16_t old_mtu;   /* previous setting */
+   uint16_t config_mtu;/* Value configured into NIC via CIMC/UCSM */
+   struct rte_eth_dev *eth_dev = enic->rte_dev;
+
+   old_mtu = eth_dev->data->mtu;
+   config_mtu = enic->config.mtu;
+
+   /* only works with Rx scatter disabled */
+   if (enic->rte_dev->data->dev_conf.rxmode.enable_scatter)
+   return -ENOTSUP;
+
+   if (new_mtu > enic->max_mtu) {
+   dev_err(enic,
+   "MTU not updated: requested (%u) greater than max 
(%u)\n",
+   new_mtu, enic->max_mtu);
+   return -EINVAL;
+   }
+   if (new_mtu < ENIC_MIN_MTU) {
+   dev_info(enic,
+   "MTU not updated: requested (%u) less than min (%u)\n",
+   new_mtu, ENIC_MIN_MTU);
+   return -EINVAL;
+   }
+   if (new_mtu > config_mtu)
+   dev_warning(enic,
+   "MTU

[dpdk-dev] [PATCH v2 1/2] enic: determine max egress packet size and max MTU

2016-06-24 Thread John Daley

Pull in common VNIC code which enables querying for max egress
packet size with newer firmware via a device command. If the
field is non-zero, it is the max egress packet size. If it is
0, the default value (9022) can safely be assumed. The value
for 1300 series VICS using firmware versions >= 3.1.2 for blade
series and >= 2.0.13 for rack series servers is 9208.

Tx buffers can be emitted only if they are less than the max egress
packet size regardless of the MTU setting (the MTU is advisory).
The max egress packet size can used to determine the upper limit
of the MTU since the enic can also receive packets of size greater
than max egress packet size. A max_mtu variable is added with
a value of max egress packet size minus L2 header size.

The default MTU is set via the CIMC/UCSM management interface and
currently allows value up to 9000. If the value is changed, the
host must be reboot. To avoid the reboot and allow MTU values
up to the max capability of the NIC, MTU update capability will
be added with a max value capped by max_mtu.

Signed-off-by: John Daley 
---
v2: Squished patch 1/4 and 2/4 into one. Tried to do a little
better explanaiton of the intent of the patch in the commit
message.

 drivers/net/enic/base/vnic_enet.h | 17 -
 drivers/net/enic/enic.h   |  1 +
 drivers/net/enic/enic_ethdev.c|  3 ++-
 drivers/net/enic/enic_res.c   | 25 +
 drivers/net/enic/enic_res.h   |  4 +++-
 5 files changed, 39 insertions(+), 11 deletions(-)

diff --git a/drivers/net/enic/base/vnic_enet.h 
b/drivers/net/enic/base/vnic_enet.h
index cc34998..5062247 100644
--- a/drivers/net/enic/base/vnic_enet.h
+++ b/drivers/net/enic/base/vnic_enet.h
@@ -35,6 +35,10 @@
 #ifndef _VNIC_ENIC_H_
 #define _VNIC_ENIC_H_

+/* Hardware intr coalesce timer is in units of 1.5us */
+#define INTR_COALESCE_USEC_TO_HW(usec) ((usec) * 2 / 3)
+#define INTR_COALESCE_HW_TO_USEC(usec) ((usec) * 3 / 2)
+
 /* Device-specific region: enet configuration */
 struct vnic_enet_config {
u32 flags;
@@ -50,6 +54,12 @@ struct vnic_enet_config {
u16 vf_rq_count;
u16 num_arfs;
u64 mem_paddr;
+   u16 rdma_qp_id;
+   u16 rdma_qp_count;
+   u16 rdma_resgrp;
+   u32 rdma_mr_id;
+   u32 rdma_mr_count;
+   u32 max_pkt_size;
 };

 #define VENETF_TSO 0x1 /* TSO enabled */
@@ -64,9 +74,14 @@ struct vnic_enet_config {
 #define VENETF_RSSHASH_IPV6_EX 0x200   /* Hash on IPv6 extended fields */
 #define VENETF_RSSHASH_TCPIPV6_EX 0x400/* Hash on TCP + IPv6 ext. 
fields */
 #define VENETF_LOOP0x800   /* Loopback enabled */
-#define VENETF_VMQ 0x4000  /* using VMQ flag for VMware NETQ */
+#define VENETF_FAILOVER0x1000  /* Fabric failover enabled */
+#define VENETF_USPACE_NIC   0x2000 /* vHPC enabled */
+#define VENETF_VMQ  0x4000 /* VMQ enabled */
+#define VENETF_ARFS0x8000  /* ARFS enabled */
 #define VENETF_VXLAN0x1 /* VxLAN offload */
 #define VENETF_NVGRE0x2 /* NVGRE offload */
+#define VENETF_GRPINTR  0x4 /* group interrupt */
+
 #define VENET_INTR_TYPE_MIN0   /* Timer specs min interrupt spacing */
 #define VENET_INTR_TYPE_IDLE   1   /* Timer specs idle time before irq */

diff --git a/drivers/net/enic/enic.h b/drivers/net/enic/enic.h
index df302ff..b557e12 100644
--- a/drivers/net/enic/enic.h
+++ b/drivers/net/enic/enic.h
@@ -121,6 +121,7 @@ struct enic {
u8 ig_vlan_strip_en;
int link_status;
u8 hw_ip_checksum;
+   u16 max_mtu;

unsigned int flags;
unsigned int priv_flags;
diff --git a/drivers/net/enic/enic_ethdev.c b/drivers/net/enic/enic_ethdev.c
index 83048d8..6fa54b2 100644
--- a/drivers/net/enic/enic_ethdev.c
+++ b/drivers/net/enic/enic_ethdev.c
@@ -439,7 +439,8 @@ static void enicpmd_dev_info_get(struct rte_eth_dev 
*eth_dev,
device_info->max_rx_queues = enic->rq_count;
device_info->max_tx_queues = enic->wq_count;
device_info->min_rx_bufsize = ENIC_MIN_MTU;
-   device_info->max_rx_pktlen = enic->config.mtu;
+   device_info->max_rx_pktlen = enic->rte_dev->data->mtu
+  + ETHER_HDR_LEN + 4;
device_info->max_mac_addrs = 1;
device_info->rx_offload_capa =
DEV_RX_OFFLOAD_VLAN_STRIP |
diff --git a/drivers/net/enic/enic_res.c b/drivers/net/enic/enic_res.c
index 42edd84..b271d34 100644
--- a/drivers/net/enic/enic_res.c
+++ b/drivers/net/enic/enic_res.c
@@ -83,6 +83,20 @@ int enic_get_vnic_config(struct enic *enic)
GET_CONFIG(intr_timer_usec);
GET_CONFIG(loop_tag);
GET_CONFIG(num_arfs);
+   GET_CONFIG(max_pkt_size);
+
+   /* max packet size is only defined in newer VIC firmware
+* and will be 0 for legacy firmware and VICs
+*/
+   if (c->max_pkt_size > ENIC_DEFAULT_MAX_PKT_SIZE)
+   enic->max_mtu = c->max_pkt_size - (ETHER_HDR_LEN + 4);
+

[dpdk-dev] [PATCH v7 25/25] mlx5: resurrect Rx scatter support

2016-06-24 Thread Nelio Laranjeiro

This commit brings back Rx scatter and related support by the MTU update
function. The maximum number of segments per packet is not a fixed value
anymore (previously MLX5_PMD_SGE_WR_N, set to 4 by default) as it caused
performance issues when fewer segments were actually needed as well as
limitations on the maximum packet size that could be received with the
default mbuf size (supporting at most 8576 bytes).

These limitations are now lifted as the number of SGEs is derived from the
MTU (which implies MRU) at queue initialization and during MTU update.

Signed-off-by: Adrien Mazarguil 
Signed-off-by: Vasily Philipov 
Signed-off-by: Nelio Laranjeiro 
---
 drivers/net/mlx5/mlx5_ethdev.c |  90 ++
 drivers/net/mlx5/mlx5_rxq.c|  77 ++-
 drivers/net/mlx5/mlx5_rxtx.c   | 139 -
 drivers/net/mlx5/mlx5_rxtx.h   |   1 +
 4 files changed, 225 insertions(+), 82 deletions(-)

diff --git a/drivers/net/mlx5/mlx5_ethdev.c b/drivers/net/mlx5/mlx5_ethdev.c
index 698a50e..72f0826 100644
--- a/drivers/net/mlx5/mlx5_ethdev.c
+++ b/drivers/net/mlx5/mlx5_ethdev.c
@@ -725,6 +725,9 @@ mlx5_dev_set_mtu(struct rte_eth_dev *dev, uint16_t mtu)
unsigned int i;
uint16_t (*rx_func)(void *, struct rte_mbuf **, uint16_t) =
mlx5_rx_burst;
+   unsigned int max_frame_len;
+   int rehash;
+   int restart = priv->started;

if (mlx5_is_secondary())
return -E_RTE_SECONDARY;
@@ -738,7 +741,6 @@ mlx5_dev_set_mtu(struct rte_eth_dev *dev, uint16_t mtu)
goto out;
} else
DEBUG("adapter port %u MTU set to %u", priv->port, mtu);
-   priv->mtu = mtu;
/* Temporarily replace RX handler with a fake one, assuming it has not
 * been copied elsewhere. */
dev->rx_pkt_burst = removed_rx_burst;
@@ -746,28 +748,94 @@ mlx5_dev_set_mtu(struct rte_eth_dev *dev, uint16_t mtu)
 * removed_rx_burst() instead. */
rte_wmb();
usleep(1000);
+   /* MTU does not include header and CRC. */
+   max_frame_len = ETHER_HDR_LEN + mtu + ETHER_CRC_LEN;
+   /* Check if at least one queue is going to need a SGE update. */
+   for (i = 0; i != priv->rxqs_n; ++i) {
+   struct rxq *rxq = (*priv->rxqs)[i];
+   unsigned int mb_len;
+   unsigned int size = RTE_PKTMBUF_HEADROOM + max_frame_len;
+   unsigned int sges_n;
+
+   if (rxq == NULL)
+   continue;
+   mb_len = rte_pktmbuf_data_room_size(rxq->mp);
+   assert(mb_len >= RTE_PKTMBUF_HEADROOM);
+   /*
+* Determine the number of SGEs needed for a full packet
+* and round it to the next power of two.
+*/
+   sges_n = log2above((size / mb_len) + !!(size % mb_len));
+   if (sges_n != rxq->sges_n)
+   break;
+   }
+   /*
+* If all queues have the right number of SGEs, a simple rehash
+* of their buffers is enough, otherwise SGE information can only
+* be updated in a queue by recreating it. All resources that depend
+* on queues (flows, indirection tables) must be recreated as well in
+* that case.
+*/
+   rehash = (i == priv->rxqs_n);
+   if (!rehash) {
+   /* Clean up everything as with mlx5_dev_stop(). */
+   priv_special_flow_disable_all(priv);
+   priv_mac_addrs_disable(priv);
+   priv_destroy_hash_rxqs(priv);
+   priv_fdir_disable(priv);
+   priv_dev_interrupt_handler_uninstall(priv, dev);
+   }
+recover:
/* Reconfigure each RX queue. */
for (i = 0; (i != priv->rxqs_n); ++i) {
struct rxq *rxq = (*priv->rxqs)[i];
-   unsigned int mb_len;
-   unsigned int max_frame_len;
+   struct rxq_ctrl *rxq_ctrl =
+   container_of(rxq, struct rxq_ctrl, rxq);
int sp;
+   unsigned int mb_len;
+   unsigned int tmp;

if (rxq == NULL)
continue;
-   /* Calculate new maximum frame length according to MTU and
-* toggle scattered support (sp) if necessary. */
-   max_frame_len = (priv->mtu + ETHER_HDR_LEN +
-(ETHER_MAX_VLAN_FRAME_LEN - ETHER_MAX_LEN));
mb_len = rte_pktmbuf_data_room_size(rxq->mp);
assert(mb_len >= RTE_PKTMBUF_HEADROOM);
+   /* Toggle scattered support (sp) if necessary. */
sp = (max_frame_len > (mb_len - RTE_PKTMBUF_HEADROOM));
-   if (sp) {
-   ERROR("%p: RX scatter is not supported", (void *)dev);
-   ret = ENOTSUP;
-   goto out;
+   /* Provide new values to

[dpdk-dev] [PATCH v7 24/25] mlx5: make Rx queue reinitialization safer

2016-06-24 Thread Nelio Laranjeiro

From: Adrien Mazarguil 

The primary purpose of rxq_rehash() function is to stop and restart
reception on a queue after re-posting buffers. This may fail if the array
that temporarily stores existing buffers for reuse cannot be allocated.

Update rxq_rehash() to work on the target queue directly (not through a
template copy) and avoid this allocation.

rxq_alloc_elts() is modified accordingly to take buffers from an existing
queue directly and update their refcount.

Unlike rxq_rehash(), rxq_setup() must work on a temporary structure but
should not allocate new mbufs from the pool while reinitializing an
existing queue. This is achieved by using the refcount-aware
rxq_alloc_elts() before overwriting queue data.

Signed-off-by: Adrien Mazarguil 
Signed-off-by: Vasily Philipov 
---
 drivers/net/mlx5/mlx5_rxq.c | 83 ++---
 1 file changed, 41 insertions(+), 42 deletions(-)

diff --git a/drivers/net/mlx5/mlx5_rxq.c b/drivers/net/mlx5/mlx5_rxq.c
index fbf14fa..b2ddd0d 100644
--- a/drivers/net/mlx5/mlx5_rxq.c
+++ b/drivers/net/mlx5/mlx5_rxq.c
@@ -642,7 +642,7 @@ priv_rehash_flows(struct priv *priv)
  */
 static int
 rxq_alloc_elts(struct rxq_ctrl *rxq_ctrl, unsigned int elts_n,
-  struct rte_mbuf **pool)
+  struct rte_mbuf *(*pool)[])
 {
unsigned int i;
int ret = 0;
@@ -654,9 +654,10 @@ rxq_alloc_elts(struct rxq_ctrl *rxq_ctrl, unsigned int 
elts_n,
&(*rxq_ctrl->rxq.wqes)[i];

if (pool != NULL) {
-   buf = *(pool++);
+   buf = (*pool)[i];
assert(buf != NULL);
rte_pktmbuf_reset(buf);
+   rte_pktmbuf_refcnt_update(buf, 1);
} else
buf = rte_pktmbuf_alloc(rxq_ctrl->rxq.mp);
if (buf == NULL) {
@@ -781,7 +782,7 @@ rxq_cleanup(struct rxq_ctrl *rxq_ctrl)
 }

 /**
- * Reconfigure a RX queue with new parameters.
+ * Reconfigure RX queue buffers.
  *
  * rxq_rehash() does not allocate mbufs, which, if not done from the right
  * thread (such as a control thread), may corrupt the pool.
@@ -798,67 +799,48 @@ rxq_cleanup(struct rxq_ctrl *rxq_ctrl)
 int
 rxq_rehash(struct rte_eth_dev *dev, struct rxq_ctrl *rxq_ctrl)
 {
-   struct rxq_ctrl tmpl = *rxq_ctrl;
-   unsigned int mbuf_n;
-   unsigned int desc_n;
-   struct rte_mbuf **pool;
-   unsigned int i, k;
+   unsigned int elts_n = rxq_ctrl->rxq.elts_n;
+   unsigned int i;
struct ibv_exp_wq_attr mod;
int err;

DEBUG("%p: rehashing queue %p", (void *)dev, (void *)rxq_ctrl);
-   /* Number of descriptors and mbufs currently allocated. */
-   desc_n = tmpl.rxq.elts_n;
-   mbuf_n = desc_n;
/* From now on, any failure will render the queue unusable.
 * Reinitialize WQ. */
mod = (struct ibv_exp_wq_attr){
.attr_mask = IBV_EXP_WQ_ATTR_STATE,
.wq_state = IBV_EXP_WQS_RESET,
};
-   err = ibv_exp_modify_wq(tmpl.wq, );
+   err = ibv_exp_modify_wq(rxq_ctrl->wq, );
if (err) {
ERROR("%p: cannot reset WQ: %s", (void *)dev, strerror(err));
assert(err > 0);
return err;
}
-   /* Allocate pool. */
-   pool = rte_malloc(__func__, (mbuf_n * sizeof(*pool)), 0);
-   if (pool == NULL) {
-   ERROR("%p: cannot allocate memory", (void *)dev);
-   return ENOBUFS;
-   }
/* Snatch mbufs from original queue. */
-   k = 0;
-   for (i = 0; (i != desc_n); ++i)
-   pool[k++] = (*rxq_ctrl->rxq.elts)[i];
-   assert(k == mbuf_n);
-   rte_free(pool);
+   claim_zero(rxq_alloc_elts(rxq_ctrl, elts_n, rxq_ctrl->rxq.elts));
+   for (i = 0; i != elts_n; ++i) {
+   struct rte_mbuf *buf = (*rxq_ctrl->rxq.elts)[i];
+
+   assert(rte_mbuf_refcnt_read(buf) == 2);
+   rte_pktmbuf_free_seg(buf);
+   }
/* Change queue state to ready. */
mod = (struct ibv_exp_wq_attr){
.attr_mask = IBV_EXP_WQ_ATTR_STATE,
.wq_state = IBV_EXP_WQS_RDY,
};
-   err = ibv_exp_modify_wq(tmpl.wq, );
+   err = ibv_exp_modify_wq(rxq_ctrl->wq, );
if (err) {
ERROR("%p: WQ state to IBV_EXP_WQS_RDY failed: %s",
  (void *)dev, strerror(err));
goto error;
}
-   /* Post SGEs. */
-   err = rxq_alloc_elts(, desc_n, pool);
-   if (err) {
-   ERROR("%p: cannot reallocate WRs, aborting", (void *)dev);
-   rte_free(pool);
-   assert(err > 0);
-   return err;
-   }
/* Update doorbell counter. */
-   rxq_ctrl->rxq.rq_ci = desc_n;
+   rxq_ctrl->rxq.rq_ci = elts_n;
rte_wmb();
*rxq_ctrl->rxq.rq_db = htonl(rxq_ctrl->rxq.rq_ci);

[dpdk-dev] [PATCH v7 22/25] mlx5: work around spurious compilation errors

2016-06-24 Thread Nelio Laranjeiro

From: Adrien Mazarguil 

Since commit "mlx5: resurrect Tx gather support", older GCC versions (such
as 4.8.5) may complain about the following:

 mlx5_rxtx.c: In function `mlx5_tx_burst':
 mlx5_rxtx.c:705:25: error: `wqe' may be used uninitialized in this
 function [-Werror=maybe-uninitialized]

 mlx5_rxtx.c: In function `mlx5_tx_burst_inline':
 mlx5_rxtx.c:864:25: error: `wqe' may be used uninitialized in this
 function [-Werror=maybe-uninitialized]

In both cases, this code cannot be reached when wqe is not initialized.

Considering older GCC versions are still widely used, work around this
issue by initializing wqe preemptively, even if it should not be necessary.

Signed-off-by: Adrien Mazarguil 
---
 drivers/net/mlx5/mlx5_rxtx.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/net/mlx5/mlx5_rxtx.c b/drivers/net/mlx5/mlx5_rxtx.c
index c72e7ce..8b67949 100644
--- a/drivers/net/mlx5/mlx5_rxtx.c
+++ b/drivers/net/mlx5/mlx5_rxtx.c
@@ -593,7 +593,7 @@ mlx5_tx_burst(void *dpdk_txq, struct rte_mbuf **pkts, 
uint16_t pkts_n)
unsigned int j = 0;
unsigned int max;
unsigned int comp;
-   volatile union mlx5_wqe *wqe;
+   volatile union mlx5_wqe *wqe = NULL;

if (unlikely(!pkts_n))
return 0;
@@ -741,7 +741,7 @@ mlx5_tx_burst_inline(void *dpdk_txq, struct rte_mbuf 
**pkts, uint16_t pkts_n)
unsigned int j = 0;
unsigned int max;
unsigned int comp;
-   volatile union mlx5_wqe *wqe;
+   volatile union mlx5_wqe *wqe = NULL;
unsigned int max_inline = txq->max_inline;

if (unlikely(!pkts_n))
-- 
2.1.4

[dpdk-dev] [PATCH v7 21/25] mlx5: resurrect Tx gather support

2016-06-24 Thread Nelio Laranjeiro

From: Adrien Mazarguil 

Compared to its previous incarnation, the software limit on the number of
mbuf segments is no more (previously MLX5_PMD_SGE_WR_N, set to 4 by
default) hence no need for linearization code and related buffers that
permanently consumed a non negligible amount of memory to handle oversized
mbufs.

The resulting code is both lighter and faster.

Signed-off-by: Adrien Mazarguil 
Signed-off-by: Nelio Laranjeiro 
---
 drivers/net/mlx5/mlx5_rxtx.c | 235 +--
 drivers/net/mlx5/mlx5_txq.c  |   8 +-
 2 files changed, 188 insertions(+), 55 deletions(-)

diff --git a/drivers/net/mlx5/mlx5_rxtx.c b/drivers/net/mlx5/mlx5_rxtx.c
index fadc182..c72e7ce 100644
--- a/drivers/net/mlx5/mlx5_rxtx.c
+++ b/drivers/net/mlx5/mlx5_rxtx.c
@@ -303,6 +303,7 @@ mlx5_wqe_write(struct txq *txq, volatile union mlx5_wqe 
*wqe,
 {
wqe->wqe.ctrl.data[0] = htonl((txq->wqe_ci << 8) | MLX5_OPCODE_SEND);
wqe->wqe.ctrl.data[1] = htonl((txq->qp_num_8s) | 4);
+   wqe->wqe.ctrl.data[2] = 0;
wqe->wqe.ctrl.data[3] = 0;
wqe->inl.eseg.rsvd0 = 0;
wqe->inl.eseg.rsvd1 = 0;
@@ -348,6 +349,7 @@ mlx5_wqe_write_vlan(struct txq *txq, volatile union 
mlx5_wqe *wqe,

wqe->wqe.ctrl.data[0] = htonl((txq->wqe_ci << 8) | MLX5_OPCODE_SEND);
wqe->wqe.ctrl.data[1] = htonl((txq->qp_num_8s) | 4);
+   wqe->wqe.ctrl.data[2] = 0;
wqe->wqe.ctrl.data[3] = 0;
wqe->inl.eseg.rsvd0 = 0;
wqe->inl.eseg.rsvd1 = 0;
@@ -425,6 +427,7 @@ mlx5_wqe_write_inline(struct txq *txq, volatile union 
mlx5_wqe *wqe,
assert(size < 64);
wqe->inl.ctrl.data[0] = htonl((txq->wqe_ci << 8) | MLX5_OPCODE_SEND);
wqe->inl.ctrl.data[1] = htonl(txq->qp_num_8s | size);
+   wqe->inl.ctrl.data[2] = 0;
wqe->inl.ctrl.data[3] = 0;
wqe->inl.eseg.rsvd0 = 0;
wqe->inl.eseg.rsvd1 = 0;
@@ -498,6 +501,7 @@ mlx5_wqe_write_inline_vlan(struct txq *txq, volatile union 
mlx5_wqe *wqe,
assert(size < 64);
wqe->inl.ctrl.data[0] = htonl((txq->wqe_ci << 8) | MLX5_OPCODE_SEND);
wqe->inl.ctrl.data[1] = htonl(txq->qp_num_8s | size);
+   wqe->inl.ctrl.data[2] = 0;
wqe->inl.ctrl.data[3] = 0;
wqe->inl.eseg.rsvd0 = 0;
wqe->inl.eseg.rsvd1 = 0;
@@ -586,6 +590,7 @@ mlx5_tx_burst(void *dpdk_txq, struct rte_mbuf **pkts, 
uint16_t pkts_n)
uint16_t elts_head = txq->elts_head;
const unsigned int elts_n = txq->elts_n;
unsigned int i = 0;
+   unsigned int j = 0;
unsigned int max;
unsigned int comp;
volatile union mlx5_wqe *wqe;
@@ -602,23 +607,27 @@ mlx5_tx_burst(void *dpdk_txq, struct rte_mbuf **pkts, 
uint16_t pkts_n)
if (max > elts_n)
max -= elts_n;
do {
-   struct rte_mbuf *buf;
+   struct rte_mbuf *buf = *(pkts++);
unsigned int elts_head_next;
uintptr_t addr;
uint32_t length;
uint32_t lkey;
+   unsigned int segs_n = buf->nb_segs;
+   volatile struct mlx5_wqe_data_seg *dseg;
+   unsigned int ds = sizeof(*wqe) / 16;

/*
 * Make sure there is enough room to store this packet and
 * that one ring entry remains unused.
 */
-   if (max < 1 + 1)
+   assert(segs_n);
+   if (max < segs_n + 1)
break;
-   --max;
+   max -= segs_n;
--pkts_n;
-   buf = *(pkts++);
elts_head_next = (elts_head + 1) & (elts_n - 1);
wqe = &(*txq->wqes)[txq->wqe_ci & (txq->wqe_n - 1)];
+   dseg = >wqe.dseg;
rte_prefetch0(wqe);
if (pkts_n)
rte_prefetch0(*pkts);
@@ -638,7 +647,6 @@ mlx5_tx_burst(void *dpdk_txq, struct rte_mbuf **pkts, 
uint16_t pkts_n)
buf->vlan_tci);
else
mlx5_wqe_write(txq, wqe, addr, length, lkey);
-   wqe->wqe.ctrl.data[2] = 0;
/* Should we enable HW CKSUM offload */
if (buf->ol_flags &
(PKT_TX_IP_CKSUM | PKT_TX_TCP_CKSUM | PKT_TX_UDP_CKSUM)) {
@@ -648,6 +656,37 @@ mlx5_tx_burst(void *dpdk_txq, struct rte_mbuf **pkts, 
uint16_t pkts_n)
} else {
wqe->wqe.eseg.cs_flags = 0;
}
+   while (--segs_n) {
+   /*
+* Spill on next WQE when the current one does not have
+* enough room left. Size of WQE must a be a multiple
+* of data segment size.
+*/
+   assert(!(sizeof(*wqe) % sizeof(*dseg)));
+   if (!(ds % (sizeof(*wqe) / 16)))
+

[dpdk-dev] [PATCH v7 20/25] mlx5: check remaining space while processing Tx burst

2016-06-24 Thread Nelio Laranjeiro

From: Adrien Mazarguil 

The space necessary to store segmented packets cannot be known in advance
and must be verified for each of them.

Signed-off-by: Adrien Mazarguil 
Signed-off-by: Nelio Laranjeiro 
---
 drivers/net/mlx5/mlx5_rxtx.c | 144 +++
 1 file changed, 78 insertions(+), 66 deletions(-)

diff --git a/drivers/net/mlx5/mlx5_rxtx.c b/drivers/net/mlx5/mlx5_rxtx.c
index ed2b5fe..fadc182 100644
--- a/drivers/net/mlx5/mlx5_rxtx.c
+++ b/drivers/net/mlx5/mlx5_rxtx.c
@@ -585,50 +585,51 @@ mlx5_tx_burst(void *dpdk_txq, struct rte_mbuf **pkts, 
uint16_t pkts_n)
struct txq *txq = (struct txq *)dpdk_txq;
uint16_t elts_head = txq->elts_head;
const unsigned int elts_n = txq->elts_n;
-   unsigned int i;
+   unsigned int i = 0;
unsigned int max;
unsigned int comp;
volatile union mlx5_wqe *wqe;
-   struct rte_mbuf *buf;

if (unlikely(!pkts_n))
return 0;
-   buf = pkts[0];
/* Prefetch first packet cacheline. */
tx_prefetch_cqe(txq, txq->cq_ci);
tx_prefetch_cqe(txq, txq->cq_ci + 1);
-   rte_prefetch0(buf);
+   rte_prefetch0(*pkts);
/* Start processing. */
txq_complete(txq);
max = (elts_n - (elts_head - txq->elts_tail));
if (max > elts_n)
max -= elts_n;
-   assert(max >= 1);
-   assert(max <= elts_n);
-   /* Always leave one free entry in the ring. */
-   --max;
-   if (max == 0)
-   return 0;
-   if (max > pkts_n)
-   max = pkts_n;
-   for (i = 0; (i != max); ++i) {
-   unsigned int elts_head_next = (elts_head + 1) & (elts_n - 1);
+   do {
+   struct rte_mbuf *buf;
+   unsigned int elts_head_next;
uintptr_t addr;
uint32_t length;
uint32_t lkey;

+   /*
+* Make sure there is enough room to store this packet and
+* that one ring entry remains unused.
+*/
+   if (max < 1 + 1)
+   break;
+   --max;
+   --pkts_n;
+   buf = *(pkts++);
+   elts_head_next = (elts_head + 1) & (elts_n - 1);
wqe = &(*txq->wqes)[txq->wqe_ci & (txq->wqe_n - 1)];
rte_prefetch0(wqe);
-   if (i + 1 < max)
-   rte_prefetch0(pkts[i + 1]);
+   if (pkts_n)
+   rte_prefetch0(*pkts);
/* Retrieve buffer information. */
addr = rte_pktmbuf_mtod(buf, uintptr_t);
length = DATA_LEN(buf);
/* Update element. */
(*txq->elts)[elts_head] = buf;
/* Prefetch next buffer data. */
-   if (i + 1 < max)
-   rte_prefetch0(rte_pktmbuf_mtod(pkts[i + 1],
+   if (pkts_n)
+   rte_prefetch0(rte_pktmbuf_mtod(*pkts,
   volatile void *));
/* Retrieve Memory Region key for this memory pool. */
lkey = txq_mp2mr(txq, txq_mb2mp(buf));
@@ -652,8 +653,8 @@ mlx5_tx_burst(void *dpdk_txq, struct rte_mbuf **pkts, 
uint16_t pkts_n)
txq->stats.obytes += length;
 #endif
elts_head = elts_head_next;
-   buf = pkts[i + 1];
-   }
+   ++i;
+   } while (pkts_n);
/* Take a shortcut if nothing must be sent. */
if (unlikely(i == 0))
return 0;
@@ -697,44 +698,45 @@ mlx5_tx_burst_inline(void *dpdk_txq, struct rte_mbuf 
**pkts, uint16_t pkts_n)
struct txq *txq = (struct txq *)dpdk_txq;
uint16_t elts_head = txq->elts_head;
const unsigned int elts_n = txq->elts_n;
-   unsigned int i;
+   unsigned int i = 0;
unsigned int max;
unsigned int comp;
volatile union mlx5_wqe *wqe;
-   struct rte_mbuf *buf;
unsigned int max_inline = txq->max_inline;

if (unlikely(!pkts_n))
return 0;
-   buf = pkts[0];
/* Prefetch first packet cacheline. */
tx_prefetch_cqe(txq, txq->cq_ci);
tx_prefetch_cqe(txq, txq->cq_ci + 1);
-   rte_prefetch0(buf);
+   rte_prefetch0(*pkts);
/* Start processing. */
txq_complete(txq);
max = (elts_n - (elts_head - txq->elts_tail));
if (max > elts_n)
max -= elts_n;
-   assert(max >= 1);
-   assert(max <= elts_n);
-   /* Always leave one free entry in the ring. */
-   --max;
-   if (max == 0)
-   return 0;
-   if (max > pkts_n)
-   max = pkts_n;
-   for (i = 0; (i != max); ++i) {
-   unsigned int elts_head_next = (elts_head + 1) & (elts_n - 1);
+   do {
+   struct rte_mbuf *buf;
+   unsigned int

[dpdk-dev] [PATCH v7 19/25] mlx5: add debugging information about Tx queues capabilities

2016-06-24 Thread Nelio Laranjeiro

From: Adrien Mazarguil 

Signed-off-by: Adrien Mazarguil 
---
 drivers/net/mlx5/mlx5_txq.c | 5 +
 1 file changed, 5 insertions(+)

diff --git a/drivers/net/mlx5/mlx5_txq.c b/drivers/net/mlx5/mlx5_txq.c
index 4f17fb0..bae9f3d 100644
--- a/drivers/net/mlx5/mlx5_txq.c
+++ b/drivers/net/mlx5/mlx5_txq.c
@@ -343,6 +343,11 @@ txq_ctrl_setup(struct rte_eth_dev *dev, struct txq_ctrl 
*txq_ctrl,
  (void *)dev, strerror(ret));
goto error;
}
+   DEBUG("TX queue capabilities: max_send_wr=%u, max_send_sge=%u,"
+ " max_inline_data=%u",
+ attr.init.cap.max_send_wr,
+ attr.init.cap.max_send_sge,
+ attr.init.cap.max_inline_data);
attr.mod = (struct ibv_exp_qp_attr){
/* Move the QP to this state. */
.qp_state = IBV_QPS_INIT,
-- 
2.1.4

[dpdk-dev] [PATCH v7 18/25] mlx5: add support for multi-packet send

2016-06-24 Thread Nelio Laranjeiro

This feature enables the TX burst function to emit up to 5 packets using
only two WQEs on devices that support it. Saves PCI bandwidth and improves
performance.

Signed-off-by: Nelio Laranjeiro 
Signed-off-by: Adrien Mazarguil 
Signed-off-by: Olga Shern 
---
 doc/guides/nics/mlx5.rst   |  10 +
 drivers/net/mlx5/mlx5.c|  14 +-
 drivers/net/mlx5/mlx5_ethdev.c |  15 +-
 drivers/net/mlx5/mlx5_rxtx.c   | 407 +
 drivers/net/mlx5/mlx5_rxtx.h   |   2 +
 drivers/net/mlx5/mlx5_txq.c|   2 +-
 6 files changed, 446 insertions(+), 4 deletions(-)

diff --git a/doc/guides/nics/mlx5.rst b/doc/guides/nics/mlx5.rst
index 9ada221..063c4a5 100644
--- a/doc/guides/nics/mlx5.rst
+++ b/doc/guides/nics/mlx5.rst
@@ -171,6 +171,16 @@ Run-time configuration

   This option should be used in combination with ``txq_inline`` above.

+- ``txq_mpw_en`` parameter [int]
+
+  A nonzero value enables multi-packet send. This feature allows the TX
+  burst function to pack up to five packets in two descriptors in order to
+  save PCI bandwidth and improve performance at the cost of a slightly
+  higher CPU usage.
+
+  It is currently only supported on the ConnectX-4 Lx family of adapters.
+  Enabled by default.
+
 Prerequisites
 -

diff --git a/drivers/net/mlx5/mlx5.c b/drivers/net/mlx5/mlx5.c
index 73069d2..5aa4adc 100644
--- a/drivers/net/mlx5/mlx5.c
+++ b/drivers/net/mlx5/mlx5.c
@@ -81,6 +81,9 @@
  */
 #define MLX5_TXQS_MIN_INLINE "txqs_min_inline"

+/* Device parameter to enable multi-packet send WQEs. */
+#define MLX5_TXQ_MPW_EN "txq_mpw_en"
+
 /**
  * Retrieve integer value from environment variable.
  *
@@ -282,6 +285,8 @@ mlx5_args_check(const char *key, const char *val, void 
*opaque)
priv->txq_inline = tmp;
} else if (strcmp(MLX5_TXQS_MIN_INLINE, key) == 0) {
priv->txqs_inline = tmp;
+   } else if (strcmp(MLX5_TXQ_MPW_EN, key) == 0) {
+   priv->mps = !!tmp;
} else {
WARN("%s: unknown parameter", key);
return -EINVAL;
@@ -307,6 +312,7 @@ mlx5_args(struct priv *priv, struct rte_devargs *devargs)
MLX5_RXQ_CQE_COMP_EN,
MLX5_TXQ_INLINE,
MLX5_TXQS_MIN_INLINE,
+   MLX5_TXQ_MPW_EN,
NULL,
};
struct rte_kvargs *kvlist;
@@ -503,6 +509,7 @@ mlx5_pci_devinit(struct rte_pci_driver *pci_drv, struct 
rte_pci_device *pci_dev)
priv->port = port;
priv->pd = pd;
priv->mtu = ETHER_MTU;
+   priv->mps = mps; /* Enable MPW by default if supported. */
priv->cqe_comp = 1; /* Enable compression by default. */
err = mlx5_args(priv, pci_dev->devargs);
if (err) {
@@ -551,7 +558,12 @@ mlx5_pci_devinit(struct rte_pci_driver *pci_drv, struct 
rte_pci_device *pci_dev)

priv_get_num_vfs(priv, _vfs);
priv->sriov = (num_vfs || sriov);
-   priv->mps = mps;
+   if (priv->mps && !mps) {
+   ERROR("multi-packet send not supported on this device"
+ " (" MLX5_TXQ_MPW_EN ")");
+   err = ENOTSUP;
+   goto port_error;
+   }
/* Allocate and register default RSS hash keys. */
priv->rss_conf = rte_calloc(__func__, hash_rxq_init_n,
sizeof((*priv->rss_conf)[0]), 0);
diff --git a/drivers/net/mlx5/mlx5_ethdev.c b/drivers/net/mlx5/mlx5_ethdev.c
index aeea4ff..698a50e 100644
--- a/drivers/net/mlx5/mlx5_ethdev.c
+++ b/drivers/net/mlx5/mlx5_ethdev.c
@@ -584,7 +584,8 @@ mlx5_dev_infos_get(struct rte_eth_dev *dev, struct 
rte_eth_dev_info *info)
  DEV_RX_OFFLOAD_UDP_CKSUM |
  DEV_RX_OFFLOAD_TCP_CKSUM) :
 0);
-   info->tx_offload_capa = DEV_TX_OFFLOAD_VLAN_INSERT;
+   if (!priv->mps)
+   info->tx_offload_capa = DEV_TX_OFFLOAD_VLAN_INSERT;
if (priv->hw_csum)
info->tx_offload_capa |=
(DEV_TX_OFFLOAD_IPV4_CKSUM |
@@ -1318,7 +1319,17 @@ void
 priv_select_tx_function(struct priv *priv)
 {
priv->dev->tx_pkt_burst = mlx5_tx_burst;
-   if (priv->txq_inline && (priv->txqs_n >= priv->txqs_inline)) {
+   /* Display warning for unsupported configurations. */
+   if (priv->sriov && priv->mps)
+   WARN("multi-packet send WQE cannot be used on a SR-IOV setup");
+   /* Select appropriate TX function. */
+   if ((priv->sriov == 0) && priv->mps && priv->txq_inline) {
+   priv->dev->tx_pkt_burst = mlx5_tx_burst_mpw_inline;
+   DEBUG("selected MPW inline TX function");
+   } else if ((priv->sriov == 0) && priv->mps) {
+   priv->dev->tx_pkt_burst = mlx5_tx_burst_mpw;
+   DEBUG("selected MPW TX function");
+   } else if

[dpdk-dev] [PATCH v7 17/25] mlx5: add support for inline send

2016-06-24 Thread Nelio Laranjeiro

From: Yaacov Hazan 

Implement send inline feature which copies packet data directly into WQEs
for improved latency. The maximum packet size and the minimum number of Tx
queues to qualify for inline send are user-configurable.

This feature is effective when HW causes a performance bottleneck.

Signed-off-by: Yaacov Hazan 
Signed-off-by: Adrien Mazarguil 
Signed-off-by: Nelio Laranjeiro 
---
 doc/guides/nics/mlx5.rst   |  17 +++
 drivers/net/mlx5/mlx5.c|  15 +++
 drivers/net/mlx5/mlx5.h|   2 +
 drivers/net/mlx5/mlx5_ethdev.c |   5 +
 drivers/net/mlx5/mlx5_rxtx.c   | 273 +
 drivers/net/mlx5/mlx5_rxtx.h   |   2 +
 drivers/net/mlx5/mlx5_txq.c|   4 +
 7 files changed, 318 insertions(+)

diff --git a/doc/guides/nics/mlx5.rst b/doc/guides/nics/mlx5.rst
index 756153b..9ada221 100644
--- a/doc/guides/nics/mlx5.rst
+++ b/doc/guides/nics/mlx5.rst
@@ -154,6 +154,23 @@ Run-time configuration
   allows to save PCI bandwidth and improve performance at the cost of a
   slightly higher CPU usage.  Enabled by default.

+- ``txq_inline`` parameter [int]
+
+  Amount of data to be inlined during TX operations. Improves latency.
+  Can improve PPS performance when PCI back pressure is detected and may be
+  useful for scenarios involving heavy traffic on many queues.
+
+  It is not enabled by default (set to 0) since the additional software
+  logic necessary to handle this mode can lower performance when back
+  pressure is not expected.
+
+- ``txqs_min_inline`` parameter [int]
+
+  Enable inline send only when the number of TX queues is greater or equal
+  to this value.
+
+  This option should be used in combination with ``txq_inline`` above.
+
 Prerequisites
 -

diff --git a/drivers/net/mlx5/mlx5.c b/drivers/net/mlx5/mlx5.c
index 630e5e4..73069d2 100644
--- a/drivers/net/mlx5/mlx5.c
+++ b/drivers/net/mlx5/mlx5.c
@@ -72,6 +72,15 @@
 /* Device parameter to enable RX completion queue compression. */
 #define MLX5_RXQ_CQE_COMP_EN "rxq_cqe_comp_en"

+/* Device parameter to configure inline send. */
+#define MLX5_TXQ_INLINE "txq_inline"
+
+/*
+ * Device parameter to configure the number of TX queues threshold for
+ * enabling inline send.
+ */
+#define MLX5_TXQS_MIN_INLINE "txqs_min_inline"
+
 /**
  * Retrieve integer value from environment variable.
  *
@@ -269,6 +278,10 @@ mlx5_args_check(const char *key, const char *val, void 
*opaque)
}
if (strcmp(MLX5_RXQ_CQE_COMP_EN, key) == 0) {
priv->cqe_comp = !!tmp;
+   } else if (strcmp(MLX5_TXQ_INLINE, key) == 0) {
+   priv->txq_inline = tmp;
+   } else if (strcmp(MLX5_TXQS_MIN_INLINE, key) == 0) {
+   priv->txqs_inline = tmp;
} else {
WARN("%s: unknown parameter", key);
return -EINVAL;
@@ -292,6 +305,8 @@ mlx5_args(struct priv *priv, struct rte_devargs *devargs)
 {
const char **params = (const char *[]){
MLX5_RXQ_CQE_COMP_EN,
+   MLX5_TXQ_INLINE,
+   MLX5_TXQS_MIN_INLINE,
NULL,
};
struct rte_kvargs *kvlist;
diff --git a/drivers/net/mlx5/mlx5.h b/drivers/net/mlx5/mlx5.h
index 8f5a6df..3a86609 100644
--- a/drivers/net/mlx5/mlx5.h
+++ b/drivers/net/mlx5/mlx5.h
@@ -113,6 +113,8 @@ struct priv {
unsigned int mps:1; /* Whether multi-packet send is supported. */
unsigned int cqe_comp:1; /* Whether CQE compression is enabled. */
unsigned int pending_alarm:1; /* An alarm is pending. */
+   unsigned int txq_inline; /* Maximum packet size for inlining. */
+   unsigned int txqs_inline; /* Queue number threshold for inlining. */
/* RX/TX queues. */
unsigned int rxqs_n; /* RX queues array size. */
unsigned int txqs_n; /* TX queues array size. */
diff --git a/drivers/net/mlx5/mlx5_ethdev.c b/drivers/net/mlx5/mlx5_ethdev.c
index 47e64b2..aeea4ff 100644
--- a/drivers/net/mlx5/mlx5_ethdev.c
+++ b/drivers/net/mlx5/mlx5_ethdev.c
@@ -1318,6 +1318,11 @@ void
 priv_select_tx_function(struct priv *priv)
 {
priv->dev->tx_pkt_burst = mlx5_tx_burst;
+   if (priv->txq_inline && (priv->txqs_n >= priv->txqs_inline)) {
+   priv->dev->tx_pkt_burst = mlx5_tx_burst_inline;
+   DEBUG("selected inline TX function (%u >= %u queues)",
+ priv->txqs_n, priv->txqs_inline);
+   }
 }

 /**
diff --git a/drivers/net/mlx5/mlx5_rxtx.c b/drivers/net/mlx5/mlx5_rxtx.c
index 9d992c3..daa22d9 100644
--- a/drivers/net/mlx5/mlx5_rxtx.c
+++ b/drivers/net/mlx5/mlx5_rxtx.c
@@ -376,6 +376,139 @@ mlx5_wqe_write_vlan(struct txq *txq, volatile union 
mlx5_wqe *wqe,
 }

 /**
+ * Write a inline WQE.
+ *
+ * @param txq
+ *   Pointer to TX queue structure.
+ * @param wqe
+ *   Pointer to the WQE to fill.
+ * @param addr
+ *   Buffer data address.
+ * @param length
+ *   Packet length.
+ * @param lkey
+ *   Memory region lkey.
+ */
+static inline void

[dpdk-dev] [PATCH v7 16/25] mlx5: replace countdown with threshold for Tx completions

2016-06-24 Thread Nelio Laranjeiro

From: Adrien Mazarguil 

Replacing the variable countdown (which depends on the number of
descriptors) with a fixed relative threshold known at compile time improves
performance by reducing the TX queue structure footprint and the amount of
code to manage completions during a burst.

Completions are now requested at most once per burst after threshold is
reached.

Signed-off-by: Adrien Mazarguil 
Signed-off-by: Nelio Laranjeiro 
Signed-off-by: Vasily Philipov 
---
 drivers/net/mlx5/mlx5_defs.h |  7 +--
 drivers/net/mlx5/mlx5_rxtx.c | 44 +---
 drivers/net/mlx5/mlx5_rxtx.h |  5 ++---
 drivers/net/mlx5/mlx5_txq.c  | 21 -
 4 files changed, 44 insertions(+), 33 deletions(-)

diff --git a/drivers/net/mlx5/mlx5_defs.h b/drivers/net/mlx5/mlx5_defs.h
index 8d2ec7a..cc2a6f3 100644
--- a/drivers/net/mlx5/mlx5_defs.h
+++ b/drivers/net/mlx5/mlx5_defs.h
@@ -48,8 +48,11 @@
 /* Maximum number of special flows. */
 #define MLX5_MAX_SPECIAL_FLOWS 4

-/* Request send completion once in every 64 sends, might be less. */
-#define MLX5_PMD_TX_PER_COMP_REQ 64
+/*
+ * Request TX completion every time descriptors reach this threshold since
+ * the previous request. Must be a power of two for performance reasons.
+ */
+#define MLX5_TX_COMP_THRESH 32

 /* RSS Indirection table size. */
 #define RSS_INDIRECTION_TABLE_SIZE 256
diff --git a/drivers/net/mlx5/mlx5_rxtx.c b/drivers/net/mlx5/mlx5_rxtx.c
index 43236f5..9d992c3 100644
--- a/drivers/net/mlx5/mlx5_rxtx.c
+++ b/drivers/net/mlx5/mlx5_rxtx.c
@@ -156,9 +156,6 @@ check_cqe64(volatile struct mlx5_cqe64 *cqe,
  * Manage TX completions.
  *
  * When sending a burst, mlx5_tx_burst() posts several WRs.
- * To improve performance, a completion event is only required once every
- * MLX5_PMD_TX_PER_COMP_REQ sends. Doing so discards completion information
- * for other WRs, but this information would not be used anyway.
  *
  * @param txq
  *   Pointer to TX queue structure.
@@ -172,14 +169,16 @@ txq_complete(struct txq *txq)
uint16_t elts_free = txq->elts_tail;
uint16_t elts_tail;
uint16_t cq_ci = txq->cq_ci;
-   unsigned int wqe_ci = (unsigned int)-1;
+   volatile struct mlx5_cqe64 *cqe = NULL;
+   volatile union mlx5_wqe *wqe;

do {
-   unsigned int idx = cq_ci & cqe_cnt;
-   volatile struct mlx5_cqe64 *cqe = &(*txq->cqes)[idx].cqe64;
+   volatile struct mlx5_cqe64 *tmp;

-   if (check_cqe64(cqe, cqe_n, cq_ci) == 1)
+   tmp = &(*txq->cqes)[cq_ci & cqe_cnt].cqe64;
+   if (check_cqe64(tmp, cqe_n, cq_ci))
break;
+   cqe = tmp;
 #ifndef NDEBUG
if (MLX5_CQE_FORMAT(cqe->op_own) == MLX5_COMPRESSED) {
if (!check_cqe64_seen(cqe))
@@ -193,14 +192,15 @@ txq_complete(struct txq *txq)
return;
}
 #endif /* NDEBUG */
-   wqe_ci = ntohs(cqe->wqe_counter);
++cq_ci;
} while (1);
-   if (unlikely(wqe_ci == (unsigned int)-1))
+   if (unlikely(cqe == NULL))
return;
+   wqe = &(*txq->wqes)[htons(cqe->wqe_counter) & (txq->wqe_n - 1)];
+   elts_tail = wqe->wqe.ctrl.data[3];
+   assert(elts_tail < txq->wqe_n);
/* Free buffers. */
-   elts_tail = (wqe_ci + 1) & (elts_n - 1);
-   do {
+   while (elts_free != elts_tail) {
struct rte_mbuf *elt = (*txq->elts)[elts_free];
unsigned int elts_free_next =
(elts_free + 1) & (elts_n - 1);
@@ -216,7 +216,7 @@ txq_complete(struct txq *txq)
/* Only one segment needs to be freed. */
rte_pktmbuf_free_seg(elt);
elts_free = elts_free_next;
-   } while (elts_free != elts_tail);
+   }
txq->cq_ci = cq_ci;
txq->elts_tail = elts_tail;
/* Update the consumer index. */
@@ -437,6 +437,7 @@ mlx5_tx_burst(void *dpdk_txq, struct rte_mbuf **pkts, 
uint16_t pkts_n)
const unsigned int elts_n = txq->elts_n;
unsigned int i;
unsigned int max;
+   unsigned int comp;
volatile union mlx5_wqe *wqe;
struct rte_mbuf *buf;

@@ -486,13 +487,7 @@ mlx5_tx_burst(void *dpdk_txq, struct rte_mbuf **pkts, 
uint16_t pkts_n)
buf->vlan_tci);
else
mlx5_wqe_write(txq, wqe, addr, length, lkey);
-   /* Request completion if needed. */
-   if (unlikely(--txq->elts_comp == 0)) {
-   wqe->wqe.ctrl.data[2] = htonl(8);
-   txq->elts_comp = txq->elts_comp_cd_init;
-   } else {
-   wqe->wqe.ctrl.data[2] = 0;
-   }
+   wqe->wqe.ctrl.data[2] = 0;
/* Should we enable HW CKSUM offload */
if (buf->ol_flags &

[dpdk-dev] [PATCH v7 15/25] mlx5: handle Rx CQE compression

2016-06-24 Thread Nelio Laranjeiro

Mini (compressed) CQEs are returned by the NIC when PCI back pressure is
detected, in which case the first CQE64 contains common packet information
followed by a number of CQE8 providing the rest, followed by a matching
number of empty CQE64 entries to be used by software for decompression.

Before decompression:

  0   1  2   6 7 8
  +---+  +-+ +---+   +---+ +---+ +---+
  | CQE64 |  |  CQE64  | | CQE64 |   | CQE64 | | CQE64 | | CQE64 |
  |---|  |-| |---|   |---| |---| |---|
  | . |  | cqe8[0] | |   | . |   | |   | | . |
  | . |  | cqe8[1] | |   | . |   | |   | | . |
  | . |  | ... | |   | . |   | |   | | . |
  | . |  | cqe8[7] | |   |   |   | |   | | . |
  +---+  +-+ +---+   +---+ +---+ +---+

After decompression:

  0  1 ... 8
  +---+  +---+ +---+
  | CQE64 |  | CQE64 | | CQE64 |
  |---|  |---| |---|
  | . |  | . |  .  | . |
  | . |  | . |  .  | . |
  | . |  | . |  .  | . |
  | . |  | . | | . |
  +---+  +---+ +---+

This patch does not perform the entire decompression step as it would be
really expensive, instead the first CQE64 is consumed and an internal
context is maintained to interpret the following CQE8 entries directly.

Intermediate empty CQE64 entries are handed back to HW without further
processing.

Signed-off-by: Nelio Laranjeiro 
Signed-off-by: Adrien Mazarguil 
Signed-off-by: Olga Shern 
Signed-off-by: Vasily Philipov 
---
 doc/guides/nics/mlx5.rst |   6 +
 drivers/net/mlx5/mlx5.c  |  24 +++-
 drivers/net/mlx5/mlx5.h  |   1 +
 drivers/net/mlx5/mlx5_rxq.c  |   9 +-
 drivers/net/mlx5/mlx5_rxtx.c | 265 +--
 drivers/net/mlx5/mlx5_rxtx.h |  11 ++
 drivers/net/mlx5/mlx5_txq.c  |   5 +
 7 files changed, 253 insertions(+), 68 deletions(-)

diff --git a/doc/guides/nics/mlx5.rst b/doc/guides/nics/mlx5.rst
index 3a07928..756153b 100644
--- a/doc/guides/nics/mlx5.rst
+++ b/doc/guides/nics/mlx5.rst
@@ -148,6 +148,12 @@ Run-time configuration

 - **ethtool** operations on related kernel interfaces also affect the PMD.

+- ``rxq_cqe_comp_en`` parameter [int]
+
+  A nonzero value enables the compression of CQE on RX side. This feature
+  allows to save PCI bandwidth and improve performance at the cost of a
+  slightly higher CPU usage.  Enabled by default.
+
 Prerequisites
 -

diff --git a/drivers/net/mlx5/mlx5.c b/drivers/net/mlx5/mlx5.c
index d08d4ac..630e5e4 100644
--- a/drivers/net/mlx5/mlx5.c
+++ b/drivers/net/mlx5/mlx5.c
@@ -69,6 +69,9 @@
 #include "mlx5_autoconf.h"
 #include "mlx5_defs.h"

+/* Device parameter to enable RX completion queue compression. */
+#define MLX5_RXQ_CQE_COMP_EN "rxq_cqe_comp_en"
+
 /**
  * Retrieve integer value from environment variable.
  *
@@ -256,12 +259,21 @@ static int
 mlx5_args_check(const char *key, const char *val, void *opaque)
 {
struct priv *priv = opaque;
+   unsigned long tmp;

-   /* No parameters are expected at the moment. */
-   (void)priv;
-   (void)val;
-   WARN("%s: unknown parameter", key);
-   return -EINVAL;
+   errno = 0;
+   tmp = strtoul(val, NULL, 0);
+   if (errno) {
+   WARN("%s: \"%s\" is not a valid integer", key, val);
+   return errno;
+   }
+   if (strcmp(MLX5_RXQ_CQE_COMP_EN, key) == 0) {
+   priv->cqe_comp = !!tmp;
+   } else {
+   WARN("%s: unknown parameter", key);
+   return -EINVAL;
+   }
+   return 0;
 }

 /**
@@ -279,6 +291,7 @@ static int
 mlx5_args(struct priv *priv, struct rte_devargs *devargs)
 {
const char **params = (const char *[]){
+   MLX5_RXQ_CQE_COMP_EN,
NULL,
};
struct rte_kvargs *kvlist;
@@ -475,6 +488,7 @@ mlx5_pci_devinit(struct rte_pci_driver *pci_drv, struct 
rte_pci_device *pci_dev)
priv->port = port;
priv->pd = pd;
priv->mtu = ETHER_MTU;
+   priv->cqe_comp = 1; /* Enable compression by default. */
err = mlx5_args(priv, pci_dev->devargs);
if (err) {
ERROR("failed to process device arguments: %s",
diff --git a/drivers/net/mlx5/mlx5.h b/drivers/net/mlx5/mlx5.h
index 3dca03d..8f5a6df 100644
--- a/drivers/net/mlx5/mlx5.h
+++ b/drivers/net/mlx5/mlx5.h
@@ -111,6 +111,7 @@ struct priv {
unsigned int hw_padding:1; /* End alignment padding is supported. */
unsigned int sriov:1; /* This is a VF or PF with VF devices. */
unsigned int mps:1; /* Whether multi-packet send is supported. */
+   unsigned int cqe_comp:1; /* Whether CQE compression is enabled. */
unsigned int pending_alarm:1; /* An alarm is pending. */

[dpdk-dev] [PATCH v7 14/25] mlx5: refactor Tx data path

2016-06-24 Thread Nelio Laranjeiro

Bypass Verbs to improve Tx performance.

Signed-off-by: Nelio Laranjeiro 
Signed-off-by: Yaacov Hazan 
Signed-off-by: Adrien Mazarguil 
---
 drivers/net/mlx5/Makefile  |   5 -
 drivers/net/mlx5/mlx5_ethdev.c |  10 +-
 drivers/net/mlx5/mlx5_mr.c |   4 +-
 drivers/net/mlx5/mlx5_rxtx.c   | 361 ++---
 drivers/net/mlx5/mlx5_rxtx.h   |  52 +++---
 drivers/net/mlx5/mlx5_txq.c| 219 +
 6 files changed, 347 insertions(+), 304 deletions(-)

diff --git a/drivers/net/mlx5/Makefile b/drivers/net/mlx5/Makefile
index dc99797..66687e8 100644
--- a/drivers/net/mlx5/Makefile
+++ b/drivers/net/mlx5/Makefile
@@ -106,11 +106,6 @@ mlx5_autoconf.h.new: FORCE
 mlx5_autoconf.h.new: $(RTE_SDK)/scripts/auto-config-h.sh
$Q $(RM) -f -- '$@'
$Q sh -- '$<' '$@' \
-   HAVE_VERBS_VLAN_INSERTION \
-   infiniband/verbs.h \
-   enum IBV_EXP_RECEIVE_WQ_CVLAN_INSERTION \
-   $(AUTOCONF_OUTPUT)
-   $Q sh -- '$<' '$@' \
HAVE_VERBS_IBV_EXP_CQ_COMPRESSED_CQE \
infiniband/verbs_exp.h \
enum IBV_EXP_CQ_COMPRESSED_CQE \
diff --git a/drivers/net/mlx5/mlx5_ethdev.c b/drivers/net/mlx5/mlx5_ethdev.c
index 16b05d3..47e64b2 100644
--- a/drivers/net/mlx5/mlx5_ethdev.c
+++ b/drivers/net/mlx5/mlx5_ethdev.c
@@ -1242,11 +1242,11 @@ mlx5_secondary_data_setup(struct priv *priv)
txq_ctrl = rte_calloc_socket("TXQ", 1, sizeof(*txq_ctrl), 0,
 primary_txq_ctrl->socket);
if (txq_ctrl != NULL) {
-   if (txq_setup(priv->dev,
- primary_txq_ctrl,
- primary_txq->elts_n,
- primary_txq_ctrl->socket,
- NULL) == 0) {
+   if (txq_ctrl_setup(priv->dev,
+  primary_txq_ctrl,
+  primary_txq->elts_n,
+  primary_txq_ctrl->socket,
+  NULL) == 0) {
txq_ctrl->txq.stats.idx =
primary_txq->stats.idx;
tx_queues[i] = _ctrl->txq;
diff --git a/drivers/net/mlx5/mlx5_mr.c b/drivers/net/mlx5/mlx5_mr.c
index 1d8bf72..67dfefa 100644
--- a/drivers/net/mlx5/mlx5_mr.c
+++ b/drivers/net/mlx5/mlx5_mr.c
@@ -190,7 +190,7 @@ txq_mp2mr_reg(struct txq *txq, struct rte_mempool *mp, 
unsigned int idx)
/* Add a new entry, register MR first. */
DEBUG("%p: discovered new memory pool \"%s\" (%p)",
  (void *)txq_ctrl, mp->name, (void *)mp);
-   mr = mlx5_mp2mr(txq_ctrl->txq.priv->pd, mp);
+   mr = mlx5_mp2mr(txq_ctrl->priv->pd, mp);
if (unlikely(mr == NULL)) {
DEBUG("%p: unable to configure MR, ibv_reg_mr() failed.",
  (void *)txq_ctrl);
@@ -209,7 +209,7 @@ txq_mp2mr_reg(struct txq *txq, struct rte_mempool *mp, 
unsigned int idx)
/* Store the new entry. */
txq_ctrl->txq.mp2mr[idx].mp = mp;
txq_ctrl->txq.mp2mr[idx].mr = mr;
-   txq_ctrl->txq.mp2mr[idx].lkey = mr->lkey;
+   txq_ctrl->txq.mp2mr[idx].lkey = htonl(mr->lkey);
DEBUG("%p: new MR lkey for MP \"%s\" (%p): 0x%08" PRIu32,
  (void *)txq_ctrl, mp->name, (void *)mp,
  txq_ctrl->txq.mp2mr[idx].lkey);
diff --git a/drivers/net/mlx5/mlx5_rxtx.c b/drivers/net/mlx5/mlx5_rxtx.c
index f2d00bf..2372fce 100644
--- a/drivers/net/mlx5/mlx5_rxtx.c
+++ b/drivers/net/mlx5/mlx5_rxtx.c
@@ -119,68 +119,52 @@ get_cqe64(volatile struct mlx5_cqe cqes[],
  *
  * @param txq
  *   Pointer to TX queue structure.
- *
- * @return
- *   0 on success, -1 on failure.
  */
-static int
+static void
 txq_complete(struct txq *txq)
 {
-   unsigned int elts_comp = txq->elts_comp;
-   unsigned int elts_tail = txq->elts_tail;
-   unsigned int elts_free = txq->elts_tail;
const unsigned int elts_n = txq->elts_n;
-   int wcs_n;
-
-   if (unlikely(elts_comp == 0))
-   return 0;
-#ifdef DEBUG_SEND
-   DEBUG("%p: processing %u work requests completions",
- (void *)txq, elts_comp);
-#endif
-   wcs_n = txq->poll_cnt(txq->cq, elts_comp);
-   if (unlikely(wcs_n == 0))
-   return 0;
-   if (unlikely(wcs_n < 0)) {
-   DEBUG("%p: ibv_poll_cq() failed (wcs_n=%d)",
- (void *)txq, wcs_n);
-   return -1;
+   const unsigned int cqe_n = txq->cqe_n;
+   uint16_t elts_free = txq->elts_tail;
+   uint16_t elts_tail;
+   uint16_t cq_ci = txq->cq_ci;
+   unsigned int wqe_ci = (unsigned int)-1;
+   int ret = 0;
+
+   while (ret == 0) {
+   volatile struct mlx5_cqe64 *cqe;
+
+   cqe = get_cqe64(*txq->cqes, cqe_n,

[dpdk-dev] [PATCH v7 13/25] mlx5: refactor Rx data path

2016-06-24 Thread Nelio Laranjeiro

Bypass Verbs to improve RX performance.

Signed-off-by: Nelio Laranjeiro 
Signed-off-by: Yaacov Hazan 
Signed-off-by: Adrien Mazarguil 
Signed-off-by: Vasily Philipov 
---
 drivers/net/mlx5/mlx5_ethdev.c |   4 +-
 drivers/net/mlx5/mlx5_fdir.c   |   2 +-
 drivers/net/mlx5/mlx5_rxq.c| 303 -
 drivers/net/mlx5/mlx5_rxtx.c   | 289 ---
 drivers/net/mlx5/mlx5_rxtx.h   |  38 +++---
 drivers/net/mlx5/mlx5_vlan.c   |   3 +-
 6 files changed, 326 insertions(+), 313 deletions(-)

diff --git a/drivers/net/mlx5/mlx5_ethdev.c b/drivers/net/mlx5/mlx5_ethdev.c
index 759434e..16b05d3 100644
--- a/drivers/net/mlx5/mlx5_ethdev.c
+++ b/drivers/net/mlx5/mlx5_ethdev.c
@@ -1263,7 +1263,9 @@ mlx5_secondary_data_setup(struct priv *priv)
}
/* RX queues. */
for (i = 0; i != nb_rx_queues; ++i) {
-   struct rxq *primary_rxq = (*sd->primary_priv->rxqs)[i];
+   struct rxq_ctrl *primary_rxq =
+   container_of((*sd->primary_priv->rxqs)[i],
+struct rxq_ctrl, rxq);

if (primary_rxq == NULL)
continue;
diff --git a/drivers/net/mlx5/mlx5_fdir.c b/drivers/net/mlx5/mlx5_fdir.c
index 1850218..73eb00e 100644
--- a/drivers/net/mlx5/mlx5_fdir.c
+++ b/drivers/net/mlx5/mlx5_fdir.c
@@ -431,7 +431,7 @@ priv_get_fdir_queue(struct priv *priv, uint16_t idx)
ind_init_attr = (struct ibv_exp_rwq_ind_table_init_attr){
.pd = priv->pd,
.log_ind_tbl_size = 0,
-   .ind_tbl = &((*priv->rxqs)[idx]->wq),
+   .ind_tbl = _ctrl->wq,
.comp_mask = 0,
};

diff --git a/drivers/net/mlx5/mlx5_rxq.c b/drivers/net/mlx5/mlx5_rxq.c
index b474a18..b1d6cfe 100644
--- a/drivers/net/mlx5/mlx5_rxq.c
+++ b/drivers/net/mlx5/mlx5_rxq.c
@@ -43,6 +43,8 @@
 #pragma GCC diagnostic ignored "-pedantic"
 #endif
 #include 
+#include 
+#include 
 #ifdef PEDANTIC
 #pragma GCC diagnostic error "-pedantic"
 #endif
@@ -373,8 +375,13 @@ priv_create_hash_rxqs(struct priv *priv)
DEBUG("indirection table extended to assume %u WQs",
  priv->reta_idx_n);
}
-   for (i = 0; (i != priv->reta_idx_n); ++i)
-   wqs[i] = (*priv->rxqs)[(*priv->reta_idx)[i]]->wq;
+   for (i = 0; (i != priv->reta_idx_n); ++i) {
+   struct rxq_ctrl *rxq_ctrl;
+
+   rxq_ctrl = container_of((*priv->rxqs)[(*priv->reta_idx)[i]],
+   struct rxq_ctrl, rxq);
+   wqs[i] = rxq_ctrl->wq;
+   }
/* Get number of hash RX queues to configure. */
for (i = 0, hash_rxqs_n = 0; (i != ind_tables_n); ++i)
hash_rxqs_n += ind_table_init[i].hash_types_n;
@@ -638,21 +645,13 @@ rxq_alloc_elts(struct rxq_ctrl *rxq_ctrl, unsigned int 
elts_n,
   struct rte_mbuf **pool)
 {
unsigned int i;
-   struct rxq_elt (*elts)[elts_n] =
-   rte_calloc_socket("RXQ elements", 1, sizeof(*elts), 0,
- rxq_ctrl->socket);
int ret = 0;

-   if (elts == NULL) {
-   ERROR("%p: can't allocate packets array", (void *)rxq_ctrl);
-   ret = ENOMEM;
-   goto error;
-   }
/* For each WR (packet). */
for (i = 0; (i != elts_n); ++i) {
-   struct rxq_elt *elt = &(*elts)[i];
-   struct ibv_sge *sge = &(*elts)[i].sge;
struct rte_mbuf *buf;
+   volatile struct mlx5_wqe_data_seg *scat =
+   &(*rxq_ctrl->rxq.wqes)[i];

if (pool != NULL) {
buf = *(pool++);
@@ -666,40 +665,36 @@ rxq_alloc_elts(struct rxq_ctrl *rxq_ctrl, unsigned int 
elts_n,
ret = ENOMEM;
goto error;
}
-   elt->buf = buf;
/* Headroom is reserved by rte_pktmbuf_alloc(). */
assert(DATA_OFF(buf) == RTE_PKTMBUF_HEADROOM);
/* Buffer is supposed to be empty. */
assert(rte_pktmbuf_data_len(buf) == 0);
assert(rte_pktmbuf_pkt_len(buf) == 0);
-   /* sge->addr must be able to store a pointer. */
-   assert(sizeof(sge->addr) >= sizeof(uintptr_t));
-   /* SGE keeps its headroom. */
-   sge->addr = (uintptr_t)
-   ((uint8_t *)buf->buf_addr + RTE_PKTMBUF_HEADROOM);
-   sge->length = (buf->buf_len - RTE_PKTMBUF_HEADROOM);
-   sge->lkey = rxq_ctrl->mr->lkey;
-   /* Redundant check for tailroom. */
-   assert(sge->length == rte_pktmbuf_tailroom(buf));
+   assert(!buf->next);
+   PORT(buf) = rxq_ctrl->rxq.port_id;
+   DATA_LEN(buf) = rte_pktmbuf_tailroom(buf);
+   PKT_LEN(buf) = DATA_LEN(buf);
+

[dpdk-dev] [PATCH v7 12/25] mlx5: add Tx/Rx burst function selection wrapper

2016-06-24 Thread Nelio Laranjeiro

These wrappers are meant to prevent code duplication later.

Signed-off-by: Nelio Laranjeiro 
Signed-off-by: Adrien Mazarguil 
---
 drivers/net/mlx5/mlx5.h|  2 ++
 drivers/net/mlx5/mlx5_ethdev.c | 34 --
 drivers/net/mlx5/mlx5_txq.c|  2 +-
 3 files changed, 31 insertions(+), 7 deletions(-)

diff --git a/drivers/net/mlx5/mlx5.h b/drivers/net/mlx5/mlx5.h
index 935e1b0..3dca03d 100644
--- a/drivers/net/mlx5/mlx5.h
+++ b/drivers/net/mlx5/mlx5.h
@@ -196,6 +196,8 @@ void priv_dev_interrupt_handler_install(struct priv *, 
struct rte_eth_dev *);
 int mlx5_set_link_down(struct rte_eth_dev *dev);
 int mlx5_set_link_up(struct rte_eth_dev *dev);
 struct priv *mlx5_secondary_data_setup(struct priv *priv);
+void priv_select_tx_function(struct priv *);
+void priv_select_rx_function(struct priv *);

 /* mlx5_mac.c */

diff --git a/drivers/net/mlx5/mlx5_ethdev.c b/drivers/net/mlx5/mlx5_ethdev.c
index 4095a06..759434e 100644
--- a/drivers/net/mlx5/mlx5_ethdev.c
+++ b/drivers/net/mlx5/mlx5_ethdev.c
@@ -1099,8 +1099,8 @@ priv_set_link(struct priv *priv, int up)
err = priv_set_flags(priv, ~IFF_UP, IFF_UP);
if (err)
return err;
-   dev->rx_pkt_burst = mlx5_rx_burst;
-   dev->tx_pkt_burst = mlx5_tx_burst;
+   priv_select_tx_function(priv);
+   priv_select_rx_function(priv);
} else {
err = priv_set_flags(priv, ~IFF_UP, ~IFF_UP);
if (err)
@@ -1290,13 +1290,11 @@ mlx5_secondary_data_setup(struct priv *priv)
rte_mb();
priv->dev->data = >data;
rte_mb();
-   priv->dev->tx_pkt_burst = mlx5_tx_burst;
-   priv->dev->rx_pkt_burst = removed_rx_burst;
+   priv_select_tx_function(priv);
+   priv_select_rx_function(priv);
priv_unlock(priv);
 end:
/* More sanity checks. */
-   assert(priv->dev->tx_pkt_burst == mlx5_tx_burst);
-   assert(priv->dev->rx_pkt_burst == removed_rx_burst);
assert(priv->dev->data == >data);
rte_spinlock_unlock(>lock);
return priv;
@@ -1307,3 +1305,27 @@ error:
rte_spinlock_unlock(>lock);
return NULL;
 }
+
+/**
+ * Configure the TX function to use.
+ *
+ * @param priv
+ *   Pointer to private structure.
+ */
+void
+priv_select_tx_function(struct priv *priv)
+{
+   priv->dev->tx_pkt_burst = mlx5_tx_burst;
+}
+
+/**
+ * Configure the RX function to use.
+ *
+ * @param priv
+ *   Pointer to private structure.
+ */
+void
+priv_select_rx_function(struct priv *priv)
+{
+   priv->dev->rx_pkt_burst = mlx5_rx_burst;
+}
diff --git a/drivers/net/mlx5/mlx5_txq.c b/drivers/net/mlx5/mlx5_txq.c
index 51210f2..ec4488a 100644
--- a/drivers/net/mlx5/mlx5_txq.c
+++ b/drivers/net/mlx5/mlx5_txq.c
@@ -478,7 +478,7 @@ mlx5_tx_queue_setup(struct rte_eth_dev *dev, uint16_t idx, 
uint16_t desc,
  (void *)dev, (void *)txq_ctrl);
(*priv->txqs)[idx] = _ctrl->txq;
/* Update send callback. */
-   dev->tx_pkt_burst = mlx5_tx_burst;
+   priv_select_tx_function(priv);
}
priv_unlock(priv);
return -ret;
-- 
2.1.4

[dpdk-dev] [PATCH v7 11/25] mlx5: add support for configuration through kvargs

2016-06-24 Thread Nelio Laranjeiro

The intent is to replace the remaining compile-time options and environment
variables with a common mean of runtime configuration. This commit only
adds the kvargs handling code, subsequent commits will update the rest.

Signed-off-by: Nelio Laranjeiro 
Signed-off-by: Adrien Mazarguil 
---
 drivers/net/mlx5/mlx5.c | 73 +
 1 file changed, 73 insertions(+)

diff --git a/drivers/net/mlx5/mlx5.c b/drivers/net/mlx5/mlx5.c
index 3f45d84..d08d4ac 100644
--- a/drivers/net/mlx5/mlx5.c
+++ b/drivers/net/mlx5/mlx5.c
@@ -37,6 +37,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 

 /* Verbs header. */
@@ -57,6 +58,7 @@
 #include 
 #include 
 #include 
+#include 
 #ifdef PEDANTIC
 #pragma GCC diagnostic error "-pedantic"
 #endif
@@ -237,6 +239,71 @@ mlx5_dev_idx(struct rte_pci_addr *pci_addr)
return ret;
 }

+/**
+ * Verify and store value for device argument.
+ *
+ * @param[in] key
+ *   Key argument to verify.
+ * @param[in] val
+ *   Value associated with key.
+ * @param opaque
+ *   User data.
+ *
+ * @return
+ *   0 on success, negative errno value on failure.
+ */
+static int
+mlx5_args_check(const char *key, const char *val, void *opaque)
+{
+   struct priv *priv = opaque;
+
+   /* No parameters are expected at the moment. */
+   (void)priv;
+   (void)val;
+   WARN("%s: unknown parameter", key);
+   return -EINVAL;
+}
+
+/**
+ * Parse device parameters.
+ *
+ * @param priv
+ *   Pointer to private structure.
+ * @param devargs
+ *   Device arguments structure.
+ *
+ * @return
+ *   0 on success, errno value on failure.
+ */
+static int
+mlx5_args(struct priv *priv, struct rte_devargs *devargs)
+{
+   const char **params = (const char *[]){
+   NULL,
+   };
+   struct rte_kvargs *kvlist;
+   int ret = 0;
+   int i;
+
+   if (devargs == NULL)
+   return 0;
+   /* Following UGLY cast is done to pass checkpatch. */
+   kvlist = rte_kvargs_parse(devargs->args, params);
+   if (kvlist == NULL)
+   return 0;
+   /* Process parameters. */
+   for (i = 0; (params[i] != NULL); ++i) {
+   if (rte_kvargs_count(kvlist, params[i])) {
+   ret = rte_kvargs_process(kvlist, params[i],
+mlx5_args_check, priv);
+   if (ret != 0)
+   return ret;
+   }
+   }
+   rte_kvargs_free(kvlist);
+   return 0;
+}
+
 static struct eth_driver mlx5_driver;

 /**
@@ -408,6 +475,12 @@ mlx5_pci_devinit(struct rte_pci_driver *pci_drv, struct 
rte_pci_device *pci_dev)
priv->port = port;
priv->pd = pd;
priv->mtu = ETHER_MTU;
+   err = mlx5_args(priv, pci_dev->devargs);
+   if (err) {
+   ERROR("failed to process device arguments: %s",
+ strerror(err));
+   goto port_error;
+   }
if (ibv_exp_query_device(ctx, _device_attr)) {
ERROR("ibv_exp_query_device() failed");
goto port_error;
-- 
2.1.4

[dpdk-dev] [PATCH v7 10/25] mlx5: add definitions for data path without Verbs

2016-06-24 Thread Nelio Laranjeiro

These structures and macros extend those exposed by libmlx5 (in mlx5_hw.h)
to let the PMD manage work queue and completion queue elements directly.

Signed-off-by: Nelio Laranjeiro 
Signed-off-by: Adrien Mazarguil 
---
 drivers/net/mlx5/mlx5_prm.h | 163 
 1 file changed, 163 insertions(+)
 create mode 100644 drivers/net/mlx5/mlx5_prm.h

diff --git a/drivers/net/mlx5/mlx5_prm.h b/drivers/net/mlx5/mlx5_prm.h
new file mode 100644
index 000..5db219b
--- /dev/null
+++ b/drivers/net/mlx5/mlx5_prm.h
@@ -0,0 +1,163 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright 2016 6WIND S.A.
+ *   Copyright 2016 Mellanox.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ * * Redistributions of source code must retain the above copyright
+ *   notice, this list of conditions and the following disclaimer.
+ * * Redistributions in binary form must reproduce the above copyright
+ *   notice, this list of conditions and the following disclaimer in
+ *   the documentation and/or other materials provided with the
+ *   distribution.
+ * * Neither the name of 6WIND S.A. nor the names of its
+ *   contributors may be used to endorse or promote products derived
+ *   from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef RTE_PMD_MLX5_PRM_H_
+#define RTE_PMD_MLX5_PRM_H_
+
+/* Verbs header. */
+/* ISO C doesn't support unnamed structs/unions, disabling -pedantic. */
+#ifdef PEDANTIC
+#pragma GCC diagnostic ignored "-pedantic"
+#endif
+#include 
+#ifdef PEDANTIC
+#pragma GCC diagnostic error "-pedantic"
+#endif
+
+/* Get CQE owner bit. */
+#define MLX5_CQE_OWNER(op_own) ((op_own) & MLX5_CQE_OWNER_MASK)
+
+/* Get CQE format. */
+#define MLX5_CQE_FORMAT(op_own) (((op_own) & MLX5E_CQE_FORMAT_MASK) >> 2)
+
+/* Get CQE opcode. */
+#define MLX5_CQE_OPCODE(op_own) (((op_own) & 0xf0) >> 4)
+
+/* Get CQE solicited event. */
+#define MLX5_CQE_SE(op_own) (((op_own) >> 1) & 1)
+
+/* Invalidate a CQE. */
+#define MLX5_CQE_INVALIDATE (MLX5_CQE_INVALID << 4)
+
+/* CQE value to inform that VLAN is stripped. */
+#define MLX5_CQE_VLAN_STRIPPED 0x1
+
+/* Maximum number of packets a multi-packet WQE can handle. */
+#define MLX5_MPW_DSEG_MAX 5
+
+/* Room for inline data in regular work queue element. */
+#define MLX5_WQE64_INL_DATA 12
+
+/* Room for inline data in multi-packet WQE. */
+#define MLX5_MWQE64_INL_DATA 28
+
+/* Subset of struct mlx5_wqe_eth_seg. */
+struct mlx5_wqe_eth_seg_small {
+   uint32_t rsvd0;
+   uint8_t cs_flags;
+   uint8_t rsvd1;
+   uint16_t mss;
+   uint32_t rsvd2;
+   uint16_t inline_hdr_sz;
+};
+
+/* Regular WQE. */
+struct mlx5_wqe_regular {
+   union {
+   struct mlx5_wqe_ctrl_seg ctrl;
+   uint32_t data[4];
+   } ctrl;
+   struct mlx5_wqe_eth_seg eseg;
+   struct mlx5_wqe_data_seg dseg;
+} __rte_aligned(64);
+
+/* Inline WQE. */
+struct mlx5_wqe_inl {
+   union {
+   struct mlx5_wqe_ctrl_seg ctrl;
+   uint32_t data[4];
+   } ctrl;
+   struct mlx5_wqe_eth_seg eseg;
+   uint32_t byte_cnt;
+   uint8_t data[MLX5_WQE64_INL_DATA];
+} __rte_aligned(64);
+
+/* Multi-packet WQE. */
+struct mlx5_wqe_mpw {
+   union {
+   struct mlx5_wqe_ctrl_seg ctrl;
+   uint32_t data[4];
+   } ctrl;
+   struct mlx5_wqe_eth_seg_small eseg;
+   struct mlx5_wqe_data_seg dseg[2];
+} __rte_aligned(64);
+
+/* Multi-packet WQE with inline. */
+struct mlx5_wqe_mpw_inl {
+   union {
+   struct mlx5_wqe_ctrl_seg ctrl;
+   uint32_t data[4];
+   } ctrl;
+   struct mlx5_wqe_eth_seg_small eseg;
+   uint32_t byte_cnt;
+   uint8_t data[MLX5_MWQE64_INL_DATA];
+} __rte_aligned(64);
+
+/* Union of all WQE types. */
+union mlx5_wqe {
+   struct mlx5_wqe_regular wqe;
+   struct mlx5_wqe_inl inl;
+   struct mlx5_wqe_mpw mpw;
+   struct mlx5_wqe_mpw_inl mpw_inl;
+   uint8_t data[64];
+};
+
+/* MPW session status. */
+enum mlx5_mpw_state {

[dpdk-dev] [PATCH v7 09/25] mlx5: update prerequisites for upcoming enhancements

2016-06-24 Thread Nelio Laranjeiro

The latest version of Mellanox OFED exposes hardware definitions necessary
to implement data path operation bypassing Verbs. Update the minimum
version requirement to MLNX_OFED >= 3.3 and clean up compatibility checks
for previous releases.

Signed-off-by: Nelio Laranjeiro 
Signed-off-by: Adrien Mazarguil 
---
 doc/guides/nics/mlx5.rst   | 44 +++---
 drivers/net/mlx5/Makefile  | 39 -
 drivers/net/mlx5/mlx5.c| 23 --
 drivers/net/mlx5/mlx5.h|  5 +
 drivers/net/mlx5/mlx5_defs.h   |  9 -
 drivers/net/mlx5/mlx5_fdir.c   | 10 --
 drivers/net/mlx5/mlx5_rxmode.c |  8 
 drivers/net/mlx5/mlx5_rxq.c| 30 
 drivers/net/mlx5/mlx5_rxtx.c   |  4 
 drivers/net/mlx5/mlx5_rxtx.h   |  8 
 drivers/net/mlx5/mlx5_txq.c|  2 --
 drivers/net/mlx5/mlx5_vlan.c   |  3 ---
 12 files changed, 16 insertions(+), 169 deletions(-)

diff --git a/doc/guides/nics/mlx5.rst b/doc/guides/nics/mlx5.rst
index 77fa957..3a07928 100644
--- a/doc/guides/nics/mlx5.rst
+++ b/doc/guides/nics/mlx5.rst
@@ -125,16 +125,6 @@ These options can be modified in the ``.config`` file.
 Environment variables
 ~

-- ``MLX5_ENABLE_CQE_COMPRESSION``
-
-  A nonzero value lets ConnectX-4 return smaller completion entries to
-  improve performance when PCI backpressure is detected. It is most useful
-  for scenarios involving heavy traffic on many queues.
-
-  Since the additional software logic necessary to handle this mode can
-  lower performance when there is no backpressure, it is not enabled by
-  default.
-
 - ``MLX5_PMD_ENABLE_PADDING``

   Enables HW packet padding in PCI bus transactions.
@@ -211,40 +201,12 @@ DPDK and must be installed separately:

 Currently supported by DPDK:

-- Mellanox OFED **3.1-1.0.3**, **3.1-1.5.7.1** or **3.2-2.0.0.0** depending
-  on usage.
-
-The following features are supported with version **3.1-1.5.7.1** and
-above only:
-
-- IPv6, UPDv6, TCPv6 RSS.
-- RX checksum offloads.
-- IBM POWER8.
-
-The following features are supported with version **3.2-2.0.0.0** and
-above only:
-
-- Flow director.
-- RX VLAN stripping.
-- TX VLAN insertion.
-- RX CRC stripping configuration.
+- Mellanox OFED **3.3-1.0.0.0**.

 - Minimum firmware version:

-  With MLNX_OFED **3.1-1.0.3**:
-
-  - ConnectX-4: **12.12.1240**
-  - ConnectX-4 Lx: **14.12.1100**
-
-  With MLNX_OFED **3.1-1.5.7.1**:
-
-  - ConnectX-4: **12.13.0144**
-  - ConnectX-4 Lx: **14.13.0144**
-
-  With MLNX_OFED **3.2-2.0.0.0**:
-
-  - ConnectX-4: **12.14.2036**
-  - ConnectX-4 Lx: **14.14.2036**
+  - ConnectX-4: **12.16.1006**
+  - ConnectX-4 Lx: **14.16.1006**

 Getting Mellanox OFED
 ~
diff --git a/drivers/net/mlx5/Makefile b/drivers/net/mlx5/Makefile
index 289c85e..dc99797 100644
--- a/drivers/net/mlx5/Makefile
+++ b/drivers/net/mlx5/Makefile
@@ -106,42 +106,19 @@ mlx5_autoconf.h.new: FORCE
 mlx5_autoconf.h.new: $(RTE_SDK)/scripts/auto-config-h.sh
$Q $(RM) -f -- '$@'
$Q sh -- '$<' '$@' \
-   HAVE_EXP_QUERY_DEVICE \
-   infiniband/verbs.h \
-   type 'struct ibv_exp_device_attr' $(AUTOCONF_OUTPUT)
-   $Q sh -- '$<' '$@' \
-   HAVE_FLOW_SPEC_IPV6 \
-   infiniband/verbs.h \
-   type 'struct ibv_exp_flow_spec_ipv6' $(AUTOCONF_OUTPUT)
-   $Q sh -- '$<' '$@' \
-   HAVE_EXP_QP_BURST_CREATE_ENABLE_MULTI_PACKET_SEND_WR \
-   infiniband/verbs.h \
-   enum IBV_EXP_QP_BURST_CREATE_ENABLE_MULTI_PACKET_SEND_WR \
-   $(AUTOCONF_OUTPUT)
-   $Q sh -- '$<' '$@' \
-   HAVE_EXP_DEVICE_ATTR_VLAN_OFFLOADS \
-   infiniband/verbs.h \
-   enum IBV_EXP_DEVICE_ATTR_VLAN_OFFLOADS \
-   $(AUTOCONF_OUTPUT)
-   $Q sh -- '$<' '$@' \
-   HAVE_EXP_CQ_RX_TCP_PACKET \
+   HAVE_VERBS_VLAN_INSERTION \
infiniband/verbs.h \
-   enum IBV_EXP_CQ_RX_TCP_PACKET \
+   enum IBV_EXP_RECEIVE_WQ_CVLAN_INSERTION \
$(AUTOCONF_OUTPUT)
$Q sh -- '$<' '$@' \
-   HAVE_VERBS_FCS \
-   infiniband/verbs.h \
-   enum IBV_EXP_CREATE_WQ_FLAG_SCATTER_FCS \
+   HAVE_VERBS_IBV_EXP_CQ_COMPRESSED_CQE \
+   infiniband/verbs_exp.h \
+   enum IBV_EXP_CQ_COMPRESSED_CQE \
$(AUTOCONF_OUTPUT)
$Q sh -- '$<' '$@' \
-   HAVE_VERBS_RX_END_PADDING \
-   infiniband/verbs.h \
-   enum IBV_EXP_CREATE_WQ_FLAG_RX_END_PADDING \
-   $(AUTOCONF_OUTPUT)
-   $Q sh -- '$<' '$@' \
-   HAVE_VERBS_VLAN_INSERTION \
-   infiniband/verbs.h \
-   enum IBV_EXP_RECEIVE_WQ_CVLAN_INSERTION \
+

[dpdk-dev] [PATCH v7 08/25] mlx5: split Rx queue structure

2016-06-24 Thread Nelio Laranjeiro

To keep the data path as efficient as possible, move fields only useful to
the control path into new structure rxq_ctrl.

Signed-off-by: Nelio Laranjeiro 
Signed-off-by: Adrien Mazarguil 
---
 drivers/net/mlx5/mlx5.c  |   6 +-
 drivers/net/mlx5/mlx5_fdir.c |   8 +-
 drivers/net/mlx5/mlx5_rxq.c  | 252 ++-
 drivers/net/mlx5/mlx5_rxtx.c |   1 -
 drivers/net/mlx5/mlx5_rxtx.h |  13 ++-
 5 files changed, 150 insertions(+), 130 deletions(-)

diff --git a/drivers/net/mlx5/mlx5.c b/drivers/net/mlx5/mlx5.c
index 3d30e00..27a7a30 100644
--- a/drivers/net/mlx5/mlx5.c
+++ b/drivers/net/mlx5/mlx5.c
@@ -122,12 +122,14 @@ mlx5_dev_close(struct rte_eth_dev *dev)
usleep(1000);
for (i = 0; (i != priv->rxqs_n); ++i) {
struct rxq *rxq = (*priv->rxqs)[i];
+   struct rxq_ctrl *rxq_ctrl;

if (rxq == NULL)
continue;
+   rxq_ctrl = container_of(rxq, struct rxq_ctrl, rxq);
(*priv->rxqs)[i] = NULL;
-   rxq_cleanup(rxq);
-   rte_free(rxq);
+   rxq_cleanup(rxq_ctrl);
+   rte_free(rxq_ctrl);
}
priv->rxqs_n = 0;
priv->rxqs = NULL;
diff --git a/drivers/net/mlx5/mlx5_fdir.c b/drivers/net/mlx5/mlx5_fdir.c
index 63e43ad..e3b97ba 100644
--- a/drivers/net/mlx5/mlx5_fdir.c
+++ b/drivers/net/mlx5/mlx5_fdir.c
@@ -424,7 +424,9 @@ create_flow:
 static struct fdir_queue *
 priv_get_fdir_queue(struct priv *priv, uint16_t idx)
 {
-   struct fdir_queue *fdir_queue = &(*priv->rxqs)[idx]->fdir_queue;
+   struct rxq_ctrl *rxq_ctrl =
+   container_of((*priv->rxqs)[idx], struct rxq_ctrl, rxq);
+   struct fdir_queue *fdir_queue = _ctrl->fdir_queue;
struct ibv_exp_rwq_ind_table *ind_table = NULL;
struct ibv_qp *qp = NULL;
struct ibv_exp_rwq_ind_table_init_attr ind_init_attr;
@@ -629,8 +631,10 @@ priv_fdir_disable(struct priv *priv)
/* Run on every RX queue to destroy related flow director QP and
 * indirection table. */
for (i = 0; (i != priv->rxqs_n); i++) {
-   fdir_queue = &(*priv->rxqs)[i]->fdir_queue;
+   struct rxq_ctrl *rxq_ctrl =
+   container_of((*priv->rxqs)[i], struct rxq_ctrl, rxq);

+   fdir_queue = _ctrl->fdir_queue;
if (fdir_queue->qp != NULL) {
claim_zero(ibv_destroy_qp(fdir_queue->qp));
fdir_queue->qp = NULL;
diff --git a/drivers/net/mlx5/mlx5_rxq.c b/drivers/net/mlx5/mlx5_rxq.c
index 4000624..05b7c91 100644
--- a/drivers/net/mlx5/mlx5_rxq.c
+++ b/drivers/net/mlx5/mlx5_rxq.c
@@ -636,7 +636,7 @@ priv_rehash_flows(struct priv *priv)
 /**
  * Allocate RX queue elements.
  *
- * @param rxq
+ * @param rxq_ctrl
  *   Pointer to RX queue structure.
  * @param elts_n
  *   Number of elements to allocate.
@@ -648,16 +648,17 @@ priv_rehash_flows(struct priv *priv)
  *   0 on success, errno value on failure.
  */
 static int
-rxq_alloc_elts(struct rxq *rxq, unsigned int elts_n, struct rte_mbuf **pool)
+rxq_alloc_elts(struct rxq_ctrl *rxq_ctrl, unsigned int elts_n,
+  struct rte_mbuf **pool)
 {
unsigned int i;
struct rxq_elt (*elts)[elts_n] =
rte_calloc_socket("RXQ elements", 1, sizeof(*elts), 0,
- rxq->socket);
+ rxq_ctrl->socket);
int ret = 0;

if (elts == NULL) {
-   ERROR("%p: can't allocate packets array", (void *)rxq);
+   ERROR("%p: can't allocate packets array", (void *)rxq_ctrl);
ret = ENOMEM;
goto error;
}
@@ -672,10 +673,10 @@ rxq_alloc_elts(struct rxq *rxq, unsigned int elts_n, 
struct rte_mbuf **pool)
assert(buf != NULL);
rte_pktmbuf_reset(buf);
} else
-   buf = rte_pktmbuf_alloc(rxq->mp);
+   buf = rte_pktmbuf_alloc(rxq_ctrl->rxq.mp);
if (buf == NULL) {
assert(pool == NULL);
-   ERROR("%p: empty mbuf pool", (void *)rxq);
+   ERROR("%p: empty mbuf pool", (void *)rxq_ctrl);
ret = ENOMEM;
goto error;
}
@@ -691,15 +692,15 @@ rxq_alloc_elts(struct rxq *rxq, unsigned int elts_n, 
struct rte_mbuf **pool)
sge->addr = (uintptr_t)
((uint8_t *)buf->buf_addr + RTE_PKTMBUF_HEADROOM);
sge->length = (buf->buf_len - RTE_PKTMBUF_HEADROOM);
-   sge->lkey = rxq->mr->lkey;
+   sge->lkey = rxq_ctrl->mr->lkey;
/* Redundant check for tailroom. */
assert(sge->length == rte_pktmbuf_tailroom(buf));

[dpdk-dev] [PATCH v7 07/25] mlx5: split Tx queue structure

2016-06-24 Thread Nelio Laranjeiro

To keep the data path as efficient as possible, move fields only useful to
the control path into new structure txq_ctrl.

Signed-off-by: Nelio Laranjeiro 
Signed-off-by: Adrien Mazarguil 
---
 drivers/net/mlx5/mlx5.c|  21 +++--
 drivers/net/mlx5/mlx5_ethdev.c |  28 +++---
 drivers/net/mlx5/mlx5_mr.c |  39 
 drivers/net/mlx5/mlx5_rxtx.h   |   9 +-
 drivers/net/mlx5/mlx5_txq.c| 200 +
 5 files changed, 160 insertions(+), 137 deletions(-)

diff --git a/drivers/net/mlx5/mlx5.c b/drivers/net/mlx5/mlx5.c
index 350028b..3d30e00 100644
--- a/drivers/net/mlx5/mlx5.c
+++ b/drivers/net/mlx5/mlx5.c
@@ -98,7 +98,6 @@ static void
 mlx5_dev_close(struct rte_eth_dev *dev)
 {
struct priv *priv = mlx5_get_priv(dev);
-   void *tmp;
unsigned int i;

priv_lock(priv);
@@ -122,12 +121,13 @@ mlx5_dev_close(struct rte_eth_dev *dev)
/* XXX race condition if mlx5_rx_burst() is still running. */
usleep(1000);
for (i = 0; (i != priv->rxqs_n); ++i) {
-   tmp = (*priv->rxqs)[i];
-   if (tmp == NULL)
+   struct rxq *rxq = (*priv->rxqs)[i];
+
+   if (rxq == NULL)
continue;
(*priv->rxqs)[i] = NULL;
-   rxq_cleanup(tmp);
-   rte_free(tmp);
+   rxq_cleanup(rxq);
+   rte_free(rxq);
}
priv->rxqs_n = 0;
priv->rxqs = NULL;
@@ -136,12 +136,15 @@ mlx5_dev_close(struct rte_eth_dev *dev)
/* XXX race condition if mlx5_tx_burst() is still running. */
usleep(1000);
for (i = 0; (i != priv->txqs_n); ++i) {
-   tmp = (*priv->txqs)[i];
-   if (tmp == NULL)
+   struct txq *txq = (*priv->txqs)[i];
+   struct txq_ctrl *txq_ctrl;
+
+   if (txq == NULL)
continue;
+   txq_ctrl = container_of(txq, struct txq_ctrl, txq);
(*priv->txqs)[i] = NULL;
-   txq_cleanup(tmp);
-   rte_free(tmp);
+   txq_cleanup(txq_ctrl);
+   rte_free(txq_ctrl);
}
priv->txqs_n = 0;
priv->txqs = NULL;
diff --git a/drivers/net/mlx5/mlx5_ethdev.c b/drivers/net/mlx5/mlx5_ethdev.c
index ca57021..4095a06 100644
--- a/drivers/net/mlx5/mlx5_ethdev.c
+++ b/drivers/net/mlx5/mlx5_ethdev.c
@@ -1232,28 +1232,32 @@ mlx5_secondary_data_setup(struct priv *priv)
/* TX queues. */
for (i = 0; i != nb_tx_queues; ++i) {
struct txq *primary_txq = (*sd->primary_priv->txqs)[i];
-   struct txq *txq;
+   struct txq_ctrl *primary_txq_ctrl;
+   struct txq_ctrl *txq_ctrl;

if (primary_txq == NULL)
continue;
-   txq = rte_calloc_socket("TXQ", 1, sizeof(*txq), 0,
-   primary_txq->socket);
-   if (txq != NULL) {
+   primary_txq_ctrl = container_of(primary_txq,
+   struct txq_ctrl, txq);
+   txq_ctrl = rte_calloc_socket("TXQ", 1, sizeof(*txq_ctrl), 0,
+primary_txq_ctrl->socket);
+   if (txq_ctrl != NULL) {
if (txq_setup(priv->dev,
- txq,
+ primary_txq_ctrl,
  primary_txq->elts_n,
- primary_txq->socket,
+ primary_txq_ctrl->socket,
  NULL) == 0) {
-   txq->stats.idx = primary_txq->stats.idx;
-   tx_queues[i] = txq;
+   txq_ctrl->txq.stats.idx =
+   primary_txq->stats.idx;
+   tx_queues[i] = _ctrl->txq;
continue;
}
-   rte_free(txq);
+   rte_free(txq_ctrl);
}
while (i) {
-   txq = tx_queues[--i];
-   txq_cleanup(txq);
-   rte_free(txq);
+   txq_ctrl = tx_queues[--i];
+   txq_cleanup(txq_ctrl);
+   rte_free(txq_ctrl);
}
goto error;
}
diff --git a/drivers/net/mlx5/mlx5_mr.c b/drivers/net/mlx5/mlx5_mr.c
index bb44041..1d8bf72 100644
--- a/drivers/net/mlx5/mlx5_mr.c
+++ b/drivers/net/mlx5/mlx5_mr.c
@@ -184,33 +184,36 @@ mlx5_mp2mr(struct

[dpdk-dev] [PATCH v7 06/25] mlx5: remove inline Tx support

2016-06-24 Thread Nelio Laranjeiro

Inline TX will be fully managed by the PMD after Verbs is bypassed in the
data path. Remove the current code until then.

Signed-off-by: Nelio Laranjeiro 
Signed-off-by: Adrien Mazarguil 
---
 config/common_base   |  1 -
 doc/guides/nics/mlx5.rst | 10 --
 drivers/net/mlx5/Makefile|  4 ---
 drivers/net/mlx5/mlx5_defs.h |  5 ---
 drivers/net/mlx5/mlx5_rxtx.c | 75 +++-
 drivers/net/mlx5/mlx5_rxtx.h |  9 --
 drivers/net/mlx5/mlx5_txq.c  | 19 ++-
 7 files changed, 27 insertions(+), 96 deletions(-)

diff --git a/config/common_base b/config/common_base
index 39e6333..5fbac47 100644
--- a/config/common_base
+++ b/config/common_base
@@ -207,7 +207,6 @@ CONFIG_RTE_LIBRTE_MLX4_SOFT_COUNTERS=1
 #
 CONFIG_RTE_LIBRTE_MLX5_PMD=n
 CONFIG_RTE_LIBRTE_MLX5_DEBUG=n
-CONFIG_RTE_LIBRTE_MLX5_MAX_INLINE=0
 CONFIG_RTE_LIBRTE_MLX5_TX_MP_CACHE=8

 #
diff --git a/doc/guides/nics/mlx5.rst b/doc/guides/nics/mlx5.rst
index 84c35a0..77fa957 100644
--- a/doc/guides/nics/mlx5.rst
+++ b/doc/guides/nics/mlx5.rst
@@ -114,16 +114,6 @@ These options can be modified in the ``.config`` file.
   adds additional run-time checks and debugging messages at the cost of
   lower performance.

-- ``CONFIG_RTE_LIBRTE_MLX5_MAX_INLINE`` (default **0**)
-
-  Amount of data to be inlined during TX operations. Improves latency.
-  Can improve PPS performance when PCI backpressure is detected and may be
-  useful for scenarios involving heavy traffic on many queues.
-
-  Since the additional software logic necessary to handle this mode can
-  lower performance when there is no backpressure, it is not enabled by
-  default.
-
 - ``CONFIG_RTE_LIBRTE_MLX5_TX_MP_CACHE`` (default **8**)

   Maximum number of cached memory pools (MPs) per TX queue. Each MP from
diff --git a/drivers/net/mlx5/Makefile b/drivers/net/mlx5/Makefile
index 656a6e1..289c85e 100644
--- a/drivers/net/mlx5/Makefile
+++ b/drivers/net/mlx5/Makefile
@@ -86,10 +86,6 @@ else
 CFLAGS += -DNDEBUG -UPEDANTIC
 endif

-ifdef CONFIG_RTE_LIBRTE_MLX5_MAX_INLINE
-CFLAGS += -DMLX5_PMD_MAX_INLINE=$(CONFIG_RTE_LIBRTE_MLX5_MAX_INLINE)
-endif
-
 ifdef CONFIG_RTE_LIBRTE_MLX5_TX_MP_CACHE
 CFLAGS += -DMLX5_PMD_TX_MP_CACHE=$(CONFIG_RTE_LIBRTE_MLX5_TX_MP_CACHE)
 endif
diff --git a/drivers/net/mlx5/mlx5_defs.h b/drivers/net/mlx5/mlx5_defs.h
index da1c90e..9a19835 100644
--- a/drivers/net/mlx5/mlx5_defs.h
+++ b/drivers/net/mlx5/mlx5_defs.h
@@ -54,11 +54,6 @@
 /* RSS Indirection table size. */
 #define RSS_INDIRECTION_TABLE_SIZE 256

-/* Maximum size for inline data. */
-#ifndef MLX5_PMD_MAX_INLINE
-#define MLX5_PMD_MAX_INLINE 0
-#endif
-
 /*
  * Maximum number of cached Memory Pools (MPs) per TX queue. Each RTE MP
  * from which buffers are to be transmitted will have to be mapped by this
diff --git a/drivers/net/mlx5/mlx5_rxtx.c b/drivers/net/mlx5/mlx5_rxtx.c
index f67cbf4..4ba88ea 100644
--- a/drivers/net/mlx5/mlx5_rxtx.c
+++ b/drivers/net/mlx5/mlx5_rxtx.c
@@ -329,58 +329,33 @@ mlx5_tx_burst(void *dpdk_txq, struct rte_mbuf **pkts, 
uint16_t pkts_n)
rte_prefetch0((volatile void *)
  (uintptr_t)buf_next_addr);
}
-   /* Put packet into send queue. */
-#if MLX5_PMD_MAX_INLINE > 0
-   if (length <= txq->max_inline) {
-#ifdef HAVE_VERBS_VLAN_INSERTION
-   if (insert_vlan)
-   err = txq->send_pending_inline_vlan
-   (txq->qp,
-(void *)addr,
-length,
-send_flags,
->vlan_tci);
-   else
-#endif /* HAVE_VERBS_VLAN_INSERTION */
-   err = txq->send_pending_inline
-   (txq->qp,
-(void *)addr,
-length,
-send_flags);
-   } else
-#endif
-   {
-   /*
-* Retrieve Memory Region key for this
-* memory pool.
-*/
-   lkey = txq_mp2mr(txq, txq_mb2mp(buf));
-   if (unlikely(lkey == (uint32_t)-1)) {
-   /* MR does not exist. */
-   DEBUG("%p: unable to get MP <-> MR"
- " association", (void *)txq);
-   /* Clean up TX element. */
-   elt->buf = NULL;
-   goto stop;
-   }
+   /* Retrieve Memory Region key for this memory pool. */
+   lkey = txq_mp2mr(txq, txq_mb2mp(buf));
+   if (unlikely(lkey == (uint32_t)-1)) {
+   /* MR does not

[dpdk-dev] [PATCH v7 05/25] mlx5: remove configuration variable

2016-06-24 Thread Nelio Laranjeiro

There is no scatter/gather support anymore, CONFIG_RTE_LIBRTE_MLX5_SGE_WR_N
has no purpose and can be removed.

Signed-off-by: Nelio Laranjeiro 
Signed-off-by: Adrien Mazarguil 
---
 config/common_base   | 1 -
 doc/guides/nics/mlx5.rst | 7 ---
 drivers/net/mlx5/Makefile| 4 
 drivers/net/mlx5/mlx5_defs.h | 5 -
 drivers/net/mlx5/mlx5_rxq.c  | 4 
 drivers/net/mlx5/mlx5_txq.c  | 4 
 6 files changed, 25 deletions(-)

diff --git a/config/common_base b/config/common_base
index ead5984..39e6333 100644
--- a/config/common_base
+++ b/config/common_base
@@ -207,7 +207,6 @@ CONFIG_RTE_LIBRTE_MLX4_SOFT_COUNTERS=1
 #
 CONFIG_RTE_LIBRTE_MLX5_PMD=n
 CONFIG_RTE_LIBRTE_MLX5_DEBUG=n
-CONFIG_RTE_LIBRTE_MLX5_SGE_WR_N=4
 CONFIG_RTE_LIBRTE_MLX5_MAX_INLINE=0
 CONFIG_RTE_LIBRTE_MLX5_TX_MP_CACHE=8

diff --git a/doc/guides/nics/mlx5.rst b/doc/guides/nics/mlx5.rst
index d9196d1..84c35a0 100644
--- a/doc/guides/nics/mlx5.rst
+++ b/doc/guides/nics/mlx5.rst
@@ -114,13 +114,6 @@ These options can be modified in the ``.config`` file.
   adds additional run-time checks and debugging messages at the cost of
   lower performance.

-- ``CONFIG_RTE_LIBRTE_MLX5_SGE_WR_N`` (default **4**)
-
-  Number of scatter/gather elements (SGEs) per work request (WR). Lowering
-  this number improves performance but also limits the ability to receive
-  scattered packets (packets that do not fit a single mbuf). The default
-  value is a safe tradeoff.
-
 - ``CONFIG_RTE_LIBRTE_MLX5_MAX_INLINE`` (default **0**)

   Amount of data to be inlined during TX operations. Improves latency.
diff --git a/drivers/net/mlx5/Makefile b/drivers/net/mlx5/Makefile
index 999ada5..656a6e1 100644
--- a/drivers/net/mlx5/Makefile
+++ b/drivers/net/mlx5/Makefile
@@ -86,10 +86,6 @@ else
 CFLAGS += -DNDEBUG -UPEDANTIC
 endif

-ifdef CONFIG_RTE_LIBRTE_MLX5_SGE_WR_N
-CFLAGS += -DMLX5_PMD_SGE_WR_N=$(CONFIG_RTE_LIBRTE_MLX5_SGE_WR_N)
-endif
-
 ifdef CONFIG_RTE_LIBRTE_MLX5_MAX_INLINE
 CFLAGS += -DMLX5_PMD_MAX_INLINE=$(CONFIG_RTE_LIBRTE_MLX5_MAX_INLINE)
 endif
diff --git a/drivers/net/mlx5/mlx5_defs.h b/drivers/net/mlx5/mlx5_defs.h
index 09207d9..da1c90e 100644
--- a/drivers/net/mlx5/mlx5_defs.h
+++ b/drivers/net/mlx5/mlx5_defs.h
@@ -54,11 +54,6 @@
 /* RSS Indirection table size. */
 #define RSS_INDIRECTION_TABLE_SIZE 256

-/* Maximum number of Scatter/Gather Elements per Work Request. */
-#ifndef MLX5_PMD_SGE_WR_N
-#define MLX5_PMD_SGE_WR_N 4
-#endif
-
 /* Maximum size for inline data. */
 #ifndef MLX5_PMD_MAX_INLINE
 #define MLX5_PMD_MAX_INLINE 0
diff --git a/drivers/net/mlx5/mlx5_rxq.c b/drivers/net/mlx5/mlx5_rxq.c
index 38ff9fd..4000624 100644
--- a/drivers/net/mlx5/mlx5_rxq.c
+++ b/drivers/net/mlx5/mlx5_rxq.c
@@ -976,10 +976,6 @@ rxq_setup(struct rte_eth_dev *dev, struct rxq *rxq, 
uint16_t desc,
ERROR("%p: invalid number of RX descriptors", (void *)dev);
return EINVAL;
}
-   if (MLX5_PMD_SGE_WR_N > 1) {
-   ERROR("%p: RX scatter is not supported", (void *)dev);
-   return ENOTSUP;
-   }
/* Toggle RX checksum offload if hardware supports it. */
if (priv->hw_csum)
tmpl.csum = !!dev->data->dev_conf.rxmode.hw_ip_checksum;
diff --git a/drivers/net/mlx5/mlx5_txq.c b/drivers/net/mlx5/mlx5_txq.c
index 5a248c9..59974c5 100644
--- a/drivers/net/mlx5/mlx5_txq.c
+++ b/drivers/net/mlx5/mlx5_txq.c
@@ -264,10 +264,6 @@ txq_setup(struct rte_eth_dev *dev, struct txq *txq, 
uint16_t desc,
ERROR("%p: invalid number of TX descriptors", (void *)dev);
return EINVAL;
}
-   if (MLX5_PMD_SGE_WR_N > 1) {
-   ERROR("%p: TX gather is not supported", (void *)dev);
-   return EINVAL;
-   }
/* MRs will be registered in mp2mr[] later. */
attr.rd = (struct ibv_exp_res_domain_init_attr){
.comp_mask = (IBV_EXP_RES_DOMAIN_THREAD_MODEL |
-- 
2.1.4

[dpdk-dev] [PATCH v7 04/25] mlx5: remove Rx scatter support

2016-06-24 Thread Nelio Laranjeiro

This is done in preparation of bypassing Verbs entirely for the data path
as a performance improvement. RX scatter cannot be maintained during the
transition and will be reimplemented later.

Signed-off-by: Nelio Laranjeiro 
Signed-off-by: Adrien Mazarguil 
---
 drivers/net/mlx5/mlx5_ethdev.c |  31 +---
 drivers/net/mlx5/mlx5_rxq.c| 314 ++---
 drivers/net/mlx5/mlx5_rxtx.c   | 211 +--
 drivers/net/mlx5/mlx5_rxtx.h   |  13 +-
 4 files changed, 53 insertions(+), 516 deletions(-)

diff --git a/drivers/net/mlx5/mlx5_ethdev.c b/drivers/net/mlx5/mlx5_ethdev.c
index 280a90a..ca57021 100644
--- a/drivers/net/mlx5/mlx5_ethdev.c
+++ b/drivers/net/mlx5/mlx5_ethdev.c
@@ -623,8 +623,7 @@ mlx5_dev_supported_ptypes_get(struct rte_eth_dev *dev)

};

-   if (dev->rx_pkt_burst == mlx5_rx_burst ||
-   dev->rx_pkt_burst == mlx5_rx_burst_sp)
+   if (dev->rx_pkt_burst == mlx5_rx_burst)
return ptypes;
return NULL;
 }
@@ -762,19 +761,11 @@ mlx5_dev_set_mtu(struct rte_eth_dev *dev, uint16_t mtu)
mb_len = rte_pktmbuf_data_room_size(rxq->mp);
assert(mb_len >= RTE_PKTMBUF_HEADROOM);
sp = (max_frame_len > (mb_len - RTE_PKTMBUF_HEADROOM));
-   /* Provide new values to rxq_setup(). */
-   dev->data->dev_conf.rxmode.jumbo_frame = sp;
-   dev->data->dev_conf.rxmode.max_rx_pkt_len = max_frame_len;
-   ret = rxq_rehash(dev, rxq);
-   if (ret) {
-   /* Force SP RX if that queue requires it and abort. */
-   if (rxq->sp)
-   rx_func = mlx5_rx_burst_sp;
-   break;
+   if (sp) {
+   ERROR("%p: RX scatter is not supported", (void *)dev);
+   ret = ENOTSUP;
+   goto out;
}
-   /* Scattered burst function takes priority. */
-   if (rxq->sp)
-   rx_func = mlx5_rx_burst_sp;
}
/* Burst functions can now be called again. */
rte_wmb();
@@ -1103,22 +1094,12 @@ priv_set_link(struct priv *priv, int up)
 {
struct rte_eth_dev *dev = priv->dev;
int err;
-   unsigned int i;

if (up) {
err = priv_set_flags(priv, ~IFF_UP, IFF_UP);
if (err)
return err;
-   for (i = 0; i < priv->rxqs_n; i++)
-   if ((*priv->rxqs)[i]->sp)
-   break;
-   /* Check if an sp queue exists.
-* Note: Some old frames might be received.
-*/
-   if (i == priv->rxqs_n)
-   dev->rx_pkt_burst = mlx5_rx_burst;
-   else
-   dev->rx_pkt_burst = mlx5_rx_burst_sp;
+   dev->rx_pkt_burst = mlx5_rx_burst;
dev->tx_pkt_burst = mlx5_tx_burst;
} else {
err = priv_set_flags(priv, ~IFF_UP, ~IFF_UP);
diff --git a/drivers/net/mlx5/mlx5_rxq.c b/drivers/net/mlx5/mlx5_rxq.c
index 0bcf55b..38ff9fd 100644
--- a/drivers/net/mlx5/mlx5_rxq.c
+++ b/drivers/net/mlx5/mlx5_rxq.c
@@ -634,145 +634,6 @@ priv_rehash_flows(struct priv *priv)
 }

 /**
- * Allocate RX queue elements with scattered packets support.
- *
- * @param rxq
- *   Pointer to RX queue structure.
- * @param elts_n
- *   Number of elements to allocate.
- * @param[in] pool
- *   If not NULL, fetch buffers from this array instead of allocating them
- *   with rte_pktmbuf_alloc().
- *
- * @return
- *   0 on success, errno value on failure.
- */
-static int
-rxq_alloc_elts_sp(struct rxq *rxq, unsigned int elts_n,
- struct rte_mbuf **pool)
-{
-   unsigned int i;
-   struct rxq_elt_sp (*elts)[elts_n] =
-   rte_calloc_socket("RXQ elements", 1, sizeof(*elts), 0,
- rxq->socket);
-   int ret = 0;
-
-   if (elts == NULL) {
-   ERROR("%p: can't allocate packets array", (void *)rxq);
-   ret = ENOMEM;
-   goto error;
-   }
-   /* For each WR (packet). */
-   for (i = 0; (i != elts_n); ++i) {
-   unsigned int j;
-   struct rxq_elt_sp *elt = &(*elts)[i];
-   struct ibv_sge (*sges)[RTE_DIM(elt->sges)] = >sges;
-
-   /* These two arrays must have the same size. */
-   assert(RTE_DIM(elt->sges) == RTE_DIM(elt->bufs));
-   /* For each SGE (segment). */
-   for (j = 0; (j != RTE_DIM(elt->bufs)); ++j) {
-   struct ibv_sge *sge = &(*sges)[j];
-   struct rte_mbuf *buf;
-
-   if (pool != NULL) {
-   buf = *(pool++);
-   assert(buf != NULL);
-   rte_pktmbuf_reset(buf);
-

[dpdk-dev] [PATCH v7 03/25] mlx5: remove Tx gather support

2016-06-24 Thread Nelio Laranjeiro

This is done in preparation of bypassing Verbs entirely for the data path
as a performance improvement. TX gather cannot be maintained during the
transition and will be reimplemented later.

Signed-off-by: Nelio Laranjeiro 
Signed-off-by: Adrien Mazarguil 
---
 drivers/net/mlx5/mlx5_ethdev.c |   2 +-
 drivers/net/mlx5/mlx5_rxtx.c   | 317 -
 drivers/net/mlx5/mlx5_rxtx.h   |  17 ---
 drivers/net/mlx5/mlx5_txq.c|  49 ++-
 4 files changed, 71 insertions(+), 314 deletions(-)

diff --git a/drivers/net/mlx5/mlx5_ethdev.c b/drivers/net/mlx5/mlx5_ethdev.c
index 0a881b6..280a90a 100644
--- a/drivers/net/mlx5/mlx5_ethdev.c
+++ b/drivers/net/mlx5/mlx5_ethdev.c
@@ -1260,7 +1260,7 @@ mlx5_secondary_data_setup(struct priv *priv)
if (txq != NULL) {
if (txq_setup(priv->dev,
  txq,
- primary_txq->elts_n * MLX5_PMD_SGE_WR_N,
+ primary_txq->elts_n,
  primary_txq->socket,
  NULL) == 0) {
txq->stats.idx = primary_txq->stats.idx;
diff --git a/drivers/net/mlx5/mlx5_rxtx.c b/drivers/net/mlx5/mlx5_rxtx.c
index 616cf7a..c4487b9 100644
--- a/drivers/net/mlx5/mlx5_rxtx.c
+++ b/drivers/net/mlx5/mlx5_rxtx.c
@@ -228,156 +228,6 @@ insert_vlan_sw(struct rte_mbuf *buf)
return 0;
 }

-#if MLX5_PMD_SGE_WR_N > 1
-
-/**
- * Copy scattered mbuf contents to a single linear buffer.
- *
- * @param[out] linear
- *   Linear output buffer.
- * @param[in] buf
- *   Scattered input buffer.
- *
- * @return
- *   Number of bytes copied to the output buffer or 0 if not large enough.
- */
-static unsigned int
-linearize_mbuf(linear_t *linear, struct rte_mbuf *buf)
-{
-   unsigned int size = 0;
-   unsigned int offset;
-
-   do {
-   unsigned int len = DATA_LEN(buf);
-
-   offset = size;
-   size += len;
-   if (unlikely(size > sizeof(*linear)))
-   return 0;
-   memcpy(&(*linear)[offset],
-  rte_pktmbuf_mtod(buf, uint8_t *),
-  len);
-   buf = NEXT(buf);
-   } while (buf != NULL);
-   return size;
-}
-
-/**
- * Handle scattered buffers for mlx5_tx_burst().
- *
- * @param txq
- *   TX queue structure.
- * @param segs
- *   Number of segments in buf.
- * @param elt
- *   TX queue element to fill.
- * @param[in] buf
- *   Buffer to process.
- * @param elts_head
- *   Index of the linear buffer to use if necessary (normally txq->elts_head).
- * @param[out] sges
- *   Array filled with SGEs on success.
- *
- * @return
- *   A structure containing the processed packet size in bytes and the
- *   number of SGEs. Both fields are set to (unsigned int)-1 in case of
- *   failure.
- */
-static struct tx_burst_sg_ret {
-   unsigned int length;
-   unsigned int num;
-}
-tx_burst_sg(struct txq *txq, unsigned int segs, struct txq_elt *elt,
-   struct rte_mbuf *buf, unsigned int elts_head,
-   struct ibv_sge (*sges)[MLX5_PMD_SGE_WR_N])
-{
-   unsigned int sent_size = 0;
-   unsigned int j;
-   int linearize = 0;
-
-   /* When there are too many segments, extra segments are
-* linearized in the last SGE. */
-   if (unlikely(segs > RTE_DIM(*sges))) {
-   segs = (RTE_DIM(*sges) - 1);
-   linearize = 1;
-   }
-   /* Update element. */
-   elt->buf = buf;
-   /* Register segments as SGEs. */
-   for (j = 0; (j != segs); ++j) {
-   struct ibv_sge *sge = &(*sges)[j];
-   uint32_t lkey;
-
-   /* Retrieve Memory Region key for this memory pool. */
-   lkey = txq_mp2mr(txq, txq_mb2mp(buf));
-   if (unlikely(lkey == (uint32_t)-1)) {
-   /* MR does not exist. */
-   DEBUG("%p: unable to get MP <-> MR association",
- (void *)txq);
-   /* Clean up TX element. */
-   elt->buf = NULL;
-   goto stop;
-   }
-   /* Update SGE. */
-   sge->addr = rte_pktmbuf_mtod(buf, uintptr_t);
-   if (txq->priv->sriov)
-   rte_prefetch0((volatile void *)
- (uintptr_t)sge->addr);
-   sge->length = DATA_LEN(buf);
-   sge->lkey = lkey;
-   sent_size += sge->length;
-   buf = NEXT(buf);
-   }
-   /* If buf is not NULL here and is not going to be linearized,
-* nb_segs is not valid. */
-   assert(j == segs);
-   assert((buf == NULL) || (linearize));
-   /* Linearize extra segments. */
-   if (linearize) {
-   struct ibv_sge *sge = &(*sges)[segs];
-   linear_t *linear =

[dpdk-dev] [PATCH v7 02/25] mlx5: split memory registration function

2016-06-24 Thread Nelio Laranjeiro

Except for the first time when memory registration occurs, the lkey is
always cached. Since memory registration is slow and performs system calls,
performance can be improved by moving that code to its own function outside
of the data path so only the lookup code is left in the original inlined
function.

Signed-off-by: Nelio Laranjeiro 
Signed-off-by: Adrien Mazarguil 
---
 drivers/net/mlx5/Makefile|   1 +
 drivers/net/mlx5/mlx5_mr.c   | 280 +++
 drivers/net/mlx5/mlx5_rxtx.c | 209 ++--
 drivers/net/mlx5/mlx5_rxtx.h |   8 +-
 4 files changed, 298 insertions(+), 200 deletions(-)
 create mode 100644 drivers/net/mlx5/mlx5_mr.c

diff --git a/drivers/net/mlx5/Makefile b/drivers/net/mlx5/Makefile
index 82558aa..999ada5 100644
--- a/drivers/net/mlx5/Makefile
+++ b/drivers/net/mlx5/Makefile
@@ -47,6 +47,7 @@ SRCS-$(CONFIG_RTE_LIBRTE_MLX5_PMD) += mlx5_vlan.c
 SRCS-$(CONFIG_RTE_LIBRTE_MLX5_PMD) += mlx5_stats.c
 SRCS-$(CONFIG_RTE_LIBRTE_MLX5_PMD) += mlx5_rss.c
 SRCS-$(CONFIG_RTE_LIBRTE_MLX5_PMD) += mlx5_fdir.c
+SRCS-$(CONFIG_RTE_LIBRTE_MLX5_PMD) += mlx5_mr.c

 # Dependencies.
 DEPDIRS-$(CONFIG_RTE_LIBRTE_MLX5_PMD) += lib/librte_ether
diff --git a/drivers/net/mlx5/mlx5_mr.c b/drivers/net/mlx5/mlx5_mr.c
new file mode 100644
index 000..bb44041
--- /dev/null
+++ b/drivers/net/mlx5/mlx5_mr.c
@@ -0,0 +1,280 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright 2016 6WIND S.A.
+ *   Copyright 2016 Mellanox.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ * * Redistributions of source code must retain the above copyright
+ *   notice, this list of conditions and the following disclaimer.
+ * * Redistributions in binary form must reproduce the above copyright
+ *   notice, this list of conditions and the following disclaimer in
+ *   the documentation and/or other materials provided with the
+ *   distribution.
+ * * Neither the name of 6WIND S.A. nor the names of its
+ *   contributors may be used to endorse or promote products derived
+ *   from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+/* Verbs header. */
+/* ISO C doesn't support unnamed structs/unions, disabling -pedantic. */
+#ifdef PEDANTIC
+#pragma GCC diagnostic ignored "-pedantic"
+#endif
+#include 
+#ifdef PEDANTIC
+#pragma GCC diagnostic error "-pedantic"
+#endif
+
+/* DPDK headers don't like -pedantic. */
+#ifdef PEDANTIC
+#pragma GCC diagnostic ignored "-pedantic"
+#endif
+#include 
+#ifdef PEDANTIC
+#pragma GCC diagnostic error "-pedantic"
+#endif
+
+#include "mlx5.h"
+#include "mlx5_rxtx.h"
+
+struct mlx5_check_mempool_data {
+   int ret;
+   char *start;
+   char *end;
+};
+
+/* Called by mlx5_check_mempool() when iterating the memory chunks. */
+static void
+mlx5_check_mempool_cb(struct rte_mempool *mp,
+ void *opaque, struct rte_mempool_memhdr *memhdr,
+ unsigned int mem_idx)
+{
+   struct mlx5_check_mempool_data *data = opaque;
+
+   (void)mp;
+   (void)mem_idx;
+
+   /* It already failed, skip the next chunks. */
+   if (data->ret != 0)
+   return;
+   /* It is the first chunk. */
+   if (data->start == NULL && data->end == NULL) {
+   data->start = memhdr->addr;
+   data->end = data->start + memhdr->len;
+   return;
+   }
+   if (data->end == memhdr->addr) {
+   data->end += memhdr->len;
+   return;
+   }
+   if (data->start == (char *)memhdr->addr + memhdr->len) {
+   data->start -= memhdr->len;
+   return;
+   }
+   /* Error, mempool is not virtually contiguous. */
+   data->ret = -1;
+}
+
+/**
+ * Check if a mempool can be used: it must be virtually contiguous.
+ *
+ * @param[in] mp
+ *   Pointer to memory pool.
+ * @param[out] start
+ *   Pointer to the start address of the mempool virtual memory area
+ * @param[out] end
+ *   Pointer to the end address of the mempool virtual memory area
+ *
+ *

[dpdk-dev] [PATCH v7 01/25] drivers: fix PCI class id support

2016-06-24 Thread Nelio Laranjeiro

Fixes: 701c8d80c820 ("pci: support class id probing")

Signed-off-by: Nelio Laranjeiro 
---
 drivers/crypto/qat/rte_qat_cryptodev.c |  5 +
 drivers/net/mlx4/mlx4.c| 18 ++
 drivers/net/mlx5/mlx5.c| 24 
 drivers/net/nfp/nfp_net.c  | 12 
 4 files changed, 19 insertions(+), 40 deletions(-)

diff --git a/drivers/crypto/qat/rte_qat_cryptodev.c 
b/drivers/crypto/qat/rte_qat_cryptodev.c
index a7912f5..f46ec85 100644
--- a/drivers/crypto/qat/rte_qat_cryptodev.c
+++ b/drivers/crypto/qat/rte_qat_cryptodev.c
@@ -69,10 +69,7 @@ static struct rte_cryptodev_ops crypto_qat_ops = {

 static struct rte_pci_id pci_id_qat_map[] = {
{
-   .vendor_id = 0x8086,
-   .device_id = 0x0443,
-   .subsystem_vendor_id = PCI_ANY_ID,
-   .subsystem_device_id = PCI_ANY_ID
+   RTE_PCI_DEVICE(0x8086, 0x0443),
},
{.device_id = 0},
 };
diff --git a/drivers/net/mlx4/mlx4.c b/drivers/net/mlx4/mlx4.c
index 9e94630..6228688 100644
--- a/drivers/net/mlx4/mlx4.c
+++ b/drivers/net/mlx4/mlx4.c
@@ -5807,22 +5807,16 @@ error:

 static const struct rte_pci_id mlx4_pci_id_map[] = {
{
-   .vendor_id = PCI_VENDOR_ID_MELLANOX,
-   .device_id = PCI_DEVICE_ID_MELLANOX_CONNECTX3,
-   .subsystem_vendor_id = PCI_ANY_ID,
-   .subsystem_device_id = PCI_ANY_ID
+   RTE_PCI_DEVICE(PCI_VENDOR_ID_MELLANOX,
+  PCI_DEVICE_ID_MELLANOX_CONNECTX3)
},
{
-   .vendor_id = PCI_VENDOR_ID_MELLANOX,
-   .device_id = PCI_DEVICE_ID_MELLANOX_CONNECTX3PRO,
-   .subsystem_vendor_id = PCI_ANY_ID,
-   .subsystem_device_id = PCI_ANY_ID
+   RTE_PCI_DEVICE(PCI_VENDOR_ID_MELLANOX,
+  PCI_DEVICE_ID_MELLANOX_CONNECTX3PRO)
},
{
-   .vendor_id = PCI_VENDOR_ID_MELLANOX,
-   .device_id = PCI_DEVICE_ID_MELLANOX_CONNECTX3VF,
-   .subsystem_vendor_id = PCI_ANY_ID,
-   .subsystem_device_id = PCI_ANY_ID
+   RTE_PCI_DEVICE(PCI_VENDOR_ID_MELLANOX,
+  PCI_DEVICE_ID_MELLANOX_CONNECTX3VF)
},
{
.vendor_id = 0
diff --git a/drivers/net/mlx5/mlx5.c b/drivers/net/mlx5/mlx5.c
index 67a541c..350028b 100644
--- a/drivers/net/mlx5/mlx5.c
+++ b/drivers/net/mlx5/mlx5.c
@@ -610,28 +610,20 @@ error:

 static const struct rte_pci_id mlx5_pci_id_map[] = {
{
-   .vendor_id = PCI_VENDOR_ID_MELLANOX,
-   .device_id = PCI_DEVICE_ID_MELLANOX_CONNECTX4,
-   .subsystem_vendor_id = PCI_ANY_ID,
-   .subsystem_device_id = PCI_ANY_ID
+   RTE_PCI_DEVICE(PCI_VENDOR_ID_MELLANOX,
+  PCI_DEVICE_ID_MELLANOX_CONNECTX4)
},
{
-   .vendor_id = PCI_VENDOR_ID_MELLANOX,
-   .device_id = PCI_DEVICE_ID_MELLANOX_CONNECTX4VF,
-   .subsystem_vendor_id = PCI_ANY_ID,
-   .subsystem_device_id = PCI_ANY_ID
+   RTE_PCI_DEVICE(PCI_VENDOR_ID_MELLANOX,
+  PCI_DEVICE_ID_MELLANOX_CONNECTX4VF)
},
{
-   .vendor_id = PCI_VENDOR_ID_MELLANOX,
-   .device_id = PCI_DEVICE_ID_MELLANOX_CONNECTX4LX,
-   .subsystem_vendor_id = PCI_ANY_ID,
-   .subsystem_device_id = PCI_ANY_ID
+   RTE_PCI_DEVICE(PCI_VENDOR_ID_MELLANOX,
+  PCI_DEVICE_ID_MELLANOX_CONNECTX4LX)
},
{
-   .vendor_id = PCI_VENDOR_ID_MELLANOX,
-   .device_id = PCI_DEVICE_ID_MELLANOX_CONNECTX4LXVF,
-   .subsystem_vendor_id = PCI_ANY_ID,
-   .subsystem_device_id = PCI_ANY_ID
+   RTE_PCI_DEVICE(PCI_VENDOR_ID_MELLANOX,
+  PCI_DEVICE_ID_MELLANOX_CONNECTX4LXVF)
},
{
.vendor_id = 0
diff --git a/drivers/net/nfp/nfp_net.c b/drivers/net/nfp/nfp_net.c
index ea5a2a3..dd0c559 100644
--- a/drivers/net/nfp/nfp_net.c
+++ b/drivers/net/nfp/nfp_net.c
@@ -2446,16 +2446,12 @@ nfp_net_init(struct rte_eth_dev *eth_dev)

 static struct rte_pci_id pci_id_nfp_net_map[] = {
{
-   .vendor_id = PCI_VENDOR_ID_NETRONOME,
-   .device_id = PCI_DEVICE_ID_NFP6000_PF_NIC,
-   .subsystem_vendor_id = PCI_ANY_ID,
-   .subsystem_device_id = PCI_ANY_ID,
+   RTE_PCI_DEVICE(PCI_VENDOR_ID_NETRONOME,
+  PCI_DEVICE_ID_NFP6000_PF_NIC)
},
{
-   .vendor_id = PCI_VENDOR_ID_NETRONOME,
-   .device_id = PCI_DEVICE_ID_NFP6000_VF_NIC,
-   .subsystem_vendor_id = PCI_ANY_ID,
-   .subsystem_device_id =

[dpdk-dev] [PATCH v7 00/25] Refactor mlx5 to improve performance

2016-06-24 Thread Nelio Laranjeiro

Enhance mlx5 with a data path that bypasses Verbs.

The first half of this patchset removes support for functionality completely
rewritten in the second half (scatter/gather, inline send), while the data
path is refactored without Verbs.

The PMD remains usable during the transition.

This patchset must be applied after "Miscellaneous fixes for mlx4 and mlx5".

Changes in v7:
- Fixed a bug introduced by change in v5.

Changes in v6:
- None.

Changes in v5:
- Fixed checkpatches errors.

Changes in v4:
- Fixed errno return value of mlx5_args().
- Fixed long line above 80 characters.

Changes in v3:
- Rebased patchset on top of next-net/rel_16_07.

Changes in v2:
- Rebased patchset on top of dpdk/master.
- Fixed CQE size on Power8.
- Fixed mbuf assertion failure in debug mode.
- Fixed missing class_id field in rte_pci_id by using RTE_PCI_DEVICE.

Adrien Mazarguil (7):
  mlx5: replace countdown with threshold for Tx completions
  mlx5: add debugging information about Tx queues capabilities
  mlx5: check remaining space while processing Tx burst
  mlx5: resurrect Tx gather support
  mlx5: work around spurious compilation errors
  mlx5: remove redundant Rx queue initialization code
  mlx5: make Rx queue reinitialization safer

Nelio Laranjeiro (17):
  drivers: fix PCI class id support
  mlx5: split memory registration function
  mlx5: remove Tx gather support
  mlx5: remove Rx scatter support
  mlx5: remove configuration variable
  mlx5: remove inline Tx support
  mlx5: split Tx queue structure
  mlx5: split Rx queue structure
  mlx5: update prerequisites for upcoming enhancements
  mlx5: add definitions for data path without Verbs
  mlx5: add support for configuration through kvargs
  mlx5: add Tx/Rx burst function selection wrapper
  mlx5: refactor Rx data path
  mlx5: refactor Tx data path
  mlx5: handle Rx CQE compression
  mlx5: add support for multi-packet send
  mlx5: resurrect Rx scatter support

Yaacov Hazan (1):
  mlx5: add support for inline send

 config/common_base |2 -
 doc/guides/nics/mlx5.rst   |   94 +-
 drivers/crypto/qat/rte_qat_cryptodev.c |5 +-
 drivers/net/mlx4/mlx4.c|   18 +-
 drivers/net/mlx5/Makefile  |   49 +-
 drivers/net/mlx5/mlx5.c|  186 ++-
 drivers/net/mlx5/mlx5.h|   10 +
 drivers/net/mlx5/mlx5_defs.h   |   26 +-
 drivers/net/mlx5/mlx5_ethdev.c |  195 ++-
 drivers/net/mlx5/mlx5_fdir.c   |   20 +-
 drivers/net/mlx5/mlx5_mr.c |  283 
 drivers/net/mlx5/mlx5_prm.h|  163 +++
 drivers/net/mlx5/mlx5_rxmode.c |8 -
 drivers/net/mlx5/mlx5_rxq.c|  767 ---
 drivers/net/mlx5/mlx5_rxtx.c   | 2232 +++-
 drivers/net/mlx5/mlx5_rxtx.h   |  176 ++-
 drivers/net/mlx5/mlx5_txq.c|  370 +++---
 drivers/net/mlx5/mlx5_vlan.c   |6 +-
 drivers/net/nfp/nfp_net.c  |   12 +-
 19 files changed, 2671 insertions(+), 1951 deletions(-)
 create mode 100644 drivers/net/mlx5/mlx5_mr.c
 create mode 100644 drivers/net/mlx5/mlx5_prm.h

-- 
2.1.4

[dpdk-dev] [PATCH v16 0/3] mempool: add mempool handler feature

2016-06-24 Thread Jan Viktorin

On Fri, 24 Jun 2016 13:24:56 +0200
Thomas Monjalon  wrote:

> 2016-06-24 13:20, Jan Viktorin:
> > thanks for the patchset. I am sorry, I didn't have any time for DPDK this 
> > week
> > and didn't test it before applying. The current master produces the 
> > following
> > error in my regular builds:
> > 
> >   INSTALL-LIB librte_eal.a
> > == Build lib/librte_ring
> >   CC rte_ring.o
> >   AR librte_ring.a
> >   SYMLINK-FILE include/rte_ring.h
> >   INSTALL-LIB librte_ring.a
> > == Build lib/librte_mempool
> >   CC rte_mempool.o
> > make[3]: *** No rule to make target `rte_mempool_ops.o', needed by 
> > `librte_mempool.a'.  Stop.  
> 
> It should be fixed now.

OK, confirmed. It seems that I only receive notifications of failures :).

Jan

[dpdk-dev] [PATCH v4 1/5] pdump: fix default socket path

2016-06-24 Thread Pattan, Reshma



> -Original Message-
> From: Thomas Monjalon [mailto:thomas.monjalon at 6wind.com]
> Sent: Friday, June 24, 2016 3:55 PM
> To: Pattan, Reshma 
> Cc: dev at dpdk.org
> Subject: Re: [dpdk-dev] [PATCH v4 1/5] pdump: fix default socket path
> 
> 2016-06-24 14:54, Reshma Pattan:
> > +#define SOCKET_DIR   "/pdump_sockets"
> 
> I think the default socket directory should contain dpdk as prefix.
> Like dpdk-pdump-sockets (I think dash is preferred for filenames).
> I wonder whether it should be a hidden directory:
>   ~/.dpdk-pdump-sockets
> And after all, why not simply
>   ~/.dpdk/

Hmm I would keep the name dpdk-pdump-sockets as library creates this only for 
pdump sockets.
Hidden directory just gives the advantage of not cluttering the display. If 
your comment is for the same I can add a . in the directory name:-).

> It would allow other DPDK applications to put some files.

[dpdk-dev] [PATCH v6 00/25] Refactor mlx5 to improve performance

2016-06-24 Thread Nélio Laranjeiro

Sorry about this, it will need a v7, a bug was introduced in the
previous version to fix a checkpatch warning.  kvargs were no more
interpreted.

I will resend a v7 after verifying everything.

Regards,

-- 
N?lio Laranjeiro
6WIND

[dpdk-dev] [PATCH v4 5/5] app/pdump: fix type casting of ring size

2016-06-24 Thread Reshma Pattan

ring_size value is wrongly type casted to uint16_t.
It should be type casted to uint32_t, as maximum
ring size is 28bit long. Wrong type cast
wrapping around the ring size values bigger than 65535.

Fixes: caa7028276b8 ("app/pdump: add tool for packet capturing")

Signed-off-by: Reshma Pattan 
---
 app/pdump/main.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/app/pdump/main.c b/app/pdump/main.c
index fe4d38a..2087c15 100644
--- a/app/pdump/main.c
+++ b/app/pdump/main.c
@@ -362,7 +362,7 @@ parse_pdump(const char *optarg)
_uint_value, );
if (ret < 0)
goto free_kvlist;
-   pt->ring_size = (uint16_t) v.val;
+   pt->ring_size = (uint32_t) v.val;
} else
pt->ring_size = RING_SIZE;

-- 
2.5.0

[dpdk-dev] [PATCH v4 4/5] app/pdump: fix string overflow

2016-06-24 Thread Reshma Pattan

replaced strncpy with snprintf for safely
copying the strings.

Coverity issue 127351: string overflow

Fixes: caa7028276b8 ("app/pdump: add tool for packet capturing")

Signed-off-by: Reshma Pattan 
---
 app/pdump/main.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/app/pdump/main.c b/app/pdump/main.c
index f8923b9..fe4d38a 100644
--- a/app/pdump/main.c
+++ b/app/pdump/main.c
@@ -217,12 +217,12 @@ parse_rxtxdev(const char *key, const char *value, void 
*extra_args)
struct pdump_tuples *pt = extra_args;

if (!strcmp(key, PDUMP_RX_DEV_ARG)) {
-   strncpy(pt->rx_dev, value, strlen(value));
+   snprintf(pt->rx_dev, sizeof(pt->rx_dev), "%s", value);
/* identify the tx stream type for pcap vdev */
if (if_nametoindex(pt->rx_dev))
pt->rx_vdev_stream_type = IFACE;
} else if (!strcmp(key, PDUMP_TX_DEV_ARG)) {
-   strncpy(pt->tx_dev, value, strlen(value));
+   snprintf(pt->tx_dev, sizeof(pt->tx_dev), "%s", value);
/* identify the tx stream type for pcap vdev */
if (if_nametoindex(pt->tx_dev))
pt->tx_vdev_stream_type = IFACE;
-- 
2.5.0

[dpdk-dev] [PATCH v4 3/5] pdump: fix string overflow

2016-06-24 Thread Reshma Pattan

replaced strncpy with snprintf for safely
copying the strings.

Cverity issue 127350: string overflow

Fixes: 278f945402c5 ("pdump: add new library for packet capture")

Signed-off-by: Reshma Pattan 
---
 lib/librte_pdump/rte_pdump.c | 6 --
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/lib/librte_pdump/rte_pdump.c b/lib/librte_pdump/rte_pdump.c
index 8240387..53a5bf2 100644
--- a/lib/librte_pdump/rte_pdump.c
+++ b/lib/librte_pdump/rte_pdump.c
@@ -804,13 +804,15 @@ pdump_prepare_client_request(char *device, uint16_t queue,
req.flags = flags;
req.op =  operation;
if ((operation & ENABLE) != 0) {
-   strncpy(req.data.en_v1.device, device, strlen(device));
+   snprintf(req.data.en_v1.device, sizeof(req.data.en_v1.device),
+   "%s", device);
req.data.en_v1.queue = queue;
req.data.en_v1.ring = ring;
req.data.en_v1.mp = mp;
req.data.en_v1.filter = filter;
} else {
-   strncpy(req.data.dis_v1.device, device, strlen(device));
+   snprintf(req.data.dis_v1.device, sizeof(req.data.dis_v1.device),
+   "%s", device);
req.data.dis_v1.queue = queue;
req.data.dis_v1.ring = NULL;
req.data.dis_v1.mp = NULL;
-- 
2.5.0

[dpdk-dev] [PATCH v4 2/5] pdump: check getenv return value

2016-06-24 Thread Reshma Pattan

inside pdump_get_socket_path(), getenv can return
a NULL pointer if the match for SOCKET_PATH_HOME is
not found in the environment. NULL check is added to
return -1 immediately. Since pdump_get_socket_path()
returns -1 now, wherever this function is called
there the return value is checked and error message
is logged.

Coverity issue 127344:  return value check
Coverity issue 127347:  null pointer dereference

Fixes: 278f945402c5 ("pdump: add new library for packet capture")

Signed-off-by: Reshma Pattan 
---
 lib/librte_pdump/rte_pdump.c | 44 +++-
 1 file changed, 39 insertions(+), 5 deletions(-)

diff --git a/lib/librte_pdump/rte_pdump.c b/lib/librte_pdump/rte_pdump.c
index 5c335ba..8240387 100644
--- a/lib/librte_pdump/rte_pdump.c
+++ b/lib/librte_pdump/rte_pdump.c
@@ -442,7 +442,7 @@ set_pdump_rxtx_cbs(struct pdump_request *p)
 }

 /* get socket path (/var/run if root, $HOME otherwise) */
-static void
+static int
 pdump_get_socket_path(char *buffer, int bufsz, enum rte_pdump_socktype type)
 {
char dir[PATH_MAX] = {0};
@@ -455,9 +455,17 @@ pdump_get_socket_path(char *buffer, int bufsz, enum 
rte_pdump_socktype type)
else {
if (getuid() != 0) {
dir_home = getenv(SOCKET_PATH_HOME);
+   if (!dir_home) {
+   RTE_LOG(ERR, PDUMP,
+   "Failed to get environment variable"
+   " value for %s, %s:%d\n",
+   SOCKET_PATH_HOME, __func__, __LINE__);
+   return -1;
+   }
strcat(dir, dir_home);
} else
strcat(dir, SOCKET_PATH_VAR_RUN);
+
strcat(dir, SOCKET_DIR);
}

@@ -467,6 +475,8 @@ pdump_get_socket_path(char *buffer, int bufsz, enum 
rte_pdump_socktype type)
else
snprintf(buffer, bufsz, CLIENT_SOCKET, dir, getpid(),
rte_sys_gettid());
+
+   return 0;
 }

 static int
@@ -476,8 +486,14 @@ pdump_create_server_socket(void)
struct sockaddr_un addr;
socklen_t addr_len;

-   pdump_get_socket_path(addr.sun_path, sizeof(addr.sun_path),
+   ret = pdump_get_socket_path(addr.sun_path, sizeof(addr.sun_path),
RTE_PDUMP_SOCKET_SERVER);
+   if (ret != 0) {
+   RTE_LOG(ERR, PDUMP,
+   "Failed to get server socket path: %s:%d\n",
+   __func__, __LINE__);
+   return -1;
+   }
addr.sun_family = AF_UNIX;

/* remove if file already exists */
@@ -608,8 +624,14 @@ rte_pdump_uninit(void)

struct sockaddr_un addr;

-   pdump_get_socket_path(addr.sun_path, sizeof(addr.sun_path),
+   ret = pdump_get_socket_path(addr.sun_path, sizeof(addr.sun_path),
RTE_PDUMP_SOCKET_SERVER);
+   if (ret != 0) {
+   RTE_LOG(ERR, PDUMP,
+   "Failed to get server socket path: %s:%d\n",
+   __func__, __LINE__);
+   return -1;
+   }
ret = unlink(addr.sun_path);
if (ret != 0) {
RTE_LOG(ERR, PDUMP,
@@ -643,8 +665,14 @@ pdump_create_client_socket(struct pdump_request *p)
return ret;
}

-   pdump_get_socket_path(addr.sun_path, sizeof(addr.sun_path),
+   ret = pdump_get_socket_path(addr.sun_path, sizeof(addr.sun_path),
RTE_PDUMP_SOCKET_CLIENT);
+   if (ret != 0) {
+   RTE_LOG(ERR, PDUMP,
+   "Failed to get client socket path: %s:%d\n",
+   __func__, __LINE__);
+   return -1;
+   }
addr.sun_family = AF_UNIX;
addr_len = sizeof(struct sockaddr_un);

@@ -660,9 +688,15 @@ pdump_create_client_socket(struct pdump_request *p)

serv_len = sizeof(struct sockaddr_un);
memset(_addr, 0, sizeof(serv_addr));
-   pdump_get_socket_path(serv_addr.sun_path,
+   ret = pdump_get_socket_path(serv_addr.sun_path,
sizeof(serv_addr.sun_path),
RTE_PDUMP_SOCKET_SERVER);
+   if (ret != 0) {
+   RTE_LOG(ERR, PDUMP,
+   "Failed to get server socket path: %s:%d\n",
+   __func__, __LINE__);
+   break;
+   }
serv_addr.sun_family = AF_UNIX;

n =  sendto(socket_fd, p, sizeof(struct pdump_request), 0,
-- 
2.5.0

[dpdk-dev] [PATCH v4 1/5] pdump: fix default socket path

2016-06-24 Thread Reshma Pattan

SOCKET_PATH_HOME is to specify environment variable "HOME",
so it should not contain "/pdump_sockets"  in the macro.
So remove "/pdump_sockets" from SOCKET_PATH_HOME
and create new macro for "/pdump_sockets". Similary removed
"/pdump_sockets" from SOCKET_PATH_VAR_RUN.
Changes are done in pdump_get_socket_path() to accommodate
new socket path changes.

Fixes: 278f945402c5 ("pdump: add new library for packet capture")

Signed-off-by: Reshma Pattan 
---
 lib/librte_pdump/rte_pdump.c | 22 +-
 1 file changed, 13 insertions(+), 9 deletions(-)

diff --git a/lib/librte_pdump/rte_pdump.c b/lib/librte_pdump/rte_pdump.c
index c921f51..5c335ba 100644
--- a/lib/librte_pdump/rte_pdump.c
+++ b/lib/librte_pdump/rte_pdump.c
@@ -50,8 +50,9 @@

 #include "rte_pdump.h"

-#define SOCKET_PATH_VAR_RUN "/var/run/pdump_sockets"
-#define SOCKET_PATH_HOME "HOME/pdump_sockets"
+#define SOCKET_PATH_VAR_RUN "/var/run"
+#define SOCKET_PATH_HOME "HOME"
+#define SOCKET_DIR   "/pdump_sockets"
 #define SERVER_SOCKET "%s/pdump_server_socket"
 #define CLIENT_SOCKET "%s/pdump_client_socket_%d_%u"
 #define DEVICE_ID_SIZE 64
@@ -444,17 +445,20 @@ set_pdump_rxtx_cbs(struct pdump_request *p)
 static void
 pdump_get_socket_path(char *buffer, int bufsz, enum rte_pdump_socktype type)
 {
-   const char *dir = NULL;
+   char dir[PATH_MAX] = {0};
+   char *dir_home = NULL;

if (type == RTE_PDUMP_SOCKET_SERVER && server_socket_dir[0] != 0)
-   dir = server_socket_dir;
+   snprintf(dir, sizeof(dir), "%s", server_socket_dir);
else if (type == RTE_PDUMP_SOCKET_CLIENT && client_socket_dir[0] != 0)
-   dir = client_socket_dir;
+   snprintf(dir, sizeof(dir), "%s", client_socket_dir);
else {
-   if (getuid() != 0)
-   dir = getenv(SOCKET_PATH_HOME);
-   else
-   dir = SOCKET_PATH_VAR_RUN;
+   if (getuid() != 0) {
+   dir_home = getenv(SOCKET_PATH_HOME);
+   strcat(dir, dir_home);
+   } else
+   strcat(dir, SOCKET_PATH_VAR_RUN);
+   strcat(dir, SOCKET_DIR);
}

mkdir(dir, 700);
-- 
2.5.0

[dpdk-dev] [PATCH v4 0/5] fix issues in packet capture framework

2016-06-24 Thread Reshma Pattan

This patchset includes listed fixes
1)fix default socket path in pdump library.
2)fix coverity issues in pdump library.
3)fix coverity issues in pdump tool.
4)fix wrong typecast of ring size in pdump tool.

v4:
added new patch for fixing wrong typecast of ring size
in pdump tool.

v3:
added new patch for fixing default socket paths "HOME" and "/var/run".
reworked coverity fixes on top of the above change.

v2:
fixed code review comment to use snprintf instead of strncpy.

Reshma Pattan (5):
  pdump: fix default socket path
  pdump: check getenv return value
  pdump: fix string overflow
  app/pdump: fix string overflow
  app/pdump: fix type casting of ring size

 app/pdump/main.c |  6 ++--
 lib/librte_pdump/rte_pdump.c | 72 ++--
 2 files changed, 59 insertions(+), 19 deletions(-)

-- 
2.5.0

[dpdk-dev] [PATCH] app/test: fix PCI class probing

2016-06-24 Thread Thomas Monjalon

The PCI test was failing because some fake devices had no PCI class.

Fixes: 1dbba1650c89 ("app/test: remove real PCI ids")

Signed-off-by: Thomas Monjalon 
---
 app/test/test_pci_sysfs/bus/pci/devices/:01:02.0/class | 1 +
 app/test/test_pci_sysfs/bus/pci/devices/:02:ab.0/class | 1 +
 2 files changed, 2 insertions(+)
 create mode 100644 app/test/test_pci_sysfs/bus/pci/devices/:01:02.0/class
 create mode 100644 app/test/test_pci_sysfs/bus/pci/devices/:02:ab.0/class

diff --git a/app/test/test_pci_sysfs/bus/pci/devices/:01:02.0/class 
b/app/test/test_pci_sysfs/bus/pci/devices/:01:02.0/class
new file mode 100644
index 000..22dd936
--- /dev/null
+++ b/app/test/test_pci_sysfs/bus/pci/devices/:01:02.0/class
@@ -0,0 +1 @@
+0x10
diff --git a/app/test/test_pci_sysfs/bus/pci/devices/:02:ab.0/class 
b/app/test/test_pci_sysfs/bus/pci/devices/:02:ab.0/class
new file mode 100644
index 000..22dd936
--- /dev/null
+++ b/app/test/test_pci_sysfs/bus/pci/devices/:02:ab.0/class
@@ -0,0 +1 @@
+0x10
-- 
2.7.0

[dpdk-dev] [PATCH] nfp: modifying guide about using uio modules

2016-06-24 Thread Mcnamara, John

> -Original Message-
> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Alejandro Lucero
> Sent: Tuesday, April 26, 2016 12:37 PM
> To: dev at dpdk.org
> Subject: [dpdk-dev] [PATCH] nfp: modifying guide about using uio modules
> 
>  - Removing dependency on nfp_uio kernel module. The igb_uio
>kernel modules can be used instead.
> 
> Fixes: 80bc1752f16e ("nfp: add guide")
> 
> Signed-off-by: Alejandro Lucero 

Acked-by: John McNamara

[dpdk-dev] [PATCH] bnxt: Add Cumulus+ PCI ID

2016-06-24 Thread Ajit Khaparde

On Fri, Jun 24, 2016 at 6:59 AM, Bruce Richardson <
bruce.richardson at intel.com> wrote:

> On Tue, Jun 21, 2016 at 04:58:20PM -0500, Ajit Khaparde wrote:
> > This patch adds support for Cumulus+ Ethernet adapters.
> > These Cumulus+ Ethernet adapters support 10Gb/25Gb/40Gb/50Gb speeds.
> >
> > Signed-off-by: Ajit Khaparde 
>
> Applied to dpdk-next-net/rel_16_07
>
?Thanks Bruce. At what point in time will the changes under rel_16_07
be available in the dpdk tree?
git://dpdk.org/dpdk

?Thanks?

?


>
> /Bruce
>

[dpdk-dev] [PATCH v16 0/3] mempool: add mempool handler feature

2016-06-24 Thread Thomas Monjalon

2016-06-24 13:20, Jan Viktorin:
> thanks for the patchset. I am sorry, I didn't have any time for DPDK this week
> and didn't test it before applying. The current master produces the following
> error in my regular builds:
> 
>   INSTALL-LIB librte_eal.a
> == Build lib/librte_ring
>   CC rte_ring.o
>   AR librte_ring.a
>   SYMLINK-FILE include/rte_ring.h
>   INSTALL-LIB librte_ring.a
> == Build lib/librte_mempool
>   CC rte_mempool.o
> make[3]: *** No rule to make target `rte_mempool_ops.o', needed by 
> `librte_mempool.a'.  Stop.

It should be fixed now.

[dpdk-dev] [PATCH v2] mk: fix parallel build of test resources

2016-06-24 Thread Thomas Monjalon

The build was failing sometimes when building with multiple
parallel jobs:
# rm build/build/app/test/*res*
# make -j6
objcopy: 'resource.tmp': No such file

The reason is that each resource was built from the same temporary file.
The failure is seen because of a race condition when removing the
temporary file after each resource creation.
It also means that some resources may be created from the wrong source.

The fix is to have a different input file for each resource.
The source file is not directly used because it may have a long path
which is used by objcopy to name the symbols after some transformations.
When linking a tar resource, the input file is already in the current
directory. The hard case is for simply linked resources.
The trick is to create a symbolic link of the source file if it is not
already in the current build directory.
Then there is a replacement of dot by an underscore to predict the
symbol names computed by objcopy which must be redefined.

There is an additional change for the test_resource_c which is both
a real source file and a test resource. An intermediate file
test_resource.res is created to avoid compiling resource.c from the
wrong directory through a symbolic link.

Fixes: 1e9e0a6270 ("app/test: fix resource creation with objcopy on FreeBSD")

Signed-off-by: Thomas Monjalon 
---
v2: fix rebuild error due to link test_resource.c in build directory
---
 app/test/Makefile | 14 --
 1 file changed, 8 insertions(+), 6 deletions(-)

diff --git a/app/test/Makefile b/app/test/Makefile
index 9fa03fb..1f24dd6 100644
--- a/app/test/Makefile
+++ b/app/test/Makefile
@@ -43,14 +43,14 @@ define linked_resource
 SRCS-y += $(1).res.o
 $(1).res.o: $(2)
@  echo '  MKRES $$@'
-   $Q ln -fs $$< resource.tmp
+   $Q [ "$$(

[dpdk-dev] [PATCH v16 0/3] mempool: add mempool handler feature

2016-06-24 Thread Jan Viktorin

On Fri, 24 Jun 2016 04:55:39 +
"Wiles, Keith"  wrote:

> On 6/23/16, 11:22 PM, "dev on behalf of Thomas Monjalon"  dpdk.org on behalf of thomas.monjalon at 6wind.com> wrote:
> 
> >> David Hunt (2):
> >>   mempool: support mempool handler operations
> >>   app/test: test mempool handler
> >>   mbuf: make default mempool ops configurable at build  
> >
> >Applied, thanks for the nice feature
> >
> >I'm sorry David, the revision record is v17 ;)  
> 
> Quick David, make two more updates to the patch ?
> 
> Thanks David and Great work !!!
> >
> >  
> 

Hello David,

thanks for the patchset. I am sorry, I didn't have any time for DPDK this week
and didn't test it before applying. The current master produces the following
error in my regular builds:

  INSTALL-LIB librte_eal.a
== Build lib/librte_ring
  CC rte_ring.o
  AR librte_ring.a
  SYMLINK-FILE include/rte_ring.h
  INSTALL-LIB librte_ring.a
== Build lib/librte_mempool
  CC rte_mempool.o
make[3]: *** No rule to make target `rte_mempool_ops.o', needed by 
`librte_mempool.a'.  Stop.
make[2]: *** [librte_mempool] Error 2
make[1]: *** [lib] Error 2
make: *** [all] Error 2
Build step 'Execute shell' marked build as failure
[WARNINGS] Skipping publisher since build result is FAILURE

I have no idea about the reason at the moment. I'll check it soon.

Regards
Jan

[dpdk-dev] [PATCH v3] i40e: fix the type issue of a single VLAN type

2016-06-24 Thread Bruce Richardson

On Wed, Jun 22, 2016 at 10:53:51AM +0800, Beilei Xing wrote:
> In current i40e codebase, if single VLAN header is added in a packet,
> it's treated as inner VLAN. Generally, a single VLAN header is
> treated as the outer VLAN header. So change corresponding register
> for single VLAN.
> 
> Fixes: 19b16e2f6442 ("ethdev: add vlan type when setting ether type")
> 
> Signed-off-by: Beilei Xing 

Applied to dpdk-next-net/rel_16_07

/Bruce

[dpdk-dev] [PATCH] bnxt: Add Cumulus+ PCI ID

2016-06-24 Thread Bruce Richardson

On Tue, Jun 21, 2016 at 04:58:20PM -0500, Ajit Khaparde wrote:
> This patch adds support for Cumulus+ Ethernet adapters.
> These Cumulus+ Ethernet adapters support 10Gb/25Gb/40Gb/50Gb speeds.
> 
> Signed-off-by: Ajit Khaparde 

Applied to dpdk-next-net/rel_16_07

/Bruce

[dpdk-dev] [PATCH v2 0/6] ena: update PMD to cooperate with latest ENA firmware

2016-06-24 Thread Bruce Richardson

On Tue, Jun 21, 2016 at 02:05:57PM +0200, Jan Medala wrote:
> As requested, big patch splitted into logical pieces for easier review.
> Improved style and fixed icc compiler issues.
> 

Thanks for the patch split. However, many of the patches don't have a commit
message describing (at a high level), what the goal of that patch is and how
it goes about implementing the changes to achieve that goal. Can you add a
paragraph or two of commit message to each patch on the set to help anyone
looking at the code understand what is happening in each patch.

Regards,
/Bruce

> Jan Medala (6):
>   ena: update of ENA communication layer
>   ena: add debug area and host information
>   ena: disable readless communication regarding to HW revision
>   ena: allocate coherent memory in node-aware way
>   ena: fix memory management issue
>   ena: fix for icc compiler
> 
>  drivers/net/ena/base/ena_com.c  | 254 +++---
>  drivers/net/ena/base/ena_com.h  |  82 +++--
>  drivers/net/ena/base/ena_defs/ena_admin_defs.h  | 107 +-
>  drivers/net/ena/base/ena_defs/ena_eth_io_defs.h | 436 
> ++--
>  drivers/net/ena/base/ena_defs/ena_gen_info.h|   4 +-
>  drivers/net/ena/base/ena_eth_com.c  |  32 +-
>  drivers/net/ena/base/ena_eth_com.h  |  14 +
>  drivers/net/ena/base/ena_plat_dpdk.h|  42 ++-
>  drivers/net/ena/ena_ethdev.c| 275 ++-
>  drivers/net/ena/ena_ethdev.h|  40 +++
>  10 files changed, 675 insertions(+), 611 deletions(-)
> 
> -- 
> 2.8.2
>

[dpdk-dev] [PATCH v2] enic: negative array index write

2016-06-24 Thread Bruce Richardson

On Mon, Jun 20, 2016 at 12:27:46PM -0700, John Daley wrote:
> Negative array index write using variable pos as an index to array
> enic->fdir.nodes. Fixed by add array index check.
> 
> Fixes: fefed3d1e62c ("enic: new driver") Coverity ID 13270
> Signed-off-by: John Daley 
> ---
> 
Applied to dpdk-next-net/rel_16_07

/Bruce

[dpdk-dev] [PATCH] scripts: add verbose option in build test help

2016-06-24 Thread Thomas Monjalon

The verbose option was available but not advertised.

Fixes: 6e38dfe21389 ("scripts: add verbose test build option")

Signed-off-by: Thomas Monjalon 
---
 scripts/test-build.sh | 1 +
 1 file changed, 1 insertion(+)

diff --git a/scripts/test-build.sh b/scripts/test-build.sh
index 31d5583..7a9f623 100755
--- a/scripts/test-build.sh
+++ b/scripts/test-build.sh
@@ -63,6 +63,7 @@ print_help () {
-hthis help
-jX   use X parallel jobs in "make"
-sshort test with only first config without examples/doc
+   -vverbose build

config: defconfig[[~][+]option1[[~][+]option2...]]
Example: x86_64-native-linuxapp-gcc+debug~RXTX_CALLBACKS
-- 
2.7.0

[dpdk-dev] [PATCH v12 1/2] i40e: support floating VEB config

2016-06-24 Thread Ferruh Yigit

Hi Zhe,

On 6/24/2016 9:29 AM, Zhe Tao wrote:
> Add the new floating VEB related arguments option in the devarg.
> Using this parameter, all the applications can decide whether to use legacy
> VEB/VEPA or floating VEB.
> To enable this feature, the user should pass a devargs parameter to the
> EAL like "-w 84:00.0,enable_floating_veb=1", and the application will
> tell PMD whether to use the floating VEB feature or not.
Is providing the enable_floating_veb=1 devarg enough, or should
application call an driver API to enable this feature, if so documenting
what application needs to do helps app developers.

> Once the floating VEB feature is enabled, all the VFs created by
> this PF device are connected to the floating VEB.
Technically there can be multiple floating VEBs, right? And with this
param only one floating VEB created and all VFs connected to it. Is
there any use case to create multiple VEBs with selective VFs connected
to them?

> 
> Also user can specify which VF need to connect to this floating veb using
> "floating_veb_list".
> Like "-w 84:00.0,enable_floating_veb=1,floating_veb_list=1/3-4", means VF1, 
> VF3,
> VF4 connect to the floating VEB, other VFs connect to the legacy VEB.The "/"
> is used for delimiter of the floating VEB list.
Is there a use case to change VF VEB connection on runtime?

> 
> All the VEB/VEPA concepts are not specific for FVL, they are defined in
> the 802.1Qbg spec.
> 
> But for floating VEB, it has two major difference.
> 1. doesn't has a up link connection which means
> the traffic cannot go to outside world.
> 2. doesn't need to connect to the physical port which means
> when the physical link is down the floating VEB can still works
> fine.
> 
> Signed-off-by: Zhe Tao 
> ---

...

>  
> +static int vf_floating_veb_handler(__rte_unused const char *key,
What about floating_veb_list_handler, to be consistent with argument name?

> +const char *floating_veb_value,
> +void *opaque)
> +{
> + int idx = 0;
> + unsigned count = 0;
> + char *end = NULL;
> + int min, max;
> + bool *vf_floating_veb = opaque;
> +
> + while (isblank(*floating_veb_value))
> + floating_veb_value++;
> +
> + /* Reset floating VEB configuration for VFs */
> + for (idx = 0; idx < I40E_MAX_VF; idx++)
> + vf_floating_veb[idx] = false;
> +
> + min = I40E_MAX_VF;
> + do {
> + while (isblank(*floating_veb_value))
> + floating_veb_value++;
> + if (*floating_veb_value == '\0')
> + return -1;
> + errno = 0;
> + idx = strtoul(floating_veb_value, , 10);
> + if (errno || end == NULL)
> + return -1;
> + while (isblank(*end))
> + end++;
> + if (*end == '-') {
> + min = idx;
> + } else if ((*end == '/') || (*end == '\0')) {
> + max = idx;
> + if (min == I40E_MAX_VF)
> + min = idx;
> + if (max >= I40E_MAX_VF)
> + max = I40E_MAX_VF - 1;
> + for (idx = min; idx <= max; idx++) {
> + vf_floating_veb[idx] = true;
> + count++;
> + }
> + min = I40E_MAX_VF;
> + } else {
> + return -1;
> + }
> + floating_veb_value = end + 1;
> + } while (*end != '\0');
> +
> + if (count == 0)
> + return -1;
> +
> + return 0;
> +}
> +
> +static void config_vf_floating_veb(struct rte_devargs *devargs,
According DPDK coding convention, return type should be in a separate line

> +uint16_t floating,
floating itself confusing, what about floating_veb?

> +bool *vf_floating_veb)
> +{
> + struct rte_kvargs *kvlist;
> + int i;
> + const char *floating_veb_list = ETH_I40E_FLOATING_VEB_LIST_ARG;
> +
> + if (floating == false)
> + return;
> + for (i = 0; i < I40E_MAX_VF; i++)
> + vf_floating_veb[i] = true;
> +
> + if (devargs == NULL)
> + return;
> +
> + kvlist = rte_kvargs_parse(devargs->args, NULL);
> + if (kvlist == NULL)
> + return;
> +
> + if (!rte_kvargs_count(kvlist, floating_veb_list)) {
> + rte_kvargs_free(kvlist);
> + return;
> + }
> +
> + if (rte_kvargs_process(kvlist, floating_veb_list,
> +vf_floating_veb_handler,
> +vf_floating_veb) < 0) {
A comment can be good to say, here floating VEB disabled for all VFs

> + rte_kvargs_free(kvlist);
> + return;
> + }
> + rte_kvargs_free(kvlist);
> +
> + return;
not required

> +}
> +
> +static int

[dpdk-dev] [PATCH] vhost: fix Tx error counting of vhost PMD

2016-06-24 Thread Tetsuya Mukawa

On 2016/06/24 11:17, Yuanhan Liu wrote:
> On Fri, Jun 24, 2016 at 11:04:20AM +0900, Tetsuya Mukawa wrote:
>> According to 'rte_eth_stats' structure comments, 'imissed'
>> should represent RX error counting, but currently 'imissed' is
>> used to count TX error.
>> The patch replaces 'imissed' by 'oerrors'.
>>
>> Fixes: ee584e9710b9 ("vhost: add driver on top of the library")
>> Signed-off-by: Tetsuya Mukawa 
> 
> Acked-by: Yuanhan Liu 
> 
> (And sorry for the delay: I planned to look at the report from
> you at the earlier of this week; but somehow I forgot :(
> 
>   --yliu
> 

Hi Yuanhan,

No need to worry. Thanks for your checking!

Thanks,
Tetsuya

[dpdk-dev] backtracing from within the code

2016-06-24 Thread Thomas Monjalon

2016-06-24 09:25, Dumitrescu, Cristian:
> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Catalin Vasile
> > I'm trying to add a feature to DPDK and I'm having a hard time printing a
> > backtrace.
> > I tried using this[1] functions for printing, but it does not print more 
> > than one
> > function. Maybe it lacks the symbols it needs.
[...]
> It eventually calls rte_dump_stack() in file 
> lib/lirte_eal/linuxapp/eal/eal_debug.c, which calls backtrace(), which is 
> probably what you are looking for. 

Example:
5: [build/app/testpmd(_start+0x29) [0x416f69]]
4: [/usr/lib/libc.so.6(__libc_start_main+0xf0) [0x7eff3b757610]]
3: [build/app/testpmd(main+0x2ff) [0x416b3f]]
2: [build/app/testpmd(init_port_config+0x88) [0x419a78]]
1: [build/lib/librte_eal.so.2.1(rte_dump_stack+0x18) [0x7eff3c126488]]

Please tell us if you have some cases where rte_dump_stack() does not work.
I do not remember what are the constraints to have it working.
Your binary is not stripped?

[dpdk-dev] [PATCH] ixgbe: use rte_mbuf_prefetch_part2 for cacheline1 access

2016-06-24 Thread Bruce Richardson

On Mon, Jun 20, 2016 at 11:19:19AM +0800, Jianbo Liu wrote:
> On 17 June 2016 at 22:06, Jerin Jacob  
> wrote:
> > made second cache line access behavior same as IA
> >
> > Signed-off-by: Jerin Jacob 
> > ---
> >  drivers/net/ixgbe/ixgbe_rxtx_vec_neon.c | 8 
> >  1 file changed, 4 insertions(+), 4 deletions(-)
> >
> > diff --git a/drivers/net/ixgbe/ixgbe_rxtx_vec_neon.c 
> > b/drivers/net/ixgbe/ixgbe_rxtx_vec_neon.c
> > index 9c1d124..64a329e 100644
> > --- a/drivers/net/ixgbe/ixgbe_rxtx_vec_neon.c
> > +++ b/drivers/net/ixgbe/ixgbe_rxtx_vec_neon.c
> > @@ -280,10 +280,10 @@ _recv_raw_pkts_vec(struct ixgbe_rx_queue *rxq, struct 
> > rte_mbuf **rx_pkts,
> > vst1q_u64((uint64_t *)_pkts[pos + 2], mbp2);
> >
> > if (split_packet) {
> > -   
> > rte_prefetch_non_temporal(_pkts[pos]->cacheline1);
> > -   rte_prefetch_non_temporal(_pkts[pos + 
> > 1]->cacheline1);
> > -   rte_prefetch_non_temporal(_pkts[pos + 
> > 2]->cacheline1);
> > -   rte_prefetch_non_temporal(_pkts[pos + 
> > 3]->cacheline1);
> > +   rte_mbuf_prefetch_part2(rx_pkts[pos]);
> > +   rte_mbuf_prefetch_part2(rx_pkts[pos + 1]);
> > +   rte_mbuf_prefetch_part2(rx_pkts[pos + 2]);
> > +   rte_mbuf_prefetch_part2(rx_pkts[pos + 3]);
> > }
> >
> > /* D.1 pkt 3,4 convert format from desc to pktmbuf */
> > --
> > 2.5.5
> >
> 
> Reviewed-by: Jianbo Liu 

Applied to dpdk-next-net/rel_16_07

/Bruce

[dpdk-dev] [PATCH 4/4] doc: add MTU update to feature matrix for enic

2016-06-24 Thread Bruce Richardson

On Thu, Jun 16, 2016 at 10:22:49PM -0700, John Daley wrote:
> Signed-off-by: John Daley 
> ---

This patch should be squashed into the previous one.

/Bruce

[dpdk-dev] [PATCH 2/4] enic: set the max allowed MTU for the NIC

2016-06-24 Thread Bruce Richardson

On Fri, Jun 24, 2016 at 11:59:19AM +0100, Bruce Richardson wrote:
> On Thu, Jun 16, 2016 at 10:22:47PM -0700, John Daley wrote:
> > The max MTU is set to the max egress packet size allowed by the VIC
> > minus the size of a an IPv4 L2 header with .1Q (18 bytes).
> > 
> 
> I think a bit more detail might be needed here. For example:
> 
> * What was the MTU set to by default before this patch is applied? Was it just
> set to 1518 or something else?
> * What happens, if anything, if buffers bigger than the MTU size are sent 
> down?
This is obviously referring to buffers bigger than MTU on TX. There is also the
question of what happens if buffer sizes smaller than MTU are provided on RX.

> 
> /Bruce

[dpdk-dev] [PATCH 2/4] enic: set the max allowed MTU for the NIC

2016-06-24 Thread Bruce Richardson

On Thu, Jun 16, 2016 at 10:22:47PM -0700, John Daley wrote:
> The max MTU is set to the max egress packet size allowed by the VIC
> minus the size of a an IPv4 L2 header with .1Q (18 bytes).
> 

I think a bit more detail might be needed here. For example:

* What was the MTU set to by default before this patch is applied? Was it just
set to 1518 or something else?
* What happens, if anything, if buffers bigger than the MTU size are sent down?

/Bruce

[dpdk-dev] [PATCH 1/4] enic: enable NIC max packet size discovery

2016-06-24 Thread Bruce Richardson

On Thu, Jun 16, 2016 at 10:22:46PM -0700, John Daley wrote:
> Pull in common VNIC code which enables querying for max egress
> packet size.
> 
With this patch applied is the user able to query the max packet size, or is it
just that the driver is able to do so for use by other functions?

/Bruce

[dpdk-dev] [PATCH] bnx2x: Don't reset buf_len in RX mbufs

2016-06-24 Thread Bruce Richardson

On Fri, Jun 17, 2016 at 06:32:06AM +, Harish Patil wrote:
> >
> >Fixes: 540a211084a7 ("bnx2x: driver core")
> >
> >Signed-off-by: Chas Williams <3chas3 at gmail.com>
> >---
> > drivers/net/bnx2x/bnx2x_rxtx.c | 1 -
> > 1 file changed, 1 deletion(-)
> >
> >diff --git a/drivers/net/bnx2x/bnx2x_rxtx.c
> >b/drivers/net/bnx2x/bnx2x_rxtx.c
> >index 55d2bd7..c963194 100644
> >--- a/drivers/net/bnx2x/bnx2x_rxtx.c
> >+++ b/drivers/net/bnx2x/bnx2x_rxtx.c
> >@@ -416,7 +416,6 @@ bnx2x_recv_pkts(void *p_rxq, struct rte_mbuf
> >**rx_pkts, uint16_t nb_pkts)
> > rx_mb->next = NULL;
> > rx_mb->pkt_len = rx_mb->data_len = len;
> > rx_mb->port = rxq->port_id;
> >-rx_mb->buf_len = len + pad;
> > rte_prefetch1(rte_pktmbuf_mtod(rx_mb, void *));
> > 
> > /*
> >-- 
> >2.5.5
> >
> >
> 
> Acked-by: Harish Patil 
> 
Applied to dpdk-next-net/rel_16_07

/Bruce

[dpdk-dev] [PATCH v2] enic: scattered Rx

2016-06-24 Thread Bruce Richardson

On Thu, Jun 16, 2016 at 12:19:05PM -0700, Nelson Escobar wrote:
> For performance reasons, this patch uses 2 VIC RQs per RQ presented to
> DPDK.
> 
> The VIC requires that each descriptor be marked as either a start of
> packet (SOP) descriptor or a non-SOP descriptor.  A one RQ solution
> requires skipping descriptors when receiving small packets and results
> in bad performance when receiving many small packets.
> 
> The 2 RQ solution makes use of the VIC feature that allows a receive
> on primary queue to 'spill over' into another queue if the receive is
> too large to fit in the buffer assigned to the descriptor on the
> primary queue.  This means that there is no skipping of descriptors
> when receiving small packets and results in much better performance.
> 
> Signed-off-by: Nelson Escobar 
> Reviewed-by: John Daley 
> ---
> 
Applied to dpdk-next-net/rel_16_07

/Bruce

[dpdk-dev] NIC support for HPE Ethernet 10Gb 2-port 560FLR-SFP+ Adapter

2016-06-24 Thread Prashant Upadhyaya

Hi,

One of my customers intends to buy HPE Ethernet 10Gb 2-port
560FLR-SFP+ Adapter
(http://www8.hp.com/h20195/v2/getpdf.aspx/c04111435.pdf?ver=8) for
running a DPDK based app.

I have never tested my app with the above NIC (always used X520 to test my app)
If someone has already tried with this NIC, can you please confirm so
that I can give a go ahead for this NIC.

Thanks in advance

Best Regards
-Prashant

[dpdk-dev] [PATCH] vhost: fix Tx error counting of vhost PMD

2016-06-24 Thread Tetsuya Mukawa

According to 'rte_eth_stats' structure comments, 'imissed'
should represent RX error counting, but currently 'imissed' is
used to count TX error.
The patch replaces 'imissed' by 'oerrors'.

Fixes: ee584e9710b9 ("vhost: add driver on top of the library")
Signed-off-by: Tetsuya Mukawa 
---
 drivers/net/vhost/rte_eth_vhost.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/vhost/rte_eth_vhost.c 
b/drivers/net/vhost/rte_eth_vhost.c
index c5bbb87..3b50946 100644
--- a/drivers/net/vhost/rte_eth_vhost.c
+++ b/drivers/net/vhost/rte_eth_vhost.c
@@ -594,7 +594,7 @@ eth_stats_get(struct rte_eth_dev *dev, struct rte_eth_stats 
*stats)

stats->ipackets = rx_total;
stats->opackets = tx_total;
-   stats->imissed = tx_missed_total;
+   stats->oerrors = tx_missed_total;
stats->ibytes = rx_total_bytes;
stats->obytes = tx_total_bytes;
 }
-- 
2.7.4

[dpdk-dev] [PATCH v4] e1000: configure VLAN TPID

2016-06-24 Thread Bruce Richardson

On Thu, Jun 16, 2016 at 01:59:46PM +, Zhang, Helin wrote:
> 
> 
> > -Original Message-
> > From: Xing, Beilei
> > Sent: Thursday, June 16, 2016 9:36 PM
> > To: Zhang, Helin 
> > Cc: dev at dpdk.org; Xing, Beilei 
> > Subject: [PATCH v4] e1000: configure VLAN TPID
> > 
> > This patch enables configuring the outer TPID for double VLAN.
> > Note that all other TPID values are read only.
> > 
> > Signed-off-by: Beilei Xing 
> Acked-by: Helin Zhang 
> 
Applied to dpdk-next-net/rel_16_07

/Bruce

[dpdk-dev] [PATCH v6 25/25] mlx5: resurrect Rx scatter support

2016-06-24 Thread Nelio Laranjeiro

This commit brings back Rx scatter and related support by the MTU update
function. The maximum number of segments per packet is not a fixed value
anymore (previously MLX5_PMD_SGE_WR_N, set to 4 by default) as it caused
performance issues when fewer segments were actually needed as well as
limitations on the maximum packet size that could be received with the
default mbuf size (supporting at most 8576 bytes).

These limitations are now lifted as the number of SGEs is derived from the
MTU (which implies MRU) at queue initialization and during MTU update.

Signed-off-by: Adrien Mazarguil 
Signed-off-by: Vasily Philipov 
Signed-off-by: Nelio Laranjeiro 
---
 drivers/net/mlx5/mlx5_ethdev.c |  90 ++
 drivers/net/mlx5/mlx5_rxq.c|  77 ++-
 drivers/net/mlx5/mlx5_rxtx.c   | 139 -
 drivers/net/mlx5/mlx5_rxtx.h   |   1 +
 4 files changed, 225 insertions(+), 82 deletions(-)

diff --git a/drivers/net/mlx5/mlx5_ethdev.c b/drivers/net/mlx5/mlx5_ethdev.c
index 698a50e..72f0826 100644
--- a/drivers/net/mlx5/mlx5_ethdev.c
+++ b/drivers/net/mlx5/mlx5_ethdev.c
@@ -725,6 +725,9 @@ mlx5_dev_set_mtu(struct rte_eth_dev *dev, uint16_t mtu)
unsigned int i;
uint16_t (*rx_func)(void *, struct rte_mbuf **, uint16_t) =
mlx5_rx_burst;
+   unsigned int max_frame_len;
+   int rehash;
+   int restart = priv->started;

if (mlx5_is_secondary())
return -E_RTE_SECONDARY;
@@ -738,7 +741,6 @@ mlx5_dev_set_mtu(struct rte_eth_dev *dev, uint16_t mtu)
goto out;
} else
DEBUG("adapter port %u MTU set to %u", priv->port, mtu);
-   priv->mtu = mtu;
/* Temporarily replace RX handler with a fake one, assuming it has not
 * been copied elsewhere. */
dev->rx_pkt_burst = removed_rx_burst;
@@ -746,28 +748,94 @@ mlx5_dev_set_mtu(struct rte_eth_dev *dev, uint16_t mtu)
 * removed_rx_burst() instead. */
rte_wmb();
usleep(1000);
+   /* MTU does not include header and CRC. */
+   max_frame_len = ETHER_HDR_LEN + mtu + ETHER_CRC_LEN;
+   /* Check if at least one queue is going to need a SGE update. */
+   for (i = 0; i != priv->rxqs_n; ++i) {
+   struct rxq *rxq = (*priv->rxqs)[i];
+   unsigned int mb_len;
+   unsigned int size = RTE_PKTMBUF_HEADROOM + max_frame_len;
+   unsigned int sges_n;
+
+   if (rxq == NULL)
+   continue;
+   mb_len = rte_pktmbuf_data_room_size(rxq->mp);
+   assert(mb_len >= RTE_PKTMBUF_HEADROOM);
+   /*
+* Determine the number of SGEs needed for a full packet
+* and round it to the next power of two.
+*/
+   sges_n = log2above((size / mb_len) + !!(size % mb_len));
+   if (sges_n != rxq->sges_n)
+   break;
+   }
+   /*
+* If all queues have the right number of SGEs, a simple rehash
+* of their buffers is enough, otherwise SGE information can only
+* be updated in a queue by recreating it. All resources that depend
+* on queues (flows, indirection tables) must be recreated as well in
+* that case.
+*/
+   rehash = (i == priv->rxqs_n);
+   if (!rehash) {
+   /* Clean up everything as with mlx5_dev_stop(). */
+   priv_special_flow_disable_all(priv);
+   priv_mac_addrs_disable(priv);
+   priv_destroy_hash_rxqs(priv);
+   priv_fdir_disable(priv);
+   priv_dev_interrupt_handler_uninstall(priv, dev);
+   }
+recover:
/* Reconfigure each RX queue. */
for (i = 0; (i != priv->rxqs_n); ++i) {
struct rxq *rxq = (*priv->rxqs)[i];
-   unsigned int mb_len;
-   unsigned int max_frame_len;
+   struct rxq_ctrl *rxq_ctrl =
+   container_of(rxq, struct rxq_ctrl, rxq);
int sp;
+   unsigned int mb_len;
+   unsigned int tmp;

if (rxq == NULL)
continue;
-   /* Calculate new maximum frame length according to MTU and
-* toggle scattered support (sp) if necessary. */
-   max_frame_len = (priv->mtu + ETHER_HDR_LEN +
-(ETHER_MAX_VLAN_FRAME_LEN - ETHER_MAX_LEN));
mb_len = rte_pktmbuf_data_room_size(rxq->mp);
assert(mb_len >= RTE_PKTMBUF_HEADROOM);
+   /* Toggle scattered support (sp) if necessary. */
sp = (max_frame_len > (mb_len - RTE_PKTMBUF_HEADROOM));
-   if (sp) {
-   ERROR("%p: RX scatter is not supported", (void *)dev);
-   ret = ENOTSUP;
-   goto out;
+   /* Provide new values to

[dpdk-dev] [PATCH v6 24/25] mlx5: make Rx queue reinitialization safer

2016-06-24 Thread Nelio Laranjeiro

From: Adrien Mazarguil 

The primary purpose of rxq_rehash() function is to stop and restart
reception on a queue after re-posting buffers. This may fail if the array
that temporarily stores existing buffers for reuse cannot be allocated.

Update rxq_rehash() to work on the target queue directly (not through a
template copy) and avoid this allocation.

rxq_alloc_elts() is modified accordingly to take buffers from an existing
queue directly and update their refcount.

Unlike rxq_rehash(), rxq_setup() must work on a temporary structure but
should not allocate new mbufs from the pool while reinitializing an
existing queue. This is achieved by using the refcount-aware
rxq_alloc_elts() before overwriting queue data.

Signed-off-by: Adrien Mazarguil 
Signed-off-by: Vasily Philipov 
---
 drivers/net/mlx5/mlx5_rxq.c | 83 ++---
 1 file changed, 41 insertions(+), 42 deletions(-)

diff --git a/drivers/net/mlx5/mlx5_rxq.c b/drivers/net/mlx5/mlx5_rxq.c
index fbf14fa..b2ddd0d 100644
--- a/drivers/net/mlx5/mlx5_rxq.c
+++ b/drivers/net/mlx5/mlx5_rxq.c
@@ -642,7 +642,7 @@ priv_rehash_flows(struct priv *priv)
  */
 static int
 rxq_alloc_elts(struct rxq_ctrl *rxq_ctrl, unsigned int elts_n,
-  struct rte_mbuf **pool)
+  struct rte_mbuf *(*pool)[])
 {
unsigned int i;
int ret = 0;
@@ -654,9 +654,10 @@ rxq_alloc_elts(struct rxq_ctrl *rxq_ctrl, unsigned int 
elts_n,
&(*rxq_ctrl->rxq.wqes)[i];

if (pool != NULL) {
-   buf = *(pool++);
+   buf = (*pool)[i];
assert(buf != NULL);
rte_pktmbuf_reset(buf);
+   rte_pktmbuf_refcnt_update(buf, 1);
} else
buf = rte_pktmbuf_alloc(rxq_ctrl->rxq.mp);
if (buf == NULL) {
@@ -781,7 +782,7 @@ rxq_cleanup(struct rxq_ctrl *rxq_ctrl)
 }

 /**
- * Reconfigure a RX queue with new parameters.
+ * Reconfigure RX queue buffers.
  *
  * rxq_rehash() does not allocate mbufs, which, if not done from the right
  * thread (such as a control thread), may corrupt the pool.
@@ -798,67 +799,48 @@ rxq_cleanup(struct rxq_ctrl *rxq_ctrl)
 int
 rxq_rehash(struct rte_eth_dev *dev, struct rxq_ctrl *rxq_ctrl)
 {
-   struct rxq_ctrl tmpl = *rxq_ctrl;
-   unsigned int mbuf_n;
-   unsigned int desc_n;
-   struct rte_mbuf **pool;
-   unsigned int i, k;
+   unsigned int elts_n = rxq_ctrl->rxq.elts_n;
+   unsigned int i;
struct ibv_exp_wq_attr mod;
int err;

DEBUG("%p: rehashing queue %p", (void *)dev, (void *)rxq_ctrl);
-   /* Number of descriptors and mbufs currently allocated. */
-   desc_n = tmpl.rxq.elts_n;
-   mbuf_n = desc_n;
/* From now on, any failure will render the queue unusable.
 * Reinitialize WQ. */
mod = (struct ibv_exp_wq_attr){
.attr_mask = IBV_EXP_WQ_ATTR_STATE,
.wq_state = IBV_EXP_WQS_RESET,
};
-   err = ibv_exp_modify_wq(tmpl.wq, );
+   err = ibv_exp_modify_wq(rxq_ctrl->wq, );
if (err) {
ERROR("%p: cannot reset WQ: %s", (void *)dev, strerror(err));
assert(err > 0);
return err;
}
-   /* Allocate pool. */
-   pool = rte_malloc(__func__, (mbuf_n * sizeof(*pool)), 0);
-   if (pool == NULL) {
-   ERROR("%p: cannot allocate memory", (void *)dev);
-   return ENOBUFS;
-   }
/* Snatch mbufs from original queue. */
-   k = 0;
-   for (i = 0; (i != desc_n); ++i)
-   pool[k++] = (*rxq_ctrl->rxq.elts)[i];
-   assert(k == mbuf_n);
-   rte_free(pool);
+   claim_zero(rxq_alloc_elts(rxq_ctrl, elts_n, rxq_ctrl->rxq.elts));
+   for (i = 0; i != elts_n; ++i) {
+   struct rte_mbuf *buf = (*rxq_ctrl->rxq.elts)[i];
+
+   assert(rte_mbuf_refcnt_read(buf) == 2);
+   rte_pktmbuf_free_seg(buf);
+   }
/* Change queue state to ready. */
mod = (struct ibv_exp_wq_attr){
.attr_mask = IBV_EXP_WQ_ATTR_STATE,
.wq_state = IBV_EXP_WQS_RDY,
};
-   err = ibv_exp_modify_wq(tmpl.wq, );
+   err = ibv_exp_modify_wq(rxq_ctrl->wq, );
if (err) {
ERROR("%p: WQ state to IBV_EXP_WQS_RDY failed: %s",
  (void *)dev, strerror(err));
goto error;
}
-   /* Post SGEs. */
-   err = rxq_alloc_elts(, desc_n, pool);
-   if (err) {
-   ERROR("%p: cannot reallocate WRs, aborting", (void *)dev);
-   rte_free(pool);
-   assert(err > 0);
-   return err;
-   }
/* Update doorbell counter. */
-   rxq_ctrl->rxq.rq_ci = desc_n;
+   rxq_ctrl->rxq.rq_ci = elts_n;
rte_wmb();
*rxq_ctrl->rxq.rq_db = htonl(rxq_ctrl->rxq.rq_ci);

[dpdk-dev] [PATCH v6 23/25] mlx5: remove redundant Rx queue initialization code

2016-06-24 Thread Nelio Laranjeiro

From: Adrien Mazarguil 

Toggling RX checksum offloads is already done at initialization time. This
code does not belong in rxq_rehash().

Signed-off-by: Adrien Mazarguil 
Signed-off-by: Nelio Laranjeiro 
---
 drivers/net/mlx5/mlx5_rxq.c | 11 ---
 1 file changed, 11 deletions(-)

diff --git a/drivers/net/mlx5/mlx5_rxq.c b/drivers/net/mlx5/mlx5_rxq.c
index b2f8f9a..fbf14fa 100644
--- a/drivers/net/mlx5/mlx5_rxq.c
+++ b/drivers/net/mlx5/mlx5_rxq.c
@@ -798,7 +798,6 @@ rxq_cleanup(struct rxq_ctrl *rxq_ctrl)
 int
 rxq_rehash(struct rte_eth_dev *dev, struct rxq_ctrl *rxq_ctrl)
 {
-   struct priv *priv = rxq_ctrl->priv;
struct rxq_ctrl tmpl = *rxq_ctrl;
unsigned int mbuf_n;
unsigned int desc_n;
@@ -811,16 +810,6 @@ rxq_rehash(struct rte_eth_dev *dev, struct rxq_ctrl 
*rxq_ctrl)
/* Number of descriptors and mbufs currently allocated. */
desc_n = tmpl.rxq.elts_n;
mbuf_n = desc_n;
-   /* Toggle RX checksum offload if hardware supports it. */
-   if (priv->hw_csum) {
-   tmpl.rxq.csum = !!dev->data->dev_conf.rxmode.hw_ip_checksum;
-   rxq_ctrl->rxq.csum = tmpl.rxq.csum;
-   }
-   if (priv->hw_csum_l2tun) {
-   tmpl.rxq.csum_l2tun =
-   !!dev->data->dev_conf.rxmode.hw_ip_checksum;
-   rxq_ctrl->rxq.csum_l2tun = tmpl.rxq.csum_l2tun;
-   }
/* From now on, any failure will render the queue unusable.
 * Reinitialize WQ. */
mod = (struct ibv_exp_wq_attr){
-- 
2.1.4

1 2 >

1 - 100 of 142 matches

Mail list logo