[dpdk-dev] [PATCH v2] doc: arm64: document DPDK application profiling methods

2016-10-26 Thread Thomas Monjalon
2016-10-05 14:13, Jerin Jacob:
> Signed-off-by: Jerin Jacob 
> ---
> v2:
> -Addressed ARM64 specific review comments(Suggested by Thomas)

I feel more comments could be done, especially about formatting.
You are adding a chapter Introduction without any other section
of the same level.
Some technical terms should be enclosed in backquotes.

Please John, could you guide Jerin or provide an updated version?
Thanks



[dpdk-dev] [PATCH] doc: add note on primary process dependency

2016-10-26 Thread Thomas Monjalon
2016-10-13 16:09, Reshma Pattan:
> The note i.e. "The dpdk-pdump tool can only be used in
> conjunction with a primary process which has the packet
> capture framework initialized already" is added to
> doc/guides/sample_app_ug/pdump.rst to facilitate
> easy understanding on the usage of the tool.
> 
> Suggested-by: Jianfeng Tan 
> Signed-off-by: Reshma Pattan 

Applied, thanks


[dpdk-dev] [PATCH] pdump: revert PCI device name conversion

2016-10-26 Thread Thomas Monjalon
> > Earlier ethdev library created the device names in the "bus:device.func"
> > format hence pdump library implemented its own conversion method for
> > changing the user passed device name format "domain:bus:device.func" to
> > "bus:device.func"
> > for finding the port id using device name using ethdev library calls. Now 
> > after
> > ethdev and eal rework http://dpdk.org/dev/patchwork/patch/15855/,
> > the device names are created in the format "domain:bus:device.func", so
> > pdump library conversion is not needed any more, hence removed the
> > corresponding code.
> > 
> > Signed-off-by: Reshma Pattan 
> 
> Acked-by: Fan Zhang  

Applied, thanks


[dpdk-dev] [PATCH v4 18/32] net/qede: add missing 100G link speed capability

2016-10-26 Thread Harish Patil

>2016-10-18 21:11, Rasesh Mody:
>> From: Harish Patil 
>> 
>> This patch fixes the missing 100G link speed advertisement
>> when the 100G support was initially added.
>> 
>> Fixes: 2af14ca79c0a ("net/qede: support 100G")
>> 
>> Signed-off-by: Harish Patil 
>[...]
>>  [Features]
>> +Speed capabilities   = Y
>
>This feature should be checked only when it is fully implemented,
>i.e. when you return the real capabilities of the device.
>
>> --- a/drivers/net/qede/qede_ethdev.c
>> +++ b/drivers/net/qede/qede_ethdev.c
>> @@ -599,7 +599,8 @@ qede_dev_info_get(struct rte_eth_dev *eth_dev,
>>   DEV_TX_OFFLOAD_UDP_CKSUM |
>>   DEV_TX_OFFLOAD_TCP_CKSUM);
>>  
>> -dev_info->speed_capa = ETH_LINK_SPEED_25G | ETH_LINK_SPEED_40G;
>> +dev_info->speed_capa = ETH_LINK_SPEED_25G | ETH_LINK_SPEED_40G |
>> +   ETH_LINK_SPEED_100G;
>>  }
>
>It is only faking the capabilities at driver-level.
>You should check if the underlying device is able to achieve 100G
>before advertising this flag to the application.
>
>I suggest to update this patch to remove the doc update.
>The contract is to fill it only when the code is fixed.
>By the way, we must call every other drivers to properly implement
>this feature.
>

Hi Thomas,
Its not really a faking. The same driver supports all three link speeds.
The required support for 100G was already present in the 16.07 inbox
driver.
We just had missed out advertising 100G link speed via
dev_info->speed_capa.
Hence it is - Fixes: 2af14ca79c0a ("net/qede: support 100G?).
Hope it is okay.


Thanks,
Harish



[dpdk-dev] [PATCH v4 07/32] net/qede: fix 32 bit compilation

2016-10-26 Thread Mody, Rasesh
> From: Thomas Monjalon [mailto:thomas.monjalon at 6wind.com]
> Sent: Wednesday, October 26, 2016 9:54 AM
> 
> 2016-10-18 21:11, Rasesh Mody:
> > Fix 32 bit compilation for gcc version 4.3.4.
> >
> > Fixes: ec94dbc57362 ("qede: add base driver")
> >
> > Signed-off-by: Rasesh Mody 
> [...]
> >  ifeq ($(CONFIG_RTE_TOOLCHAIN_GCC),y)
> > +ifeq ($(shell gcc -Wno-unused-but-set-variable -Werror -E - <
> > +/dev/null > /dev/null 2>&1; echo $$?),0)
> >  CFLAGS_BASE_DRIVER += -Wno-unused-but-set-variable
> > +endif
> >  CFLAGS_BASE_DRIVER += -Wno-missing-declarations
> > +ifeq ($(shell gcc -Wno-maybe-uninitialized -Werror -E - < /dev/null >
> > +/dev/null 2>&1; echo $$?),0)
> >  CFLAGS_BASE_DRIVER += -Wno-maybe-uninitialized
> > +endif
> >  CFLAGS_BASE_DRIVER += -Wno-strict-prototypes  ifeq ($(shell test
> > $(GCC_VERSION) -ge 60 && echo 1), 1)  CFLAGS_BASE_DRIVER +=
> > -Wno-shift-negative-value
> 
> What the hell are you doing here?

In one of our compilation testing on i586, we have gcc version 4.3.4. This 
version of gcc gives us following errors:

cc1: error: unrecognized command line option "-Wno-unused-but-set-variable"
cc1: error: unrecognized command line option "-Wno-maybe-uninitialized"

-Wno-unused-but-set-variable option was added only in gcc version 5.1.0
-Wno-maybe-uninitialized option was added only in gcc version 4.7.0

All that above change does is that it checks if -Wno-unused-but-set-variable 
and -Wno-maybe-uninitialized options are available with gcc only then include 
them for compilation.

> 1/ You should better fix "unused-but-set-variable" errors 2/ It won't work
> when cross-compiling because you do not use $(CC)
>   in $(shell gcc

We tested on gcc version 6.2.0 on x86_64 without applying this patch. Errors 
related to "unused-but-set-variable" option were not seen. The only errors we 
saw are as noted above due to an older version of gcc.
We do use $(shell gcc, however, it is used under ifeq 
($(CONFIG_RTE_TOOLCHAIN_GCC),y), so, I believe it should work when 
cross-compiling. For example, in one of our compilation testing on clang 
version 3.8.0, with this patch applied, we did not see any errors. Please let 
us know if you see otherwise.

However, I do agree it is better to use $(CC). We could change that with a 
follow on patch.

Thanks!
-Rasesh

> 
> I really do not want to look at the qede patches.
> But each time my eyes stop on one of them, I'm struggling.


[dpdk-dev] [RFC] [PATCH v2] libeventdev: event driven programming model framework for DPDK

2016-10-26 Thread Vincent Jardin


Le 26 octobre 2016 2:11:26 PM "Van Haaren, Harry" 
 a ?crit :

>> -Original Message-
>> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Jerin Jacob
>>
>> So far, I have received constructive feedback from Intel, NXP and Linaro 
>> folks.
>> Let me know, if anyone else interested in contributing to the definition of 
>> eventdev?
>>
>> If there are no major issues in proposed spec, then Cavium would like work on
>> implementing and up-streaming the common code(lib/librte_eventdev/) and
>> an associated HW driver.(Requested minor changes of v2 will be addressed
>> in next version).
>
> Hi All,
>
> I will propose a minor change to the rte_event struct, allowing some bits 
> to be implementation specific. Currently the rte_event struct has no space 
> to allow an implementation store any metadata about the event. For software 
> performance it would be really helpful if there are some bits available for 
> the implementation to keep some flags about each event.
>
> I suggest to rework the struct as below which opens 6 bits that were 
> otherwise wasted, and define them as implementation specific. By 
> implementation specific it is understood that the implementation can 
> overwrite any information stored in those bits, and the application must 
> not expect the data to remain after the event is scheduled.
>
> OLD:
> struct rte_event {
>   uint32_t flow_id:24;
>   uint32_t queue_id:8;
>   uint8_t  sched_type; /* Note only 2 bits of 8 are required */
>
> NEW:
> struct rte_event {
>   uint32_t flow_id:24;
>   uint32_t sched_type:2; /* reduced size : but 2 bits is enough for the 
> enqueue types Ordered,Atomic,Parallel.*/
>   uint32_t implementation:6; /* available for implementation specific 
> metadata */
>   uint8_t queue_id; /* still 8 bits as before */

Bitfileds are efficients on Octeon. What's about other CPUs you have in 
mind? x86 is not as efficient.


>
>
> Thoughts? -Harry




[dpdk-dev] rte_eth_dev_config_restore problem

2016-10-26 Thread Igor Ryzhov
Hello everyone,

I think there is a bug in rte_eth_dev_config_restore function.
During restoration of MAC address configuration, all MAC addresses are
restored with mac_addr_add function, but as I think MAC address with index
0 shouldn't be restored in such way, because it is a default MAC address.

This problem can be solved in two ways:
1. Just call mac_addr_set instead of mac_addr_add for index 0.
2. Don't restore address with index 0 at all and let driver do it.

I think the second option is the right one, because:
1. Some drivers don't support mac_addr_set at all, it means that we must
not touch it.
2. Some drivers already support restoration of default MAC address. For
example, look at the ixgbe "ixgbe_init_rx_addrs_generic" function. It
restores default MAC address if it was overridden by user. All that we have
to do is to rewrite hw->mac.addr in mac_addr_set function.

Best regards,
Igor


[dpdk-dev] [PATCH] doc/guides: add more info re VT-d/iommu settings for QAT

2016-10-26 Thread Thomas Monjalon
2016-10-26 16:50, Trahe, Fiona:
>  Hi Thomas,
> 
> From: Thomas Monjalon [mailto:thomas.monjalon at 6wind.com]
> > 2016-10-26 16:20, Fiona Trahe:
> > > add more info re VT-d/iommu settings for QAT remove limitation re
> > > performance tuning
> > 
> > Sorry, I do not understand what means "re".
> 
> "re" is commonly used in the English language and means "in reference to" or 
> "about"
> but I'm happy to change to "about" which is the more conventional word.
> There's an interesting explanation of the evolution of the term here:
> https://en.oxforddictionaries.com/definition/re
> Usage
> The traditional view is that re should be used in headings and references, as 
> in Re: Ainsworth versus Chambers, but not as a normal word meaning 'about', 
> as in I saw the deputy head re the incident. However, the evidence suggests 
> that re is now widely used in the second context in official and 
> semi-official contexts, and is now generally accepted. It is hard to see any 
> compelling logical argument against using it as an ordinary English word in 
> this way

Thanks for the detailed explanations, and sorry again for my ignorance :)


[dpdk-dev] [PATCH v4 07/32] net/qede: fix 32 bit compilation

2016-10-26 Thread Thomas Monjalon
2016-10-18 21:11, Rasesh Mody:
> Fix 32 bit compilation for gcc version 4.3.4.
> 
> Fixes: ec94dbc57362 ("qede: add base driver")
> 
> Signed-off-by: Rasesh Mody 
[...]
>  ifeq ($(CONFIG_RTE_TOOLCHAIN_GCC),y)
> +ifeq ($(shell gcc -Wno-unused-but-set-variable -Werror -E - < /dev/null > 
> /dev/null 2>&1; echo $$?),0)
>  CFLAGS_BASE_DRIVER += -Wno-unused-but-set-variable
> +endif
>  CFLAGS_BASE_DRIVER += -Wno-missing-declarations
> +ifeq ($(shell gcc -Wno-maybe-uninitialized -Werror -E - < /dev/null > 
> /dev/null 2>&1; echo $$?),0)
>  CFLAGS_BASE_DRIVER += -Wno-maybe-uninitialized
> +endif
>  CFLAGS_BASE_DRIVER += -Wno-strict-prototypes
>  ifeq ($(shell test $(GCC_VERSION) -ge 60 && echo 1), 1)
>  CFLAGS_BASE_DRIVER += -Wno-shift-negative-value

What the hell are you doing here?
1/ You should better fix "unused-but-set-variable" errors
2/ It won't work when cross-compiling because you do not use $(CC)
in $(shell gcc

I really do not want to look at the qede patches.
But each time my eyes stop on one of them, I'm struggling.



[dpdk-dev] [PATCH v2] eal: fix libabi macro for device generalization patches

2016-10-26 Thread Shreyansh Jain
On Wednesday 26 October 2016 06:30 PM, Shreyansh Jain wrote:
> rte_device/driver generalization patches [1] were merged without a change
> in the LIBABIVER macro. This patches bumps the macro of affected libs.
>
> Also, deprecation notice from 16.07 has been removed and release notes for
> 16.11 added.
>
> Signed-off-by: Shreyansh Jain 
> --
> v2:
>  - Mark bumped libraries in release_16_11.rst file
>  - change code symbol names from text to code layout
>
> ---
>  doc/guides/rel_notes/deprecation.rst   | 12 
>  doc/guides/rel_notes/release_16_11.rst | 21 +++--
>  lib/librte_cryptodev/Makefile  |  2 +-
>  lib/librte_eal/bsdapp/eal/Makefile |  2 +-
>  lib/librte_eal/linuxapp/eal/Makefile   |  2 +-
>  lib/librte_ether/Makefile  |  2 +-
>  6 files changed, 23 insertions(+), 18 deletions(-)
>
> diff --git a/doc/guides/rel_notes/deprecation.rst 
> b/doc/guides/rel_notes/deprecation.rst
> index d5c1490..884a231 100644
> --- a/doc/guides/rel_notes/deprecation.rst
> +++ b/doc/guides/rel_notes/deprecation.rst
> @@ -18,18 +18,6 @@ Deprecation Notices
>``nb_seg_max`` and ``nb_mtu_seg_max`` providing information about number of
>segments limit to be transmitted by device for TSO/non-TSO packets.
>
> -* The ethdev hotplug API is going to be moved to EAL with a notification
> -  mechanism added to crypto and ethdev libraries so that hotplug is now
> -  available to both of them. This API will be stripped of the device 
> arguments
> -  so that it only cares about hotplugging.
> -
> -* Structures embodying pci and vdev devices are going to be reworked to
> -  integrate new common rte_device / rte_driver objects (see
> -  http://dpdk.org/ml/archives/dev/2016-January/031390.html).
> -  ethdev and crypto libraries will then only handle those objects so that 
> they
> -  do not need to care about the kind of devices that are being used, making 
> it
> -  easier to add new buses later.
> -
>  * ABI changes are planned for 16.11 in the ``rte_mbuf`` structure: some 
> fields
>may be reordered to facilitate the writing of ``data_off``, ``refcnt``, and
>``nb_segs`` in one operation, because some platforms have an overhead if 
> the
> diff --git a/doc/guides/rel_notes/release_16_11.rst 
> b/doc/guides/rel_notes/release_16_11.rst
> index 26cdd62..2d5636c 100644
> --- a/doc/guides/rel_notes/release_16_11.rst
> +++ b/doc/guides/rel_notes/release_16_11.rst
> @@ -149,6 +149,23 @@ Resolved Issues
>  EAL
>  ~~~
>
> +* **Improved device/driver heirarchy and generalized hotplugging**
> +
> +  Device and driver relationship has been restructured by introducing generic
> +  classes. This paves way for having PCI, VDEV and other device types as
> +  just instantiated objects rather than classes in themselves. Hotplugging 
> too
> +  has been generalized into EAL so that ethernet or crypto devices can use 
> the
> +  common infrastructure.
> +
> +  * removed ``pmd_type`` as way of segragation of devices
> +  * added ``rte_device`` class and all PCI and VDEV devices inherit from it
> +  * renamed devinit/devuninit handlers to probe/remove to make it more
> +semantically correct with respect to device<=>driver relationship
> +  * moved hotplugging support to EAL
> +  * helpers and support macros have been renamed to make them more synonymous
> +with their device types
> +(e.g. ``PMD_REGISTER_DRIVER`` => ``DRIVER_REGISTER_PCI``)
> +
>
>  Drivers
>  ~~~
> @@ -232,11 +249,11 @@ The libraries prepended with a plus sign were 
> incremented in this version.
>
>  .. code-block:: diff
>
> - libethdev.so.4
> +   + libethdev.so.4

Just noticed:
Should the '4' here reflect the current LIBABIVER number?
If so, I will send this patch again.

>   librte_acl.so.2
>   librte_cfgfile.so.2
>   librte_cmdline.so.2
> - librte_cryptodev.so.1
> +   + librte_cryptodev.so.1
>   librte_distributor.so.1
> + librte_eal.so.3
>   librte_hash.so.2
> diff --git a/lib/librte_cryptodev/Makefile b/lib/librte_cryptodev/Makefile
> index 314a046..aebf5d9 100644
> --- a/lib/librte_cryptodev/Makefile
> +++ b/lib/librte_cryptodev/Makefile
> @@ -34,7 +34,7 @@ include $(RTE_SDK)/mk/rte.vars.mk
>  LIB = librte_cryptodev.a
>
>  # library version
> -LIBABIVER := 1
> +LIBABIVER := 2
>
>  # build flags
>  CFLAGS += -O3
> diff --git a/lib/librte_eal/bsdapp/eal/Makefile 
> b/lib/librte_eal/bsdapp/eal/Makefile
> index a15b762..122798c 100644
> --- a/lib/librte_eal/bsdapp/eal/Makefile
> +++ b/lib/librte_eal/bsdapp/eal/Makefile
> @@ -48,7 +48,7 @@ LDLIBS += -lgcc_s
>
>  EXPORT_MAP := rte_eal_version.map
>
> -LIBABIVER := 3
> +LIBABIVER := 4
>
>  # specific to bsdapp exec-env
>  SRCS-$(CONFIG_RTE_EXEC_ENV_BSDAPP) := eal.c
> diff --git a/lib/librte_eal/linuxapp/eal/Makefile 
> b/lib/librte_eal/linuxapp/eal/Makefile
> index 4e206f0..4ad7c85 100644
> --- a/lib/librte_eal/linuxapp/eal/Makefile
> +++ b/lib/librte_eal/linuxapp/eal/Makefile
> @@ -37,7 +37,7 @@ ARCH_DIR ?= $(RTE_ARCH)
>  

[dpdk-dev] [PATCH] eal: fix libabi macro for device generalization patches

2016-10-26 Thread Shreyansh Jain
On Wednesday 26 October 2016 06:08 PM, Shreyansh Jain wrote:
> rte_device/driver generalization patches [1] were merged without a change
> in the LIBABIVER macro. This patches bumps the macro of affected libs.
>
> Also, deprecation notice from 16.07 has been removed and release notes for
> 16.11 added.
>
> [1] http://dpdk.org/ml/archives/dev/2016-September/047087.html
>
> Signed-off-by: Shreyansh Jain 
> ---
>  doc/guides/rel_notes/deprecation.rst   | 12 
>  doc/guides/rel_notes/release_16_11.rst | 16 
>  lib/librte_cryptodev/Makefile  |  2 +-
>  lib/librte_eal/bsdapp/eal/Makefile |  2 +-
>  lib/librte_eal/linuxapp/eal/Makefile   |  2 +-
>  lib/librte_ether/Makefile  |  2 +-
>  6 files changed, 20 insertions(+), 16 deletions(-)
>

Self-NACK.
missed updating the libraries impacted in the list of libraries.
Sent v2.


[dpdk-dev] [PATCH v2] eal: fix libabi macro for device generalization patches

2016-10-26 Thread Shreyansh Jain
rte_device/driver generalization patches [1] were merged without a change
in the LIBABIVER macro. This patches bumps the macro of affected libs.

Also, deprecation notice from 16.07 has been removed and release notes for
16.11 added.

Signed-off-by: Shreyansh Jain 
--
v2:
 - Mark bumped libraries in release_16_11.rst file
 - change code symbol names from text to code layout

---
 doc/guides/rel_notes/deprecation.rst   | 12 
 doc/guides/rel_notes/release_16_11.rst | 21 +++--
 lib/librte_cryptodev/Makefile  |  2 +-
 lib/librte_eal/bsdapp/eal/Makefile |  2 +-
 lib/librte_eal/linuxapp/eal/Makefile   |  2 +-
 lib/librte_ether/Makefile  |  2 +-
 6 files changed, 23 insertions(+), 18 deletions(-)

diff --git a/doc/guides/rel_notes/deprecation.rst 
b/doc/guides/rel_notes/deprecation.rst
index d5c1490..884a231 100644
--- a/doc/guides/rel_notes/deprecation.rst
+++ b/doc/guides/rel_notes/deprecation.rst
@@ -18,18 +18,6 @@ Deprecation Notices
   ``nb_seg_max`` and ``nb_mtu_seg_max`` providing information about number of
   segments limit to be transmitted by device for TSO/non-TSO packets.

-* The ethdev hotplug API is going to be moved to EAL with a notification
-  mechanism added to crypto and ethdev libraries so that hotplug is now
-  available to both of them. This API will be stripped of the device arguments
-  so that it only cares about hotplugging.
-
-* Structures embodying pci and vdev devices are going to be reworked to
-  integrate new common rte_device / rte_driver objects (see
-  http://dpdk.org/ml/archives/dev/2016-January/031390.html).
-  ethdev and crypto libraries will then only handle those objects so that they
-  do not need to care about the kind of devices that are being used, making it
-  easier to add new buses later.
-
 * ABI changes are planned for 16.11 in the ``rte_mbuf`` structure: some fields
   may be reordered to facilitate the writing of ``data_off``, ``refcnt``, and
   ``nb_segs`` in one operation, because some platforms have an overhead if the
diff --git a/doc/guides/rel_notes/release_16_11.rst 
b/doc/guides/rel_notes/release_16_11.rst
index 26cdd62..2d5636c 100644
--- a/doc/guides/rel_notes/release_16_11.rst
+++ b/doc/guides/rel_notes/release_16_11.rst
@@ -149,6 +149,23 @@ Resolved Issues
 EAL
 ~~~

+* **Improved device/driver heirarchy and generalized hotplugging**
+
+  Device and driver relationship has been restructured by introducing generic
+  classes. This paves way for having PCI, VDEV and other device types as
+  just instantiated objects rather than classes in themselves. Hotplugging too
+  has been generalized into EAL so that ethernet or crypto devices can use the
+  common infrastructure.
+
+  * removed ``pmd_type`` as way of segragation of devices
+  * added ``rte_device`` class and all PCI and VDEV devices inherit from it
+  * renamed devinit/devuninit handlers to probe/remove to make it more
+semantically correct with respect to device<=>driver relationship
+  * moved hotplugging support to EAL
+  * helpers and support macros have been renamed to make them more synonymous
+with their device types
+(e.g. ``PMD_REGISTER_DRIVER`` => ``DRIVER_REGISTER_PCI``)
+

 Drivers
 ~~~
@@ -232,11 +249,11 @@ The libraries prepended with a plus sign were incremented 
in this version.

 .. code-block:: diff

- libethdev.so.4
+   + libethdev.so.4
  librte_acl.so.2
  librte_cfgfile.so.2
  librte_cmdline.so.2
- librte_cryptodev.so.1
+   + librte_cryptodev.so.1
  librte_distributor.so.1
+ librte_eal.so.3
  librte_hash.so.2
diff --git a/lib/librte_cryptodev/Makefile b/lib/librte_cryptodev/Makefile
index 314a046..aebf5d9 100644
--- a/lib/librte_cryptodev/Makefile
+++ b/lib/librte_cryptodev/Makefile
@@ -34,7 +34,7 @@ include $(RTE_SDK)/mk/rte.vars.mk
 LIB = librte_cryptodev.a

 # library version
-LIBABIVER := 1
+LIBABIVER := 2

 # build flags
 CFLAGS += -O3
diff --git a/lib/librte_eal/bsdapp/eal/Makefile 
b/lib/librte_eal/bsdapp/eal/Makefile
index a15b762..122798c 100644
--- a/lib/librte_eal/bsdapp/eal/Makefile
+++ b/lib/librte_eal/bsdapp/eal/Makefile
@@ -48,7 +48,7 @@ LDLIBS += -lgcc_s

 EXPORT_MAP := rte_eal_version.map

-LIBABIVER := 3
+LIBABIVER := 4

 # specific to bsdapp exec-env
 SRCS-$(CONFIG_RTE_EXEC_ENV_BSDAPP) := eal.c
diff --git a/lib/librte_eal/linuxapp/eal/Makefile 
b/lib/librte_eal/linuxapp/eal/Makefile
index 4e206f0..4ad7c85 100644
--- a/lib/librte_eal/linuxapp/eal/Makefile
+++ b/lib/librte_eal/linuxapp/eal/Makefile
@@ -37,7 +37,7 @@ ARCH_DIR ?= $(RTE_ARCH)
 EXPORT_MAP := rte_eal_version.map
 VPATH += $(RTE_SDK)/lib/librte_eal/common/arch/$(ARCH_DIR)

-LIBABIVER := 3
+LIBABIVER := 4

 VPATH += $(RTE_SDK)/lib/librte_eal/common

diff --git a/lib/librte_ether/Makefile b/lib/librte_ether/Makefile
index 488b7c8..bc2e5f6 100644
--- a/lib/librte_ether/Makefile
+++ b/lib/librte_ether/Makefile
@@ -41,7 +41,7 @@ CFLAGS += $(WERROR_FLAGS)

 EXPORT_MAP := rte_ether_version.map

-LIBABIVER 

[dpdk-dev] [PATCH v2] doc/guides: add more info about VT-d/iommu settings

2016-10-26 Thread Fiona Trahe
Add more information about VT-d/iommu settings for QAT PMD.
Remove limitation indicating QAT driver is not performance tuned.

Signed-off-by: Fiona Trahe 
---

v2:
 clarified commit message


 doc/guides/cryptodevs/qat.rst | 7 +++
 1 file changed, 3 insertions(+), 4 deletions(-)

diff --git a/doc/guides/cryptodevs/qat.rst b/doc/guides/cryptodevs/qat.rst
index 70bc2b1..bbe0b12 100644
--- a/doc/guides/cryptodevs/qat.rst
+++ b/doc/guides/cryptodevs/qat.rst
@@ -77,7 +77,6 @@ Limitations
 * Hash only is not supported except SNOW 3G UIA2 and KASUMI F9.
 * Cipher only is not supported except SNOW 3G UEA2, KASUMI F8 and 3DES.
 * Only supports the session-oriented API implementation (session-less APIs are 
not supported).
-* Not performance tuned.
 * SNOW 3G (UEA2) and KASUMI (F8) supported only if cipher length, cipher 
offset fields are byte-aligned.
 * SNOW 3G (UIA2) and KASUMI (F9) supported only if hash length, hash offset 
fields are byte-aligned.
 * No BSD support as BSD QAT kernel driver not available.
@@ -201,7 +200,7 @@ The steps below assume you are:
 * Running DPDK on a platform with one ``DH895xCC`` device.
 * On a kernel at least version 4.4.

-In BIOS ensure that SRIOV is enabled and VT-d is disabled.
+In BIOS ensure that SRIOV is enabled and either a) disable VT-d or b) enable 
VT-d and set ``"intel_iommu=on iommu=pt"`` in the grub file.

 Ensure the QAT driver is loaded on your system, by executing::

@@ -260,7 +259,7 @@ The steps below assume you are:
 * Running DPDK on a platform with one ``C62x`` device.
 * On a kernel at least version 4.5.

-In BIOS ensure that SRIOV is enabled and VT-d is disabled.
+In BIOS ensure that SRIOV is enabled and either a) disable VT-d or b) enable 
VT-d and set ``"intel_iommu=on iommu=pt"`` in the grub file.

 Ensure the QAT driver is loaded on your system, by executing::

@@ -304,7 +303,7 @@ The steps below assume you are:
 * Running DPDK on a platform with one ``C3xxx`` device.
 * On a kernel at least version 4.5.

-In BIOS ensure that SRIOV is enabled and VT-d is disabled.
+In BIOS ensure that SRIOV is enabled and either a) disable VT-d or b) enable 
VT-d and set ``"intel_iommu=on iommu=pt"`` in the grub file.

 Ensure the QAT driver is loaded on your system, by executing::

-- 
2.5.0



[dpdk-dev] [PATCH] eal: fix libabi macro for device generalization patches

2016-10-26 Thread Shreyansh Jain
rte_device/driver generalization patches [1] were merged without a change
in the LIBABIVER macro. This patches bumps the macro of affected libs.

Also, deprecation notice from 16.07 has been removed and release notes for
16.11 added.

[1] http://dpdk.org/ml/archives/dev/2016-September/047087.html

Signed-off-by: Shreyansh Jain 
---
 doc/guides/rel_notes/deprecation.rst   | 12 
 doc/guides/rel_notes/release_16_11.rst | 16 
 lib/librte_cryptodev/Makefile  |  2 +-
 lib/librte_eal/bsdapp/eal/Makefile |  2 +-
 lib/librte_eal/linuxapp/eal/Makefile   |  2 +-
 lib/librte_ether/Makefile  |  2 +-
 6 files changed, 20 insertions(+), 16 deletions(-)

diff --git a/doc/guides/rel_notes/deprecation.rst 
b/doc/guides/rel_notes/deprecation.rst
index d5c1490..884a231 100644
--- a/doc/guides/rel_notes/deprecation.rst
+++ b/doc/guides/rel_notes/deprecation.rst
@@ -18,18 +18,6 @@ Deprecation Notices
   ``nb_seg_max`` and ``nb_mtu_seg_max`` providing information about number of
   segments limit to be transmitted by device for TSO/non-TSO packets.

-* The ethdev hotplug API is going to be moved to EAL with a notification
-  mechanism added to crypto and ethdev libraries so that hotplug is now
-  available to both of them. This API will be stripped of the device arguments
-  so that it only cares about hotplugging.
-
-* Structures embodying pci and vdev devices are going to be reworked to
-  integrate new common rte_device / rte_driver objects (see
-  http://dpdk.org/ml/archives/dev/2016-January/031390.html).
-  ethdev and crypto libraries will then only handle those objects so that they
-  do not need to care about the kind of devices that are being used, making it
-  easier to add new buses later.
-
 * ABI changes are planned for 16.11 in the ``rte_mbuf`` structure: some fields
   may be reordered to facilitate the writing of ``data_off``, ``refcnt``, and
   ``nb_segs`` in one operation, because some platforms have an overhead if the
diff --git a/doc/guides/rel_notes/release_16_11.rst 
b/doc/guides/rel_notes/release_16_11.rst
index 26cdd62..c3f3bd9 100644
--- a/doc/guides/rel_notes/release_16_11.rst
+++ b/doc/guides/rel_notes/release_16_11.rst
@@ -149,6 +149,22 @@ Resolved Issues
 EAL
 ~~~

+* **Improved device/driver heirarchy and generalized hotplugging**
+
+  Device and driver relationship has been restructured by introducing generic
+  classes. This paves way for having PCI, VDEV and other device types as
+  just instantiated objects rather than classes in themselves. Hotplugging too
+  has been generalized into EAL so that ethernet or cryptodevices can use the
+  common infrastructure.
+
+  * removed pmd_type as way of segragation of devices
+  * added rte_device class and all PCI and VDEV devices inherit from it
+  * renamed devinit/devuninit handlers to probe/remove to make it more
+semantically correct with respect to device<=>driver relationship
+  * moved hotplugging support to EAL
+  * helpers and support macros have been renamed to make them more synonymous
+with their device types (e.g. PMD_REGISTER_DRIVER => DRIVER_REGISTER_PCI)
+

 Drivers
 ~~~
diff --git a/lib/librte_cryptodev/Makefile b/lib/librte_cryptodev/Makefile
index 314a046..aebf5d9 100644
--- a/lib/librte_cryptodev/Makefile
+++ b/lib/librte_cryptodev/Makefile
@@ -34,7 +34,7 @@ include $(RTE_SDK)/mk/rte.vars.mk
 LIB = librte_cryptodev.a

 # library version
-LIBABIVER := 1
+LIBABIVER := 2

 # build flags
 CFLAGS += -O3
diff --git a/lib/librte_eal/bsdapp/eal/Makefile 
b/lib/librte_eal/bsdapp/eal/Makefile
index a15b762..122798c 100644
--- a/lib/librte_eal/bsdapp/eal/Makefile
+++ b/lib/librte_eal/bsdapp/eal/Makefile
@@ -48,7 +48,7 @@ LDLIBS += -lgcc_s

 EXPORT_MAP := rte_eal_version.map

-LIBABIVER := 3
+LIBABIVER := 4

 # specific to bsdapp exec-env
 SRCS-$(CONFIG_RTE_EXEC_ENV_BSDAPP) := eal.c
diff --git a/lib/librte_eal/linuxapp/eal/Makefile 
b/lib/librte_eal/linuxapp/eal/Makefile
index 4e206f0..4ad7c85 100644
--- a/lib/librte_eal/linuxapp/eal/Makefile
+++ b/lib/librte_eal/linuxapp/eal/Makefile
@@ -37,7 +37,7 @@ ARCH_DIR ?= $(RTE_ARCH)
 EXPORT_MAP := rte_eal_version.map
 VPATH += $(RTE_SDK)/lib/librte_eal/common/arch/$(ARCH_DIR)

-LIBABIVER := 3
+LIBABIVER := 4

 VPATH += $(RTE_SDK)/lib/librte_eal/common

diff --git a/lib/librte_ether/Makefile b/lib/librte_ether/Makefile
index 488b7c8..bc2e5f6 100644
--- a/lib/librte_ether/Makefile
+++ b/lib/librte_ether/Makefile
@@ -41,7 +41,7 @@ CFLAGS += $(WERROR_FLAGS)

 EXPORT_MAP := rte_ether_version.map

-LIBABIVER := 4
+LIBABIVER := 5

 SRCS-y += rte_ethdev.c

-- 
2.7.4



[dpdk-dev] [RFC] [PATCH v2] libeventdev: event driven programming model framework for DPDK

2016-10-26 Thread Jerin Jacob
On Wed, Oct 26, 2016 at 12:11:03PM +, Van Haaren, Harry wrote:
> > -Original Message-
> > From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Jerin Jacob
> > 
> > So far, I have received constructive feedback from Intel, NXP and Linaro 
> > folks.
> > Let me know, if anyone else interested in contributing to the definition of 
> > eventdev?
> > 
> > If there are no major issues in proposed spec, then Cavium would like work 
> > on
> > implementing and up-streaming the common code(lib/librte_eventdev/) and
> > an associated HW driver.(Requested minor changes of v2 will be addressed
> > in next version).
> 
> Hi All,
> 
> I will propose a minor change to the rte_event struct, allowing some bits to 
> be implementation specific. Currently the rte_event struct has no space to 
> allow an implementation store any metadata about the event. For software 
> performance it would be really helpful if there are some bits available for 
> the implementation to keep some flags about each event.

OK.

> 
> I suggest to rework the struct as below which opens 6 bits that were 
> otherwise wasted, and define them as implementation specific. By 
> implementation specific it is understood that the implementation can 
> overwrite any information stored in those bits, and the application must not 
> expect the data to remain after the event is scheduled.
> 
> OLD:
> struct rte_event {
>   uint32_t flow_id:24;
>   uint32_t queue_id:8;
>   uint8_t  sched_type; /* Note only 2 bits of 8 are required */
> 
> NEW:
> struct rte_event {
>   uint32_t flow_id:24;
>   uint32_t sched_type:2; /* reduced size : but 2 bits is enough for the 
> enqueue types Ordered,Atomic,Parallel.*/
>   uint32_t implementation:6; /* available for implementation specific 
> metadata */
>   uint8_t queue_id; /* still 8 bits as before */
> 
> 
> Thoughts? -Harry

Looks good to me. I will add it in v3.




[dpdk-dev] [PATCH v4 18/32] net/qede: add missing 100G link speed capability

2016-10-26 Thread Thomas Monjalon
2016-10-18 21:11, Rasesh Mody:
> From: Harish Patil 
> 
> This patch fixes the missing 100G link speed advertisement
> when the 100G support was initially added.
> 
> Fixes: 2af14ca79c0a ("net/qede: support 100G")
> 
> Signed-off-by: Harish Patil 
[...]
>  [Features]
> +Speed capabilities   = Y

This feature should be checked only when it is fully implemented,
i.e. when you return the real capabilities of the device.

> --- a/drivers/net/qede/qede_ethdev.c
> +++ b/drivers/net/qede/qede_ethdev.c
> @@ -599,7 +599,8 @@ qede_dev_info_get(struct rte_eth_dev *eth_dev,
>DEV_TX_OFFLOAD_UDP_CKSUM |
>DEV_TX_OFFLOAD_TCP_CKSUM);
>  
> - dev_info->speed_capa = ETH_LINK_SPEED_25G | ETH_LINK_SPEED_40G;
> + dev_info->speed_capa = ETH_LINK_SPEED_25G | ETH_LINK_SPEED_40G |
> +ETH_LINK_SPEED_100G;
>  }

It is only faking the capabilities at driver-level.
You should check if the underlying device is able to achieve 100G
before advertising this flag to the application.

I suggest to update this patch to remove the doc update.
The contract is to fill it only when the code is fixed.
By the way, we must call every other drivers to properly implement
this feature.


[dpdk-dev] [PATCH] doc/guides: add more info re VT-d/iommu settings for QAT

2016-10-26 Thread Jain, Deepak K

> -Original Message-
> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Fiona Trahe
> Sent: Wednesday, October 26, 2016 4:20 PM
> To: dev at dpdk.org
> Subject: [dpdk-dev] [PATCH] doc/guides: add more info re VT-d/iommu
> settings for QAT
> 
> add more info re VT-d/iommu settings for QAT remove limitation re
> performance tuning
> 
> Signed-off-by: Fiona Trahe 
> ---
>  doc/guides/cryptodevs/qat.rst | 7 +++
>  1 file changed, 3 insertions(+), 4 deletions(-)
> 
> diff --git a/doc/guides/cryptodevs/qat.rst b/doc/guides/cryptodevs/qat.rst
> index 70bc2b1..bbe0b12 100644
> --- a/doc/guides/cryptodevs/qat.rst
> +++ b/doc/guides/cryptodevs/qat.rst
> 
> --
> 2.5.0
Acked-by: Deepak Kumar Jain


[dpdk-dev] [PATCH v4 00/32] net/qede: update qede pmd to 1.2.0.1 and enable by default

2016-10-26 Thread Mody, Rasesh
Hi Thomas,

> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Thomas Monjalon
> Sent: Wednesday, October 26, 2016 8:20 AM
> 
> 2016-10-24 14:41, Bruce Richardson:
> > On Tue, Oct 18, 2016 at 09:11:14PM -0700, Rasesh Mody wrote:
> > > Please apply to DPDK tree for v16.11 release.
> >
> > Patchset applied to dpdk_next_net/rel_16_11
> 
> It breaks compilation because it is enabled everywhere and zlib.h is still
> included without checking CONFIG_ECORE_ZIPPED_FW.
> The patch removing zlib dependency was not tested without zlib installed.
> I will fix it while applying with this change:

Sorry, we missed to test the patch removing zlib dependency from latest patch 
set when zlib headers are unavailable.
The zlib.h include is not needed in bcm_osal.c. It got left out there when 
zlib.h was included in ecore.h file by patch "[PATCH v4 03/32] net/qede: use FW 
CONFIG defines as needed". In ecore.h it is protected by ifdef, however, same 
is not true about bcm_osal.c. Hence compilation complains when zlib.h is not 
available.

--- a/drivers/net/qede/base/bcm_osal.c
+++ b/drivers/net/qede/base/bcm_osal.c
@@ -6,8 +6,6 @@
  * See LICENSE.qede_pmd for copyright and licensing details.
  */

-#include 
-
 #include 
 #include 


> --- a/drivers/net/qede/base/bcm_osal.c
> +++ b/drivers/net/qede/base/bcm_osal.c
> @@ -6,7 +6,9 @@
>   * See LICENSE.qede_pmd for copyright and licensing details.
>   */
> 
> +#ifdef CONFIG_ECORE_ZIPPED_FW
>  #include 
> +#endif
> 
>  #include 
>  #include 
> 

Above change looks fine. Thanks!

> I won't do any quality review of qede patches but from what I've seen
> before, there is some room for improvements.
> 
> Another nit, important to help reviews, please use --in-reply-to when
> sending a new revision of a patch to keep them in the same thread and allow
> us to understand the progress.
> I plan to do an automatic nack for patches missing the --in-reply-to.

Sure, will do.

Thanks!
-Rasesh


[dpdk-dev] [PATCH v4 18/32] net/qede: add missing 100G link speed capability

2016-10-26 Thread Bruce Richardson
On Wed, Oct 26, 2016 at 05:41:58PM +0200, Thomas Monjalon wrote:
> 2016-10-18 21:11, Rasesh Mody:
> > From: Harish Patil 
> > 
> > This patch fixes the missing 100G link speed advertisement
> > when the 100G support was initially added.
> > 
> > Fixes: 2af14ca79c0a ("net/qede: support 100G")
> > 
> > Signed-off-by: Harish Patil 
> [...]
> >  [Features]
> > +Speed capabilities   = Y
> 
> This feature should be checked only when it is fully implemented,
> i.e. when you return the real capabilities of the device.
> 
> > --- a/drivers/net/qede/qede_ethdev.c
> > +++ b/drivers/net/qede/qede_ethdev.c
> > @@ -599,7 +599,8 @@ qede_dev_info_get(struct rte_eth_dev *eth_dev,
> >  DEV_TX_OFFLOAD_UDP_CKSUM |
> >  DEV_TX_OFFLOAD_TCP_CKSUM);
> >  
> > -   dev_info->speed_capa = ETH_LINK_SPEED_25G | ETH_LINK_SPEED_40G;
> > +   dev_info->speed_capa = ETH_LINK_SPEED_25G | ETH_LINK_SPEED_40G |
> > +  ETH_LINK_SPEED_100G;
> >  }
> 
> It is only faking the capabilities at driver-level.
> You should check if the underlying device is able to achieve 100G
> before advertising this flag to the application.
> 
> I suggest to update this patch to remove the doc update.
> The contract is to fill it only when the code is fixed.
> By the way, we must call every other drivers to properly implement
> this feature.

I agree that devices should only advertise speeds they support and the
doc should reflect this ability (or lack of this ability) in a driver.

Regards,
/Bruce


[dpdk-dev] [PATCH v2] net/i40e: fix Rx hang when disable LLDP

2016-10-26 Thread Bruce Richardson
On Wed, Oct 26, 2016 at 04:47:31PM +0100, Wu, Jingjing wrote:
> 
> 
> > -Original Message-
> > From: Richardson, Bruce
> > Sent: Wednesday, October 26, 2016 11:44 PM
> > To: Wu, Jingjing 
> > Cc: Zhang, Qi Z ; Zhang, Helin  > intel.com>;
> > dev at dpdk.org
> > Subject: Re: [dpdk-dev] [PATCH v2] net/i40e: fix Rx hang when disable LLDP
> > 
> > On Wed, Oct 26, 2016 at 03:12:41PM +, Wu, Jingjing wrote:
> > >
> > >
> > > > -Original Message-
> > > > From: Zhang, Qi Z
> > > > Sent: Thursday, October 20, 2016 4:40 AM
> > > > To: Wu, Jingjing ; Zhang, Helin
> > > > 
> > > > Cc: dev at dpdk.org; Zhang, Qi Z 
> > > > Subject: [PATCH v2] net/i40e: fix Rx hang when disable LLDP
> > > >
> > > > Remove stopping LLDP as a workaround for a known errata which can cause
> > > > Rx hang.
> > > >
> > > > Fixes: 4861cde46116 ("i40e: new poll mode driver")
> > > >
> > > > Signed-off-by: Qi Zhang 
> > >
> > > Acked-by: Jingjing Wu 
> > >
> > I'm hoping to apply this patch to make it in RC2. However, there is an
> > errata mentioned in the commit log, so it would be good to actually
> > include a reference or link to the errata itself, so that the reader can
> > look it up himself. Can you supply such a link to the errata in
> > question?
> > 
> errata #70 in the Public Spec Update with link
> http://www.intel.com/content/www/us/en/embedded/products/networking/xl710-10-40-controller-spec-update.html
> 
> Thanks
> Jingjing
Applied to dpdk-next-net/rel_16_11 with errata link included in commit
message.

thanks,
/Bruce


[dpdk-dev] [PATCH] doc/guides: add more info re VT-d/iommu settings for QAT

2016-10-26 Thread Trahe, Fiona
 Hi Thomas,

> -Original Message-
> From: Thomas Monjalon [mailto:thomas.monjalon at 6wind.com]
> Sent: Wednesday, October 26, 2016 4:27 PM
> To: Trahe, Fiona 
> Cc: dev at dpdk.org
> Subject: Re: [dpdk-dev] [PATCH] doc/guides: add more info re VT-d/iommu
> settings for QAT
> 
> 2016-10-26 16:20, Fiona Trahe:
> > add more info re VT-d/iommu settings for QAT remove limitation re
> > performance tuning
> 
> Sorry, I do not understand what means "re".

"re" is commonly used in the English language and means "in reference to" or 
"about"
but I'm happy to change to "about" which is the more conventional word.
There's an interesting explanation of the evolution of the term here:
https://en.oxforddictionaries.com/definition/re
Usage
The traditional view is that re should be used in headings and references, as 
in Re: Ainsworth versus Chambers, but not as a normal word meaning 'about', as 
in I saw the deputy head re the incident. However, the evidence suggests that 
re is now widely used in the second context in official and semi-official 
contexts, and is now generally accepted. It is hard to see any compelling 
logical argument against using it as an ordinary English word in this way

> Please use an uppercase at the beginning of each sentence and a dot at the
> end. It helps reading.
Ok.  
I was confusing this with the style needed for the email subject, which is the 
opposite!


[dpdk-dev] [PATCH v2] net/i40e: fix Rx hang when disable LLDP

2016-10-26 Thread Bruce Richardson
On Wed, Oct 26, 2016 at 03:12:41PM +, Wu, Jingjing wrote:
> 
> 
> > -Original Message-
> > From: Zhang, Qi Z
> > Sent: Thursday, October 20, 2016 4:40 AM
> > To: Wu, Jingjing ; Zhang, Helin
> > 
> > Cc: dev at dpdk.org; Zhang, Qi Z 
> > Subject: [PATCH v2] net/i40e: fix Rx hang when disable LLDP
> > 
> > Remove stopping LLDP as a workaround for a known errata which can cause
> > Rx hang.
> > 
> > Fixes: 4861cde46116 ("i40e: new poll mode driver")
> > 
> > Signed-off-by: Qi Zhang 
> 
> Acked-by: Jingjing Wu 
> 
I'm hoping to apply this patch to make it in RC2. However, there is an
errata mentioned in the commit log, so it would be good to actually
include a reference or link to the errata itself, so that the reader can
look it up himself. Can you supply such a link to the errata in
question?

/Bruce


[dpdk-dev] [PATCH v7] net/ixgbe: support multiqueue mode VMDq DCB with SRIOV

2016-10-26 Thread Bruce Richardson
On Wed, Oct 26, 2016 at 04:28:40PM +0100, Bernard Iremonger wrote:
> The folowing changes have been made to allow Data Centre Bridge
> (DCB) configuration when SRIOV is enabled.
> 
> Modify ixgbe_check_mq_mode function,
> when SRIOV is enabled, enable mq_mode
> ETH_MQ_RX_VMDQ_DCB and ETH_MQ_TX_VMDQ_DCB.
> 
> Modify ixgbe_dcb_tx_hw_config function,
> replace the struct ixgbe_hw parameter with a
> struct rte_eth_dev parameter and handle SRIOV enabled.
> 
> Modify ixgbe_dev_mq_rx_configure function,
> when SRIOV is enabled, enable mq_mode ETH_MQ_RX_VMDQ_DCB.
> 
> Modify ixgbe_configure_dcb function,
> revise check on dev->data->nb_rx_queues.
> 
> Signed-off-by: Rahul R Shah 
> Signed-off-by: Bernard Iremonger 
> Acked-by: Wenzhuo Lu 

Applied to dpdk-next-net/rel_16_11 with commit message cut down to just
the high-level functional change.

/Bruce



[dpdk-dev] [PATCH v6 1/2] net/ixgbe: support multiqueue mode VMDq DCB with SRIOV

2016-10-26 Thread Wu, Jingjing


> -Original Message-
> From: Iremonger, Bernard
> Sent: Thursday, October 27, 2016 12:10 AM
> To: Wu, Jingjing ; dev at dpdk.org; Shah, Rahul R
> ; Lu, Wenzhuo ; 
> Dumitrescu, Cristian
> 
> Subject: RE: [PATCH v6 1/2] net/ixgbe: support multiqueue mode VMDq DCB with 
> SRIOV
> 
> Hi Jingling,
> 
> 
> 
> > > > Subject: [PATCH v6 1/2] net/ixgbe: support multiqueue mode VMDq DCB
> > > > with SRIOV
> > > >
> > > > The folowing changes have been made to allow Data Centre Bridge
> > > > (DCB) configuration when SRIOV is enabled.
> > > >
> > > > Modify ixgbe_check_mq_mode function, when SRIOV is enabled, enable
> > > > mq_mode ETH_MQ_RX_VMDQ_DCB and ETH_MQ_TX_VMDQ_DCB.
> > > >
> > > > Modify ixgbe_dcb_tx_hw_config function, replace the struct ixgbe_hw
> > > > parameter with a struct rte_eth_dev parameter and handle SRIOV
> > > > enabled.
> > > >
> > > > Modify ixgbe_dev_mq_rx_configure function, when SRIOV is enabled,
> > > > enable mq_mode ETH_MQ_RX_VMDQ_DCB.
> > > >
> > > > Modify ixgbe_configure_dcb function, revise check on
> > > > dev->data->nb_rx_queues.
> > > >
> > > > Signed-off-by: Rahul R Shah 
> > > > Signed-off-by: Bernard Iremonger 
> > > > Acked-by: Wenzhuo Lu 
> > > > ---
> > > >  drivers/net/ixgbe/ixgbe_ethdev.c | 11 ++-
> > > >  drivers/net/ixgbe/ixgbe_rxtx.c   | 35 ++
> > ---
> > > --
> > > >  2 files changed, 28 insertions(+), 18 deletions(-)
> > > >
> > > > diff --git a/drivers/net/ixgbe/ixgbe_ethdev.c
> > > > b/drivers/net/ixgbe/ixgbe_ethdev.c
> > > > index 4ca5747..4d5ce83 100644
> > > > --- a/drivers/net/ixgbe/ixgbe_ethdev.c
> > > > +++ b/drivers/net/ixgbe/ixgbe_ethdev.c
> > > > @@ -1977,6 +1977,9 @@ ixgbe_check_mq_mode(struct rte_eth_dev
> > > *dev)
> > > > /* check multi-queue mode */
> > > > switch (dev_conf->rxmode.mq_mode) {
> > > > case ETH_MQ_RX_VMDQ_DCB:
> > > > +   PMD_INIT_LOG(INFO, "ETH_MQ_RX_VMDQ_DCB
> > > mode supported in SRIOV");
> > > > +   dev->data->dev_conf.rxmode.mq_mode =
> > > ETH_MQ_RX_VMDQ_DCB;
> > > This line is duplicated, mq_mode is ETH_MQ_RX_VMDQ_DCB already.
> >
> > The mq_mode is assigned at this point in the other cases. This case is coded
> > in line with the other cases.
> >
> > >  and it's better to check if the nb_queue is valid.
> >
> The nb_rx_q and nb_tx_q are checked after the switch statements at line 2027 
> with the v6
> patch applied.
Thanks, it's fine then. :)
> Regards,
> 
> Bernard.


[dpdk-dev] [PATCH v6 1/2] net/ixgbe: support multiqueue mode VMDq DCB with SRIOV

2016-10-26 Thread Iremonger, Bernard
Hi Jingling,



> > > Subject: [PATCH v6 1/2] net/ixgbe: support multiqueue mode VMDq DCB
> > > with SRIOV
> > >
> > > The folowing changes have been made to allow Data Centre Bridge
> > > (DCB) configuration when SRIOV is enabled.
> > >
> > > Modify ixgbe_check_mq_mode function, when SRIOV is enabled, enable
> > > mq_mode ETH_MQ_RX_VMDQ_DCB and ETH_MQ_TX_VMDQ_DCB.
> > >
> > > Modify ixgbe_dcb_tx_hw_config function, replace the struct ixgbe_hw
> > > parameter with a struct rte_eth_dev parameter and handle SRIOV
> > > enabled.
> > >
> > > Modify ixgbe_dev_mq_rx_configure function, when SRIOV is enabled,
> > > enable mq_mode ETH_MQ_RX_VMDQ_DCB.
> > >
> > > Modify ixgbe_configure_dcb function, revise check on
> > > dev->data->nb_rx_queues.
> > >
> > > Signed-off-by: Rahul R Shah 
> > > Signed-off-by: Bernard Iremonger 
> > > Acked-by: Wenzhuo Lu 
> > > ---
> > >  drivers/net/ixgbe/ixgbe_ethdev.c | 11 ++-
> > >  drivers/net/ixgbe/ixgbe_rxtx.c   | 35 ++
> ---
> > --
> > >  2 files changed, 28 insertions(+), 18 deletions(-)
> > >
> > > diff --git a/drivers/net/ixgbe/ixgbe_ethdev.c
> > > b/drivers/net/ixgbe/ixgbe_ethdev.c
> > > index 4ca5747..4d5ce83 100644
> > > --- a/drivers/net/ixgbe/ixgbe_ethdev.c
> > > +++ b/drivers/net/ixgbe/ixgbe_ethdev.c
> > > @@ -1977,6 +1977,9 @@ ixgbe_check_mq_mode(struct rte_eth_dev
> > *dev)
> > >   /* check multi-queue mode */
> > >   switch (dev_conf->rxmode.mq_mode) {
> > >   case ETH_MQ_RX_VMDQ_DCB:
> > > + PMD_INIT_LOG(INFO, "ETH_MQ_RX_VMDQ_DCB
> > mode supported in SRIOV");
> > > + dev->data->dev_conf.rxmode.mq_mode =
> > ETH_MQ_RX_VMDQ_DCB;
> > This line is duplicated, mq_mode is ETH_MQ_RX_VMDQ_DCB already.
> 
> The mq_mode is assigned at this point in the other cases. This case is coded
> in line with the other cases.
> 
> >  and it's better to check if the nb_queue is valid.
> 
The nb_rx_q and nb_tx_q are checked after the switch statements at line 2027 
with the v6 patch applied.

Regards,

Bernard.


[dpdk-dev] [PATCH v2] net/i40e: fix Rx hang when disable LLDP

2016-10-26 Thread Wu, Jingjing


> -Original Message-
> From: Richardson, Bruce
> Sent: Wednesday, October 26, 2016 11:44 PM
> To: Wu, Jingjing 
> Cc: Zhang, Qi Z ; Zhang, Helin  intel.com>;
> dev at dpdk.org
> Subject: Re: [dpdk-dev] [PATCH v2] net/i40e: fix Rx hang when disable LLDP
> 
> On Wed, Oct 26, 2016 at 03:12:41PM +, Wu, Jingjing wrote:
> >
> >
> > > -Original Message-
> > > From: Zhang, Qi Z
> > > Sent: Thursday, October 20, 2016 4:40 AM
> > > To: Wu, Jingjing ; Zhang, Helin
> > > 
> > > Cc: dev at dpdk.org; Zhang, Qi Z 
> > > Subject: [PATCH v2] net/i40e: fix Rx hang when disable LLDP
> > >
> > > Remove stopping LLDP as a workaround for a known errata which can cause
> > > Rx hang.
> > >
> > > Fixes: 4861cde46116 ("i40e: new poll mode driver")
> > >
> > > Signed-off-by: Qi Zhang 
> >
> > Acked-by: Jingjing Wu 
> >
> I'm hoping to apply this patch to make it in RC2. However, there is an
> errata mentioned in the commit log, so it would be good to actually
> include a reference or link to the errata itself, so that the reader can
> look it up himself. Can you supply such a link to the errata in
> question?
> 
errata #70 in the Public Spec Update with link
http://www.intel.com/content/www/us/en/embedded/products/networking/xl710-10-40-controller-spec-update.html

Thanks
Jingjing


[dpdk-dev] [PATCH v2] eal: fix libabi macro for device generalization patches

2016-10-26 Thread Ferruh Yigit
Hi Shreyansh,

On 10/26/2016 2:12 PM, Shreyansh Jain wrote:
> On Wednesday 26 October 2016 06:30 PM, Shreyansh Jain wrote:
>> rte_device/driver generalization patches [1] were merged without a change
>> in the LIBABIVER macro. This patches bumps the macro of affected libs.
>>
>> Also, deprecation notice from 16.07 has been removed and release notes for
>> 16.11 added.
>>
>> Signed-off-by: Shreyansh Jain 
>> --
>> v2:
>>  - Mark bumped libraries in release_16_11.rst file
>>  - change code symbol names from text to code layout
>>
>> ---

<...>

>>  .. code-block:: diff
>>
>> - libethdev.so.4
>> +   + libethdev.so.4
> 
> Just noticed:
> Should the '4' here reflect the current LIBABIVER number?
> If so, I will send this patch again.

Yes, as you guessed, it should be:
- libethdev.so.4
+   + libethdev.so.5

<...>

>> diff --git a/lib/librte_eal/bsdapp/eal/Makefile 
>> b/lib/librte_eal/bsdapp/eal/Makefile
>> index a15b762..122798c 100644
>> --- a/lib/librte_eal/bsdapp/eal/Makefile
>> +++ b/lib/librte_eal/bsdapp/eal/Makefile
>> @@ -48,7 +48,7 @@ LDLIBS += -lgcc_s
>>
>>  EXPORT_MAP := rte_eal_version.map
>>
>> -LIBABIVER := 3
>> +LIBABIVER := 4

eal version seems already increased for this release, 2 => 3, in:
d7e61ad3ae36 ("log: remove deprecated history dump")

So NO need to increase it again, sorry for late notice, I just
recognized it.
Only librte_ether and librte_cryptodev requires the increase.

<...>

Thanks,
ferruh



[dpdk-dev] [PATCH v2] net/i40e: fix Rx hang when disable LLDP

2016-10-26 Thread Wu, Jingjing


> -Original Message-
> From: Zhang, Qi Z
> Sent: Thursday, October 20, 2016 4:40 AM
> To: Wu, Jingjing ; Zhang, Helin
> 
> Cc: dev at dpdk.org; Zhang, Qi Z 
> Subject: [PATCH v2] net/i40e: fix Rx hang when disable LLDP
> 
> Remove stopping LLDP as a workaround for a known errata which can cause
> Rx hang.
> 
> Fixes: 4861cde46116 ("i40e: new poll mode driver")
> 
> Signed-off-by: Qi Zhang 

Acked-by: Jingjing Wu 




[dpdk-dev] [PATCH v11 6/6] testpmd: use Tx preparation in csum engine

2016-10-26 Thread Tomasz Kulasek
Removed pseudo header calculation for udp/tcp/tso packets from
application and used Tx preparation API for packet preparation and
verification.

Adding additional step to the csum engine costs about 3-4% of
performance drop, on my setup with ixgbe driver. It's caused mostly by
the need of reaccessing and modification of packet data.

Signed-off-by: Tomasz Kulasek 
---
 app/test-pmd/csumonly.c |   36 +---
 1 file changed, 13 insertions(+), 23 deletions(-)

diff --git a/app/test-pmd/csumonly.c b/app/test-pmd/csumonly.c
index 57e6ae2..6f33ae9 100644
--- a/app/test-pmd/csumonly.c
+++ b/app/test-pmd/csumonly.c
@@ -112,15 +112,6 @@ struct simple_gre_hdr {
 } __attribute__((__packed__));

 static uint16_t
-get_psd_sum(void *l3_hdr, uint16_t ethertype, uint64_t ol_flags)
-{
-   if (ethertype == _htons(ETHER_TYPE_IPv4))
-   return rte_ipv4_phdr_cksum(l3_hdr, ol_flags);
-   else /* assume ethertype == ETHER_TYPE_IPv6 */
-   return rte_ipv6_phdr_cksum(l3_hdr, ol_flags);
-}
-
-static uint16_t
 get_udptcp_checksum(void *l3_hdr, void *l4_hdr, uint16_t ethertype)
 {
if (ethertype == _htons(ETHER_TYPE_IPv4))
@@ -370,32 +361,24 @@ process_inner_cksums(void *l3_hdr, const struct 
testpmd_offload_info *info,
/* do not recalculate udp cksum if it was 0 */
if (udp_hdr->dgram_cksum != 0) {
udp_hdr->dgram_cksum = 0;
-   if (testpmd_ol_flags & TESTPMD_TX_OFFLOAD_UDP_CKSUM) {
+   if (testpmd_ol_flags & TESTPMD_TX_OFFLOAD_UDP_CKSUM)
ol_flags |= PKT_TX_UDP_CKSUM;
-   udp_hdr->dgram_cksum = get_psd_sum(l3_hdr,
-   info->ethertype, ol_flags);
-   } else {
+   else
udp_hdr->dgram_cksum =
get_udptcp_checksum(l3_hdr, udp_hdr,
info->ethertype);
-   }
}
} else if (info->l4_proto == IPPROTO_TCP) {
tcp_hdr = (struct tcp_hdr *)((char *)l3_hdr + info->l3_len);
tcp_hdr->cksum = 0;
-   if (tso_segsz) {
+   if (tso_segsz)
ol_flags |= PKT_TX_TCP_SEG;
-   tcp_hdr->cksum = get_psd_sum(l3_hdr, info->ethertype,
-   ol_flags);
-   } else if (testpmd_ol_flags & TESTPMD_TX_OFFLOAD_TCP_CKSUM) {
+   else if (testpmd_ol_flags & TESTPMD_TX_OFFLOAD_TCP_CKSUM)
ol_flags |= PKT_TX_TCP_CKSUM;
-   tcp_hdr->cksum = get_psd_sum(l3_hdr, info->ethertype,
-   ol_flags);
-   } else {
+   else
tcp_hdr->cksum =
get_udptcp_checksum(l3_hdr, tcp_hdr,
info->ethertype);
-   }
} else if (info->l4_proto == IPPROTO_SCTP) {
sctp_hdr = (struct sctp_hdr *)((char *)l3_hdr + info->l3_len);
sctp_hdr->cksum = 0;
@@ -648,6 +631,7 @@ pkt_burst_checksum_forward(struct fwd_stream *fs)
void *l3_hdr = NULL, *outer_l3_hdr = NULL; /* can be IPv4 or IPv6 */
uint16_t nb_rx;
uint16_t nb_tx;
+   uint16_t nb_prep;
uint16_t i;
uint64_t rx_ol_flags, tx_ol_flags;
uint16_t testpmd_ol_flags;
@@ -857,7 +841,13 @@ pkt_burst_checksum_forward(struct fwd_stream *fs)
printf("\n");
}
}
-   nb_tx = rte_eth_tx_burst(fs->tx_port, fs->tx_queue, pkts_burst, nb_rx);
+   nb_prep = rte_eth_tx_prep(fs->tx_port, fs->tx_queue, pkts_burst,
+   nb_rx);
+   if (nb_prep != nb_rx)
+   printf("Preparing packet burst to transmit failed: %s\n",
+   rte_strerror(rte_errno));
+
+   nb_tx = rte_eth_tx_burst(fs->tx_port, fs->tx_queue, pkts_burst, 
nb_prep);
/*
 * Retry if necessary
 */
-- 
1.7.9.5



[dpdk-dev] [PATCH v11 5/6] ixgbe: add Tx preparation

2016-10-26 Thread Tomasz Kulasek
Signed-off-by: Tomasz Kulasek 
---
 drivers/net/ixgbe/ixgbe_ethdev.c |3 ++
 drivers/net/ixgbe/ixgbe_ethdev.h |5 +++-
 drivers/net/ixgbe/ixgbe_rxtx.c   |   58 +-
 drivers/net/ixgbe/ixgbe_rxtx.h   |2 ++
 4 files changed, 66 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ixgbe/ixgbe_ethdev.c b/drivers/net/ixgbe/ixgbe_ethdev.c
index 4ca5747..4c6a8e1 100644
--- a/drivers/net/ixgbe/ixgbe_ethdev.c
+++ b/drivers/net/ixgbe/ixgbe_ethdev.c
@@ -517,6 +517,8 @@ static const struct rte_eth_desc_lim tx_desc_lim = {
.nb_max = IXGBE_MAX_RING_DESC,
.nb_min = IXGBE_MIN_RING_DESC,
.nb_align = IXGBE_TXD_ALIGN,
+   .nb_seg_max = IXGBE_TX_MAX_SEG,
+   .nb_mtu_seg_max = IXGBE_TX_MAX_SEG,
 };

 static const struct eth_dev_ops ixgbe_eth_dev_ops = {
@@ -1103,6 +1105,7 @@ eth_ixgbe_dev_init(struct rte_eth_dev *eth_dev)
eth_dev->dev_ops = _eth_dev_ops;
eth_dev->rx_pkt_burst = _recv_pkts;
eth_dev->tx_pkt_burst = _xmit_pkts;
+   eth_dev->tx_pkt_prep = _prep_pkts;

/*
 * For secondary processes, we don't initialise any further as primary
diff --git a/drivers/net/ixgbe/ixgbe_ethdev.h b/drivers/net/ixgbe/ixgbe_ethdev.h
index 4ff6338..e229cf5 100644
--- a/drivers/net/ixgbe/ixgbe_ethdev.h
+++ b/drivers/net/ixgbe/ixgbe_ethdev.h
@@ -1,7 +1,7 @@
 /*-
  *   BSD LICENSE
  *
- *   Copyright(c) 2010-2015 Intel Corporation. All rights reserved.
+ *   Copyright(c) 2010-2016 Intel Corporation. All rights reserved.
  *   All rights reserved.
  *
  *   Redistribution and use in source and binary forms, with or without
@@ -396,6 +396,9 @@ uint16_t ixgbe_xmit_pkts(void *tx_queue, struct rte_mbuf 
**tx_pkts,
 uint16_t ixgbe_xmit_pkts_simple(void *tx_queue, struct rte_mbuf **tx_pkts,
uint16_t nb_pkts);

+uint16_t ixgbe_prep_pkts(void *tx_queue, struct rte_mbuf **tx_pkts,
+   uint16_t nb_pkts);
+
 int ixgbe_dev_rss_hash_update(struct rte_eth_dev *dev,
  struct rte_eth_rss_conf *rss_conf);

diff --git a/drivers/net/ixgbe/ixgbe_rxtx.c b/drivers/net/ixgbe/ixgbe_rxtx.c
index 2ce8234..031414c 100644
--- a/drivers/net/ixgbe/ixgbe_rxtx.c
+++ b/drivers/net/ixgbe/ixgbe_rxtx.c
@@ -1,7 +1,7 @@
 /*-
  *   BSD LICENSE
  *
- *   Copyright(c) 2010-2015 Intel Corporation. All rights reserved.
+ *   Copyright(c) 2010-2016 Intel Corporation. All rights reserved.
  *   Copyright 2014 6WIND S.A.
  *   All rights reserved.
  *
@@ -70,6 +70,7 @@
 #include 
 #include 
 #include 
+#include 

 #include "ixgbe_logs.h"
 #include "base/ixgbe_api.h"
@@ -87,6 +88,9 @@
PKT_TX_TCP_SEG | \
PKT_TX_OUTER_IP_CKSUM)

+#define IXGBE_TX_OFFLOAD_NOTSUP_MASK \
+   (PKT_TX_OFFLOAD_MASK ^ IXGBE_TX_OFFLOAD_MASK)
+
 #if 1
 #define RTE_PMD_USE_PREFETCH
 #endif
@@ -905,6 +909,56 @@ end_of_tx:

 /*
  *
+ *  TX prep functions
+ *
+ **/
+uint16_t
+ixgbe_prep_pkts(void *tx_queue, struct rte_mbuf **tx_pkts, uint16_t nb_pkts)
+{
+   int i, ret;
+   uint64_t ol_flags;
+   struct rte_mbuf *m;
+   struct ixgbe_tx_queue *txq = (struct ixgbe_tx_queue *)tx_queue;
+
+   for (i = 0; i < nb_pkts; i++) {
+   m = tx_pkts[i];
+   ol_flags = m->ol_flags;
+
+   /**
+* Check if packet meets requirements for number of segments
+*
+* NOTE: for ixgbe it's always (40 - WTHRESH) for both TSO and 
non-TSO
+*/
+
+   if (m->nb_segs > IXGBE_TX_MAX_SEG - txq->wthresh) {
+   rte_errno = -EINVAL;
+   return i;
+   }
+
+   if (ol_flags & IXGBE_TX_OFFLOAD_NOTSUP_MASK) {
+   rte_errno = -ENOTSUP;
+   return i;
+   }
+
+#ifdef RTE_LIBRTE_ETHDEV_DEBUG
+   ret = rte_validate_tx_offload(m);
+   if (ret != 0) {
+   rte_errno = ret;
+   return i;
+   }
+#endif
+   ret = rte_phdr_cksum_fix(m);
+   if (ret != 0) {
+   rte_errno = ret;
+   return i;
+   }
+   }
+
+   return i;
+}
+
+/*
+ *
  *  RX functions
  *
  **/
@@ -2282,6 +2336,7 @@ ixgbe_set_tx_function(struct rte_eth_dev *dev, struct 
ixgbe_tx_queue *txq)
if (((txq->txq_flags & IXGBE_SIMPLE_FLAGS) == IXGBE_SIMPLE_FLAGS)
&& (txq->tx_rs_thresh >= RTE_PMD_IXGBE_TX_MAX_BURST)) {
PMD_INIT_LOG(DEBUG, "Using simple tx code path");
+   dev->tx_pkt_prep = NULL;
 #ifdef RTE_IXGBE_INC_VECTOR
if (txq->tx_rs_thresh 

[dpdk-dev] [PATCH v11 4/6] i40e: add Tx preparation

2016-10-26 Thread Tomasz Kulasek
Signed-off-by: Tomasz Kulasek 
---
 drivers/net/i40e/i40e_ethdev.c |3 ++
 drivers/net/i40e/i40e_rxtx.c   |   72 +++-
 drivers/net/i40e/i40e_rxtx.h   |8 +
 3 files changed, 82 insertions(+), 1 deletion(-)

diff --git a/drivers/net/i40e/i40e_ethdev.c b/drivers/net/i40e/i40e_ethdev.c
index 5af0e43..dab0d48 100644
--- a/drivers/net/i40e/i40e_ethdev.c
+++ b/drivers/net/i40e/i40e_ethdev.c
@@ -936,6 +936,7 @@ eth_i40e_dev_init(struct rte_eth_dev *dev)
dev->dev_ops = _eth_dev_ops;
dev->rx_pkt_burst = i40e_recv_pkts;
dev->tx_pkt_burst = i40e_xmit_pkts;
+   dev->tx_pkt_prep = i40e_prep_pkts;

/* for secondary processes, we don't initialise any further as primary
 * has already done this work. Only check we don't need a different
@@ -2629,6 +2630,8 @@ i40e_dev_info_get(struct rte_eth_dev *dev, struct 
rte_eth_dev_info *dev_info)
.nb_max = I40E_MAX_RING_DESC,
.nb_min = I40E_MIN_RING_DESC,
.nb_align = I40E_ALIGN_RING_DESC,
+   .nb_seg_max = I40E_TX_MAX_SEG,
+   .nb_mtu_seg_max = I40E_TX_MAX_MTU_SEG,
};

if (pf->flags & I40E_FLAG_VMDQ) {
diff --git a/drivers/net/i40e/i40e_rxtx.c b/drivers/net/i40e/i40e_rxtx.c
index 7ae7d9f..7f6d3d8 100644
--- a/drivers/net/i40e/i40e_rxtx.c
+++ b/drivers/net/i40e/i40e_rxtx.c
@@ -1,7 +1,7 @@
 /*-
  *   BSD LICENSE
  *
- *   Copyright(c) 2010-2015 Intel Corporation. All rights reserved.
+ *   Copyright(c) 2010-2016 Intel Corporation. All rights reserved.
  *   All rights reserved.
  *
  *   Redistribution and use in source and binary forms, with or without
@@ -50,6 +50,8 @@
 #include 
 #include 
 #include 
+#include 
+#include 

 #include "i40e_logs.h"
 #include "base/i40e_prototype.h"
@@ -79,6 +81,17 @@
PKT_TX_TCP_SEG | \
PKT_TX_OUTER_IP_CKSUM)

+#define I40E_TX_OFFLOAD_MASK (  \
+   PKT_TX_IP_CKSUM |   \
+   PKT_TX_L4_MASK |\
+   PKT_TX_OUTER_IP_CKSUM | \
+   PKT_TX_TCP_SEG |\
+   PKT_TX_QINQ_PKT |   \
+   PKT_TX_VLAN_PKT)
+
+#define I40E_TX_OFFLOAD_NOTSUP_MASK \
+   (PKT_TX_OFFLOAD_MASK ^ I40E_TX_OFFLOAD_MASK)
+
 static uint16_t i40e_xmit_pkts_simple(void *tx_queue,
  struct rte_mbuf **tx_pkts,
  uint16_t nb_pkts);
@@ -1411,6 +1424,61 @@ i40e_xmit_pkts_simple(void *tx_queue,
return nb_tx;
 }

+/*
+ *
+ *  TX prep functions
+ *
+ **/
+uint16_t
+i40e_prep_pkts(__rte_unused void *tx_queue, struct rte_mbuf **tx_pkts,
+   uint16_t nb_pkts)
+{
+   int i, ret;
+   uint64_t ol_flags;
+   struct rte_mbuf *m;
+
+   for (i = 0; i < nb_pkts; i++) {
+   m = tx_pkts[i];
+   ol_flags = m->ol_flags;
+
+   /**
+* m->nb_segs is uint8_t, so m->nb_segs is always less than
+* I40E_TX_MAX_SEG.
+* We check only a condition for m->nb_segs > 
I40E_TX_MAX_MTU_SEG.
+*/
+   if (!(ol_flags & PKT_TX_TCP_SEG)) {
+   if (m->nb_segs > I40E_TX_MAX_MTU_SEG) {
+   rte_errno = -EINVAL;
+   return i;
+   }
+   } else if ((m->tso_segsz < I40E_MIN_TSO_MSS) ||
+   (m->tso_segsz > I40E_MAX_TSO_MSS)) {
+   /* MSS outside the range (256B - 9674B) are considered 
malicious */
+   rte_errno = -EINVAL;
+   return i;
+   }
+
+   if (ol_flags & I40E_TX_OFFLOAD_NOTSUP_MASK) {
+   rte_errno = -ENOTSUP;
+   return i;
+   }
+
+#ifdef RTE_LIBRTE_ETHDEV_DEBUG
+   ret = rte_validate_tx_offload(m);
+   if (ret != 0) {
+   rte_errno = ret;
+   return i;
+   }
+#endif
+   ret = rte_phdr_cksum_fix(m);
+   if (ret != 0) {
+   rte_errno = ret;
+   return i;
+   }
+   }
+   return i;
+}
+
 /*
  * Find the VSI the queue belongs to. 'queue_idx' is the queue index
  * application used, which assume having sequential ones. But from driver's
@@ -2763,9 +2831,11 @@ i40e_set_tx_function(struct rte_eth_dev *dev)
PMD_INIT_LOG(DEBUG, "Simple tx finally be used.");
dev->tx_pkt_burst = i40e_xmit_pkts_simple;
}
+   dev->tx_pkt_prep = NULL;
} else {
PMD_INIT_LOG(DEBUG, "Xmit tx finally be used.");
dev->tx_pkt_burst = i40e_xmit_pkts;
+

[dpdk-dev] [PATCH v11 3/6] fm10k: add Tx preparation

2016-10-26 Thread Tomasz Kulasek
Signed-off-by: Tomasz Kulasek 
---
 drivers/net/fm10k/fm10k.h|6 +
 drivers/net/fm10k/fm10k_ethdev.c |5 
 drivers/net/fm10k/fm10k_rxtx.c   |   50 +-
 3 files changed, 60 insertions(+), 1 deletion(-)

diff --git a/drivers/net/fm10k/fm10k.h b/drivers/net/fm10k/fm10k.h
index 05aa1a2..c6fed21 100644
--- a/drivers/net/fm10k/fm10k.h
+++ b/drivers/net/fm10k/fm10k.h
@@ -69,6 +69,9 @@
 #define FM10K_MAX_RX_DESC  (FM10K_MAX_RX_RING_SZ / sizeof(union fm10k_rx_desc))
 #define FM10K_MAX_TX_DESC  (FM10K_MAX_TX_RING_SZ / sizeof(struct 
fm10k_tx_desc))

+#define FM10K_TX_MAX_SEG UINT8_MAX
+#define FM10K_TX_MAX_MTU_SEG UINT8_MAX
+
 /*
  * byte aligment for HW RX data buffer
  * Datasheet requires RX buffer addresses shall either be 512-byte aligned or
@@ -356,6 +359,9 @@ fm10k_dev_rx_descriptor_done(void *rx_queue, uint16_t 
offset);
 uint16_t fm10k_xmit_pkts(void *tx_queue, struct rte_mbuf **tx_pkts,
uint16_t nb_pkts);

+uint16_t fm10k_prep_pkts(void *tx_queue, struct rte_mbuf **tx_pkts,
+   uint16_t nb_pkts);
+
 int fm10k_rxq_vec_setup(struct fm10k_rx_queue *rxq);
 int fm10k_rx_vec_condition_check(struct rte_eth_dev *);
 void fm10k_rx_queue_release_mbufs_vec(struct fm10k_rx_queue *rxq);
diff --git a/drivers/net/fm10k/fm10k_ethdev.c b/drivers/net/fm10k/fm10k_ethdev.c
index c804436..dffb6d1 100644
--- a/drivers/net/fm10k/fm10k_ethdev.c
+++ b/drivers/net/fm10k/fm10k_ethdev.c
@@ -1446,6 +1446,8 @@ fm10k_dev_infos_get(struct rte_eth_dev *dev,
.nb_max = FM10K_MAX_TX_DESC,
.nb_min = FM10K_MIN_TX_DESC,
.nb_align = FM10K_MULT_TX_DESC,
+   .nb_seg_max = FM10K_TX_MAX_SEG,
+   .nb_mtu_seg_max = FM10K_TX_MAX_MTU_SEG,
};

dev_info->speed_capa = ETH_LINK_SPEED_1G | ETH_LINK_SPEED_2_5G |
@@ -2754,8 +2756,10 @@ fm10k_set_tx_function(struct rte_eth_dev *dev)
fm10k_txq_vec_setup(txq);
}
dev->tx_pkt_burst = fm10k_xmit_pkts_vec;
+   dev->tx_pkt_prep = NULL;
} else {
dev->tx_pkt_burst = fm10k_xmit_pkts;
+   dev->tx_pkt_prep = fm10k_prep_pkts;
PMD_INIT_LOG(DEBUG, "Use regular Tx func");
}
 }
@@ -2834,6 +2838,7 @@ eth_fm10k_dev_init(struct rte_eth_dev *dev)
dev->dev_ops = _eth_dev_ops;
dev->rx_pkt_burst = _recv_pkts;
dev->tx_pkt_burst = _xmit_pkts;
+   dev->tx_pkt_prep = _prep_pkts;

/* only initialize in the primary process */
if (rte_eal_process_type() != RTE_PROC_PRIMARY)
diff --git a/drivers/net/fm10k/fm10k_rxtx.c b/drivers/net/fm10k/fm10k_rxtx.c
index 32cc7ff..5fc4d5a 100644
--- a/drivers/net/fm10k/fm10k_rxtx.c
+++ b/drivers/net/fm10k/fm10k_rxtx.c
@@ -1,7 +1,7 @@
 /*-
  *   BSD LICENSE
  *
- *   Copyright(c) 2013-2015 Intel Corporation. All rights reserved.
+ *   Copyright(c) 2013-2016 Intel Corporation. All rights reserved.
  *   All rights reserved.
  *
  *   Redistribution and use in source and binary forms, with or without
@@ -35,6 +35,7 @@

 #include 
 #include 
+#include 
 #include "fm10k.h"
 #include "base/fm10k_type.h"

@@ -65,6 +66,15 @@ static inline void dump_rxd(union fm10k_rx_desc *rxd)
 }
 #endif

+#define FM10K_TX_OFFLOAD_MASK (  \
+   PKT_TX_VLAN_PKT |\
+   PKT_TX_IP_CKSUM |\
+   PKT_TX_L4_MASK | \
+   PKT_TX_TCP_SEG)
+
+#define FM10K_TX_OFFLOAD_NOTSUP_MASK \
+   (PKT_TX_OFFLOAD_MASK ^ FM10K_TX_OFFLOAD_MASK)
+
 /* @note: When this function is changed, make corresponding change to
  * fm10k_dev_supported_ptypes_get()
  */
@@ -597,3 +607,41 @@ fm10k_xmit_pkts(void *tx_queue, struct rte_mbuf **tx_pkts,

return count;
 }
+
+uint16_t
+fm10k_prep_pkts(__rte_unused void *tx_queue, struct rte_mbuf **tx_pkts,
+   uint16_t nb_pkts)
+{
+   int i, ret;
+   struct rte_mbuf *m;
+
+   for (i = 0; i < nb_pkts; i++) {
+   m = tx_pkts[i];
+
+   if ((m->ol_flags & PKT_TX_TCP_SEG) &&
+   (m->tso_segsz < FM10K_TSO_MINMSS)) {
+   rte_errno = -EINVAL;
+   return i;
+   }
+
+   if (m->ol_flags & FM10K_TX_OFFLOAD_NOTSUP_MASK) {
+   rte_errno = -ENOTSUP;
+   return i;
+   }
+
+#ifdef RTE_LIBRTE_ETHDEV_DEBUG
+   ret = rte_validate_tx_offload(m);
+   if (ret != 0) {
+   rte_errno = ret;
+   return i;
+   }
+#endif
+   ret = rte_phdr_cksum_fix(m);
+   if (ret != 0) {
+   rte_errno = ret;
+   return i;
+   }
+   }
+
+   return i;
+}
-- 
1.7.9.5



[dpdk-dev] [PATCH v11 2/6] e1000: add Tx preparation

2016-10-26 Thread Tomasz Kulasek
Signed-off-by: Tomasz Kulasek 
---
 drivers/net/e1000/e1000_ethdev.h |   11 
 drivers/net/e1000/em_ethdev.c|5 +++-
 drivers/net/e1000/em_rxtx.c  |   48 ++-
 drivers/net/e1000/igb_ethdev.c   |4 +++
 drivers/net/e1000/igb_rxtx.c |   52 +-
 5 files changed, 117 insertions(+), 3 deletions(-)

diff --git a/drivers/net/e1000/e1000_ethdev.h b/drivers/net/e1000/e1000_ethdev.h
index 6c25c8d..bd0f277 100644
--- a/drivers/net/e1000/e1000_ethdev.h
+++ b/drivers/net/e1000/e1000_ethdev.h
@@ -138,6 +138,11 @@
 #define E1000_MISC_VEC_ID   RTE_INTR_VEC_ZERO_OFFSET
 #define E1000_RX_VEC_START  RTE_INTR_VEC_RXTX_OFFSET

+#define IGB_TX_MAX_SEG UINT8_MAX
+#define IGB_TX_MAX_MTU_SEG UINT8_MAX
+#define EM_TX_MAX_SEG  UINT8_MAX
+#define EM_TX_MAX_MTU_SEG  UINT8_MAX
+
 /* structure for interrupt relative data */
 struct e1000_interrupt {
uint32_t flags;
@@ -315,6 +320,9 @@ void eth_igb_tx_init(struct rte_eth_dev *dev);
 uint16_t eth_igb_xmit_pkts(void *txq, struct rte_mbuf **tx_pkts,
uint16_t nb_pkts);

+uint16_t eth_igb_prep_pkts(void *txq, struct rte_mbuf **tx_pkts,
+   uint16_t nb_pkts);
+
 uint16_t eth_igb_recv_pkts(void *rxq, struct rte_mbuf **rx_pkts,
uint16_t nb_pkts);

@@ -376,6 +384,9 @@ void eth_em_tx_init(struct rte_eth_dev *dev);
 uint16_t eth_em_xmit_pkts(void *tx_queue, struct rte_mbuf **tx_pkts,
uint16_t nb_pkts);

+uint16_t eth_em_prep_pkts(void *txq, struct rte_mbuf **tx_pkts,
+   uint16_t nb_pkts);
+
 uint16_t eth_em_recv_pkts(void *rx_queue, struct rte_mbuf **rx_pkts,
uint16_t nb_pkts);

diff --git a/drivers/net/e1000/em_ethdev.c b/drivers/net/e1000/em_ethdev.c
index 7cf5f0c..17b45cb 100644
--- a/drivers/net/e1000/em_ethdev.c
+++ b/drivers/net/e1000/em_ethdev.c
@@ -1,7 +1,7 @@
 /*-
  *   BSD LICENSE
  *
- *   Copyright(c) 2010-2015 Intel Corporation. All rights reserved.
+ *   Copyright(c) 2010-2016 Intel Corporation. All rights reserved.
  *   All rights reserved.
  *
  *   Redistribution and use in source and binary forms, with or without
@@ -300,6 +300,7 @@ eth_em_dev_init(struct rte_eth_dev *eth_dev)
eth_dev->dev_ops = _em_ops;
eth_dev->rx_pkt_burst = (eth_rx_burst_t)_em_recv_pkts;
eth_dev->tx_pkt_burst = (eth_tx_burst_t)_em_xmit_pkts;
+   eth_dev->tx_pkt_prep = (eth_tx_prep_t)_em_prep_pkts;

/* for secondary processes, we don't initialise any further as primary
 * has already done this work. Only check we don't need a different
@@ -1067,6 +1068,8 @@ eth_em_infos_get(struct rte_eth_dev *dev, struct 
rte_eth_dev_info *dev_info)
.nb_max = E1000_MAX_RING_DESC,
.nb_min = E1000_MIN_RING_DESC,
.nb_align = EM_TXD_ALIGN,
+   .nb_seg_max = EM_TX_MAX_SEG,
+   .nb_mtu_seg_max = EM_TX_MAX_MTU_SEG,
};

dev_info->speed_capa = ETH_LINK_SPEED_10M_HD | ETH_LINK_SPEED_10M |
diff --git a/drivers/net/e1000/em_rxtx.c b/drivers/net/e1000/em_rxtx.c
index 41f51c0..5bd3c99 100644
--- a/drivers/net/e1000/em_rxtx.c
+++ b/drivers/net/e1000/em_rxtx.c
@@ -1,7 +1,7 @@
 /*-
  *   BSD LICENSE
  *
- *   Copyright(c) 2010-2015 Intel Corporation. All rights reserved.
+ *   Copyright(c) 2010-2016 Intel Corporation. All rights reserved.
  *   All rights reserved.
  *
  *   Redistribution and use in source and binary forms, with or without
@@ -66,6 +66,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 

 #include "e1000_logs.h"
@@ -77,6 +78,14 @@

 #define E1000_RXDCTL_GRAN  0x0100 /* RXDCTL Granularity */

+#define E1000_TX_OFFLOAD_MASK ( \
+   PKT_TX_IP_CKSUM |   \
+   PKT_TX_L4_MASK |\
+   PKT_TX_VLAN_PKT)
+
+#define E1000_TX_OFFLOAD_NOTSUP_MASK \
+   (PKT_TX_OFFLOAD_MASK ^ E1000_TX_OFFLOAD_MASK)
+
 /**
  * Structure associated with each descriptor of the RX ring of a RX queue.
  */
@@ -618,6 +627,43 @@ end_of_tx:

 /*
  *
+ *  TX prep functions
+ *
+ **/
+uint16_t
+eth_em_prep_pkts(__rte_unused void *tx_queue, struct rte_mbuf **tx_pkts,
+   uint16_t nb_pkts)
+{
+   int i, ret;
+   struct rte_mbuf *m;
+
+   for (i = 0; i < nb_pkts; i++) {
+   m = tx_pkts[i];
+
+   if (m->ol_flags & E1000_TX_OFFLOAD_NOTSUP_MASK) {
+   rte_errno = -ENOTSUP;
+   return i;
+   }
+
+#ifdef RTE_LIBRTE_ETHDEV_DEBUG
+   ret = rte_validate_tx_offload(m);
+   if (ret != 0) {
+   rte_errno = ret;
+   return i;
+   }
+#endif
+   ret = rte_phdr_cksum_fix(m);
+   if (ret != 0) {
+   rte_errno = 

[dpdk-dev] [PATCH v11 1/6] ethdev: add Tx preparation

2016-10-26 Thread Tomasz Kulasek
Added API for `rte_eth_tx_prep`

uint16_t rte_eth_tx_prep(uint8_t port_id, uint16_t queue_id,
struct rte_mbuf **tx_pkts, uint16_t nb_pkts)

Added fields to the `struct rte_eth_desc_lim`:

uint16_t nb_seg_max;
/**< Max number of segments per whole packet. */

uint16_t nb_mtu_seg_max;
/**< Max number of segments per one MTU */

Added functions:

int rte_validate_tx_offload(struct rte_mbuf *m)
to validate general requirements for tx offload set in mbuf of packet
  such a flag completness. In current implementation this function is
  called optionaly when RTE_LIBRTE_ETHDEV_DEBUG is enabled.

int rte_phdr_cksum_fix(struct rte_mbuf *m)
to fix pseudo header checksum for TSO and non-TSO tcp/udp packets
before hardware tx checksum offload.
 - for non-TSO tcp/udp packets full pseudo-header checksum is
   counted and set.
 - for TSO the IP payload length is not included.

PERFORMANCE TESTS
-

This feature was tested with modified csum engine from test-pmd.

The packet checksum preparation was moved from application to Tx
preparation step placed before burst.

We may expect some overhead costs caused by:
1) using additional callback before burst,
2) rescanning burst,
3) additional condition checking (packet validation),
4) worse optimization (e.g. packet data access, etc.)

We tested it using ixgbe Tx preparation implementation with some parts
disabled to have comparable information about the impact of different
parts of implementation.

IMPACT:

1) For unimplemented Tx preparation callback the performance impact is
   negligible,
2) For packet condition check without checksum modifications (nb_segs,
   available offloads, etc.) is 14626628/14252168 (~2.62% drop),
3) Full support in ixgbe driver (point 2 + packet checksum
   initialization) is 14060924/13588094 (~3.48% drop)

Signed-off-by: Tomasz Kulasek 
---
 config/common_base|1 +
 lib/librte_ether/rte_ethdev.h |  103 +
 lib/librte_mbuf/rte_mbuf.h|   64 +
 lib/librte_net/rte_net.h  |   85 ++
 4 files changed, 253 insertions(+)

diff --git a/config/common_base b/config/common_base
index c7fd3db..619284b 100644
--- a/config/common_base
+++ b/config/common_base
@@ -120,6 +120,7 @@ CONFIG_RTE_MAX_QUEUES_PER_PORT=1024
 CONFIG_RTE_LIBRTE_IEEE1588=n
 CONFIG_RTE_ETHDEV_QUEUE_STAT_CNTRS=16
 CONFIG_RTE_ETHDEV_RXTX_CALLBACKS=y
+CONFIG_RTE_ETHDEV_TX_PREP=y

 #
 # Support NIC bypass logic
diff --git a/lib/librte_ether/rte_ethdev.h b/lib/librte_ether/rte_ethdev.h
index 38641e8..cf6f68e 100644
--- a/lib/librte_ether/rte_ethdev.h
+++ b/lib/librte_ether/rte_ethdev.h
@@ -182,6 +182,7 @@ extern "C" {
 #include 
 #include 
 #include 
+#include 
 #include "rte_ether.h"
 #include "rte_eth_ctrl.h"
 #include "rte_dev_info.h"
@@ -699,6 +700,8 @@ struct rte_eth_desc_lim {
uint16_t nb_max;   /**< Max allowed number of descriptors. */
uint16_t nb_min;   /**< Min allowed number of descriptors. */
uint16_t nb_align; /**< Number of descriptors should be aligned to. */
+   uint16_t nb_seg_max; /**< Max number of segments per whole packet. 
*/
+   uint16_t nb_mtu_seg_max; /**< Max number of segments per one MTU */
 };

 /**
@@ -1188,6 +1191,11 @@ typedef uint16_t (*eth_tx_burst_t)(void *txq,
   uint16_t nb_pkts);
 /**< @internal Send output packets on a transmit queue of an Ethernet device. 
*/

+typedef uint16_t (*eth_tx_prep_t)(void *txq,
+  struct rte_mbuf **tx_pkts,
+  uint16_t nb_pkts);
+/**< @internal Prepare output packets on a transmit queue of an Ethernet 
device. */
+
 typedef int (*flow_ctrl_get_t)(struct rte_eth_dev *dev,
   struct rte_eth_fc_conf *fc_conf);
 /**< @internal Get current flow control parameter on an Ethernet device */
@@ -1622,6 +1630,7 @@ struct rte_eth_rxtx_callback {
 struct rte_eth_dev {
eth_rx_burst_t rx_pkt_burst; /**< Pointer to PMD receive function. */
eth_tx_burst_t tx_pkt_burst; /**< Pointer to PMD transmit function. */
+   eth_tx_prep_t tx_pkt_prep; /**< Pointer to PMD transmit prepare 
function. */
struct rte_eth_dev_data *data;  /**< Pointer to device data */
const struct eth_driver *driver;/**< Driver for this device */
const struct eth_dev_ops *dev_ops; /**< Functions exported by PMD */
@@ -2816,6 +2825,100 @@ rte_eth_tx_burst(uint8_t port_id, uint16_t queue_id,
return (*dev->tx_pkt_burst)(dev->data->tx_queues[queue_id], tx_pkts, 
nb_pkts);
 }

+/**
+ * Process a burst of output packets on a transmit queue of an Ethernet device.
+ *
+ * The rte_eth_tx_prep() function is invoked to prepare output packets to be
+ * transmitted on the output queue *queue_id* of the Ethernet device designated
+ * by its *port_id*.
+ 

[dpdk-dev] [PATCH v11 0/6] add Tx preparation

2016-10-26 Thread Tomasz Kulasek
As discussed in that thread:

http://dpdk.org/ml/archives/dev/2015-September/023603.html

Different NIC models depending on HW offload requested might impose
different requirements on packets to be TX-ed in terms of:

 - Max number of fragments per packet allowed
 - Max number of fragments per TSO segments
 - The way pseudo-header checksum should be pre-calculated
 - L3/L4 header fields filling
 - etc.


MOTIVATION:
---

1) Some work cannot (and didn't should) be done in rte_eth_tx_burst.
   However, this work is sometimes required, and now, it's an
   application issue.

2) Different hardware may have different requirements for TX offloads,
   other subset can be supported and so on.

3) Some parameters (e.g. number of segments in ixgbe driver) may hung
   device. These parameters may be vary for different devices.

   For example i40e HW allows 8 fragments per packet, but that is after
   TSO segmentation. While ixgbe has a 38-fragment pre-TSO limit.

4) Fields in packet may require different initialization (like e.g. will
   require pseudo-header checksum precalculation, sometimes in a
   different way depending on packet type, and so on). Now application
   needs to care about it.

5) Using additional API (rte_eth_tx_prep) before rte_eth_tx_burst let to
   prepare packet burst in acceptable form for specific device.

6) Some additional checks may be done in debug mode keeping tx_burst
   implementation clean.


PROPOSAL:
-

To help user to deal with all these varieties we propose to:

1) Introduce rte_eth_tx_prep() function to do necessary preparations of
   packet burst to be safely transmitted on device for desired HW
   offloads (set/reset checksum field according to the hardware
   requirements) and check HW constraints (number of segments per
   packet, etc).

   While the limitations and requirements may differ for devices, it
   requires to extend rte_eth_dev structure with new function pointer
   "tx_pkt_prep" which can be implemented in the driver to prepare and
   verify packets, in devices specific way, before burst, what should to
   prevent application to send malformed packets.

2) Also new fields will be introduced in rte_eth_desc_lim: 
   nb_seg_max and nb_mtu_seg_max, providing an information about max
   segments in TSO and non-TSO packets acceptable by device.

   This information is useful for application to not create/limit
   malicious packet.


APPLICATION (CASE OF USE):
--

1) Application should to initialize burst of packets to send, set
   required tx offload flags and required fields, like l2_len, l3_len,
   l4_len, and tso_segsz

2) Application passes burst to the rte_eth_tx_prep to check conditions
   required to send packets through the NIC.

3) The result of rte_eth_tx_prep can be used to send valid packets
   and/or restore invalid if function fails.

e.g.

for (i = 0; i < nb_pkts; i++) {

/* initialize or process packet */

bufs[i]->tso_segsz = 800;
bufs[i]->ol_flags = PKT_TX_TCP_SEG | PKT_TX_IPV4
| PKT_TX_IP_CKSUM;
bufs[i]->l2_len = sizeof(struct ether_hdr);
bufs[i]->l3_len = sizeof(struct ipv4_hdr);
bufs[i]->l4_len = sizeof(struct tcp_hdr);
}

/* Prepare burst of TX packets */
nb_prep = rte_eth_tx_prep(port, 0, bufs, nb_pkts);

if (nb_prep < nb_pkts) {
printf("tx_prep failed\n");

/* nb_prep indicates here first invalid packet. rte_eth_tx_prep
 * can be used on remaining packets to find another ones.
 */

}

/* Send burst of TX packets */
nb_tx = rte_eth_tx_burst(port, 0, bufs, nb_prep);

/* Free any unsent packets. */

v11 changed:
 - updated comments
 - added information to the API description about packet data
   requirements/limitations.

v10 changes:
 - moved drivers tx calback check in rte_eth_tx_prep after queue_id check

v9 changes:
 - fixed headers structure fragmentation check
 - moved fragmentation check into rte_validate_tx_offload()

v8 changes:
 - mbuf argument in rte_validate_tx_offload declared as const

v7 changes:
 - comments reworded/added
 - changed errno values returned from Tx prep API
 - added check in rte_phdr_cksum_fix if headers are in the first
   data segment and can be safetly modified
 - moved rte_validate_tx_offload to rte_mbuf
 - moved rte_phdr_cksum_fix to rte_net.h
 - removed rte_pkt.h new file as useless

v6 changes:
- added performance impact test results to the patch description

v5 changes:
 - rebased csum engine modification
 - added information to the csum engine about performance tests
 - some performance improvements

v4 changes:
 - tx_prep is now set to default behavior (NULL) for simple/vector path
   in fm10k, i40e and ixgbe drivers to increase performance, when
   Tx offloads are not intentionally available

v3 changes:
 - 

[dpdk-dev] [PATCH v6 1/2] net/ixgbe: support multiqueue mode VMDq DCB with SRIOV

2016-10-26 Thread Wu, Jingjing


> -Original Message-
> From: Iremonger, Bernard
> Sent: Wednesday, October 26, 2016 12:51 AM
> To: dev at dpdk.org; Shah, Rahul R ; Lu, Wenzhuo
> ; Dumitrescu, Cristian  intel.com>; Wu, Jingjing
> 
> Cc: Iremonger, Bernard 
> Subject: [PATCH v6 1/2] net/ixgbe: support multiqueue mode VMDq DCB with SRIOV
> 
> The folowing changes have been made to allow Data Centre Bridge
> (DCB) configuration when SRIOV is enabled.
> 
> Modify ixgbe_check_mq_mode function,
> when SRIOV is enabled, enable mq_mode
> ETH_MQ_RX_VMDQ_DCB and ETH_MQ_TX_VMDQ_DCB.
> 
> Modify ixgbe_dcb_tx_hw_config function,
> replace the struct ixgbe_hw parameter with a
> struct rte_eth_dev parameter and handle SRIOV enabled.
> 
> Modify ixgbe_dev_mq_rx_configure function,
> when SRIOV is enabled, enable mq_mode ETH_MQ_RX_VMDQ_DCB.
> 
> Modify ixgbe_configure_dcb function,
> revise check on dev->data->nb_rx_queues.
> 
> Signed-off-by: Rahul R Shah 
> Signed-off-by: Bernard Iremonger 
> Acked-by: Wenzhuo Lu 
> ---
>  drivers/net/ixgbe/ixgbe_ethdev.c | 11 ++-
>  drivers/net/ixgbe/ixgbe_rxtx.c   | 35 ++-
>  2 files changed, 28 insertions(+), 18 deletions(-)
> 
> diff --git a/drivers/net/ixgbe/ixgbe_ethdev.c 
> b/drivers/net/ixgbe/ixgbe_ethdev.c
> index 4ca5747..4d5ce83 100644
> --- a/drivers/net/ixgbe/ixgbe_ethdev.c
> +++ b/drivers/net/ixgbe/ixgbe_ethdev.c
> @@ -1977,6 +1977,9 @@ ixgbe_check_mq_mode(struct rte_eth_dev *dev)
>   /* check multi-queue mode */
>   switch (dev_conf->rxmode.mq_mode) {
>   case ETH_MQ_RX_VMDQ_DCB:
> + PMD_INIT_LOG(INFO, "ETH_MQ_RX_VMDQ_DCB mode supported 
> in SRIOV");
> + dev->data->dev_conf.rxmode.mq_mode = ETH_MQ_RX_VMDQ_DCB;
This line is duplicated, mq_mode is ETH_MQ_RX_VMDQ_DCB already. And it's better 
to
check if the nb_queue is valid.




[dpdk-dev] [PATCH v6 2/2] app/test_pmd: fix DCB configuration

2016-10-26 Thread Wu, Jingjing

>   if (dcb_mode == DCB_VT_ENABLED) {
> - nb_rxq = rte_port->dev_info.max_rx_queues;
> - nb_txq = rte_port->dev_info.max_tx_queues;
> + nb_rxq = 1;
> + nb_txq = 1;
Before, the 'vt' argument in dcb command is used to distinguish whether the 
VMDQ is involved, but does mean if SRIOV is enabled.
I guess you want to use mode= ETH_MQ_RX_VMDQ_DCB to cover both VMDQ+DCB and 
SRIOV+DCB cases.
But if set nb_rxq=1, VMDQ + DCB case will not work in testpmd. And even doesn't 
care about the VMDQ +DCB case,
setting num of queue to 1 makes no sense to DCB (queue based on TC) cases.

Thanks
Jingjing


[dpdk-dev] [RFC] [PATCH v2] libeventdev: event driven programming model framework for DPDK

2016-10-26 Thread Bruce Richardson
On Wed, Oct 26, 2016 at 05:54:17PM +0530, Jerin Jacob wrote:
> On Wed, Oct 26, 2016 at 12:11:03PM +, Van Haaren, Harry wrote:
> > > -Original Message-
> > > From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Jerin Jacob
> > > 
> > > So far, I have received constructive feedback from Intel, NXP and Linaro 
> > > folks.
> > > Let me know, if anyone else interested in contributing to the definition 
> > > of eventdev?
> > > 
> > > If there are no major issues in proposed spec, then Cavium would like 
> > > work on
> > > implementing and up-streaming the common code(lib/librte_eventdev/) and
> > > an associated HW driver.(Requested minor changes of v2 will be addressed
> > > in next version).
> > 
> > Hi All,
> > 
> > I will propose a minor change to the rte_event struct, allowing some bits 
> > to be implementation specific. Currently the rte_event struct has no space 
> > to allow an implementation store any metadata about the event. For software 
> > performance it would be really helpful if there are some bits available for 
> > the implementation to keep some flags about each event.
> 
> OK.
> 
> > 
> > I suggest to rework the struct as below which opens 6 bits that were 
> > otherwise wasted, and define them as implementation specific. By 
> > implementation specific it is understood that the implementation can 
> > overwrite any information stored in those bits, and the application must 
> > not expect the data to remain after the event is scheduled.
> > 
> > OLD:
> > struct rte_event {
> > uint32_t flow_id:24;
> > uint32_t queue_id:8;
> > uint8_t  sched_type; /* Note only 2 bits of 8 are required */
> > 
> > NEW:
> > struct rte_event {
> > uint32_t flow_id:24;
> > uint32_t sched_type:2; /* reduced size : but 2 bits is enough for the 
> > enqueue types Ordered,Atomic,Parallel.*/
> > uint32_t implementation:6; /* available for implementation specific 
> > metadata */
> > uint8_t queue_id; /* still 8 bits as before */
> > 
> > 
> > Thoughts? -Harry
> 
> Looks good to me. I will add it in v3.
> 
Thanks. One other suggestion is that it might be useful to provide
support for having typed queues explicitly in the API. Right now, when
you create an queue, the queue_conf structure takes as parameters how
many atomic flows that are needed for the queue, or how many reorder
slots need to be reserved for it. This implicitly hints at the type of
traffic which will be sent to the queue, but I'm wondering if it's
better to make it explicit. There are certain optimisations that can be
looked at if we know that a queue only handles packets of a particular
type. [Not having to handle reordering when pulling events from a core
can be a big win for software!].

How about adding: "allowed_event_types" as a field to
rte_event_queue_conf, with possible values:
* atomic
* ordered
* parallel
* mixed - allowing all 3 types. I think allowing 2 of three types might
make things too complicated.

An open question would then be how to behave when the queue type and
requested event type conflict. We can either throw an error, or just
ignore the event type and always treat enqueued events as being of the
queue type. I prefer the latter, because it's faster not having to
error-check, and it pushes the responsibility on the app to know what
it's doing.

/Bruce


[dpdk-dev] [RFC] [PATCH v2] libeventdev: event driven programming model framework for DPDK

2016-10-26 Thread Bruce Richardson
On Tue, Oct 25, 2016 at 11:19:05PM +0530, Jerin Jacob wrote:
> On Wed, Oct 12, 2016 at 01:00:16AM +0530, Jerin Jacob wrote:
> > Thanks to Intel and NXP folks for the positive and constructive feedback
> > I've received so far. Here is the updated RFC(v2).
> > 
> > I've attempted to address as many comments as possible.
> > 
> > This series adds rte_eventdev.h to the DPDK tree with
> > adequate documentation in doxygen format.
> > 
> > Updates are also available online:
> > 
> > Related draft header file (this patch):
> > https://rawgit.com/jerinjacobk/libeventdev/master/rte_eventdev.h
> > 
> > PDF version(doxgen output):
> > https://rawgit.com/jerinjacobk/libeventdev/master/librte_eventdev_v2.pdf
> > 
> > Repo:
> > https://github.com/jerinjacobk/libeventdev
> >
> 
> Hi Community,
> 
> So far, I have received constructive feedback from Intel, NXP and Linaro 
> folks.
> Let me know, if anyone else interested in contributing to the definition of 
> eventdev?
> 
> If there are no major issues in proposed spec, then Cavium would like work on
> implementing and up-streaming the common code(lib/librte_eventdev/) and
> an associated HW driver.(Requested minor changes of v2 will be addressed
> in next version).
> 
> We are planning to submit the work for 17.02 or 17.05 release(based on
> how implementation goes).
> 

Hi Jerin,

thanks for driving this. In terms of the common code framework, when
would you see that you might have something to upstream for that? As you
know, we've been working on a software implementation which we are now
looking to move to the eventdev APIs, and which also needs this common
code to support it. 

If it can accelerate this effort, we can perhaps provide as an RFC
the common code part that we have implemented for our work, or else we
are happy to migrate to use common code you provide if it can be
upstreamed fairly soon.

Regards,
/Bruce


[dpdk-dev] [RFC] [PATCH v2] libeventdev: event driven programming model framework for DPDK

2016-10-26 Thread Van Haaren, Harry
> -Original Message-
> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Jerin Jacob
> 
> So far, I have received constructive feedback from Intel, NXP and Linaro 
> folks.
> Let me know, if anyone else interested in contributing to the definition of 
> eventdev?
> 
> If there are no major issues in proposed spec, then Cavium would like work on
> implementing and up-streaming the common code(lib/librte_eventdev/) and
> an associated HW driver.(Requested minor changes of v2 will be addressed
> in next version).

Hi All,

I will propose a minor change to the rte_event struct, allowing some bits to be 
implementation specific. Currently the rte_event struct has no space to allow 
an implementation store any metadata about the event. For software performance 
it would be really helpful if there are some bits available for the 
implementation to keep some flags about each event.

I suggest to rework the struct as below which opens 6 bits that were otherwise 
wasted, and define them as implementation specific. By implementation specific 
it is understood that the implementation can overwrite any information stored 
in those bits, and the application must not expect the data to remain after the 
event is scheduled.

OLD:
struct rte_event {
uint32_t flow_id:24;
uint32_t queue_id:8;
uint8_t  sched_type; /* Note only 2 bits of 8 are required */

NEW:
struct rte_event {
uint32_t flow_id:24;
uint32_t sched_type:2; /* reduced size : but 2 bits is enough for the 
enqueue types Ordered,Atomic,Parallel.*/
uint32_t implementation:6; /* available for implementation specific 
metadata */
uint8_t queue_id; /* still 8 bits as before */


Thoughts? -Harry


[dpdk-dev] [PATCH v10 11/25] eal/pci: helpers for device name parsing/update

2016-10-26 Thread Shreyansh Jain
Hello Reshma,

On Tuesday 25 October 2016 09:19 PM, Pattan, Reshma wrote:
> Hi Shreyansh,
>
>> -Original Message-
>> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Shreyansh Jain
>> Sent: Friday, September 16, 2016 5:30 AM
>> To: dev at dpdk.org
>> Cc: viktorin at rehivetech.com; David Marchand ;
>> hemant.agrawal at nxp.com; Thomas Monjalon
>> ; Shreyansh Jain 
>> Subject: [dpdk-dev] [PATCH v10 11/25] eal/pci: helpers for device name
>> parsing/update
>>
>> From: David Marchand 
>>
>> - Move rte_eth_dev_create_unique_device_name() from ether/rte_ethdev.c to
>>   common/include/rte_pci.h as rte_eal_pci_device_name(). Being a common
>>   method, can be used across crypto/net PCI PMDs.
>> - Remove crypto specific routine and fallback to common name function.
>> - Introduce a eal private Update function for PCI device naming.
>>
>> Signed-off-by: David Marchand 
>> [Shreyansh: Merge crypto/pci helper patches]
>> Signed-off-by: Shreyansh Jain 
>> ---
>>  lib/librte_cryptodev/rte_cryptodev.c| 27 +++---
>>  lib/librte_eal/bsdapp/eal/eal_pci.c | 49
>> +
>>  lib/librte_eal/common/eal_private.h | 13 +
>>  lib/librte_eal/common/include/rte_pci.h | 24 
>>  lib/librte_eal/linuxapp/eal/eal_pci.c   | 13 +
>>  lib/librte_ether/rte_ethdev.c   | 24 +++-
>>  6 files changed, 107 insertions(+), 43 deletions(-)
>>
>> diff --git a/lib/librte_cryptodev/rte_cryptodev.c
>> b/lib/librte_cryptodev/rte_cryptodev.c
>> index 2a3b649..c81e366 100644
>> --- a/lib/librte_cryptodev/rte_cryptodev.c
>> +++ b/lib/librte_cryptodev/rte_cryptodev.c
>> @@ -365,23 +365,6 @@ rte_cryptodev_pmd_allocate(const char *name, int
>> socket_id)
>>  return cryptodev;
>>  }
>>
>>   *
>>   * This function is private to EAL.
>> diff --git a/lib/librte_eal/common/include/rte_pci.h
>> b/lib/librte_eal/common/include/rte_pci.h
>> index cf81898..e1f695f 100644
>> --- a/lib/librte_eal/common/include/rte_pci.h
>> +++ b/lib/librte_eal/common/include/rte_pci.h
>> @@ -82,6 +82,7 @@ extern "C" {
>>  /** Formatting string for PCI device identifier: Ex: :00:01.0 */  
>> #define
>> PCI_PRI_FMT "%.4" PRIx16 ":%.2" PRIx8 ":%.2" PRIx8 ".%" PRIx8
>> +#define PCI_PRI_STR_SIZE sizeof(":XX:XX.X")
>>
>>  /** Short formatting string, without domain, for PCI device: Ex: 00:01.0 */
>> #define PCI_SHORT_PRI_FMT "%.2" PRIx8 ":%.2" PRIx8 ".%" PRIx8 @@ -308,6
>>
>> +static inline void
>> +rte_eal_pci_device_name(const struct rte_pci_addr *addr,
>> +char *output, size_t size)
>> +{
>> +RTE_VERIFY(size >= PCI_PRI_STR_SIZE);
>> +RTE_VERIFY(snprintf(output, size, PCI_PRI_FMT,
>> +addr->domain, addr->bus,
>> +addr->devid, addr->function) >= 0); }
>> +
>>
>> +int
>> +pci_update_device(const struct rte_pci_addr *addr) {
>> +char filename[PATH_MAX];
>> +
>> +snprintf(filename, sizeof(filename), "%s/" PCI_PRI_FMT,
>> + pci_get_sysfs_path(), addr->domain, addr->bus, addr->devid,
>> + addr->function);
>> +
>> +return pci_scan_one(filename, addr->domain, addr->bus, addr->devid,
>> +addr->function);
>> +}
>> +
>
>
> Earlier device names were created in the format "bus:deviceid.function" as 
> per the below ethdev API.
> Now after above new eal API the name format is "domain:bus:deviceid.func" was 
> that intentional  and why is that so.

Yes, this is intentional.
It is to bring the naming in sync with the device name being used for 
scanning on the bus (/sys/bus/pci/devices/:BB:CC.D/).
Also, it was proposed in a separate patch [1] but merged in this series.

[1] http://dpdk.org/ml/archives/dev/2016-July/044614.html

(Just as a note: I am not the original author of this patch but above is 
what I understood and acked it).

>
>> -static int
>> -rte_eth_dev_create_unique_device_name(char *name, size_t size,
>> -struct rte_pci_device *pci_dev)
>> -{
>> -int ret;
>> -
>> -ret = snprintf(name, size, "%d:%d.%d",
>> -pci_dev->addr.bus, pci_dev->addr.devid,
>> -pci_dev->addr.function);
>> -if (ret < 0)
>> -return ret;
>> -return 0;
>> -}
>> -
>

-
Shreyansh


[dpdk-dev] [PATCH v2] release 16.07.1

2016-10-26 Thread Thomas Monjalon
From: Yuanhan Liu 

A quick download button is added for the latest stable release.

Signed-off-by: Yuanhan Liu 
Signed-off-by: Thomas Monjalon 
---
v2:
- buttons on one row
- md5 of tarball
---
 content.css   |  4 ++--
 download.html | 10 +++---
 rel.html  |  6 +++---
 3 files changed, 12 insertions(+), 8 deletions(-)

diff --git a/content.css b/content.css
index 4bd7240..4b469bd 100644
--- a/content.css
+++ b/content.css
@@ -105,11 +105,11 @@ section .button:hover {
 }
 section#download .button,
 section#rel .button {
-   width: 26%;
+   width: 18%;
 }
 section#download .button + .button,
 section#rel .button + .button {
-   margin-left: 11%;
+   margin-left: 9.3%;
 }
 section#events .button {
padding: 1em 3em;
diff --git a/download.html b/download.html
index 919c41e..dc5c802 100644
--- a/download.html
+++ b/download.html
@@ -41,15 +41,19 @@
Download
http://fast.dpdk.org/rel/dpdk-16.07.tar.xz; 
class="button">
archive
-   Latest Version: 16.07
+   Latest Major16.07
+   http://fast.dpdk.org/rel/dpdk-16.07.1.tar.xz; class="button">
+   archive
+   Latest Stable16.07.1

view_list
-   Other Versions
+   Other Versions

access_time
-   Quick Start Instructions
+   Quick Start


Applications
diff --git a/rel.html b/rel.html
index b95403a..ffdc513 100644
--- a/rel.html
+++ b/rel.html
@@ -59,9 +59,9 @@
md5


-   http://fast.dpdk.org/rel/dpdk-16.07.tar.xz;>DPDK 16.07
-   2016 July 28
-   690a2bb570103e58d12f9806e8bf21be
+   http://fast.dpdk.org/rel/dpdk-16.07.1.tar.xz;>DPDK 16.07.1
+   2016 October 26
+   28b3831cebfa94ca533418c17de25986


http://fast.dpdk.org/rel/dpdk-16.04.tar.xz;>DPDK 16.04
-- 
2.7.0



[dpdk-dev] [PATCH 2/2] net/mlx5: fix support for newer link speeds

2016-10-26 Thread Nelio Laranjeiro
Not all speed capabilities can be reported properly before Linux 4.8 (25G,
50G and 100G speeds are missing), moreover the API to retrieve them only
exists since Linux 4.5, this commit thus implements compatibility code for
all versions.

Fixes: e274f5732225 ("ethdev: add speed capabilities")

Signed-off-by: Nelio Laranjeiro 
---
 drivers/net/mlx5/Makefile  |  15 +
 drivers/net/mlx5/mlx5_ethdev.c | 123 -
 2 files changed, 135 insertions(+), 3 deletions(-)

diff --git a/drivers/net/mlx5/Makefile b/drivers/net/mlx5/Makefile
index 2c13c30..cf87f0b 100644
--- a/drivers/net/mlx5/Makefile
+++ b/drivers/net/mlx5/Makefile
@@ -121,6 +121,21 @@ mlx5_autoconf.h.new: $(RTE_SDK)/scripts/auto-config-h.sh
infiniband/mlx5_hw.h \
enum MLX5_OPCODE_TSO \
$(AUTOCONF_OUTPUT)
+   $Q sh -- '$<' '$@' \
+   HAVE_ETHTOOL_LINK_MODE_25G \
+   /usr/include/linux/ethtool.h \
+   enum ETHTOOL_LINK_MODE_25000baseCR_Full_BIT \
+   $(AUTOCONF_OUTPUT)
+   $Q sh -- '$<' '$@' \
+   HAVE_ETHTOOL_LINK_MODE_50G \
+   /usr/include/linux/ethtool.h \
+   enum ETHTOOL_LINK_MODE_5baseCR2_Full_BIT \
+   $(AUTOCONF_OUTPUT)
+   $Q sh -- '$<' '$@' \
+   HAVE_ETHTOOL_LINK_MODE_100G \
+   /usr/include/linux/ethtool.h \
+   enum ETHTOOL_LINK_MODE_10baseKR4_Full_BIT \
+   $(AUTOCONF_OUTPUT)

 # Create mlx5_autoconf.h or update it in case it differs from the new one.

diff --git a/drivers/net/mlx5/mlx5_ethdev.c b/drivers/net/mlx5/mlx5_ethdev.c
index 042c9bc..2d49f86 100644
--- a/drivers/net/mlx5/mlx5_ethdev.c
+++ b/drivers/net/mlx5/mlx5_ethdev.c
@@ -627,15 +627,15 @@ mlx5_dev_supported_ptypes_get(struct rte_eth_dev *dev)
 }

 /**
- * DPDK callback to retrieve physical link information (unlocked version).
+ * Retrieve physical link information (unlocked version using legacy ioctl).
  *
  * @param dev
  *   Pointer to Ethernet device structure.
  * @param wait_to_complete
  *   Wait for request completion (ignored).
  */
-int
-mlx5_link_update_unlocked(struct rte_eth_dev *dev, int wait_to_complete)
+static int
+mlx5_link_update_unlocked_gset(struct rte_eth_dev *dev, int wait_to_complete)
 {
struct priv *priv = mlx5_get_priv(dev);
struct ethtool_cmd edata = {
@@ -691,6 +691,123 @@ mlx5_link_update_unlocked(struct rte_eth_dev *dev, int 
wait_to_complete)
 }

 /**
+ * Retrieve physical link information (unlocked version using new ioctl from
+ * Linux 4.5).
+ *
+ * @param dev
+ *   Pointer to Ethernet device structure.
+ * @param wait_to_complete
+ *   Wait for request completion (ignored).
+ */
+static int
+mlx5_link_update_unlocked_gs(struct rte_eth_dev *dev, int wait_to_complete)
+{
+#ifdef ETHTOOL_GLINKSETTINGS
+   struct priv *priv = mlx5_get_priv(dev);
+   struct ethtool_link_settings edata = {
+   .cmd = ETHTOOL_GLINKSETTINGS,
+   };
+   struct ifreq ifr;
+   struct rte_eth_link dev_link;
+   uint64_t sc;
+
+   (void)wait_to_complete;
+   if (priv_ifreq(priv, SIOCGIFFLAGS, )) {
+   WARN("ioctl(SIOCGIFFLAGS) failed: %s", strerror(errno));
+   return -1;
+   }
+   memset(_link, 0, sizeof(dev_link));
+   dev_link.link_status = ((ifr.ifr_flags & IFF_UP) &&
+   (ifr.ifr_flags & IFF_RUNNING));
+   ifr.ifr_data = (void *)
+   if (priv_ifreq(priv, SIOCETHTOOL, )) {
+   DEBUG("ioctl(SIOCETHTOOL, ETHTOOL_GLINKSETTINGS) failed: %s",
+ strerror(errno));
+   return -1;
+   }
+   dev_link.link_speed = edata.speed;
+   sc = edata.link_mode_masks[0] |
+   ((uint64_t)edata.link_mode_masks[1] << 32);
+   priv->link_speed_capa = 0;
+   /* Link speeds available in kernel v4.5. */
+   if (sc & ETHTOOL_LINK_MODE_Autoneg_BIT)
+   priv->link_speed_capa |= ETH_LINK_SPEED_AUTONEG;
+   if (sc & (ETHTOOL_LINK_MODE_1000baseT_Full_BIT |
+ ETHTOOL_LINK_MODE_1000baseKX_Full_BIT))
+   priv->link_speed_capa |= ETH_LINK_SPEED_1G;
+   if (sc & (ETHTOOL_LINK_MODE_1baseKX4_Full_BIT |
+ ETHTOOL_LINK_MODE_1baseKR_Full_BIT |
+ ETHTOOL_LINK_MODE_1baseR_FEC_BIT))
+   priv->link_speed_capa |= ETH_LINK_SPEED_10G;
+   if (sc & (ETHTOOL_LINK_MODE_2baseMLD2_Full_BIT |
+ ETHTOOL_LINK_MODE_2baseKR2_Full_BIT))
+   priv->link_speed_capa |= ETH_LINK_SPEED_20G;
+   if (sc & (ETHTOOL_LINK_MODE_4baseKR4_Full_BIT |
+ ETHTOOL_LINK_MODE_4baseCR4_Full_BIT |
+ ETHTOOL_LINK_MODE_4baseSR4_Full_BIT |
+ ETHTOOL_LINK_MODE_4baseLR4_Full_BIT))
+   priv->link_speed_capa |= ETH_LINK_SPEED_40G;
+   if (sc & 

[dpdk-dev] [PATCH 1/2] net/mlx5: fix link speed capability information

2016-10-26 Thread Nelio Laranjeiro
Make hard-coded values dynamic to return correct link speed capabilities
(not all ConnectX-4 NICs support everything).

Fixes: e274f5732225 ("ethdev: add speed capabilities")

Signed-off-by: Nelio Laranjeiro 
---
 drivers/net/mlx5/mlx5.h|  1 +
 drivers/net/mlx5/mlx5_ethdev.c | 25 +++--
 2 files changed, 16 insertions(+), 10 deletions(-)

diff --git a/drivers/net/mlx5/mlx5.h b/drivers/net/mlx5/mlx5.h
index 000fb38..d7976cb 100644
--- a/drivers/net/mlx5/mlx5.h
+++ b/drivers/net/mlx5/mlx5.h
@@ -139,6 +139,7 @@ struct priv {
unsigned int reta_idx_n; /* RETA index size. */
struct fdir_filter_list *fdir_filter_list; /* Flow director rules. */
struct fdir_queue *fdir_drop_queue; /* Flow director drop queue. */
+   uint32_t link_speed_capa; /* Link speed capabilities. */
rte_spinlock_t lock; /* Lock for control functions. */
 };

diff --git a/drivers/net/mlx5/mlx5_ethdev.c b/drivers/net/mlx5/mlx5_ethdev.c
index b8b3ea9..042c9bc 100644
--- a/drivers/net/mlx5/mlx5_ethdev.c
+++ b/drivers/net/mlx5/mlx5_ethdev.c
@@ -604,15 +604,7 @@ mlx5_dev_infos_get(struct rte_eth_dev *dev, struct 
rte_eth_dev_info *info)
info->hash_key_size = ((*priv->rss_conf) ?
   (*priv->rss_conf)[0]->rss_key_len :
   0);
-   info->speed_capa =
-   ETH_LINK_SPEED_1G |
-   ETH_LINK_SPEED_10G |
-   ETH_LINK_SPEED_20G |
-   ETH_LINK_SPEED_25G |
-   ETH_LINK_SPEED_40G |
-   ETH_LINK_SPEED_50G |
-   ETH_LINK_SPEED_56G |
-   ETH_LINK_SPEED_100G;
+   info->speed_capa = priv->link_speed_capa;
priv_unlock(priv);
 }

@@ -647,7 +639,7 @@ mlx5_link_update_unlocked(struct rte_eth_dev *dev, int 
wait_to_complete)
 {
struct priv *priv = mlx5_get_priv(dev);
struct ethtool_cmd edata = {
-   .cmd = ETHTOOL_GSET
+   .cmd = ETHTOOL_GSET /* Deprecated since Linux v4.5. */
};
struct ifreq ifr;
struct rte_eth_link dev_link;
@@ -672,6 +664,19 @@ mlx5_link_update_unlocked(struct rte_eth_dev *dev, int 
wait_to_complete)
dev_link.link_speed = 0;
else
dev_link.link_speed = link_speed;
+   priv->link_speed_capa = 0;
+   if (edata.supported & SUPPORTED_Autoneg)
+   priv->link_speed_capa |= ETH_LINK_SPEED_AUTONEG;
+   if (edata.supported & (SUPPORTED_1000baseT_Full |
+  SUPPORTED_1000baseKX_Full))
+   priv->link_speed_capa |= ETH_LINK_SPEED_1G;
+   if (edata.supported & SUPPORTED_1baseKR_Full)
+   priv->link_speed_capa |= ETH_LINK_SPEED_10G;
+   if (edata.supported & (SUPPORTED_4baseKR4_Full |
+  SUPPORTED_4baseCR4_Full |
+  SUPPORTED_4baseSR4_Full |
+  SUPPORTED_4baseLR4_Full))
+   priv->link_speed_capa |= ETH_LINK_SPEED_40G;
dev_link.link_duplex = ((edata.duplex == DUPLEX_HALF) ?
ETH_LINK_HALF_DUPLEX : ETH_LINK_FULL_DUPLEX);
dev_link.link_autoneg = !(dev->data->dev_conf.link_speeds &
-- 
2.1.4



[dpdk-dev] [PATCH 0/2] mlx5: fix link speed capabilities

2016-10-26 Thread Nelio Laranjeiro
Make hard-coded values dynamic to return correct link speed capabilities
(not all ConnectX-4 NICs support everything).

Nelio Laranjeiro (2):
  net/mlx5: fix link speed capability information
  net/mlx5: fix support for newer link speeds

 drivers/net/mlx5/Makefile  |  15 +
 drivers/net/mlx5/mlx5.h|   1 +
 drivers/net/mlx5/mlx5_ethdev.c | 148 +
 3 files changed, 151 insertions(+), 13 deletions(-)

-- 
2.1.4



[dpdk-dev] [PATCH] pdump: revert PCI device name conversion

2016-10-26 Thread Zhang, Roy Fan
> -Original Message-
> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Reshma Pattan
> Sent: Tuesday, October 25, 2016 5:32 PM
> To: dev at dpdk.org
> Cc: Pattan, Reshma 
> Subject: [dpdk-dev] [PATCH] pdump: revert PCI device name conversion
> 
> Earlier ethdev library created the device names in the "bus:device.func"
> format hence pdump library implemented its own conversion method for
> changing the user passed device name format "domain:bus:device.func" to
> "bus:device.func"
> for finding the port id using device name using ethdev library calls. Now 
> after
> ethdev and eal rework http://dpdk.org/dev/patchwork/patch/15855/,
> the device names are created in the format "domain:bus:device.func", so
> pdump library conversion is not needed any more, hence removed the
> corresponding code.
> 
> Signed-off-by: Reshma Pattan 
> ---
>  lib/librte_pdump/rte_pdump.c | 37 ++---
>  1 file changed, 2 insertions(+), 35 deletions(-)
> 
> diff --git a/lib/librte_pdump/rte_pdump.c b/lib/librte_pdump/rte_pdump.c
> index ea5ccd9..504a1ce 100644
> --- a/lib/librte_pdump/rte_pdump.c
> +++ b/lib/librte_pdump/rte_pdump.c
> @@ -226,29 +226,6 @@ pdump_tx(uint8_t port __rte_unused, uint16_t qidx
> __rte_unused,  }
> 
>  static int
> -pdump_get_dombdf(char *device_id, char *domBDF, size_t len) -{
> - int ret;
> - struct rte_pci_addr dev_addr = {0};
> -
> - /* identify if device_id is pci address or name */
> - ret = eal_parse_pci_DomBDF(device_id, _addr);
> - if (ret < 0)
> - return -1;
> -
> - if (dev_addr.domain)
> - ret = snprintf(domBDF, len, "%u:%u:%u.%u",
> dev_addr.domain,
> - dev_addr.bus, dev_addr.devid,
> - dev_addr.function);
> - else
> - ret = snprintf(domBDF, len, "%u:%u.%u", dev_addr.bus,
> - dev_addr.devid,
> - dev_addr.function);
> -
> - return ret;
> -}
> -
> -static int
>  pdump_regitser_rx_callbacks(uint16_t end_q, uint8_t port, uint16_t queue,
>   struct rte_ring *ring, struct rte_mempool
> *mp,
>   uint16_t operation)
> @@ -885,7 +862,6 @@ rte_pdump_enable_by_deviceid(char *device_id,
> uint16_t queue,
>   void *filter)
>  {
>   int ret = 0;
> - char domBDF[DEVICE_ID_SIZE];
> 
>   ret = pdump_validate_ring_mp(ring, mp);
>   if (ret < 0)
> @@ -894,11 +870,7 @@ rte_pdump_enable_by_deviceid(char *device_id,
> uint16_t queue,
>   if (ret < 0)
>   return ret;
> 
> - if (pdump_get_dombdf(device_id, domBDF, sizeof(domBDF)) > 0)
> - ret = pdump_prepare_client_request(domBDF, queue, flags,
> - ENABLE, ring, mp, filter);
> - else
> - ret = pdump_prepare_client_request(device_id, queue,
> flags,
> + ret = pdump_prepare_client_request(device_id, queue, flags,
>   ENABLE, ring, mp, filter);
> 
>   return ret;
> @@ -928,17 +900,12 @@ rte_pdump_disable_by_deviceid(char *device_id,
> uint16_t queue,
>   uint32_t flags)
>  {
>   int ret = 0;
> - char domBDF[DEVICE_ID_SIZE];
> 
>   ret = pdump_validate_flags(flags);
>   if (ret < 0)
>   return ret;
> 
> - if (pdump_get_dombdf(device_id, domBDF, sizeof(domBDF)) > 0)
> - ret = pdump_prepare_client_request(domBDF, queue, flags,
> - DISABLE, NULL, NULL, NULL);
> - else
> - ret = pdump_prepare_client_request(device_id, queue,
> flags,
> + ret = pdump_prepare_client_request(device_id, queue, flags,
>   DISABLE, NULL, NULL, NULL);
> 
>   return ret;
> --
> 2.7.4

Acked-by: Fan Zhang  


[dpdk-dev] mbuf changes

2016-10-26 Thread Morten Brørup
> From: Alejandro Lucero [mailto:alejandro.lucero at netronome.com] 
> On Tue, Oct 25, 2016 at 2:05 PM, Bruce Richardson  intel.com> wrote:
> > On Tue, Oct 25, 2016 at 05:24:28PM +0530, Shreyansh Jain wrote:
> > > On Monday 24 October 2016 09:55 PM, Bruce Richardson wrote:
> > > > On Mon, Oct 24, 2016 at 04:11:33PM +, Wiles, Keith wrote:
> > > > >
> > > > > > On Oct 24, 2016, at 10:49 AM, Morten Br?rup  > > > > > smartsharesystems.com> wrote:
> > > > > >
> > > > > > First of all: Thanks for a great DPDK Userspace 2016!
> > > > > >
> > > > > >
> > > > > >
> > > > > > Continuing the Userspace discussion about Olivier Matz?s proposed 
> > > > > > mbuf changes...
> > > >
> > > > Thanks for keeping the discussion going!
> > > > > >
> > > > > >
> > > > > >
> > > > > > 1.
> > > > > >
> > > > > > Stephen Hemminger had a noteworthy general comment about keeping 
> > > > > > metadata for the NIC in the appropriate section of the mbuf: 
> > > > > > Metadata generated by the NIC?s RX handler belongs in the first 
> > > > > > cache line, and metadata required by the NIC?s TX handler belongs 
> > > > > > in the second cache line. This also means that touching the second 
> > > > > > cache line on ingress should be avoided if possible; and Bruce 
> > > > > > Richardson mentioned that for this reason m->next was zeroed on 
> > > > > > free().
> > > > > >
> > > > Thinking about it, I suspect there are more fields we can reset on free
> > > > to save time on alloc. Refcnt, as discussed below is one of them, but so
> > > > too could be the nb_segs field and possibly others.
> > > >
> > > > > >
> > > > > >
> > > > > > 2.
> > > > > >
> > > > > > There seemed to be consensus that the size of m->refcnt should 
> > > > > > match the size of m->port because a packet could be duplicated on 
> > > > > > all physical ports for L3 multicast and L2 flooding.
> > > > > >
> > > > > > Furthermore, although a single physical machine (i.e. a single 
> > > > > > server) with 255 physical ports probably doesn?t exist, it might 
> > > > > > contain more than 255 virtual machines with a virtual port each, so 
> > > > > > it makes sense extending these mbuf fields from 8 to 16 bits.
> > > > >
> > > > > I thought we also talked about removing the m->port from the mbuf as 
> > > > > it is not really needed.
> > > > >
> > > > Yes, this was mentioned, and also the option of moving the port value to
> > > > the second cacheline, but it appears that NXP are using the port value
> > > > in their NIC drivers for passing in metadata, so we'd need their
> > > > agreement on any move (or removal).
> > >
> > > I am not sure where NXP's NIC came into picture on this, but now that it 
> > > is
> > > highlighted, this field is required for libevent implementation [1].
> > >
> > > A scheduler sending an event, which can be a packet, would only have
> > > information of a flow_id. From this matching it back to a port, without
> > > mbuf->port, would be very difficult (costly). There may be way around this
> > > but at least in current proposal I think port would be important to have -
> > > even if in second cache line.
> > >
> > > But, off the top of my head, as of now it is not being used for any 
> > > specific
> > > purpose in NXP's PMD implementation.
> > >
> > > Even the SoC patches don't necessarily rely on it except using it because 
> > > it
> > > is available.
> > >
> > > @Bruce: where did you get the NXP context here from?
> > >
> > Oh, I'm just mis-remembering. :-( It was someone else who was looking for
> > this - Netronome, perhaps?
> > 
> > CC'ing Alejandro in the hope I'm remembering correctly second time
> > round!
> > 
> Yes. Thanks Bruce!
> 
> So Netronome uses the port field and, as I commented on the user meeting, we 
> are happy with the field going from 8 to 16 bits.
> 
> In our case, this is something some clients have demanded, and if I'm not 
> wrong (I'll double check this asap), the port value is for knowing where the 
> packet is coming from. Think about a switch in the NIC, with ports linked to 
> VFs/VMs, and one or more physical ports. That port value is not related to 
> DPDK ports but to the switch ports. Code in the host (DPDK or not) can 
> receive packets from the wire or from VFs through the NIC. This is also true 
> for packets received by VMs, but I guess the port value is just interested 
> for host code.

Come to think of it: About ten years ago I was directly involved in the design 
of a silicon architecture for a stackable switch, and the packets on the 
backplane used a proprietary stack header which included a port number. Among 
other purposes, it allowed the management CPU to inject packets addressed to 
any physical port in the stack; and this could easily be more than 255 ports. I 
have seen similar proprietary headers in chips used for consumer wifi routers, 
allowing a single port on the CPU chip to communicate individually with the 4 
LAN ports through the 5 port switch chip on the PCB. These are other 

[dpdk-dev] [PATCH v7 3/7] vhost: simplify mergeable Rx vring reservation

2016-10-26 Thread Yuanhan Liu
On Wed, Oct 26, 2016 at 12:08:49AM +0200, Thomas Monjalon wrote:
> 2016-10-14 17:34, Yuanhan Liu:
> > -static inline uint32_t __attribute__((always_inline))
> > +static inline int __attribute__((always_inline))
> >  copy_mbuf_to_desc_mergeable(struct virtio_net *dev, struct vhost_virtqueue 
> > *vq,
> > -   uint16_t end_idx, struct rte_mbuf *m,
> > -   struct buf_vector *buf_vec)
> > +   struct rte_mbuf *m, struct buf_vector *buf_vec,
> > +   uint16_t num_buffers)
> >  {
> > struct virtio_net_hdr_mrg_rxbuf virtio_hdr = {{0, 0, 0, 0, 0, 0}, 0};
> > uint32_t vec_idx = 0;
> > -   uint16_t start_idx = vq->last_used_idx;
> > -   uint16_t cur_idx = start_idx;
> > +   uint16_t cur_idx = vq->last_used_idx;
> > uint64_t desc_addr;
> > uint32_t desc_chain_head;
> > uint32_t desc_chain_len;
> > @@ -394,21 +393,21 @@ copy_mbuf_to_desc_mergeable(struct virtio_net *dev, 
> > struct vhost_virtqueue *vq,
> > struct rte_mbuf *hdr_mbuf;
> >  
> > if (unlikely(m == NULL))
> > -   return 0;
> > +   return -1;
> >  
> > LOG_DEBUG(VHOST_DATA, "(%d) current index %d | end index %d\n",
> > dev->vid, cur_idx, end_idx);
> 
> There is a build error:
>   lib/librte_vhost/virtio_net.c:399:22: error: ?end_idx? undeclared

Oops...  you know, my robot is broken since the holiday :(
I just had a quick fix. Hopefully, it will start working again...

> It is probably trivial and could be fixed directly in the already applied
> commit in next-virtio.

Yes, and FYI, here is the overall diffs I made to fix this bug.

--yliu

---
diff --git a/lib/librte_vhost/virtio_net.c
b/lib/librte_vhost/virtio_net.c
index b784dba..eed0b1c 100644
--- a/lib/librte_vhost/virtio_net.c
+++ b/lib/librte_vhost/virtio_net.c
@@ -443,9 +443,6 @@ copy_mbuf_to_desc_mergeable(struct virtio_net *dev,
struct rte_mbuf *m,
if (unlikely(m == NULL))
return -1;

-   LOG_DEBUG(VHOST_DATA, "(%d) current index %d | end index %d\n",
-   dev->vid, cur_idx, end_idx);
-
desc_addr = gpa_to_vva(dev, buf_vec[vec_idx].buf_addr);
if (buf_vec[vec_idx].buf_len < dev->vhost_hlen || !desc_addr)
return -1;
@@ -555,6 +552,10 @@ virtio_dev_merge_rx(struct virtio_net *dev,
uint16_t queue_id,
break;
}

+   LOG_DEBUG(VHOST_DATA, "(%d) current index %d | end index 
%d\n",
+   dev->vid, vq->last_avail_idx,
+   vq->last_avail_idx + num_buffers);
+
if (copy_mbuf_to_desc_mergeable(dev, pkts[pkt_idx],
buf_vec, num_buffers) < 0) {
vq->shadow_used_idx -= num_buffers;


[dpdk-dev] [PATCH] net/vmxnet3: fix mbuf release on reset/stop

2016-10-26 Thread Yong Wang
During device reset/stop, vmxnet3 releases all mbufs in tx and
rx cmd ring.  For rx, we should go over all ring descriptors and
free using rte_pktmbuf_free_seg() instead of rte_pktmbuf_free()
as the metadata of the mbuf might not be properly initialized
(initialization after mempool creation is done in the rx routine)
and the mbuf should always be a single-segment one when populated.
For tx, we can use the existing way as mbuf, if any, will be a
valid one stashed in the eop.

Fixes: dfaff37fc46d ("vmxnet3: import new vmxnet3 poll mode driver 
implementation")
Signed-off-by: Yong Wang 
---
 drivers/net/vmxnet3/vmxnet3_rxtx.c | 34 +-
 1 file changed, 29 insertions(+), 5 deletions(-)

diff --git a/drivers/net/vmxnet3/vmxnet3_rxtx.c 
b/drivers/net/vmxnet3/vmxnet3_rxtx.c
index 31f396c..b109168 100644
--- a/drivers/net/vmxnet3/vmxnet3_rxtx.c
+++ b/drivers/net/vmxnet3/vmxnet3_rxtx.c
@@ -140,10 +140,10 @@ vmxnet3_txq_dump(struct vmxnet3_tx_queue *txq)
 #endif

 static void
-vmxnet3_cmd_ring_release_mbufs(vmxnet3_cmd_ring_t *ring)
+vmxnet3_tx_cmd_ring_release_mbufs(vmxnet3_cmd_ring_t *ring)
 {
while (ring->next2comp != ring->next2fill) {
-   /* No need to worry about tx desc ownership, device is quiesced 
by now. */
+   /* No need to worry about desc ownership, device is quiesced by 
now. */
vmxnet3_buf_info_t *buf_info = ring->buf_info + ring->next2comp;

if (buf_info->m) {
@@ -157,9 +157,27 @@ vmxnet3_cmd_ring_release_mbufs(vmxnet3_cmd_ring_t *ring)
 }

 static void
+vmxnet3_rx_cmd_ring_release_mbufs(vmxnet3_cmd_ring_t *ring)
+{
+   uint32_t i;
+
+   for (i = 0; i < ring->size; i++) {
+   /* No need to worry about desc ownership, device is quiesced by 
now. */
+   vmxnet3_buf_info_t *buf_info = >buf_info[i];
+
+   if (buf_info->m) {
+   rte_pktmbuf_free_seg(buf_info->m);
+   buf_info->m = NULL;
+   buf_info->bufPA = 0;
+   buf_info->len = 0;
+   }
+   vmxnet3_cmd_ring_adv_next2comp(ring);
+   }
+}
+
+static void
 vmxnet3_cmd_ring_release(vmxnet3_cmd_ring_t *ring)
 {
-   vmxnet3_cmd_ring_release_mbufs(ring);
rte_free(ring->buf_info);
ring->buf_info = NULL;
 }
@@ -170,6 +188,8 @@ vmxnet3_dev_tx_queue_release(void *txq)
vmxnet3_tx_queue_t *tq = txq;

if (tq != NULL) {
+   /* Release mbufs */
+   vmxnet3_tx_cmd_ring_release_mbufs(>cmd_ring);
/* Release the cmd_ring */
vmxnet3_cmd_ring_release(>cmd_ring);
}
@@ -182,6 +202,10 @@ vmxnet3_dev_rx_queue_release(void *rxq)
vmxnet3_rx_queue_t *rq = rxq;

if (rq != NULL) {
+   /* Release mbufs */
+   for (i = 0; i < VMXNET3_RX_CMDRING_SIZE; i++)
+   vmxnet3_rx_cmd_ring_release_mbufs(>cmd_ring[i]);
+
/* Release both the cmd_rings */
for (i = 0; i < VMXNET3_RX_CMDRING_SIZE; i++)
vmxnet3_cmd_ring_release(>cmd_ring[i]);
@@ -199,7 +223,7 @@ vmxnet3_dev_tx_queue_reset(void *txq)

if (tq != NULL) {
/* Release the cmd_ring mbufs */
-   vmxnet3_cmd_ring_release_mbufs(>cmd_ring);
+   vmxnet3_tx_cmd_ring_release_mbufs(>cmd_ring);
}

/* Tx vmxnet rings structure initialization*/
@@ -228,7 +252,7 @@ vmxnet3_dev_rx_queue_reset(void *rxq)
if (rq != NULL) {
/* Release both the cmd_rings mbufs */
for (i = 0; i < VMXNET3_RX_CMDRING_SIZE; i++)
-   vmxnet3_cmd_ring_release_mbufs(>cmd_ring[i]);
+   vmxnet3_rx_cmd_ring_release_mbufs(>cmd_ring[i]);
}

ring0 = >cmd_ring[0];
-- 
1.9.1



[dpdk-dev] mbuf changes

2016-10-26 Thread Alejandro Lucero
On Tue, Oct 25, 2016 at 2:05 PM, Bruce Richardson <
bruce.richardson at intel.com> wrote:

> On Tue, Oct 25, 2016 at 05:24:28PM +0530, Shreyansh Jain wrote:
> > On Monday 24 October 2016 09:55 PM, Bruce Richardson wrote:
> > > On Mon, Oct 24, 2016 at 04:11:33PM +, Wiles, Keith wrote:
> > > >
> > > > > On Oct 24, 2016, at 10:49 AM, Morten Br?rup <
> mb at smartsharesystems.com> wrote:
> > > > >
> > > > > First of all: Thanks for a great DPDK Userspace 2016!
> > > > >
> > > > >
> > > > >
> > > > > Continuing the Userspace discussion about Olivier Matz?s proposed
> mbuf changes...
> > >
> > > Thanks for keeping the discussion going!
> > > > >
> > > > >
> > > > >
> > > > > 1.
> > > > >
> > > > > Stephen Hemminger had a noteworthy general comment about keeping
> metadata for the NIC in the appropriate section of the mbuf: Metadata
> generated by the NIC?s RX handler belongs in the first cache line, and
> metadata required by the NIC?s TX handler belongs in the second cache line.
> This also means that touching the second cache line on ingress should be
> avoided if possible; and Bruce Richardson mentioned that for this reason
> m->next was zeroed on free().
> > > > >
> > > Thinking about it, I suspect there are more fields we can reset on free
> > > to save time on alloc. Refcnt, as discussed below is one of them, but
> so
> > > too could be the nb_segs field and possibly others.
> > >
> > > > >
> > > > >
> > > > > 2.
> > > > >
> > > > > There seemed to be consensus that the size of m->refcnt should
> match the size of m->port because a packet could be duplicated on all
> physical ports for L3 multicast and L2 flooding.
> > > > >
> > > > > Furthermore, although a single physical machine (i.e. a single
> server) with 255 physical ports probably doesn?t exist, it might contain
> more than 255 virtual machines with a virtual port each, so it makes sense
> extending these mbuf fields from 8 to 16 bits.
> > > >
> > > > I thought we also talked about removing the m->port from the mbuf as
> it is not really needed.
> > > >
> > > Yes, this was mentioned, and also the option of moving the port value
> to
> > > the second cacheline, but it appears that NXP are using the port value
> > > in their NIC drivers for passing in metadata, so we'd need their
> > > agreement on any move (or removal).
> >
> > I am not sure where NXP's NIC came into picture on this, but now that it
> is
> > highlighted, this field is required for libevent implementation [1].
> >
> > A scheduler sending an event, which can be a packet, would only have
> > information of a flow_id. From this matching it back to a port, without
> > mbuf->port, would be very difficult (costly). There may be way around
> this
> > but at least in current proposal I think port would be important to have
> -
> > even if in second cache line.
> >
> > But, off the top of my head, as of now it is not being used for any
> specific
> > purpose in NXP's PMD implementation.
> >
> > Even the SoC patches don't necessarily rely on it except using it
> because it
> > is available.
> >
> > @Bruce: where did you get the NXP context here from?
> >
> Oh, I'm just mis-remembering. :-( It was someone else who was looking for
> this - Netronome, perhaps?
>
> CC'ing Alejandro in the hope I'm remembering correctly second time
> round!
>
>
Yes. Thanks Bruce!

So Netronome uses the port field and, as I commented on the user meeting,
we are happy with the field going from 8 to 16 bits.

In our case, this is something some clients have demanded, and if I'm not
wrong (I'll double check this asap), the port value is for knowing where
the packet is coming from. Think about a switch in the NIC, with ports
linked to VFs/VMs, and one or more physical ports. That port value is not
related to DPDK ports but to the switch ports. Code in the host (DPDK or
not) can receive packets from the wire or from VFs through the NIC. This is
also true for packets received by VMs, but I guess the port value is just
interested for host code.



> /Bruce
>


[dpdk-dev] DPDK & ASLR

2016-10-26 Thread Mcnamara, John
> -Original Message-
> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Jon DeVree
> Sent: Wednesday, October 26, 2016 5:20 AM
> To: dev at dpdk.org
> Subject: Re: [dpdk-dev] DPDK & ASLR
> 
> On Tue, Oct 25, 2016 at 15:18:03 -0700, Samir Shah wrote:
> > Does ASLR need to be turned off system-wide, or DPDK-processes wide?
> > Could we use setarch/personality to disable ASLR for just the DPDK
> > process and leave it enabled for the rest of the system? Any
> > experience to say if that would work or not?
> >
> 
> I'm using setarch/personality to disable it only in the processes using
> dpdk without any trouble.

Hi Jon,

That is interesting. Do you have some more details on how to do this. 

Also, It might be worth adding this to the DPDK documentation:

http://dpdk.org/doc/guides/prog_guide/multi_proc_support.html#multi-process-limitations

John





[dpdk-dev] [PATCH v10 11/25] eal/pci: helpers for device name parsing/update

2016-10-26 Thread Pattan, Reshma
Hi,


> -Original Message-
> From: Shreyansh Jain [mailto:shreyansh.jain at nxp.com]
> Sent: Wednesday, October 26, 2016 7:23 AM
> To: Pattan, Reshma 
> Cc: dev at dpdk.org; viktorin at rehivetech.com; David Marchand
> ; hemant.agrawal at nxp.com; Thomas Monjalon
> 
> Subject: Re: [PATCH v10 11/25] eal/pci: helpers for device name parsing/update
> 
> Hello Reshma,
> 
> On Tuesday 25 October 2016 09:19 PM, Pattan, Reshma wrote:
> > Hi Shreyansh,
> >
> >> -Original Message-
> >> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Shreyansh Jain
> >> Sent: Friday, September 16, 2016 5:30 AM
> >> To: dev at dpdk.org
> >> Cc: viktorin at rehivetech.com; David Marchand
> >> ; hemant.agrawal at nxp.com; Thomas
> Monjalon
> >> ; Shreyansh Jain 
> >> Subject: [dpdk-dev] [PATCH v10 11/25] eal/pci: helpers for device
> >> name parsing/update
> >>
> >> From: David Marchand 
> >>
> >> - Move rte_eth_dev_create_unique_device_name() from ether/rte_ethdev.c
> to
> >>   common/include/rte_pci.h as rte_eal_pci_device_name(). Being a common
> >>   method, can be used across crypto/net PCI PMDs.
> >> - Remove crypto specific routine and fallback to common name function.
> >> - Introduce a eal private Update function for PCI device naming.
> >>
> >> Signed-off-by: David Marchand 
> >> [Shreyansh: Merge crypto/pci helper patches]
> >> Signed-off-by: Shreyansh Jain 
> >> ---
> >>  lib/librte_cryptodev/rte_cryptodev.c| 27 +++---
> >>  lib/librte_eal/bsdapp/eal/eal_pci.c | 49
> >> +
> >>  lib/librte_eal/common/eal_private.h | 13 +
> >>  lib/librte_eal/common/include/rte_pci.h | 24 
> >>  lib/librte_eal/linuxapp/eal/eal_pci.c   | 13 +
> >>  lib/librte_ether/rte_ethdev.c   | 24 +++-
> >>  6 files changed, 107 insertions(+), 43 deletions(-)
> >>
> >> diff --git a/lib/librte_cryptodev/rte_cryptodev.c
> >> b/lib/librte_cryptodev/rte_cryptodev.c
> >> index 2a3b649..c81e366 100644
> >> --- a/lib/librte_cryptodev/rte_cryptodev.c
> >> +++ b/lib/librte_cryptodev/rte_cryptodev.c
> >> @@ -365,23 +365,6 @@ rte_cryptodev_pmd_allocate(const char *name, int
> >> socket_id)
> >>return cryptodev;
> >>  }
> >>
> >>   *
> >>   * This function is private to EAL.
> >> diff --git a/lib/librte_eal/common/include/rte_pci.h
> >> b/lib/librte_eal/common/include/rte_pci.h
> >> index cf81898..e1f695f 100644
> >> --- a/lib/librte_eal/common/include/rte_pci.h
> >> +++ b/lib/librte_eal/common/include/rte_pci.h
> >> @@ -82,6 +82,7 @@ extern "C" {
> >>  /** Formatting string for PCI device identifier: Ex: :00:01.0 */
> >> #define PCI_PRI_FMT "%.4" PRIx16 ":%.2" PRIx8 ":%.2" PRIx8 ".%" PRIx8
> >> +#define PCI_PRI_STR_SIZE sizeof(":XX:XX.X")
> >>
> >>  /** Short formatting string, without domain, for PCI device: Ex:
> >> 00:01.0 */ #define PCI_SHORT_PRI_FMT "%.2" PRIx8 ":%.2" PRIx8 ".%"
> >> PRIx8 @@ -308,6
> >>
> >> +static inline void
> >> +rte_eal_pci_device_name(const struct rte_pci_addr *addr,
> >> +  char *output, size_t size)
> >> +{
> >> +  RTE_VERIFY(size >= PCI_PRI_STR_SIZE);
> >> +  RTE_VERIFY(snprintf(output, size, PCI_PRI_FMT,
> >> +  addr->domain, addr->bus,
> >> +  addr->devid, addr->function) >= 0); }
> >> +
> >>
> >> +int
> >> +pci_update_device(const struct rte_pci_addr *addr) {
> >> +  char filename[PATH_MAX];
> >> +
> >> +  snprintf(filename, sizeof(filename), "%s/" PCI_PRI_FMT,
> >> +   pci_get_sysfs_path(), addr->domain, addr->bus, addr->devid,
> >> +   addr->function);
> >> +
> >> +  return pci_scan_one(filename, addr->domain, addr->bus, addr->devid,
> >> +  addr->function);
> >> +}
> >> +
> >
> >
> > Earlier device names were created in the format "bus:deviceid.function" as 
> > per
> the below ethdev API.
> > Now after above new eal API the name format is "domain:bus:deviceid.func"
> was that intentional  and why is that so.
> 
> Yes, this is intentional.
> It is to bring the naming in sync with the device name being used for 
> scanning on
> the bus (/sys/bus/pci/devices/:BB:CC.D/).

Fair enough and thanks for clarification.

Thanks,
Reshma


[dpdk-dev] [PATCH] examples/ipsec-secgw: Update checksum while decrementing ttl

2016-10-26 Thread De Lara Guarch, Pablo


> -Original Message-
> From: Akhil Goyal [mailto:akhil.goyal at nxp.com]
> Sent: Wednesday, October 19, 2016 1:38 AM
> To: De Lara Guarch, Pablo; Gonzalez Monroy, Sergio; dev at dpdk.org
> Subject: RE: [PATCH] examples/ipsec-secgw: Update checksum while
> decrementing ttl
> 
> 
> 
> -Original Message-
> From: De Lara Guarch, Pablo [mailto:pablo.de.lara.guarch at intel.com]
> Sent: Monday, October 17, 2016 10:35 PM
> To: Gonzalez Monroy, Sergio ; Akhil
> Goyal ; dev at dpdk.org
> Subject: RE: [PATCH] examples/ipsec-secgw: Update checksum while
> decrementing ttl
> 
> 
> 
> > -Original Message-
> > From: Gonzalez Monroy, Sergio
> > Sent: Monday, October 10, 2016 5:05 AM
> > To: De Lara Guarch, Pablo; Akhil Goyal; dev at dpdk.org
> > Subject: Re: [PATCH] examples/ipsec-secgw: Update checksum while
> > decrementing ttl
> >
> > On 07/10/2016 21:53, De Lara Guarch, Pablo wrote:
> > >> -Original Message-
> > >> From: Akhil Goyal [mailto:akhil.goyal at nxp.com]
> > >> Sent: Tuesday, October 04, 2016 11:33 PM
> > >> To: De Lara Guarch, Pablo; Gonzalez Monroy, Sergio; dev at dpdk.org
> > >> Subject: Re: [PATCH] examples/ipsec-secgw: Update checksum while
> > >> decrementing ttl
> > >>
> > >> On 10/5/2016 6:04 AM, De Lara Guarch, Pablo wrote:
> > >>>
> >  -Original Message-
> >  From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Sergio
> > Gonzalez
> >  Monroy
> >  Sent: Monday, September 26, 2016 6:28 AM
> >  To: akhil.goyal at nxp.com; dev at dpdk.org
> >  Subject: Re: [dpdk-dev] [PATCH] examples/ipsec-secgw: Update
> > checksum
> >  while decrementing ttl
> > 
> >  Hi Akhil,
> > 
> >  This application relies on checksum offload in both outbound and
> > >> inbound
> >  paths (PKT_TX_IP_CKSUM flag).
> > >> [Akhil]Agreed that the application relies on checksum offload, but
> > >> here we are talking about the inner ip header. Inner IP checksum
> > >> will be updated on the next end point after decryption. This would
> > >> expect that the next end point must have checksum offload
> > >> capability. What if we are capturing the encrypted packets on
> > >> wireshark or say send it to some other machine which does not run
> > >> DPDK and do not know about
> > checksum
> > >> offload, then wireshark/other machine will not be able to get the
> > >> correct the checksum and will show error.
> >
> > Understood, we need to have a valid inner checksum.
> > RFC1624 states that the computation would be incorrect in
> > corner/boundary case.
> > I reckon you are basing your incremental update on RFC1141?
> >
> > Also I think you should take care of endianess and increment the
> > checksum with
> > host_to_be(0x0100) instead of +1.
> >
> >  Because we assume that we always forward the packet in both
> >  paths,
> > we
> >  decrement the ttl in both inbound and outbound.
> >  You seem to only increment (recalculate) the checksum of the
> >  inner IP header in the outbound path but not the inbound path.
> > >> [Akhil]Correct I missed out the inbound path.
> >  Also, in the inbound path you have to consider a possible ECN
> >  value
> > >> update.
> > >> [Akhil]If I take care of the ECN then it would mean I need to
> > >> calculate the checksum completely, incremental checksum wont give
> correct results.
> > >> This would surely impact performance. Any suggestion on how should
> > >> we take care of ECN update. Should I recalculate the checksum and
> > >> send the patch for ECN update? Or do we have a better solution.
> >
> > If I am understanding the RFCs mentioned above correctly, you should
> > be able to do incremental checksum update for any 16bit field/value of
> > the IP header.
> > I don't see no reason why you couldn't do something like that, except
> > that you would have to follow the full equation instead of just adding
> > 0x0100, which would be always the case when decrementing TTL.
> >
> > What do you think?
> 
> Any comments, Akhil?
> 
> Ok.. will send next version soon.

Hi Akhil,
Are you sending that version soon? It won't make it the RC2, but it may be 
merged for RC3.

Thanks,
Pablo



[dpdk-dev] DPDK & ASLR

2016-10-26 Thread Jon DeVree
On Tue, Oct 25, 2016 at 15:18:03 -0700, Samir Shah wrote:
> Does ASLR need to be turned off system-wide, or DPDK-processes wide? Could
> we use setarch/personality to disable ASLR for just the DPDK process and
> leave it enabled for the rest of the system? Any experience to say if that
> would work or not?
> 

I'm using setarch/personality to disable it only in the processes using
dpdk without any trouble.

-- 
Jon
Doge Wrangler
X(7): A program for managing terminal windows. See also screen(1) and tmux(1).


[dpdk-dev] [PATCH v7 3/7] vhost: simplify mergeable Rx vring reservation

2016-10-26 Thread Thomas Monjalon
2016-10-14 17:34, Yuanhan Liu:
> -static inline uint32_t __attribute__((always_inline))
> +static inline int __attribute__((always_inline))
>  copy_mbuf_to_desc_mergeable(struct virtio_net *dev, struct vhost_virtqueue 
> *vq,
> - uint16_t end_idx, struct rte_mbuf *m,
> - struct buf_vector *buf_vec)
> + struct rte_mbuf *m, struct buf_vector *buf_vec,
> + uint16_t num_buffers)
>  {
>   struct virtio_net_hdr_mrg_rxbuf virtio_hdr = {{0, 0, 0, 0, 0, 0}, 0};
>   uint32_t vec_idx = 0;
> - uint16_t start_idx = vq->last_used_idx;
> - uint16_t cur_idx = start_idx;
> + uint16_t cur_idx = vq->last_used_idx;
>   uint64_t desc_addr;
>   uint32_t desc_chain_head;
>   uint32_t desc_chain_len;
> @@ -394,21 +393,21 @@ copy_mbuf_to_desc_mergeable(struct virtio_net *dev, 
> struct vhost_virtqueue *vq,
>   struct rte_mbuf *hdr_mbuf;
>  
>   if (unlikely(m == NULL))
> - return 0;
> + return -1;
>  
>   LOG_DEBUG(VHOST_DATA, "(%d) current index %d | end index %d\n",
>   dev->vid, cur_idx, end_idx);

There is a build error:
lib/librte_vhost/virtio_net.c:399:22: error: ?end_idx? undeclared

It is probably trivial and could be fixed directly in the already applied
commit in next-virtio.


[dpdk-dev] [PATCH v2] lib/ether: prevent duplicate callback on list

2016-10-26 Thread Thomas Monjalon
2016-10-20 09:34, E. Scott Daniels:
> This change prevents the attempt to add a structure which is
> already on the callback list. If a struct with matching
> parameters is found on the list, then no action is taken. If
> a struct with matching parameters is found on the list, then
> no action is taken.

Callback is not duplicate anymore but the last sentence is duplicate :)

> Fixes: ac2f69c ("ethdev: fix crash if malloc of user callback fails")
> 
> Signed-off-by: E. Scott Daniels 
> ---
> 
> V2:
> * Correct the component name; changed from net/ixgbe.
> * Add Fixes tag.
> * Acked-by: Wenzhuo Lu 

The Acked-by tag must be added below your Signed-off-by.

Applied with above nits fixed, thanks


[dpdk-dev] [dpdk-stable] [PATCH v2] mempool: fix search of maximum contiguous pages

2016-10-26 Thread Thomas Monjalon
2016-10-25 17:01, Olivier Matz:
> From: Wei Dai 
> 
> paddr[i] + pg_sz always points to the start physical address of the
> 2nd page after pddr[i], so only up to 2 pages can be combinded to
> be used. With this revision, more than 2 pages can be used.
> 
> Fixes: 84121f197187 ("mempool: store memory chunks in a list")
> 
> Signed-off-by: Wei Dai 
> Signed-off-by: Olivier Matz 

Applied, thanks


[dpdk-dev] [RFC] [PATCH v2] libeventdev: event driven programming model framework for DPDK

2016-10-26 Thread Jerin Jacob
On Wed, Oct 12, 2016 at 01:00:16AM +0530, Jerin Jacob wrote:
> Thanks to Intel and NXP folks for the positive and constructive feedback
> I've received so far. Here is the updated RFC(v2).
> 
> I've attempted to address as many comments as possible.
> 
> This series adds rte_eventdev.h to the DPDK tree with
> adequate documentation in doxygen format.
> 
> Updates are also available online:
> 
> Related draft header file (this patch):
> https://rawgit.com/jerinjacobk/libeventdev/master/rte_eventdev.h
> 
> PDF version(doxgen output):
> https://rawgit.com/jerinjacobk/libeventdev/master/librte_eventdev_v2.pdf
> 
> Repo:
> https://github.com/jerinjacobk/libeventdev
>

Hi Community,

So far, I have received constructive feedback from Intel, NXP and Linaro folks.
Let me know, if anyone else interested in contributing to the definition of 
eventdev?

If there are no major issues in proposed spec, then Cavium would like work on
implementing and up-streaming the common code(lib/librte_eventdev/) and
an associated HW driver.(Requested minor changes of v2 will be addressed
in next version).

We are planning to submit the work for 17.02 or 17.05 release(based on
how implementation goes).

/Jerin
Cavium


[dpdk-dev] [PATCH] testpmd: fix fdir command on MAC and tunnel modes

2016-10-26 Thread Thomas Monjalon
2016-09-27 11:01, Frederico Cadete:
> On Tue, Sep 27, 2016 at 4:42 AM, Wu, Jingjing  
> wrote:
> > From: Frederico.Cadete-
> >> The flow_director_filter commands has a pf|vf option for most modes
> >> except for MAC-VLAN and tunnel. On Intel NIC's these modes are not
> >> supported under virtualized environments.
> >> But the application was checking that this field was parsed for these 
> >> cases,
> >> even though this token is not registered with the cmdline parser.
> >>
> >> This patch skips checking of this field for the commands that don't accept 
> >> it.
> >>
> >> Signed-off-by: Frederico Cadete 
[...]
> >
> > Thanks for the patch.
> 
> And thanks a lot for the review.
> 
> > But with this change the field of pf_vf cannot omit either.
> > I think it still looks confused because it will allow any meaningless 
> > string.
> 
> Sorry, I am not aware that it can be omitted.
> For MAC/VLAN and tunnel mode it does not and will not allow any
> meaningless string.
> At least that was my intention :)
> 
> The cmdline parser expects "... flexbytes (flexbytes_value) (drop|fwd)
> queue ..." .
> This is what is documented [1] and the command's cmdline_parse_inst_t
> [2] matches this.
> If you put something in-between "(drop|fwd)" and "queue" it is
> rejected by the parser
> in librte_cmdline.
> 
> > In MAC_VLAN or TUNNEL mode, why not just use pf.
> 
> With the current code, because if you write that in the command, it is
> rejected by the parser :)
> 
> Do you mean it would be preferable to make these commands always take
> such an argument,
> and only at the NIC driver check that it must equal PF for MAC_VLAN or
> TUNNEL mode?
> The command becomes a bit more complicated for the current intel
> NIC's, but as I understand
> it currently does not work anyway. Unless I'm missing something else.
> 
> >
> > Maybe an optional field supporting on DPDK cmdline library is exactly what 
> > we
> > Are waiting for :)
> 
> Laudable goal! My excuses but it's beyond my current skills and bandwith :/

Thanks Frederico.
Your approach has been re-submitted and fixed by Wenzhuo:
http://dpdk.org/patch/16679


[dpdk-dev] [PATCH v2] app/testpmd: fix PF/VF check of flow director

2016-10-26 Thread Thomas Monjalon
2016-10-19 09:12, Wenzhuo Lu:
> Parameters pf & vf are added into most of flow director
> filter CLIs.
> But mac-valn and tunnel filters don't have these parameters,
> the parameters should not be checked for mac-vlan and tunnel
> filters.
> 
> Fixes: e6a68c013353 ("app/testpmd: extend commands for flow director in VF")
> 
> Signed-off-by: Wenzhuo Lu 
> Acked-by: Pablo de Lara 

This bug was reported and fixed by Frederico Cadete:
http://dpdk.org/patch/15264
We have waited long to have a review saying it requires
an optional parameter in the command line.
And finally you re-post a fixed version of the same approach
without any comment to the original thread or a reference here.
Please be more careful with occasional contributors.

Applied with
Reported-by: Frederico Cadete