[dpdk-dev] Unsupported SFP+ Module - Hardware Initialization Failure -19

2016-08-30 Thread Ajit Khaparde
On Tue, Aug 30, 2016 at 11:02 AM, Alex Forster  wrote:

> Hi guys,
>
> I have a problem again that I had about a year ago[1]. My Finisar
> FTL410QE2C (multimode QSFP) transceivers aren?t working with my Intel
> X520-QDA1?s. I have 7 servers, each with two X520?s and two QSFP
> transceivers in them, totaling 14 NICs and transceivers, and all of them
> fail with the same error: Unsupported SFP+ Module / Hardware Initialization
> Failure: -19.
>
> This doesn?t appear to be specifically a DPDK issue, but rather an IXGBE
> issue, since (a) the stock Debian 8 IXGBE, (b) the latest out of tree
> IXGBE, and (c) DPDK?s copy of IXGBE all fail in this same way. I?m happy to
> take this to an IXGBE mailing list if somebody can point me to one, but
> just in case anyone here can help, here?s the info I?ve gathered?
>
> * The exact function where IXGBE bails is here, annotated with the values
> read from the transceiver?s EEPROMs: https://www.googledrive.com/
> host/0B_-81lR_g8lyNDJ6QkkxS28yQzQ
> * I?ve modified DPDK?s copy of IXGBE to dump the whole EEPROM of these
> transceivers, which are here: https://gist.github.com/alexforster/
> 5da30cd89d49877d5d6908e9628baee8
>
> Here?s (I think) an important part: Last time I had this problem, I only
> had one server, and one of the two transceivers worked. I fixed the other
> transceiver in that server by replacing it with an identical model. Now,
> suddenly, neither of those transceivers (which have worked for ~11 months)
> are recognized. I don?t know how I could have done something that would
> have affected the EEPROM output, but I suspect I may have.
>
> Anybody able to offer any advice?
>
> Alex Forster
>
> [1] http://dpdk.org/ml/archives/dev/2015-October/024973.html


Adding e1000-devel at lists.sourceforge.net


[dpdk-dev] [PATCH] crypto/qat: optimisation of request copy

2016-08-30 Thread De Lara Guarch, Pablo


> -Original Message-
> From: Griffin, John
> Sent: Thursday, August 04, 2016 9:03 AM
> To: Trahe, Fiona; dev at dpdk.org
> Cc: Doherty, Declan; De Lara Guarch, Pablo
> Subject: Re: [dpdk-dev] [PATCH] crypto/qat: optimisation of request copy
> 
> On 04/08/16 13:00, Fiona Trahe wrote:
> > From: Fiona Trahe 
> >
> > using rte_mov128 instead of structure assignment to copy
> > template request from session context into request
> >
> > Signed-off-by: Fiona Trahe 
> >
> > ---
> >   drivers/crypto/qat/qat_crypto.c | 2 +-
> >   1 file changed, 1 insertion(+), 1 deletion(-)
> >
> 
> Acked-by: John Griffin 

Applied to dpdk-next-crypto.
Thanks,
Pablo



[dpdk-dev] [PATCH v1] crypto/qat: make the session struct variable in size

2016-08-30 Thread De Lara Guarch, Pablo


> -Original Message-
> From: Trahe, Fiona
> Sent: Tuesday, August 09, 2016 6:30 AM
> To: Griffin, John; dev at dpdk.org
> Cc: Jain, Deepak K; De Lara Guarch, Pablo; Trahe, Fiona
> Subject: RE: [PATCH v1] crypto/qat: make the session struct variable in size
> 
> 
> 
> -Original Message-
> From: Griffin, John
> Sent: Thursday, August 4, 2016 4:46 PM
> To: dev at dpdk.org
> Cc: Griffin, John ; Trahe, Fiona
> ; Jain, Deepak K ; De 
> Lara
> Guarch, Pablo 
> Subject: [PATCH v1] crypto/qat: make the session struct variable in size
> 
> This patch changes the qat firmware session data structure from a fixed size
> to a variable size which is dependent on the size of the chosen algorithm.
> This reduces the amount of bytes which are transferred across PCIe and thus
> helps to increase qat performance when the accelerator is bound by PCIe.
> 
> Signed-off-by: John Griffin 
> ---
> v1:
> * Fixed a compile issue with icc.
> 
> Acked-by: Fiona Trahe 

Applied to dpdk-next-crypto.
Thanks,

Pablo


[dpdk-dev] [PATCH] maintainers: claim responsability for crypto subtree

2016-08-30 Thread Pablo de Lara
>From 16.07, I will be the maintainer of the crypto subtree.

This will include:
- app/test/test_cryptodev*
- doc/guides/cryptodevs/
- drivers/crypto/
- examples/l2fwd-crypto/
- examples/ipsec-secgw/
- lib/librte_cryptodev/

Signed-off-by: Pablo de Lara 
---
 MAINTAINERS | 1 +
 1 file changed, 1 insertion(+)

diff --git a/MAINTAINERS b/MAINTAINERS
index bc9aa02..ee5118e 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -398,6 +398,7 @@ F: drivers/net/null/

 Crypto Drivers
 --
+T: Pablo de Lara 

 Intel AES-NI GCM PMD
 M: Declan Doherty 
-- 
2.7.4



[dpdk-dev] [PATCH] mk: add missing *CPPFLAGS to rte.compile-pre.mk

2016-08-30 Thread Luca Boccassi
Some targets in mk/internal/rte.compile-pre.mk are calling CC or
HOSTCC without passing CPPFLAGS, EXTRA_CPPFLAGS or HOST_CPPFLAGS,
HOST_EXTRA_CPPFLAGS.
On Debian/Ubuntu builds this means that preprocessor flags set by the
dpkg-buildpackage environment, like hardening flags, are not
correctly passed to all objects builds.

Signed-off-by: Luca Boccassi 
---
 mk/internal/rte.compile-pre.mk | 10 +-
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/mk/internal/rte.compile-pre.mk b/mk/internal/rte.compile-pre.mk
index f740179..7a1a62a 100644
--- a/mk/internal/rte.compile-pre.mk
+++ b/mk/internal/rte.compile-pre.mk
@@ -74,18 +74,18 @@ CMDS-all := $(CMDS-y) $(CMDS-n) $(CMDS-)

 # command to compile a .c file to generate an object
 ifeq ($(USE_HOST),1)
-C_TO_O = $(HOSTCC) -Wp,-MD,$(call obj2dep,$(@)).tmp $(HOST_CFLAGS) \
-   $(CFLAGS_$(@)) $(HOST_EXTRA_CFLAGS) -o $@ -c $<
+C_TO_O = $(HOSTCC) -Wp,-MD,$(call obj2dep,$(@)).tmp $(HOST_CPPFLAGS) 
$(HOST_CFLAGS) \
+   $(CFLAGS_$(@)) $(HOST_EXTRA_CPPFLAGS) $(HOST_EXTRA_CFLAGS) -o $@ -c $<
 C_TO_O_STR = $(subst ','\'',$(C_TO_O)) #'# fix syntax highlight
 C_TO_O_DISP = $(if $(V),"$(C_TO_O_STR)","  HOSTCC $(@)")
 else
-C_TO_O = $(CC) -Wp,-MD,$(call obj2dep,$(@)).tmp $(CFLAGS) \
-   $(CFLAGS_$(@)) $(EXTRA_CFLAGS) -o $@ -c $<
+C_TO_O = $(CC) -Wp,-MD,$(call obj2dep,$(@)).tmp $(CPPFLAGS) $(CFLAGS) \
+   $(CFLAGS_$(@)) $(EXTRA_CPPFLAGS) $(EXTRA_CFLAGS) -o $@ -c $<
 C_TO_O_STR = $(subst ','\'',$(C_TO_O)) #'# fix syntax highlight
 C_TO_O_DISP = $(if $(V),"$(C_TO_O_STR)","  CC $(@)")
 endif
 PMDINFO_GEN = $(RTE_SDK_BIN)/app/dpdk-pmdinfogen $@ $@.pmd.c
-PMDINFO_CC = $(CC) $(CFLAGS) -c -o $@.pmd.o $@.pmd.c
+PMDINFO_CC = $(CC) $(CPPFLAGS) $(CFLAGS) -c -o $@.pmd.o $@.pmd.c
 PMDINFO_LD = $(CROSS)ld $(LDFLAGS) -r -o $@.o $@.pmd.o $@
 PMDINFO_TO_O = if grep -q 'PMD_REGISTER_DRIVER(.*)' $<; then \
echo "$(if $V,$(PMDINFO_GEN),  PMDINFO $@.pmd.c)" && \
-- 
2.1.4



[dpdk-dev] [PATCH v8 01/25] eal: define macro container_of

2016-08-30 Thread Shreyansh Jain
Hi Thomas,

On Tuesday 30 August 2016 04:00 PM, Thomas Monjalon wrote:
> 2016-08-30 09:57, Shreyansh Jain:
>> On Monday 29 August 2016 10:13 PM, Ferruh Yigit wrote:
>>> This gives compilation error for mlx5, because the libraries mlx depends
>>> defines same macro:
>>> /rte_common.h:338:9: error: 'container_of' macro redefined
>>> /usr/include/infiniband/verbs.h:77:9: note: previous definition is here
>>
>> I thought testing with scripts/test-build.sh and default configuration
>> would compile all drivers - I was wrong. I will retest the patches and
>> release again.
>>
>> Is there a better way to test that no driver breaks? Any particular
>> parameters I should use for test-build.sh?
>
> Yes I suggest to create a file ~/.config/dpdk/devel.config to adapt the
> configuration to your system.
> Once you have installed the required dependencies, you can make this kind
> of configuration:

Ok.

>
> mlxdep=$root/mlx/mofed-3.3-1.0.0.0
> szedep=$root/sze/usr-1.1.4

What does '$root' here refer to?
I am assuming 'mofed-3.3-1.0.0.0' and 'usr-1.1.4' are part of some 
dependencies that I should be revolving. Is that so?
As of now I don't have much idea about this - I will have a look and 
ping back in case I am stuck.

> if echo $DPDK_TARGET | grep -q '^x86_64' ; then
> export DPDK_DEP_ARCHIVE=y
> export DPDK_DEP_ZLIB=y
> export DPDK_DEP_PCAP=y
> export DPDK_DEP_SSL=y
> export DPDK_DEP_MOFED=y
> export DPDK_DEP_SZE=y
> export DPDK_DEP_CFLAGS="-I$mlxdep/include -I$szedep/include"
> export DPDK_DEP_LDFLAGS="-L$mlxdep/lib -L$szedep/lib64 
> -rpath=$szedep/lib64"
> export AESNI_MULTI_BUFFER_LIB_PATH=$root/aesni/ipsec-043
> export LIBSSO_SNOW3G_PATH=$root/libsso/libsso-snow3g-0.3.1
> export LIBSSO_KASUMI_PATH=$root/libsso/libsso-kasumi-0.3.1
> fi

Thanks. I will try the above.

>
>> I used 'x86_64-native-linuxapp-gcc+default+debug+shared' for all patches.
>
> It is a good idea to test also with clang (x86_64-native-linuxapp-clang)
> and another arch (e.g. arm64-thunderx-linuxapp-gcc).

Before releasing v9, I will do these steps.
Thank you for suggestions.

-
Shreyansh



[dpdk-dev] [PATCH] doc/guides: add info on how to enable QAT

2016-08-30 Thread Thomas Monjalon
2016-08-30 13:57, Wiles, Keith:
> > 2016-08-30 14:26, Eoin Breen:
> >> --- a/doc/guides/cryptodevs/qat.rst
> >> +++ b/doc/guides/cryptodevs/qat.rst
> >> @@ -78,6 +78,11 @@ Installation
> >> To use the DPDK QAT PMD an SRIOV-enabled QAT kernel driver is required. The
> >> VF devices exposed by this driver will be used by QAT PMD.
> >> 
> >> +To enable QAT in DPDK you must change the ./config/common_base file. 
> >> Change the
> >> +line 'CONFIG_RTE_LIBRTE_PMD_QAT=n' to 'CONFIG_RTE_LIBRTE_PMD_QAT=y' to do 
> >> this.
> > 
> > No, the recommended way is to change the value in the generated config
> > file (.config).
> 
> The way I have been changing the default configuration options is to copy the 
> config/defconfig_XYZ file like defconfig_x86_64-native-linuxapp-gcc to a new 
> name say defconfig-x86_64-qat-linuxapp-gcc. Then edit that file and add the 
> CONFIG_RTE_LIBRTE_PMD_QAT=y to the bottom of the file. Then ?make install 
> T=x86_64-qat-linuxapp-gcc -j?.
> 
> Is this not a better way to build a new configuration for a specific reason?

Yes you can also build your own defconfig.
I think it is better to stick to simply change the generated file between
"make config" and "make" for the documentation.


[dpdk-dev] [PATCH v8 22/25] eal/pci: inherit rte_driver by rte_pci_driver

2016-08-30 Thread Ferruh Yigit
On 8/26/2016 2:57 PM, Shreyansh Jain wrote:
> Remove the 'name' member from rte_pci_driver and move to generic rte_driver.
> 
> Most of the PMD drivers were initially using DRIVER_REGISTER_PCI(..)
> as well as assigning a name to eth_driver.pci_drv.name member.
> In this patch, only the original DRIVER_REGISTER_PCI(..) name has been
> populated into the rte_driver.name member - assignments through eth_driver
> has been removed.
> 
> Signed-off-by: Jan Viktorin 
> Signed-off-by: Shreyansh Jain 
> ---

There are a few name fields:

1) eth_dev->data->name
2) eth_dev->data->drv_name
3) rte_driver->name
4) dev_info->driver_name


What should be the relation between them?

I guess 1) is device_name, 2, 3, 4 are same thing and driver_name.

If this is correct, virtual drivers needs to be updated for this,
because for them 3 != (2 == 4). They all use global variable for 2 & 4.

And what do you think removing 2) completely?
I guess it exists for virtual devices, since for them eth_driver is not
exists and not able to access to rte_driver->name from eth_dev, but this
is solvable.


Thanks,
ferruh


[dpdk-dev] [PATCH v8 10/25] eal/pci: Helpers for device name parsing/update

2016-08-30 Thread Pattan, Reshma
Hi,

> +/**
> + * Utility function to write a pci device name, this device name can
> +later be
> + * used to retrieve the corresponding rte_pci_addr using
> +eal_parse_pci_*
> + * BDF helpers.
> + *
> + * @param addr
> + *   The PCI Bus-Device-Function address
> + * @param output
> + *   The output buffer string
> + * @param size
> + *   The output buffer size
> + * @return
> + *  0 on success, negative on error.
> + */

This function doesn't have any return value. Need to change the @return 
description.

> +static inline void
> +rte_eal_pci_device_name(const struct rte_pci_addr *addr,
> + char *output, size_t size)
> +{
> + RTE_VERIFY(size >= PCI_PRI_STR_SIZE);
> + RTE_VERIFY(snprintf(output, size, PCI_PRI_FMT,
> + addr->domain, addr->bus,
> + addr->devid, addr->function) >= 0); }
> +

Thanks,
Reshma


[dpdk-dev] Unsupported SFP+ Module - Hardware Initialization Failure -19

2016-08-30 Thread Alex Forster
Hi guys,

I have a problem again that I had about a year ago[1]. My Finisar FTL410QE2C 
(multimode QSFP) transceivers aren?t working with my Intel X520-QDA1?s. I have 
7 servers, each with two X520?s and two QSFP transceivers in them, totaling 14 
NICs and transceivers, and all of them fail with the same error: Unsupported 
SFP+ Module / Hardware Initialization Failure: -19.

This doesn?t appear to be specifically a DPDK issue, but rather an IXGBE issue, 
since (a) the stock Debian 8 IXGBE, (b) the latest out of tree IXGBE, and (c) 
DPDK?s copy of IXGBE all fail in this same way. I?m happy to take this to an 
IXGBE mailing list if somebody can point me to one, but just in case anyone 
here can help, here?s the info I?ve gathered?

* The exact function where IXGBE bails is here, annotated with the values read 
from the transceiver?s EEPROMs: 
https://www.googledrive.com/host/0B_-81lR_g8lyNDJ6QkkxS28yQzQ
* I?ve modified DPDK?s copy of IXGBE to dump the whole EEPROM of these 
transceivers, which are here: 
https://gist.github.com/alexforster/5da30cd89d49877d5d6908e9628baee8

Here?s (I think) an important part: Last time I had this problem, I only had 
one server, and one of the two transceivers worked. I fixed the other 
transceiver in that server by replacing it with an identical model. Now, 
suddenly, neither of those transceivers (which have worked for ~11 months) are 
recognized. I don?t know how I could have done something that would have 
affected the EEPROM output, but I suspect I may have.

Anybody able to offer any advice?

Alex Forster

[1] http://dpdk.org/ml/archives/dev/2015-October/024973.html


[dpdk-dev] [PATCH] doc/guides: add info on how to enable QAT

2016-08-30 Thread Thomas Monjalon
2016-08-30 14:26, Eoin Breen:
> --- a/doc/guides/cryptodevs/qat.rst
> +++ b/doc/guides/cryptodevs/qat.rst
> @@ -78,6 +78,11 @@ Installation
>  To use the DPDK QAT PMD an SRIOV-enabled QAT kernel driver is required. The
>  VF devices exposed by this driver will be used by QAT PMD.
>  
> +To enable QAT in DPDK you must change the ./config/common_base file. Change 
> the
> +line 'CONFIG_RTE_LIBRTE_PMD_QAT=n' to 'CONFIG_RTE_LIBRTE_PMD_QAT=y' to do 
> this.

No, the recommended way is to change the value in the generated config
file (.config).

PS: please avoid confidential disclaimer


[dpdk-dev] [PATCH 1/2] examples/l2fwd: Add new option to enable/disable MAC addresses tweaking

2016-08-30 Thread Mcnamara, John

> -Original Message-
> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Maxime Coquelin
> Sent: Tuesday, July 26, 2016 4:56 PM
> To: Richardson, Bruce ; De Lara Guarch, Pablo
> 
> Cc: dev at dpdk.org; Maxime Coquelin 
> Subject: [dpdk-dev] [PATCH 1/2] examples/l2fwd: Add new option to
> enable/disable MAC addresses tweaking
> 
> l2fwd could be useful for testing virtual devices without the need of
> physical ones.
> 
> To achieve this, this patch adds a new option to enable/disable the MAC
> addresses tweaking done at forwarding time: --[no-]mac-tweaking

Acked-by: John McNamara 


[dpdk-dev] [PATCH 0/2] examples/l2fwd: Add option to enable/disable MAC addresses tweaking

2016-08-30 Thread Mcnamara, John
> -Original Message-
> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Maxime Coquelin
> Sent: Tuesday, July 26, 2016 4:56 PM
> To: Richardson, Bruce ; De Lara Guarch, Pablo
> 
> Cc: dev at dpdk.org; Maxime Coquelin 
> Subject: [dpdk-dev] [PATCH 0/2] examples/l2fwd: Add option to
> enable/disable MAC addresses tweaking
> 
> This series adds a new option to enable/disable MAC addresses tweaking in
> l2fwd example.
> 
> Doing that, we can enable basic VM 2 VM communication easily, without
> external projects dependencies, nor real NIC (as with vhost example).

Hi,

This looks like a useful feature. However, I don't know if "tweaking" is
the best description of the feature. Maybe "updating" would be better.

John


[dpdk-dev] [PATCH 3/4] doc: add basic invocation info for dpdk-devbind

2016-08-30 Thread Mcnamara, John
> -Original Message-
> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Christian Ehrhardt
> Sent: Thursday, August 4, 2016 12:17 PM
> To: christian.ehrhardt at canonical.com; thomas.monjalon at 6wind.com;
> dev at dpdk.org
> Subject: [dpdk-dev] [PATCH 3/4] doc: add basic invocation info for dpdk-
> devbind
> 
> +
> +OPTIONS
> +---
> +
> +* ``--help, --usage``
> +
> +Display usage information and quit
> +
> +* ``-s, --status``
> +
> +Print the current status of all known network interfaces.
> +For each device, it displays the PCI domain, bus, slot and
> function,
> +along with a text description of the device. Depending upon
> whether the
> +device is being used by a kernel driver, the igb_uio driver, or
> no
> +driver, other relevant information will be displayed:
> +* the Linux interface name e.g. if=eth0
> +* the driver being used e.g. drv=igb_uio
> +* any suitable drivers not currently using that device
> +e.g. unused=igb_uio
> +NOTE: if this flag is passed along with a bind/unbind option, the
> +status display will always occur after the other operations have
> taken
> +place.

There are a few RST errors in this file. One of theme relates to the second
level bullet list above. There should be a blank line before and after the list.

Also, "e.g. unused=igb_uio" should be joined to, or aligned with, the previous
line.

Also, it would be better to quote any fixed width strings in the docs with 
quotes, like ``unused=igb_uio``. This could be applied to any of the 
``--options``
in the text as well.
> +Examples
> +
> +
> +To display current device status:
> +.. code-block:: console
> +
> +   dpdk-devbind --status
> +

All the "code-block" directives should have a blank line before them.
However it is probably better to use the simpler :: directive, like:

To display current device status::

dpdk-devbind --status



[dpdk-dev] [PATCH 2/4] doc: add basic invocation info for dpdk-pmdinfo

2016-08-30 Thread Mcnamara, John


> -Original Message-
> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Christian Ehrhardt
> Sent: Thursday, August 4, 2016 12:17 PM
> To: christian.ehrhardt at canonical.com; thomas.monjalon at 6wind.com;
> dev at dpdk.org
> Subject: [dpdk-dev] [PATCH 2/4] doc: add basic invocation info for dpdk-
> pmdinfo
> 
> This summarizes the "how to call dpdk-pmdinfo" in one place to be picked
> up by html/pdf/man-page docs.
> 
> ...
>
>  # The following hook functions add some simple handling for the :numref:
> diff --git a/doc/guides/sample_app_ug/index.rst
> b/doc/guides/sample_app_ug/index.rst
> index 96bb317..7801688 100644
> --- a/doc/guides/sample_app_ug/index.rst
> +++ b/doc/guides/sample_app_ug/index.rst
> @@ -77,6 +77,7 @@ Sample Applications User Guide

I think these docs would be better in a "doc/guides/tools" directory.
That would be clearer in terms to the documentation structure and
also in terms of their functionality



> +dpdk-pmdinfo Application
> +
> +
> +The ``dpdk-pmdinfo`` tool is a Data Plane Development Kit (DPDK) tool
> +that can dump a PMDs hardware support info.

To avoid using "tool" twice in the sentence you could use "utility" the
second time.



> +
> +   .. Note::
> +
> +  * The actual data is stored in the object files as
> + PMD_INFO_STRING

I'd leave this note out of the description.


> +Running the Application
> +---
> +
> +The tool has a number of command line options:
> +
> +.. code-block:: console
> +
> +
> +   dpdk-pmdinfo [-hrtp] [-d 
> +
> +   -h, --helpshow a short help message and exit
> +   -r, --raw Dump as raw json strings
> +   -d FILE, --pcidb=FILE
> + specify a pci database to get vendor names from
> +   -t, --table   output information on hw support as a hex table
> +   -p, --plugindir   scan dpdk for autoload plugins
> +

One of the descriptions is in sentence case and the others aren't. It should
be one or the other. I would use sentence case for all.




[dpdk-dev] [PATCH 1/4] doc: rendering and installation of man pages

2016-08-30 Thread Mcnamara, John
> -Original Message-
> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Christian Ehrhardt
> Sent: Thursday, August 4, 2016 12:17 PM
> To: christian.ehrhardt at canonical.com; thomas.monjalon at 6wind.com;
> dev at dpdk.org
> Subject: [dpdk-dev] [PATCH 1/4] doc: rendering and installation of man
> pages
> 

Acked-by: John McNamara 


[dpdk-dev] [PATCH 0/4] provide man pages for binaries provided by DPDK

2016-08-30 Thread Mcnamara, John
> -Original Message-
> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Christian Ehrhardt
> Sent: Thursday, August 4, 2016 12:17 PM
> To: christian.ehrhardt at canonical.com; thomas.monjalon at 6wind.com;
> dev at dpdk.org
> Subject: [dpdk-dev] [PATCH 0/4] provide man pages for binaries provided by
> DPDK
> 
> Hi,
> this is about providing manpages for the binaries installed by DPDK.
> Eventually people using commands expect at least something reasonable
> avalable behind "man command".
> 

Hi Christian,

Thanks for that. It is really useful and the output looks very good.

I'll make a few minor comments to the patches.

John


[dpdk-dev] [PATCH v8 00/25] Introducing rte_driver/rte_device generalization

2016-08-30 Thread Ferruh Yigit
On 8/26/2016 2:56 PM, Shreyansh Jain wrote:
> Based on master (e22856313fff2)
> 
> Background:
> ===
> 
> It includes two different patch-sets floated on ML earlier:
>  * Original patch series is from David Marchand [1], [2].
>   `- This focused mainly on PCI (PDEV) part
>   `- v7 of this was posted by me [8] in August/2016
>  * Patch series [4] from Jan Viktorin
>   `- This focused on VDEV and rte_device integration
> 
> Introduction:
> =
> 
> This patch series introduces a generic device model, moving away from PCI 
> centric code layout. Key change is to introduce rte_driver/rte_device 
> structures at the top level which are inherited by 
> rte_XXX_driver/rte_XXX_device - where XXX belongs to {pci, vdev, soc (in 
> future),...}.
> 
> Key motivation for this series is to move away from PCI centric design of 
> EAL to a more hierarchical device model - pivoted around a generic driver 
> and device. Each specific driver and device can inherit the common 
> properties of the generic set and build upon it through driver/device 
> specific functions.
> 
> Earlier, the EAL device initialization model was:
> (Refer: [3])
> 
> --
>  Constructor:
>   |- PMD_DRIVER_REGISTER(rte_driver)
>  `-  insert into dev_driver_list, rte_driver object
> 
>  rte_eal_init():
>   |- rte_eal_pci_init()
>   |  `- scan and fill pci_device_list from sysfs
>   |
>   |- rte_eal_dev_init()
>   |  `- For each rte_driver in dev_driver_list
>   | `- call the rte_driver->init() function
>   ||- PMDs designed to call rte_eth_driver_register(eth_driver)
>   ||- eth_driver have rte_pci_driver embedded in them
>   |`- rte_eth_driver_register installs the 
>   |   rte_pci_driver->devinit/devuninit callbacks.
>   |
>   |- rte_eal_pci_probe()
>   |  |- For each device detected, dev_driver_list is parsed and matching is
>   |  |  done.
>   |  |- For each matching device, the rte_pci_driver->devinit() is called.
>   |  |- Default map is to rte_eth_dev_init() which in turn creates a
>   |  |  new ethernet device (eth_dev)
>   |  |  `- eth_drv->eth_dev_init() is called which is implemented by 
>   `--|individual PMD drivers.
> 
> --
> 
> The structure of driver looks something like:
> 
>  ++ ._.
>  | rte_driver <-| PMD |___
>  |  .init | `-`   \
>  +.---+  | \
>   `-.| What PMD actually is
>  \   |  |
>   +--v+ |
>   | eth_driver| |
>   | .eth_dev_init | |
>   +.--+ |
>`-.  |
>   \ |
>+v---+
>| rte_pci_driver |
>| .pci_devinit   |
>++
> 
>   and all devices are part of a following linked lists:
> - dev_driver_list for all rte_drivers
> - pci_device_list for all devices, whether PCI or VDEV
> 
> 
> From the above:
>  * a PMD initializes a rte_driver, eth_driver even though actually it is a 
>pci_driver
>  * initialization routines are passed from rte_driver->pci_driver->eth_driver
>even though they should ideally be rte_eal_init()->rte_pci_driver()
>  * For a single driver/device type model, this is not necessarily a
>functional issue - but more of a design language.
>  * But, when number of driver/device type increase, this would create problem
>in how driver<=>device links are represented.
> 
> Proposed Architecture:
> ==
> 
> A nice representation has already been created by David in [3]. Copying that
> here:
> 
> +--+ +---+
> |  | |   |
> | rte_pci_device   | | rte_pci_driver|
> |  | |   |
> +-+ | +--+ | | +---+ |
> | | | |  | | | |   | |
> | rte_eth_dev +---> rte_device   +-> rte_driver| |
> | | | |  char name[] | | | |  char name[]  | |
> +-+ | |  | | | |  int init(rte_device *)   | |
> | +--+ | | |  int uninit(rte_device *) | |
> |  | | |   | |
> +--+ | +---+ |
>  |   |
>  +---+
> 
> - for ethdev on top of vdev devices
> 
> +--+ +---+
> |  | |   |
> | drv specific | | rte_vdev_driver   |
> |  

[dpdk-dev] [PATCH] doc/guides: add info on how to enable QAT

2016-08-30 Thread Eoin Breen
Signed-off-by: Eoin Breen 
---
 doc/guides/cryptodevs/qat.rst | 5 +
 1 file changed, 5 insertions(+)

diff --git a/doc/guides/cryptodevs/qat.rst b/doc/guides/cryptodevs/qat.rst
index cae1958..db03470 100644
--- a/doc/guides/cryptodevs/qat.rst
+++ b/doc/guides/cryptodevs/qat.rst
@@ -78,6 +78,11 @@ Installation
 To use the DPDK QAT PMD an SRIOV-enabled QAT kernel driver is required. The
 VF devices exposed by this driver will be used by QAT PMD.

+To enable QAT in DPDK you must change the ./config/common_base file. Change the
+line 'CONFIG_RTE_LIBRTE_PMD_QAT=n' to 'CONFIG_RTE_LIBRTE_PMD_QAT=y' to do this.
+You must then configure and build dpdk, for example using the commands:
+make T=x86_64-native-linuxapp-gcc config; make
+
 If you are running on kernel 4.4 or greater, see instructions for
 `Installation using kernel.org driver`_ below. If you are on a kernel earlier
 than 4.4, see `Installation using 01.org QAT driver`_.
-- 
2.5.5

--
Intel Research and Development Ireland Limited
Registered in Ireland
Registered Office: Collinstown Industrial Park, Leixlip, County Kildare
Registered Number: 308263


This e-mail and any attachments may contain confidential material for the sole
use of the intended recipient(s). Any review or distribution by others is
strictly prohibited. If you are not the intended recipient, please contact the
sender and delete all copies.



[dpdk-dev] [PATCH] doc/guides: add info on how to enable QAT

2016-08-30 Thread Wiles, Keith

Regards,
Keith

> On Aug 30, 2016, at 8:46 AM, Thomas Monjalon  
> wrote:
> 
> 2016-08-30 14:26, Eoin Breen:
>> --- a/doc/guides/cryptodevs/qat.rst
>> +++ b/doc/guides/cryptodevs/qat.rst
>> @@ -78,6 +78,11 @@ Installation
>> To use the DPDK QAT PMD an SRIOV-enabled QAT kernel driver is required. The
>> VF devices exposed by this driver will be used by QAT PMD.
>> 
>> +To enable QAT in DPDK you must change the ./config/common_base file. Change 
>> the
>> +line 'CONFIG_RTE_LIBRTE_PMD_QAT=n' to 'CONFIG_RTE_LIBRTE_PMD_QAT=y' to do 
>> this.
> 
> No, the recommended way is to change the value in the generated config
> file (.config).

The way I have been changing the default configuration options is to copy the 
config/defconfig_XYZ file like defconfig_x86_64-native-linuxapp-gcc to a new 
name say defconfig-x86_64-qat-linuxapp-gcc. Then edit that file and add the 
CONFIG_RTE_LIBRTE_PMD_QAT=y to the bottom of the file. Then ?make install 
T=x86_64-qat-linuxapp-gcc -j?.

Is this not a better way to build a new configuration for a specific reason?

> 
> PS: please avoid confidential disclaimer



[dpdk-dev] support BCM5719 driver

2016-08-30 Thread Keren Hochman
Hi, 

I tried to bind BCM5719 driver to DPDK compatible driver: (I tried both to 
uio_pci_generic and igb_uio)

Network devices using DPDK-compatible driver

:03:00.2 'NetXtreme BCM5719 Gigabit Ethernet PCIe' drv=uio_pci_generic 
unused=tg3,igb_uio
:03:00.3 'NetXtreme BCM5719 Gigabit Ethernet PCIe' drv=igb_uio 
unused=tg3,uio_pci_generic

But when I ran from the dpdk example: link_status_interrupt I received an error 
No ethernet port. 

(When I tried to bind I350 Gigabit Network Connection it worked.

Thank you, Keren



[dpdk-dev] [PATCH v8 01/25] eal: define macro container_of

2016-08-30 Thread Thomas Monjalon
2016-08-30 09:57, Shreyansh Jain:
> On Monday 29 August 2016 10:13 PM, Ferruh Yigit wrote:
> > This gives compilation error for mlx5, because the libraries mlx depends
> > defines same macro:
> > /rte_common.h:338:9: error: 'container_of' macro redefined
> > /usr/include/infiniband/verbs.h:77:9: note: previous definition is here
> 
> I thought testing with scripts/test-build.sh and default configuration 
> would compile all drivers - I was wrong. I will retest the patches and 
> release again.
> 
> Is there a better way to test that no driver breaks? Any particular 
> parameters I should use for test-build.sh?

Yes I suggest to create a file ~/.config/dpdk/devel.config to adapt the
configuration to your system.
Once you have installed the required dependencies, you can make this kind
of configuration:

mlxdep=$root/mlx/mofed-3.3-1.0.0.0
szedep=$root/sze/usr-1.1.4
if echo $DPDK_TARGET | grep -q '^x86_64' ; then
export DPDK_DEP_ARCHIVE=y
export DPDK_DEP_ZLIB=y
export DPDK_DEP_PCAP=y
export DPDK_DEP_SSL=y
export DPDK_DEP_MOFED=y
export DPDK_DEP_SZE=y
export DPDK_DEP_CFLAGS="-I$mlxdep/include -I$szedep/include"
export DPDK_DEP_LDFLAGS="-L$mlxdep/lib -L$szedep/lib64 -rpath=$szedep/lib64"
export AESNI_MULTI_BUFFER_LIB_PATH=$root/aesni/ipsec-043
export LIBSSO_SNOW3G_PATH=$root/libsso/libsso-snow3g-0.3.1
export LIBSSO_KASUMI_PATH=$root/libsso/libsso-kasumi-0.3.1
fi

> I used 'x86_64-native-linuxapp-gcc+default+debug+shared' for all patches.

It is a good idea to test also with clang (x86_64-native-linuxapp-clang)
and another arch (e.g. arm64-thunderx-linuxapp-gcc).



[dpdk-dev] [RFC] drivers: advertise kmod dependencies in pmdinfo

2016-08-30 Thread Olivier Matz
Hi Matej,

On 08/30/2016 10:40 AM, Matej Vido wrote:
> On 26.08.2016 15:20, Olivier Matz wrote:
> 
>> Add a new macro DRIVER_REGISTER_KMOD_DEP() that allows a driver to
>> declare the list of kernel modules required to run properly.
>>
>> Today, most PCI drivers require uio/vfio.
>>
>> Signed-off-by: Olivier Matz 
>> ---
> [..]
>>  
>> diff --git a/drivers/net/szedata2/rte_eth_szedata2.c
>> b/drivers/net/szedata2/rte_eth_szedata2.c
>> index 483d789..409e71f 100644
>> --- a/drivers/net/szedata2/rte_eth_szedata2.c
>> +++ b/drivers/net/szedata2/rte_eth_szedata2.c
>> @@ -1602,3 +1602,5 @@ static struct rte_driver rte_szedata2_driver = {
>> PMD_REGISTER_DRIVER(rte_szedata2_driver, RTE_SZEDATA2_DRIVER_NAME);
>>   DRIVER_REGISTER_PCI_TABLE(RTE_SZEDATA2_DRIVER_NAME,
>> rte_szedata2_pci_id_table);
>> +DRIVER_REGISTER_KMOD_DEP(RTE_SZEDATA2_DRIVER_NAME,
>> +"uio,igb_uio:uio,uio_pci_generic:vfio,vfio-pci");
> Hi Olivier,
> 
> szedata2 doesn't require uio/vfio modules. Instead the following lines
> could be used:
> 
> +DRIVER_REGISTER_KMOD_DEP(RTE_SZEDATA2_DRIVER_NAME,
> +"combo6core,combov3,szedata2,szedata2_cv3");
> 

ok, I will update it for next revision, thanks !

Olivier


[dpdk-dev] [PATCH v8 22/25] eal/pci: inherit rte_driver by rte_pci_driver

2016-08-30 Thread Shreyansh Jain
On Monday 29 August 2016 10:19 PM, Ferruh Yigit wrote:
> On 8/26/2016 2:57 PM, Shreyansh Jain wrote:
>> Remove the 'name' member from rte_pci_driver and move to generic rte_driver.
>>
>> Most of the PMD drivers were initially using DRIVER_REGISTER_PCI(..)
>> as well as assigning a name to eth_driver.pci_drv.name member.
>> In this patch, only the original DRIVER_REGISTER_PCI(..) name has been
>> populated into the rte_driver.name member - assignments through eth_driver
>> has been removed.
>>
>> Signed-off-by: Jan Viktorin 
>> Signed-off-by: Shreyansh Jain 
>> ---
>>  app/test/test_pci.c | 10 +++---
>>  app/test/virtual_pmd.c  |  2 +-
>>  drivers/crypto/qat/qat_qp.c |  2 +-
>>  drivers/net/bnx2x/bnx2x_ethdev.c|  2 --
>
> bnx2x_rxtx.c also needs to be updated:
>
>   CC bnx2x_rxtx.o
> /drivers/net/bnx2x/bnx2x_rxtx.c:22:25: error: no member named 'name'
> in 'struct rte_pci_driver'
> dev->driver->pci_drv.name, ring_name,
> dev->data->port_id, queue_id);
>
> 
>
>
> Also szedata2 driver seems missed:
>
>   CC rte_eth_szedata2.o
> /drivers/net/szedata2/rte_eth_szedata2.c:1575:4: error: field
> designator 'name' does not refer to any field in type 'struct
> rte_pci_driver'
> .name = RTE_SZEDATA2_PCI_DRIVER_NAME,
>  ^
>
>
>

Ok - I will fix this. It seems I really need a better way to test before 
sending out the patches.

-
Shreyansh   


[dpdk-dev] [PATCH v8 25/25] eal/pci: Create rte_device list and fallback on its members

2016-08-30 Thread Shreyansh Jain
On Monday 29 August 2016 10:23 PM, Ferruh Yigit wrote:
> On 8/26/2016 2:57 PM, Shreyansh Jain wrote:
>> Now that rte_device is available, drivers can start using its members (numa,
>> name) as well as link themselves into another rte_device list.
>>
>> As of now no one is using this list, but can be used for moving over all
>> devices (pdev/vdev/Xdev) and perform bulk actions (like cleanup).
>>
>> Signed-off-by: Jan Viktorin 
>> Signed-off-by: Shreyansh Jain 
>> ---
>>  app/test/virtual_pmd.c  |  4 ++--
>>  drivers/net/fm10k/fm10k_ethdev.c|  6 +++---
>>  drivers/net/i40e/i40e_ethdev.c  |  6 --
>>  drivers/net/virtio/virtio_pci.c |  5 +++--
>>  lib/librte_eal/bsdapp/eal/eal_pci.c |  2 +-
>>  lib/librte_eal/common/eal_common_pci.c  | 11 ++-
>>  lib/librte_eal/common/include/rte_pci.h |  3 +--
>>  lib/librte_eal/linuxapp/eal/eal_pci.c   |  7 +--
>>  lib/librte_ether/rte_ethdev.c   |  2 +-
>>  9 files changed, 26 insertions(+), 20 deletions(-)
>
> mlx5.c needs to be updated too:
>
>   CC mlx5.o
> /drivers/net/mlx5/mlx5.c:514:34: error: no member named 'devargs' in
> 'struct rte_pci_device'
> err = mlx5_args(priv, pci_dev->devargs);
>   ~~~  ^
>
>

Ok. I will fix this. Thanks.

-
Shreyansh



[dpdk-dev] [PATCH v8 17/25] drivers: convert PMD_VDEV drivers to use rte_vdev_driver

2016-08-30 Thread Shreyansh Jain
On Monday 29 August 2016 10:27 PM, Ferruh Yigit wrote:
> On 8/26/2016 2:56 PM, Shreyansh Jain wrote:
>> All PMD_VDEV drivers can now use rte_vdev_driver instead of the
>> rte_driver (which is embedded in the rte_vdev_driver).
>>
>> Signed-off-by: Jan Viktorin 
>> Signed-off-by: Shreyansh Jain 
>> ---
>
> 
>
>> diff --git a/drivers/net/xenvirt/rte_eth_xenvirt.c 
>> b/drivers/net/xenvirt/rte_eth_xenvirt.c
>> index 6b15381..fa00e52 100644
>> --- a/drivers/net/xenvirt/rte_eth_xenvirt.c
>> +++ b/drivers/net/xenvirt/rte_eth_xenvirt.c
>> @@ -759,12 +759,14 @@ rte_pmd_xenvirt_devuninit(const char *name)
>>  return 0;
>>  }
>
> xenvirt missing header file rte_vdev.h:
>
>   CC rte_eth_xenvirt.o
> /drivers/net/xenvirt/rte_eth_xenvirt.c:762:31: error: variable has
> incomplete type 'struct rte_vdev_driver'
> static struct rte_vdev_driver pmd_xenvirt_drv = {

Thanks for highlighting. I will fix this.

>
>
>>
>> -static struct rte_driver pmd_xenvirt_drv = {
>> -.type = PMD_VDEV,
>> -.init = rte_pmd_xenvirt_devinit,
>> -.uninit = rte_pmd_xenvirt_devuninit,
>> +static struct rte_vdev_driver pmd_xenvirt_drv = {
>> +.driver = {
>> +.type = PMD_VDEV,
>> +.init = rte_pmd_xenvirt_devinit,
>> +.uninit = rte_pmd_xenvirt_devuninit
>> +},
>>  };
>>
>> -PMD_REGISTER_DRIVER(pmd_xenvirt_drv, eth_xenvirt);
>> +DRIVER_REGISTER_VDEV(eth_xenvirt, pmd_xenvirt_drv);
>>  DRIVER_REGISTER_PARAM_STRING(eth_xenvirt,
>>  "mac=");
>>
>
>

-
Shreyansh



[dpdk-dev] [PATCH v8 09/25] driver: Remove driver register callbacks for crypto/net

2016-08-30 Thread Shreyansh Jain
Hi Ferruh,

On Monday 29 August 2016 10:50 PM, Ferruh Yigit wrote:
> On 8/26/2016 2:56 PM, Shreyansh Jain wrote:
>> Now that all pdev are pci drivers, we don't need to register crypto and 
>> ethdev
>> drivers through a dedicated channel.
>>
>> Signed-off-by: David Marchand 
>> Signed-off-by: Shreyansh Jain 
>
> 
>
>> -void rte_eth_driver_register(struct eth_driver *eth_drv);
>
> Function needs to be removed from .map file

Actually here I was a little confused.
This is part of DPDK_2.2 - an old release. Would removing from an old 
release break compatibility for any system using the new release but 
still adhering to older APIs?
Also, there is the case of this comment [1] from Neil Horman (and 
counter argument [2]) about not removing 
'rte_cryptodev_pmd_driver_register' from map file because there was no 
deprecation notice.

Overall situation is that I have maintained the previous patch of 
'rte_cryptodev_pmd_driver_register' removal but refrained from removing 
new symbol - waiting for a little more clarification.

Is it OK to remove a symbol from an older release without a deprecation 
notice? Or, should a deprecation notice follow this patch?

[1] http://dpdk.org/ml/archives/dev/2016-June/042439.html
[2] http://dpdk.org/ml/archives/dev/2016-June/042444.html

>
>> -
>> -/**
>>   * Convert a numerical speed in Mbps to a bitmap flag that can be used in
>>   * the bitmap link_speeds of the struct rte_eth_conf
>>   *
>>
>
>

-
Shreyansh


[dpdk-dev] [PATCH v8 01/25] eal: define macro container_of

2016-08-30 Thread Shreyansh Jain
Hi Ferruh,

Forgot to add a comment in previous reply to this email:

On Monday 29 August 2016 10:13 PM, Ferruh Yigit wrote:
> On 8/26/2016 2:56 PM, Shreyansh Jain wrote:
>> Signed-off-by: Jan Viktorin 
>> Signed-off-by: Shreyansh Jain 
>> ---
>>  lib/librte_eal/common/include/rte_common.h | 16 
>>  1 file changed, 16 insertions(+)
>>
>> diff --git a/lib/librte_eal/common/include/rte_common.h 
>> b/lib/librte_eal/common/include/rte_common.h
>> index 332f2a4..a9b6792 100644
>> --- a/lib/librte_eal/common/include/rte_common.h
>> +++ b/lib/librte_eal/common/include/rte_common.h
>> @@ -322,6 +322,22 @@ rte_bsf32(uint32_t v)
>>  #define offsetof(TYPE, MEMBER)  __builtin_offsetof (TYPE, MEMBER)
>>  #endif
>>
>> +/**
>> + * Return pointer to the wrapping struct instance.
>> + * Example:
>> + *
>> + *  struct wrapper {
>> + *  ...
>> + *  struct child c;
>> + *  ...
>> + *  };
>> + *
>> + *  struct child *x = obtain(...);
>> + *  struct wrapper *w = container_of(x, struct wrapper, c);
>> + */
>> +#define container_of(p, type, member) \
>> +((type *) (((char *) (p)) - offsetof(type, member)))
>> +
>>  #define _RTE_STR(x) #x
>>  /** Take a macro value and get a string version of it */
>>  #define RTE_STR(x) _RTE_STR(x)
>>
>
> This gives compilation error for mlx5, because the libraries mlx depends
> defines same macro:
> /rte_common.h:338:9: error: 'container_of' macro redefined
> /usr/include/infiniband/verbs.h:77:9: note: previous definition is here
>
> Does it make sense to protect macro with
> #ifndef container_of
> 
> #endif

Sounds good - probably would prevent double definition in future if 
someone includes any linux header having similar definition.

Generally the container_of definitions are consistent - that is, they 
would invariably use the offsetof from member. In which case, creating a 
new dpdk_* would only duplicate. Thus, I prefer the #ifndef.

>
> OR
>
> add a dpdk prefix?
>
>
> Regards,
> ferruh
>

-
Shreyansh



[dpdk-dev] [PATCH v8 01/25] eal: define macro container_of

2016-08-30 Thread Shreyansh Jain
Hi Ferruh,

On Monday 29 August 2016 10:13 PM, Ferruh Yigit wrote:
> On 8/26/2016 2:56 PM, Shreyansh Jain wrote:
>> Signed-off-by: Jan Viktorin 
>> Signed-off-by: Shreyansh Jain 
>> ---
>>  lib/librte_eal/common/include/rte_common.h | 16 
>>  1 file changed, 16 insertions(+)
>>
>> diff --git a/lib/librte_eal/common/include/rte_common.h 
>> b/lib/librte_eal/common/include/rte_common.h
>> index 332f2a4..a9b6792 100644
>> --- a/lib/librte_eal/common/include/rte_common.h
>> +++ b/lib/librte_eal/common/include/rte_common.h
>> @@ -322,6 +322,22 @@ rte_bsf32(uint32_t v)
>>  #define offsetof(TYPE, MEMBER)  __builtin_offsetof (TYPE, MEMBER)
>>  #endif
>>
>> +/**
>> + * Return pointer to the wrapping struct instance.
>> + * Example:
>> + *
>> + *  struct wrapper {
>> + *  ...
>> + *  struct child c;
>> + *  ...
>> + *  };
>> + *
>> + *  struct child *x = obtain(...);
>> + *  struct wrapper *w = container_of(x, struct wrapper, c);
>> + */
>> +#define container_of(p, type, member) \
>> +((type *) (((char *) (p)) - offsetof(type, member)))
>> +
>>  #define _RTE_STR(x) #x
>>  /** Take a macro value and get a string version of it */
>>  #define RTE_STR(x) _RTE_STR(x)
>>
>
> This gives compilation error for mlx5, because the libraries mlx depends
> defines same macro:
> /rte_common.h:338:9: error: 'container_of' macro redefined
> /usr/include/infiniband/verbs.h:77:9: note: previous definition is here

I thought testing with scripts/test-build.sh and default configuration 
would compile all drivers - I was wrong. I will retest the patches and 
release again.

Is there a better way to test that no driver breaks? Any particular 
parameters I should use for test-build.sh?

I used 'x86_64-native-linuxapp-gcc+default+debug+shared' for all patches.

>
> Does it make sense to protect macro with
> #ifndef container_of
> 
> #endif
>
> OR
>
> add a dpdk prefix?
>
>
> Regards,
> ferruh
>

-
Shreyansh



[dpdk-dev] [PATCH v2] add mtu set in virtio

2016-08-30 Thread Maxime Coquelin
Hi Souvik,

On 08/30/2016 01:02 AM, souvikdey33 wrote:
> Signed-off-by: Souvik Dey 
>
> Fixes: 1fb8e8896ca8 ("Signed-off-by: Souvik Dey ")
> Reviewed-by: Stephen Hemminger 
>
> Virtio interfaces should also support setting of mtu, as in case of cloud
> it is expected to have the consistent mtu across the infrastructure that
> the dhcp server sends and not hardcoded to 1500(default).
> ---
>  drivers/net/virtio/virtio_ethdev.c | 12 
>  1 file changed, 12 insertions(+)

FYI, there are some on-going changes in the VIRTIO specification
so that the VHOST interface exposes its MTU to its VIRTIO peer.
It may also be used as an alternative of what you patch achieves.

I am working on its implementation in Qemu/DPDK, our goal being to
reduce performance drops for small packets with Rx mergeable buffers
feature enabled.

Regards,
Maxime


[dpdk-dev] [dpdk-dev, RFC] drivers: advertise kmod dependencies in pmdinfo

2016-08-30 Thread Neil Horman
On Fri, Aug 26, 2016 at 03:20:46PM +0200, Olivier Matz wrote:
> Add a new macro DRIVER_REGISTER_KMOD_DEP() that allows a driver to
> declare the list of kernel modules required to run properly.
> 
> Today, most PCI drivers require uio/vfio.
> 
> Signed-off-by: Olivier Matz 
> 
> ---
> In this RFC, I supposed that all PCI drivers require a the loading of a
> uio/vfio module (except mlx*), this may be wrong.
> Comments are welcome!
> 
> 
>  buildtools/pmdinfogen/pmdinfogen.c  |  1 +
>  buildtools/pmdinfogen/pmdinfogen.h  |  1 +
>  drivers/crypto/qat/rte_qat_cryptodev.c  |  2 ++
>  drivers/net/bnx2x/bnx2x_ethdev.c|  4 
>  drivers/net/bnxt/bnxt_ethdev.c  |  2 ++
>  drivers/net/cxgbe/cxgbe_ethdev.c|  2 ++
>  drivers/net/e1000/em_ethdev.c   |  2 ++
>  drivers/net/e1000/igb_ethdev.c  |  4 
>  drivers/net/ena/ena_ethdev.c|  2 ++
>  drivers/net/enic/enic_ethdev.c  |  2 ++
>  drivers/net/fm10k/fm10k_ethdev.c|  2 ++
>  drivers/net/i40e/i40e_ethdev.c  |  2 ++
>  drivers/net/i40e/i40e_ethdev_vf.c   |  2 ++
>  drivers/net/ixgbe/ixgbe_ethdev.c|  4 
>  drivers/net/mlx4/mlx4.c |  2 ++
>  drivers/net/mlx5/mlx5.c |  3 +++
>  drivers/net/nfp/nfp_net.c   |  2 ++
>  drivers/net/qede/qede_ethdev.c  |  4 
>  drivers/net/szedata2/rte_eth_szedata2.c |  2 ++
>  drivers/net/thunderx/nicvf_ethdev.c |  2 ++
>  drivers/net/virtio/virtio_ethdev.c  |  2 ++
>  drivers/net/vmxnet3/vmxnet3_ethdev.c|  2 ++
>  lib/librte_eal/common/include/rte_dev.h | 14 ++
>  tools/dpdk-pmdinfo.py   |  5 -
>  24 files changed, 69 insertions(+), 1 deletion(-)
> 

Generally speaking, I like the idea, it makes sense to me in terms of using
pmdinfo to export this information

That said, This may need to be a set of macros.  By that I mean (and correct me
if I'm wrong here), but the relationship between pmd's and kernel modules is in
some cases, more complex than a 'requires' or 'depends' relationship.  That is
to say, some pmd may need user space hardware access, but can use either uio OR
vfio, but doesn't need both, and can continue to function if only one is
available.  Other PMD's may be able to use vfio or uio, but can still function
without either.  And some, as your patch implements, simply require one or the
other to function.  As such it seems like you may want a few macros, in the form
of:

DRIVER_REGISTER_KMOD_REQUEST - List of modules to attempt loading, ignore any
failures 
DRIVER_REGISTER_KMOD_REQUIRE - List of modules required to be loaded after
request macro completes, fail if any are not loaded

Thats just spitballing, mind you, theres probably a better way to do it, but the
idea is to list a set of modules you would like to have, and then create a
parsable syntax to describe the modules that need to be loaded after the request
is complete so that you can accurately codify the situations I described above.

Neil

> diff --git a/buildtools/pmdinfogen/pmdinfogen.c 
> b/buildtools/pmdinfogen/pmdinfogen.c
> index e1bf2e4..1e5b6f3 100644
> --- a/buildtools/pmdinfogen/pmdinfogen.c
> +++ b/buildtools/pmdinfogen/pmdinfogen.c
> @@ -269,6 +269,7 @@ struct opt_tag {
>  
>  static const struct opt_tag opt_tags[] = {
>   {"_param_string_export", "params"},
> + {"_kmod_dep_export", "kmod"},
>  };
>  
>  static int complete_pmd_entry(struct elf_info *info, struct pmd_driver *drv)
> diff --git a/buildtools/pmdinfogen/pmdinfogen.h 
> b/buildtools/pmdinfogen/pmdinfogen.h
> index 1da2966..2fab2aa 100644
> --- a/buildtools/pmdinfogen/pmdinfogen.h
> +++ b/buildtools/pmdinfogen/pmdinfogen.h
> @@ -85,6 +85,7 @@ else \
>  
>  enum opt_params {
>   PMD_PARAM_STRING = 0,
> + PMD_KMOD_DEP,
>   PMD_OPT_MAX
>  };
>  
> diff --git a/drivers/crypto/qat/rte_qat_cryptodev.c 
> b/drivers/crypto/qat/rte_qat_cryptodev.c
> index 82ab047..fc62be9 100644
> --- a/drivers/crypto/qat/rte_qat_cryptodev.c
> +++ b/drivers/crypto/qat/rte_qat_cryptodev.c
> @@ -135,4 +135,6 @@ static struct rte_driver pmd_qat_drv = {
>  
>  PMD_REGISTER_DRIVER(pmd_qat_drv, CRYPTODEV_NAME_QAT_SYM_PMD);
>  DRIVER_REGISTER_PCI_TABLE(CRYPTODEV_NAME_QAT_SYM_PMD, pci_id_qat_map);
> +DRIVER_REGISTER_KMOD_DEP(CRYPTODEV_NAME_QAT_SYM_PMD,
> + "uio,igb_uio:uio,uio_pci_generic:vfio,vfio-pci");
>  
> diff --git a/drivers/net/bnx2x/bnx2x_ethdev.c 
> b/drivers/net/bnx2x/bnx2x_ethdev.c
> index f3ab355..ba8831a 100644
> --- a/drivers/net/bnx2x/bnx2x_ethdev.c
> +++ b/drivers/net/bnx2x/bnx2x_ethdev.c
> @@ -667,5 +667,9 @@ static struct rte_driver rte_bnx2xvf_driver = {
>  
>  PMD_REGISTER_DRIVER(rte_bnx2x_driver, bnx2x);
>  DRIVER_REGISTER_PCI_TABLE(bnx2x, pci_id_bnx2x_map);
> +DRIVER_REGISTER_KMOD_DEP(bnx2x,
> + "uio,igb_uio:uio,uio_pci_generic:vfio,vfio-pci");
>  PMD_REGISTER_DRIVER(rte_bnx2xvf_driver, bnx2xvf);
>  DRIVER_REGISTER_PCI_TABLE(bnx2xvf, pci_id_bnx2xvf_map);
> 

[dpdk-dev] [PATCH v1] dpdk-devbind.py: Virtio interface issue.

2016-08-30 Thread Mussar, Gary
-Original Message-
From: Dey, Souvik [mailto:so...@sonusnet.com] 
Sent: Monday, August 29, 2016 7:17 PM
To: Mussar, Gary; Stephen Hemminger
Cc: nhorman at tuxdriver.com; dev at dpdk.org
Subject: RE: [dpdk-dev] [PATCH v1] dpdk-devbind.py: Virtio interface issue.

Hi,

I already followed the 100% python way and submitted the v3 of this patch. 
http://dpdk.org/dev/patchwork/patch/15378/
How will your patch be different in solving the issue. There will always be 
multiple ways to solving things right.

GM> When I first tackled this problem I used Popen() and got the exact same 
feedback about using 100% python. The version I posted yesterday satisfied the 
internal reviewers. 


V3 of my submitted patch:

diff --git a/tools/dpdk-devbind.py b/tools/dpdk-devbind.py index 
b69ca2a..c0b46ee 100755
--- a/tools/dpdk-devbind.py
+++ b/tools/dpdk-devbind.py
@@ -36,6 +36,7 @@  import sys
 import os
 import getopt
 import subprocess
+
 from os.path import exists, abspath, dirname, basename

 # The PCI base class for NETWORK devices @@ -222,8 +223,19 @@  def 
get_pci_device_details(dev_id):
 device[name] = value
 # check for a unix interface name
 sys_path = "/sys/bus/pci/devices/%s/net/" % dev_id
+# the path for virtio devices are different, so get the correct path
+virtio = "/sys/bus/pci/devices/%s/" % dev_id
+ls = subprocess.Popen(['ls', virtio], stdout=subprocess.PIPE)
+grep = subprocess.Popen('grep virt'.split(), stdin=ls.stdout,
+stdout=subprocess.PIPE)
+ls.stdout.close()
+virtio = grep.communicate()[0].rstrip()
+ls.wait()
+virtio_sys_path = "/sys/bus/pci/devices/%s/%s/net/" % (dev_id, 
+ virtio)
 if exists(sys_path):
 device["Interface"] = ",".join(os.listdir(sys_path))
+elif exists(virtio_sys_path):
+device["Interface"] = ",".join(os.listdir(virtio_sys_path))
 else:
 device["Interface"] = ""
 # check if a port is used for ssh connection


-Original Message-
From: Mussar, Gary [mailto:gmus...@ciena.com]
Sent: Monday, August 29, 2016 11:10 AM
To: Dey, Souvik ; Stephen Hemminger 
Cc: nhorman at tuxdriver.com; dev at dpdk.org
Subject: RE: [dpdk-dev] [PATCH v1] dpdk-devbind.py: Virtio interface issue.

We did this slightly differently. This is 100% python and is a bit more 
general. We search for the first "net" directory under the specific device 
directory.

---
--- tools/dpdk-devbind.py   2016-08-29 11:02:35.594202888 -0400
+++ ../dpdk/tools/dpdk-devbind.py 2016-08-29 11:00:34.897677233 -0400
@@ -221,11 +221,11 @@
 name = name.strip(":") + "_str"
 device[name] = value
 # check for a unix interface name
-sys_path = "/sys/bus/pci/devices/%s/net/" % dev_id
-if exists(sys_path):
-device["Interface"] = ",".join(os.listdir(sys_path))
-else:
-device["Interface"] = ""
+device["Interface"] = ""
+for base, dirs, files in os.walk("/sys/bus/pci/devices/%s/" % dev_id):
+if "net" in dirs:
+device["Interface"] = 
",".join(os.listdir(os.path.join(base,"net")))
+break
 # check if a port is used for ssh connection
 device["Ssh_if"] = False
 device["Active"] = ""
---

Gary

-Original Message-
From: dev [mailto:dev-boun...@dpdk.org] On Behalf Of Dey, Souvik
Sent: Friday, August 26, 2016 8:21 PM
To: Stephen Hemminger
Cc: nhorman at tuxdriver.com; dev at dpdk.org
Subject: Re: [dpdk-dev] [PATCH v1] dpdk-devbind.py: Virtio interface issue.

Hi ,
I have already updated it and have re submitted the patch v3. Can you 
please check that http://dpdk.org/dev/patchwork/patch/15378/
--
Regards,
Souvik

-Original Message-
From: Stephen Hemminger [mailto:step...@networkplumber.org]
Sent: Friday, August 26, 2016 11:55 AM
To: Dey, Souvik 
Cc: nhorman at tuxdriver.com; dev at dpdk.org
Subject: Re: [dpdk-dev] [PATCH v1] dpdk-devbind.py: Virtio interface issue.

On Wed, 24 Aug 2016 22:25:46 -0400
souvikdey33  wrote:

> +#The path for virtio devices are different. Get the correct path.
> + virtio = "/sys/bus/pci/devices/%s/" % dev_id
> +cmd = " ls %s | grep 'virt' " %virtio
> +virtio = commands.getoutput(cmd)
> +virtio_sys_path = "/sys/bus/pci/devices/%s/%s/net/" %
> +(dev_id,virtio)
>  if exists(sys_path):
>  device["Interface"] = ",".join(os.listdir(sys_path))

There should be a way to do this in python without going out to shell.
This would be safer and more secure.

The code already uses os.listdir() (which is the python library version of ls) 
in later section. Why not use that here to check for virtio bus.


[dpdk-dev] [PATCH v1] dpdk-devbind.py: Virtio interface issue.

2016-08-30 Thread Neil Horman
On Mon, Aug 29, 2016 at 11:16:35PM +, Dey, Souvik wrote:
> Hi,
> 
> I already followed the 100% python way and submitted the v3 of this patch. 
> http://dpdk.org/dev/patchwork/patch/15378/
> How will your patch be different in solving the issue. There will always be 
> multiple ways to solving things right.
> 
As stephen says, using popen is a bit of a hack here.  You could easily use one
of several python-sysfs libraries to simplify the sysfs enumeration and
discovery process
Neil

> 
> V3 of my submitted patch:
> 
> diff --git a/tools/dpdk-devbind.py b/tools/dpdk-devbind.py
> index b69ca2a..c0b46ee 100755
> --- a/tools/dpdk-devbind.py
> +++ b/tools/dpdk-devbind.py
> @@ -36,6 +36,7 @@  import sys
>  import os
>  import getopt
>  import subprocess
> +
>  from os.path import exists, abspath, dirname, basename
>  
>  # The PCI base class for NETWORK devices
> @@ -222,8 +223,19 @@  def get_pci_device_details(dev_id):
>  device[name] = value
>  # check for a unix interface name
>  sys_path = "/sys/bus/pci/devices/%s/net/" % dev_id
> +# the path for virtio devices are different, so get the correct path
> +virtio = "/sys/bus/pci/devices/%s/" % dev_id
> +ls = subprocess.Popen(['ls', virtio], stdout=subprocess.PIPE)
> +grep = subprocess.Popen('grep virt'.split(), stdin=ls.stdout,
> +stdout=subprocess.PIPE)
> +ls.stdout.close()
> +virtio = grep.communicate()[0].rstrip()
> +ls.wait()
> +virtio_sys_path = "/sys/bus/pci/devices/%s/%s/net/" % (dev_id, virtio)
>  if exists(sys_path):
>  device["Interface"] = ",".join(os.listdir(sys_path))
> +elif exists(virtio_sys_path):
> +device["Interface"] = ",".join(os.listdir(virtio_sys_path))
>  else:
>  device["Interface"] = ""
>  # check if a port is used for ssh connection
> 
> 
> -Original Message-
> From: Mussar, Gary [mailto:gmussar at ciena.com] 
> Sent: Monday, August 29, 2016 11:10 AM
> To: Dey, Souvik ; Stephen Hemminger  networkplumber.org>
> Cc: nhorman at tuxdriver.com; dev at dpdk.org
> Subject: RE: [dpdk-dev] [PATCH v1] dpdk-devbind.py: Virtio interface issue.
> 
> We did this slightly differently. This is 100% python and is a bit more 
> general. We search for the first "net" directory under the specific device 
> directory.
> 
> ---
> --- tools/dpdk-devbind.py   2016-08-29 11:02:35.594202888 -0400
> +++ ../dpdk/tools/dpdk-devbind.py 2016-08-29 11:00:34.897677233 -0400
> @@ -221,11 +221,11 @@
>  name = name.strip(":") + "_str"
>  device[name] = value
>  # check for a unix interface name
> -sys_path = "/sys/bus/pci/devices/%s/net/" % dev_id
> -if exists(sys_path):
> -device["Interface"] = ",".join(os.listdir(sys_path))
> -else:
> -device["Interface"] = ""
> +device["Interface"] = ""
> +for base, dirs, files in os.walk("/sys/bus/pci/devices/%s/" % dev_id):
> +if "net" in dirs:
> +device["Interface"] = 
> ",".join(os.listdir(os.path.join(base,"net")))
> +break
>  # check if a port is used for ssh connection
>  device["Ssh_if"] = False
>  device["Active"] = ""
> ---
> 
> Gary
> 
> -Original Message-
> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Dey, Souvik
> Sent: Friday, August 26, 2016 8:21 PM
> To: Stephen Hemminger
> Cc: nhorman at tuxdriver.com; dev at dpdk.org
> Subject: Re: [dpdk-dev] [PATCH v1] dpdk-devbind.py: Virtio interface issue.
> 
> Hi ,
>   I have already updated it and have re submitted the patch v3. Can you 
> please check that http://dpdk.org/dev/patchwork/patch/15378/
> --
> Regards,
> Souvik
> 
> -Original Message-
> From: Stephen Hemminger [mailto:stephen at networkplumber.org] 
> Sent: Friday, August 26, 2016 11:55 AM
> To: Dey, Souvik 
> Cc: nhorman at tuxdriver.com; dev at dpdk.org
> Subject: Re: [dpdk-dev] [PATCH v1] dpdk-devbind.py: Virtio interface issue.
> 
> On Wed, 24 Aug 2016 22:25:46 -0400
> souvikdey33  wrote:
> 
> > +#The path for virtio devices are different. Get the correct path.
> > +   virtio = "/sys/bus/pci/devices/%s/" % dev_id
> > +cmd = " ls %s | grep 'virt' " %virtio
> > +virtio = commands.getoutput(cmd)
> > +virtio_sys_path = "/sys/bus/pci/devices/%s/%s/net/" % 
> > +(dev_id,virtio)
> >  if exists(sys_path):
> >  device["Interface"] = ",".join(os.listdir(sys_path))
> 
> There should be a way to do this in python without going out to shell.
> This would be safer and more secure.
> 
> The code already uses os.listdir() (which is the python library version of 
> ls) in later section. Why not use that here to check for virtio bus.
> 


[dpdk-dev] support BCM5719 driver

2016-08-30 Thread Stephen Hemminger
On Tue, 30 Aug 2016 13:54:41 +0300
Keren Hochman  wrote:

> Hi, 
> 
> I tried to bind BCM5719 driver to DPDK compatible driver: (I tried both to 
> uio_pci_generic and igb_uio)
> 
> Network devices using DPDK-compatible driver
> 
> :03:00.2 'NetXtreme BCM5719 Gigabit Ethernet PCIe' drv=uio_pci_generic 
> unused=tg3,igb_uio
> :03:00.3 'NetXtreme BCM5719 Gigabit Ethernet PCIe' drv=igb_uio 
> unused=tg3,uio_pci_generic
> 
> But when I ran from the dpdk example: link_status_interrupt I received an 
> error No ethernet port. 
> 
> (When I tried to bind I350 Gigabit Network Connection it worked.
> 
> Thank you, Keren
> 

The BCM5719 device is not supported by the current DPDK driver.
Only the 10G Broadcom NetExtreme II devices are supported.


[dpdk-dev] VMXNET3 PMD

2016-08-30 Thread Aravamudan Srivathsan
Hi

We are developing DPDK based app in VMWARE EXSI 6.0. We have 1 G card 
configured as VMXNET3 device. With 16.07 dpdk unmodified the system reboots 
when binding the igb_uio driver to the PCI device.
I followed the following guide.
http://dpdk.org/doc/guides/_images/vmxnet3_int.png
http://dpdk.org/doc/guides/nics/vmxnet3.html
I could not get any logs as the system hangs once the igb driver is loaded.
Please point me to the right direction for debugging the issue.

Srivatshan


[dpdk-dev] [PATCH] vhost: add pmd xstats

2016-08-30 Thread Yao, Lei A
Hi, Qian

The test setup at my side is Vhost/VirtIO loopback with 64B packets.


-Original Message-
From: Xu, Qian Q 
Sent: Tuesday, August 30, 2016 11:03 AM
To: Yao, Lei A ; Yang, Zhiyong ; Panu Matilainen ; Thomas Monjalon 
; Yuanhan Liu 
Cc: dev at dpdk.org
Subject: RE: [dpdk-dev] [PATCH] vhost: add pmd xstats

Lei
Could you list the test setup for below findings? I think we need at least to 
check below tests for mergeable=on/off path: 
1. Vhost/virtio loopback
2. PVP test : virtio-pmd IO fwd and virtio-net IPV4 fwd

-Original Message-
From: dev [mailto:dev-boun...@dpdk.org] On Behalf Of Yao, Lei A
Sent: Tuesday, August 30, 2016 10:46 AM
To: Yang, Zhiyong ; Panu Matilainen ; Thomas Monjalon ; Yuanhan Liu 

Cc: dev at dpdk.org
Subject: Re: [dpdk-dev] [PATCH] vhost: add pmd xstats

Hi, Zhiyong

I have tested more  xstats performance drop data at my side.
Vhost Xstats patch  with mergeable on :  ~3% Vhost Xstats patch with  mergeable 
off : ~9%

Because Zhihong also submit patch to improve the performance on for the 
mergeable on: http://dpdk.org/dev/patchwork/patch/15245/  ~15249. If both 
patch integrated, the performance drop will be much higher 
Vhsot Xstats patch + Vhost mergeable on patch   with mergeable on : the 
performance drop is around  6%


Best Regards
Lei

-Original Message-
From: dev [mailto:dev-boun...@dpdk.org] On Behalf Of Yang, Zhiyong
Sent: Thursday, August 25, 2016 5:22 PM
To: Panu Matilainen ; Thomas Monjalon ; Yuanhan Liu 
Cc: dev at dpdk.org
Subject: Re: [dpdk-dev] [PATCH] vhost: add pmd xstats



> -Original Message-
> From: Panu Matilainen [mailto:pmatilai at redhat.com]
> Sent: Wednesday, August 24, 2016 8:37 PM
> To: Thomas Monjalon ; Yuanhan Liu 
> 
> Cc: dev at dpdk.org; Yang, Zhiyong 
> Subject: Re: [dpdk-dev] [PATCH] vhost: add pmd xstats
> 
> On 08/24/2016 11:44 AM, Thomas Monjalon wrote:
> > 2016-08-24 13:46, Yuanhan Liu:
> >> On Tue, Aug 23, 2016 at 12:45:54PM +0300, Panu Matilainen wrote:
> >> Since collecting data of vhost_update_packet_xstats will have 
> >> some effect on RX/TX performance, so, Setting compiling switch 
> >> CONFIG_RTE_LIBRTE_PMD_VHOST_UPDATE_XSTATS=n by default
> in the
> > file
> >> config/common_base, if needing xstats data, you can enable it(y).
> >
> > NAK, such things need to be switchable at run-time.
> >
> > - Panu -
> 
>  Considering the following reasons using the compiler switch, not 
>  command-line at run-time.
> 
>  1.Similar xstats update functions are always collecting stats 
>  data in the background when rx/tx are running, such as the 
>  physical NIC or virtio, which have no switch. Compiler switch for 
>  vhost pmd xstats is added as a option when performance is viewed 
>  as critical
> factor.
> 
>  2. No data structure and API in any layer support the xstats 
>  update switch at run-time. Common data structure (struct
>  rte_eth_dev_data) has no device-specific data member, if 
>  implementing enable/disable of vhost_update _packet_xstats at 
>  run-time, must define a
>  flag(device-specific) in it, because the definition of struct 
>  vhost_queue in the driver code (eth_vhost_rx/eth_vhost_tx
> processing)is not visible from device perspective.
> 
>  3. I tested RX/TX with v1 patch (y) as reference based on
>  Intel(R)
>  Xeon(R) CPU E5-2699 v3 @ 2.30GHz, for 64byts packets in burst 
>  mode,
>  32 packets in one RX/TX processing. Overhead of 
>  vhost_update_packet_xstats is less than 3% for the rx/tx 
>  processing. It looks that vhost_update_packet_xstats has a 
>  limited
> effect on performance drop.
> >>>
> >>> Well, either the performance overhead is acceptable and it should 
> >>> always be on (like with physical NICs I think). Or it is not. In 
> >>> which case it needs to be turnable on and off, at run-time.
> >>> Rebuilding is not an option in the world of distros.
> >>
> >> I think the less than 3% overhead is acceptable here, that I agree 
> >> with Panu we should always keep it on. If someone compains it later 
> >> that even 3% is too big for them, let's consider to make it be 
> >> switchable at run-time. Either we could introduce a generic eth API 
> >> for that, Or just introduce a vhost one if that doesn't make too 
> >> much sense to other eth drivers.
> >
> > +1
> > It may have sense to introduce a generic run-time option for stats.
> >
> 
> Yup, sounds good.
> 
It sounds better , if DPDK can add generic API and structure to the switch of 
xstats update. So, any device can use it at run time if necessary.

Can we define one bit data member (xstats_update) in the data structure struct 
rte_eth_dev_data?
such as:
uint8_t promiscuous : 1, /**< RX promiscuous mode ON(1) / OFF(0). */
scattered_rx : 1,  /**< RX of scattered packets is ON(1) / OFF(0) */
all_multicast : 1, /**< RX all multicast mode ON(1) / 

[dpdk-dev] [PATCH] vhost: add pmd xstats

2016-08-30 Thread Xu, Qian Q
Lei
Could you list the test setup for below findings? I think we need at least to 
check below tests for mergeable=on/off path: 
1. Vhost/virtio loopback
2. PVP test : virtio-pmd IO fwd and virtio-net IPV4 fwd

-Original Message-
From: dev [mailto:dev-boun...@dpdk.org] On Behalf Of Yao, Lei A
Sent: Tuesday, August 30, 2016 10:46 AM
To: Yang, Zhiyong ; Panu Matilainen ; Thomas Monjalon ; Yuanhan Liu 

Cc: dev at dpdk.org
Subject: Re: [dpdk-dev] [PATCH] vhost: add pmd xstats

Hi, Zhiyong

I have tested more  xstats performance drop data at my side.
Vhost Xstats patch  with mergeable on :  ~3% Vhost Xstats patch with  mergeable 
off : ~9%

Because Zhihong also submit patch to improve the performance on for the 
mergeable on: http://dpdk.org/dev/patchwork/patch/15245/  ~15249. If both 
patch integrated, the performance drop will be much higher 
Vhsot Xstats patch + Vhost mergeable on patch   with mergeable on : the 
performance drop is around  6%


Best Regards
Lei

-Original Message-
From: dev [mailto:dev-boun...@dpdk.org] On Behalf Of Yang, Zhiyong
Sent: Thursday, August 25, 2016 5:22 PM
To: Panu Matilainen ; Thomas Monjalon ; Yuanhan Liu 
Cc: dev at dpdk.org
Subject: Re: [dpdk-dev] [PATCH] vhost: add pmd xstats



> -Original Message-
> From: Panu Matilainen [mailto:pmatilai at redhat.com]
> Sent: Wednesday, August 24, 2016 8:37 PM
> To: Thomas Monjalon ; Yuanhan Liu 
> 
> Cc: dev at dpdk.org; Yang, Zhiyong 
> Subject: Re: [dpdk-dev] [PATCH] vhost: add pmd xstats
> 
> On 08/24/2016 11:44 AM, Thomas Monjalon wrote:
> > 2016-08-24 13:46, Yuanhan Liu:
> >> On Tue, Aug 23, 2016 at 12:45:54PM +0300, Panu Matilainen wrote:
> >> Since collecting data of vhost_update_packet_xstats will have 
> >> some effect on RX/TX performance, so, Setting compiling switch 
> >> CONFIG_RTE_LIBRTE_PMD_VHOST_UPDATE_XSTATS=n by default
> in the
> > file
> >> config/common_base, if needing xstats data, you can enable it(y).
> >
> > NAK, such things need to be switchable at run-time.
> >
> > - Panu -
> 
>  Considering the following reasons using the compiler switch, not 
>  command-line at run-time.
> 
>  1.Similar xstats update functions are always collecting stats 
>  data in the background when rx/tx are running, such as the 
>  physical NIC or virtio, which have no switch. Compiler switch for 
>  vhost pmd xstats is added as a option when performance is viewed 
>  as critical
> factor.
> 
>  2. No data structure and API in any layer support the xstats 
>  update switch at run-time. Common data structure (struct
>  rte_eth_dev_data) has no device-specific data member, if 
>  implementing enable/disable of vhost_update _packet_xstats at 
>  run-time, must define a
>  flag(device-specific) in it, because the definition of struct 
>  vhost_queue in the driver code (eth_vhost_rx/eth_vhost_tx
> processing)is not visible from device perspective.
> 
>  3. I tested RX/TX with v1 patch (y) as reference based on
>  Intel(R)
>  Xeon(R) CPU E5-2699 v3 @ 2.30GHz, for 64byts packets in burst 
>  mode,
>  32 packets in one RX/TX processing. Overhead of 
>  vhost_update_packet_xstats is less than 3% for the rx/tx 
>  processing. It looks that vhost_update_packet_xstats has a 
>  limited
> effect on performance drop.
> >>>
> >>> Well, either the performance overhead is acceptable and it should 
> >>> always be on (like with physical NICs I think). Or it is not. In 
> >>> which case it needs to be turnable on and off, at run-time.
> >>> Rebuilding is not an option in the world of distros.
> >>
> >> I think the less than 3% overhead is acceptable here, that I agree 
> >> with Panu we should always keep it on. If someone compains it later 
> >> that even 3% is too big for them, let's consider to make it be 
> >> switchable at run-time. Either we could introduce a generic eth API 
> >> for that, Or just introduce a vhost one if that doesn't make too 
> >> much sense to other eth drivers.
> >
> > +1
> > It may have sense to introduce a generic run-time option for stats.
> >
> 
> Yup, sounds good.
> 
It sounds better , if DPDK can add generic API and structure to the switch of 
xstats update. So, any device can use it at run time if necessary.

Can we define one bit data member (xstats_update) in the data structure struct 
rte_eth_dev_data?
such as:
uint8_t promiscuous : 1, /**< RX promiscuous mode ON(1) / OFF(0). */
scattered_rx : 1,  /**< RX of scattered packets is ON(1) / OFF(0) */
all_multicast : 1, /**< RX all multicast mode ON(1) / OFF(0). */
dev_started : 1,   /**< Device state: STARTED(1) / STOPPED(0). */
lro : 1,   /**< RX LRO is ON(1) / OFF(0) */
xstats_update: 1;   /**< xstats update is ON(1) / OFF(0) */

Define 3 functions:

void rte_eth_xstats_update_enable(uint8_t port_id); void 

[dpdk-dev] [PATCH] vhost: add pmd xstats

2016-08-30 Thread Yao, Lei A
Hi, Zhiyong

I have tested more  xstats performance drop data at my side.
Vhost Xstats patch  with mergeable on :  ~3%
Vhost Xstats patch with  mergeable off : ~9%

Because Zhihong also submit patch to improve the performance on for the 
mergeable on: http://dpdk.org/dev/patchwork/patch/15245/  ~15249. If both 
patch integrated, the performance drop will be much higher 
Vhsot Xstats patch + Vhost mergeable on patch   with mergeable on : the 
performance drop is around  6%


Best Regards
Lei

-Original Message-
From: dev [mailto:dev-boun...@dpdk.org] On Behalf Of Yang, Zhiyong
Sent: Thursday, August 25, 2016 5:22 PM
To: Panu Matilainen ; Thomas Monjalon ; Yuanhan Liu 
Cc: dev at dpdk.org
Subject: Re: [dpdk-dev] [PATCH] vhost: add pmd xstats



> -Original Message-
> From: Panu Matilainen [mailto:pmatilai at redhat.com]
> Sent: Wednesday, August 24, 2016 8:37 PM
> To: Thomas Monjalon ; Yuanhan Liu 
> 
> Cc: dev at dpdk.org; Yang, Zhiyong 
> Subject: Re: [dpdk-dev] [PATCH] vhost: add pmd xstats
> 
> On 08/24/2016 11:44 AM, Thomas Monjalon wrote:
> > 2016-08-24 13:46, Yuanhan Liu:
> >> On Tue, Aug 23, 2016 at 12:45:54PM +0300, Panu Matilainen wrote:
> >> Since collecting data of vhost_update_packet_xstats will have 
> >> some effect on RX/TX performance, so, Setting compiling switch 
> >> CONFIG_RTE_LIBRTE_PMD_VHOST_UPDATE_XSTATS=n by default
> in the
> > file
> >> config/common_base, if needing xstats data, you can enable it(y).
> >
> > NAK, such things need to be switchable at run-time.
> >
> > - Panu -
> 
>  Considering the following reasons using the compiler switch, not 
>  command-line at run-time.
> 
>  1.Similar xstats update functions are always collecting stats 
>  data in the background when rx/tx are running, such as the 
>  physical NIC or virtio, which have no switch. Compiler switch for 
>  vhost pmd xstats is added as a option when performance is viewed 
>  as critical
> factor.
> 
>  2. No data structure and API in any layer support the xstats 
>  update switch at run-time. Common data structure (struct 
>  rte_eth_dev_data) has no device-specific data member, if 
>  implementing enable/disable of vhost_update _packet_xstats at 
>  run-time, must define a
>  flag(device-specific) in it, because the definition of struct 
>  vhost_queue in the driver code (eth_vhost_rx/eth_vhost_tx
> processing)is not visible from device perspective.
> 
>  3. I tested RX/TX with v1 patch (y) as reference based on 
>  Intel(R)
>  Xeon(R) CPU E5-2699 v3 @ 2.30GHz, for 64byts packets in burst 
>  mode,
>  32 packets in one RX/TX processing. Overhead of 
>  vhost_update_packet_xstats is less than 3% for the rx/tx 
>  processing. It looks that vhost_update_packet_xstats has a 
>  limited
> effect on performance drop.
> >>>
> >>> Well, either the performance overhead is acceptable and it should 
> >>> always be on (like with physical NICs I think). Or it is not. In 
> >>> which case it needs to be turnable on and off, at run-time.
> >>> Rebuilding is not an option in the world of distros.
> >>
> >> I think the less than 3% overhead is acceptable here, that I agree 
> >> with Panu we should always keep it on. If someone compains it later 
> >> that even 3% is too big for them, let's consider to make it be 
> >> switchable at run-time. Either we could introduce a generic eth API 
> >> for that, Or just introduce a vhost one if that doesn't make too 
> >> much sense to other eth drivers.
> >
> > +1
> > It may have sense to introduce a generic run-time option for stats.
> >
> 
> Yup, sounds good.
> 
It sounds better , if DPDK can add generic API and structure to the switch of 
xstats update. So, any device can use it at run time if necessary.

Can we define one bit data member (xstats_update) in the data structure struct 
rte_eth_dev_data?
such as:
uint8_t promiscuous : 1, /**< RX promiscuous mode ON(1) / OFF(0). */
scattered_rx : 1,  /**< RX of scattered packets is ON(1) / OFF(0) */
all_multicast : 1, /**< RX all multicast mode ON(1) / OFF(0). */
dev_started : 1,   /**< Device state: STARTED(1) / STOPPED(0). */
lro : 1,   /**< RX LRO is ON(1) / OFF(0) */
xstats_update: 1;   /**< xstats update is ON(1) / OFF(0) */

Define 3 functions:

void rte_eth_xstats_update_enable(uint8_t port_id); void 
rte_eth_xstats_update_disable(uint8_t port_id); int 
rte_eth_xstats_update_get(uint8_t port_id);

Or define two:

/* uint8_t xstats_update ; 1 on, 0 off*/ void 
rte_eth_xstats_update_enable(uint8_t port_id, uint8_t xstats_update); int 
rte_eth_xstats_update_get(uint8_t port_id);

In the struct eth_dev_ops, adding two functions to pass xstats_update to 
driver, because the rxq/txq can't access xstats_update directly.
So, add a xstats flag per queue data structure. 
for example
struct vhost_queue {
..
  

[dpdk-dev] [PATCH 1/2] net/ixgbe: fix mbufs leakage during Rx queue release

2016-08-30 Thread Lu, Wenzhuo
Hi,

> -Original Message-
> From: Kylulin, Yury
> Sent: Tuesday, August 30, 2016 12:51 AM
> To: Zhang, Helin; Ananyev, Konstantin; Wu, Jingjing
> Cc: Lu, Wenzhuo; dev at dpdk.org; Kylulin, Yury
> Subject: [PATCH 1/2] net/ixgbe: fix mbufs leakage during Rx queue release
> 
> For the vector PMD release all mbufs from the Rx queue if no packets received
> after device start.
> 
> Fixes: 11b220c6498d ("ixgbe: fix release queue mbufs")
> 
> Signed-off-by: Yury Kylulin 
Acked-by: Wenzhuo Lu 


[dpdk-dev] [PATCH v4 6/6] vhost: optimize cache access

2016-08-30 Thread Zhihong Wang
This patch reorders the code to delay virtio header write to optimize cache
access efficiency for cases where the mrg_rxbuf feature is turned on. It
reduces CPU pipeline stall cycles significantly.

---
Changes in v3:

 1. Remove unnecessary memset which causes frontend stall on SNB & IVB.

 2. Rename variables to follow naming convention.

Signed-off-by: Zhihong Wang 
---
 lib/librte_vhost/vhost_rxtx.c | 20 ++--
 1 file changed, 14 insertions(+), 6 deletions(-)

diff --git a/lib/librte_vhost/vhost_rxtx.c b/lib/librte_vhost/vhost_rxtx.c
index ddc7b21..fc5dc4a 100644
--- a/lib/librte_vhost/vhost_rxtx.c
+++ b/lib/librte_vhost/vhost_rxtx.c
@@ -196,6 +196,7 @@ enqueue_packet(struct virtio_net *dev, struct 
vhost_virtqueue *vq,
uint32_t mbuf_len;
uint32_t mbuf_avail;
uint32_t copy_len;
+   uint32_t copy_virtio_hdr;
uint32_t extra_buffers = 0;

/* start with the first mbuf of the packet */
@@ -210,12 +211,12 @@ enqueue_packet(struct virtio_net *dev, struct 
vhost_virtqueue *vq,
if (unlikely(!desc_addr))
goto error;

-   /* handle virtio header */
+   /*
+* handle virtio header, the actual write operation is delayed
+* for cache optimization, to reduce CPU pipeline stall cycles.
+*/
virtio_hdr = (struct virtio_net_hdr_mrg_rxbuf *)(uintptr_t)desc_addr;
-   virtio_enqueue_offload(mbuf, &(virtio_hdr->hdr));
-   if (is_mrg_rxbuf)
-   virtio_hdr->num_buffers = extra_buffers + 1;
-
+   copy_virtio_hdr = 1;
vhost_log_write(dev, desc->addr, dev->vhost_hlen);
PRINT_PACKET(dev, (uintptr_t)desc_addr, dev->vhost_hlen, 0);
desc_offset = dev->vhost_hlen;
@@ -266,8 +267,15 @@ enqueue_packet(struct virtio_net *dev, struct 
vhost_virtqueue *vq,
goto error;
}

-   /* copy mbuf data */
+   /* copy virtio header and mbuf data */
copy_len = RTE_MIN(desc->len - desc_offset, mbuf_avail);
+   if (copy_virtio_hdr) {
+   copy_virtio_hdr = 0;
+   virtio_enqueue_offload(mbuf, &(virtio_hdr->hdr));
+   if (is_mrg_rxbuf)
+   virtio_hdr->num_buffers = extra_buffers + 1;
+   }
+
rte_memcpy((void *)(uintptr_t)desc_addr,
rte_pktmbuf_mtod_offset(mbuf, void *,
mbuf_len - mbuf_avail),
-- 
2.7.4



[dpdk-dev] [PATCH v4 5/6] vhost: batch update used ring

2016-08-30 Thread Zhihong Wang
This patch enables batch update of the used ring for better efficiency.

---
Changes in v4:

 1. Free shadow used ring in the right place.

 2. Add failure check for shadow used ring malloc.

Signed-off-by: Zhihong Wang 
---
 lib/librte_vhost/vhost-net.h  |  4 +++
 lib/librte_vhost/vhost_rxtx.c | 62 ---
 lib/librte_vhost/virtio-net.c | 42 ++---
 3 files changed, 95 insertions(+), 13 deletions(-)

diff --git a/lib/librte_vhost/vhost-net.h b/lib/librte_vhost/vhost-net.h
index 51fdf3d..a15182c 100644
--- a/lib/librte_vhost/vhost-net.h
+++ b/lib/librte_vhost/vhost-net.h
@@ -85,6 +85,10 @@ struct vhost_virtqueue {

/* Physical address of used ring, for logging */
uint64_tlog_guest_addr;
+
+   /* Shadow used ring for performance */
+   struct vring_used_elem  *shadow_used_ring;
+   uint32_tshadow_used_idx;
 } __rte_cache_aligned;

 /* Old kernels have no such macro defined */
diff --git a/lib/librte_vhost/vhost_rxtx.c b/lib/librte_vhost/vhost_rxtx.c
index 927896c..ddc7b21 100644
--- a/lib/librte_vhost/vhost_rxtx.c
+++ b/lib/librte_vhost/vhost_rxtx.c
@@ -134,16 +134,51 @@ virtio_enqueue_offload(struct rte_mbuf *m_buf, struct 
virtio_net_hdr *net_hdr)
 }

 static inline void __attribute__((always_inline))
-update_used_ring(struct virtio_net *dev, struct vhost_virtqueue *vq,
-   uint32_t desc_chain_head, uint32_t desc_chain_len)
+update_used_ring(struct vhost_virtqueue *vq, uint32_t desc_chain_head,
+   uint32_t desc_chain_len)
 {
-   uint32_t used_idx_round = vq->last_used_idx & (vq->size - 1);
+   vq->shadow_used_ring[vq->shadow_used_idx].id = desc_chain_head;
+   vq->shadow_used_ring[vq->shadow_used_idx].len = desc_chain_len;
+   vq->shadow_used_idx++;
+}
+
+static inline void __attribute__((always_inline))
+flush_used_ring(struct virtio_net *dev, struct vhost_virtqueue *vq,
+   uint32_t used_idx_start)
+{
+   if (used_idx_start + vq->shadow_used_idx < vq->size) {
+   rte_memcpy(>used->ring[used_idx_start],
+   >shadow_used_ring[0],
+   vq->shadow_used_idx *
+   sizeof(struct vring_used_elem));
+   vhost_log_used_vring(dev, vq,
+   offsetof(struct vring_used,
+   ring[used_idx_start]),
+   vq->shadow_used_idx *
+   sizeof(struct vring_used_elem));
+   } else {
+   uint32_t part_1 = vq->size - used_idx_start;
+   uint32_t part_2 = vq->shadow_used_idx - part_1;

-   vq->used->ring[used_idx_round].id = desc_chain_head;
-   vq->used->ring[used_idx_round].len = desc_chain_len;
-   vhost_log_used_vring(dev, vq, offsetof(struct vring_used,
-   ring[used_idx_round]),
-   sizeof(vq->used->ring[used_idx_round]));
+   rte_memcpy(>used->ring[used_idx_start],
+   >shadow_used_ring[0],
+   part_1 *
+   sizeof(struct vring_used_elem));
+   vhost_log_used_vring(dev, vq,
+   offsetof(struct vring_used,
+   ring[used_idx_start]),
+   part_1 *
+   sizeof(struct vring_used_elem));
+   rte_memcpy(>used->ring[0],
+   >shadow_used_ring[part_1],
+   part_2 *
+   sizeof(struct vring_used_elem));
+   vhost_log_used_vring(dev, vq,
+   offsetof(struct vring_used,
+   ring[0]),
+   part_2 *
+   sizeof(struct vring_used_elem));
+   }
 }

 static inline uint32_t __attribute__((always_inline))
@@ -208,7 +243,7 @@ enqueue_packet(struct virtio_net *dev, struct 
vhost_virtqueue *vq,
goto error;
} else if (is_mrg_rxbuf) {
/* start with the next desc chain */
-   update_used_ring(dev, vq, desc_chain_head,
+   update_used_ring(vq, desc_chain_head,
desc_chain_len);
vq->last_used_idx++;
extra_buffers++;
@@ -245,7 +280,7 @@ enqueue_packet(struct virtio_net *dev, struct 
vhost_virtqueue *vq,
desc_chain_len += copy_len;
}

-   update_used_ring(dev, vq, desc_chain_head, desc_chain_len);
+   update_used_ring(vq, desc_chain_head, desc_chain_len);
vq->last_used_idx++;

return 0;
@@ -276,6 +311,7 @@ 

[dpdk-dev] [PATCH v4 4/6] vhost: add desc prefetch

2016-08-30 Thread Zhihong Wang
This patch adds descriptor prefetch to hide cache access latency.

Signed-off-by: Zhihong Wang 
---
 lib/librte_vhost/vhost_rxtx.c | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/lib/librte_vhost/vhost_rxtx.c b/lib/librte_vhost/vhost_rxtx.c
index 629e8ae..927896c 100644
--- a/lib/librte_vhost/vhost_rxtx.c
+++ b/lib/librte_vhost/vhost_rxtx.c
@@ -304,6 +304,12 @@ rte_vhost_enqueue_burst(int vid, uint16_t queue_id,
/* start enqueuing packets 1 by 1 */
avail_idx = *((volatile uint16_t *)>avail->idx);
while (pkt_left && avail_idx != vq->last_used_idx) {
+   /* prefetch the next desc */
+   if (pkt_left > 1 && avail_idx != vq->last_used_idx + 1)
+   rte_prefetch0(>desc[vq->avail->ring[
+   (vq->last_used_idx + 1) &
+   (vq->size - 1)]]);
+
if (enqueue_packet(dev, vq, avail_idx, pkts[pkt_idx],
is_mrg_rxbuf))
break;
-- 
2.7.4



[dpdk-dev] [PATCH v4 3/6] vhost: remove useless volatile

2016-08-30 Thread Zhihong Wang
This patch removes useless volatile attribute to allow compiler
optimization.

Signed-off-by: Zhihong Wang 
---
 lib/librte_vhost/vhost-net.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/lib/librte_vhost/vhost-net.h b/lib/librte_vhost/vhost-net.h
index 38593a2..51fdf3d 100644
--- a/lib/librte_vhost/vhost-net.h
+++ b/lib/librte_vhost/vhost-net.h
@@ -71,7 +71,7 @@ struct vhost_virtqueue {
uint32_tsize;

/* Last index used on the available ring */
-   volatile uint16_t   last_used_idx;
+   uint16_tlast_used_idx;
 #define VIRTIO_INVALID_EVENTFD (-1)
 #define VIRTIO_UNINITIALIZED_EVENTFD   (-2)

-- 
2.7.4



[dpdk-dev] [PATCH v4 2/6] vhost: rewrite enqueue

2016-08-30 Thread Zhihong Wang
This patch implements the vhost logic from scratch into a single function
designed for high performance and better maintainability.

This is the baseline version of the new code, more optimization will be
added in the following patches in this patch set.

---
Changes in v4:

 1. Refactor the code for clearer logic.

 2. Add PRINT_PACKET for debugging.

---
Changes in v3:

 1. Rewrite enqueue and delete the obsolete in the same patch.

Signed-off-by: Zhihong Wang 
---
 lib/librte_vhost/vhost_rxtx.c | 525 --
 1 file changed, 145 insertions(+), 380 deletions(-)

diff --git a/lib/librte_vhost/vhost_rxtx.c b/lib/librte_vhost/vhost_rxtx.c
index 5806f99..629e8ae 100644
--- a/lib/librte_vhost/vhost_rxtx.c
+++ b/lib/librte_vhost/vhost_rxtx.c
@@ -91,7 +91,7 @@ is_valid_virt_queue_idx(uint32_t idx, int is_tx, uint32_t 
qp_nb)
return (is_tx ^ (idx & 1)) == 0 && idx < qp_nb * VIRTIO_QNUM;
 }

-static void
+static inline void __attribute__((always_inline))
 virtio_enqueue_offload(struct rte_mbuf *m_buf, struct virtio_net_hdr *net_hdr)
 {
if (m_buf->ol_flags & PKT_TX_L4_MASK) {
@@ -112,6 +112,10 @@ virtio_enqueue_offload(struct rte_mbuf *m_buf, struct 
virtio_net_hdr *net_hdr)
cksum));
break;
}
+   } else {
+   net_hdr->flags = 0;
+   net_hdr->csum_start = 0;
+   net_hdr->csum_offset = 0;
}

if (m_buf->ol_flags & PKT_TX_TCP_SEG) {
@@ -122,437 +126,198 @@ virtio_enqueue_offload(struct rte_mbuf *m_buf, struct 
virtio_net_hdr *net_hdr)
net_hdr->gso_size = m_buf->tso_segsz;
net_hdr->hdr_len = m_buf->l2_len + m_buf->l3_len
+ m_buf->l4_len;
+   } else {
+   net_hdr->gso_type = 0;
+   net_hdr->hdr_len = 0;
+   net_hdr->gso_size = 0;
}
 }

-static inline void
-copy_virtio_net_hdr(struct virtio_net *dev, uint64_t desc_addr,
-   struct virtio_net_hdr_mrg_rxbuf hdr)
+static inline void __attribute__((always_inline))
+update_used_ring(struct virtio_net *dev, struct vhost_virtqueue *vq,
+   uint32_t desc_chain_head, uint32_t desc_chain_len)
 {
-   if (dev->vhost_hlen == sizeof(struct virtio_net_hdr_mrg_rxbuf))
-   *(struct virtio_net_hdr_mrg_rxbuf *)(uintptr_t)desc_addr = hdr;
-   else
-   *(struct virtio_net_hdr *)(uintptr_t)desc_addr = hdr.hdr;
+   uint32_t used_idx_round = vq->last_used_idx & (vq->size - 1);
+
+   vq->used->ring[used_idx_round].id = desc_chain_head;
+   vq->used->ring[used_idx_round].len = desc_chain_len;
+   vhost_log_used_vring(dev, vq, offsetof(struct vring_used,
+   ring[used_idx_round]),
+   sizeof(vq->used->ring[used_idx_round]));
 }

-static inline int __attribute__((always_inline))
-copy_mbuf_to_desc(struct virtio_net *dev, struct vhost_virtqueue *vq,
- struct rte_mbuf *m, uint16_t desc_idx)
+static inline uint32_t __attribute__((always_inline))
+enqueue_packet(struct virtio_net *dev, struct vhost_virtqueue *vq,
+   uint16_t avail_idx, struct rte_mbuf *mbuf,
+   uint32_t is_mrg_rxbuf)
 {
-   uint32_t desc_avail, desc_offset;
-   uint32_t mbuf_avail, mbuf_offset;
-   uint32_t cpy_len;
+   struct virtio_net_hdr_mrg_rxbuf *virtio_hdr;
struct vring_desc *desc;
uint64_t desc_addr;
-   struct virtio_net_hdr_mrg_rxbuf virtio_hdr = {{0, 0, 0, 0, 0, 0}, 0};
-
-   desc = >desc[desc_idx];
+   uint32_t desc_chain_head;
+   uint32_t desc_chain_len;
+   uint32_t desc_current;
+   uint32_t desc_offset;
+   uint32_t mbuf_len;
+   uint32_t mbuf_avail;
+   uint32_t copy_len;
+   uint32_t extra_buffers = 0;
+
+   /* start with the first mbuf of the packet */
+   mbuf_len = rte_pktmbuf_data_len(mbuf);
+   mbuf_avail = mbuf_len;
+
+   /* get the current desc */
+   desc_current = vq->avail->ring[(vq->last_used_idx) & (vq->size - 1)];
+   desc_chain_head = desc_current;
+   desc = >desc[desc_current];
desc_addr = gpa_to_vva(dev, desc->addr);
-   /*
-* Checking of 'desc_addr' placed outside of 'unlikely' macro to avoid
-* performance issue with some versions of gcc (4.8.4 and 5.3.0) which
-* otherwise stores offset on the stack instead of in a register.
-*/
-   if (unlikely(desc->len < dev->vhost_hlen) || !desc_addr)
-   return -1;
+   if (unlikely(!desc_addr))
+   goto error;

-   rte_prefetch0((void *)(uintptr_t)desc_addr);
+   /* handle virtio header */
+   virtio_hdr = (struct virtio_net_hdr_mrg_rxbuf *)(uintptr_t)desc_addr;
+   virtio_enqueue_offload(mbuf, &(virtio_hdr->hdr));
+   if (is_mrg_rxbuf)
+   virtio_hdr->num_buffers = extra_buffers + 

[dpdk-dev] [PATCH v4 1/6] vhost: fix windows vm hang

2016-08-30 Thread Zhihong Wang
This patch fixes a Windows VM compatibility issue in DPDK 16.07 vhost code,
which causes the guest to hang once any packets are enqueued when mrg_rxbuf
is turned on.

How to test?

 1. Start testpmd in the host with a vhost port.

 2. Start a Windows VM image with qemu and connect to the vhost port.

 3. Start io forwarding with tx_first in host testpmd.

For 16.07 code, the Windows VM will hang once any packets are enqueued.

Cc: 
Signed-off-by: Zhihong Wang 
---
 lib/librte_vhost/vhost_rxtx.c | 17 -
 1 file changed, 12 insertions(+), 5 deletions(-)

diff --git a/lib/librte_vhost/vhost_rxtx.c b/lib/librte_vhost/vhost_rxtx.c
index 08a73fd..5806f99 100644
--- a/lib/librte_vhost/vhost_rxtx.c
+++ b/lib/librte_vhost/vhost_rxtx.c
@@ -384,6 +384,8 @@ copy_mbuf_to_desc_mergeable(struct virtio_net *dev, struct 
vhost_virtqueue *vq,
uint16_t start_idx = vq->last_used_idx;
uint16_t cur_idx = start_idx;
uint64_t desc_addr;
+   uint32_t desc_chain_head;
+   uint32_t desc_chain_len;
uint32_t mbuf_offset, mbuf_avail;
uint32_t desc_offset, desc_avail;
uint32_t cpy_len;
@@ -412,6 +414,8 @@ copy_mbuf_to_desc_mergeable(struct virtio_net *dev, struct 
vhost_virtqueue *vq,

desc_avail  = buf_vec[vec_idx].buf_len - dev->vhost_hlen;
desc_offset = dev->vhost_hlen;
+   desc_chain_head = buf_vec[vec_idx].desc_idx;
+   desc_chain_len = desc_offset;

mbuf_avail  = rte_pktmbuf_data_len(m);
mbuf_offset = 0;
@@ -419,19 +423,21 @@ copy_mbuf_to_desc_mergeable(struct virtio_net *dev, 
struct vhost_virtqueue *vq,
/* done with current desc buf, get the next one */
if (desc_avail == 0) {
desc_idx = buf_vec[vec_idx].desc_idx;
+   vec_idx++;

if (!(vq->desc[desc_idx].flags & VRING_DESC_F_NEXT)) {
/* Update used ring with desc information */
used_idx = cur_idx++ & (vq->size - 1);
-   vq->used->ring[used_idx].id  = desc_idx;
-   vq->used->ring[used_idx].len = desc_offset;
+   vq->used->ring[used_idx].id = desc_chain_head;
+   vq->used->ring[used_idx].len = desc_chain_len;
vhost_log_used_vring(dev, vq,
offsetof(struct vring_used,
 ring[used_idx]),
sizeof(vq->used->ring[used_idx]));
+   desc_chain_head = buf_vec[vec_idx].desc_idx;
+   desc_chain_len = 0;
}

-   vec_idx++;
desc_addr = gpa_to_vva(dev, buf_vec[vec_idx].buf_addr);
if (unlikely(!desc_addr))
return 0;
@@ -463,11 +469,12 @@ copy_mbuf_to_desc_mergeable(struct virtio_net *dev, 
struct vhost_virtqueue *vq,
mbuf_offset += cpy_len;
desc_avail  -= cpy_len;
desc_offset += cpy_len;
+   desc_chain_len += cpy_len;
}

used_idx = cur_idx & (vq->size - 1);
-   vq->used->ring[used_idx].id = buf_vec[vec_idx].desc_idx;
-   vq->used->ring[used_idx].len = desc_offset;
+   vq->used->ring[used_idx].id = desc_chain_head;
+   vq->used->ring[used_idx].len = desc_chain_len;
vhost_log_used_vring(dev, vq,
offsetof(struct vring_used, ring[used_idx]),
sizeof(vq->used->ring[used_idx]));
-- 
2.7.4



[dpdk-dev] [PATCH v4 0/6] vhost: optimize enqueue

2016-08-30 Thread Zhihong Wang
This patch set optimizes the vhost enqueue function.

It implements the vhost logic from scratch into a single function designed
for high performance and good maintainability, and improves CPU efficiency
significantly by optimizing cache access, which means:

 *  Higher maximum throughput can be achieved for fast frontends like DPDK
virtio pmd.

 *  Better scalability can be achieved that each vhost core can support
more connections because it takes less cycles to handle each single
frontend.

This patch set contains:

 1. A Windows VM compatibility fix for vhost enqueue in 16.07 release.

 2. A baseline patch to rewrite the vhost logic.

 3. A series of optimization patches added upon the baseline.

The main optimization techniques are:

 1. Reorder code to reduce CPU pipeline stall cycles.

 2. Batch update the used ring for better efficiency.

 3. Prefetch descriptor to hide cache latency.

 4. Remove useless volatile attribute to allow compiler optimization.

Code reordering and batch used ring update bring most of the performance
improvements.

In the existing code there're 2 callbacks for vhost enqueue:

 *  virtio_dev_merge_rx for mrg_rxbuf turned on cases.

 *  virtio_dev_rx for mrg_rxbuf turned off cases.

The performance of the existing code is not optimal, especially when the
mrg_rxbuf feature turned on. Besides, having 2 callback paths increases
maintenance efforts.

Also, there's a compatibility issue in the existing code which causes
Windows VM to hang when the mrg_rxbuf feature turned on.

---
Changes in v4:

 1. Fix a Windows VM compatibility issue.

 2. Free shadow used ring in the right place.

 3. Add failure check for shadow used ring malloc.

 4. Refactor the code for clearer logic.

 5. Add PRINT_PACKET for debugging.

---
Changes in v3:

 1. Remove unnecessary memset which causes frontend stall on SNB & IVB.

 2. Rename variables to follow naming convention.

 3. Rewrite enqueue and delete the obsolete in the same patch.

---
Changes in v2:

 1. Split the big function into several small ones.

 2. Use multiple patches to explain each optimization.

 3. Add comments.

Zhihong Wang (6):
  vhost: fix windows vm hang
  vhost: rewrite enqueue
  vhost: remove useless volatile
  vhost: add desc prefetch
  vhost: batch update used ring
  vhost: optimize cache access

 lib/librte_vhost/vhost-net.h  |   6 +-
 lib/librte_vhost/vhost_rxtx.c | 572 +++---
 lib/librte_vhost/virtio-net.c |  42 +++-
 3 files changed, 244 insertions(+), 376 deletions(-)

-- 
2.7.4



[dpdk-dev] [PATCH v1] dpdk-devbind.py: Virtio interface issue.

2016-08-30 Thread Dey, Souvik
Hi,

I already followed the 100% python way and submitted the v3 of this patch. 
http://dpdk.org/dev/patchwork/patch/15378/
How will your patch be different in solving the issue. There will always be 
multiple ways to solving things right.


V3 of my submitted patch:

diff --git a/tools/dpdk-devbind.py b/tools/dpdk-devbind.py
index b69ca2a..c0b46ee 100755
--- a/tools/dpdk-devbind.py
+++ b/tools/dpdk-devbind.py
@@ -36,6 +36,7 @@  import sys
 import os
 import getopt
 import subprocess
+
 from os.path import exists, abspath, dirname, basename

 # The PCI base class for NETWORK devices
@@ -222,8 +223,19 @@  def get_pci_device_details(dev_id):
 device[name] = value
 # check for a unix interface name
 sys_path = "/sys/bus/pci/devices/%s/net/" % dev_id
+# the path for virtio devices are different, so get the correct path
+virtio = "/sys/bus/pci/devices/%s/" % dev_id
+ls = subprocess.Popen(['ls', virtio], stdout=subprocess.PIPE)
+grep = subprocess.Popen('grep virt'.split(), stdin=ls.stdout,
+stdout=subprocess.PIPE)
+ls.stdout.close()
+virtio = grep.communicate()[0].rstrip()
+ls.wait()
+virtio_sys_path = "/sys/bus/pci/devices/%s/%s/net/" % (dev_id, virtio)
 if exists(sys_path):
 device["Interface"] = ",".join(os.listdir(sys_path))
+elif exists(virtio_sys_path):
+device["Interface"] = ",".join(os.listdir(virtio_sys_path))
 else:
 device["Interface"] = ""
 # check if a port is used for ssh connection


-Original Message-
From: Mussar, Gary [mailto:gmus...@ciena.com] 
Sent: Monday, August 29, 2016 11:10 AM
To: Dey, Souvik ; Stephen Hemminger 
Cc: nhorman at tuxdriver.com; dev at dpdk.org
Subject: RE: [dpdk-dev] [PATCH v1] dpdk-devbind.py: Virtio interface issue.

We did this slightly differently. This is 100% python and is a bit more 
general. We search for the first "net" directory under the specific device 
directory.

---
--- tools/dpdk-devbind.py   2016-08-29 11:02:35.594202888 -0400
+++ ../dpdk/tools/dpdk-devbind.py 2016-08-29 11:00:34.897677233 -0400
@@ -221,11 +221,11 @@
 name = name.strip(":") + "_str"
 device[name] = value
 # check for a unix interface name
-sys_path = "/sys/bus/pci/devices/%s/net/" % dev_id
-if exists(sys_path):
-device["Interface"] = ",".join(os.listdir(sys_path))
-else:
-device["Interface"] = ""
+device["Interface"] = ""
+for base, dirs, files in os.walk("/sys/bus/pci/devices/%s/" % dev_id):
+if "net" in dirs:
+device["Interface"] = 
",".join(os.listdir(os.path.join(base,"net")))
+break
 # check if a port is used for ssh connection
 device["Ssh_if"] = False
 device["Active"] = ""
---

Gary

-Original Message-
From: dev [mailto:dev-boun...@dpdk.org] On Behalf Of Dey, Souvik
Sent: Friday, August 26, 2016 8:21 PM
To: Stephen Hemminger
Cc: nhorman at tuxdriver.com; dev at dpdk.org
Subject: Re: [dpdk-dev] [PATCH v1] dpdk-devbind.py: Virtio interface issue.

Hi ,
I have already updated it and have re submitted the patch v3. Can you 
please check that http://dpdk.org/dev/patchwork/patch/15378/
--
Regards,
Souvik

-Original Message-
From: Stephen Hemminger [mailto:step...@networkplumber.org] 
Sent: Friday, August 26, 2016 11:55 AM
To: Dey, Souvik 
Cc: nhorman at tuxdriver.com; dev at dpdk.org
Subject: Re: [dpdk-dev] [PATCH v1] dpdk-devbind.py: Virtio interface issue.

On Wed, 24 Aug 2016 22:25:46 -0400
souvikdey33  wrote:

> +#The path for virtio devices are different. Get the correct path.
> + virtio = "/sys/bus/pci/devices/%s/" % dev_id
> +cmd = " ls %s | grep 'virt' " %virtio
> +virtio = commands.getoutput(cmd)
> +virtio_sys_path = "/sys/bus/pci/devices/%s/%s/net/" % 
> +(dev_id,virtio)
>  if exists(sys_path):
>  device["Interface"] = ",".join(os.listdir(sys_path))

There should be a way to do this in python without going out to shell.
This would be safer and more secure.

The code already uses os.listdir() (which is the python library version of ls) 
in later section. Why not use that here to check for virtio bus.