[dpdk-dev] New to DPDK

2015-01-14 Thread Wiles, Keith
Most people I guess use a Xeon CPU class MB with one or two sockets
running Linux with a supported NICs. I use a motherboard like the one
below running Ubuntu 12.04 with 12G RAM and the 82599 NICs. You can find
more supported NICs in the documentation and you need find the rest of the
parts :-) You do not need much disk space I have a 500G disk and you can
use less memory, but that is something you need to decide on.

http://www.intel.com/content/www/us/en/motherboards/server-motherboards/ser
ver-board-w2600cr.html


On 1/14/15, 12:33 PM, "Ravi Rao"  wrote:

>Hi All,
>I am a newbee to DPDK. Can one of you please let me know if there is
>any reference board that is available which I can use to build and
>tryout the dpdk stuff on.
>Regards,
>Ravi



[dpdk-dev] [PATCH v2 3/5] vhost: enable promisc mode and config VMDQ offload register for multicast feature

2015-01-14 Thread Thomas Monjalon
Hi Huawei,

2015-01-08 10:07, Xie, Huawei:
> CTRL_RX is dependent on CTRL_VQ.
> CTRL_VQ should be enabled if CTRL_RX is enabled.
> Observed that virtio-net driver will crash if CTRL_VQ isn't enabled in 
> vhost-user case.
>   /* Caller should know better */
>   BUG_ON(!virtio_has_feature(vi->vdev, VIRTIO_NET_F_CTRL_VQ) ||
>   (out + in > VIRTNET_SEND_COMMAND_SG_MAX)); 

Do you plan to send a patch?

-- 
Thomas


[dpdk-dev] Packet Rx issue with DPDK1.8

2015-01-14 Thread Thomas Monjalon
2015-01-09 09:26, Bruce Richardson:
> On Fri, Jan 09, 2015 at 06:45:57AM +, Prashant Upadhyaya wrote:
> > Hi Bruce,
> > 
> > I tried with your suggestion.
> > 
> > When I disable the _vec function with the following config change, the
> > usecase works for me. So it points to some issue in the _vec function.
> > 
> > CONFIG_RTE_IXGBE_INC_VECTOR=y, I changed this parameter to
> > CONFIG_RTE_IXGBE_INC_VECTOR=n
> > 
> > There appears to be some gottcha in the function therefore, somebody may
> > want to run some tests again perhaps with jumbo frames enabled (and
> > sending small normal frames)
> > 
> > Regards
> > -Prashant
> 
> Yes, we'll perform additional testing and see what we can turn up.

Bruce, any news about this bug?
It seems important and could justify (with other bugs) to release a version
1.8.1.

Thanks
-- 
Thomas


[dpdk-dev] Q on Support for I217 and I218 Intel chipsets.

2015-01-14 Thread Thomas Monjalon
2015-01-09 04:41, Ravi Kerur:
> Thomas,
> 
> Please let me know how I can move forward on this. If i confine changes in
> e1000/ directory to e1000_osdep.h file only and the rest in PMD will that
> work? The reason I ask is because of following comment  in README file.
> 
> ...
> Few changes to the original FreeBSD sources were made to:
> - Adopt it for PMD usage mode:
> e1000_osdep.c
> e1000_osdep.h
> ...

This is an Intel driver so you should ask to the responsible of this code at 
Intel.
The problem is that there is not really an identified responsible for this 
driver.

The rule is to not change the base driver, even osdep files.
But it would be better to have an exception here.


PS: please avoid top-posting.

> On Mon, Jan 5, 2015 at 8:40 AM, Ravi Kerur  wrote:
> 
> > Inline 
> >
> > On Mon, Jan 5, 2015 at 12:55 AM, Thomas Monjalon <
> > thomas.monjalon at 6wind.com> wrote:
> >
> >> 2015-01-04 15:28, Ravi Kerur:
> >> > We have a Gigabyte H97N motherboard which has I217 Intel chipset which
> >> uses
> >> > e100e drivers. I looked into lib/librte_pmd_e1000 directory and I do see
> >> > that e1000e code is integrated but missing some support for read/write
> >> from
> >> > flash_address and other minor things. I have made changes shown below
> >> and
> >> > have done some testing with testpmd utility and now have following
> >> questions
> >> >
> >> > 1. What amount of testing is required to qualify patch as successfully
> >> > tested on new chipsets
> >>
> >> There is no good answer to this question. Generally, you must be sure that
> >> you don't break anything.
> >> So you must test the code paths you have changed.
> >>
> >
> >  yes I have done testing on Ubuntu for I217 using testpmd.
> >
> >>
> >> > 2. FreeBSD testing, currently we have Ubuntu 14.04 installed on existing
> >> > H97N motherboard and testing is done solely on Linux. We plan to get
> >> > another motherboard which will have I218 chipset and still deciding
> >> whether
> >> > to go with FreeBSD or Ubuntu. So the question I have is what amount of
> >> > testing should be done on FreeBSD? I don't think
> >> setup.sh/dpdk_nic_bind.py
> >> > works on FreeBSD yet hence the question on testing.
> >>
> >> FreeBSD testing is required when patching common EAL, scripts or
> >> makefiles.
> >>
> >> > >  lib/librte_pmd_e1000/e1000/e1000_api.c  | 21
> >> +
> >> > >  lib/librte_pmd_e1000/e1000/e1000_api.h  |  1 +
> >> > >  lib/librte_pmd_e1000/e1000/e1000_osdep.h| 24
> >> +++-
> >>
> >> These files are part of the base driver.
> >> The rule is to not patch them and try to do the changes in PMD only.
> >> There can be exceptions if an Intel maintainer acknowledges it.
> >>
> >
> >   Changes in these files are modifying existing macros
> >
> > E1000_READ_FLASH_REG,
> > E1000_WRITE_FLASH_REG
> > ...
> >
> > If it is not recommended to modify these files, should I move macros into
> > some PMD file?
> >
> > Thanks.



[dpdk-dev] [PATCH 1/2] doc: Add 'make pdf' target to convert guide docs to pdf.

2015-01-14 Thread John McNamara
This patch adds a high level 'make pdf' target to generate
pdf documents from the sphinx/rst user guides.

Signed-off-by: John McNamara 
---
 doc/api/sphinx-latex-update.pl  |   71 ++
 doc/guides/freebsd_gsg/conf.py  |   86 +++
 doc/guides/linux_gsg/conf.py|   86 +++
 doc/guides/prog_guide/conf.py   |   86 +++
 doc/guides/rel_notes/conf.py|   85 ++
 doc/guides/rel_notes/supported_features.rst |2 +-
 doc/guides/sample_app_ug/conf.py|   86 +++
 doc/guides/sample_app_ug/test_pipeline.rst  |6 +-
 doc/guides/testpmd_app_ug/conf.py   |   85 ++
 mk/rte.sdkdoc.mk|   28 +++--
 mk/rte.sdkroot.mk   |3 +-
 11 files changed, 614 insertions(+), 10 deletions(-)
 create mode 100644 doc/api/sphinx-latex-update.pl
 create mode 100644 doc/guides/freebsd_gsg/conf.py
 create mode 100644 doc/guides/linux_gsg/conf.py
 create mode 100644 doc/guides/prog_guide/conf.py
 create mode 100644 doc/guides/rel_notes/conf.py
 create mode 100644 doc/guides/sample_app_ug/conf.py
 create mode 100644 doc/guides/testpmd_app_ug/conf.py

diff --git a/doc/api/sphinx-latex-update.pl b/doc/api/sphinx-latex-update.pl
new file mode 100644
index 000..d41c695
--- /dev/null
+++ b/doc/api/sphinx-latex-update.pl
@@ -0,0 +1,71 @@
+#!/usr/bin/perl -i
+
+#   BSD LICENSE
+#   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+#   All rights reserved.
+#
+#   Redistribution and use in source and binary forms, with or without
+#   modification, are permitted provided that the following conditions
+#   are met:
+#
+#   * Redistributions of source code must retain the above copyright
+#   notice, this list of conditions and the following disclaimer.
+#   * Redistributions in binary form must reproduce the above copyright
+#   notice, this list of conditions and the following disclaimer in
+#   the documentation and/or other materials provided with the
+#   distribution.
+#   * Neither the name of Intel Corporation nor the names of its
+#   contributors may be used to endorse or promote products derived
+#   from this software without specific prior written permission.
+#
+#   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+#   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+#   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+#   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+#   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+#   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+#   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+#   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+#   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+#   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+#   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+
+
+# Utility program to post-process the Sphinx LaTex docs prior to
+# generating the latexpdf output.
+
+use strict;
+use warnings;
+
+
+while (<>) {
+
+# Convert escaped single quotes back to real single quote so that
+# the Latex upquote package has an effect.
+s/\\PYGZsq{}/'/g;
+
+# Remove italic form the Pygments formatting.
+s/\\let\\PYG\@it=\\textit//g;
+
+# Change the comments color in the Pygments formatting.
+s/0\.25,0\.50,0\.56/0.40,0.69,0.33/;
+
+# Use PNG instead of SVG (which isn't well supported by LaTeX).
+if ( /\\includegraphics/ ) {
+s/\.svg/.png/;
+}
+
+# Center the images.
+if ( /^\\includegraphics/ ) {
+print "\\begin{center}\n";
+print;
+print "\\end{center}\n";
+
+next;
+}
+
+print;
+}
+
+
+__END__
diff --git a/doc/guides/freebsd_gsg/conf.py b/doc/guides/freebsd_gsg/conf.py
new file mode 100644
index 000..65a7ede
--- /dev/null
+++ b/doc/guides/freebsd_gsg/conf.py
@@ -0,0 +1,86 @@
+#   BSD LICENSE
+#   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+#   All rights reserved.
+#
+#   Redistribution and use in source and binary forms, with or without
+#   modification, are permitted provided that the following conditions
+#   are met:
+#
+#   * Redistributions of source code must retain the above copyright
+#   notice, this list of conditions and the following disclaimer.
+#   * Redistributions in binary form must reproduce the above copyright
+#   notice, this list of conditions and the following disclaimer in
+#   the documentation and/or other materials provided with the
+#   distribution.
+#   * Neither the name of Intel Corporation nor the names of its
+#   contributors may be used to endorse or promote products derived
+#   from thi

[dpdk-dev] [PATCH RFC 0/1] Add 'make pdf' target to convert guide docs to pdf.

2015-01-14 Thread John McNamara
This patch adds support for creating PDF versions of the the user guides.
Specifically:

* The Programmer's Guide
* The Linux Getting Started Guide
* The FreeBSD Getting Started Guide
* The Sample Applications User Guide
* The TestPMD User Guide
* The Release Notes

The local and online Html documentation is very useful but we have had
internal and external requests from people who also liked the PDF
documentation in older releases.

The PDF generation is fully automated and uses the same Sphinx build system
and RST files used for the Html docs but uses the 'latexpdf' target. In
addition to the standard Sphinx Python modules it requires the Tex/LaTeX
toolchain. For best results it requires a Tex Live 'Full' installation.

The PDF documents are generated as follows:

make pdf
# or
make doc-pdf

The PDFs aren't generated as part of the 'make doc' rule since they can take
some 1-3 minutes to build and since they have a large toolchain dependency.

This patch doesn't include PDF generation of the DPDK API document. That will
be submitted later in a separate patch.

I have omitted the 2/2 part of the patch with the PNG files from the RFC.


John McNamara (2):
  doc: Add 'make pdf' target to convert guide docs to pdf.
  doc: Add PNG files for 'make pdf' target.

 doc/api/sphinx-latex-update.pl |   71 
 doc/guides/freebsd_gsg/conf.py |   86 
 doc/guides/freebsd_gsg/img/Intel-logo.png  |  Bin 0 -> 7560 bytes
 doc/guides/linux_gsg/conf.py   |   86 
 doc/guides/linux_gsg/img/Intel-logo.png|  Bin 0 -> 7560 bytes
 doc/guides/prog_guide/conf.py  |   86 
 doc/guides/prog_guide/img/Intel-logo.png   |  Bin 0 -> 7560 bytes
 .../prog_guide/img/architecture-overview.png   |  Bin 0 -> 69418 bytes
 doc/guides/prog_guide/img/bond-mode-0.png  |  Bin 0 -> 31581 bytes
 doc/guides/prog_guide/img/bond-mode-1.png  |  Bin 0 -> 25550 bytes
 doc/guides/prog_guide/img/bond-mode-2.png  |  Bin 0 -> 33645 bytes
 doc/guides/prog_guide/img/bond-mode-3.png  |  Bin 0 -> 33548 bytes
 doc/guides/prog_guide/img/bond-mode-4.png  |  Bin 0 -> 36763 bytes
 doc/guides/prog_guide/img/bond-mode-5.png  |  Bin 0 -> 40778 bytes
 doc/guides/prog_guide/img/bond-overview.png|  Bin 0 -> 25065 bytes
 doc/guides/prog_guide/img/linuxapp_launch.png  |  Bin 0 -> 125118 bytes
 doc/guides/prog_guide/img/mbuf1.png|  Bin 0 -> 37843 bytes
 doc/guides/prog_guide/img/mbuf2.png|  Bin 0 -> 58682 bytes
 doc/guides/prog_guide/img/memory-management.png|  Bin 0 -> 22904 bytes
 doc/guides/prog_guide/img/memory-management2.png   |  Bin 0 -> 25411 bytes
 doc/guides/prog_guide/img/mempool.png  |  Bin 0 -> 50966 bytes
 doc/guides/prog_guide/img/multi_process_memory.png |  Bin 0 -> 52930 bytes
 doc/guides/prog_guide/img/ring-dequeue1.png|  Bin 0 -> 29169 bytes
 doc/guides/prog_guide/img/ring-dequeue2.png|  Bin 0 -> 30334 bytes
 doc/guides/prog_guide/img/ring-dequeue3.png|  Bin 0 -> 27677 bytes
 doc/guides/prog_guide/img/ring-enqueue1.png|  Bin 0 -> 28386 bytes
 doc/guides/prog_guide/img/ring-enqueue2.png|  Bin 0 -> 29329 bytes
 doc/guides/prog_guide/img/ring-enqueue3.png|  Bin 0 -> 28907 bytes
 doc/guides/prog_guide/img/ring-modulo1.png |  Bin 0 -> 21666 bytes
 doc/guides/prog_guide/img/ring-modulo2.png |  Bin 0 -> 21814 bytes
 doc/guides/prog_guide/img/ring-mp-enqueue1.png |  Bin 0 -> 35928 bytes
 doc/guides/prog_guide/img/ring-mp-enqueue2.png |  Bin 0 -> 43924 bytes
 doc/guides/prog_guide/img/ring-mp-enqueue3.png |  Bin 0 -> 43581 bytes
 doc/guides/prog_guide/img/ring-mp-enqueue4.png |  Bin 0 -> 43648 bytes
 doc/guides/prog_guide/img/ring-mp-enqueue5.png |  Bin 0 -> 29787 bytes
 doc/guides/prog_guide/img/ring1.png|  Bin 0 -> 21466 bytes
 doc/guides/rel_notes/conf.py   |   85 +++
 doc/guides/rel_notes/img/Intel-logo.png|  Bin 0 -> 7560 bytes
 doc/guides/rel_notes/supported_features.rst|2 +-
 doc/guides/sample_app_ug/conf.py   |   86 
 doc/guides/sample_app_ug/img/Intel-logo.png|  Bin 0 -> 7560 bytes
 doc/guides/sample_app_ug/img/dist_app.png  |  Bin 0 -> 14191 bytes
 doc/guides/sample_app_ug/img/dist_perf.png |  Bin 0 -> 12355 bytes
 .../sample_app_ug/img/exception_path_example.png   |  Bin 0 -> 57544 bytes
 .../sample_app_ug/img/l2_fwd_benchmark_setup.png   |  Bin 0 -> 21985 bytes
 .../sample_app_ug/img/vm_power_mgr_highlevel.png   |  Bin 0 -> 192526 bytes
 .../img/vm_power_mgr_vm_request_seq.png|  Bin 0 -> 59573 bytes
 doc/guides/sample_app_ug/img/vmdq_dcb_example.png  |  Bin 0 -> 36777 bytes
 doc/guides/sample_app_ug/test_pipel

[dpdk-dev] Packet Rx issue with DPDK1.8

2015-01-14 Thread Bruce Richardson
On Wed, Jan 14, 2015 at 05:30:15PM +0100, Thomas Monjalon wrote:
> 2015-01-09 09:26, Bruce Richardson:
> > On Fri, Jan 09, 2015 at 06:45:57AM +, Prashant Upadhyaya wrote:
> > > Hi Bruce,
> > > 
> > > I tried with your suggestion.
> > > 
> > > When I disable the _vec function with the following config change, the
> > > usecase works for me. So it points to some issue in the _vec function.
> > > 
> > > CONFIG_RTE_IXGBE_INC_VECTOR=y, I changed this parameter to
> > > CONFIG_RTE_IXGBE_INC_VECTOR=n
> > > 
> > > There appears to be some gottcha in the function therefore, somebody may
> > > want to run some tests again perhaps with jumbo frames enabled (and
> > > sending small normal frames)
> > > 
> > > Regards
> > > -Prashant
> > 
> > Yes, we'll perform additional testing and see what we can turn up.
> 
> Bruce, any news about this bug?
> It seems important and could justify (with other bugs) to release a version
> 1.8.1.
> 
> Thanks
> -- 
> Thomas

No updates on my end yet. I'll keep you informed as I get any extra info here.

/Bruce


[dpdk-dev] KNI interface operational state UP issue

2015-01-14 Thread Aziz Hajee
Yes, the DPDK reports port is UP.
By the following change in the registered callback function to comment out
stop/start the dpdk interface, I can bring the KNI vEthX interfaces up:
(I guess, the reason is the ret value is now the default ret = 0, and no
error from stop/start. )

kni_config_network_interface(uint8_t port_id, uint8_t if_up)
{
int ret = 0;

if (port_id >= rte_eth_dev_count() || port_id >= RTE_MAX_ETHPORTS) {
RTE_LOG(ERR, KNI, "Invalid port id %d\n", port_id);
return -EINVAL;
}

RTE_LOG(INFO, KNI, "Configure network interface of %d %s\n",
port_id, if_up ? "up" : "down");

//if (if_up != 0) { /* Configure network interface up */
//rte_eth_dev_stop(port_id);
//ret = rte_eth_dev_start(port_id);
//} else /* Configure network interface down */
//rte_eth_dev_stop(port_id);

if (ret < 0)
RTE_LOG(ERR, KNI, "Failed to start port %d\n", port_id);

Question:
1. Where does the vEth0 ifname given to the ifconfig command, gets mapped
to the dpdk port_id 0 in the callback

thanks.
-aziz

===

# ifconfig vEth0
vEth0 Link encap:Ethernet  HWaddr 00:00:00:00:00:00
  BROADCAST MULTICAST  MTU:1500  Metric:1
  RX packets:13 errors:0 dropped:0 overruns:0 frame:0
  TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
  collisions:0 txqueuelen:1000
  RX bytes:3186 (3.1 KB)  TX bytes:0 (0.0 B)

# ifconfig vEth0 192.16.1.1
# ifconfig vEth0
vEth0 Link encap:Ethernet  HWaddr 90:e2:ba:5f:1a:64
  inet addr:192.16.1.1  Bcast:192.16.1.255  Mask:255.255.255.0
  inet6 addr: fe80::92e2:baff:fe5f:1a64/64 Scope:Link
  UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
  RX packets:13 errors:0 dropped:0 overruns:0 frame:0
  TX packets:0 errors:0 dropped:37 overruns:0 carrier:0
  collisions:0 txqueuelen:1000
  RX bytes:3186 (3.1 KB)  TX bytes:0 (0.0 B)



On Mon, Jan 12, 2015 at 3:24 AM, Bruce Richardson <
bruce.richardson at intel.com> wrote:

> On Fri, Jan 09, 2015 at 05:20:26PM -0800, Aziz Hajee wrote:
> > I am using the dpdk1.6.0r1
> > The rte_kni.lo is loaded:
> > lsmod | grep kni
> > rte_kni   279134  1
> >
> > however, the ifconfig vEth0, and vEth1 does not show link up ?
> > How do i get the operational state up for these interfaces.
> > $ sudo tcpdump -i vEth0
> > tcpdump: vEth0: That device is not up
> >
> > ifconfig vEth0
> > vEth0 Link encap:Ethernet  HWaddr 00:00:00:00:00:00
> >   BROADCAST MULTICAST  MTU:1500  Metric:1
> >   RX packets:12 errors:0 dropped:0 overruns:0 frame:0
> >   TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
> >   collisions:0 txqueuelen:1000
> >   RX bytes:3388 (3.3 KB)  TX bytes:0 (0.0 B)
> >
> >  ifconfig vEth1
> > vEth1 Link encap:Ethernet  HWaddr 00:00:00:00:00:00
> >   BROADCAST MULTICAST  MTU:1500  Metric:1
> >   RX packets:60 errors:0 dropped:0 overruns:0 frame:0
> >   TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
> >   collisions:0 txqueuelen:1000
> >   RX bytes:10252 (10.2 KB)  TX bytes:0 (0.0 B)
> >
> > These KNI interfaces are created as per dmeg below from the CREATE IOCTL.
> > sudo ifconfig vEth0 192.168.0.11 netmask 255.255.0.0
> > SIOCSIFFLAGS: Timer expired
> > aziz at fast-1
> :~/stm15-0108/stm/dpdk/dpdk-1.6.0r1_ss/lib/librte_eal/linuxapp/kni$
> > ifconfig vEth0
> > vEth0 Link encap:Ethernet  HWaddr 90:e2:ba:5f:1a:64
> >   inet addr:192.168.0.11  Bcast:192.168.255.255  Mask:255.255.0.0
> >   BROADCAST MULTICAST  MTU:1500  Metric:1
> >   RX packets:50 errors:0 dropped:0 overruns:0 frame:0
> >   TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
> >   collisions:0 txqueuelen:1000
> >   RX bytes:14488 (14.4 KB)  TX bytes:0 (0.0 B)
> >
> > Trying to set the vEth0 up, looks like it is doing the callback in the
> dpdk
> > to the corresponding PMD NIC interface, and not the vEth0 kernel
> interface.
> >
> With KNI, the actual underlying NIC interface is still under the control
> of the
> DPDK application. What happens is that any ethtool requests that go to the
> kernel
> driver, get passed into the userspace DPDK application to make the actual
> changes
> to the hardware port. Does DPDK itself report the port as being up?
>
> /Bruce
>


[dpdk-dev] [PATCH v3 4/4] docs: Add ABI documentation

2015-01-14 Thread Thomas Monjalon
2014-12-23 10:51, Neil Horman:
> Adding a document describing rudimentary ABI policy and adding notice space 
> for
> any deprecation announcements

We had a good discussion about the policy and its impact:
http://thread.gmane.org/gmane.comp.networking.dpdk.devel/8367/focus=8461
Sadly nobody else discussed it.
I think we should integrate some of the conclusions in this documentation.

> --- /dev/null
> +++ b/doc/abi.txt
> @@ -0,0 +1,17 @@
> +ABI policy:
> + ABI versions are set at the time of major release labeling, and ABI may
> +change multiple times between the last labeling and the HEAD label of the git
> +tree without warning
> +
> + ABI versions, once released are available until such time as their
> +deprecation has been noted here for at least one major release cycle, after 
> it
> +has been tagged.  E.g. the ABI for DPDK 1.8 is shipped, and then the 
> decision to
> +remove it is made during the development of DPDK 1.9.  The decision will be
> +recorded here, shipped with the DPDK 1.9 release, and actually removed when 
> DPDK
> +1.10 ships.
> +
> + ABI versions may be deprecated in whole, or in part as needed by a given
> +update.
> +
> +Deprecation Notices:
> +

You could upgrade your example to 2.0/2.1.

Thanks
-- 
Thomas


[dpdk-dev] Does I210 NIC support Flow director filters?

2015-01-14 Thread Kamraan Nasim
Many thanks Helin and Bruce :)

Now if 1Gb NICs don't support fdir filters then im wondering how would we
count the number of packets matching a filter.

Regular 5tuple filters don't have any stats similar to "fdirmatch"(in the
rte_eth_stats  struct).
One way I can think of is to use regular ibytes/ipackets stats for the
queue to which the packets are being redirected in the 5tuple filter but
this seems a bit hacky + there is no way to distinguish this packet
throughput from the regular traffic that the NIC is forwarding to that
specific queue.

Is there a way to EXCLUSIVELY bind a 5tuple filter to an RSS queue so that
only matched traffic is forwarded there?


--Kam

On Wed, Jan 14, 2015 at 5:27 AM, Bruce Richardson <
bruce.richardson at intel.com> wrote:

> On Tue, Jan 13, 2015 at 11:21:08PM -0500, Kamraan Nasim wrote:
> > Hello,
> >
> > I've been using DPDK fdir filter APIs for 82599 NIC(Niantic) and they
> work
> > very well.
> >
> > Was wondering if I these could also be used for I210, 1Gbps NICs?
> >
> > The other option is to use 5tuple filters(rte_eth_dev_add_5tuple_filter
> > <
> http://dpdk.org/doc/api/rte__ethdev_8h.html#aaa28adafa65a4f47d4aeceaf1b08381b
> >),
> > however these do not support IPv6 yet.
> >
> >
> > Have people in the community had any luck with configuring L3/L4 hardware
> > filters for the I210 NIC?
> >
> > Thanks,
> > Kam
>
> Flow director filters are not supported for 1G NICs. Sorry.
>
> /Bruce
>


[dpdk-dev] [PATCH v3 3/4] Add library version extenstion

2015-01-14 Thread Thomas Monjalon
2014-12-23 10:51, Neil Horman:
> To differentiate libraries that break ABI, we add a library version number
> suffix to the library, which must be incremented when a given libraries ABI is
> broken.  This patch enforces that addition, sets the initial abi soname
> extension to 1 for each library and creates a symlink to the base SONAME so 
> that
> the test applications will link properly.

[...]

> --- a/mk/rte.lib.mk
> +++ b/mk/rte.lib.mk
> @@ -37,10 +37,9 @@ include $(RTE_SDK)/mk/internal/rte.depdirs-pre.mk
>  
>  # VPATH contains at least SRCDIR
>  VPATH += $(SRCDIR)
> -
>  ifeq ($(RTE_BUILD_SHARED_LIB),y)
> -LIB := $(patsubst %.a,%.so,$(LIB))
>  
> +LIB := $(patsubst %.a,%.so.$(LIBABIVER),$(LIB))
>  CPU_LDFLAGS += --version-script=$(SRCDIR)/$(EXPORT_MAP)
>  
>  endif
> @@ -63,6 +62,7 @@ build: _postbuild
>  
>  exe2cmd = $(strip $(call dotfile,$(patsubst %,%.cmd,$(1
>  
> +

Newline changes seem weird.

>  ifeq ($(LINK_USING_CC),1)
>  # Override the definition of LD here, since we're linking with CC
>  LD := $(CC) $(CPU_CFLAGS)
> @@ -113,6 +113,10 @@ lib_dir = [ -d $(RTE_OUTPUT)/lib ] || mkdir -p 
> $(RTE_OUTPUT)/lib;
>  #
>  ifeq ($(RTE_BUILD_SHARED_LIB),y)
>  $(LIB): $(OBJS-y) $(DEP_$(LIB)) FORCE
> +ifeq ($(LIBABIVER),)
> + @echo "Must Specify a $(LIB) ABI version"
> + @exit 1

I think (not sure) that @false is better handled than @exit in case of parallel 
processing.

> +endif
>   @[ -d $(dir $@) ] || mkdir -p $(dir $@)
>   $(if $(D),\
>   @echo -n "$< -> $@ " ; \
> @@ -126,6 +130,7 @@ $(LIB): $(OBJS-y) $(DEP_$(LIB)) FORCE
>   $(depfile_missing),\
>   $(depfile_newer)),\
>   $(O_TO_S_DO))
> +
>  ifeq ($(RTE_BUILD_COMBINE_LIBS),y)
>   $(if $(or \
>  $(file_missing),\
> @@ -163,9 +168,13 @@ endif
>  # install lib in $(RTE_OUTPUT)/lib
>  #
>  $(RTE_OUTPUT)/lib/$(LIB): $(LIB)
> + $(eval LIBSONAME := $(basename $(LIB)))
>   @echo "  INSTALL-LIB $(LIB)"
>   @[ -d $(RTE_OUTPUT)/lib ] || mkdir -p $(RTE_OUTPUT)/lib
>   $(Q)cp -f $(LIB) $(RTE_OUTPUT)/lib
> +ifeq ($(RTE_BUILD_SHARED_LIB),y)
> + $(Q)ln -s -f ./$(LIB) $(RTE_OUTPUT)/lib/$(LIBSONAME)
> +endif

Why using ./ ?
Why using the eval trick for $(LIBSONAME) instead of $(basename $(LIB)) ?
Even better, you could use $< instead of $(LIB), matter of taste.

-- 
Thomas


[dpdk-dev] [PATCH v3 2/4] Provide initial versioning for all DPDK libraries

2015-01-14 Thread Thomas Monjalon
2014-12-23 10:51, Neil Horman:
> Add linker version script files to each DPDK library to put a stake in the
> ground from which we can start cleaning up API's

[...]

>  lib/librte_acl/Makefile|   2 +
>  lib/librte_acl/rte_acl_version.map |  21 
>  lib/librte_cfgfile/Makefile|   2 +
>  lib/librte_cfgfile/rte_cfgfile_version.map |  14 +++
>  lib/librte_cmdline/Makefile|   2 +
>  lib/librte_cmdline/rte_cmdline_version.map |  69 +
>  lib/librte_distributor/Makefile|   2 +
>  lib/librte_distributor/rte_distributor_version.map |  16 +++
>  lib/librte_eal/bsdapp/eal/Makefile |   2 +
>  lib/librte_eal/bsdapp/eal/rte_eal_version.map  |  90 
>  lib/librte_eal/linuxapp/eal/Makefile   |   2 +
>  lib/librte_eal/linuxapp/eal/rte_eal_version.map|  90 
>  lib/librte_ether/Makefile  |   2 +
>  lib/librte_ether/rte_ether_version.map | 113 
> +
>  lib/librte_hash/Makefile   |   2 +
>  lib/librte_hash/rte_hash_version.map   |  18 
>  lib/librte_ip_frag/Makefile|   2 +
>  lib/librte_ip_frag/rte_ipfrag_version.map  |  14 +++
>  lib/librte_ivshmem/Makefile|   2 +
>  lib/librte_ivshmem/rte_ivshmem_version.map |  13 +++
>  lib/librte_kni/Makefile|   2 +
>  lib/librte_kni/rte_kni_version.map |  20 
>  lib/librte_kvargs/Makefile |   2 +
>  lib/librte_kvargs/rte_kvargs_version.map   |  10 ++
>  lib/librte_lpm/Makefile|   2 +
>  lib/librte_lpm/rte_lpm_version.map |  24 +
>  lib/librte_malloc/Makefile |   2 +
>  lib/librte_malloc/rte_malloc_version.map   |  19 
>  lib/librte_mbuf/Makefile   |   2 +
>  lib/librte_mbuf/rte_mbuf_version.map   |  14 +++
>  lib/librte_mempool/Makefile|   2 +
>  lib/librte_mempool/rte_mempool_version.map |  18 
>  lib/librte_meter/Makefile  |   2 +
>  lib/librte_meter/rte_meter_version.map |  13 +++
>  lib/librte_pipeline/Makefile   |   2 +
>  lib/librte_pipeline/rte_pipeline_version.map   |  23 +
>  lib/librte_pmd_af_packet/Makefile  |   2 +
>  .../rte_pmd_af_packet_version.map  |   7 ++
>  lib/librte_pmd_bond/Makefile   |   2 +
>  lib/librte_pmd_bond/rte_eth_bond_version.map   |  21 
>  lib/librte_pmd_e1000/Makefile  |   2 +
>  lib/librte_pmd_e1000/rte_pmd_e1000_version.map |   5 +
>  lib/librte_pmd_enic/Makefile   |   2 +
>  lib/librte_pmd_enic/rte_pmd_enic_version.map   |   5 +
>  lib/librte_pmd_i40e/Makefile   |   2 +
>  lib/librte_pmd_i40e/rte_pmd_i40e_version.map   |   5 +
>  lib/librte_pmd_ixgbe/Makefile  |   2 +
>  lib/librte_pmd_ixgbe/rte_pmd_ixgbe_version.map |   5 +
>  lib/librte_pmd_pcap/Makefile   |   2 +
>  lib/librte_pmd_pcap/rte_pmd_pcap_version.map   |   5 +
>  lib/librte_pmd_ring/Makefile   |   2 +
>  lib/librte_pmd_ring/rte_eth_ring.c |   2 +-
>  lib/librte_pmd_ring/rte_eth_ring.h |   6 --
>  lib/librte_pmd_ring/rte_eth_ring_version.map   |  10 ++
>  lib/librte_pmd_virtio/Makefile |   1 +
>  lib/librte_pmd_virtio/rte_pmd_virtio_version.map   |   5 +
>  lib/librte_pmd_vmxnet3/Makefile|   2 +
>  lib/librte_pmd_vmxnet3/rte_pmd_vmxnet3_version.map |   5 +
>  lib/librte_pmd_xenvirt/Makefile|   2 +
>  lib/librte_pmd_xenvirt/rte_eth_xenvirt_version.map |   8 ++
>  lib/librte_port/Makefile   |   2 +
>  lib/librte_port/rte_port_version.map   |  18 
>  lib/librte_power/Makefile  |   2 +
>  lib/librte_power/rte_power_version.map |  18 
>  lib/librte_ring/Makefile   |   2 +
>  lib/librte_ring/rte_ring_version.map   |  12 +++
>  lib/librte_sched/Makefile  |   2 +
>  lib/librte_sched/rte_sched_version.map |  22 
>  lib/librte_table/Makefile  |   2 +
>  lib/librte_table/rte_table_version.map |  22 
>  lib/librte_timer/Makefile  |   2 +
>  lib/librte_timer/rte_timer_version.map |  16 +++
>  lib/librte_vhost/Makefile  |   2 +
>  lib/librte_vhost/rte_vhost_version.map |  14 +++

Honestly, this patch is difficult to review.
How have you populated .map files? Did you use some script?

[...]

> -

[dpdk-dev] [PATCH v3 1/4] compat: Add infrastructure to support symbol versioning

2015-01-14 Thread Thomas Monjalon
Hi Neil,

2014-12-23 10:51, Neil Horman:
> Add initial pass header files to support symbol versioning.

[...]

> +#   Copyright(c) 2010-2014 Neil Horman 

Why these dates?

> +#   All rights reserved.

I think this line is not required anymore:
http://en.wikipedia.org/wiki/All_rights_reserved

[...]

> +#ifndef _RTE_COMPAT_H_
> +#define _RTE_COMPAT_H_

Why using underscores?
I think it's reserved:
http://en.wikipedia.org/wiki/Include_guard#Use_of_.23include_guards

> +#define SA(x) #x

It should be prefixed. But it's better to use RTE_STR.

> +#ifdef RTE_BUILD_SHARED_LIB
> +
> +/*
> + * Provides backwards compatibility when updating exported functions.
> + * When a symol is exported from a library to provide an API, it also 
> provides a
> + * calling convention (ABI) that is embodied in its name, return type,
> + * arguments, etc.  On occasion that function may need to change to 
> accomodate
> + * new functionality, behavior, etc.  When that occurs, it is desireable to
> + * allow for backwards compatibility for a time with older binaries that are
> + * dynamically linked to the dpdk.  to support that the __vsym and

Should be "To support that," with uppercase and comma.

> + * VERSION_SYMBOL macros are created.  They, in conjunction with the
> + * _version.map file for a given library allow for multiple 
> versions of
> + * a symbol to exist in a shared library so that older binaries need not be
> + * immediately recompiled. Their use is outlined in the following example:
> + * Assumptions: DPDK 1.(X) contains a function int foo(char *string)
> + *  DPDK 1.(X+1) needs to change foo to be int foo(int index)
> + *
> + * To accomplish this:
> + * 1) Edit lib//library_version.map to add a DPDK_1.(X+1) node, in 
> which
> + * foo is exported as a global symbol.
> + *
> + * 2) rename the existing function int foo(char *string) to 
> + *   int __vsym foo_v18(char *string)
> + *
> + * 3) Add this macro immediately below the function
> + *   VERSION_SYMBOL(foo, _v18, 1.8);
> + *
> + * 4) Implement a new version of foo.
> + *   char foo(int value, int otherval) { ...}
> + *
> + * 5) Mark the newest version as the default version
> + *   BIND_DEFAULT_SYMBOL(foo, 1.9);
> + *
> + */

Thanks for this good tutorial.

> +#define VERSION_SYMBOL(b, e, v) __asm__(".symver " SA(b) SA(e) ", 
> "SA(b)"@DPDK_"SA(v))
> +#define BASE_SYMBOL(b, n) __asm__(".symver " SA(n) ", "SA(b)"@")
> +#define BIND_DEFAULT_SYMBOL(b, v) __asm__(".symver " SA(b) ", 
> "SA(b)"@@DPDK_"SA(v))
> +#define __vsym __attribute__((used))

OK. It would be simpler to read if b, e, v and n were formally defined in a 
comment.

> +#else
[...]
> +/*
> + * RTE_BUILD_SHARED_LIB
> + */

This type of comment is strange. It makes me think that we are in the case
RTE_BUILD_SHARED_LIB=y

> +#endif

[...]

> +
> +CPU_LDFLAGS += --version-script=$(EXPORT_MAP)

Why this variable name? VERSION_SCRIPT or VERSION_MAP seems more appropriate.

> +
>  endif
>  
> +

Why this newline?

>  _BUILD = $(LIB)

-- 
Thomas


[dpdk-dev] Why nothing since 1.8.0?

2015-01-14 Thread Neil Horman
On Wed, Jan 14, 2015 at 12:23:52PM -0800, Stephen Hemminger wrote:
> Ok, so 1.8.0 came out almost a month ago and none of the patches
> that were deferred waiting for the release got merged since then.
> Last commit in git is the 1.8.0 release.
> 
> Where is the post-merge window bundle, where are the later commits?
> Lots of patches are sitting rotting in patchwork...
> 
> 

+1, I've had the same questions.
Neil



[dpdk-dev] [PATCH v3 1/4] compat: Add infrastructure to support symbol versioning

2015-01-14 Thread Neil Horman
On Wed, Jan 14, 2015 at 04:25:19PM +0100, Thomas Monjalon wrote:
> Hi Neil,
> 
> 2014-12-23 10:51, Neil Horman:
> > Add initial pass header files to support symbol versioning.
> 
> [...]
> 
> > +#   Copyright(c) 2010-2014 Neil Horman 
> 
> Why these dates?
> 
Because I copied the Makefile from librte_acl, and modified the name but not the
dates.

> > +#   All rights reserved.
> 
> I think this line is not required anymore:
>   http://en.wikipedia.org/wiki/All_rights_reserved
> 
Hmm, apparently so.  However, since it exists in every other copyright notice in
the tree, I'd just as soon keep this language consistent, and make a tree wide
change in a separate patch if the consensus is to do so.

> [...]
> 
> > +#ifndef _RTE_COMPAT_H_
> > +#define _RTE_COMPAT_H_
> 
> Why using underscores?
> I think it's reserved:
>   http://en.wikipedia.org/wiki/Include_guard#Use_of_.23include_guards
> 
Its reserved for the implementation, and must not be used by a user using the
header file.  Its ok, and is common practice.  See every other symlinked header
file in the DPDK.

> > +#define SA(x) #x
> 
> It should be prefixed. But it's better to use RTE_STR.
> 
very well

> > +#ifdef RTE_BUILD_SHARED_LIB
> > +
> > +/*
> > + * Provides backwards compatibility when updating exported functions.
> > + * When a symol is exported from a library to provide an API, it also 
> > provides a
> > + * calling convention (ABI) that is embodied in its name, return type,
> > + * arguments, etc.  On occasion that function may need to change to 
> > accomodate
> > + * new functionality, behavior, etc.  When that occurs, it is desireable to
> > + * allow for backwards compatibility for a time with older binaries that 
> > are
> > + * dynamically linked to the dpdk.  to support that the __vsym and
> 
> Should be "To support that," with uppercase and comma.
> 
yup

> > + * VERSION_SYMBOL macros are created.  They, in conjunction with the
> > + * _version.map file for a given library allow for multiple 
> > versions of
> > + * a symbol to exist in a shared library so that older binaries need not be
> > + * immediately recompiled. Their use is outlined in the following example:
> > + * Assumptions: DPDK 1.(X) contains a function int foo(char *string)
> > + *  DPDK 1.(X+1) needs to change foo to be int foo(int index)
> > + *
> > + * To accomplish this:
> > + * 1) Edit lib//library_version.map to add a DPDK_1.(X+1) node, 
> > in which
> > + * foo is exported as a global symbol.
> > + *
> > + * 2) rename the existing function int foo(char *string) to 
> > + * int __vsym foo_v18(char *string)
> > + *
> > + * 3) Add this macro immediately below the function
> > + * VERSION_SYMBOL(foo, _v18, 1.8);
> > + *
> > + * 4) Implement a new version of foo.
> > + * char foo(int value, int otherval) { ...}
> > + *
> > + * 5) Mark the newest version as the default version
> > + * BIND_DEFAULT_SYMBOL(foo, 1.9);
> > + *
> > + */
> 
> Thanks for this good tutorial.
> 
> > +#define VERSION_SYMBOL(b, e, v) __asm__(".symver " SA(b) SA(e) ", 
> > "SA(b)"@DPDK_"SA(v))
> > +#define BASE_SYMBOL(b, n) __asm__(".symver " SA(n) ", "SA(b)"@")
> > +#define BIND_DEFAULT_SYMBOL(b, v) __asm__(".symver " SA(b) ", 
> > "SA(b)"@@DPDK_"SA(v))
> > +#define __vsym __attribute__((used))
> 
> OK. It would be simpler to read if b, e, v and n were formally defined in a 
> comment.
> 
> > +#else
> [...]
> > +/*
> > + * RTE_BUILD_SHARED_LIB
> > + */
> 
> This type of comment is strange. It makes me think that we are in the case
> RTE_BUILD_SHARED_LIB=y
> 
> > +#endif
> 
> [...]
> 
> > +
> > +CPU_LDFLAGS += --version-script=$(EXPORT_MAP)
> 
> Why this variable name? VERSION_SCRIPT or VERSION_MAP seems more appropriate.
> 
> > +
> >  endif
> >  
> > +
> 
> Why this newline?
> 
> >  _BUILD = $(LIB)
> 
> -- 
> Thomas
> 


[dpdk-dev] [PATCH v3 4/4] docs: Add ABI documentation

2015-01-14 Thread Neil Horman
On Wed, Jan 14, 2015 at 04:59:51PM +0100, Thomas Monjalon wrote:
> 2014-12-23 10:51, Neil Horman:
> > Adding a document describing rudimentary ABI policy and adding notice space 
> > for
> > any deprecation announcements
> 
> We had a good discussion about the policy and its impact:
>   http://thread.gmane.org/gmane.comp.networking.dpdk.devel/8367/focus=8461
> Sadly nobody else discussed it.
> I think we should integrate some of the conclusions in this documentation.
> 
I'm certainly open to that.  However, I felt like that conversation centered
more around the debate for the need for ABI versioning, not the mechanics
thereof.  Are there specific sections of that conversation that you are looking
to incorporate, or specific topics?

> > --- /dev/null
> > +++ b/doc/abi.txt
> > @@ -0,0 +1,17 @@
> > +ABI policy:
> > +   ABI versions are set at the time of major release labeling, and ABI may
> > +change multiple times between the last labeling and the HEAD label of the 
> > git
> > +tree without warning
> > +
> > +   ABI versions, once released are available until such time as their
> > +deprecation has been noted here for at least one major release cycle, 
> > after it
> > +has been tagged.  E.g. the ABI for DPDK 1.8 is shipped, and then the 
> > decision to
> > +remove it is made during the development of DPDK 1.9.  The decision will be
> > +recorded here, shipped with the DPDK 1.9 release, and actually removed 
> > when DPDK
> > +1.10 ships.
> > +
> > +   ABI versions may be deprecated in whole, or in part as needed by a given
> > +update.
> > +
> > +Deprecation Notices:
> > +
> 
> You could upgrade your example to 2.0/2.1.
> 
Sure, though I think doing so is rather arbitrary, as its going to be
immediately dated as soon as version 2.1 releases.  But I can do that if you
like when we square up the documentation question above
Neil

> Thanks
> -- 
> Thomas
> 


[dpdk-dev] New to DPDK

2015-01-14 Thread Ravi Rao
Thanks a lot for the quick response.
On 01/14/2015 01:27 PM, Wiles, Keith wrote:
> Most people I guess use a Xeon CPU class MB with one or two sockets
> running Linux with a supported NICs. I use a motherboard like the one
> below running Ubuntu 12.04 with 12G RAM and the 82599 NICs. You can find
> more supported NICs in the documentation and you need find the rest of the
> parts :-) You do not need much disk space I have a 500G disk and you can
> use less memory, but that is something you need to decide on.
>
> http://www.intel.com/content/www/us/en/motherboards/server-motherboards/ser
> ver-board-w2600cr.html
>
>
> On 1/14/15, 12:33 PM, "Ravi Rao"  wrote:
>
>> Hi All,
>> I am a newbee to DPDK. Can one of you please let me know if there is
>> any reference board that is available which I can use to build and
>> tryout the dpdk stuff on.
>> Regards,
>> Ravi



[dpdk-dev] Port link speed and link duplex always set to auto-negotiate & manual link speed configuration of 100Mb link speed not possible

2015-01-14 Thread Thomas Monjalon
Hi Mark,

Thanks for the report and the patch.
Unfortunately, there is still no review.
I believe 2 things could help to go further here:
1) your fix need to be sent in the format described here:
http://dpdk.org/dev#send
2) a maintainer of e1000 PMD should be identified


2015-01-08 16:26, Marc E. Cooper:
> I believe there are defects in the code supporting manual configuration of
> port link speed and link duplex to values other than auto-negotiate.  The
> TESTPMD application from DPDK version 1.8.0 was executed on 2 different
> systems having 4x1G NICs with the following Ethernet controllers:
> 
> * Intel Corporation 82576 Gigabit Network Connection (rev 01)
> * Intel Corporation I350 Gigabit Network Connection (rev 01)
> 
> There appears to be two issues in the code:
> 
> * "hw->mac.autoneg? is always set to true (1).  The force speed and duplex
> code path is not followed.
> * "hw->mac.forced_speed_duplex" is never set.  A forced link speed
> configuration will always default to 10Mb regardless of whether configured
> for 100Mb or 10Mb.  For example, e1000_phy_force_speed_duplex_setup() will
> always configure link speed to 10mb since it checks for the following
> condition "if (mac->forced_speed_duplex & E1000_ALL_100_SPEED)?.
> 
> Changes are needed within ?igb_ethdev.c? and ?em_ethdev.c? within
> ?lib/librte_pmd_e1000?.  The switch statements that setup link speed and
> link duplex within these files need to manually set "hw->mac.autoneg" and
> "hw->mac.forced_speed_duplex" as shown below:
> 
> [root at box librte_pmd_e1000]# pwd
> /home/marc/dpdk/dpdk-1.8.0/lib/librte_pmd_e1000
> [root at box librte_pmd_e1000]# diff  -p igb_ethdev.c.orig igb_ethdev.c
> *** igb_ethdev.c.orig 2015-01-08 09:59:52.937215791 -0500
> --- igb_ethdev.c  2015-01-08 10:01:44.073730592 -0500
> *** eth_igb_start(struct rte_eth_dev *dev)
> *** 871,876 
> --- 871,878 
>   hw->phy.autoneg_advertised = ADVERTISE_10_FULL;
>   else
>   goto error_invalid_config;
> +   hw->mac.autoneg = 0;
> +   hw->mac.forced_speed_duplex |= hw->phy.autoneg_advertised;
>   break;
>   case ETH_LINK_SPEED_100:
>   if (dev->data->dev_conf.link_duplex == ETH_LINK_AUTONEG_DUPLEX)
> *** eth_igb_start(struct rte_eth_dev *dev)
> *** 881,886 
> --- 883,890 
>   hw->phy.autoneg_advertised = ADVERTISE_100_FULL;
>   else
>   goto error_invalid_config;
> +   hw->mac.autoneg = 0;
> +   hw->mac.forced_speed_duplex |= hw->phy.autoneg_advertised;
>   break;
>   case ETH_LINK_SPEED_1000:
>   if ((dev->data->dev_conf.link_duplex == 
> ETH_LINK_AUTONEG_DUPLEX) ||
> *** eth_igb_start(struct rte_eth_dev *dev)
> *** 888,893 
> --- 892,899 
>   hw->phy.autoneg_advertised = ADVERTISE_1000_FULL;
>   else
>   goto error_invalid_config;
> +   hw->mac.autoneg = 0;
> +   hw->mac.forced_speed_duplex |= hw->phy.autoneg_advertised;
>   break;
>   case ETH_LINK_SPEED_1:
>   default:
> [root at box librte_pmd_e1000]#
> 
> 
> After only setting hw->mac.autoneg = 0 in the switch statement cases
> described above I experimented with configuring the peer (switch) ports
> for 100Mb full duplex.  Within TESTPMD the ports were also configured for
> 100Mb full duplex using ?port config all speed 100 duplex full?.  The
> links failed to establish.  I enabled debug statements within
> ?e1000_phy_force_speed_duplex_setup()? found in
> ?lib/librte_pmd_e1000/e1000/e1000_phy.c? to display whether 100mb or 10mb
> was being forced.  It was observed that 10Mb was being forced when the
> link speed was manually configured for 100Mb.  Setting
> ?hw->mac.force_speed_duplex? as shown above seemed to resolve this issue
> and the links came up.
>  
> 
> Are these known issues?
> 
> -Marc



[dpdk-dev] [PATCH v2 00/17] ACL: New AVX2 classify method and several other enhancements.

2015-01-14 Thread Neil Horman
On Mon, Jan 12, 2015 at 07:16:04PM +, Konstantin Ananyev wrote:
> v2 changes:
> - When build with the compilers that don't support AVX2 instructions,
> make rte_acl_classify_avx2() do nothing and return an error.
> - Remove unneeded 'ifdef __AVX2__' in acl_run_avx2.*.
> - Reorder order of patches in the set, to keep RTE_LIBRTE_ACL_STANDALONE=y
> always buildable.
> 
> This patch series contain several fixes and enhancements for ACL library.
> See complete list below.
> Two main changes that are externally visible:
> - Introduce new classify method:  RTE_ACL_CLASSIFY_AVX2.
> It uses AVX2 instructions and 256 bit wide data types
> to perform internal trie traversal.
> That helps to increase classify() throughput.
> This method is selected as default one on CPUs that supports AVX2.
> - Introduce new field in the build config structure: max_size.
> It specifies maximum size that internal RT structure for given context
> can reach.
> The purpose of that is to allow user to decide about space/performance 
> trade-off
> (faster classify() vs less space for RT internal structures)
> for each given set of rules.
> 
> Konstantin Ananyev (17):
>   fix fix compilation issues with RTE_LIBRTE_ACL_STANDALONE=y
>   app/test: few small fixes fot test_acl.c
>   librte_acl: make data_indexes long enough to survive idle transitions.
>   librte_acl: remove build phase heuristsic with negative perfomance
> effect.
>   librte_acl: fix a bug at build phase that can cause matches beeing
> overwirtten.
>   librte_acl: introduce DFA nodes compression (group64) for identical
> entries.
>   librte_acl: build/gen phase - simplify the way match nodes are
> allocated.
>   librte_acl: make scalar RT code to be more similar to vector one.
>   librte_acl: a bit of RT code deduplication.
>   EAL: introduce rte_ymm and relatives in rte_common_vect.h.
>   librte_acl: add AVX2 as new rte_acl_classify() method
>   test-acl: add ability to manually select RT method.
>   librte_acl: Remove search_sse_2 and relatives.
>   libter_acl: move lo/hi dwords shuffle out from calc_addr
>   libte_acl: make calc_addr a define to deduplicate the code.
>   libte_acl: introduce max_size into rte_acl_config.
>   libte_acl: remove unused macros.
> 
>  app/test-acl/main.c | 126 +++--
>  app/test/test_acl.c |   8 +-
>  examples/l3fwd-acl/main.c   |   3 +-
>  examples/l3fwd/main.c   |   2 +-
>  lib/librte_acl/Makefile |  18 +
>  lib/librte_acl/acl.h|  58 ++-
>  lib/librte_acl/acl_bld.c| 392 +++-
>  lib/librte_acl/acl_gen.c| 268 +++
>  lib/librte_acl/acl_run.h|   7 +-
>  lib/librte_acl/acl_run_avx2.c   |  54 +++
>  lib/librte_acl/acl_run_avx2.h   | 284 
>  lib/librte_acl/acl_run_scalar.c |  65 ++-
>  lib/librte_acl/acl_run_sse.c| 585 
> +---
>  lib/librte_acl/acl_run_sse.h| 357 +++
>  lib/librte_acl/acl_vect.h   | 132 +++---
>  lib/librte_acl/rte_acl.c|  47 +-
>  lib/librte_acl/rte_acl.h|   4 +
>  lib/librte_acl/rte_acl_osdep_alone.h|  47 +-
>  lib/librte_eal/common/include/rte_common_vect.h |  39 +-
>  lib/librte_lpm/rte_lpm.h|   2 +-
>  20 files changed, 1444 insertions(+), 1054 deletions(-)
>  create mode 100644 lib/librte_acl/acl_run_avx2.c
>  create mode 100644 lib/librte_acl/acl_run_avx2.h
>  create mode 100644 lib/librte_acl/acl_run_sse.h
> 
> -- 
> 1.8.5.3
> 
> 
Series
Acked-by: Neil Horman 

Nice work
Neil



[dpdk-dev] Fast Path Query

2015-01-14 Thread Deepak Sehrawat
Hi Helin,

If we use exception_path or KNI, which extracts packet from Linux kernel
(for DPDK application processing), will it still remain fast path? Will it
not impact the performance; as Linux interrupt framework shall also come
into picture here?


On Wed, Jan 14, 2015 at 1:17 PM, Zhang, Helin  wrote:

> Hi Deepak
>
> If a NIC port is controlled by DPDK, all packets received by that port
> will go directly to DPDK, and Linux kernel doesn't know those packets
> anymore.
> But, the packets received by DPDK can be put into kernel by two special
> ways. They are exception_path and KNI. Please check the examples/ for more
> details.
> In the future, a port may be co-controlled by both Linux and DPDK. Part of
> queues will be controlled by Linux kernel driver, part of queues will be
> controlled by DPDK. Check the DPDK roadmap for more details.
>
> Regards,
> Helin
>
> > -Original Message-
> > From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Deepak Sehrawat
> > Sent: Wednesday, January 14, 2015 2:16 PM
> > To: dev at dpdk.org
> > Subject: [dpdk-dev] Fast Path Query
> >
> > Hi All,
> >
> > I have a use-case where my slow path application (control path) is to
> run on
> > Linux where as my data path is to run as DPDK application. Because both
> > control and data packets are going to be received via same NIC card, how
> will
> > these two flows be separated and passed on to Linux control app and DPDK
> > data path app respectively? In short I want to understand how NIC
> received
> > packets are separated between Linux Eth driver and PMD (poll mode
> > driver) of DPDK?
> >
> > Thanks for the help in advance.
> >
> > Thanks,
> > Deepak
>



-- 
 ==
 The Harder I work,
 The Luckier I Get


DEEPAK $EHRAWAT
Senior Principal Engineer
Hughes Systique Corporation,
D-8, InfoCity Phase- II,
Sector-33, Gurgaon-122001
Mobile No. 9818228349
deepak.sehrawat at hsc.com
==


[dpdk-dev] New to DPDK

2015-01-14 Thread Ravi Rao
Hi All,
I am a newbee to DPDK. Can one of you please let me know if there is 
any reference board that is available which I can use to build and 
tryout the dpdk stuff on.
Regards,
Ravi


[dpdk-dev] IPv6 Offload Capabilities

2015-01-14 Thread Thomas Monjalon
Hi Matthew,

2015-01-05 21:25, Matthew Hall:
> > > 2) The checksum operations are kind of a hodgepodge and don't always have 
> > > a
> > > consistent vision to them... some things like the 16-bit-based IP checksum
> > > appear to be missing any routine, including any accelerated one when the
> > > offload doesn't work (like for ICMPv4, ICMPv6, and any IPv6 datagrams, or
> > > other weird crap like IPv6 pseudo headers, even contemplating those gives 
> > > me
> > > a headache, but at least my greenfield code for it works now).
> > 
> > Please detail which function is missing for which usage.
> 
> rte_hash_crc exists, rte_hash_crc_4byte exists, there is no rte_hash_ip_cksum 
> to use when checksum offloading doesn't work for some reason (in BSD it's 
> called in_cksum). The jhash and CRC API's don't look to be consistent / 
> compatible. An expandable API with some enum of hash algorithms and a 
> standard 
> calling convention for accelerated / special algorithms (like ones which 
> assume 4-byte input) would make this more generic.

[...]

> But the larger architectural point was my proposed goal that all of the 
> various kinds of hashes (flow hashes, checksums / packet hashes, table lookup 
> hashes, etc.) could use a consistent pluggable API so we could easily move 
> back and forth between them and write clean consistent code any time a hash 
> is 
> being used.

Thank you for your detailed comments.
Are you saying that you want to work on such hash API for DPDK?

-- 
Thomas


[dpdk-dev] Why nothing since 1.8.0?

2015-01-14 Thread Stephen Hemminger
Ok, so 1.8.0 came out almost a month ago and none of the patches
that were deferred waiting for the release got merged since then.
Last commit in git is the 1.8.0 release.

Where is the post-merge window bundle, where are the later commits?
Lots of patches are sitting rotting in patchwork...



[dpdk-dev] Fast Path Query

2015-01-14 Thread Deepak Sehrawat
Hi All,

I have a use-case where my slow path application (control path) is to run
on Linux where as my data path is to run as DPDK application. Because both
control and data packets are going to be received via same NIC card, how
will these two flows be separated and passed on to Linux control app and
DPDK data path app respectively? In short I want to understand how NIC
received packets are separated between Linux Eth driver and PMD (poll mode
driver) of DPDK?

Thanks for the help in advance.

Thanks,
Deepak


[dpdk-dev] [PATCH v3 2/4] Provide initial versioning for all DPDK libraries

2015-01-14 Thread Neil Horman
On Wed, Jan 14, 2015 at 04:29:29PM +0100, Thomas Monjalon wrote:
> 2014-12-23 10:51, Neil Horman:
> > Add linker version script files to each DPDK library to put a stake in the
> > ground from which we can start cleaning up API's
> 
> [...]
> 
> >  lib/librte_acl/Makefile|   2 +
> >  lib/librte_acl/rte_acl_version.map |  21 
> >  lib/librte_cfgfile/Makefile|   2 +
> >  lib/librte_cfgfile/rte_cfgfile_version.map |  14 +++
> >  lib/librte_cmdline/Makefile|   2 +
> >  lib/librte_cmdline/rte_cmdline_version.map |  69 +
> >  lib/librte_distributor/Makefile|   2 +
> >  lib/librte_distributor/rte_distributor_version.map |  16 +++
> >  lib/librte_eal/bsdapp/eal/Makefile |   2 +
> >  lib/librte_eal/bsdapp/eal/rte_eal_version.map  |  90 
> >  lib/librte_eal/linuxapp/eal/Makefile   |   2 +
> >  lib/librte_eal/linuxapp/eal/rte_eal_version.map|  90 
> >  lib/librte_ether/Makefile  |   2 +
> >  lib/librte_ether/rte_ether_version.map | 113 
> > +
> >  lib/librte_hash/Makefile   |   2 +
> >  lib/librte_hash/rte_hash_version.map   |  18 
> >  lib/librte_ip_frag/Makefile|   2 +
> >  lib/librte_ip_frag/rte_ipfrag_version.map  |  14 +++
> >  lib/librte_ivshmem/Makefile|   2 +
> >  lib/librte_ivshmem/rte_ivshmem_version.map |  13 +++
> >  lib/librte_kni/Makefile|   2 +
> >  lib/librte_kni/rte_kni_version.map |  20 
> >  lib/librte_kvargs/Makefile |   2 +
> >  lib/librte_kvargs/rte_kvargs_version.map   |  10 ++
> >  lib/librte_lpm/Makefile|   2 +
> >  lib/librte_lpm/rte_lpm_version.map |  24 +
> >  lib/librte_malloc/Makefile |   2 +
> >  lib/librte_malloc/rte_malloc_version.map   |  19 
> >  lib/librte_mbuf/Makefile   |   2 +
> >  lib/librte_mbuf/rte_mbuf_version.map   |  14 +++
> >  lib/librte_mempool/Makefile|   2 +
> >  lib/librte_mempool/rte_mempool_version.map |  18 
> >  lib/librte_meter/Makefile  |   2 +
> >  lib/librte_meter/rte_meter_version.map |  13 +++
> >  lib/librte_pipeline/Makefile   |   2 +
> >  lib/librte_pipeline/rte_pipeline_version.map   |  23 +
> >  lib/librte_pmd_af_packet/Makefile  |   2 +
> >  .../rte_pmd_af_packet_version.map  |   7 ++
> >  lib/librte_pmd_bond/Makefile   |   2 +
> >  lib/librte_pmd_bond/rte_eth_bond_version.map   |  21 
> >  lib/librte_pmd_e1000/Makefile  |   2 +
> >  lib/librte_pmd_e1000/rte_pmd_e1000_version.map |   5 +
> >  lib/librte_pmd_enic/Makefile   |   2 +
> >  lib/librte_pmd_enic/rte_pmd_enic_version.map   |   5 +
> >  lib/librte_pmd_i40e/Makefile   |   2 +
> >  lib/librte_pmd_i40e/rte_pmd_i40e_version.map   |   5 +
> >  lib/librte_pmd_ixgbe/Makefile  |   2 +
> >  lib/librte_pmd_ixgbe/rte_pmd_ixgbe_version.map |   5 +
> >  lib/librte_pmd_pcap/Makefile   |   2 +
> >  lib/librte_pmd_pcap/rte_pmd_pcap_version.map   |   5 +
> >  lib/librte_pmd_ring/Makefile   |   2 +
> >  lib/librte_pmd_ring/rte_eth_ring.c |   2 +-
> >  lib/librte_pmd_ring/rte_eth_ring.h |   6 --
> >  lib/librte_pmd_ring/rte_eth_ring_version.map   |  10 ++
> >  lib/librte_pmd_virtio/Makefile |   1 +
> >  lib/librte_pmd_virtio/rte_pmd_virtio_version.map   |   5 +
> >  lib/librte_pmd_vmxnet3/Makefile|   2 +
> >  lib/librte_pmd_vmxnet3/rte_pmd_vmxnet3_version.map |   5 +
> >  lib/librte_pmd_xenvirt/Makefile|   2 +
> >  lib/librte_pmd_xenvirt/rte_eth_xenvirt_version.map |   8 ++
> >  lib/librte_port/Makefile   |   2 +
> >  lib/librte_port/rte_port_version.map   |  18 
> >  lib/librte_power/Makefile  |   2 +
> >  lib/librte_power/rte_power_version.map |  18 
> >  lib/librte_ring/Makefile   |   2 +
> >  lib/librte_ring/rte_ring_version.map   |  12 +++
> >  lib/librte_sched/Makefile  |   2 +
> >  lib/librte_sched/rte_sched_version.map |  22 
> >  lib/librte_table/Makefile  |   2 +
> >  lib/librte_table/rte_table_version.map |  22 
> >  lib/librte_timer/Makefile  |   2 +
> >  lib/librte_timer/rte_timer_version.map |  16 +++
> >  lib/librte_vhos

[dpdk-dev] Does I210 NIC support Flow director filters?

2015-01-14 Thread Bruce Richardson
On Tue, Jan 13, 2015 at 11:21:08PM -0500, Kamraan Nasim wrote:
> Hello,
> 
> I've been using DPDK fdir filter APIs for 82599 NIC(Niantic) and they work
> very well.
> 
> Was wondering if I these could also be used for I210, 1Gbps NICs?
> 
> The other option is to use 5tuple filters(rte_eth_dev_add_5tuple_filter
> ),
> however these do not support IPv6 yet.
> 
> 
> Have people in the community had any luck with configuring L3/L4 hardware
> filters for the I210 NIC?
> 
> Thanks,
> Kam

Flow director filters are not supported for 1G NICs. Sorry.

/Bruce


[dpdk-dev] [PATCH] lib/librte_ether: change socket_id passed to rte_memzone_reserve

2015-01-14 Thread Bruce Richardson
On Tue, Jan 13, 2015 at 03:24:15PM -0800, Stephen Hemminger wrote:
> On Tue, 13 Jan 2015 09:22:00 +
> Cian Ferriter  wrote:
> 
> > Change the socket id that is passed to rte_memzone_reserve from
> > the socket id of current logical core to the socket id of the
> > master_lcore.
> > ---
> >  lib/librte_ether/rte_ethdev.c |2 +-
> >  1 files changed, 1 insertions(+), 1 deletions(-)
> >  mode change 100644 => 100755 lib/librte_ether/rte_ethdev.c
> > 
> > diff --git a/lib/librte_ether/rte_ethdev.c b/lib/librte_ether/rte_ethdev.c
> > old mode 100644
> > new mode 100755
> > index 95f2ceb..835540d
> > --- a/lib/librte_ether/rte_ethdev.c
> > +++ b/lib/librte_ether/rte_ethdev.c
> > @@ -184,7 +184,7 @@ rte_eth_dev_data_alloc(void)
> > if (rte_eal_process_type() == RTE_PROC_PRIMARY){
> > mz = rte_memzone_reserve(MZ_RTE_ETH_DEV_DATA,
> > RTE_MAX_ETHPORTS * sizeof(*rte_eth_dev_data),
> > -   rte_socket_id(), flags);
> > +   rte_lcore_to_socket_id(rte_get_master_lcore()), 
> > flags);
> > } else
> > mz = rte_memzone_lookup(MZ_RTE_ETH_DEV_DATA);
> > if (mz == NULL)
> 
> 
> Why is this a memzone at all?
> Seems like it should be allocated on a per-device basis on the same NUMA node
> of the device. Probably with rte_malloc_socket().
> 
You can't look up a malloced area of memory in a secondary process, since it
doesn't have a name.
Question is: for normal apps, does the eth_dev_data ever drop out of cache? If
not, the numa node used for memory doesn't matter.

/Bruce


[dpdk-dev] [PATCH] lib/librte_ether: change socket_id passed to rte_memzone_reserve

2015-01-14 Thread Bruce Richardson
On Tue, Jan 13, 2015 at 06:05:25PM +, Ferriter, Cian wrote:
> Comments on alternative solutions:
> 1) how would this solution work when there is no NIC present, and 
> "rte_eth_from_rings" is called? Here, could you have an else where the socket 
> id of the master core is passed to the "memzone_reserve"?
> 2) how would you advise making this change? I have looked at where 
> "rte_eth_dev_allocate" is being called and in all but one case, there is a 
> "numa_id" that could be passed in. This isn't the case for " 
> rte_eth_dev_init" however, is there an easy solution for this? Would there 
> now need to be an "rte_eth_dev_data" struct for each socket that there is a 
> NIC attached to, reserving memory from that socket?
> 
> Cian

While I think the issues you highlight can probably be overcome, I'm not so
sure any more how much it matters what numa node this is allocated on. The 
ethdev data for any port in use by a port should be in the cache. In that case,
if it doesn't matter, your original suggestion would work fine.

/Bruce

> 
> -Original Message-
> From: Richardson, Bruce 
> Sent: Tuesday, January 13, 2015 1:56 PM
> To: Ferriter, Cian
> Cc: dev at dpdk.org
> Subject: Re: [dpdk-dev] [PATCH] lib/librte_ether: change socket_id passed to 
> rte_memzone_reserve
> 
> On Tue, Jan 13, 2015 at 09:23:16AM +, Ferriter, Cian wrote:
> > Passing a socket id of "rte_socket_id()" can cause problems in non DPDK 
> > applications as there is a dependency on the current logical core we are 
> > running on.
> > Passing " rte_lcore_to_socket_id(rte_get_master_lcore())" as the socket id 
> > to rte_memzone_reserve resolves these issues as the master lcore doesn't 
> > change.
> > 
> 
> The only trouble is that when affinitizing the memory for the NICs to the 
> socket of the master lcore, it gives us no way to correctly configure an app 
> to use NICs connected to two different sockets on the one system. All memory 
> for all NICs will end up on the same socket. Two possible alternative 
> solutions:
> 1) affinitize memory to the socket the NIC is connected to
> 2) add a socket parameter to the API calls to allow the user complete control 
> over their memory allocations
> 
> Obviously the second one breaks backward compatibility (assume we modify 
> existing API call), but is more powerful.
> 
> Thoughts?
> 
> /Bruce
> 
> > -Original Message-
> > From: Ferriter, Cian
> > Sent: Tuesday, January 13, 2015 9:22 AM
> > To: dev at dpdk.org
> > Cc: Ferriter, Cian
> > Subject: [PATCH] lib/librte_ether: change socket_id passed to 
> > rte_memzone_reserve
> > 
> > Change the socket id that is passed to rte_memzone_reserve from the socket 
> > id of current logical core to the socket id of the master_lcore.
> > ---
> >  lib/librte_ether/rte_ethdev.c |2 +-
> >  1 files changed, 1 insertions(+), 1 deletions(-)  mode change 100644 
> > => 100755 lib/librte_ether/rte_ethdev.c
> > 
> > diff --git a/lib/librte_ether/rte_ethdev.c 
> > b/lib/librte_ether/rte_ethdev.c old mode 100644 new mode 100755 index 
> > 95f2ceb..835540d
> > --- a/lib/librte_ether/rte_ethdev.c
> > +++ b/lib/librte_ether/rte_ethdev.c
> > @@ -184,7 +184,7 @@ rte_eth_dev_data_alloc(void)
> > if (rte_eal_process_type() == RTE_PROC_PRIMARY){
> > mz = rte_memzone_reserve(MZ_RTE_ETH_DEV_DATA,
> > RTE_MAX_ETHPORTS * sizeof(*rte_eth_dev_data),
> > -   rte_socket_id(), flags);
> > +   rte_lcore_to_socket_id(rte_get_master_lcore()), 
> > flags);
> > } else
> > mz = rte_memzone_lookup(MZ_RTE_ETH_DEV_DATA);
> > if (mz == NULL)
> > --
> > 1.7.4.1
> > 


[dpdk-dev] [PATCH 1/2] pmd: add DCB for VF for ixgbe

2015-01-14 Thread Wodkowski, PawelX
> > >
> > >   - split nb_q_per_pool to nb_rx_q_per_pool and nb_tx_q_per_pool
> > >
> > > Rationale:
> > >
> > > rx and tx number of queue might be different if RX and TX are
> > >
> > > configured in different mode. This allow to inform VF about
> > >
> > > proper number of queues.
> >
> >
> > Nice move! Ouyang, this is a nice answer to my recent remarks about your
> > PATCH4 in "Enable VF RSS for Niantic" series.
> 
> After I respond your last comments, I see this,  :-), I am sure we both agree 
> it is
> the right way to resolve it in vmdq dcb case.
> 

I am now dividing this patch with your suggestions and I am little confused.

In this (DCB in SRIOV) case the primary cause for spliting nb_q_per_pool into
nb_rx_q_per_pool and nb_tx_q_per_pool was because of this code:

diff --git a/lib/librte_ether/rte_ethdev.c b/lib/librte_ether/rte_ethdev.c
index af9e261..be3afe4 100644
--- a/lib/librte_ether/rte_ethdev.c
+++ b/lib/librte_ether/rte_ethdev.c
@@ -537,8 +537,8 @@
default: /* ETH_MQ_RX_VMDQ_ONLY or ETH_MQ_RX_NONE */
/* if nothing mq mode configure, use default scheme */
dev->data->dev_conf.rxmode.mq_mode = 
ETH_MQ_RX_VMDQ_ONLY;
-   if (RTE_ETH_DEV_SRIOV(dev).nb_q_per_pool > 1)
-   RTE_ETH_DEV_SRIOV(dev).nb_q_per_pool = 1;
+   if (RTE_ETH_DEV_SRIOV(dev).nb_rx_q_per_pool > 1)
+   RTE_ETH_DEV_SRIOV(dev).nb_rx_q_per_pool = 1;
break;
}

@@ -553,17 +553,18 @@
default: /* ETH_MQ_TX_VMDQ_ONLY or ETH_MQ_TX_NONE */
/* if nothing mq mode configure, use default scheme */
dev->data->dev_conf.txmode.mq_mode = 
ETH_MQ_TX_VMDQ_ONLY;
-   if (RTE_ETH_DEV_SRIOV(dev).nb_q_per_pool > 1)
-   RTE_ETH_DEV_SRIOV(dev).nb_q_per_pool = 1;
+   if (RTE_ETH_DEV_SRIOV(dev).nb_tx_q_per_pool > 1)
+   RTE_ETH_DEV_SRIOV(dev).nb_tx_q_per_pool = 1;
break;
}

/* check valid queue number */
-   if ((nb_rx_q > RTE_ETH_DEV_SRIOV(dev).nb_q_per_pool) ||
-   (nb_tx_q > RTE_ETH_DEV_SRIOV(dev).nb_q_per_pool)) {
+   if ((nb_rx_q > RTE_ETH_DEV_SRIOV(dev).nb_rx_q_per_pool) ||
+   (nb_tx_q > RTE_ETH_DEV_SRIOV(dev).nb_tx_q_per_pool)) {
PMD_DEBUG_TRACE("ethdev port_id=%d SRIOV active, "
-   "queue number must less equal to %d\n",
-   port_id, 
RTE_ETH_DEV_SRIOV(dev).nb_q_per_pool);
+   "rx/tx queue number must less equal to 
%d/%d\n",
+   port_id, 
RTE_ETH_DEV_SRIOV(dev).nb_rx_q_per_pool,
+   
RTE_ETH_DEV_SRIOV(dev).nb_tx_q_per_pool);
return (-EINVAL);
}
} else {
--

This introduced an issue when RX and TX was configure in different way. The 
problem was
that the RTE_ETH_DEV_SRIOV(dev).nb_q_per_pool as common for RX and TX and it is
changed. So I did the above. But when testpmd was adjusted for DCB in SRIOV 
there 
was another issue. Testpmd is pre-configuring ports by default and since
nb_rx_q_per_pool  and nb_tx_q_per_pool was already reset to 1 there was no way 
to 
use it for DCB in SRIOV. So I did another modification:

> + uint16_t nb_rx_q_per_pool = 
> RTE_ETH_DEV_SRIOV(dev).nb_rx_q_per_pool;
> + uint16_t nb_tx_q_per_pool = 
> RTE_ETH_DEV_SRIOV(dev).nb_tx_q_per_pool;
> +
>   switch (dev_conf->rxmode.mq_mode) {
> - case ETH_MQ_RX_VMDQ_RSS:
>   case ETH_MQ_RX_VMDQ_DCB:
> + break;
> + case ETH_MQ_RX_VMDQ_RSS:
>   case ETH_MQ_RX_VMDQ_DCB_RSS:
> - /* DCB/RSS VMDQ in SRIOV mode, not implement yet */
> + /* RSS, DCB+RSS VMDQ in SRIOV mode, not implement yet */
>   PMD_DEBUG_TRACE("ethdev port_id=%" PRIu8
>   " SRIOV active, "
>   "unsupported VMDQ mq_mode rx %u\n",
> @@ -537,37 +560,32 @@ rte_eth_dev_check_mq_mode(uint8_t port_id, uint16_t 
> nb_rx_q, uint16_t nb_tx_q,
>   default: /* ETH_MQ_RX_VMDQ_ONLY or ETH_MQ_RX_NONE */
>   /* if nothing mq mode configure, use default scheme */
>   dev->data->dev_conf.rxmode.mq_mode = 
> ETH_MQ_RX_VMDQ_ONLY;
> - if (RTE_ETH_DEV_SRIOV(dev).nb_q_per_pool > 1)
> - RTE_ETH_DEV_SRIOV(dev).nb_q_per_pool = 1;
> + if (nb_rx_q_per_pool > 1)
> + nb_rx_q_per_pool = 1;
>   break;
>   }
>   
> 

[dpdk-dev] Fast Path Query

2015-01-14 Thread Gal Sagie
Your application (which you build on top of DPDK) needs to filter which
traffic is control traffic and inject
it into the network stack.
You can leverage DPDK KNI for that (
http://dpdk.org/doc/guides/prog_guide/kernel_nic_interface.html)
Keep in mind you also need to take care of the TX side.


On Wed, Jan 14, 2015 at 8:16 AM, Deepak Sehrawat 
wrote:

> Hi All,
>
> I have a use-case where my slow path application (control path) is to run
> on Linux where as my data path is to run as DPDK application. Because both
> control and data packets are going to be received via same NIC card, how
> will these two flows be separated and passed on to Linux control app and
> DPDK data path app respectively? In short I want to understand how NIC
> received packets are separated between Linux Eth driver and PMD (poll mode
> driver) of DPDK?
>
> Thanks for the help in advance.
>
> Thanks,
> Deepak
>



-- 
Best Regards ,

The G.


[dpdk-dev] Fast Path Query

2015-01-14 Thread Zhang, Helin
Hi Deepak

Exception path or KNI just provide an exceptional path for packet exchanging 
between user space and kernel space. All packet IO are still in user space 
DPDK, Linux kernel still doesn?t know the NIC port. So, no standard Linux 
interrupt.

Regards,
Helin

From: Deepak Sehrawat [mailto:d.sehra...@gmail.com]
Sent: Wednesday, January 14, 2015 3:58 PM
To: Zhang, Helin
Cc: dev at dpdk.org
Subject: Re: [dpdk-dev] Fast Path Query


Hi Helin,

If we use exception_path or KNI, which extracts packet from Linux kernel (for 
DPDK application processing), will it still remain fast path? Will it not 
impact the performance; as Linux interrupt framework shall also come into 
picture here?


On Wed, Jan 14, 2015 at 1:17 PM, Zhang, Helin mailto:helin.zhang at intel.com>> wrote:
Hi Deepak

If a NIC port is controlled by DPDK, all packets received by that port will go 
directly to DPDK, and Linux kernel doesn't know those packets anymore.
But, the packets received by DPDK can be put into kernel by two special ways. 
They are exception_path and KNI. Please check the examples/ for more details.
In the future, a port may be co-controlled by both Linux and DPDK. Part of 
queues will be controlled by Linux kernel driver, part of queues will be 
controlled by DPDK. Check the DPDK roadmap for more details.

Regards,
Helin

> -Original Message-
> From: dev [mailto:dev-bounces at dpdk.org] On 
> Behalf Of Deepak Sehrawat
> Sent: Wednesday, January 14, 2015 2:16 PM
> To: dev at dpdk.org
> Subject: [dpdk-dev] Fast Path Query
>
> Hi All,
>
> I have a use-case where my slow path application (control path) is to run on
> Linux where as my data path is to run as DPDK application. Because both
> control and data packets are going to be received via same NIC card, how will
> these two flows be separated and passed on to Linux control app and DPDK
> data path app respectively? In short I want to understand how NIC received
> packets are separated between Linux Eth driver and PMD (poll mode
> driver) of DPDK?
>
> Thanks for the help in advance.
>
> Thanks,
> Deepak



--
==
 The Harder I work,
 The Luckier I Get


DEEPAK $EHRAWAT
Senior Principal Engineer
Hughes Systique Corporation,
D-8, InfoCity Phase- II,
Sector-33, Gurgaon-122001
Mobile No. 9818228349
deepak.sehrawat at hsc.com
==


[dpdk-dev] Fast Path Query

2015-01-14 Thread Zhang, Helin
Hi Deepak

If a NIC port is controlled by DPDK, all packets received by that port will go 
directly to DPDK, and Linux kernel doesn't know those packets anymore.
But, the packets received by DPDK can be put into kernel by two special ways. 
They are exception_path and KNI. Please check the examples/ for more details.
In the future, a port may be co-controlled by both Linux and DPDK. Part of 
queues will be controlled by Linux kernel driver, part of queues will be 
controlled by DPDK. Check the DPDK roadmap for more details.

Regards,
Helin

> -Original Message-
> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Deepak Sehrawat
> Sent: Wednesday, January 14, 2015 2:16 PM
> To: dev at dpdk.org
> Subject: [dpdk-dev] Fast Path Query
> 
> Hi All,
> 
> I have a use-case where my slow path application (control path) is to run on
> Linux where as my data path is to run as DPDK application. Because both
> control and data packets are going to be received via same NIC card, how will
> these two flows be separated and passed on to Linux control app and DPDK
> data path app respectively? In short I want to understand how NIC received
> packets are separated between Linux Eth driver and PMD (poll mode
> driver) of DPDK?
> 
> Thanks for the help in advance.
> 
> Thanks,
> Deepak


[dpdk-dev] Does I210 NIC support Flow director filters?

2015-01-14 Thread Zhang, Helin
FD should works on all supported NICs, though might not all FD features have 
been implemented. You can see some reworks are ongoing from the mail list.

Regards,
Helin

> -Original Message-
> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Kamraan Nasim
> Sent: Wednesday, January 14, 2015 12:21 PM
> To: dev at dpdk.org
> Cc: Ean Houts; Jun Du
> Subject: [dpdk-dev] Does I210 NIC support Flow director filters?
> 
> Hello,
> 
> I've been using DPDK fdir filter APIs for 82599 NIC(Niantic) and they work 
> very
> well.
> 
> Was wondering if I these could also be used for I210, 1Gbps NICs?
> 
> The other option is to use 5tuple filters(rte_eth_dev_add_5tuple_filter
>  b08381b>),
> however these do not support IPv6 yet.
> 
> 
> Have people in the community had any luck with configuring L3/L4 hardware
> filters for the I210 NIC?
> 
> Thanks,
> Kam


[dpdk-dev] [PATCH v3 0/3] enhance TX checksum command and csum forwarding engine

2015-01-14 Thread Liu, Jijiang
Hi Olivier,

> -Original Message-
> From: Olivier MATZ [mailto:olivier.matz at 6wind.com]
> Sent: Tuesday, January 13, 2015 5:56 PM
> To: Liu, Jijiang
> Cc: dev at dpdk.org; Ananyev, Konstantin
> Subject: Re: [dpdk-dev] [PATCH v3 0/3] enhance TX checksum command and
> csum forwarding engine
> 
> Hi Jijiang,
> 
> On 01/13/2015 04:04 AM, Liu, Jijiang wrote:
> > the following two commands are.
> >
> > 1. tx_checksum set sw-tunnel-mode on/off
> >
> > 2. tx_checksum set hw-tunnel-mode on/off
> >
> > For command 1, If the sw-tunnel-mode is set/clear, which will
> > set/clear a testpmd flag that is used in the process of analyzing
> > incoming packet., the pseudo-codes are list below,
> >
> > If (sw-tunnel-mode)
> >
> > Csum fwd engine will analyze if incoming packet is a tunneling packet.
> > tunnel = 1;
> > else
> > Csum fwd engine will not analyze if incoming packet is a 
> > tunneling
> packet, and treat all the incoming packets as non-tunneling packets.
> > It is used for A.
> 
> What about "recognize-tunnel" instead of "sw-tunnel-mode"?
> Or "parse-tunnel"?

Ok,  "parse-tunnel" or "parse-tunnel-pkt" is better.
Thanks.


> To me, using "sw-" or "hw-" prefix is confusing because in any case the 
> checksums
> can be calculated in software or hardware depending on "tx_checksum set outer-
> ip hw|sw".
> 
> Moreover, this command has an impact on receive side, but the name is still
> "tx_checksum". Maybe this is also confusing.
Ok,  how about this?

set  checksum parse-tunnel-pkt on|off  (port-id)

> > For command 2, If the hw-tunnel-mode is set/clear, which will
> > set/clear a testpmd flag that is used in the process of how to handle
> > tunneling packet, the pseudo-codes are list below,
> >
> > if (tunnel == 1) { // this is a tunneling packet
> >  If (hw-tunnel-mode)
> >ol_flags |= PKT_TX_UDP_TUNNEL_PKT;
> >
> >Csum fwd engine set PKT_TX_UDP_TUNNEL_PKT offload flag, which
> means to tell HW treat  the transmit packet as a tunneling packet to do 
> checksum
> offload.
> >It is used for B.1
> > Else
> >Csum fwd engine doesn't  set PKT_TX_UDP_TUNNEL_PKT 
> > offload
> flag, which means  tell HW to treat the packet as ordinary (non-tunnelled) 
> packet.
> >   It is used for B.2
> > }
> 
> What about:
>tx_checksum set tunnel-method normal|outer
> It would select if we use lX_len or outer_lX_len. Is it what you mean?

tx_checksum set tunnel-method normal|outer

Let me explain that what differences of  TX checksum mechanism between 
ixgbe(82599) and i40e(40G NIC) are.

For 82599, there is only one register that is used for L3 checksum offload. So 
for tunneling packet, hardware is unable to recognize the packet is tunneling 
packet and  the register cannot be worked for both outer L3 checksum offload 
and inner L3 checksum offload at the same time,  just for outer or inner.

For i40e(40G NIC),  there are two registers that are user for L3 TX checksum 
offload, so for tunneling packet, the outer and inner L3 checksum offload  can 
be done by hardware at the same time, but a prerequisite is that we must tell
Hardware the packet is a tunneling packet by setting a register 
(PKT_TX_UDP_TUNNEL_PKT offload flag is used to indicate to set this register.)

As for other NIC, I think its working mechanism should be same as the i40e if 
it can recognize tunneling packet.

So my idea:
tx_checksum set tunnel-method  tunnel-pkt on|off

or
tx_checksum set tunnel-pkt on|off

What do you think?


> And this only makes sense when we use hw checksum right?
yes

> 
> >> And will it be possible to support future hardware that will be able
> >> to compute both outer l3, outer l4, l3 and l4 checksums?

Currently, if outer l4  will be supported in the future, and we can add 
outer-udp/tcp option into following command.
Tx_checksum set outer-ip|ip|sctp|udp|tcp.


> > Yes.
> > Currently, i40e support outer l3, outer l4, l3 and l4 checksums offload at 
> > the
> same time.
Sorry, my bad.
I40e just support outer l3, l3 and l4.

Fortville can offload the following L3 and L4 integrity checks: IPv4 header(s) 
checksum for "simple" and tunneled packets, Inner TCP or UDP checksum and SCTP 
CRC integrity. Tunneling UDP headers and GRE header are not offloaded while 
Fortville leaves their checksum field as is. If a checksum is required, 
software should provide it as well as the inner checksum value(s) that are 
required for the outer checksum.

> 
> >> I have another idea, please let me know if you find it clearer or not.
> >> The commands format would be:
> >>
> >> tx_checksum  ...
> >>
> >> [...]
> >>
> >> What do you think?
> >
> > Thanks for your proposal.
> > It is clear for me.
> >
> > But there are two questions for me.
> >
> > As I know, in current command line framework, the option in command line is
> exact match, so you probably have to add duplicated codes when you want to
> 

[dpdk-dev] [PATCH 1/2] pmd: add DCB for VF for ixgbe

2015-01-14 Thread Ouyang, Changchun


> -Original Message-
> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Vlad Zolotarov
> Sent: Tuesday, January 13, 2015 6:14 PM
> To: Jastrzebski, MichalX K; dev at dpdk.org
> Subject: Re: [dpdk-dev] [PATCH 1/2] pmd: add DCB for VF for ixgbe
> 
> 
> On 01/12/15 17:50, Michal Jastrzebski wrote:
> > From: Pawel Wodkowski 
> >
> > This patch add support for DCB in SRIOV mode. When no PFC is enabled
> > this feature might be used as multiple queues (up to 8 or 4) for VF.
> >
> > It incorporate following modifications:
> >   - Allow zero rx/tx queues to be passed to rte_eth_dev_configure().
> > Rationale:
> > in SRIOV mode PF use first free VF to RX/TX. If VF count
> > is 16 or 32 all recources are assigned to VFs so PF can
> > be used only for configuration.
> >   - split nb_q_per_pool to nb_rx_q_per_pool and nb_tx_q_per_pool
> > Rationale:
> > rx and tx number of queue might be different if RX and TX are
> > configured in different mode. This allow to inform VF about
> > proper number of queues.
> >   - extern mailbox API for DCB mode
> 
> IMHO each bullet above is worth a separate patch. ;) It would be much easier
> to review.
> 
> thanks,
> vlad
> 
Agree with Vlad


[dpdk-dev] [PATCH 1/2] pmd: add DCB for VF for ixgbe

2015-01-14 Thread Ouyang, Changchun


> -Original Message-
> From: Vlad Zolotarov [mailto:vladz at cloudius-systems.com]
> Sent: Tuesday, January 13, 2015 6:09 PM
> To: Jastrzebski, MichalX K; dev at dpdk.org
> Cc: Ouyang, Changchun
> Subject: Re: [dpdk-dev] [PATCH 1/2] pmd: add DCB for VF for ixgbe
> 
> 
> On 01/12/15 16:43, Michal Jastrzebski wrote:
> > Date: Mon, 12 Jan 2015 15:39:40 +0100
> > Message-Id:
> > <1421073581-6644-2-git-send-email-michalx.k.jastrzebski at intel.com>
> > X-Mailer: git-send-email 2.1.1
> > In-Reply-To:
> > <1421073581-6644-1-git-send-email-michalx.k.jastrzebski at intel.com>
> > References:
> > <1421073581-6644-1-git-send-email-michalx.k.jastrzebski at intel.com>
> >
> > From: Pawel Wodkowski 
> >
> >
> > This patch add support for DCB in SRIOV mode. When no PFC
> >
> > is enabled this feature might be used as multiple queues
> >
> > (up to 8 or 4) for VF.
> >
> >
> >
> > It incorporate following modifications:
> >
> >   - Allow zero rx/tx queues to be passed to rte_eth_dev_configure().
> >
> > Rationale:
> >
> > in SRIOV mode PF use first free VF to RX/TX. If VF count
> >
> > is 16 or 32 all recources are assigned to VFs so PF can
> >
> > be used only for configuration.
> >
> >   - split nb_q_per_pool to nb_rx_q_per_pool and nb_tx_q_per_pool
> >
> > Rationale:
> >
> > rx and tx number of queue might be different if RX and TX are
> >
> > configured in different mode. This allow to inform VF about
> >
> > proper number of queues.
> 
> 
> Nice move! Ouyang, this is a nice answer to my recent remarks about your
> PATCH4 in "Enable VF RSS for Niantic" series.

After I respond your last comments, I see this,  :-), I am sure we both agree 
it is the right way to resolve it in vmdq dcb case.

> Michal, could u, pls., respin this series after fixing the formatting and 
> (maybe)
> using "git send-email" for sending? ;)
> 
> thanks,
> vlad
> 
> 
> >
> >   - extern mailbox API for DCB mode
> >
> >
> >
> > Signed-off-by: Pawel Wodkowski 
> >
> > ---
> >
> >   lib/librte_ether/rte_ethdev.c   |   84 +-
> >
> >   lib/librte_ether/rte_ethdev.h   |5 +-
> >
> >   lib/librte_pmd_e1000/igb_pf.c   |3 +-
> >
> >   lib/librte_pmd_ixgbe/ixgbe_ethdev.c |   10 ++--
> >
> >   lib/librte_pmd_ixgbe/ixgbe_ethdev.h |1 +
> >
> >   lib/librte_pmd_ixgbe/ixgbe_pf.c |   98
> ++-
> >
> >   lib/librte_pmd_ixgbe/ixgbe_rxtx.c   |7 ++-
> >
> >   7 files changed, 159 insertions(+), 49 deletions(-)
> >
> >
> >
> > diff --git a/lib/librte_ether/rte_ethdev.c
> > b/lib/librte_ether/rte_ethdev.c
> >
> > index 95f2ceb..4c1a494 100644
> >
> > --- a/lib/librte_ether/rte_ethdev.c
> >
> > +++ b/lib/librte_ether/rte_ethdev.c
> >
> > @@ -333,7 +333,7 @@ rte_eth_dev_rx_queue_config(struct rte_eth_dev
> > *dev, uint16_t nb_queues)
> >
> > dev->data->rx_queues = rte_zmalloc("ethdev->rx_queues",
> >
> > sizeof(dev->data->rx_queues[0]) *
> nb_queues,
> >
> > RTE_CACHE_LINE_SIZE);
> >
> > -   if (dev->data->rx_queues == NULL) {
> >
> > +   if (dev->data->rx_queues == NULL && nb_queues > 0) {
> >
> > dev->data->nb_rx_queues = 0;
> >
> > return -(ENOMEM);
> >
> > }
> >
> > @@ -475,7 +475,7 @@ rte_eth_dev_tx_queue_config(struct rte_eth_dev
> > *dev, uint16_t nb_queues)
> >
> > dev->data->tx_queues = rte_zmalloc("ethdev->tx_queues",
> >
> > sizeof(dev->data->tx_queues[0]) *
> nb_queues,
> >
> > RTE_CACHE_LINE_SIZE);
> >
> > -   if (dev->data->tx_queues == NULL) {
> >
> > +   if (dev->data->tx_queues == NULL && nb_queues > 0) {
> >
> > dev->data->nb_tx_queues = 0;
> >
> > return -(ENOMEM);
> >
> > }
> >
> > @@ -507,6 +507,7 @@ rte_eth_dev_check_mq_mode(uint8_t port_id,
> > uint16_t nb_rx_q, uint16_t nb_tx_q,
> >
> >   const struct rte_eth_conf *dev_conf)
> >
> >   {
> >
> > struct rte_eth_dev *dev = &rte_eth_devices[port_id];
> >
> > +   struct rte_eth_dev_info dev_info;
> >
> >
> >
> > if (RTE_ETH_DEV_SRIOV(dev).active != 0) {
> >
> > /* check multi-queue mode */
> >
> > @@ -524,11 +525,33 @@ rte_eth_dev_check_mq_mode(uint8_t port_id,
> > uint16_t nb_rx_q, uint16_t nb_tx_q,
> >
> > return (-EINVAL);
> >
> > }
> >
> >
> >
> > +   if ((dev_conf->rxmode.mq_mode ==
> ETH_MQ_RX_VMDQ_DCB) &&
> >
> > +   (dev_conf->txmode.mq_mode ==
> ETH_MQ_TX_VMDQ_DCB)) {
> >
> > +   enum rte_eth_nb_pools rx_pools =
> >
> > +   dev_conf-
> >rx_adv_conf.vmdq_dcb_conf.nb_queue_pools;
> >
> > +   enum rte_eth_nb_pools tx_pools =
> >
> > +   dev_conf-
> >tx_adv_conf.vmdq_dcb_tx_conf.nb_queue_pools;
> >

[dpdk-dev] [PATCH v5 4/6] ether: Check VMDq RSS mode

2015-01-14 Thread Ouyang, Changchun


From: Vlad Zolotarov [mailto:vl...@cloudius-systems.com]
Sent: Tuesday, January 13, 2015 5:00 PM
To: Ouyang, Changchun; dev at dpdk.org
Subject: Re: [dpdk-dev] [PATCH v5 4/6] ether: Check VMDq RSS mode


On 01/13/15 03:50, Ouyang, Changchun wrote:


From: Vlad Zolotarov [mailto:vl...@cloudius-systems.com]
Sent: Monday, January 12, 2015 9:59 PM
To: Ouyang, Changchun; dev at dpdk.org
Subject: Re: [dpdk-dev] [PATCH v5 4/6] ether: Check VMDq RSS mode


On 01/12/15 05:41, Ouyang, Changchun wrote:


From: Vlad Zolotarov [mailto:vl...@cloudius-systems.com]
Sent: Friday, January 09, 2015 9:50 PM
To: Ouyang, Changchun; dev at dpdk.org
Subject: Re: [dpdk-dev] [PATCH v5 4/6] ether: Check VMDq RSS mode


On 01/09/15 07:54, Ouyang, Changchun wrote:





-Original Message-

From: Vlad Zolotarov [mailto:vl...@cloudius-systems.com]

Sent: Friday, January 9, 2015 2:49 AM

To: Ouyang, Changchun; dev at dpdk.org

Subject: Re: [dpdk-dev] [PATCH v5 4/6] ether: Check VMDq RSS mode





On 01/08/15 11:19, Vlad Zolotarov wrote:



On 01/07/15 08:32, Ouyang Changchun wrote:

Check mq mode for VMDq RSS, handle it correctly instead of returning

an error; Also remove the limitation of per pool queue number has max

value of 1, because the per pool queue number could be 2 or 4 if it

is VMDq RSS mode;



The number of rxq specified in config will determine the mq mode for

VMDq RSS.



Signed-off-by: Changchun Ouyang 



changes in v5:

   - Fix '<' issue, it should be '<=' to test rxq number;

   - Extract a function to remove the embeded switch-case statement.



---

  lib/librte_ether/rte_ethdev.c | 50

++-

  1 file changed, 45 insertions(+), 5 deletions(-)



diff --git a/lib/librte_ether/rte_ethdev.c

b/lib/librte_ether/rte_ethdev.c index 95f2ceb..8363e26 100644

--- a/lib/librte_ether/rte_ethdev.c

+++ b/lib/librte_ether/rte_ethdev.c

@@ -503,6 +503,31 @@ rte_eth_dev_tx_queue_config(struct

rte_eth_dev

*dev, uint16_t nb_queues)

  }

static int

+rte_eth_dev_check_vf_rss_rxq_num(uint8_t port_id, uint16_t nb_rx_q)

+{

+struct rte_eth_dev *dev = &rte_eth_devices[port_id];

+switch (nb_rx_q) {

+case 1:

+case 2:

+RTE_ETH_DEV_SRIOV(dev).active =

+ETH_64_POOLS;

+break;

+case 4:

+RTE_ETH_DEV_SRIOV(dev).active =

+ETH_32_POOLS;

+break;

+default:

+return -EINVAL;

+}

+

+RTE_ETH_DEV_SRIOV(dev).nb_q_per_pool = nb_rx_q;

+RTE_ETH_DEV_SRIOV(dev).def_pool_q_idx =

+dev->pci_dev->max_vfs * nb_rx_q;

+

+return 0;

+}

+

+static int

  rte_eth_dev_check_mq_mode(uint8_t port_id, uint16_t nb_rx_q,

uint16_t nb_tx_q,

const struct rte_eth_conf *dev_conf)

  {

@@ -510,8 +535,7 @@ rte_eth_dev_check_mq_mode(uint8_t port_id,

uint16_t nb_rx_q, uint16_t nb_tx_q,

if (RTE_ETH_DEV_SRIOV(dev).active != 0) {

  /* check multi-queue mode */

-if ((dev_conf->rxmode.mq_mode == ETH_MQ_RX_RSS) ||

-(dev_conf->rxmode.mq_mode == ETH_MQ_RX_DCB) ||

+if ((dev_conf->rxmode.mq_mode == ETH_MQ_RX_DCB) ||

  (dev_conf->rxmode.mq_mode == ETH_MQ_RX_DCB_RSS) ||

  (dev_conf->txmode.mq_mode == ETH_MQ_TX_DCB)) {

  /* SRIOV only works in VMDq enable mode */ @@ -525,7

+549,6 @@ rte_eth_dev_check_mq_mode(uint8_t port_id, uint16_t

nb_rx_q, uint16_t nb_tx_q,

  }

switch (dev_conf->rxmode.mq_mode) {

-case ETH_MQ_RX_VMDQ_RSS:

  case ETH_MQ_RX_VMDQ_DCB:

  case ETH_MQ_RX_VMDQ_DCB_RSS:

  /* DCB/RSS VMDQ in SRIOV mode, not implement yet */ @@

-534,6 +557,25 @@ rte_eth_dev_check_mq_mode(uint8_t port_id,

uint16_t

nb_rx_q, uint16_t nb_tx_q,

  "unsupported VMDQ mq_mode rx %u\n",

  port_id, dev_conf->rxmode.mq_mode);

  return (-EINVAL);

+case ETH_MQ_RX_RSS:

+PMD_DEBUG_TRACE("ethdev port_id=%" PRIu8

+" SRIOV active, "

+"Rx mq mode is changed from:"

+"mq_mode %u into VMDQ mq_mode %u\n",

+port_id,

+dev_conf->rxmode.mq_mode,

+dev->data->dev_conf.rxmode.mq_mode);

+case ETH_MQ_RX_VMDQ_RSS:

+dev->data->dev_conf.rxmode.mq_mode =

ETH_MQ_RX_VMDQ_RSS;

+if (nb_rx_q <= RTE_ETH_DEV_SRIOV(dev).nb_q_per_pool)

+if (rte_eth_dev_check_vf_rss_rxq_num(port_id,

nb_rx_q) != 0) {

+PMD_DEBUG_TRACE("ethdev port_id=%d"

+" SRIOV active, invalid queue"

+" number for VMDQ RSS\n",

+port_id);



Some nitpicking here: I'd add the allowed values descriptions to the

error message. Something like: "invalid queue number for VMDQ R