[dpdk-dev] [PATCH] E1000: fix for forced speed/duplex config

2016-11-02 Thread Lu, Wenzhuo
Hi Ananda,

> -Original Message-
> From: Ananda Sathyanarayana [mailto:ananda at versa-networks.com]
> Sent: Wednesday, November 2, 2016 6:47 AM
> To: Lu, Wenzhuo
> Cc: dev at dpdk.org; Ananda Sathyanarayana
> Subject: [PATCH] E1000: fix for forced speed/duplex config
> 
> From the code, it looks like, hw->mac.autoneg, variable is used to switch
> between calling either autoneg function or forcing speed/duplex function. But
> this variable is not modified in eth_em_start/eth_igb_start routines (it is 
> always
> set to 1) even while forcing the link speed.
> 
> Following discussion thread has some more information on this
> 
> http://dpdk.org/ml/archives/dev/2016-October/049272.html
> 
> Signed-off-by: Ananda Sathyanarayana 
Thanks for the patch. It looks fine to me. But as it's a fix, would you like to 
add a Fixes tag for it?


[dpdk-dev] [PATCH v3 02/12] net/virtio: setup and start cq in configure callback

2016-11-02 Thread Yao, Lei A
Hi, Olivier

During the validation work with v16.11-rc2, I find that this patch will cause 
VM crash if enable virtio bonding in VM. Could you have a check at your side? 
The following is steps at my side. Thanks a lot

1. bind PF port to igb_uio.
modprobe uio
insmod ./x86_64-native-linuxapp-gcc/kmod/igb_uio.ko
./tools/dpdk-devbind.py --bind=igb_uio 84:00.1

2. start vhost switch.
./examples/vhost/build/vhost-switch -c 0x1c -n 4 --socket-mem 4096,4096 - 
-p 0x1 --mergeable 0 --vm2vm 0 --socket-file ./vhost-net

3. bootup one vm with four virtio net device
qemu-system-x86_64 \
-name vm0 -enable-kvm -chardev 
socket,path=/tmp/vm0_qga0.sock,server,nowait,id=vm0_qga0 \
-device virtio-serial -device 
virtserialport,chardev=vm0_qga0,name=org.qemu.guest_agent.0 \
-daemonize -monitor unix:/tmp/vm0_monitor.sock,server,nowait \
-net nic,vlan=0,macaddr=00:00:00:c7:56:64,addr=1f \
net user,vlan=0,hostfwd=tcp:10.239.129.127:6107:22 \
-chardev socket,id=char0,path=./vhost-net \
-netdev type=vhost-user,id=netdev0,chardev=char0,vhostforce \
-device virtio-net-pci,netdev=netdev0,mac=52:54:00:00:00:01 \
-chardev socket,id=char1,path=./vhost-net \
-netdev type=vhost-user,id=netdev1,chardev=char1,vhostforce \
-device virtio-net-pci,netdev=netdev1,mac=52:54:00:00:00:02 \
-chardev socket,id=char2,path=./vhost-net \
-netdev type=vhost-user,id=netdev2,chardev=char2,vhostforce \
-device virtio-net-pci,netdev=netdev2,mac=52:54:00:00:00:03 \
-chardev socket,id=char3,path=./vhost-net \
-netdev type=vhost-user,id=netdev3,chardev=char3,vhostforce \
-device virtio-net-pci,netdev=netdev3,mac=52:54:00:00:00:04 \
-cpu host -smp 8 -m 4096 \
-object memory-backend-file,id=mem,size=4096M,mem-path=/mnt/huge,share=on \
-numa node,memdev=mem -mem-prealloc -drive file=/home/osimg/ubuntu16.img -vnc 
:10

4. on vm:
bind virtio net device to igb_uio
modprobe uio
insmod ./x86_64-native-linuxapp-gcc/kmod/igb_uio.ko
tools/dpdk-devbind.py --bind=igb_uio 00:04.0 00:05.0 00:06.0 00:07.0
5. startup test_pmd app
./x86_64-native-linuxapp-gcc/app/testpmd -c 0x1f -n 4 - -i --txqflags=0xf00 
--disable-hw-vlan-filter
6. create one bonding device (port 4)
create bonded device 0 0 (the first 0: mode, the second: the socket number)
show bonding config 4
7. bind port 0, 1, 2 to port 4
add bonding slave 0 4
add bonding slave 1 4
add bonding slave 2 4
port start 4
Result: just after port start 4(port 4 is bonded port), the vm shutdown 
immediately.

BRs
Lei

-Original Message-
From: dev [mailto:dev-boun...@dpdk.org] On Behalf Of Olivier Matz
Sent: Thursday, October 13, 2016 10:16 PM
To: dev at dpdk.org; yuanhan.liu at linux.intel.com
Cc: Ananyev, Konstantin ; Chandran, Sugesh 
; Richardson, Bruce ; Tan, Jianfeng ; Zhang, Helin 
; adrien.mazarguil at 6wind.com; stephen at 
networkplumber.org; dprovan at bivio.net; Wang, Xiao W ; maxime.coquelin at redhat.com
Subject: [dpdk-dev] [PATCH v3 02/12] net/virtio: setup and start cq in 
configure callback

Move the configuration of control queue in the configure callback.
This is needed by next commit, which introduces the reinitialization of the 
device in the configure callback to change the feature flags.
Therefore, the control queue will have to be restarted at the same place.

As virtio_dev_cq_queue_setup() is called from a place where
config->max_virtqueue_pairs is not available, we need to store this in
the private structure. It replaces max_rx_queues and max_tx_queues which have 
the same value. The log showing the value of max_rx_queues and max_tx_queues is 
also removed since config->max_virtqueue_pairs is already displayed above.

Signed-off-by: Olivier Matz 
Reviewed-by: Maxime Coquelin 
---
 drivers/net/virtio/virtio_ethdev.c | 43 +++---
 drivers/net/virtio/virtio_ethdev.h |  4 ++--
 drivers/net/virtio/virtio_pci.h|  3 +--
 3 files changed, 24 insertions(+), 26 deletions(-)

diff --git a/drivers/net/virtio/virtio_ethdev.c 
b/drivers/net/virtio/virtio_ethdev.c
index 77ca569..f3921ac 100644
--- a/drivers/net/virtio/virtio_ethdev.c
+++ b/drivers/net/virtio/virtio_ethdev.c
@@ -552,6 +552,9 @@ virtio_dev_close(struct rte_eth_dev *dev)
if (hw->started == 1)
virtio_dev_stop(dev);

+   if (hw->cvq)
+   virtio_dev_queue_release(hw->cvq->vq);
+
/* reset the NIC */
if (dev->data->dev_flags & RTE_ETH_DEV_INTR_LSC)
vtpci_irq_config(hw, VIRTIO_MSI_NO_VECTOR); @@ -1191,16 +1194,7 
@@ virtio_init_device(struct rte_eth_dev *eth_dev)
config->max_virtqueue_pairs = 1;
}

-   hw->max_rx_queues =
-   (VIRTIO_MAX_RX_QUEUES < config->max_virtqueue_pairs) ?
-   VIRTIO_MAX_RX_QUEUES : config->max_virtqueue_pairs;
-   hw->max_tx_queues =
-   (VIRTIO_MAX_TX_QUEUES < config->max_virtqueue_pairs) ?
-   VIRTIO_MAX_TX_QUEUES : config->max_virtqueue_pairs;
-
-   

[dpdk-dev] [PATCH v4] vhost: Add indirect descriptors support to the TX path

2016-11-02 Thread Yuanhan Liu
On Tue, Nov 01, 2016 at 10:39:35AM +0100, Thomas Monjalon wrote:
> 2016-11-01 16:15, Yuanhan Liu:
> > On Fri, Oct 28, 2016 at 09:58:51AM +0200, Maxime Coquelin wrote:
> > > Agree, what we need is to be able to disable Virtio PMD features
> > > without having to rebuild the PMD.
> > 
> > I want this feature (or more precisely, ability) long times ago.
> > For example, I'd wish there is an option like "force_legacy" when
> > both legacy and modern exist.
> 
> You can change the behaviour of the driver with a run-time parameter
> as a struct rte_devargs.

Thanks for the tip! Yeah, it's a workable solution, not an ideal one though.

--yliu


[dpdk-dev] [PATCH 0/3] vhost: comments and doc update due to vhost-cuse removal

2016-11-02 Thread Yuanhan Liu
Here is a small patchset of updating vhost programming and sample guide
and comments, due to the removal of vhost-cuse.

---
Yuanhan Liu (3):
  doc: update vhost programming guide
  doc: update the vhost sample guide
  vhost: update comments

 doc/guides/prog_guide/vhost_lib.rst|  62 +-
 doc/guides/sample_app_ug/img/qemu_virtio_net.png   | Bin 31557 -> 0 bytes
 doc/guides/sample_app_ug/img/tx_dpdk_testpmd.png   | Bin 76019 -> 0 bytes
 doc/guides/sample_app_ug/img/vhost_net_arch.png| Bin 154920 -> 0 bytes
 .../sample_app_ug/img/vhost_net_sample_app.png | Bin 23800 -> 0 bytes
 .../sample_app_ug/img/virtio_linux_vhost.png   | Bin 30290 -> 0 bytes
 doc/guides/sample_app_ug/index.rst |  10 -
 doc/guides/sample_app_ug/tep_termination.rst   |  54 +-
 doc/guides/sample_app_ug/vhost.rst | 869 +++--
 examples/tep_termination/main.c|   7 +-
 examples/vhost/main.c  |   4 +-
 examples/vhost_xen/main.c  |   3 +-
 lib/librte_vhost/rte_virtio_net.h  |   5 +-
 lib/librte_vhost/vhost.c   |   9 +-
 lib/librte_vhost/vhost.h   |   4 +-
 15 files changed, 148 insertions(+), 879 deletions(-)
 delete mode 100644 doc/guides/sample_app_ug/img/qemu_virtio_net.png
 delete mode 100644 doc/guides/sample_app_ug/img/tx_dpdk_testpmd.png
 delete mode 100644 doc/guides/sample_app_ug/img/vhost_net_arch.png
 delete mode 100644 doc/guides/sample_app_ug/img/vhost_net_sample_app.png
 delete mode 100644 doc/guides/sample_app_ug/img/virtio_linux_vhost.png

-- 
1.9.0



[dpdk-dev] [PATCH 1/3] doc: update vhost programming guide

2016-11-02 Thread Yuanhan Liu
vhost-cuse has been removed in this release. Update the doc, with the
vhost-cuse part being removed.

Signed-off-by: Yuanhan Liu 
---
 doc/guides/prog_guide/vhost_lib.rst | 62 +
 1 file changed, 7 insertions(+), 55 deletions(-)

diff --git a/doc/guides/prog_guide/vhost_lib.rst 
b/doc/guides/prog_guide/vhost_lib.rst
index 573a318..4f997d4 100644
--- a/doc/guides/prog_guide/vhost_lib.rst
+++ b/doc/guides/prog_guide/vhost_lib.rst
@@ -46,26 +46,8 @@ vhost library should be able to:
 * Know all the necessary information about the vring:

   Information such as where the available ring is stored. Vhost defines some
-  messages to tell the backend all the information it needs to know how to
-  manipulate the vring.
-
-Currently, there are two ways to pass these messages and as a result there are
-two Vhost implementations in DPDK: *vhost-cuse* (where the character devices
-are in user space) and *vhost-user*.
-
-Vhost-cuse creates a user space character device and hook to a function ioctl,
-so that all ioctl commands that are sent from the frontend (QEMU) will be
-captured and handled.
-
-Vhost-user creates a Unix domain socket file through which messages are
-passed.
-
-.. Note::
-
-   Since DPDK v2.2, the majority of the development effort has gone into
-   enhancing vhost-user, such as multiple queue, live migration, and
-   reconnect. Thus, it is strongly advised to use vhost-user instead of
-   vhost-cuse.
+  messages (passed through a Unix domain socket file) to tell the backend all
+  the information it needs to know how to manipulate the vring.


 Vhost API Overview
@@ -75,11 +57,10 @@ The following is an overview of the Vhost API functions:

 * ``rte_vhost_driver_register(path, flags)``

-  This function registers a vhost driver into the system. For vhost-cuse, a
-  ``/dev/path`` character device file will be created. For vhost-user server
-  mode, a Unix domain socket file ``path`` will be created.
+  This function registers a vhost driver into the system. ``path`` specifies
+  the Unix domain socket file path.

-  Currently supported flags are (these are valid for vhost-user only):
+  Currently supported flags are:

   - ``RTE_VHOST_USER_CLIENT``

@@ -171,35 +152,8 @@ The following is an overview of the Vhost API functions:
   default.


-Vhost Implementations
--
-
-Vhost-cuse implementation
-~
-
-When vSwitch registers the vhost driver, it will register a cuse device driver
-into the system and creates a character device file. This cuse driver will
-receive vhost open/release/IOCTL messages from the QEMU simulator.
-
-When the open call is received, the vhost driver will create a vhost device
-for the virtio device in the guest.
-
-When the ``VHOST_SET_MEM_TABLE`` ioctl is received, vhost searches the memory
-region to find the starting user space virtual address that maps the memory of
-the guest virtual machine. Through this virtual address and the QEMU pid,
-vhost can find the file QEMU uses to map the guest memory. Vhost maps this
-file into its address space, in this way vhost can fully access the guest
-physical memory, which means vhost could access the shared virtio ring and the
-guest physical address specified in the entry of the ring.
-
-The guest virtual machine tells the vhost whether the virtio device is ready
-for processing or is de-activated through the ``VHOST_NET_SET_BACKEND``
-message. The registered callback from vSwitch will be called.
-
-When the release call is made, vhost will destroy the device.
-
-Vhost-user implementation
-~
+Vhost-user Implementations
+--

 Vhost-user uses Unix domain sockets for passing messages. This means the DPDK
 vhost-user implementation has two options:
@@ -246,8 +200,6 @@ For ``VHOST_SET_MEM_TABLE`` message, QEMU will send 
information for each
 memory region and its file descriptor in the ancillary data of the message.
 The file descriptor is used to map that region.

-There is no ``VHOST_NET_SET_BACKEND`` message as in vhost-cuse to signal
-whether the virtio device is ready or stopped. Instead,
 ``VHOST_SET_VRING_KICK`` is used as the signal to put the vhost device into
 the data plane, and ``VHOST_GET_VRING_BASE`` is used as the signal to remove
 the vhost device from the data plane.
-- 
1.9.0



[dpdk-dev] [PATCH 3/3] vhost: update comments

2016-11-02 Thread Yuanhan Liu
vhost-cuse is removed, update corresponding comments that are still
referencing it.

Signed-off-by: Yuanhan Liu 
---
 examples/tep_termination/main.c   | 7 ++-
 examples/vhost/main.c | 4 +---
 examples/vhost_xen/main.c | 3 +--
 lib/librte_vhost/rte_virtio_net.h | 5 ++---
 lib/librte_vhost/vhost.c  | 9 -
 lib/librte_vhost/vhost.h  | 4 +++-
 6 files changed, 13 insertions(+), 19 deletions(-)

diff --git a/examples/tep_termination/main.c b/examples/tep_termination/main.c
index 622f248..1d6d463 100644
--- a/examples/tep_termination/main.c
+++ b/examples/tep_termination/main.c
@@ -1151,8 +1151,7 @@ print_stats(void)
 }

 /**
- * Main function, does initialisation and calls the per-lcore functions. The 
CUSE
- * device is also registered here to handle the IOCTLs.
+ * Main function, does initialisation and calls the per-lcore functions.
  */
 int
 main(int argc, char *argv[])
@@ -1253,14 +1252,12 @@ main(int argc, char *argv[])
}
rte_vhost_feature_disable(1ULL << VIRTIO_NET_F_MRG_RXBUF);

-   /* Register CUSE device to handle IOCTLs. */
ret = rte_vhost_driver_register((char *)_basename, 0);
if (ret != 0)
-   rte_exit(EXIT_FAILURE, "CUSE device setup failure.\n");
+   rte_exit(EXIT_FAILURE, "failed to register vhost driver.\n");

rte_vhost_driver_callback_register(_net_device_ops);

-   /* Start CUSE session. */
rte_vhost_driver_session_start();

return 0;
diff --git a/examples/vhost/main.c b/examples/vhost/main.c
index 91000e8..0709859 100644
--- a/examples/vhost/main.c
+++ b/examples/vhost/main.c
@@ -1409,8 +1409,7 @@ create_mbuf_pool(uint16_t nr_port, uint32_t 
nr_switch_core, uint32_t mbuf_size,
 }

 /*
- * Main function, does initialisation and calls the per-lcore functions. The 
CUSE
- * device is also registered here to handle the IOCTLs.
+ * Main function, does initialisation and calls the per-lcore functions.
  */
 int
 main(int argc, char *argv[])
@@ -1531,7 +1530,6 @@ main(int argc, char *argv[])

rte_vhost_driver_callback_register(_net_device_ops);

-   /* Start CUSE session. */
rte_vhost_driver_session_start();
return 0;

diff --git a/examples/vhost_xen/main.c b/examples/vhost_xen/main.c
index 2e40357..f4dbaa4 100644
--- a/examples/vhost_xen/main.c
+++ b/examples/vhost_xen/main.c
@@ -1407,8 +1407,7 @@ print_stats(void)
 int init_virtio_net(struct virtio_net_device_ops const * const ops);

 /*
- * Main function, does initialisation and calls the per-lcore functions. The 
CUSE
- * device is also registered here to handle the IOCTLs.
+ * Main function, does initialisation and calls the per-lcore functions.
  */
 int
 main(int argc, char *argv[])
diff --git a/lib/librte_vhost/rte_virtio_net.h 
b/lib/librte_vhost/rte_virtio_net.h
index c53ff64..926039c 100644
--- a/lib/librte_vhost/rte_virtio_net.h
+++ b/lib/librte_vhost/rte_virtio_net.h
@@ -123,9 +123,8 @@ int rte_vhost_get_numa_node(int vid);
 uint32_t rte_vhost_get_queue_num(int vid);

 /**
- * Get the virtio net device's ifname. For vhost-cuse, ifname is the
- * path of the char device. For vhost-user, ifname is the vhost-user
- * socket file path.
+ * Get the virtio net device's ifname, which is the vhost-user socket
+ * file path.
  *
  * @param vid
  *  virtio-net device ID
diff --git a/lib/librte_vhost/vhost.c b/lib/librte_vhost/vhost.c
index d8116ff..31825b8 100644
--- a/lib/librte_vhost/vhost.c
+++ b/lib/librte_vhost/vhost.c
@@ -227,9 +227,8 @@ reset_device(struct virtio_net *dev)
 }

 /*
- * Function is called from the CUSE open function. The device structure is
- * initialised and a new entry is added to the device configuration linked
- * list.
+ * Invoked when there is a new vhost-user connection established (when
+ * there is a new virtio device being attached).
  */
 int
 vhost_new_device(void)
@@ -261,8 +260,8 @@ vhost_new_device(void)
 }

 /*
- * Function is called from the CUSE release function. This function will
- * cleanup the device and remove it from device configuration linked list.
+ * Invoked when there is the vhost-user connection is broken (when
+ * the virtio device is being detached).
  */
 void
 vhost_destroy_device(int vid)
diff --git a/lib/librte_vhost/vhost.h b/lib/librte_vhost/vhost.h
index acec772..22564f1 100644
--- a/lib/librte_vhost/vhost.h
+++ b/lib/librte_vhost/vhost.h
@@ -284,7 +284,9 @@ void vhost_set_ifname(int, const char *if_name, unsigned 
int if_len);
 void vhost_enable_dequeue_zero_copy(int vid);

 /*
- * Backend-specific cleanup. Defined by vhost-cuse and vhost-user.
+ * Backend-specific cleanup.
+ *
+ * TODO: fix it; we have one backend now
  */
 void vhost_backend_cleanup(struct virtio_net *dev);

-- 
1.9.0



[dpdk-dev] [RFC]Generic flow filtering API Sample Application

2016-11-02 Thread Zhao1, Wei
Hi  All,
Now we are planning for an sample application for Generic flow 
filtering API feature, and I have finished the RFC for this example app.
Now  Adrien Mazarguil  has send v2 version of Generic flow 
filtering API,  this sample application  RFC is based on that.

Thank you.




Generic flow filtering API Sample Application


The application is a simple example of generic flow filtering API using the 
DPDK.
The application performs flow director/filtering/classification in packet 
processing.

Overview


The application demonstrates the use of generic flow 
director/filtering/classification API 
in the DPDK to implement packet forwarding.And this document focus on the guide 
line of writing rules configuration 
files and prompt commands usage. It also supply the definition of the available 
EAL options arguments which is useful
in DPDK packet forwarding processing.


Compiling the Application
-

To compile the application:

#.  Go to the sample application directory:

.. code-block:: console

export RTE_SDK=/path/to/rte_sdk
cd ${RTE_SDK}/examples/gen_filter

#.  Set the target (a default target is used if not specified). For example:

.. code-block:: console

export RTE_TARGET=x86_64-native-linuxapp-gcc

See the *DPDK Getting Started Guide* for possible RTE_TARGET values.

#.  Build the application:

.. code-block:: console

make

Running the Application
---
The application has a number of EAL options::

./gen_filter [EAL options] -- 

EAL options:
*   -c
Codemask, set the hexadecimal bitmask of the cores to run on.

*   -n
Num, set the number of memory channels to use.

APP PARAMS:
The following are the application options parameters, they must be 
separated
from the EAL options with a "--" separator.

*   -i
Interactive, run this app in interactive mode. In this mode, the app 
starts with a prompt that can
be used to start and stop forwarding, then manage generic filters rule 
configure in the application,
reference to the following description for more details.In 
non-interactive mode, the application starts with the configuration specified 
on the
command-line and immediately enters forwarding mode.

*   --portmask=0xXX
Set the hexadecimal bitmask of the ports which can be used by the 
generic flow director test in packet forwarding.

*   --coremask=0xXX
Set the hexadecimal bitmask of the cores running the packet forwarding 
test. The master
lcore is reserved for command line parsing only and cannot be masked on 
for packet forwarding.

*   --nb-ports=N 
Set the number of forwarding ports, where 1 <= N <= "number of ports" 
on the board
or CONFIG_RTE_MAX_ETHPORTS from the configuration file. The default 
value is the number of ports on the board.

*   --rxq=N
Set the number of RX queues per port to N, where 1 <= N <= 65535. The 
default value is 1.

*   --txq=N
Set the number of TX queues per port to N, where 1 <= N <= 65535. The 
default value is 1.


###this part need to complete later after decision of which EAL commands 
arguments need to be support in this application###


Interactive mode

*   when the gen_filter application is started in interactive mode, 
(-i|--interactive), it displays a prompt 
that can be used to start and stop forwarding, and configure the 
application to set the Flow Director,
display statistics, set the Flow Director and other tasks. The 
application has a number of commands line options:

gen_filter>[Commands]

*   There is a prompt "gen_filter> " before cursor, command can be enter 
after that position,
also a space bar between configuration file name and command.

These are the commands that are currently working under the command line 
interface:

*   Control Commands

help: show the following commands which are currently available in this 
application and their usage
gen_filter>help

quit: quits the application.
gen_filter>quit

start: start the application, start packet forwarding
gen_filter>start

stop: stop the application, stop packet forwarding
gen_filter>stop

showcfg: print configuration infomation about EAL parameters, for 
example mapping of cores, rx queue, tx queues and so on.
gen_filter>showcfg

*   General Commands to add/remove/query an filter rule:
App will print reminder message for user about whether this 
rule command is SUCESS or FAIL after user type in the commmand.

add: add filter rules from configuration file
gen_filter>add port_id filename.txt

[dpdk-dev] [RFC] [PATCH v2] libeventdev: event driven programming model framework for DPDK

2016-11-02 Thread Jerin Jacob
On Fri, Oct 28, 2016 at 01:48:57PM +, Van Haaren, Harry wrote:
> > -Original Message-
> > From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Jerin Jacob
> > Sent: Tuesday, October 25, 2016 6:49 PM
> 
> > 
> > Hi Community,
> > 
> > So far, I have received constructive feedback from Intel, NXP and Linaro 
> > folks.
> > Let me know, if anyone else interested in contributing to the definition of 
> > eventdev?
> > 
> > If there are no major issues in proposed spec, then Cavium would like work 
> > on
> > implementing and up-streaming the common code(lib/librte_eventdev/) and
> > an associated HW driver.(Requested minor changes of v2 will be addressed
> > in next version).
> 
> 
> Hi All,
> 
> I've been looking at the eventdev API from a use-case point of view, and I'm 
> unclear on a how the API caters for two uses. I have simplified these as much 
> as possible, think of them as a theoretical unit-test for the API :)
> 
> 
> Fragmentation:
> 1. Dequeue 8 packets
> 2. Process 2 packets
> 3. Processing 3rd, this packet needs fragmentation into two packets
> 4. Process remaining 5 packets as normal
> 
> What function calls does the application make to achieve this?
> In particular, I'm referring to how can the scheduler know that the 3rd 
> packet is the one being fragmented, and how to keep packet order valid. 
> 

OK. I will try to share my views on IP fragmentation on event _HW_
models(at least on Cavium HW) then we can see, how we can converge.

First, The fragmentation specific logic should be decoupled from the event
model as it specific to packet and L3 layer(Not specific to generic event)

Now, let us consider the fragmentation handling with non-burst case and single 
flow.
The following text outlines the event flow

a)Setup an event device with single event queue
b)Link multiple ports to single event queue
c)Event producer enqueues p0..p7 packets to event queue with ORDERED
type.(let's assume p2 packet needs to be fragmented i.e application
needs to create p2.0 and p2.1 from p2)
d)Since it is an ORDERED type, p0 to p7 packets are distributed to multiple
ports in parallel(assigned to each lcore or lightweight thread)
e) each lcore/lightweight thread get the packet from designated event port
and process them in parallel and enqueue back to ATOMIC type to maintain
ordering
f)The one lcore dequeues the p2 packet, understands it needs to be
fragmented due to MTU size etc. So it calls rte_ipv4_fragment_packet()
and store the fragmented packet p2.0 and p2.1 in private area of p2 mbuf.
and as usual like other workers, it enqueues p2 to atomic queue for maintaining
the order.
g)On the atomic flow, when lcore dequeues packets, then it comes in order 
p0..p7.
The application sends p0 to p7 on the wire. When application checks the p2 mbuf
private area it understands it is fragmented and then sends p2.0 and p2.1
on the wire.

OR

skip the fragmentation step in (f) and in step (g),
while processing the p2, run over rte_ipv4_fragment_packet() and split the 
packet
and transmit the packets(in case application don't want to deal with mbuf 
private area)

Now, When it comes to BURST scheme. We are planning to create a SW
structure as a virtual event port and associate N 
(N=rte_event_port_dequeue_depth())
physical HW event ports to the virtual port.
That way, it just come as an extension to non burst API and on the
release call have explicit "index" and identify the physical event port
associated with the virtual port.

/Jerin

> 
> Dropping packets:
> 1. Dequeue 8 packets
> 2. Process 2 packets
> 3. Processing 3rd, this packet needs to be dropped
> 4. Process remaining 5 packets as normal
> 
> What function calls does the application make to achieve this?
> Again, in particular how does the scheduler know that the 3rd packet is being 
> dropped.

rte_event_release(..,..,3)??

> 
> 
> Regards, -Harry


[dpdk-dev] [PATCH] ethdev: fix statistics description

2016-11-02 Thread Dai, Wei
Hi, John & Greg

Would you please give any opinion for this patch ?

I have looked through all PMDs and found not all statistics items can be 
supported by some NIC.
For example,  rx_nombuf,  q_ipackets,  q_opackets,  q_ibytes and q_obytes are 
not supported by i40e.
But when the function rte_eth_stats_get(uint8_t port_id, struct rte_eth_stats 
*stats) is called for i40e PMD,
Above un-supported statistics item in output stats are zero, this is not real 
value.
So far, there is no way to know whether an item in struct rte_eth_stats is 
supported or not only from this structure definition.
Maybe some structure member can be added to indicate each of statistics item 
valid or not.
But this means ABI change.

In following list, I list statistics support details of all PMDs.
Hope it can be displayed in your screen.

Thanks
/Wei

NIC   ipackets   opackets  ibytes  obytes  imissed  ierrors  oerrors  
rx_nombuf  q_ipackets   q_opacktes q_ibytesq_obytes  q_errors
af_packet  y  y y   y   nny
n  yyy y y
bnx2x y  y y   y   yyyy 
 nnn n n
bnxt  y  y y   y   yyyn 
 yyy y y
bonding   y  y y   y   yyyy 
 yyy y y
cxgbe y  y y   y   yyyn 
 yyy y y
e1000(igb) y  y y   y   yyy
n  nnn n n
e1000(igbvf)   y  y y   y   nnn
n  nnn n n
ena   y  y y   y   yyyy 
 nnn n n
enic  y  y y   y   yyyy 
 nnn n n
fm10k y  y y   y   nnnn 
 yyy y n
i40e  y  y y   y   yyyn 
 nnn n n
i40evfy  y y   y   nyyn 
 nnn n n
ixgbe y  y y   y   yyyn 
 yyy y y
ixgbevf   y  y y   y   nnnn 
 nnn n n
mlx4  y  y y   y   nyyy 
 yyy y y
mlx5  y  y y   y   nyyy 
 yyy y y
mpipe y  y y   y   nyyy 
 yyy y y
nfp   y  y y   y   yyyy 
 yyy y n
null  y  y n   n   nnyn 
 yyn n y
pcap  y  y y   y   nnyn 
 yyy y y
qede  y  y y   y   yyyy 
 nnn n n
ring  y  y n   n   nnyn 
 yyn n y
szedata2  y  y y   y   nnyn 
 yyy y y
thunderx  y  y y   y   yyyn 
 yyy y n
vhost y  y y   y   nnyn 
 yyy y n
virtio y  y y   y   nyy
y  yyy y n
vmxnet3  y  y y   y   nyyy  
yyy y y
xenvirt   y  y n   n   nnnn 
 nnn n n

> -Original Message-
> From: Thomas Monjalon [mailto:thomas.monjalon at 6wind.com]
> Sent: Tuesday, October 4, 2016 5:35 PM
> To: Dai, 

[dpdk-dev] [PATCH 2/3] doc: update the vhost sample guide

2016-11-02 Thread Yuanhan Liu
For vhost-switch sample, the old guide takes too many words on
vhost-cuse, which is mainly due to vhost-cuse is invented before
vhost-user.

Now vhost-cuse is removed, meaning the best part of the doc is useless.
Instead of amending one piece here and there, this patch simply removes
the most part of the doc and replace it with a simple test guide.

For tep_term sample, mainly for removing the part has "vhost-cuse".

Signed-off-by: Yuanhan Liu 
---

The test is simplified a bit (from using two virtio-net devices to one),
meaning few figures doesn't apply anymore; thus, vhost_net_sample and
tx_dpdk_testpmd figure are removed.

And apparently, the vhost_net_arch figure also becomes useless. Yet
I have no time to draw one for vhost-user, I also removed the two left
figures: qemu_virtio_net and virtio_linux_vhost, which shows the arch
of typical virtio net and linux kernel vhost-net.
---
 doc/guides/sample_app_ug/img/qemu_virtio_net.png   | Bin 31557 -> 0 bytes
 doc/guides/sample_app_ug/img/tx_dpdk_testpmd.png   | Bin 76019 -> 0 bytes
 doc/guides/sample_app_ug/img/vhost_net_arch.png| Bin 154920 -> 0 bytes
 .../sample_app_ug/img/vhost_net_sample_app.png | Bin 23800 -> 0 bytes
 .../sample_app_ug/img/virtio_linux_vhost.png   | Bin 30290 -> 0 bytes
 doc/guides/sample_app_ug/index.rst |  10 -
 doc/guides/sample_app_ug/tep_termination.rst   |  54 +-
 doc/guides/sample_app_ug/vhost.rst | 869 +++--
 8 files changed, 128 insertions(+), 805 deletions(-)
 delete mode 100644 doc/guides/sample_app_ug/img/qemu_virtio_net.png
 delete mode 100644 doc/guides/sample_app_ug/img/tx_dpdk_testpmd.png
 delete mode 100644 doc/guides/sample_app_ug/img/vhost_net_arch.png
 delete mode 100644 doc/guides/sample_app_ug/img/vhost_net_sample_app.png
 delete mode 100644 doc/guides/sample_app_ug/img/virtio_linux_vhost.png

diff --git a/doc/guides/sample_app_ug/img/qemu_virtio_net.png 
b/doc/guides/sample_app_ug/img/qemu_virtio_net.png
deleted file mode 100644
index 
a852c1662fe978e1dc785f4771ff285172ae5e86..
GIT binary patch
literal 0
HcmV?d1

literal 31557
zcmXtf1yoy2*EI$D;10!IQ at jMX;_gsfQz-7Ph2mP=p~VUA4lNc61h=5Yi at V#O_W$0y
z)?KWGWHNW=$UbN9iBeaU!$K!RM?gTpQjnL{L_m1u0>9Hy|AW6HlVv6h|AXu%si2Jt
zzx+`@M#5jCIm_$2As}G)|Gi(OvSO1XAkZKvNK0sY=N#txnnes|fths~T-5%P8)Cpp_Y2qtHdlY3r?RKZER-vvyX;Zwk)l
zcwc4A!eM?V;EN}XS<#j0m4^2o!aOq<4{`6Ba@@cjJ5?>KI`RNUj=xs`{=NR}k
zlDCQOLQG7YE*8t$Z){{_bkDV at +l4@!o~N$c5~IH+O6g?7h;`KwDGSop!TH`)ocL2J
zqQPg=x6?$t9rA^a0SQH?!HsB_$0X^S|=4viH;OS!84rB0?&0STnIC3M{w?w2{)i
zjm?eqFGCtY-Wi_Sf}~vG?*0^})YLCFb`Ax;r(Ar2-FXECbc8BWTF912|J85DsHCK%
z;K2v11WPL_C at EpIS5Gf53_TahPA@Kpg7-qDBtm{7 at IEAzd=F$sLcy|lXm%)z9{wSb
zioBRbs?D1bM?R%XkRSKJ$!ZZpY
z^?GU+dRjJlMMQEhP;cC;DbBQkt-43i}6trC6E^lSf~XeRml0d%-#!1YMlCX
z{b`RlT58t(Aolq^&5y6&`%qE7AS1rg)?viOvJ%X{nGhs85x-rEDHJBs?_>dNU%;9BqY|iiHMJ4)X at d!+t*c?l_b(!xMqK9pR8>h
z=}6GO+mU at Y`-}<-y32_fue2c9uHf&)TQ!Qpa7-|FkhJ4v4b>?Cga_jj{3q at -7Ob
z*k+3Kz`hKX^l`Cvbc8^TP(TU=

[dpdk-dev] [RFC] [PATCH v2] libeventdev: event driven programming model framework for DPDK

2016-11-02 Thread Jerin Jacob
On Fri, Oct 28, 2016 at 03:16:18PM +0100, Bruce Richardson wrote:
> On Fri, Oct 28, 2016 at 02:48:57PM +0100, Van Haaren, Harry wrote:
> > > -Original Message-
> > > From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Jerin Jacob
> > > Sent: Tuesday, October 25, 2016 6:49 PM
> > 
> > > 
> > > Hi Community,
> > > 
> > > So far, I have received constructive feedback from Intel, NXP and Linaro 
> > > folks.
> > > Let me know, if anyone else interested in contributing to the definition 
> > > of eventdev?
> > > 
> > > If there are no major issues in proposed spec, then Cavium would like 
> > > work on
> > > implementing and up-streaming the common code(lib/librte_eventdev/) and
> > > an associated HW driver.(Requested minor changes of v2 will be addressed
> > > in next version).
> > 
> > 
> > Hi All,
> > 
> > I've been looking at the eventdev API from a use-case point of view, and 
> > I'm unclear on a how the API caters for two uses. I have simplified these 
> > as much as possible, think of them as a theoretical unit-test for the API :)
> > 
> > 
> > Fragmentation:
> > 1. Dequeue 8 packets
> > 2. Process 2 packets
> > 3. Processing 3rd, this packet needs fragmentation into two packets
> > 4. Process remaining 5 packets as normal
> > 
> > What function calls does the application make to achieve this?
> > In particular, I'm referring to how can the scheduler know that the 3rd 
> > packet is the one being fragmented, and how to keep packet order valid. 
> > 
> > 
> > Dropping packets:
> > 1. Dequeue 8 packets
> > 2. Process 2 packets
> > 3. Processing 3rd, this packet needs to be dropped
> > 4. Process remaining 5 packets as normal
> > 
> > What function calls does the application make to achieve this?
> > Again, in particular how does the scheduler know that the 3rd packet is 
> > being dropped.
> > 
> > 
> > Regards, -Harry
> 
> Hi,
> 
> these questions apply particularly to reordered which has a lot more
> complications than the other types in terms of sending packets back into
> the scheduler. However, atomic types will still suffer from problems
> with things the way they are - again if we assume a burst of 8 packets,
> then to forward those packets, we need to re-enqueue them again to the
> scheduler, and also then send 8 releases to the scheduler as well, to
> release the atomic locks for those packets.
> This means that for each packet we have to send two messages to a
> scheduler core, something that is really inefficient.
> 
> This number of messages is critical for any software implementation, as
> the cost of moving items core-to-core is going to be a big bottleneck
> (perhaps the biggest bottleneck) in the system. It's for this reason we
> need to use burst APIs - as with rte_rings.

I agree, That the reason why we have rte_event_*_burst()

> 
> How we have solved this in our implementation, is to allow there to be
> an event operation type. The four operations we implemented are as below
> (using packet as a synonym for event here, since these would mostly
> apply to packets flowing through a system):
> 
> * NEW - just a regular enqueue of a packet, without any previous context

Makes sense. I was trying derive it.Make sense for application
requesting it.

> * FORWARD - enqueue a packet, and mark the flow processing for the
> equivalent packet that was dequeued as completed, i.e.
>   release any atomic locks, or reorder this packet with
>   respect to any other outstanding packets from the event queue.

Default case

> * DROP- this is roughtly equivalent to the existing "release" API call,
> except that having it as an enqueue type allows us to
>   release multiple items in a single call, and also to mix
>   releases with new packets and forwarded packets

Yes. Maps to rte_event_release(), with index parameter, its kind doing
the job. But, Makes sense as flag to enable burst.
But it calls for removing the index parameter. Looks like index parameter
has issue in Intel implementation. If so, may be we(Cavium) can fill the
index in the dequeue as implementation specific bits like Harry
suggested and use it in enqueue.
http://dpdk.org/ml/archives/dev/2016-October/049459.html

Any thoughts from NXP?

> * PARTIAL - this indicates that the packet being enqueued should be
>   treated according to the context of the current packet, but
>   that that context should not be released/completed by the
>   enqueue of this packet. This only really applies for
>   reordered events, and is needed to do fragmentation and or
>   multicast of packets with reordering.

I believe PARTIAL is something, HW implementation will have trouble.
I have outlined other way to fix without coupling fragmentation logic in
scheduler.
http://dpdk.org/ml/archives/dev/2016-November/049707.html

If it makes sense for everyone then may be can
- Introduce "event operation type" bits (NEW, DROP, FORWARD(may not required as 

[dpdk-dev] [PATCH] ethdev: fix statistics description

2016-11-02 Thread Mcnamara, John
> -Original Message-
> From: Dai, Wei
> Sent: Wednesday, November 2, 2016 8:29 AM
> To: Thomas Monjalon ; Mcnamara, John
> ; Ananyev, Konstantin
> ; Wu, Jingjing ;
> Zhang, Helin ; Dai, Wei ;
> Curran, Greg 
> Cc: dev at dpdk.org
> Subject: RE: [dpdk-dev] [PATCH] ethdev: fix statistics description
> 
> Hi, John & Greg
> 
> Would you please give any opinion for this patch ?
> 
> I have looked through all PMDs and found not all statistics items can be
> supported by some NIC.
> For example,  rx_nombuf,  q_ipackets,  q_opackets,  q_ibytes and q_obytes
> are not supported by i40e.
> But when the function rte_eth_stats_get(uint8_t port_id, struct
> rte_eth_stats *stats) is called for i40e PMD, Above un-supported
> statistics item in output stats are zero, this is not real value.
> So far, there is no way to know whether an item in struct rte_eth_stats is
> supported or not only from this structure definition.
> Maybe some structure member can be added to indicate each of statistics
> item valid or not.
> But this means ABI change.
> 
> In following list, I list statistics support details of all PMDs.
> Hope it can be displayed in your screen.

Hi,

Thanks for the analysis.

Perhaps we could an API that returns a struct, or otherwise, that indicated 
what stats are returned by a PMD. An application that required stats could call 
it once to establish what stats were available. It would have to be done in 
some way that wouldn't break ABI every time a new stat was added.

Harry, Remy, how would this fit in with the existing stats scheme or the new 
metrics library.

John


[dpdk-dev] [PATCH] ethdev: fix statistics description

2016-11-02 Thread Morten Brørup
> -Original Message-
> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Dai, Wei
> Sent: Wednesday, November 2, 2016 9:29 AM
> To: Thomas Monjalon; Mcnamara, John; Ananyev, Konstantin; Wu, Jingjing;
> Zhang, Helin; Dai, Wei; Curran, Greg
> Cc: dev at dpdk.org
> Subject: Re: [dpdk-dev] [PATCH] ethdev: fix statistics description
> 
> Hi, John & Greg
> 
> Would you please give any opinion for this patch ?
> 
> I have looked through all PMDs and found not all statistics items can
> be supported by some NIC.
> For example,  rx_nombuf,  q_ipackets,  q_opackets,  q_ibytes and
> q_obytes are not supported by i40e.
> But when the function rte_eth_stats_get(uint8_t port_id, struct
> rte_eth_stats *stats) is called for i40e PMD, Above un-supported
> statistics item in output stats are zero, this is not real value.
> So far, there is no way to know whether an item in struct rte_eth_stats
> is supported or not only from this structure definition.
> Maybe some structure member can be added to indicate each of statistics
> item valid or not.
> But this means ABI change.
> 
> 
> > -Original Message-
> > From: Thomas Monjalon [mailto:thomas.monjalon at 6wind.com]
> > Sent: Tuesday, October 4, 2016 5:35 PM
> > To: Dai, Wei ; Mcnamara, John
> > 
> > Cc: dev at dpdk.org
> > Subject: Re: [dpdk-dev] [PATCH] ethdev: fix statistics description
> >
> > 2016-08-26 18:08, Wei Dai:
> > >  /**
> > >   * A structure used to retrieve statistics for an Ethernet port.
> > > + * Not all statistics fields in struct rte_eth_stats are supported
> > > + * by any type of network interface card (NIC). If any statistics
> > > + * field is not supported, its value is 0 .
> > >   */
> > >  struct rte_eth_stats {
> >
> > I'm missing the point of this patch.
> > Why do you think it is a fix?
> >
> > John, any opinion?
> 

I think the source code comment is an improvement. I would also like to see a 
source code comment describing the criteria for choosing which counters go in 
the eth_stats, and which counters are relegated to the eth_xstats.

It doesn't look like the Interfaces MIB (IF-MIB) or even the etherStats MIB 
drives this selection.

The ifOutQLen in the IF-MIB is deprecated; I guess it is not important for 
network monitoring. But fast access to the queue lengths are obviously useful 
for a fast path application with RED or other intelligent queue overflow 
handling mechanisms, so in my mind they are useful here.

-Morten



[dpdk-dev] PCIe Hot Insert/Remove Support

2016-11-02 Thread Shreyansh Jain
Hello Ben,

Apologies for joining this discussion late.

On 10/24/2016 11:46 PM, Walker, Benjamin wrote:
> Hi all,
>
> My name is Ben Walker and I'm the technical lead for SPDK (it's like DPDK, but
> for storage devices). SPDK relies on DPDK only for the base functionality in 
> the
> EAL - memory management, the rings, and the PCI scanning code. A key feature 
> for
> storage devices is support for hot insert and remove, so we're currently 
> working
> through how best to implement this for a user space driver. While doing this
> work, we've run into a few issues with the current DPDK PCI/device/driver
> framework that I'd like to discuss with this list. I'm not entirely ramped up 
> on
> all of the current activity in this area or what the future plans are, so 
> please
> educate me if something is coming that will address our current issues. I'm
> working off of the latest commit on the master branch as of today.

There has been some work recently ([1], [2]) which generalized the overall DPDK 
EAL framework to not look at devices as PCI only. In that sense, the 
attach/detach routines were also generalized. But, dynamically removing device 
during the application initialization is still something which is grey area 
(from what I understand).
There are some callbacks which the application can register (eal interrupts), 
but it would entirely depend on drivers ability to handle interrupts generated 
from device.

A complete restructuring of EAL is still open discussion - which might include 
this point. There is a patch series floated by Jan (and subsequently managed by 
me) here [3]. I also initiated a discussion on this lines in DPDK User Summit 
in Ireland (Slide 18 of [4] might interest you).

[1] http://dpdk.org/ml/archives/dev/2016-January/031390.html
[2] http://dpdk.org/ml/archives/dev/2016-September/047099.html
[3] http://dpdk.org/ml/archives/dev/2016-October/049606.html
[4] 
https://dpdksummit.com/Archive/pdf/2016Userspace/Day02-Session03-ShreyanshJain-Userspace2016.pdf

>
> Today, there appears to be two lists - one of PCI devices and one of drivers. 
> To
> update the list of PCI devices, you call rte_eal_pci_scan(), which scans the 
> PCI
> bus. That call does not attempt to load any drivers. One scan is automatically
> performed when the eal is first initialized. To add or remove drivers from the
> driver list you call rte_eal_driver_register/unregister. To match drivers in 
> the
> driver list to devices in the device list, you call rte_eal_pci_probe.

I agree with this general flow.

>
> There are a few problems with how the code works for us. First,
> rte_eal_pci_scan's algorithm will not correctly detect devices that are in its
> internal list but weren't found by the most recent PCI bus scan (i.e. they 
> were
> hot removed). DPDK's scan doesn't seem to comprehend hot remove in any way.

Indeed. And this not just limited to failure to scan hot-removed, but also 
those devices which are either ethernet or crypto but not PCI bus compliant. 
Essentially, the complete DPDK scan model assumes that PCI compliant devices 
are available *before* scan is performed and across the scan process.
Hotplugging is limited to applications ability to attach/detach devices.

> Fortunately there is a public API to remove devices from the device list -
> rte_eal_pci_detach. That function will automatically unload any drivers
> associated with the device and then remove it from the list. There is a 
> similar
> call for adding a device to the list - rte_eal_pci_probe_one, which will add a
> device to the device list and then automatically match it to drivers. I think 
> if
> rte_eal_pci_scan is going to be a public interface (and it is), it needs to
> correctly comprehend the removal of PCI devices. Otherwise, make it a private

In some parallel discussions, there have been talk of scan being non-PCI 
centric - which would make this API hidden beneath another layer of generic 
scan. Either way, clean handling of a hot-plugging case is indeed desirable. 
See [4] above.

> API that is only called in response to rte_eal_init and only expose the public
> probe_one/detach calls for modifying the list of devices. My preference is for
> the former, not the latter.

Generic way of DPDK application is to start, rely on DPDK framework to find 
devices which have already been attached to the PCI drivers (bind script) and 
then wait for probing to finish  (rte_eal_init) so as to start the I/O 
(creating pool, queue setup, etc). In the later method suggested by you, this 
model doesn't bode well. From what I make of it,  applications would then be 
responsible for attaches thereby making them call extra APIs. In that context, 
I too prefer the former (handling hotplug).

>
> Second, rte_eal_pci_probe will call the driver initialization functions each
> time a probe happens, even if the driver has already been successfully loaded.
> This tends to crash a lot of the PMDs. It seems to me like rte_eal_pci_probe 
> is
> not 

[dpdk-dev] [PATCH] pci: Don't call probe callback if driver already loaded.

2016-11-02 Thread Shreyansh Jain
On 10/26/2016 3:20 AM, Ben Walker wrote:
> If the user asks to probe multiple times, the probe
> callback should only be called on devices that don't have
> a driver already loaded.
>
> This is useful if a driver is registered after the
> execution of a program has started and the list of devices
> needs to be re-scanned.
>
> Signed-off-by: Ben Walker 
> ---
>  lib/librte_eal/common/eal_common_pci.c | 4 
>  1 file changed, 4 insertions(+)
>
> diff --git a/lib/librte_eal/common/eal_common_pci.c 
> b/lib/librte_eal/common/eal_common_pci.c
> index 638cd86..971ad20 100644
> --- a/lib/librte_eal/common/eal_common_pci.c
> +++ b/lib/librte_eal/common/eal_common_pci.c
> @@ -289,6 +289,10 @@ pci_probe_all_drivers(struct rte_pci_device *dev)
>   if (dev == NULL)
>   return -1;
>
> + /* Check if a driver is already loaded */
> + if (dev->driver != NULL)
> + return 0;
> +

In case if it is required to change the driver assigned to a device, would it 
mean application relies on detach(dev)->new-driver-plugged-in->attach(dev) 
sequence?
To me, the above change sounds fine. Though, I am not aware if there is even a 
use case for changing driver assigned to a device. detach()->attach() should be 
able to work in those cases, I think.

>   TAILQ_FOREACH(dr, _driver_list, next) {
>   rc = rte_eal_pci_probe_one_driver(dr, dev);
>   if (rc < 0)
>

-
Shreyansh



[dpdk-dev] [PATCH 0/3] fix Rx checksum offloads

2016-11-02 Thread Nelio Laranjeiro
Fill correctly the Mbuf Rx offloads.

Nelio Laranjeiro (3):
  net/mlx5: fix Rx checksum macros
  net/mlx5: define explicit fields for Rx offloads
  net/mlx: fix support for new Rx checksum flags

 drivers/net/mlx4/mlx4.c  | 21 --
 drivers/net/mlx5/mlx5_prm.h  | 37 +-
 drivers/net/mlx5/mlx5_rxtx.c | 93 
 3 files changed, 87 insertions(+), 64 deletions(-)

-- 
2.1.4



[dpdk-dev] [PATCH 1/3] net/mlx5: fix Rx checksum macros

2016-11-02 Thread Nelio Laranjeiro
Add missing:

 - MLX5_CQE_RX_IPV4_PACKET
 - MLX5_CQE_RX_IPV6_PACKET
 - MLX5_CQE_RX_OUTER_IPV4_PACKET
 - MLX5_CQE_RX_OUTER_IPV6_PACKET
 - MLX5_CQE_RX_TUNNEL_PACKET
 - MLX5_CQE_RX_OUTER_IP_CSUM_OK
 - MLX5_CQE_RX_OUTER_TCP_UDP_CSUM_OK

Fixes: 51a50a3d9b8f ("net/mlx5: add definitions for data path without Verbs")

Signed-off-by: Nelio Laranjeiro 
---
 drivers/net/mlx5/mlx5_prm.h  | 21 +
 drivers/net/mlx5/mlx5_rxtx.c | 16 
 2 files changed, 29 insertions(+), 8 deletions(-)

diff --git a/drivers/net/mlx5/mlx5_prm.h b/drivers/net/mlx5/mlx5_prm.h
index 90b47f0..500f25a 100644
--- a/drivers/net/mlx5/mlx5_prm.h
+++ b/drivers/net/mlx5/mlx5_prm.h
@@ -84,6 +84,27 @@
 #define MLX5_OPCODE_TSO MLX5_OPCODE_LSO_MPW /* Compat with OFED 3.3. */
 #endif

+/* IPv4 packet. */
+#define MLX5_CQE_RX_IPV4_PACKET (1u << 2)
+
+/* IPv6 packet. */
+#define MLX5_CQE_RX_IPV6_PACKET (1u << 3)
+
+/* Outer IPv4 packet. */
+#define MLX5_CQE_RX_OUTER_IPV4_PACKET (1u << 7)
+
+/* Outer IPv6 packet. */
+#define MLX5_CQE_RX_OUTER_IPV6_PACKET (1u << 8)
+
+/* Tunnel packet bit in the CQE. */
+#define MLX5_CQE_RX_TUNNEL_PACKET (1u << 4)
+
+/* Outer IP checksum OK. */
+#define MLX5_CQE_RX_OUTER_IP_CSUM_OK (1u << 5)
+
+/* Outer UDP header and checksum OK. */
+#define MLX5_CQE_RX_OUTER_TCP_UDP_CSUM_OK (1u << 6)
+
 /* Subset of struct mlx5_wqe_eth_seg. */
 struct mlx5_wqe_eth_seg_small {
uint32_t rsvd0;
diff --git a/drivers/net/mlx5/mlx5_rxtx.c b/drivers/net/mlx5/mlx5_rxtx.c
index ba8e202..7ebe557 100644
--- a/drivers/net/mlx5/mlx5_rxtx.c
+++ b/drivers/net/mlx5/mlx5_rxtx.c
@@ -1096,19 +1096,19 @@ rxq_cq_to_pkt_type(volatile struct mlx5_cqe64 *cqe)
uint8_t flags = cqe->l4_hdr_type_etc;
uint8_t info = cqe->rsvd0[0];

-   if (info & IBV_EXP_CQ_RX_TUNNEL_PACKET)
+   if (info & MLX5_CQE_RX_TUNNEL_PACKET)
pkt_type =
TRANSPOSE(flags,
- IBV_EXP_CQ_RX_OUTER_IPV4_PACKET,
+ MLX5_CQE_RX_OUTER_IPV4_PACKET,
  RTE_PTYPE_L3_IPV4) |
TRANSPOSE(flags,
- IBV_EXP_CQ_RX_OUTER_IPV6_PACKET,
+ MLX5_CQE_RX_OUTER_IPV6_PACKET,
  RTE_PTYPE_L3_IPV6) |
TRANSPOSE(flags,
- IBV_EXP_CQ_RX_IPV4_PACKET,
+ MLX5_CQE_RX_IPV4_PACKET,
  RTE_PTYPE_INNER_L3_IPV4) |
TRANSPOSE(flags,
- IBV_EXP_CQ_RX_IPV6_PACKET,
+ MLX5_CQE_RX_IPV6_PACKET,
  RTE_PTYPE_INNER_L3_IPV6);
else
pkt_type =
@@ -1256,13 +1256,13 @@ rxq_cq_to_ol_flags(struct rxq *rxq, volatile struct 
mlx5_cqe64 *cqe)
 * of PKT_RX_EIP_CKSUM_BAD because the latter is not functional
 * (its value is 0).
 */
-   if ((info & IBV_EXP_CQ_RX_TUNNEL_PACKET) && (rxq->csum_l2tun))
+   if ((info & MLX5_CQE_RX_TUNNEL_PACKET) && (rxq->csum_l2tun))
ol_flags |=
TRANSPOSE(~cqe->l4_hdr_type_etc,
- IBV_EXP_CQ_RX_OUTER_IP_CSUM_OK,
+ MLX5_CQE_RX_OUTER_IP_CSUM_OK,
  PKT_RX_IP_CKSUM_BAD) |
TRANSPOSE(~cqe->l4_hdr_type_etc,
- IBV_EXP_CQ_RX_OUTER_TCP_UDP_CSUM_OK,
+ MLX5_CQE_RX_OUTER_TCP_UDP_CSUM_OK,
  PKT_RX_L4_CKSUM_BAD);
return ol_flags;
 }
-- 
2.1.4



[dpdk-dev] [PATCH 2/3] net/mlx5: define explicit fields for Rx offloads

2016-11-02 Thread Nelio Laranjeiro
This commit redefines the completion queue element structure as the
original lacks the required fields.

Signed-off-by: Nelio Laranjeiro 
---
 drivers/net/mlx5/mlx5_prm.h  | 16 -
 drivers/net/mlx5/mlx5_rxtx.c | 56 +---
 2 files changed, 42 insertions(+), 30 deletions(-)

diff --git a/drivers/net/mlx5/mlx5_prm.h b/drivers/net/mlx5/mlx5_prm.h
index 500f25a..7f31a2f 100644
--- a/drivers/net/mlx5/mlx5_prm.h
+++ b/drivers/net/mlx5/mlx5_prm.h
@@ -158,7 +158,21 @@ struct mlx5_cqe {
 #if (RTE_CACHE_LINE_SIZE == 128)
uint8_t padding[64];
 #endif
-   struct mlx5_cqe64 cqe64;
+   uint8_t pkt_info;
+   uint8_t rsvd0[11];
+   uint32_t rx_hash_res;
+   uint8_t rx_hash_type;
+   uint8_t rsvd1[11];
+   uint8_t hds_ip_ext;
+   uint8_t l4_hdr_type_etc;
+   uint16_t vlan_info;
+   uint8_t rsvd2[12];
+   uint32_t byte_cnt;
+   uint64_t timestamp;
+   uint8_t rsvd3[4];
+   uint16_t wqe_counter;
+   uint8_t rsvd4;
+   uint8_t op_own;
 };

 #endif /* RTE_PMD_MLX5_PRM_H_ */
diff --git a/drivers/net/mlx5/mlx5_rxtx.c b/drivers/net/mlx5/mlx5_rxtx.c
index 7ebe557..b6e0d65 100644
--- a/drivers/net/mlx5/mlx5_rxtx.c
+++ b/drivers/net/mlx5/mlx5_rxtx.c
@@ -83,10 +83,10 @@
  *   0 the first time.
  */
 static inline int
-check_cqe64_seen(volatile struct mlx5_cqe64 *cqe)
+check_cqe_seen(volatile struct mlx5_cqe *cqe)
 {
static const uint8_t magic[] = "seen";
-   volatile uint8_t (*buf)[sizeof(cqe->rsvd40)] = >rsvd40;
+   volatile uint8_t (*buf)[sizeof(cqe->rsvd3)] = >rsvd3;
int ret = 1;
unsigned int i;

@@ -101,9 +101,9 @@ check_cqe64_seen(volatile struct mlx5_cqe64 *cqe)
 #endif /* NDEBUG */

 static inline int
-check_cqe64(volatile struct mlx5_cqe64 *cqe,
-   unsigned int cqes_n, const uint16_t ci)
-   __attribute__((always_inline));
+check_cqe(volatile struct mlx5_cqe *cqe,
+ unsigned int cqes_n, const uint16_t ci)
+ __attribute__((always_inline));

 /**
  * Check whether CQE is valid.
@@ -119,8 +119,8 @@ check_cqe64(volatile struct mlx5_cqe64 *cqe,
  *   0 on success, 1 on failure.
  */
 static inline int
-check_cqe64(volatile struct mlx5_cqe64 *cqe,
-   unsigned int cqes_n, const uint16_t ci)
+check_cqe(volatile struct mlx5_cqe *cqe,
+ unsigned int cqes_n, const uint16_t ci)
 {
uint16_t idx = ci & cqes_n;
uint8_t op_own = cqe->op_own;
@@ -138,14 +138,14 @@ check_cqe64(volatile struct mlx5_cqe64 *cqe,
if ((syndrome == MLX5_CQE_SYNDROME_LOCAL_LENGTH_ERR) ||
(syndrome == MLX5_CQE_SYNDROME_REMOTE_ABORTED_ERR))
return 0;
-   if (!check_cqe64_seen(cqe))
+   if (!check_cqe_seen(cqe))
ERROR("unexpected CQE error %u (0x%02x)"
  " syndrome 0x%02x",
  op_code, op_code, syndrome);
return 1;
} else if ((op_code != MLX5_CQE_RESP_SEND) &&
   (op_code != MLX5_CQE_REQ)) {
-   if (!check_cqe64_seen(cqe))
+   if (!check_cqe_seen(cqe))
ERROR("unexpected CQE opcode %u (0x%02x)",
  op_code, op_code);
return 1;
@@ -174,25 +174,25 @@ txq_complete(struct txq *txq)
uint16_t elts_free = txq->elts_tail;
uint16_t elts_tail;
uint16_t cq_ci = txq->cq_ci;
-   volatile struct mlx5_cqe64 *cqe = NULL;
+   volatile struct mlx5_cqe *cqe = NULL;
volatile struct mlx5_wqe *wqe;

do {
-   volatile struct mlx5_cqe64 *tmp;
+   volatile struct mlx5_cqe *tmp;

-   tmp = &(*txq->cqes)[cq_ci & cqe_cnt].cqe64;
-   if (check_cqe64(tmp, cqe_n, cq_ci))
+   tmp = &(*txq->cqes)[cq_ci & cqe_cnt];
+   if (check_cqe(tmp, cqe_n, cq_ci))
break;
cqe = tmp;
 #ifndef NDEBUG
if (MLX5_CQE_FORMAT(cqe->op_own) == MLX5_COMPRESSED) {
-   if (!check_cqe64_seen(cqe))
+   if (!check_cqe_seen(cqe))
ERROR("unexpected compressed CQE, TX stopped");
return;
}
if ((MLX5_CQE_OPCODE(cqe->op_own) == MLX5_CQE_RESP_ERR) ||
(MLX5_CQE_OPCODE(cqe->op_own) == MLX5_CQE_REQ_ERR)) {
-   if (!check_cqe64_seen(cqe))
+   if (!check_cqe_seen(cqe))
ERROR("unexpected error CQE, TX stopped");
return;
}
@@ -1090,13 +1090,12 @@ mlx5_tx_burst_mpw_inline(void *dpdk_txq, struct 
rte_mbuf **pkts,
  *   Packet type for struct rte_mbuf.
  */
 static inline uint32_t
-rxq_cq_to_pkt_type(volatile struct mlx5_cqe64 *cqe)
+rxq_cq_to_pkt_type(volatile struct mlx5_cqe *cqe)
 {
uint32_t pkt_type;
  

[dpdk-dev] [PATCH 3/3] net/mlx: fix support for new Rx checksum flags

2016-11-02 Thread Nelio Laranjeiro
Fixes: 5842289a546c ("mbuf: add new Rx checksum flags")

Signed-off-by: Nelio Laranjeiro 
---
 drivers/net/mlx4/mlx4.c  | 21 -
 drivers/net/mlx5/mlx5_rxtx.c | 25 ++---
 2 files changed, 18 insertions(+), 28 deletions(-)

diff --git a/drivers/net/mlx4/mlx4.c b/drivers/net/mlx4/mlx4.c
index faa9acd..da61a85 100644
--- a/drivers/net/mlx4/mlx4.c
+++ b/drivers/net/mlx4/mlx4.c
@@ -2995,25 +2995,20 @@ rxq_cq_to_ol_flags(const struct rxq *rxq, uint32_t 
flags)

if (rxq->csum)
ol_flags |=
-   TRANSPOSE(~flags,
+   TRANSPOSE(flags,
  IBV_EXP_CQ_RX_IP_CSUM_OK,
- PKT_RX_IP_CKSUM_BAD) |
-   TRANSPOSE(~flags,
+ PKT_RX_IP_CKSUM_GOOD) |
+   TRANSPOSE(flags,
  IBV_EXP_CQ_RX_TCP_UDP_CSUM_OK,
- PKT_RX_L4_CKSUM_BAD);
-   /*
-* PKT_RX_IP_CKSUM_BAD and PKT_RX_L4_CKSUM_BAD are used in place
-* of PKT_RX_EIP_CKSUM_BAD because the latter is not functional
-* (its value is 0).
-*/
+ PKT_RX_L4_CKSUM_GOOD);
if ((flags & IBV_EXP_CQ_RX_TUNNEL_PACKET) && (rxq->csum_l2tun))
ol_flags |=
-   TRANSPOSE(~flags,
+   TRANSPOSE(flags,
  IBV_EXP_CQ_RX_OUTER_IP_CSUM_OK,
- PKT_RX_IP_CKSUM_BAD) |
-   TRANSPOSE(~flags,
+ PKT_RX_IP_CKSUM_GOOD) |
+   TRANSPOSE(flags,
  IBV_EXP_CQ_RX_OUTER_TCP_UDP_CSUM_OK,
- PKT_RX_L4_CKSUM_BAD);
+ PKT_RX_L4_CKSUM_GOOD);
return ol_flags;
 }

diff --git a/drivers/net/mlx5/mlx5_rxtx.c b/drivers/net/mlx5/mlx5_rxtx.c
index b6e0d65..beff580 100644
--- a/drivers/net/mlx5/mlx5_rxtx.c
+++ b/drivers/net/mlx5/mlx5_rxtx.c
@@ -1239,29 +1239,24 @@ rxq_cq_to_ol_flags(struct rxq *rxq, volatile struct 
mlx5_cqe *cqe)

if ((l3_hdr == MLX5_CQE_L3_HDR_TYPE_IPV4) ||
(l3_hdr == MLX5_CQE_L3_HDR_TYPE_IPV6))
-   ol_flags |=
-   (!(cqe->hds_ip_ext & MLX5_CQE_L3_OK) *
-PKT_RX_IP_CKSUM_BAD);
+   ol_flags |= TRANSPOSE(cqe->hds_ip_ext,
+ MLX5_CQE_L3_OK,
+ PKT_RX_IP_CKSUM_GOOD);
if ((l4_hdr == MLX5_CQE_L4_HDR_TYPE_TCP) ||
(l4_hdr == MLX5_CQE_L4_HDR_TYPE_TCP_EMP_ACK) ||
(l4_hdr == MLX5_CQE_L4_HDR_TYPE_TCP_ACK) ||
(l4_hdr == MLX5_CQE_L4_HDR_TYPE_UDP))
-   ol_flags |=
-   (!(cqe->hds_ip_ext & MLX5_CQE_L4_OK) *
-PKT_RX_L4_CKSUM_BAD);
-   /*
-* PKT_RX_IP_CKSUM_BAD and PKT_RX_L4_CKSUM_BAD are used in place
-* of PKT_RX_EIP_CKSUM_BAD because the latter is not functional
-* (its value is 0).
-*/
+   ol_flags |= TRANSPOSE(cqe->hds_ip_ext,
+ MLX5_CQE_L4_OK,
+ PKT_RX_L4_CKSUM_GOOD);
if ((cqe->pkt_info & MLX5_CQE_RX_TUNNEL_PACKET) && (rxq->csum_l2tun))
ol_flags |=
-   TRANSPOSE(~cqe->l4_hdr_type_etc,
+   TRANSPOSE(cqe->l4_hdr_type_etc,
  MLX5_CQE_RX_OUTER_IP_CSUM_OK,
- PKT_RX_IP_CKSUM_BAD) |
-   TRANSPOSE(~cqe->l4_hdr_type_etc,
+ PKT_RX_IP_CKSUM_GOOD) |
+   TRANSPOSE(cqe->l4_hdr_type_etc,
  MLX5_CQE_RX_OUTER_TCP_UDP_CSUM_OK,
- PKT_RX_L4_CKSUM_BAD);
+ PKT_RX_L4_CKSUM_GOOD);
return ol_flags;
 }

-- 
2.1.4



[dpdk-dev] [RFC] [PATCH v2] libeventdev: event driven programming model framework for DPDK

2016-11-02 Thread Jerin Jacob
On Wed, Oct 26, 2016 at 12:11:03PM +, Van Haaren, Harry wrote:
> > -Original Message-
> > From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Jerin Jacob
> > 
> > So far, I have received constructive feedback from Intel, NXP and Linaro 
> > folks.
> > Let me know, if anyone else interested in contributing to the definition of 
> > eventdev?
> > 
> > If there are no major issues in proposed spec, then Cavium would like work 
> > on
> > implementing and up-streaming the common code(lib/librte_eventdev/) and
> > an associated HW driver.(Requested minor changes of v2 will be addressed
> > in next version).
>

Hi All,

Two queries,

1) In SW implementation, Is their any connection between "struct
rte_event_port_conf"'s dequeue_queue_depth and enqueue_queue_depth ?
i.e it should be enqueue_queue_depth >= dequeue_queue_depth. Right ?
Thought of adding the common checks in common layer.

2)Any comments on follow item(section under ) that needs improvement.
---
Abstract the differences in event QoS management with different
priority schemes available in different HW or SW implementations with portable
application workflow.

Based on the feedback, there three different kinds of QoS support
available in
three different HW or SW implementations.
1) Priority associated with the event queue
2) Priority associated with each event enqueue
(Same flow can have two different priority on two separate enqueue)
3) Priority associated with the flow(each flow has unique priority)

In v2, The differences abstracted based on device capability
(RTE_EVENT_DEV_CAP_QUEUE_QOS for the first scheme,
RTE_EVENT_DEV_CAP_EVENT_QOS for the second and third scheme).
This scheme would call for different application workflow for
nontrivial QoS-enabled applications.
---
After thinking a while, I think, RTE_EVENT_DEV_CAP_EVENT_QOS is a
super-set.if so, the subset RTE_EVENT_DEV_CAP_QUEUE_QOS can be
implemented with RTE_EVENT_DEV_CAP_EVENT_QOS. i.e We may not need two
flags, Just one flag RTE_EVENT_DEV_CAP_EVENT_QOS is enough to fix
portability issue with basic QoS enabled applications.

i.e Introduce RTE_EVENT_DEV_CAP_EVENT_QOS as config option in device
configure stage if application needs fine granularity on QoS per event
enqueue.For trivial applications, configured
rte_event_queue_conf->priority can be used as rte_event_enqueue(struct
rte_event.priority)

Thoughts?

/Jerin




[dpdk-dev] [PATCH v4] vhost: Add indirect descriptors support to the TX path

2016-11-02 Thread Maxime Coquelin


On 10/31/2016 11:01 AM, Wang, Zhihong wrote:
>
>
>> -Original Message-
>> From: Maxime Coquelin [mailto:maxime.coquelin at redhat.com]
>> Sent: Friday, October 28, 2016 3:42 PM
>> To: Wang, Zhihong ; Yuanhan Liu
>> 
>> Cc: stephen at networkplumber.org; Pierre Pfister (ppfister)
>> ; Xie, Huawei ; dev at 
>> dpdk.org;
>> vkaplans at redhat.com; mst at redhat.com
>> Subject: Re: [dpdk-dev] [PATCH v4] vhost: Add indirect descriptors support
>> to the TX path
>>
>>
>>
>> On 10/28/2016 02:49 AM, Wang, Zhihong wrote:
>>>
> -Original Message-
> From: Yuanhan Liu [mailto:yuanhan.liu at linux.intel.com]
> Sent: Thursday, October 27, 2016 6:46 PM
> To: Maxime Coquelin 
> Cc: Wang, Zhihong ;
> stephen at networkplumber.org; Pierre Pfister (ppfister)
> ; Xie, Huawei ;
>> dev at dpdk.org;
> vkaplans at redhat.com; mst at redhat.com
> Subject: Re: [dpdk-dev] [PATCH v4] vhost: Add indirect descriptors
>> support
> to the TX path
>
> On Thu, Oct 27, 2016 at 12:35:11PM +0200, Maxime Coquelin wrote:
>>>
>>>
>>> On 10/27/2016 12:33 PM, Yuanhan Liu wrote:
> On Thu, Oct 27, 2016 at 11:10:34AM +0200, Maxime Coquelin
>> wrote:
>>> Hi Zhihong,
>>>
>>> On 10/27/2016 11:00 AM, Wang, Zhihong wrote:
> Hi Maxime,
>
> Seems indirect desc feature is causing serious performance
> degradation on Haswell platform, about 20% drop for both
> mrg=on and mrg=off (--txqflags=0xf00, non-vector version),
> both iofwd and macfwd.
>>> I tested PVP (with macswap on guest) and Txonly/Rxonly on an
>> Ivy
> Bridge
>>> platform, and didn't faced such a drop.
>
> I was actually wondering that may be the cause. I tested it with
> my IvyBridge server as well, I saw no drop.
>
> Maybe you should find a similar platform (Haswell) and have a try?
>>> Yes, that's why I asked Zhihong whether he could test Txonly in guest
>> to
>>> see if issue is reproducible like this.
>
> I have no Haswell box, otherwise I could do a quick test for you. IIRC,
> he tried to disable the indirect_desc feature, then the performance
> recovered. So, it's likely the indirect_desc is the culprit here.
>
>>> I will be easier for me to find an Haswell machine if it has not to be
>>> connected back to back to and HW/SW packet generator.
>>> In fact simple loopback test will also do, without pktgen.
>>>
>>> Start testpmd in both host and guest, and do "start" in one
>>> and "start tx_first 32" in another.
>>>
>>> Perf drop is about 24% in my test.
>>>
>>
>> Thanks, I never tried this test.
>> I managed to find an Haswell platform (Intel(R) Xeon(R) CPU E5-2699 v3
>> @ 2.30GHz), and can reproduce the problem with the loop test you
>> mention. I see a performance drop about 10% (8.94Mpps/8.08Mpps).
>> Out of curiosity, what are the numbers you get with your setup?
>
> Hi Maxime,
>
> Let's align our test case to RC2, mrg=on, loopback, on Haswell.
> My results below:
>  1. indirect=1: 5.26 Mpps
>  2. indirect=0: 6.54 Mpps
>
> It's about 24% drop.
OK, so on my side, same setup on Haswell:
1. indirect=1: 7.44 Mpps
2. indirect=0: 8.18 Mpps

Still 10% drop in my case with mrg=on.

The strange thing with both of our figures is that this is below from
what I obtain with my SandyBridge machine. The SB cpu freq is 4% higher,
but that doesn't explain the gap between the measurements.

I'm continuing the investigations on my side.
Maybe we should fix a deadline, and decide do disable indirect in
Virtio PMD if root cause not identified/fixed at some point?

Yuanhan, what do you think?

Regards,
Maxime


[dpdk-dev] [RFC v2] Generic flow director/filtering/classification API

2016-11-02 Thread Adrien Mazarguil
Hi Helin,

On Mon, Oct 31, 2016 at 07:19:18AM +, Zhang, Helin wrote:
> Hi Adrien
> 
> Just a double check, do you have any update on the v1 patch set, as now it is 
> the end of October?
> We are extremly eager to see the v1 patch set for development.
> I don't think we need full validation on the v1 patch set for API. It should 
> be together with PMD and example application.
> If we can see the v1 API patch set earlier, we can help to validate it with 
> our code changes. That's should be more efficient and helpful.
> Any comments on my personal understanding?
> 
> Thank you very much for the hard work and kind helps!

I intend to send it shortly, likely this week. For the record, a large part
of this task was also dedicated to implement it on the client side (I've
just read Wei's RFC for a client-side application to which I will reply
separately), in order to validate it from a usability standpoint that led me
to make a few necessary adjustments to the API.

My next submission will include both the updated API with several changes
discussed on this ML and testpmd code (not a separate application) that uses
it. Just hang on a bit longer!

> > -Original Message-
> > From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Adrien Mazarguil
> > Sent: Friday, September 30, 2016 1:11 AM
> > To: dev at dpdk.org
> > Cc: Thomas Monjalon
> > Subject: Re: [dpdk-dev] [RFC v2] Generic flow 
> > director/filtering/classification
> > API
> > 
> > On Fri, Aug 19, 2016 at 08:50:44PM +0200, Adrien Mazarguil wrote:
> > > Hi All,
> > >
> > > Thanks to many for the positive and constructive feedback I've
> > > received so far. Here is the updated specification (v0.7) at last.
> > >
> > > I've attempted to address as many comments as possible but could not
> > > process them all just yet. A new section "Future evolutions" has been
> > > added for the remaining topics.
> > >
> > > This series adds rte_flow.h to the DPDK tree. Next time I will attempt
> > > to convert the specification as a documentation commit part of the
> > > patchset and actually implement API functions.
> > [...]
> > 
> > A quick update, we initially targeted 16.11 as the DPDK release this API 
> > would
> > be available for, turns out this goal was somewhat too optimistic as
> > September is ending and we are about to overshoot the deadline for
> > integration (basically everything took longer than expected, big surprise).
> > 
> > So instead of rushing things now to include a botched API in 16.11 with no
> > PMD support, we simply modified the target, now set to 17.02. On the plus
> > side this should leave developers more time to refine and test the API 
> > before
> > applications and PMDs start to use it.
> > 
> > I intend to send the patchset for the first non-draft version mid-October
> > worst case (ASAP in fact). I still haven't replied to several comments but 
> > did
> > take them into account, thanks for your feedback.
> > 
> > --
> > Adrien Mazarguil
> > 6WIND

-- 
Adrien Mazarguil
6WIND


[dpdk-dev] [RFC] [PATCH v2] libeventdev: event driven programming model framework for DPDK

2016-11-02 Thread Jerin Jacob
On Fri, Oct 28, 2016 at 02:36:48PM +0530, Jerin Jacob wrote:
> On Fri, Oct 28, 2016 at 09:36:46AM +0100, Bruce Richardson wrote:
> > On Fri, Oct 28, 2016 at 08:31:41AM +0530, Jerin Jacob wrote:
> > > On Wed, Oct 26, 2016 at 01:54:14PM +0100, Bruce Richardson wrote:
> > > > On Wed, Oct 26, 2016 at 05:54:17PM +0530, Jerin Jacob wrote:
> > > > > On Wed, Oct 26, 2016 at 12:11:03PM +, Van Haaren, Harry wrote:
> > > > > > > -Original Message-
> > > > rte_event_queue_conf, with possible values:
> > > > * atomic
> > > > * ordered
> > > > * parallel
> > > > * mixed - allowing all 3 types. I think allowing 2 of three types might
> > > > make things too complicated.
> > > > 
> > > > An open question would then be how to behave when the queue type and
> > > > requested event type conflict. We can either throw an error, or just
> > > > ignore the event type and always treat enqueued events as being of the
> > > > queue type. I prefer the latter, because it's faster not having to
> > > > error-check, and it pushes the responsibility on the app to know what
> > > > it's doing.
> > > 
> > > How about making default as "mixed" and let application configures what
> > > is not required?. That way application responsibility is clear.
> > > something similar to ETH_TXQ_FLAGS_NOMULTSEGS, ETH_TXQ_FLAGS_NOREFCOUNT
> > > with default.
> > > 
> > I suppose it could work, but why bother doing that? If an app knows it's
> > only going to use one traffic type, why not let it just state what it
> > will do rather than try to specify what it won't do. If mixed is needed,
> 
> My thought was more inline with ethdev spec, like, ref-count is default,
> if application need exception then set ETH_TXQ_FLAGS_NOREFCOUNT. But it is 
> OK, if
> you need other way.
> 
> > then it's easy enough to specify - and we can make it the zero/default
> > value too.
> 
> OK. Then we will make MIX as zero/default and add "allowed_event_types" in
> event queue config.
>

Bruce,

I have tried to make it as "allowed_event_types" in event queue config.
However, rte_event_queue_default_conf_get() can also take NULL for default
configuration. So I think, It makes sense to go with negation approach
like ethdev to define the default to avoid confusion on the default. So
I am thinking like below now,

? [master][libeventdev] $ git diff
diff --git a/rte_eventdev.h b/rte_eventdev.h
index cf22b0e..cac4642 100644
--- a/rte_eventdev.h
+++ b/rte_eventdev.h
@@ -429,6 +429,12 @@ rte_event_dev_configure(uint8_t dev_id, struct
rte_event_dev_config *config);
  *
  *  \see rte_event_port_setup(), rte_event_port_link()
  */
+#define RTE_EVENT_QUEUE_CFG_NOATOMIC_TYPE  (1ULL << 1)
+/**< Skip configuring atomic schedule type resources */
+#define RTE_EVENT_QUEUE_CFG_NOORDERED_TYPE (1ULL << 2)
+/**< Skip configuring ordered schedule type resources */
+#define RTE_EVENT_QUEUE_CFG_NOPARALLEL_TYPE(1ULL << 3)
+/**< Skip configuring parallel schedule type resources */

 /** Event queue configuration structure */
 struct rte_event_queue_conf {

Thoughts?


> /Jerin
> 
> > 
> > Our software implementation for now, only supports one type per queue -
> > which we suspect should meet a lot of use-cases. We'll have to see about
> > adding in mixed types in future.
> > 
> > /Bruce


[dpdk-dev] [RFC] [PATCH v2] libeventdev: event driven programming model framework for DPDK

2016-11-02 Thread Bruce Richardson
On Wed, Nov 02, 2016 at 04:55:22PM +0530, Jerin Jacob wrote:
> On Fri, Oct 28, 2016 at 02:36:48PM +0530, Jerin Jacob wrote:
> > On Fri, Oct 28, 2016 at 09:36:46AM +0100, Bruce Richardson wrote:
> > > On Fri, Oct 28, 2016 at 08:31:41AM +0530, Jerin Jacob wrote:
> > > > On Wed, Oct 26, 2016 at 01:54:14PM +0100, Bruce Richardson wrote:
> > > > > On Wed, Oct 26, 2016 at 05:54:17PM +0530, Jerin Jacob wrote:
> > > > > > On Wed, Oct 26, 2016 at 12:11:03PM +, Van Haaren, Harry wrote:
> > > > > > > > -Original Message-
> > > > > rte_event_queue_conf, with possible values:
> > > > > * atomic
> > > > > * ordered
> > > > > * parallel
> > > > > * mixed - allowing all 3 types. I think allowing 2 of three types 
> > > > > might
> > > > > make things too complicated.
> > > > > 
> > > > > An open question would then be how to behave when the queue type and
> > > > > requested event type conflict. We can either throw an error, or just
> > > > > ignore the event type and always treat enqueued events as being of the
> > > > > queue type. I prefer the latter, because it's faster not having to
> > > > > error-check, and it pushes the responsibility on the app to know what
> > > > > it's doing.
> > > > 
> > > > How about making default as "mixed" and let application configures what
> > > > is not required?. That way application responsibility is clear.
> > > > something similar to ETH_TXQ_FLAGS_NOMULTSEGS, ETH_TXQ_FLAGS_NOREFCOUNT
> > > > with default.
> > > > 
> > > I suppose it could work, but why bother doing that? If an app knows it's
> > > only going to use one traffic type, why not let it just state what it
> > > will do rather than try to specify what it won't do. If mixed is needed,
> > 
> > My thought was more inline with ethdev spec, like, ref-count is default,
> > if application need exception then set ETH_TXQ_FLAGS_NOREFCOUNT. But it is 
> > OK, if
> > you need other way.
> > 
> > > then it's easy enough to specify - and we can make it the zero/default
> > > value too.
> > 
> > OK. Then we will make MIX as zero/default and add "allowed_event_types" in
> > event queue config.
> >
> 
> Bruce,
> 
> I have tried to make it as "allowed_event_types" in event queue config.
> However, rte_event_queue_default_conf_get() can also take NULL for default
> configuration. So I think, It makes sense to go with negation approach
> like ethdev to define the default to avoid confusion on the default. So
> I am thinking like below now,
> 
> ? [master][libeventdev] $ git diff
> diff --git a/rte_eventdev.h b/rte_eventdev.h
> index cf22b0e..cac4642 100644
> --- a/rte_eventdev.h
> +++ b/rte_eventdev.h
> @@ -429,6 +429,12 @@ rte_event_dev_configure(uint8_t dev_id, struct
> rte_event_dev_config *config);
>   *
>   *  \see rte_event_port_setup(), rte_event_port_link()
>   */
> +#define RTE_EVENT_QUEUE_CFG_NOATOMIC_TYPE  (1ULL << 1)
> +/**< Skip configuring atomic schedule type resources */
> +#define RTE_EVENT_QUEUE_CFG_NOORDERED_TYPE (1ULL << 2)
> +/**< Skip configuring ordered schedule type resources */
> +#define RTE_EVENT_QUEUE_CFG_NOPARALLEL_TYPE(1ULL << 3)
> +/**< Skip configuring parallel schedule type resources */
> 
>  /** Event queue configuration structure */
>  struct rte_event_queue_conf {
> 
> Thoughts?
> 

I'm ok with the default as being all types, in the case where NULL is
specified for the parameter. It does make the most sense.

However, for the cases where the user does specify what they want, I
think it does make more sense, and is easier on the user for things to
be specified in a positive, rather than negative sense. For a user who
wants to just use atomic events, having to specify that as "not-reordered
and not-unordered" just isn't as clear! :-)

/Bruce



[dpdk-dev] [PATCH v2] net/ring: remove unnecessary NULL check

2016-11-02 Thread Ferruh Yigit
Hi Mauricio,

On 11/1/2016 7:55 PM, Mauricio Vasquez B wrote:
> Coverity detected this as an issue because internals->data will never be NULL,
> then the check is not necessary.
> 
> Fixes: d082c0395bf6 ("ring: fix memory leak when detaching")
> Coverity issue: 137873
> 
> Signed-off-by: Mauricio Vasquez B 
> ---
>  drivers/net/ring/rte_eth_ring.c | 20 +---
>  1 file changed, 9 insertions(+), 11 deletions(-)
> 
> diff --git a/drivers/net/ring/rte_eth_ring.c b/drivers/net/ring/rte_eth_ring.c
> index 6d2a8c1..5ca00ed 100644
> --- a/drivers/net/ring/rte_eth_ring.c
> +++ b/drivers/net/ring/rte_eth_ring.c
> @@ -599,17 +599,15 @@ rte_pmd_ring_remove(const char *name)
>  
>   eth_dev_stop(eth_dev);
>  
> - if (eth_dev->data) {
> - internals = eth_dev->data->dev_private;
> - if (internals->action == DEV_CREATE) {
> - /*
> -  * it is only necessary to delete the rings in 
> rx_queues because
> -  * they are the same used in tx_queues
> -  */
> - for (i = 0; i < eth_dev->data->nb_rx_queues; i++) {
> - r = eth_dev->data->rx_queues[i];
> - rte_ring_free(r->rng);
> - }
> + internals = eth_dev->data->dev_private;
> + if (internals->action == DEV_CREATE) {
> + /*
> +  * it is only necessary to delete the rings in rx_queues because
> +  * they are the same used in tx_queues
> +  */
> + for (i = 0; i < eth_dev->data->nb_rx_queues; i++) {
> + r = eth_dev->data->rx_queues[i];
> + rte_ring_free(r->rng);
>   }
>  
>   rte_free(eth_dev->data->rx_queues);

This patch not only removes the NULL check but also changes the logic.
after patch rx_queues, tx_queues and dev_private only freed if action is
DEV_CREATE which is wrong.

> 



[dpdk-dev] [RFC] [PATCH v2] libeventdev: event driven programming model framework for DPDK

2016-11-02 Thread Bruce Richardson
On Wed, Nov 02, 2016 at 04:17:04PM +0530, Jerin Jacob wrote:
> On Wed, Oct 26, 2016 at 12:11:03PM +, Van Haaren, Harry wrote:
> > > -Original Message-
> > > From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Jerin Jacob
> > > 
> > > So far, I have received constructive feedback from Intel, NXP and Linaro 
> > > folks.
> > > Let me know, if anyone else interested in contributing to the definition 
> > > of eventdev?
> > > 
> > > If there are no major issues in proposed spec, then Cavium would like 
> > > work on
> > > implementing and up-streaming the common code(lib/librte_eventdev/) and
> > > an associated HW driver.(Requested minor changes of v2 will be addressed
> > > in next version).
> >
> 
> Hi All,
> 
> Two queries,
> 
> 1) In SW implementation, Is their any connection between "struct
> rte_event_port_conf"'s dequeue_queue_depth and enqueue_queue_depth ?
> i.e it should be enqueue_queue_depth >= dequeue_queue_depth. Right ?
> Thought of adding the common checks in common layer.

I think this is probably best left to the driver layers to enforce. For
us, such a restriction doesn't really make sense, though in many cases
that would be the usual setup. For accurate load balancing, the dequeue
queue depth would be small, and the burst size would probably equal the
queue depth, meaning the enqueue depth needs to be at least as big.
However, for better throughput, or in cases where all traffic is being
coalesced to a single core e.g. for transmit out a network port, there
is no need to keep the dequeue queue shallow and so it can be many times
the burst size, while the enqueue queue can be kept to 1-2 times the
burst size.

> 
> 2)Any comments on follow item(section under ) that needs improvement.
> ---
> Abstract the differences in event QoS management with different
> priority schemes available in different HW or SW implementations with portable
> application workflow.
> 
> Based on the feedback, there three different kinds of QoS support
> available in
> three different HW or SW implementations.
> 1) Priority associated with the event queue
> 2) Priority associated with each event enqueue
> (Same flow can have two different priority on two separate enqueue)
> 3) Priority associated with the flow(each flow has unique priority)
> 
> In v2, The differences abstracted based on device capability
> (RTE_EVENT_DEV_CAP_QUEUE_QOS for the first scheme,
> RTE_EVENT_DEV_CAP_EVENT_QOS for the second and third scheme).
> This scheme would call for different application workflow for
> nontrivial QoS-enabled applications.
> ---
> After thinking a while, I think, RTE_EVENT_DEV_CAP_EVENT_QOS is a
> super-set.if so, the subset RTE_EVENT_DEV_CAP_QUEUE_QOS can be
> implemented with RTE_EVENT_DEV_CAP_EVENT_QOS. i.e We may not need two
> flags, Just one flag RTE_EVENT_DEV_CAP_EVENT_QOS is enough to fix
> portability issue with basic QoS enabled applications.
> 
> i.e Introduce RTE_EVENT_DEV_CAP_EVENT_QOS as config option in device
> configure stage if application needs fine granularity on QoS per event
> enqueue.For trivial applications, configured
> rte_event_queue_conf->priority can be used as rte_event_enqueue(struct
> rte_event.priority)
> 
So all implementations should support the concept of priority among
queues, and then there is optional support for event or flow based
prioritization. Is that a correct interpretation of what you propose?

/Bruce



[dpdk-dev] [RFC] [PATCH v2] libeventdev: event driven programming model framework for DPDK

2016-11-02 Thread Bruce Richardson
On Wed, Nov 02, 2016 at 01:36:34PM +0530, Jerin Jacob wrote:
> On Fri, Oct 28, 2016 at 01:48:57PM +, Van Haaren, Harry wrote:
> > > -Original Message-
> > > From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Jerin Jacob
> > > Sent: Tuesday, October 25, 2016 6:49 PM
> > 
> > > 
> > > Hi Community,
> > > 
> > > So far, I have received constructive feedback from Intel, NXP and Linaro 
> > > folks.
> > > Let me know, if anyone else interested in contributing to the definition 
> > > of eventdev?
> > > 
> > > If there are no major issues in proposed spec, then Cavium would like 
> > > work on
> > > implementing and up-streaming the common code(lib/librte_eventdev/) and
> > > an associated HW driver.(Requested minor changes of v2 will be addressed
> > > in next version).
> > 
> > 
> > Hi All,
> > 
> > I've been looking at the eventdev API from a use-case point of view, and 
> > I'm unclear on a how the API caters for two uses. I have simplified these 
> > as much as possible, think of them as a theoretical unit-test for the API :)
> > 
> > 
> > Fragmentation:
> > 1. Dequeue 8 packets
> > 2. Process 2 packets
> > 3. Processing 3rd, this packet needs fragmentation into two packets
> > 4. Process remaining 5 packets as normal
> > 
> > What function calls does the application make to achieve this?
> > In particular, I'm referring to how can the scheduler know that the 3rd 
> > packet is the one being fragmented, and how to keep packet order valid. 
> > 
> 
> OK. I will try to share my views on IP fragmentation on event _HW_
> models(at least on Cavium HW) then we can see, how we can converge.
> 
> First, The fragmentation specific logic should be decoupled from the event
> model as it specific to packet and L3 layer(Not specific to generic event)
> 
I would view fragmentation as just one example of a workload like this,
multicast and broadcast may be two other cases. Yes, they all apply to
packet, but the general feature support is just how to provide support
for one event generating multiple further events which should be linked
together for reordering. [I think this only really applies in the
reordered case - which leads to another question: in your experience
do you see other event types other than packet being handled in a
"reordered" manner?]

/Bruce



[dpdk-dev] Best Practices for PMD Verification before Upstream Requests

2016-11-02 Thread Shepard Siegel
Thomas and DPDK devs,

Almost a year into our DPDK development, we have shipped an alpha version
of our "Arkville" product. We've thankful for all the support from this
group. Most everyone has suggested "get your code upstream ASAP"; but our
team is cut from the "if it isn't tested, it doesn't work" cloth. We now
have some solid miles on our Arkville PMD driver "ark" with 16.07. Mostly
testpmd and a suite of user apps; dts not so much, only because our use
case is a little different. We expect almost all of our contribution would
land under $dpdk/drivers/net/ark . We are looking past 16.11 to possibly
jump on board when the 17.02 window opens in December. One question that
came up is "Should we do a thorough port and regression against 16.11 as a
precursor to up streaming at 17.02?". Constructive feedback always welcome!

-Shep

Shepard Siegel, CTO
atomicrules.com

On Mon, Aug 22, 2016 at 9:07 AM, Thomas Monjalon 
wrote:

> 2016-08-17 08:34, Shepard Siegel:
> > Atomic Rules is new to the DPDK community. We attended the DPDK Summit
> last
> > week and received terrific advice and encouragement. We are developing a
> > DPDK PMD for our Arkville product which is a DPDK-aware data mover,
> capable
> > of marshaling packets between FPGA/ASIC gates with AXI interfaces on one
> > side, and the DPDK API/ABI on the other. Arkville plus a MAC looks like a
> > line-rate-agnostic bare-bones L2 NIC. We have testpmd and our first DPDK
> > applications running using our early-alpha Arkville PMD.
>
> Welcome :)
>
> Any release targeted for upstream support?
>
> 
>


[dpdk-dev] [RFC] [PATCH v2] libeventdev: event driven programming model framework for DPDK

2016-11-02 Thread Jerin Jacob
On Wed, Nov 02, 2016 at 11:45:07AM +, Bruce Richardson wrote:
> On Wed, Nov 02, 2016 at 04:17:04PM +0530, Jerin Jacob wrote:
> > On Wed, Oct 26, 2016 at 12:11:03PM +, Van Haaren, Harry wrote:
> > > > -Original Message-
> > > > From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Jerin Jacob
> > > > 
> > > > So far, I have received constructive feedback from Intel, NXP and 
> > > > Linaro folks.
> > > > Let me know, if anyone else interested in contributing to the 
> > > > definition of eventdev?
> > > > 
> > > > If there are no major issues in proposed spec, then Cavium would like 
> > > > work on
> > > > implementing and up-streaming the common code(lib/librte_eventdev/) and
> > > > an associated HW driver.(Requested minor changes of v2 will be addressed
> > > > in next version).
> > >
> > 
> > Hi All,
> > 
> > Two queries,
> > 
> > 1) In SW implementation, Is their any connection between "struct
> > rte_event_port_conf"'s dequeue_queue_depth and enqueue_queue_depth ?
> > i.e it should be enqueue_queue_depth >= dequeue_queue_depth. Right ?
> > Thought of adding the common checks in common layer.
> 
> I think this is probably best left to the driver layers to enforce. For
> us, such a restriction doesn't really make sense, though in many cases
> that would be the usual setup. For accurate load balancing, the dequeue
> queue depth would be small, and the burst size would probably equal the
> queue depth, meaning the enqueue depth needs to be at least as big.
> However, for better throughput, or in cases where all traffic is being
> coalesced to a single core e.g. for transmit out a network port, there
> is no need to keep the dequeue queue shallow and so it can be many times
> the burst size, while the enqueue queue can be kept to 1-2 times the
> burst size.
> 

OK

> > 
> > 2)Any comments on follow item(section under ) that needs improvement.
> > ---
> > Abstract the differences in event QoS management with different
> > priority schemes available in different HW or SW implementations with 
> > portable
> > application workflow.
> > 
> > Based on the feedback, there three different kinds of QoS support
> > available in
> > three different HW or SW implementations.
> > 1) Priority associated with the event queue
> > 2) Priority associated with each event enqueue
> > (Same flow can have two different priority on two separate enqueue)
> > 3) Priority associated with the flow(each flow has unique priority)
> > 
> > In v2, The differences abstracted based on device capability
> > (RTE_EVENT_DEV_CAP_QUEUE_QOS for the first scheme,
> > RTE_EVENT_DEV_CAP_EVENT_QOS for the second and third scheme).
> > This scheme would call for different application workflow for
> > nontrivial QoS-enabled applications.
> > ---
> > After thinking a while, I think, RTE_EVENT_DEV_CAP_EVENT_QOS is a
> > super-set.if so, the subset RTE_EVENT_DEV_CAP_QUEUE_QOS can be
> > implemented with RTE_EVENT_DEV_CAP_EVENT_QOS. i.e We may not need two
> > flags, Just one flag RTE_EVENT_DEV_CAP_EVENT_QOS is enough to fix
> > portability issue with basic QoS enabled applications.
> > 
> > i.e Introduce RTE_EVENT_DEV_CAP_EVENT_QOS as config option in device
> > configure stage if application needs fine granularity on QoS per event
> > enqueue.For trivial applications, configured
> > rte_event_queue_conf->priority can be used as rte_event_enqueue(struct
> > rte_event.priority)
> > 
> So all implementations should support the concept of priority among
> queues, and then there is optional support for event or flow based
> prioritization. Is that a correct interpretation of what you propose?

Yes. If you _can_ implement it and if possible in the system.

> 
> /Bruce
> 


[dpdk-dev] [PATCH v2] net/ring: remove unnecessary NULL check

2016-11-02 Thread Fulvio Risso
Dear Ferruh,
Maybe I'm wrong, but I cannot see your point.
The code is absolutely the same, only the following line

if (eth_dev->data) {

is actually removed.

fulvio



On 02/11/2016 12:38, Ferruh Yigit wrote:
> Hi Mauricio,
>
> On 11/1/2016 7:55 PM, Mauricio Vasquez B wrote:
>> Coverity detected this as an issue because internals->data will never be 
>> NULL,
>> then the check is not necessary.
>>
>> Fixes: d082c0395bf6 ("ring: fix memory leak when detaching")
>> Coverity issue: 137873
>>
>> Signed-off-by: Mauricio Vasquez B 
>> ---
>>  drivers/net/ring/rte_eth_ring.c | 20 +---
>>  1 file changed, 9 insertions(+), 11 deletions(-)
>>
>> diff --git a/drivers/net/ring/rte_eth_ring.c 
>> b/drivers/net/ring/rte_eth_ring.c
>> index 6d2a8c1..5ca00ed 100644
>> --- a/drivers/net/ring/rte_eth_ring.c
>> +++ b/drivers/net/ring/rte_eth_ring.c
>> @@ -599,17 +599,15 @@ rte_pmd_ring_remove(const char *name)
>>
>>  eth_dev_stop(eth_dev);
>>
>> -if (eth_dev->data) {
>> -internals = eth_dev->data->dev_private;
>> -if (internals->action == DEV_CREATE) {
>> -/*
>> - * it is only necessary to delete the rings in 
>> rx_queues because
>> - * they are the same used in tx_queues
>> - */
>> -for (i = 0; i < eth_dev->data->nb_rx_queues; i++) {
>> -r = eth_dev->data->rx_queues[i];
>> -rte_ring_free(r->rng);
>> -}
>> +internals = eth_dev->data->dev_private;
>> +if (internals->action == DEV_CREATE) {
>> +/*
>> + * it is only necessary to delete the rings in rx_queues because
>> + * they are the same used in tx_queues
>> + */
>> +for (i = 0; i < eth_dev->data->nb_rx_queues; i++) {
>> +r = eth_dev->data->rx_queues[i];
>> +rte_ring_free(r->rng);
>>  }
>>
>>  rte_free(eth_dev->data->rx_queues);
>
> This patch not only removes the NULL check but also changes the logic.
> after patch rx_queues, tx_queues and dev_private only freed if action is
> DEV_CREATE which is wrong.
>
>>
>


[dpdk-dev] [RFC] [PATCH v2] libeventdev: event driven programming model framework for DPDK

2016-11-02 Thread Jerin Jacob
On Wed, Nov 02, 2016 at 11:48:37AM +, Bruce Richardson wrote:
> On Wed, Nov 02, 2016 at 01:36:34PM +0530, Jerin Jacob wrote:
> > On Fri, Oct 28, 2016 at 01:48:57PM +, Van Haaren, Harry wrote:
> > > > -Original Message-
> > > > From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Jerin Jacob
> > > > Sent: Tuesday, October 25, 2016 6:49 PM
> > > 
> > > > 
> > > > Hi Community,
> > > > 
> > > > So far, I have received constructive feedback from Intel, NXP and 
> > > > Linaro folks.
> > > > Let me know, if anyone else interested in contributing to the 
> > > > definition of eventdev?
> > > > 
> > > > If there are no major issues in proposed spec, then Cavium would like 
> > > > work on
> > > > implementing and up-streaming the common code(lib/librte_eventdev/) and
> > > > an associated HW driver.(Requested minor changes of v2 will be addressed
> > > > in next version).
> > > 
> > > 
> > > Hi All,
> > > 
> > > I've been looking at the eventdev API from a use-case point of view, and 
> > > I'm unclear on a how the API caters for two uses. I have simplified these 
> > > as much as possible, think of them as a theoretical unit-test for the API 
> > > :)
> > > 
> > > 
> > > Fragmentation:
> > > 1. Dequeue 8 packets
> > > 2. Process 2 packets
> > > 3. Processing 3rd, this packet needs fragmentation into two packets
> > > 4. Process remaining 5 packets as normal
> > > 
> > > What function calls does the application make to achieve this?
> > > In particular, I'm referring to how can the scheduler know that the 3rd 
> > > packet is the one being fragmented, and how to keep packet order valid. 
> > > 
> > 
> > OK. I will try to share my views on IP fragmentation on event _HW_
> > models(at least on Cavium HW) then we can see, how we can converge.
> > 
> > First, The fragmentation specific logic should be decoupled from the event
> > model as it specific to packet and L3 layer(Not specific to generic event)
> > 
> I would view fragmentation as just one example of a workload like this,
> multicast and broadcast may be two other cases. Yes, they all apply to
> packet, but the general feature support is just how to provide support
> for one event generating multiple further events which should be linked
> together for reordering. [I think this only really applies in the

AFIAK, There two different schemes to "maintain ordering", the first one
is based "reordering buffers" i.e as a list data structure used to hold the
event first and then when it comes correcting the order(ORDERED->ATOMIC),
correct the order based on the previous "reordering buffers".
But some HW implementation use "port" state based reordering scheme
(i.e no external reorder buffer to keep track the order).

So I think, To have portable application workflow, the use case where multiple
event generated based on one event, generated events needs to store in the 
parent event
and in the downstream, process them as required. like fragmentation example in

http://dpdk.org/ml/archives/dev/2016-November/049707.html

The above scheme should OK in your implementation. Right?


> reordered case - which leads to another question: in your experience
> do you see other event types other than packet being handled in a
> "reordered" manner?]

We use both timer events and crypto completion events etc in ORDERED
type. But not like, one event creates N event scheme on those.

> 
> /Bruce
> 


[dpdk-dev] [PATCH v2] doc/guides: add more info about VT-d/iommu settings

2016-11-02 Thread Kusztal, ArkadiuszX


> -Original Message-
> From: Trahe, Fiona
> Sent: Wednesday, October 26, 2016 6:24 PM
> To: dev at dpdk.org
> Cc: De Lara Guarch, Pablo ; Trahe, Fiona
> ; Griffin, John 
> Subject: [PATCH v2] doc/guides: add more info about VT-d/iommu settings
> 
> Add more information about VT-d/iommu settings for QAT PMD.
> Remove limitation indicating QAT driver is not performance tuned.
> 
> Signed-off-by: Fiona Trahe 
> ---
> 
> v2:
>  clarified commit message
> 
> 
>  doc/guides/cryptodevs/qat.rst | 7 +++
>  1 file changed, 3 insertions(+), 4 deletions(-)
> 
> diff --git a/doc/guides/cryptodevs/qat.rst b/doc/guides/cryptodevs/qat.rst
> index 70bc2b1..bbe0b12 100644
> --- a/doc/guides/cryptodevs/qat.rst
> +++ b/doc/guides/cryptodevs/qat.rst
> 
> --
> 2.5.0
Acked-by: Arek Kusztal 



[dpdk-dev] [RFC] [PATCH v2] libeventdev: event driven programming model framework for DPDK

2016-11-02 Thread Jerin Jacob
On Wed, Nov 02, 2016 at 11:35:51AM +, Bruce Richardson wrote:
> On Wed, Nov 02, 2016 at 04:55:22PM +0530, Jerin Jacob wrote:
> > On Fri, Oct 28, 2016 at 02:36:48PM +0530, Jerin Jacob wrote:
> > > On Fri, Oct 28, 2016 at 09:36:46AM +0100, Bruce Richardson wrote:
> > > > On Fri, Oct 28, 2016 at 08:31:41AM +0530, Jerin Jacob wrote:
> > > > > On Wed, Oct 26, 2016 at 01:54:14PM +0100, Bruce Richardson wrote:
> > > > > > On Wed, Oct 26, 2016 at 05:54:17PM +0530, Jerin Jacob wrote:
> > > > > How about making default as "mixed" and let application configures 
> > > > > what
> > > > > is not required?. That way application responsibility is clear.
> > > > > something similar to ETH_TXQ_FLAGS_NOMULTSEGS, 
> > > > > ETH_TXQ_FLAGS_NOREFCOUNT
> > > > > with default.
> > > > > 
> > > > I suppose it could work, but why bother doing that? If an app knows it's
> > > > only going to use one traffic type, why not let it just state what it
> > > > will do rather than try to specify what it won't do. If mixed is needed,
> > > 
> > > My thought was more inline with ethdev spec, like, ref-count is default,
> > > if application need exception then set ETH_TXQ_FLAGS_NOREFCOUNT. But it 
> > > is OK, if
> > > you need other way.
> > > 
> > > > then it's easy enough to specify - and we can make it the zero/default
> > > > value too.
> > > 
> > > OK. Then we will make MIX as zero/default and add "allowed_event_types" in
> > > event queue config.
> > >
> > 
> > Bruce,
> > 
> > I have tried to make it as "allowed_event_types" in event queue config.
> > However, rte_event_queue_default_conf_get() can also take NULL for default
> > configuration. So I think, It makes sense to go with negation approach
> > like ethdev to define the default to avoid confusion on the default. So
> > I am thinking like below now,
> > 
> > ? [master][libeventdev] $ git diff
> > diff --git a/rte_eventdev.h b/rte_eventdev.h
> > index cf22b0e..cac4642 100644
> > --- a/rte_eventdev.h
> > +++ b/rte_eventdev.h
> > @@ -429,6 +429,12 @@ rte_event_dev_configure(uint8_t dev_id, struct
> > rte_event_dev_config *config);
> >   *
> >   *  \see rte_event_port_setup(), rte_event_port_link()
> >   */
> > +#define RTE_EVENT_QUEUE_CFG_NOATOMIC_TYPE  (1ULL << 1)
> > +/**< Skip configuring atomic schedule type resources */
> > +#define RTE_EVENT_QUEUE_CFG_NOORDERED_TYPE (1ULL << 2)
> > +/**< Skip configuring ordered schedule type resources */
> > +#define RTE_EVENT_QUEUE_CFG_NOPARALLEL_TYPE(1ULL << 3)
> > +/**< Skip configuring parallel schedule type resources */
> > 
> >  /** Event queue configuration structure */
> >  struct rte_event_queue_conf {
> > 
> > Thoughts?
> > 
> 
> I'm ok with the default as being all types, in the case where NULL is
> specified for the parameter. It does make the most sense.

Yes. That case I need to explicitly mention in the documentation about what
is default case. With RTE_EVENT_QUEUE_CFG_NOATOMIC_TYPE scheme it quite
understood what is default. Not adding up? :-)

> 
> However, for the cases where the user does specify what they want, I
> think it does make more sense, and is easier on the user for things to
> be specified in a positive, rather than negative sense. For a user who
> wants to just use atomic events, having to specify that as "not-reordered
> and not-unordered" just isn't as clear! :-)
> 
> /Bruce
> 


[dpdk-dev] [PATCH v2] net/ring: remove unnecessary NULL check

2016-11-02 Thread Ferruh Yigit
On 11/2/2016 12:49 PM, Fulvio Risso wrote:
> Dear Ferruh,
> Maybe I'm wrong, but I cannot see your point.
> The code is absolutely the same, only the following line
> 
> if (eth_dev->data) {
> 
> is actually removed.

Please double check the condition "rx_queues" freed:

before the patch:
==
if (eth_dev->data) {
  internals = eth_dev->data->dev_private;
  if (internals->action == DEV_CREATE) {
/*
 * it is only necessary to delete the rings in rx_queues because
 * they are the same used in tx_queues
 */
for (i = 0; i < eth_dev->data->nb_rx_queues; i++) {
  r = eth_dev->data->rx_queues[i];
  rte_ring_free(r->rng);
}
  }

  rte_free(eth_dev->data->rx_queues);
  rte_free(eth_dev->data->tx_queues);
  rte_free(eth_dev->data->dev_private);
}
==


After the patch:
==
internals = eth_dev->data->dev_private;
if (internals->action == DEV_CREATE) {
  /*
   * it is only necessary to delete the rings in rx_queues because
   * they are the same used in tx_queues
   */
  for (i = 0; i < eth_dev->data->nb_rx_queues; i++) {
r = eth_dev->data->rx_queues[i];
rte_ring_free(r->rng);
  }

  rte_free(eth_dev->data->rx_queues);
  rte_free(eth_dev->data->tx_queues);
  rte_free(eth_dev->data->dev_private);
}
==


Thanks,
ferruh


[dpdk-dev] [PATCH v3] net/ring: remove unnecessary NULL check

2016-11-02 Thread Mauricio Vasquez B
Coverity detected this as an issue because internals->data will never be NULL,
then the check is not necessary.

Fixes: d082c0395bf6 ("ring: fix memory leak when detaching")
Coverity issue: 137873

Signed-off-by: Mauricio Vasquez B 
---
 drivers/net/ring/rte_eth_ring.c | 28 +---
 1 file changed, 13 insertions(+), 15 deletions(-)

diff --git a/drivers/net/ring/rte_eth_ring.c b/drivers/net/ring/rte_eth_ring.c
index 6d2a8c1..c1767c4 100644
--- a/drivers/net/ring/rte_eth_ring.c
+++ b/drivers/net/ring/rte_eth_ring.c
@@ -599,24 +599,22 @@ rte_pmd_ring_remove(const char *name)

eth_dev_stop(eth_dev);

-   if (eth_dev->data) {
-   internals = eth_dev->data->dev_private;
-   if (internals->action == DEV_CREATE) {
-   /*
-* it is only necessary to delete the rings in 
rx_queues because
-* they are the same used in tx_queues
-*/
-   for (i = 0; i < eth_dev->data->nb_rx_queues; i++) {
-   r = eth_dev->data->rx_queues[i];
-   rte_ring_free(r->rng);
-   }
+   internals = eth_dev->data->dev_private;
+   if (internals->action == DEV_CREATE) {
+   /*
+* it is only necessary to delete the rings in rx_queues because
+* they are the same used in tx_queues
+*/
+   for (i = 0; i < eth_dev->data->nb_rx_queues; i++) {
+   r = eth_dev->data->rx_queues[i];
+   rte_ring_free(r->rng);
}
-
-   rte_free(eth_dev->data->rx_queues);
-   rte_free(eth_dev->data->tx_queues);
-   rte_free(eth_dev->data->dev_private);
}

+   rte_free(eth_dev->data->rx_queues);
+   rte_free(eth_dev->data->tx_queues);
+   rte_free(eth_dev->data->dev_private);
+
rte_free(eth_dev->data);

rte_eth_dev_release_port(eth_dev);
-- 
1.9.1



[dpdk-dev] [PATCH 0/2] update mlx5 release note and guide

2016-11-02 Thread Nelio Laranjeiro
Nelio Laranjeiro (2):
  doc: update mlx5 dependencies
  doc: add mlx5 release notes

 doc/guides/nics/mlx5.rst   |   8 +-
 doc/guides/rel_notes/release_16_11.rst | 136 ++---
 2 files changed, 114 insertions(+), 30 deletions(-)

-- 
2.1.4



[dpdk-dev] [PATCH 1/2] doc: update mlx5 dependencies

2016-11-02 Thread Nelio Laranjeiro
Signed-off-by: Nelio Laranjeiro 
---
 doc/guides/nics/mlx5.rst | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/doc/guides/nics/mlx5.rst b/doc/guides/nics/mlx5.rst
index 0d1fabb..98d1341 100644
--- a/doc/guides/nics/mlx5.rst
+++ b/doc/guides/nics/mlx5.rst
@@ -241,12 +241,12 @@ DPDK and must be installed separately:

 Currently supported by DPDK:

-- Mellanox OFED **3.3-1.0.0.0** and **3.3-2.0.0.0**.
+- Mellanox OFED **3.4-1.0.0.0**.

-- Minimum firmware version:
+- firmware version:

-  - ConnectX-4: **12.16.1006**
-  - ConnectX-4 Lx: **14.16.1006**
+  - ConnectX-4: **12.17.1010**
+  - ConnectX-4 Lx: **14.17.1010**

 Getting Mellanox OFED
 ~
-- 
2.1.4



[dpdk-dev] [PATCH 2/2] doc: add mlx5 release notes

2016-11-02 Thread Nelio Laranjeiro
Add list of tested and validated NICs too.

Signed-off-by: Nelio Laranjeiro 
---
 doc/guides/rel_notes/release_16_11.rst | 136 ++---
 1 file changed, 110 insertions(+), 26 deletions(-)

diff --git a/doc/guides/rel_notes/release_16_11.rst 
b/doc/guides/rel_notes/release_16_11.rst
index aa0c09a..7447195 100644
--- a/doc/guides/rel_notes/release_16_11.rst
+++ b/doc/guides/rel_notes/release_16_11.rst
@@ -131,6 +131,13 @@ New Features
   The GCC 4.9 ``-march`` option supports the Intel processor code names.
   The config option ``RTE_MACHINE`` can be used to pass code names to the 
compiler as ``-march`` flag.

+* **Updated the mlx5 driver.**
+
+  The following changes were made to mlx5:
+
+  * Add support for RSS hash result
+  * Several performance improvements
+  * Several bug fixes

 Resolved Issues
 ---
@@ -265,47 +272,124 @@ The libraries prepended with a plus sign were 
incremented in this version.
 Tested Platforms
 

-.. This section should contain a list of platforms that were tested with this 
release.
+#. Intel(R) Server board S2600WTT

-   The format is:
+   - Processor: Intel(R) Xeon(R) CPU E5-2697 v2 @ 2.70GHz

-   #. Platform name.
+#. Intel(R) Server

-  * Platform details.
-  * Platform details.
+   - Intel(R) Xeon(R) CPU E5-2680 v2 @ 2.80GHz

-   This section is a comment. Make sure to start the actual text at the margin.
+#. Intel(R) Server
+
+   - Intel(R) Xeon(R) CPU E5-2697 v3 @ 2.60GHz

+#. IBM(R) Power8(R)
+
+   - Machine type-model: 8247-22L
+   - Firmware FW810.21 (SV810_108)
+   - Processor: POWER8E (raw), AltiVec supported

 Tested NICs
 ---

-.. This section should contain a list of NICs that were tested with this 
release.
+#. Mellanox(R) ConnectX(R)-4 10G MCX4111A-XCAT (1x10G)

-   The format is:
+   * Host interface: PCI Express 3.0 x8
+   * Device ID: 15b3:1013
+   * MLNX_OFED: 3.4-1.0.0.0
+   * Firmware version: 12.17.1010

-   #. NIC name.
+#. Mellanox(R) ConnectX(R)-4 10G MCX4121A-XCAT (2x10G)

-  * NIC details.
-  * NIC details.
+   * Host interface: PCI Express 3.0 x8
+   * Device ID: 15b3:1013
+   * MLNX_OFED: 3.4-1.0.0.0
+   * Firmware version: 12.17.1010

-   This section is a comment. Make sure to start the actual text at the margin.
+#. Mellanox(R) ConnectX(R)-4 25G MCX4111A-ACAT (1x25G)

+   * Host interface: PCI Express 3.0 x8
+   * Device ID: 15b3:1013
+   * MLNX_OFED: 3.4-1.0.0.0
+   * Firmware version: 12.17.1010

-Tested OSes

+#. Mellanox(R) ConnectX(R)-4 25G MCX4121A-ACAT (2x25G)

-.. This section should contain a list of OSes that were tested with this 
release.
-   The format is as follows, in alphabetical order:
+   * Host interface: PCI Express 3.0 x8
+   * Device ID: 15b3:1013
+   * MLNX_OFED: 3.4-1.0.0.0
+   * Firmware version: 12.17.1010

-   * CentOS 7.0
-   * Fedora 23
-   * Fedora 24
-   * FreeBSD 10.3
-   * Red Hat Enterprise Linux 7.2
-   * SUSE Enterprise Linux 12
-   * Ubuntu 15.10
-   * Ubuntu 16.04 LTS
-   * Wind River Linux 8
+#. Mellanox(R) ConnectX(R)-4 40G MCX4131A-BCAT/MCX413A-BCAT (1x40G)

-   This section is a comment. Make sure to start the actual text at the margin.
+   * Host interface: PCI Express 3.0 x8
+   * Device ID: 15b3:1013
+   * MLNX_OFED: 3.4-1.0.0.0
+   * Firmware version: 12.17.1010
+
+#. Mellanox(R) ConnectX(R)-4 40G MCX415A-BCAT (1x40G)
+
+   * Host interface: PCI Express 3.0 x16
+   * Device ID: 15b3:1013
+   * MLNX_OFED: 3.4-1.0.0.0
+   * Firmware version: 12.17.1010
+
+#. Mellanox(R) ConnectX(R)-4 50G MCX4131A-GCAT/MCX413A-GCAT (1x50G)
+
+   * Host interface: PCI Express 3.0 x8
+   * Device ID: 15b3:1013
+   * MLNX_OFED: 3.4-1.0.0.0
+   * Firmware version: 12.17.1010
+
+#. Mellanox(R) ConnectX(R)-4 50G MCX414A-BCAT (2x50G)
+
+   * Host interface: PCI Express 3.0 x8
+   * Device ID: 15b3:1013
+   * MLNX_OFED: 3.4-1.0.0.0
+   * Firmware version: 12.17.1010
+
+#. Mellanox(R) ConnectX(R)-4 50G MCX415A-GCAT/MCX416A-BCAT/MCX416A-GCAT (2x50G)
+
+   * Host interface: PCI Express 3.0 x16
+   * Device ID: 15b3:1013
+   * MLNX_OFED: 3.4-1.0.0.0
+   * Firmware version: 12.17.1010
+
+#. Mellanox(R) ConnectX(R)-4 50G MCX415A-CCAT (1x100G)
+
+   * Host interface: PCI Express 3.0 x16
+   * Device ID: 15b3:1013
+   * MLNX_OFED: 3.4-1.0.0.0
+   * Firmware version: 12.17.1010
+
+#. Mellanox(R) ConnectX(R)-4 100G MCX416A-CCAT (2x100G)
+
+   * Host interface: PCI Express 3.0 x16
+   * Device ID: 15b3:1013
+   * MLNX_OFED: 3.4-1.0.0.0
+   * Firmware version: 12.17.1010
+
+#. Mellanox(R) ConnectX(R)-4 Lx 10G MCX4121A-XCAT (2x10G)
+
+   * Host interface: PCI Express 3.0 x8
+   * Device ID: 15b3:1015
+   * MLNX_OFED: 3.4-1.0.0.0
+   * Firmware version: 14.17.1010
+
+#. Mellanox(R) ConnectX(R)-4 Lx 25G MCX4121A-ACAT (2x25G)
+
+   * Host interface: PCI Express 3.0 x8
+   * Device ID: 15b3:1015
+   * MLNX_OFED: 3.4-1.0.0.0
+   * Firmware version: 14.17.1010
+
+Tested OSes
+---
+
+   * Red Hat Enterprise Linux Server release 6.7 (Santiago)
+   * 

[dpdk-dev] [PATCH v2] net/ring: remove unnecessary NULL check

2016-11-02 Thread Mauricio Vasquez
Dear Ferruh,

You are right,  I messed up the brackets.

I already sent v3.

Thanks,

Mauricio.


On 11/02/2016 08:15 AM, Ferruh Yigit wrote:
> On 11/2/2016 12:49 PM, Fulvio Risso wrote:
>> Dear Ferruh,
>> Maybe I'm wrong, but I cannot see your point.
>> The code is absolutely the same, only the following line
>>
>>  if (eth_dev->data) {
>>
>> is actually removed.
> Please double check the condition "rx_queues" freed:
>
> before the patch:
> ==
> if (eth_dev->data) {
>internals = eth_dev->data->dev_private;
>if (internals->action == DEV_CREATE) {
>  /*
>   * it is only necessary to delete the rings in rx_queues because
>   * they are the same used in tx_queues
>   */
>  for (i = 0; i < eth_dev->data->nb_rx_queues; i++) {
>r = eth_dev->data->rx_queues[i];
>rte_ring_free(r->rng);
>  }
>}
>
>rte_free(eth_dev->data->rx_queues);
>rte_free(eth_dev->data->tx_queues);
>rte_free(eth_dev->data->dev_private);
> }
> ==
>
>
> After the patch:
> ==
> internals = eth_dev->data->dev_private;
> if (internals->action == DEV_CREATE) {
>/*
> * it is only necessary to delete the rings in rx_queues because
> * they are the same used in tx_queues
> */
>for (i = 0; i < eth_dev->data->nb_rx_queues; i++) {
>  r = eth_dev->data->rx_queues[i];
>  rte_ring_free(r->rng);
>}
>
>rte_free(eth_dev->data->rx_queues);
>rte_free(eth_dev->data->tx_queues);
>rte_free(eth_dev->data->dev_private);
> }
> ==
>
>
> Thanks,
> ferruh
>



[dpdk-dev] [RFC] [PATCH v2] libeventdev: event driven programming model framework for DPDK

2016-11-02 Thread Bruce Richardson
On Wed, Nov 02, 2016 at 06:39:27PM +0530, Jerin Jacob wrote:
> On Wed, Nov 02, 2016 at 11:35:51AM +, Bruce Richardson wrote:
> > On Wed, Nov 02, 2016 at 04:55:22PM +0530, Jerin Jacob wrote:
> > > On Fri, Oct 28, 2016 at 02:36:48PM +0530, Jerin Jacob wrote:
> > > > On Fri, Oct 28, 2016 at 09:36:46AM +0100, Bruce Richardson wrote:
> > > > > On Fri, Oct 28, 2016 at 08:31:41AM +0530, Jerin Jacob wrote:
> > > > > > On Wed, Oct 26, 2016 at 01:54:14PM +0100, Bruce Richardson wrote:
> > > > > > > On Wed, Oct 26, 2016 at 05:54:17PM +0530, Jerin Jacob wrote:
> > > > > > How about making default as "mixed" and let application configures 
> > > > > > what
> > > > > > is not required?. That way application responsibility is clear.
> > > > > > something similar to ETH_TXQ_FLAGS_NOMULTSEGS, 
> > > > > > ETH_TXQ_FLAGS_NOREFCOUNT
> > > > > > with default.
> > > > > > 
> > > > > I suppose it could work, but why bother doing that? If an app knows 
> > > > > it's
> > > > > only going to use one traffic type, why not let it just state what it
> > > > > will do rather than try to specify what it won't do. If mixed is 
> > > > > needed,
> > > > 
> > > > My thought was more inline with ethdev spec, like, ref-count is default,
> > > > if application need exception then set ETH_TXQ_FLAGS_NOREFCOUNT. But it 
> > > > is OK, if
> > > > you need other way.
> > > > 
> > > > > then it's easy enough to specify - and we can make it the zero/default
> > > > > value too.
> > > > 
> > > > OK. Then we will make MIX as zero/default and add "allowed_event_types" 
> > > > in
> > > > event queue config.
> > > >
> > > 
> > > Bruce,
> > > 
> > > I have tried to make it as "allowed_event_types" in event queue config.
> > > However, rte_event_queue_default_conf_get() can also take NULL for default
> > > configuration. So I think, It makes sense to go with negation approach
> > > like ethdev to define the default to avoid confusion on the default. So
> > > I am thinking like below now,
> > > 
> > > ? [master][libeventdev] $ git diff
> > > diff --git a/rte_eventdev.h b/rte_eventdev.h
> > > index cf22b0e..cac4642 100644
> > > --- a/rte_eventdev.h
> > > +++ b/rte_eventdev.h
> > > @@ -429,6 +429,12 @@ rte_event_dev_configure(uint8_t dev_id, struct
> > > rte_event_dev_config *config);
> > >   *
> > >   *  \see rte_event_port_setup(), rte_event_port_link()
> > >   */
> > > +#define RTE_EVENT_QUEUE_CFG_NOATOMIC_TYPE  (1ULL << 1)
> > > +/**< Skip configuring atomic schedule type resources */
> > > +#define RTE_EVENT_QUEUE_CFG_NOORDERED_TYPE (1ULL << 2)
> > > +/**< Skip configuring ordered schedule type resources */
> > > +#define RTE_EVENT_QUEUE_CFG_NOPARALLEL_TYPE(1ULL << 3)
> > > +/**< Skip configuring parallel schedule type resources */
> > > 
> > >  /** Event queue configuration structure */
> > >  struct rte_event_queue_conf {
> > > 
> > > Thoughts?
> > > 
> > 
> > I'm ok with the default as being all types, in the case where NULL is
> > specified for the parameter. It does make the most sense.
> 
> Yes. That case I need to explicitly mention in the documentation about what
> is default case. With RTE_EVENT_QUEUE_CFG_NOATOMIC_TYPE scheme it quite
> understood what is default. Not adding up? :-)
> 

Would below not work? DEFAULT explicitly stated, and can be commented to
say all types allowed.

#define RTE_EVENT_QUEUE_CFG_DEFAULT 0
#define RTE_EVENT_QUEUE_CFG_ALL_TYPES RTE_EVENT_QUEUE_CFG_DEFAULT
#define RTE_EVENT_QUEUE_CFG_ATOMIC_ONLY (1<<0)
#define RTE_EVENT_QUEUE_CFG_ORDERED_ONLY (1<<1) 


/Bruce


[dpdk-dev] [PATCH v3] net/ring: remove unnecessary NULL check

2016-11-02 Thread Ferruh Yigit
On 11/2/2016 1:46 PM, Mauricio Vasquez B wrote:
> Coverity detected this as an issue because internals->data will never be NULL,
> then the check is not necessary.
> 
> Fixes: d082c0395bf6 ("ring: fix memory leak when detaching")
> Coverity issue: 137873
> 
> Signed-off-by: Mauricio Vasquez B 
> ---

Acked-by: Ferruh Yigit 



[dpdk-dev] [PATCH] app/test: fix wrong pointer values in crypto perftest

2016-11-02 Thread Trahe, Fiona


> -Original Message-
> From: Kusztal, ArkadiuszX
> Sent: Friday, October 28, 2016 12:37 PM
> To: dev at dpdk.org
> Cc: Trahe, Fiona ; De Lara Guarch, Pablo
> ; Griffin, John  intel.com>;
> Jain, Deepak K ; Kusztal, ArkadiuszX
> 
> Subject: [PATCH] app/test: fix wrong pointer values in crypto perftest
> 
> This commit fixes problem with device hanging because of
> wrong pointer values in snow3g performance test
> 
> Fixes: 97fe6461c7cb ("app/test: add SNOW 3G performance test")
> 
> Signed-off-by: Arek Kusztal 
> ---
Acked-by: Fiona Trahe 


[dpdk-dev] [RFC] [PATCH v2] libeventdev: event driven programming model framework for DPDK

2016-11-02 Thread Jerin Jacob
On Wed, Nov 02, 2016 at 01:56:27PM +, Bruce Richardson wrote:
> On Wed, Nov 02, 2016 at 06:39:27PM +0530, Jerin Jacob wrote:
> > On Wed, Nov 02, 2016 at 11:35:51AM +, Bruce Richardson wrote:
> > > On Wed, Nov 02, 2016 at 04:55:22PM +0530, Jerin Jacob wrote:
> > > > On Fri, Oct 28, 2016 at 02:36:48PM +0530, Jerin Jacob wrote:
> > > > > On Fri, Oct 28, 2016 at 09:36:46AM +0100, Bruce Richardson wrote:
> > > > > > On Fri, Oct 28, 2016 at 08:31:41AM +0530, Jerin Jacob wrote:
> > > > > > > On Wed, Oct 26, 2016 at 01:54:14PM +0100, Bruce Richardson wrote:
> > > > > > > > On Wed, Oct 26, 2016 at 05:54:17PM +0530, Jerin Jacob wrote:
> > > > > > > How about making default as "mixed" and let application 
> > > > > > > configures what
> > > > > > > is not required?. That way application responsibility is clear.
> > > > > > > something similar to ETH_TXQ_FLAGS_NOMULTSEGS, 
> > > > > > > ETH_TXQ_FLAGS_NOREFCOUNT
> > > > > > > with default.
> > > > > > > 
> > > > > > I suppose it could work, but why bother doing that? If an app knows 
> > > > > > it's
> > > > > > only going to use one traffic type, why not let it just state what 
> > > > > > it
> > > > > > will do rather than try to specify what it won't do. If mixed is 
> > > > > > needed,
> > > > > 
> > > > > My thought was more inline with ethdev spec, like, ref-count is 
> > > > > default,
> > > > > if application need exception then set ETH_TXQ_FLAGS_NOREFCOUNT. But 
> > > > > it is OK, if
> > > > > you need other way.
> > > > > 
> > > > > > then it's easy enough to specify - and we can make it the 
> > > > > > zero/default
> > > > > > value too.
> > > > > 
> > > > > OK. Then we will make MIX as zero/default and add 
> > > > > "allowed_event_types" in
> > > > > event queue config.
> > > > >
> > > > 
> > > > Bruce,
> > > > 
> > > > I have tried to make it as "allowed_event_types" in event queue config.
> > > > However, rte_event_queue_default_conf_get() can also take NULL for 
> > > > default
> > > > configuration. So I think, It makes sense to go with negation approach
> > > > like ethdev to define the default to avoid confusion on the default. So
> > > > I am thinking like below now,
> > > > 
> > > > ? [master][libeventdev] $ git diff
> > > > diff --git a/rte_eventdev.h b/rte_eventdev.h
> > > > index cf22b0e..cac4642 100644
> > > > --- a/rte_eventdev.h
> > > > +++ b/rte_eventdev.h
> > > > @@ -429,6 +429,12 @@ rte_event_dev_configure(uint8_t dev_id, struct
> > > > rte_event_dev_config *config);
> > > >   *
> > > >   *  \see rte_event_port_setup(), rte_event_port_link()
> > > >   */
> > > > +#define RTE_EVENT_QUEUE_CFG_NOATOMIC_TYPE  (1ULL << 1)
> > > > +/**< Skip configuring atomic schedule type resources */
> > > > +#define RTE_EVENT_QUEUE_CFG_NOORDERED_TYPE (1ULL << 2)
> > > > +/**< Skip configuring ordered schedule type resources */
> > > > +#define RTE_EVENT_QUEUE_CFG_NOPARALLEL_TYPE(1ULL << 3)
> > > > +/**< Skip configuring parallel schedule type resources */
> > > > 
> > > >  /** Event queue configuration structure */
> > > >  struct rte_event_queue_conf {
> > > > 
> > > > Thoughts?
> > > > 
> > > 
> > > I'm ok with the default as being all types, in the case where NULL is
> > > specified for the parameter. It does make the most sense.
> > 
> > Yes. That case I need to explicitly mention in the documentation about what
> > is default case. With RTE_EVENT_QUEUE_CFG_NOATOMIC_TYPE scheme it quite
> > understood what is default. Not adding up? :-)
> > 
> 
> Would below not work? DEFAULT explicitly stated, and can be commented to
> say all types allowed.

All I was trying to avoid explicitly stating the default state. Not worth
to have back and forth on slow path configuration, I will keep it as
positive logic as you suggested :-) and inspired from PKT_TX_L4_MASK

#define RTE_EVENT_QUEUE_CFG_TYPE_MASK   (3ULL << 0)
#define RTE_EVENT_QUEUE_CFG_ALL_TYPES   (0ULL << 0) /**< Enable all types */
#define RTE_EVENT_QUEUE_CFG_ATOMIC_ONLY (1ULL << 0)
#define RTE_EVENT_QUEUE_CFG_ORDERED_ONLY(2ULL << 0)
#define RTE_EVENT_QUEUE_CFG_PARALLEL_ONLY   (3ULL << 0)
#define RTE_EVENT_QUEUE_CFG_SINGLE_CONSUMER (1ULL << 2)

> 
> #define RTE_EVENT_QUEUE_CFG_DEFAULT 0
> #define RTE_EVENT_QUEUE_CFG_ALL_TYPES RTE_EVENT_QUEUE_CFG_DEFAULT
> #define RTE_EVENT_QUEUE_CFG_ATOMIC_ONLY (1<<0)
> #define RTE_EVENT_QUEUE_CFG_ORDERED_ONLY (1<<1) 
> 
> 
> /Bruce


[dpdk-dev] [PATCH] doc: announce ABI changes in filtering support

2016-11-02 Thread Stroe, Laura
Self-Nack.
After an internal review of ABI breakage announcements we found a way of 
achieving this with an ABI change.

-Original Message-
From: Stroe, Laura 
Sent: Friday, September 23, 2016 12:23 PM
To: dev at dpdk.org
Cc: Stroe, Laura 
Subject: [PATCH] doc: announce ABI changes in filtering support

From: Laura Stroe 

This patch adds a notice that the ABI for filter types functionality will be 
enhanced in the 17.02 release with new operation available to manipulate the 
tunnel filters:
replace filter types.

Signed-off-by: Laura Stroe 
---
 doc/guides/rel_notes/deprecation.rst | 9 +
 1 file changed, 9 insertions(+)

diff --git a/doc/guides/rel_notes/deprecation.rst 
b/doc/guides/rel_notes/deprecation.rst
index 1a3831f..1cd1d2c 100644
--- a/doc/guides/rel_notes/deprecation.rst
+++ b/doc/guides/rel_notes/deprecation.rst
@@ -57,3 +57,12 @@ Deprecation Notices
 * API will change for ``rte_port_source_params`` and ``rte_port_sink_params``
   structures. The member ``file_name`` data type will be changed from
   ``char *`` to ``const char *``. This change targets release 16.11.
+
+* In 17.02 ABI changes are planned: the ``rte_filter_op `` enum will be 
+extended
+  with a new member RTE_ETH_FILTER_REPLACE in order to facilitate
+  the new operation - replacing the tunnel filters,
+  the ``rte_eth_tunnel_filter_conf`` structure will be extended with a 
+new field
+  ``filter_type_replace`` handling the bitmask combination of the 
+filter types
+  defined by the values  ETH_TUNNEL_FILTER_XX,
+  define new values for Outer VLAN and Outer Ethertype filters
+  ETH_TUNNEL_FILTER_OVLAN and ETH_TUNNEL_FILTER_OETH.
--
2.5.5



[dpdk-dev] dpdk16.11 RC2 package ipv4 reassembly example can't work

2016-11-02 Thread Adrien Mazarguil
Hi all,

On Wed, Nov 02, 2016 at 08:39:31AM +, Lu, Wenzhuo wrote:
> Correct the typo of receiver.
> 
> Hi Adrien,
> The change from struct ip_frag_pkt pkt[0]  to struct ip_frag_pkt pkt[] will 
> make IP reassembly not working. I think this is not the root cause. Maybe 
> Konstantin can give us some idea.
> But I notice one thing, you change some from [0] to [], but others just add 
> '__extension__'. I believe if you add '__extension__' for struct ip_frag_pkt 
> pkt[0], we'll not hit this issue. Just curious why you use 2 ways to resolve 
> the same problem.

I've used the __extension__ method whenever the C99 syntax could not work
due to invalid usage in the code, e.g. a flexible array cannot be the only
member of a struct, you cannot make arrays out of structures that contain
such fields, while there is no such constraint with the GNU syntax.

For example see __extension__ uint8_t action_data[0] in struct
rte_pipeline_table_entry. The C99 could not be used because of
test_table_acl.c:

  struct rte_pipeline_table_entry entries[5];

If replacing ip_frag_pkt[] with __extension__ ip_frag_pkt pkt[0] in
rte_ip_frag.h solves the issue, either some code is breaking some constraint
somewhere or this change broke the ABI (unlikely considering a simple
recompilation should have taken care of the issue). I did not notice any
change in sizeof(struct rte_ip_frag_tbl) nor offsetof(struct
rte_ip_frag_tbl, pkt) on my setup, perhaps the compilation flags used in
your test affect them somehow.

Can you confirm whether only reverting this particular field solves the
issue?

> From: Xu, HuilongX
> Sent: Wednesday, November 2, 2016 4:29 PM
> To: drien.mazarguil at 6wind.com
> Cc: Ananyev, Konstantin; Liu, Yu Y; Chen, WeichunX; Lu, Wenzhuo; Xu, HuilongX
> Subject: dpdk16.11 RC2 package ipv4 reassembly example can't work
> 
> Hi mazarguil,
> I find ip reassembly example can't work with dpdk16.11 rc2 package.
> But when I reset dpdk code before 347a1e037fd323e6c2af55d17f7f0dc4bfe1d479, 
> it works ok.
> Could you have time to check this issue, thanks  a lot.
> Unzip password: intel123
> 
> Test detail info:
> 
> os:4.2.3-300.fc23.x86_64
> gcc version:5.3.1 20160406 (Red Hat 5.3.1-6) (GCC)
> NIC:03:00.0 Ethernet controller [0200]: Intel Corporation Ethernet Connection 
> X552/X557-AT 10GBASE-T [8086:15ad] and
> 84:00.0 Ethernet controller [0200]: Intel Corporation 82599ES 10-Gigabit 
> SFI/SFP+ Network Connection [8086:10fb] (rev 01)
> package: dpdk16.11.rc2.tar.gz
> test steps:
> 1. build and install dpdk
> 2. build ip_reassembly example
> 3. run ip_reassembly
> ./examples/ip_reassembly/build/ip_reassembly -c 0x2 -n 4 - -p 0x1 
> --maxflows=1024 --flowttl=10s
> 4. set tester port mtu
> ip link set mtu 9000 dev ens160f1
> 5. setup scapy on tester and send packet
> scapy
> pcap = rdpcap("file.pcap")
> sendp(pcap, iface="ens160f1")
> 6. sniff packet on tester and check packet
> test result:
> dpdk16.04 reassembly packet successful but dpdk16.11 reassembly pack failed.
> 
> comments:
> file.pcap: send packets pcap file
> tcpdump_16.04_reassembly_successful.pcap: sniff packets by tcpdump on 16.04.
> tcpdump_reset_code_reassembly_failed.pcap: sniff packets by tcpdump on 16.11
> reset_code_reassembly_successful_.jpg: reassembly a packets successful detail 
> info
> dpdk16.11_reassembly_failed.jpg: reassembly a packets failed detail info
> 

-- 
Adrien Mazarguil
6WIND


[dpdk-dev] Possible memory corruption due to incorrect DMA shutdown

2016-11-02 Thread George Prekas
I have posted the following messages on users at dpdk.organd it seems that 
it was the wrong mailing list:

http://dpdk.org/ml/archives/users/2016-March/000340.html

http://dpdk.org/ml/archives/users/2016-September/001026.html

I also include the message here for convenience:

I can consistently reproduce this behavior following these steps:

On host A do:

$ sudo modprobe uio
$ sudo insmod ./build/kmod/igb_uio.ko
$ sudo python ./tools/dpdk_nic_bind.py --bind=igb_uio :42:00.1
$ sudo ./build/app/testpmd -- --forward-mode=icmpecho

 From host B (which is on the same local network), do an arping to an 
arbitrary IP address C.C.C.C (that is in the same network and it doesn't 
belong to another host). DPDK will respond to any IP address. All you 
want is to populate host B's ARP cache. Then terminate DPDK and run the 
following Python program on host B:

import sys
import socket

A=sys.argv[3] * int(sys.argv[2])

for i in xrange(1):
   sock = socket.socket(socket.AF_INET,socket.SOCK_DGRAM)
   sock.sendto(A, (sys.argv[1], 1))

as:

$ python go.py C.C.C.C 28 prekageo_was_here_

Then go back to host A and use the physical memory grep tool that you 
can find here https://github.com/prekageo/allmemscan

$ make
$ sudo insmod allmem.ko
$ sudo ./allmemscan 28 prekageo_was_here

Optionally:

$ sudo dd if=/dev/allmem bs=4K skip=PAGE count=1 | hexdump -C | less

On 11/03/2016 19:41, George Prekas wrote:
> Hi. I've been using DPDK for a research project 
> (https://www.usenix.org/conference/osdi14/technical-sessions/presentation/belay)
>  
> for over 2 years and I'd like to report a behavior that puzzles me 
> using DPDK.
>
> The behavior leads to memory corruption and is caused by the incorrect 
> shutdown of DMA. I can reproduce it after executing the following steps:
>
> $ sudo modprobe uio
> $ sudo insmod ./build/kmod/igb_uio.ko
> $ sudo python ./tools/dpdk_nic_bind.py --bind=igb_uio :42:00.1
> $ sudo ./build/app/testpmd -- --forward-mode=icmpecho
>
> Then I terminate the DPDK program (after populating the ARP cache of 
> another host on the local network). After this, I can send UDP packets 
> to the host and can observe their payload in host memory. Clearly, 
> network packets are arriving to the network card and are written to 
> RAM after DPDK has finished executing.
>
> Am I doing something wrong? Is this behavior expected?
>
> Regards,
> George
>



[dpdk-dev] [RFC]Generic flow filtering API Sample Application

2016-11-02 Thread Adrien Mazarguil
Hi Wei,

On Wed, Nov 02, 2016 at 05:27:50AM +, Zhao1, Wei wrote:
> Hi  All,
> Now we are planning for an sample application for Generic flow 
> filtering API feature, and I have finished the RFC for this example app.
> Now  Adrien Mazarguil  has send v2 version of Generic flow 
> filtering API,  this sample application  RFC is based on that.
> 
> Thank you.

Thanks for your RFC, sorry for the late notice that I've been essentially
working on a similar implementation in testpmd in order to validate the API
before sending v1, which I concede is taking way longer than expected.

I have yet to submit my patches however this should happen soon, if you
haven't started working on your own implementation yet, please wait until my
implementation gets rejected to avoid any more duplicated effort in the
meantime.

BTW, I find a lot of similarities between our respective command-line
handling approaches, which is great! We're going in the same direction.

> Generic flow filtering API Sample Application
> 
> 
> The application is a simple example of generic flow filtering API using the 
> DPDK.
> The application performs flow director/filtering/classification in packet 
> processing.
> 
> Overview
> 
> 
> The application demonstrates the use of generic flow 
> director/filtering/classification API 
> in the DPDK to implement packet forwarding.And this document focus on the 
> guide line of writing rules configuration 
> files and prompt commands usage. It also supply the definition of the 
> available EAL options arguments which is useful
> in DPDK packet forwarding processing.
> 
> 
> Compiling the Application
> -
> 
> To compile the application:
> 
> #.Go to the sample application directory:
> 
>   .. code-block:: console
> 
>   export RTE_SDK=/path/to/rte_sdk
>   cd ${RTE_SDK}/examples/gen_filter
> 
> #.Set the target (a default target is used if not specified). For example:
> 
>   .. code-block:: console
> 
>   export RTE_TARGET=x86_64-native-linuxapp-gcc
> 
>   See the *DPDK Getting Started Guide* for possible RTE_TARGET values.
> 
> #.Build the application:
> 
>   .. code-block:: console
> 
>   make
> 
> Running the Application
> ---
> The application has a number of EAL options::
> 
>   ./gen_filter [EAL options] -- 
> 
> EAL options:
> * -c
>   Codemask, set the hexadecimal bitmask of the cores to run on.
> 
> * -n
>   Num, set the number of memory channels to use.
> 
> APP PARAMS:
>   The following are the application options parameters, they must be 
> separated
>   from the EAL options with a "--" separator.
> 
> * -i
>   Interactive, run this app in interactive mode. In this mode, the app 
> starts with a prompt that can
>   be used to start and stop forwarding, then manage generic filters rule 
> configure in the application,
>   reference to the following description for more details.In 
> non-interactive mode, the application starts with the configuration specified 
> on the
>   command-line and immediately enters forwarding mode.
> 
> * --portmask=0xXX
>   Set the hexadecimal bitmask of the ports which can be used by the 
> generic flow director test in packet forwarding.
>   
> * --coremask=0xXX
>   Set the hexadecimal bitmask of the cores running the packet forwarding 
> test. The master
>   lcore is reserved for command line parsing only and cannot be masked on 
> for packet forwarding.
> 
> * --nb-ports=N 
>   Set the number of forwarding ports, where 1 <= N <= "number of ports" 
> on the board
>   or CONFIG_RTE_MAX_ETHPORTS from the configuration file. The default 
> value is the number of ports on the board.
> 
> * --rxq=N
>   Set the number of RX queues per port to N, where 1 <= N <= 65535. The 
> default value is 1.
> 
> * --txq=N
>   Set the number of TX queues per port to N, where 1 <= N <= 65535. The 
> default value is 1.
> 
> 
> ###this part need to complete later after decision of which EAL commands 
> arguments need to be support in this application###
> 
> 
> Interactive mode
> 
> *   when the gen_filter application is started in interactive mode, 
> (-i|--interactive), it displays a prompt 
>   that can be used to start and stop forwarding, and configure the 
> application to set the Flow Director,
>   display statistics, set the Flow Director and other tasks. The 
> application has a number of commands line options:
> 
>   gen_filter>[Commands]
> 
> * There is a prompt "gen_filter> " before cursor, command can be enter 
> after that position,
>   also a space bar between configuration file name and command.
> 
> These are the commands that are currently working under the command line 
> interface:
> 
> * Control Commands
> 
>   help: show the following commands which are 

[dpdk-dev] [PATCH] net/mlx5: fix wrong use of vector instruction

2016-11-02 Thread Adrien Mazarguil
On Tue, Nov 01, 2016 at 08:13:27AM +, Elad Persiko wrote:
> Constraint alignment was not respected in Tx.
> 
> Fixes: 1d88ba171942 ("net/mlx5: refactor Tx data path")
> 
> Signed-off-by: Elad Persiko 
> ---
>  drivers/net/mlx5/mlx5_rxtx.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/net/mlx5/mlx5_rxtx.c b/drivers/net/mlx5/mlx5_rxtx.c
> index 21164ba..ba8e202 100644
> --- a/drivers/net/mlx5/mlx5_rxtx.c
> +++ b/drivers/net/mlx5/mlx5_rxtx.c
> @@ -309,7 +309,7 @@ mlx5_tx_dbrec(struct txq *txq)
>   *txq->qp_db = htonl(txq->wqe_ci);
>   /* Ensure ordering between DB record and BF copy. */
>   rte_wmb();
> - rte_mov16(dst, (uint8_t *)data);
> + memcpy(dst, (uint8_t *)data, 16);
>   txq->bf_offset ^= (1 << txq->bf_buf_size);
>  }
>  
> @@ -449,7 +449,7 @@ mlx5_tx_burst(void *dpdk_txq, struct rte_mbuf **pkts, 
> uint16_t pkts_n)
>   wqe->eseg.mss = 0;
>   wqe->eseg.rsvd2 = 0;
>   /* Start by copying the Ethernet Header. */
> - rte_mov16((uint8_t *)raw, (uint8_t *)addr);
> + memcpy((uint8_t *)raw, ((uint8_t *)addr), 16);
>   length -= MLX5_WQE_DWORD_SIZE;
>   addr += MLX5_WQE_DWORD_SIZE;
>   /* Replace the Ethernet type by the VLAN if necessary. */
> -- 
> 1.8.3.1

Acked-by: Adrien Mazarguil 

-- 
Adrien Mazarguil
6WIND


[dpdk-dev] [PATCH 0/3] fix Rx checksum offloads

2016-11-02 Thread Adrien Mazarguil
On Wed, Nov 02, 2016 at 11:39:36AM +0100, Nelio Laranjeiro wrote:
> Fill correctly the Mbuf Rx offloads.
> 
> Nelio Laranjeiro (3):
>   net/mlx5: fix Rx checksum macros
>   net/mlx5: define explicit fields for Rx offloads
>   net/mlx: fix support for new Rx checksum flags
> 
>  drivers/net/mlx4/mlx4.c  | 21 --
>  drivers/net/mlx5/mlx5_prm.h  | 37 +-
>  drivers/net/mlx5/mlx5_rxtx.c | 93 
> 
>  3 files changed, 87 insertions(+), 64 deletions(-)
> 
> -- 
> 2.1.4

Thanks. For the series:

Acked-by: Adrien Mazarguil 

-- 
Adrien Mazarguil
6WIND


[dpdk-dev] [PATCH 0/2] update mlx5 release note and guide

2016-11-02 Thread Adrien Mazarguil
On Wed, Nov 02, 2016 at 02:46:42PM +0100, Nelio Laranjeiro wrote:
> Nelio Laranjeiro (2):
>   doc: update mlx5 dependencies
>   doc: add mlx5 release notes
> 
>  doc/guides/nics/mlx5.rst   |   8 +-
>  doc/guides/rel_notes/release_16_11.rst | 136 
> ++---
>  2 files changed, 114 insertions(+), 30 deletions(-)
> 
> -- 
> 2.1.4

Acked-by: Adrien Mazarguil 

-- 
Adrien Mazarguil
6WIND


[dpdk-dev] [PATCH] scripts: fix quiet checkpatch

2016-11-02 Thread Thomas Monjalon
The commit e13fbc065c7f ("scripts: improve quiet checkpatch")
removed the line "total: 1 errors, 0 warnings, 7 lines checked"
from the quiet report.
Later, commit e7c38f471384 ("scripts: remove useless checkpatch notes")
removed few lines before "total:.*lines checked", so it was not working
well for quiet reporting.

Better to keep the "total:" line in quiet mode and remove the other ones.
That's why the checkpatch.pl option --no-summary is not used anymore
by reverting the commit "improve quiet checkpatch".

Signed-off-by: Thomas Monjalon 
---
 scripts/checkpatches.sh | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/scripts/checkpatches.sh b/scripts/checkpatches.sh
index 622a5b6..5286fe6 100755
--- a/scripts/checkpatches.sh
+++ b/scripts/checkpatches.sh
@@ -64,7 +64,7 @@ verbose=false
 while getopts hn:qv ARG ; do
case $ARG in
n ) number=$OPTARG ;;
-   q ) quiet=true && options="$options --no-summary" ;;
+   q ) quiet=true ;;
v ) verbose=true ;;
h ) print_usage ; exit 0 ;;
? ) print_usage ; exit 1 ;;
-- 
2.7.0



[dpdk-dev] [PATCH] scripts: add standard input to checkpatch

2016-11-02 Thread Thomas Monjalon
It is now possible to check a patch by providing an email
through stdin.
It is especially useful to automate checkpatch run when
receiving an email.

Signed-off-by: Thomas Monjalon 
---
 scripts/checkpatches.sh | 14 --
 1 file changed, 12 insertions(+), 2 deletions(-)

diff --git a/scripts/checkpatches.sh b/scripts/checkpatches.sh
index 5286fe6..0f3ed9d 100755
--- a/scripts/checkpatches.sh
+++ b/scripts/checkpatches.sh
@@ -53,7 +53,7 @@ print_usage () {
Run Linux kernel checkpatch.pl with DPDK options.
The environment variable DPDK_CHECKPATCH_PATH must be set.

-   The patches to check can be from files specified on the command line,
+   The patches to check can be from stdin, files specified on the command 
line,
or latest git commits limited with -n option (default limit: 
origin/master).
END_OF_HELP
 }
@@ -90,6 +90,8 @@ check () { #   
elif [ -n "$2" ] ; then
report=$(git format-patch --no-stat --stdout -1 $commit |
$DPDK_CHECKPATCH_PATH $options - 2>/dev/null)
+   else
+   report=$($DPDK_CHECKPATCH_PATH $options - 2>/dev/null)
fi
[ $? -ne 0 ] || continue
$verbose || printf '\n### %s\n\n' "$3"
@@ -97,7 +99,15 @@ check () { #   
status=$(($status + 1))
 }

-if [ -z "$1" ] ; then
+if [ ! -t 0 ] ; then # stdin
+   subject=$(while read header value ; do
+   if [ "$header" = 'Subject:' ] ; then
+   echo $value
+   break
+   fi
+   done)
+   check '' '' "$subject"
+elif [ -z "$1" ] ; then
if [ $number -eq 0 ] ; then
commits=$(git rev-list --reverse origin/master..)
else
-- 
2.7.0



[dpdk-dev] [PATCH] igb_uio: fix build with backported kernel

2016-11-02 Thread martin_curran-g...@keysight.com
Hi ,

Sorry, struggling to see what happened to this thread

I managed to get dpdk 2.2.0 to build on CentOs 6.8 by sorting the 
MSIX_ENTRY_CTRL_MASKBIT

But I'm trying to get 16.7 to run on 6.8, and am hitting the   
vlan_tx_tag_present(_skb)

I tried just putting a bare
#define  vlan_tx_tag_present(_skb) 0
line in the two kcompat.h files
one for igb and one for ixgbe

but I'm hitting other issues now.

/root/mcgray/dpdk-16.07/x86_64-native-linuxapp-gcc/build/lib/librte_eal/linuxapp/kni/kni_misc.c:441:20:
 error: macro "alloc_netdev" passed 4 arguments, but takes just 3

I had already turned of KNI in my config file, so why is the dpdk-setup.sh even 
trying to build this stuff??

I don't need KNI as far as I know

I saw mention of backported kernel?

I guess my 16.7 is a few months old now, if I go and get another download will 
this all just go away?

Thanks

Sry, this stuff all a bit beyond my experience so far.



Martin Curran-Gray


[dpdk-dev] [PATCH] igb_uio: fix build with backported kernel

2016-11-02 Thread Ferruh Yigit
On 11/2/2016 4:19 PM, martin_curran-gray at keysight.com wrote:
> Hi ,
> 
> Sorry, struggling to see what happened to this thread
> 
> I managed to get dpdk 2.2.0 to build on CentOs 6.8 by sorting the 
> MSIX_ENTRY_CTRL_MASKBIT
> 
> But I'm trying to get 16.7 to run on 6.8, and am hitting the   
> vlan_tx_tag_present(_skb)
> 
> I tried just putting a bare
> #define  vlan_tx_tag_present(_skb) 0
> line in the two kcompat.h files
> one for igb and one for ixgbe
> 
> but I'm hitting other issues now.
> 
> /root/mcgray/dpdk-16.07/x86_64-native-linuxapp-gcc/build/lib/librte_eal/linuxapp/kni/kni_misc.c:441:20:
>  error: macro "alloc_netdev" passed 4 arguments, but takes just 3
> 
> I had already turned of KNI in my config file, so why is the dpdk-setup.sh 
> even trying to build this stuff??

I guess it is not disabled properly. How are you disabling KNI?

> 
> I don't need KNI as far as I know
> 
> I saw mention of backported kernel?
> 
> I guess my 16.7 is a few months old now, if I go and get another download 
> will this all just go away?
> 
> Thanks
> 
> Sry, this stuff all a bit beyond my experience so far.
> 
> 
> 
> Martin Curran-Gray
> 



[dpdk-dev] [PATCH] igb_uio: fix build with backported kernel

2016-11-02 Thread martin_curran-g...@keysight.com
Hi,

I set 
CONFIG_RTE_LIBRTE_KNI=n
In common_linux_app

Hmmm I didn't set 
CONFIG_RTE_KNI_KMOD=n
It was a y

Lets see

Ah success

Thanks!

M.


-Original Message-
From: Ferruh Yigit [mailto:ferruh.yi...@intel.com] 
Sent: 02 November 2016 16:31
To: CURRAN-GRAY,MARTIN (K-Scotland,ex1) ; 
dev at dpdk.org
Subject: Re: [dpdk-dev] [PATCH] igb_uio: fix build with backported kernel

On 11/2/2016 4:19 PM, martin_curran-gray at keysight.com wrote:
> Hi ,
> 
> Sorry, struggling to see what happened to this thread
> 
> I managed to get dpdk 2.2.0 to build on CentOs 6.8 by sorting the 
> MSIX_ENTRY_CTRL_MASKBIT
> 
> But I'm trying to get 16.7 to run on 6.8, and am hitting the   
> vlan_tx_tag_present(_skb)
> 
> I tried just putting a bare
> #define  vlan_tx_tag_present(_skb) 0
> line in the two kcompat.h files
> one for igb and one for ixgbe
> 
> but I'm hitting other issues now.
> 
> /root/mcgray/dpdk-16.07/x86_64-native-linuxapp-gcc/build/lib/librte_eal/linuxapp/kni/kni_misc.c:441:20:
>  error: macro "alloc_netdev" passed 4 arguments, but takes just 3
> 
> I had already turned of KNI in my config file, so why is the dpdk-setup.sh 
> even trying to build this stuff??

I guess it is not disabled properly. How are you disabling KNI?

> 
> I don't need KNI as far as I know
> 
> I saw mention of backported kernel?
> 
> I guess my 16.7 is a few months old now, if I go and get another download 
> will this all just go away?
> 
> Thanks
> 
> Sry, this stuff all a bit beyond my experience so far.
> 
> 
> 
> Martin Curran-Gray
> 



[dpdk-dev] [PATCH] E1000: fix for forced speed/duplex config

2016-11-02 Thread Ferruh Yigit
Hi Ananda,

Thank you for the patch. Can you please take care a few minor issues?

Patch tag should be: "net/e1000:", so patch title becomes:
"net/e1000: fix for forced speed/duplex config"

On 11/1/2016 10:47 PM, Ananda Sathyanarayana wrote:
> From the code, it looks like, hw->mac.autoneg, variable is used to
> switch between calling either autoneg function or forcing
> speed/duplex function. But this variable is not modified in
> eth_em_start/eth_igb_start routines (it is always set to 1)
> even while forcing the link speed.
> 
> Following discussion thread has some more information on
> this
> 
> http://dpdk.org/ml/archives/dev/2016-October/049272.html

Requires a fixes line:
http://dpdk.org/doc/guides/contributing/patches.html#commit-messages-body

> 
> Signed-off-by: Ananda Sathyanarayana 

You can keep Wenzhuo's ack for next version of the patch.
Acked-by: Wenzhuo Lu 

> ---
>  drivers/net/e1000/em_ethdev.c  | 16 ++--
>  drivers/net/e1000/igb_ethdev.c | 16 ++--
>  2 files changed, 28 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/net/e1000/em_ethdev.c b/drivers/net/e1000/em_ethdev.c
> index 7cf5f0c..a2412f5 100644
> --- a/drivers/net/e1000/em_ethdev.c
> +++ b/drivers/net/e1000/em_ethdev.c
> @@ -639,6 +639,7 @@ eth_em_start(struct rte_eth_dev *dev)
>   speeds = >data->dev_conf.link_speeds;
>   if (*speeds == ETH_LINK_SPEED_AUTONEG) {
>   hw->phy.autoneg_advertised = E1000_ALL_SPEED_DUPLEX;
> +hw->mac.autoneg = 1;

checkpatch gives many whitespace errors.
>From coding style document:
"Global whitespace rule in DPDK, use tabs for indentation, spaces for
alignment."

And how to use checkpatch:
http://dpdk.org/doc/guides/contributing/patches.html#checking-the-patches

>   } else {
>   num_speeds = 0;
>   autoneg = (*speeds & ETH_LINK_SPEED_FIXED) == 0;
> @@ -672,9 +673,20 @@ eth_em_start(struct rte_eth_dev *dev)
>   hw->phy.autoneg_advertised |= ADVERTISE_1000_FULL;
>   num_speeds++;
>   }
> - if (num_speeds == 0 || (!autoneg && (num_speeds > 1)))
> + if (num_speeds == 0 || (!autoneg && (num_speeds > 1))) {

No need to update this line, dpdk coding style doesn't require
parenthesis for single line statement:
http://dpdk.org/doc/guides/contributing/coding_style.html#control-statements-and-loops

>   goto error_invalid_config;
> - }
> +}
> +/*
> + * Set/reset the mac.autoneg based on the link speed,
> + * fixed or not
> + */
> +if (!autoneg) {
> +hw->mac.autoneg = 0;
> +hw->mac.forced_speed_duplex = 
> hw->phy.autoneg_advertised;

This line over 80 character limit.

> +} else {
> +hw->mac.autoneg = 1;
> +}
> +}
>  
>   e1000_setup_link(hw);
>  
> diff --git a/drivers/net/e1000/igb_ethdev.c b/drivers/net/e1000/igb_ethdev.c

Same comments valid for this file too.

Thanks,
ferruh



[dpdk-dev] [PATCH] crypto: clarify how crypto operations affect buffers

2016-11-02 Thread Fiona Trahe
Updated comments on API to clarify which parts of mbufs are
copied or changed by crypto operations.

Signed-off-by: Fiona Trahe 
---
 lib/librte_cryptodev/rte_crypto_sym.h | 19 +++
 1 file changed, 19 insertions(+)

diff --git a/lib/librte_cryptodev/rte_crypto_sym.h 
b/lib/librte_cryptodev/rte_crypto_sym.h
index 693774e..d3d38e4 100644
--- a/lib/librte_cryptodev/rte_crypto_sym.h
+++ b/lib/librte_cryptodev/rte_crypto_sym.h
@@ -366,6 +366,25 @@ struct rte_cryptodev_sym_session;
  * it must have a valid *rte_mbuf* structure attached, via m_src parameter,
  * which contains the source data which the crypto operation is to be performed
  * on.
+ * While the mbuf is in use by a crypto operation no part of the mbuf should be
+ * changed by the application as the device may read or write to any part of 
the
+ * mbuf. In the case of hardware crypto devices some or all of the mbuf
+ * may be DMAed in and out of the device, so writing over the original data,
+ * though only the part specified by the rte_crypto_sym_op for transformation
+ * will be changed.
+ * Out-of-place (OOP) operation, where the source mbuf is different to the
+ * destination mbuf, is a special case. Data will be copied from m_src to 
m_dst.
+ * The part copied includes all the parts of the source mbuf that will be
+ * operated on, based on the cipher.data.offset+cipher.data.length and
+ * auth.data.offset+auth.data.length values in the rte_crypto_sym_op. The part
+ * indicated by the cipher parameters will be transformed, any extra data 
around
+ * this indicated by the auth parameters will be copied unchanged from source 
to
+ * destination mbuf.
+ * Also in OOP operation the cipher.data.offset and auth.data.offset apply to
+ * both source and destination mbufs. As these offsets are relative to the
+ * data_off parameter in each mbuf this can result in the data written to the
+ * destination buffer being at a different alignment, relative to buffer start,
+ * to the data in the source buffer.
  */
 struct rte_crypto_sym_op {
struct rte_mbuf *m_src; /**< source mbuf */
-- 
2.5.0



[dpdk-dev] [PATCH v2] E1000: fix for forced speed/duplex config

2016-11-02 Thread Ananda Sathyanarayana
Fixed the formating/syntax issues reported

>From the code, it looks like, hw->mac.autoneg, variable is used to
switch between calling either autoneg function or forcing
speed/duplex function. But this variable is not modified in
eth_em_start/eth_igb_start routines (it is always set to 1)
even while forcing the link speed.

Following discussion thread has some more information on
this

http://dpdk.org/ml/archives/dev/2016-October/049272.html

Signed-off-by: Ananda Sathyanarayana 
---
 drivers/net/e1000/em_ethdev.c  | 12 
 drivers/net/e1000/igb_ethdev.c | 12 
 2 files changed, 24 insertions(+)

diff --git a/drivers/net/e1000/em_ethdev.c b/drivers/net/e1000/em_ethdev.c
index 7cf5f0c..aee3d34 100644
--- a/drivers/net/e1000/em_ethdev.c
+++ b/drivers/net/e1000/em_ethdev.c
@@ -639,6 +639,7 @@ eth_em_start(struct rte_eth_dev *dev)
speeds = >data->dev_conf.link_speeds;
if (*speeds == ETH_LINK_SPEED_AUTONEG) {
hw->phy.autoneg_advertised = E1000_ALL_SPEED_DUPLEX;
+   hw->mac.autoneg = 1;
} else {
num_speeds = 0;
autoneg = (*speeds & ETH_LINK_SPEED_FIXED) == 0;
@@ -674,6 +675,17 @@ eth_em_start(struct rte_eth_dev *dev)
}
if (num_speeds == 0 || (!autoneg && (num_speeds > 1)))
goto error_invalid_config;
+
+   /* Set/reset the mac.autoneg based on the link speed,
+* fixed or not
+*/
+   if (!autoneg) {
+   hw->mac.autoneg = 0;
+   hw->mac.forced_speed_duplex =
+   hw->phy.autoneg_advertised;
+   } else {
+   hw->mac.autoneg = 1;
+   }
}

e1000_setup_link(hw);
diff --git a/drivers/net/e1000/igb_ethdev.c b/drivers/net/e1000/igb_ethdev.c
index 4924396..2fddf0c 100644
--- a/drivers/net/e1000/igb_ethdev.c
+++ b/drivers/net/e1000/igb_ethdev.c
@@ -1327,6 +1327,7 @@ eth_igb_start(struct rte_eth_dev *dev)
speeds = >data->dev_conf.link_speeds;
if (*speeds == ETH_LINK_SPEED_AUTONEG) {
hw->phy.autoneg_advertised = E1000_ALL_SPEED_DUPLEX;
+   hw->mac.autoneg = 1;
} else {
num_speeds = 0;
autoneg = (*speeds & ETH_LINK_SPEED_FIXED) == 0;
@@ -1362,6 +1363,17 @@ eth_igb_start(struct rte_eth_dev *dev)
}
if (num_speeds == 0 || (!autoneg && (num_speeds > 1)))
goto error_invalid_config;
+
+   /* Set/reset the mac.autoneg based on the link speed,
+* fixed or not
+*/
+   if (!autoneg) {
+   hw->mac.autoneg = 0;
+   hw->mac.forced_speed_duplex =
+   hw->phy.autoneg_advertised;
+   } else {
+   hw->mac.autoneg = 1;
+   }
}

e1000_setup_link(hw);
-- 
1.9.1