[dpdk-dev] [PATCHv7 6/6] doc: Add prog_guide section documenting pmdinfo script

2016-06-09 Thread Mcnamara, John
> -Original Message-
> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Neil Horman
> Sent: Thursday, June 9, 2016 6:47 PM
> To: dev at dpdk.org
> Cc: Neil Horman ; Richardson, Bruce
> ; Thomas Monjalon  6wind.com>;
> Stephen Hemminger ; Panu Matilainen
> 
> Subject: [dpdk-dev] [PATCHv7 6/6] doc: Add prog_guide section documenting
> pmdinfo script
> 
> Information on pmdinfogen may be useful to 3rd party driver developers.
> Include documentation on what it does
> 
> Signed-off-by: Neil Horman 
> CC: Bruce Richardson 
> CC: Thomas Monjalon 
> CC: Stephen Hemminger 
> CC: Panu Matilainen 

Acked-by: John McNamara 





[dpdk-dev] [PATCH v6 5/8] lib/librte_pdump: add new library for packet capturing support

2016-06-09 Thread Ananyev, Konstantin


> -Original Message-
> From: Aaron Conole [mailto:aconole at redhat.com]
> Sent: Thursday, June 09, 2016 6:24 PM
> To: Ananyev, Konstantin
> Cc: Pattan, Reshma; dev at dpdk.org
> Subject: Re: [dpdk-dev] [PATCH v6 5/8] lib/librte_pdump: add new library for 
> packet capturing support
> 
> "Ananyev, Konstantin"  writes:
> 
> >> -Original Message-
> >> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Aaron Conole
> >> Sent: Thursday, June 09, 2016 4:59 PM
> >> To: Pattan, Reshma
> >> Cc: dev at dpdk.org
> >> Subject: Re: [dpdk-dev] [PATCH v6 5/8] lib/librte_pdump: add new
> > library for packet capturing support
> >>
> >> Reshma Pattan  writes:
> >>
> >> > Added new library for packet capturing support.
> >> >
> >> > Added public api rte_pdump_init, applications should call
> >> > this as part of their application setup to have packet
> >> > capturing framework ready.
> >> >
> >> > Added public api rte_pdump_uninit to uninitialize the packet
> >> > capturing framework.
> >> >
> >> > Added public apis rte_pdump_enable and rte_pdump_disable to
> >> > enable and disable packet capturing on specific port and queue.
> >> >
> >> > Added public apis rte_pdump_enable_by_deviceid and
> >> > rte_pdump_disable_by_deviceid to enable and disable packet
> >> > capturing on a specific device (pci address or name) and queue.
> >> >
> >> > Signed-off-by: Reshma Pattan 
> >> > ---
> >> > +
> >> > +int
> >> > +rte_pdump_init(void)
> >>
> >> Would you be opposed to having an argument here which takes a path to
> >> the server socket?  That way the application can have some control over
> >> the server socket location rather than using the guesses from
> >> pdump_get_socket_path.
> >
> > I suppose it is better to keep IPC mechanism details internal for the
> > pdump library.
> > That way upper layer don't need to know what is that and write the
> > code to open/maintain it.
> > Again, that gives pdump library a freedom to change it (if needed) or
> > possibly introduce some alternatives.
> > Konstantin
> >
> 
> How does the application change it?  The details do matter here, as some
> applications (ex: openvswitch) have specific policies on which files
> files get opened and where those files exist.  That has impact on things
> like selinux and other access control technology.
> 
> If I missed the API that lets apps redirect the output, please correct me,
> but so far I don't think I've missed it.  pdump still can change a
> default, but it would be good to give a method for guiding the final
> choice of file to open.

No, I think, right now it is not possible to change socket path from the API.
Basically it follows the same logic as generating path for DPDK runtime config, 
etc:
/var/run for root, or $HOME dir otherwise.
I suppose if openswitch or any other dpdk app is able to start successfully,
then it should be able to open pdump socket too, correct?

Anyway, as I understand what you suggest:

rte_pdump_init(const char *dir)
{
 If (sock_name != NULL) {/*open socket at provided dir path*/}
 else { /*generate default pathname for the socket*/

and something similar for client side, might be a new function:

rte_pdump_set_dirpath(const char *dir);

Is that right?
Konstantin

> 
> -Aaron


[dpdk-dev] [PATCH v8 1/3] mempool: support external mempool operations

2016-06-09 Thread Jerin Jacob
On Thu, Jun 09, 2016 at 02:18:57PM +0100, Hunt, David wrote:
> 
> 
> > > > As I mentioned earlier, My take is not to create the separate API's for
> > > > external mempool handlers.In my view, It's same,  just that sepreate
> > > > mempool handler through function pointers.
> > > > 
> > > > To keep the backward compatibility, I think we can extend the flags
> > > > in rte_mempool_create and have a single API external/internal pool
> > > > creation(makes easy for existing applications too, add a just mempool
> > > > flag command line argument to existing applications to choose the
> > > > mempool handler)
> > > May be I am interpreting it wrong, but, are you suggesting a single 
> > > mempool handler for all buffer/packet needs of an application (passed as 
> > > command line argument)?
> > > That would be inefficient especially for cases where pool is backed by a 
> > > hardware. The application wouldn't want its generic buffers to consume 
> > > hardware resource which would be better used for packets.
> > It may vary from platform to platform or particular use case. For instance,
> > the HW external pool manager for generic buffers may scale better than SW 
> > multi
> > producers/multi-consumer implementation when the number of cores > N
> > (as no locking involved in enqueue/dequeue(again it is depended on
> > specific HW implementation))
> > 
> > I thought their no harm in selecting the external pool handlers
> > in root level itself(rte_mempool_create) as by default it is
> > SW MP/MC and it just an option to override if the application wants it.
> > 
> > Jerin
> > 
> 
> 
> So, how about we go with the following, based on Shreyansh's suggestion:
> 
> 1. Add in #define MEMPOOL_F_EMM_ALLOC 0x0010  (EMM for External Mempool
> Manager)
> 
> 2. Check this bit in rte_mempool_create() just before the other bits are
> checked (by the way, the flags check has been moved to rte_mempool_create(),
> as per an earlier patchset, but was inadvertantly reverted)
> 
> /*
>  * First check to see if we use the config'd mempool hander.
>  * Then examine the combinations of SP/SC/MP/MC flags to
>  * set the correct index into the table of ops structs.
>  */
> if (flags & MEMPOOL_F_EMM_ALLOC)
> rte_mempool_set_ops_byname(mp, RTE_MBUF_DEFAULT_MEMPOOL_OPS)
> else (flags & (MEMPOOL_F_SP_PUT | MEMPOOL_F_SC_GET))
> rte_mempool_set_ops_byname(mp, "ring_sp_sc");
> else if (flags & MEMPOOL_F_SP_PUT)
> rte_mempool_set_ops_byname(mp, "ring_sp_mc");
> else if (flags & MEMPOOL_F_SC_GET)
> rte_mempool_set_ops_byname(mp, "ring_mp_sc");
> else
> rte_mempool_set_ops_byname(mp, "ring_mp_mc");
> 
> 3. Modify rte_pktmbuf_pool_create to pass the bit to rte_mempool_create
> 
> -sizeof(struct rte_pktmbuf_pool_private), socket_id, 0);
> +sizeof(struct rte_pktmbuf_pool_private), socket_id,
> +MEMPOOL_F_PKT_ALLOC);
> 
> 
> This will allow legacy apps to use one external handler ( as defined
> RTE_MBUF_DEFAULT_MEMPOOL_OPS) by adding the MEMPOOL_F_EMM_ALLOC bit to their
> flags in the call to rte_mempool_create().
> 
> Of course, if an app wants to use more than one external handler, they can
> call create_empty and set_ops_byname() for each mempool they create.

+1

Since rte_pktmbuf_pool_create does not take flag, I think this the only
option left with for legacy apps.

> 
> Regards,
> Dave.
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 


[dpdk-dev] [PATCH] mbuf: remove inconsistent assert statements

2016-06-09 Thread Don Provan
> -Original Message-
> From: Ananyev, Konstantin [mailto:konstantin.ananyev at intel.com] 
> Sent: Thursday, June 09, 2016 8:45 AM
> To: Thomas Monjalon 
> Cc: dev at dpdk.org; Olivier Matz ; Adrien 
> Mazarguil ; Zhang, Helin  intel.com>
> Subject: Re: [dpdk-dev] [PATCH] mbuf: remove inconsistent assert statements
>...
> But as I said, if everyone are that desperate about symmetry, we can just 
> create a new public function:
>
> void
> rte_mbuf_raw_free(stuct rte_mbuf *m)
> {
>   if (rte_mbuf_refcnt_update(m, -1) == 0)
> __rte_mbuf_raw_free(m);
> }

Well, if you're going to do that, let's recognize that rte_mbuf_raw_free() was 
misnamed:
it doesn't free an mbuf, it returns an mbuf that's already free back to the 
pool. So instead
of "__rte_mbuf_raw_free", I'd call it "rte_mbuf_raw_release".

Logically I like this solution, as I'm very uncomfortable with the idea that a 
free mbuf
might sometimes have refcnt set to one. On the other hand, if mbufs can often 
be freed
without touching refcnt, that could be a big win that might persuade me to 
accept being
uncomfortable.

-don provan
dprovan at bivio.net



[dpdk-dev] [PATCH v8 1/3] mempool: support external mempool operations

2016-06-09 Thread Jerin Jacob
On Thu, Jun 09, 2016 at 11:49:44AM +, Shreyansh Jain wrote:
> Hi Jerin,

Hi Shreyansh,

> 
> > > Yes, this would simplify somewhat the creation of a pktmbuf pool, in that
> > it
> > > replaces
> > > the rte_mempool_set_ops_byname with a flag bit. However, I'm not sure we
> > > want
> > > to introduce a third method of creating a mempool to the developers. If we
> > > introduced this, we would then have:
> > > 1. rte_pktmbuf_pool_create()
> > > 2. rte_mempool_create_empty() with MEMPOOL_F_PKT_ALLOC set (which would
> > >use the configured custom handler)
> > > 3. rte_mempool_create_empty() with MEMPOOL_F_PKT_ALLOC __not__ set 
> > > followed
> > >by a call to rte_mempool_set_ops_byname() (would allow several 
> > > different
> > > custom
> > >handlers to be used in one application
> > >
> > > Does anyone else have an opinion on this? Oliver, Jerin, Jan?
> > 
> > As I mentioned earlier, My take is not to create the separate API's for
> > external mempool handlers.In my view, It's same,  just that sepreate
> > mempool handler through function pointers.
> > 
> > To keep the backward compatibility, I think we can extend the flags
> > in rte_mempool_create and have a single API external/internal pool
> > creation(makes easy for existing applications too, add a just mempool
> > flag command line argument to existing applications to choose the
> > mempool handler)
> 
> May be I am interpreting it wrong, but, are you suggesting a single mempool 
> handler for all buffer/packet needs of an application (passed as command line 
> argument)?
> That would be inefficient especially for cases where pool is backed by a 
> hardware. The application wouldn't want its generic buffers to consume 
> hardware resource which would be better used for packets.

It may vary from platform to platform or particular use case. For instance,
the HW external pool manager for generic buffers may scale better than SW multi
producers/multi-consumer implementation when the number of cores > N
(as no locking involved in enqueue/dequeue(again it is depended on
specific HW implementation))

I thought their no harm in selecting the external pool handlers
in root level itself(rte_mempool_create) as by default it is
SW MP/MC and it just an option to override if the application wants it.

Jerin


> 
> I was hoping that external mempool handler would help segregate such 
> use-cases and allow applications to tap into accelerated packet processors.
> 
> Do correct me if I my understanding is wrong.
> 
> > 
> > Jerin
> > 
> > >
> > > Regards,
> > > Dave.
> > >
> > >
> 
> -
> Shreyansh


[dpdk-dev] [PATCH v7 0/8] add packet capture framework

2016-06-09 Thread Ananyev, Konstantin
> 
> This patch set include below changes
> 
> 1)Changes to librte_ether.
> 2)A new library librte_pdump added for packet capture framework.
> 3)A new app/pdump tool added for packet capturing.
> 4)Test pmd changes done to initialize packet capture framework.
> 5)Documentation update.
> 
> 1)librte_pdump
> ==
> To support packet capturing on dpdk Ethernet devices, a new library 
> librte_pdump
> is added.Users can develop their own packet capturing application using new 
> library APIs.
> 
> Operation:
> --
> Pdump library provides APIs to support packet capturing on dpdk Ethernet 
> devices.
> Library provides APIs to initialize the packet capture framework, 
> enable/disable
> the packet capture and uninitialize the packet capture framework.
> 
> Pdump library works on client/server based model.
> 
> Sever is responsible for enabling/disabling the packet captures.
> Clients are responsible for requesting enable/disable of the
> packet captures.
> 
> As part of packet capture framework initialization, pthread and
> the server socket is created. Only one server socket is allowed on the system.
> As part of enabling/disabling the packet capture, client sockets are created
> and multiple client sockets are allowed.
> Who ever calls initialization first they will succeed with the initialization,
> next subsequent calls of initialization are not allowed. So next users can 
> only
> request enabling/disabling the packet capture.
> 
> Applications using below APIs need to pass port/device_id, queue, mempool and
> ring parameters. Library uses user provided ring and mempool to mirror the 
> rx/tx
> packets of the port for users. Users need to dequeue the rings and write the 
> packets
> to vdev(pcap/tuntap) to view the packets using any standard tools.
> 
> Note:
> Mempool and Ring should be mc/mp supportable.
> Mempool mbuf size should be big enough to handle the rx/tx packets of a port.
> 
> APIs:
> -
> rte_pdump_init()
> rte_pdump_enable()
> rte_pdump_enable_by_deviceid()
> rte_pdump_disable()
> rte_pdump_disable_by_deviceid()
> rte_pdump_uninit()
> 
> 2)app/pdump tool
> 
> Tool app/pdump is designed based on librte_pdump for packet capturing in DPDK.
> This tool by default runs as secondary process, and provides the support for
> the command line options for packet capture.
> 
> ./build/app/dpdk_pdump --
>--pdump '(port= | device_id= name>),
> (queue=),
> (rx-dev= |
>  tx-dev=),
> [ring-size=],
> [mbuf-size=],
> [total-num-mbufs=]'
> 
> Parameters inside the parenthesis represents the mandatory parameters.
> Parameters inside the square brackets represents optional parameters.
> User has to pass on packet capture parameters under --pdump parameters, 
> multiples of
> --pdump can be passed to capture packets on different port and queue 
> combinations
> 
> Operation:
> --
> *Tool parse the user command line arguments,
> creates the mempool, ring and the PCAP PMD vdev with 'tx_stream' as either
> of the device passed in rx-dev|tx-dev parameters.
> 
> *Then calls the APIs of librte_pdump i.e. 
> rte_pdump_enable()/rte_pdump_enable_by_deviceid()
> to enable packet capturing on a specific port/device_id and queue by passing 
> on
> port|device_id, queue, mempool and ring info.
> 
> *Tool runs in while loop to dequeue the packets from the ring and write them 
> to pcap device.
> 
> *Tool can be stopped using SIGINT, upon which tool calls
> rte_pdump_disable()/rte_pdump_disable_by_deviceid() and free the allocated 
> resources.
> 
> Note:
> CONFIG_RTE_LIBRTE_PMD_PCAP flag should be set to yes to compile and run the 
> pdump tool.
> 
> 3)Test-pmd changes
> ==
> Changes are done to test-pmd application to initialize/uninitialize the 
> packet capture framework.
> So app/pdump tool can be run to see packets of dpdk ports that are used by 
> test-pmd.
> 
> Similarly any application which needs packet capture should call 
> initialize/uninitialize APIs of
> librte_pdump and use pdump tool to start the capture.
> 
> 4)Packet capture flow between pdump tool and librte_pdump
> =
> * Pdump tool (Secondary process) requests packet capture
> for specific port|device_id and queue combinations.
> 
> *Library in secondary process context creates client socket and communicates
> the port|device_id, queue, ring and mempool to server.
> 
> *Library initializes server in primary process 'test-pmd' context and server 
> serves
> the client request to enable Ethernet rxtx call-backs for a given 
> port|device_id and queue.
> 
> *Copy the rx/tx packets to passed mempool and enqueue the packets to ring for 
> secondary process.
> 
> *Pdump tool will dequeue the packets from ring and writes them to PCAPMD 

[dpdk-dev] [PATCH] mbuf: remove inconsistent assert statements

2016-06-09 Thread Thomas Monjalon
2016-06-09 13:21, Ananyev, Konstantin:
> From: Olivier Matz [mailto:olivier.matz at 6wind.com]
> > Today:
> > 
> >   /* allowed */
> >   m = rte_pktmbuf_alloc();
> >   rte_pktmbuf_free(m);
> > 
> >   /* not allowed */
> >   m = rte_mbuf_raw_alloc();
> >   __rte_mbuf_raw_free(m);
> > 
> >   /* we should do instead (strange): */
> >   m = rte_mbuf_raw_alloc();
> >   rte_pktmbuf_free(m);
> > 
> > What I suggest to have:
> > 
> >   /* allowed, no change */
> >   m = rte_pktmbuf_alloc();
> >   rte_pktmbuf_free(m);
> > 
> >   /* allowed, these functions would be symetrical */
> >   m = rte_mbuf_raw_alloc();
> >   rte_mbuf_raw_free(m);
> > 
> >   /* not allowed, m->refcnt is uninitialized */
> >   m = rte_mbuf_raw_alloc();
> >   rte_pktmbuf_free(m);
> 
> Hmm, and what it will buy us (except of symmetry)?

API consistency is important.
It is a matter of making our users confident in DPDK.

I would not like to use a library where I need to check in the doc
for each function because I don't remember and I'm not confident
that the function fish() do some fishing or hunting.


[dpdk-dev] [PATCH 2/3] librte_eal: Import FreeBSD sys/tree.h into librte_eal/common

2016-06-09 Thread Thomas Monjalon
2016-06-09 16:54, Nikita Kozlov:
> On 06/ 9/16 02:58 AM, Stephen Hemminger wrote:
> > Please don't copy a header file which is available already on both BSD and 
> > Linux.
> >
> I was quite hesitant on how to handle it. I had the feeling that dpdk
> wanted to avoid external dependency so I copied that file. If it's not
> the case I would be happy to resend the patches without that external
> import.

Yes please let's use external libraries where possible.
Being able to run in a bare metal environment has been a requirement
in the early days of DPDK. Now we got rid of this idea and run on
Linux and FreeBSD. An attempt to run on OSv has been done but without
more follow-up.


[dpdk-dev] [PATCH v7 8/8] doc: update doc for packet capture framework

2016-06-09 Thread Reshma Pattan
Added programmers guide for librte_pdump.
Added sample application guide for app/pdump application.
Updated release note for packet capture framework changes.

Signed-off-by: Reshma Pattan 
Acked-by: John McNamara 
---
 MAINTAINERS |   3 +
 doc/guides/prog_guide/index.rst |   1 +
 doc/guides/prog_guide/pdump_library.rst | 107 
 doc/guides/rel_notes/release_16_07.rst  |  13 
 doc/guides/sample_app_ug/index.rst  |   1 +
 doc/guides/sample_app_ug/pdump.rst  | 122 
 6 files changed, 247 insertions(+)
 create mode 100644 doc/guides/prog_guide/pdump_library.rst
 create mode 100644 doc/guides/sample_app_ug/pdump.rst

diff --git a/MAINTAINERS b/MAINTAINERS
index a48c8de..ce7c941 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -436,6 +436,9 @@ Pdump
 M: Reshma Pattan 
 F: lib/librte_pdump/
 F: app/pdump/
+F: doc/guides/prog_guide/pdump_library.rst
+F: doc/guides/sample_app_ug/pdump.rst
+

 Hierarchical scheduler
 M: Cristian Dumitrescu 
diff --git a/doc/guides/prog_guide/index.rst b/doc/guides/prog_guide/index.rst
index b862d0c..4caf969 100644
--- a/doc/guides/prog_guide/index.rst
+++ b/doc/guides/prog_guide/index.rst
@@ -71,6 +71,7 @@ Programmer's Guide
 writing_efficient_code
 profile_app
 glossary
+pdump_library


 **Figures**
diff --git a/doc/guides/prog_guide/pdump_library.rst 
b/doc/guides/prog_guide/pdump_library.rst
new file mode 100644
index 000..1809234
--- /dev/null
+++ b/doc/guides/prog_guide/pdump_library.rst
@@ -0,0 +1,107 @@
+..  BSD LICENSE
+Copyright(c) 2016 Intel Corporation. All rights reserved.
+All rights reserved.
+
+Redistribution and use in source and binary forms, with or without
+modification, are permitted provided that the following conditions
+are met:
+
+* Redistributions of source code must retain the above copyright
+notice, this list of conditions and the following disclaimer.
+* Redistributions in binary form must reproduce the above copyright
+notice, this list of conditions and the following disclaimer in
+the documentation and/or other materials provided with the
+distribution.
+* Neither the name of Intel Corporation nor the names of its
+contributors may be used to endorse or promote products derived
+from this software without specific prior written permission.
+
+THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+"AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+(INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+
+.. _pdump_library:
+
+The librte_pdump Library
+
+
+The ``librte_pdump`` library provides a framework for packet capturing in DPDK.
+The library provides the following APIs to initialize the packet capture 
framework, to enable
+or disable the packet capture, and to uninitialize it:
+
+* ``rte_pdump_init()``:
+  This API initializes the packet capture framework.
+
+* ``rte_pdump_enable()``:
+  This API enables the packet capture on a given port and queue.
+  Note: The filter option in the API is a place holder for future enhancements.
+
+* ``rte_pdump_enable_by_deviceid()``:
+  This API enables the packet capture on a given device id (``vdev name or pci 
address``) and queue.
+  Note: The filter option in the API is a place holder for future enhancements.
+
+* ``rte_pdump_disable()``:
+  This API disables the packet capture on a given port and queue.
+
+* ``rte_pdump_disable_by_deviceid()``:
+  This API disables the packet capture on a given device id (``vdev name or 
pci address``) and queue.
+
+* ``rte_pdump_uninit()``:
+  This API uninitializes the packet capture framework.
+
+
+Operation
+-
+
+The ``librte_pdump`` library works on a client/server model. The server is 
responsible for enabling or
+disabling the packet capture and the clients are responsible for requesting 
the enabling or disabling of
+the packet capture.
+
+The packet capture framework, as part of its initialization, creates the 
pthread and the server socket in
+the pthread. The application that calls the framework initialization first 
will have the server socket created.
+Further calls to the framework initialization by the same application or other 
applications is not allowed i.e., only
+one server socket is allowed on the system. So 

[dpdk-dev] [PATCH v7 7/8] app/test-pmd: add pdump initialization uninitialization

2016-06-09 Thread Reshma Pattan
Call rte_pdump_init and rte_pdump_uninit for packet
capturing initialization and uninitialization.

Signed-off-by: Reshma Pattan 
---
 app/test-pmd/testpmd.c | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/app/test-pmd/testpmd.c b/app/test-pmd/testpmd.c
index dd6b046..f6089fa 100644
--- a/app/test-pmd/testpmd.c
+++ b/app/test-pmd/testpmd.c
@@ -76,6 +76,7 @@
 #ifdef RTE_LIBRTE_PMD_XENVIRT
 #include 
 #endif
+#include 

 #include "testpmd.h"

@@ -2029,6 +2030,8 @@ signal_handler(int signum)
if (signum == SIGINT || signum == SIGTERM) {
printf("\nSignal %d received, preparing to exit...\n",
signum);
+   /* uninitialize packet capture framework */
+   rte_pdump_uninit();
force_quit();
/* exit with the expected status */
signal(signum, SIG_DFL);
@@ -2049,6 +2052,9 @@ main(int argc, char** argv)
if (diag < 0)
rte_panic("Cannot init EAL\n");

+   /* initialize packet capture framework */
+   rte_pdump_init();
+
nb_ports = (portid_t) rte_eth_dev_count();
if (nb_ports == 0)
RTE_LOG(WARNING, EAL, "No probed ethernet devices\n");
-- 
2.5.0



[dpdk-dev] [PATCH v7 6/8] app/pdump: add pdump tool for packet capturing

2016-06-09 Thread Reshma Pattan
New tool added for packet capturing on dpdk.
This tool supports command line options.
This tool runs as secondary process by default.

Command line supports various parameters to capture
the packets.

User should pass on a)port and queue (or) b)pci address
and queue (or) c)device name and queue to capture
the packets.

Users also need to pass on either pcap file name or
any linux iface, on to which packets captured from dpdk
ports will be sent on for the users to view using tcpdump.

Users have option to capture packets either a) in RX
direction, b)(or) in TX direction c)(or) from both the
directions.

User can pass on ring_size and mempool parameters using
command line, but these are optional parameters.
These are used to create ring and mempool objects for packet
mirroring from primary application to tool. If user doesn't
provide any values, default values will be used internally
for the creation of the ring and mempool.

Signed-off-by: Reshma Pattan 
---
 MAINTAINERS|   1 +
 app/Makefile   |   1 +
 app/pdump/Makefile |  45 +++
 app/pdump/main.c   | 844 +
 4 files changed, 891 insertions(+)
 create mode 100644 app/pdump/Makefile
 create mode 100644 app/pdump/main.c

diff --git a/MAINTAINERS b/MAINTAINERS
index cc3ffdb..a48c8de 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -435,6 +435,7 @@ F: doc/guides/sample_app_ug/packet_ordering.rst
 Pdump
 M: Reshma Pattan 
 F: lib/librte_pdump/
+F: app/pdump/

 Hierarchical scheduler
 M: Cristian Dumitrescu 
diff --git a/app/Makefile b/app/Makefile
index 1151e09..c593efa 100644
--- a/app/Makefile
+++ b/app/Makefile
@@ -37,5 +37,6 @@ DIRS-$(CONFIG_RTE_LIBRTE_PIPELINE) += test-pipeline
 DIRS-$(CONFIG_RTE_TEST_PMD) += test-pmd
 DIRS-$(CONFIG_RTE_LIBRTE_CMDLINE) += cmdline_test
 DIRS-$(CONFIG_RTE_EXEC_ENV_LINUXAPP) += proc_info
+DIRS-$(CONFIG_RTE_EXEC_ENV_LINUXAPP) += pdump

 include $(RTE_SDK)/mk/rte.subdir.mk
diff --git a/app/pdump/Makefile b/app/pdump/Makefile
new file mode 100644
index 000..96bb4af
--- /dev/null
+++ b/app/pdump/Makefile
@@ -0,0 +1,45 @@
+#   BSD LICENSE
+#
+#   Copyright(c) 2016 Intel Corporation. All rights reserved.
+#   All rights reserved.
+#
+#   Redistribution and use in source and binary forms, with or without
+#   modification, are permitted provided that the following conditions
+#   are met:
+#
+# * Redistributions of source code must retain the above copyright
+#   notice, this list of conditions and the following disclaimer.
+# * Redistributions in binary form must reproduce the above copyright
+#   notice, this list of conditions and the following disclaimer in
+#   the documentation and/or other materials provided with the
+#   distribution.
+# * Neither the name of Intel Corporation nor the names of its
+#   contributors may be used to endorse or promote products derived
+#   from this software without specific prior written permission.
+#
+#   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+#   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+#   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+#   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+#   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+#   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+#   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+#   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+#   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+#   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+#   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+
+include $(RTE_SDK)/mk/rte.vars.mk
+
+APP = dpdk_pdump
+
+CFLAGS += $(WERROR_FLAGS)
+
+# all source are stored in SRCS-y
+
+SRCS-y := main.c
+
+# this application needs libraries first
+DEPDIRS-y += lib
+
+include $(RTE_SDK)/mk/rte.app.mk
diff --git a/app/pdump/main.c b/app/pdump/main.c
new file mode 100644
index 000..f8923b9
--- /dev/null
+++ b/app/pdump/main.c
@@ -0,0 +1,844 @@
+/*
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2016 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ * * Redistributions of source code must retain the above copyright
+ *   notice, this list of conditions and the following disclaimer.
+ * * Redistributions in binary form must reproduce the above copyright
+ *   notice, this list of conditions and the following disclaimer in
+ *   the documentation and/or other materials provided with the
+ *   distribution.
+ * * Neither the name of Intel Corporation nor the names of its
+ *   contributors may be used to endorse or promote products derived
+ *   from 

[dpdk-dev] [PATCH v7 4/8] librte_ether: make rte_eth_dev_get_port_by_name rte_eth_dev_get_name_by_port public

2016-06-09 Thread Reshma Pattan
Converted rte_eth_dev_get_port_by_name to a public API.
Converted rte_eth_dev_get_name_by_port to a public API.

Signed-off-by: Reshma Pattan 
---
 lib/librte_ether/rte_ethdev.c  |  4 ++--
 lib/librte_ether/rte_ethdev.h  | 29 +
 lib/librte_ether/rte_ether_version.map |  2 ++
 3 files changed, 33 insertions(+), 2 deletions(-)

diff --git a/lib/librte_ether/rte_ethdev.c b/lib/librte_ether/rte_ethdev.c
index 1f634c9..0b19569 100644
--- a/lib/librte_ether/rte_ethdev.c
+++ b/lib/librte_ether/rte_ethdev.c
@@ -406,7 +406,7 @@ rte_eth_dev_get_addr_by_port(uint8_t port_id, struct 
rte_pci_addr *addr)
return 0;
 }

-static int
+int
 rte_eth_dev_get_name_by_port(uint8_t port_id, char *name)
 {
char *tmp;
@@ -425,7 +425,7 @@ rte_eth_dev_get_name_by_port(uint8_t port_id, char *name)
return 0;
 }

-static int
+int
 rte_eth_dev_get_port_by_name(const char *name, uint8_t *port_id)
 {
int i;
diff --git a/lib/librte_ether/rte_ethdev.h b/lib/librte_ether/rte_ethdev.h
index 8ad7c01..fab281e 100644
--- a/lib/librte_ether/rte_ethdev.h
+++ b/lib/librte_ether/rte_ethdev.h
@@ -4284,6 +4284,35 @@ rte_eth_dev_l2_tunnel_offload_set(uint8_t port_id,
  uint32_t mask,
  uint8_t en);

+/**
+* Get the port id from pci adrress or device name
+* Ex: :2:00.0 or vdev name eth_pcap0
+*
+* @param name
+*  pci address or name of the device
+* @param port_id
+*   pointer to port identifier of the device
+* @return
+*   - (0) if successful.
+*   - (-ENODEV or -EINVAL) on failure.
+*/
+int
+rte_eth_dev_get_port_by_name(const char *name, uint8_t *port_id);
+
+/**
+* Get the device name from port id
+*
+* @param port_id
+*   pointer to port identifier of the device
+* @param name
+*  pci address or name of the device
+* @return
+*   - (0) if successful.
+*   - (-EINVAL) on failure.
+*/
+int
+rte_eth_dev_get_name_by_port(uint8_t port_id, char *name);
+
 #ifdef __cplusplus
 }
 #endif
diff --git a/lib/librte_ether/rte_ether_version.map 
b/lib/librte_ether/rte_ether_version.map
index d06d648..73e730d 100644
--- a/lib/librte_ether/rte_ether_version.map
+++ b/lib/librte_ether/rte_ether_version.map
@@ -137,5 +137,7 @@ DPDK_16.07 {
global:

rte_eth_add_first_rx_callback;
+   rte_eth_dev_get_name_by_port;
+   rte_eth_dev_get_port_by_name;
rte_eth_dev_info_get;
 } DPDK_16.04;
-- 
2.5.0



[dpdk-dev] [PATCH v7 3/8] librte_ether: add new fields to rte_eth_dev_info struct

2016-06-09 Thread Reshma Pattan
New fields nb_rx_queues and nb_tx_queues are added to
rte_eth_dev_info structure.
Changes to API rte_eth_dev_info_get() are done to update
these new fields to rte_eth_dev_info object.

Signed-off-by: Reshma Pattan 
---
 lib/librte_ether/rte_ethdev.c  | 2 ++
 lib/librte_ether/rte_ethdev.h  | 3 +++
 lib/librte_ether/rte_ether_version.map | 1 +
 3 files changed, 6 insertions(+)

diff --git a/lib/librte_ether/rte_ethdev.c b/lib/librte_ether/rte_ethdev.c
index 97d167e..1f634c9 100644
--- a/lib/librte_ether/rte_ethdev.c
+++ b/lib/librte_ether/rte_ethdev.c
@@ -1661,6 +1661,8 @@ rte_eth_dev_info_get(uint8_t port_id, struct 
rte_eth_dev_info *dev_info)
(*dev->dev_ops->dev_infos_get)(dev, dev_info);
dev_info->pci_dev = dev->pci_dev;
dev_info->driver_name = dev->data->drv_name;
+   dev_info->nb_rx_queues = dev->data->nb_rx_queues;
+   dev_info->nb_tx_queues = dev->data->nb_tx_queues;
 }

 int
diff --git a/lib/librte_ether/rte_ethdev.h b/lib/librte_ether/rte_ethdev.h
index 237e6ef..8ad7c01 100644
--- a/lib/librte_ether/rte_ethdev.h
+++ b/lib/librte_ether/rte_ethdev.h
@@ -882,6 +882,9 @@ struct rte_eth_dev_info {
struct rte_eth_desc_lim rx_desc_lim;  /**< RX descriptors limits */
struct rte_eth_desc_lim tx_desc_lim;  /**< TX descriptors limits */
uint32_t speed_capa;  /**< Supported speeds bitmap (ETH_LINK_SPEED_). */
+   /** Configured number of rx/tx queues */
+   uint16_t nb_rx_queues; /**< Number of RX queues. */
+   uint16_t nb_tx_queues; /**< Number of TX queues. */
 };

 /**
diff --git a/lib/librte_ether/rte_ether_version.map 
b/lib/librte_ether/rte_ether_version.map
index c990b04..d06d648 100644
--- a/lib/librte_ether/rte_ether_version.map
+++ b/lib/librte_ether/rte_ether_version.map
@@ -137,4 +137,5 @@ DPDK_16.07 {
global:

rte_eth_add_first_rx_callback;
+   rte_eth_dev_info_get;
 } DPDK_16.04;
-- 
2.5.0



[dpdk-dev] [PATCH v7 2/8] librte_ether: add new api rte_eth_add_first_rx_callback

2016-06-09 Thread Reshma Pattan
Added new public api rte_eth_add_first_rx_callback to add given
callback as head of list.

Signed-off-by: Reshma Pattan 
---
 lib/librte_ether/rte_ethdev.c  | 35 ++
 lib/librte_ether/rte_ethdev.h  | 28 +++
 lib/librte_ether/rte_ether_version.map |  6 ++
 3 files changed, 69 insertions(+)

diff --git a/lib/librte_ether/rte_ethdev.c b/lib/librte_ether/rte_ethdev.c
index ce70d58..97d167e 100644
--- a/lib/librte_ether/rte_ethdev.c
+++ b/lib/librte_ether/rte_ethdev.c
@@ -2939,6 +2939,41 @@ rte_eth_add_rx_callback(uint8_t port_id, uint16_t 
queue_id,
 }

 void *
+rte_eth_add_first_rx_callback(uint8_t port_id, uint16_t queue_id,
+   rte_rx_callback_fn fn, void *user_param)
+{
+#ifndef RTE_ETHDEV_RXTX_CALLBACKS
+   rte_errno = ENOTSUP;
+   return NULL;
+#endif
+   /* check input parameters */
+   if (!rte_eth_dev_is_valid_port(port_id) || fn == NULL ||
+   queue_id >= rte_eth_devices[port_id].data->nb_rx_queues) {
+   rte_errno = EINVAL;
+   return NULL;
+   }
+
+   struct rte_eth_rxtx_callback *cb = rte_zmalloc(NULL, sizeof(*cb), 0);
+
+   if (cb == NULL) {
+   rte_errno = ENOMEM;
+   return NULL;
+   }
+
+   cb->fn.rx = fn;
+   cb->param = user_param;
+
+   rte_spinlock_lock(_eth_rx_cb_lock);
+   /* Add the callbacks at fisrt position*/
+   cb->next = rte_eth_devices[port_id].post_rx_burst_cbs[queue_id];
+   rte_smp_wmb();
+   rte_eth_devices[port_id].post_rx_burst_cbs[queue_id] = cb;
+   rte_spinlock_unlock(_eth_rx_cb_lock);
+
+   return cb;
+}
+
+void *
 rte_eth_add_tx_callback(uint8_t port_id, uint16_t queue_id,
rte_tx_callback_fn fn, void *user_param)
 {
diff --git a/lib/librte_ether/rte_ethdev.h b/lib/librte_ether/rte_ethdev.h
index 2757510..237e6ef 100644
--- a/lib/librte_ether/rte_ethdev.h
+++ b/lib/librte_ether/rte_ethdev.h
@@ -3825,6 +3825,34 @@ int rte_eth_dev_get_dcb_info(uint8_t port_id,
 void *rte_eth_add_rx_callback(uint8_t port_id, uint16_t queue_id,
rte_rx_callback_fn fn, void *user_param);

+/*
+* Add a callback that must be called first on packet RX on a given port
+* and queue.
+*
+* This API configures a first function to be called for each burst of
+* packets received on a given NIC port queue. The return value is a pointer
+* that can be used to later remove the callback using
+* rte_eth_remove_rx_callback().
+*
+* Multiple functions are called in the order that they are added.
+*
+* @param port_id
+*   The port identifier of the Ethernet device.
+* @param queue_id
+*   The queue on the Ethernet device on which the callback is to be added.
+* @param fn
+*   The callback function
+* @param user_param
+*   A generic pointer parameter which will be passed to each invocation of the
+*   callback function on this port and queue.
+*
+* @return
+*   NULL on error.
+*   On success, a pointer value which can later be used to remove the callback.
+*/
+void *rte_eth_add_first_rx_callback(uint8_t port_id, uint16_t queue_id,
+   rte_rx_callback_fn fn, void *user_param);
+
 /**
  * Add a callback to be called on packet TX on a given port and queue.
  *
diff --git a/lib/librte_ether/rte_ether_version.map 
b/lib/librte_ether/rte_ether_version.map
index 214ecc7..c990b04 100644
--- a/lib/librte_ether/rte_ether_version.map
+++ b/lib/librte_ether/rte_ether_version.map
@@ -132,3 +132,9 @@ DPDK_16.04 {
rte_eth_tx_buffer_set_err_callback;

 } DPDK_2.2;
+
+DPDK_16.07 {
+   global:
+
+   rte_eth_add_first_rx_callback;
+} DPDK_16.04;
-- 
2.5.0



[dpdk-dev] [PATCH v7 1/8] librte_ether: protect add/remove of rxtx callbacks with spinlocks

2016-06-09 Thread Reshma Pattan
Added spinlocks around add/remove logic of rxtx callbacks to
avoid corruption of callback lists in multithreaded context.

Signed-off-by: Reshma Pattan 
---
 lib/librte_ether/rte_ethdev.c | 82 +--
 1 file changed, 40 insertions(+), 42 deletions(-)

diff --git a/lib/librte_ether/rte_ethdev.c b/lib/librte_ether/rte_ethdev.c
index e148028..ce70d58 100644
--- a/lib/librte_ether/rte_ethdev.c
+++ b/lib/librte_ether/rte_ethdev.c
@@ -77,6 +77,12 @@ static uint8_t nb_ports;
 /* spinlock for eth device callbacks */
 static rte_spinlock_t rte_eth_dev_cb_lock = RTE_SPINLOCK_INITIALIZER;

+/* spinlock for add/remove rx callbacks */
+static rte_spinlock_t rte_eth_rx_cb_lock = RTE_SPINLOCK_INITIALIZER;
+
+/* spinlock for add/remove tx callbacks */
+static rte_spinlock_t rte_eth_tx_cb_lock = RTE_SPINLOCK_INITIALIZER;
+
 /* store statistics names and its offset in stats structure  */
 struct rte_eth_xstats_name_off {
char name[RTE_ETH_XSTATS_NAME_SIZE];
@@ -1634,7 +1640,6 @@ rte_eth_dev_set_rx_queue_stats_mapping(uint8_t port_id, 
uint16_t rx_queue_id,
STAT_QMAP_RX);
 }

-
 void
 rte_eth_dev_info_get(uint8_t port_id, struct rte_eth_dev_info *dev_info)
 {
@@ -2905,7 +2910,6 @@ rte_eth_add_rx_callback(uint8_t port_id, uint16_t 
queue_id,
rte_errno = EINVAL;
return NULL;
}
-
struct rte_eth_rxtx_callback *cb = rte_zmalloc(NULL, sizeof(*cb), 0);

if (cb == NULL) {
@@ -2916,6 +2920,7 @@ rte_eth_add_rx_callback(uint8_t port_id, uint16_t 
queue_id,
cb->fn.rx = fn;
cb->param = user_param;

+   rte_spinlock_lock(_eth_rx_cb_lock);
/* Add the callbacks in fifo order. */
struct rte_eth_rxtx_callback *tail =
rte_eth_devices[port_id].post_rx_burst_cbs[queue_id];
@@ -2928,6 +2933,7 @@ rte_eth_add_rx_callback(uint8_t port_id, uint16_t 
queue_id,
tail = tail->next;
tail->next = cb;
}
+   rte_spinlock_unlock(_eth_rx_cb_lock);

return cb;
 }
@@ -2957,6 +2963,7 @@ rte_eth_add_tx_callback(uint8_t port_id, uint16_t 
queue_id,
cb->fn.tx = fn;
cb->param = user_param;

+   rte_spinlock_lock(_eth_tx_cb_lock);
/* Add the callbacks in fifo order. */
struct rte_eth_rxtx_callback *tail =
rte_eth_devices[port_id].pre_tx_burst_cbs[queue_id];
@@ -2969,6 +2976,7 @@ rte_eth_add_tx_callback(uint8_t port_id, uint16_t 
queue_id,
tail = tail->next;
tail->next = cb;
}
+   rte_spinlock_unlock(_eth_tx_cb_lock);

return cb;
 }
@@ -2987,29 +2995,24 @@ rte_eth_remove_rx_callback(uint8_t port_id, uint16_t 
queue_id,
return -EINVAL;

struct rte_eth_dev *dev = _eth_devices[port_id];
-   struct rte_eth_rxtx_callback *cb = dev->post_rx_burst_cbs[queue_id];
-   struct rte_eth_rxtx_callback *prev_cb;
-
-   /* Reset head pointer and remove user cb if first in the list. */
-   if (cb == user_cb) {
-   dev->post_rx_burst_cbs[queue_id] = user_cb->next;
-   return 0;
-   }
-
-   /* Remove the user cb from the callback list. */
-   do {
-   prev_cb = cb;
-   cb = cb->next;
-
+   struct rte_eth_rxtx_callback *cb;
+   struct rte_eth_rxtx_callback **prev_cb;
+   int ret = -EINVAL;
+
+   rte_spinlock_lock(_eth_rx_cb_lock);
+   prev_cb = >post_rx_burst_cbs[queue_id];
+   for (; *prev_cb != NULL; prev_cb = >next) {
+   cb = *prev_cb;
if (cb == user_cb) {
-   prev_cb->next = user_cb->next;
-   return 0;
+   /* Remove the user cb from the callback list. */
+   *prev_cb = cb->next;
+   ret = 0;
+   break;
}
+   }
+   rte_spinlock_unlock(_eth_rx_cb_lock);

-   } while (cb != NULL);
-
-   /* Callback wasn't found. */
-   return -EINVAL;
+   return ret;
 }

 int
@@ -3026,29 +3029,24 @@ rte_eth_remove_tx_callback(uint8_t port_id, uint16_t 
queue_id,
return -EINVAL;

struct rte_eth_dev *dev = _eth_devices[port_id];
-   struct rte_eth_rxtx_callback *cb = dev->pre_tx_burst_cbs[queue_id];
-   struct rte_eth_rxtx_callback *prev_cb;
-
-   /* Reset head pointer and remove user cb if first in the list. */
-   if (cb == user_cb) {
-   dev->pre_tx_burst_cbs[queue_id] = user_cb->next;
-   return 0;
-   }
-
-   /* Remove the user cb from the callback list. */
-   do {
-   prev_cb = cb;
-   cb = cb->next;
-
+   int ret = -EINVAL;
+   struct rte_eth_rxtx_callback *cb;
+   struct rte_eth_rxtx_callback **prev_cb;
+
+   rte_spinlock_lock(_eth_tx_cb_lock);
+   prev_cb = >pre_tx_burst_cbs[queue_id];
+   for (; *prev_cb != NULL; prev_cb = 

[dpdk-dev] [PATCH v7 0/8] add packet capture framework

2016-06-09 Thread Reshma Pattan
This patch set include below changes

1)Changes to librte_ether.
2)A new library librte_pdump added for packet capture framework.
3)A new app/pdump tool added for packet capturing.
4)Test pmd changes done to initialize packet capture framework.
5)Documentation update.

1)librte_pdump
==
To support packet capturing on dpdk Ethernet devices, a new library librte_pdump
is added.Users can develop their own packet capturing application using new 
library APIs.

Operation:
--
Pdump library provides APIs to support packet capturing on dpdk Ethernet 
devices.
Library provides APIs to initialize the packet capture framework, enable/disable
the packet capture and uninitialize the packet capture framework.

Pdump library works on client/server based model.

Sever is responsible for enabling/disabling the packet captures.
Clients are responsible for requesting enable/disable of the
packet captures.

As part of packet capture framework initialization, pthread and
the server socket is created. Only one server socket is allowed on the system.
As part of enabling/disabling the packet capture, client sockets are created
and multiple client sockets are allowed.
Who ever calls initialization first they will succeed with the initialization,
next subsequent calls of initialization are not allowed. So next users can only
request enabling/disabling the packet capture.

Applications using below APIs need to pass port/device_id, queue, mempool and
ring parameters. Library uses user provided ring and mempool to mirror the rx/tx
packets of the port for users. Users need to dequeue the rings and write the 
packets
to vdev(pcap/tuntap) to view the packets using any standard tools.

Note:
Mempool and Ring should be mc/mp supportable.
Mempool mbuf size should be big enough to handle the rx/tx packets of a port.

APIs:
-
rte_pdump_init()
rte_pdump_enable()
rte_pdump_enable_by_deviceid()
rte_pdump_disable()
rte_pdump_disable_by_deviceid()
rte_pdump_uninit()

2)app/pdump tool

Tool app/pdump is designed based on librte_pdump for packet capturing in DPDK.
This tool by default runs as secondary process, and provides the support for
the command line options for packet capture.

./build/app/dpdk_pdump --
   --pdump '(port= | device_id=),
(queue=),
(rx-dev= |
 tx-dev=),
[ring-size=],
[mbuf-size=],
[total-num-mbufs=]'

Parameters inside the parenthesis represents the mandatory parameters.
Parameters inside the square brackets represents optional parameters.
User has to pass on packet capture parameters under --pdump parameters, 
multiples of
--pdump can be passed to capture packets on different port and queue 
combinations

Operation:
--
*Tool parse the user command line arguments,
creates the mempool, ring and the PCAP PMD vdev with 'tx_stream' as either
of the device passed in rx-dev|tx-dev parameters.

*Then calls the APIs of librte_pdump i.e. 
rte_pdump_enable()/rte_pdump_enable_by_deviceid()
to enable packet capturing on a specific port/device_id and queue by passing on
port|device_id, queue, mempool and ring info.

*Tool runs in while loop to dequeue the packets from the ring and write them to 
pcap device.

*Tool can be stopped using SIGINT, upon which tool calls
rte_pdump_disable()/rte_pdump_disable_by_deviceid() and free the allocated 
resources.

Note:
CONFIG_RTE_LIBRTE_PMD_PCAP flag should be set to yes to compile and run the 
pdump tool.

3)Test-pmd changes
==
Changes are done to test-pmd application to initialize/uninitialize the packet 
capture framework.
So app/pdump tool can be run to see packets of dpdk ports that are used by 
test-pmd.

Similarly any application which needs packet capture should call 
initialize/uninitialize APIs of
librte_pdump and use pdump tool to start the capture.

4)Packet capture flow between pdump tool and librte_pdump
=
* Pdump tool (Secondary process) requests packet capture
for specific port|device_id and queue combinations.

*Library in secondary process context creates client socket and communicates
the port|device_id, queue, ring and mempool to server.

*Library initializes server in primary process 'test-pmd' context and server 
serves
the client request to enable Ethernet rxtx call-backs for a given 
port|device_id and queue.

*Copy the rx/tx packets to passed mempool and enqueue the packets to ring for 
secondary process.

*Pdump tool will dequeue the packets from ring and writes them to PCAPMD vdev,
so ultimately packets will be seen on the device that is passed in 
rx-dev|tx-dev.

*Once the pdump tool is terminated with SIGINT it will disable the packet 
capturing.

*Library receives the disable packet capture request, communicate the info to 
server,
server will 

[dpdk-dev] [PATCH 2/2 v3] kni: add documentation for the mempool capacity

2016-06-09 Thread Mcnamara, John
> Just to confirm, should I do anything before it gets merged?

No. Looks good to me. Now you just need to wait. :-)

John.


[dpdk-dev] [PATCH v2] enic: fix seg fault when releasing queues

2016-06-09 Thread Bruce Richardson
On Wed, May 25, 2016 at 07:45:00PM -0700, John Daley wrote:
> If device configuration failed due to a lack of resources, like if
> there were more queues requested than available, the queue release
> function is called with NULL pointers which were being dereferenced.
> 
> Skip releasing queues if they are NULL pointers. Also, if configuration
> fails due to lack of resources, be more specific about which resources
> are lacking.

The "also", and a a review of the code, indicates that the error message
changes should be a separate patch, as it's not related to the NULL check fix.
:-)


> 
> Fixes: fefed3d1e62c ("enic: new driver")
> Signed-off-by: John Daley 
> ---
> v2: Log an error for all resource deficiencies not just the first one
> found.
> 
>  drivers/net/enic/enic_main.c | 37 +++--
>  1 file changed, 23 insertions(+), 14 deletions(-)
> 
> diff --git a/drivers/net/enic/enic_main.c b/drivers/net/enic/enic_main.c
> index 996f999..411d23c 100644
> --- a/drivers/net/enic/enic_main.c
> +++ b/drivers/net/enic/enic_main.c
> @@ -426,14 +426,16 @@ int enic_alloc_intr_resources(struct enic *enic)
>  
>  void enic_free_rq(void *rxq)
>  {
> - struct vnic_rq *rq = (struct vnic_rq *)rxq;
> - struct enic *enic = vnic_dev_priv(rq->vdev);
> + if (rxq != NULL) {
> + struct vnic_rq *rq = (struct vnic_rq *)rxq;
> + struct enic *enic = vnic_dev_priv(rq->vdev);
>  
> - enic_rxmbuf_queue_release(enic, rq);
> - rte_free(rq->mbuf_ring);
> - rq->mbuf_ring = NULL;
> - vnic_rq_free(rq);
> - vnic_cq_free(>cq[rq->index]);
> + enic_rxmbuf_queue_release(enic, rq);
> + rte_free(rq->mbuf_ring);
> + rq->mbuf_ring = NULL;
> + vnic_rq_free(rq);
> + vnic_cq_free(>cq[rq->index]);
> + }
>  }

Is it not just easier to put a check at the top for: 
if (rq == NULL)
return;

Rather than putting the whole body of the function inside an if statement? It
would certainly be a smaller diff.

/Bruce



[dpdk-dev] [PATCH v2] log: deprecate history dump

2016-06-09 Thread Thomas Monjalon
The log history uses rte_mempool. In order to remove the mempool
dependency in EAL (and improve the build), this feature is deprecated.
The ABI is kept but the behaviour is now voided because it seems this
function was not used. The history can be read from syslog.
When enabling the log history, a warning is logged.

Signed-off-by: Thomas Monjalon 
---
v2:
- remove more mempool and log history traces
- add a warning if enabling log history
- move not related mempool includes cleanup in another patch
---
 app/test-pmd/cmdline.c  |   3 -
 app/test/autotest_data.py   |   6 --
 app/test/autotest_test_funcs.py |   5 --
 app/test/commands.c |   4 +-
 app/test/test_logs.c|   3 -
 doc/guides/prog_guide/mempool_lib.rst   |   4 +-
 doc/guides/rel_notes/deprecation.rst|   3 +
 lib/librte_eal/bsdapp/eal/Makefile  |   1 -
 lib/librte_eal/bsdapp/eal/eal_debug.c   |   6 --
 lib/librte_eal/common/eal_common_log.c  | 143 ++--
 lib/librte_eal/common/eal_private.h |   3 -
 lib/librte_eal/common/include/rte_log.h |   8 ++
 lib/librte_eal/linuxapp/eal/Makefile|   1 -
 lib/librte_eal/linuxapp/eal/eal_debug.c |   6 --
 lib/librte_eal/linuxapp/eal/eal_log.c   |   9 +-
 lib/librte_mempool/rte_mempool.c|   4 -
 16 files changed, 20 insertions(+), 189 deletions(-)

diff --git a/app/test-pmd/cmdline.c b/app/test-pmd/cmdline.c
index 1921612..fd389ac 100644
--- a/app/test-pmd/cmdline.c
+++ b/app/test-pmd/cmdline.c
@@ -7268,8 +7268,6 @@ static void cmd_dump_parsed(void *parsed_result,
rte_dump_physmem_layout(stdout);
else if (!strcmp(res->dump, "dump_memzone"))
rte_memzone_dump(stdout);
-   else if (!strcmp(res->dump, "dump_log_history"))
-   rte_log_dump_history(stdout);
else if (!strcmp(res->dump, "dump_struct_sizes"))
dump_struct_sizes();
else if (!strcmp(res->dump, "dump_ring"))
@@ -7284,7 +7282,6 @@ cmdline_parse_token_string_t cmd_dump_dump =
TOKEN_STRING_INITIALIZER(struct cmd_dump_result, dump,
"dump_physmem#"
"dump_memzone#"
-   "dump_log_history#"
"dump_struct_sizes#"
"dump_ring#"
"dump_mempool#"
diff --git a/app/test/autotest_data.py b/app/test/autotest_data.py
index 78d2edd..6c87809 100644
--- a/app/test/autotest_data.py
+++ b/app/test/autotest_data.py
@@ -94,12 +94,6 @@ parallel_test_group_list = [
 "Report" : None,
},
{
-"Name" :   "Dump log history",
-"Command" :"dump_log_history",
-"Func" :   dump_autotest,
-"Report" : None,
-   },
-   {
 "Name" :   "Dump rings",
 "Command" :"dump_ring",
 "Func" :   dump_autotest,
diff --git a/app/test/autotest_test_funcs.py b/app/test/autotest_test_funcs.py
index b60b941..14cffd0 100644
--- a/app/test/autotest_test_funcs.py
+++ b/app/test/autotest_test_funcs.py
@@ -144,16 +144,11 @@ def logs_autotest(child, test_name):
i = 0
child.sendline(test_name)

-   # logs sequence is printed twice because of history dump
log_list = [
"TESTAPP1: error message",
"TESTAPP1: critical message",
"TESTAPP2: critical message",
"TESTAPP1: error message",
-   "TESTAPP1: error message",
-   "TESTAPP1: critical message",
-   "TESTAPP2: critical message",
-   "TESTAPP1: error message",
]

for log_msg in log_list:
diff --git a/app/test/commands.c b/app/test/commands.c
index e0af8e4..2df46b0 100644
--- a/app/test/commands.c
+++ b/app/test/commands.c
@@ -150,8 +150,6 @@ static void cmd_dump_parsed(void *parsed_result,
rte_dump_physmem_layout(stdout);
else if (!strcmp(res->dump, "dump_memzone"))
rte_memzone_dump(stdout);
-   else if (!strcmp(res->dump, "dump_log_history"))
-   rte_log_dump_history(stdout);
else if (!strcmp(res->dump, "dump_struct_sizes"))
dump_struct_sizes();
else if (!strcmp(res->dump, "dump_ring"))
@@ -164,7 +162,7 @@ static void cmd_dump_parsed(void *parsed_result,

 cmdline_parse_token_string_t cmd_dump_dump =
TOKEN_STRING_INITIALIZER(struct cmd_dump_result, dump,
-"dump_physmem#dump_memzone#dump_log_history#"
+"dump_physmem#dump_memzone#"
 "dump_struct_sizes#dump_ring#dump_mempool#"
 "dump_devargs");

diff --git a/app/test/test_logs.c b/app/test/test_logs.c
index 05aa862..d0a9962 100644
--- a/app/test/test_logs.c
+++ b/app/test/test_logs.c
@@ -83,9 +83,6 @@ test_logs(void)
RTE_LOG(ERR, 

[dpdk-dev] [PATCH] doc: fix wrong supported feature table

2016-06-09 Thread Mcnamara, John
> -Original Message-
> From: De Lara Guarch, Pablo
> Sent: Thursday, June 9, 2016 4:45 PM
> To: dev at dpdk.org
> Cc: Mcnamara, John ; Doherty, Declan
> ; De Lara Guarch, Pablo
> 
> Subject: [PATCH] doc: fix wrong supported feature table
> 
> Some crypto PMDs that support symmetric crypto were not marked as
> supported in the supported feature flags table.
> 
> Fixes: 2373c0661b2f0 ("doc: add cryptodevs guide overview")
> 
> Signed-off-by: Pablo de Lara 

Acked-by: John McNamara 



[dpdk-dev] [PATCH] af_packet: add byte counters

2016-06-09 Thread Bruce Richardson
On Thu, May 26, 2016 at 11:01:57AM -0400, John W. Linville wrote:
> On Thu, May 26, 2016 at 03:47:59PM +0100, Ferruh Yigit wrote:
> > On 5/25/2016 10:03 PM, Rich Lane wrote:
> > > Signed-off-by: Rich Lane 
> > 
> > Reviewed-by: Ferruh Yigit 
> 
> Acked-by: John W. Linville 
> 
Applied to dpdk-next-net/rel_16_07

/Bruce


[dpdk-dev] [PATCH] log: deprecate history dump

2016-06-09 Thread Christian Ehrhardt
Hi,
in I totally like it - thanks Thomas for picking that up.

I just wanted to mention that the Makefile still refers to mempool, but
David beat me in time and Detail a lot.

I'll certainly try to follow and help the bit I can.



Christian Ehrhardt
Software Engineer, Ubuntu Server
Canonical Ltd

On Thu, Jun 9, 2016 at 4:45 PM, David Marchand 
wrote:

> Thomas,
>
> On Thu, Jun 9, 2016 at 4:09 PM, Thomas Monjalon
>  wrote:
> > The log history uses rte_mempool. In order to remove the mempool
> > dependency in EAL (and improve the build), this feature is deprecated.
> > The ABI is kept but the behaviour is now voided because it seems this
> > function was not used. The history can be read from syslog.
>
> It does look like it is not really used.
> I am for this change unless someone complains.
>
> Comments below.
>
> > Signed-off-by: Thomas Monjalon 
> > ---
> >  app/test-pmd/cmdline.c   |   3 -
> >  app/test/autotest_test_funcs.py  |   5 --
> >  app/test/commands.c  |   4 +-
> >  app/test/test_logs.c |   3 -
> >  doc/guides/rel_notes/deprecation.rst |   3 +
> >  lib/librte_eal/bsdapp/eal/eal_debug.c|   6 --
> >  lib/librte_eal/common/eal_common_log.c   | 128
> +--
> >  lib/librte_eal/common/include/rte_log.h  |   8 ++
> >  lib/librte_eal/linuxapp/eal/eal_debug.c  |   6 --
> >  lib/librte_eal/linuxapp/eal/eal_interrupts.c |   1 -
> >  lib/librte_eal/linuxapp/eal/eal_ivshmem.c|   1 -
> >  lib/librte_eal/linuxapp/eal/eal_log.c|   3 -
> >  lib/librte_mempool/rte_mempool.c |   4 -
> >  13 files changed, 16 insertions(+), 159 deletions(-)
>
> - You missed autotest_data.py.
>
> $ git grep dump_log_history
> app/test/autotest_data.py:   "Command" :"dump_log_history",
>
> - eal Makefile still refers to librte_mempool.
>
> - Since you are looking at this, what keeps us from removing the
> dependency on librte_ring ?
> I would say it was mainly because of mempool.
> Maybe ivshmem ?
>
>
> [snip]
>
> > diff --git a/doc/guides/rel_notes/deprecation.rst
> b/doc/guides/rel_notes/deprecation.rst
> > index ad05eba..f11df93 100644
> > --- a/doc/guides/rel_notes/deprecation.rst
> > +++ b/doc/guides/rel_notes/deprecation.rst
> > @@ -8,6 +8,9 @@ API and ABI deprecation notices are to be posted here.
> >  Deprecation Notices
> >  ---
> >
> > +* The log history is deprecated in release 16.07.
> > +  It is voided and will be completely removed in release 16.11.
> > +
>
> It is voided in 16.07 and will be completely removed in release 16.11.
>
>
> >  * The ethdev hotplug API is going to be moved to EAL with a notification
> >mechanism added to crypto and ethdev libraries so that hotplug is now
> >available to both of them. This API will be stripped of the device
> arguments
>
> > diff --git a/lib/librte_eal/common/eal_common_log.c
> b/lib/librte_eal/common/eal_common_log.c
> > index b5e37bb..94ecdd2 100644
> > --- a/lib/librte_eal/common/eal_common_log.c
> > +++ b/lib/librte_eal/common/eal_common_log.c
> > @@ -56,29 +56,11 @@
> >  #include 
> >  #include 
> >  #include 
> > -#include 
> >
> >  #include "eal_private.h"
> >
> >  #define LOG_ELT_SIZE 2048
>
> We don't need LOG_ELT_SIZE.
>
> >
> > -#define LOG_HISTORY_MP_NAME "log_history"
> > -
> > -STAILQ_HEAD(log_history_list, log_history);
> > -
> > -/**
> > - * The structure of a message log in the log history.
> > - */
> > -struct log_history {
> > -   STAILQ_ENTRY(log_history) next;
> > -   unsigned size;
> > -   char buf[0];
> > -};
> > -
> > -static struct rte_mempool *log_history_mp = NULL;
> > -static unsigned log_history_size = 0;
> > -static struct log_history_list log_history;
> > -
> >  /* global log structure */
> >  struct rte_logs rte_logs = {
> > .type = ~0,
> > @@ -86,10 +68,7 @@ struct rte_logs rte_logs = {
> > .file = NULL,
> >  };
> >
> > -static rte_spinlock_t log_dump_lock = RTE_SPINLOCK_INITIALIZER;
> > -static rte_spinlock_t log_list_lock = RTE_SPINLOCK_INITIALIZER;
> >  static FILE *default_log_stream;
> > -static int history_enabled = 1;
> >
> >  /**
> >   * This global structure stores some informations about the message
> > @@ -106,59 +85,14 @@ static RTE_DEFINE_PER_LCORE(struct log_cur_msg,
> log_cur_msg);
> >  /* default logs */
> >
> >  int
> > -rte_log_add_in_history(const char *buf, size_t size)
> > +rte_log_add_in_history(const char *buf __rte_unused, size_t size
> __rte_unused)
> >  {
> > -   struct log_history *hist_buf = NULL;
> > -   static const unsigned hist_buf_size = LOG_ELT_SIZE -
> sizeof(*hist_buf);
> > -   void *obj;
> > -
> > -   if (history_enabled == 0)
> > -   return 0;
> > -
> > -   rte_spinlock_lock(_list_lock);
> > -
> > -   /* get a buffer for adding in history */
> > -   if (log_history_size > RTE_LOG_HISTORY) {
> > -   hist_buf = STAILQ_FIRST(_history);

[dpdk-dev] [PATCH] log: deprecate history dump

2016-06-09 Thread Thomas Monjalon
2016-06-09 16:45, David Marchand:
> On Thu, Jun 9, 2016 at 4:09 PM, Thomas Monjalon
>  wrote:
> > The log history uses rte_mempool. In order to remove the mempool
> > dependency in EAL (and improve the build), this feature is deprecated.
> > The ABI is kept but the behaviour is now voided because it seems this
> > function was not used. The history can be read from syslog.
> 
> It does look like it is not really used.
> I am for this change unless someone complains.
> 
> Comments below.

All your comments will be addressed in a v2. Thanks

> - Since you are looking at this, what keeps us from removing the
> dependency on librte_ring ?

Please see this first small cleanup:
http://dpdk.org/ml/archives/dev/2016-June/040798.html

> I would say it was mainly because of mempool.
> Maybe ivshmem ?

Yes CONFIG_RTE_LIBRTE_IVSHMEM brings dependencies to rte_ring and rte_ivshmem.
This "feature" also pollute the memory allocator and makes rework harder.
That's why I would be in favor of removing CONFIG_RTE_LIBRTE_IVSHMEM.

Otherwise, as an alternative proposal, the file
lib/librte_eal/linuxapp/eal/eal_ivshmem.c
could be moved outside of EAL. Probably that lib/librte_ivshmem/
would be a good place.
The tricky operation would be to remove ivshmem init from eal:

#ifdef RTE_LIBRTE_IVSHMEM
if (rte_eal_ivshmem_init() < 0) 
 
rte_panic("Cannot init IVSHMEM\n");
#endif

if (rte_eal_memory_init() < 0)
rte_panic("Cannot init memory\n");

/* the directories are locked during eal_hugepage_info_init */
eal_hugedirs_unlock();

if (rte_eal_memzone_init() < 0)
rte_panic("Cannot init memzone\n");

if (rte_eal_tailqs_init() < 0)
rte_panic("Cannot init tail queues for objects\n");

#ifdef RTE_LIBRTE_IVSHMEM
if (rte_eal_ivshmem_obj_init() < 0)
rte_panic("Cannot init IVSHMEM objects\n");
#endif


[dpdk-dev] [PATCH] eal: remove useless includes of mempool and ring

2016-06-09 Thread Thomas Monjalon
The libraries rte_mempool and rte_ring are not used in EAL,
except rte_ring for the ivshmem part (CONFIG_RTE_LIBRTE_IVSHMEM).

Signed-off-by: Thomas Monjalon 
---
 lib/librte_eal/linuxapp/eal/eal_interrupts.c | 2 --
 lib/librte_eal/linuxapp/eal/eal_ivshmem.c| 1 -
 2 files changed, 3 deletions(-)

diff --git a/lib/librte_eal/linuxapp/eal/eal_interrupts.c 
b/lib/librte_eal/linuxapp/eal/eal_interrupts.c
index 06b26a9..a9af396 100644
--- a/lib/librte_eal/linuxapp/eal/eal_interrupts.c
+++ b/lib/librte_eal/linuxapp/eal/eal_interrupts.c
@@ -57,10 +57,8 @@
 #include 
 #include 
 #include 
-#include 
 #include 
 #include 
-#include 
 #include 
 #include 
 #include 
diff --git a/lib/librte_eal/linuxapp/eal/eal_ivshmem.c 
b/lib/librte_eal/linuxapp/eal/eal_ivshmem.c
index eea0314..67b3caf 100644
--- a/lib/librte_eal/linuxapp/eal/eal_ivshmem.c
+++ b/lib/librte_eal/linuxapp/eal/eal_ivshmem.c
@@ -49,7 +49,6 @@
 #include 
 #include 
 #include 
-#include 
 #include 
 #include 
 #include 
-- 
2.7.0



[dpdk-dev] [PATCH v8 3/3] i40e: add floating VEB extension support

2016-06-09 Thread Bruce Richardson
On Wed, May 25, 2016 at 01:28:06AM +0800, Zhe Tao wrote:
> To enable this feature, the user should pass a devargs parameter to the EAL
> like "-w 84:00.0,enable_floating=1", and the application will make sure the 
> PMD
> will use the floating VEB feature for all the VFs created by this PF device.
> 
> Also you can specifiy which VF need to connect to this floating veb using
> "floating_bitmap", every bit corresponding to one VF (e.g. bitn for VFn).
> Like "-w 84:00.0,enable_floating=1,floating_bitmap=1", means only the VF0 
> connect
> to the floating VEB, VF1 connect to the legacy VEB.

Is there only ever one floating VEB inside a PF? That perhaps needs to be called
out in the explanation of what the floating VEB is?

/Bruce


[dpdk-dev] [PATCH v8 2/3] i40e: Add floating VEB support in i40e

2016-06-09 Thread Bruce Richardson
On Wed, May 25, 2016 at 01:28:05AM +0800, Zhe Tao wrote:
> This patch add the support for floating VEB in i40e.
> All the VFs VSIs can decide whether to connect to the legacy VEB/VEPA or
> the floating VEB. When connect to the floating VEB a new floating VEB is
> created. Now all the VFs need to connect to floating VEB or legacy VEB,
> cannot connect to both of them. The PF and VMDQ,FD VSIs still connect to
> the old legacy VEB/VEPA.
> 
> All the VEB/VEPA concepts are not specific for FVL, they are defined in the
> 802.1Qbg spec.
copy reference to patch 1.

> 
> Now the floating VEB feature is only avaiable in the specific version of FW.
Please list the firmware versions rather than making the reader read through
the patch.

> 
> Signed-off-by: Zhe Tao 
> ---
>  doc/guides/nics/i40e.rst   |   7 +++
>  doc/guides/rel_notes/release_16_07.rst |   6 ++
>  drivers/net/i40e/i40e_ethdev.c | 109 
> ++---
>  drivers/net/i40e/i40e_ethdev.h |   2 +
>  drivers/net/i40e/i40e_pf.c |  11 +++-
>  5 files changed, 112 insertions(+), 23 deletions(-)
> 
> diff --git a/doc/guides/nics/i40e.rst b/doc/guides/nics/i40e.rst
> index 934eb02..49a0598 100644
> --- a/doc/guides/nics/i40e.rst
> +++ b/doc/guides/nics/i40e.rst
> @@ -366,3 +366,10 @@ Delete all flow director rules on a port:
>  
> testpmd> flush_flow_director 0
>  
> +Floating VEB
> +~
> +FVL can support floating VEB feature.
> +To enable this feature, the user should pass a devargs parameter to the EAL
> +like "-w 84:00.0,enable_floating=1", and the application will make sure the 
> PMD
> +will use the floating VEB feature for all the VFs created by this PF device.
> +

I think this documentation part really need to be expanded to explain what a
floating VEB is and why you'd use it. If this is well documented elsewhere, e.g.
in the relevant datasheets for NICs using the i40e driver, by all means refer to
that, but keep a summarised account of what it is here in the DPDK docs. Perhaps
you can reuse the description from the release note below, just with an extra
sentence or two about why the PF being down makes a difference, etc.

> diff --git a/doc/guides/rel_notes/release_16_07.rst 
> b/doc/guides/rel_notes/release_16_07.rst
> index 30e78d4..8485b08 100644
> --- a/doc/guides/rel_notes/release_16_07.rst
> +++ b/doc/guides/rel_notes/release_16_07.rst
> @@ -47,6 +47,12 @@ New Features
>* Dropped specific Xen Dom0 code.
>* Dropped specific anonymous mempool code in testpmd.
>  
> +* **Added floating VEB support for i40e PF driver.**
> +
> +  Now VFs for i40e can connect to the floating VEB.
> +  With this feature, VFs can communicate with each other, but cannot access
> +  outside network. When PF is down, and VFs can still forward pkts between 
> each
> +  other.

This expanation belongs more in the i40e.rst document above. I think the release
notes only needs a mention that the feature was added and then a link to the
fuller description in the i40e doc.

Regards,
/Bruce


[dpdk-dev] [PATCH 2/3] librte_eal: Import FreeBSD sys/tree.h into librte_eal/common

2016-06-09 Thread Nikita Kozlov
On 06/ 9/16 02:58 AM, Stephen Hemminger wrote:
> On Thu,  9 Jun 2016 02:53:53 +0200
> Nikita Kozlov  wrote:
>
>> This structure is used inside the rte_lpm6 lib for storing added rules.
>> It's imported from FreeBSD-10.3 from /usr/include/sys/tree.h, another
>> solution could have been to use on Linux the version from libbsd but it
>> would create an external dependency.
>>
>> Signed-off-by: Nikita Kozlov 
> Using Red-black tree is a good idea, and we have been doing it for a while
> both on v4 and v6.
Yes, like I said in 1/3, the idea is taken from your patch.
I have tested it and it seemed to work pretty fine, maybe we could try
to split it a little bit and try to send it again ?
It's quite difficult to tell what was wrong with it since I cannot see
any public discussion about it.
>
> But this is not the way to handle it.
> Please don't copy a header file which is available already on both BSD and 
> Linux.
>
>
I was quite hesitant on how to handle it. I had the feeling that dpdk
wanted to avoid external dependency so I copied that file. If it's not
the case I would be happy to resend the patches without that external
import.


[dpdk-dev] [PATCH v8 1/3] i40e: support floating VEB config

2016-06-09 Thread Bruce Richardson
On Wed, May 25, 2016 at 01:28:04AM +0800, Zhe Tao wrote:
> Add the new floating related argument option in the devarg.
> Using this parameter, all the samples can decide whether to use legacy 
> VEB/VEPA
> or floating VEB.
> To enable this feature, the user should pass a devargs parameter to the EAL
> like "-w 84:00.0,enable_floating=1", and the application will make sure the 
> PMD
> will use the floating VEB feature for all the VFs created by this PF device.
> 
> Signed-off-by: Zhe Tao 
> ---
>  drivers/net/i40e/i40e_ethdev.c | 44 
> ++
>  drivers/net/i40e/i40e_ethdev.h |  6 ++
>  2 files changed, 50 insertions(+)

I don't think we should ever use the word "floating" in this context without the
word "VEB" on it as well, because it's really meaningless - being an adjective,
it always requires a noun will it. This applies both to cmdline arguments and to
commit messages.

Secondly, these patches really could do with an explanation of what a floating
VEB is and what it can do vs a non-floating VEB. I know that in the second patch
you refer to standards which define VEBs, but:
a) why is that reference not in patch 1?
b) when I tried looking it up the spec requires a login for the IEEE site, and
a google search for "floating VEB" only brings up links to these patches in
patchwork.

Regards,
/Bruce


[dpdk-dev] [PATCH] log: deprecate history dump

2016-06-09 Thread David Marchand
Thomas,

On Thu, Jun 9, 2016 at 4:09 PM, Thomas Monjalon
 wrote:
> The log history uses rte_mempool. In order to remove the mempool
> dependency in EAL (and improve the build), this feature is deprecated.
> The ABI is kept but the behaviour is now voided because it seems this
> function was not used. The history can be read from syslog.

It does look like it is not really used.
I am for this change unless someone complains.

Comments below.

> Signed-off-by: Thomas Monjalon 
> ---
>  app/test-pmd/cmdline.c   |   3 -
>  app/test/autotest_test_funcs.py  |   5 --
>  app/test/commands.c  |   4 +-
>  app/test/test_logs.c |   3 -
>  doc/guides/rel_notes/deprecation.rst |   3 +
>  lib/librte_eal/bsdapp/eal/eal_debug.c|   6 --
>  lib/librte_eal/common/eal_common_log.c   | 128 
> +--
>  lib/librte_eal/common/include/rte_log.h  |   8 ++
>  lib/librte_eal/linuxapp/eal/eal_debug.c  |   6 --
>  lib/librte_eal/linuxapp/eal/eal_interrupts.c |   1 -
>  lib/librte_eal/linuxapp/eal/eal_ivshmem.c|   1 -
>  lib/librte_eal/linuxapp/eal/eal_log.c|   3 -
>  lib/librte_mempool/rte_mempool.c |   4 -
>  13 files changed, 16 insertions(+), 159 deletions(-)

- You missed autotest_data.py.

$ git grep dump_log_history
app/test/autotest_data.py:   "Command" :"dump_log_history",

- eal Makefile still refers to librte_mempool.

- Since you are looking at this, what keeps us from removing the
dependency on librte_ring ?
I would say it was mainly because of mempool.
Maybe ivshmem ?


[snip]

> diff --git a/doc/guides/rel_notes/deprecation.rst 
> b/doc/guides/rel_notes/deprecation.rst
> index ad05eba..f11df93 100644
> --- a/doc/guides/rel_notes/deprecation.rst
> +++ b/doc/guides/rel_notes/deprecation.rst
> @@ -8,6 +8,9 @@ API and ABI deprecation notices are to be posted here.
>  Deprecation Notices
>  ---
>
> +* The log history is deprecated in release 16.07.
> +  It is voided and will be completely removed in release 16.11.
> +

It is voided in 16.07 and will be completely removed in release 16.11.


>  * The ethdev hotplug API is going to be moved to EAL with a notification
>mechanism added to crypto and ethdev libraries so that hotplug is now
>available to both of them. This API will be stripped of the device 
> arguments

> diff --git a/lib/librte_eal/common/eal_common_log.c 
> b/lib/librte_eal/common/eal_common_log.c
> index b5e37bb..94ecdd2 100644
> --- a/lib/librte_eal/common/eal_common_log.c
> +++ b/lib/librte_eal/common/eal_common_log.c
> @@ -56,29 +56,11 @@
>  #include 
>  #include 
>  #include 
> -#include 
>
>  #include "eal_private.h"
>
>  #define LOG_ELT_SIZE 2048

We don't need LOG_ELT_SIZE.

>
> -#define LOG_HISTORY_MP_NAME "log_history"
> -
> -STAILQ_HEAD(log_history_list, log_history);
> -
> -/**
> - * The structure of a message log in the log history.
> - */
> -struct log_history {
> -   STAILQ_ENTRY(log_history) next;
> -   unsigned size;
> -   char buf[0];
> -};
> -
> -static struct rte_mempool *log_history_mp = NULL;
> -static unsigned log_history_size = 0;
> -static struct log_history_list log_history;
> -
>  /* global log structure */
>  struct rte_logs rte_logs = {
> .type = ~0,
> @@ -86,10 +68,7 @@ struct rte_logs rte_logs = {
> .file = NULL,
>  };
>
> -static rte_spinlock_t log_dump_lock = RTE_SPINLOCK_INITIALIZER;
> -static rte_spinlock_t log_list_lock = RTE_SPINLOCK_INITIALIZER;
>  static FILE *default_log_stream;
> -static int history_enabled = 1;
>
>  /**
>   * This global structure stores some informations about the message
> @@ -106,59 +85,14 @@ static RTE_DEFINE_PER_LCORE(struct log_cur_msg, 
> log_cur_msg);
>  /* default logs */
>
>  int
> -rte_log_add_in_history(const char *buf, size_t size)
> +rte_log_add_in_history(const char *buf __rte_unused, size_t size 
> __rte_unused)
>  {
> -   struct log_history *hist_buf = NULL;
> -   static const unsigned hist_buf_size = LOG_ELT_SIZE - 
> sizeof(*hist_buf);
> -   void *obj;
> -
> -   if (history_enabled == 0)
> -   return 0;
> -
> -   rte_spinlock_lock(_list_lock);
> -
> -   /* get a buffer for adding in history */
> -   if (log_history_size > RTE_LOG_HISTORY) {
> -   hist_buf = STAILQ_FIRST(_history);
> -   if (hist_buf) {
> -   STAILQ_REMOVE_HEAD(_history, next);
> -   log_history_size--;
> -   }
> -   }
> -   else {
> -   if (rte_mempool_mc_get(log_history_mp, ) < 0)
> -   obj = NULL;
> -   hist_buf = obj;
> -   }
> -
> -   /* no buffer */
> -   if (hist_buf == NULL) {
> -   rte_spinlock_unlock(_list_lock);
> -   return -ENOBUFS;
> -   }
> -
> -   /* not enough room for msg, buffer go back in mempool */
> -   if (size >= 

[dpdk-dev] [PATCH] doc: fix wrong supported feature table

2016-06-09 Thread Pablo de Lara
Some crypto PMDs that support symmetric crypto were not marked
as supported in the supported feature flags table.

Fixes: 2373c0661b2f0 ("doc: add cryptodevs guide overview")

Signed-off-by: Pablo de Lara 
---
 doc/guides/cryptodevs/overview.rst | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/doc/guides/cryptodevs/overview.rst 
b/doc/guides/cryptodevs/overview.rst
index 4a84146..5861440 100644
--- a/doc/guides/cryptodevs/overview.rst
+++ b/doc/guides/cryptodevs/overview.rst
@@ -36,7 +36,7 @@ Supported Feature Flags
:header: "Feature Flags", "qat", "null", "aesni_mb", "aesni_gcm", "snow3g"
:stub-columns: 1

-   "RTE_CRYPTODEV_FF_SYMMETRIC_CRYPTO",x,x,,,
+   "RTE_CRYPTODEV_FF_SYMMETRIC_CRYPTO",x,x,x,x,x
"RTE_CRYPTODEV_FF_ASYMMETRIC_CRYPTO",
"RTE_CRYPTODEV_FF_SYM_OPERATION_CHAINING",x,x,x,x,x
"RTE_CRYPTODEV_FF_CPU_SSE",,,x,x,x
-- 
2.5.0



[dpdk-dev] [PATCH v1] i40e: fix olflags for vector RX

2016-06-09 Thread Bruce Richardson
On Fri, Jun 03, 2016 at 07:19:09AM +, Azarewicz, PiotrX T wrote:
> Hi,
> 
> > --- a/drivers/net/i40e/i40e_rxtx_vec.c
> > +++ b/drivers/net/i40e/i40e_rxtx_vec.c
> > @@ -149,7 +149,7 @@ desc_to_olflags_v(__m128i descs[4], struct rte_mbuf
> > **rx_pkts)
> >
> > /* mask everything except rss and vlan flags
> > *bit2 is for vlan tag, bits 13:12 for rss
> > */
> 
> Please, update or remove the comment above.
> Thanks, Piotr
> 
ping.
is a v2 of this patch being produced?

/Bruce


[dpdk-dev] [PATCH] doc: fix wrong supported feature table

2016-06-09 Thread Jain, Deepak K


> -Original Message-
> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Pablo de Lara
> Sent: Thursday, June 9, 2016 4:45 PM
> To: dev at dpdk.org
> Cc: Mcnamara, John ; Doherty, Declan
> ; De Lara Guarch, Pablo
> 
> Subject: [dpdk-dev] [PATCH] doc: fix wrong supported feature table
> 
> Some crypto PMDs that support symmetric crypto were not marked as
> supported in the supported feature flags table.
> 
> Fixes: 2373c0661b2f0 ("doc: add cryptodevs guide overview")
> 
> Signed-off-by: Pablo de Lara 
> ---
>  doc/guides/cryptodevs/overview.rst | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> --
> 2.5.0

Acked-by: Deepak Kumar JAIN 


[dpdk-dev] [PATCH v3 00/20] DPDK PMD for ThunderX NIC device

2016-06-09 Thread Jerin Jacob
On Wed, Jun 08, 2016 at 04:08:37PM +0100, Bruce Richardson wrote:
> On Wed, Jun 08, 2016 at 03:42:14PM +0200, Thomas Monjalon wrote:
> > 2016-06-08 18:13, Jerin Jacob:
> > > On Wed, Jun 08, 2016 at 01:30:28PM +0100, Ferruh Yigit wrote:
> > > > Hi Jerin,
> > > > 
> > > > In patch subject, as tag, other drivers are using only driver name, and
> > > > Intel drivers also has "driver/base", since base code has some special
> > > > case. For thunderx, what do you think about keeping subject as:
> > > >  "thunderx: "
> > > > 
> > > 
> > > Hi Ferruh,
> > > 
> > > We may add crypto or other builtin ThunderX HW accelerated block drivers
> > > in future to DPDK.
> > > So that is the reason why I thought of keeping the subject as 
> > > thunderx/nicvf.
> > > If you don't have any objection then I would like to keep it as
> > > thunderx/nicvf or just nicvf.
> > 
> > I don't like the name nicvf but I guess that's the official name?
> > 
> > Thus I agree the title should be thunderx/nicvf or thunderx_nicvf.
> 
> I think I'd prefer the underscore version.

underscore option looks bit odd when comparing to exiting other git logs.
as Ferruh suggested "net/thunderx:" looks good to me. If their no
objections I would like to go with "net/thunderx:"

Jerin



[dpdk-dev] [PATCH] i40e: unintended sign extension

2016-06-09 Thread Bruce Richardson
On Fri, May 20, 2016 at 03:03:36PM +0200, Slawomir Mrozowicz wrote:
> Suspicious implicit sign extension: pf->fdir.match_counter_index
> with type unsigned short (16 bits, unsigned) is promoted in
> pf->fdir.match_counter_index << 20 to type int (32 bits, signed),
> then sign-extended to type unsigned long (64 bits, unsigned).
> If pf->fdir.match_counter_index << 20 is greater than 0x7FFF,
> the upper bits of the result will all be 1.
> To fix the issue set explicit cast uint32_t of pf->fdir.match_counter_index.
> 
> Fixes: 05999aab4ca6 ("i40e: add or delete flow director")
> Coverity ID 13315
> 
> Signed-off-by: Slawomir Mrozowicz 

Acked-by: Bruce Richardson 

Applied to dpdk-next-net/rel_16_07

/Bruce


[dpdk-dev] [PATCH] log: deprecate history dump

2016-06-09 Thread Thomas Monjalon
The log history uses rte_mempool. In order to remove the mempool
dependency in EAL (and improve the build), this feature is deprecated.
The ABI is kept but the behaviour is now voided because it seems this
function was not used. The history can be read from syslog.

Signed-off-by: Thomas Monjalon 
---
 app/test-pmd/cmdline.c   |   3 -
 app/test/autotest_test_funcs.py  |   5 --
 app/test/commands.c  |   4 +-
 app/test/test_logs.c |   3 -
 doc/guides/rel_notes/deprecation.rst |   3 +
 lib/librte_eal/bsdapp/eal/eal_debug.c|   6 --
 lib/librte_eal/common/eal_common_log.c   | 128 +--
 lib/librte_eal/common/include/rte_log.h  |   8 ++
 lib/librte_eal/linuxapp/eal/eal_debug.c  |   6 --
 lib/librte_eal/linuxapp/eal/eal_interrupts.c |   1 -
 lib/librte_eal/linuxapp/eal/eal_ivshmem.c|   1 -
 lib/librte_eal/linuxapp/eal/eal_log.c|   3 -
 lib/librte_mempool/rte_mempool.c |   4 -
 13 files changed, 16 insertions(+), 159 deletions(-)

diff --git a/app/test-pmd/cmdline.c b/app/test-pmd/cmdline.c
index 1921612..fd389ac 100644
--- a/app/test-pmd/cmdline.c
+++ b/app/test-pmd/cmdline.c
@@ -7268,8 +7268,6 @@ static void cmd_dump_parsed(void *parsed_result,
rte_dump_physmem_layout(stdout);
else if (!strcmp(res->dump, "dump_memzone"))
rte_memzone_dump(stdout);
-   else if (!strcmp(res->dump, "dump_log_history"))
-   rte_log_dump_history(stdout);
else if (!strcmp(res->dump, "dump_struct_sizes"))
dump_struct_sizes();
else if (!strcmp(res->dump, "dump_ring"))
@@ -7284,7 +7282,6 @@ cmdline_parse_token_string_t cmd_dump_dump =
TOKEN_STRING_INITIALIZER(struct cmd_dump_result, dump,
"dump_physmem#"
"dump_memzone#"
-   "dump_log_history#"
"dump_struct_sizes#"
"dump_ring#"
"dump_mempool#"
diff --git a/app/test/autotest_test_funcs.py b/app/test/autotest_test_funcs.py
index b60b941..14cffd0 100644
--- a/app/test/autotest_test_funcs.py
+++ b/app/test/autotest_test_funcs.py
@@ -144,16 +144,11 @@ def logs_autotest(child, test_name):
i = 0
child.sendline(test_name)

-   # logs sequence is printed twice because of history dump
log_list = [
"TESTAPP1: error message",
"TESTAPP1: critical message",
"TESTAPP2: critical message",
"TESTAPP1: error message",
-   "TESTAPP1: error message",
-   "TESTAPP1: critical message",
-   "TESTAPP2: critical message",
-   "TESTAPP1: error message",
]

for log_msg in log_list:
diff --git a/app/test/commands.c b/app/test/commands.c
index e0af8e4..2df46b0 100644
--- a/app/test/commands.c
+++ b/app/test/commands.c
@@ -150,8 +150,6 @@ static void cmd_dump_parsed(void *parsed_result,
rte_dump_physmem_layout(stdout);
else if (!strcmp(res->dump, "dump_memzone"))
rte_memzone_dump(stdout);
-   else if (!strcmp(res->dump, "dump_log_history"))
-   rte_log_dump_history(stdout);
else if (!strcmp(res->dump, "dump_struct_sizes"))
dump_struct_sizes();
else if (!strcmp(res->dump, "dump_ring"))
@@ -164,7 +162,7 @@ static void cmd_dump_parsed(void *parsed_result,

 cmdline_parse_token_string_t cmd_dump_dump =
TOKEN_STRING_INITIALIZER(struct cmd_dump_result, dump,
-"dump_physmem#dump_memzone#dump_log_history#"
+"dump_physmem#dump_memzone#"
 "dump_struct_sizes#dump_ring#dump_mempool#"
 "dump_devargs");

diff --git a/app/test/test_logs.c b/app/test/test_logs.c
index 05aa862..d0a9962 100644
--- a/app/test/test_logs.c
+++ b/app/test/test_logs.c
@@ -83,9 +83,6 @@ test_logs(void)
RTE_LOG(ERR, TESTAPP1, "error message\n");
RTE_LOG(ERR, TESTAPP2, "error message (not displayed)\n");

-   /* print again the previous logs */
-   rte_log_dump_history(stdout);
-
return 0;
 }

diff --git a/doc/guides/rel_notes/deprecation.rst 
b/doc/guides/rel_notes/deprecation.rst
index ad05eba..f11df93 100644
--- a/doc/guides/rel_notes/deprecation.rst
+++ b/doc/guides/rel_notes/deprecation.rst
@@ -8,6 +8,9 @@ API and ABI deprecation notices are to be posted here.
 Deprecation Notices
 ---

+* The log history is deprecated in release 16.07.
+  It is voided and will be completely removed in release 16.11.
+
 * The ethdev hotplug API is going to be moved to EAL with a notification
   mechanism added to crypto and ethdev libraries so that hotplug is now
   available to both of them. This API will be stripped of the device arguments
diff --git a/lib/librte_eal/bsdapp/eal/eal_debug.c 

[dpdk-dev] [PATCH v6 5/8] lib/librte_pdump: add new library for packet capturing support

2016-06-09 Thread Ananyev, Konstantin


> -Original Message-
> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Aaron Conole
> Sent: Thursday, June 09, 2016 4:59 PM
> To: Pattan, Reshma
> Cc: dev at dpdk.org
> Subject: Re: [dpdk-dev] [PATCH v6 5/8] lib/librte_pdump: add new library for 
> packet capturing support
> 
> Reshma Pattan  writes:
> 
> > Added new library for packet capturing support.
> >
> > Added public api rte_pdump_init, applications should call
> > this as part of their application setup to have packet
> > capturing framework ready.
> >
> > Added public api rte_pdump_uninit to uninitialize the packet
> > capturing framework.
> >
> > Added public apis rte_pdump_enable and rte_pdump_disable to
> > enable and disable packet capturing on specific port and queue.
> >
> > Added public apis rte_pdump_enable_by_deviceid and
> > rte_pdump_disable_by_deviceid to enable and disable packet
> > capturing on a specific device (pci address or name) and queue.
> >
> > Signed-off-by: Reshma Pattan 
> > ---
> >  MAINTAINERS|   4 +
> >  config/common_base |   5 +
> >  lib/Makefile   |   1 +
> >  lib/librte_pdump/Makefile  |  55 +++
> >  lib/librte_pdump/rte_pdump.c   | 841 
> > +
> >  lib/librte_pdump/rte_pdump.h   | 186 
> >  lib/librte_pdump/rte_pdump_version.map |  12 +
> >  mk/rte.app.mk  |   1 +
> >  8 files changed, 1105 insertions(+)
> >  create mode 100644 lib/librte_pdump/Makefile
> >  create mode 100644 lib/librte_pdump/rte_pdump.c
> >  create mode 100644 lib/librte_pdump/rte_pdump.h
> >  create mode 100644 lib/librte_pdump/rte_pdump_version.map
> >
> > diff --git a/MAINTAINERS b/MAINTAINERS
> > index 3e8558f..cc3ffdb 100644
> > --- a/MAINTAINERS
> > +++ b/MAINTAINERS
> > @@ -432,6 +432,10 @@ F: app/test/test_reorder*
> >  F: examples/packet_ordering/
> >  F: doc/guides/sample_app_ug/packet_ordering.rst
> >
> > +Pdump
> > +M: Reshma Pattan 
> > +F: lib/librte_pdump/
> > +
> >  Hierarchical scheduler
> >  M: Cristian Dumitrescu 
> >  F: lib/librte_sched/
> > diff --git a/config/common_base b/config/common_base
> > index 47c26f6..a2d5d72 100644
> > --- a/config/common_base
> > +++ b/config/common_base
> > @@ -484,6 +484,11 @@ CONFIG_RTE_LIBRTE_DISTRIBUTOR=y
> >  CONFIG_RTE_LIBRTE_REORDER=y
> >
> >  #
> > +# Compile the pdump library
> > +#
> > +CONFIG_RTE_LIBRTE_PDUMP=y
> > +
> > +#
> >  # Compile librte_port
> >  #
> >  CONFIG_RTE_LIBRTE_PORT=y
> > diff --git a/lib/Makefile b/lib/Makefile
> > index f254dba..ca7c02f 100644
> > --- a/lib/Makefile
> > +++ b/lib/Makefile
> > @@ -57,6 +57,7 @@ DIRS-$(CONFIG_RTE_LIBRTE_PORT) += librte_port
> >  DIRS-$(CONFIG_RTE_LIBRTE_TABLE) += librte_table
> >  DIRS-$(CONFIG_RTE_LIBRTE_PIPELINE) += librte_pipeline
> >  DIRS-$(CONFIG_RTE_LIBRTE_REORDER) += librte_reorder
> > +DIRS-$(CONFIG_RTE_LIBRTE_PDUMP) += librte_pdump
> >
> >  ifeq ($(CONFIG_RTE_EXEC_ENV_LINUXAPP),y)
> >  DIRS-$(CONFIG_RTE_LIBRTE_KNI) += librte_kni
> > diff --git a/lib/librte_pdump/Makefile b/lib/librte_pdump/Makefile
> > new file mode 100644
> > index 000..af81a28
> > --- /dev/null
> > +++ b/lib/librte_pdump/Makefile
> > @@ -0,0 +1,55 @@
> > +#   BSD LICENSE
> > +#
> > +#   Copyright(c) 2016 Intel Corporation. All rights reserved.
> > +#   All rights reserved.
> > +#
> > +#   Redistribution and use in source and binary forms, with or without
> > +#   modification, are permitted provided that the following conditions
> > +#   are met:
> > +#
> > +# * Redistributions of source code must retain the above copyright
> > +#   notice, this list of conditions and the following disclaimer.
> > +# * Redistributions in binary form must reproduce the above copyright
> > +#   notice, this list of conditions and the following disclaimer in
> > +#   the documentation and/or other materials provided with the
> > +#   distribution.
> > +# * Neither the name of Intel Corporation nor the names of its
> > +#   contributors may be used to endorse or promote products derived
> > +#   from this software without specific prior written permission.
> > +#
> > +#   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
> > +#   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
> > +#   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
> > +#   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
> > +#   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
> > +#   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
> > +#   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
> > +#   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
> > +#   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
> > +#   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
> > +#   OF THIS SOFTWARE, EVEN 

[dpdk-dev] [PATCH v3 00/20] DPDK PMD for ThunderX NIC device

2016-06-09 Thread Thomas Monjalon
2016-06-09 16:19, Jerin Jacob:
> On Wed, Jun 08, 2016 at 04:08:37PM +0100, Bruce Richardson wrote:
> > On Wed, Jun 08, 2016 at 03:42:14PM +0200, Thomas Monjalon wrote:
> > > 2016-06-08 18:13, Jerin Jacob:
> > > > On Wed, Jun 08, 2016 at 01:30:28PM +0100, Ferruh Yigit wrote:
> > > > > Hi Jerin,
> > > > > 
> > > > > In patch subject, as tag, other drivers are using only driver name, 
> > > > > and
> > > > > Intel drivers also has "driver/base", since base code has some special
> > > > > case. For thunderx, what do you think about keeping subject as:
> > > > >  "thunderx: "
> > > > > 
> > > > 
> > > > Hi Ferruh,
> > > > 
> > > > We may add crypto or other builtin ThunderX HW accelerated block drivers
> > > > in future to DPDK.
> > > > So that is the reason why I thought of keeping the subject as 
> > > > thunderx/nicvf.
> > > > If you don't have any objection then I would like to keep it as
> > > > thunderx/nicvf or just nicvf.
> > > 
> > > I don't like the name nicvf but I guess that's the official name?
> > > 
> > > Thus I agree the title should be thunderx/nicvf or thunderx_nicvf.
> > 
> > I think I'd prefer the underscore version.
> 
> underscore option looks bit odd when comparing to exiting other git logs.
> as Ferruh suggested "net/thunderx:" looks good to me. If their no
> objections I would like to go with "net/thunderx:"

net/thunderx would be easy to parse if all other drivers were using
net/ or crypto/ prefixes.
Do we want to add these prefixes to driver commits?


[dpdk-dev] [PATCH v8 1/3] mempool: support external mempool operations

2016-06-09 Thread Jerin Jacob
On Thu, Jun 09, 2016 at 10:39:46AM +0100, Hunt, David wrote:
> Hi Shreyansh,
> 
> On 8/6/2016 2:48 PM, Shreyansh Jain wrote:
> > Hi David,
> > 
> > Thanks for explanation. I have some comments inline...
> > 
> > > -Original Message-
> > > From: Hunt, David [mailto:david.hunt at intel.com]
> > > Sent: Tuesday, June 07, 2016 2:56 PM
> > > To: Shreyansh Jain ; dev at dpdk.org
> > > Cc: olivier.matz at 6wind.com; viktorin at rehivetech.com;
> > > jerin.jacob at caviumnetworks.com
> > > Subject: Re: [dpdk-dev] [PATCH v8 1/3] mempool: support external mempool
> > > operations
> > > 
> > > Hi Shreyansh,
> > > 
> > > On 6/6/2016 3:38 PM, Shreyansh Jain wrote:
> > > > Hi,
> > > > 
> > > > (Apologies for overly-eager email sent on this thread earlier. Will be 
> > > > more
> > > careful in future).
> > > > This is more of a question/clarification than a comment. (And I have 
> > > > taken
> > > only some snippets from original mail to keep it cleaner)
> > > > 
> > > > > +MEMPOOL_REGISTER_OPS(ops_mp_mc);
> > > > > +MEMPOOL_REGISTER_OPS(ops_sp_sc);
> > > > > +MEMPOOL_REGISTER_OPS(ops_mp_sc);
> > > > > +MEMPOOL_REGISTER_OPS(ops_sp_mc);
> > > > 
> > > > 
> > > >   From the above what I understand is that multiple packet pool 
> > > > handlers can
> > > be created.
> > > > I have a use-case where application has multiple pools but only the 
> > > > packet
> > > pool is hardware backed. Using the hardware for general buffer 
> > > requirements
> > > would prove costly.
> > > >   From what I understand from the patch, selection of the pool is based 
> > > > on
> > > the flags below.
> > > 
> > > The flags are only used to select one of the default handlers for
> > > backward compatibility through
> > > the rte_mempool_create call. If you wish to use a mempool handler that
> > > is not one of the
> > > defaults, (i.e. a new hardware handler), you would use the
> > > rte_create_mempool_empty
> > > followed by the rte_mempool_set_ops_byname call.
> > > So, for the external handlers, you create and empty mempool, then set
> > > the operations (ops)
> > > for that particular mempool.
> > I am concerned about the existing applications (for example, l3fwd).
> > Explicit calls to 'rte_create_mempool_empty->rte_mempool_set_ops_byname' 
> > model would require modifications to these applications.
> > Ideally, without any modifications, these applications should be able to 
> > use packet pools (backed by hardware) and buffer pools (backed by 
> > ring/others) - transparently.
> > 
> > If I go by your suggestions, what I understand is, doing the above without 
> > modification to applications would be equivalent to:
> > 
> >struct rte_mempool_ops custom_hw_allocator = {...}
> > 
> > thereafter, in config/common_base:
> > 
> >CONFIG_RTE_DEFAULT_MEMPOOL_OPS="custom_hw_allocator"
> > 
> > calls to rte_pktmbuf_pool_create would use the new allocator.
> 
> Yes, correct. But only for calls to rte_pktmbuf_pool_create(). Calls to
> rte_mempool_create will continue to use the default handlers (ring based).
> > But, another problem arises here.
> > 
> > There are two distinct paths for allocations of a memory pool:
> > 1. A 'pkt' pool:
> > rte_pktmbuf_pool_create
> >   \- rte_mempool_create_empty
> >   |   \- rte_mempool_set_ops_byname(..ring_mp_mc..)
> >   |
> >   `- rte_mempool_set_ops_byname
> > (...RTE_MBUF_DEFAULT_MEMPOOL_OPS..)
> > /* Override default 'ring_mp_mc' of
> >  * rte_mempool_create */
> > 
> > 2. Through generic mempool create API
> > rte_mempool_create
> >   \- rte_mempool_create_empty
> > (passing pktmbuf and pool constructors)
> > I found various instances in example applications where 
> > rte_mempool_create() is being called directly for packet pools - bypassing 
> > the more semantically correct call to rte_pktmbuf_* for packet pools.
> > 
> > In (2) control path, RTE_MBUF_DEFAULT_MEMPOOLS_OPS wouldn't be able to 
> > replace custom handler operations for packet buffer allocations.
> > 
> >  From a performance point-of-view, Applications should be able to select 
> > between packet pools and non-packet pools.
> 
> This is intended for backward compatibility, and API consistency. Any
> applications that use
> rte_mempool_create directly will continue to use the default mempool
> handlers. If the need
> to use a custeom hander, they will need to be modified to call the newer
> API,
> rte_mempool_create_empty and rte_mempool_set_ops_byname.
> 
> 
> > > > 
> > > > > + /*
> > > > > +  * Since we have 4 combinations of the SP/SC/MP/MC examine the 
> > > > > flags to
> > > > > +  * set the correct index into the table of ops structs.
> > > > > +  */
> > > > > + if (flags & (MEMPOOL_F_SP_PUT | MEMPOOL_F_SC_GET))
> > > > > + rte_mempool_set_ops_byname(mp, "ring_sp_sc");
> > > > > + else if (flags & MEMPOOL_F_SP_PUT)
> > > > > + rte_mempool_set_ops_byname(mp, "ring_sp_mc");
> > > > > + 

[dpdk-dev] [PATCH v6 5/8] lib/librte_pdump: add new library for packet capturing support

2016-06-09 Thread Aaron Conole
"Ananyev, Konstantin"  writes:

>> -Original Message-
>> From: Aaron Conole [mailto:aconole at redhat.com]
>> Sent: Thursday, June 09, 2016 6:24 PM
>> To: Ananyev, Konstantin
>> Cc: Pattan, Reshma; dev at dpdk.org
>> Subject: Re: [dpdk-dev] [PATCH v6 5/8] lib/librte_pdump: add new
>> library for packet capturing support
>> 
>> "Ananyev, Konstantin"  writes:
>> 
>> >> -Original Message-
>> >> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Aaron Conole
>> >> Sent: Thursday, June 09, 2016 4:59 PM
>> >> To: Pattan, Reshma
>> >> Cc: dev at dpdk.org
>> >> Subject: Re: [dpdk-dev] [PATCH v6 5/8] lib/librte_pdump: add new
>> > library for packet capturing support
>> >>
>> >> Reshma Pattan  writes:
>> >>
>> >> > Added new library for packet capturing support.
>> >> >
>> >> > Added public api rte_pdump_init, applications should call
>> >> > this as part of their application setup to have packet
>> >> > capturing framework ready.
>> >> >
>> >> > Added public api rte_pdump_uninit to uninitialize the packet
>> >> > capturing framework.
>> >> >
>> >> > Added public apis rte_pdump_enable and rte_pdump_disable to
>> >> > enable and disable packet capturing on specific port and queue.
>> >> >
>> >> > Added public apis rte_pdump_enable_by_deviceid and
>> >> > rte_pdump_disable_by_deviceid to enable and disable packet
>> >> > capturing on a specific device (pci address or name) and queue.
>> >> >
>> >> > Signed-off-by: Reshma Pattan 
>> >> > ---
>> >> > +
>> >> > +int
>> >> > +rte_pdump_init(void)
>> >>
>> >> Would you be opposed to having an argument here which takes a path to
>> >> the server socket?  That way the application can have some control over
>> >> the server socket location rather than using the guesses from
>> >> pdump_get_socket_path.
>> >
>> > I suppose it is better to keep IPC mechanism details internal for the
>> > pdump library.
>> > That way upper layer don't need to know what is that and write the
>> > code to open/maintain it.
>> > Again, that gives pdump library a freedom to change it (if needed) or
>> > possibly introduce some alternatives.
>> > Konstantin
>> >
>> 
>> How does the application change it?  The details do matter here, as some
>> applications (ex: openvswitch) have specific policies on which files
>> files get opened and where those files exist.  That has impact on things
>> like selinux and other access control technology.
>> 
>> If I missed the API that lets apps redirect the output, please correct me,
>> but so far I don't think I've missed it.  pdump still can change a
>> default, but it would be good to give a method for guiding the final
>> choice of file to open.
>
> No, I think, right now it is not possible to change socket path from the API.
> Basically it follows the same logic as generating path for DPDK
> runtime config, etc:
> /var/run for root, or $HOME dir otherwise.
> I suppose if openswitch or any other dpdk app is able to start successfully,
> then it should be able to open pdump socket too, correct?
>
> Anyway, as I understand what you suggest:
>
> rte_pdump_init(const char *dir)
> {
>  If (sock_name != NULL) {/*open socket at provided dir path*/}
>  else { /*generate default pathname for the socket*/
>
> and something similar for client side, might be a new function:
>
> rte_pdump_set_dirpath(const char *dir);
>
> Is that right?

Something like that, yes.

> Konstantin
>  
>> 
>> -Aaron


[dpdk-dev] [PATCH] mbuf: remove inconsistent assert statements

2016-06-09 Thread Ananyev, Konstantin

> -Original Message-
> From: Thomas Monjalon [mailto:thomas.monjalon at 6wind.com]
> Sent: Thursday, June 09, 2016 4:28 PM
> To: Ananyev, Konstantin
> Cc: dev at dpdk.org; Olivier Matz; Adrien Mazarguil; Zhang, Helin
> Subject: Re: [dpdk-dev] [PATCH] mbuf: remove inconsistent assert statements
> 
> 2016-06-09 13:21, Ananyev, Konstantin:
> > From: Olivier Matz [mailto:olivier.matz at 6wind.com]
> > > Today:
> > >
> > >   /* allowed */
> > >   m = rte_pktmbuf_alloc();
> > >   rte_pktmbuf_free(m);
> > >
> > >   /* not allowed */
> > >   m = rte_mbuf_raw_alloc();
> > >   __rte_mbuf_raw_free(m);
> > >
> > >   /* we should do instead (strange): */
> > >   m = rte_mbuf_raw_alloc();
> > >   rte_pktmbuf_free(m);
> > >
> > > What I suggest to have:
> > >
> > >   /* allowed, no change */
> > >   m = rte_pktmbuf_alloc();
> > >   rte_pktmbuf_free(m);
> > >
> > >   /* allowed, these functions would be symetrical */
> > >   m = rte_mbuf_raw_alloc();
> > >   rte_mbuf_raw_free(m);
> > >
> > >   /* not allowed, m->refcnt is uninitialized */
> > >   m = rte_mbuf_raw_alloc();
> > >   rte_pktmbuf_free(m);
> >
> > Hmm, and what it will buy us (except of symmetry)?
> 
> API consistency is important.
> It is a matter of making our users confident in DPDK.
> 
> I would not like to use a library where I need to check in the doc
> for each function because I don't remember and I'm not confident
> that the function fish() do some fishing or hunting.

As you remember, the whole story started when people used
'internal mbuf function' inside their code and then
figured out that it doesn't work as they expect :)

But as I said, if everyone are that desperate about symmetry, 
we can just create a new public function:

void
rte_mbuf_raw_free(stuct rte_mbuf *m)
{
  if (rte_mbuf_refcnt_update(m, -1) == 0)
__rte_mbuf_raw_free(m);
}

That would be 'symmetric' to rte_mbuf_raw_alloc(),
and all three combinations above would be allowed.
Konstantin



[dpdk-dev] [PATCH v4] i40e: configure MTU

2016-06-09 Thread Bruce Richardson
On Mon, May 23, 2016 at 01:33:42AM +, Wu, Jingjing wrote:
> 
> 
> > -Original Message-
> > From: Xing, Beilei
> > Sent: Friday, May 20, 2016 11:17 PM
> > To: Wu, Jingjing
> > Cc: dev at dpdk.org; Xing, Beilei
> > Subject: [PATCH v4] i40e: configure MTU
> > 
> > This patch enables configuring MTU for i40e.
> > Since changing MTU needs to reconfigure queue, stop port first before
> > configuring MTU.
> > 
> > Signed-off-by: Beilei Xing 
> 
> Acked-by: Jingjing Wu 
> 
Applied to dpdk-next-net/rel_16_07

/Bruce


[dpdk-dev] [PATCH] mbuf: remove inconsistent assert statements

2016-06-09 Thread Bruce Richardson
On Thu, Jun 09, 2016 at 01:21:18PM +, Ananyev, Konstantin wrote:
> Hi Olivier,
> 
> > -Original Message-
> > From: Olivier Matz [mailto:olivier.matz at 6wind.com]
> > Sent: Thursday, June 09, 2016 8:47 AM
> > To: Ananyev, Konstantin; dev at dpdk.org; Adrien Mazarguil
> > Subject: Re: [dpdk-dev] [PATCH] mbuf: remove inconsistent assert statements
> > 
> > Hi Konstantin,
> > 
> > >> Yes, it refcnt supposed to be set to 0 by 
> > >> __rte_pktmbuf_prefree_seg().
> > >> Wright now, it is a user responsibility to make sure refcnt==0 
> > >> before pushing
> > >> mbuf back to the pool.
> > >> Not sure why do you consider that wrong?
> > >
> > > I do not consider this wrong and I'm all for using assert() to catch
> > > programming errors, however in this specific case, I think they are
> > > inconsistent and misleading.
> > 
> >  Honestly, I don't understand why.
> >  Right now the rule of thumb is - when mbuf is in the pool, it's refcnt 
> >  should be equal zero.
> > >>
> > >> What is the purpose of this? Is there some code relying on this?
> > >
> > > The whole current implementation of mbuf_free code path relies on that.
> > > Straight here:
> > > if (likely(NULL != (m = __rte_pktmbuf_prefree_seg(m {
> > > m->next = NULL;
> > > __rte_mbuf_raw_free(m);
> > > }
> > >
> > > If we'll exclude indirect mbuf logic, all it does:
> > > if (rte_mbuf_refcnt_update(m, -1) == 0) {
> > >   m->next = NULL;
> > >   __rte_mbuf_raw_free(m);
> > > }
> > >
> > > I.E.:
> > > decrement mbuf->refcnt.
> > > If new value of refcnt is zero, then put it back into the pool.
> > >
> > > So having ASERT(mbuf->refcnt==0) inside
> > > __rte_mbuf_raw_free()/__rte_mbuf_raw_alloc()
> > > looks absolutely valid to me.
> > > I *has* to be zero at that point with current implementation,
> > > And if it is not then we probably have (or will have) a silent memory 
> > > corruption.
> > 
> > This explains how the refcount is used, and why it is set
> > to zero before returning the mbuf to the pool with the mbuf
> > free functions.
> 
> From my point, that shows that rte_pktmbuf_free() relies on the value of the 
> refcnt
> to make a decision is it ok to put mbuf back to the pool or not.
> Right now it puts mbuf to the pool *only* if it's refcnt==0.
> As discussed below, we probably can change it to be refcnt==1
> (if there really would be noticeable performance gain).
> But I think it still should be just one predefined value of refcnt (0 or 1).
> In theory it is possible to allow both (0 and 1),
> but that will make it hard to debug any alloc/free issues,
> plus would neglect any possible performance gain -
> as in that case raw_alloc (or it's analog) should still do
> mbuf->refcnt=1; 
> 

I think enforcing the rule of the refcnt being 1 inside the mempool is 
reasonable.
It saves setting the flag to zero and back to one again in the normal case. The
one case, that I can think of, where it does cause extra work is when two
threads simultaneously do the atomic decrement of the refcnt and one of them
gets zero and must free the buffer. However, in that case, the additional cost
of setting the count back to 1 on free is far less than the penalty paid on the
atomic decrement - and this should definitely be an edge-case anyway, since it
requires the two threads to free the same mbuf at the same time.

If we make this change, it also may open the possibility to look at moving the
refcount to cacheline 1 if we are reorganising the mbuf. It would no longer be
needed inside RX in our PMDs, and the path to free mbufs already uses the pool
pointer from the second cache-line anyway. That would free up 16-bits of space
to expand out the nb_segs and port values to 16-bit each if we wanted.

Regards,
/Bruce



[dpdk-dev] [PATCH v3 00/20] DPDK PMD for ThunderX NIC device

2016-06-09 Thread Bruce Richardson
On Thu, Jun 09, 2016 at 04:02:17PM +0200, Thomas Monjalon wrote:
> 2016-06-09 16:19, Jerin Jacob:
> > On Wed, Jun 08, 2016 at 04:08:37PM +0100, Bruce Richardson wrote:
> > > On Wed, Jun 08, 2016 at 03:42:14PM +0200, Thomas Monjalon wrote:
> > > > 2016-06-08 18:13, Jerin Jacob:
> > > > > On Wed, Jun 08, 2016 at 01:30:28PM +0100, Ferruh Yigit wrote:
> > > > > > Hi Jerin,
> > > > > > 
> > > > > > In patch subject, as tag, other drivers are using only driver name, 
> > > > > > and
> > > > > > Intel drivers also has "driver/base", since base code has some 
> > > > > > special
> > > > > > case. For thunderx, what do you think about keeping subject as:
> > > > > >  "thunderx: "
> > > > > > 
> > > > > 
> > > > > Hi Ferruh,
> > > > > 
> > > > > We may add crypto or other builtin ThunderX HW accelerated block 
> > > > > drivers
> > > > > in future to DPDK.
> > > > > So that is the reason why I thought of keeping the subject as 
> > > > > thunderx/nicvf.
> > > > > If you don't have any objection then I would like to keep it as
> > > > > thunderx/nicvf or just nicvf.
> > > > 
> > > > I don't like the name nicvf but I guess that's the official name?
> > > > 
> > > > Thus I agree the title should be thunderx/nicvf or thunderx_nicvf.
> > > 
> > > I think I'd prefer the underscore version.
> > 
> > underscore option looks bit odd when comparing to exiting other git logs.
> > as Ferruh suggested "net/thunderx:" looks good to me. If their no
> > objections I would like to go with "net/thunderx:"
> 
> net/thunderx would be easy to parse if all other drivers were using
> net/ or crypto/ prefixes.
> Do we want to add these prefixes to driver commits?
I would be in favour of that going forward.
It would avoid problems between ring library and ring pmd, and vhost library
and vhost pmd also.

/Bruce


[dpdk-dev] [PATCH v8 1/3] mempool: support external mempool operations

2016-06-09 Thread Jan Viktorin
On Thu, 9 Jun 2016 10:39:46 +0100
"Hunt, David"  wrote:

> Hi Shreyansh,
> 
> On 8/6/2016 2:48 PM, Shreyansh Jain wrote:
> > Hi David,
> >
> > Thanks for explanation. I have some comments inline...
> >  
> >> -Original Message-
> >> From: Hunt, David [mailto:david.hunt at intel.com]
> >> Sent: Tuesday, June 07, 2016 2:56 PM
> >> To: Shreyansh Jain ; dev at dpdk.org
> >> Cc: olivier.matz at 6wind.com; viktorin at rehivetech.com;
> >> jerin.jacob at caviumnetworks.com
> >> Subject: Re: [dpdk-dev] [PATCH v8 1/3] mempool: support external mempool
> >> operations
> >>
> >> Hi Shreyansh,
> >>
> >> On 6/6/2016 3:38 PM, Shreyansh Jain wrote:  
> >>> Hi,
> >>>
> >>> (Apologies for overly-eager email sent on this thread earlier. Will be 
> >>> more  
> >> careful in future).  
> >>> This is more of a question/clarification than a comment. (And I have 
> >>> taken  
> >> only some snippets from original mail to keep it cleaner)  
> >>>   
>  +MEMPOOL_REGISTER_OPS(ops_mp_mc);
>  +MEMPOOL_REGISTER_OPS(ops_sp_sc);
>  +MEMPOOL_REGISTER_OPS(ops_mp_sc);
>  +MEMPOOL_REGISTER_OPS(ops_sp_mc);  
> >>> 
> >>>
> >>>   From the above what I understand is that multiple packet pool handlers 
> >>> can  
> >> be created.  
> >>> I have a use-case where application has multiple pools but only the 
> >>> packet  
> >> pool is hardware backed. Using the hardware for general buffer requirements
> >> would prove costly.  
> >>>   From what I understand from the patch, selection of the pool is based 
> >>> on  
> >> the flags below.
> >>
> >> The flags are only used to select one of the default handlers for
> >> backward compatibility through
> >> the rte_mempool_create call. If you wish to use a mempool handler that
> >> is not one of the
> >> defaults, (i.e. a new hardware handler), you would use the
> >> rte_create_mempool_empty
> >> followed by the rte_mempool_set_ops_byname call.
> >> So, for the external handlers, you create and empty mempool, then set
> >> the operations (ops)
> >> for that particular mempool.  
> > I am concerned about the existing applications (for example, l3fwd).
> > Explicit calls to 'rte_create_mempool_empty->rte_mempool_set_ops_byname' 
> > model would require modifications to these applications.
> > Ideally, without any modifications, these applications should be able to 
> > use packet pools (backed by hardware) and buffer pools (backed by 
> > ring/others) - transparently.
> >
> > If I go by your suggestions, what I understand is, doing the above without 
> > modification to applications would be equivalent to:
> >
> >struct rte_mempool_ops custom_hw_allocator = {...}
> >
> > thereafter, in config/common_base:
> >
> >CONFIG_RTE_DEFAULT_MEMPOOL_OPS="custom_hw_allocator"
> >
> > calls to rte_pktmbuf_pool_create would use the new allocator.  
> 
> Yes, correct. But only for calls to rte_pktmbuf_pool_create(). Calls to 
> rte_mempool_create will continue to use the default handlers (ring based).
> > But, another problem arises here.
> >
> > There are two distinct paths for allocations of a memory pool:
> > 1. A 'pkt' pool:
> > rte_pktmbuf_pool_create
> >   \- rte_mempool_create_empty
> >   |   \- rte_mempool_set_ops_byname(..ring_mp_mc..)
> >   |
> >   `- rte_mempool_set_ops_byname
> > (...RTE_MBUF_DEFAULT_MEMPOOL_OPS..)
> > /* Override default 'ring_mp_mc' of
> >  * rte_mempool_create */
> >
> > 2. Through generic mempool create API
> > rte_mempool_create
> >   \- rte_mempool_create_empty
> > (passing pktmbuf and pool constructors)
> >
> > I found various instances in example applications where 
> > rte_mempool_create() is being called directly for packet pools - bypassing 
> > the more semantically correct call to rte_pktmbuf_* for packet pools.
> >
> > In (2) control path, RTE_MBUF_DEFAULT_MEMPOOLS_OPS wouldn't be able to 
> > replace custom handler operations for packet buffer allocations.
> >
> >  From a performance point-of-view, Applications should be able to select 
> > between packet pools and non-packet pools.  
> 
> This is intended for backward compatibility, and API consistency. Any 
> applications that use
> rte_mempool_create directly will continue to use the default mempool 
> handlers. If the need
> to use a custeom hander, they will need to be modified to call the newer 
> API,
> rte_mempool_create_empty and rte_mempool_set_ops_byname.
> 
> 
> >>>   
>  +/*
>  + * Since we have 4 combinations of the SP/SC/MP/MC examine the 
>  flags to
>  + * set the correct index into the table of ops structs.
>  + */
>  +if (flags & (MEMPOOL_F_SP_PUT | MEMPOOL_F_SC_GET))
>  +rte_mempool_set_ops_byname(mp, "ring_sp_sc");
>  +else if (flags & MEMPOOL_F_SP_PUT)
>  +rte_mempool_set_ops_byname(mp, "ring_sp_mc");
>  +else if (flags & MEMPOOL_F_SC_GET)
>  +

[dpdk-dev] [PATCH v8 1/3] mempool: support external mempool operations

2016-06-09 Thread Hunt, David


On 9/6/2016 1:30 PM, Jerin Jacob wrote:
> On Thu, Jun 09, 2016 at 11:49:44AM +, Shreyansh Jain wrote:
>> Hi Jerin,
> Hi Shreyansh,
>
 Yes, this would simplify somewhat the creation of a pktmbuf pool, in that
>>> it
 replaces
 the rte_mempool_set_ops_byname with a flag bit. However, I'm not sure we
 want
 to introduce a third method of creating a mempool to the developers. If we
 introduced this, we would then have:
 1. rte_pktmbuf_pool_create()
 2. rte_mempool_create_empty() with MEMPOOL_F_PKT_ALLOC set (which would
 use the configured custom handler)
 3. rte_mempool_create_empty() with MEMPOOL_F_PKT_ALLOC __not__ set followed
 by a call to rte_mempool_set_ops_byname() (would allow several 
 different
 custom
 handlers to be used in one application

 Does anyone else have an opinion on this? Oliver, Jerin, Jan?
>>> As I mentioned earlier, My take is not to create the separate API's for
>>> external mempool handlers.In my view, It's same,  just that sepreate
>>> mempool handler through function pointers.
>>>
>>> To keep the backward compatibility, I think we can extend the flags
>>> in rte_mempool_create and have a single API external/internal pool
>>> creation(makes easy for existing applications too, add a just mempool
>>> flag command line argument to existing applications to choose the
>>> mempool handler)
>> May be I am interpreting it wrong, but, are you suggesting a single mempool 
>> handler for all buffer/packet needs of an application (passed as command 
>> line argument)?
>> That would be inefficient especially for cases where pool is backed by a 
>> hardware. The application wouldn't want its generic buffers to consume 
>> hardware resource which would be better used for packets.
> It may vary from platform to platform or particular use case. For instance,
> the HW external pool manager for generic buffers may scale better than SW 
> multi
> producers/multi-consumer implementation when the number of cores > N
> (as no locking involved in enqueue/dequeue(again it is depended on
> specific HW implementation))
>
> I thought their no harm in selecting the external pool handlers
> in root level itself(rte_mempool_create) as by default it is
> SW MP/MC and it just an option to override if the application wants it.
>
> Jerin
>


So, how about we go with the following, based on Shreyansh's suggestion:

1. Add in #define MEMPOOL_F_EMM_ALLOC 0x0010  (EMM for External Mempool 
Manager)

2. Check this bit in rte_mempool_create() just before the other bits are 
checked (by the way, the flags check has been moved to 
rte_mempool_create(), as per an earlier patchset, but was inadvertantly 
reverted)

 /*
  * First check to see if we use the config'd mempool hander.
  * Then examine the combinations of SP/SC/MP/MC flags to
  * set the correct index into the table of ops structs.
  */
 if (flags & MEMPOOL_F_EMM_ALLOC)
 rte_mempool_set_ops_byname(mp, RTE_MBUF_DEFAULT_MEMPOOL_OPS)
 else (flags & (MEMPOOL_F_SP_PUT | MEMPOOL_F_SC_GET))
 rte_mempool_set_ops_byname(mp, "ring_sp_sc");
 else if (flags & MEMPOOL_F_SP_PUT)
 rte_mempool_set_ops_byname(mp, "ring_sp_mc");
 else if (flags & MEMPOOL_F_SC_GET)
 rte_mempool_set_ops_byname(mp, "ring_mp_sc");
 else
 rte_mempool_set_ops_byname(mp, "ring_mp_mc");

3. Modify rte_pktmbuf_pool_create to pass the bit to rte_mempool_create

-sizeof(struct rte_pktmbuf_pool_private), socket_id, 0);
+sizeof(struct rte_pktmbuf_pool_private), socket_id,
+MEMPOOL_F_PKT_ALLOC);


This will allow legacy apps to use one external handler ( as defined 
RTE_MBUF_DEFAULT_MEMPOOL_OPS) by adding the MEMPOOL_F_EMM_ALLOC bit to 
their flags in the call to rte_mempool_create().

Of course, if an app wants to use more than one external handler, they 
can call create_empty and set_ops_byname() for each mempool they create.

Regards,
Dave.












[dpdk-dev] [PATCH 1/2] mk: prevent overlinking in applications

2016-06-09 Thread Ferruh Yigit
On 6/9/2016 11:10 AM, Thomas Monjalon wrote:
> Hi Ferruh,
> 
> 2016-05-27 17:48, Ferruh Yigit:
>> Replace --no-as-needed linker flag with --as-needed flag, which will
>> only link libraries directly called by application. This requires inter
>> library dependencies resolved correctly.
>>
>> Not linking all libraries cause a compile error for lpcap and possible
>> to have other similar compiler errors, so increasing the scope of
>> --start-group argument.
> 
> What is the error?
> 
This is with -as-needed flag, and static library compilation:

.../dpdk/build/lib/librte_pmd_pcap.a(rte_eth_pcap.o): In function
`eth_dev_stop':
rte_eth_pcap.c:(.text+0xcc1): undefined reference to `pcap_dump_close'
rte_eth_pcap.c:(.text+0xcd6): undefined reference to `pcap_close'
rte_eth_pcap.c:(.text+0xd19): undefined reference to `pcap_close'
rte_eth_pcap.c:(.text+0xd58): undefined reference to `pcap_close'
/root/development/dpdk/build/lib/librte_pmd_pcap.a(rte_eth_pcap.o): In
function `open_tx_pcap':
rte_eth_pcap.c:(.text+0x190b): undefined reference to `pcap_open_dead'
rte_eth_pcap.c:(.text+0x191b): undefined reference to `pcap_dump_open'
...

pcap pmd has libpcap calls, but while linking final application
(testpmd), linker is not able to find objects that has these.

The command line for compilation is:
gcc ... -o test ... -Wl,--whole-archive  -Wl,-lpcap
-Wl,--start-group  -Wl,-lrte_pmd_pcap 
-Wl,--end-group -Wl,--no-whole-archive

-lpcap is provided, but since before -lrte_pmd_pcap, references not
resolved.

Previously, with "-no-as-needed" flag, all shared libraries linked
always, which was preventing compile error.


>> cmdline_test application causes compile error because of cyclic
>> dependency between librte_eal <-> librte_mempool. A workaround added to
>> cmdline_test for compile error.
>>
>> Signed-off-by: Ferruh Yigit 
> 
>> --- a/app/cmdline_test/Makefile
>> +++ b/app/cmdline_test/Makefile
>> @@ -46,6 +46,7 @@ SRCS-y += commands.c
>>  
>>  CFLAGS += -O3
>>  CFLAGS += $(WERROR_FLAGS)
> 
> A comment is required here to explain the workaround.
> 

Sure, I will add a comment and send a new version of the patch.

>> +LDFLAGS += -no-as-needed
> 
> The option should be --no-as-needed.
Right

> 
>> --- a/mk/exec-env/linuxapp/rte.vars.mk
>> +++ b/mk/exec-env/linuxapp/rte.vars.mk
>> @@ -46,7 +46,7 @@ EXECENV_CFLAGS  = -pthread
>>  endif
>>  
>>  # Workaround lack of DT_NEEDED entry
> 
> This comment is obsolete now.
Will remove.

> 
>> -EXECENV_LDFLAGS = --no-as-needed
>> +EXECENV_LDFLAGS = --as-needed
> 
> Why put this option for Linux only?
When this option not defined at all, different compilers behave
different, some has --as-needed as default and cause compilation issues,
I guess that is why for Linux --no-as-needed is forces. I assume BSD
compilers by default use --no-as-needed, so we don't have any
compilation problem.
And it looks like there is no specific reason to not force BSD to
--as-needed, I will test it.

> Shouldn't we restrict this option to app.mk only?
Having --no-needed flag only for app should be OK. Let me test first,
and as a result should we move this to app.mk?

> 
>> --- a/mk/rte.app.mk
>> +++ b/mk/rte.app.mk
>> @@ -58,6 +58,7 @@ _LDLIBS-y += -L$(RTE_SDK_BIN)/lib
>>  #
>>  
>>  _LDLIBS-y += --whole-archive
>> +_LDLIBS-y += --start-group
> 
> --start-group must be used only to solve circular dependencies.
> Ideally we should not use it. I think it's needed only because
> of EAL logs using mempool (must be removed).
No, this is not the reason, please check above lpcap compile error, that
is the reason of this update.

> Why extending it?
> I'm afraid we are masking some issues.
But if we have a convention to add external libraries after dpdk
libraries, that also should solve this issue, but that is more update
than just updating --start-group.

> 
>>  _LDLIBS-$(CONFIG_RTE_LIBRTE_DISTRIBUTOR)+= -lrte_distributor
>>  _LDLIBS-$(CONFIG_RTE_LIBRTE_REORDER)+= -lrte_reorder
>> @@ -111,8 +112,6 @@ _LDLIBS-y   += -lcrypto
>>  endif
>>  endif # !CONFIG_RTE_BUILD_SHARED_LIBS
>>  
>> -_LDLIBS-y += --start-group
>> -
>>  _LDLIBS-$(CONFIG_RTE_LIBRTE_KVARGS) += -lrte_kvargs
>>  _LDLIBS-$(CONFIG_RTE_LIBRTE_MBUF)   += -lrte_mbuf
>>  _LDLIBS-$(CONFIG_RTE_LIBRTE_IP_FRAG)+= -lrte_ip_frag
> 



[dpdk-dev] [PATCH v8 1/3] mempool: support external mempool operations

2016-06-09 Thread Hunt, David


On 9/6/2016 12:41 PM, Shreyansh Jain wrote:
> Hi David,
>
>> -Original Message-
>> From: Hunt, David [mailto:david.hunt at intel.com]
>> Sent: Thursday, June 09, 2016 3:10 PM
>> To: Shreyansh Jain ; dev at dpdk.org
>> Cc: olivier.matz at 6wind.com; viktorin at rehivetech.com;
>> jerin.jacob at caviumnetworks.com
>> Subject: Re: [dpdk-dev] [PATCH v8 1/3] mempool: support external mempool
>> operations
>>
>> Hi Shreyansh,
>>
>> On 8/6/2016 2:48 PM, Shreyansh Jain wrote:
>>> Hi David,
>>>
>>> Thanks for explanation. I have some comments inline...
>>>
 -Original Message-
 From: Hunt, David [mailto:david.hunt at intel.com]
 Sent: Tuesday, June 07, 2016 2:56 PM
 To: Shreyansh Jain ; dev at dpdk.org
 Cc: olivier.matz at 6wind.com; viktorin at rehivetech.com;
 jerin.jacob at caviumnetworks.com
 Subject: Re: [dpdk-dev] [PATCH v8 1/3] mempool: support external mempool
 operations

 Hi Shreyansh,

 On 6/6/2016 3:38 PM, Shreyansh Jain wrote:
> Hi,
>
> (Apologies for overly-eager email sent on this thread earlier. Will be
>> more
 careful in future).
> This is more of a question/clarification than a comment. (And I have
>> taken
 only some snippets from original mail to keep it cleaner)
> 
>> +MEMPOOL_REGISTER_OPS(ops_mp_mc);
>> +MEMPOOL_REGISTER_OPS(ops_sp_sc);
>> +MEMPOOL_REGISTER_OPS(ops_mp_sc);
>> +MEMPOOL_REGISTER_OPS(ops_sp_mc);
> 
>
>From the above what I understand is that multiple packet pool handlers
>> can
 be created.
> I have a use-case where application has multiple pools but only the
>> packet
 pool is hardware backed. Using the hardware for general buffer
>> requirements
 would prove costly.
>From what I understand from the patch, selection of the pool is based
>> on
 the flags below.

 The flags are only used to select one of the default handlers for
 backward compatibility through
 the rte_mempool_create call. If you wish to use a mempool handler that
 is not one of the
 defaults, (i.e. a new hardware handler), you would use the
 rte_create_mempool_empty
 followed by the rte_mempool_set_ops_byname call.
 So, for the external handlers, you create and empty mempool, then set
 the operations (ops)
 for that particular mempool.
>>> I am concerned about the existing applications (for example, l3fwd).
>>> Explicit calls to 'rte_create_mempool_empty->rte_mempool_set_ops_byname'
>> model would require modifications to these applications.
>>> Ideally, without any modifications, these applications should be able to 
>>> use packet pools (backed by hardware) and buffer pools (backed by
>>> ring/others) - transparently.
>>> If I go by your suggestions, what I understand is, doing the above without 
>>> modification to applications would be equivalent to:
>>> struct rte_mempool_ops custom_hw_allocator = {...}
>>>
>>> thereafter, in config/common_base:
>>>
>>> CONFIG_RTE_DEFAULT_MEMPOOL_OPS="custom_hw_allocator"
>>>
>>> calls to rte_pktmbuf_pool_create would use the new allocator.
>> Yes, correct. But only for calls to rte_pktmbuf_pool_create(). Calls to
>> rte_mempool_create will continue to use the default handlers (ring based).
> Agree with you.
> But, some applications continue to use rte_mempool_create for allocating 
> packet pools. Thus, even with a custom handler available (which, most 
> probably, would be a hardware packet buffer handler), application would 
> unintentionally end up not using it.
> Probably, such applications should be changed? (e.g. pipeline).

Yes, agreed.  If those applications need to use external mempool 
handlers, then they should be changed. I would see that as outside the 
scope of this patchset.


>>> But, another problem arises here.
>>>
>>> There are two distinct paths for allocations of a memory pool:
>>> 1. A 'pkt' pool:
>>>  rte_pktmbuf_pool_create
>>>\- rte_mempool_create_empty
>>>|   \- rte_mempool_set_ops_byname(..ring_mp_mc..)
>>>|
>>>`- rte_mempool_set_ops_byname
>>>  (...RTE_MBUF_DEFAULT_MEMPOOL_OPS..)
>>>  /* Override default 'ring_mp_mc' of
>>>   * rte_mempool_create */
>>>
>>> 2. Through generic mempool create API
>>>  rte_mempool_create
>>>\- rte_mempool_create_empty
>>>  (passing pktmbuf and pool constructors)
>>>
>>> I found various instances in example applications where
>> rte_mempool_create() is being called directly for packet pools - bypassing
>> the more semantically correct call to rte_pktmbuf_* for packet pools.
>>> In (2) control path, RTE_MBUF_DEFAULT_MEMPOOLS_OPS wouldn't be able to
>> replace custom handler operations for packet buffer allocations.
>>>   From a performance point-of-view, Applications should be able to select
>> between packet pools and non-packet pools.
>>
>> This is intended for backward compatibility, and API consistency. Any
>> 

[dpdk-dev] [PATCHv7 6/6] doc: Add prog_guide section documenting pmdinfo script

2016-06-09 Thread Neil Horman
Information on pmdinfogen may be useful to 3rd party driver developers.
Include documentation on what it does

Signed-off-by: Neil Horman 
CC: Bruce Richardson 
CC: Thomas Monjalon 
CC: Stephen Hemminger 
CC: Panu Matilainen 
---
 doc/guides/prog_guide/dev_kit_build_system.rst | 43 --
 1 file changed, 41 insertions(+), 2 deletions(-)

diff --git a/doc/guides/prog_guide/dev_kit_build_system.rst 
b/doc/guides/prog_guide/dev_kit_build_system.rst
index 3e89eae..1dc1388 100644
--- a/doc/guides/prog_guide/dev_kit_build_system.rst
+++ b/doc/guides/prog_guide/dev_kit_build_system.rst
@@ -70,7 +70,7 @@ Each build directory contains include files, libraries, and 
applications:
 ...
 ~/DEV/DPDK$ ls i686-native-linuxapp-gcc

-app build hostapp include kmod lib Makefile
+app build buildtools include kmod lib Makefile


 ~/DEV/DPDK$ ls i686-native-linuxapp-gcc/app/
@@ -264,7 +264,7 @@ These Makefiles generate a binary application.

 *   rte.extapp.mk: External application

-*   rte.hostapp.mk: Host application in the development kit framework
+*   rte.hostapp.mk: prerequisite tool to build dpdk

 Library
 ^^^
@@ -304,6 +304,45 @@ Misc

 *   rte.subdir.mk: Build several directories in the development kit framework.

+.. _Internally_Generated_Build_Tools:
+
+Internally Generated Build Tools
+
+
+``app/pmdinfogen``
+
+
+``pmdinfogen`` scans an object (.o) file for various well known symbol names.  
These
+well known symbol names are defined by various macros and used to export
+important information about hardware support and usage for pmd files.  For
+instance the macro:
+
+.. code-block:: c
+
+PMD_REGISTER_DRIVER(drv, name)
+
+
+Creates the following symbol:
+
+.. code-block:: c
+
+   static char this_pmd_name0[] __attribute__((used)) = "";
+
+
+Which pmdinfogen scans for.  Using this information other relevant bits of data
+can be exported from the object file and used to produce a hardware support
+description, that pmdinfogen then encodes into a json formatted string in the
+following format:
+
+.. code-block:: C
+
+   static char ="PMD_INFO_STRING=\"{'name' : '', 
...}\"";
+
+
+These strings can then be searched for by external tools to determine the
+hardware support of a given library or application.
+
+
 .. _Useful_Variables_Provided_by_the_Build_System:

 Useful Variables Provided by the Build System
-- 
2.5.5



[dpdk-dev] [PATCHv7 5/6] pmdinfo.py: Add tool to query binaries for hw and other support information

2016-06-09 Thread Neil Horman
This tool searches for the primer sting PMD_DRIVER_INFO= in any ELF binary,
and, if found parses the remainder of the string as a json encoded string,
outputting the results in either a human readable or raw, script parseable
format

Note that, in the case of dynamically linked applications, pmdinfo.py will
scan for implicitly linked PMDs by searching the specified binaries
.dynamic section for DT_NEEDED entries that contain the substring
librte_pmd.  The DT_RUNPATH, LD_LIBRARY_PATH, /usr/lib and /lib are
searched for these libraries, in that order

If a file is specified with no path, it is assumed to be a PMD DSO, and the
LD_LIBRARY_PATH, /usr/lib[64]/ and /lib[64] is searched for it

Currently the tool can output data in 3 formats:

a) raw, suitable for scripting, where the raw JSON strings are dumped out
b) table format (default) where hex pci ids are dumped in a table format
c) pretty, where a user supplied pci.ids file is used to print out vendor
and device strings

Signed-off-by: Neil Horman 
CC: Bruce Richardson 
CC: Thomas Monjalon 
CC: Stephen Hemminger 
CC: Panu Matilainen 
---
 mk/rte.sdkinstall.mk |   2 +
 tools/pmdinfo.py | 629 +++
 2 files changed, 631 insertions(+)
 create mode 100755 tools/pmdinfo.py

diff --git a/mk/rte.sdkinstall.mk b/mk/rte.sdkinstall.mk
index 68e56b6..dc36df5 100644
--- a/mk/rte.sdkinstall.mk
+++ b/mk/rte.sdkinstall.mk
@@ -126,6 +126,8 @@ install-runtime:
$(Q)$(call rte_mkdir,  $(DESTDIR)$(sbindir))
$(Q)$(call rte_symlink,$(DESTDIR)$(datadir)/tools/dpdk_nic_bind.py, 
\
   $(DESTDIR)$(sbindir)/dpdk_nic_bind)
+   $(Q)$(call rte_symlink,$(DESTDIR)$(datadir)/tools/pmdinfo.py, \
+  $(DESTDIR)$(bindir)/dpdk-pmdinfo)

 install-kmod:
 ifneq ($(wildcard $O/kmod/*),)
diff --git a/tools/pmdinfo.py b/tools/pmdinfo.py
new file mode 100755
index 000..e531154
--- /dev/null
+++ b/tools/pmdinfo.py
@@ -0,0 +1,629 @@
+#!/usr/bin/python
+# -
+# scripts/pmdinfo.py
+#
+# Utility to dump PMD_INFO_STRING support from an object file
+#
+# -
+import os
+import sys
+from optparse import OptionParser
+import string
+import json
+
+# For running from development directory. It should take precedence over the
+# installed pyelftools.
+sys.path.insert(0, '.')
+
+
+from elftools import __version__
+from elftools.common.exceptions import ELFError
+from elftools.common.py3compat import (
+ifilter, byte2int, bytes2str, itervalues, str2bytes)
+from elftools.elf.elffile import ELFFile
+from elftools.elf.dynamic import DynamicSection, DynamicSegment
+from elftools.elf.enums import ENUM_D_TAG
+from elftools.elf.segments import InterpSegment
+from elftools.elf.sections import SymbolTableSection
+from elftools.elf.gnuversions import (
+GNUVerSymSection, GNUVerDefSection,
+GNUVerNeedSection,
+)
+from elftools.elf.relocation import RelocationSection
+from elftools.elf.descriptions import (
+describe_ei_class, describe_ei_data, describe_ei_version,
+describe_ei_osabi, describe_e_type, describe_e_machine,
+describe_e_version_numeric, describe_p_type, describe_p_flags,
+describe_sh_type, describe_sh_flags,
+describe_symbol_type, describe_symbol_bind, describe_symbol_visibility,
+describe_symbol_shndx, describe_reloc_type, describe_dyn_tag,
+describe_ver_flags,
+)
+from elftools.elf.constants import E_FLAGS
+from elftools.dwarf.dwarfinfo import DWARFInfo
+from elftools.dwarf.descriptions import (
+describe_reg_name, describe_attr_value, set_global_machine_arch,
+describe_CFI_instructions, describe_CFI_register_rule,
+describe_CFI_CFA_rule,
+)
+from elftools.dwarf.constants import (
+DW_LNS_copy, DW_LNS_set_file, DW_LNE_define_file)
+from elftools.dwarf.callframe import CIE, FDE
+
+raw_output = False
+pcidb = None
+
+# ===
+
+
+class Vendor:
+"""
+Class for vendors. This is the top level class
+for the devices belong to a specific vendor.
+self.devices is the device dictionary
+subdevices are in each device.
+"""
+
+def __init__(self, vendorStr):
+"""
+Class initializes with the raw line from pci.ids
+Parsing takes place inside __init__
+"""
+self.ID = vendorStr.split()[0]
+self.name = vendorStr.replace("%s " % self.ID, "").rstrip()
+self.devices = {}
+
+def addDevice(self, deviceStr):
+"""
+Adds a device to self.devices
+takes the raw line from pci.ids
+"""
+s = deviceStr.strip()
+devID = s.split()[0]
+if devID in self.devices:
+pass
+else:
+self.devices[devID] = Device(deviceStr)
+
+def report(self):
+print self.ID, self.name
+for id, 

[dpdk-dev] [PATCHv7 4/6] Makefile: Do post processing on objects that register a driver

2016-06-09 Thread Neil Horman
Modify the compilation makefile to identify C files that export PMD
information, and use that to trigger execution of the pmdinfo binary.  If
the execution of pmdinfo is successful, compile the output C file to an
object, and use the linker to do relocatable linking on the resultant
object file into the parent object that it came from.  This effectively
just adds the json string into the string table of the object that defines
the PMD to the outside world.

Signed-off-by: Neil Horman 
CC: Bruce Richardson 
CC: Thomas Monjalon 
CC: Stephen Hemminger 
CC: Panu Matilainen 
---
 mk/internal/rte.compile-pre.mk | 14 ++
 1 file changed, 14 insertions(+)

diff --git a/mk/internal/rte.compile-pre.mk b/mk/internal/rte.compile-pre.mk
index b9bff4a..5632d6e 100644
--- a/mk/internal/rte.compile-pre.mk
+++ b/mk/internal/rte.compile-pre.mk
@@ -88,10 +88,24 @@ C_TO_O_CMD = 'cmd_$@ = $(C_TO_O_STR)'
 C_TO_O_DO = @set -e; \
echo $(C_TO_O_DISP); \
$(C_TO_O) && \
+   sh -c "grep -q \"PMD_REGISTER_DRIVER(.*)\" $<; \
+   if [ \$$? -eq 0 ]; then \
+   echo \"  PMDINFOGEN\" $@; \
+   OBJF=`readlink -f $@`; \
+   ${RTE_OUTPUT}/app/pmdinfogen \$$OBJF \$$OBJF.pmd.c; \
+   if [ \$$? -eq 0 ]; \
+   then \
+   echo \"  PMDINFOBUILD\" $@; \
+   $(CC) $(CFLAGS) -c -o \$$OBJF.pmd.o \$$OBJF.pmd.c; \
+   $(CROSS)ld $(LDFLAGS) -r -o \$$OBJF.o \$$OBJF.pmd.o 
\$$OBJF; \
+   mv -f \$$OBJF.o \$$OBJF; \
+   fi; \
+   fi;" && \
echo $(C_TO_O_CMD) > $(call obj2cmd,$(@)) && \
sed 's,'$@':,dep_'$@' =,' $(call obj2dep,$(@)).tmp > $(call 
obj2dep,$(@)) && \
rm -f $(call obj2dep,$(@)).tmp

+
 # return an empty string if string are equal
 compare = $(strip $(subst $(1),,$(2)) $(subst $(2),,$(1)))

-- 
2.5.5



[dpdk-dev] [PATCHv7 3/6] eal: Add an export symbol to expose the autoload path to external tools

2016-06-09 Thread Neil Horman
Export a symbol containing the string:
DPDK_PLUGIN_PATH="$(CONFIG_RTE_EAL_PMD_PATH)"

Where the latter half of the string is set at build time to a location from
which autoloaded DSO's will be found.  This string is used by pmdinfo in
'plugin' mode, whereby a user can specify a dpdk installation directory (or
static binary), and scan the associated path (if found) for pmd DSO's and
report on their hardware support.

Signed-off-by: Neil Horman 
CC: Bruce Richardson 
CC: Thomas Monjalon 
CC: Stephen Hemminger 
CC: Panu Matilainen 
---
 lib/librte_eal/common/eal_common_options.c | 9 +
 1 file changed, 9 insertions(+)

diff --git a/lib/librte_eal/common/eal_common_options.c 
b/lib/librte_eal/common/eal_common_options.c
index 3efc90f..0a594d7 100644
--- a/lib/librte_eal/common/eal_common_options.c
+++ b/lib/librte_eal/common/eal_common_options.c
@@ -115,6 +115,15 @@ TAILQ_HEAD_INITIALIZER(solib_list);
 /* Default path of external loadable drivers */
 static const char *default_solib_dir = RTE_EAL_PMD_PATH;

+/*
+ * Stringified version of solib path used by pmdinfo.py
+ * Note: PLEASE DO NOT ALTER THIS without making a corresponding
+ * change to tools/pmdinfo.py
+ */
+static const char dpdk_solib_path[] __attribute__((used)) =
+"DPDK_PLUGIN_PATH=" RTE_EAL_PMD_PATH;
+
+
 static int master_lcore_parsed;
 static int mem_parsed;

-- 
2.5.5



[dpdk-dev] [PATCHv7 2/6] drivers: Update driver registration macro usage

2016-06-09 Thread Neil Horman
Modify the PMD_REGISTER_DRIVER macro, adding a name argument to it.  The
addition of a name argument creates a token that can be used for subsequent
macros in the creation of unique symbol names to export additional bits of
information for use by the pmdinfogen tool.  For example:

PMD_REGISTER_DRIVER(ena_driver, ena);

registers the ena_driver struct as it always did, and creates a symbol
const char this_pmd_name0[] __attribute__((used)) = "ena";

which pmdinfogen can search for and extract.  The subsequent macro

DRIVER_REGISTER_PCI_TABLE(ena, ena_pci_id_map);

creates a symbol const char ena_pci_tbl_export[] __attribute__((used)) =
"ena_pci_id_map";

Which allows pmdinfogen to find the pci table of this driver

Using this pattern, we can export arbitrary bits of information.

pmdinfo uses this information to extract hardware support from an object
file and create a json string to make hardware support info discoverable
later.

Signed-off-by: Neil Horman 
CC: Bruce Richardson 
CC: Thomas Monjalon 
CC: Stephen Hemminger 
CC: Panu Matilainen 
---
 drivers/Makefile   |  2 ++
 drivers/crypto/aesni_gcm/aesni_gcm_pmd.c   |  4 +++-
 drivers/crypto/aesni_mb/rte_aesni_mb_pmd.c |  4 +++-
 drivers/crypto/null/null_crypto_pmd.c  |  4 +++-
 drivers/crypto/qat/rte_qat_cryptodev.c |  4 +++-
 drivers/crypto/snow3g/rte_snow3g_pmd.c |  4 +++-
 drivers/net/af_packet/rte_eth_af_packet.c  |  4 +++-
 drivers/net/bnx2x/bnx2x_ethdev.c   |  6 --
 drivers/net/bonding/rte_eth_bond_pmd.c |  7 ++-
 drivers/net/cxgbe/cxgbe_ethdev.c   |  4 +++-
 drivers/net/e1000/em_ethdev.c  |  3 ++-
 drivers/net/e1000/igb_ethdev.c |  6 --
 drivers/net/ena/ena_ethdev.c   |  3 ++-
 drivers/net/enic/enic_ethdev.c |  3 ++-
 drivers/net/fm10k/fm10k_ethdev.c   |  3 ++-
 drivers/net/i40e/i40e_ethdev.c |  3 ++-
 drivers/net/i40e/i40e_ethdev_vf.c  |  3 ++-
 drivers/net/ixgbe/ixgbe_ethdev.c   |  6 --
 drivers/net/mlx4/mlx4.c|  3 ++-
 drivers/net/mlx5/mlx5.c|  3 ++-
 drivers/net/mpipe/mpipe_tilegx.c   |  4 ++--
 drivers/net/nfp/nfp_net.c  |  3 ++-
 drivers/net/null/rte_eth_null.c|  3 ++-
 drivers/net/pcap/rte_eth_pcap.c|  4 +++-
 drivers/net/qede/qede_ethdev.c |  6 --
 drivers/net/ring/rte_eth_ring.c|  3 ++-
 drivers/net/szedata2/rte_eth_szedata2.c|  3 ++-
 drivers/net/vhost/rte_eth_vhost.c  |  3 ++-
 drivers/net/virtio/virtio_ethdev.c |  3 ++-
 drivers/net/vmxnet3/vmxnet3_ethdev.c   |  3 ++-
 drivers/net/xenvirt/rte_eth_xenvirt.c  |  2 +-
 lib/librte_eal/common/include/rte_dev.h| 30 --
 32 files changed, 105 insertions(+), 41 deletions(-)

diff --git a/drivers/Makefile b/drivers/Makefile
index 81c03a8..75a3168 100644
--- a/drivers/Makefile
+++ b/drivers/Makefile
@@ -34,4 +34,6 @@ include $(RTE_SDK)/mk/rte.vars.mk
 DIRS-y += net
 DIRS-$(CONFIG_RTE_LIBRTE_CRYPTODEV) += crypto

+DEPDIRS-y += buildtools/pmdinfo
+
 include $(RTE_SDK)/mk/rte.subdir.mk
diff --git a/drivers/crypto/aesni_gcm/aesni_gcm_pmd.c 
b/drivers/crypto/aesni_gcm/aesni_gcm_pmd.c
index 2987ef6..f43e407 100644
--- a/drivers/crypto/aesni_gcm/aesni_gcm_pmd.c
+++ b/drivers/crypto/aesni_gcm/aesni_gcm_pmd.c
@@ -521,4 +521,6 @@ static struct rte_driver aesni_gcm_pmd_drv = {
.uninit = aesni_gcm_uninit
 };

-PMD_REGISTER_DRIVER(aesni_gcm_pmd_drv);
+PMD_REGISTER_DRIVER(aesni_gcm_pmd_drv, aesni_gcm);
+DRIVER_REGISTER_PARAM_STRING(aesni_gcm, "max_nb_queue_pairs= "
+"max_nb_sessions= socket_id=");
diff --git a/drivers/crypto/aesni_mb/rte_aesni_mb_pmd.c 
b/drivers/crypto/aesni_mb/rte_aesni_mb_pmd.c
index 6554fc4..db3e562 100644
--- a/drivers/crypto/aesni_mb/rte_aesni_mb_pmd.c
+++ b/drivers/crypto/aesni_mb/rte_aesni_mb_pmd.c
@@ -721,4 +721,6 @@ static struct rte_driver cryptodev_aesni_mb_pmd_drv = {
.uninit = cryptodev_aesni_mb_uninit
 };

-PMD_REGISTER_DRIVER(cryptodev_aesni_mb_pmd_drv);
+PMD_REGISTER_DRIVER(cryptodev_aesni_mb_pmd_drvi, aesni_mb);
+DRIVER_REGISTER_PARAM_STRING(aesni_gcm, "max_nb_queue_pairs= "
+"max_nb_sessions= socket_id=");
diff --git a/drivers/crypto/null/null_crypto_pmd.c 
b/drivers/crypto/null/null_crypto_pmd.c
index bdaf13c..0a195ed 100644
--- a/drivers/crypto/null/null_crypto_pmd.c
+++ b/drivers/crypto/null/null_crypto_pmd.c
@@ -275,4 +275,6 @@ static struct rte_driver cryptodev_null_pmd_drv = {
.uninit = cryptodev_null_uninit
 };

-PMD_REGISTER_DRIVER(cryptodev_null_pmd_drv);
+PMD_REGISTER_DRIVER(cryptodev_null_pmd_drv, cryptodev_null_pmd);
+DRIVER_REGISTER_PARAM_STRING(aesni_gcm, "max_nb_queue_pairs= "
+"max_nb_sessions= socket_id=");
diff --git a/drivers/crypto/qat/rte_qat_cryptodev.c 
b/drivers/crypto/qat/rte_qat_cryptodev.c
index a7912f5..d2de6a6 100644
--- a/drivers/crypto/qat/rte_qat_cryptodev.c
+++ 

[dpdk-dev] [PATCHv7 1/6] pmdinfogen: Add buildtools and pmdinfogen utility

2016-06-09 Thread Neil Horman
pmdinfogen is a tool used to parse object files and build json strings for
use in later determining hardware support in a dso or application binary.
pmdinfo looks for the non-exported symbol names this_pmd_name and
this_pmd_tbl (where n is a integer counter).  It records the name of
each of these tuples, using the later to find the symbolic name of the
pci_table for physical devices that the object supports.  With this
information, it outputs a C file with a single line of the form:

static char *_driver_info[] __attribute__((used)) = " \
PMD_DRIVER_INFO=";

Where  is the arbitrary name of the pmd, and  is the
json encoded string that hold relevant pmd information, including the pmd
name, type and optional array of pci device/vendor ids that the driver
supports.

This c file is suitable for compiling to object code, then relocatably
linking into the parent file from which the C was generated.  This creates
an entry in the string table of the object that can inform a later tool
about hardware support.

Signed-off-by: Neil Horman 
CC: Bruce Richardson 
CC: Thomas Monjalon 
CC: Stephen Hemminger 
CC: Panu Matilainen 
---
 GNUmakefile|   2 +-
 buildtools/Makefile|  36 +++
 buildtools/pmdinfogen/Makefile |  49 +
 buildtools/pmdinfogen/pmdinfogen.c | 439 +
 buildtools/pmdinfogen/pmdinfogen.h | 120 ++
 mk/rte.hostapp.mk  |   8 +-
 mk/rte.sdkbuild.mk |   3 +-
 7 files changed, 651 insertions(+), 6 deletions(-)
 create mode 100644 buildtools/Makefile
 create mode 100644 buildtools/pmdinfogen/Makefile
 create mode 100644 buildtools/pmdinfogen/pmdinfogen.c
 create mode 100644 buildtools/pmdinfogen/pmdinfogen.h

diff --git a/GNUmakefile b/GNUmakefile
index b59e4b6..00fe0db 100644
--- a/GNUmakefile
+++ b/GNUmakefile
@@ -40,6 +40,6 @@ export RTE_SDK
 # directory list
 #

-ROOTDIRS-y := lib drivers app
+ROOTDIRS-y := buildtools lib drivers app

 include $(RTE_SDK)/mk/rte.sdkroot.mk
diff --git a/buildtools/Makefile b/buildtools/Makefile
new file mode 100644
index 000..35a42ff
--- /dev/null
+++ b/buildtools/Makefile
@@ -0,0 +1,36 @@
+#   BSD LICENSE
+#
+#   Copyright(c) 2016 Neil Horman. All rights reserved.
+#   All rights reserved.
+#
+#   Redistribution and use in source and binary forms, with or without
+#   modification, are permitted provided that the following conditions
+#   are met:
+#
+# * Redistributions of source code must retain the above copyright
+#   notice, this list of conditions and the following disclaimer.
+# * Redistributions in binary form must reproduce the above copyright
+#   notice, this list of conditions and the following disclaimer in
+#   the documentation and/or other materials provided with the
+#   distribution.
+# * Neither the name of Intel Corporation nor the names of its
+#   contributors may be used to endorse or promote products derived
+#   from this software without specific prior written permission.
+#
+#   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+#   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+#   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+#   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+#   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+#   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+#   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+#   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+#   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+#   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+#   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+
+include $(RTE_SDK)/mk/rte.vars.mk
+
+DIRS-y += pmdinfogen
+
+include $(RTE_SDK)/mk/rte.subdir.mk
diff --git a/buildtools/pmdinfogen/Makefile b/buildtools/pmdinfogen/Makefile
new file mode 100644
index 000..125901b
--- /dev/null
+++ b/buildtools/pmdinfogen/Makefile
@@ -0,0 +1,49 @@
+#   BSD LICENSE
+#
+#   Copyright(c) 2016 Neil Horman. All rights reserved.
+#   All rights reserved.
+#
+#   Redistribution and use in source and binary forms, with or without
+#   modification, are permitted provided that the following conditions
+#   are met:
+#
+# * Redistributions of source code must retain the above copyright
+#   notice, this list of conditions and the following disclaimer.
+# * Redistributions in binary form must reproduce the above copyright
+#   notice, this list of conditions and the following disclaimer in
+#   the documentation and/or other materials provided with the
+#   distribution.
+# * Neither the name of Intel Corporation nor the names of its
+#   contributors may be used to endorse or promote products derived
+#   from this software without specific prior 

[dpdk-dev] [PATCHv7 0/6] Implement pmd hardware support exports

2016-06-09 Thread Neil Horman
Hey all-
So heres attempt number 2 at a method for exporting PMD hardware support
information.  As we discussed previously, the consensus seems to be that pmd
information should be:

1) Able to be interrogated on any ELF binary (application binary or individual
DSO)
2) Equally functional on statically linked applications or on DSO's
3) Resilient to symbol stripping
4) Script friendly
5) Show kernel dependencies
6) List driver options
7) Show driver name
8) Offer human readable output
9) Show DPDK version
10) Show driver version
11) Allow for expansion
12) Not place additional build environment dependencies on an application

I added item 12 myself, because I don't think its reasonable to require
applications to use a specific linker script to get hardware support information
(which precludes the use of a special .modinfo section like the kernel uses)

However, we still can use some tricks from the kernel to make this work.  In
this approach, what I've done is the following:

A) Modified the driver registration macro to also define a variable:
this_pmd_name= "name"

Based on the unique name string pointed to by the above variable, we can
query for an arbitrary number of other symbols following the pattern:
_

Where tag is some well known identifier for information we wish to export

B) Added a utility called pmdinfogen.  This utility is not meant for general 
use,
but rather is used by the dpdk build environment itself when compiling pmds.
for each object that uses the PMD_REGISTER_DRIVER macro, this utiity is run.  It
searches for the symbols in (A), and using those, extracts the hardware support
info, and module name from the object, using that to produce a new c file
containing a single variable in the following format:

static const char [] __attribute__((used)) = "PMD_DRIVER_INFO=";

The  is arbitrary, as its static and not referenced.  The relevant bit is
the string value assigned to it.  The  is a json encoded string of the
extracted hardware support information pmdinfo found for the corresponding
object.  This C file is suitable for compilation and relocatable linking back
into the parent object file.  The result of this operation is that the object
string table now contains a string that will not e removed by stripping, whos
leading text (PMD_DRIVER_INFO) can be easily searched for at any time weather
the symbol referring to it is stripped or not.

C) Added a utilty called pmdinfo.py.  This python script, searches the
string table of the .rodata section of any provided ELF file looking for the
PMD_DRIVER_INFO prefix on a string.  When found, it will interpret the remainder
of the string as json, and output the hardware support for that ELF file (if
any).


This approach ticks most of the above boxes:
1) Impervious to stripping
2) Works on static and dynamic binaries
3) Human and script friendly
4) Allows for expansion

Because of (4) the other items should be pretty easy to implement, as its just a
matter of modifying the macros to export the info, pmdinfo to encode it to json,
and pmd_hw_support.py to read it.  I'm not adding them now, as theres alot of
rote work to do to get some of them in place (e.g. DPDK has no current global
VERSION macro, drivers don't have a consistent version scheme, command line
strings have to be built for each driver, etc).  But once this is accepted,
those items can be done as time allows and should be fairly easy to implement.

Change Notes:
v2)
 * Made the export macros a bit easier to expand
 * Added a macro to optionally export command line strings
 * Renamed utilties to be more appropriate
   (pmdinfo -> pmdinfogen, pmd_hw_support.py -> pmdinfo.py)
 * Added search capabilities to pmdinfo.py so that we search for libraries
   linked using DT_NEEDINFO entries.  We search DT_RUNPATH, LD_LIBRARY_PATH,
   /usr/lib and /lib
 * Added serch abilities if the specified binary to pmdinfo isn't found, we
   search LD_LIBRARY_PATH, /usr/lib and /lib
 * Added an option to pmdinfo.py to pretty-print hardware support info using the
   pci.ids database

v3)
 * Made table mode printing the default mode
 * Added a default path to the pci.ids file
 * Modifed pmdinfo to use python rather than python3
 * Cleaned up some changelog entries
 * Added an export for RTE_EAL_PMD_PATH
 * Added a plugin option to pmdinfo to scan for autoloaded DSO's

v4)
 * Modified the operation of the -p option. As much as I don't like implying
that autoloaded pmds are guaranteed to be there at run time, I'm having a hard
time seeing how we can avoid specifying the application file to scan for the
autoload directory.  Without it we can't determine which library the user means
in a multiversion installation
 * Cleaned up the help text
 * Added a rule for an install target for pmdinfo
 * Guarded against some tracebacks in pmdinfo
 * Use DT_NEEDED entries to get versioned libraries in -p mode
 * Fixed traceback that occurs on lack of input arguments
 * Fixed some erroneous macro usage in drivers that 

[dpdk-dev] [PATCH] examples/ip_pipeline: fix build error for gcc 4.8

2016-06-09 Thread Daniel Mrzyglod
This patch fixes a maybe-uninitialized warning when compiling DPDK with GCC 4.8

examples/ip_pipeline/pipeline/pipeline_common_fe.c: In function 
'app_pipeline_track_pktq_out_to_link':
examples/ip_pipeline/pipeline/pipeline_common_fe.c:66:31: error:
'reader' may be used uninitialized in this function 
[-Werror=maybe-uninitialized]

   struct app_pktq_out_params *pktq_out =

Fixes: 760064838ec0 ("examples/ip_pipeline: link routing output ports to 
devices")

Signed-off-by: Daniel Mrzyglod 
---
 examples/ip_pipeline/app.h |4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/examples/ip_pipeline/app.h b/examples/ip_pipeline/app.h
index 848244a..592c17c 100644
--- a/examples/ip_pipeline/app.h
+++ b/examples/ip_pipeline/app.h
@@ -667,7 +667,7 @@ app_swq_get_reader(struct app_params *app,
struct app_pktq_swq_params *swq,
uint32_t *pktq_in_id)
 {
-   struct app_pipeline_params *reader;
+   struct app_pipeline_params *reader = NULL;
uint32_t pos = swq - app->swq_params;
uint32_t n_pipelines = RTE_MIN(app->n_pipelines,
RTE_DIM(app->pipeline_params));
@@ -727,7 +727,7 @@ app_tm_get_reader(struct app_params *app,
struct app_pktq_tm_params *tm,
uint32_t *pktq_in_id)
 {
-   struct app_pipeline_params *reader;
+   struct app_pipeline_params *reader = NULL;
uint32_t pos = tm - app->tm_params;
uint32_t n_pipelines = RTE_MIN(app->n_pipelines,
RTE_DIM(app->pipeline_params));
-- 
1.7.9.5



[dpdk-dev] [PATCHv6 7/7] doc: Add prog_guide section documenting pmdinfo script

2016-06-09 Thread Neil Horman
On Wed, Jun 08, 2016 at 05:14:05PM +, Mcnamara, John wrote:
> > -Original Message-
> > From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Neil Horman
> > Sent: Tuesday, May 31, 2016 2:58 PM
> > To: dev at dpdk.org
> > Cc: Neil Horman ; Richardson, Bruce
> > ; Thomas Monjalon  > 6wind.com>;
> > Stephen Hemminger ; Panu Matilainen
> > 
> > Subject: [dpdk-dev] [PATCHv6 7/7] doc: Add prog_guide section documenting
> > pmdinfo script
> > 
> > Information on pmdinfogen may be useful to 3rd party driver developers.
> > Include documentation on what it does
> 
> Hi,
> 
> There are some trailing whitespace warnings on merge. Some other, minor,
> comments below.
> 
> 
> > 
> > +.. _Internally_Generated_Build_Tools
> 
> The target needs a colon at the end to make it valid:
> 
> .. _Internally_Generated_Build_Tools:
> 
> 
> > +
> > +buildtools/pmdinfogen
> 
> This might be better with some distinguishing highlighting, either
> ``buildtools/pmdinfogen`` or **buildtools/pmdinfogen**.
> 
> 
> > +pmdinfogen scans an object (.o) file for various well known symbol
> 
> Instances of pmdinfogen would be better as a fixed width ``pmdinfogen``.
> 
> 
> > +names.  These well know symbol names are defined by various macros and
> > +used to export important information about hardware support and usage
> > +for pmd files.  for instance the macro:
> 
> s/know/known
> s/for/For
> 
> 
> > +These strings can then be searched for by external tools to determine
> > +the hardware support of a given library or application
> 
> Missing full stop.
> 
> Apart from these minor changes:
> 
> 
> Acked-by: John McNamara 
Thanks, I'm incorporating this with the updates that Thomas and I have been
arguing about.
Neil

> 
> 
> 


[dpdk-dev] [PATCH v6 5/8] lib/librte_pdump: add new library for packet capturing support

2016-06-09 Thread Aaron Conole
"Ananyev, Konstantin"  writes:

>> -Original Message-
>> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Aaron Conole
>> Sent: Thursday, June 09, 2016 4:59 PM
>> To: Pattan, Reshma
>> Cc: dev at dpdk.org
>> Subject: Re: [dpdk-dev] [PATCH v6 5/8] lib/librte_pdump: add new
> library for packet capturing support
>> 
>> Reshma Pattan  writes:
>> 
>> > Added new library for packet capturing support.
>> >
>> > Added public api rte_pdump_init, applications should call
>> > this as part of their application setup to have packet
>> > capturing framework ready.
>> >
>> > Added public api rte_pdump_uninit to uninitialize the packet
>> > capturing framework.
>> >
>> > Added public apis rte_pdump_enable and rte_pdump_disable to
>> > enable and disable packet capturing on specific port and queue.
>> >
>> > Added public apis rte_pdump_enable_by_deviceid and
>> > rte_pdump_disable_by_deviceid to enable and disable packet
>> > capturing on a specific device (pci address or name) and queue.
>> >
>> > Signed-off-by: Reshma Pattan 
>> > ---
>> > +
>> > +int
>> > +rte_pdump_init(void)
>> 
>> Would you be opposed to having an argument here which takes a path to
>> the server socket?  That way the application can have some control over
>> the server socket location rather than using the guesses from
>> pdump_get_socket_path.
>
> I suppose it is better to keep IPC mechanism details internal for the
> pdump library.
> That way upper layer don't need to know what is that and write the
> code to open/maintain it.
> Again, that gives pdump library a freedom to change it (if needed) or
> possibly introduce some alternatives.
> Konstantin
>

How does the application change it?  The details do matter here, as some
applications (ex: openvswitch) have specific policies on which files
files get opened and where those files exist.  That has impact on things
like selinux and other access control technology.

If I missed the API that lets apps redirect the output, please correct me,
but so far I don't think I've missed it.  pdump still can change a
default, but it would be good to give a method for guiding the final
choice of file to open.

-Aaron


[dpdk-dev] [PATCH] mbuf: remove inconsistent assert statements

2016-06-09 Thread Ananyev, Konstantin
Hi Olivier,

> -Original Message-
> From: Olivier Matz [mailto:olivier.matz at 6wind.com]
> Sent: Thursday, June 09, 2016 8:47 AM
> To: Ananyev, Konstantin; dev at dpdk.org; Adrien Mazarguil
> Subject: Re: [dpdk-dev] [PATCH] mbuf: remove inconsistent assert statements
> 
> Hi Konstantin,
> 
> >> Yes, it refcnt supposed to be set to 0 by __rte_pktmbuf_prefree_seg().
> >> Wright now, it is a user responsibility to make sure refcnt==0 before 
> >> pushing
> >> mbuf back to the pool.
> >> Not sure why do you consider that wrong?
> >
> > I do not consider this wrong and I'm all for using assert() to catch
> > programming errors, however in this specific case, I think they are
> > inconsistent and misleading.
> 
>  Honestly, I don't understand why.
>  Right now the rule of thumb is - when mbuf is in the pool, it's refcnt 
>  should be equal zero.
> >>
> >> What is the purpose of this? Is there some code relying on this?
> >
> > The whole current implementation of mbuf_free code path relies on that.
> > Straight here:
> > if (likely(NULL != (m = __rte_pktmbuf_prefree_seg(m {
> > m->next = NULL;
> > __rte_mbuf_raw_free(m);
> > }
> >
> > If we'll exclude indirect mbuf logic, all it does:
> > if (rte_mbuf_refcnt_update(m, -1) == 0) {
> > m->next = NULL;
> > __rte_mbuf_raw_free(m);
> > }
> >
> > I.E.:
> > decrement mbuf->refcnt.
> > If new value of refcnt is zero, then put it back into the pool.
> >
> > So having ASERT(mbuf->refcnt==0) inside
> > __rte_mbuf_raw_free()/__rte_mbuf_raw_alloc()
> > looks absolutely valid to me.
> > I *has* to be zero at that point with current implementation,
> > And if it is not then we probably have (or will have) a silent memory 
> > corruption.
> 
> This explains how the refcount is used, and why it is set
> to zero before returning the mbuf to the pool with the mbuf
> free functions.

>From my point, that shows that rte_pktmbuf_free() relies on the value of the 
>refcnt
to make a decision is it ok to put mbuf back to the pool or not.
Right now it puts mbuf to the pool *only* if it's refcnt==0.
As discussed below, we probably can change it to be refcnt==1
(if there really would be noticeable performance gain).
But I think it still should be just one predefined value of refcnt (0 or 1).
In theory it is possible to allow both (0 and 1),
but that will make it hard to debug any alloc/free issues,
plus would neglect any possible performance gain -
as in that case raw_alloc (or it's analog) should still do
mbuf->refcnt=1; 

> 
> It does not explain which code relies on the refcnt beeing 0
> while the mbuf is in the pool.
> 
> 
> >> But since [1], it is allowed to call rte_mbuf_raw_alloc() in PMDs (all
> >> PMDs were calling an internal function before).
> >
> > Yes, raw_alloc is public, NP with that.
> >
> >> We could argue that
> >> rte_mbuf_raw_free() should also be made public for PMDs.
> >
> > We could, but right now it is not.
> > Again, as I said, user could use it on his own but he obviously has to
> > obey the rules and do manually what __rte_pktmbuf_prefree_seg() does.
> > At least:
> >
> > rte_mbuf_refcnt_set(m, 0);
> > __rte_mbuf_raw_free(m);
> >
> >>
> >> As you said below, no-one promised that the free() reverts the malloc(),
> >> but given the function names, one can legitimately expect that the
> >> following code is valid:
> >>
> >> m = __rte_mbuf_raw_alloc();
> >> /* do nothing here */
> >> __rte_mbuf_raw_free(m);
> >
> > Surely people could (and would) expect various things...
> > But the reality right now is: __rte_mbuf_raw_free() is an internal
> > function and not counterpart of __rte_mbuf_raw_alloc().
> > If the people don't bother to read API docs or/and the actual code,
> > I can't see how we can help them :)
> 
> Yes, of course, people should read the doc.
> This does not prevent to have a nice API that behaves in a
> natural way :)
> 
> By the way, the fact that today the mbuf refcnt should be 0 while
> in a pool is not in the api doc, but in the code.

Ok, I admit there is a bug in the docs, let's add this line to the PG and fix 
it :)

> 
> >> If no code relies on having the refcnt set to 0 when a mbuf is in
> >> the pool, I suggest to relax this constraint as Adrien proposed.
> >
> > Why not just rename it to __rte_mbuf_raw_free_dont_use_after_raw_alloc()?
> > To avoid any further confusion :)
> > Seriously speaking I would prefer to leave it as it is.
> > If you feel we have to introduce a counterpart of  rte_mbuf_raw_alloc(),
> > we can make a new public one:
> >
> > rte_mbuf_raw_free(stuct rte_mbuf *m)
> > {
> >   if (rte_mbuf_refcnt_update(m, -1) == 0)
> > __rte_mbuf_raw_free(m);
> > }
> 
> This is an option, but I think it's not efficient to touch
> the mbuf structure when allocating/freeing. See below.

I don't think it is totally avoidable for generic case anyway.

> 
> >> Then, my opinion is that the refcount should be set 

[dpdk-dev] [PATCH] app/test: fix array overflow warning with gcc 4.5

2016-06-09 Thread Tomasz Kulasek
DPDK/app/test/test_cryptodev.c: In function ?create_snow3g_cipher_operation
_oop.clone.15?: DPDK/x86_64-native-linuxapp-gcc/include/rte_memcpy.h:796:14
error: array subscript is above array bounds.

In test_cryptodev.c:
2429rte_memcpy(sym_op->cipher.iv.data, iv, iv_len);

When iv_len is declared as 'unsigned int', rte_memcpy evaluates code for
buffer size bigger than 255, but while 'iv' array is 64 bytes long, it
causes 'above array bounds' warning in gcc 4.5 and breaks compilation.

Using uint8_t as a size of copied block prevents to evaluate in rte_memcpy
code for length bigger than 255, causing the problem.

The root of this issue and solution is the same as for commit 2c007ea10616
("app/test: fix array overflow warning with gcc 4.5")

Fixes: 9727af14b032 ("app/test: add out-of-place symmetric crypto
   operations")

Signed-off-by: Tomasz Kulasek 
---
 app/test/test_cryptodev.c |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/app/test/test_cryptodev.c b/app/test/test_cryptodev.c
index 45e6daa..6621573 100644
--- a/app/test/test_cryptodev.c
+++ b/app/test/test_cryptodev.c
@@ -2392,7 +2392,7 @@ create_snow3g_cipher_operation(const uint8_t *iv, const 
unsigned iv_len,
 }

 static int
-create_snow3g_cipher_operation_oop(const uint8_t *iv, const unsigned iv_len,
+create_snow3g_cipher_operation_oop(const uint8_t *iv, const uint8_t iv_len,
const unsigned cipher_len,
const unsigned cipher_offset)
 {
-- 
1.7.9.5



[dpdk-dev] [PATCH v8 1/3] mempool: support external mempool operations

2016-06-09 Thread Shreyansh Jain
Hi Jerin,

> -Original Message-
> From: Jerin Jacob [mailto:jerin.jacob at caviumnetworks.com]
> Sent: Thursday, June 09, 2016 6:01 PM
> To: Shreyansh Jain 
> Cc: Hunt, David ; dev at dpdk.org; olivier.matz at 
> 6wind.com;
> viktorin at rehivetech.com
> Subject: Re: [dpdk-dev] [PATCH v8 1/3] mempool: support external mempool
> operations
> 
> On Thu, Jun 09, 2016 at 11:49:44AM +, Shreyansh Jain wrote:
> > Hi Jerin,
> 
> Hi Shreyansh,
> 
> >
> > > > Yes, this would simplify somewhat the creation of a pktmbuf pool, in
> that
> > > it
> > > > replaces
> > > > the rte_mempool_set_ops_byname with a flag bit. However, I'm not sure
> we
> > > > want
> > > > to introduce a third method of creating a mempool to the developers. If
> we
> > > > introduced this, we would then have:
> > > > 1. rte_pktmbuf_pool_create()
> > > > 2. rte_mempool_create_empty() with MEMPOOL_F_PKT_ALLOC set (which would
> > > >use the configured custom handler)
> > > > 3. rte_mempool_create_empty() with MEMPOOL_F_PKT_ALLOC __not__ set
> followed
> > > >by a call to rte_mempool_set_ops_byname() (would allow several
> different
> > > > custom
> > > >handlers to be used in one application
> > > >
> > > > Does anyone else have an opinion on this? Oliver, Jerin, Jan?
> > >
> > > As I mentioned earlier, My take is not to create the separate API's for
> > > external mempool handlers.In my view, It's same,  just that sepreate
> > > mempool handler through function pointers.
> > >
> > > To keep the backward compatibility, I think we can extend the flags
> > > in rte_mempool_create and have a single API external/internal pool
> > > creation(makes easy for existing applications too, add a just mempool
> > > flag command line argument to existing applications to choose the
> > > mempool handler)
> >
> > May be I am interpreting it wrong, but, are you suggesting a single mempool
> handler for all buffer/packet needs of an application (passed as command line
> argument)?
> > That would be inefficient especially for cases where pool is backed by a
> hardware. The application wouldn't want its generic buffers to consume
> hardware resource which would be better used for packets.
> 
> It may vary from platform to platform or particular use case. For instance,
> the HW external pool manager for generic buffers may scale better than SW
> multi
> producers/multi-consumer implementation when the number of cores > N
> (as no locking involved in enqueue/dequeue(again it is depended on
> specific HW implementation))

I agree with you that above cases would exist.

But, even in these cases I think it would be application's prerogative to 
decide whether it would like its buffers to be managed by a hardware allocator 
or SW [SM]p/[SM]c implementations. Probably, in this case the application would 
call the rte_mempool_*(PKT_POOL) for generic buffers as well (or maybe a 
dedicated buffer pool flag) - just as an example.

> 
> I thought their no harm in selecting the external pool handlers
> in root level itself(rte_mempool_create) as by default it is
> SW MP/MC and it just an option to override if the application wants it.

It sounds fine if calls to rte_mempool_* can select an external handler 
*optionally* - but, if we pass it as command line, it would be binding (at 
least, semantically) for rte_pktmbuf_* calls as well. Isn't it?

[Probably, I am still unclear how it would remain 'optional' in command line 
case you suggested.]

> 
> Jerin
> 
> 
[...]

-
Shreyansh


[dpdk-dev] [PATCH] doc: virtio pmd versions

2016-06-09 Thread Mcnamara, John
> -Original Message-
> From: Richardson, Bruce
> Sent: Thursday, June 9, 2016 1:53 PM
> To: Mcnamara, John ; Wang, Zhihong
> ; dev at dpdk.org
> Cc: Wang, Zhihong 
> Subject: RE: [dpdk-dev] [PATCH] doc: virtio pmd versions
> 
> > -Original Message-
> > From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Mcnamara, John
> 
> >
> > > -Original Message-
> > > From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Zhihong Wang
> 
> > > +
> > > +Virtio PMD Versions
> > > +---
> > > +
> > > +Virtio driver has 3 versions of rx functions and 2 versions of tx
> > > functions.
> >
> > In some places RX/TX is used and in some rx/tx. I would suggest the
> > uppercase versions throughout.
> >
> 
> In the commit logs, the only valid contractions allowed by the check-git-
> log.sh script are Rx and Tx
> 
> bad=$(echo "$headlines" | grep -E --color=always \
> -e '\<(rx|tx|RX|TX)\>' \
>
> 
> I would therefore suggest we follow the same rules for the docs for
> consistency.

Hi,

I don't mind what it is once we have consistency, so Rx/Tx is fine. Zhihong, 
please note.

John





[dpdk-dev] [PATCH] doc: virtio pmd versions

2016-06-09 Thread Richardson, Bruce
> -Original Message-
> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Mcnamara, John

> 
> > -Original Message-
> > From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Zhihong Wang
 
> > +
> > +Virtio PMD Versions
> > +---
> > +
> > +Virtio driver has 3 versions of rx functions and 2 versions of tx
> > functions.
> 
> In some places RX/TX is used and in some rx/tx. I would suggest the
> uppercase versions throughout.
> 

In the commit logs, the only valid contractions allowed by the check-git-log.sh 
script are Rx and Tx

bad=$(echo "$headlines" | grep -E --color=always \
-e '\<(rx|tx|RX|TX)\>' \
 

I would therefore suggest we follow the same rules for the docs for consistency.

/Bruce


[dpdk-dev] [PATCH v3 08/20] vhost: introduce new API to export numa node

2016-06-09 Thread Yuanhan Liu
On Wed, Jun 08, 2016 at 02:51:33PM -0700, Rich Lane wrote:
> On Mon, Jun 6, 2016 at 8:51 PM, Yuanhan Liu 
> wrote:
> 
> @@ -248,14 +248,9 @@ new_device(struct virtio_net *dev)
> ? ? ? ? internal = eth_dev->data->dev_private;
> 
> ?#ifdef RTE_LIBRTE_VHOST_NUMA
> -? ? ? ?ret? = get_mempolicy(, NULL, 0, dev,
> -? ? ? ? ? ? ? ? ? ? ? ?MPOL_F_NODE | MPOL_F_ADDR);
> -? ? ? ?if (ret < 0) {
> -? ? ? ? ? ? ? ?RTE_LOG(ERR, PMD, "Unknown numa node\n");
> -? ? ? ? ? ? ? ?return -1;
> -? ? ? ?}
> -
> -? ? ? ?eth_dev->data->numa_node = newnode;
> +? ? ? ?newnode = rte_vhost_get_numa_node(dev->vid);
> +? ? ? ?if (newnode > 0)
> 
> 
> Should be "newnode >= 0".?

Nice catch! Thanks.

--yliu


[dpdk-dev] supported packet types

2016-06-09 Thread Adrien Mazarguil
On Thu, Jun 09, 2016 at 09:57:28AM +0200, Olivier Matz wrote:
> Hi Konstantin,
> 
> On 04/29/2016 06:03 PM, Ananyev, Konstantin wrote:
> >> The following commit introduces a function to list the supported
> >> packet types of a device:
> >>
> >>   http://dpdk.org/browse/dpdk/commit/?id=78a38edf66
> >>
> >> I would like to know what does "supported" precisely mean.
> >> Is it:
> >>
> >> 1/ - if a ptype is marked as supported, the driver MUST set
> >>  this ptype if the packet matches the format described in rte_mbuf.h
> >>
> >>-> if the ptype is not recognized, the application is sure
> >>   that the packet is not one of the supported ptype
> >>-> but this is difficult to take advantage of this inside an
> >>   application that supports several different ports model
> >>   that do not support the same packet types
> >>
> >> 2/ - if a ptype is marked as supported, the driver CAN set
> >>  the ptype if the packet matches the format described in rte_mbuf.h
> >>
> >>-> if a ptype is not recognized, the application does a software
> >>   fallback
> >>-> in this case, it would useless to have the get_supported_ptype()
> >>
> >> Can you confirm if the PMDs and l3fwd (the only user) expect 1/
> >> or 2/ ?
> > 
> > 1) 
> > 
> >>
> >> Can you elaborate on the advantages of having this API?
> > 
> > Application can rely on information provided by PMD avoid parsing packet 
> > manually to recognise it's pytpe.
> > 
> >>
> >> And a supplementary question: if a ptype is not marked as supported,
> >> is it forbidden for a driver to set this ptype anyway?
> > 
> > I suppose it is not forbidden, but there is no guarantee from PMD that it
> > would be able to recognise that ptype.
> > 
> > Konstantin
> > 
> >> Because we can
> >> imagine a hardware that can only recognize packets in some conditions
> >> (ex: can recognize IPv4 if there is no vlan).
> >>
> >> I think properly defining the meaning of "supported" is mandatory
> >> to have an application beeing able to use this feature, and avoid
> >> PMDs to behave differently because the API is unclear (like we've
> >> already seen for other features).
> 
> Back on this. I've made some tests with ixgbe, and I'm afraid it
> will be difficult to ensure that when a ptype is advertised, it will
> always be set in the mbuf, whatever the layers below. Here are some
> examples:
> 
> - double vlans
> 
> Ether(type=0x88A8)/Dot1Q(vlan=0x666)/Dot1Q(vlan=0x666)/IP()/UDP()/Raw("x"*32)
>   ixgbe advertises RTE_PTYPE_ETHER in supported ptypes
>   returned ptype: RTE_PTYPE_UNKNOWN
>   should be: L2_ETHER
>   (this works with any unknown ethtype)
> 
> - ip6 in ip6 tunnel
>   ixgbe advertises RTE_PTYPE_TUNNEL_IP in supported ptypes
>   Ether()/IPv6()/IPv6()/UDP()/Raw("x"*32)
>   returned ptype: L2_ETHER L3_IPV6
>   should be: L2_ETHER L3_IPV6 TUNNEL_IP INNER_L3_IPV6 INNER_L4_UDP
> 
> - ip options
>   Ether()/IP(options=IPOption('\x83\x03\x10'))/UDP()/Raw("x"*32)
>   returned ptype: RTE_PTYPE_UNKNOWN
>   should be: L2_ETHER L3_IPV4_EXT L4_UDP
> 
> - ip inside ip with options
>   Ether()/IP(options=IPOption('\x83\x03\x10'))/IP()/UDP()/Raw("x"*32)
>   returned ptype: L2_ETHER L3_IPV4_EXT
>   should be: L2_ETHER L3_IPV4_EXT TUNNEL_IP INNER_L3_IPV4 INNER_L4_UDP
> 
> I'm sure we can find more examples that do not return the expected
> result, knowing that ixgbe is probably one of the most complete
> driver in dpdk. I'm afraid of the behavior for other PMDs :)

AFAIK, mlx4/mlx5 have similar issues, while they have no problem identifying
Dot1Q payloads, QinQ is another matter. For supported tunnel types such as
VXLAN, non-standard UDP ports prevent them from being identified as such
unless HW is configured properly (and then packets that use the standard UDP
port are not recognized anymore).

So yes, this API identifies what can be at best reported by PMDs in the
right context assuming all conditions are met, and PMDs do not implement
software fallbacks for obvious reasons.

> That's why I think the get_supported_ptypes() function, as of today,
> is useless for an application. I suggest instead to set the ptype
> in an opportunistic fashion instead:
> - if the driver/hw knows the ptype, set it
> - else, set it to unknown
> 
> What do you think?

Makes sense to me, supported encapsulations cannot be described easily with
the ptype API, applications cannot rely on it more than to get a rough idea
of what flags may potentially be set sometimes in received mbufs, which is
fine but not very useful.

That suggestion is also in line with the recent RX checksum discussion [1],
where basically HW not reporting something about a packet does not mean it
is the opposite.

[1] http://dpdk.org/ml/archives/dev/2016-May/039920.html

> By the way, I'm working on a software implementation that return a
> packet_type from a mbuf. I may need it for virtio offload, but before
> that, I think it could be useful for debug purposes. I'll submit a
> patchset for this in the coming days.

[dpdk-dev] [PATCH] doc: virtio pmd versions

2016-06-09 Thread Mcnamara, John
> -Original Message-
> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Zhihong Wang
> Sent: Thursday, April 21, 2016 4:55 AM
> To: dev at dpdk.org
> Cc: Wang, Zhihong 
> Subject: [dpdk-dev] [PATCH] doc: virtio pmd versions
> 
> This patch explains all the versions of current virtio pmd implementation,
> what's the difference, and how to choose the right version.

Hi,

Thanks for the clarification docs. That should hopefully answer a frequently
asked question.

Some minor comments below.


> +
> +Virtio PMD Versions
> +---
> +
> +Virtio driver has 3 versions of rx functions and 2 versions of tx
> functions.

In some places RX/TX is used and in some rx/tx. I would suggest the
uppercase versions throughout.



> +
> +RX functions:
> +
> +*   ``virtio_recv_pkts``:
> +
> +Regular version without mergeable rx buffers support


Since these could be a numbered list I would suggest using "#." (numbered
list) instead of "*" (bullet list). Also it would be better without a blank
line between the name and the definition so that they appear on the same
line in the generated docs.


> +
> +Simple version without mergeable rx buffers support, also fixes the
> + avail ring and uses vector instructions to optimize performance

s/avail/available in 2 places.

Also should that be "available ring size" or "number of available rings"
or something else?


> +Example to use the simple version of virtio poll mode driver in testpmd:
> +
> +.. code-block:: console
> +
> +./x86_64-native-linuxapp-gcc/app/testpmd -c 0x7 -n 4
> +--  -i --txqflags=0xf01 --rxq=1 --txq=1 --nb-cores=1


The code/commandline needs to be indented 3-4 spaces to be rendered
correctly. I'd also suggest omitting "./x86_64-native-linuxapp-gcc/app/" and
just using "testpmd" since that commandline is target dependent. The command
would also fit on one line like that.

Thanks,

John.
-- 



[dpdk-dev] [PATCH 1/2] mk: prevent overlinking in applications

2016-06-09 Thread Thomas Monjalon
Hi Ferruh,

2016-05-27 17:48, Ferruh Yigit:
> Replace --no-as-needed linker flag with --as-needed flag, which will
> only link libraries directly called by application. This requires inter
> library dependencies resolved correctly.
> 
> Not linking all libraries cause a compile error for lpcap and possible
> to have other similar compiler errors, so increasing the scope of
> --start-group argument.

What is the error?

> cmdline_test application causes compile error because of cyclic
> dependency between librte_eal <-> librte_mempool. A workaround added to
> cmdline_test for compile error.
> 
> Signed-off-by: Ferruh Yigit 

> --- a/app/cmdline_test/Makefile
> +++ b/app/cmdline_test/Makefile
> @@ -46,6 +46,7 @@ SRCS-y += commands.c
>  
>  CFLAGS += -O3
>  CFLAGS += $(WERROR_FLAGS)

A comment is required here to explain the workaround.

> +LDFLAGS += -no-as-needed

The option should be --no-as-needed.

> --- a/mk/exec-env/linuxapp/rte.vars.mk
> +++ b/mk/exec-env/linuxapp/rte.vars.mk
> @@ -46,7 +46,7 @@ EXECENV_CFLAGS  = -pthread
>  endif
>  
>  # Workaround lack of DT_NEEDED entry

This comment is obsolete now.

> -EXECENV_LDFLAGS = --no-as-needed
> +EXECENV_LDFLAGS = --as-needed

Why put this option for Linux only?
Shouldn't we restrict this option to app.mk only?

> --- a/mk/rte.app.mk
> +++ b/mk/rte.app.mk
> @@ -58,6 +58,7 @@ _LDLIBS-y += -L$(RTE_SDK_BIN)/lib
>  #
>  
>  _LDLIBS-y += --whole-archive
> +_LDLIBS-y += --start-group

--start-group must be used only to solve circular dependencies.
Ideally we should not use it. I think it's needed only because
of EAL logs using mempool (must be removed).
Why extending it?
I'm afraid we are masking some issues.

>  _LDLIBS-$(CONFIG_RTE_LIBRTE_DISTRIBUTOR)+= -lrte_distributor
>  _LDLIBS-$(CONFIG_RTE_LIBRTE_REORDER)+= -lrte_reorder
> @@ -111,8 +112,6 @@ _LDLIBS-y   += -lcrypto
>  endif
>  endif # !CONFIG_RTE_BUILD_SHARED_LIBS
>  
> -_LDLIBS-y += --start-group
> -
>  _LDLIBS-$(CONFIG_RTE_LIBRTE_KVARGS) += -lrte_kvargs
>  _LDLIBS-$(CONFIG_RTE_LIBRTE_MBUF)   += -lrte_mbuf
>  _LDLIBS-$(CONFIG_RTE_LIBRTE_IP_FRAG)+= -lrte_ip_frag



[dpdk-dev] [PATCH 2/8] doc: update build instructions for libsso_snow3g

2016-06-09 Thread Mcnamara, John

> -Original Message-
> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Pablo de Lara
> Sent: Friday, May 6, 2016 3:04 PM
> To: dev at dpdk.org
> Cc: Doherty, Declan ; De Lara Guarch, Pablo
> 
> Subject: [dpdk-dev] [PATCH 2/8] doc: update build instructions for
> libsso_snow3g
> 
> With the library update, the way to compile the library has changed, so
> documentation reflects this change.
> Also, the patch to fix the compilation issues present with gcc > 5.0 has
> been removed, as the issues have been fixed in the library.
> 
> Signed-off-by: Pablo de Lara 
> ...
> +   make snow3G

Nice change to make usage easier.

Acked-by: John McNamara 




[dpdk-dev] [PATCH v8 1/3] mempool: support external mempool operations

2016-06-09 Thread Hunt, David


On 9/6/2016 11:31 AM, Jerin Jacob wrote:
> On Thu, Jun 09, 2016 at 10:39:46AM +0100, Hunt, David wrote:
>> Hi Shreyansh,
>>
>> On 8/6/2016 2:48 PM, Shreyansh Jain wrote:
>>> Hi David,
>>>
>>> Thanks for explanation. I have some comments inline...
>>>
 -Original Message-
 From: Hunt, David [mailto:david.hunt at intel.com]
 Sent: Tuesday, June 07, 2016 2:56 PM
 To: Shreyansh Jain ; dev at dpdk.org
 Cc: olivier.matz at 6wind.com; viktorin at rehivetech.com;
 jerin.jacob at caviumnetworks.com
 Subject: Re: [dpdk-dev] [PATCH v8 1/3] mempool: support external mempool
 operations

 Hi Shreyansh,

 On 6/6/2016 3:38 PM, Shreyansh Jain wrote:
> Hi,
>
> (Apologies for overly-eager email sent on this thread earlier. Will be 
> more
 careful in future).
> This is more of a question/clarification than a comment. (And I have taken
 only some snippets from original mail to keep it cleaner)
> 
>> +MEMPOOL_REGISTER_OPS(ops_mp_mc);
>> +MEMPOOL_REGISTER_OPS(ops_sp_sc);
>> +MEMPOOL_REGISTER_OPS(ops_mp_sc);
>> +MEMPOOL_REGISTER_OPS(ops_sp_mc);
> 
>
>From the above what I understand is that multiple packet pool handlers 
> can
 be created.
> I have a use-case where application has multiple pools but only the packet
 pool is hardware backed. Using the hardware for general buffer requirements
 would prove costly.
>From what I understand from the patch, selection of the pool is based 
> on
 the flags below.

 The flags are only used to select one of the default handlers for
 backward compatibility through
 the rte_mempool_create call. If you wish to use a mempool handler that
 is not one of the
 defaults, (i.e. a new hardware handler), you would use the
 rte_create_mempool_empty
 followed by the rte_mempool_set_ops_byname call.
 So, for the external handlers, you create and empty mempool, then set
 the operations (ops)
 for that particular mempool.
>>> I am concerned about the existing applications (for example, l3fwd).
>>> Explicit calls to 'rte_create_mempool_empty->rte_mempool_set_ops_byname' 
>>> model would require modifications to these applications.
>>> Ideally, without any modifications, these applications should be able to 
>>> use packet pools (backed by hardware) and buffer pools (backed by 
>>> ring/others) - transparently.
>>>
>>> If I go by your suggestions, what I understand is, doing the above without 
>>> modification to applications would be equivalent to:
>>>
>>> struct rte_mempool_ops custom_hw_allocator = {...}
>>>
>>> thereafter, in config/common_base:
>>>
>>> CONFIG_RTE_DEFAULT_MEMPOOL_OPS="custom_hw_allocator"
>>>
>>> calls to rte_pktmbuf_pool_create would use the new allocator.
>> Yes, correct. But only for calls to rte_pktmbuf_pool_create(). Calls to
>> rte_mempool_create will continue to use the default handlers (ring based).
>>> But, another problem arises here.
>>>
>>> There are two distinct paths for allocations of a memory pool:
>>> 1. A 'pkt' pool:
>>>  rte_pktmbuf_pool_create
>>>\- rte_mempool_create_empty
>>>|   \- rte_mempool_set_ops_byname(..ring_mp_mc..)
>>>|
>>>`- rte_mempool_set_ops_byname
>>>  (...RTE_MBUF_DEFAULT_MEMPOOL_OPS..)
>>>  /* Override default 'ring_mp_mc' of
>>>   * rte_mempool_create */
>>>
>>> 2. Through generic mempool create API
>>>  rte_mempool_create
>>>\- rte_mempool_create_empty
>>>  (passing pktmbuf and pool constructors)
>>> I found various instances in example applications where 
>>> rte_mempool_create() is being called directly for packet pools - bypassing 
>>> the more semantically correct call to rte_pktmbuf_* for packet pools.
>>>
>>> In (2) control path, RTE_MBUF_DEFAULT_MEMPOOLS_OPS wouldn't be able to 
>>> replace custom handler operations for packet buffer allocations.
>>>
>>>   From a performance point-of-view, Applications should be able to select 
>>> between packet pools and non-packet pools.
>> This is intended for backward compatibility, and API consistency. Any
>> applications that use
>> rte_mempool_create directly will continue to use the default mempool
>> handlers. If the need
>> to use a custeom hander, they will need to be modified to call the newer
>> API,
>> rte_mempool_create_empty and rte_mempool_set_ops_byname.
>>
>>
> 
>> +/*
>> + * Since we have 4 combinations of the SP/SC/MP/MC examine the 
>> flags to
>> + * set the correct index into the table of ops structs.
>> + */
>> +if (flags & (MEMPOOL_F_SP_PUT | MEMPOOL_F_SC_GET))
>> +rte_mempool_set_ops_byname(mp, "ring_sp_sc");
>> +else if (flags & MEMPOOL_F_SP_PUT)
>> +rte_mempool_set_ops_byname(mp, "ring_sp_mc");
>> +else if (flags & MEMPOOL_F_SC_GET)
>> +   

[dpdk-dev] [PATCH 2/2 v3] kni: add documentation for the mempool capacity

2016-06-09 Thread Mcnamara, John
> -Original Message-
> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Alex Wang
> Sent: Saturday, May 21, 2016 8:59 AM
> To: dev at dpdk.org
> Cc: Yigit, Ferruh ; Alex Wang
> 
> Subject: [dpdk-dev] [PATCH 2/2 v3] kni: add documentation for the mempool
> capacity
> 
> From: Alex Wang 
> 
> Function like 'rte_kni_rx_burst()' keeps allocating 'MAX_MBUF_BURST_NUM'
> mbufs to kni fifo queue unless the queue's capacity
> ('KNI_FIFO_COUNT_MAX') is reached.  So, if the mempool is under-
> provisioned, user may run into "Out of Memory" logs from KNI code.
> This commit documents the need to provision mempool capacity of more than
> "2 x KNI_FIFO_COUNT_MAX" for each KNI interface.
> 
> Signed-off-by: Alex Wang 
> Acked-by: Ferruh Yigit 

Acked-by: John McNamara 



[dpdk-dev] [PATCH v6 5/8] lib/librte_pdump: add new library for packet capturing support

2016-06-09 Thread Aaron Conole
Reshma Pattan  writes:

> Added new library for packet capturing support.
>
> Added public api rte_pdump_init, applications should call
> this as part of their application setup to have packet
> capturing framework ready.
>
> Added public api rte_pdump_uninit to uninitialize the packet
> capturing framework.
>
> Added public apis rte_pdump_enable and rte_pdump_disable to
> enable and disable packet capturing on specific port and queue.
>
> Added public apis rte_pdump_enable_by_deviceid and
> rte_pdump_disable_by_deviceid to enable and disable packet
> capturing on a specific device (pci address or name) and queue.
>
> Signed-off-by: Reshma Pattan 
> ---
>  MAINTAINERS|   4 +
>  config/common_base |   5 +
>  lib/Makefile   |   1 +
>  lib/librte_pdump/Makefile  |  55 +++
>  lib/librte_pdump/rte_pdump.c   | 841 
> +
>  lib/librte_pdump/rte_pdump.h   | 186 
>  lib/librte_pdump/rte_pdump_version.map |  12 +
>  mk/rte.app.mk  |   1 +
>  8 files changed, 1105 insertions(+)
>  create mode 100644 lib/librte_pdump/Makefile
>  create mode 100644 lib/librte_pdump/rte_pdump.c
>  create mode 100644 lib/librte_pdump/rte_pdump.h
>  create mode 100644 lib/librte_pdump/rte_pdump_version.map
>
> diff --git a/MAINTAINERS b/MAINTAINERS
> index 3e8558f..cc3ffdb 100644
> --- a/MAINTAINERS
> +++ b/MAINTAINERS
> @@ -432,6 +432,10 @@ F: app/test/test_reorder*
>  F: examples/packet_ordering/
>  F: doc/guides/sample_app_ug/packet_ordering.rst
>  
> +Pdump
> +M: Reshma Pattan 
> +F: lib/librte_pdump/
> +
>  Hierarchical scheduler
>  M: Cristian Dumitrescu 
>  F: lib/librte_sched/
> diff --git a/config/common_base b/config/common_base
> index 47c26f6..a2d5d72 100644
> --- a/config/common_base
> +++ b/config/common_base
> @@ -484,6 +484,11 @@ CONFIG_RTE_LIBRTE_DISTRIBUTOR=y
>  CONFIG_RTE_LIBRTE_REORDER=y
>  
>  #
> +# Compile the pdump library
> +#
> +CONFIG_RTE_LIBRTE_PDUMP=y
> +
> +#
>  # Compile librte_port
>  #
>  CONFIG_RTE_LIBRTE_PORT=y
> diff --git a/lib/Makefile b/lib/Makefile
> index f254dba..ca7c02f 100644
> --- a/lib/Makefile
> +++ b/lib/Makefile
> @@ -57,6 +57,7 @@ DIRS-$(CONFIG_RTE_LIBRTE_PORT) += librte_port
>  DIRS-$(CONFIG_RTE_LIBRTE_TABLE) += librte_table
>  DIRS-$(CONFIG_RTE_LIBRTE_PIPELINE) += librte_pipeline
>  DIRS-$(CONFIG_RTE_LIBRTE_REORDER) += librte_reorder
> +DIRS-$(CONFIG_RTE_LIBRTE_PDUMP) += librte_pdump
>  
>  ifeq ($(CONFIG_RTE_EXEC_ENV_LINUXAPP),y)
>  DIRS-$(CONFIG_RTE_LIBRTE_KNI) += librte_kni
> diff --git a/lib/librte_pdump/Makefile b/lib/librte_pdump/Makefile
> new file mode 100644
> index 000..af81a28
> --- /dev/null
> +++ b/lib/librte_pdump/Makefile
> @@ -0,0 +1,55 @@
> +#   BSD LICENSE
> +#
> +#   Copyright(c) 2016 Intel Corporation. All rights reserved.
> +#   All rights reserved.
> +#
> +#   Redistribution and use in source and binary forms, with or without
> +#   modification, are permitted provided that the following conditions
> +#   are met:
> +#
> +# * Redistributions of source code must retain the above copyright
> +#   notice, this list of conditions and the following disclaimer.
> +# * Redistributions in binary form must reproduce the above copyright
> +#   notice, this list of conditions and the following disclaimer in
> +#   the documentation and/or other materials provided with the
> +#   distribution.
> +# * Neither the name of Intel Corporation nor the names of its
> +#   contributors may be used to endorse or promote products derived
> +#   from this software without specific prior written permission.
> +#
> +#   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
> +#   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
> +#   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
> +#   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
> +#   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
> +#   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
> +#   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
> +#   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
> +#   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
> +#   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
> +#   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
> +
> +include $(RTE_SDK)/mk/rte.vars.mk
> +
> +# library name
> +LIB = librte_pdump.a
> +
> +CFLAGS += $(WERROR_FLAGS) -I$(SRCDIR) -O3
> +CFLAGS += -D_GNU_SOURCE
> +
> +EXPORT_MAP := rte_pdump_version.map
> +
> +LIBABIVER := 1
> +
> +# all source are stored in SRCS-y
> +SRCS-$(CONFIG_RTE_LIBRTE_PDUMP) := rte_pdump.c
> +
> +# install this header file
> +SYMLINK-$(CONFIG_RTE_LIBRTE_PDUMP)-include := rte_pdump.h
> +
> +# this lib depends upon:
> 

[dpdk-dev] [PATCH v3 2/2] enic: more specific out of resources error messages

2016-06-09 Thread John Daley
If configuration fails due to lack of resources, be more specific
about which resources are lacking - work queues, read queues or
completion queues.

Fixes: fefed3d1e62c ("enic: new driver")
Signed-off-by: John Daley 
---
v3: Log messages fix in separate patch. Log errors
for all lacking resources, not just 1st encountered.

 drivers/net/enic/enic_main.c | 21 ++---
 1 file changed, 14 insertions(+), 7 deletions(-)

diff --git a/drivers/net/enic/enic_main.c b/drivers/net/enic/enic_main.c
index 5939b9d..70776c2 100644
--- a/drivers/net/enic/enic_main.c
+++ b/drivers/net/enic/enic_main.c
@@ -819,22 +819,29 @@ static void enic_dev_deinit(struct enic *enic)
 int enic_set_vnic_res(struct enic *enic)
 {
struct rte_eth_dev *eth_dev = enic->rte_dev;
+   int rc = 0;

-   if ((enic->rq_count < eth_dev->data->nb_rx_queues) ||
-   (enic->wq_count < eth_dev->data->nb_tx_queues)) {
-   dev_err(dev, "Not enough resources configured, aborting\n");
-   return -1;
+   if (enic->rq_count < eth_dev->data->nb_rx_queues) {
+   dev_err(dev, "Not enough Receive queues. Requested:%u, 
Configured:%u\n",
+   eth_dev->data->nb_rx_queues, enic->rq_count);
+   rc = -1;
+   }
+   if (enic->wq_count < eth_dev->data->nb_tx_queues) {
+   dev_err(dev, "Not enough Transmit queues. Requested:%u, 
Configured:%u\n",
+   eth_dev->data->nb_tx_queues, enic->wq_count);
+   rc = -1;
}

enic->rq_count = eth_dev->data->nb_rx_queues;
enic->wq_count = eth_dev->data->nb_tx_queues;
if (enic->cq_count < (enic->rq_count + enic->wq_count)) {
-   dev_err(dev, "Not enough resources configured, aborting\n");
-   return -1;
+   dev_err(dev, "Not enough Completion queues. Required:%u, 
Configured:%u\n",
+   enic->rq_count + enic->wq_count, enic->cq_count);
+   rc = -1;
}

enic->cq_count = enic->rq_count + enic->wq_count;
-   return 0;
+   return rc;
 }

 static int enic_dev_init(struct enic *enic)
-- 
2.7.0



[dpdk-dev] [PATCH v3 1/2] enic: fix seg fault when releasing queues

2016-06-09 Thread John Daley
If device configuration failed due to a lack of resources, such as
if more queues are requested than are available, the queue release
function is called with NULL pointers which were being dereferenced.

Skip releasing queues if they are NULL pointers.

Fixes: fefed3d1e62c ("enic: new driver")
Signed-off-by: John Daley 
---

v3: bail out of free rq function if rq is null instead of if
around != NULL.

 drivers/net/enic/enic_main.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/drivers/net/enic/enic_main.c b/drivers/net/enic/enic_main.c
index 996f999..5939b9d 100644
--- a/drivers/net/enic/enic_main.c
+++ b/drivers/net/enic/enic_main.c
@@ -429,6 +429,9 @@ void enic_free_rq(void *rxq)
struct vnic_rq *rq = (struct vnic_rq *)rxq;
struct enic *enic = vnic_dev_priv(rq->vdev);

+   if (rxq == NULL)
+   return;
+
enic_rxmbuf_queue_release(enic, rq);
rte_free(rq->mbuf_ring);
rq->mbuf_ring = NULL;
-- 
2.7.0



[dpdk-dev] [PATCH v3 0/2] enic: fix seg fault when releasing queues

2016-06-09 Thread John Daley
Patch broken into separate patches for fixing seg fault and improving
log messages.

John Daley (2):
  enic: fix seg fault when releasing queues
  enic: more specific out of resources error messages

 drivers/net/enic/enic_main.c | 24 +---
 1 file changed, 17 insertions(+), 7 deletions(-)

-- 
2.7.0



[dpdk-dev] [PATCH v8 1/3] mempool: support external mempool operations

2016-06-09 Thread Shreyansh Jain
Hi David,

> -Original Message-
> From: Hunt, David [mailto:david.hunt at intel.com]
> Sent: Thursday, June 09, 2016 3:10 PM
> To: Shreyansh Jain ; dev at dpdk.org
> Cc: olivier.matz at 6wind.com; viktorin at rehivetech.com;
> jerin.jacob at caviumnetworks.com
> Subject: Re: [dpdk-dev] [PATCH v8 1/3] mempool: support external mempool
> operations
> 
> Hi Shreyansh,
> 
> On 8/6/2016 2:48 PM, Shreyansh Jain wrote:
> > Hi David,
> >
> > Thanks for explanation. I have some comments inline...
> >
> >> -Original Message-
> >> From: Hunt, David [mailto:david.hunt at intel.com]
> >> Sent: Tuesday, June 07, 2016 2:56 PM
> >> To: Shreyansh Jain ; dev at dpdk.org
> >> Cc: olivier.matz at 6wind.com; viktorin at rehivetech.com;
> >> jerin.jacob at caviumnetworks.com
> >> Subject: Re: [dpdk-dev] [PATCH v8 1/3] mempool: support external mempool
> >> operations
> >>
> >> Hi Shreyansh,
> >>
> >> On 6/6/2016 3:38 PM, Shreyansh Jain wrote:
> >>> Hi,
> >>>
> >>> (Apologies for overly-eager email sent on this thread earlier. Will be
> more
> >> careful in future).
> >>> This is more of a question/clarification than a comment. (And I have
> taken
> >> only some snippets from original mail to keep it cleaner)
> >>> 
>  +MEMPOOL_REGISTER_OPS(ops_mp_mc);
>  +MEMPOOL_REGISTER_OPS(ops_sp_sc);
>  +MEMPOOL_REGISTER_OPS(ops_mp_sc);
>  +MEMPOOL_REGISTER_OPS(ops_sp_mc);
> >>> 
> >>>
> >>>   From the above what I understand is that multiple packet pool handlers
> can
> >> be created.
> >>> I have a use-case where application has multiple pools but only the
> packet
> >> pool is hardware backed. Using the hardware for general buffer
> requirements
> >> would prove costly.
> >>>   From what I understand from the patch, selection of the pool is based
> on
> >> the flags below.
> >>
> >> The flags are only used to select one of the default handlers for
> >> backward compatibility through
> >> the rte_mempool_create call. If you wish to use a mempool handler that
> >> is not one of the
> >> defaults, (i.e. a new hardware handler), you would use the
> >> rte_create_mempool_empty
> >> followed by the rte_mempool_set_ops_byname call.
> >> So, for the external handlers, you create and empty mempool, then set
> >> the operations (ops)
> >> for that particular mempool.
> > I am concerned about the existing applications (for example, l3fwd).
> > Explicit calls to 'rte_create_mempool_empty->rte_mempool_set_ops_byname'
> model would require modifications to these applications.
> > Ideally, without any modifications, these applications should be able to
> use packet pools (backed by hardware) and buffer pools (backed by
> ring/others) - transparently.
> >
> > If I go by your suggestions, what I understand is, doing the above without
> modification to applications would be equivalent to:
> >
> >struct rte_mempool_ops custom_hw_allocator = {...}
> >
> > thereafter, in config/common_base:
> >
> >CONFIG_RTE_DEFAULT_MEMPOOL_OPS="custom_hw_allocator"
> >
> > calls to rte_pktmbuf_pool_create would use the new allocator.
> 
> Yes, correct. But only for calls to rte_pktmbuf_pool_create(). Calls to
> rte_mempool_create will continue to use the default handlers (ring based).

Agree with you.
But, some applications continue to use rte_mempool_create for allocating packet 
pools. Thus, even with a custom handler available (which, most probably, would 
be a hardware packet buffer handler), application would unintentionally end up 
not using it.
Probably, such applications should be changed? (e.g. pipeline). 

> > But, another problem arises here.
> >
> > There are two distinct paths for allocations of a memory pool:
> > 1. A 'pkt' pool:
> > rte_pktmbuf_pool_create
> >   \- rte_mempool_create_empty
> >   |   \- rte_mempool_set_ops_byname(..ring_mp_mc..)
> >   |
> >   `- rte_mempool_set_ops_byname
> > (...RTE_MBUF_DEFAULT_MEMPOOL_OPS..)
> > /* Override default 'ring_mp_mc' of
> >  * rte_mempool_create */
> >
> > 2. Through generic mempool create API
> > rte_mempool_create
> >   \- rte_mempool_create_empty
> > (passing pktmbuf and pool constructors)
> >
> > I found various instances in example applications where
> rte_mempool_create() is being called directly for packet pools - bypassing
> the more semantically correct call to rte_pktmbuf_* for packet pools.
> >
> > In (2) control path, RTE_MBUF_DEFAULT_MEMPOOLS_OPS wouldn't be able to
> replace custom handler operations for packet buffer allocations.
> >
> >  From a performance point-of-view, Applications should be able to select
> between packet pools and non-packet pools.
> 
> This is intended for backward compatibility, and API consistency. Any
> applications that use
> rte_mempool_create directly will continue to use the default mempool
> handlers. If the need
> to use a custeom hander, they will need to be modified to call the newer
> API,
> rte_mempool_create_empty and rte_mempool_set_ops_byname.


[dpdk-dev] [PATCH] mk: generate internal library dependencies from DEPDIRS-y automatically

2016-06-09 Thread Thomas Monjalon
> > Up to now dependencies between DPDK internal libraries have been
> > untracked at shared library level, requiring applications to know
> > about library internal dependencies and often consequently overlinking.
> >
> > Since the dependencies are already recorded for build ordering in the
> > makefiles, we can use that information to generate LDLIBS entries for
> > internal libraries automatically.
> >
> > Also revert commit 8180554d82b3 ("vhost: fix linkage of driver with
> > library") which is made redundant by this change.
> >
> > Signed-off-by: Panu Matilainen 
> Acked-by: Christian Ehrhardt 

Applied, thanks


[dpdk-dev] [PATCH v8 1/3] mempool: support external mempool operations

2016-06-09 Thread Hunt, David
Hi Olivier,

On 8/6/2016 1:13 PM, Olivier Matz wrote:
> Hi David,
>
> Please find some comments below.
>
> On 06/03/2016 04:58 PM, David Hunt wrote:
>
>> --- a/lib/librte_mempool/rte_mempool.h
>> +++ b/lib/librte_mempool/rte_mempool.h
>> +/**
>> + * Prototype for implementation specific data provisioning function.
>> + *
>> + * The function should provide the implementation specific memory for
>> + * for use by the other mempool ops functions in a given mempool ops struct.
>> + * E.g. the default ops provides an instance of the rte_ring for this 
>> purpose.
>> + * it will mostlikely point to a different type of data structure, and
>> + * will be transparent to the application programmer.
>> + */
>> +typedef void *(*rte_mempool_alloc_t)(struct rte_mempool *mp);
> In http://dpdk.org/ml/archives/dev/2016-June/040233.html, I suggested
> to change the prototype to return an int (-errno) and directly set
> the set mp->pool_data (void *) or mp->pool_if (uint64_t). No cast
> would be required in this latter case.

Done.

> By the way, there is a typo in the comment:
> "mostlikely" -> "most likely"

Fixed.

>> --- /dev/null
>> +++ b/lib/librte_mempool/rte_mempool_default.c
>> +static void
>> +common_ring_free(struct rte_mempool *mp)
>> +{
>> +rte_ring_free((struct rte_ring *)mp->pool_data);
>> +}
> I don't think the cast is needed here.
> (same in other functions)

Removed.

>
>> --- /dev/null
>> +++ b/lib/librte_mempool/rte_mempool_ops.c
>> +/* add a new ops struct in rte_mempool_ops_table, return its index */
>> +int
>> +rte_mempool_ops_register(const struct rte_mempool_ops *h)
>> +{
>> +struct rte_mempool_ops *ops;
>> +int16_t ops_index;
>> +
>> +rte_spinlock_lock(_mempool_ops_table.sl);
>> +
>> +if (rte_mempool_ops_table.num_ops >=
>> +RTE_MEMPOOL_MAX_OPS_IDX) {
>> +rte_spinlock_unlock(_mempool_ops_table.sl);
>> +RTE_LOG(ERR, MEMPOOL,
>> +"Maximum number of mempool ops structs exceeded\n");
>> +return -ENOSPC;
>> +}
>> +
>> +if (h->put == NULL || h->get == NULL || h->get_count == NULL) {
>> +rte_spinlock_unlock(_mempool_ops_table.sl);
>> +RTE_LOG(ERR, MEMPOOL,
>> +"Missing callback while registering mempool ops\n");
>> +return -EINVAL;
>> +}
>> +
>> +if (strlen(h->name) >= sizeof(ops->name) - 1) {
>> +RTE_LOG(DEBUG, EAL, "%s(): mempool_ops <%s>: name too long\n",
>> +__func__, h->name);
>> +rte_errno = EEXIST;
>> +return NULL;
>> +}
> This has already been noticed by Shreyansh, but in case of:
>
> rte_mempool_ops.c:75:10: error: return makes integer from pointer
> without a cast [-Werror=int-conversion]
> return NULL;
>^

Changed to return -EEXIST

>
>> +/* sets mempool ops previously registered by rte_mempool_ops_register */
>> +int
>> +rte_mempool_set_ops_byname(struct rte_mempool *mp, const char *name)
>
> When I compile with shared libraries enabled, I get the following error:
>
> librte_reorder.so: undefined reference to `rte_mempool_ops_table'
> librte_mbuf.so: undefined reference to `rte_mempool_set_ops_byname'
> ...
>
> The new functions and global variables must be in
> rte_mempool_version.map. This was in v5
> ( http://dpdk.org/ml/archives/dev/2016-May/039365.html ) but
> was dropped in v6.

OK, Added.

>
> Regards,
> Olivier

Thanks,
David.



[dpdk-dev] [PATCH v2] app/test: reduced duration of red_autotest

2016-06-09 Thread Tomasz Kantecki
These changes don't break the tests on my systems and reduce
execution time to ~2[s]. I tried "faking" CPU clock frequency but
in some cases it leads to intermittent test fails.
Tomasz

'red_autotest' changed to run only functional tests without test #4 which was
taking ~53 seconds. 'red_autotest' takes ~2[s] now.
'red_perf' has been added to run performance tests only).
'red_all' has been added to run all functional tests (including #4) and
perfromance tests. This reflects current 'red_autotest' behavior.

Other changes:
- machine TSC clock frequency detection takes place only once now.
- timeouts and number of iterations in functional tests have been reduced
  in order to shorten test duration.
---
 app/test/test_red.c | 89 -
 1 file changed, 74 insertions(+), 15 deletions(-)

diff --git a/app/test/test_red.c b/app/test/test_red.c
index 81c9d67..2384c55 100644
--- a/app/test/test_red.c
+++ b/app/test/test_red.c
@@ -57,6 +57,7 @@
 #define MSEC_PER_SEC   1000  /**< Milli-seconds per second */
 #define USEC_PER_MSEC  1000  /**< Micro-seconds per milli-second */
 #define USEC_PER_SEC   100   /**< Micro-seconds per second */
+#define NSEC_PER_SEC   (USEC_PER_SEC * 1000) /**< Nano-seconds per 
second */

 /**< structures for testing rte_red performance and function */
 struct test_rte_red_config {/**< Test structure for RTE_RED config */
@@ -280,12 +281,15 @@ static uint64_t get_machclk_freq(void)
uint64_t start = 0;
uint64_t end = 0;
uint64_t diff = 0;
-   uint64_t clk_freq_hz = 0;
+   static uint64_t clk_freq_hz;
struct timespec tv_start = {0, 0}, tv_end = {0, 0};
struct timespec req = {0, 0};

-   req.tv_sec = 1;
-   req.tv_nsec = 0;
+   if (clk_freq_hz != 0)
+   return clk_freq_hz;
+
+   req.tv_sec = 0;
+   req.tv_nsec = NSEC_PER_SEC / 4;

clock_gettime(CLOCK_REALTIME, _start);
start = rte_rdtsc();
@@ -435,8 +439,8 @@ static struct test_queue ft_tqueue = {
 };

 static struct test_var ft_tvar = {
-   .wait_usec = 25,
-   .num_iterations = 20,
+   .wait_usec = 1,
+   .num_iterations = 5,
.num_ops = 1,
.clk_freq = 0,
.dropped = ft_dropped,
@@ -1747,6 +1751,16 @@ struct tests func_tests[] = {
{ _test1_config, ovfl_test1 },
 };

+struct tests func_tests_quick[] = {
+   { _test1_config, func_test1 },
+   { _test2_config, func_test2 },
+   { _test3_config, func_test3 },
+   /* no test 4 as it takes a lot of time */
+   { _test5_config, func_test5 },
+   { _test6_config, func_test6 },
+   { _test1_config, ovfl_test1 },
+};
+
 struct tests perf_tests[] = {
{ _test1_config, perf1_test },
{ _test2_config, perf1_test },
@@ -1850,27 +1864,60 @@ test_invalid_parameters(void)
return 0;
 }

+static void
+show_stats(const uint32_t num_tests, const uint32_t num_pass)
+{
+   if (num_pass == num_tests)
+   printf("[total: %u, pass: %u]\n", num_tests, num_pass);
+   else
+   printf("[total: %u, pass: %u, fail: %u]\n", num_tests, num_pass,
+  num_tests - num_pass);
+}
+
+static int
+tell_the_result(const uint32_t num_tests, const uint32_t num_pass)
+{
+   return (num_pass == num_tests) ? 0 : 1;
+}
+
 static int
 test_red(void)
 {
uint32_t num_tests = 0;
uint32_t num_pass = 0;
-   int ret = 0;

if (test_invalid_parameters() < 0)
return -1;
+   run_tests(func_tests_quick, RTE_DIM(func_tests_quick),
+ _tests, _pass);
+   show_stats(num_tests, num_pass);
+   return tell_the_result(num_tests, num_pass);
+}
+
+static int
+test_red_perf(void)
+{
+   uint32_t num_tests = 0;
+   uint32_t num_pass = 0;

-   run_tests(func_tests, RTE_DIM(func_tests), _tests, _pass);
run_tests(perf_tests, RTE_DIM(perf_tests), _tests, _pass);
+   show_stats(num_tests, num_pass);
+   return tell_the_result(num_tests, num_pass);
+}

-   if (num_pass == num_tests) {
-   printf("[total: %u, pass: %u]\n", num_tests, num_pass);
-   ret = 0;
-   } else {
-   printf("[total: %u, pass: %u, fail: %u]\n", num_tests, 
num_pass, num_tests - num_pass);
-   ret = -1;
-   }
-   return ret;
+static int
+test_red_all(void)
+{
+   uint32_t num_tests = 0;
+   uint32_t num_pass = 0;
+
+   if (test_invalid_parameters() < 0)
+   return -1;
+
+   run_tests(func_tests, RTE_DIM(func_tests), _tests, _pass);
+   run_tests(perf_tests, RTE_DIM(perf_tests), _tests, _pass);
+   show_stats(num_tests, num_pass);
+   return tell_the_result(num_tests, num_pass);
 }

 static struct test_command red_cmd = {
@@ -1878,3 +1925,15 @@ static struct test_command red_cmd = {
.callback = test_red,
 };
 

[dpdk-dev] [PATCH v3 9/9] doc: update ipsec sample guide

2016-06-09 Thread Mcnamara, John
> -Original Message-
> From: Gonzalez Monroy, Sergio
> Sent: Thursday, June 9, 2016 9:43 AM
> To: dev at dpdk.org
> Cc: De Lara Guarch, Pablo ; Mcnamara, John
> 
> Subject: [PATCH v3 9/9] doc: update ipsec sample guide
> 
> Signed-off-by: Sergio Gonzalez Monroy 
>
> ...
>
>  Configurations
>  --
> 
>  The following sections provide some details on the default values used to
>  initialize the SP, SA and Routing tables.
> -Currently all the configuration is hard coded into the application.
> +Currently all configuration information is hard coded into the
> application.
> +
> +The following image illustrate a few of the concepts regarding IPSec,
> such
> +as protected/unprotected and inbound/outbound traffic, from the point of
> +view of two back-to-back endpoints:
> +
> +.. _figure_ipsec_endpoints:
> +
> +.. figure:: img/ipsec_endpoints.svg
> +
> +   IPSec Inbound/Outbound traffic

Hi,

This file throws an error with make doc-guides-pdf.

The image needs to be specified as "img/ipsec_endpoints.*" (now .svg) to 
allow it to be converted from svg to pdf.

With this fix the build works.

John



[dpdk-dev] [PATCH] examples/ip_pipeline: fix build error for gcc 4.8

2016-06-09 Thread Dumitrescu, Cristian


> -Original Message-
> From: Mrzyglod, DanielX T
> Sent: Thursday, June 9, 2016 12:39 PM
> To: Singh, Jasvinder ; Dumitrescu, Cristian
> 
> Cc: dev at dpdk.org; Mrzyglod, DanielX T 
> Subject: [PATCH] examples/ip_pipeline: fix build error for gcc 4.8
> 
> This patch fixes a maybe-uninitialized warning when compiling DPDK with
> GCC 4.8
> 

Acked-by: Cristian Dumitrescu 



[dpdk-dev] [PATCH] i40e: fix flexible payload selection

2016-06-09 Thread Bruce Richardson
On Thu, Jun 02, 2016 at 03:30:57AM +0800, Zhe Tao wrote:
> On Thu, May 12, 2016 at 04:11:40PM +0800, Jingjing Wu wrote:
> > When setting up flexible payload selection rules, it is allowed
> > that setting value to 63 to disable the rule (NONUSE_FLX_PIT_DEST_OFF).
> > However, MK_FLX_PIT macro is always adding an offset value 50
> > (I40E_FLX_OFFSET_IN_FIELD_VECTOR), it will be set to "63 + 50" and
> > when setting NONUSE_FLX_PIT_DEST_OFF to disable it. It breaks
> > the functionality.
> > This patch fixes this issue.
> > 
> > Fixes: d8b90c4eabe9 ("i40e: take flow director flexible payload
> >   configuration")
> > 
> > Reported-by: Michael Habibi 
> > Signed-off-by: Jingjing Wu 
> > ---
> >  drivers/net/i40e/i40e_fdir.c | 4 +++-
> >  1 file changed, 3 insertions(+), 1 deletion(-)
> > 
> > diff --git a/drivers/net/i40e/i40e_fdir.c b/drivers/net/i40e/i40e_fdir.c
> > index 8aa41e5..efbcd18 100644
> > --- a/drivers/net/i40e/i40e_fdir.c
> > +++ b/drivers/net/i40e/i40e_fdir.c
> > @@ -94,7 +94,9 @@
> > I40E_PRTQF_FLX_PIT_SOURCE_OFF_MASK) | \
> > (((fsize) << I40E_PRTQF_FLX_PIT_FSIZE_SHIFT) & \
> > I40E_PRTQF_FLX_PIT_FSIZE_MASK) | \
> > -   dst_offset) + I40E_FLX_OFFSET_IN_FIELD_VECTOR) << \
> > +   dst_offset) == NONUSE_FLX_PIT_DEST_OFF ? \
> > +   NONUSE_FLX_PIT_DEST_OFF : \
> > +   ((dst_offset) + I40E_FLX_OFFSET_IN_FIELD_VECTOR)) << \
> > I40E_PRTQF_FLX_PIT_DEST_OFF_SHIFT) & \
> > I40E_PRTQF_FLX_PIT_DEST_OFF_MASK))
> >  
> > -- 
> > 2.4.0
> Acked-by: Zhe Tao 

Applied to dpdk-next-net/rel_16_07

/Bruce


[dpdk-dev] [PATCH v8 1/3] mempool: support external mempool operations

2016-06-09 Thread Hunt, David
Hi Shreyansh,

On 8/6/2016 2:48 PM, Shreyansh Jain wrote:
> Hi David,
>
> Thanks for explanation. I have some comments inline...
>
>> -Original Message-
>> From: Hunt, David [mailto:david.hunt at intel.com]
>> Sent: Tuesday, June 07, 2016 2:56 PM
>> To: Shreyansh Jain ; dev at dpdk.org
>> Cc: olivier.matz at 6wind.com; viktorin at rehivetech.com;
>> jerin.jacob at caviumnetworks.com
>> Subject: Re: [dpdk-dev] [PATCH v8 1/3] mempool: support external mempool
>> operations
>>
>> Hi Shreyansh,
>>
>> On 6/6/2016 3:38 PM, Shreyansh Jain wrote:
>>> Hi,
>>>
>>> (Apologies for overly-eager email sent on this thread earlier. Will be more
>> careful in future).
>>> This is more of a question/clarification than a comment. (And I have taken
>> only some snippets from original mail to keep it cleaner)
>>> 
 +MEMPOOL_REGISTER_OPS(ops_mp_mc);
 +MEMPOOL_REGISTER_OPS(ops_sp_sc);
 +MEMPOOL_REGISTER_OPS(ops_mp_sc);
 +MEMPOOL_REGISTER_OPS(ops_sp_mc);
>>> 
>>>
>>>   From the above what I understand is that multiple packet pool handlers can
>> be created.
>>> I have a use-case where application has multiple pools but only the packet
>> pool is hardware backed. Using the hardware for general buffer requirements
>> would prove costly.
>>>   From what I understand from the patch, selection of the pool is based on
>> the flags below.
>>
>> The flags are only used to select one of the default handlers for
>> backward compatibility through
>> the rte_mempool_create call. If you wish to use a mempool handler that
>> is not one of the
>> defaults, (i.e. a new hardware handler), you would use the
>> rte_create_mempool_empty
>> followed by the rte_mempool_set_ops_byname call.
>> So, for the external handlers, you create and empty mempool, then set
>> the operations (ops)
>> for that particular mempool.
> I am concerned about the existing applications (for example, l3fwd).
> Explicit calls to 'rte_create_mempool_empty->rte_mempool_set_ops_byname' 
> model would require modifications to these applications.
> Ideally, without any modifications, these applications should be able to use 
> packet pools (backed by hardware) and buffer pools (backed by ring/others) - 
> transparently.
>
> If I go by your suggestions, what I understand is, doing the above without 
> modification to applications would be equivalent to:
>
>struct rte_mempool_ops custom_hw_allocator = {...}
>
> thereafter, in config/common_base:
>
>CONFIG_RTE_DEFAULT_MEMPOOL_OPS="custom_hw_allocator"
>
> calls to rte_pktmbuf_pool_create would use the new allocator.

Yes, correct. But only for calls to rte_pktmbuf_pool_create(). Calls to 
rte_mempool_create will continue to use the default handlers (ring based).
> But, another problem arises here.
>
> There are two distinct paths for allocations of a memory pool:
> 1. A 'pkt' pool:
> rte_pktmbuf_pool_create
>   \- rte_mempool_create_empty
>   |   \- rte_mempool_set_ops_byname(..ring_mp_mc..)
>   |
>   `- rte_mempool_set_ops_byname
> (...RTE_MBUF_DEFAULT_MEMPOOL_OPS..)
> /* Override default 'ring_mp_mc' of
>  * rte_mempool_create */
>
> 2. Through generic mempool create API
> rte_mempool_create
>   \- rte_mempool_create_empty
> (passing pktmbuf and pool constructors)
>
> I found various instances in example applications where rte_mempool_create() 
> is being called directly for packet pools - bypassing the more semantically 
> correct call to rte_pktmbuf_* for packet pools.
>
> In (2) control path, RTE_MBUF_DEFAULT_MEMPOOLS_OPS wouldn't be able to 
> replace custom handler operations for packet buffer allocations.
>
>  From a performance point-of-view, Applications should be able to select 
> between packet pools and non-packet pools.

This is intended for backward compatibility, and API consistency. Any 
applications that use
rte_mempool_create directly will continue to use the default mempool 
handlers. If the need
to use a custeom hander, they will need to be modified to call the newer 
API,
rte_mempool_create_empty and rte_mempool_set_ops_byname.


>>> 
 +  /*
 +   * Since we have 4 combinations of the SP/SC/MP/MC examine the flags to
 +   * set the correct index into the table of ops structs.
 +   */
 +  if (flags & (MEMPOOL_F_SP_PUT | MEMPOOL_F_SC_GET))
 +  rte_mempool_set_ops_byname(mp, "ring_sp_sc");
 +  else if (flags & MEMPOOL_F_SP_PUT)
 +  rte_mempool_set_ops_byname(mp, "ring_sp_mc");
 +  else if (flags & MEMPOOL_F_SC_GET)
 +  rte_mempool_set_ops_byname(mp, "ring_mp_sc");
 +  else
 +  rte_mempool_set_ops_byname(mp, "ring_mp_mc");
 +
> My suggestion is to have an additional flag, 'MEMPOOL_F_PKT_ALLOC', which, if 
> specified, would:
>
> ...
> #define MEMPOOL_F_SC_GET0x0008
> #define MEMPOOL_F_PKT_ALLOC 0x0010
> ...
>
> in rte_mempool_create_empty:
> ... after checking the other MEMPOOL_F_* flags...
>
>  if (flags & 

[dpdk-dev] [PATCH v2] mempool: fix local cache initialization

2016-06-09 Thread Olivier Matz


On 06/09/2016 10:19 AM, Sergio Gonzalez Monroy wrote:
> The mempool local cache was not initialized properly leading to
> undefined behavior in cases where the allocated memory was used
> previously and left with data.
> 
> Fixes: 213af31e0960 ("mempool: reduce structure size if no cache needed")
> 
> Signed-off-by: Sergio Gonzalez Monroy 

Acked-by: Olivier Matz 

Thanks


[dpdk-dev] [PATCH 2/2 v3] kni: add documentation for the mempool capacity

2016-06-09 Thread Alex Wang
Just to confirm, should I do anything before it gets merged?

On Thu, Jun 9, 2016 at 5:03 AM, Mcnamara, John 
wrote:

> > -Original Message-
> > From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Alex Wang
> > Sent: Saturday, May 21, 2016 8:59 AM
> > To: dev at dpdk.org
> > Cc: Yigit, Ferruh ; Alex Wang
> > 
> > Subject: [dpdk-dev] [PATCH 2/2 v3] kni: add documentation for the mempool
> > capacity
> >
> > From: Alex Wang 
> >
> > Function like 'rte_kni_rx_burst()' keeps allocating 'MAX_MBUF_BURST_NUM'
> > mbufs to kni fifo queue unless the queue's capacity
> > ('KNI_FIFO_COUNT_MAX') is reached.  So, if the mempool is under-
> > provisioned, user may run into "Out of Memory" logs from KNI code.
> > This commit documents the need to provision mempool capacity of more than
> > "2 x KNI_FIFO_COUNT_MAX" for each KNI interface.
> >
> > Signed-off-by: Alex Wang 
> > Acked-by: Ferruh Yigit 
>
> Acked-by: John McNamara 
>
>


[dpdk-dev] [PATCH] mempool: fix local cache initialization

2016-06-09 Thread Olivier Matz
Hi Sergio,

On 06/09/2016 09:57 AM, Sergio Gonzalez Monroy wrote:
> Hi Olivier,
> 
> On 08/06/2016 20:14, Olivier Matz wrote:
>> Hi Sergio,
>>
>> Good catch, thanks. The patch looks ok, just few comments
>> on the commit log:
>>
>> On 06/08/2016 05:10 PM, Sergio Gonzalez Monroy wrote:
>>> The mempool local cache is not being initialize properly leading to
>> 'initialize' -> 'initialized' ?
>> and maybe 'is not being' -> 'was not' ?
>>
>>> undefined behavior in cases where the allocated memory was used and left
>>> with data.
>>>
>>> Fixes: af75078fece3 ("first public release")
>> I think it fixes this one instead:
>>
>> 213af31e0960 ("mempool: reduce structure size if no cache needed")
> 
> Fair enough, I thought the issue was there as we never
> initialized/zeroed the local cache
> on mempool creation. Usually we would have allocated all mempools on
> init (or close)
> and that would be it (initially all memory would be zeroed), but I think
> you could still
> manage to reproduce the problem if somehow you where to do something like:
> rte_malloc(), rte_free(), rte_mempool_create() and the memory was the
> one we got
> with malloc and never gets zeroed again.

Before Keith's commit (213af31e0960), the local cache was initialized
when doing the memset() because it was included in the mempool
structure. So I think the problem did not exist before this patch.
Or did I miss something in your explanation?

Regards,
Olivier


[dpdk-dev] supported packet types

2016-06-09 Thread Olivier Matz
Hi Konstantin,

On 04/29/2016 06:03 PM, Ananyev, Konstantin wrote:
>> The following commit introduces a function to list the supported
>> packet types of a device:
>>
>>   http://dpdk.org/browse/dpdk/commit/?id=78a38edf66
>>
>> I would like to know what does "supported" precisely mean.
>> Is it:
>>
>> 1/ - if a ptype is marked as supported, the driver MUST set
>>  this ptype if the packet matches the format described in rte_mbuf.h
>>
>>-> if the ptype is not recognized, the application is sure
>>   that the packet is not one of the supported ptype
>>-> but this is difficult to take advantage of this inside an
>>   application that supports several different ports model
>>   that do not support the same packet types
>>
>> 2/ - if a ptype is marked as supported, the driver CAN set
>>  the ptype if the packet matches the format described in rte_mbuf.h
>>
>>-> if a ptype is not recognized, the application does a software
>>   fallback
>>-> in this case, it would useless to have the get_supported_ptype()
>>
>> Can you confirm if the PMDs and l3fwd (the only user) expect 1/
>> or 2/ ?
> 
> 1) 
> 
>>
>> Can you elaborate on the advantages of having this API?
> 
> Application can rely on information provided by PMD avoid parsing packet 
> manually to recognise it's pytpe.
> 
>>
>> And a supplementary question: if a ptype is not marked as supported,
>> is it forbidden for a driver to set this ptype anyway?
> 
> I suppose it is not forbidden, but there is no guarantee from PMD that it
> would be able to recognise that ptype.
> 
> Konstantin
> 
>> Because we can
>> imagine a hardware that can only recognize packets in some conditions
>> (ex: can recognize IPv4 if there is no vlan).
>>
>> I think properly defining the meaning of "supported" is mandatory
>> to have an application beeing able to use this feature, and avoid
>> PMDs to behave differently because the API is unclear (like we've
>> already seen for other features).

Back on this. I've made some tests with ixgbe, and I'm afraid it
will be difficult to ensure that when a ptype is advertised, it will
always be set in the mbuf, whatever the layers below. Here are some
examples:

- double vlans

Ether(type=0x88A8)/Dot1Q(vlan=0x666)/Dot1Q(vlan=0x666)/IP()/UDP()/Raw("x"*32)
  ixgbe advertises RTE_PTYPE_ETHER in supported ptypes
  returned ptype: RTE_PTYPE_UNKNOWN
  should be: L2_ETHER
  (this works with any unknown ethtype)

- ip6 in ip6 tunnel
  ixgbe advertises RTE_PTYPE_TUNNEL_IP in supported ptypes
  Ether()/IPv6()/IPv6()/UDP()/Raw("x"*32)
  returned ptype: L2_ETHER L3_IPV6
  should be: L2_ETHER L3_IPV6 TUNNEL_IP INNER_L3_IPV6 INNER_L4_UDP

- ip options
  Ether()/IP(options=IPOption('\x83\x03\x10'))/UDP()/Raw("x"*32)
  returned ptype: RTE_PTYPE_UNKNOWN
  should be: L2_ETHER L3_IPV4_EXT L4_UDP

- ip inside ip with options
  Ether()/IP(options=IPOption('\x83\x03\x10'))/IP()/UDP()/Raw("x"*32)
  returned ptype: L2_ETHER L3_IPV4_EXT
  should be: L2_ETHER L3_IPV4_EXT TUNNEL_IP INNER_L3_IPV4 INNER_L4_UDP

I'm sure we can find more examples that do not return the expected
result, knowing that ixgbe is probably one of the most complete
driver in dpdk. I'm afraid of the behavior for other PMDs :)

That's why I think the get_supported_ptypes() function, as of today,
is useless for an application. I suggest instead to set the ptype
in an opportunistic fashion instead:
- if the driver/hw knows the ptype, set it
- else, set it to unknown

What do you think?

By the way, I'm working on a software implementation that return a
packet_type from a mbuf. I may need it for virtio offload, but before
that, I think it could be useful for debug purposes. I'll submit a
patchset for this in the coming days.

Regards,
Olivier

PS: sorry, many questions to you these days... answer when you have
 the time ;)


[dpdk-dev] [PATCH 2/8] lib/librte_ether: defind RX/TX lock mode

2016-06-09 Thread Olivier Matz
Hi,

On 06/08/2016 09:34 AM, Lu, Wenzhuo wrote:
> Hi Stephen,
> 
> 
>> -Original Message-
>> From: Stephen Hemminger [mailto:stephen at networkplumber.org]
>> Sent: Wednesday, June 8, 2016 10:16 AM
>> To: Lu, Wenzhuo
>> Cc: dev at dpdk.org; Tao, Zhe
>> Subject: Re: [dpdk-dev] [PATCH 2/8] lib/librte_ether: defind RX/TX lock mode
>>
>> On Mon,  6 Jun 2016 13:40:47 +0800
>> Wenzhuo Lu  wrote:
>>
>>> Define lock mode for RX/TX queue. Because when resetting the device we
>>> want the resetting thread to get the lock of the RX/TX queue to make
>>> sure the RX/TX is stopped.
>>>
>>> Using next ABI macro for this ABI change as it has too much impact. 7
>>> APIs and 1 global variable are impacted.
>>>
>>> Signed-off-by: Wenzhuo Lu 
>>> Signed-off-by: Zhe Tao 
>>
>> Why does this patch set make a different assumption the rest of the DPDK?
>>
>> The rest of the DPDK operates on the principle that the application is smart
>> enough to stop the device before making changes. There is no equivalent to 
>> the
>> Linux kernel RTNL mutex. The API assumes application threads are well behaved
>> and will not try and sabotage each other.
>>
>> If you restrict the reset operation to only being available when RX/TX is 
>> stopped,
>> then no lock is needed.
>>
>> The fact that it requires lots more locking inside each device driver 
>> implies to me
>> this is not correct way to architect this.

+1

I'm not sure adding locks is the proper way to do.
This is the application responsibility to ensure that:
- control functions are not called concurrently on the same port
- rx/tx functions are not called when the device is stopped/reset/...

However, I do think the usage paradigms of the ethdev api should be
better documented in rte_ethdev.h (ex: which functions can be called
concurrently). This would be a first step.

If we really want a helper API to do that in DPDK, the _next_ step
could be to add them in the ethdev api to achieve this. Maybe
something like (the function names could be better):

- to be called on one control thread:

  rte_eth_stop_rxtx(port)
  rte_eth_start_rxtx(port)

  rte_eth_get_rxtx_state(port)
 -> return "running" if at least one core is inside the rx/tx code
 -> return "stopped" if all cores are outside the rx/tx code

- to be called on dataplane cores:

  /* same than rte_eth_rx_burst(), but checks if rx/tx is allowed
   * first, else do nothing */
  rte_eth_rx_burst_interruptible()
  rte_eth_tx_burst_interruptible()


The code of control thread could be:

  rte_eth_stop_rxtx(port);
  /* wait that all dataplane cores finished their processing */
  while (rte_eth_get_rxtx_state(port) != stopped)
  ;
  rte_eth_some_control_operation(port);
  rte_eth_start_rxtx(port);


I think this could be done without any lock, just with the proper
memory barriers and a per-core status.

But this API may impose a paradigm to the application, and I'm not
sure the DPDK should do that.

Regards,
Olivier


[dpdk-dev] [PATCH v6 8/8] doc: update doc for packet capture framework

2016-06-09 Thread Reshma Pattan
Added programmers guide for librte_pdump.
Added sample application guide for app/pdump application.
Updated release note for packet capture framework changes.

Signed-off-by: Reshma Pattan 
Acked-by: John McNamara 
---
 MAINTAINERS |   3 +
 doc/guides/prog_guide/index.rst |   1 +
 doc/guides/prog_guide/pdump_library.rst | 107 
 doc/guides/rel_notes/release_16_07.rst  |  13 
 doc/guides/sample_app_ug/index.rst  |   1 +
 doc/guides/sample_app_ug/pdump.rst  | 122 
 6 files changed, 247 insertions(+)
 create mode 100644 doc/guides/prog_guide/pdump_library.rst
 create mode 100644 doc/guides/sample_app_ug/pdump.rst

diff --git a/MAINTAINERS b/MAINTAINERS
index a48c8de..ce7c941 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -436,6 +436,9 @@ Pdump
 M: Reshma Pattan 
 F: lib/librte_pdump/
 F: app/pdump/
+F: doc/guides/prog_guide/pdump_library.rst
+F: doc/guides/sample_app_ug/pdump.rst
+

 Hierarchical scheduler
 M: Cristian Dumitrescu 
diff --git a/doc/guides/prog_guide/index.rst b/doc/guides/prog_guide/index.rst
index b862d0c..4caf969 100644
--- a/doc/guides/prog_guide/index.rst
+++ b/doc/guides/prog_guide/index.rst
@@ -71,6 +71,7 @@ Programmer's Guide
 writing_efficient_code
 profile_app
 glossary
+pdump_library


 **Figures**
diff --git a/doc/guides/prog_guide/pdump_library.rst 
b/doc/guides/prog_guide/pdump_library.rst
new file mode 100644
index 000..1809234
--- /dev/null
+++ b/doc/guides/prog_guide/pdump_library.rst
@@ -0,0 +1,107 @@
+..  BSD LICENSE
+Copyright(c) 2016 Intel Corporation. All rights reserved.
+All rights reserved.
+
+Redistribution and use in source and binary forms, with or without
+modification, are permitted provided that the following conditions
+are met:
+
+* Redistributions of source code must retain the above copyright
+notice, this list of conditions and the following disclaimer.
+* Redistributions in binary form must reproduce the above copyright
+notice, this list of conditions and the following disclaimer in
+the documentation and/or other materials provided with the
+distribution.
+* Neither the name of Intel Corporation nor the names of its
+contributors may be used to endorse or promote products derived
+from this software without specific prior written permission.
+
+THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+"AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+(INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+
+.. _pdump_library:
+
+The librte_pdump Library
+
+
+The ``librte_pdump`` library provides a framework for packet capturing in DPDK.
+The library provides the following APIs to initialize the packet capture 
framework, to enable
+or disable the packet capture, and to uninitialize it:
+
+* ``rte_pdump_init()``:
+  This API initializes the packet capture framework.
+
+* ``rte_pdump_enable()``:
+  This API enables the packet capture on a given port and queue.
+  Note: The filter option in the API is a place holder for future enhancements.
+
+* ``rte_pdump_enable_by_deviceid()``:
+  This API enables the packet capture on a given device id (``vdev name or pci 
address``) and queue.
+  Note: The filter option in the API is a place holder for future enhancements.
+
+* ``rte_pdump_disable()``:
+  This API disables the packet capture on a given port and queue.
+
+* ``rte_pdump_disable_by_deviceid()``:
+  This API disables the packet capture on a given device id (``vdev name or 
pci address``) and queue.
+
+* ``rte_pdump_uninit()``:
+  This API uninitializes the packet capture framework.
+
+
+Operation
+-
+
+The ``librte_pdump`` library works on a client/server model. The server is 
responsible for enabling or
+disabling the packet capture and the clients are responsible for requesting 
the enabling or disabling of
+the packet capture.
+
+The packet capture framework, as part of its initialization, creates the 
pthread and the server socket in
+the pthread. The application that calls the framework initialization first 
will have the server socket created.
+Further calls to the framework initialization by the same application or other 
applications is not allowed i.e., only
+one server socket is allowed on the system. So 

[dpdk-dev] [PATCH v6 7/8] app/test-pmd: add pdump initialization uninitialization

2016-06-09 Thread Reshma Pattan
Call rte_pdump_init and rte_pdump_uninit for packet
capturing initialization and uninitialization.

Signed-off-by: Reshma Pattan 
---
 app/test-pmd/testpmd.c | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/app/test-pmd/testpmd.c b/app/test-pmd/testpmd.c
index dd6b046..f6089fa 100644
--- a/app/test-pmd/testpmd.c
+++ b/app/test-pmd/testpmd.c
@@ -76,6 +76,7 @@
 #ifdef RTE_LIBRTE_PMD_XENVIRT
 #include 
 #endif
+#include 

 #include "testpmd.h"

@@ -2029,6 +2030,8 @@ signal_handler(int signum)
if (signum == SIGINT || signum == SIGTERM) {
printf("\nSignal %d received, preparing to exit...\n",
signum);
+   /* uninitialize packet capture framework */
+   rte_pdump_uninit();
force_quit();
/* exit with the expected status */
signal(signum, SIG_DFL);
@@ -2049,6 +2052,9 @@ main(int argc, char** argv)
if (diag < 0)
rte_panic("Cannot init EAL\n");

+   /* initialize packet capture framework */
+   rte_pdump_init();
+
nb_ports = (portid_t) rte_eth_dev_count();
if (nb_ports == 0)
RTE_LOG(WARNING, EAL, "No probed ethernet devices\n");
-- 
2.5.0



[dpdk-dev] [PATCH v6 6/8] app/pdump: add pdump tool for packet capturing

2016-06-09 Thread Reshma Pattan
New tool added for packet capturing on dpdk.
This tool supports command line options.
This tool runs as secondary process by default.

Command line supports various parameters to capture
the packets.

User should pass on a)port and queue (or) b)pci address
and queue (or) c)device name and queue to capture
the packets.

Users also need to pass on either pcap file name or
any linux iface, on to which packets captured from dpdk
ports will be sent on for the users to view using tcpdump.

Users have option to capture packets either a) in RX
direction, b)(or) in TX direction c)(or) from both the
directions.

User can pass on ring_size and mempool parameters using
command line, but these are optional parameters.
These are used to create ring and mempool objects for packet
mirroring from primary application to tool. If user doesn't
provide any values, default values will be used internally
for the creation of the ring and mempool.

Signed-off-by: Reshma Pattan 
---
 MAINTAINERS|   1 +
 app/Makefile   |   1 +
 app/pdump/Makefile |  45 +++
 app/pdump/main.c   | 814 +
 4 files changed, 861 insertions(+)
 create mode 100644 app/pdump/Makefile
 create mode 100644 app/pdump/main.c

diff --git a/MAINTAINERS b/MAINTAINERS
index cc3ffdb..a48c8de 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -435,6 +435,7 @@ F: doc/guides/sample_app_ug/packet_ordering.rst
 Pdump
 M: Reshma Pattan 
 F: lib/librte_pdump/
+F: app/pdump/

 Hierarchical scheduler
 M: Cristian Dumitrescu 
diff --git a/app/Makefile b/app/Makefile
index 1151e09..c593efa 100644
--- a/app/Makefile
+++ b/app/Makefile
@@ -37,5 +37,6 @@ DIRS-$(CONFIG_RTE_LIBRTE_PIPELINE) += test-pipeline
 DIRS-$(CONFIG_RTE_TEST_PMD) += test-pmd
 DIRS-$(CONFIG_RTE_LIBRTE_CMDLINE) += cmdline_test
 DIRS-$(CONFIG_RTE_EXEC_ENV_LINUXAPP) += proc_info
+DIRS-$(CONFIG_RTE_EXEC_ENV_LINUXAPP) += pdump

 include $(RTE_SDK)/mk/rte.subdir.mk
diff --git a/app/pdump/Makefile b/app/pdump/Makefile
new file mode 100644
index 000..96bb4af
--- /dev/null
+++ b/app/pdump/Makefile
@@ -0,0 +1,45 @@
+#   BSD LICENSE
+#
+#   Copyright(c) 2016 Intel Corporation. All rights reserved.
+#   All rights reserved.
+#
+#   Redistribution and use in source and binary forms, with or without
+#   modification, are permitted provided that the following conditions
+#   are met:
+#
+# * Redistributions of source code must retain the above copyright
+#   notice, this list of conditions and the following disclaimer.
+# * Redistributions in binary form must reproduce the above copyright
+#   notice, this list of conditions and the following disclaimer in
+#   the documentation and/or other materials provided with the
+#   distribution.
+# * Neither the name of Intel Corporation nor the names of its
+#   contributors may be used to endorse or promote products derived
+#   from this software without specific prior written permission.
+#
+#   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+#   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+#   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+#   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+#   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+#   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+#   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+#   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+#   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+#   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+#   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+
+include $(RTE_SDK)/mk/rte.vars.mk
+
+APP = dpdk_pdump
+
+CFLAGS += $(WERROR_FLAGS)
+
+# all source are stored in SRCS-y
+
+SRCS-y := main.c
+
+# this application needs libraries first
+DEPDIRS-y += lib
+
+include $(RTE_SDK)/mk/rte.app.mk
diff --git a/app/pdump/main.c b/app/pdump/main.c
new file mode 100644
index 000..a4a5ca2
--- /dev/null
+++ b/app/pdump/main.c
@@ -0,0 +1,814 @@
+/*
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2016 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ * * Redistributions of source code must retain the above copyright
+ *   notice, this list of conditions and the following disclaimer.
+ * * Redistributions in binary form must reproduce the above copyright
+ *   notice, this list of conditions and the following disclaimer in
+ *   the documentation and/or other materials provided with the
+ *   distribution.
+ * * Neither the name of Intel Corporation nor the names of its
+ *   contributors may be used to endorse or promote products derived
+ *   from 

[dpdk-dev] [PATCH v6 5/8] lib/librte_pdump: add new library for packet capturing support

2016-06-09 Thread Reshma Pattan
Added new library for packet capturing support.

Added public api rte_pdump_init, applications should call
this as part of their application setup to have packet
capturing framework ready.

Added public api rte_pdump_uninit to uninitialize the packet
capturing framework.

Added public apis rte_pdump_enable and rte_pdump_disable to
enable and disable packet capturing on specific port and queue.

Added public apis rte_pdump_enable_by_deviceid and
rte_pdump_disable_by_deviceid to enable and disable packet
capturing on a specific device (pci address or name) and queue.

Signed-off-by: Reshma Pattan 
---
 MAINTAINERS|   4 +
 config/common_base |   5 +
 lib/Makefile   |   1 +
 lib/librte_pdump/Makefile  |  55 +++
 lib/librte_pdump/rte_pdump.c   | 841 +
 lib/librte_pdump/rte_pdump.h   | 186 
 lib/librte_pdump/rte_pdump_version.map |  12 +
 mk/rte.app.mk  |   1 +
 8 files changed, 1105 insertions(+)
 create mode 100644 lib/librte_pdump/Makefile
 create mode 100644 lib/librte_pdump/rte_pdump.c
 create mode 100644 lib/librte_pdump/rte_pdump.h
 create mode 100644 lib/librte_pdump/rte_pdump_version.map

diff --git a/MAINTAINERS b/MAINTAINERS
index 3e8558f..cc3ffdb 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -432,6 +432,10 @@ F: app/test/test_reorder*
 F: examples/packet_ordering/
 F: doc/guides/sample_app_ug/packet_ordering.rst

+Pdump
+M: Reshma Pattan 
+F: lib/librte_pdump/
+
 Hierarchical scheduler
 M: Cristian Dumitrescu 
 F: lib/librte_sched/
diff --git a/config/common_base b/config/common_base
index 47c26f6..a2d5d72 100644
--- a/config/common_base
+++ b/config/common_base
@@ -484,6 +484,11 @@ CONFIG_RTE_LIBRTE_DISTRIBUTOR=y
 CONFIG_RTE_LIBRTE_REORDER=y

 #
+# Compile the pdump library
+#
+CONFIG_RTE_LIBRTE_PDUMP=y
+
+#
 # Compile librte_port
 #
 CONFIG_RTE_LIBRTE_PORT=y
diff --git a/lib/Makefile b/lib/Makefile
index f254dba..ca7c02f 100644
--- a/lib/Makefile
+++ b/lib/Makefile
@@ -57,6 +57,7 @@ DIRS-$(CONFIG_RTE_LIBRTE_PORT) += librte_port
 DIRS-$(CONFIG_RTE_LIBRTE_TABLE) += librte_table
 DIRS-$(CONFIG_RTE_LIBRTE_PIPELINE) += librte_pipeline
 DIRS-$(CONFIG_RTE_LIBRTE_REORDER) += librte_reorder
+DIRS-$(CONFIG_RTE_LIBRTE_PDUMP) += librte_pdump

 ifeq ($(CONFIG_RTE_EXEC_ENV_LINUXAPP),y)
 DIRS-$(CONFIG_RTE_LIBRTE_KNI) += librte_kni
diff --git a/lib/librte_pdump/Makefile b/lib/librte_pdump/Makefile
new file mode 100644
index 000..af81a28
--- /dev/null
+++ b/lib/librte_pdump/Makefile
@@ -0,0 +1,55 @@
+#   BSD LICENSE
+#
+#   Copyright(c) 2016 Intel Corporation. All rights reserved.
+#   All rights reserved.
+#
+#   Redistribution and use in source and binary forms, with or without
+#   modification, are permitted provided that the following conditions
+#   are met:
+#
+# * Redistributions of source code must retain the above copyright
+#   notice, this list of conditions and the following disclaimer.
+# * Redistributions in binary form must reproduce the above copyright
+#   notice, this list of conditions and the following disclaimer in
+#   the documentation and/or other materials provided with the
+#   distribution.
+# * Neither the name of Intel Corporation nor the names of its
+#   contributors may be used to endorse or promote products derived
+#   from this software without specific prior written permission.
+#
+#   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+#   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+#   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+#   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+#   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+#   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+#   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+#   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+#   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+#   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+#   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+
+include $(RTE_SDK)/mk/rte.vars.mk
+
+# library name
+LIB = librte_pdump.a
+
+CFLAGS += $(WERROR_FLAGS) -I$(SRCDIR) -O3
+CFLAGS += -D_GNU_SOURCE
+
+EXPORT_MAP := rte_pdump_version.map
+
+LIBABIVER := 1
+
+# all source are stored in SRCS-y
+SRCS-$(CONFIG_RTE_LIBRTE_PDUMP) := rte_pdump.c
+
+# install this header file
+SYMLINK-$(CONFIG_RTE_LIBRTE_PDUMP)-include := rte_pdump.h
+
+# this lib depends upon:
+DEPDIRS-$(CONFIG_RTE_LIBRTE_PDUMP) += lib/librte_mbuf
+DEPDIRS-$(CONFIG_RTE_LIBRTE_PDUMP) += lib/librte_eal
+DEPDIRS-$(CONFIG_RTE_LIBRTE_PDUMP) += lib/librte_ether
+
+include $(RTE_SDK)/mk/rte.lib.mk
diff --git a/lib/librte_pdump/rte_pdump.c b/lib/librte_pdump/rte_pdump.c
new file mode 100644
index 

[dpdk-dev] [PATCH v6 4/8] librte_ether: make rte_eth_dev_get_port_by_name rte_eth_dev_get_name_by_port public

2016-06-09 Thread Reshma Pattan
Converted rte_eth_dev_get_port_by_name to a public API.
Converted rte_eth_dev_get_name_by_port to a public API.

Signed-off-by: Reshma Pattan 
---
 lib/librte_ether/rte_ethdev.c  |  4 ++--
 lib/librte_ether/rte_ethdev.h  | 29 +
 lib/librte_ether/rte_ether_version.map |  2 ++
 3 files changed, 33 insertions(+), 2 deletions(-)

diff --git a/lib/librte_ether/rte_ethdev.c b/lib/librte_ether/rte_ethdev.c
index 4d732e8..d8aac99 100644
--- a/lib/librte_ether/rte_ethdev.c
+++ b/lib/librte_ether/rte_ethdev.c
@@ -406,7 +406,7 @@ rte_eth_dev_get_addr_by_port(uint8_t port_id, struct 
rte_pci_addr *addr)
return 0;
 }

-static int
+int
 rte_eth_dev_get_name_by_port(uint8_t port_id, char *name)
 {
char *tmp;
@@ -425,7 +425,7 @@ rte_eth_dev_get_name_by_port(uint8_t port_id, char *name)
return 0;
 }

-static int
+int
 rte_eth_dev_get_port_by_name(const char *name, uint8_t *port_id)
 {
int i;
diff --git a/lib/librte_ether/rte_ethdev.h b/lib/librte_ether/rte_ethdev.h
index 106318f..6f5a61a 100644
--- a/lib/librte_ether/rte_ethdev.h
+++ b/lib/librte_ether/rte_ethdev.h
@@ -4283,6 +4283,35 @@ rte_eth_dev_l2_tunnel_offload_set(uint8_t port_id,
  uint32_t mask,
  uint8_t en);

+/**
+* Get the port id from pci adrress or device name
+* Ex: :2:00.0 or vdev name eth_pcap0
+*
+* @param name
+*  pci address or name of the device
+* @param port_id
+*   pointer to port identifier of the device
+* @return
+*   - (0) if successful.
+*   - (-ENODEV or -EINVAL) on failure.
+*/
+int
+rte_eth_dev_get_port_by_name(const char *name, uint8_t *port_id);
+
+/**
+* Get the device name from port id
+*
+* @param port_id
+*   pointer to port identifier of the device
+* @param name
+*  pci address or name of the device
+* @return
+*   - (0) if successful.
+*   - (-EINVAL) on failure.
+*/
+int
+rte_eth_dev_get_name_by_port(uint8_t port_id, char *name);
+
 #ifdef __cplusplus
 }
 #endif
diff --git a/lib/librte_ether/rte_ether_version.map 
b/lib/librte_ether/rte_ether_version.map
index d06d648..73e730d 100644
--- a/lib/librte_ether/rte_ether_version.map
+++ b/lib/librte_ether/rte_ether_version.map
@@ -137,5 +137,7 @@ DPDK_16.07 {
global:

rte_eth_add_first_rx_callback;
+   rte_eth_dev_get_name_by_port;
+   rte_eth_dev_get_port_by_name;
rte_eth_dev_info_get;
 } DPDK_16.04;
-- 
2.5.0



[dpdk-dev] [PATCH v6 3/8] librte_ether: add new fields to rte_eth_dev_info struct

2016-06-09 Thread Reshma Pattan
New fields nb_rx_queues and nb_tx_queues are added to
rte_eth_dev_info structure.
Changes to API rte_eth_dev_info_get() are done to update
these new fields to rte_eth_dev_info object.

Signed-off-by: Reshma Pattan 
---
 lib/librte_ether/rte_ethdev.c  | 2 ++
 lib/librte_ether/rte_ethdev.h  | 3 +++
 lib/librte_ether/rte_ether_version.map | 1 +
 3 files changed, 6 insertions(+)

diff --git a/lib/librte_ether/rte_ethdev.c b/lib/librte_ether/rte_ethdev.c
index f13c70a..4d732e8 100644
--- a/lib/librte_ether/rte_ethdev.c
+++ b/lib/librte_ether/rte_ethdev.c
@@ -1661,6 +1661,8 @@ rte_eth_dev_info_get(uint8_t port_id, struct 
rte_eth_dev_info *dev_info)
(*dev->dev_ops->dev_infos_get)(dev, dev_info);
dev_info->pci_dev = dev->pci_dev;
dev_info->driver_name = dev->data->drv_name;
+   dev_info->nb_rx_queues = dev->data->nb_rx_queues;
+   dev_info->nb_tx_queues = dev->data->nb_tx_queues;
 }

 int
diff --git a/lib/librte_ether/rte_ethdev.h b/lib/librte_ether/rte_ethdev.h
index 92b07a9..106318f 100644
--- a/lib/librte_ether/rte_ethdev.h
+++ b/lib/librte_ether/rte_ethdev.h
@@ -882,6 +882,9 @@ struct rte_eth_dev_info {
struct rte_eth_desc_lim rx_desc_lim;  /**< RX descriptors limits */
struct rte_eth_desc_lim tx_desc_lim;  /**< TX descriptors limits */
uint32_t speed_capa;  /**< Supported speeds bitmap (ETH_LINK_SPEED_). */
+   /** Configured number of rx/tx queues */
+   uint16_t nb_rx_queues; /**< Number of RX queues. */
+   uint16_t nb_tx_queues; /**< Number of TX queues. */
 };

 /**
diff --git a/lib/librte_ether/rte_ether_version.map 
b/lib/librte_ether/rte_ether_version.map
index c990b04..d06d648 100644
--- a/lib/librte_ether/rte_ether_version.map
+++ b/lib/librte_ether/rte_ether_version.map
@@ -137,4 +137,5 @@ DPDK_16.07 {
global:

rte_eth_add_first_rx_callback;
+   rte_eth_dev_info_get;
 } DPDK_16.04;
-- 
2.5.0



[dpdk-dev] [PATCH v6 2/8] librte_ether: add new api rte_eth_add_first_rx_callback

2016-06-09 Thread Reshma Pattan
Added new public api rte_eth_add_first_rx_callback to add given
callback as head of list.

Signed-off-by: Reshma Pattan 
---
 lib/librte_ether/rte_ethdev.c  | 35 ++
 lib/librte_ether/rte_ethdev.h  | 27 ++
 lib/librte_ether/rte_ether_version.map |  6 ++
 3 files changed, 68 insertions(+)

diff --git a/lib/librte_ether/rte_ethdev.c b/lib/librte_ether/rte_ethdev.c
index ce70d58..f13c70a 100644
--- a/lib/librte_ether/rte_ethdev.c
+++ b/lib/librte_ether/rte_ethdev.c
@@ -2939,6 +2939,41 @@ rte_eth_add_rx_callback(uint8_t port_id, uint16_t 
queue_id,
 }

 void *
+rte_eth_add_first_rx_callback(uint8_t port_id, uint16_t queue_id,
+   rte_rx_callback_fn fn, void *user_param)
+{
+#ifndef RTE_ETHDEV_RXTX_CALLBACKS
+   rte_errno = ENOTSUP;
+   return NULL;
+#endif
+   /* check input parameters */
+   if (!rte_eth_dev_is_valid_port(port_id) || fn == NULL ||
+   queue_id >= 
rte_eth_devices[port_id].data->nb_rx_queues) {
+   rte_errno = EINVAL;
+   return NULL;
+   }
+
+   struct rte_eth_rxtx_callback *cb = rte_zmalloc(NULL, sizeof(*cb), 0);
+
+   if (cb == NULL) {
+   rte_errno = ENOMEM;
+   return NULL;
+   }
+
+   cb->fn.rx = fn;
+   cb->param = user_param;
+
+   rte_spinlock_lock(_eth_rx_cb_lock);
+   /* Add the callbacks at fisrt position*/
+   cb->next = rte_eth_devices[port_id].post_rx_burst_cbs[queue_id];
+   rte_smp_wmb();
+   rte_eth_devices[port_id].post_rx_burst_cbs[queue_id] = cb;
+   rte_spinlock_unlock(_eth_rx_cb_lock);
+
+   return cb;
+}
+
+void *
 rte_eth_add_tx_callback(uint8_t port_id, uint16_t queue_id,
rte_tx_callback_fn fn, void *user_param)
 {
diff --git a/lib/librte_ether/rte_ethdev.h b/lib/librte_ether/rte_ethdev.h
index 2757510..92b07a9 100644
--- a/lib/librte_ether/rte_ethdev.h
+++ b/lib/librte_ether/rte_ethdev.h
@@ -3825,6 +3825,33 @@ int rte_eth_dev_get_dcb_info(uint8_t port_id,
 void *rte_eth_add_rx_callback(uint8_t port_id, uint16_t queue_id,
rte_rx_callback_fn fn, void *user_param);

+/*
+* Add a callback that must be called first on packet RX on a given port and 
queue.
+*
+* This API configures a first function to be called for each burst of
+* packets received on a given NIC port queue. The return value is a pointer
+* that can be used to later remove the callback using
+* rte_eth_remove_rx_callback().
+*
+* Multiple functions are called in the order that they are added.
+*
+* @param port_id
+*   The port identifier of the Ethernet device.
+* @param queue_id
+*   The queue on the Ethernet device on which the callback is to be added.
+* @param fn
+*   The callback function
+* @param user_param
+*   A generic pointer parameter which will be passed to each invocation of the
+*   callback function on this port and queue.
+*
+* @return
+*   NULL on error.
+*   On success, a pointer value which can later be used to remove the callback.
+*/
+void *rte_eth_add_first_rx_callback(uint8_t port_id, uint16_t queue_id,
+   rte_rx_callback_fn fn, void *user_param);
+
 /**
  * Add a callback to be called on packet TX on a given port and queue.
  *
diff --git a/lib/librte_ether/rte_ether_version.map 
b/lib/librte_ether/rte_ether_version.map
index 214ecc7..c990b04 100644
--- a/lib/librte_ether/rte_ether_version.map
+++ b/lib/librte_ether/rte_ether_version.map
@@ -132,3 +132,9 @@ DPDK_16.04 {
rte_eth_tx_buffer_set_err_callback;

 } DPDK_2.2;
+
+DPDK_16.07 {
+   global:
+
+   rte_eth_add_first_rx_callback;
+} DPDK_16.04;
-- 
2.5.0



[dpdk-dev] [PATCH v6 1/8] librte_ether: protect add/remove of rxtx callbacks with spinlocks

2016-06-09 Thread Reshma Pattan
Added spinlocks around add/remove logic of rxtx callbacks to
avoid corruption of callback lists in multithreaded context.

Signed-off-by: Reshma Pattan 
---
 lib/librte_ether/rte_ethdev.c | 82 +--
 1 file changed, 40 insertions(+), 42 deletions(-)

diff --git a/lib/librte_ether/rte_ethdev.c b/lib/librte_ether/rte_ethdev.c
index e148028..ce70d58 100644
--- a/lib/librte_ether/rte_ethdev.c
+++ b/lib/librte_ether/rte_ethdev.c
@@ -77,6 +77,12 @@ static uint8_t nb_ports;
 /* spinlock for eth device callbacks */
 static rte_spinlock_t rte_eth_dev_cb_lock = RTE_SPINLOCK_INITIALIZER;

+/* spinlock for add/remove rx callbacks */
+static rte_spinlock_t rte_eth_rx_cb_lock = RTE_SPINLOCK_INITIALIZER;
+
+/* spinlock for add/remove tx callbacks */
+static rte_spinlock_t rte_eth_tx_cb_lock = RTE_SPINLOCK_INITIALIZER;
+
 /* store statistics names and its offset in stats structure  */
 struct rte_eth_xstats_name_off {
char name[RTE_ETH_XSTATS_NAME_SIZE];
@@ -1634,7 +1640,6 @@ rte_eth_dev_set_rx_queue_stats_mapping(uint8_t port_id, 
uint16_t rx_queue_id,
STAT_QMAP_RX);
 }

-
 void
 rte_eth_dev_info_get(uint8_t port_id, struct rte_eth_dev_info *dev_info)
 {
@@ -2905,7 +2910,6 @@ rte_eth_add_rx_callback(uint8_t port_id, uint16_t 
queue_id,
rte_errno = EINVAL;
return NULL;
}
-
struct rte_eth_rxtx_callback *cb = rte_zmalloc(NULL, sizeof(*cb), 0);

if (cb == NULL) {
@@ -2916,6 +2920,7 @@ rte_eth_add_rx_callback(uint8_t port_id, uint16_t 
queue_id,
cb->fn.rx = fn;
cb->param = user_param;

+   rte_spinlock_lock(_eth_rx_cb_lock);
/* Add the callbacks in fifo order. */
struct rte_eth_rxtx_callback *tail =
rte_eth_devices[port_id].post_rx_burst_cbs[queue_id];
@@ -2928,6 +2933,7 @@ rte_eth_add_rx_callback(uint8_t port_id, uint16_t 
queue_id,
tail = tail->next;
tail->next = cb;
}
+   rte_spinlock_unlock(_eth_rx_cb_lock);

return cb;
 }
@@ -2957,6 +2963,7 @@ rte_eth_add_tx_callback(uint8_t port_id, uint16_t 
queue_id,
cb->fn.tx = fn;
cb->param = user_param;

+   rte_spinlock_lock(_eth_tx_cb_lock);
/* Add the callbacks in fifo order. */
struct rte_eth_rxtx_callback *tail =
rte_eth_devices[port_id].pre_tx_burst_cbs[queue_id];
@@ -2969,6 +2976,7 @@ rte_eth_add_tx_callback(uint8_t port_id, uint16_t 
queue_id,
tail = tail->next;
tail->next = cb;
}
+   rte_spinlock_unlock(_eth_tx_cb_lock);

return cb;
 }
@@ -2987,29 +2995,24 @@ rte_eth_remove_rx_callback(uint8_t port_id, uint16_t 
queue_id,
return -EINVAL;

struct rte_eth_dev *dev = _eth_devices[port_id];
-   struct rte_eth_rxtx_callback *cb = dev->post_rx_burst_cbs[queue_id];
-   struct rte_eth_rxtx_callback *prev_cb;
-
-   /* Reset head pointer and remove user cb if first in the list. */
-   if (cb == user_cb) {
-   dev->post_rx_burst_cbs[queue_id] = user_cb->next;
-   return 0;
-   }
-
-   /* Remove the user cb from the callback list. */
-   do {
-   prev_cb = cb;
-   cb = cb->next;
-
+   struct rte_eth_rxtx_callback *cb;
+   struct rte_eth_rxtx_callback **prev_cb;
+   int ret = -EINVAL;
+
+   rte_spinlock_lock(_eth_rx_cb_lock);
+   prev_cb = >post_rx_burst_cbs[queue_id];
+   for (; *prev_cb != NULL; prev_cb = >next) {
+   cb = *prev_cb;
if (cb == user_cb) {
-   prev_cb->next = user_cb->next;
-   return 0;
+   /* Remove the user cb from the callback list. */
+   *prev_cb = cb->next;
+   ret = 0;
+   break;
}
+   }
+   rte_spinlock_unlock(_eth_rx_cb_lock);

-   } while (cb != NULL);
-
-   /* Callback wasn't found. */
-   return -EINVAL;
+   return ret;
 }

 int
@@ -3026,29 +3029,24 @@ rte_eth_remove_tx_callback(uint8_t port_id, uint16_t 
queue_id,
return -EINVAL;

struct rte_eth_dev *dev = _eth_devices[port_id];
-   struct rte_eth_rxtx_callback *cb = dev->pre_tx_burst_cbs[queue_id];
-   struct rte_eth_rxtx_callback *prev_cb;
-
-   /* Reset head pointer and remove user cb if first in the list. */
-   if (cb == user_cb) {
-   dev->pre_tx_burst_cbs[queue_id] = user_cb->next;
-   return 0;
-   }
-
-   /* Remove the user cb from the callback list. */
-   do {
-   prev_cb = cb;
-   cb = cb->next;
-
+   int ret = -EINVAL;
+   struct rte_eth_rxtx_callback *cb;
+   struct rte_eth_rxtx_callback **prev_cb;
+
+   rte_spinlock_lock(_eth_tx_cb_lock);
+   prev_cb = >pre_tx_burst_cbs[queue_id];
+   for (; *prev_cb != NULL; prev_cb = 

[dpdk-dev] [PATCH v6 0/8] add packet capture framework

2016-06-09 Thread Reshma Pattan
This patch set include below changes

1)Changes to librte_ether.
2)A new library librte_pdump added for packet capture framework.
3)A new app/pdump tool added for packet capturing.
4)Test pmd changes done to initialize packet capture framework.
5)Documentation update.

1)librte_pdump
==
To support packet capturing on dpdk Ethernet devices, a new library librte_pdump
is added.Users can develop their own packet capturing application using new 
library APIs.

Operation:
--
Pdump library provides APIs to support packet capturing on dpdk Ethernet 
devices.
Library provides APIs to initialize the packet capture framework, enable/disable
the packet capture and uninitialize the packet capture framework.

Pdump library works on client/server based model.

Sever is responsible for enabling/disabling the packet captures.
Clients are responsible for requesting enable/disable of the
packet captures.

As part of packet capture framework initialization, pthread and
the server socket is created. Only one server socket is allowed on the system.
As part of enabling/disabling the packet capture, client sockets are created
and multiple client sockets are allowed.
Who ever calls initialization first they will succeed with the initialization,
next subsequent calls of initialization are not allowed. So next users can only
request enabling/disabling the packet capture.

Applications using below APIs need to pass port/device_id, queue, mempool and
ring parameters. Library uses user provided ring and mempool to mirror the rx/tx
packets of the port for users. Users need to dequeue the rings and write the 
packets
to vdev(pcap/tuntap) to view the packets using any standard tools.

Note:
Mempool and Ring should be mc/mp supportable.
Mempool mbuf size should be big enough to handle the rx/tx packets of a port.

APIs:
-
rte_pdump_init()
rte_pdump_enable()
rte_pdump_enable_by_deviceid()
rte_pdump_disable()
rte_pdump_disable_by_deviceid()
rte_pdump_uninit()

2)app/pdump tool

Tool app/pdump is designed based on librte_pdump for packet capturing in DPDK.
This tool by default runs as secondary process, and provides the support for
the command line options for packet capture.

./build/app/dpdk_pdump --
   --pdump '(port= | device_id=),
(queue=),
(rx-dev= |
 tx-dev=),
[ring-size=],
[mbuf-size=],
[total-num-mbufs=]'

Parameters inside the parenthesis represents the mandatory parameters.
Parameters inside the square brackets represents optional parameters.
User has to pass on packet capture parameters under --pdump parameters, 
multiples of
--pdump can be passed to capture packets on different port and queue 
combinations

Operation:
--
*Tool parse the user command line arguments,
creates the mempool, ring and the PCAP PMD vdev with 'tx_stream' as either
of the device passed in rx-dev|tx-dev parameters.

*Then calls the APIs of librte_pdump i.e. 
rte_pdump_enable()/rte_pdump_enable_by_deviceid()
to enable packet capturing on a specific port/device_id and queue by passing on
port|device_id, queue, mempool and ring info.

*Tool runs in while loop to dequeue the packets from the ring and write them to 
pcap device.

*Tool can be stopped using SIGINT, upon which tool calls
rte_pdump_disable()/rte_pdump_disable_by_deviceid() and free the allocated 
resources.

Note:
CONFIG_RTE_LIBRTE_PMD_PCAP flag should be set to yes to compile and run the 
pdump tool.

3)Test-pmd changes
==
Changes are done to test-pmd application to initialize/uninitialize the packet 
capture framework.
So app/pdump tool can be run to see packets of dpdk ports that are used by 
test-pmd.

Similarly any application which needs packet capture should call 
initialize/uninitialize APIs of
librte_pdump and use pdump tool to start the capture.

4)Packet capture flow between pdump tool and librte_pdump
=
* Pdump tool (Secondary process) requests packet capture
for specific port|device_id and queue combinations.

*Library in secondary process context creates client socket and communicates
the port|device_id, queue, ring and mempool to server.

*Library initializes server in primary process 'test-pmd' context and server 
serves
the client request to enable Ethernet rxtx call-backs for a given 
port|device_id and queue.

*Copy the rx/tx packets to passed mempool and enqueue the packets to ring for 
secondary process.

*Pdump tool will dequeue the packets from ring and writes them to PCAPMD vdev,
so ultimately packets will be seen on the device that is passed in 
rx-dev|tx-dev.

*Once the pdump tool is terminated with SIGINT it will disable the packet 
capturing.

*Library receives the disable packet capture request, communicate the info to 
server,
server will 

[dpdk-dev] [PATCH] mbuf: remove inconsistent assert statements

2016-06-09 Thread Olivier Matz
Hi Konstantin,

>> Yes, it refcnt supposed to be set to 0 by __rte_pktmbuf_prefree_seg().
>> Wright now, it is a user responsibility to make sure refcnt==0 before 
>> pushing
>> mbuf back to the pool.
>> Not sure why do you consider that wrong?
>
> I do not consider this wrong and I'm all for using assert() to catch
> programming errors, however in this specific case, I think they are
> inconsistent and misleading.

 Honestly, I don't understand why.
 Right now the rule of thumb is - when mbuf is in the pool, it's refcnt 
 should be equal zero.
>>
>> What is the purpose of this? Is there some code relying on this?
> 
> The whole current implementation of mbuf_free code path relies on that.
> Straight here: 
> if (likely(NULL != (m = __rte_pktmbuf_prefree_seg(m {
> m->next = NULL;
> __rte_mbuf_raw_free(m);
> }
> 
> If we'll exclude indirect mbuf logic, all it does:
> if (rte_mbuf_refcnt_update(m, -1) == 0) {
>   m->next = NULL;
>   __rte_mbuf_raw_free(m);
> }
> 
> I.E.:
> decrement mbuf->refcnt.
> If new value of refcnt is zero, then put it back into the pool.
> 
> So having ASERT(mbuf->refcnt==0) inside
> __rte_mbuf_raw_free()/__rte_mbuf_raw_alloc()
> looks absolutely valid to me.
> I *has* to be zero at that point with current implementation,
> And if it is not then we probably have (or will have) a silent memory 
> corruption.

This explains how the refcount is used, and why it is set
to zero before returning the mbuf to the pool with the mbuf
free functions.

It does not explain which code relies on the refcnt beeing 0
while the mbuf is in the pool.


>> But since [1], it is allowed to call rte_mbuf_raw_alloc() in PMDs (all
>> PMDs were calling an internal function before).
> 
> Yes, raw_alloc is public, NP with that.
> 
>> We could argue that
>> rte_mbuf_raw_free() should also be made public for PMDs.
> 
> We could, but right now it is not.
> Again, as I said, user could use it on his own but he obviously has to
> obey the rules and do manually what __rte_pktmbuf_prefree_seg() does.
> At least:
> 
> rte_mbuf_refcnt_set(m, 0);
> __rte_mbuf_raw_free(m);
> 
>>
>> As you said below, no-one promised that the free() reverts the malloc(),
>> but given the function names, one can legitimately expect that the
>> following code is valid:
>>
>> m = __rte_mbuf_raw_alloc();
>> /* do nothing here */
>> __rte_mbuf_raw_free(m);
> 
> Surely people could (and would) expect various things...
> But the reality right now is: __rte_mbuf_raw_free() is an internal
> function and not counterpart of __rte_mbuf_raw_alloc().
> If the people don't bother to read API docs or/and the actual code,
> I can't see how we can help them :)

Yes, of course, people should read the doc.
This does not prevent to have a nice API that behaves in a
natural way :)

By the way, the fact that today the mbuf refcnt should be 0 while
in a pool is not in the api doc, but in the code.

>> If no code relies on having the refcnt set to 0 when a mbuf is in
>> the pool, I suggest to relax this constraint as Adrien proposed.
> 
> Why not just rename it to __rte_mbuf_raw_free_dont_use_after_raw_alloc()?
> To avoid any further confusion :)
> Seriously speaking I would prefer to leave it as it is.
> If you feel we have to introduce a counterpart of  rte_mbuf_raw_alloc(),
> we can make a new public one:
> 
> rte_mbuf_raw_free(stuct rte_mbuf *m)
> {
>   if (rte_mbuf_refcnt_update(m, -1) == 0)
> __rte_mbuf_raw_free(m);
> }

This is an option, but I think it's not efficient to touch
the mbuf structure when allocating/freeing. See below.

>> Then, my opinion is that the refcount should be set to 1 in
>> rte_pktmbuf_reset().
> 
> I don't think we need to update rte_pktmbuf_reset(),
> it doesn't touch refcnt at all and probably better to keep it that way.

Why would it be better?
All mbuf struct initializations are done in that function, why would
it be different for the refcnt?

> To achieve what you suggesting, we probably need to:
> 1) update  _rte_pktmbuf_prefree_seg and rte_pktmbuf_detach() to
> set refcnt back to 1, i.e:
> 
> static inline struct rte_mbuf* __attribute__((always_inline))
> __rte_pktmbuf_prefree_seg(struct rte_mbuf *m)
> {
> __rte_mbuf_sanity_check(m, 0);
> 
> if (likely(rte_mbuf_refcnt_update(m, -1) == 0)) {
> /* if this is an indirect mbuf, it is detached. */
> if (RTE_MBUF_INDIRECT(m))
> rte_pktmbuf_detach(m);
> + rte_mbuf_refcnt_set(m, 1);
> return m;
> }
> return NULL;
> }
> 
> 2) either:
>a) update mbuf constructor function, so it sets refcnt=1.
> I suppose that is easy for rte_pktmbuf_init(), but it means that all 
> custom
> constructors should do the same.
> Which means possible changes in existing user code and all ABI change 
> related hassle.
>   b) keep rte_mbuf_raw_alloc() 

[dpdk-dev] [PATCH v3 9/9] doc: update ipsec sample guide

2016-06-09 Thread Sergio Gonzalez Monroy
Signed-off-by: Sergio Gonzalez Monroy 
---
 doc/guides/sample_app_ug/img/ipsec_endpoints.svg | 850 +
 doc/guides/sample_app_ug/ipsec_secgw.rst | 910 ++-
 2 files changed, 1400 insertions(+), 360 deletions(-)
 create mode 100644 doc/guides/sample_app_ug/img/ipsec_endpoints.svg

diff --git a/doc/guides/sample_app_ug/img/ipsec_endpoints.svg 
b/doc/guides/sample_app_ug/img/ipsec_endpoints.svg
new file mode 100644
index 000..e4aba4c
--- /dev/null
+++ b/doc/guides/sample_app_ug/img/ipsec_endpoints.svg
@@ -0,0 +1,850 @@
+
+
+
+http://www.openswatchbook.org/uri/2009/osb;
+   xmlns:dc="http://purl.org/dc/elements/1.1/;
+   xmlns:cc="http://creativecommons.org/ns#;
+   xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#;
+   xmlns:svg="http://www.w3.org/2000/svg;
+   xmlns="http://www.w3.org/2000/svg;
+   xmlns:sodipodi="http://sodipodi.sourceforge.net/DTD/sodipodi-0.dtd;
+   xmlns:inkscape="http://www.inkscape.org/namespaces/inkscape;
+   width="155.68507mm"
+   height="76.061203mm"
+   viewBox="0 0 551.64003 269.50821"
+   id="svg2"
+   version="1.1"
+   inkscape:version="0.91 r13725"
+   sodipodi:docname="endpoints.svg">
+  
+
+  
+
+
+  
+
+
+  
+
+
+  
+
+
+  
+
+
+  
+
+
+  
+  
+
+
+  
+
+
+  
+
+
+  
+
+
+  
+
+
+  
+
+
+  
+
+
+  
+
+
+  
+
+
+  
+
+
+
+
+
+
+  
+
+
+  
+
+
+  
+
+
+  
+
+
+  
+
+
+  
+
+
+  
+
+
+  
+
+
+  
+
+
+  
+
+
+  
+
+
+  
+
+
+  
+
+  
+  
+  
+
+  
+image/svg+xml
+http://purl.org/dc/dcmitype/StillImage; />
+
+  
+
+  
+  
+
+
+
+
+ep0
+ep1
+traffic gen
+traffic gen
+
+
+
+
+
+2
+3
+2
+3
+0
+1
+
+0
+1
+
+UNPROTECTEDcipher-text
+PROTECTEDclear-text
+
+
+outbound
+inbound
+
+
+
+
+
+
+  
+
diff --git a/doc/guides/sample_app_ug/ipsec_secgw.rst 
b/doc/guides/sample_app_ug/ipsec_secgw.rst
index c11c7e7..66dd326 100644
--- a/doc/guides/sample_app_ug/ipsec_secgw.rst
+++ b/doc/guides/sample_app_ug/ipsec_secgw.rst
@@ -38,165 +38,171 @@ Overview
 

 The application demonstrates the implementation of a Security Gateway
-(not IPsec compliant, see Constraints bellow) using DPDK based on RFC4301,
+(not IPsec compliant, see the Constraints section below) using DPDK based on 
RFC4301,
 RFC4303, RFC3602 and RFC2404.

 Internet Key Exchange (IKE) is not implemented, so only manual setting of
 Security Policies and Security Associations is supported.

 The Security Policies (SP) are implemented as ACL rules, the Security
-Associations (SA) are stored in a table and the Routing is implemented
+Associations (SA) are stored in a table and the routing is implemented
 using LPM.

-The application classify the ports between Protected and Unprotected.
-Thus, traffic received in an Unprotected or Protected port is consider
+The application classifies the ports as *Protected* and *Unprotected*.
+Thus, traffic received on an Unprotected or Protected port is consider
 Inbound or Outbound respectively.

-Path for IPsec Inbound traffic:
+The Path for IPsec Inbound traffic is:

-*  Read packets from the port
+*  Read packets from the port.
 *  Classify packets between IPv4 and ESP.
-*  Inbound SA lookup for ESP packets based on their SPI
-*  Verification/Decryption
-*  Removal of ESP and outer IP header
-*  Inbound SP check using ACL of decrypted packets and any other IPv4 packet
-   we read.
-*  Routing
-*  Write packet to port
-
-Path for IPsec Outbound traffic:
-
-*  Read packets from the port
-*  Outbound SP check using ACL of all IPv4 traffic
-*  Outbound SA lookup for packets that need IPsec protection
-*  Add ESP and outer IP header
-*  Encryption/Digest
-*  Routing
-*  Write packet to port
+*  Perform Inbound SA lookup for ESP packets based on their SPI.
+*  Perform Verification/Decryption.
+*  Remove ESP and outer IP header
+*  Inbound SP check using ACL of decrypted packets and any other IPv4 packets.
+*  Routing.
+*  Write packet to port.
+
+The Path for the IPsec Outbound traffic is:
+
+*  Read packets from the port.
+*  Perform Outbound SP check using ACL of all IPv4 traffic.
+*  Perform Outbound SA lookup for packets that need IPsec protection.
+*  Add ESP and outer IP header.
+*  Perform Encryption/Digest.
+*  Routing.
+*  Write packet to port.
+

 Constraints
 ---
-*  IPv4 traffic
-*  ESP tunnel mode
-*  EAS-CBC, HMAC-SHA1 and NULL
-*  Each SA must be handle by a unique lcore (1 RX queue per port)
-*  No chained mbufs
+
+*  No IPv6 options headers.
+*  No AH mode.

[dpdk-dev] [PATCH v3 8/9] examples/ipsec-secgw: transport mode support

2016-06-09 Thread Sergio Gonzalez Monroy
IPSec transport mode support.

Signed-off-by: Sergio Gonzalez Monroy 
---
 examples/ipsec-secgw/esp.c   | 124 ++-
 examples/ipsec-secgw/ipsec.h |   1 +
 examples/ipsec-secgw/rt.c|  32 +++
 examples/ipsec-secgw/sa.c|  39 ++
 examples/ipsec-secgw/sp4.c   |  40 ++
 examples/ipsec-secgw/sp6.c   |  48 +
 6 files changed, 248 insertions(+), 36 deletions(-)

diff --git a/examples/ipsec-secgw/esp.c b/examples/ipsec-secgw/esp.c
index e1fde36..05caa77 100644
--- a/examples/ipsec-secgw/esp.c
+++ b/examples/ipsec-secgw/esp.c
@@ -42,7 +42,6 @@
 #include 

 #include 
-#include 
 #include 
 #include 
 #include 
@@ -70,21 +69,24 @@ int
 esp_inbound(struct rte_mbuf *m, struct ipsec_sa *sa,
struct rte_crypto_op *cop)
 {
-   int32_t payload_len, ip_hdr_len;
+   struct ip *ip4;
struct rte_crypto_sym_op *sym_cop;
+   int32_t payload_len, ip_hdr_len;

RTE_ASSERT(m != NULL);
RTE_ASSERT(sa != NULL);
RTE_ASSERT(cop != NULL);

-   ip_hdr_len = 0;
-   switch (sa->flags) {
-   case IP4_TUNNEL:
-   ip_hdr_len = sizeof(struct ip);
-   break;
-   case IP6_TUNNEL:
+   ip4 = rte_pktmbuf_mtod(m, struct ip *);
+   if (likely(ip4->ip_v == IPVERSION))
+   ip_hdr_len = ip4->ip_hl * 4;
+   else if (ip4->ip_v == IP6_VERSION)
+   /* XXX No option headers supported */
ip_hdr_len = sizeof(struct ip6_hdr);
-   break;
+   else {
+   RTE_LOG(ERR, IPSEC_ESP, "invalid IP packet type %d\n",
+   ip4->ip_v);
+   return -EINVAL;
}

payload_len = rte_pktmbuf_pkt_len(m) - ip_hdr_len -
@@ -126,6 +128,8 @@ int
 esp_inbound_post(struct rte_mbuf *m, struct ipsec_sa *sa,
struct rte_crypto_op *cop)
 {
+   struct ip *ip4, *ip;
+   struct ip6_hdr *ip6;
uint8_t *nexthdr, *pad_len;
uint8_t *padding;
uint16_t i;
@@ -135,7 +139,7 @@ esp_inbound_post(struct rte_mbuf *m, struct ipsec_sa *sa,
RTE_ASSERT(cop != NULL);

if (cop->status != RTE_CRYPTO_OP_STATUS_SUCCESS) {
-   RTE_LOG(ERR, IPSEC_ESP, "Failed crypto op\n");
+   RTE_LOG(ERR, IPSEC_ESP, "failed crypto op\n");
return -1;
}

@@ -146,7 +150,7 @@ esp_inbound_post(struct rte_mbuf *m, struct ipsec_sa *sa,
padding = pad_len - *pad_len;
for (i = 0; i < *pad_len; i++) {
if (padding[i] != i + 1) {
-   RTE_LOG(ERR, IPSEC_ESP, "invalid pad_len field\n");
+   RTE_LOG(ERR, IPSEC_ESP, "invalid padding\n");
return -EINVAL;
}
}
@@ -157,7 +161,23 @@ esp_inbound_post(struct rte_mbuf *m, struct ipsec_sa *sa,
return -EINVAL;
}

-   ipip_inbound(m, sizeof(struct esp_hdr) + sa->iv_len);
+   if (unlikely(sa->flags == TRANSPORT)) {
+   ip = rte_pktmbuf_mtod(m, struct ip *);
+   ip4 = (struct ip *)rte_pktmbuf_adj(m,
+   sizeof(struct esp_hdr) + sa->iv_len);
+   if (likely(ip->ip_v == IPVERSION)) {
+   memmove(ip4, ip, ip->ip_hl * 4);
+   ip4->ip_p = *nexthdr;
+   ip4->ip_len = htons(rte_pktmbuf_data_len(m));
+   } else {
+   ip6 = (struct ip6_hdr *)ip4;
+   /* XXX No option headers supported */
+   memmove(ip6, ip, sizeof(struct ip6_hdr));
+   ip6->ip6_nxt = *nexthdr;
+   ip6->ip6_plen = htons(rte_pktmbuf_data_len(m));
+   }
+   } else
+   ipip_inbound(m, sizeof(struct esp_hdr) + sa->iv_len);

return 0;
 }
@@ -166,31 +186,57 @@ int
 esp_outbound(struct rte_mbuf *m, struct ipsec_sa *sa,
struct rte_crypto_op *cop)
 {
-   uint16_t pad_payload_len, pad_len, ip_hdr_len;
struct ip *ip4;
struct ip6_hdr *ip6;
-   struct esp_hdr *esp;
-   int32_t i;
-   char *padding;
+   struct esp_hdr *esp = NULL;
+   uint8_t *padding, *new_ip, nlp;
struct rte_crypto_sym_op *sym_cop;
+   int32_t i;
+   uint16_t pad_payload_len, pad_len, ip_hdr_len;

RTE_ASSERT(m != NULL);
RTE_ASSERT(sa != NULL);
RTE_ASSERT(cop != NULL);

-   /* Payload length */
-   pad_payload_len = RTE_ALIGN_CEIL(rte_pktmbuf_pkt_len(m) + 2,
-   sa->block_size);
-   pad_len = pad_payload_len - rte_pktmbuf_pkt_len(m);
-
ip_hdr_len = 0;
-   switch (sa->flags) {
-   case IP4_TUNNEL:
+
+   ip4 = rte_pktmbuf_mtod(m, struct ip *);
+   if (likely(ip4->ip_v == IPVERSION)) {
+   if (unlikely(sa->flags == TRANSPORT)) {
+   ip_hdr_len = ip4->ip_hl * 4;
+  

[dpdk-dev] [PATCH v3 6/9] examples/ipsec-secgw: consistent config variable names

2016-06-09 Thread Sergio Gonzalez Monroy
Modify the default SP config variables names to be consistent with SA.

The resulting naming convention is that variables with suffixes _out/_in
are the default for ep0 and the reverse for ep1.

Signed-off-by: Sergio Gonzalez Monroy 
---
 examples/ipsec-secgw/sp.c | 14 +++---
 1 file changed, 7 insertions(+), 7 deletions(-)

diff --git a/examples/ipsec-secgw/sp.c b/examples/ipsec-secgw/sp.c
index 4f16730..6aa377d 100644
--- a/examples/ipsec-secgw/sp.c
+++ b/examples/ipsec-secgw/sp.c
@@ -112,7 +112,7 @@ struct rte_acl_field_def ipv4_defs[NUM_FIELDS_IPV4] = {

 RTE_ACL_RULE_DEF(acl4_rules, RTE_DIM(ipv4_defs));

-const struct acl4_rules acl4_rules_in[] = {
+const struct acl4_rules acl4_rules_out[] = {
{
.data = {.userdata = PROTECT(5), .category_mask = 1, .priority = 1},
/* destination IPv4 */
@@ -175,7 +175,7 @@ const struct acl4_rules acl4_rules_in[] = {
}
 };

-const struct acl4_rules acl4_rules_out[] = {
+const struct acl4_rules acl4_rules_in[] = {
{
.data = {.userdata = PROTECT(5), .category_mask = 1, .priority = 1},
/* destination IPv4 */
@@ -343,15 +343,15 @@ sp_init(struct socket_ctx *ctx, int socket_id, unsigned 
ep)
"initialized\n", socket_id);

if (ep == 0) {
-   rules_out = acl4_rules_in;
-   nb_out_rules = RTE_DIM(acl4_rules_in);
-   rules_in = acl4_rules_out;
-   nb_in_rules = RTE_DIM(acl4_rules_out);
-   } else if (ep == 1) {
rules_out = acl4_rules_out;
nb_out_rules = RTE_DIM(acl4_rules_out);
rules_in = acl4_rules_in;
nb_in_rules = RTE_DIM(acl4_rules_in);
+   } else if (ep == 1) {
+   rules_out = acl4_rules_in;
+   nb_out_rules = RTE_DIM(acl4_rules_in);
+   rules_in = acl4_rules_out;
+   nb_in_rules = RTE_DIM(acl4_rules_out);
} else
rte_exit(EXIT_FAILURE, "Invalid EP value %u. "
"Only 0 or 1 supported.\n", ep);
-- 
2.5.5



[dpdk-dev] [PATCH v3 5/9] examples/ipsec-secgw: fix no sa found case

2016-06-09 Thread Sergio Gonzalez Monroy
The application only ASSERTS that an SA is not NULL (only when debugging
is enabled) without properly dealing with the case of not having an SA
for the processed packet.

Behavior should be such as if no SA is found, drop the packet.

Fixes: d299106e8e31 ("examples/ipsec-secgw: add IPsec sample application")

Signed-off-by: Sergio Gonzalez Monroy 
---
 examples/ipsec-secgw/ipsec.c | 7 +--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/examples/ipsec-secgw/ipsec.c b/examples/ipsec-secgw/ipsec.c
index 90a9a86..ccc840f 100644
--- a/examples/ipsec-secgw/ipsec.c
+++ b/examples/ipsec-secgw/ipsec.c
@@ -110,6 +110,11 @@ ipsec_enqueue(ipsec_xform_fn xform_func, struct ipsec_ctx 
*ipsec_ctx,
struct ipsec_sa *sa;

for (i = 0; i < nb_pkts; i++) {
+   if (unlikely(sas[i] == NULL)) {
+   rte_pktmbuf_free(pkts[i]);
+   continue;
+   }
+
rte_prefetch0(sas[i]);
rte_prefetch0(pkts[i]);

@@ -117,8 +122,6 @@ ipsec_enqueue(ipsec_xform_fn xform_func, struct ipsec_ctx 
*ipsec_ctx,
sa = sas[i];
priv->sa = sa;

-   RTE_ASSERT(sa != NULL);
-
priv->cop.type = RTE_CRYPTO_OP_TYPE_SYMMETRIC;

rte_prefetch0(>sym_cop);
-- 
2.5.5



  1   2   >