[dpdk-dev] [PATCH 0/2] fix missing dependencies

2015-12-02 Thread Thomas Monjalon
2015-12-02 17:09, Declan Doherty:
> On 02/12/15 04:05, Stephen Hemminger wrote:
> > Fix some issues found when doing parallel builds
> >
> > Stephen Hemminger (2):
> >cmdline_test: add missing dependencies
> >bonding: add depencency on cmdline library
> >
> >   app/cmdline_test/Makefile| 3 +++
> >   drivers/net/bonding/Makefile | 1 +
> >   2 files changed, 4 insertions(+)
> >
> Series Acked-by: Declan Doherty

Applied, thanks


[dpdk-dev] [PATCH v4 0/2] disable CONFIG_RTE_SCHED_VECTOR for arm

2015-12-02 Thread Thomas Monjalon
2015-11-30 22:50, Jerin Jacob:
> 
> v1..v2
> created common arm64 configs under common_arm64 file.
> let  each armv8 machine targets  capture only the differences
> between the common arm64 config.
> 
> v2..v3
> Fix whitespace issue with git am
> 
> v3..v4
> removed common_arm64 file and used defconfig_arm64-armv8a-linuxapp-gcc
> as base
> 
> Jerin Jacob (2):
>   config: use defconfig_arm64-armv8a-linuxapp-gcc as base for arm64
> targets
>   config: disable CONFIG_RTE_SCHED_VECTOR for arm

Applied, thanks
Even if the compilation is already fixed, it is a good idea to explicitly
mark SCHED_VECTOR as disabled instead of relying on internal disabling.


[dpdk-dev] [PATCH] mk: disable SCHED_VECTOR in the default config

2015-12-02 Thread Thomas Monjalon
Hi,

2015-12-01 16:13, Christian Ehrhardt:
> As it causes issues when building with RTE_MACHINE=default due to SSE4.x
> requirements and in other discussions was so far rated "lightly tested and
> doesn't provide really significant performance improvement" let us disable
> that in the default config.
> (=> http://dpdk.org/ml/archives/dev/2015-November/029067.html)

Is your issue fixed with the following patch?
http://dpdk.org/browse/dpdk/commit/?id=1985903e4454



[dpdk-dev] [PATCH 3/4] eal/arm: Enable lpm/table/pipeline libs

2015-12-02 Thread Jerin Jacob
On Wed, Dec 02, 2015 at 05:57:10PM +0100, Thomas Monjalon wrote:
> 2015-12-02 22:23, Jerin Jacob:
> > On Wed, Dec 02, 2015 at 05:40:13PM +0100, Thomas Monjalon wrote:
> > > 2015-12-02 20:04, Jerin Jacob:
> > > > On Wed, Dec 02, 2015 at 09:13:51PM +0800, Jianbo Liu wrote:
> > > > > On 2 December 2015 at 18:39, Jerin Jacob  > > > > caviumnetworks.com> wrote:
> > > > > > AND they include "rte_lpm.h"(it internally includes rte_vect.h)
> > > > > > that lead to multiple definition and its not good.
> > > > > >
> > > > > But you will have similar issue since "typedef int32x4_t __m128i"
> > > > > appears in both your patch and this header file.
> > > > 
> > > > I just tested it, it won't break, back to back "typedef int32x4_t 
> > > > __m128i"
> > > > is fine(unlike inline function).
> > > > 
> > > > my intention to keep __m128i "as is"  because changing the __m128i to 
> > > > rte_???
> > > > something would break the ABI.
> > > 
> > > Isn't it already broken in 2.2?
> > 
> > Does it mean, You would like to have rte_128i(or similar) kind of
> > abstraction to represent 128bit SIMD variable in DPDK?
> 
> If you are convinced that it is the best way to write a generic code, yes.
> I think the most important question is to know what is the best solution
> for performance and maintainability. The API/ABI questions will be considered

IMO, a true portable platform-independent library may need rte_128i kind
of abstracttion to represent a 128bit SIMD variable. I can send an RFC
patch to see the changes required across the DPDK.


> after.
> 
> Thanks for your involvement guys.


[dpdk-dev] [PATCH v8 00/11] Add installation rules for dpdk files.

2015-12-02 Thread Arevalo, Mario Alfredo C
Thank you, I'm going to take note about it for a version number 9 :)

Thanks.
Mario.

From: Panu Matilainen [pmati...@redhat.com]
Sent: Wednesday, December 02, 2015 1:33 AM
To: Arevalo, Mario Alfredo C; dev at dpdk.org
Cc: Venegas Munoz, Jos C
Subject: Re: [dpdk-dev] [PATCH v8 00/11] Add installation rules for dpdk files.

On 12/01/2015 09:39 PM, Mario Carrillo wrote:
> DPDK package lacks of a mechanism to install libraries, headers
> applications, kernel modules and sdk files to a file system tree.
> This patch set allows to install files based on the next
> proposal:
> http://www.freedesktop.org/software/systemd/man/file-hierarchy.html
>
> v8:
>
> When "make install" is invoked if "T" variable is defined,
> the installation process will have the current
> behaviour, else "install-fhs" rule will be called.
>
> Using rules support is possible to do the next steps:
>
> make config T=
> make
> make 
>
> Modify the makefile target to specify the files
> that will be installed using a rule:
>
> * make install-bin (install app files)(dafault path 
> bindir=$(exec_prefix)/bin).
>
> * make install-headers (install headers)(dafault path 
> includedir=$(prefix)/include/dpdk).
>
> * make install-lib (install libraries)(dafault path 
> libdir=$(exec_prefix)/lib).
>
> * make install-doc (install documentation)(dafault path 
> docdir=$(datarootdir)/doc/dpdk).
>
> * make install-mod (install modules)(dafault path if RTE_EXEC_ENV=linuxapp 
> then
>  kerneldir=/lib/modules/$(uname -r)/extra/drivers/dpdk else 
> kerneldir=/boot/modules).
>
> * make install-sdk (install headers, makefiles, scripts,examples and
>  config files) (default path sdkdir=$(datadir)/share/dpdk).
>
> * make install-fhs (install  libraries, modules, app files, tools and 
> documentation).
>
> * make install (if T is defined current behaviour, else it will call 
> install-fhs rule).
>
> The following defaults apply:
>
> prefix=/usr/local
> exec_prefix=$(prefix)
> datarootdir=$(prefix)/share
>
> All path variables can be overridden and all targets can use the "DESTDIR"
> variable.
>
> Furthermore this information is added to documentation.

Overall, does what it promises.

One point I just realized from comparing with Thomas' variant is that
this by default installs documentation sources, ie the raw .rst files
and does not include any "compiled" formats even if they exist.

It might be better to leave docs out by default as Thomas' version does.
One way of achieving that is only install docs if $(RTE_OUTPUT)/doc, and
only install anything in that directory. That way you have to request
doc generation specifically with "make doc" first (which has quite some
build-dependencies so you might not always wnat it), and only the
compiled docs get installed. Or something like that.

- Panu -


[dpdk-dev] [PATCH 1/5] vhost: refactor rte_vhost_dequeue_burst

2015-12-02 Thread Stephen Hemminger
On Thu,  3 Dec 2015 14:06:09 +0800
Yuanhan Liu  wrote:

> + rte_prefetch0((void *)(uintptr_t)desc_addr);

Another unnecessary set of casts.


[dpdk-dev] [PATCH 1/5] vhost: refactor rte_vhost_dequeue_burst

2015-12-02 Thread Stephen Hemminger
On Thu,  3 Dec 2015 14:06:09 +0800
Yuanhan Liu  wrote:

> +#define COPY(dst, src) do {  \
> + cpy_len = RTE_MIN(desc_avail, mbuf_avail);  \
> + rte_memcpy((void *)(uintptr_t)(dst),\
> +(const void *)(uintptr_t)(src), cpy_len);\
> + \
> + mbuf_avail  -= cpy_len; \
> + mbuf_offset += cpy_len; \
> + desc_avail  -= cpy_len; \
> + desc_offset += cpy_len; \
> +} while(0)
> +

I see lots of issues here.

All those void * casts are unnecessary, C casts arguements already.
rte_memcpy is slower for constant size values than memcpy()

This macro violates the rule that ther should be no hidden variables
in a macro. I.e you are assuming cpy_len, desc_avail, and mbuf_avail
are defined in all code using the macro.

Why use an un-typed macro when an inline function would be just
as fast and give type safety?


[dpdk-dev] Aligning net/ethernet.h and rte_ether.h

2015-12-02 Thread Thomas Monjalon
2015-12-02 11:45, Stephen Hemminger:
> I would like to just have rte_ether.h include netinet/ether.h
> to get rid of the useless duplication, and fix all the code in DPDK.
> But this will break out-of-tree source compatibility so best to
> wait for DPDK 2.3. Is there a good place to put this in 2.2 release notes?

deprecation.rst?


[dpdk-dev] [PATCH 2/4] vhost: introduce vhost_log_write

2015-12-02 Thread Yuanhan Liu
On Wed, Dec 02, 2015 at 03:53:01PM +0200, Victor Kaplansky wrote:
> On Wed, Dec 02, 2015 at 11:43:11AM +0800, Yuanhan Liu wrote:
> > Introduce vhost_log_write() helper function to log the dirty pages we
> > touched. Page size is harded code to 4096 (VHOST_LOG_PAGE), and each
> > log is presented by 1 bit.
> > 
> > Therefore, vhost_log_write() simply finds the right bit for related
> > page we are gonna change, and set it to 1. dev->log_base denotes the
> > start of the dirty page bitmap.
> > 
> > The page address is biased by log_guest_addr, which is derived from
> > SET_VRING_ADDR request as part of the vring related addresses.
> > 
> > Signed-off-by: Yuanhan Liu 
> > ---
> >  lib/librte_vhost/rte_virtio_net.h | 34 ++
> >  lib/librte_vhost/virtio-net.c |  4 
> >  2 files changed, 38 insertions(+)
> > 
> > diff --git a/lib/librte_vhost/rte_virtio_net.h 
> > b/lib/librte_vhost/rte_virtio_net.h
> > index 416dac2..191c1be 100644
> > --- a/lib/librte_vhost/rte_virtio_net.h
> > +++ b/lib/librte_vhost/rte_virtio_net.h
> > @@ -40,6 +40,7 @@
> >   */
> >  
> >  #include 
> > +#include 
> >  #include 
> >  #include 
> >  #include 
> > @@ -59,6 +60,8 @@ struct rte_mbuf;
> >  /* Backend value set by guest. */
> >  #define VIRTIO_DEV_STOPPED -1
> >  
> > +#define VHOST_LOG_PAGE 4096
> > +
> >  
> >  /* Enum for virtqueue management. */
> >  enum {VIRTIO_RXQ, VIRTIO_TXQ, VIRTIO_QNUM};
> > @@ -82,6 +85,7 @@ struct vhost_virtqueue {
> > struct vring_desc   *desc;  /**< Virtqueue 
> > descriptor ring. */
> > struct vring_avail  *avail; /**< Virtqueue 
> > available ring. */
> > struct vring_used   *used;  /**< Virtqueue used 
> > ring. */
> > +   uint64_tlog_guest_addr; /**< Physical address 
> > of used ring, for logging */
> > uint32_tsize;   /**< Size of descriptor 
> > ring. */
> > uint32_tbackend;/**< Backend value to 
> > determine if device should started/stopped. */
> > uint16_tvhost_hlen; /**< Vhost header 
> > length (varies depending on RX merge buffers. */
> > @@ -203,6 +207,36 @@ gpa_to_vva(struct virtio_net *dev, uint64_t guest_pa)
> > return vhost_va;
> >  }
> >  
> > +static inline void __attribute__((always_inline))
> > +vhost_log_page(uint8_t *log_base, uint64_t page)
> > +{
> > +   /* TODO: to make it atomic? */
> > +   log_base[page / 8] |= 1 << (page % 8);
> 
> I think the atomic OR operation is necessary only if there can be
> more than one vhost-user back-end updating the guest's memory
> simultaneously. However probably it is pretty safe to perform
> regular OR operation, since rings are not shared between
> back-end. What about buffers pointed by descriptors?  To be on
> the safe side, I would use a GCC built-in function
> __sync_fetch_and_or(). 

The build has to be passed not only for gcc, but for icc and clang as
well.

> 
> > +}
> > +
> > +static inline void __attribute__((always_inline))
> > +vhost_log_write(struct virtio_net *dev, struct vhost_virtqueue *vq,
> > +   uint64_t offset, uint64_t len)
> > +{
> > +   uint64_t addr = vq->log_guest_addr;
> > +   uint64_t page;
> > +
> > +   if (unlikely(((dev->features & (1ULL << VHOST_F_LOG_ALL)) == 0) ||
> > +!dev->log_base || !len))
> > +   return;
> 
> Isn't "likely" more appropriate in above, since the whole
> expression is expected to be true most of the time?

Sorry, it's a typo, and thanks for the catching.

--yliu


[dpdk-dev] [PATCH 3/4] vhost: log vring changes

2015-12-02 Thread Yuanhan Liu
On Wed, Dec 02, 2015 at 04:07:02PM +0200, Victor Kaplansky wrote:
> On Wed, Dec 02, 2015 at 11:43:12AM +0800, Yuanhan Liu wrote:
> > Invoking vhost_log_write() to mark corresponding page as dirty while
> > updating used vring.
> 
> Looks good, thanks!
> 
> I didn't find where you log the dirty pages in result of data
> written to the buffers pointed by the descriptors in RX vring.
> AFAIU, the buffers of RX queue reside in guest's memory and have
> to be marked as dirty if they are written. What do you say?

Yeah, we should. I got a question then: why log_guest_addr is set
to the physical address of used vring in guest? I mean, apparently,
we need log more changes other than used vring only.

--yliu


[dpdk-dev] [PATCH 0/4 for 2.3] vhost-user live migration support

2015-12-02 Thread Yuanhan Liu
On Wed, Dec 02, 2015 at 04:10:56PM +0200, Victor Kaplansky wrote:
...
> > Note: this patch set has mostly been based on Victor Kaplansk's demo
> > work (vhost-user-bridge) at QEMU project. I was thinking to add Victor
> > as the co-author. Victor, what do you think of that? :)
> 
> Thanks for adding me to credits list!

Great, I will add your signed-off-by since v2. Will that be okay to you?

--yliu


[dpdk-dev] [PATCH 1/4] vhost: handle VHOST_USER_SET_LOG_BASE request

2015-12-02 Thread Yuanhan Liu
On Wed, Dec 02, 2015 at 03:53:45PM +0200, Panu Matilainen wrote:
> On 12/02/2015 05:43 AM, Yuanhan Liu wrote:
> >VHOST_USER_SET_LOG_BASE request is used to tell the backend (dpdk
> >vhost-user) where we should log dirty pages, and how big the log
> >buffer is.
> >
> >This request introduces a new payload:
> >
> > typedef struct VhostUserLog {
> > uint64_t mmap_size;
> > uint64_t mmap_offset;
> > } VhostUserLog;
> >
> >Also, a fd is delivered from QEMU by ancillary data.
> >
> >With those info given, an area of memory is mmaped, assigned
> >to dev->log_base, for logging dirty pages.
> >
> >Signed-off-by: Yuanhan Liu 
> >---
> >  lib/librte_vhost/rte_virtio_net.h |  2 ++
> >  lib/librte_vhost/vhost_user/vhost-net-user.c  |  7 -
> >  lib/librte_vhost/vhost_user/vhost-net-user.h  |  6 
> >  lib/librte_vhost/vhost_user/virtio-net-user.c | 44 
> > +++
> >  lib/librte_vhost/vhost_user/virtio-net-user.h |  1 +
> >  5 files changed, 59 insertions(+), 1 deletion(-)
> >
> >diff --git a/lib/librte_vhost/rte_virtio_net.h 
> >b/lib/librte_vhost/rte_virtio_net.h
> >index 5687452..416dac2 100644
> >--- a/lib/librte_vhost/rte_virtio_net.h
> >+++ b/lib/librte_vhost/rte_virtio_net.h
> >@@ -127,6 +127,8 @@ struct virtio_net {
> >  #define IF_NAME_SZ (PATH_MAX > IFNAMSIZ ? PATH_MAX : IFNAMSIZ)
> > charifname[IF_NAME_SZ]; /**< Name of the tap 
> > device or socket path. */
> > uint32_tvirt_qp_nb; /**< number of queue pair we 
> > have allocated */
> >+uint64_tlog_size;   /**< Size of log area */
> >+uint8_t *log_base;  /**< Where dirty pages are 
> >logged */
> > void*priv;  /**< private context */
> > struct vhost_virtqueue  *virtqueue[VHOST_MAX_QUEUE_PAIRS * 2];  /**< 
> > Contains all virtqueue information. */
> >  } __rte_cache_aligned;
> 
> This (and other changes in patch 2 breaks the librte_vhost ABI
> again, so you'd need to at least add a deprecation note to 2.2 to be
> able to do it in 2.3 at all according to the ABI policy.

I was thinking that adding a new field (instead of renaming it or
removing it) isn't an ABI break. So, I was wrong?

> 
> Perhaps a better option would be adding some padding to the structs
> now for 2.2 since the vhost ABI is broken there anyway. That would
> at least give a chance to keep it compatible from 2.2 to 2.3.

It will not be compatible, unless we add exact same fields (not
something like uint8_t pad[xx]). Otherwise, the pad field renaming
is also an ABI break, right?

Thomas, should I write an ABI deprecation note? Can I make it for
v2.2 release If I make one tomorrow? (Sorry that I'm not awared
of that it would be an ABI break).

--yliu


[dpdk-dev] [PATCH 3/4] eal/arm: Enable lpm/table/pipeline libs

2015-12-02 Thread Jerin Jacob
On Wed, Dec 02, 2015 at 05:40:13PM +0100, Thomas Monjalon wrote:
> 2015-12-02 20:04, Jerin Jacob:
> > On Wed, Dec 02, 2015 at 09:13:51PM +0800, Jianbo Liu wrote:
> > > On 2 December 2015 at 18:39, Jerin Jacob  > > caviumnetworks.com> wrote:
> > > > AND they include "rte_lpm.h"(it internally includes rte_vect.h)
> > > > that lead to multiple definition and its not good.
> > > >
> > > But you will have similar issue since "typedef int32x4_t __m128i"
> > > appears in both your patch and this header file.
> > 
> > I just tested it, it won't break, back to back "typedef int32x4_t __m128i"
> > is fine(unlike inline function).
> > 
> > my intention to keep __m128i "as is"  because changing the __m128i to 
> > rte_???
> > something would break the ABI.
> 
> Isn't it already broken in 2.2?

Does it mean, You would like to have rte_128i(or similar) kind of
abstraction to represent 128bit SIMD variable in DPDK?


[dpdk-dev] [PATCH 3/4] eal/arm: Enable lpm/table/pipeline libs

2015-12-02 Thread Jianbo Liu
On 2 December 2015 at 18:39, Jerin Jacob  
wrote:
> On Wed, Dec 02, 2015 at 05:49:41PM +0800, Jianbo Liu wrote:
>> On 2 December 2015 at 16:03, Jerin Jacob  
>> wrote:
>> > On Wed, Dec 02, 2015 at 02:54:52PM +0800, Jianbo Liu wrote:
>> >> On 2 December 2015 at 00:41, Jerin Jacob > >> caviumnetworks.com> wrote:
>> >> > On Tue, Dec 01, 2015 at 01:41:15PM -0500, Jianbo Liu wrote:
>> >> >> Adds ARM NEON support for lpm.
>> >> >> And enables table/pipeline libraries which depend on lpm.
>> >> >
>> >> > I already sent the patch on the same yesterday.
>> >> > We can converge the patches after the discussion.
>> >> > Please check "[dpdk-dev] [PATCH 0/3] add lpm support for NEON" on ml
>> >> >
>> >> Yes, I have read your patch. But there are many differences, so I sent
>> >> mine for your reviewing :)
>> >>
>> >> >
>> >> >>
>> >> >> Signed-off-by: Jianbo Liu 
>> >> >> ---
>> >> >>  config/defconfig_arm-armv7a-linuxapp-gcc  |  3 -
>> >> >>  config/defconfig_arm64-armv8a-linuxapp-gcc|  3 -
>> >> >>  lib/librte_eal/common/include/arch/arm/rte_vect.h | 28 ++
>> >> >>  lib/librte_lpm/rte_lpm.h  | 68 
>> >> >> ---
>> >> >>  4 files changed, 77 insertions(+), 25 deletions(-)
>> >> >>
>> >> >> diff --git a/config/defconfig_arm-armv7a-linuxapp-gcc 
>> >> >> b/config/defconfig_arm-armv7a-linuxapp-gcc
>> >> >> index cbebd64..efffa1f 100644
>> >> >> --- a/config/defconfig_arm-armv7a-linuxapp-gcc
>> >> >> +++ b/config/defconfig_arm-armv7a-linuxapp-gcc
>> >> >> @@ -53,9 +53,6 @@ CONFIG_RTE_LIBRTE_KNI=n
>> >> >>  CONFIG_RTE_EAL_IGB_UIO=n
>> >> >>
>> >> >>  # fails to compile on ARM
>> >> >> -CONFIG_RTE_LIBRTE_LPM=n
>> >> >> -CONFIG_RTE_LIBRTE_TABLE=n
>> >> >> -CONFIG_RTE_LIBRTE_PIPELINE=n
>> >> >>  CONFIG_RTE_SCHED_VECTOR=n
>> >> >>
>> >> >>  # cannot use those on ARM
>> >> >> diff --git a/config/defconfig_arm64-armv8a-linuxapp-gcc 
>> >> >> b/config/defconfig_arm64-armv8a-linuxapp-gcc
>> >> >> index 504f3ed..57f7941 100644
>> >> >> --- a/config/defconfig_arm64-armv8a-linuxapp-gcc
>> >> >> +++ b/config/defconfig_arm64-armv8a-linuxapp-gcc
>> >> >> @@ -51,7 +51,4 @@ CONFIG_RTE_LIBRTE_IVSHMEM=n
>> >> >>  CONFIG_RTE_LIBRTE_FM10K_PMD=n
>> >> >>  CONFIG_RTE_LIBRTE_I40E_PMD=n
>> >> >>
>> >> >> -CONFIG_RTE_LIBRTE_LPM=n
>> >> >> -CONFIG_RTE_LIBRTE_TABLE=n
>> >> >> -CONFIG_RTE_LIBRTE_PIPELINE=n
>> >> >>  CONFIG_RTE_SCHED_VECTOR=n
>> >> >> diff --git a/lib/librte_eal/common/include/arch/arm/rte_vect.h 
>> >> >> b/lib/librte_eal/common/include/arch/arm/rte_vect.h
>> >> >> index a33c054..7437711 100644
>> >> >> --- a/lib/librte_eal/common/include/arch/arm/rte_vect.h
>> >> >> +++ b/lib/librte_eal/common/include/arch/arm/rte_vect.h
>> >> >> @@ -41,6 +41,8 @@ extern "C" {
>> >> >>
>> >> >>  typedef int32x4_t xmm_t;
>> >> >>
>> >> >> +typedef int32x4_t __m128i;
>> >> >> +
>> >> >>  #define  XMM_SIZE(sizeof(xmm_t))
>> >> >>  #define  XMM_MASK(XMM_SIZE - 1)
>> >> >>
>> >> >> @@ -53,6 +55,32 @@ typedef union rte_xmm {
>> >> >>   double   pd[XMM_SIZE / sizeof(double)];
>> >> >>  } __attribute__((aligned(16))) rte_xmm_t;
>> >> >>
>> >> >> +static __inline __m128i
>> >> >> +_mm_set_epi32(int i3, int i2, int i1, int i0)
>> >> >> +{
>> >> >> + int32_t r[4] = {i0, i1, i2, i3};
>> >> >> +
>> >> >> + return vld1q_s32(r);
>> >> >> +}
>> >> >> +
>> >> >> +static __inline __m128i
>> >> >> +_mm_loadu_si128(__m128i *p)
>> >> >> +{
>> >> >> + return vld1q_s32((int32_t *)p);
>> >> >> +}
>> >> >> +
>> >> >> +static __inline __m128i
>> >> >> +_mm_set1_epi32(int i)
>> >> >> +{
>> >> >> + return vdupq_n_s32(i);
>> >> >> +}
>> >> >> +
>> >> >> +static __inline __m128i
>> >> >> +_mm_and_si128(__m128i a, __m128i b)
>> >> >> +{
>> >> >> + return vandq_s32(a, b);
>> >> >> +}
>> >> >> +
>> >
>> > IMO, it's not always good to emulate GCC defined intrinsics of
>> > other architecture. What if a legacy DPDK application has such mappings
>> > then BOOM, multiple definition, which one is correct? which one
>> > to comment it out? Integration pain starts for DPDK library consumer:-(
>> >
>> They can include rte_vect.h in build/include directly, which is linked 
>> correctly
>> to the one for that ARCH, so there is no need to worry about.
>
> I think you missed the point,I was trying to say that
> legacy DPDK application and third party stacks uses SSE2NEON kind of
> libraries
> for quick integration, for example, something like this
> https://github.com/jratcliff63367/sse2neon/blob/master/SSE2NEON.h
>
> AND they include "rte_lpm.h"(it internally includes rte_vect.h)
> that lead to multiple definition and its not good.
>
But you will have similar issue since "typedef int32x4_t __m128i"
appears in both your patch and this header file.

>>
>>
>> >> >
>> >> > IMO, it makes sense to not emulate the SSE intrinsics with NEON
>> >> > Let's create the rte_vect_* as required. look at the existing patch.
>> >> >
>> >> I thought of creating a layer of SIMD over all the 

[dpdk-dev] [PATCH 2/3] lpm: add support for NEON

2015-12-02 Thread Jerin Jacob
On Wed, Dec 02, 2015 at 02:43:40PM +0100, Jan Viktorin wrote:
> On Mon, 30 Nov 2015 22:54:12 +0530
> Jerin Jacob  wrote:
> 
> > enabled CONFIG_RTE_LIBRTE_LPM, CONFIG_RTE_LIBRTE_TABLE,
> > CONFIG_RTE_LIBRTE_PIPELINE libraries for arm64.
> > 
> > TABLE, PIPELINE libraries were disabled due to LPM library dependency.
> > 
> > Signed-off-by: Jerin Jacob 
> > ---
> >  app/test/test_lpm.c|  10 +-
> >  config/defconfig_arm64-armv8a-linuxapp-gcc |   3 -
> >  lib/librte_lpm/Makefile|   3 +
> >  lib/librte_lpm/rte_lpm.h   |   5 +
> >  lib/librte_lpm/rte_lpm_neon.h  | 172 
> > +
> >  5 files changed, 185 insertions(+), 8 deletions(-)
> >  create mode 100644 lib/librte_lpm/rte_lpm_neon.h
> > 
> > [snip]
> >  
> >  # this lib needs eal
> >  DEPDIRS-$(CONFIG_RTE_LIBRTE_LPM) += lib/librte_eal
> > diff --git a/lib/librte_lpm/rte_lpm.h b/lib/librte_lpm/rte_lpm.h
> > index c299ce2..12b75ce 100644
> > --- a/lib/librte_lpm/rte_lpm.h
> > +++ b/lib/librte_lpm/rte_lpm.h
> > @@ -361,6 +361,9 @@ rte_lpm_lookup_bulk_func(const struct rte_lpm *lpm, 
> > const uint32_t * ips,
> >  /* Mask four results. */
> >  #define RTE_LPM_MASKX4_RES UINT64_C(0x00ff00ff00ff00ff)
> >  
> > +#if defined(RTE_ARCH_ARM64)
> > +#include "rte_lpm_neon.h"
> > +#else
> >  /**
> >   * Lookup four IP addresses in an LPM table.
> >   *
> > @@ -473,6 +476,8 @@ rte_lpm_lookupx4(const struct rte_lpm *lpm, __m128i ip, 
> > uint16_t hop[4],
> > hop[3] = (tbl[3] & RTE_LPM_LOOKUP_SUCCESS) ? (uint8_t)tbl[3] : defv;
> >  }
> >  
> > +#endif
> > +
> 
> I would separate the SSE implementation into its own file as well.

make sense. planning to make it as  lib/librte_lpm/rte_lpm_sse.h
and lib/librte_lpm/rte_lpm_neon.h.  OK ?

I can fix it in next revision.

> 
> Otherwise, I like this patch. I hope to be able to test it soon.
> 
> >  [snip]
> 
> 
> -- 
>Jan Viktorin  E-mail: Viktorin at RehiveTech.com
>System Architect  Web:www.RehiveTech.com
>RehiveTech
>Brno, Czech Republic


[dpdk-dev] [PATCH 1/3] eal: introduce rte_vect_* abstractions

2015-12-02 Thread Jerin Jacob
On Wed, Dec 02, 2015 at 02:43:34PM +0100, Jan Viktorin wrote:
> On Mon, 30 Nov 2015 22:54:11 +0530
> Jerin Jacob  wrote:
> 
> > introduce rte_vect_* abstractions to remove SSE/AVX specific
> > code in the common code(i.e the test applications)
> > 
> > The patch does not provide any functional change for IA, the goal is to
> 
> Does IA mean Intel Architecture?

Yes.

> 
> > have infrastructure to reuse the common vector-based test code across
> > all the architectures.
> > 
> > Signed-off-by: Jerin Jacob 
> > ---
> >  lib/librte_eal/common/include/arch/arm/rte_vect.h | 17 -
> >  lib/librte_eal/common/include/arch/x86/rte_vect.h |  8 
> >  2 files changed, 24 insertions(+), 1 deletion(-)
> > 
> > diff --git a/lib/librte_eal/common/include/arch/arm/rte_vect.h 
> > b/lib/librte_eal/common/include/arch/arm/rte_vect.h
> > index 21cdb4d..d300951 100644
> > --- a/lib/librte_eal/common/include/arch/arm/rte_vect.h
> > +++ b/lib/librte_eal/common/include/arch/arm/rte_vect.h
> > @@ -33,13 +33,14 @@
> >  #ifndef _RTE_VECT_ARM_H_
> >  #define _RTE_VECT_ARM_H_
> >  
> > -#include "arm_neon.h"
> > +#include 
> >  
> >  #ifdef __cplusplus
> >  extern "C" {
> >  #endif
> >  
> >  typedef int32x4_t xmm_t;
> > +typedef int32x4_t __m128i;
> 
> As Jianbo pointed out recently, the __m128i type should be refactored in
> a general rte_vect API too. If we do something like
> 
> #if SSE
> typedef __m128i rte_128i;
> #elif NEON
> typedef int32x4_y rte_128i;
> #endif
> 
> does it make somebody angry? I am afraid that it will influence a lot of
> code. However, from the ABI point of view, it is OK, isn't it?
> 
> >  
> >  #defineXMM_SIZE(sizeof(xmm_t))
> >  #defineXMM_MASK(XMM_SIZE - 1)
> > @@ -53,6 +54,20 @@ typedef union rte_xmm {
> > double   pd[XMM_SIZE / sizeof(double)];
> >  } __attribute__((aligned(16))) rte_xmm_t;
> >  
> > +/* rte_vect_* abstraction implementation using NEON */
> > +
> > +/* loads the __m128i value from address p(does not need to be 16-byte 
> > aligned)*/
> > +#define rte_vect_loadu_sil128(p) vld1q_s32((const int32_t *)p)
> > +
> > +/* sets the 4 signed 32-bit integer values and returns the __m128i 
> > variable */
> > +static inline __m128i  __attribute__((always_inline))
> > +rte_vect_set_epi32(int i3, int i2, int i1, int i0)
> > +{
> > +   int32_t data[4] = {i0, i1, i2, i3};
> > +
> > +   return vld1q_s32(data);
> > +}
> > +
> >  #ifdef __cplusplus
> >  }
> >  #endif
> > diff --git a/lib/librte_eal/common/include/arch/x86/rte_vect.h 
> > b/lib/librte_eal/common/include/arch/x86/rte_vect.h
> > index b698797..91c6523 100644
> > --- a/lib/librte_eal/common/include/arch/x86/rte_vect.h
> > +++ b/lib/librte_eal/common/include/arch/x86/rte_vect.h
> > @@ -125,6 +125,14 @@ typedef union rte_ymm {
> >  })
> >  #endif /* (defined(__ICC) && __ICC < 1210) */
> >  
> > +/* rte_vect_* abstraction implementation using SSE */
> > +
> > +/* loads the __m128i value from address p(does not need to be 16-byte 
> > aligned)*/
> > +#define rte_vect_loadu_sil128(p) _mm_loadu_si128(p)
> > +
> > +/* sets the 4 signed 32-bit integer values and returns the __m128i 
> > variable */
> > +#define rte_vect_set_epi32(i3, i2, i1, i0) _mm_set_epi32(i3, i2, i1, i0)
> > +
> >  #ifdef __cplusplus
> >  }
> >  #endif
> 
> I like this approach. It is a question whether to inherit names from
> SSE. However, why to reinvent the wheel...
> 
> We probably need other people to give their ideas about such
> generalization of the API.

Yes, I would like get the feedback from other people.

ret_vect_* abstraction only for the common code (i.e test code) which
typically used to call the SIMD DPDK API's across the architecture.

> 
> I think, there should be an autotest of the rte_vect API. Is it
> possible to create one?

Yes

> 
> Regards
> Jan
> 
> -- 
>Jan Viktorin  E-mail: Viktorin at RehiveTech.com
>System Architect  Web:www.RehiveTech.com
>RehiveTech
>Brno, Czech Republic


[dpdk-dev] [PATCH 0/3] add lpm support for NEON

2015-12-02 Thread Jerin Jacob
On Wed, Dec 02, 2015 at 02:43:12PM +0100, Jan Viktorin wrote:
> Hello Jerin,
> 
> thank you for this patch series. Please CC me next time when doing an
> ARM-related changes. It took me a while to find the related e-mails on
> the mail server.

It's was my mistake. Sorry about that.


> 
> On Mon, 30 Nov 2015 22:54:10 +0530
> Jerin Jacob  wrote:
> 
> > - Introduce new rte_vect_* abstractions in eal
> > - This patch set has the changes required for optimised pm library usage in 
> > arm64 perspective
> > - Tested on Juno and Thunder boards
> > - Tested and verified the changes with following DPDK unit test cases
> > --lpm_autotest
> > --lpm6_autotest
> > - This patch set has dependency on [dpdk-dev] [PATCH v4 0/2] disable 
> > CONFIG_RTE_SCHED_VECTOR for arm
> 
> What kind of dependency is it? Functional?

Not functional, Just "git am" dependency on config file change due to recent 
config file
re structuring.


> 
> > - With these changes, arm64 platform supports all DPDK libraries(in feature 
> > wise)
> 
> Is there some ARMv8 specific NEON instruction?

NO. I just said as covering note as ACL on armv7 was not supported at
that time.


> 
> > 
> > Jerin Jacob (3):
> >   eal: introduce rte_vect_* abstractions
> >   lpm: add support for NEON
> >   maintainers: claim responsibility for arm64 specific files of hash and
> > lpm
> > 
> >  MAINTAINERS   |   3 +
> >  app/test/test_lpm.c   |  10 +-
> >  config/defconfig_arm64-armv8a-linuxapp-gcc|   3 -
> >  lib/librte_eal/common/include/arch/arm/rte_vect.h |  17 ++-
> >  lib/librte_eal/common/include/arch/x86/rte_vect.h |   8 +
> >  lib/librte_lpm/Makefile   |   3 +
> >  lib/librte_lpm/rte_lpm.h  |   5 +
> >  lib/librte_lpm/rte_lpm_neon.h | 172 
> > ++
> >  8 files changed, 212 insertions(+), 9 deletions(-)
> >  create mode 100644 lib/librte_lpm/rte_lpm_neon.h
> > 
> > --
> > 2.1.0
> > 
> 
> 
> 
> -- 
>Jan Viktorin  E-mail: Viktorin at RehiveTech.com
>System Architect  Web:www.RehiveTech.com
>RehiveTech
>Brno, Czech Republic


[dpdk-dev] [PATCH 3/4] eal/arm: Enable lpm/table/pipeline libs

2015-12-02 Thread Jerin Jacob
On Wed, Dec 02, 2015 at 09:13:51PM +0800, Jianbo Liu wrote:
> On 2 December 2015 at 18:39, Jerin Jacob  
> wrote:
> > On Wed, Dec 02, 2015 at 05:49:41PM +0800, Jianbo Liu wrote:
> >> On 2 December 2015 at 16:03, Jerin Jacob  >> caviumnetworks.com> wrote:
> >> > On Wed, Dec 02, 2015 at 02:54:52PM +0800, Jianbo Liu wrote:
> >> >> On 2 December 2015 at 00:41, Jerin Jacob  >> >> caviumnetworks.com> wrote:
> >> >> > On Tue, Dec 01, 2015 at 01:41:15PM -0500, Jianbo Liu wrote:
> >> >> >> Adds ARM NEON support for lpm.
> >> >> >> And enables table/pipeline libraries which depend on lpm.
> >> >> >
> >> >> > I already sent the patch on the same yesterday.
> >> >> > We can converge the patches after the discussion.
> >> >> > Please check "[dpdk-dev] [PATCH 0/3] add lpm support for NEON" on ml
> >> >> >
> >> >> Yes, I have read your patch. But there are many differences, so I sent
> >> >> mine for your reviewing :)
> >> >>
> >> >> >
> >> >> >>
> >> >> >> Signed-off-by: Jianbo Liu 
> >> >> >> ---
> >> >> >>  config/defconfig_arm-armv7a-linuxapp-gcc  |  3 -
> >> >> >>  config/defconfig_arm64-armv8a-linuxapp-gcc|  3 -
> >> >> >>  lib/librte_eal/common/include/arch/arm/rte_vect.h | 28 ++
> >> >> >>  lib/librte_lpm/rte_lpm.h  | 68 
> >> >> >> ---
> >> >> >>  4 files changed, 77 insertions(+), 25 deletions(-)
> >> >> >>
> >> >> >> diff --git a/config/defconfig_arm-armv7a-linuxapp-gcc 
> >> >> >> b/config/defconfig_arm-armv7a-linuxapp-gcc
> >> >> >> index cbebd64..efffa1f 100644
> >> >> >> --- a/config/defconfig_arm-armv7a-linuxapp-gcc
> >> >> >> +++ b/config/defconfig_arm-armv7a-linuxapp-gcc
> >> >> >> @@ -53,9 +53,6 @@ CONFIG_RTE_LIBRTE_KNI=n
> >> >> >>  CONFIG_RTE_EAL_IGB_UIO=n
> >> >> >>
> >> >> >>  # fails to compile on ARM
> >> >> >> -CONFIG_RTE_LIBRTE_LPM=n
> >> >> >> -CONFIG_RTE_LIBRTE_TABLE=n
> >> >> >> -CONFIG_RTE_LIBRTE_PIPELINE=n
> >> >> >>  CONFIG_RTE_SCHED_VECTOR=n
> >> >> >>
> >> >> >>  # cannot use those on ARM
> >> >> >> diff --git a/config/defconfig_arm64-armv8a-linuxapp-gcc 
> >> >> >> b/config/defconfig_arm64-armv8a-linuxapp-gcc
> >> >> >> index 504f3ed..57f7941 100644
> >> >> >> --- a/config/defconfig_arm64-armv8a-linuxapp-gcc
> >> >> >> +++ b/config/defconfig_arm64-armv8a-linuxapp-gcc
> >> >> >> @@ -51,7 +51,4 @@ CONFIG_RTE_LIBRTE_IVSHMEM=n
> >> >> >>  CONFIG_RTE_LIBRTE_FM10K_PMD=n
> >> >> >>  CONFIG_RTE_LIBRTE_I40E_PMD=n
> >> >> >>
> >> >> >> -CONFIG_RTE_LIBRTE_LPM=n
> >> >> >> -CONFIG_RTE_LIBRTE_TABLE=n
> >> >> >> -CONFIG_RTE_LIBRTE_PIPELINE=n
> >> >> >>  CONFIG_RTE_SCHED_VECTOR=n
> >> >> >> diff --git a/lib/librte_eal/common/include/arch/arm/rte_vect.h 
> >> >> >> b/lib/librte_eal/common/include/arch/arm/rte_vect.h
> >> >> >> index a33c054..7437711 100644
> >> >> >> --- a/lib/librte_eal/common/include/arch/arm/rte_vect.h
> >> >> >> +++ b/lib/librte_eal/common/include/arch/arm/rte_vect.h
> >> >> >> @@ -41,6 +41,8 @@ extern "C" {
> >> >> >>
> >> >> >>  typedef int32x4_t xmm_t;
> >> >> >>
> >> >> >> +typedef int32x4_t __m128i;
> >> >> >> +
> >> >> >>  #define  XMM_SIZE(sizeof(xmm_t))
> >> >> >>  #define  XMM_MASK(XMM_SIZE - 1)
> >> >> >>
> >> >> >> @@ -53,6 +55,32 @@ typedef union rte_xmm {
> >> >> >>   double   pd[XMM_SIZE / sizeof(double)];
> >> >> >>  } __attribute__((aligned(16))) rte_xmm_t;
> >> >> >>
> >> >> >> +static __inline __m128i
> >> >> >> +_mm_set_epi32(int i3, int i2, int i1, int i0)
> >> >> >> +{
> >> >> >> + int32_t r[4] = {i0, i1, i2, i3};
> >> >> >> +
> >> >> >> + return vld1q_s32(r);
> >> >> >> +}
> >> >> >> +
> >> >> >> +static __inline __m128i
> >> >> >> +_mm_loadu_si128(__m128i *p)
> >> >> >> +{
> >> >> >> + return vld1q_s32((int32_t *)p);
> >> >> >> +}
> >> >> >> +
> >> >> >> +static __inline __m128i
> >> >> >> +_mm_set1_epi32(int i)
> >> >> >> +{
> >> >> >> + return vdupq_n_s32(i);
> >> >> >> +}
> >> >> >> +
> >> >> >> +static __inline __m128i
> >> >> >> +_mm_and_si128(__m128i a, __m128i b)
> >> >> >> +{
> >> >> >> + return vandq_s32(a, b);
> >> >> >> +}
> >> >> >> +
> >> >
> >> > IMO, it's not always good to emulate GCC defined intrinsics of
> >> > other architecture. What if a legacy DPDK application has such mappings
> >> > then BOOM, multiple definition, which one is correct? which one
> >> > to comment it out? Integration pain starts for DPDK library consumer:-(
> >> >
> >> They can include rte_vect.h in build/include directly, which is linked 
> >> correctly
> >> to the one for that ARCH, so there is no need to worry about.
> >
> > I think you missed the point,I was trying to say that
> > legacy DPDK application and third party stacks uses SSE2NEON kind of
> > libraries
> > for quick integration, for example, something like this
> > https://github.com/jratcliff63367/sse2neon/blob/master/SSE2NEON.h
> >
> > AND they include "rte_lpm.h"(it internally includes rte_vect.h)
> > that lead to multiple definition and its not good.
> >
> But you will have similar 

[dpdk-dev] [PATCH 1/4] vhost: handle VHOST_USER_SET_LOG_BASE request

2015-12-02 Thread Michael S. Tsirkin
On Wed, Dec 02, 2015 at 06:58:03PM +0200, Panu Matilainen wrote:
> On 12/02/2015 05:09 PM, Yuanhan Liu wrote:
> >On Wed, Dec 02, 2015 at 04:48:14PM +0200, Panu Matilainen wrote:
> >...
> >diff --git a/lib/librte_vhost/rte_virtio_net.h 
> >b/lib/librte_vhost/rte_virtio_net.h
> >index 5687452..416dac2 100644
> >--- a/lib/librte_vhost/rte_virtio_net.h
> >+++ b/lib/librte_vhost/rte_virtio_net.h
> >@@ -127,6 +127,8 @@ struct virtio_net {
> >  #define IF_NAME_SZ (PATH_MAX > IFNAMSIZ ? PATH_MAX : IFNAMSIZ)
> > charifname[IF_NAME_SZ]; /**< Name of 
> > the tap device or socket path. */
> > uint32_tvirt_qp_nb; /**< number of queue 
> > pair we have allocated */
> >+uint64_tlog_size;   /**< Size of log area */
> >+uint8_t *log_base;  /**< Where dirty pages 
> >are logged */
> > void*priv;  /**< private context */
> > struct vhost_virtqueue  *virtqueue[VHOST_MAX_QUEUE_PAIRS * 2];  
> > /**< Contains all virtqueue information. */
> >  } __rte_cache_aligned;
> 
> This (and other changes in patch 2 breaks the librte_vhost ABI
> again, so you'd need to at least add a deprecation note to 2.2 to be
> able to do it in 2.3 at all according to the ABI policy.
> >>>
> >>>I was thinking that adding a new field (instead of renaming it or
> >>>removing it) isn't an ABI break. So, I was wrong?
> >>
> >>Adding or removing a field in the middle of a public struct is
> >>always an ABI break. Adding to the end often is too, but not always.
> >>Renaming a field is an API break but not an ABI break - the compiler
> >>cares but the cpu does not.
> >
> >Good to know. Thanks.
> >
> >>
> 
> Perhaps a better option would be adding some padding to the structs
> now for 2.2 since the vhost ABI is broken there anyway. That would
> at least give a chance to keep it compatible from 2.2 to 2.3.
> >>>
> >>>It will not be compatible, unless we add exact same fields (not
> >>>something like uint8_t pad[xx]). Otherwise, the pad field renaming
> >>>is also an ABI break, right?
> >>
> >>There's no ABI (or API) break in changing reserved unused fields to
> >>something else, as long as care is taken with sizes and alignment.
> >
> >as long as we don't reference the reserved unused fields?
> 
> That would be the definition of an unused field I think :)
> Call it "reserved" if you want, it doesn't really matter as long as its
> clear its something you shouldn't be using.
> 
> >
> >>In any case padding is best added to the end of a struct to minimize
> >>risks and keep things simple.
> >
> >The thing is that isn't it a bit aweful to (always) add pads to
> >the end of a struct, especially when you don't know how many
> >need to be padded?
> 
> Then you pad for what you think you need, plus a bit extra, and maybe some
> more for others who might want to extend it. What is a reasonable amount
> needs deciding case by case - if a struct is alloced in the millions then be
> (very) conservative, but if there are one or 50 such structs within an app
> lifetime then who cares if its bit larger?
> 
> And yeah padding may be annoying, but that's pretty much the only option in
> a project where most of the structs are out in the open.
> 
>   - Panu -

Functions versioning is another option.
For a sufficiently widely used struct, it's a lot of work, padding
is easier.  But it might be better than breaking ABI
if e.g. you didn't pad enough.

> >
> > --yliu
> >


[dpdk-dev] [PATCH 1/4] vhost: handle VHOST_USER_SET_LOG_BASE request

2015-12-02 Thread Panu Matilainen
On 12/02/2015 05:09 PM, Yuanhan Liu wrote:
> On Wed, Dec 02, 2015 at 04:48:14PM +0200, Panu Matilainen wrote:
> ...
> diff --git a/lib/librte_vhost/rte_virtio_net.h 
> b/lib/librte_vhost/rte_virtio_net.h
> index 5687452..416dac2 100644
> --- a/lib/librte_vhost/rte_virtio_net.h
> +++ b/lib/librte_vhost/rte_virtio_net.h
> @@ -127,6 +127,8 @@ struct virtio_net {
>   #define IF_NAME_SZ (PATH_MAX > IFNAMSIZ ? PATH_MAX : IFNAMSIZ)
>   charifname[IF_NAME_SZ]; /**< Name of 
> the tap device or socket path. */
>   uint32_tvirt_qp_nb; /**< number of queue 
> pair we have allocated */
> + uint64_tlog_size;   /**< Size of log area */
> + uint8_t *log_base;  /**< Where dirty pages are 
> logged */
>   void*priv;  /**< private context */
>   struct vhost_virtqueue  *virtqueue[VHOST_MAX_QUEUE_PAIRS * 2];  
> /**< Contains all virtqueue information. */
>   } __rte_cache_aligned;

 This (and other changes in patch 2 breaks the librte_vhost ABI
 again, so you'd need to at least add a deprecation note to 2.2 to be
 able to do it in 2.3 at all according to the ABI policy.
>>>
>>> I was thinking that adding a new field (instead of renaming it or
>>> removing it) isn't an ABI break. So, I was wrong?
>>
>> Adding or removing a field in the middle of a public struct is
>> always an ABI break. Adding to the end often is too, but not always.
>> Renaming a field is an API break but not an ABI break - the compiler
>> cares but the cpu does not.
>
> Good to know. Thanks.
>
>>

 Perhaps a better option would be adding some padding to the structs
 now for 2.2 since the vhost ABI is broken there anyway. That would
 at least give a chance to keep it compatible from 2.2 to 2.3.
>>>
>>> It will not be compatible, unless we add exact same fields (not
>>> something like uint8_t pad[xx]). Otherwise, the pad field renaming
>>> is also an ABI break, right?
>>
>> There's no ABI (or API) break in changing reserved unused fields to
>> something else, as long as care is taken with sizes and alignment.
>
> as long as we don't reference the reserved unused fields?

That would be the definition of an unused field I think :)
Call it "reserved" if you want, it doesn't really matter as long as its 
clear its something you shouldn't be using.

>
>> In any case padding is best added to the end of a struct to minimize
>> risks and keep things simple.
>
> The thing is that isn't it a bit aweful to (always) add pads to
> the end of a struct, especially when you don't know how many
> need to be padded?

Then you pad for what you think you need, plus a bit extra, and maybe 
some more for others who might want to extend it. What is a reasonable 
amount needs deciding case by case - if a struct is alloced in the 
millions then be (very) conservative, but if there are one or 50 such 
structs within an app lifetime then who cares if its bit larger?

And yeah padding may be annoying, but that's pretty much the only option 
in a project where most of the structs are out in the open.

- Panu -

>
>   --yliu
>



[dpdk-dev] [PATCH] scripts: support any legal git revisions as abi validation range

2015-12-02 Thread Panu Matilainen
In addition to git tags, support validating abi between any legal
gitrevisions(7) syntaxes, such as "validate-abi.sh . -1 "
"validate-abi.sh master mybrach " etc in addition to
validating between tags. Makes it easier to run the validator
for in-development work.

Signed-off-by: Panu Matilainen 
---
 scripts/validate-abi.sh | 26 --
 1 file changed, 16 insertions(+), 10 deletions(-)

diff --git a/scripts/validate-abi.sh b/scripts/validate-abi.sh
index 4476433..0e3ccd7 100755
--- a/scripts/validate-abi.sh
+++ b/scripts/validate-abi.sh
@@ -43,16 +43,15 @@ log() {
 }

 validate_tags() {
-   git tag -l | grep -q "$TAG1"
-   if [ $? -ne 0 ]
+
+   if [ -z "$HASH1" ]
then
-   echo "$TAG1 is invalid"
+   echo "invalid revision: $TAG1"
return
fi
-   git tag -l | grep -q "$TAG2"
-   if [ $? -ne 0 ]
+   if [ -z "$HASH2" ]
then
-   echo "$TAG2 is invalid"
+   echo "invalid revision: $TAG2"
return
fi
 }
@@ -112,6 +111,9 @@ then
cleanup_and_exit 1
 fi

+HASH1=$(git show -s --format=%H "$TAG1" -- 2> /dev/null)
+HASH2=$(git show -s --format=%H "$TAG2" -- 2> /dev/null)
+
 # Make sure our tags exist
 res=$(validate_tags)
 if [ -n "$res" ]
@@ -120,6 +122,10 @@ then
cleanup_and_exit 1
 fi

+# Make hashes available in output for non-local reference
+TAG1="$TAG1 ($HASH1)"
+TAG2="$TAG2 ($HASH2)"
+
 ABICHECK=`which abi-compliance-checker 2>/dev/null`
 if [ $? -ne 0 ]
 then
@@ -152,7 +158,7 @@ cd $(dirname $0)/..

 log "INFO" "Checking out version $TAG1 of the dpdk"
 # Move to the old version of the tree
-git checkout $TAG1
+git checkout $HASH1

 # Make sure we configure SHARED libraries
 # Also turn off IGB and KNI as those require kernel headers to build
@@ -185,7 +191,7 @@ cd $TARGET/lib
 log "INFO" "COLLECTING ABI INFORMATION FOR $TAG1"
 for i in `ls *.so`
 do
-   $ABIDUMP $i -o $ABI_DIR/$i-ABI-0.dump -lver $TAG1
+   $ABIDUMP $i -o $ABI_DIR/$i-ABI-0.dump -lver $HASH1
 done
 cd ../..

@@ -194,7 +200,7 @@ git clean -f -d
 git reset --hard
 # Move to the new version of the tree
 log "INFO" "Checking out version $TAG2 of the dpdk"
-git checkout $TAG2
+git checkout $HASH2

 # Make sure we configure SHARED libraries
 # Also turn off IGB and KNI as those require kernel headers to build
@@ -220,7 +226,7 @@ cd $TARGET/lib
 log "INFO" "COLLECTING ABI INFORMATION FOR $TAG2"
 for i in `ls *.so`
 do
-   $ABIDUMP $i -o $ABI_DIR/$i-ABI-1.dump -lver $TAG2
+   $ABIDUMP $i -o $ABI_DIR/$i-ABI-1.dump -lver $HASH2
 done
 cd ../..

-- 
2.5.0



[dpdk-dev] [PATCH] vfio: Include No-IOMMU mode

2015-12-02 Thread Michael S. Tsirkin
On Wed, Dec 02, 2015 at 05:19:18PM +0100, Thomas Monjalon wrote:
> Hi,
> 
> 2015-12-02 08:28, Alex Williamson:
> > On Mon, 2015-11-16 at 19:12 +0200, Avi Kivity wrote:
> > > On 11/16/2015 07:06 PM, Alex Williamson wrote:
> > > > FYI, this is now in v4.4-rc1 (the slightly modified v2 version).  I want
> > > > to give fair warning though that while we seem to agree on this idea, it
> > > > hasn't been proven with a userspace driver port.  I've opted to include
> > > > it in this merge window rather than delaying it until v4.5, but I really
> > > > need to see a user for this before the end of the v4.4 cycle or I think
> > > > we'll need to revert and revisit for v4.5 anyway.  I don't really have
> > > > any interest in adding and maintaining code that has no users.  Please
> > > > keep me informed of progress with a dpdk port.  Thanks,
> > > 
> > > Thanks Alex.  Copying the dpdk mailing list, where the users live.
> > > 
> > > dpdk-ers: vfio-noiommu is a replacement for uio_pci_generic and 
> > > uio_igb.  It supports MSI-X and so can be used on SR/IOV VF devices.  
> > > The intent is that you can use dpdk without an external module, using 
> > > vfio, whether you are on bare metal with an iommu, bare metal without an 
> > > iommu, or virtualized.  However, dpdk needs modification to support this.
> > 
> > Still no users for this that I'm aware of.  I'm going to revert this in
> > rc5 unless something changes.  Thanks,
> 
> Removing needs for out-of-tree modules is a really nice achievement.
> Yes, we (in the DPDK project) should check how to use this no-iommu VFIO
> and to replace igb_uio.
> 
> I'm sorry we failed to catch your email and follow up.
> Is it really too late? What is the risk of keeping it in Linux 4.4?
> Advertising a new feature and removing it would be frustrating.
> 
> Have you tried this VFIO mode with DPDK?
> How complex would be the patch to support it?
> 
> Thanks

These things need to be developed together, one can't be sure it meets
userspace needs if no one tried.  And then where would we be?
Supporting a broken interface forever.  If someone writes the userspace
code, then this feature can come back for 4.5.

-- 
MST


[dpdk-dev] [PATCH 3/4] vhost: log vring changes

2015-12-02 Thread Michael S. Tsirkin
On Wed, Dec 02, 2015 at 05:58:24PM +0200, Victor Kaplansky wrote:
> On Wed, Dec 02, 2015 at 10:38:02PM +0800, Yuanhan Liu wrote:
> > On Wed, Dec 02, 2015 at 04:07:02PM +0200, Victor Kaplansky wrote:
> > > On Wed, Dec 02, 2015 at 11:43:12AM +0800, Yuanhan Liu wrote:
> > > > Invoking vhost_log_write() to mark corresponding page as dirty while
> > > > updating used vring.
> > > 
> > > Looks good, thanks!
> > > 
> > > I didn't find where you log the dirty pages in result of data
> > > written to the buffers pointed by the descriptors in RX vring.
> > > AFAIU, the buffers of RX queue reside in guest's memory and have
> > > to be marked as dirty if they are written. What do you say?
> > 
> > Yeah, we should. I got a question then: why log_guest_addr is set
> > to the physical address of used vring in guest? I mean, apparently,
> > we need log more changes other than used vring only.
> 
> The physical address of used vring sent to the back-end, since
> otherwise back-end has to perform virtual to physical
> translation, and we want to avoid this. The dirty buffers has to
> be marked as well, but their guest's physical address is known
> directly from the descriptors.

Yes, people wanted to be able to do multiple physical
addresses to one virtual so you do not want to translate
virt to phys.

> > 
> > --yliu


[dpdk-dev] [PATCH v4 0/2] Add support for driver directories

2015-12-02 Thread Stephen Hemminger
On Thu, 12 Nov 2015 16:52:32 +0100
Thomas Monjalon  wrote:

> > > This mini-series adds support for driver directory concept
> > > based on idea by Thomas Monjalon back in February:
> > > http://dpdk.org/ml/archives/dev/2015-February/013285.html
> > >
> > > In the process FreeBSD also gains plugin support (but untested).
> > >
> > > v4: - introduce error-early behavior for invalid plugin paths
> > > - support directories via the existing -d option instead of adding new
> > >
> > > v3: - merge the first commits
> > >
> > > v2: - move code to eal/common
> > > - add bsd support
> > >
> > > Panu Matilainen (2):
> > >   eal: move plugin loading to eal/common
> > >   eal: add support for driver directory concept
> > 
> > 
> > checkpatch complains for some indent problem (Thomas, can you fix this ?),
> > but the rest looks good to me.
> > 
> > Acked-by: David Marchand 
> > 
> > Thanks Panu.
> 
> Applied, thanks

This patch introduces a new issue reported by Coverity.

The root cause of the problem is that you are checking that it s a directory 
first with stat
then calling dlopen(). I malicious entity could get between the stat and the 
dlopen.

In this case the desire to handle both file name and directory is getting in 
the way.
It really should just only take a directory now, or have two different config 
options
in a method similar to other subsystems (look at /etc/xxx vs /etc/xxx.d as 
standard practice).


*** CID 120151:  Security best practices violations  (TOCTOU)
/lib/librte_eal/common/eal_common_options.c: 232 in eal_plugins_init()
226 solib->name);
227 return -1;
228 }
229 } else {
230 RTE_LOG(DEBUG, EAL, "open shared lib %s\n",
231 solib->name);
>>> CID 120151:  Security best practices violations  (TOCTOU)
>>> Calling function "dlopen" that uses "solib->name" after a check 
>>> function. This can cause a time-of-check, time-of-use race condition.  
232 solib->lib_handle = dlopen(solib->name, 
RTLD_NOW);
233 if (solib->lib_handle == NULL) {
234 RTE_LOG(ERR, EAL, "%s\n", dlerror());
235 return -1;
236 }
237 }


[dpdk-dev] [PATCH 3/4] vhost: log vring changes

2015-12-02 Thread Victor Kaplansky
On Wed, Dec 02, 2015 at 10:38:02PM +0800, Yuanhan Liu wrote:
> On Wed, Dec 02, 2015 at 04:07:02PM +0200, Victor Kaplansky wrote:
> > On Wed, Dec 02, 2015 at 11:43:12AM +0800, Yuanhan Liu wrote:
> > > Invoking vhost_log_write() to mark corresponding page as dirty while
> > > updating used vring.
> > 
> > Looks good, thanks!
> > 
> > I didn't find where you log the dirty pages in result of data
> > written to the buffers pointed by the descriptors in RX vring.
> > AFAIU, the buffers of RX queue reside in guest's memory and have
> > to be marked as dirty if they are written. What do you say?
> 
> Yeah, we should. I got a question then: why log_guest_addr is set
> to the physical address of used vring in guest? I mean, apparently,
> we need log more changes other than used vring only.

The physical address of used vring sent to the back-end, since
otherwise back-end has to perform virtual to physical
translation, and we want to avoid this. The dirty buffers has to
be marked as well, but their guest's physical address is known
directly from the descriptors.

> 
>   --yliu


[dpdk-dev] [PATCH 3/4] eal/arm: Enable lpm/table/pipeline libs

2015-12-02 Thread Thomas Monjalon
2015-12-02 22:23, Jerin Jacob:
> On Wed, Dec 02, 2015 at 05:40:13PM +0100, Thomas Monjalon wrote:
> > 2015-12-02 20:04, Jerin Jacob:
> > > On Wed, Dec 02, 2015 at 09:13:51PM +0800, Jianbo Liu wrote:
> > > > On 2 December 2015 at 18:39, Jerin Jacob  > > > caviumnetworks.com> wrote:
> > > > > AND they include "rte_lpm.h"(it internally includes rte_vect.h)
> > > > > that lead to multiple definition and its not good.
> > > > >
> > > > But you will have similar issue since "typedef int32x4_t __m128i"
> > > > appears in both your patch and this header file.
> > > 
> > > I just tested it, it won't break, back to back "typedef int32x4_t __m128i"
> > > is fine(unlike inline function).
> > > 
> > > my intention to keep __m128i "as is"  because changing the __m128i to 
> > > rte_???
> > > something would break the ABI.
> > 
> > Isn't it already broken in 2.2?
> 
> Does it mean, You would like to have rte_128i(or similar) kind of
> abstraction to represent 128bit SIMD variable in DPDK?

If you are convinced that it is the best way to write a generic code, yes.
I think the most important question is to know what is the best solution
for performance and maintainability. The API/ABI questions will be considered
after.

Thanks for your involvement guys.


[dpdk-dev] [PATCH 2/3] rte_sched: introduce reciprocal divide

2015-12-02 Thread Hannes Frederic Sowa
Hello,

On Wed, Dec 2, 2015, at 17:45, Dumitrescu, Cristian wrote:
> > diff --git a/lib/librte_sched/rte_reciprocal.h
> > b/lib/librte_sched/rte_reciprocal.h
> > new file mode 100644
> > index 000..abd1525
> > --- /dev/null
> > +++ b/lib/librte_sched/rte_reciprocal.h
> > @@ -0,0 +1,39 @@
> > +/*
> > + * Reciprocal divide
> > + *
> > + * Used with permission from original authors
> > + *  Hannes Frederic Sowa and Daniel Borkmann
> > + *
> > + * This algorithm is based on the paper "Division by Invariant
> > + * Integers Using Multiplication" by Torbj??rn Granlund and Peter
> > + * L. Montgomery.
> 
> Stephen, can you please provide a link to this paper?



> > + *
> > + * The assembler implementation from Agner Fog, which this code is
> > + * based on, can be found here:
> > + * http://www.agner.org/optimize/asmlib.zip
> > + *
> > + * This optimization for A/B is helpful if the divisor B is mostly
> > + * runtime invariant. The reciprocal of B is calculated in the
> > + * slow-path with reciprocal_value(). The fast-path can then just use
> > + * a much faster multiplication operation with a variable dividend A
> > + * to calculate the division A/B.
> > + */
> > +
> > +#ifndef _RTE_RECIPROCAL_H_
> > +#define _RTE_RECIPROCAL_H_
> > +
> > +struct rte_reciprocal {
> > +   uint32_t m;
> > +   uint8_t sh1, sh2;
> > +};
> 
> The size of this structure is not a multiple of 32 bits. You seem to
> transfer this structure by value rather than by reference (the function
> rte_reciprocal_value() below returns an instance of this structure), I
> don't feel comfortable with the last 16 bits of the structure being left
> uninitialized, we should probably add some explicit pad field and
> initialize this structure explicitly to zero at init time?

Note, it is used by static inline functions in fast path which most
probably expands the code in question, thus no real argument passing
happens (at least this is what I saw in the linux kernel assembly). I
don't think you need to worry about padding. It happens very often
without noticing. ;)

> > +
> > +static inline uint32_t rte_reciprocal_divide(uint32_t a, struct 
> > rte_reciprocal
> > R)
> > +{
> > +   uint32_t t = (uint32_t)(((uint64_t)a * R.m) >> 32);
> > +
> > +   return (t + ((a - t) >> R.sh1)) >> R.sh2;
> > +}
> > +
> > +struct rte_reciprocal rte_reciprocal_value(uint32_t d);
> 
> Why 32-bit arithmetic? We had a lot of bugs in librte_sched library due
> to 32-bit arithmetic that were particularly difficult to track. Can we
> have this function rte_reciprocal_divide() return a 64-bit integer and
> replace any 32-bit arithmetic/conversion with 64-bit operations?

There was no use case at this time and I am actually not sure how easy
the move to 64 bit is, as it would require one multiplication operation
in an integer domain twice as large.

> > +
> > +#endif /* _RTE_RECIPROCAL_H_ */
> > --
> > 2.1.4
> 
> As previously discussed, a simpler/faster alternative to floating point
> division is 64-bit multiplication followed by right shift, any particular
> reason why this approach was not considered?

This is exact division. It depends on what you want. I am not sure if
you want to do array accesses with floating point division or simple 64
bit multiplication and shifting.

Bye,
Hannes


[dpdk-dev] [PATCH 3/4] eal/arm: Enable lpm/table/pipeline libs

2015-12-02 Thread Jianbo Liu
On 2 December 2015 at 16:03, Jerin Jacob  
wrote:
> On Wed, Dec 02, 2015 at 02:54:52PM +0800, Jianbo Liu wrote:
>> On 2 December 2015 at 00:41, Jerin Jacob  
>> wrote:
>> > On Tue, Dec 01, 2015 at 01:41:15PM -0500, Jianbo Liu wrote:
>> >> Adds ARM NEON support for lpm.
>> >> And enables table/pipeline libraries which depend on lpm.
>> >
>> > I already sent the patch on the same yesterday.
>> > We can converge the patches after the discussion.
>> > Please check "[dpdk-dev] [PATCH 0/3] add lpm support for NEON" on ml
>> >
>> Yes, I have read your patch. But there are many differences, so I sent
>> mine for your reviewing :)
>>
>> >
>> >>
>> >> Signed-off-by: Jianbo Liu 
>> >> ---
>> >>  config/defconfig_arm-armv7a-linuxapp-gcc  |  3 -
>> >>  config/defconfig_arm64-armv8a-linuxapp-gcc|  3 -
>> >>  lib/librte_eal/common/include/arch/arm/rte_vect.h | 28 ++
>> >>  lib/librte_lpm/rte_lpm.h  | 68 
>> >> ---
>> >>  4 files changed, 77 insertions(+), 25 deletions(-)
>> >>
>> >> diff --git a/config/defconfig_arm-armv7a-linuxapp-gcc 
>> >> b/config/defconfig_arm-armv7a-linuxapp-gcc
>> >> index cbebd64..efffa1f 100644
>> >> --- a/config/defconfig_arm-armv7a-linuxapp-gcc
>> >> +++ b/config/defconfig_arm-armv7a-linuxapp-gcc
>> >> @@ -53,9 +53,6 @@ CONFIG_RTE_LIBRTE_KNI=n
>> >>  CONFIG_RTE_EAL_IGB_UIO=n
>> >>
>> >>  # fails to compile on ARM
>> >> -CONFIG_RTE_LIBRTE_LPM=n
>> >> -CONFIG_RTE_LIBRTE_TABLE=n
>> >> -CONFIG_RTE_LIBRTE_PIPELINE=n
>> >>  CONFIG_RTE_SCHED_VECTOR=n
>> >>
>> >>  # cannot use those on ARM
>> >> diff --git a/config/defconfig_arm64-armv8a-linuxapp-gcc 
>> >> b/config/defconfig_arm64-armv8a-linuxapp-gcc
>> >> index 504f3ed..57f7941 100644
>> >> --- a/config/defconfig_arm64-armv8a-linuxapp-gcc
>> >> +++ b/config/defconfig_arm64-armv8a-linuxapp-gcc
>> >> @@ -51,7 +51,4 @@ CONFIG_RTE_LIBRTE_IVSHMEM=n
>> >>  CONFIG_RTE_LIBRTE_FM10K_PMD=n
>> >>  CONFIG_RTE_LIBRTE_I40E_PMD=n
>> >>
>> >> -CONFIG_RTE_LIBRTE_LPM=n
>> >> -CONFIG_RTE_LIBRTE_TABLE=n
>> >> -CONFIG_RTE_LIBRTE_PIPELINE=n
>> >>  CONFIG_RTE_SCHED_VECTOR=n
>> >> diff --git a/lib/librte_eal/common/include/arch/arm/rte_vect.h 
>> >> b/lib/librte_eal/common/include/arch/arm/rte_vect.h
>> >> index a33c054..7437711 100644
>> >> --- a/lib/librte_eal/common/include/arch/arm/rte_vect.h
>> >> +++ b/lib/librte_eal/common/include/arch/arm/rte_vect.h
>> >> @@ -41,6 +41,8 @@ extern "C" {
>> >>
>> >>  typedef int32x4_t xmm_t;
>> >>
>> >> +typedef int32x4_t __m128i;
>> >> +
>> >>  #define  XMM_SIZE(sizeof(xmm_t))
>> >>  #define  XMM_MASK(XMM_SIZE - 1)
>> >>
>> >> @@ -53,6 +55,32 @@ typedef union rte_xmm {
>> >>   double   pd[XMM_SIZE / sizeof(double)];
>> >>  } __attribute__((aligned(16))) rte_xmm_t;
>> >>
>> >> +static __inline __m128i
>> >> +_mm_set_epi32(int i3, int i2, int i1, int i0)
>> >> +{
>> >> + int32_t r[4] = {i0, i1, i2, i3};
>> >> +
>> >> + return vld1q_s32(r);
>> >> +}
>> >> +
>> >> +static __inline __m128i
>> >> +_mm_loadu_si128(__m128i *p)
>> >> +{
>> >> + return vld1q_s32((int32_t *)p);
>> >> +}
>> >> +
>> >> +static __inline __m128i
>> >> +_mm_set1_epi32(int i)
>> >> +{
>> >> + return vdupq_n_s32(i);
>> >> +}
>> >> +
>> >> +static __inline __m128i
>> >> +_mm_and_si128(__m128i a, __m128i b)
>> >> +{
>> >> + return vandq_s32(a, b);
>> >> +}
>> >> +
>
> IMO, it's not always good to emulate GCC defined intrinsics of
> other architecture. What if a legacy DPDK application has such mappings
> then BOOM, multiple definition, which one is correct? which one
> to comment it out? Integration pain starts for DPDK library consumer:-(
>
They can include rte_vect.h in build/include directly, which is linked correctly
to the one for that ARCH, so there is no need to worry about.


>> >
>> > IMO, it makes sense to not emulate the SSE intrinsics with NEON
>> > Let's create the rte_vect_* as required. look at the existing patch.
>> >
>> I thought of creating a layer of SIMD over all the platforms before.
>> But can't you see it make things complicated, considering there are
>> only few simple intrinsic to implement?
>
> Not true, There were, a lot of SSE intrinsics needs be to emulated for ACL 
> NEON
> implementation if I were to take this approach and emulation comes with
> the cost.
>
No, I will not re-implement all the intrinsic like that .
I only do with the simple intrinsic, such as load/store, as you said below.

> So my take is,
> lets the each architecture implementation for specific SIMD version of DPDK
> API in the library should have the freedom to implement the API in
> NATIVE.
>
> And let's create only rte_vect_* abstraction only for using
> that API/library. Which boils down to have very minimal rte_vect_*
> abstraction to load, store, set not beyond that.
>
> This makes clear "contract" between DPDK library and the applications.
> and make easy for remaning new architecture  porting effort in DPDK.
>
Agree.
But I reuse existing intrinsic 

[dpdk-dev] [PATCH 3/4] eal/arm: Enable lpm/table/pipeline libs

2015-12-02 Thread Thomas Monjalon
2015-12-02 20:04, Jerin Jacob:
> On Wed, Dec 02, 2015 at 09:13:51PM +0800, Jianbo Liu wrote:
> > On 2 December 2015 at 18:39, Jerin Jacob  > caviumnetworks.com> wrote:
> > > AND they include "rte_lpm.h"(it internally includes rte_vect.h)
> > > that lead to multiple definition and its not good.
> > >
> > But you will have similar issue since "typedef int32x4_t __m128i"
> > appears in both your patch and this header file.
> 
> I just tested it, it won't break, back to back "typedef int32x4_t __m128i"
> is fine(unlike inline function).
> 
> my intention to keep __m128i "as is"  because changing the __m128i to rte_???
> something would break the ABI.

Isn't it already broken in 2.2?


[dpdk-dev] [PATCH] l2fwd-crypto: fix behaviour of -t option

2015-12-02 Thread Declan Doherty
On 02/12/15 17:16, Declan Doherty wrote:
> passing -t 0 as a command line argument causes the application
> to exit with an "invalid refresh period specified" error  which is
> contrary to applications help text.
>
> This patch removes the unnecessary option "--no-stats" and fixes the
> behaviour of the -t parameter.
>
> Reported-by: Min Cao 
> Signed-off-by: Declan Doherty 
> ---
>   examples/l2fwd-crypto/main.c | 29 ++---
>   1 file changed, 10 insertions(+), 19 deletions(-)
>
> diff --git a/examples/l2fwd-crypto/main.c b/examples/l2fwd-crypto/main.c
> index 0b4414b..d70fc9a 100644
> --- a/examples/l2fwd-crypto/main.c
> +++ b/examples/l2fwd-crypto/main.c
> @@ -118,7 +118,6 @@ struct l2fwd_crypto_options {
>   unsigned nb_ports_per_lcore;
>   unsigned refresh_period;
>   unsigned single_lcore:1;
> - unsigned no_stats_printing:1;
>
>   enum rte_cryptodev_type cdev_type;
>   unsigned sessionless:1;
> @@ -575,10 +574,9 @@ l2fwd_main_loop(struct l2fwd_crypto_options *options)
>   (uint64_t)timer_period)) {
>
>   /* do this only on master core */
> - if (lcore_id == rte_get_master_lcore() 
> &&
> - 
> !options->no_stats_printing) {
> + if (lcore_id == rte_get_master_lcore()
> + && options->refresh_period) {
>   print_stats();
> - /* reset the timer */
>   timer_tsc = 0;
>   }
>   }
> @@ -802,11 +800,6 @@ static int
>   l2fwd_crypto_parse_args_long_options(struct l2fwd_crypto_options *options,
>   struct option *lgopts, int option_index)
>   {
> - if (strcmp(lgopts[option_index].name, "no_stats") == 0) {
> - options->no_stats_printing = 1;
> - return 0;
> - }
> -
>   if (strcmp(lgopts[option_index].name, "cdev_type") == 0)
>   return parse_cryptodev_type(>cdev_type, optarg);
>
> @@ -903,21 +896,21 @@ l2fwd_crypto_parse_timer_period(struct 
> l2fwd_crypto_options *options,
>   const char *q_arg)
>   {
>   char *end = NULL;
> - int n;
> + long int n;
>
>   /* parse number string */
>   n = strtol(q_arg, , 10);
>   if ((q_arg[0] == '\0') || (end == NULL) || (*end != '\0'))
>   n = 0;
>
> - if (n >= MAX_TIMER_PERIOD)
> - n = 0;
> + if (n >= MAX_TIMER_PERIOD) {
> + printf("Warning refresh period specified %ld is greater than "
> + "max value %d! using max value",
> + n, MAX_TIMER_PERIOD);
> + n = MAX_TIMER_PERIOD;
> + }
>
>   options->refresh_period = n * 1000 * TIMER_MILLISECOND;
> - if (options->refresh_period == 0) {
> - printf("invalid refresh period specified\n");
> - return -1;
> - }
>
>   return 0;
>   }
> @@ -932,7 +925,6 @@ l2fwd_crypto_default_options(struct l2fwd_crypto_options 
> *options)
>   options->nb_ports_per_lcore = 1;
>   options->refresh_period = 1;
>   options->single_lcore = 0;
> - options->no_stats_printing = 0;
>
>   options->cdev_type = RTE_CRYPTODEV_AESNI_MB_PMD;
>   options->sessionless = 0;
> @@ -979,7 +971,7 @@ l2fwd_crypto_options_print(struct l2fwd_crypto_options 
> *options)
>   printf("single lcore mode: %s\n",
>   options->single_lcore ? "enabled" : "disabled");
>   printf("stats_printing: %s\n",
> - options->no_stats_printing ? "disabled" : "enabled");
> + options->refresh_period == 0 ? "disabled" : "enabled");
>
>   switch (options->cdev_type) {
>   case RTE_CRYPTODEV_AESNI_MB_PMD:
> @@ -1036,7 +1028,6 @@ l2fwd_crypto_parse_args(struct l2fwd_crypto_options 
> *options,
>   char **argvopt = argv, *prgname = argv[0];
>
>   static struct option lgopts[] = {
> - { "no_stats", no_argument, 0, 0 },
>   { "sessionless", no_argument, 0, 0 },
>
>   { "cdev_type", required_argument, 0, 0 },
>

I forgot to specify the commit this patch fixes in the commit message.

"fixes: 387259bd6c6733ec0ff8dfead0b555dc57402aa1"


[dpdk-dev] [PATCH] eal: don't crash if one pci device fails

2015-12-02 Thread Stephen Hemminger
If there is a failure to setup one pci device, there maybe other
devices that can be initialized. Don't call rte_exit which
is a forced crash, pass the error back to the
application to decide what it wants to do.

Might be good idea to return a positive value for the
number of devices found, but that would break ABI.

Signed-off-by: Stephen Hemminger 
---
 lib/librte_eal/common/eal_common_pci.c | 17 +++--
 1 file changed, 11 insertions(+), 6 deletions(-)

diff --git a/lib/librte_eal/common/eal_common_pci.c 
b/lib/librte_eal/common/eal_common_pci.c
index dcfe947..594ef9c 100644
--- a/lib/librte_eal/common/eal_common_pci.c
+++ b/lib/librte_eal/common/eal_common_pci.c
@@ -391,12 +391,13 @@ rte_eal_pci_probe(void)
struct rte_pci_device *dev = NULL;
struct rte_devargs *devargs;
int probe_all = 0;
-   int ret = 0;
+   int failed = 0;

if (rte_eal_devargs_type_count(RTE_DEVTYPE_WHITELISTED_PCI) == 0)
probe_all = 1;

TAILQ_FOREACH(dev, _device_list, next) {
+   int ret = 0;

/* set devargs in PCI structure */
devargs = pci_devargs_lookup(dev);
@@ -409,13 +410,17 @@ rte_eal_pci_probe(void)
else if (devargs != NULL &&
devargs->type == RTE_DEVTYPE_WHITELISTED_PCI)
ret = pci_probe_all_drivers(dev);
-   if (ret < 0)
-   rte_exit(EXIT_FAILURE, "Requested device " PCI_PRI_FMT
-" cannot be used\n", dev->addr.domain, 
dev->addr.bus,
-dev->addr.devid, dev->addr.function);
+
+   if (ret < 0) {
+   RTE_LOG(ERR, EAL,
+   "Requested device " PCI_PRI_FMT " cannot be 
used\n",
+   dev->addr.domain, dev->addr.bus,
+   dev->addr.devid, dev->addr.function);
+   failed = ret;
+   }
}

-   return 0;
+   return failed;
 }

 /* dump one device */
-- 
2.1.4



[dpdk-dev] [PATCH 1/4] vhost: handle VHOST_USER_SET_LOG_BASE request

2015-12-02 Thread Thomas Monjalon
2015-12-02 22:31, Yuanhan Liu:
> Thomas, should I write an ABI deprecation note? Can I make it for
> v2.2 release If I make one tomorrow? (Sorry that I'm not awared
> of that it would be an ABI break).

As Panu suggested, it would be better to reserve some room now
in 2.2 which already breaks vhost ABI.
Maybe we have a chance to keep the same vhost ABI in 2.3.

The 2.2 release will probably be closed in less than 2 weeks.


[dpdk-dev] [PATCH] vfio: Include No-IOMMU mode

2015-12-02 Thread Thomas Monjalon
Hi,

2015-12-02 08:28, Alex Williamson:
> On Mon, 2015-11-16 at 19:12 +0200, Avi Kivity wrote:
> > On 11/16/2015 07:06 PM, Alex Williamson wrote:
> > > FYI, this is now in v4.4-rc1 (the slightly modified v2 version).  I want
> > > to give fair warning though that while we seem to agree on this idea, it
> > > hasn't been proven with a userspace driver port.  I've opted to include
> > > it in this merge window rather than delaying it until v4.5, but I really
> > > need to see a user for this before the end of the v4.4 cycle or I think
> > > we'll need to revert and revisit for v4.5 anyway.  I don't really have
> > > any interest in adding and maintaining code that has no users.  Please
> > > keep me informed of progress with a dpdk port.  Thanks,
> > 
> > Thanks Alex.  Copying the dpdk mailing list, where the users live.
> > 
> > dpdk-ers: vfio-noiommu is a replacement for uio_pci_generic and 
> > uio_igb.  It supports MSI-X and so can be used on SR/IOV VF devices.  
> > The intent is that you can use dpdk without an external module, using 
> > vfio, whether you are on bare metal with an iommu, bare metal without an 
> > iommu, or virtualized.  However, dpdk needs modification to support this.
> 
> Still no users for this that I'm aware of.  I'm going to revert this in
> rc5 unless something changes.  Thanks,

Removing needs for out-of-tree modules is a really nice achievement.
Yes, we (in the DPDK project) should check how to use this no-iommu VFIO
and to replace igb_uio.

I'm sorry we failed to catch your email and follow up.
Is it really too late? What is the risk of keeping it in Linux 4.4?
Advertising a new feature and removing it would be frustrating.

Have you tried this VFIO mode with DPDK?
How complex would be the patch to support it?

Thanks


[dpdk-dev] [PATCH] l2fwd-crypto: fix behaviour of -t option

2015-12-02 Thread Declan Doherty
passing -t 0 as a command line argument causes the application
to exit with an "invalid refresh period specified" error  which is
contrary to applications help text.

This patch removes the unnecessary option "--no-stats" and fixes the
behaviour of the -t parameter.

Reported-by: Min Cao 
Signed-off-by: Declan Doherty 
---
 examples/l2fwd-crypto/main.c | 29 ++---
 1 file changed, 10 insertions(+), 19 deletions(-)

diff --git a/examples/l2fwd-crypto/main.c b/examples/l2fwd-crypto/main.c
index 0b4414b..d70fc9a 100644
--- a/examples/l2fwd-crypto/main.c
+++ b/examples/l2fwd-crypto/main.c
@@ -118,7 +118,6 @@ struct l2fwd_crypto_options {
unsigned nb_ports_per_lcore;
unsigned refresh_period;
unsigned single_lcore:1;
-   unsigned no_stats_printing:1;

enum rte_cryptodev_type cdev_type;
unsigned sessionless:1;
@@ -575,10 +574,9 @@ l2fwd_main_loop(struct l2fwd_crypto_options *options)
(uint64_t)timer_period)) {

/* do this only on master core */
-   if (lcore_id == rte_get_master_lcore() 
&&
-   
!options->no_stats_printing) {
+   if (lcore_id == rte_get_master_lcore()
+   && options->refresh_period) {
print_stats();
-   /* reset the timer */
timer_tsc = 0;
}
}
@@ -802,11 +800,6 @@ static int
 l2fwd_crypto_parse_args_long_options(struct l2fwd_crypto_options *options,
struct option *lgopts, int option_index)
 {
-   if (strcmp(lgopts[option_index].name, "no_stats") == 0) {
-   options->no_stats_printing = 1;
-   return 0;
-   }
-
if (strcmp(lgopts[option_index].name, "cdev_type") == 0)
return parse_cryptodev_type(>cdev_type, optarg);

@@ -903,21 +896,21 @@ l2fwd_crypto_parse_timer_period(struct 
l2fwd_crypto_options *options,
const char *q_arg)
 {
char *end = NULL;
-   int n;
+   long int n;

/* parse number string */
n = strtol(q_arg, , 10);
if ((q_arg[0] == '\0') || (end == NULL) || (*end != '\0'))
n = 0;

-   if (n >= MAX_TIMER_PERIOD)
-   n = 0;
+   if (n >= MAX_TIMER_PERIOD) {
+   printf("Warning refresh period specified %ld is greater than "
+   "max value %d! using max value",
+   n, MAX_TIMER_PERIOD);
+   n = MAX_TIMER_PERIOD;
+   }

options->refresh_period = n * 1000 * TIMER_MILLISECOND;
-   if (options->refresh_period == 0) {
-   printf("invalid refresh period specified\n");
-   return -1;
-   }

return 0;
 }
@@ -932,7 +925,6 @@ l2fwd_crypto_default_options(struct l2fwd_crypto_options 
*options)
options->nb_ports_per_lcore = 1;
options->refresh_period = 1;
options->single_lcore = 0;
-   options->no_stats_printing = 0;

options->cdev_type = RTE_CRYPTODEV_AESNI_MB_PMD;
options->sessionless = 0;
@@ -979,7 +971,7 @@ l2fwd_crypto_options_print(struct l2fwd_crypto_options 
*options)
printf("single lcore mode: %s\n",
options->single_lcore ? "enabled" : "disabled");
printf("stats_printing: %s\n",
-   options->no_stats_printing ? "disabled" : "enabled");
+   options->refresh_period == 0 ? "disabled" : "enabled");

switch (options->cdev_type) {
case RTE_CRYPTODEV_AESNI_MB_PMD:
@@ -1036,7 +1028,6 @@ l2fwd_crypto_parse_args(struct l2fwd_crypto_options 
*options,
char **argvopt = argv, *prgname = argv[0];

static struct option lgopts[] = {
-   { "no_stats", no_argument, 0, 0 },
{ "sessionless", no_argument, 0, 0 },

{ "cdev_type", required_argument, 0, 0 },
-- 
2.5.0



[dpdk-dev] [PATCH 0/2] fix missing dependencies

2015-12-02 Thread Declan Doherty
On 02/12/15 04:05, Stephen Hemminger wrote:
> Fix some issues found when doing parallel builds
>
> Stephen Hemminger (2):
>cmdline_test: add missing dependencies
>bonding: add depencency on cmdline library
>
>   app/cmdline_test/Makefile| 3 +++
>   drivers/net/bonding/Makefile | 1 +
>   2 files changed, 4 insertions(+)
>
Series Acked-by: Declan Doherty


[dpdk-dev] Does anybody know OpenDataPlane

2015-12-02 Thread Polehn, Mike A
A hint of the fundamental difference:
One originated somewhat more from the embedded orientation and one originated 
somewhat more from the server orientation. Both efforts are driving each 
towards the other and have overlap.

Mike

-Original Message-
From: dev [mailto:dev-boun...@dpdk.org] On Behalf Of Polehn, Mike A
Sent: Wednesday, December 2, 2015 8:32 AM
To: Kury Nicolas; dev at dpdk.org
Subject: Re: [dpdk-dev] Does anybody know OpenDataPlane

I don't think you have researched this enough. 
Asking this questions shows that you are just beginning your research or do not 
understand how this fits into current telco NFV/SDN efforts.

Why does this exist: "OpenDataPlane using DPDK for Intel NIC", listed below? 
Why would competing technologies use the competition technology to solve a 
problem?

Maybe you can change your thesis to "Current Open Source Dataplane Methods": 
and do a comparison between the two.  However if you just look at the sales 
documentation then you may not understand the real difference.

Mike

-Original Message-
From: dev [mailto:dev-boun...@dpdk.org] On Behalf Of Kury Nicolas
Sent: Wednesday, December 2, 2015 6:22 AM
To: dev at dpdk.org
Subject: [dpdk-dev] Does anybody know OpenDataPlane

Hi!


Does anybody know OpenDataPlane ?  http://www.opendataplane.org/ It is a 
framework designed to enable software portability between networking SoCs, 
regardless of the underlying instruction set architecture. There are several 
implementations.

  *   OpenDataPlane using DPDK for Intel NIC
  *   OpenDataPlane using DPAA for Freescale platforms (QorIQ)
  *   OpenDataPlane using MCSDK for Texas Insturments platforms (KeyStone II)
  *   etc.

When a developer wants to port his application, he just needs to recompile it 
with the implementation of OpenDataPlane related to the new platform.


I'm doing my Master's Thesis on OpenDataPlane  and I have some questions.

- Now that OpenDataPlane (ODP) exists, schould every developpers start a new 
project with ODP or are there some reasons to still use DPDK ? What do you 
think ?


Thank you very much

Nicolas




[dpdk-dev] [PATCH 3/3] rte_sched: eliminate floating point in calculating byte clock

2015-12-02 Thread Dumitrescu, Cristian


> -Original Message-
> From: Stephen Hemminger [mailto:stephen at networkplumber.org]
> Sent: Sunday, November 29, 2015 8:47 PM
> To: Dumitrescu, Cristian 
> Cc: dev at dpdk.org; Stephen Hemminger 
> Subject: [PATCH 3/3] rte_sched: eliminate floating point in calculating byte
> clock
> 
> The old code was doing a floating point divide for each rte_dequeue()
> which is very expensive. Change to using fixed point scaled inverse
> multiply. To maintain equivalent precision, scaled math is used.
> The application ABI is the same.
> 
> This improved performance from 5Gbit/sec to 10 Gbit/sec when configured
> for 10 Gbit/sec rate.
> 
> There was some feedback from Cristian that he wanted a better
> solution and was going to give one, but none was provided.
> For 2.2 this is a better solution than existing code, if someone
> has a better version I would love to see it.
> 
> Signed-off-by: Stephen Hemminger 
> ---
>  lib/librte_sched/rte_sched.c | 23 ++-
>  1 file changed, 18 insertions(+), 5 deletions(-)
> 
> diff --git a/lib/librte_sched/rte_sched.c b/lib/librte_sched/rte_sched.c
> index 16acd6b..cfae136 100644
> --- a/lib/librte_sched/rte_sched.c
> +++ b/lib/librte_sched/rte_sched.c
> @@ -47,6 +47,7 @@
>  #include "rte_bitmap.h"
>  #include "rte_sched_common.h"
>  #include "rte_approx.h"
> +#include "rte_reciprocal.h"
> 
>  #ifdef __INTEL_COMPILER
>  #pragma warning(disable:2259) /* conversion may lose significant bits */
> @@ -62,6 +63,11 @@
>  #define RTE_SCHED_PIPE_INVALIDUINT32_MAX
>  #define RTE_SCHED_BMP_POS_INVALID UINT32_MAX
> 
> +/* Scaling for cycles_per_byte calculation
> + * Chosen so that minimum rate is 480 bit/sec
> + */
> +#define RTE_SCHED_TIME_SHIFT   8

Stephen, can you please elaborate why we need to shift the dividend at all and 
why the shift value was picked as 8? Is 8 a hard constraint? How does this 
affect the scheduling precision/accuracy?

> +
>  struct rte_sched_subport {
>   /* Token bucket (TB) */
>   uint64_t tb_time; /* time of last update */
> @@ -215,7 +221,7 @@ struct rte_sched_port {
>   uint64_t time_cpu_cycles; /* Current CPU time measured in CPU
> cyles */
>   uint64_t time_cpu_bytes;  /* Current CPU time measured in bytes
> */
>   uint64_t time;/* Current NIC TX time measured in bytes 
> */
> - double cycles_per_byte;   /* CPU cycles per byte */
> + struct rte_reciprocal inv_cycles_per_byte; /* CPU cycles per byte */
> 
>   /* Scheduling loop detection */
>   uint32_t pipe_loop;
> @@ -610,7 +616,7 @@ struct rte_sched_port *
>  rte_sched_port_config(struct rte_sched_port_params *params)
>  {
>   struct rte_sched_port *port = NULL;
> - uint32_t mem_size, bmp_mem_size, n_queues_per_port, i;
> + uint32_t mem_size, bmp_mem_size, n_queues_per_port, i,
> cycles_per_byte;
> 
>   /* Check user parameters. Determine the amount of memory to
> allocate */
>   mem_size = rte_sched_port_get_memory_footprint(params);
> @@ -661,7 +667,10 @@ rte_sched_port_config(struct
> rte_sched_port_params *params)
>   port->time_cpu_cycles = rte_get_tsc_cycles();
>   port->time_cpu_bytes = 0;
>   port->time = 0;
> - port->cycles_per_byte = ((double) rte_get_tsc_hz()) / ((double)
> params->rate);
> +
> + cycles_per_byte = (rte_get_tsc_hz() << RTE_SCHED_TIME_SHIFT)
> + / params->rate;
> + port->inv_cycles_per_byte = rte_reciprocal_value(cycles_per_byte);
> 
>   /* Scheduling loop detection */
>   port->pipe_loop = RTE_SCHED_PIPE_INVALID;
> @@ -2088,11 +2097,15 @@ rte_sched_port_time_resync(struct
> rte_sched_port *port)
>  {
>   uint64_t cycles = rte_get_tsc_cycles();
>   uint64_t cycles_diff = cycles - port->time_cpu_cycles;
> - double bytes_diff = ((double) cycles_diff) / port->cycles_per_byte;
> + uint64_t bytes_diff;
> +
> + /* Compute elapsed time in bytes */
> + bytes_diff = rte_reciprocal_divide(cycles_diff <<
> RTE_SCHED_TIME_SHIFT,
> +port->inv_cycles_per_byte);
> 
>   /* Advance port time */
>   port->time_cpu_cycles = cycles;
> - port->time_cpu_bytes += (uint64_t) bytes_diff;
> + port->time_cpu_bytes += bytes_diff;
>   if (port->time < port->time_cpu_bytes)
>   port->time = port->time_cpu_bytes;
> 
> --
> 2.1.4

Can you provide some insight into how you tested this code and the test vectors 
you used?



[dpdk-dev] [PATCH 1/4] vhost: handle VHOST_USER_SET_LOG_BASE request

2015-12-02 Thread Panu Matilainen
On 12/02/2015 04:31 PM, Yuanhan Liu wrote:
> On Wed, Dec 02, 2015 at 03:53:45PM +0200, Panu Matilainen wrote:
>> On 12/02/2015 05:43 AM, Yuanhan Liu wrote:
>>> VHOST_USER_SET_LOG_BASE request is used to tell the backend (dpdk
>>> vhost-user) where we should log dirty pages, and how big the log
>>> buffer is.
>>>
>>> This request introduces a new payload:
>>>
>>> typedef struct VhostUserLog {
>>> uint64_t mmap_size;
>>> uint64_t mmap_offset;
>>> } VhostUserLog;
>>>
>>> Also, a fd is delivered from QEMU by ancillary data.
>>>
>>> With those info given, an area of memory is mmaped, assigned
>>> to dev->log_base, for logging dirty pages.
>>>
>>> Signed-off-by: Yuanhan Liu 
>>> ---
>>>   lib/librte_vhost/rte_virtio_net.h |  2 ++
>>>   lib/librte_vhost/vhost_user/vhost-net-user.c  |  7 -
>>>   lib/librte_vhost/vhost_user/vhost-net-user.h  |  6 
>>>   lib/librte_vhost/vhost_user/virtio-net-user.c | 44 
>>> +++
>>>   lib/librte_vhost/vhost_user/virtio-net-user.h |  1 +
>>>   5 files changed, 59 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/lib/librte_vhost/rte_virtio_net.h 
>>> b/lib/librte_vhost/rte_virtio_net.h
>>> index 5687452..416dac2 100644
>>> --- a/lib/librte_vhost/rte_virtio_net.h
>>> +++ b/lib/librte_vhost/rte_virtio_net.h
>>> @@ -127,6 +127,8 @@ struct virtio_net {
>>>   #define IF_NAME_SZ (PATH_MAX > IFNAMSIZ ? PATH_MAX : IFNAMSIZ)
>>> charifname[IF_NAME_SZ]; /**< Name of the tap 
>>> device or socket path. */
>>> uint32_tvirt_qp_nb; /**< number of queue pair we 
>>> have allocated */
>>> +   uint64_tlog_size;   /**< Size of log area */
>>> +   uint8_t *log_base;  /**< Where dirty pages are 
>>> logged */
>>> void*priv;  /**< private context */
>>> struct vhost_virtqueue  *virtqueue[VHOST_MAX_QUEUE_PAIRS * 2];  /**< 
>>> Contains all virtqueue information. */
>>>   } __rte_cache_aligned;
>>
>> This (and other changes in patch 2 breaks the librte_vhost ABI
>> again, so you'd need to at least add a deprecation note to 2.2 to be
>> able to do it in 2.3 at all according to the ABI policy.
>
> I was thinking that adding a new field (instead of renaming it or
> removing it) isn't an ABI break. So, I was wrong?

Adding or removing a field in the middle of a public struct is always an 
ABI break. Adding to the end often is too, but not always.
Renaming a field is an API break but not an ABI break - the compiler 
cares but the cpu does not.

>>
>> Perhaps a better option would be adding some padding to the structs
>> now for 2.2 since the vhost ABI is broken there anyway. That would
>> at least give a chance to keep it compatible from 2.2 to 2.3.
>
> It will not be compatible, unless we add exact same fields (not
> something like uint8_t pad[xx]). Otherwise, the pad field renaming
> is also an ABI break, right?

There's no ABI (or API) break in changing reserved unused fields to 
something else, as long as care is taken with sizes and alignment. In 
any case padding is best added to the end of a struct to minimize risks 
and keep things simple.

- Panu -

>
> Thomas, should I write an ABI deprecation note? Can I make it for
> v2.2 release If I make one tomorrow? (Sorry that I'm not awared
> of that it would be an ABI break).
>
>   --yliu
>



[dpdk-dev] [PATCH 2/3] rte_sched: introduce reciprocal divide

2015-12-02 Thread Dumitrescu, Cristian


> -Original Message-
> From: Stephen Hemminger [mailto:stephen at networkplumber.org]
> Sent: Sunday, November 29, 2015 8:47 PM
> To: Dumitrescu, Cristian 
> Cc: dev at dpdk.org; Stephen Hemminger ;
> Hannes Frederic Sowa 
> Subject: [PATCH 2/3] rte_sched: introduce reciprocal divide
> 
> This adds (with permission of the original author)
> reciprocal divide based on algorithm in Linux.
> 
> Signed-off-by: Stephen Hemminger 
> Signed-off-by: Hannes Frederic Sowa 
> ---
>  lib/librte_sched/Makefile |  6 ++--
>  lib/librte_sched/rte_reciprocal.c | 72
> +++
>  lib/librte_sched/rte_reciprocal.h | 39 +
>  3 files changed, 115 insertions(+), 2 deletions(-)
>  create mode 100644 lib/librte_sched/rte_reciprocal.c
>  create mode 100644 lib/librte_sched/rte_reciprocal.h
> 
> diff --git a/lib/librte_sched/Makefile b/lib/librte_sched/Makefile
> index b1cb285..e0a2c6d 100644
> --- a/lib/librte_sched/Makefile
> +++ b/lib/librte_sched/Makefile
> @@ -48,10 +48,12 @@ LIBABIVER := 1
>  #
>  # all source are stored in SRCS-y
>  #
> -SRCS-$(CONFIG_RTE_LIBRTE_SCHED) += rte_sched.c rte_red.c rte_approx.c
> +SRCS-$(CONFIG_RTE_LIBRTE_SCHED) += rte_sched.c rte_red.c
> rte_approx.c \
> + rte_reciprocal.c
> 
>  # install includes
> -SYMLINK-$(CONFIG_RTE_LIBRTE_SCHED)-include := rte_sched.h
> rte_bitmap.h rte_sched_common.h rte_red.h rte_approx.h
> +SYMLINK-$(CONFIG_RTE_LIBRTE_SCHED)-include := rte_sched.h
> rte_bitmap.h \
> + rte_sched_common.h rte_red.h rte_approx.h rte_reciprocal.h
> 
>  # this lib depends upon:
>  DEPDIRS-$(CONFIG_RTE_LIBRTE_SCHED) += lib/librte_mempool
> lib/librte_mbuf
> diff --git a/lib/librte_sched/rte_reciprocal.c
> b/lib/librte_sched/rte_reciprocal.c
> new file mode 100644
> index 000..652f023
> --- /dev/null
> +++ b/lib/librte_sched/rte_reciprocal.c
> @@ -0,0 +1,72 @@
> +/*-
> + *   BSD LICENSE
> + *
> + *   Copyright(c) Hannes Frederic Sowa
> + *   All rights reserved.
> + *
> + *   Redistribution and use in source and binary forms, with or without
> + *   modification, are permitted provided that the following conditions
> + *   are met:
> + *
> + * * Redistributions of source code must retain the above copyright
> + *   notice, this list of conditions and the following disclaimer.
> + * * Redistributions in binary form must reproduce the above copyright
> + *   notice, this list of conditions and the following disclaimer in
> + *   the documentation and/or other materials provided with the
> + *   distribution.
> + * * Neither the name of Intel Corporation nor the names of its

Why is Intel mentioned here, as according to this license header Intel is not 
the copyright holder?

> + *   contributors may be used to endorse or promote products derived
> + *   from this software without specific prior written permission.
> + *
> + *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND
> CONTRIBUTORS
> + *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT
> NOT
> + *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND
> FITNESS FOR
> + *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE
> COPYRIGHT
> + *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT,
> INCIDENTAL,
> + *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT
> NOT
> + *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS
> OF USE,
> + *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED
> AND ON ANY
> + *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR
> TORT
> + *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF
> THE USE
> + *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH
> DAMAGE.
> + */
> +
> +#include 
> +#include 
> +
> +#include 
> +
> +#include "rte_reciprocal.h"
> +
> +/* find largest set bit.
> + * portable and slow but does not matter for this usage.
> + */
> +static inline int fls(uint32_t x)
> +{
> + int b;
> +
> + for (b = 31; b >= 0; --b) {
> + if (x & (1u << b))
> + return b + 1;
> + }
> +
> + return 0;
> +}
> +
> +struct rte_reciprocal rte_reciprocal_value(uint32_t d)
> +{
> + struct rte_reciprocal R;
> + uint64_t m;
> + int l;
> +
> + l = fls(d - 1);
> + m = ((1ULL << 32) * ((1ULL << l) - d));
> + m /= d;
> +
> + ++m;
> + R.m = m;
> + R.sh1 = RTE_MIN(l, 1);
> + R.sh2 = RTE_MAX(l - 1, 0);
> +
> + return R;
> +}
> diff --git a/lib/librte_sched/rte_reciprocal.h
> b/lib/librte_sched/rte_reciprocal.h
> new file mode 100644
> index 000..abd1525
> --- /dev/null
> +++ b/lib/librte_sched/rte_reciprocal.h
> @@ -0,0 +1,39 @@
> +/*
> + * Reciprocal divide
> + *
> + * Used with permission from original authors
> + *  Hannes Frederic Sowa and Daniel Borkmann
> + *
> + * This algorithm is based on the paper "Division by Invariant
> + * Integers Using Multiplication" by Torbj??rn Granlund and Peter

[dpdk-dev] Does anybody know OpenDataPlane

2015-12-02 Thread Polehn, Mike A
I don't think you have researched this enough. 
Asking this questions shows that you are just beginning your research or do not 
understand how this fits into current telco NFV/SDN efforts.

Why does this exist: "OpenDataPlane using DPDK for Intel NIC", listed below? 
Why would competing technologies use the competition technology to solve a 
problem?

Maybe you can change your thesis to "Current Open Source Dataplane Methods": 
and do a comparison between the two.  However if you just look at the sales 
documentation then you may not understand the real difference.

Mike

-Original Message-
From: dev [mailto:dev-boun...@dpdk.org] On Behalf Of Kury Nicolas
Sent: Wednesday, December 2, 2015 6:22 AM
To: dev at dpdk.org
Subject: [dpdk-dev] Does anybody know OpenDataPlane

Hi!


Does anybody know OpenDataPlane ?  http://www.opendataplane.org/ It is a 
framework designed to enable software portability between networking SoCs, 
regardless of the underlying instruction set architecture. There are several 
implementations.

  *   OpenDataPlane using DPDK for Intel NIC
  *   OpenDataPlane using DPAA for Freescale platforms (QorIQ)
  *   OpenDataPlane using MCSDK for Texas Insturments platforms (KeyStone II)
  *   etc.

When a developer wants to port his application, he just needs to recompile it 
with the implementation of OpenDataPlane related to the new platform.


I'm doing my Master's Thesis on OpenDataPlane  and I have some questions.

- Now that OpenDataPlane (ODP) exists, schould every developpers start a new 
project with ODP or are there some reasons to still use DPDK ? What do you 
think ?


Thank you very much

Nicolas




[dpdk-dev] [PATCH 3/4] eal/arm: Enable lpm/table/pipeline libs

2015-12-02 Thread Jerin Jacob
On Wed, Dec 02, 2015 at 10:33:44AM +, Ananyev, Konstantin wrote:
> Hi everyone,
> 
> > -Original Message-
> > From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Jianbo Liu
> > Sent: Wednesday, December 02, 2015 9:50 AM
> > To: Jerin Jacob
> > Cc: dev at dpdk.org
> > Subject: Re: [dpdk-dev] [PATCH 3/4] eal/arm: Enable lpm/table/pipeline libs
> > 
> > On 2 December 2015 at 16:03, Jerin Jacob  > caviumnetworks.com> wrote:
> > > On Wed, Dec 02, 2015 at 02:54:52PM +0800, Jianbo Liu wrote:
> > >> On 2 December 2015 at 00:41, Jerin Jacob  > >> caviumnetworks.com> wrote:
> > >> > On Tue, Dec 01, 2015 at 01:41:15PM -0500, Jianbo Liu wrote:
> > >> >> Adds ARM NEON support for lpm.
> > >> >> And enables table/pipeline libraries which depend on lpm.
> > >> >
> > >> > I already sent the patch on the same yesterday.
> > >> > We can converge the patches after the discussion.
> > >> > Please check "[dpdk-dev] [PATCH 0/3] add lpm support for NEON" on ml
> > >> >
> > >> Yes, I have read your patch. But there are many differences, so I sent
> > >> mine for your reviewing :)
> > >>
> > >> >
> > >> >>
> > >> >> Signed-off-by: Jianbo Liu 
> > >> >> ---
> > >> >>  config/defconfig_arm-armv7a-linuxapp-gcc  |  3 -
> > >> >>  config/defconfig_arm64-armv8a-linuxapp-gcc|  3 -
> > >> >>  lib/librte_eal/common/include/arch/arm/rte_vect.h | 28 ++
> > >> >>  lib/librte_lpm/rte_lpm.h  | 68 
> > >> >> ---
> > >> >>  4 files changed, 77 insertions(+), 25 deletions(-)
> > >> >>
> > >> >> diff --git a/config/defconfig_arm-armv7a-linuxapp-gcc 
> > >> >> b/config/defconfig_arm-armv7a-linuxapp-gcc
> > >> >> index cbebd64..efffa1f 100644
> > >> >> --- a/config/defconfig_arm-armv7a-linuxapp-gcc
> > >> >> +++ b/config/defconfig_arm-armv7a-linuxapp-gcc
> > >> >> @@ -53,9 +53,6 @@ CONFIG_RTE_LIBRTE_KNI=n
> > >> >>  CONFIG_RTE_EAL_IGB_UIO=n
> > >> >>
> > >> >>  # fails to compile on ARM
> > >> >> -CONFIG_RTE_LIBRTE_LPM=n
> > >> >> -CONFIG_RTE_LIBRTE_TABLE=n
> > >> >> -CONFIG_RTE_LIBRTE_PIPELINE=n
> > >> >>  CONFIG_RTE_SCHED_VECTOR=n
> > >> >>
> > >> >>  # cannot use those on ARM
> > >> >> diff --git a/config/defconfig_arm64-armv8a-linuxapp-gcc 
> > >> >> b/config/defconfig_arm64-armv8a-linuxapp-gcc
> > >> >> index 504f3ed..57f7941 100644
> > >> >> --- a/config/defconfig_arm64-armv8a-linuxapp-gcc
> > >> >> +++ b/config/defconfig_arm64-armv8a-linuxapp-gcc
> > >> >> @@ -51,7 +51,4 @@ CONFIG_RTE_LIBRTE_IVSHMEM=n
> > >> >>  CONFIG_RTE_LIBRTE_FM10K_PMD=n
> > >> >>  CONFIG_RTE_LIBRTE_I40E_PMD=n
> > >> >>
> > >> >> -CONFIG_RTE_LIBRTE_LPM=n
> > >> >> -CONFIG_RTE_LIBRTE_TABLE=n
> > >> >> -CONFIG_RTE_LIBRTE_PIPELINE=n
> > >> >>  CONFIG_RTE_SCHED_VECTOR=n
> > >> >> diff --git a/lib/librte_eal/common/include/arch/arm/rte_vect.h 
> > >> >> b/lib/librte_eal/common/include/arch/arm/rte_vect.h
> > >> >> index a33c054..7437711 100644
> > >> >> --- a/lib/librte_eal/common/include/arch/arm/rte_vect.h
> > >> >> +++ b/lib/librte_eal/common/include/arch/arm/rte_vect.h
> > >> >> @@ -41,6 +41,8 @@ extern "C" {
> > >> >>
> > >> >>  typedef int32x4_t xmm_t;
> > >> >>
> > >> >> +typedef int32x4_t __m128i;
> > >> >> +
> > >> >>  #define  XMM_SIZE(sizeof(xmm_t))
> > >> >>  #define  XMM_MASK(XMM_SIZE - 1)
> > >> >>
> > >> >> @@ -53,6 +55,32 @@ typedef union rte_xmm {
> > >> >>   double   pd[XMM_SIZE / sizeof(double)];
> > >> >>  } __attribute__((aligned(16))) rte_xmm_t;
> > >> >>
> > >> >> +static __inline __m128i
> > >> >> +_mm_set_epi32(int i3, int i2, int i1, int i0)
> > >> >> +{
> > >> >> + int32_t r[4] = {i0, i1, i2, i3};
> > >> >> +
> > >> >> + return vld1q_s32(r);
> > >> >> +}
> > >> >> +
> > >> >> +static __inline __m128i
> > >> >> +_mm_loadu_si128(__m128i *p)
> > >> >> +{
> > >> >> + return vld1q_s32((int32_t *)p);
> > >> >> +}
> > >> >> +
> > >> >> +static __inline __m128i
> > >> >> +_mm_set1_epi32(int i)
> > >> >> +{
> > >> >> + return vdupq_n_s32(i);
> > >> >> +}
> > >> >> +
> > >> >> +static __inline __m128i
> > >> >> +_mm_and_si128(__m128i a, __m128i b)
> > >> >> +{
> > >> >> + return vandq_s32(a, b);
> > >> >> +}
> > >> >> +
> > >
> > > IMO, it's not always good to emulate GCC defined intrinsics of
> > > other architecture. What if a legacy DPDK application has such mappings
> > > then BOOM, multiple definition, which one is correct? which one
> > > to comment it out? Integration pain starts for DPDK library consumer:-(
> > >
> > They can include rte_vect.h in build/include directly, which is linked 
> > correctly
> > to the one for that ARCH, so there is no need to worry about.
> > 
> > 
> > >> >
> > >> > IMO, it makes sense to not emulate the SSE intrinsics with NEON
> > >> > Let's create the rte_vect_* as required. look at the existing patch.
> > >> >
> > >> I thought of creating a layer of SIMD over all the platforms before.
> > >> But can't you see it make things complicated, considering there are
> > >> only few simple intrinsic 

[dpdk-dev] Bond port with multiple queues

2015-12-02 Thread Sergey Balabanov
Hello,

I configured a bond port with 2 rx queues on it and added 2 slaves into the 
bond port. When I run traffic I get all packets on queue #0. This is quite 
expected when RSS turned off. When I turn on RSS all packets are distributed 
between two rx queues. There is no guarantee that I will get all packets from 
the slave port #0 on some queue and all packets from the slave port #1 on 
another queue.
Does anybody know is there a way to configure bond port in a way when all 
traffic 
on port #0 goes to queue #0 and traffic on port #1 goes to queue #1?

Thanks,
Sergey Balabanov


[dpdk-dev] [PATCH 0/4 for 2.3] vhost-user live migration support

2015-12-02 Thread Victor Kaplansky
On Wed, Dec 02, 2015 at 11:43:09AM +0800, Yuanhan Liu wrote:
> This patch set adds the initial vhost-user live migration support.
> 
> The major task behind that is to log pages we touched during
> live migration. So, this patch is basically about adding vhost
> log support, and using it.
> 
> Patchset
> 
> - Patch 1 handles VHOST_USER_SET_LOG_BASE, which tells us where
>   the dirty memory bitmap is.
> 
> - Patch 2 introduces a vhost_log_write() helper function to log
>   pages we are gonna change.
> 
> - Patch 3 logs changes we made to used vring.
> 
> - Patch 4 sets log_fhmfd protocol feature bit, which actually
>   enables the vhost-user live migration support.
> 
> A simple test guide (on same host)
> ==
> 
> The following test is based on OVS + DPDK. And here is guide
> to setup OVS + DPDK:
> 
> http://wiki.qemu.org/Features/vhost-user-ovs-dpdk
> 
> 1. start ovs-vswitchd
> 
> 2. Add two ovs vhost-user port, say vhost0 and vhost1
> 
> 3. Start a VM1 to connect to vhost0. Here is my example:
> 
>$QEMU -enable-kvm -m 1024 -smp 4 \
>-chardev socket,id=char0,path=/var/run/openvswitch/vhost0  \
>-netdev type=vhost-user,id=mynet1,chardev=char0,vhostforce \
>-device virtio-net-pci,netdev=mynet1,mac=52:54:00:12:34:58 \
>-object 
> memory-backend-file,id=mem,size=1024M,mem-path=$HOME/hugetlbfs,share=on \
>-numa node,memdev=mem -mem-prealloc \
>-kernel $HOME/iso/vmlinuz -append "root=/dev/sda1" \
>-hda fc-19-i386.img \
>-monitor telnet::,server,nowait -curses
> 
> 4. run "ping $host" inside VM1
> 
> 5. Start VM2 to connect to vhost0, and marking it as the target
>of live migration (by adding -incoming tcp:0: option)
> 
>$QEMU -enable-kvm -m 1024 -smp 4 \
>-chardev socket,id=char0,path=/var/run/openvswitch/vhost1  \
>-netdev type=vhost-user,id=mynet1,chardev=char0,vhostforce \
>-device virtio-net-pci,netdev=mynet1,mac=52:54:00:12:34:58 \
>-object 
> memory-backend-file,id=mem,size=1024M,mem-path=$HOME/hugetlbfs,share=on \
>-numa node,memdev=mem -mem-prealloc \
>-kernel $HOME/iso/vmlinuz -append "root=/dev/sda1" \
>-hda fc-19-i386.img \
>-monitor telnet::3334,server,nowait -curses \
>-incoming tcp:0: 
> 
> 6. connect to VM1 monitor, and start migration:
> 
>> migrate tcp:0:
> 
> 7. After a while, you will find that VM1 has been migrated to VM2,
>and the "ping" command continues running, perfectly.
> 
> 
> Note: this patch set has mostly been based on Victor Kaplansk's demo
> work (vhost-user-bridge) at QEMU project. I was thinking to add Victor
> as the co-author. Victor, what do you think of that? :)

Thanks for adding me to credits list!
-- Victor

> 
> Comments are welcome!
> 
> ---
> Yuanhan Liu (4):
>   vhost: handle VHOST_USER_SET_LOG_BASE request
>   vhost: introduce vhost_log_write
>   vhost: log vring changes
>   vhost: enable log_shmfd protocol feature
> 
>  lib/librte_vhost/rte_virtio_net.h | 35 ++
>  lib/librte_vhost/vhost_rxtx.c | 70 
> ++-
>  lib/librte_vhost/vhost_user/vhost-net-user.c  |  7 ++-
>  lib/librte_vhost/vhost_user/vhost-net-user.h  |  6 +++
>  lib/librte_vhost/vhost_user/virtio-net-user.c | 44 +
>  lib/librte_vhost/vhost_user/virtio-net-user.h |  5 +-
>  lib/librte_vhost/virtio-net.c |  4 ++
>  7 files changed, 145 insertions(+), 26 deletions(-)
> 
> -- 
> 1.9.0


[dpdk-dev] [PATCH 3/4] eal/arm: Enable lpm/table/pipeline libs

2015-12-02 Thread Jerin Jacob
On Wed, Dec 02, 2015 at 05:49:41PM +0800, Jianbo Liu wrote:
> On 2 December 2015 at 16:03, Jerin Jacob  
> wrote:
> > On Wed, Dec 02, 2015 at 02:54:52PM +0800, Jianbo Liu wrote:
> >> On 2 December 2015 at 00:41, Jerin Jacob  >> caviumnetworks.com> wrote:
> >> > On Tue, Dec 01, 2015 at 01:41:15PM -0500, Jianbo Liu wrote:
> >> >> Adds ARM NEON support for lpm.
> >> >> And enables table/pipeline libraries which depend on lpm.
> >> >
> >> > I already sent the patch on the same yesterday.
> >> > We can converge the patches after the discussion.
> >> > Please check "[dpdk-dev] [PATCH 0/3] add lpm support for NEON" on ml
> >> >
> >> Yes, I have read your patch. But there are many differences, so I sent
> >> mine for your reviewing :)
> >>
> >> >
> >> >>
> >> >> Signed-off-by: Jianbo Liu 
> >> >> ---
> >> >>  config/defconfig_arm-armv7a-linuxapp-gcc  |  3 -
> >> >>  config/defconfig_arm64-armv8a-linuxapp-gcc|  3 -
> >> >>  lib/librte_eal/common/include/arch/arm/rte_vect.h | 28 ++
> >> >>  lib/librte_lpm/rte_lpm.h  | 68 
> >> >> ---
> >> >>  4 files changed, 77 insertions(+), 25 deletions(-)
> >> >>
> >> >> diff --git a/config/defconfig_arm-armv7a-linuxapp-gcc 
> >> >> b/config/defconfig_arm-armv7a-linuxapp-gcc
> >> >> index cbebd64..efffa1f 100644
> >> >> --- a/config/defconfig_arm-armv7a-linuxapp-gcc
> >> >> +++ b/config/defconfig_arm-armv7a-linuxapp-gcc
> >> >> @@ -53,9 +53,6 @@ CONFIG_RTE_LIBRTE_KNI=n
> >> >>  CONFIG_RTE_EAL_IGB_UIO=n
> >> >>
> >> >>  # fails to compile on ARM
> >> >> -CONFIG_RTE_LIBRTE_LPM=n
> >> >> -CONFIG_RTE_LIBRTE_TABLE=n
> >> >> -CONFIG_RTE_LIBRTE_PIPELINE=n
> >> >>  CONFIG_RTE_SCHED_VECTOR=n
> >> >>
> >> >>  # cannot use those on ARM
> >> >> diff --git a/config/defconfig_arm64-armv8a-linuxapp-gcc 
> >> >> b/config/defconfig_arm64-armv8a-linuxapp-gcc
> >> >> index 504f3ed..57f7941 100644
> >> >> --- a/config/defconfig_arm64-armv8a-linuxapp-gcc
> >> >> +++ b/config/defconfig_arm64-armv8a-linuxapp-gcc
> >> >> @@ -51,7 +51,4 @@ CONFIG_RTE_LIBRTE_IVSHMEM=n
> >> >>  CONFIG_RTE_LIBRTE_FM10K_PMD=n
> >> >>  CONFIG_RTE_LIBRTE_I40E_PMD=n
> >> >>
> >> >> -CONFIG_RTE_LIBRTE_LPM=n
> >> >> -CONFIG_RTE_LIBRTE_TABLE=n
> >> >> -CONFIG_RTE_LIBRTE_PIPELINE=n
> >> >>  CONFIG_RTE_SCHED_VECTOR=n
> >> >> diff --git a/lib/librte_eal/common/include/arch/arm/rte_vect.h 
> >> >> b/lib/librte_eal/common/include/arch/arm/rte_vect.h
> >> >> index a33c054..7437711 100644
> >> >> --- a/lib/librte_eal/common/include/arch/arm/rte_vect.h
> >> >> +++ b/lib/librte_eal/common/include/arch/arm/rte_vect.h
> >> >> @@ -41,6 +41,8 @@ extern "C" {
> >> >>
> >> >>  typedef int32x4_t xmm_t;
> >> >>
> >> >> +typedef int32x4_t __m128i;
> >> >> +
> >> >>  #define  XMM_SIZE(sizeof(xmm_t))
> >> >>  #define  XMM_MASK(XMM_SIZE - 1)
> >> >>
> >> >> @@ -53,6 +55,32 @@ typedef union rte_xmm {
> >> >>   double   pd[XMM_SIZE / sizeof(double)];
> >> >>  } __attribute__((aligned(16))) rte_xmm_t;
> >> >>
> >> >> +static __inline __m128i
> >> >> +_mm_set_epi32(int i3, int i2, int i1, int i0)
> >> >> +{
> >> >> + int32_t r[4] = {i0, i1, i2, i3};
> >> >> +
> >> >> + return vld1q_s32(r);
> >> >> +}
> >> >> +
> >> >> +static __inline __m128i
> >> >> +_mm_loadu_si128(__m128i *p)
> >> >> +{
> >> >> + return vld1q_s32((int32_t *)p);
> >> >> +}
> >> >> +
> >> >> +static __inline __m128i
> >> >> +_mm_set1_epi32(int i)
> >> >> +{
> >> >> + return vdupq_n_s32(i);
> >> >> +}
> >> >> +
> >> >> +static __inline __m128i
> >> >> +_mm_and_si128(__m128i a, __m128i b)
> >> >> +{
> >> >> + return vandq_s32(a, b);
> >> >> +}
> >> >> +
> >
> > IMO, it's not always good to emulate GCC defined intrinsics of
> > other architecture. What if a legacy DPDK application has such mappings
> > then BOOM, multiple definition, which one is correct? which one
> > to comment it out? Integration pain starts for DPDK library consumer:-(
> >
> They can include rte_vect.h in build/include directly, which is linked 
> correctly
> to the one for that ARCH, so there is no need to worry about.

I think you missed the point,I was trying to say that
legacy DPDK application and third party stacks uses SSE2NEON kind of
libraries
for quick integration, for example, something like this
https://github.com/jratcliff63367/sse2neon/blob/master/SSE2NEON.h

AND they include "rte_lpm.h"(it internally includes rte_vect.h)
that lead to multiple definition and its not good.

>
>
> >> >
> >> > IMO, it makes sense to not emulate the SSE intrinsics with NEON
> >> > Let's create the rte_vect_* as required. look at the existing patch.
> >> >
> >> I thought of creating a layer of SIMD over all the platforms before.
> >> But can't you see it make things complicated, considering there are
> >> only few simple intrinsic to implement?
> >
> > Not true, There were, a lot of SSE intrinsics needs be to emulated for ACL 
> > NEON
> > implementation if I were to take this approach and emulation comes with
> 

[dpdk-dev] [PATCH 3/4] vhost: log vring changes

2015-12-02 Thread Victor Kaplansky
On Wed, Dec 02, 2015 at 11:43:12AM +0800, Yuanhan Liu wrote:
> Invoking vhost_log_write() to mark corresponding page as dirty while
> updating used vring.

Looks good, thanks!

I didn't find where you log the dirty pages in result of data
written to the buffers pointed by the descriptors in RX vring.
AFAIU, the buffers of RX queue reside in guest's memory and have
to be marked as dirty if they are written. What do you say?

-- Victor

> 
> Signed-off-by: Yuanhan Liu 
> ---
>  lib/librte_vhost/vhost_rxtx.c | 74 
> +--
>  1 file changed, 50 insertions(+), 24 deletions(-)
> 
> diff --git a/lib/librte_vhost/vhost_rxtx.c b/lib/librte_vhost/vhost_rxtx.c
> index 9322ce6..d4805d8 100644
> --- a/lib/librte_vhost/vhost_rxtx.c
> +++ b/lib/librte_vhost/vhost_rxtx.c
> @@ -129,6 +129,7 @@ virtio_dev_rx(struct virtio_net *dev, uint16_t queue_id,
>   uint32_t offset = 0, vb_offset = 0;
>   uint32_t pkt_len, len_to_cpy, data_len, total_copied = 0;
>   uint8_t hdr = 0, uncompleted_pkt = 0;
> + uint16_t idx;
>  
>   /* Get descriptor from available ring */
>   desc = >desc[head[packet_success]];
> @@ -200,16 +201,22 @@ virtio_dev_rx(struct virtio_net *dev, uint16_t queue_id,
>   }
>  
>   /* Update used ring with desc information */
> - vq->used->ring[res_cur_idx & (vq->size - 1)].id =
> - head[packet_success];
> + idx = res_cur_idx & (vq->size - 1);
> + vq->used->ring[idx].id = head[packet_success];
>  
>   /* Drop the packet if it is uncompleted */
>   if (unlikely(uncompleted_pkt == 1))
> - vq->used->ring[res_cur_idx & (vq->size - 1)].len =
> - vq->vhost_hlen;
> + vq->used->ring[idx].len = vq->vhost_hlen;
>   else
> - vq->used->ring[res_cur_idx & (vq->size - 1)].len =
> - pkt_len + 
> vq->vhost_hlen;
> + vq->used->ring[idx].len = pkt_len + vq->vhost_hlen;
> +
> + /*
> +  * to defer the update to when updating used->idx,
> +  * and batch them?
> +  */
> + vhost_log_write(dev, vq,
> + offsetof(struct vring_used, ring[idx]),
> + sizeof(vq->used->ring[idx]));
>  
>   res_cur_idx++;
>   packet_success++;
> @@ -236,6 +243,9 @@ virtio_dev_rx(struct virtio_net *dev, uint16_t queue_id,
>  
>   *(volatile uint16_t *)>used->idx += count;
>   vq->last_used_idx = res_end_idx;
> + vhost_log_write(dev, vq,
> + offsetof(struct vring_used, idx),
> + sizeof(vq->used->idx));
>  
>   /* flush used->idx update before we read avail->flags. */
>   rte_mb();
> @@ -265,6 +275,7 @@ copy_from_mbuf_to_vring(struct virtio_net *dev, uint32_t 
> queue_id,
>   uint32_t seg_avail;
>   uint32_t vb_avail;
>   uint32_t cpy_len, entry_len;
> + uint16_t idx;
>  
>   if (pkt == NULL)
>   return 0;
> @@ -302,16 +313,18 @@ copy_from_mbuf_to_vring(struct virtio_net *dev, 
> uint32_t queue_id,
>   entry_len = vq->vhost_hlen;
>  
>   if (vb_avail == 0) {
> - uint32_t desc_idx =
> - vq->buf_vec[vec_idx].desc_idx;
> + uint32_t desc_idx = vq->buf_vec[vec_idx].desc_idx;
> +
> + if ((vq->desc[desc_idx].flags & VRING_DESC_F_NEXT) == 0) {
> + idx = cur_idx & (vq->size - 1);
>  
> - if ((vq->desc[desc_idx].flags
> - & VRING_DESC_F_NEXT) == 0) {
>   /* Update used ring with desc information */
> - vq->used->ring[cur_idx & (vq->size - 1)].id
> - = vq->buf_vec[vec_idx].desc_idx;
> - vq->used->ring[cur_idx & (vq->size - 1)].len
> - = entry_len;
> + vq->used->ring[idx].id = vq->buf_vec[vec_idx].desc_idx;
> + vq->used->ring[idx].len = entry_len;
> +
> + vhost_log_write(dev, vq,
> + offsetof(struct vring_used, ring[idx]),
> + sizeof(vq->used->ring[idx]));
>  
>   entry_len = 0;
>   cur_idx++;
> @@ -354,10 +367,13 @@ copy_from_mbuf_to_vring(struct virtio_net *dev, 
> uint32_t queue_id,
>   if ((vq->desc[vq->buf_vec[vec_idx].desc_idx].flags &
>   VRING_DESC_F_NEXT) == 0) {
>   /* Update used ring with desc information */
> - vq->used->ring[cur_idx & (vq->size - 1)].id
> + idx = cur_idx & (vq->size - 1);
> +  

[dpdk-dev] [PATCH 1/4] vhost: handle VHOST_USER_SET_LOG_BASE request

2015-12-02 Thread Panu Matilainen
On 12/02/2015 05:43 AM, Yuanhan Liu wrote:
> VHOST_USER_SET_LOG_BASE request is used to tell the backend (dpdk
> vhost-user) where we should log dirty pages, and how big the log
> buffer is.
>
> This request introduces a new payload:
>
>   typedef struct VhostUserLog {
>   uint64_t mmap_size;
>   uint64_t mmap_offset;
>   } VhostUserLog;
>
> Also, a fd is delivered from QEMU by ancillary data.
>
> With those info given, an area of memory is mmaped, assigned
> to dev->log_base, for logging dirty pages.
>
> Signed-off-by: Yuanhan Liu 
> ---
>   lib/librte_vhost/rte_virtio_net.h |  2 ++
>   lib/librte_vhost/vhost_user/vhost-net-user.c  |  7 -
>   lib/librte_vhost/vhost_user/vhost-net-user.h  |  6 
>   lib/librte_vhost/vhost_user/virtio-net-user.c | 44 
> +++
>   lib/librte_vhost/vhost_user/virtio-net-user.h |  1 +
>   5 files changed, 59 insertions(+), 1 deletion(-)
>
> diff --git a/lib/librte_vhost/rte_virtio_net.h 
> b/lib/librte_vhost/rte_virtio_net.h
> index 5687452..416dac2 100644
> --- a/lib/librte_vhost/rte_virtio_net.h
> +++ b/lib/librte_vhost/rte_virtio_net.h
> @@ -127,6 +127,8 @@ struct virtio_net {
>   #define IF_NAME_SZ (PATH_MAX > IFNAMSIZ ? PATH_MAX : IFNAMSIZ)
>   charifname[IF_NAME_SZ]; /**< Name of the tap 
> device or socket path. */
>   uint32_tvirt_qp_nb; /**< number of queue pair we 
> have allocated */
> + uint64_tlog_size;   /**< Size of log area */
> + uint8_t *log_base;  /**< Where dirty pages are 
> logged */
>   void*priv;  /**< private context */
>   struct vhost_virtqueue  *virtqueue[VHOST_MAX_QUEUE_PAIRS * 2];  /**< 
> Contains all virtqueue information. */
>   } __rte_cache_aligned;

This (and other changes in patch 2 breaks the librte_vhost ABI again, so 
you'd need to at least add a deprecation note to 2.2 to be able to do it 
in 2.3 at all according to the ABI policy.

Perhaps a better option would be adding some padding to the structs now 
for 2.2 since the vhost ABI is broken there anyway. That would at least 
give a chance to keep it compatible from 2.2 to 2.3.

- Panu -




[dpdk-dev] [PATCH 2/4] vhost: introduce vhost_log_write

2015-12-02 Thread Victor Kaplansky
On Wed, Dec 02, 2015 at 11:43:11AM +0800, Yuanhan Liu wrote:
> Introduce vhost_log_write() helper function to log the dirty pages we
> touched. Page size is harded code to 4096 (VHOST_LOG_PAGE), and each
> log is presented by 1 bit.
> 
> Therefore, vhost_log_write() simply finds the right bit for related
> page we are gonna change, and set it to 1. dev->log_base denotes the
> start of the dirty page bitmap.
> 
> The page address is biased by log_guest_addr, which is derived from
> SET_VRING_ADDR request as part of the vring related addresses.
> 
> Signed-off-by: Yuanhan Liu 
> ---
>  lib/librte_vhost/rte_virtio_net.h | 34 ++
>  lib/librte_vhost/virtio-net.c |  4 
>  2 files changed, 38 insertions(+)
> 
> diff --git a/lib/librte_vhost/rte_virtio_net.h 
> b/lib/librte_vhost/rte_virtio_net.h
> index 416dac2..191c1be 100644
> --- a/lib/librte_vhost/rte_virtio_net.h
> +++ b/lib/librte_vhost/rte_virtio_net.h
> @@ -40,6 +40,7 @@
>   */
>  
>  #include 
> +#include 
>  #include 
>  #include 
>  #include 
> @@ -59,6 +60,8 @@ struct rte_mbuf;
>  /* Backend value set by guest. */
>  #define VIRTIO_DEV_STOPPED -1
>  
> +#define VHOST_LOG_PAGE   4096
> +
>  
>  /* Enum for virtqueue management. */
>  enum {VIRTIO_RXQ, VIRTIO_TXQ, VIRTIO_QNUM};
> @@ -82,6 +85,7 @@ struct vhost_virtqueue {
>   struct vring_desc   *desc;  /**< Virtqueue 
> descriptor ring. */
>   struct vring_avail  *avail; /**< Virtqueue 
> available ring. */
>   struct vring_used   *used;  /**< Virtqueue used 
> ring. */
> + uint64_tlog_guest_addr; /**< Physical address 
> of used ring, for logging */
>   uint32_tsize;   /**< Size of descriptor 
> ring. */
>   uint32_tbackend;/**< Backend value to 
> determine if device should started/stopped. */
>   uint16_tvhost_hlen; /**< Vhost header 
> length (varies depending on RX merge buffers. */
> @@ -203,6 +207,36 @@ gpa_to_vva(struct virtio_net *dev, uint64_t guest_pa)
>   return vhost_va;
>  }
>  
> +static inline void __attribute__((always_inline))
> +vhost_log_page(uint8_t *log_base, uint64_t page)
> +{
> + /* TODO: to make it atomic? */
> + log_base[page / 8] |= 1 << (page % 8);

I think the atomic OR operation is necessary only if there can be
more than one vhost-user back-end updating the guest's memory
simultaneously. However probably it is pretty safe to perform
regular OR operation, since rings are not shared between
back-end. What about buffers pointed by descriptors?  To be on
the safe side, I would use a GCC built-in function
__sync_fetch_and_or(). 

> +}
> +
> +static inline void __attribute__((always_inline))
> +vhost_log_write(struct virtio_net *dev, struct vhost_virtqueue *vq,
> + uint64_t offset, uint64_t len)
> +{
> + uint64_t addr = vq->log_guest_addr;
> + uint64_t page;
> +
> + if (unlikely(((dev->features & (1ULL << VHOST_F_LOG_ALL)) == 0) ||
> +  !dev->log_base || !len))
> + return;

Isn't "likely" more appropriate in above, since the whole
expression is expected to be true most of the time?

> +
> + addr += offset;
> + if (dev->log_size < ((addr + len - 1) / VHOST_LOG_PAGE / 8))
> + return;
> +
> + page = addr / VHOST_LOG_PAGE;
> + while (page * VHOST_LOG_PAGE < addr + len) {
> + vhost_log_page(dev->log_base, page);
> + page += VHOST_LOG_PAGE;
> + }
> +}
> +
> +
>  /**
>   *  Disable features in feature_mask. Returns 0 on success.
>   */
> diff --git a/lib/librte_vhost/virtio-net.c b/lib/librte_vhost/virtio-net.c
> index 8364938..4481827 100644
> --- a/lib/librte_vhost/virtio-net.c
> +++ b/lib/librte_vhost/virtio-net.c
> @@ -666,12 +666,16 @@ set_vring_addr(struct vhost_device_ctx ctx, struct 
> vhost_vring_addr *addr)
>   return -1;
>   }
>  
> + vq->log_guest_addr = addr->log_guest_addr;
> +
>   LOG_DEBUG(VHOST_CONFIG, "(%"PRIu64") mapped address desc: %p\n",
>   dev->device_fh, vq->desc);
>   LOG_DEBUG(VHOST_CONFIG, "(%"PRIu64") mapped address avail: %p\n",
>   dev->device_fh, vq->avail);
>   LOG_DEBUG(VHOST_CONFIG, "(%"PRIu64") mapped address used: %p\n",
>   dev->device_fh, vq->used);
> + LOG_DEBUG(VHOST_CONFIG, "(%"PRIu64") log_guest_addr: %p\n",
> + dev->device_fh, (void *)(uintptr_t)vq->log_guest_addr);
>  
>   return 0;
>  }
> -- 
> 1.9.0


[dpdk-dev] Bond port with multiple queues

2015-12-02 Thread Declan Doherty
On 02/12/15 13:11, Sergey Balabanov wrote:
> Hello,
>
> I configured a bond port with 2 rx queues on it and added 2 slaves into the
> bond port. When I run traffic I get all packets on queue #0. This is quite
> expected when RSS turned off. When I turn on RSS all packets are distributed
> between two rx queues. There is no guarantee that I will get all packets from
> the slave port #0 on some queue and all packets from the slave port #1 on
> another queue.
> Does anybody know is there a way to configure bond port in a way when all 
> traffic
> on port #0 goes to queue #0 and traffic on port #1 goes to queue #1?
>
> Thanks,
> Sergey Balabanov
>


Hey Sergey,

this is the behavior I would expect. The way we've implemented it the 
same RSS configuration will get applied to each slave, so in the case of 
say a fail over event in active backup mode as rx traffic is moved onto 
a new master the traffic flows will be directed to the same queues as 
before. If I understand what you are asking for I'm not sure why you 
need a bonded port, why not just used the 2 ports with a single queue 
each rather than a single bonded port with 2 queues?

Declan


[dpdk-dev] [PATCH 03/10] mk: install a standard cutomizable tree

2015-12-02 Thread Panu Matilainen
On 12/02/2015 03:05 PM, Thomas Monjalon wrote:
> 2015-12-02 14:54, Panu Matilainen:
>> On 12/02/2015 01:25 PM, Thomas Monjalon wrote:
>>> 2015-12-02 12:27, Panu Matilainen:
 $(prefix)/share is supposed to be shareable across different
 architectures. Most of the content here is, but at least the lib symlink
 and .config file are not.
>>>
>>> The case you want to address is multilib 32/x32/64, right?
>>
>> That, plus modern Debian/Ubuntu supports multiarch, not just -lib.
>
> We do not support completely different platforms (e.g. ARM and x86)
> with only one include directory. At the moment, only variants (32/64)
> live together.

Actually even the variants will run into problems because eg 
rte_config.h will differ between 32- and 64-bit. But that's a problem 
for another day, this is hardly the most pressing of issues :)

>
 One option is to install .config and the symlinks within $(sdkdir)/$(T)
 directories, then it can be shared across architectures because each
 lives in their own directory. Another possibility is moving the whole
 sdk directory into a subdir in $(libdir), but that misses the
 opportunity to share across architectures (whether anybody actually
 cares is a whole other question :)
>>>
>>> Yes, I tried to remove the use of RTE_TARGET when building an example.
>>> But we can keep it with a subdirectory in $(sdkdir).
>>
>> Just realized my suggestion $(sdkdir)/$(T) would not cut it because if
>> T= is specified then this installation method wont be invoked at all :D
>
> I don't understand what you mean.
> In my patchset, the installation is the same (except some default values)
> with and without T=.

Hmm, must've misuderstood/mixed up with something Marios patches do. 
Never mind, I was just mumbling out loud anyhow.

>
>> So yeah, RTE_TARGET. Or perhaps just RTE_ARCH. Dunno if there's actual
>> added value to having the whole target string there, but I wont mind either.
>
> RTE_TARGET is a safe choice for future.
>

Nod.

- Panu -


[dpdk-dev] building LIBRTE_PMD_XENVIRT in 32bit triggers some errors

2015-12-02 Thread Christian Ehrhardt
Hi,
just FYI - building LIBRTE_PMD_XENVIRT in 32bit triggers some errors.

I don't know if that part of the tree is actively maintained - It is
default off, in the config template config/common_linuxapp.

I'm not even entirely sure if  LIBRTE_PMD_XENVIRT is still required.
I guess in the Dom0 you can go with uio-pci-generic these days, not
sure about para-virtual guests.

Anyway I thought it might be worth to at least report.

== Build drivers/net/xenvirt
gcc -Wp,-MD,./.rte_eth_xenvirt.o.d.tmp -m32 -pthread -fPIC
-march=native -DRTE_MACHINE_CPUFLAG_SSE -DRTE_MACHINE_CPUFLAG_SSE2
-DRTE_MACHINE_CPUFLAG_SSE3 -DRTE_MACHINE_CPUFLAG_SSSE3
-DRTE_MACHINE_CPUFLAG_SSE4_1 -DRTE_MACHINE_CPUFLAG_SSE4_2
-DRTE_MACHINE_CPUFLAG_AES -DRTE_MACHINE_CPUFLAG_PCLMULQDQ
-DRTE_MACHINE_CPUFLAG_AVX
-DRTE_COMPILE_TIME_CPUFLAGS=RTE_CPUFLAG_SSE,RTE_CPUFLAG_SSE2,RTE_CPUFLAG_SSE3,RTE_CPUFLAG_SSSE3,RTE_CPUFLAG_SSE4_1,RTE_CPUFLAG_SSE4_2,RTE_CPUFLAG_AES,RTE_CPUFLAG_PCLMULQDQ,RTE_CPUFLAG_AVX
 -I/home/ubuntu/dpdk-2.2.0-rc2/build/include -include
/home/ubuntu/dpdk-2.2.0-rc2/build/include/rte_config.h -O3 -W -Wall
-Werror -Wstrict-prototypes -Wmissing-prototypes
-Wmissing-declarations -Wold-style-definition -Wpointer-arith
-Wcast-align -Wnested-externs -Wcast-qual -Wformat-nonliteral
-Wformat-security -Wundef -Wwrite-strings   -o rte_eth_xenvirt.o -c
/home/ubuntu/dpdk-2.2.0-rc2/drivers/net/xenvirt/rte_eth_xenvirt.c
In file included from
/home/ubuntu/dpdk-2.2.0-rc2/drivers/net/xenvirt/rte_eth_xenvirt.c:61:0:
/home/ubuntu/dpdk-2.2.0-rc2/drivers/net/xenvirt/virtqueue.h: In
function ?virtqueue_enqueue_recv_refill?:
/home/ubuntu/dpdk-2.2.0-rc2/drivers/net/xenvirt/virtqueue.h:201:15:
error: cast from pointer to integer of different size
[-Werror=pointer-to-int-cast]
   (uint64_t) ((uint64_t)cookie->buf_addr + RTE_PKTMBUF_HEADROOM -
sizeof(struct virtio_net_hdr));
   ^
In file included from
/home/ubuntu/dpdk-2.2.0-rc2/drivers/net/xenvirt/rte_eth_xenvirt.c:51:0:
/home/ubuntu/dpdk-2.2.0-rc2/drivers/net/xenvirt/virtqueue.h: In
function ?virtqueue_enqueue_xmit?:
/home/ubuntu/dpdk-2.2.0-rc2/build/include/rte_mbuf.h:1617:3: error:
cast from pointer to integer of different size
[-Werror=pointer-to-int-cast]
  ((t)((char *)(m)->buf_addr + (m)->data_off + (o)))
   ^
/home/ubuntu/dpdk-2.2.0-rc2/build/include/rte_mbuf.h:1631:32: note: in
expansion of macro ?rte_pktmbuf_mtod_offset?
 #define rte_pktmbuf_mtod(m, t) rte_pktmbuf_mtod_offset(m, t, 0)
^
/home/ubuntu/dpdk-2.2.0-rc2/drivers/net/xenvirt/virtqueue.h:58:2:
note: in expansion of macro ?rte_pktmbuf_mtod?
  rte_pktmbuf_mtod(mb, uint64_t)
  ^
/home/ubuntu/dpdk-2.2.0-rc2/drivers/net/xenvirt/virtqueue.h:241:24:
note: in expansion of macro ?RTE_MBUF_DATA_DMA_ADDR?
  start_dp[idx].addr  = RTE_MBUF_DATA_DMA_ADDR(cookie);
^
cc1: all warnings being treated as errors
/home/ubuntu/dpdk-2.2.0-rc2/mk/internal/rte.compile-pre.mk:126: recipe
for target 'rte_eth_xenvirt.o' failed
make[4]: *** [rte_eth_xenvirt.o] Error 1
/home/ubuntu/dpdk-2.2.0-rc2/mk/rte.subdir.mk:61: recipe for target
'xenvirt' failed
make[3]: *** [xenvirt] Error 2
/home/ubuntu/dpdk-2.2.0-rc2/mk/rte.subdir.mk:61: recipe for target 'net' failed
make[2]: *** [net] Error 2
/home/ubuntu/dpdk-2.2.0-rc2/mk/rte.sdkbuild.mk:93: recipe for target
'drivers' failed
make[1]: *** [drivers] Error 2
/home/ubuntu/dpdk-2.2.0-rc2/mk/rte.sdkroot.mk:124: recipe for target
'all' failed
make: *** [all] Error 2

Christian Ehrhardt
Software Engineer, Ubuntu Server
Canonical Ltd


[dpdk-dev] [dpdk-dev, v2] igb_uio: fix igb_uio's access to pci_dev->msi_list for kernels >= 4.3

2015-12-02 Thread De Lara Guarch, Pablo
Hi,

Sorry about the spam, I was testing my mail server and sent it accidentally to 
the mailing list.

Pablo

> -Original Message-
> From: De Lara Guarch, Pablo
> Sent: Wednesday, December 02, 2015 3:03 PM
> To: De Lara Guarch, Pablo; dev at dpdk.org
> Cc: David Hunfdsfst; Davidfsdf Hunt
> Subject: [dpdk-dev, v2] igb_uio: fix igb_uio's access to pci_dev->msi_list for
> kernels >= 4.3
> 
> From: David Hunfdsfst 
> 
> Fix to take this change into account: https://lkml.org/lkml/2015/7/9/101
> Has been applied to Kernel 4.3.0-rc6
> 
> Linux: 4a7cc831 ("genirq/MSI: Move msi_list from struct pci_dev to
> struct device")
> 
> Signed-off-by: Davidfsdf Hunt 
> Acked-by: Pablo de Lara 
> 
> ---
> lib/librte_eal/linuxapp/igb_uio/igb_uio.c | 5 +
>  1 file changed, 5 insertions(+)
> 
> diff --git a/lib/librte_eal/linuxapp/igb_uio/igb_uio.c
> b/lib/librte_eal/linuxapp/igb_uio/igb_uio.c
> index 3173e93..918861a 100644
> --- a/lib/librte_eal/linuxapp/igb_uio/igb_uio.c
> +++ b/lib/librte_eal/linuxapp/igb_uio/igb_uio.c
> @@ -248,8 +248,13 @@ igbuio_pci_irqcontrol(struct uio_info *info, s32
> irq_state)
>   else if (udev->mode == RTE_INTR_MODE_MSIX) {
>   struct msi_desc *desc;
> 
> +#if (LINUX_VERSION_CODE < KERNEL_VERSION(4, 3, 0))
>   list_for_each_entry(desc, >msi_list, list)
>   igbuio_msix_mask_irq(desc, irq_state);
> +#else
> + list_for_each_entry(desc, >dev.msi_list, list)
> + igbuio_msix_mask_irq(desc, irq_state);
> +#endif
>   }
>   pci_cfg_access_unlock(pdev);
> 


[dpdk-dev] [PATCH 3/4] eal/arm: Enable lpm/table/pipeline libs

2015-12-02 Thread Jianbo Liu
On 2 December 2015 at 00:41, Jerin Jacob  
wrote:
> On Tue, Dec 01, 2015 at 01:41:15PM -0500, Jianbo Liu wrote:
>> Adds ARM NEON support for lpm.
>> And enables table/pipeline libraries which depend on lpm.
>
> I already sent the patch on the same yesterday.
> We can converge the patches after the discussion.
> Please check "[dpdk-dev] [PATCH 0/3] add lpm support for NEON" on ml
>
Yes, I have read your patch. But there are many differences, so I sent
mine for your reviewing :)

>
>>
>> Signed-off-by: Jianbo Liu 
>> ---
>>  config/defconfig_arm-armv7a-linuxapp-gcc  |  3 -
>>  config/defconfig_arm64-armv8a-linuxapp-gcc|  3 -
>>  lib/librte_eal/common/include/arch/arm/rte_vect.h | 28 ++
>>  lib/librte_lpm/rte_lpm.h  | 68 
>> ---
>>  4 files changed, 77 insertions(+), 25 deletions(-)
>>
>> diff --git a/config/defconfig_arm-armv7a-linuxapp-gcc 
>> b/config/defconfig_arm-armv7a-linuxapp-gcc
>> index cbebd64..efffa1f 100644
>> --- a/config/defconfig_arm-armv7a-linuxapp-gcc
>> +++ b/config/defconfig_arm-armv7a-linuxapp-gcc
>> @@ -53,9 +53,6 @@ CONFIG_RTE_LIBRTE_KNI=n
>>  CONFIG_RTE_EAL_IGB_UIO=n
>>
>>  # fails to compile on ARM
>> -CONFIG_RTE_LIBRTE_LPM=n
>> -CONFIG_RTE_LIBRTE_TABLE=n
>> -CONFIG_RTE_LIBRTE_PIPELINE=n
>>  CONFIG_RTE_SCHED_VECTOR=n
>>
>>  # cannot use those on ARM
>> diff --git a/config/defconfig_arm64-armv8a-linuxapp-gcc 
>> b/config/defconfig_arm64-armv8a-linuxapp-gcc
>> index 504f3ed..57f7941 100644
>> --- a/config/defconfig_arm64-armv8a-linuxapp-gcc
>> +++ b/config/defconfig_arm64-armv8a-linuxapp-gcc
>> @@ -51,7 +51,4 @@ CONFIG_RTE_LIBRTE_IVSHMEM=n
>>  CONFIG_RTE_LIBRTE_FM10K_PMD=n
>>  CONFIG_RTE_LIBRTE_I40E_PMD=n
>>
>> -CONFIG_RTE_LIBRTE_LPM=n
>> -CONFIG_RTE_LIBRTE_TABLE=n
>> -CONFIG_RTE_LIBRTE_PIPELINE=n
>>  CONFIG_RTE_SCHED_VECTOR=n
>> diff --git a/lib/librte_eal/common/include/arch/arm/rte_vect.h 
>> b/lib/librte_eal/common/include/arch/arm/rte_vect.h
>> index a33c054..7437711 100644
>> --- a/lib/librte_eal/common/include/arch/arm/rte_vect.h
>> +++ b/lib/librte_eal/common/include/arch/arm/rte_vect.h
>> @@ -41,6 +41,8 @@ extern "C" {
>>
>>  typedef int32x4_t xmm_t;
>>
>> +typedef int32x4_t __m128i;
>> +
>>  #define  XMM_SIZE(sizeof(xmm_t))
>>  #define  XMM_MASK(XMM_SIZE - 1)
>>
>> @@ -53,6 +55,32 @@ typedef union rte_xmm {
>>   double   pd[XMM_SIZE / sizeof(double)];
>>  } __attribute__((aligned(16))) rte_xmm_t;
>>
>> +static __inline __m128i
>> +_mm_set_epi32(int i3, int i2, int i1, int i0)
>> +{
>> + int32_t r[4] = {i0, i1, i2, i3};
>> +
>> + return vld1q_s32(r);
>> +}
>> +
>> +static __inline __m128i
>> +_mm_loadu_si128(__m128i *p)
>> +{
>> + return vld1q_s32((int32_t *)p);
>> +}
>> +
>> +static __inline __m128i
>> +_mm_set1_epi32(int i)
>> +{
>> + return vdupq_n_s32(i);
>> +}
>> +
>> +static __inline __m128i
>> +_mm_and_si128(__m128i a, __m128i b)
>> +{
>> + return vandq_s32(a, b);
>> +}
>> +
>
> IMO, it makes sense to not emulate the SSE intrinsics with NEON
> Let's create the rte_vect_* as required. look at the existing patch.
>
I thought of creating a layer of SIMD over all the platforms before.
But can't you see it make things complicated, considering there are
only few simple intrinsic to implement?
If do so, we also need to explain to others how to use these interfaces.
Besides, this patch did the smallest changes to the original code, and
more likely to be accepted by others.

>
>>  #ifdef RTE_ARCH_ARM
>>  /* NEON intrinsic vqtbl1q_u8() is not supported in ARMv7-A(AArch32) */
>>  static __inline uint8x16_t
>> diff --git a/lib/librte_lpm/rte_lpm.h b/lib/librte_lpm/rte_lpm.h
>> index c299ce2..c76c07d 100644
>> --- a/lib/librte_lpm/rte_lpm.h
>> +++ b/lib/librte_lpm/rte_lpm.h
>> @@ -361,6 +361,47 @@ rte_lpm_lookup_bulk_func(const struct rte_lpm *lpm, 
>> const uint32_t * ips,
>>  /* Mask four results. */
>>  #define   RTE_LPM_MASKX4_RES UINT64_C(0x00ff00ff00ff00ff)
>>
>> +#if defined(RTE_ARCH_ARM) || defined(RTE_ARCH_ARM64)
>
> Separate out arm implementation to the different header file.
> Too many ifdef looks odd in the header file and difficult to manage.
>
But there are many ifdefs already.
And It seems unreasonable to add a new file only for one small function.

>
>> +static inline void
>> +rte_lpm_tbl24_val4(const struct rte_lpm *lpm, int32x4_t ip, uint16_t tbl[4])
>> +{
>> + uint32x4_t i24;
>> + uint32_t idx[4];
>> +
>> + /* get 4 indexes for tbl24[]. */
>> + i24 = vshrq_n_u32(vreinterpretq_u32_s32(ip), CHAR_BIT);
>> + vst1q_u32(idx, i24);
>> +
>> + /* extract values from tbl24[] */
>> + tbl[0] = *(const uint16_t *)>tbl24[idx[0]];
>> + tbl[1] = *(const uint16_t *)>tbl24[idx[1]];
>> + tbl[2] = *(const uint16_t *)>tbl24[idx[2]];
>> + tbl[3] = *(const uint16_t *)>tbl24[idx[3]];
>> +}
>
> Nice. There is an improvement in this portion code wrt my patch. This is
> a candidate for convergence.
>
>
>> +#else
>> +static 

[dpdk-dev] [dpdk-dev, v2] igb_uio: fix igb_uio's access to pci_dev->msi_list for kernels >= 4.3

2015-12-02 Thread Pablo de Lara
From: David Hunfdsfst 

Fix to take this change into account: https://lkml.org/lkml/2015/7/9/101
Has been applied to Kernel 4.3.0-rc6

Linux: 4a7cc831 ("genirq/MSI: Move msi_list from struct pci_dev to
struct device")

Signed-off-by: Davidfsdf Hunt 
Acked-by: Pablo de Lara 

---
lib/librte_eal/linuxapp/igb_uio/igb_uio.c | 5 +
 1 file changed, 5 insertions(+)

diff --git a/lib/librte_eal/linuxapp/igb_uio/igb_uio.c 
b/lib/librte_eal/linuxapp/igb_uio/igb_uio.c
index 3173e93..918861a 100644
--- a/lib/librte_eal/linuxapp/igb_uio/igb_uio.c
+++ b/lib/librte_eal/linuxapp/igb_uio/igb_uio.c
@@ -248,8 +248,13 @@ igbuio_pci_irqcontrol(struct uio_info *info, s32 irq_state)
else if (udev->mode == RTE_INTR_MODE_MSIX) {
struct msi_desc *desc;

+#if (LINUX_VERSION_CODE < KERNEL_VERSION(4, 3, 0))
list_for_each_entry(desc, >msi_list, list)
igbuio_msix_mask_irq(desc, irq_state);
+#else
+   list_for_each_entry(desc, >dev.msi_list, list)
+   igbuio_msix_mask_irq(desc, irq_state);
+#endif
}
pci_cfg_access_unlock(pdev);



[dpdk-dev] [PATCH v4 2/2] examples: add pthread-shim in performance-thread sample app

2015-12-02 Thread ibetts
From: Ian Betts 

This commit adds a simple pthread_shim example for the
cooperative scheduler included with this patchset.

The shim demonstrates a way in which legacy code writtem for
pthreads could be adapted to lighweight threads.

Signed-off-by: Ian Betts 
---
 doc/guides/sample_app_ug/performance_thread.rst| 114 
 examples/performance-thread/Makefile   |   2 +
 examples/performance-thread/pthread_shim/Makefile  |  60 ++
 examples/performance-thread/pthread_shim/main.c| 284 
 .../performance-thread/pthread_shim/pthread_shim.c | 714 +
 .../performance-thread/pthread_shim/pthread_shim.h | 113 
 6 files changed, 1287 insertions(+)
 create mode 100644 examples/performance-thread/pthread_shim/Makefile
 create mode 100644 examples/performance-thread/pthread_shim/main.c
 create mode 100644 examples/performance-thread/pthread_shim/pthread_shim.c
 create mode 100644 examples/performance-thread/pthread_shim/pthread_shim.h

diff --git a/doc/guides/sample_app_ug/performance_thread.rst 
b/doc/guides/sample_app_ug/performance_thread.rst
index 6ea83cc..d71bb84 100644
--- a/doc/guides/sample_app_ug/performance_thread.rst
+++ b/doc/guides/sample_app_ug/performance_thread.rst
@@ -1102,6 +1102,120 @@ it the local data it needs, and pick up the new logical 
core specific values
 from pthread local storage at its new home.


+.. _pthread_shim:
+
+Pthread shim
+
+
+A convenient way to get something working with legacy code can be to use a
+shim that adapts pthread API calls to the corresponding L-thread ones.
+This approach will not mitigate any of the porting considerations mentioned
+in the previous sections, but it will reduce the amount of code churn that
+would otherwise been involved. It is a reasonable approach to evaluate
+L-threads, before investing effort in porting to the native L-thread APIs.
+
+
+Overview
+
+The L-thread subsystem includes an example pthread shim. This is a partial
+implementation but does contain the API stubs needed to get basic applications
+running. There is a simple "hello world" application that demonstrates the
+use of the pthread shim.
+
+A subtlety of working with a shim is that the application will still need
+to make use of the genuine pthread library functions, at the very least in
+order to create the EAL threads in which the L-thread schedulers will run.
+This is the case with DPDK initialization, and exit.
+
+To deal with the initialization and shutdown scenarios, the shim is capable of
+switching on or off its adaptor functionality, an application can control this
+behavior by the calling the function ``pt_override_set()``. The default state
+is disabled.
+
+The pthread shim uses the dynamic linker loader and saves the loaded addresses
+of the genuine pthread API functions in an internal table, when the shim
+functionality is enabled it performs the adaptor function, when disabled it
+invokes the genuine pthread function.
+
+The function ``pthread_exit()`` has additional special handling. The standard
+system header file pthread.h declares ``pthread_exit()`` with
+``__attribute__((noreturn))`` this is an optimization that is possible because
+the pthread is terminating and this enables the compiler to omit the normal
+handling of stack and protection of registers since the function is not
+expected to return, and in fact the thread is being destroyed. These
+optimizations are applied in both the callee and the caller of the
+``pthread_exit()`` function.
+
+In our cooperative scheduling environment this behavior is inadmissible. The
+pthread is the L-thread scheduler thread, and, although an L-thread is
+terminating, there must be a return to the scheduler in order that the system
+can continue to run. Further, returning from a function with attribute
+``noreturn`` is invalid and may result in undefined behavior.
+
+The solution is to redefine the ``pthread_exit`` function with a macro,
+causing it to be mapped to a stub function in the shim that does not have the
+``noreturn`` attribute. This macro is defined in the file
+``pthread_shim.h``. The stub function is otherwise no different than any of
+the other stub functions in the shim, and will switch between the real
+``pthread_exit()`` function or the ``lthread_exit()`` function as
+required. The only difference is that the mapping to the stub by macro
+substitution.
+
+A consequence of this is that the file ``pthread_shim.h`` must be included in
+legacy code wishing to make use of the shim. It also means that dynamic
+linkage of a pre-compiled binary that did not include pthread_shim.h is not be
+supported.
+
+Given the requirements for porting legacy code outlined in
+:ref:`porting_legacy_code_to_run_on_lthreads` most applications will require at
+least some minimal adjustment and recompilation to run on L-threads so
+pre-compiled binaries are unlikely to be met in practice.
+
+In summary the shim approach adds some overhead but 

[dpdk-dev] [PATCH v4 1/2] examples: add performance thread sample application

2015-12-02 Thread ibetts
From: Ian Betts 

This example comprises a layer 3 forwarding derivative intended to
facilitate characterization of performance with different
threading models, specifically:-

1. EAL threads running on different physical cores
2. EAL threads running on the same physical core
3. Lightweight threads running in an EAL thread

Purpose and justification

Since dpdk 2.0 it has been possible to assign multiple EAL threads to
a physical core ( case 2 above ).
Currently no example application has focused on demonstrating the
performance constraints of differing threading models.

Whilst purpose built applications that fully comprehend the DPDK
single threaded programming model will always yield superior
performance, the desire to preserve ROI in legacy code written for
multithreaded operating environments  makes lightweight threads
(case 3 above) worthy of consideration.

As well as aiding with legacy code reuse, it is anticipated that
lightweight threads will make it possible to scale a multithreaded
application with fine granularity allowing an application  to more
easily take advantage of headroom on EAL cores, or conversely occupy
more cores, as dictated by system load.

To explore performance with lightweight threads a simple cooperative
scheduler subsystem is being included in this example application.
If the expected benefits and use cases prove to be of value, it is
anticipated that this lightweight thread subsystem would become a
library in some future DPDK release.

Changes in this version:-
  * Copyright updated for 2015
  * fix TLS destructor handling

Signed-off-by: Ian Betts 
---
 config/common_linuxapp |1 +
 config/defconfig_i686-native-linuxapp-gcc  |1 +
 config/defconfig_i686-native-linuxapp-icc  |1 +
 config/defconfig_x86_64-native-linuxapp-gcc|3 +
 config/defconfig_x86_64-native-linuxapp-icc|3 +
 doc/guides/sample_app_ug/performance_thread.rst| 1149 ++
 examples/Makefile  |2 +
 examples/performance-thread/Makefile   |   45 +
 .../performance-thread/common/arch/x86/atomic.h|   59 +
 examples/performance-thread/common/arch/x86/ctx.c  |   93 +
 examples/performance-thread/common/arch/x86/ctx.h  |   57 +
 examples/performance-thread/common/common.mk   |   40 +
 examples/performance-thread/common/lthread.c   |  530 +++
 examples/performance-thread/common/lthread.h   |   99 +
 examples/performance-thread/common/lthread_api.h   |  829 +
 examples/performance-thread/common/lthread_cond.c  |  241 ++
 examples/performance-thread/common/lthread_cond.h  |   77 +
 examples/performance-thread/common/lthread_diag.c  |  321 ++
 examples/performance-thread/common/lthread_diag.h  |  129 +
 .../performance-thread/common/lthread_diag_api.h   |  319 ++
 examples/performance-thread/common/lthread_int.h   |  212 ++
 examples/performance-thread/common/lthread_mutex.c |  256 ++
 examples/performance-thread/common/lthread_mutex.h |   52 +
 .../performance-thread/common/lthread_objcache.h   |  160 +
 examples/performance-thread/common/lthread_pool.h  |  333 ++
 examples/performance-thread/common/lthread_queue.h |  303 ++
 examples/performance-thread/common/lthread_sched.c |  600 
 examples/performance-thread/common/lthread_sched.h |  152 +
 examples/performance-thread/common/lthread_timer.h |   47 +
 examples/performance-thread/common/lthread_tls.c   |  254 ++
 examples/performance-thread/common/lthread_tls.h   |   57 +
 examples/performance-thread/l3fwd-thread/Makefile  |   57 +
 examples/performance-thread/l3fwd-thread/main.c| 3641 
 33 files changed, 10123 insertions(+)
 create mode 100644 doc/guides/sample_app_ug/performance_thread.rst
 create mode 100644 examples/performance-thread/Makefile
 create mode 100644 examples/performance-thread/common/arch/x86/atomic.h
 create mode 100644 examples/performance-thread/common/arch/x86/ctx.c
 create mode 100644 examples/performance-thread/common/arch/x86/ctx.h
 create mode 100644 examples/performance-thread/common/common.mk
 create mode 100644 examples/performance-thread/common/lthread.c
 create mode 100644 examples/performance-thread/common/lthread.h
 create mode 100644 examples/performance-thread/common/lthread_api.h
 create mode 100644 examples/performance-thread/common/lthread_cond.c
 create mode 100644 examples/performance-thread/common/lthread_cond.h
 create mode 100644 examples/performance-thread/common/lthread_diag.c
 create mode 100644 examples/performance-thread/common/lthread_diag.h
 create mode 100644 examples/performance-thread/common/lthread_diag_api.h
 create mode 100644 examples/performance-thread/common/lthread_int.h
 create mode 100644 examples/performance-thread/common/lthread_mutex.c
 create mode 100644 examples/performance-thread/common/lthread_mutex.h
 create mode 100644 examples/performance-thread/common/lthread_objcache.h
 create mode 100644 

[dpdk-dev] [PATCH 03/10] mk: install a standard cutomizable tree

2015-12-02 Thread Panu Matilainen
On 12/02/2015 01:25 PM, Thomas Monjalon wrote:
> 2015-12-02 12:27, Panu Matilainen:
>> On 12/02/2015 05:57 AM, Thomas Monjalon wrote:
>>> The old installed tree was static and always had .config, includes and
>>> libs in a RTE_TARGET subdirectory. There is no such directory anymore in
>>> an installed SDK. So the top directory is checked.
>>> But RTE_TARGET can still be used, especially to build an app with a
>>> compiled but not installed SDK.
>>> That's why both cases are looked for RTE_SDK_BIN.
> [...]
>>> The old usage of an installed SDK is:
>>>   make -C examples/helloworld RTE_SDK=$(readlink -m $DESTDIR) \
>>>RTE_TARGET=x86_64-native-linuxapp-gcc
>>> RTE_TARGET can be specified but is useless now with an installed SDK.
>>> The RTE_SDK directory must now point to a different path depending of
>>> the installation.
> [...]
>>> +   $(Q)$(call rte_mkdir,$(DESTDIR)$(sdkdir))
>>> +   $(Q)cp -a   $(BUILD_DIR)/.config $(DESTDIR)$(sdkdir)
>>> +   $(Q)cp -a   $(RTE_SDK)/{mk,scripts}  $(DESTDIR)$(sdkdir)
>>> +   $(Q)$(call rte_symlink, $(DESTDIR)$(includedir), 
>>> $(DESTDIR)$(sdkdir)/include)
>>> +   $(Q)$(call rte_symlink, $(DESTDIR)$(libdir), 
>>> $(DESTDIR)$(sdkdir)/lib)
>>
>> $(prefix)/share is supposed to be shareable across different
>> architectures. Most of the content here is, but at least the lib symlink
>> and .config file are not.
>
> The case you want to address is multilib 32/x32/64, right?

That, plus modern Debian/Ubuntu supports multiarch, not just -lib.

And then there's the pedantic side, ie to be in line with the FHS 
definition: 
http://www.pathname.com/fhs/pub/fhs-2.3.html#USRSHAREARCHITECTUREINDEPENDENTDATA

>
>> One option is to install .config and the symlinks within $(sdkdir)/$(T)
>> directories, then it can be shared across architectures because each
>> lives in their own directory. Another possibility is moving the whole
>> sdk directory into a subdir in $(libdir), but that misses the
>> opportunity to share across architectures (whether anybody actually
>> cares is a whole other question :)
>
> Yes, I tried to remove the use of RTE_TARGET when building an example.
> But we can keep it with a subdirectory in $(sdkdir).

Just realized my suggestion $(sdkdir)/$(T) would not cut it because if 
T= is specified then this installation method wont be invoked at all :D

So yeah, RTE_TARGET. Or perhaps just RTE_ARCH. Dunno if there's actual 
added value to having the whole target string there, but I wont mind either.

>
>> $(sdkdir)/lib -> $(libdir) symlink seems reasonable when installing to
>> an empty staging root, but on a real-world installation it'd point to
>> /usr/lib(something) which has hundreds or thousands of other unrelated
>> libraries. My memory is hazy on details but I think this caused an
>> actual problem with something because I ended up $(sdkdir)/lib an actual
>> directory populated with symlinks to the individual DPDK libraries.
>
> I don't see the problem.
> I suggest to keep it and see how to fix it if an issue is raised.

The problem probably had to do with something external, like compiling 
OVS or pktgen, but ... this is too hand-wavy to worry about right now. 
Just wanted to mention it because I dont think I added the extra 
complexity in packaging just for fun.

- Panu -


[dpdk-dev] DPDK OVS on Ubuntu 14.04

2015-12-02 Thread Gray, Mark D
+ discuss at openvswitch.org

one comment below: 

> -Original Message-
> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Polehn, Mike A
> Sent: Tuesday, December 1, 2015 2:46 PM
> To: Abhijeet Karve; dev at dpdk.org
> Cc: bhavya.addep at gmail.com
> Subject: Re: [dpdk-dev] DPDK OVS on Ubuntu 14.04
> 
> May need to setup huge pages on kernel boot line (this is example, you may
> need to adjust):
> 
> The huge page configuration can be added to the default configuration file
> /etc/default/grub by adding to the GRUB_CMDLINE_LINUX and the grub
> configuration file regenerated to get an updated configuration file for Linux
> boot.
> # vim /etc/default/grub// edit file
> 
> . . .
> GRUB_CMDLINE_LINUX_DEFAULT="... default_hugepagesz=1GB
> hugepagesz=1GB hugepages=4 hugepagesize=2m hugepages=2048 ..."
> . . .
> 
> 
> This example sets up huge pages for both 1 GB pages for 4 GB of 1 GB
> hugepage memory and 2 MB pages for 4 GB of 2 MB hugepage memory.
> After boot the number of 1 GB pages cannot be changed, but the number of
> 2 MB pages can be changed.
> 
> After editing configuration file /etc/default/grub , the new grub.cfg boot 
> file
> needs to be regenerated:
> # update-grub
> 
> And reboot. After reboot memory managers need to be setup:
> 
> If /dev/hugepages does not exist:# mkdir /dev/hugepages
> 
> # mount -t hugetlbfs nodev   /dev/hugepages
> 
> # mkdir /dev/hugepages_2mb
> # mount -t hugetlbfs nodev /dev/hugepages_2mb -o pagesize=2MB
> 
> Mike
> 
> -Original Message-
> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Abhijeet Karve
> Sent: Monday, November 30, 2015 10:14 PM
> To: dev at dpdk.org
> Cc: bhavya.addep at gmail.com
> Subject: [dpdk-dev] DPDK OVS on Ubuntu 14.04
> 
> Dear All,
> 
> 
> We are trying to install DPDK OVS on top of the openstack juno in Ubuntu
> 14.04 single server. We are referring following steps for executing the same.
> 
> https://software.intel.com/en-us/blogs/2015/06/09/building-vhost-user-for-
> ovs-today-using-dpdk-200
> 
> During execution we are getting some issues with ovs-vswitchd service as its
> getting hang during starting.
> __
> ___
> 
> nfv-dpdk at nfv-dpdk:~$ tail -f /var/log/openvswitch/ovs-vswitchd.log
> 2015-11-
> 24T10:54:34.036Z|6|reconnect|INFO|unix:/var/run/openvswitch/db.so
> ck:
> connecting...
> 2015-11-
> 24T10:54:34.036Z|7|reconnect|INFO|unix:/var/run/openvswitch/db.so
> ck:
> connected
> 2015-11-24T10:54:34.064Z|8|bridge|INFO|ovs-vswitchd (Open vSwitch)
> 2.4.90
> 2015-11-24T11:03:42.957Z|2|vlog|INFO|opened log file
> /var/log/openvswitch/ov
>  
> s-vswitchd.log 2015-11-
> 24T11:03:42.958Z|3|ovs_numa|INFO|Discovered 24 CPU cores on
> NUMA
> nod
> e 0
> 2015-11-24T11:03:42.958Z|4|ovs_numa|INFO|Discovered 24 CPU cores
> on NUMA
> nod
> e 1
> 2015-11-24T11:03:42.958Z|5|ovs_numa|INFO|Discovered 2 NUMA
> nodes and
> 48 CPU
>  cores
> 2015-11-
> 24T11:03:42.958Z|6|reconnect|INFO|unix:/var/run/openvswitch/db.so
> ck:
> connecting...
> 2015-11-
> 24T11:03:42.958Z|7|reconnect|INFO|unix:/var/run/openvswitch/db.so
> ck:
> connected
> 2015-11-24T11:03:42.961Z|8|bridge|INFO|ovs-vswitchd (Open vSwitch)
> 2.4.90
> __
> ___
> 
> Also, attaching output(Hugepage.txt) of  ? ./ovs-vswitchd --dpdk -c 0x0FF8 -n
> 4 --socket-mem 1024,0 -- --log-file=/var/log/openvswitch/ovs-vswitchd.log
> --pidfile=/var/run/oppenvswitch/ovs-vswitchd.pid?
> 
> -  We tried seting up echo 0 >
> /sys/kernel/mm/hugepages/hugepages-2048kB/nr_hugepages, but couldn?t
> succeeded.
>   Can anyone please help us in getting the things if we are missing any 
> and
> causing ovs-vswitchd to stuck while starting?
> 
> Also, when we create vm in openstack with DPDK OVS, dpdkvhost-user type
> interfaces are getting created automatically. If this interfaces are getting
> mapped with regular br-int bridge rather than DPDK bridge br0 then is this

You can still have a bridge named br-int  that is backed with a userspace 
datapath. You can't add
a dpdkvhostuser port to a kernel space datapath. So in this case, I think you 
are ok and are
using DPDK.

> mean that we have successfully enabled DPDK with netdev datapath?
> 
> 
> 
> We really appreciate for all the advice if you have.
> 
> Thanks,
> Abhijeet
> Thanks & Regards
> Abhijeet Karve
> 
> =-=-=
> Notice: The information contained in this e-mail message and/or
> attachments to it may contain confidential or privileged information. If you
> are not the intended recipient, any dissemination, use, review, distribution,
> printing or copying of the information contained in this e-mail message
> and/or attachments to it are strictly prohibited. If you have received this
> communication in error, please 

[dpdk-dev] [PATCH v4 2/2] examples: add pthread-shim in performance-thread sample app

2015-12-02 Thread ibetts
From: Ian Betts 

This commit adds a simple pthread_shim example for the
cooperative scheduler included with this patchset.

The shim demonstrates a way in which legacy code writtem for
pthreads could be adapted to lighweight threads.

Signed-off-by: Ian Betts 
---
 doc/guides/sample_app_ug/performance_thread.rst| 114 
 examples/performance-thread/Makefile   |   2 +
 examples/performance-thread/pthread_shim/Makefile  |  60 ++
 examples/performance-thread/pthread_shim/main.c| 284 
 .../performance-thread/pthread_shim/pthread_shim.c | 714 +
 .../performance-thread/pthread_shim/pthread_shim.h | 113 
 6 files changed, 1287 insertions(+)
 create mode 100644 examples/performance-thread/pthread_shim/Makefile
 create mode 100644 examples/performance-thread/pthread_shim/main.c
 create mode 100644 examples/performance-thread/pthread_shim/pthread_shim.c
 create mode 100644 examples/performance-thread/pthread_shim/pthread_shim.h

diff --git a/doc/guides/sample_app_ug/performance_thread.rst 
b/doc/guides/sample_app_ug/performance_thread.rst
index 6ea83cc..d71bb84 100644
--- a/doc/guides/sample_app_ug/performance_thread.rst
+++ b/doc/guides/sample_app_ug/performance_thread.rst
@@ -1102,6 +1102,120 @@ it the local data it needs, and pick up the new logical 
core specific values
 from pthread local storage at its new home.


+.. _pthread_shim:
+
+Pthread shim
+
+
+A convenient way to get something working with legacy code can be to use a
+shim that adapts pthread API calls to the corresponding L-thread ones.
+This approach will not mitigate any of the porting considerations mentioned
+in the previous sections, but it will reduce the amount of code churn that
+would otherwise been involved. It is a reasonable approach to evaluate
+L-threads, before investing effort in porting to the native L-thread APIs.
+
+
+Overview
+
+The L-thread subsystem includes an example pthread shim. This is a partial
+implementation but does contain the API stubs needed to get basic applications
+running. There is a simple "hello world" application that demonstrates the
+use of the pthread shim.
+
+A subtlety of working with a shim is that the application will still need
+to make use of the genuine pthread library functions, at the very least in
+order to create the EAL threads in which the L-thread schedulers will run.
+This is the case with DPDK initialization, and exit.
+
+To deal with the initialization and shutdown scenarios, the shim is capable of
+switching on or off its adaptor functionality, an application can control this
+behavior by the calling the function ``pt_override_set()``. The default state
+is disabled.
+
+The pthread shim uses the dynamic linker loader and saves the loaded addresses
+of the genuine pthread API functions in an internal table, when the shim
+functionality is enabled it performs the adaptor function, when disabled it
+invokes the genuine pthread function.
+
+The function ``pthread_exit()`` has additional special handling. The standard
+system header file pthread.h declares ``pthread_exit()`` with
+``__attribute__((noreturn))`` this is an optimization that is possible because
+the pthread is terminating and this enables the compiler to omit the normal
+handling of stack and protection of registers since the function is not
+expected to return, and in fact the thread is being destroyed. These
+optimizations are applied in both the callee and the caller of the
+``pthread_exit()`` function.
+
+In our cooperative scheduling environment this behavior is inadmissible. The
+pthread is the L-thread scheduler thread, and, although an L-thread is
+terminating, there must be a return to the scheduler in order that the system
+can continue to run. Further, returning from a function with attribute
+``noreturn`` is invalid and may result in undefined behavior.
+
+The solution is to redefine the ``pthread_exit`` function with a macro,
+causing it to be mapped to a stub function in the shim that does not have the
+``noreturn`` attribute. This macro is defined in the file
+``pthread_shim.h``. The stub function is otherwise no different than any of
+the other stub functions in the shim, and will switch between the real
+``pthread_exit()`` function or the ``lthread_exit()`` function as
+required. The only difference is that the mapping to the stub by macro
+substitution.
+
+A consequence of this is that the file ``pthread_shim.h`` must be included in
+legacy code wishing to make use of the shim. It also means that dynamic
+linkage of a pre-compiled binary that did not include pthread_shim.h is not be
+supported.
+
+Given the requirements for porting legacy code outlined in
+:ref:`porting_legacy_code_to_run_on_lthreads` most applications will require at
+least some minimal adjustment and recompilation to run on L-threads so
+pre-compiled binaries are unlikely to be met in practice.
+
+In summary the shim approach adds some overhead but 

[dpdk-dev] [PATCH 3/3] maintainers: claim responsibility for arm64 specific files of hash and lpm

2015-12-02 Thread Jan Viktorin
On Mon, 30 Nov 2015 22:54:13 +0530
Jerin Jacob  wrote:

> Signed-off-by: Jerin Jacob 
> ---
>  MAINTAINERS | 3 +++
>  1 file changed, 3 insertions(+)
> 
> diff --git a/MAINTAINERS b/MAINTAINERS
> index 4478862..dc8f80a 100644
> --- a/MAINTAINERS
> +++ b/MAINTAINERS
> @@ -130,6 +130,9 @@ ARM v8
>  M: Jerin Jacob 
>  F: lib/librte_eal/common/include/arch/arm/*_64.h
>  F: lib/librte_acl/acl_run_neon.*
> +F: lib/librte_lpm/rte_lpm_neon.h
> +F: lib/librte_hash/rte_crc_arm64.h
> +F: lib/librte_hash/rte_cmp_arm64.h

I can't see the librte_hash/* files in the patch set. Is it by mistake?

>  
>  EZchip TILE-Gx
>  M: Zhigang Lu 



-- 
   Jan Viktorin  E-mail: Viktorin at RehiveTech.com
   System Architect  Web:www.RehiveTech.com
   RehiveTech
   Brno, Czech Republic


[dpdk-dev] [PATCH 2/3] lpm: add support for NEON

2015-12-02 Thread Jan Viktorin
On Mon, 30 Nov 2015 22:54:12 +0530
Jerin Jacob  wrote:

> enabled CONFIG_RTE_LIBRTE_LPM, CONFIG_RTE_LIBRTE_TABLE,
> CONFIG_RTE_LIBRTE_PIPELINE libraries for arm64.
> 
> TABLE, PIPELINE libraries were disabled due to LPM library dependency.
> 
> Signed-off-by: Jerin Jacob 
> ---
>  app/test/test_lpm.c|  10 +-
>  config/defconfig_arm64-armv8a-linuxapp-gcc |   3 -
>  lib/librte_lpm/Makefile|   3 +
>  lib/librte_lpm/rte_lpm.h   |   5 +
>  lib/librte_lpm/rte_lpm_neon.h  | 172 
> +
>  5 files changed, 185 insertions(+), 8 deletions(-)
>  create mode 100644 lib/librte_lpm/rte_lpm_neon.h
> 
> [snip]
>  
>  # this lib needs eal
>  DEPDIRS-$(CONFIG_RTE_LIBRTE_LPM) += lib/librte_eal
> diff --git a/lib/librte_lpm/rte_lpm.h b/lib/librte_lpm/rte_lpm.h
> index c299ce2..12b75ce 100644
> --- a/lib/librte_lpm/rte_lpm.h
> +++ b/lib/librte_lpm/rte_lpm.h
> @@ -361,6 +361,9 @@ rte_lpm_lookup_bulk_func(const struct rte_lpm *lpm, const 
> uint32_t * ips,
>  /* Mask four results. */
>  #define   RTE_LPM_MASKX4_RES UINT64_C(0x00ff00ff00ff00ff)
>  
> +#if defined(RTE_ARCH_ARM64)
> +#include "rte_lpm_neon.h"
> +#else
>  /**
>   * Lookup four IP addresses in an LPM table.
>   *
> @@ -473,6 +476,8 @@ rte_lpm_lookupx4(const struct rte_lpm *lpm, __m128i ip, 
> uint16_t hop[4],
>   hop[3] = (tbl[3] & RTE_LPM_LOOKUP_SUCCESS) ? (uint8_t)tbl[3] : defv;
>  }
>  
> +#endif
> +

I would separate the SSE implementation into its own file as well.

Otherwise, I like this patch. I hope to be able to test it soon.

>  [snip]


-- 
   Jan Viktorin  E-mail: Viktorin at RehiveTech.com
   System Architect  Web:www.RehiveTech.com
   RehiveTech
   Brno, Czech Republic


[dpdk-dev] [PATCH 1/3] eal: introduce rte_vect_* abstractions

2015-12-02 Thread Jan Viktorin
On Mon, 30 Nov 2015 22:54:11 +0530
Jerin Jacob  wrote:

> introduce rte_vect_* abstractions to remove SSE/AVX specific
> code in the common code(i.e the test applications)
> 
> The patch does not provide any functional change for IA, the goal is to

Does IA mean Intel Architecture?

> have infrastructure to reuse the common vector-based test code across
> all the architectures.
> 
> Signed-off-by: Jerin Jacob 
> ---
>  lib/librte_eal/common/include/arch/arm/rte_vect.h | 17 -
>  lib/librte_eal/common/include/arch/x86/rte_vect.h |  8 
>  2 files changed, 24 insertions(+), 1 deletion(-)
> 
> diff --git a/lib/librte_eal/common/include/arch/arm/rte_vect.h 
> b/lib/librte_eal/common/include/arch/arm/rte_vect.h
> index 21cdb4d..d300951 100644
> --- a/lib/librte_eal/common/include/arch/arm/rte_vect.h
> +++ b/lib/librte_eal/common/include/arch/arm/rte_vect.h
> @@ -33,13 +33,14 @@
>  #ifndef _RTE_VECT_ARM_H_
>  #define _RTE_VECT_ARM_H_
>  
> -#include "arm_neon.h"
> +#include 
>  
>  #ifdef __cplusplus
>  extern "C" {
>  #endif
>  
>  typedef int32x4_t xmm_t;
> +typedef int32x4_t __m128i;

As Jianbo pointed out recently, the __m128i type should be refactored in
a general rte_vect API too. If we do something like

#if SSE
typedef __m128i rte_128i;
#elif NEON
typedef int32x4_y rte_128i;
#endif

does it make somebody angry? I am afraid that it will influence a lot of
code. However, from the ABI point of view, it is OK, isn't it?

>  
>  #define  XMM_SIZE(sizeof(xmm_t))
>  #define  XMM_MASK(XMM_SIZE - 1)
> @@ -53,6 +54,20 @@ typedef union rte_xmm {
>   double   pd[XMM_SIZE / sizeof(double)];
>  } __attribute__((aligned(16))) rte_xmm_t;
>  
> +/* rte_vect_* abstraction implementation using NEON */
> +
> +/* loads the __m128i value from address p(does not need to be 16-byte 
> aligned)*/
> +#define rte_vect_loadu_sil128(p) vld1q_s32((const int32_t *)p)
> +
> +/* sets the 4 signed 32-bit integer values and returns the __m128i variable 
> */
> +static inline __m128i  __attribute__((always_inline))
> +rte_vect_set_epi32(int i3, int i2, int i1, int i0)
> +{
> + int32_t data[4] = {i0, i1, i2, i3};
> +
> + return vld1q_s32(data);
> +}
> +
>  #ifdef __cplusplus
>  }
>  #endif
> diff --git a/lib/librte_eal/common/include/arch/x86/rte_vect.h 
> b/lib/librte_eal/common/include/arch/x86/rte_vect.h
> index b698797..91c6523 100644
> --- a/lib/librte_eal/common/include/arch/x86/rte_vect.h
> +++ b/lib/librte_eal/common/include/arch/x86/rte_vect.h
> @@ -125,6 +125,14 @@ typedef union rte_ymm {
>  })
>  #endif /* (defined(__ICC) && __ICC < 1210) */
>  
> +/* rte_vect_* abstraction implementation using SSE */
> +
> +/* loads the __m128i value from address p(does not need to be 16-byte 
> aligned)*/
> +#define rte_vect_loadu_sil128(p) _mm_loadu_si128(p)
> +
> +/* sets the 4 signed 32-bit integer values and returns the __m128i variable 
> */
> +#define rte_vect_set_epi32(i3, i2, i1, i0) _mm_set_epi32(i3, i2, i1, i0)
> +
>  #ifdef __cplusplus
>  }
>  #endif

I like this approach. It is a question whether to inherit names from
SSE. However, why to reinvent the wheel...

We probably need other people to give their ideas about such
generalization of the API.

I think, there should be an autotest of the rte_vect API. Is it
possible to create one?

Regards
Jan

-- 
   Jan Viktorin  E-mail: Viktorin at RehiveTech.com
   System Architect  Web:www.RehiveTech.com
   RehiveTech
   Brno, Czech Republic


[dpdk-dev] [PATCH 0/3] add lpm support for NEON

2015-12-02 Thread Jan Viktorin
Hello Jerin,

thank you for this patch series. Please CC me next time when doing an
ARM-related changes. It took me a while to find the related e-mails on
the mail server.

On Mon, 30 Nov 2015 22:54:10 +0530
Jerin Jacob  wrote:

> - Introduce new rte_vect_* abstractions in eal
> - This patch set has the changes required for optimised pm library usage in 
> arm64 perspective
> - Tested on Juno and Thunder boards
> - Tested and verified the changes with following DPDK unit test cases
>   --lpm_autotest
>   --lpm6_autotest
> - This patch set has dependency on [dpdk-dev] [PATCH v4 0/2] disable 
> CONFIG_RTE_SCHED_VECTOR for arm

What kind of dependency is it? Functional?

> - With these changes, arm64 platform supports all DPDK libraries(in feature 
> wise)

Is there some ARMv8 specific NEON instruction?

> 
> Jerin Jacob (3):
>   eal: introduce rte_vect_* abstractions
>   lpm: add support for NEON
>   maintainers: claim responsibility for arm64 specific files of hash and
> lpm
> 
>  MAINTAINERS   |   3 +
>  app/test/test_lpm.c   |  10 +-
>  config/defconfig_arm64-armv8a-linuxapp-gcc|   3 -
>  lib/librte_eal/common/include/arch/arm/rte_vect.h |  17 ++-
>  lib/librte_eal/common/include/arch/x86/rte_vect.h |   8 +
>  lib/librte_lpm/Makefile   |   3 +
>  lib/librte_lpm/rte_lpm.h  |   5 +
>  lib/librte_lpm/rte_lpm_neon.h | 172 
> ++
>  8 files changed, 212 insertions(+), 9 deletions(-)
>  create mode 100644 lib/librte_lpm/rte_lpm_neon.h
> 
> --
> 2.1.0
> 



-- 
   Jan Viktorin  E-mail: Viktorin at RehiveTech.com
   System Architect  Web:www.RehiveTech.com
   RehiveTech
   Brno, Czech Republic


[dpdk-dev] [PATCH] maintainers: claim responsability

2015-12-02 Thread Bruce Richardson
On Wed, Dec 02, 2015 at 01:06:20PM +, Sergio Gonzalez Monroy wrote:
> Claim responsability for:
> - Secondary Process as maintainer.
> - FreeBSD EAL, FreeBSD contigmem and FreeBSD UIO as co-maintainer.
> 
> Signed-off-by: Sergio Gonzalez Monroy 

Acked-by: Bruce Richardson 



[dpdk-dev] Does anybody know OpenDataPlane

2015-12-02 Thread Kury Nicolas
Hi!


Does anybody know OpenDataPlane ?  http://www.opendataplane.org/ It is a 
framework designed to enable software portability between networking SoCs, 
regardless of the underlying instruction set architecture. There are several 
implementations.

  *   OpenDataPlane using DPDK for Intel NIC
  *   OpenDataPlane using DPAA for Freescale platforms (QorIQ)
  *   OpenDataPlane using MCSDK for Texas Insturments platforms (KeyStone II)
  *   etc.

When a developer wants to port his application, he just needs to recompile it 
with the implementation of OpenDataPlane related to the new platform.


I'm doing my Master's Thesis on OpenDataPlane  and I have some questions.

- Now that OpenDataPlane (ODP) exists, schould every developpers start a new 
project with ODP or are there some reasons to still use DPDK ? What do you 
think ?


Thank you very much

Nicolas




[dpdk-dev] [PATCH] cxgbe: explictly mark this as pci_driver

2015-12-02 Thread Stephen Hemminger
The upcoming Hyper-V driver converts the pci_drv element
in struct eth_driver to a union.  When vmbus is added the
pci_drv needs to be explicit. Easier to fix the issue
ahead of time.

This is backwards compatiable with previous code.

Signed-off-by: Stephen Hemminger 
---
 drivers/net/cxgbe/cxgbe_ethdev.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/cxgbe/cxgbe_ethdev.c b/drivers/net/cxgbe/cxgbe_ethdev.c
index ec5d22b..97ef152 100644
--- a/drivers/net/cxgbe/cxgbe_ethdev.c
+++ b/drivers/net/cxgbe/cxgbe_ethdev.c
@@ -847,7 +847,7 @@ out_free_adapter:
 }

 static struct eth_driver rte_cxgbe_pmd = {
-   {
+   .pci_drv = {
.name = "rte_cxgbe_pmd",
.id_table = cxgb4_pci_tbl,
.drv_flags = RTE_PCI_DRV_NEED_MAPPING | RTE_PCI_DRV_INTR_LSC,
-- 
2.1.4



[dpdk-dev] [PATCH 2/4] eal/acl: enable acl for armv7-a

2015-12-02 Thread Jianbo Liu
On 1 December 2015 at 22:46, Jan Viktorin  wrote:
> On Tue, 1 Dec 2015 20:13:49 +0530
> Jerin Jacob  wrote:
>
>> > enum rte_acl_classify_alg alg = RTE_ACL_CLASSIFY_DEFAULT;
>> >
>> > -#ifdef RTE_ARCH_ARM64
>> > +#if defined(RTE_ARCH_ARM) || defined(RTE_ARCH_ARM64)
>> > alg =  RTE_ACL_CLASSIFY_NEON;
>>
>> I believe SIMD is optional in armv7. If true, select alg as
>> RTE_ACL_CLASSIFY_NEON only when cpufeature NEON enabled.
>
> Yes. Or, probably, we can be happy with
>
> #if defined(__ARM_NEON_FP)
> ...
> #endif
>
> as it is currently done in rte_memcpy_32.h.
>
> Regards
> Jan

Athough optional for armv7, I believe there is NEON in most of the
popular armv7a chips.
Anyway, I will add the checking...

Thanks!


[dpdk-dev] [PATCH v3] lib/librte_sched: Fix compile with gcc 4.3.4

2015-12-02 Thread Thomas Monjalon
2015-12-02 10:39, Michael Qiu:
> gcc 4.3.4 does not include "immintrin.h", and will post below error:
> lib/librte_sched/rte_sched.c:56:23: error:
> immintrin.h: No such file or directory
> 
> This compiler issue is fixed with rte_vect.h
> 
> There is another issue, need SSE2 support
> 
> Fixes: 42ec27a0178a ("sched: enable SSE optimizations in config")
> 
> Signed-off-by: Michael Qiu 

Applied, thanks


[dpdk-dev] [PATCH 3/3] rte_sched: eliminate floating point in calculating byte clock

2015-12-02 Thread Stephen Hemminger
On Wed, 2 Dec 2015 16:48:17 +
"Dumitrescu, Cristian"  wrote:

> 
> 
> > -Original Message-
> > From: Stephen Hemminger [mailto:stephen at networkplumber.org]
> > Sent: Sunday, November 29, 2015 8:47 PM
> > To: Dumitrescu, Cristian 
> > Cc: dev at dpdk.org; Stephen Hemminger 
> > Subject: [PATCH 3/3] rte_sched: eliminate floating point in calculating byte
> > clock
> > 
> > The old code was doing a floating point divide for each rte_dequeue()
> > which is very expensive. Change to using fixed point scaled inverse
> > multiply. To maintain equivalent precision, scaled math is used.
> > The application ABI is the same.
> > 
> > This improved performance from 5Gbit/sec to 10 Gbit/sec when configured
> > for 10 Gbit/sec rate.
> > 
> > There was some feedback from Cristian that he wanted a better
> > solution and was going to give one, but none was provided.
> > For 2.2 this is a better solution than existing code, if someone
> > has a better version I would love to see it.
> > 
> > Signed-off-by: Stephen Hemminger 
> > ---
> >  lib/librte_sched/rte_sched.c | 23 ++-
> >  1 file changed, 18 insertions(+), 5 deletions(-)
> > 
> > diff --git a/lib/librte_sched/rte_sched.c b/lib/librte_sched/rte_sched.c
> > index 16acd6b..cfae136 100644
> > --- a/lib/librte_sched/rte_sched.c
> > +++ b/lib/librte_sched/rte_sched.c
> > @@ -47,6 +47,7 @@
> >  #include "rte_bitmap.h"
> >  #include "rte_sched_common.h"
> >  #include "rte_approx.h"
> > +#include "rte_reciprocal.h"
> > 
> >  #ifdef __INTEL_COMPILER
> >  #pragma warning(disable:2259) /* conversion may lose significant bits */
> > @@ -62,6 +63,11 @@
> >  #define RTE_SCHED_PIPE_INVALIDUINT32_MAX
> >  #define RTE_SCHED_BMP_POS_INVALID UINT32_MAX
> > 
> > +/* Scaling for cycles_per_byte calculation
> > + * Chosen so that minimum rate is 480 bit/sec
> > + */
> > +#define RTE_SCHED_TIME_SHIFT 8
> 
> Stephen, can you please elaborate why we need to shift the dividend at all 
> and why the shift value was picked as 8? Is 8 a hard constraint? How does 
> this affect the scheduling precision/accuracy?

The shift value is a tradeoff for scaled math. The bigger the shift
the finer the resolution, but at the risk of overflow in the cycles_per_byte.
The value was chosen as a tradeoff based on current CPU clock rate (TSC)
and minimum rate.

> > +
> >  struct rte_sched_subport {
> > /* Token bucket (TB) */
> > uint64_t tb_time; /* time of last update */
> > @@ -215,7 +221,7 @@ struct rte_sched_port {
> > uint64_t time_cpu_cycles; /* Current CPU time measured in CPU
> > cyles */
> > uint64_t time_cpu_bytes;  /* Current CPU time measured in bytes
> > */
> > uint64_t time;/* Current NIC TX time measured in bytes 
> > */
> > -   double cycles_per_byte;   /* CPU cycles per byte */
> > +   struct rte_reciprocal inv_cycles_per_byte; /* CPU cycles per byte */
> > 
> > /* Scheduling loop detection */
> > uint32_t pipe_loop;
> > @@ -610,7 +616,7 @@ struct rte_sched_port *
> >  rte_sched_port_config(struct rte_sched_port_params *params)
> >  {
> > struct rte_sched_port *port = NULL;
> > -   uint32_t mem_size, bmp_mem_size, n_queues_per_port, i;
> > +   uint32_t mem_size, bmp_mem_size, n_queues_per_port, i,
> > cycles_per_byte;
> > 
> > /* Check user parameters. Determine the amount of memory to
> > allocate */
> > mem_size = rte_sched_port_get_memory_footprint(params);
> > @@ -661,7 +667,10 @@ rte_sched_port_config(struct
> > rte_sched_port_params *params)
> > port->time_cpu_cycles = rte_get_tsc_cycles();
> > port->time_cpu_bytes = 0;
> > port->time = 0;
> > -   port->cycles_per_byte = ((double) rte_get_tsc_hz()) / ((double)
> > params->rate);
> > +
> > +   cycles_per_byte = (rte_get_tsc_hz() << RTE_SCHED_TIME_SHIFT)
> > +   / params->rate;
> > +   port->inv_cycles_per_byte = rte_reciprocal_value(cycles_per_byte);
> > 
> > /* Scheduling loop detection */
> > port->pipe_loop = RTE_SCHED_PIPE_INVALID;
> > @@ -2088,11 +2097,15 @@ rte_sched_port_time_resync(struct
> > rte_sched_port *port)
> >  {
> > uint64_t cycles = rte_get_tsc_cycles();
> > uint64_t cycles_diff = cycles - port->time_cpu_cycles;
> > -   double bytes_diff = ((double) cycles_diff) / port->cycles_per_byte;
> > +   uint64_t bytes_diff;
> > +
> > +   /* Compute elapsed time in bytes */
> > +   bytes_diff = rte_reciprocal_divide(cycles_diff <<
> > RTE_SCHED_TIME_SHIFT,
> > +  port->inv_cycles_per_byte);
> > 
> > /* Advance port time */
> > port->time_cpu_cycles = cycles;
> > -   port->time_cpu_bytes += (uint64_t) bytes_diff;
> > +   port->time_cpu_bytes += bytes_diff;
> > if (port->time < port->time_cpu_bytes)
> > port->time = port->time_cpu_bytes;
> > 
> > --
> > 2.1.4
> 
> Can you provide some insight into how you tested this code and the test 
> vectors you used?

We tested with 10 gbit link and 

[dpdk-dev] [PATCH 3/4] eal/arm: Enable lpm/table/pipeline libs

2015-12-02 Thread Jan Viktorin
On Wed, 2 Dec 2015 16:18:13 +0530
Jerin Jacob  wrote:

> > > [snip]
> > 
> > My preference would also be to put architecture dependent implementation
> > into different files. 
> > Might be create lib/librte_lpm/arch/(arm|x86)/... here?
> > Konstantin
> 
> +1
> 
> my existing patch creates lib/librte_lpm/rte_lpm_neon.h instead
> of lib/librte_lpm/arch/arm/rte_lpm_neon.h like
> lib/librte_hash/rte_cmp_x86.h
> 
> I am OK for changing the directory structure as proposed in my next revision
> of patch.
> Let me know if anyone has any objections/concerns.
> 
> Jerin

I don't like the idea to have arch/... directory structure inside
libraries. I would delay such decision until there are really a big
number of different optimized implementations.

However, the rte_lpm_neon.h approach is OK from my point of view.

Jan

> > [snip]


[dpdk-dev] [PATCH 2/3] rte_sched: introduce reciprocal divide

2015-12-02 Thread Stephen Hemminger
On Wed, 2 Dec 2015 16:45:01 +
"Dumitrescu, Cristian"  wrote:
> + * * Neither the name of Intel Corporation nor the names of its
> 
> Why is Intel mentioned here, as according to this license header Intel is not 
> the copyright holder?

Copy/paste from other code.


> > +#ifndef _RTE_RECIPROCAL_H_
> > +#define _RTE_RECIPROCAL_H_
> > +
> > +struct rte_reciprocal {
> > +   uint32_t m;
> > +   uint8_t sh1, sh2;
> > +};
> 
> The size of this structure is not a multiple of 32 bits. You seem to transfer 
> this structure by value rather than by reference (the function 
> rte_reciprocal_value() below returns an instance of this structure), I don't 
> feel comfortable with the last 16 bits of the structure being left 
> uninitialized, we should probably add some explicit pad field and initialize 
> this structure explicitly to zero at init time?

Shouldn't matter for inline at all.

> 
> > +
> > +static inline uint32_t rte_reciprocal_divide(uint32_t a, struct 
> > rte_reciprocal
> > R)
> > +{
> > +   uint32_t t = (uint32_t)(((uint64_t)a * R.m) >> 32);
> > +
> > +   return (t + ((a - t) >> R.sh1)) >> R.sh2;
> > +}
> > +
> > +struct rte_reciprocal rte_reciprocal_value(uint32_t d);
> 
> Why 32-bit arithmetic? We had a lot of bugs in librte_sched library due to 
> 32-bit arithmetic that were particularly difficult to track. Can we have this 
> function rte_reciprocal_divide() return a 64-bit integer and replace any 
> 32-bit arithmetic/conversion with 64-bit operations?

Doing reciprocal divide by multiply requires a 2x temporary. So if it
used 64 bit math, it would require a 128 bit multiply. 


> > +
> > +#endif /* _RTE_RECIPROCAL_H_ */
> > --
> > 2.1.4
> 
> As previously discussed, a simpler/faster alternative to floating point 
> division is 64-bit multiplication followed by right shift, any particular 
> reason why this approach was not considered?

That is what this is. It is a 64 bit multiply (a * R.m) followed by a right 
shift.
The only other stuff is related to round off and scaling.

I chose to use known working algorithm rather than writing and having to
do mathematical validation of any new code.



[dpdk-dev] [PATCH 03/10] mk: install a standard cutomizable tree

2015-12-02 Thread Thomas Monjalon
2015-12-02 14:54, Panu Matilainen:
> On 12/02/2015 01:25 PM, Thomas Monjalon wrote:
> > 2015-12-02 12:27, Panu Matilainen:
> >> $(prefix)/share is supposed to be shareable across different
> >> architectures. Most of the content here is, but at least the lib symlink
> >> and .config file are not.
> >
> > The case you want to address is multilib 32/x32/64, right?
> 
> That, plus modern Debian/Ubuntu supports multiarch, not just -lib.

We do not support completely different platforms (e.g. ARM and x86)
with only one include directory. At the moment, only variants (32/64)
live together.

> >> One option is to install .config and the symlinks within $(sdkdir)/$(T)
> >> directories, then it can be shared across architectures because each
> >> lives in their own directory. Another possibility is moving the whole
> >> sdk directory into a subdir in $(libdir), but that misses the
> >> opportunity to share across architectures (whether anybody actually
> >> cares is a whole other question :)
> >
> > Yes, I tried to remove the use of RTE_TARGET when building an example.
> > But we can keep it with a subdirectory in $(sdkdir).
> 
> Just realized my suggestion $(sdkdir)/$(T) would not cut it because if 
> T= is specified then this installation method wont be invoked at all :D

I don't understand what you mean.
In my patchset, the installation is the same (except some default values)
with and without T=.

> So yeah, RTE_TARGET. Or perhaps just RTE_ARCH. Dunno if there's actual 
> added value to having the whole target string there, but I wont mind either.

RTE_TARGET is a safe choice for future.



[dpdk-dev] [PATCH 3/4] eal/arm: Enable lpm/table/pipeline libs

2015-12-02 Thread Jan Viktorin
On Wed, 2 Dec 2015 16:09:06 +0530
Jerin Jacob  wrote:

> > > [snip]
> > > IMO, it's not always good to emulate GCC defined intrinsics of
> > > other architecture. What if a legacy DPDK application has such mappings
> > > then BOOM, multiple definition, which one is correct? which one
> > > to comment it out? Integration pain starts for DPDK library consumer:-(
> > >  
> > They can include rte_vect.h in build/include directly, which is linked 
> > correctly
> > to the one for that ARCH, so there is no need to worry about.  
> 
> I think you missed the point,I was trying to say that
> legacy DPDK application and third party stacks uses SSE2NEON kind of
> libraries
> for quick integration, for example, something like this
> https://github.com/jratcliff63367/sse2neon/blob/master/SSE2NEON.h
> 
> AND they include "rte_lpm.h"(it internally includes rte_vect.h)
> that lead to multiple definition and its not good.
> 
> >
> >  
> > >> >
> > >> > IMO, it makes sense to not emulate the SSE intrinsics with NEON
> > >> > Let's create the rte_vect_* as required. look at the existing patch.
> > >> >  
> > >> I thought of creating a layer of SIMD over all the platforms before.
> > >> But can't you see it make things complicated, considering there are
> > >> only few simple intrinsic to implement?  
> > >
> > > Not true, There were, a lot of SSE intrinsics needs be to emulated for 
> > > ACL NEON
> > > implementation if I were to take this approach and emulation comes with
> > > the cost.
> > >  
> > No, I will not re-implement all the intrinsic like that .
> > I only do with the simple intrinsic, such as load/store, as you said below. 
> >  
> 
> but you forced to add _mm_and_si128 also to the list and emulated
> _mm_and_si128 intrinsic. Am just saying no emulation.
> 

Guys, do we want emulate x86 on ARM? I hope we don't ;). I think, as
more platforms might come into DPDK, there will be a need for a proper
abstract vector operations API. Yes, we have to describe this API to
people. However, otherwise, the ARM guys must learn SSE and write for
ARM platform something that looks quite odd. And if there are some "neon
emulations" as shown above, it's definitely an argue to have the API
that can hide those approachs.

Regards
Jan


[dpdk-dev] [PATCH] examples/vhost: add rate statistics for rx/tx and core

2015-12-02 Thread Yuanhan Liu
On Wed, Dec 02, 2015 at 06:32:54AM +0800, Jianfeng Tan wrote:
> Currently, we only have aggregated statistics. This seems not
> obvious to show how fast rx/tx and how busy of each core.
> 
> This patch adds rx/tx rate of each period of option --stat.
> And also a simple core busy rate is added to show how many
> rounds are really processing packets in all rounds of
> circulation.
> 
> Besides, this fix the problem of statistics error under the
> case of software vm2vm fowarding.

Please, do not mix the fix in this patch. One patch should only
do one thing.

--yliu


[dpdk-dev] [PATCH] mk: Make XEN_PMD build in combined library mode

2015-12-02 Thread Christian Ehrhardt
Building RTE_LIBRTE_PMD_XENVIRT was broken when RTE_BUILD_COMBINE_LIBS was
enabled (http://dpdk.org/ml/archives/dev/2015-November/028660.html).
Now the underlying issue is rather simple, the xen code needs libxenstore.
But rte.app.mk so far only considered that when RTE_BUILD_COMBINE_LIBS was
disabled.
While it is correct to create the DPDK sublib linking only in the
RTE_BUILD_COMBINE_LIBS=n case, the libxenstore should be added to the linked
libs in any case if RTE_LIBRTE_PMD_XENVIRT is enabled.

Signed-off-by: Christian Ehrhardt 
---

[diffstat]
 rte.app.mk |2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

[diff]
diff --git a/mk/rte.app.mk b/mk/rte.app.mk
index 85a680d..f003187 100644
--- a/mk/rte.app.mk
+++ b/mk/rte.app.mk
@@ -113,6 +113,7 @@ endif # ! CONFIG_RTE_BUILD_SHARED_LIBS
 _LDLIBS-$(CONFIG_RTE_LIBRTE_BNX2X_PMD)  += -lz

 _LDLIBS-y += --start-group
+_LDLIBS-$(CONFIG_RTE_LIBRTE_PMD_XENVIRT)+= -lxenstore

 ifeq ($(CONFIG_RTE_BUILD_COMBINE_LIBS),n)

@@ -130,7 +131,6 @@ _LDLIBS-$(CONFIG_RTE_LIBRTE_CFGFILE)+= -lrte_cfgfile
 _LDLIBS-$(CONFIG_RTE_LIBRTE_PMD_BOND)   += -lrte_pmd_bond

 _LDLIBS-$(CONFIG_RTE_LIBRTE_PMD_XENVIRT)+= -lrte_pmd_xenvirt
-_LDLIBS-$(CONFIG_RTE_LIBRTE_PMD_XENVIRT)+= -lxenstore

 ifeq ($(CONFIG_RTE_BUILD_SHARED_LIB),n)
 # plugins (link only if static libraries)


[dpdk-dev] Query on Filtering Support in DPDK

2015-12-02 Thread Rahul Lakkireddy
Hi Thomas,

On Monday, November 11/30/15, 2015 at 05:43:18 -0800, Thomas Monjalon wrote:
> Hi,
> 
> 2015-11-30 18:19, Rahul Lakkireddy:
> > 1. Add a new action 'switch' that will:
> >* Allow re-direction to different ports in hardware.
> > 
> >Also, for such a rule, additionally support below:
> > 
> >* Allow source mac/destination mac and vlan header re-writing to be
> >  done by the hardware.
> > 
> >* Allow re-write of TCP/IP headers to perform NAT in hardware.
> > 
> > 2. Add ability to mask individual fields at a particular layer for each
> >filter in flow_director. For example, mask all ip packets coming from
> >a particular subnet mask and particular range of l4 ports for each
> >filter rule.
> > 
> > We would like to get some suggestions on how to proceed with adding the
> > above features.
> 
> You need to identify which API must change and what will be the ABI changes.
> Then please send a deprecation notice before December 11 in order to be part
> of the 2.2 release notes.

I am currently identifying the various API changes to support this and
also the ABI changes if any.

> 
> If you have some RFC patches to send (at least the API changes), it would be
> a good discussion start.

I will try to post some RFC patches in 3-4 days time to get more
inputs/reviews on the approach.

Thanks,
Rahul.


[dpdk-dev] [PATCH 3/4] eal/arm: Enable lpm/table/pipeline libs

2015-12-02 Thread Jerin Jacob
On Wed, Dec 02, 2015 at 02:54:52PM +0800, Jianbo Liu wrote:
> On 2 December 2015 at 00:41, Jerin Jacob  
> wrote:
> > On Tue, Dec 01, 2015 at 01:41:15PM -0500, Jianbo Liu wrote:
> >> Adds ARM NEON support for lpm.
> >> And enables table/pipeline libraries which depend on lpm.
> >
> > I already sent the patch on the same yesterday.
> > We can converge the patches after the discussion.
> > Please check "[dpdk-dev] [PATCH 0/3] add lpm support for NEON" on ml
> >
> Yes, I have read your patch. But there are many differences, so I sent
> mine for your reviewing :)
> 
> >
> >>
> >> Signed-off-by: Jianbo Liu 
> >> ---
> >>  config/defconfig_arm-armv7a-linuxapp-gcc  |  3 -
> >>  config/defconfig_arm64-armv8a-linuxapp-gcc|  3 -
> >>  lib/librte_eal/common/include/arch/arm/rte_vect.h | 28 ++
> >>  lib/librte_lpm/rte_lpm.h  | 68 
> >> ---
> >>  4 files changed, 77 insertions(+), 25 deletions(-)
> >>
> >> diff --git a/config/defconfig_arm-armv7a-linuxapp-gcc 
> >> b/config/defconfig_arm-armv7a-linuxapp-gcc
> >> index cbebd64..efffa1f 100644
> >> --- a/config/defconfig_arm-armv7a-linuxapp-gcc
> >> +++ b/config/defconfig_arm-armv7a-linuxapp-gcc
> >> @@ -53,9 +53,6 @@ CONFIG_RTE_LIBRTE_KNI=n
> >>  CONFIG_RTE_EAL_IGB_UIO=n
> >>
> >>  # fails to compile on ARM
> >> -CONFIG_RTE_LIBRTE_LPM=n
> >> -CONFIG_RTE_LIBRTE_TABLE=n
> >> -CONFIG_RTE_LIBRTE_PIPELINE=n
> >>  CONFIG_RTE_SCHED_VECTOR=n
> >>
> >>  # cannot use those on ARM
> >> diff --git a/config/defconfig_arm64-armv8a-linuxapp-gcc 
> >> b/config/defconfig_arm64-armv8a-linuxapp-gcc
> >> index 504f3ed..57f7941 100644
> >> --- a/config/defconfig_arm64-armv8a-linuxapp-gcc
> >> +++ b/config/defconfig_arm64-armv8a-linuxapp-gcc
> >> @@ -51,7 +51,4 @@ CONFIG_RTE_LIBRTE_IVSHMEM=n
> >>  CONFIG_RTE_LIBRTE_FM10K_PMD=n
> >>  CONFIG_RTE_LIBRTE_I40E_PMD=n
> >>
> >> -CONFIG_RTE_LIBRTE_LPM=n
> >> -CONFIG_RTE_LIBRTE_TABLE=n
> >> -CONFIG_RTE_LIBRTE_PIPELINE=n
> >>  CONFIG_RTE_SCHED_VECTOR=n
> >> diff --git a/lib/librte_eal/common/include/arch/arm/rte_vect.h 
> >> b/lib/librte_eal/common/include/arch/arm/rte_vect.h
> >> index a33c054..7437711 100644
> >> --- a/lib/librte_eal/common/include/arch/arm/rte_vect.h
> >> +++ b/lib/librte_eal/common/include/arch/arm/rte_vect.h
> >> @@ -41,6 +41,8 @@ extern "C" {
> >>
> >>  typedef int32x4_t xmm_t;
> >>
> >> +typedef int32x4_t __m128i;
> >> +
> >>  #define  XMM_SIZE(sizeof(xmm_t))
> >>  #define  XMM_MASK(XMM_SIZE - 1)
> >>
> >> @@ -53,6 +55,32 @@ typedef union rte_xmm {
> >>   double   pd[XMM_SIZE / sizeof(double)];
> >>  } __attribute__((aligned(16))) rte_xmm_t;
> >>
> >> +static __inline __m128i
> >> +_mm_set_epi32(int i3, int i2, int i1, int i0)
> >> +{
> >> + int32_t r[4] = {i0, i1, i2, i3};
> >> +
> >> + return vld1q_s32(r);
> >> +}
> >> +
> >> +static __inline __m128i
> >> +_mm_loadu_si128(__m128i *p)
> >> +{
> >> + return vld1q_s32((int32_t *)p);
> >> +}
> >> +
> >> +static __inline __m128i
> >> +_mm_set1_epi32(int i)
> >> +{
> >> + return vdupq_n_s32(i);
> >> +}
> >> +
> >> +static __inline __m128i
> >> +_mm_and_si128(__m128i a, __m128i b)
> >> +{
> >> + return vandq_s32(a, b);
> >> +}
> >> +

IMO, it's not always good to emulate GCC defined intrinsics of
other architecture. What if a legacy DPDK application has such mappings
then BOOM, multiple definition, which one is correct? which one
to comment it out? Integration pain starts for DPDK library consumer:-(

> >
> > IMO, it makes sense to not emulate the SSE intrinsics with NEON
> > Let's create the rte_vect_* as required. look at the existing patch.
> >
> I thought of creating a layer of SIMD over all the platforms before.
> But can't you see it make things complicated, considering there are
> only few simple intrinsic to implement?

Not true, There were, a lot of SSE intrinsics needs be to emulated for ACL NEON
implementation if I were to take this approach and emulation comes with
the cost.

So my take is,
lets the each architecture implementation for specific SIMD version of DPDK
API in the library should have the freedom to implement the API in
NATIVE.

And let's create only rte_vect_* abstraction only for using
that API/library. Which boils down to have very minimal rte_vect_*
abstraction to load, store, set not beyond that.

This makes clear "contract" between DPDK library and the applications.
and make easy for remaning new architecture  porting effort in DPDK.

Imagine how your proposed function will look like if new architecture
wants to implement "optimized" version of rte_lpm_lookupx4


> If do so, we also need to explain to others how to use these interfaces.
> Besides, this patch did the smallest changes to the original code, and
> more likely to be accepted by others.

other patch makes no changes to IA version of rte_lpm_lookupx4.I thought
that make reviewer easy to review the changes in architecture
perspective.

> 
> >
> >>  #ifdef RTE_ARCH_ARM
> >>  /* 

[dpdk-dev] [PATCH] scripts: support any legal git revisions as abi validation range

2015-12-02 Thread Neil Horman
On Wed, Dec 02, 2015 at 06:50:47PM +0200, Panu Matilainen wrote:
> In addition to git tags, support validating abi between any legal
> gitrevisions(7) syntaxes, such as "validate-abi.sh . -1 "
> "validate-abi.sh master mybrach " etc in addition to
> validating between tags. Makes it easier to run the validator
> for in-development work.
> 
> Signed-off-by: Panu Matilainen 
Acked-by: Neil Horman 

> 


[dpdk-dev] [PATCH] maintainers: claim responsability

2015-12-02 Thread Sergio Gonzalez Monroy
Claim responsability for:
- Secondary Process as maintainer.
- FreeBSD EAL, FreeBSD contigmem and FreeBSD UIO as co-maintainer.

Signed-off-by: Sergio Gonzalez Monroy 
---
 MAINTAINERS | 4 
 1 file changed, 4 insertions(+)

diff --git a/MAINTAINERS b/MAINTAINERS
index 4478862..51da877 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -116,6 +116,7 @@ F: examples/l2fwd-keepalive/
 F: doc/guides/sample_app_ug/keep_alive.rst

 Secondary process
+M: Sergio Gonzalez Monroy 
 K: RTE_PROC_
 F: doc/guides/prog_guide/multi_proc_support.rst
 F: app/test/test_mp_secondary.c
@@ -171,16 +172,19 @@ F: examples/vhost_xen/

 FreeBSD EAL (with overlaps)
 M: Bruce Richardson 
+M: Sergio Gonzalez Monroy 
 F: lib/librte_eal/bsdapp/Makefile
 F: lib/librte_eal/bsdapp/eal/
 F: doc/guides/freebsd_gsg/

 FreeBSD contigmem
 M: Bruce Richardson 
+M: Sergio Gonzalez Monroy 
 F: lib/librte_eal/bsdapp/contigmem/

 FreeBSD UIO
 M: Bruce Richardson 
+M: Sergio Gonzalez Monroy 
 F: lib/librte_eal/bsdapp/nic_uio/


-- 
2.4.3



[dpdk-dev] [PATCH] remove double semicolons

2015-12-02 Thread Stephen Hemminger
Trivial cleanup

Signed-off-by: Stephen Hemminger 
---
 drivers/net/e1000/igb_pf.c| 2 +-
 drivers/net/xenvirt/rte_xen_lib.c | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/net/e1000/igb_pf.c b/drivers/net/e1000/igb_pf.c
index 26c2960..1d00dda 100644
--- a/drivers/net/e1000/igb_pf.c
+++ b/drivers/net/e1000/igb_pf.c
@@ -229,7 +229,7 @@ set_rx_mode(struct rte_eth_dev *dev)

/* set all bits that we expect to always be set */
fctrl &= ~E1000_RCTL_SBP; /* disable store-bad-packets */
-   fctrl |= E1000_RCTL_BAM;;
+   fctrl |= E1000_RCTL_BAM;

/* clear the bits we are changing the status of */
fctrl &= ~(E1000_RCTL_UPE | E1000_RCTL_MPE);
diff --git a/drivers/net/xenvirt/rte_xen_lib.c 
b/drivers/net/xenvirt/rte_xen_lib.c
index 5900b53..3e97c1a 100644
--- a/drivers/net/xenvirt/rte_xen_lib.c
+++ b/drivers/net/xenvirt/rte_xen_lib.c
@@ -362,7 +362,7 @@ grant_node_create(uint32_t pg_num, uint32_t *gref_arr, 
phys_addr_t *pa_arr, char
uint32_t pg_shift;
void *ptr = NULL;
uint32_t count, entries_per_pg;
-   uint32_t i, j = 0, k = 0;;
+   uint32_t i, j = 0, k = 0;
uint32_t *gref_tmp;
int first = 1;
char tmp_str[PATH_MAX] = {0};
-- 
2.1.4



[dpdk-dev] 2.3 Roadmap

2015-12-02 Thread Bruce Richardson
On Tue, Dec 01, 2015 at 02:49:46PM -0500, Matthew Hall wrote:
> On Tue, Dec 01, 2015 at 01:57:39PM +, Bruce Richardson wrote:
> > Hi Matthew,
> > 
> > Couple of follow-up questions on this:
> > * do you need the exact same number of bits in both implementations? If we 
> > support
> > 21 bits of data in IPv6 and 24 in IPv4 is that an issue compared to 
> > supporting
> > 21 bits just in both for compatibility.
> > * related to this - how much data are you looking to store in the tables?
> > 
> > Thanks,
> > /Bruce
> 
> Let me provide some more detailed high level examples of some security use 
> cases so we could consider what makes sense.
> 
> 1) Spamhaus provides a list of approximately 800 CIDR blocks which are so 
> bad that they recommend null-routing them as widely as possible:
> 
> https://www.spamhaus.org/drop/
> https://www.spamhaus.org/drop/drop.txt
> https://www.spamhaus.org/drop/edrop.txt
> 
> In the old implementation I couldn't even fit all of those, and doing 
> something like this seems to be a must-have feature for security.
> 
> 2) Team Cymru provides lists of Bogons for IPv4 and IPv6. In IPv4, there are 
> 3600 bogon CIDR blocks because many things are in-use. But the IPv6 table has 
> 65000 CIDR blocks, because it is larger, newer, and more sparse.
> 
> http://www.team-cymru.org/Services/Bogons/fullbogons-ipv4.txt
> http://www.team-cymru.org/Services/Bogons/fullbogons-ipv6.txt
> 
> Being able to monitor these would be another must-have for security and is 
> quite popular for core routing from what I have heard.
> 
> 3) At any given time, through various methods, I am aware of around 350,000 
> to 
> 2.5 million recent bad IP addresses. Technically single entries could be 
> matched using rte_hash. But it is quite common in the security world, to look 
> at the number of bad IPs in a class C, and then flag the entire subnet as 
> suspect if more than a few bad IPs are present there.
> 
> Some support for some level of this is a must-have for security and firewall 
> use cases.
> 
> 4) Of course, it goes without saying that fitting the contents of the entire 
> Internet BGP prefix list for IPv4 and IPv6 is a must-have for core routing 
> although less needed for security. I am not an expert in this. Some very 
> basic 
> statistics I located with a quick search suggest one needs about 600,000 
> prefixes (presumably for IPv4). It would help if some router experts could 
> clarify it and help me know what the story is for IPv6.
> 
> http://www.cidr-report.org/as2.0/#General_Status
> 
> 5) Considering all of the above, it seems like 22 or 23 unsigned lookup bits 
> are required (4194304 or 8388608 entries) if you want comprehensive bad IP 
> detection. And probably 21 unsigned bits for basic security support. But that 
> would not necessarily leave a whole lot of headroom depending on the details.
> 
> Matthew.

Hi Matthew,

thanks for the info, but I'm not sure I understand it correctly. It seems to
me that you are mostly referring to the depths/sizes of the tables being used,
rather than to the "data-size" being stored in each entry, which was actually
what I was asking about. Is that correct? If so, it seems that - looking 
initially
at IPv4 LPM only - you are more looking for an increase in the number of tbl8's
for lookup, rather than necessarily an increase the 8-bit user data being stored
with each entry. [And assuming similar interest for v6] Am I right in 
thinking this?

Thanks,
/Bruce


[dpdk-dev] [PATCH 03/10] mk: install a standard cutomizable tree

2015-12-02 Thread Panu Matilainen
On 12/02/2015 05:57 AM, Thomas Monjalon wrote:
> The rule "install" follows these conventions:
> https://www.gnu.org/prep/standards/html_node/Directory-Variables.html
> https://www.gnu.org/prep/standards/html_node/DESTDIR.html
>
> The variable sdkdir has been added to the more standards ones,
> to configure the directory used with RTE_SDK when using the DPDK makefiles
> to build an application.
>
> The old installed tree was static and always had .config, includes and
> libs in a RTE_TARGET subdirectory. There is no such directory anymore in
> an installed SDK. So the top directory is checked.
> But RTE_TARGET can still be used, especially to build an app with a
> compiled but not installed SDK.
> That's why both cases are looked for RTE_SDK_BIN.
>
> The default prefix /usr/local is empty in the T= case which is
> used only for a local install.
> It is still possible to build DPDK with the "install T=" rule without
> specifying any DESTDIR. In such case there is no install, as before.
>
> The old usage of an installed SDK is:
>  make -C examples/helloworld RTE_SDK=$(readlink -m $DESTDIR) \
>   RTE_TARGET=x86_64-native-linuxapp-gcc
> RTE_TARGET can be specified but is useless now with an installed SDK.
> The RTE_SDK directory must now point to a different path depending of
> the installation.
>
> Signed-off-by: Thomas Monjalon 
> ---
[...]
> @@ -32,10 +33,30 @@
>   # Build directory is given with O=
>   O ?= .
>
> +prefix  ?= /usr/local
> +exec_prefix ?=  $(prefix)
> +bindir  ?= $(exec_prefix)/bin
> +libdir  ?= $(exec_prefix)/lib
> +includedir  ?=  $(prefix)/include/dpdk
> +datarootdir ?=  $(prefix)/share
> +datadir ?=   $(datarootdir)/dpdk
> +sdkdir  ?= $(datadir)
> +
> +# The install directories may be staged in DESTDIR
[...]
> + @echo == Installing $(DESTDIR)$(prefix)/
> + $(Q)$(call rte_mkdir, $(DESTDIR)$(libdir))
> + $(Q)cp -a $(BUILD_DIR)/lib/* $(DESTDIR)$(libdir)
> + $(Q)$(call rte_mkdir, $(DESTDIR)$(bindir))
> + $(Q)tar -cf -  -C $(BUILD_DIR) app  --exclude 'app/*.map' \
> + --exclude 'app/cmdline*' --exclude app/test \
> + --exclude app/testacl --exclude app/testpipeline | \
> + tar -xf -  -C $(DESTDIR)$(bindir) --strip-components=1 \
> + --keep-newer-files --warning=no-ignore-newer
> + $(Q)$(call rte_mkdir,  $(DESTDIR)$(datadir))
> + $(Q)cp -a $(RTE_SDK)/tools $(DESTDIR)$(datadir)
> + $(Q)$(call rte_mkdir, $(DESTDIR)$(includedir))
> + $(Q)tar -chf - -C $(BUILD_DIR) include | \
> + tar -xf -  -C $(DESTDIR)$(includedir) --strip-components=1 \
> + --keep-newer-files --warning=no-ignore-newer
> + $(Q)$(call rte_mkdir,$(DESTDIR)$(sdkdir))
> + $(Q)cp -a   $(BUILD_DIR)/.config $(DESTDIR)$(sdkdir)
> + $(Q)cp -a   $(RTE_SDK)/{mk,scripts}  $(DESTDIR)$(sdkdir)
> + $(Q)$(call rte_symlink, $(DESTDIR)$(includedir), 
> $(DESTDIR)$(sdkdir)/include)
> + $(Q)$(call rte_symlink, $(DESTDIR)$(libdir), 
> $(DESTDIR)$(sdkdir)/lib)
> + @echo Installation in $(DESTDIR)$(prefix)/ complete
> +endif

$(prefix)/share is supposed to be shareable across different 
architectures. Most of the content here is, but at least the lib symlink 
and .config file are not.

One option is to install .config and the symlinks within $(sdkdir)/$(T) 
directories, then it can be shared across architectures because each 
lives in their own directory. Another possibility is moving the whole 
sdk directory into a subdir in $(libdir), but that misses the 
opportunity to share across architectures (whether anybody actually 
cares is a whole other question :)

$(sdkdir)/lib -> $(libdir) symlink seems reasonable when installing to 
an empty staging root, but on a real-world installation it'd point to 
/usr/lib(something) which has hundreds or thousands of other unrelated 
libraries. My memory is hazy on details but I think this caused an 
actual problem with something because I ended up $(sdkdir)/lib an actual 
directory populated with symlinks to the individual DPDK libraries.

- Panu -


[dpdk-dev] [PATCH 03/10] mk: install a standard cutomizable tree

2015-12-02 Thread Thomas Monjalon
2015-12-02 12:27, Panu Matilainen:
> On 12/02/2015 05:57 AM, Thomas Monjalon wrote:
> > The old installed tree was static and always had .config, includes and
> > libs in a RTE_TARGET subdirectory. There is no such directory anymore in
> > an installed SDK. So the top directory is checked.
> > But RTE_TARGET can still be used, especially to build an app with a
> > compiled but not installed SDK.
> > That's why both cases are looked for RTE_SDK_BIN.
[...]
> > The old usage of an installed SDK is:
> >  make -C examples/helloworld RTE_SDK=$(readlink -m $DESTDIR) \
> >   RTE_TARGET=x86_64-native-linuxapp-gcc
> > RTE_TARGET can be specified but is useless now with an installed SDK.
> > The RTE_SDK directory must now point to a different path depending of
> > the installation.
[...]
> > +   $(Q)$(call rte_mkdir,$(DESTDIR)$(sdkdir))
> > +   $(Q)cp -a   $(BUILD_DIR)/.config $(DESTDIR)$(sdkdir)
> > +   $(Q)cp -a   $(RTE_SDK)/{mk,scripts}  $(DESTDIR)$(sdkdir)
> > +   $(Q)$(call rte_symlink, $(DESTDIR)$(includedir), 
> > $(DESTDIR)$(sdkdir)/include)
> > +   $(Q)$(call rte_symlink, $(DESTDIR)$(libdir), 
> > $(DESTDIR)$(sdkdir)/lib)
> 
> $(prefix)/share is supposed to be shareable across different 
> architectures. Most of the content here is, but at least the lib symlink 
> and .config file are not.

The case you want to address is multilib 32/x32/64, right?

> One option is to install .config and the symlinks within $(sdkdir)/$(T) 
> directories, then it can be shared across architectures because each 
> lives in their own directory. Another possibility is moving the whole 
> sdk directory into a subdir in $(libdir), but that misses the 
> opportunity to share across architectures (whether anybody actually 
> cares is a whole other question :)

Yes, I tried to remove the use of RTE_TARGET when building an example.
But we can keep it with a subdirectory in $(sdkdir).

> $(sdkdir)/lib -> $(libdir) symlink seems reasonable when installing to 
> an empty staging root, but on a real-world installation it'd point to 
> /usr/lib(something) which has hundreds or thousands of other unrelated 
> libraries. My memory is hazy on details but I think this caused an 
> actual problem with something because I ended up $(sdkdir)/lib an actual 
> directory populated with symlinks to the individual DPDK libraries.

I don't see the problem.
I suggest to keep it and see how to fix it if an issue is raised.



[dpdk-dev] [PATCH] remove blank lines at end-of-file

2015-12-02 Thread Stephen Hemminger
This is one of those trivial things git and other tools complain
about.

Signed-off-by: Stephen Hemminger 
---
 drivers/net/i40e/i40e_rxtx.c| 1 -
 drivers/net/vmxnet3/base/includeCheck.h | 1 -
 lib/librte_pipeline/rte_pipeline.c  | 1 -
 3 files changed, 3 deletions(-)

diff --git a/drivers/net/i40e/i40e_rxtx.c b/drivers/net/i40e/i40e_rxtx.c
index 13abd67..c9eca42 100644
--- a/drivers/net/i40e/i40e_rxtx.c
+++ b/drivers/net/i40e/i40e_rxtx.c
@@ -3270,4 +3270,3 @@ i40e_xmit_pkts_vec(void __rte_unused *tx_queue,
 {
return 0;
 }
-
diff --git a/drivers/net/vmxnet3/base/includeCheck.h 
b/drivers/net/vmxnet3/base/includeCheck.h
index 16308d2..310cebe 100644
--- a/drivers/net/vmxnet3/base/includeCheck.h
+++ b/drivers/net/vmxnet3/base/includeCheck.h
@@ -37,4 +37,3 @@
 #include "vmxnet3_osdep.h"

 #endif /* _INCLUDECHECK_H */
-
diff --git a/lib/librte_pipeline/rte_pipeline.c 
b/lib/librte_pipeline/rte_pipeline.c
index 56022f4..d625fd2 100644
--- a/lib/librte_pipeline/rte_pipeline.c
+++ b/lib/librte_pipeline/rte_pipeline.c
@@ -1636,4 +1636,3 @@ int rte_pipeline_table_stats_read(struct rte_pipeline *p, 
uint32_t table_id,

return 0;
 }
-
-- 
2.1.4



[dpdk-dev] [PATCH 07/10] mk: install binding tool in sbin directory

2015-12-02 Thread Panu Matilainen
On 12/02/2015 05:57 AM, Thomas Monjalon wrote:
> sbin/dpdk_nic_bind is a symbolic link to tools/dpdk_nic_bind.py
> where some python objects may be generated.
>
> Signed-off-by: Thomas Monjalon 
> ---
>   mk/rte.sdkinstall.mk | 4 
>   1 file changed, 4 insertions(+)
>
> diff --git a/mk/rte.sdkinstall.mk b/mk/rte.sdkinstall.mk
> index 46253ff..d6df30c 100644
> --- a/mk/rte.sdkinstall.mk
> +++ b/mk/rte.sdkinstall.mk
> @@ -38,6 +38,7 @@ prefix  ?= /usr/local
>   exec_prefix ?=  $(prefix)
>   kerneldir   ?= $(exec_prefix)/kmod
>   bindir  ?= $(exec_prefix)/bin
> +sbindir ?= $(exec_prefix)/sbin
>   libdir  ?= $(exec_prefix)/lib
>   includedir  ?=  $(prefix)/include/dpdk
>   datarootdir ?=  $(prefix)/share
> @@ -106,6 +107,9 @@ install-runtime:
>   --keep-newer-files --warning=no-ignore-newer
>   $(Q)$(call rte_mkdir,  $(DESTDIR)$(datadir))
>   $(Q)cp -a $(RTE_SDK)/tools $(DESTDIR)$(datadir)
> + $(Q)$(call rte_mkdir,  $(DESTDIR)$(sbindir))
> + $(Q)$(call rte_symlink,$(DESTDIR)$(datadir)/dpdk_nic_bind.py, \
> +$(DESTDIR)$(sbindir)/dpdk_nic_bind)
>
>   install-kmod:
>   ifneq '$(wildcard $O/kmod/*)' ''
>

This symlink is broken, it expects dpdk_nic_bind.py to reside in 
$(datadir) root when it actually is in $(datadir)/tools/

Other than that, getting rid of the .py suffix is a nice touch.

- Panu -


[dpdk-dev] [PATCH 06/10] mk: install kernel modules

2015-12-02 Thread Panu Matilainen
On 12/02/2015 05:57 AM, Thomas Monjalon wrote:
> Add kernel modules to "make install".
> Nothing is done if there is no kernel module compiled.
>
> On native Linux, this path is suggested:
>   kerneldir=/lib/modules/$(uname -r)/extra/dpdk
>
> Suggested-by: Mario Carrillo 
> Signed-off-by: Thomas Monjalon 
> ---
>   mk/rte.sdkinstall.mk | 8 
>   1 file changed, 8 insertions(+)
>
> diff --git a/mk/rte.sdkinstall.mk b/mk/rte.sdkinstall.mk
> index 5585974..46253ff 100644
> --- a/mk/rte.sdkinstall.mk
> +++ b/mk/rte.sdkinstall.mk
> @@ -36,6 +36,7 @@ BUILD_DIR := $O
>
>   prefix  ?= /usr/local
>   exec_prefix ?=  $(prefix)
> +kerneldir   ?= $(exec_prefix)/kmod
>   bindir  ?= $(exec_prefix)/bin
>   libdir  ?= $(exec_prefix)/lib
>   includedir  ?=  $(prefix)/include/dpdk
> @@ -89,6 +90,7 @@ ifeq '$(DESTDIR)$(if $T,,+)' ''
>   else
>   @echo == Installing $(DESTDIR)$(prefix)/
>   $(Q)$(MAKE) O=$(BUILD_DIR) install-runtime
> + $(Q)$(MAKE) O=$(BUILD_DIR) install-kmod
>   $(Q)$(MAKE) O=$(BUILD_DIR) install-sdk
>   @echo Installation in $(DESTDIR)$(prefix)/ complete
>   endif
> @@ -105,6 +107,12 @@ install-runtime:
>   $(Q)$(call rte_mkdir,  $(DESTDIR)$(datadir))
>   $(Q)cp -a $(RTE_SDK)/tools $(DESTDIR)$(datadir)
>
> +install-kmod:
> +ifneq '$(wildcard $O/kmod/*)' ''
> + $(Q)$(call rte_mkdir, $(DESTDIR)$(kerneldir))
> + $(Q)cp -a   $O/kmod/* $(DESTDIR)$(kerneldir)
> +endif
> +
>   install-sdk:
>   $(Q)$(call rte_mkdir, $(DESTDIR)$(includedir))
>   $(Q)tar -chf - -C $O include | \
>

This by default installs the modules to /usr/local/kmod/ with no kernel 
version etc. That's so broken that it'd be better not to install them at 
all.

So either get the kerneldir right (the correct path is known on Linux 
and surely BSD too) or dont install them at all unless kerneldir is 
manually specified. For Linux, it should default to 
/lib/modules//extra/dpdk on Linux, where  is the 
version those modules were built against (which might or might not have 
anything to do with uname -r output).

- Panu -


[dpdk-dev] [PATCH] app/test: fix memory_autotest integer overflow/wraparound

2015-12-02 Thread De Lara Guarch, Pablo


> -Original Message-
> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Sergio Gonzalez
> Monroy
> Sent: Tuesday, November 17, 2015 3:39 PM
> To: dev at dpdk.org
> Subject: [dpdk-dev] [PATCH] app/test: fix memory_autotest integer
> overflow/wraparound
> 
> memory_autotest loops infinitely when at least one the memsegs
> is bigger than 4GB.
> 
> The issue is the result of an integer overflow/wraparound of
> the offset variable.
> 
> Fix it by using the correct type (size_t).
> 
> Signed-off-by: Sergio Gonzalez Monroy
> 

Acked-by: Pablo de Lara 


[dpdk-dev] [PATCH 00/10] standard make install

2015-12-02 Thread Panu Matilainen
On 12/02/2015 11:25 AM, Thomas Monjalon wrote:
> 2015-12-02 09:44, Panu Matilainen:
>> That aside, a bigger problem is that it doesn't seem to work.
>>
>> make clean
>> make config T=x86_64-native-linuxapp-gcc
>> make
>> make install DESTDIR=/tmp/dpdk-root
>
> Oh, I forgot to test the simple case where O= is not specified!
>
> It should be fixed with this change:
>

Okay, that helped a bunch :)

Now that I can actually test it, seems mostly ok to me. As for the rest, 
I'll comment on the specific patches.

- Panu -


[dpdk-dev] Aligning net/ethernet.h and rte_ether.h

2015-12-02 Thread Stephen Hemminger
The two header files net/ethenet.h and rte_ether.h are source
incompatiable right now. They both define a bunch of constants
and struct ether_addr; the effective values are the same but
the structure element name is different.

/usr/include/net/ether.h
/* This is a name for the 48 bit ethernet address available on many
   systems.  */
struct ether_addr
{
  u_int8_t ether_addr_octet[ETH_ALEN];
} __attribute__ ((__packed__));

lib/librte_ether/rte_ether.h
struct ether_addr {
uint8_t addr_bytes[ETHER_ADDR_LEN]; /**< Address bytes in transmission 
order */
} __attribute__((__packed__));


I would like to just have rte_ether.h include netinet/ether.h
to get rid of the useless duplication, and fix all the code in DPDK.
But this will break out-of-tree source compatibility so best to
wait for DPDK 2.3. Is there a good place to put this in 2.2 release notes?


[dpdk-dev] [PATCH] bnx2x: tx_start_bd->vlan_or_ethertype is le16

2015-12-02 Thread Thomas Monjalon
2015-12-02 05:18, Charles  Williams:
> On Wed, 2015-12-02 at 02:04 +0100, Thomas Monjalon wrote:
> > 2015-12-01 18:58, Charles  Williams:
> > > On Wed, 2015-12-02 at 00:34 +0100, Thomas Monjalon wrote:
> > > > 2015-12-01 14:37, Stephen Hemminger:
> > > > > Harish Patil  wrote:
> > > > > > >2015-11-03 12:26, Chas Williams:  
> > > > > > >> --- a/drivers/net/bnx2x/bnx2x.c
> > > > > > >> +++ b/drivers/net/bnx2x/bnx2x.c
> > > > > > >> -tx_start_bd->vlan_or_ethertype = 
> > > > > > >> eh->ether_type;
> > > > > > >> +tx_start_bd->vlan_or_ethertype
> > > > > > >> += 
> > > > > > >> rte_cpu_to_le_16(rte_be_to_cpu_16(eh->ether_type));
> > > > > > 
> > > > > > Minor question - any specific reason to use rte_be_to_cpu_16() on
> > > > > > ether_type alone before converting from native order to le16?
> > > > > 
> > > > > ether_type is in network byte order (big endian)
> > > > > and hardware wants little endian. On x86 the second step is a nop.
> > > > 
> > > > Doesn't it deserve a macro rte_ntole16()?
> > > > It may be in lib/librte_eal/common/include/generic/rte_byteorder.h.
> > > 
> > > I looked I didn't see anything.  This value, according to the linux
> > > driver, wants to be little endian regardless of the host endian.
> > 
> > Yes, that's why I suggest to create some macros to do this kind of 
> > conversion.
> > Example: rte_ntole16 means "network to little endian 16-bit".
> > Do you think it would be clearer to use?
> 
> This is the only example of this kind of conversion in the source code
> so it would be a macro for one user.  If you create rte_ntole16() you
> might feel obligated to create the various permutations for which there
> are no consumers.

Yes, that's why I was not sure of the interest.



[dpdk-dev] [PATCH 4/4] vhost: enable log_shmfd protocol feature

2015-12-02 Thread Yuanhan Liu
To claim that we support vhost-user live migration support:
SET_LOG_BASE request will be send only when this feature flag
is set.

Besides this flag, we actually need another feature flag set
to make vhost-user live migration work: VHOST_F_LOG_ALL.
Which, however, has been enabled long time ago.

Signed-off-by: Yuanhan Liu 
---
 lib/librte_vhost/vhost_user/virtio-net-user.h | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/lib/librte_vhost/vhost_user/virtio-net-user.h 
b/lib/librte_vhost/vhost_user/virtio-net-user.h
index 013cf38..a3a889d 100644
--- a/lib/librte_vhost/vhost_user/virtio-net-user.h
+++ b/lib/librte_vhost/vhost_user/virtio-net-user.h
@@ -38,8 +38,10 @@
 #include "vhost-net-user.h"

 #define VHOST_USER_PROTOCOL_F_MQ   0
+#define VHOST_USER_PROTOCOL_F_LOG_SHMFD1

-#define VHOST_USER_PROTOCOL_FEATURES   (1ULL << VHOST_USER_PROTOCOL_F_MQ)
+#define VHOST_USER_PROTOCOL_FEATURES   ((1ULL << VHOST_USER_PROTOCOL_F_MQ) | \
+(1ULL << 
VHOST_USER_PROTOCOL_F_LOG_SHMFD))

 int user_set_mem_table(struct vhost_device_ctx, struct VhostUserMsg *);

-- 
1.9.0



[dpdk-dev] [PATCH 3/4] vhost: log vring changes

2015-12-02 Thread Yuanhan Liu
Invoking vhost_log_write() to mark corresponding page as dirty while
updating used vring.

Signed-off-by: Yuanhan Liu 
---
 lib/librte_vhost/vhost_rxtx.c | 74 +--
 1 file changed, 50 insertions(+), 24 deletions(-)

diff --git a/lib/librte_vhost/vhost_rxtx.c b/lib/librte_vhost/vhost_rxtx.c
index 9322ce6..d4805d8 100644
--- a/lib/librte_vhost/vhost_rxtx.c
+++ b/lib/librte_vhost/vhost_rxtx.c
@@ -129,6 +129,7 @@ virtio_dev_rx(struct virtio_net *dev, uint16_t queue_id,
uint32_t offset = 0, vb_offset = 0;
uint32_t pkt_len, len_to_cpy, data_len, total_copied = 0;
uint8_t hdr = 0, uncompleted_pkt = 0;
+   uint16_t idx;

/* Get descriptor from available ring */
desc = >desc[head[packet_success]];
@@ -200,16 +201,22 @@ virtio_dev_rx(struct virtio_net *dev, uint16_t queue_id,
}

/* Update used ring with desc information */
-   vq->used->ring[res_cur_idx & (vq->size - 1)].id =
-   head[packet_success];
+   idx = res_cur_idx & (vq->size - 1);
+   vq->used->ring[idx].id = head[packet_success];

/* Drop the packet if it is uncompleted */
if (unlikely(uncompleted_pkt == 1))
-   vq->used->ring[res_cur_idx & (vq->size - 1)].len =
-   vq->vhost_hlen;
+   vq->used->ring[idx].len = vq->vhost_hlen;
else
-   vq->used->ring[res_cur_idx & (vq->size - 1)].len =
-   pkt_len + 
vq->vhost_hlen;
+   vq->used->ring[idx].len = pkt_len + vq->vhost_hlen;
+
+   /*
+* to defer the update to when updating used->idx,
+* and batch them?
+*/
+   vhost_log_write(dev, vq,
+   offsetof(struct vring_used, ring[idx]),
+   sizeof(vq->used->ring[idx]));

res_cur_idx++;
packet_success++;
@@ -236,6 +243,9 @@ virtio_dev_rx(struct virtio_net *dev, uint16_t queue_id,

*(volatile uint16_t *)>used->idx += count;
vq->last_used_idx = res_end_idx;
+   vhost_log_write(dev, vq,
+   offsetof(struct vring_used, idx),
+   sizeof(vq->used->idx));

/* flush used->idx update before we read avail->flags. */
rte_mb();
@@ -265,6 +275,7 @@ copy_from_mbuf_to_vring(struct virtio_net *dev, uint32_t 
queue_id,
uint32_t seg_avail;
uint32_t vb_avail;
uint32_t cpy_len, entry_len;
+   uint16_t idx;

if (pkt == NULL)
return 0;
@@ -302,16 +313,18 @@ copy_from_mbuf_to_vring(struct virtio_net *dev, uint32_t 
queue_id,
entry_len = vq->vhost_hlen;

if (vb_avail == 0) {
-   uint32_t desc_idx =
-   vq->buf_vec[vec_idx].desc_idx;
+   uint32_t desc_idx = vq->buf_vec[vec_idx].desc_idx;
+
+   if ((vq->desc[desc_idx].flags & VRING_DESC_F_NEXT) == 0) {
+   idx = cur_idx & (vq->size - 1);

-   if ((vq->desc[desc_idx].flags
-   & VRING_DESC_F_NEXT) == 0) {
/* Update used ring with desc information */
-   vq->used->ring[cur_idx & (vq->size - 1)].id
-   = vq->buf_vec[vec_idx].desc_idx;
-   vq->used->ring[cur_idx & (vq->size - 1)].len
-   = entry_len;
+   vq->used->ring[idx].id = vq->buf_vec[vec_idx].desc_idx;
+   vq->used->ring[idx].len = entry_len;
+
+   vhost_log_write(dev, vq,
+   offsetof(struct vring_used, ring[idx]),
+   sizeof(vq->used->ring[idx]));

entry_len = 0;
cur_idx++;
@@ -354,10 +367,13 @@ copy_from_mbuf_to_vring(struct virtio_net *dev, uint32_t 
queue_id,
if ((vq->desc[vq->buf_vec[vec_idx].desc_idx].flags &
VRING_DESC_F_NEXT) == 0) {
/* Update used ring with desc information */
-   vq->used->ring[cur_idx & (vq->size - 1)].id
+   idx = cur_idx & (vq->size - 1);
+   vq->used->ring[idx].id
= vq->buf_vec[vec_idx].desc_idx;
-   vq->used->ring[cur_idx & (vq->size - 1)].len
-   = entry_len;
+   vq->used->ring[idx].len = entry_len;
+   vhost_log_write(dev, vq,
+   offsetof(struct 

[dpdk-dev] [PATCH 2/4] vhost: introduce vhost_log_write

2015-12-02 Thread Yuanhan Liu
Introduce vhost_log_write() helper function to log the dirty pages we
touched. Page size is harded code to 4096 (VHOST_LOG_PAGE), and each
log is presented by 1 bit.

Therefore, vhost_log_write() simply finds the right bit for related
page we are gonna change, and set it to 1. dev->log_base denotes the
start of the dirty page bitmap.

The page address is biased by log_guest_addr, which is derived from
SET_VRING_ADDR request as part of the vring related addresses.

Signed-off-by: Yuanhan Liu 
---
 lib/librte_vhost/rte_virtio_net.h | 34 ++
 lib/librte_vhost/virtio-net.c |  4 
 2 files changed, 38 insertions(+)

diff --git a/lib/librte_vhost/rte_virtio_net.h 
b/lib/librte_vhost/rte_virtio_net.h
index 416dac2..191c1be 100644
--- a/lib/librte_vhost/rte_virtio_net.h
+++ b/lib/librte_vhost/rte_virtio_net.h
@@ -40,6 +40,7 @@
  */

 #include 
+#include 
 #include 
 #include 
 #include 
@@ -59,6 +60,8 @@ struct rte_mbuf;
 /* Backend value set by guest. */
 #define VIRTIO_DEV_STOPPED -1

+#define VHOST_LOG_PAGE 4096
+

 /* Enum for virtqueue management. */
 enum {VIRTIO_RXQ, VIRTIO_TXQ, VIRTIO_QNUM};
@@ -82,6 +85,7 @@ struct vhost_virtqueue {
struct vring_desc   *desc;  /**< Virtqueue 
descriptor ring. */
struct vring_avail  *avail; /**< Virtqueue 
available ring. */
struct vring_used   *used;  /**< Virtqueue used 
ring. */
+   uint64_tlog_guest_addr; /**< Physical address 
of used ring, for logging */
uint32_tsize;   /**< Size of descriptor 
ring. */
uint32_tbackend;/**< Backend value to 
determine if device should started/stopped. */
uint16_tvhost_hlen; /**< Vhost header 
length (varies depending on RX merge buffers. */
@@ -203,6 +207,36 @@ gpa_to_vva(struct virtio_net *dev, uint64_t guest_pa)
return vhost_va;
 }

+static inline void __attribute__((always_inline))
+vhost_log_page(uint8_t *log_base, uint64_t page)
+{
+   /* TODO: to make it atomic? */
+   log_base[page / 8] |= 1 << (page % 8);
+}
+
+static inline void __attribute__((always_inline))
+vhost_log_write(struct virtio_net *dev, struct vhost_virtqueue *vq,
+   uint64_t offset, uint64_t len)
+{
+   uint64_t addr = vq->log_guest_addr;
+   uint64_t page;
+
+   if (unlikely(((dev->features & (1ULL << VHOST_F_LOG_ALL)) == 0) ||
+!dev->log_base || !len))
+   return;
+
+   addr += offset;
+   if (dev->log_size < ((addr + len - 1) / VHOST_LOG_PAGE / 8))
+   return;
+
+   page = addr / VHOST_LOG_PAGE;
+   while (page * VHOST_LOG_PAGE < addr + len) {
+   vhost_log_page(dev->log_base, page);
+   page += VHOST_LOG_PAGE;
+   }
+}
+
+
 /**
  *  Disable features in feature_mask. Returns 0 on success.
  */
diff --git a/lib/librte_vhost/virtio-net.c b/lib/librte_vhost/virtio-net.c
index 8364938..4481827 100644
--- a/lib/librte_vhost/virtio-net.c
+++ b/lib/librte_vhost/virtio-net.c
@@ -666,12 +666,16 @@ set_vring_addr(struct vhost_device_ctx ctx, struct 
vhost_vring_addr *addr)
return -1;
}

+   vq->log_guest_addr = addr->log_guest_addr;
+
LOG_DEBUG(VHOST_CONFIG, "(%"PRIu64") mapped address desc: %p\n",
dev->device_fh, vq->desc);
LOG_DEBUG(VHOST_CONFIG, "(%"PRIu64") mapped address avail: %p\n",
dev->device_fh, vq->avail);
LOG_DEBUG(VHOST_CONFIG, "(%"PRIu64") mapped address used: %p\n",
dev->device_fh, vq->used);
+   LOG_DEBUG(VHOST_CONFIG, "(%"PRIu64") log_guest_addr: %p\n",
+   dev->device_fh, (void *)(uintptr_t)vq->log_guest_addr);

return 0;
 }
-- 
1.9.0



[dpdk-dev] [PATCH 1/4] vhost: handle VHOST_USER_SET_LOG_BASE request

2015-12-02 Thread Yuanhan Liu
VHOST_USER_SET_LOG_BASE request is used to tell the backend (dpdk
vhost-user) where we should log dirty pages, and how big the log
buffer is.

This request introduces a new payload:

typedef struct VhostUserLog {
uint64_t mmap_size;
uint64_t mmap_offset;
} VhostUserLog;

Also, a fd is delivered from QEMU by ancillary data.

With those info given, an area of memory is mmaped, assigned
to dev->log_base, for logging dirty pages.

Signed-off-by: Yuanhan Liu 
---
 lib/librte_vhost/rte_virtio_net.h |  2 ++
 lib/librte_vhost/vhost_user/vhost-net-user.c  |  7 -
 lib/librte_vhost/vhost_user/vhost-net-user.h  |  6 
 lib/librte_vhost/vhost_user/virtio-net-user.c | 44 +++
 lib/librte_vhost/vhost_user/virtio-net-user.h |  1 +
 5 files changed, 59 insertions(+), 1 deletion(-)

diff --git a/lib/librte_vhost/rte_virtio_net.h 
b/lib/librte_vhost/rte_virtio_net.h
index 5687452..416dac2 100644
--- a/lib/librte_vhost/rte_virtio_net.h
+++ b/lib/librte_vhost/rte_virtio_net.h
@@ -127,6 +127,8 @@ struct virtio_net {
 #define IF_NAME_SZ (PATH_MAX > IFNAMSIZ ? PATH_MAX : IFNAMSIZ)
charifname[IF_NAME_SZ]; /**< Name of the tap 
device or socket path. */
uint32_tvirt_qp_nb; /**< number of queue pair we 
have allocated */
+   uint64_tlog_size;   /**< Size of log area */
+   uint8_t *log_base;  /**< Where dirty pages are 
logged */
void*priv;  /**< private context */
struct vhost_virtqueue  *virtqueue[VHOST_MAX_QUEUE_PAIRS * 2];  /**< 
Contains all virtqueue information. */
 } __rte_cache_aligned;
diff --git a/lib/librte_vhost/vhost_user/vhost-net-user.c 
b/lib/librte_vhost/vhost_user/vhost-net-user.c
index 2dc0547..76bcac2 100644
--- a/lib/librte_vhost/vhost_user/vhost-net-user.c
+++ b/lib/librte_vhost/vhost_user/vhost-net-user.c
@@ -388,7 +388,12 @@ vserver_message_handler(int connfd, void *dat, int *remove)
break;

case VHOST_USER_SET_LOG_BASE:
-   RTE_LOG(INFO, VHOST_CONFIG, "not implemented.\n");
+   user_set_log_base(ctx, );
+
+   /* it needs a reply */
+   msg.size = sizeof(msg.payload.u64);
+   send_vhost_message(connfd, );
+   break;
case VHOST_USER_SET_LOG_FD:
close(msg.fds[0]);
RTE_LOG(INFO, VHOST_CONFIG, "not implemented.\n");
diff --git a/lib/librte_vhost/vhost_user/vhost-net-user.h 
b/lib/librte_vhost/vhost_user/vhost-net-user.h
index 38637cc..6d252a3 100644
--- a/lib/librte_vhost/vhost_user/vhost-net-user.h
+++ b/lib/librte_vhost/vhost_user/vhost-net-user.h
@@ -83,6 +83,11 @@ typedef struct VhostUserMemory {
VhostUserMemoryRegion regions[VHOST_MEMORY_MAX_NREGIONS];
 } VhostUserMemory;

+typedef struct VhostUserLog {
+   uint64_t mmap_size;
+   uint64_t mmap_offset;
+} VhostUserLog;
+
 typedef struct VhostUserMsg {
VhostUserRequest request;

@@ -97,6 +102,7 @@ typedef struct VhostUserMsg {
struct vhost_vring_state state;
struct vhost_vring_addr addr;
VhostUserMemory memory;
+   VhostUserLoglog;
} payload;
int fds[VHOST_MEMORY_MAX_NREGIONS];
 } __attribute((packed)) VhostUserMsg;
diff --git a/lib/librte_vhost/vhost_user/virtio-net-user.c 
b/lib/librte_vhost/vhost_user/virtio-net-user.c
index 2934d1c..1d705fd 100644
--- a/lib/librte_vhost/vhost_user/virtio-net-user.c
+++ b/lib/librte_vhost/vhost_user/virtio-net-user.c
@@ -365,3 +365,47 @@ user_set_protocol_features(struct vhost_device_ctx ctx,

dev->protocol_features = protocol_features;
 }
+
+int
+user_set_log_base(struct vhost_device_ctx ctx,
+struct VhostUserMsg *msg)
+{
+   struct virtio_net *dev;
+   int fd = msg->fds[0];
+   uint64_t size, off;
+   void *addr;
+
+   dev = get_device(ctx);
+   if (!dev)
+   return -1;
+
+   if (fd < 0) {
+   RTE_LOG(ERR, VHOST_CONFIG, "invalid log fd: %d\n", fd);
+   return -1;
+   }
+
+   if (msg->size != sizeof(VhostUserLog)) {
+   RTE_LOG(ERR, VHOST_CONFIG,
+   "invalid log base msg size: %"PRId32" != %d\n",
+   msg->size, (int)sizeof(VhostUserLog));
+   return -1;
+   }
+
+   size = msg->payload.log.mmap_size;
+   off  = msg->payload.log.mmap_offset;
+   RTE_LOG(INFO, VHOST_CONFIG,
+   "log mmap size: %"PRId64", offset: %"PRId64"\n",
+   size, off);
+
+   addr = mmap(0, size, PROT_READ | PROT_WRITE, MAP_SHARED, fd, off);
+   if (addr == MAP_FAILED) {
+   RTE_LOG(ERR, VHOST_CONFIG, "mmap log base failed!\n");
+   return -1;
+   }
+
+   /* TODO: unmap on stop */
+   dev->log_base = addr;
+   dev->log_size = size;
+

[dpdk-dev] [PATCH 0/4 for 2.3] vhost-user live migration support

2015-12-02 Thread Yuanhan Liu
This patch set adds the initial vhost-user live migration support.

The major task behind that is to log pages we touched during
live migration. So, this patch is basically about adding vhost
log support, and using it.

Patchset

- Patch 1 handles VHOST_USER_SET_LOG_BASE, which tells us where
  the dirty memory bitmap is.

- Patch 2 introduces a vhost_log_write() helper function to log
  pages we are gonna change.

- Patch 3 logs changes we made to used vring.

- Patch 4 sets log_fhmfd protocol feature bit, which actually
  enables the vhost-user live migration support.

A simple test guide (on same host)
==

The following test is based on OVS + DPDK. And here is guide
to setup OVS + DPDK:

http://wiki.qemu.org/Features/vhost-user-ovs-dpdk

1. start ovs-vswitchd

2. Add two ovs vhost-user port, say vhost0 and vhost1

3. Start a VM1 to connect to vhost0. Here is my example:

   $QEMU -enable-kvm -m 1024 -smp 4 \
   -chardev socket,id=char0,path=/var/run/openvswitch/vhost0  \
   -netdev type=vhost-user,id=mynet1,chardev=char0,vhostforce \
   -device virtio-net-pci,netdev=mynet1,mac=52:54:00:12:34:58 \
   -object 
memory-backend-file,id=mem,size=1024M,mem-path=$HOME/hugetlbfs,share=on \
   -numa node,memdev=mem -mem-prealloc \
   -kernel $HOME/iso/vmlinuz -append "root=/dev/sda1" \
   -hda fc-19-i386.img \
   -monitor telnet::,server,nowait -curses

4. run "ping $host" inside VM1

5. Start VM2 to connect to vhost0, and marking it as the target
   of live migration (by adding -incoming tcp:0: option)

   $QEMU -enable-kvm -m 1024 -smp 4 \
   -chardev socket,id=char0,path=/var/run/openvswitch/vhost1  \
   -netdev type=vhost-user,id=mynet1,chardev=char0,vhostforce \
   -device virtio-net-pci,netdev=mynet1,mac=52:54:00:12:34:58 \
   -object 
memory-backend-file,id=mem,size=1024M,mem-path=$HOME/hugetlbfs,share=on \
   -numa node,memdev=mem -mem-prealloc \
   -kernel $HOME/iso/vmlinuz -append "root=/dev/sda1" \
   -hda fc-19-i386.img \
   -monitor telnet::3334,server,nowait -curses \
   -incoming tcp:0: 

6. connect to VM1 monitor, and start migration:

   > migrate tcp:0:

7. After a while, you will find that VM1 has been migrated to VM2,
   and the "ping" command continues running, perfectly.


Note: this patch set has mostly been based on Victor Kaplansk's demo
work (vhost-user-bridge) at QEMU project. I was thinking to add Victor
as the co-author. Victor, what do you think of that? :)

Comments are welcome!

---
Yuanhan Liu (4):
  vhost: handle VHOST_USER_SET_LOG_BASE request
  vhost: introduce vhost_log_write
  vhost: log vring changes
  vhost: enable log_shmfd protocol feature

 lib/librte_vhost/rte_virtio_net.h | 35 ++
 lib/librte_vhost/vhost_rxtx.c | 70 ++-
 lib/librte_vhost/vhost_user/vhost-net-user.c  |  7 ++-
 lib/librte_vhost/vhost_user/vhost-net-user.h  |  6 +++
 lib/librte_vhost/vhost_user/virtio-net-user.c | 44 +
 lib/librte_vhost/vhost_user/virtio-net-user.h |  5 +-
 lib/librte_vhost/virtio-net.c |  4 ++
 7 files changed, 145 insertions(+), 26 deletions(-)

-- 
1.9.0



[dpdk-dev] [PATCH 06/10] mk: install kernel modules

2015-12-02 Thread Thomas Monjalon
2015-12-02 11:53, Panu Matilainen:
> On 12/02/2015 05:57 AM, Thomas Monjalon wrote:
> > Add kernel modules to "make install".
> > Nothing is done if there is no kernel module compiled.
> >
> > On native Linux, this path is suggested:
> > kerneldir=/lib/modules/$(uname -r)/extra/dpdk
[...]
> > +kerneldir   ?= $(exec_prefix)/kmod
> 
> This by default installs the modules to /usr/local/kmod/ with no kernel 
> version etc. That's so broken that it'd be better not to install them at 
> all.
> 
> So either get the kerneldir right (the correct path is known on Linux 
> and surely BSD too) or dont install them at all unless kerneldir is 
> manually specified. For Linux, it should default to 
> /lib/modules//extra/dpdk on Linux, where  is the 
> version those modules were built against (which might or might not have 
> anything to do with uname -r output).

Yes. That's what Mario did.
I wanted to keep the same default as with the old T= command.
But both are do-able by using "ifdef T".


[dpdk-dev] [PATCH 00/10] standard make install

2015-12-02 Thread Thomas Monjalon
2015-12-02 11:47, Panu Matilainen:
> On 12/02/2015 11:25 AM, Thomas Monjalon wrote:
> > 2015-12-02 09:44, Panu Matilainen:
> >> That aside, a bigger problem is that it doesn't seem to work.
> >>
> >> make clean
> >> make config T=x86_64-native-linuxapp-gcc
> >> make
> >> make install DESTDIR=/tmp/dpdk-root
> >
> > Oh, I forgot to test the simple case where O= is not specified!
> >
> > It should be fixed with this change:
> >
> 
> Okay, that helped a bunch :)
> 
> Now that I can actually test it, seems mostly ok to me. As for the rest, 
> I'll comment on the specific patches.

OK thanks :)


[dpdk-dev] [PATCH v2] examples/bond: fix bsd compile error

2015-12-02 Thread Yigit, Ferruh
On Thu, Nov 26, 2015 at 09:52:15AM +, Mrzyglod, DanielX T wrote:
> >-Original Message-
> >From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Ferruh Yigit
> >Sent: Wednesday, November 25, 2015 6:41 PM
> >To: dev at dpdk.org
> >Subject: [dpdk-dev] [PATCH v2] examples/bond: fix bsd compile error
> >
> >Error:
> >== bond
> >  CC main.o
> >/.../examples/bond/main.c:431:24: error: use of undeclared identifier 
> >'AF_INET'
> >if (res->ip.family == AF_INET)
> >  ^
> >1 error generated.
> >/.../mk/internal/rte.compile-pre.mk:126: recipe for target 'main.o' failed
> >
> >AF_INET defined in sys/socket.h
> >
> >This header included for Linux:
> >. //include/rte_ip.h
> >.. /usr/include/netinet/in.h
> >... /usr/include/sys/socket.h
> >
> >But not for FreeBSD:
> >. //include/rte_ip.h
> >.. /usr/include/netinet/in.h
> >... /usr/include/machine/endian.h
> >... /usr/include/netinet6/in6.h
> >. //include/rte_tcp.h
> >
> >Signed-off-by: Ferruh Yigit 
> >---
> > examples/bond/main.c | 1 +
> > 1 file changed, 1 insertion(+)
> >
> >diff --git a/examples/bond/main.c b/examples/bond/main.c
> >index 4622283..19f4f05 100644
> >--- a/examples/bond/main.c
> >+++ b/examples/bond/main.c
> >@@ -45,6 +45,7 @@
> > #include 
> > #include 
> > #include 
> >+#include 
> >
> > #include 
> > #include 
> >--
> >2.5.0
> 
> Acked-by: Daniel Mrzyglod 

Self NACK, in favour of http://dpdk.org/dev/patchwork/patch/9130/
I will update pathwork.

Thanks,
ferruh




[dpdk-dev] [PATCH] examples/bond: add header to support freebsd compilation

2015-12-02 Thread Ferruh Yigit
On Thu, Nov 26, 2015 at 09:55:17AM +, Mrzyglod, DanielX T wrote:
> 
> 
> >-Original Message-
> >From: Thomas Monjalon [mailto:thomas.monjalon at 6wind.com]
> >Sent: Wednesday, November 25, 2015 7:08 PM
> >
> >2015-11-25 19:03, Daniel Mrzyglod:
> >> definition of 'AF_INET' enum was missing - is available in 
> >>
> >> Signed-off-by: Daniel Mrzyglod 
> >
> >It is definitely the right fix as Ferruh submitted the same one
> >less than one hour ago.
> >Should we understand it is an ack?
> >
> >It seems Ferruh was hesitating about the line where inserting the
> >include. I would say I prefer your choice :)
> 
> I acked Ferruh patch. I didn't notice he send patch first.
> You can combine both patches :> He has much better description :>

Thomas said he prefer this one, let's use this one.

Acked-by: Ferruh Yigit 


[dpdk-dev] 2.3 Roadmap

2015-12-02 Thread Matthew Hall
On Wed, Dec 02, 2015 at 12:35:16PM +, Bruce Richardson wrote:
> Hi Matthew,
> 
> thanks for the info, but I'm not sure I understand it correctly. It seems to
> me that you are mostly referring to the depths/sizes of the tables being used,
> rather than to the "data-size" being stored in each entry, which was actually
> what I was asking about. Is that correct? If so, it seems that - looking 
> initially
> at IPv4 LPM only - you are more looking for an increase in the number of 
> tbl8's
> for lookup, rather than necessarily an increase the 8-bit user data being 
> stored
> with each entry. [And assuming similar interest for v6] Am I right in 
> thinking this?
> 
> Thanks,
> /Bruce

This question is a result of a different way of looking at things between 
routing / networking and security. I actually need to increase the size of 
user data as I did in my patches.

1. There is an assumption, when LPM is used for routing, that many millions of 
inputs might map to a smaller number of outputs.

2. This assumption is not true in the security ecosystem. If I have several 
million CIDR blocks and bad IPs, I need a separate user data value output for 
each value input.

This is because, every time I have a bad IP, CIDR, Domain, URL, or Email, I 
create a security indicator tracking struct for each one of these. In the IP 
and CIDR case I find the struct using rte_hash (possibly for single IPs) and 
rte_lpm.

For Domain, URL, and Email, rte_hash cannot be used, because it mis-assumes 
all inputs are equal-length. So I had to use a different hash table.

4. The struct contains things such as a unique 64-bit unsigned integer for 
each separate IP or CIDR triggered, to allow looking up contextual data about 
the threat it represents. These IDs are defined by upstream threat databases, 
so I can't crunch them down to fit inside rte_lpm. They also include stats 
regarding how many times an indicator is seen, what kind of security threat it 
represents, etc. Without which you can't do any valuable security enrichment 
needed to respond to any events generated.

5. This means, if I want to support X million security indicators, regardless 
if they are IP, CIDR, Domain, URL, or Email, then I need X million distinct 
user data values to look up all the context that goes with them.

Matthew.


[dpdk-dev] [PATCH v3] lib/librte_sched: Fix compile with gcc 4.3.4

2015-12-02 Thread Michael Qiu
gcc 4.3.4 does not include "immintrin.h", and will post below error:
lib/librte_sched/rte_sched.c:56:23: error:
immintrin.h: No such file or directory

This compiler issue is fixed with rte_vect.h

There is another issue, need SSE2 support

Fixes: 42ec27a0178a ("sched: enable SSE optimizations in config")

Signed-off-by: Michael Qiu 
---
v3 --> v2:
reformat commit log
move rte_vect.h inside RTE_SCHED_VECTOR

v2 --> v1:
include header file rte_vect.h instead of gcc version check
change __AVX__ to __SSE2__ since all vector function are SSE2 related

 lib/librte_sched/rte_sched.c | 9 +++--
 1 file changed, 7 insertions(+), 2 deletions(-)

diff --git a/lib/librte_sched/rte_sched.c b/lib/librte_sched/rte_sched.c
index d47cfc2..21ebf25 100644
--- a/lib/librte_sched/rte_sched.c
+++ b/lib/librte_sched/rte_sched.c
@@ -53,7 +53,12 @@
 #endif

 #ifdef RTE_SCHED_VECTOR
-#include 
+#include 
+
+#if defined(__SSE2__)
+#define SCHED_VECTOR_SSE2
+#endif
+
 #endif

 #define RTE_SCHED_TB_RATE_CONFIG_ERR  (1e-7)
@@ -1667,7 +1672,7 @@ grinder_schedule(struct rte_sched_port *port, uint32_t 
pos)
return 1;
 }

-#ifdef RTE_SCHED_VECTOR
+#ifdef SCHED_VECTOR_SSE2

 static inline int
 grinder_pipe_exists(struct rte_sched_port *port, uint32_t base_pipe)
-- 
1.9.3



  1   2   >