date:20160210

[dpdk-dev] [PATCH] Correcting upstream kernel version in driver

2016-02-10 Thread John Griffin

Fixing the version of the kernel required in the QAT documentation.

Signed-off-by: John Griffin 
---
 doc/guides/cryptodevs/qat.rst | 12 ++--
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/doc/guides/cryptodevs/qat.rst b/doc/guides/cryptodevs/qat.rst
index 1901842..23402b4 100644
--- a/doc/guides/cryptodevs/qat.rst
+++ b/doc/guides/cryptodevs/qat.rst
@@ -33,7 +33,7 @@ Quick Assist Crypto Poll Mode Driver
 The QAT PMD provides poll mode crypto driver support for **Intel QuickAssist
 Technology DH895xxC** hardware accelerator.

-The QAT PMD has been tested on Fedora 21 64-bit with gcc and on the 4.3
+The QAT PMD has been tested on Fedora 21 64-bit with gcc and on the 4.4
 kernel.org Linux kernel.


@@ -73,9 +73,9 @@ Installation
 To use the DPDK QAT PMD an SRIOV-enabled QAT kernel driver is required. The
 VF devices exposed by this driver will be used by QAT PMD.

-If you are running on kernel 4.3 or greater, see instructions for
+If you are running on kernel 4.4 or greater, see instructions for
 `Installation using kernel.org driver`_ below. If you are on a kernel earlier
-than 4.3, see `Installation using 01.org QAT driver`_.
+than 4.4, see `Installation using 01.org QAT driver`_.


 Installation using 01.org QAT driver
@@ -157,13 +157,13 @@ If the build or install fails due to mismatching kernel 
sources you may need to
 Installation using kernel.org driver
 

-Assuming you are running on at least a 4.3 kernel, you can use the stock 
kernel.org QAT
+Assuming you are running on at least a 4.4 kernel, you can use the stock 
kernel.org QAT
 driver to start the QAT hardware.

 The steps below assume you are:

 * Running DPDK on a platform with one ``DH895xCC`` device.
-* On a kernel at least version 4.3.
+* On a kernel at least version 4.4.

 In BIOS ensure that SRIOV is enabled and VT-d is disabled.

@@ -190,7 +190,7 @@ Using the sysfs, enable the VFs::

 echo 32 > /sys/bus/pci/drivers/dh895xcc/\:03\:00.0/sriov_numvfs

-If you get an error, it's likely you're using a QAT kernel driver earlier than 
kernel 4.3.
+If you get an error, it's likely you're using a QAT kernel driver earlier than 
kernel 4.4.

 To verify that the VFs are available for use - use ``lspci -d:443`` to confirm
 the bdf of the 32 VF devices are available per ``DH895xCC`` device.
-- 
2.1.0

[dpdk-dev] DPDK (and rte_*alloc family) friendly Valgrind

2016-02-10 Thread Luca Boccassi

Hello all,

I created a set of patches for Valgrind that add support for the
rte_*alloc family of functions. We use it for memcheck (I added support
for other all the other Valgrind tools like cachegrind as well, but it's
less tested), and find it extremely useful, since the vanilla version
cannot intercept and report leaks cause by rte_*alloc functions from
librte_malloc.

While at FOSDEM last week I mentioned this to Mark Gray and Kevin
Traynor after their presentations (sorry it took a while to remember to
push this up :-) ), and they thought it would be useful for other
DPDK-based applications developers, so I've uploaded it to Github [1].
I've also sent the patches upstream a while ago [2].

It works with both with a statically and dynamically linked DPDK
library.

To use it with a statically linked DPDK, pass the following parameter to
Valgrind: --soname-synonyms=somalloc=NONE

To use it with a dynamically linked DPDK no additional parameter is
needed, but make sure that the SONAME matches either "lib*dpdk.so*" if
building a single .so or "librte_malloc.so*" if building each library
individually. If it doesn't match either of these regexp, then you can
manually patch the file include/pub_tool_redir.h and rebuild.

Please feel free to provide comments, feedback, or patches! All bug
reports will be promptly redirected to /dev/null :-)

-- 
Kind regards,
Luca Boccassi
Brocade Communications Systems

[1] https://github.com/bluca/valgrind-dpdk
[2] https://bugs.kde.org/show_bug.cgi?id=350405

[dpdk-dev] [PATCH] config: remove useless explicit includes of generated header

2016-02-10 Thread Thomas Monjalon

2016-02-08 15:38, Thomas Monjalon:
> The file rte_config.h is automatically generated and included.
> No need to #include it.
> 
> The example performance-thread needs a makefile fix to avoid
> overwriting the default cflags.
> 
> Signed-off-by: Thomas Monjalon 

Applied

[dpdk-dev] [PATCH v3 1/2] version: switch to year/month version numbers

2016-02-10 Thread Thomas Monjalon

2016-02-10 17:02, John McNamara:
> From: "Richardson, Bruce" 
> 
> As discussed on list, switch numbering scheme to be based on year/month.
> Release 2.3 then becomes 16.04.
> 
> Ref: http://dpdk.org/ml/archives/dev/2015-December/030336.html
> 
> Also, added zero padding to the month so that it appear as 16.04 and
> not 16.4 in "make showversion" and rte_version().
> 
> Signed-off-by: Bruce Richardson 
> Signed-off-by: John McNamara 

Acked-by: Thomas Monjalon 

Series applied, thanks

[dpdk-dev] [PATCH v2] mempool: reduce rte_mempool structure size

2016-02-10 Thread Olivier MATZ

Hi Keith,

On 02/10/2016 07:35 PM, Wiles, Keith wrote:
>>> @@ -672,6 +704,24 @@ rte_mempool_count(const struct rte_mempool *mp)
>>>  static unsigned
>>>  rte_mempool_dump_cache(FILE *f, const struct rte_mempool *mp)
>>>  {
>>> +#ifdef RTE_NEXT_ABI
>>> +   unsigned lcore_id;
>>> +   unsigned count = 0;
>>> +   unsigned cache_count;
>>> +
>>> +   fprintf(f, "  cache infos:\n");
>>> +   fprintf(f, "cache_size=%"PRIu32"\n", mp->cache_size);
>>> +   if (mp->cache_size == 0)
>>> +   return count;
>>> +
>>> +   for (lcore_id = 0; lcore_id < RTE_MAX_LCORE; lcore_id++) {
>>> +   cache_count = mp->local_cache[lcore_id].len;
>>> +   fprintf(f, "cache_count[%u]=%u\n", lcore_id, cache_count);
>>> +   count += cache_count;
>>> +   }
>>> +   fprintf(f, "total_cache_count=%u\n", count);
>>> +   return count;
>>> +#else
>>>  #if RTE_MEMPOOL_CACHE_MAX_SIZE > 0
>>> unsigned lcore_id;
>>> unsigned count = 0;
>>
>> I think in this case we could avoid to duplicate the code without
>> beeing unclear by using the proper #ifdefs:
>>
>> #if RTE_MEMPOOL_CACHE_MAX_SIZE > 0 || defined(RTE_NEXT_ABI)
>>  /* common code */
>> #ifdef RTE_NEXT_ABI
>>  if (mp->cache_size == 0)
>>  return count;
>> #endif
>>  /* common code */
>> #else
>> ...
>> #endif
> 
> Started looking at this change and the problem is I want to remove the ifdef 
> RTE_MEMPOOL.. As well as the #else/#endif code. If I do as you suggest it 
> does not appear to be clearer when someone goes back to remove the code, they 
> may think the #ifdef RTE_MEMPOOL/#else/#endif are still required.

OK, makes sense.

>>> @@ -755,19 +793,25 @@ static inline void __attribute__((always_inline))
>>>  __mempool_put_bulk(struct rte_mempool *mp, void * const *obj_table,
>>> unsigned n, int is_mp)
>>>  {
>>> +#ifndef RTE_NEXT_ABI   /* Note: ifndef */
>>>  #if RTE_MEMPOOL_CACHE_MAX_SIZE > 0
>>> +#endif /* RTE_NEXT_ABI */
>>> struct rte_mempool_cache *cache;
>>> uint32_t index;
>>> void **cache_objs;
>>> unsigned lcore_id = rte_lcore_id();
>>> uint32_t cache_size = mp->cache_size;
>>> uint32_t flushthresh = mp->cache_flushthresh;
>>> +#ifndef RTE_NEXT_ABI   /* Note: ifndef */
>>>  #endif /* RTE_MEMPOOL_CACHE_MAX_SIZE > 0 */
>>> +#endif /* RTE_NEXT_ABI */
>>
>> this looks strange... I think it does not work properly.
>> Why not
>> #if RTE_MEMPOOL_CACHE_MAX_SIZE > 0 || defined(RTE_NEXT_ABI)
> 
> Yes I agree the ifndef looks strange, but it should work as we want to remove 
> the #ifdef RTE_MEMPOOL/#endif lines. This was the reason for the comment that 
> it was a ifndef.

It's not only strange, it's also probably not what you want to do:

#ifndef RTE_NEXT_ABI/* Note: ifndef */
#if RTE_MEMPOOL_CACHE_MAX_SIZE > 0
#endif /* RTE_NEXT_ABI */
...

Here, the #endif corresponds to the second #if, not the first #ifdef.

Regards,
Olivier

[dpdk-dev] Cannot run DPDK applications in more than one Container simultaneously

2016-02-10 Thread Pushpasis Sarkar

Hi,

I have tried the stops mentioned in the mail thread below..

http://dpdk.org/ml/archives/dev/2014-July/004031.html

But I could not run the same applcation in two different containers. It
seems once I start the application on the first container, the /mnt/huge
filesystem cannot be locked from the second container.. until I stop the
app on the first one, and 'umount /mnt/dev' on it..  Kind of like seen in
the mail thread above..

Can anyone point to me any material on how to achieve the same?

Thanks in advance
-Pushpasis

[dpdk-dev] thoughts on DPDK after a few days of reading sources

2016-02-10 Thread Matthew Hall

On Wed, Feb 10, 2016 at 07:05:40PM -0800, Seth Arnold wrote:
> - ixgbe driver in the package is very different from the driver in the
>   Linux kernel -- when bugs in one are found, who is in charge of copying
>   the fixes between the two code bases?

It's not the Linux driver. It's from BSD because DPDK uses mbufs. The 
MAINTAINERS file / Intel team refreshes it w/ shared code from their upstream 
BSD support.

Matthew.

[dpdk-dev] thoughts on DPDK after a few days of reading sources

2016-02-10 Thread Seth Arnold

Hello,

The Ubuntu distribution is looking at supporting DPDK in the 'main'
component of the archive. As part of this process I spent a few days
reading the DPDK sources to gauge if we can support it or not.

I've taken some notes while reading the sources; I'm sharing them in the
hopes that it's useful: on the one hand my fresh eyes may spot things that
you've overlooked, on the other hand your familiarity with the code means
that you're better suited to judge what I've found.

Most of the code was very good; I am however concerned about the frequent
memory allocations that use simple integer arithmetic when deciding how
much to allocate without checking for integer overflows.

Here's the bug tracking the Main Inclusion Request:
https://bugs.launchpad.net/ubuntu/+source/dpdk/+bug/1492186

Here's the portions of the notes that I think may be most interesting
to DPDK devs. It's in rough priority order so feel free to stop reading
when you're bored:

===

Affects main codebase and should be investigated quickly:

- shellcheck reports extensive cases of forgotten quotes to prevent word
  splitting or globbing, potentially unused variables, error-prone printf
  formatting. The scripts that are going to be used at runtime should be
  fixed:
  - ./debian/dpdk-init
  - ./debian/dpdk.init
  - ./tools/setup.sh ? (Hard to tell)
- ./drivers/net/cxgbe/cxgbe_ethdev.c eth_cxgbe_dev_init() memory leak in
  out_free_adapter: that doesn't free adapter
- ./drivers/net/virtio/virtio_ethdev.c virtio_set_multiple_queues() calls
  virtio_send_command(), which performs:
  memcpy(vq->virtio_net_hdr_mz->addr, ctrl, sizeof(struct virtio_pmd_ctrl));
  This copies a potentially huge amount of uninitialized data into ->addr
  because the struct virtio_pmd_ctrl ctrl was not zeroed before being
  passed. How much of this data leaves the system? Does this require a
  CVE?
- ./lib/librte_eal/common/rte_malloc.c memory allocation routines don't
  check for integer overflow errors:
  - rte_calloc_socket()
  - rte_calloc()
- ./lib/librte_vhost/vhost_user/virtio-net-user.c user_set_mem_table()
  doesn't perform integer overflow checks before calling calloc()
- ./lib/librte_vhost/vhost_cuse/virtio-net-cdev.c cuse_set_mem_table()
  doesn't perform integer overflow checks before calling calloc()
- ./lib/librte_eal/linuxapp/eal/eal_xen_memory.c rte_xen_dom0_memory_init()
  If vma_addr = xen_get_virtual_area(_len, RTE_PGSIZE_2M); fails,
  vma_len is reset to RTE_PGSIZE_2M instead of seginfo[memseg_idx].size --
  is this a bug?
- ./lib/librte_eal/linuxapp/eal/eal_memory.c create_shared_memory()
  creates the hugetlb file with mode 0666 rather than 0600 -- is this a
  bug? Does this require a CVE?
- ./lib/librte_eal/common/include/arch/ppc_64/rte_cpuflags.h
  rte_cpu_get_features() leaks auxv_fd
- ./lib/librte_eal/common/include/arch/arm/rte_cpuflags_32.h
  rte_cpu_get_features() leaks auxv_fd
- ./lib/librte_eal/common/include/arch/arm/rte_cpuflags_64.h
  rte_cpu_get_features() leaks auxv_fd

Affects main codebase:

- Assorted false-positives and style issues reported by cppcheck
- Slightly dangerous convention of memcpy(dest, source, sizeof(source))
  is used extensively; while all the instances I investigated were
  correct, it's still more prone to mistakes under maintenance than
  memcpy(dest, source, sizeof(dest)).
- extensive use of *malloc() wrappers that perform multiplication to
  determine the size to allocate; all the cases I've seen used values that
  should be constrained by system configurations but the habit is
  dangerous compared to use of *calloc() wrappers that handle integer
  overflow safely.
- ./app/test/test_malloc.c test_reordered_free_per_lcore() has incorrect
  calls to is_memory_overlap() -- p2 is 16000 bytes long, not 1000 bytes
  long, and this size difference is not reflected in the calls.
- ./app/test/test_malloc.c test_realloc() leaks ptr1, may leak ptr9 via
  error path
- ./app/test/test_malloc.c test_realloc() the test with error "Unexpected
  - ptr4 != ptr3" doesn't feel like it tests an actual promise from the API
- ./lib/librte_pipeline/rte_pipeline.c rte_pipeline_table_create(),
  rte_pipeline_port_in_create(), rte_pipeline_port_out_create(),
  duplicate the array of function pointers via memcpy() rather than
  copying a pointer to static tables -- this may represent an easy way to
  save memory and improve cache hit ratios as well as potentially allow
  storing the tables in static memory rather than on the heap, reducing
  the value of these structs in potential exploits.
- ixgbe driver in the package is very different from the driver in the
  Linux kernel -- when bugs in one are found, who is in charge of copying
  the fixes between the two code bases?
- ./lib/librte_eal/linuxapp/kni/ethtool/ixgbe/ixgbe_ethtool.c
  ixgbe_get_strings() takes a buffer to write into but not the buffer
  length; sprintf() calls may overflow the buffer if it isn't large
  enough. It looks like ethtool_get_strings() may use

[dpdk-dev] [PATCH v2] mempool: reduce rte_mempool structure size

2016-02-10 Thread Thomas Monjalon

2016-02-10 18:01, Wiles, Keith:
> >>> --- a/config/defconfig_x86_64-native-linuxapp-gcc
> >>> +++ b/config/defconfig_x86_64-native-linuxapp-gcc
> >>> @@ -40,3 +40,8 @@ CONFIG_RTE_ARCH_64=y
> >>>  
> >>>  CONFIG_RTE_TOOLCHAIN="gcc"
> >>>  CONFIG_RTE_TOOLCHAIN_GCC=y
> >>> +CONFIG_RTE_BUILD_SHARED_LIB=y
> >>> +CONFIG_RTE_NEXT_ABI=n
> >>> +CONFIG_RTE_EAL_IGB_UIO=n
> >>> +CONFIG_RTE_LIBRTE_KNI=n
> >>> +CONFIG_RTE_KNI_KMOD=n
> >
> >Hmm, not sure where this came from, but will remove it.
> 
> I think this was from the ABI-Checker I ran and the tool should leave the 
> repo in its original state.

Yes you're right. The ABI checker modify the defconfig instead of modifying
the generated .config file.
Anyone for a patch?

[dpdk-dev] [PATCH v2] mempool: reduce rte_mempool structure size

2016-02-10 Thread Wiles, Keith

>Hi Keith,
>
>Thank you for adding the RTE_NEXT_ABI. I think this is the way
>described in the process. Your changes will be available in next
>version (16.4) for people compiling with RTE_NEXT_ABI=y, and in
>16.7 without option (I'm just surprised that RTE_NEXT_ABI=y in
>default configs...).
>
>I think a deprecation notice should also be added in this commit
>in doc/guides/rel_notes/deprecation.rst.
>
>Please also find comments below.
>
>On 02/09/2016 06:30 PM, Keith Wiles wrote:
>
>> diff --git a/config/defconfig_x86_64-native-linuxapp-gcc 
>> b/config/defconfig_x86_64-native-linuxapp-gcc
>> index 60baf5b..02e9ace 100644
>> --- a/config/defconfig_x86_64-native-linuxapp-gcc
>> +++ b/config/defconfig_x86_64-native-linuxapp-gcc
>> @@ -40,3 +40,8 @@ CONFIG_RTE_ARCH_64=y
>>  
>>  CONFIG_RTE_TOOLCHAIN="gcc"
>>  CONFIG_RTE_TOOLCHAIN_GCC=y
>> +CONFIG_RTE_BUILD_SHARED_LIB=y
>> +CONFIG_RTE_NEXT_ABI=n
>> +CONFIG_RTE_EAL_IGB_UIO=n
>> +CONFIG_RTE_LIBRTE_KNI=n
>> +CONFIG_RTE_KNI_KMOD=n
>
>I think this should not be part of the patch.
>
>> @@ -672,6 +704,24 @@ rte_mempool_count(const struct rte_mempool *mp)
>>  static unsigned
>>  rte_mempool_dump_cache(FILE *f, const struct rte_mempool *mp)
>>  {
>> +#ifdef RTE_NEXT_ABI
>> +unsigned lcore_id;
>> +unsigned count = 0;
>> +unsigned cache_count;
>> +
>> +fprintf(f, "  cache infos:\n");
>> +fprintf(f, "cache_size=%"PRIu32"\n", mp->cache_size);
>> +if (mp->cache_size == 0)
>> +return count;
>> +
>> +for (lcore_id = 0; lcore_id < RTE_MAX_LCORE; lcore_id++) {
>> +cache_count = mp->local_cache[lcore_id].len;
>> +fprintf(f, "cache_count[%u]=%u\n", lcore_id, cache_count);
>> +count += cache_count;
>> +}
>> +fprintf(f, "total_cache_count=%u\n", count);
>> +return count;
>> +#else
>>  #if RTE_MEMPOOL_CACHE_MAX_SIZE > 0
>>  unsigned lcore_id;
>>  unsigned count = 0;
>
>I think in this case we could avoid to duplicate the code without
>beeing unclear by using the proper #ifdefs:
>
>#if RTE_MEMPOOL_CACHE_MAX_SIZE > 0 || defined(RTE_NEXT_ABI)
>   /* common code */
>#ifdef RTE_NEXT_ABI
>   if (mp->cache_size == 0)
>   return count;
>#endif
>   /* common code */
>#else
>...
>#endif

Started looking at this change and the problem is I want to remove the ifdef 
RTE_MEMPOOL.. As well as the #else/#endif code. If I do as you suggest it does 
not appear to be clearer when someone goes back to remove the code, they may 
think the #ifdef RTE_MEMPOOL/#else/#endif are still required.

>
>
>> @@ -755,6 +806,26 @@ mempool_audit_cookies(const struct rte_mempool *mp)
>>  #define mempool_audit_cookies(mp) do {} while(0)
>>  #endif
>>  
>> +#ifdef RTE_NEXT_ABI
>> +/* check cookies before and after objects */
>> +static void
>> +mempool_audit_cache(const struct rte_mempool *mp)
>> +{
>> +/* check cache size consistency */
>> +unsigned lcore_id;
>> +
>> +if (mp->cache_size == 0)
>> +return;
>> +
>> +for (lcore_id = 0; lcore_id < RTE_MAX_LCORE; lcore_id++) {
>> +if (mp->local_cache[lcore_id].len > mp->cache_flushthresh) {
>> +RTE_LOG(CRIT, MEMPOOL, "badness on cache[%u]\n",
>> +lcore_id);
>> +rte_panic("MEMPOOL: invalid cache len\n");
>> +}
>> +}
>> +}
>> +#else
>
>same here

The same comment here.
>
>> diff --git a/lib/librte_mempool/rte_mempool.h 
>> b/lib/librte_mempool/rte_mempool.h
>> index 6e2390a..fc9b595 100644
>> --- a/lib/librte_mempool/rte_mempool.h
>> +++ b/lib/librte_mempool/rte_mempool.h
>> @@ -95,6 +95,19 @@ struct rte_mempool_debug_stats {
>>  } __rte_cache_aligned;
>>  #endif
>>  
>> +#ifdef RTE_NEXT_ABI
>> +/**
>> + * A structure that stores a per-core object cache.
>> + */
>> +struct rte_mempool_cache {
>> +unsigned len; /**< Cache len */
>> +/*
>> + * Cache is allocated to this size to allow it to overflow in certain
>> + * cases to avoid needless emptying of cache.
>> + */
>> +void *objs[RTE_MEMPOOL_CACHE_MAX_SIZE * 3]; /**< Cache objects */
>> +} __rte_cache_aligned;
>> +#else
>
>same here

Same for this one.
>
>
>
>> @@ -755,19 +793,25 @@ static inline void __attribute__((always_inline))
>>  __mempool_put_bulk(struct rte_mempool *mp, void * const *obj_table,
>>  unsigned n, int is_mp)
>>  {
>> +#ifndef RTE_NEXT_ABI/* Note: ifndef */
>>  #if RTE_MEMPOOL_CACHE_MAX_SIZE > 0
>> +#endif /* RTE_NEXT_ABI */
>>  struct rte_mempool_cache *cache;
>>  uint32_t index;
>>  void **cache_objs;
>>  unsigned lcore_id = rte_lcore_id();
>>  uint32_t cache_size = mp->cache_size;
>>  uint32_t flushthresh = mp->cache_flushthresh;
>> +#ifndef RTE_NEXT_ABI/* Note: ifndef */
>>  #endif /* RTE_MEMPOOL_CACHE_MAX_SIZE > 0 */
>> +#endif /* RTE_NEXT_ABI */
>
>this looks strange... I think it does not work properly.
>Why not
>#if RTE_MEMPOOL_CACHE_MAX_SIZE > 0 ||

[dpdk-dev] [PATCH v3 1/2] version: switch to year/month version numbers

2016-02-10 Thread Thomas Monjalon

2016-02-10 17:11, Mcnamara, John:
> Hi Alejandro,
> 
> Please note the above change to the nfp_net.c code due to the name change
> in the version macro names.
> 
> 
> I was originally going to change it to this:
> 
> #if RTE_VER_YEAR == 2 && RTE_VER_MONTH >= 1
> dev_info->hash_key_size = NFP_NET_CFG_RSS_KEY_SZ;
> #endif
> 
> Or better:
> 
> #if RTE_VERSION >= 0x0201
> dev_info->hash_key_size = NFP_NET_CFG_RSS_KEY_SZ;
> #endif

Or better:
#if RTE_VERSION >= RTE_VERSION_NUM(2,1,0,0)

> However, since the code was added in DPDK 2.2 the version check doesn't seem
> necessary, so I removed it. If you need it back in for some reason let me 
> know.

A version check is needed for external applications or external drivers.
It should not exist in the git tree.

[dpdk-dev] [PATCH v2] mempool: reduce rte_mempool structure size

2016-02-10 Thread Wiles, Keith

>>Hi Keith,
>>
>>Thank you for adding the RTE_NEXT_ABI. I think this is the way
>>described in the process. Your changes will be available in next
>>version (16.4) for people compiling with RTE_NEXT_ABI=y, and in
>>16.7 without option (I'm just surprised that RTE_NEXT_ABI=y in
>>default configs...).
>>
>>I think a deprecation notice should also be added in this commit
>>in doc/guides/rel_notes/deprecation.rst.
>
>Will add the text.
>>
>>Please also find comments below.
>>
>>On 02/09/2016 06:30 PM, Keith Wiles wrote:
>>
>>> diff --git a/config/defconfig_x86_64-native-linuxapp-gcc 
>>> b/config/defconfig_x86_64-native-linuxapp-gcc
>>> index 60baf5b..02e9ace 100644
>>> --- a/config/defconfig_x86_64-native-linuxapp-gcc
>>> +++ b/config/defconfig_x86_64-native-linuxapp-gcc
>>> @@ -40,3 +40,8 @@ CONFIG_RTE_ARCH_64=y
>>>  
>>>  CONFIG_RTE_TOOLCHAIN="gcc"
>>>  CONFIG_RTE_TOOLCHAIN_GCC=y
>>> +CONFIG_RTE_BUILD_SHARED_LIB=y
>>> +CONFIG_RTE_NEXT_ABI=n
>>> +CONFIG_RTE_EAL_IGB_UIO=n
>>> +CONFIG_RTE_LIBRTE_KNI=n
>>> +CONFIG_RTE_KNI_KMOD=n
>
>Hmm, not sure where this came from, but will remove it.

I think this was from the ABI-Checker I ran and the tool should leave the repo 
in its original state.

>>
>>I think this should not be part of the patch.
>>
>>> @@ -672,6 +704,24 @@ rte_mempool_count(const struct rte_mempool *mp)
>>>  static unsigned
>>>  rte_mempool_dump_cache(FILE *f, const struct rte_mempool *mp)
>>>  {
>>> +#ifdef RTE_NEXT_ABI
>>> +   unsigned lcore_id;
>>> +   unsigned count = 0;
>>> +   unsigned cache_count;
>>> +
>>> +   fprintf(f, "  cache infos:\n");
>>> +   fprintf(f, "cache_size=%"PRIu32"\n", mp->cache_size);
>>> +   if (mp->cache_size == 0)
>>> +   return count;
>>> +
>>> +   for (lcore_id = 0; lcore_id < RTE_MAX_LCORE; lcore_id++) {
>>> +   cache_count = mp->local_cache[lcore_id].len;
>>> +   fprintf(f, "cache_count[%u]=%u\n", lcore_id, cache_count);
>>> +   count += cache_count;
>>> +   }
>>> +   fprintf(f, "total_cache_count=%u\n", count);
>>> +   return count;
>>> +#else
>>>  #if RTE_MEMPOOL_CACHE_MAX_SIZE > 0
>>> unsigned lcore_id;
>>> unsigned count = 0;
>>
>>I think in this case we could avoid to duplicate the code without
>>beeing unclear by using the proper #ifdefs:
>
>I was struggling with how it should be done. I like to see clear ifdefs and be 
>able to see the complete code for a given case. In these cases I wanted to 
>make it simple to remove the code quickly by just deleting lines instead of 
>editing lines. I will follow your suggestion.
>>
>>#if RTE_MEMPOOL_CACHE_MAX_SIZE > 0 || defined(RTE_NEXT_ABI)
>>  /* common code */
>>#ifdef RTE_NEXT_ABI
>>  if (mp->cache_size == 0)
>>  return count;
>>#endif
>>  /* common code */
>>#else
>>...
>>#endif
>>
>>
>>> @@ -755,6 +806,26 @@ mempool_audit_cookies(const struct rte_mempool *mp)
>>>  #define mempool_audit_cookies(mp) do {} while(0)
>>>  #endif
>>>  
>>> +#ifdef RTE_NEXT_ABI
>>> +/* check cookies before and after objects */
>>> +static void
>>> +mempool_audit_cache(const struct rte_mempool *mp)
>>> +{
>>> +   /* check cache size consistency */
>>> +   unsigned lcore_id;
>>> +
>>> +   if (mp->cache_size == 0)
>>> +   return;
>>> +
>>> +   for (lcore_id = 0; lcore_id < RTE_MAX_LCORE; lcore_id++) {
>>> +   if (mp->local_cache[lcore_id].len > mp->cache_flushthresh) {
>>> +   RTE_LOG(CRIT, MEMPOOL, "badness on cache[%u]\n",
>>> +   lcore_id);
>>> +   rte_panic("MEMPOOL: invalid cache len\n");
>>> +   }
>>> +   }
>>> +}
>>> +#else
>>
>>same here
>>
>>> diff --git a/lib/librte_mempool/rte_mempool.h 
>>> b/lib/librte_mempool/rte_mempool.h
>>> index 6e2390a..fc9b595 100644
>>> --- a/lib/librte_mempool/rte_mempool.h
>>> +++ b/lib/librte_mempool/rte_mempool.h
>>> @@ -95,6 +95,19 @@ struct rte_mempool_debug_stats {
>>>  } __rte_cache_aligned;
>>>  #endif
>>>  
>>> +#ifdef RTE_NEXT_ABI
>>> +/**
>>> + * A structure that stores a per-core object cache.
>>> + */
>>> +struct rte_mempool_cache {
>>> +   unsigned len; /**< Cache len */
>>> +   /*
>>> +* Cache is allocated to this size to allow it to overflow in certain
>>> +* cases to avoid needless emptying of cache.
>>> +*/
>>> +   void *objs[RTE_MEMPOOL_CACHE_MAX_SIZE * 3]; /**< Cache objects */
>>> +} __rte_cache_aligned;
>>> +#else
>>
>>same here
>>
>>
>>
>>> @@ -755,19 +793,25 @@ static inline void __attribute__((always_inline))
>>>  __mempool_put_bulk(struct rte_mempool *mp, void * const *obj_table,
>>> unsigned n, int is_mp)
>>>  {
>>> +#ifndef RTE_NEXT_ABI   /* Note: ifndef */
>>>  #if RTE_MEMPOOL_CACHE_MAX_SIZE > 0
>>> +#endif /* RTE_NEXT_ABI */
>>> struct rte_mempool_cache *cache;
>>> uint32_t index;
>>> void **cache_objs;
>>> unsigned lcore_id = rte_lcore_id();
>>> uint32_t cache_size = mp->cache_size;
>>> uint32_t flushthresh = mp->cache_flushthresh;
>>> +#ifndef RTE_NEXT_ABI

[dpdk-dev] [PATCH v2] mempool: reduce rte_mempool structure size

2016-02-10 Thread Olivier MATZ

Hi Keith,

Thank you for adding the RTE_NEXT_ABI. I think this is the way
described in the process. Your changes will be available in next
version (16.4) for people compiling with RTE_NEXT_ABI=y, and in
16.7 without option (I'm just surprised that RTE_NEXT_ABI=y in
default configs...).

I think a deprecation notice should also be added in this commit
in doc/guides/rel_notes/deprecation.rst.

Please also find comments below.

On 02/09/2016 06:30 PM, Keith Wiles wrote:

> diff --git a/config/defconfig_x86_64-native-linuxapp-gcc 
> b/config/defconfig_x86_64-native-linuxapp-gcc
> index 60baf5b..02e9ace 100644
> --- a/config/defconfig_x86_64-native-linuxapp-gcc
> +++ b/config/defconfig_x86_64-native-linuxapp-gcc
> @@ -40,3 +40,8 @@ CONFIG_RTE_ARCH_64=y
>  
>  CONFIG_RTE_TOOLCHAIN="gcc"
>  CONFIG_RTE_TOOLCHAIN_GCC=y
> +CONFIG_RTE_BUILD_SHARED_LIB=y
> +CONFIG_RTE_NEXT_ABI=n
> +CONFIG_RTE_EAL_IGB_UIO=n
> +CONFIG_RTE_LIBRTE_KNI=n
> +CONFIG_RTE_KNI_KMOD=n

I think this should not be part of the patch.

> @@ -672,6 +704,24 @@ rte_mempool_count(const struct rte_mempool *mp)
>  static unsigned
>  rte_mempool_dump_cache(FILE *f, const struct rte_mempool *mp)
>  {
> +#ifdef RTE_NEXT_ABI
> + unsigned lcore_id;
> + unsigned count = 0;
> + unsigned cache_count;
> +
> + fprintf(f, "  cache infos:\n");
> + fprintf(f, "cache_size=%"PRIu32"\n", mp->cache_size);
> + if (mp->cache_size == 0)
> + return count;
> +
> + for (lcore_id = 0; lcore_id < RTE_MAX_LCORE; lcore_id++) {
> + cache_count = mp->local_cache[lcore_id].len;
> + fprintf(f, "cache_count[%u]=%u\n", lcore_id, cache_count);
> + count += cache_count;
> + }
> + fprintf(f, "total_cache_count=%u\n", count);
> + return count;
> +#else
>  #if RTE_MEMPOOL_CACHE_MAX_SIZE > 0
>   unsigned lcore_id;
>   unsigned count = 0;

I think in this case we could avoid to duplicate the code without
beeing unclear by using the proper #ifdefs:

#if RTE_MEMPOOL_CACHE_MAX_SIZE > 0 || defined(RTE_NEXT_ABI)
/* common code */
#ifdef RTE_NEXT_ABI
if (mp->cache_size == 0)
return count;
#endif
/* common code */
#else
...
#endif


> @@ -755,6 +806,26 @@ mempool_audit_cookies(const struct rte_mempool *mp)
>  #define mempool_audit_cookies(mp) do {} while(0)
>  #endif
>  
> +#ifdef RTE_NEXT_ABI
> +/* check cookies before and after objects */
> +static void
> +mempool_audit_cache(const struct rte_mempool *mp)
> +{
> + /* check cache size consistency */
> + unsigned lcore_id;
> +
> + if (mp->cache_size == 0)
> + return;
> +
> + for (lcore_id = 0; lcore_id < RTE_MAX_LCORE; lcore_id++) {
> + if (mp->local_cache[lcore_id].len > mp->cache_flushthresh) {
> + RTE_LOG(CRIT, MEMPOOL, "badness on cache[%u]\n",
> + lcore_id);
> + rte_panic("MEMPOOL: invalid cache len\n");
> + }
> + }
> +}
> +#else

same here

> diff --git a/lib/librte_mempool/rte_mempool.h 
> b/lib/librte_mempool/rte_mempool.h
> index 6e2390a..fc9b595 100644
> --- a/lib/librte_mempool/rte_mempool.h
> +++ b/lib/librte_mempool/rte_mempool.h
> @@ -95,6 +95,19 @@ struct rte_mempool_debug_stats {
>  } __rte_cache_aligned;
>  #endif
>  
> +#ifdef RTE_NEXT_ABI
> +/**
> + * A structure that stores a per-core object cache.
> + */
> +struct rte_mempool_cache {
> + unsigned len; /**< Cache len */
> + /*
> +  * Cache is allocated to this size to allow it to overflow in certain
> +  * cases to avoid needless emptying of cache.
> +  */
> + void *objs[RTE_MEMPOOL_CACHE_MAX_SIZE * 3]; /**< Cache objects */
> +} __rte_cache_aligned;
> +#else

same here



> @@ -755,19 +793,25 @@ static inline void __attribute__((always_inline))
>  __mempool_put_bulk(struct rte_mempool *mp, void * const *obj_table,
>   unsigned n, int is_mp)
>  {
> +#ifndef RTE_NEXT_ABI /* Note: ifndef */
>  #if RTE_MEMPOOL_CACHE_MAX_SIZE > 0
> +#endif /* RTE_NEXT_ABI */
>   struct rte_mempool_cache *cache;
>   uint32_t index;
>   void **cache_objs;
>   unsigned lcore_id = rte_lcore_id();
>   uint32_t cache_size = mp->cache_size;
>   uint32_t flushthresh = mp->cache_flushthresh;
> +#ifndef RTE_NEXT_ABI /* Note: ifndef */
>  #endif /* RTE_MEMPOOL_CACHE_MAX_SIZE > 0 */
> +#endif /* RTE_NEXT_ABI */

this looks strange... I think it does not work properly.
Why not
#if RTE_MEMPOOL_CACHE_MAX_SIZE > 0 || defined(RTE_NEXT_ABI)

>   /* increment stat now, adding in mempool always success */
>   __MEMPOOL_STAT_ADD(mp, put, n);
>  
> +#ifndef RTE_NEXT_ABI /* Note: ifndef */
>  #if RTE_MEMPOOL_CACHE_MAX_SIZE > 0
> +#endif /* RTE_NEXT_ABI */
>   /* cache is not enabled or single producer or non-EAL thread */
>   if (unlikely(cache_size == 0 || is_mp == 0 ||
>lcore_id >= RTE_MAX_LCORE))
> @@ -802,7 +846,9 @@ __mempool_put_bulk(struct rte_mempool

[dpdk-dev] [PATCH v2] mempool: reduce rte_mempool structure size

2016-02-10 Thread Wiles, Keith

>Hi Keith,
>
>Thank you for adding the RTE_NEXT_ABI. I think this is the way
>described in the process. Your changes will be available in next
>version (16.4) for people compiling with RTE_NEXT_ABI=y, and in
>16.7 without option (I'm just surprised that RTE_NEXT_ABI=y in
>default configs...).
>
>I think a deprecation notice should also be added in this commit
>in doc/guides/rel_notes/deprecation.rst.

Will add the text.
>
>Please also find comments below.
>
>On 02/09/2016 06:30 PM, Keith Wiles wrote:
>
>> diff --git a/config/defconfig_x86_64-native-linuxapp-gcc 
>> b/config/defconfig_x86_64-native-linuxapp-gcc
>> index 60baf5b..02e9ace 100644
>> --- a/config/defconfig_x86_64-native-linuxapp-gcc
>> +++ b/config/defconfig_x86_64-native-linuxapp-gcc
>> @@ -40,3 +40,8 @@ CONFIG_RTE_ARCH_64=y
>>  
>>  CONFIG_RTE_TOOLCHAIN="gcc"
>>  CONFIG_RTE_TOOLCHAIN_GCC=y
>> +CONFIG_RTE_BUILD_SHARED_LIB=y
>> +CONFIG_RTE_NEXT_ABI=n
>> +CONFIG_RTE_EAL_IGB_UIO=n
>> +CONFIG_RTE_LIBRTE_KNI=n
>> +CONFIG_RTE_KNI_KMOD=n

Hmm, not sure where this came from, but will remove it.
>
>I think this should not be part of the patch.
>
>> @@ -672,6 +704,24 @@ rte_mempool_count(const struct rte_mempool *mp)
>>  static unsigned
>>  rte_mempool_dump_cache(FILE *f, const struct rte_mempool *mp)
>>  {
>> +#ifdef RTE_NEXT_ABI
>> +unsigned lcore_id;
>> +unsigned count = 0;
>> +unsigned cache_count;
>> +
>> +fprintf(f, "  cache infos:\n");
>> +fprintf(f, "cache_size=%"PRIu32"\n", mp->cache_size);
>> +if (mp->cache_size == 0)
>> +return count;
>> +
>> +for (lcore_id = 0; lcore_id < RTE_MAX_LCORE; lcore_id++) {
>> +cache_count = mp->local_cache[lcore_id].len;
>> +fprintf(f, "cache_count[%u]=%u\n", lcore_id, cache_count);
>> +count += cache_count;
>> +}
>> +fprintf(f, "total_cache_count=%u\n", count);
>> +return count;
>> +#else
>>  #if RTE_MEMPOOL_CACHE_MAX_SIZE > 0
>>  unsigned lcore_id;
>>  unsigned count = 0;
>
>I think in this case we could avoid to duplicate the code without
>beeing unclear by using the proper #ifdefs:

I was struggling with how it should be done. I like to see clear ifdefs and be 
able to see the complete code for a given case. In these cases I wanted to make 
it simple to remove the code quickly by just deleting lines instead of editing 
lines. I will follow your suggestion.
>
>#if RTE_MEMPOOL_CACHE_MAX_SIZE > 0 || defined(RTE_NEXT_ABI)
>   /* common code */
>#ifdef RTE_NEXT_ABI
>   if (mp->cache_size == 0)
>   return count;
>#endif
>   /* common code */
>#else
>...
>#endif
>
>
>> @@ -755,6 +806,26 @@ mempool_audit_cookies(const struct rte_mempool *mp)
>>  #define mempool_audit_cookies(mp) do {} while(0)
>>  #endif
>>  
>> +#ifdef RTE_NEXT_ABI
>> +/* check cookies before and after objects */
>> +static void
>> +mempool_audit_cache(const struct rte_mempool *mp)
>> +{
>> +/* check cache size consistency */
>> +unsigned lcore_id;
>> +
>> +if (mp->cache_size == 0)
>> +return;
>> +
>> +for (lcore_id = 0; lcore_id < RTE_MAX_LCORE; lcore_id++) {
>> +if (mp->local_cache[lcore_id].len > mp->cache_flushthresh) {
>> +RTE_LOG(CRIT, MEMPOOL, "badness on cache[%u]\n",
>> +lcore_id);
>> +rte_panic("MEMPOOL: invalid cache len\n");
>> +}
>> +}
>> +}
>> +#else
>
>same here
>
>> diff --git a/lib/librte_mempool/rte_mempool.h 
>> b/lib/librte_mempool/rte_mempool.h
>> index 6e2390a..fc9b595 100644
>> --- a/lib/librte_mempool/rte_mempool.h
>> +++ b/lib/librte_mempool/rte_mempool.h
>> @@ -95,6 +95,19 @@ struct rte_mempool_debug_stats {
>>  } __rte_cache_aligned;
>>  #endif
>>  
>> +#ifdef RTE_NEXT_ABI
>> +/**
>> + * A structure that stores a per-core object cache.
>> + */
>> +struct rte_mempool_cache {
>> +unsigned len; /**< Cache len */
>> +/*
>> + * Cache is allocated to this size to allow it to overflow in certain
>> + * cases to avoid needless emptying of cache.
>> + */
>> +void *objs[RTE_MEMPOOL_CACHE_MAX_SIZE * 3]; /**< Cache objects */
>> +} __rte_cache_aligned;
>> +#else
>
>same here
>
>
>
>> @@ -755,19 +793,25 @@ static inline void __attribute__((always_inline))
>>  __mempool_put_bulk(struct rte_mempool *mp, void * const *obj_table,
>>  unsigned n, int is_mp)
>>  {
>> +#ifndef RTE_NEXT_ABI/* Note: ifndef */
>>  #if RTE_MEMPOOL_CACHE_MAX_SIZE > 0
>> +#endif /* RTE_NEXT_ABI */
>>  struct rte_mempool_cache *cache;
>>  uint32_t index;
>>  void **cache_objs;
>>  unsigned lcore_id = rte_lcore_id();
>>  uint32_t cache_size = mp->cache_size;
>>  uint32_t flushthresh = mp->cache_flushthresh;
>> +#ifndef RTE_NEXT_ABI/* Note: ifndef */
>>  #endif /* RTE_MEMPOOL_CACHE_MAX_SIZE > 0 */
>> +#endif /* RTE_NEXT_ABI */
>
>this looks strange... I think it does not work properly.
>Why not
>#if RTE_MEMPOOL_CACHE_MAX_SIZE > 0 ||

[dpdk-dev] [PATCH] virtio: prettify log messages

2016-02-10 Thread Stephen Hemminger

On Thu, 11 Feb 2016 00:45:34 +0100
Vincent JARDIN  wrote:

> PMD_TX_LOG() looks better with a \n
> 
> Signed-off-by: Vincent JARDIN 

NAK. Yes the messages, are messed up, but this is not
the right way to fix it.

The logging wrappers are inconsistent in virtio.
PMD_INIT_LOG adds newline, but RX/TX/DRV do not.

I would rather the macros were aligned with ixgbe which always
adds newline for all the PMD_XXX_LOG() macros. And then remove
all extra newlines in virtio code.

[dpdk-dev] [PATCH] ixgbe: fix whitespace

2016-02-10 Thread Bruce Richardson

On Wed, Jan 13, 2016 at 05:23:22AM +, Zhang, Helin wrote:
> 
> 
> -Original Message-
> From: Stephen Hemminger [mailto:stephen at networkplumber.org] 
> Sent: Wednesday, January 13, 2016 12:54 PM
> To: Zhang, Helin ; Ananyev, Konstantin 
> 
> Cc: dev at dpdk.org; Stephen Hemminger 
> Subject: [PATCH] ixgbe: fix whitespace
> 
> Normally, I ignore random minor whitespace issues, but this driver seems to 
> have some authors that don't understand the conventions of putting whitespace 
> after keywords.
> 
> Signed-off-by: Stephen Hemminger 
> Acked-by: Helin Zhang 
> 

Pushed to dpdk-next-net/rel_16_04, including an additional 11 other whitespace
fixes called out by checkpatch.

/Bruce

[dpdk-dev] [PATCH v3 1/2] version: switch to year/month version numbers

2016-02-10 Thread Mcnamara, John

> -Original Message-
> From: Mcnamara, John
> Sent: Wednesday, February 10, 2016 5:02 PM
> To: dev at dpdk.org
> Cc: thomas.monjalon at 6wind.com; Richardson, Bruce
> ; Mcnamara, John 
> Subject: [PATCH v3 1/2] version: switch to year/month version numbers


> diff --git a/drivers/net/nfp/nfp_net.c b/drivers/net/nfp/nfp_net.c index
> ffa5730..fd4dd39 100644
> --- a/drivers/net/nfp/nfp_net.c
> +++ b/drivers/net/nfp/nfp_net.c
> @@ -1070,9 +1070,7 @@ nfp_net_infos_get(struct rte_eth_dev *dev, struct
> rte_eth_dev_info *dev_info)
>   };
> 
>   dev_info->reta_size = NFP_NET_CFG_RSS_ITBL_SZ; 
>-#if RTE_VER_MAJOR == 2 && RTE_VER_MINOR >= 1
>   dev_info->hash_key_size = NFP_NET_CFG_RSS_KEY_SZ;
>-#endif  }
> 

Hi Alejandro,

Please note the above change to the nfp_net.c code due to the name change
in the version macro names.


I was originally going to change it to this:

#if RTE_VER_YEAR == 2 && RTE_VER_MONTH >= 1
dev_info->hash_key_size = NFP_NET_CFG_RSS_KEY_SZ;
#endif

Or better:

#if RTE_VERSION >= 0x0201
dev_info->hash_key_size = NFP_NET_CFG_RSS_KEY_SZ;
#endif

However, since the code was added in DPDK 2.2 the version check doesn't seem
necessary, so I removed it. If you need it back in for some reason let me know.

John

[dpdk-dev] [PATCH v3 2/2] doc: rename release notes 2.3 to 16.04

2016-02-10 Thread John McNamara

From: "Richardson, Bruce" 

Updated release documentation to reflect new numbering scheme.

Signed-off-by: Bruce Richardson 
Signed-off-by: John McNamara 
---
 doc/guides/rel_notes/index.rst |   2 +-
 doc/guides/rel_notes/release_16_04.rst | 138 +
 doc/guides/rel_notes/release_2_3.rst   | 138 -
 3 files changed, 139 insertions(+), 139 deletions(-)
 create mode 100644 doc/guides/rel_notes/release_16_04.rst
 delete mode 100644 doc/guides/rel_notes/release_2_3.rst

diff --git a/doc/guides/rel_notes/index.rst b/doc/guides/rel_notes/index.rst
index 29013cf..84317b8 100644
--- a/doc/guides/rel_notes/index.rst
+++ b/doc/guides/rel_notes/index.rst
@@ -36,7 +36,7 @@ Release Notes
 :numbered:

 rel_description
-release_2_3
+release_16_04
 release_2_2
 release_2_1
 release_2_0
diff --git a/doc/guides/rel_notes/release_16_04.rst 
b/doc/guides/rel_notes/release_16_04.rst
new file mode 100644
index 000..27fc624
--- /dev/null
+++ b/doc/guides/rel_notes/release_16_04.rst
@@ -0,0 +1,138 @@
+DPDK Release 16.04
+==
+
+
+**Read this first**
+
+The text below explains how to update the release notes.
+
+Use proper spelling, capitalization and punctuation in all sections.
+
+Variable and config names should be quoted as fixed width text: ``LIKE_THIS``.
+
+Build the docs and view the output file to ensure the changes are correct::
+
+   make doc-guides-html
+
+   firefox build/doc/html/guides/rel_notes/release_16_04.html
+
+
+New Features
+
+
+This section should contain new features added in this release. Sample format:
+
+* **Add a title in the past tense with a full stop.**
+
+  Add a short 1-2 sentence description in the past tense. The description
+  should be enough to allow someone scanning the release notes to understand
+  the new feature.
+
+  If the feature adds a lot of sub-features you can use a bullet list like 
this.
+
+  * Added feature foo to do something.
+  * Enhanced feature bar to do something else.
+
+  Refer to the previous release notes for examples.
+
+* **Virtio 1.0.**
+
+  Enabled virtio 1.0 support for virtio pmd driver.
+
+
+Resolved Issues
+---
+
+This section should contain bug fixes added to the relevant sections. Sample 
format:
+
+* **code/section Fixed issue in the past tense with a full stop.**
+
+  Add a short 1-2 sentence description of the resolved issue in the past tense.
+  The title should contain the code/lib section like a commit message.
+  Add the entries in alphabetic order in the relevant sections below.
+
+
+EAL
+~~~
+
+
+Drivers
+~~~
+
+
+Libraries
+~
+
+
+Examples
+
+
+
+Other
+~
+
+
+Known Issues
+
+
+This section should contain new known issues in this release. Sample format:
+
+* **Add title in present tense with full stop.**
+
+  Add a short 1-2 sentence description of the known issue in the present
+  tense. Add information on any known workarounds.
+
+
+API Changes
+---
+
+This section should contain API changes. Sample format:
+
+* Add a short 1-2 sentence description of the API change. Use fixed width
+  quotes for ``rte_function_names`` or ``rte_struct_names``. Use the past 
tense.
+
+
+ABI Changes
+---
+
+* Add a short 1-2 sentence description of the ABI change that was announced in
+  the previous releases and made in this release. Use fixed width quotes for
+  ``rte_function_names`` or ``rte_struct_names``. Use the past tense.
+
+
+Shared Library Versions
+---
+
+Update any library version updated in this release and prepend with a ``+`` 
sign.
+
+The libraries prepended with a plus sign were incremented in this version.
+
+.. code-block:: diff
+
+ libethdev.so.2
+ librte_acl.so.2
+ librte_cfgfile.so.2
+ librte_cmdline.so.1
+ librte_distributor.so.1
+ librte_eal.so.2
+ librte_hash.so.2
+ librte_ip_frag.so.1
+ librte_ivshmem.so.1
+ librte_jobstats.so.1
+ librte_kni.so.2
+ librte_kvargs.so.1
+ librte_lpm.so.2
+ librte_mbuf.so.2
+ librte_mempool.so.1
+ librte_meter.so.1
+ librte_pipeline.so.2
+ librte_pmd_bond.so.1
+ librte_pmd_ring.so.2
+ librte_port.so.2
+ librte_power.so.1
+ librte_reorder.so.1
+ librte_ring.so.1
+ librte_sched.so.1
+ librte_table.so.2
+ librte_timer.so.1
+ librte_vhost.so.2
diff --git a/doc/guides/rel_notes/release_2_3.rst 
b/doc/guides/rel_notes/release_2_3.rst
deleted file mode 100644
index 7945694..000
--- a/doc/guides/rel_notes/release_2_3.rst
+++ /dev/null
@@ -1,138 +0,0 @@
-DPDK Release 2.3
-
-
-
-**Read this first**
-
-The text below explains how to update the release notes.
-
-Use proper spelling, capitalization and punctuation in all sections.
-
-Variable and config names should be quoted as fixed width text: ``LIKE_THIS``.
-
-Build the docs and view the output file to

[dpdk-dev] [PATCH v2 1/2] version: switch to year/month version numbers

2016-02-10 Thread Mcnamara, John

> -Original Message-
> From: Thomas Monjalon [mailto:thomas.monjalon at 6wind.com]
> Sent: Wednesday, February 10, 2016 3:12 PM
> To: Mcnamara, John 
> Cc: dev at dpdk.org; Richardson, Bruce 
> Subject: Re: [PATCH v2 1/2] version: switch to year/month version numbers
> 
> ...
>
> I liked the idea of Bruce of renaming these constants to *_YEAR and
> *_MONTH.
> I also think RTE_VER_PATCH_LEVEL should be RTE_VER_MINOR.
> And maybe that RTE_VER_PATCH_RELEASE should be RTE_VER_RELEASE.
> 
> It shouldn't hurt applications as they must use RTE_VERSION and
> rte_version().
> 
> Opinions?


That seems like a good change. I'll resubmit.

John.
--

[dpdk-dev] [PATCH v2] af_packet: make the device detachable

2016-02-10 Thread Wojciech Żmuda

Hi Bruce,

>The use of "deinitialization" sounds awkward

Thank you for your interest. I called it deinitialization in
opposition to an initialization of a device. As I'm not a native
English speaker, I trust your opinion and I'll try to rephrase this.

Hi Bernard,

>What parameters do you use with --vdev option in testpmd

I launch testpmd like this:
# ./testpmd -c f -n 4 --vdev 'eth_af_packet0,iface=eth2' -- -i
--port-topology=chained

Then I can see that af_packet starts:

PMD: Initializing pmd_af_packet for eth_af_packet0
PMD: eth_af_packet0: AF_PACKET MMAP parameters:
PMD: eth_af_packet0:block size 4096
PMD: eth_af_packet0:block count 256
PMD: eth_af_packet0:frame size 2048
PMD: eth_af_packet0:frame count 512
PMD: eth_af_packet0: creating AF_PACKET-backed ethdev on numa socket

When I get to the CLI, I do as follows:

Checking link statuses...
Port 0 Link Up - speed 1 Mbps - full-duplex
Done
testpmd> port stop 0
Stopping ports...
Checking link statuses...
Port 0 Link Down
Done
testpmd> port close 0
Closing ports...
Done
testpmd> port detach 0
Detaching a port...
PMD: Closing AF_PACKET ethdev on numa socket 0
Port 'eth_af_packet0' is detached. Now total ports is 0
Done
testpmd>

In opposition, without the patch detach is impossible:

testpmd> port detach 0
Detaching a port...
EAL: Driver, cannot detach the device
testpmd>

Bernard, Bruce, I have a question, if I may. Do you know what is the
reason that rte_pmd_af_packet_devinit() is the only non-static device
initialization function among all the dpdk drivers? There's even a
comment in the rte_eth_af_packet.h:

/**
 * For use by the EAL only. Called as part of EAL init to set up any dummy NICs
 * configured on command line.
 */
int rte_pmd_af_packet_devinit(const char *name, const char *params);

Despite the comment above, I cannot see this function being called
directly anywhere. Is there any reason it is implemented this way? Or
should I change the definition to static, as it should be called via
proper API functions?

Thank you for your time,
Wojtek

2016-02-09 17:37 GMT+01:00 Bruce Richardson :
> On Tue, Feb 09, 2016 at 05:09:06PM +0100, Wojciech Zmuda wrote:
>> Implement rte_pmd_af_packet_devuninit() exposed through struct
>> rte_driver.uninit() and set dev_flags to RTE_ETH_DEV_DETACHABLE,
>> to allow af_packet device deinitialization with API function
>> rte_eth_dev_detach(). This fixes memory leak by freeing memory
>> allocated during initialization.
>> During device initialization copy device name to ethdev->data to make
>> it compatible with rte_eth_dev_allocated().
>>
>> Signed-off-by: Wojciech Zmuda 
>> ---
>> v2:
>> * Fixed typo and a comment.
>> * Added feature to the 2.3 release notes.
>> * Free memory allocated for rx and tx queues.
>>
>>  doc/guides/rel_notes/release_2_3.rst  |  4 
>>  drivers/net/af_packet/rte_eth_af_packet.c | 37 
>> ++-
>>  2 files changed, 40 insertions(+), 1 deletion(-)
>>
>> diff --git a/doc/guides/rel_notes/release_2_3.rst 
>> b/doc/guides/rel_notes/release_2_3.rst
>> index 7945694..4694646 100644
>> --- a/doc/guides/rel_notes/release_2_3.rst
>> +++ b/doc/guides/rel_notes/release_2_3.rst
>> @@ -39,6 +39,10 @@ This section should contain new features added in this 
>> release. Sample format:
>>
>>Enabled virtio 1.0 support for virtio pmd driver.
>>
>> +* **Added af_packet driver deinitialization function.**
>> +
>> +  Implemented rte_pmd_af_packet_devuninit() exposed through struct
>> +  rte_driver.uninit() to allow af_packet device deinitialization with API 
>> function.
>>
>
> The use of "deinitialization" sounds awkward, and the overall text maybe 
> could be
> made less technical. Maybe talk about "allowing dynamic removal" of af_packet
> devices [or even hotplug of them]?
>
> /Bruce

[dpdk-dev] [PATCH] remove redundant func in PMD_INIT_LOG and PMD_RX_LOG

2016-02-10 Thread Bruce Richardson

On Mon, Dec 28, 2015 at 09:39:44AM -0800, Stephen Hemminger wrote:
> On Mon, 28 Dec 2015 01:28:28 +0800
> Huawei Xie  wrote:
> 
> > 
> > Signed-off-by: Huawei Xie 
> > ---
> >  drivers/net/virtio/virtio_ethdev.c   | 12 +---
> >  drivers/net/vmxnet3/vmxnet3_ethdev.c |  6 +++---
> >  drivers/net/vmxnet3/vmxnet3_rxtx.c   |  2 +-
> >  3 files changed, 9 insertions(+), 11 deletions(-)
> > 
> 
> Looks good, probably worth auditing the code for extra newlines as well at 
> some point.
> 
> Acked-by: Stephen Hemminger 

Patch split into 2 - one for virtio and one for vmxnet3 - and applied to
dpdk-next-net/rel_16_04

/Bruce

[dpdk-dev] [PATCH v2 1/2] version: switch to year/month version numbers

2016-02-10 Thread Thomas Monjalon

2016-02-10 14:33, John McNamara:
>  /**
>   * Major version number i.e. the x in x.y.z
>   */
> -#define RTE_VER_MAJOR 2
> +#define RTE_VER_MAJOR 16
>  
>  /**
>   * Minor version number i.e. the y in x.y.z
>   */
> -#define RTE_VER_MINOR 3
> +#define RTE_VER_MINOR 4
>

I liked the idea of Bruce of renaming these constants to
*_YEAR and *_MONTH.
I also think RTE_VER_PATCH_LEVEL should be RTE_VER_MINOR.
And maybe that RTE_VER_PATCH_RELEASE should be RTE_VER_RELEASE.

It shouldn't hurt applications as they must use RTE_VERSION
and rte_version().

Opinions?

[dpdk-dev] [PATCH] i40e: fix inverted check for ETH_TXQ_FLAGS_NOREFCOUNT

2016-02-10 Thread Bruce Richardson

On Wed, Dec 23, 2015 at 08:42:26AM +, Zhang, Helin wrote:
> 
> 
> > -Original Message-
> > From: Rich Lane [mailto:rich.lane at bigswitch.com]
> > Sent: Wednesday, December 23, 2015 4:08 PM
> > To: dev at dpdk.org
> > Cc: Zhang, Helin
> > Subject: [PATCH] i40e: fix inverted check for ETH_TXQ_FLAGS_NOREFCOUNT
> > 
> > The no-refcount path was being taken without the application opting in to 
> > it.
> > 
> > Reported-by: Mike Stolarchuk 
> > Signed-off-by: Rich Lane 
> Acked-by: Helin Zhang 
> 
> Thanks for the good catch!

Applied to dpdk-next-net/rel_16_04

Thanks,
/Bruce

[dpdk-dev] [dpdk-users] DPDK i40evf problem in receving packet

2016-02-10 Thread Muhammad Zain-ul-Abideen

Hi Saurabh,
Can you tell me what card you are uing


*Regards:*
Zain ul Abideen


*Disclaimer**: The information contained in this e-mail and any attachments
is confidential; it is intended only for use of the individual or entity
named above. If the reader of this message is not the intended recipient,
you are notified that any dissemination, distribution or use of this
information is strictly prohibited. If you have received this communication
in error, please delete it and email confirmation to the sender. Thank You.*


On Wed, Feb 10, 2016 at 6:30 AM, Saurabh Mishra 
wrote:

> Hi Qian --
>
> Any suggestions? This is bit urgent.
>
> /Saurabh
>
> On Sat, Feb 6, 2016 at 9:22 AM, Saurabh Mishra 
> wrote:
>
> > Hi Qian --
> >
> >
> > Here's the data from Host:
> >
> > [root at oscompute3 ~]# ethtool -i p3p1
> >
> > driver: i40e
> >
> > version: 1.0.11-k
> >
> > firmware-version: f4.40 a1.4 n04.53 e80001dc0
> >
> > bus-info: :04:00.0
> >
> > supports-statistics: yes
> >
> > supports-test: yes
> >
> > supports-eeprom-access: yes
> >
> > supports-register-dump: yes
> >
> > supports-priv-flags: no
> >
> > [root at oscompute3 ~]# ethtool -i p3p2
> >
> > driver: i40e
> >
> > version: 1.0.11-k
> >
> > firmware-version: f4.40 a1.4 n04.53 e80001dc0
> >
> > bus-info: :04:00.1
> >
> > supports-statistics: yes
> >
> > supports-test: yes
> >
> > supports-eeprom-access: yes
> >
> > supports-register-dump: yes
> >
> > supports-priv-flags: no
> >
> > [root at oscompute3 ~]#
> >
> > EthApp> drvinfo
> >
> > Port 0 driver: rte_i40evf_pmd (ver: RTE 2.2.0)
> >
> > Port 1 driver: rte_i40evf_pmd (ver: RTE 2.2.0)
> >
> > EthApp>
> >
> > On Fri, Feb 5, 2016 at 4:59 PM, Xu, Qian Q  wrote:
> >
> >> What's your current firmware info, can u run ethtool -i port_interface
> to
> >> check?
> >>
> >> Thanks
> >> Qian
> >>
> >> -Original Message-
> >> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Saurabh Mishra
> >> Sent: Saturday, February 06, 2016 6:33 AM
> >> To: dev at dpdk.org; users at dpdk.org
> >> Subject: [dpdk-dev] DPDK i40evf problem in receving packet
> >>
> >> Hi,
> >>
> >> I'm seeing two problems:
> >>
> >> 1)  when use our kernel '3.10.88-8.0.0.0.6', we only receive first
> >> packet but not subsequent ones at all after that. However, when I use
> >> centos7.0, then l2fwd is able to receive all the packets.
> >>
> >> 2) I've also seen that on centos7.0, symmetric_mp itself is not working.
> >> dev start fails with 280 error.
> >>
> >> i40evf is giving us lot of headache. The i40evf kernel driver works
> >> without any problem. Host is a centos7 KVM. I've already upgraded
> firmware
> >> to latest.
> >>
> >> [root at localhost ~]# uname -a
> >>
> >> Linux localhost.localdomain 3.10.0-327.4.5.el7.x86_64 #1 SMP Mon Jan 25
> >> 22:07:14 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux
> >>
> >> [root at localhost ~]#
> >>
> >> [root at localhost ~]# ./symmetric_mp fakeelf --file-prefix=virtaddr -c2
> >> -m2048 -n1 --base-virtaddr=0x7fa0 --proc-type=primary -- -p 3
> >> --num-procs=1 --proc-id=0
> >>
> >> EAL: Detected lcore 0 as core 0 on socket 0
> >>
> >> EAL: Detected lcore 1 as core 0 on socket 0
> >>
> >> EAL: Support maximum 128 logical core(s) by configuration.
> >>
> >> EAL: Detected 2 lcore(s)
> >>
> >> EAL: No free hugepages reported in hugepages-1048576kB
> >>
> >> EAL: VFIO modules not all loaded, skip VFIO support...
> >>
> >> EAL: Setting up physically contiguous memory...
> >>
> >> EAL: Ask a virtual area of 0x2a80 bytes
> >>
> >> EAL: Virtual area found at 0x7fa0 (size = 0x2a80)
> >>
> >> EAL: Ask a virtual area of 0x20 bytes
> >>
> >> EAL: Virtual area found at 0x7fa02a80 (size = 0x20)
> >>
> >> EAL: Ask a virtual area of 0x5400 bytes
> >>
> >> EAL: Virtual area found at 0x7fa02aa0 (size = 0x5400)
> >>
> >> EAL: Ask a virtual area of 0x40 bytes
> >>
> >> EAL: Virtual area found at 0x7fa07ea0 (size = 0x40)
> >>
> >> EAL: Ask a virtual area of 0x20 bytes
> >>
> >> EAL: Virtual area found at 0x7fa07ee0 (size = 0x20)
> >>
> >> EAL: Ask a virtual area of 0x20 bytes
> >>
> >> EAL: Virtual area found at 0x7fa07f00 (size = 0x20)
> >>
> >> EAL: Ask a virtual area of 0x60 bytes
> >>
> >> EAL: Virtual area found at 0x7fa07f20 (size = 0x60)
> >>
> >> EAL: Ask a virtual area of 0x20 bytes
> >>
> >> EAL: Virtual area found at 0x7fa07f80 (size = 0x20)
> >>
> >> EAL: Ask a virtual area of 0x20 bytes
> >>
> >> EAL: Virtual area found at 0x7fa07fa0 (size = 0x20)
> >>
> >> EAL: Ask a virtual area of 0x40 bytes
> >>
> >> EAL: Virtual area found at 0x7fa07fc0 (size = 0x40)
> >>
> >> EAL: Requesting 1024 pages of size 2MB from socket 0
> >>
> >> EAL: TSC frequency is ~260 KHz
> >>
> >> EAL: WARNING: cpu flags constant_tsc=yes nonstop_tsc=no -> using
> >> unreliable clock cycles !
> >>
> >> EAL: Master lcore 1 is ready (tid=3e55900;cpuset=[1])
> >>
> >> EAL: PCI device :00:09.0 on

[dpdk-dev] [PATCH] ixgbe: Discard SRIOV transparent vlan packet headers.

2016-02-10 Thread Bruce Richardson

On Thu, Dec 17, 2015 at 10:30:42AM +, Tom Kiely wrote:
> Sorry for the delay in replying to this thread. I was on vacation for the
> last 3 days. Please see inline for my comments.
> 
> On 12/15/2015 02:37 PM, Ananyev, Konstantin wrote:
> >
> >>-Original Message-
> >>From: Stephen Hemminger [mailto:stephen at networkplumber.org]
> >>Sent: Monday, December 14, 2015 9:35 PM
> >>To: Ananyev, Konstantin
> >>Cc: Zhang, Helin; dev at dpdk.org; Tom Kiely
> >>Subject: Re: [PATCH] ixgbe: Discard SRIOV transparent vlan packet headers.
> >>
> >>On Mon, 14 Dec 2015 19:57:10 +
> >>"Ananyev, Konstantin"  wrote:
> >>
> >>>
> -Original Message-
> From: Stephen Hemminger [mailto:stephen at networkplumber.org]
> Sent: Monday, December 14, 2015 7:25 PM
> To: Ananyev, Konstantin
> Cc: Zhang, Helin; dev at dpdk.org; Tom Kiely
> Subject: Re: [PATCH] ixgbe: Discard SRIOV transparent vlan packet headers.
> 
> On Mon, 14 Dec 2015 19:12:26 +
> "Ananyev, Konstantin"  wrote:
> 
> >
> >>From: Stephen Hemminger [mailto:stephen at networkplumber.org]
> >>Sent: Friday, December 11, 2015 4:59 PM
> >>To: Zhang, Helin; Ananyev, Konstantin
> >>Cc: dev at dpdk.org; Tom Kiely; Stephen Hemminger
> >>Subject: [PATCH] ixgbe: Discard SRIOV transparent vlan packet headers.
> >>
> >>From: Tom Kiely 
> >>
> >>SRIOV VFs support "transparent" vlans. Traffic from/to a VM
> >>associated with a VF is tagged/untagged with the specified
> >>vlan in a manner intended to be totally transparent to the VM.
> >>
> >>The vlan is specified by "ip link set  vf  vlan ".
> >>The VM is not configured for any vlan on the VF and the VM
> >>should never see these transparent vlan headers for that reason.
> >>
> >>However, in practice these vlan headers are being received by
> >>the VM which discards the packets as that vlan is unknown to it.
> >>The Linux kernel explicitly discards such vlan headers but DPDK
> >>does not.
> >>This patch mirrors the kernel behaviour for SRIOV VFs only
> >
> >I have few concerns about that approach:
> >
> >1. I don't think vlan_tci info should *always* be stripped by vf RX 
> >routine.
> >There could be configurations when that information might be needed by 
> >upper layer.
> >Let say VF can be member of 2 or more VLANs and upper layer would like 
> >to have that information
> >for further processing.
> >Or special mirror VF, that does traffic snnoping, or something else.
> >2. Proposed implementation would introduce a slowdown for all VF RX 
> >routines.
> >3. From the description it seems like the aim is to clear VLAN 
> >information for the RX packet.
> >Though the patch actually clears VLAN info only for the RX packet whose 
> >VLAN tag is not present inside SW copy of VFTA table.
> >Which makes no much point to me:
> >If VLAN is not present in HW VFTA table, then packet with that VLAN tag 
> >will be discarded by HW anyway.
> >If it is present inside VFTA table (both SW & HW), then VLAN information 
> >would be preserved with and without the patch.
> >
> >If you need to clear VLAN information, why not to do it on the upper 
> >layer - inside your application itself?
> >Either create some sort of wrapper around rx_burst(), or setup an RX 
> >call-back for your VF device.
> >
> >Konstantin
> 
> The aim is to get SRIOV to work when the transparent VLAN tag feature is 
> used.
> Please talk to the Linux driver team. Similar code exists there in 
> ixgbevf_process_skb_fields.
> >>>
> >>>Ah ok, I realised what you are trying to achieve now:
> >>>You setup HW VFTA[] from the PF, so from VF point of view SW copy of the 
> >>>VFTA[] remains unset.
> >>>So HW will pass VLAN packet in, but then SW will clear VLAN tag.
> >>>Ok, that clears #3 above, but I think #1,2 still remain.
> >>On the host, what configured is a vlan tag per VF per guest
> >>
> >>Tom had more info in the original mail.
> >>
> >>http://permalink.gmane.org/gmane.comp.networking.dpdk.devel/28932
> >>
> The other option is have a copy of all the receive logic which is only
> used by VF code.
> >>>Why that's the only option?
> >>>Why can't you clear that VLAN information above the PMD layer?
> >>>Keep/obtain a copy of VFTA[] somewhere on the upper layer,
> >>>and do actual clear after rx_burst() returns?
> >>>Konstantin
> >>The problem is that the guest is supposed to not see the VLAN tags (it has 
> >>no reason to),
> >>but the hardware leaves a VLAN tag on there.
> >Yes, I understand what you are trying to achieve.
> >  What I am trying to say:
> >1. VLAN tag removing shouldn't be forced for all VFs.
> >I think there are scenarios where existing behaviour (keeping vlan_tci and 
> >ol_flags intact) are what people need.
> >One example would be mirror VF doing other VFs traffic

[dpdk-dev] [PATCH] doc: drop old naming of the project

2016-02-10 Thread Thomas Monjalon

2016-02-09 17:20, Bruce Richardson:
> On Mon, Feb 08, 2016 at 02:39:02PM +0100, Thomas Monjalon wrote:
> > It was requested by Intel, more than one year ago, to replace the name
> > "Intel DPDK" by "DPDK".
> > Some references to the old name were still in some docs and code comments,
> > leading to confusion.
> > 
> > Fixes: ac8ada004c12 ("doc: remove Intel references from release notes")
> > 
> > Signed-off-by: Thomas Monjalon 
> 
> Acked-by: Bruce Richardson 

Applied

[dpdk-dev] [PATCH v3] remove extra parentheses in return statement

2016-02-10 Thread Thomas Monjalon

2016-01-27 21:58, Huawei Xie:
> v3 changes:
>  remove other extra parentheses in 'return (logical expressions)'
> which checkpatch doesn't report as error
>  remove extra parentheses in return statement which crosses
> multiple line
>  fix the document
> 
> v2 changes:
>  add missed commit message in v1
> 
> fix the error reported by checkpatch:
>   "ERROR: return is not a function, parentheses are not required"
> 
> remove parentheses in return like:
>   "return (logical expressions)"
> 
> remove parentheses in return a function like:
>   "return (rte_mempool_lookup(...))"
> 
> Fixes: 6307b909b8e0 ("lib: remove extra parenthesis after return")
> 
> Signed-off-by: Huawei Xie 
> ---
>  95 files changed, 436 insertions(+), 436 deletions(-)

Wow 95 files changed! good work
Applied, thanks

[dpdk-dev] [PATCH] virtio: fix virtio_net_hdr desc pointing to the same buffer

2016-02-10 Thread Bruce Richardson

On Mon, Dec 14, 2015 at 01:58:34PM +, Xie, Huawei wrote:
> 
> 
> > -Original Message-
> > From: Thomas Monjalon [mailto:thomas.monjalon at 6wind.com]
> > Sent: Monday, December 14, 2015 9:45 PM
> > To: Xie, Huawei
> > Cc: Yuanhan Liu; dev at dpdk.org
> > Subject: Re: [dpdk-dev] [PATCH] virtio: fix virtio_net_hdr desc
> > pointing to the same buffer
> > 

> > 2015-12-14 13:38, Xie, Huawei:
> Np. There is no issue either apply this patch or delay it to 2.3.
> 

Patch now applied to dpdk-next-net/rel_16_04.

/Bruce

[dpdk-dev] [PATCH v2 1/2] version: switch to year/month version numbers

2016-02-10 Thread Bruce Richardson

On Wed, Feb 10, 2016 at 04:11:40PM +0100, Thomas Monjalon wrote:
> 2016-02-10 14:33, John McNamara:
> >  /**
> >   * Major version number i.e. the x in x.y.z
> >   */
> > -#define RTE_VER_MAJOR 2
> > +#define RTE_VER_MAJOR 16
> >  
> >  /**
> >   * Minor version number i.e. the y in x.y.z
> >   */
> > -#define RTE_VER_MINOR 3
> > +#define RTE_VER_MINOR 4
> >
> 
> I liked the idea of Bruce of renaming these constants to
> *_YEAR and *_MONTH.
> I also think RTE_VER_PATCH_LEVEL should be RTE_VER_MINOR.
> And maybe that RTE_VER_PATCH_RELEASE should be RTE_VER_RELEASE.
> 
> It shouldn't hurt applications as they must use RTE_VERSION
> and rte_version().
> 
> Opinions?

Well, I always tend to like Bruce's ideas too, so +1 from me :-)

/Bruce

[dpdk-dev] i40e: cannot change mtu to enable jumbo frame

2016-02-10 Thread Zhang, Helin



> -Original Message-
> From: Julien Meunier [mailto:julien.meunier at 6wind.com]
> Sent: Wednesday, February 10, 2016 12:36 AM
> To: Zhang, Helin ; dev at dpdk.org
> Subject: i40e: cannot change mtu to enable jumbo frame
> 
> Hello Helin,
> 
> I tried to send jumbo frames to a i40e card. However, I observed that all 
> frames
> are dropped. Moreover, set_mtu function is not implemented on i40e PMD.
> 
>  > testpmd --log-level 8 --huge-dir=/mnt/huge -n 4 -l 2,18 --socket-mem
> 1024,1024 -w :02:00.0 -w :02:00.2 -- -i --nb-cores=1
> --nb-ports=2 --total-num-mbufs=65536
> 
> =
> Configuration
> =
> 
> +---+  +-+
> |   |  | |
> | tgen  |  | |
> |   +--+ port 0  |
> |   |  | |
> |   |  | |
> |   |  | |
> |   |  | |
> |   +--+ port 1  |
> |   |  | |
> +---+  +-+
> 
> DPDK: DPDK-v2.2
> 
> ==
> MTU = 1500
> ==
> Packet sent from a tgen
>  > p = Ether / IP / UDP / Raw(MTU + HDR(Ethernet)- HDR(IP) - HDR(UDP))  >
> len(p) = 1514
> 
> testpmd> start
> PMD: i40e_rxd_to_vlan_tci(): Mbuf vlan_tci: 0, vlan_tci_outer: 0
> testpmd> stop
> Telling cores to stop...
> Waiting for lcores to finish...
> PMD: i40e_update_vsi_stats(): *** VSI[13] stats start ***
> PMD: i40e_update_vsi_stats(): rx_bytes:1518
> PMD: i40e_update_vsi_stats(): rx_unicast:  1
> PMD: i40e_update_vsi_stats(): *** VSI[13] stats end ***
> PMD: i40e_dev_stats_get(): *** PF stats start ***
> PMD: i40e_dev_stats_get(): rx_bytes:1514
> PMD: i40e_dev_stats_get(): rx_unicast:  1
> PMD: i40e_dev_stats_get(): rx_unknown_protocol: 1
> PMD: i40e_dev_stats_get(): rx_size_1522: 1
> PMD: i40e_dev_stats_get(): *** PF stats end ***
> 
>-- Forward statistics for port 0  --
>RX-packets: 1  RX-dropped: 0 RX-total: 1
>TX-packets: 0  TX-dropped: 0 TX-total: 0
> 
> 
> PMD: i40e_update_vsi_stats(): *** VSI[14] stats start ***
> PMD: i40e_update_vsi_stats(): tx_bytes:1514
> PMD: i40e_update_vsi_stats(): tx_unicast:  1
> PMD: i40e_update_vsi_stats(): *** VSI[14] stats end ***
> PMD: i40e_dev_stats_get(): *** PF stats start ***
> PMD: i40e_dev_stats_get(): tx_bytes:1514
> PMD: i40e_dev_stats_get(): tx_unicast:  1
> PMD: i40e_dev_stats_get(): tx_size_1522: 1
> PMD: i40e_dev_stats_get(): *** PF stats end ***
> 
>-- Forward statistics for port 1 --
>RX-packets: 0  RX-dropped: 0 RX-total: 0
>TX-packets: 1  TX-dropped: 0 TX-total: 1
> 
> 
> 
>+ Accumulated forward statistics for all ports+
>RX-packets: 1  RX-dropped: 0 RX-total: 1
>TX-packets: 1  TX-dropped: 0 TX-total: 1
> 
> 
> 
> 
> => OK
> 
> ==
> MTU = 1600
> ==
> Packet sent
>  > p = Ether / IP / UDP / Raw(MTU + HDR(Ethernet)- HDR(IP) - HDR(UDP))  >
> len(p) = 1614
> 
> testpmd> port config mtu 0 1600
> rte_eth_dev_set_mtu: Function not supported Set MTU failed. diag=-95
> testpmd> port config mtu 1 1600
> rte_eth_dev_set_mtu: Function not supported Set MTU failed. diag=-95
> testpmd> start
> testpmd> stop
> Telling cores to stop...
> Waiting for lcores to finish...
> PMD: i40e_update_vsi_stats(): *** VSI[13] stats start ***
> PMD: i40e_update_vsi_stats(): rx_bytes:1618
> PMD: i40e_update_vsi_stats(): rx_unicast:  1
> PMD: i40e_update_vsi_stats(): *** VSI[13] stats end ***
> PMD: i40e_dev_stats_get(): *** PF stats start ***
> PMD: i40e_dev_stats_get(): rx_bytes:1614
> PMD: i40e_dev_stats_get(): rx_unicast:  1
> PMD: i40e_dev_stats_get(): rx_unknown_protocol: 1
> PMD: i40e_dev_stats_get(): rx_size_big:  1
> PMD: i40e_dev_stats_get(): *** PF stats end ***
> 
>-- Forward statistics for port 0  --
>RX-packets: 1  RX-dropped: 0 RX-total: 1
>TX-packets: 0  TX-dropped: 0 TX-total: 0
> 
> 
> PMD: i40e_update_vsi_stats(): *** VSI[14] stats start ***
> PMD: i40e_update_vsi_stats(): tx_bytes:0
> PMD: i40e_update_vsi_stats(): tx_unicast:  0
> PMD: i40e_update_vsi_stats(): *** VSI[14] stats end ***
> PMD: i40e_dev_stats_get(): *** PF stats start ***
> PMD: i40e_dev_stats_get(): tx_bytes:0
>

[dpdk-dev] [PATCH v2 ] ixgbe: speed up transmit

2016-02-10 Thread Bruce Richardson

On Fri, Dec 11, 2015 at 06:52:36PM +, Ananyev, Konstantin wrote:
> > -Original Message-
> > From: Stephen Hemminger [mailto:shemming at brocade.com]
> > Sent: Friday, December 11, 2015 4:48 PM
> > To: dev at dpdk.org; Zhang, Helin; Ananyev, Konstantin
> > Subject: Re: [PATCH v2 ] ixgbe: speed up transmit
> > 
> > On Fri, 13 Nov 2015 08:10:13 -0800
> > Stephen Hemminger  wrote:
> > 
> > > From: Stephen Hemminger 
> > >
> > > The freeing of mbuf's in ixgbe is one of the observable hot spots
> > > under load. Optimize it by doing bulk free of mbufs using code similar
> > > to i40e and fm10k.
> > >
> > > Drop the no longer needed micro-optimization for the no refcount flag.
> > >
> > > Signed-off-by: Stephen Hemminger 
> > 
> > How come this patch got no review or comments?
> > It gets a visible performance gain of up to 10% in some cases.
> > 
> > I understand maintainers are busy with internal work, but they need
> > to read mailing list as well.
> 
> Yep, I missed it somehow.
> BTW, by some reason I couldn't find it patchwork
> Was it submitted too late for 2.2 timeframe?
> Anyway, about the patch itself: it looks good to me and indeed it provides 
> 10%+
> performance improvement.
> So here is my ACK for it.
> Konstantin

Applied to dpdk-next-net/rel_16_04, with edited title to clarify that it's the 
non-vectorized tx code that is being sped up, and not all ixgbe transmit paths.

Thanks,
/Bruce

[dpdk-dev] [PATCH v3] mempool: reduce rte_mempool structure size

2016-02-10 Thread Keith Wiles

The rte_mempool structure is changed, which will cause an ABI change
for this structure. Providing backward compat is not reasonable
here as this structure is used in multiple defines/inlines.

Allow mempool cache support to be dynamic depending on if the
mempool being created needs cache support. Saves about 1.5M of
memory used by the rte_mempool structure.

Allocating small mempools which do not require cache can consume
larges amounts of memory if you have a number of these mempools.

Signed-off-by: Keith Wiles 
---
* Patch v3 fix up the ifdefs to correct some problems in removing ifdef
  lines. Added the ABI deprecation notice to the document file.
* Patch v2 to add some comments and setup for RTE_NEXT_ABI changes.

 app/test/test_mempool.c  |  5 +++
 doc/guides/rel_notes/deprecation.rst |  7 +++
 lib/librte_mempool/rte_mempool.c | 82 +---
 lib/librte_mempool/rte_mempool.h | 46 
 4 files changed, 127 insertions(+), 13 deletions(-)

diff --git a/app/test/test_mempool.c b/app/test/test_mempool.c
index f0f823b..f3fba50 100644
--- a/app/test/test_mempool.c
+++ b/app/test/test_mempool.c
@@ -122,8 +122,13 @@ test_mempool_basic(void)
return -1;

printf("get private data\n");
+#ifdef RTE_NEXT_ABI
+   if (rte_mempool_get_priv(mp) != (char *)mp +
+   MEMPOOL_HEADER_SIZE(mp, mp->pg_num, mp->cache_size))
+#else
if (rte_mempool_get_priv(mp) !=
(char*) mp + MEMPOOL_HEADER_SIZE(mp, mp->pg_num))
+#endif
return -1;

printf("get physical address of an object\n");
diff --git a/doc/guides/rel_notes/deprecation.rst 
b/doc/guides/rel_notes/deprecation.rst
index e94d4a2..1b9d25e 100644
--- a/doc/guides/rel_notes/deprecation.rst
+++ b/doc/guides/rel_notes/deprecation.rst
@@ -49,3 +49,10 @@ Deprecation Notices
   commands (such as RETA update in testpmd).  This should impact
   CMDLINE_PARSE_RESULT_BUFSIZE, STR_TOKEN_SIZE and RDLINE_BUF_SIZE.
   It should be integrated in release 2.3.
+
+* ABI change is planned for the rte_mempool structure to allow mempool
+  cache support to be dynamic depending on the mempool being created
+  needing cache support. Saves about 1.5M of memory per rte_mempool structure
+  by removing the per lcore cache memory. Change will occur after DPDK 16.04
+  release.
+
diff --git a/lib/librte_mempool/rte_mempool.c b/lib/librte_mempool/rte_mempool.c
index aff5f6d..5f21eaa 100644
--- a/lib/librte_mempool/rte_mempool.c
+++ b/lib/librte_mempool/rte_mempool.c
@@ -452,12 +452,17 @@ rte_mempool_xmem_create(const char *name, unsigned n, 
unsigned elt_size,
/* compilation-time checks */
RTE_BUILD_BUG_ON((sizeof(struct rte_mempool) &
  RTE_CACHE_LINE_MASK) != 0);
+#ifdef RTE_NEXT_ABI
+   RTE_BUILD_BUG_ON((sizeof(struct rte_mempool_cache) &
+ RTE_CACHE_LINE_MASK) != 0);
+#else
 #if RTE_MEMPOOL_CACHE_MAX_SIZE > 0
RTE_BUILD_BUG_ON((sizeof(struct rte_mempool_cache) &
  RTE_CACHE_LINE_MASK) != 0);
RTE_BUILD_BUG_ON((offsetof(struct rte_mempool, local_cache) &
  RTE_CACHE_LINE_MASK) != 0);
 #endif
+#endif /* RTE_NEXT_ABI */
 #ifdef RTE_LIBRTE_MEMPOOL_DEBUG
RTE_BUILD_BUG_ON((sizeof(struct rte_mempool_debug_stats) &
  RTE_CACHE_LINE_MASK) != 0);
@@ -527,9 +532,8 @@ rte_mempool_xmem_create(const char *name, unsigned n, 
unsigned elt_size,
 */
int head = sizeof(struct rte_mempool);
int new_size = (private_data_size + head) % page_size;
-   if (new_size) {
+   if (new_size)
private_data_size += page_size - new_size;
-   }
}

/* try to allocate tailq entry */
@@ -544,7 +548,12 @@ rte_mempool_xmem_create(const char *name, unsigned n, 
unsigned elt_size,
 * store mempool objects. Otherwise reserve a memzone that is large
 * enough to hold mempool header and metadata plus mempool objects.
 */
+#ifdef RTE_NEXT_ABI
+   mempool_size = MEMPOOL_HEADER_SIZE(mp, pg_num, cache_size);
+   mempool_size += private_data_size;
+#else
mempool_size = MEMPOOL_HEADER_SIZE(mp, pg_num) + private_data_size;
+#endif /* RTE_NEXT_ABI */
mempool_size = RTE_ALIGN_CEIL(mempool_size, RTE_MEMPOOL_ALIGN);
if (vaddr == NULL)
mempool_size += (size_t)objsz.total_size * n;
@@ -598,9 +607,22 @@ rte_mempool_xmem_create(const char *name, unsigned n, 
unsigned elt_size,
mp->cache_flushthresh = CALC_CACHE_FLUSHTHRESH(cache_size);
mp->private_data_size = private_data_size;

+#ifdef RTE_NEXT_ABI
+   /*
+* local_cache pointer is set even if cache_size is zero.
+* The local_cache points to just past the elt_pa[] array.
+*/
+   mp->local_cache = (struct rte_mempool_cache *)
+

[dpdk-dev] [PATCH] virtio: fix rx ring descriptor starvation

2016-02-10 Thread Bruce Richardson

On Tue, Jan 05, 2016 at 07:13:04AM +, Xie, Huawei wrote:
> On 12/17/2015 7:18 PM, Tom Kiely wrote:
> >
> >
> > On 11/25/2015 05:32 PM, Xie, Huawei wrote:
> >> On 11/13/2015 5:33 PM, Tom Kiely wrote:
> >>> If all rx descriptors are processed while transient
> >>> mbuf exhaustion is present, the rx ring ends up with
> >>> no available descriptors. Thus no packets are received
> >>> on that ring. Since descriptor refill is performed post
> >>> rx descriptor processing, in this case no refill is
> >>> ever subsequently performed resulting in permanent rx
> >>> traffic drop.
> >>>
> >>> Signed-off-by: Tom Kiely 
> >>> ---
> >>>   drivers/net/virtio/virtio_rxtx.c |6 --
> >>>   1 file changed, 4 insertions(+), 2 deletions(-)
> >>>
> >>> diff --git a/drivers/net/virtio/virtio_rxtx.c
> >>> b/drivers/net/virtio/virtio_rxtx.c
> >>> index 5770fa2..a95e234 100644
> >>> --- a/drivers/net/virtio/virtio_rxtx.c
> >>> +++ b/drivers/net/virtio/virtio_rxtx.c
> >>> @@ -586,7 +586,8 @@ virtio_recv_pkts(void *rx_queue, struct rte_mbuf
> >>> **rx_pkts, uint16_t nb_pkts)
> >>>   if (likely(num > DESC_PER_CACHELINE))
> >>>   num = num - ((rxvq->vq_used_cons_idx + num) %
> >>> DESC_PER_CACHELINE);
> >>>   -if (num == 0)
> >>> +/* Refill free descriptors even if no pkts recvd */
> >>> +if (num == 0 && virtqueue_full(rxvq))
> >> Should the return condition be that no used buffers and we have avail
> >> descs in avail ring, i.e,
> >>  num == 0 && rxvq->vq_free_cnt != rxvq->vq_nentries
> >>
> >> rather than
> >>  num == 0 && rxvq->vq_free_cnt == 0
> > Yes we could do that but I don't see a good reason to wait until the
> > vq_free_cnt == vq_nentries
> > before attempting the refill. The existing code will attempt refill
> > even if only 1 packet was received
> > and the free count is small. To me it seems safer to extend that to
> > try refill even if no packet was received
> > but the free count is non-zero.
> The existing code attempt to refill only if 1 packet was received.
> 
> If we want to refill even no packet was received, then the strict
> condition should be
> num == 0 && rxvq->vq_free_cnt != rxvq->vq_nentries
> 
> The safer condition, what you want to use,  should be
> num == 0 && !virtqueue_full(...)
> rather than
> num == 0 && virtqueue_full(...)
> 
> We could simplify things a bit, just remove this check, if the following
> receiving code already takes care of the "num == 0" condition.
> 
> I find virtqueue_full is confusing, maybe we could change it to some
> other meaningful name.
> 
> >
> >Tom
> >
Ping.

Tom and Huawei, what is the status of this patch? Will there be a V2?

/Bruce

[dpdk-dev] [PATCH v2 1/2] driver/net/mpipe: add rte_vect.h and enable CONFIG_RTE_LIBRTE_LPM

2016-02-10 Thread Liming Sun

Thanks Bruce. I split the changes as suggested and resubmitted as below.

http://dpdk.org/dev/patchwork/patch/10454/
http://dpdk.org/dev/patchwork/patch/10455/

- Liming

-Original Message-
From: Bruce Richardson [mailto:bruce.richard...@intel.com] 
Sent: Tuesday, February 09, 2016 10:56 AM
To: Liming Sun
Cc: dev at dpdk.org
Subject: Re: [dpdk-dev] [PATCH v2 1/2] driver/net/mpipe: add rte_vect.h and 
enable CONFIG_RTE_LIBRTE_LPM

On Fri, Jan 08, 2016 at 09:39:07AM -0500, Liming Sun wrote:
> rte_vect.h was missing earlier thus LPM was disabled and l3fwd is not 
> able to compile. This commit implements the vector api and enable LPM 
> in the tilegx configuration by default. It also includes a minor 
> optimization to use __insn_fetchadd4() instead of
> rte_atomic32_xxx() in mpipe_dp_enter/mpipe_dp_exit to avoid the 
> unnecessary memory fence.

This looks like it should be two patches to me. One patch to add the missing 
dependency and get lpm to work. The second patch should then contain the driver 
optimization. Do you agree?

/Bruce

PS: the commit title prefix for the first patch should probably be "eal/tile"
rather than mpipe, since it's not directly affecting the mpipe driver.

> 
> Signed-off-by: Liming Sun 
> Acked-by: Zhigang Lu 
> ---
>  config/defconfig_tile-tilegx-linuxapp-gcc  |  2 +-
>  drivers/net/mpipe/mpipe_tilegx.c   | 18 +++--
>  lib/librte_eal/common/include/arch/tile/rte_vect.h | 93 
> ++
>  3 files changed, 107 insertions(+), 6 deletions(-)  create mode 
> 100644 lib/librte_eal/common/include/arch/tile/rte_vect.h
> 
[snip]
>+  mbuf->next = NULL;
>  
>   PMD_DEBUG_RX("%s: RX mbuf %p, buffer %p, buf_addr %p, size %d\n",
>mpipe_name(priv), mbuf, va, mbuf->buf_addr, size); diff 
> --git 
> a/lib/librte_eal/common/include/arch/tile/rte_vect.h 
> b/lib/librte_eal/common/include/arch/tile/rte_vect.h
> new file mode 100644
> index 000..32d768a
> --- /dev/null
> +++ b/lib/librte_eal/common/include/arch/tile/rte_vect.h
> @@ -0,0 +1,93 @@
> +/*
> + *   BSD LICENSE
> + *
> + *   Copyright (C) EZchip Semiconductor Ltd. 2015.

Maybe update the copyright year?

> + *
> + *   Redistribution and use in source and binary forms, with or without
> + *   modification, are permitted provided that the following conditions
[snip]

[dpdk-dev] [PATCH v2 2/2] driver/net/mpipe: fix the crash/hung issue when testpmd quits

2016-02-10 Thread Liming Sun

I split the change and resubmitted it as v3 
http://dpdk.org/dev/patchwork/patch/10456/
It includes one of the two fixes. The other one is optional for now and could 
be submitted separately later.

Thanks,
Liming

-Original Message-
From: Bruce Richardson [mailto:bruce.richard...@intel.com] 
Sent: Tuesday, February 09, 2016 10:59 AM
To: Liming Sun
Cc: dev at dpdk.org
Subject: Re: [dpdk-dev] [PATCH v2 2/2] driver/net/mpipe: fix the crash/hung 
issue when testpmd quits

On Fri, Jan 08, 2016 at 09:39:08AM -0500, Liming Sun wrote:
> 1. Fixed the compiling issue of the ethtool example on tilegx
>platform.
> 2. Fixed the hung/crash issue when quitting testpmd under high
>traffic rate. The buffer error bit needs to be checked before
>processing the idesc and releasing the buffer. Code logic is
>also simplified.
> 

Again, with two issues being solved, this looks like two patches. Can you also 
describe exactly the causes of the individual issues in each patch and how the 
patch fixes them. 
Please also include a fixes line for each patch as described here:

http://dpdk.org/doc/guides/contributing/patches.html#commit-messages-body

Thanks,
/Bruce

[dpdk-dev] [PATCH] hash: fix CRC32c computation

2016-02-10 Thread Pattan, Reshma

Hi,

Small typo.

> -Original Message-
> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Didier Pallard
> Sent: Tuesday, December 22, 2015 9:35 AM
> To: dev at dpdk.org; Richardson, Bruce ; De 
> Lara
> Guarch, Pablo 
> Subject: [dpdk-dev] [PATCH] hash: fix CRC32c computation
> diff --git a/lib/librte_hash/rte_hash_crc.h b/lib/librte_hash/rte_hash_crc.h 
> index
> 78a34b7..485f8a2 100644
> --- a/lib/librte_hash/rte_hash_crc.h
> +++ b/lib/librte_hash/rte_hash_crc.h
> + * Use single crc32 instruction to perform a hash on a byte value.

"on a byte value" should be ===> "on 2 bytes".

> + * Fall back to software crc32 implementation in case SSE4.2 is
> + * not supported
> + *
> + * @param data
> + *   Data to perform hash on.
> + * @param init_val
> + *   Value to initialise hash generator.
> + * @return
> + *   32bit calculated hash value.
> + */
> +static inline uint32_t
> +rte_hash_crc_2byte(uint16_t data, uint32_t init_val) { #if defined
> +RTE_ARCH_I686 || defined RTE_ARCH_X86_64
> + if (likely(crc32_alg & CRC32_SSE42))
> + return crc32c_sse42_u16(data, init_val); #endif
> +
> + return crc32c_2bytes(data, init_val);
> +}
> +

Thanks,
Reshma

[dpdk-dev] [PATCH v2 2/2] doc: rename release notes 2.3 to 16.04

2016-02-10 Thread John McNamara

From: "Richardson, Bruce" 

Updated release documentation to reflect new numbering scheme.

Signed-off-by: Bruce Richardson 
Signed-off-by: John McNamara 
---
 doc/guides/rel_notes/index.rst |   2 +-
 doc/guides/rel_notes/release_16_04.rst | 138 +
 doc/guides/rel_notes/release_2_3.rst   | 138 -
 3 files changed, 139 insertions(+), 139 deletions(-)
 create mode 100644 doc/guides/rel_notes/release_16_04.rst
 delete mode 100644 doc/guides/rel_notes/release_2_3.rst

diff --git a/doc/guides/rel_notes/index.rst b/doc/guides/rel_notes/index.rst
index 29013cf..84317b8 100644
--- a/doc/guides/rel_notes/index.rst
+++ b/doc/guides/rel_notes/index.rst
@@ -36,7 +36,7 @@ Release Notes
 :numbered:

 rel_description
-release_2_3
+release_16_04
 release_2_2
 release_2_1
 release_2_0
diff --git a/doc/guides/rel_notes/release_16_04.rst 
b/doc/guides/rel_notes/release_16_04.rst
new file mode 100644
index 000..27fc624
--- /dev/null
+++ b/doc/guides/rel_notes/release_16_04.rst
@@ -0,0 +1,138 @@
+DPDK Release 16.04
+==
+
+
+**Read this first**
+
+The text below explains how to update the release notes.
+
+Use proper spelling, capitalization and punctuation in all sections.
+
+Variable and config names should be quoted as fixed width text: ``LIKE_THIS``.
+
+Build the docs and view the output file to ensure the changes are correct::
+
+   make doc-guides-html
+
+   firefox build/doc/html/guides/rel_notes/release_16_04.html
+
+
+New Features
+
+
+This section should contain new features added in this release. Sample format:
+
+* **Add a title in the past tense with a full stop.**
+
+  Add a short 1-2 sentence description in the past tense. The description
+  should be enough to allow someone scanning the release notes to understand
+  the new feature.
+
+  If the feature adds a lot of sub-features you can use a bullet list like 
this.
+
+  * Added feature foo to do something.
+  * Enhanced feature bar to do something else.
+
+  Refer to the previous release notes for examples.
+
+* **Virtio 1.0.**
+
+  Enabled virtio 1.0 support for virtio pmd driver.
+
+
+Resolved Issues
+---
+
+This section should contain bug fixes added to the relevant sections. Sample 
format:
+
+* **code/section Fixed issue in the past tense with a full stop.**
+
+  Add a short 1-2 sentence description of the resolved issue in the past tense.
+  The title should contain the code/lib section like a commit message.
+  Add the entries in alphabetic order in the relevant sections below.
+
+
+EAL
+~~~
+
+
+Drivers
+~~~
+
+
+Libraries
+~
+
+
+Examples
+
+
+
+Other
+~
+
+
+Known Issues
+
+
+This section should contain new known issues in this release. Sample format:
+
+* **Add title in present tense with full stop.**
+
+  Add a short 1-2 sentence description of the known issue in the present
+  tense. Add information on any known workarounds.
+
+
+API Changes
+---
+
+This section should contain API changes. Sample format:
+
+* Add a short 1-2 sentence description of the API change. Use fixed width
+  quotes for ``rte_function_names`` or ``rte_struct_names``. Use the past 
tense.
+
+
+ABI Changes
+---
+
+* Add a short 1-2 sentence description of the ABI change that was announced in
+  the previous releases and made in this release. Use fixed width quotes for
+  ``rte_function_names`` or ``rte_struct_names``. Use the past tense.
+
+
+Shared Library Versions
+---
+
+Update any library version updated in this release and prepend with a ``+`` 
sign.
+
+The libraries prepended with a plus sign were incremented in this version.
+
+.. code-block:: diff
+
+ libethdev.so.2
+ librte_acl.so.2
+ librte_cfgfile.so.2
+ librte_cmdline.so.1
+ librte_distributor.so.1
+ librte_eal.so.2
+ librte_hash.so.2
+ librte_ip_frag.so.1
+ librte_ivshmem.so.1
+ librte_jobstats.so.1
+ librte_kni.so.2
+ librte_kvargs.so.1
+ librte_lpm.so.2
+ librte_mbuf.so.2
+ librte_mempool.so.1
+ librte_meter.so.1
+ librte_pipeline.so.2
+ librte_pmd_bond.so.1
+ librte_pmd_ring.so.2
+ librte_port.so.2
+ librte_power.so.1
+ librte_reorder.so.1
+ librte_ring.so.1
+ librte_sched.so.1
+ librte_table.so.2
+ librte_timer.so.1
+ librte_vhost.so.2
diff --git a/doc/guides/rel_notes/release_2_3.rst 
b/doc/guides/rel_notes/release_2_3.rst
deleted file mode 100644
index 7945694..000
--- a/doc/guides/rel_notes/release_2_3.rst
+++ /dev/null
@@ -1,138 +0,0 @@
-DPDK Release 2.3
-
-
-
-**Read this first**
-
-The text below explains how to update the release notes.
-
-Use proper spelling, capitalization and punctuation in all sections.
-
-Variable and config names should be quoted as fixed width text: ``LIKE_THIS``.
-
-Build the docs and view the output file to

[dpdk-dev] [PATCH v2 1/2] version: switch to year/month version numbers

2016-02-10 Thread John McNamara

From: "Richardson, Bruce" 

As discussed on list, switch numbering scheme to be based on year/month.
Release 2.3 then becomes 16.04.

Ref: http://dpdk.org/ml/archives/dev/2015-December/030336.html

Also, added zero padding to the month so that it appear as 16.04 and
not 16.4 in "make showversion" and rte_version().

Signed-off-by: Bruce Richardson 
Signed-off-by: John McNamara 
---
 lib/librte_eal/common/include/rte_version.h | 10 +-
 mk/rte.sdkconfig.mk |  2 +-
 2 files changed, 6 insertions(+), 6 deletions(-)

diff --git a/lib/librte_eal/common/include/rte_version.h 
b/lib/librte_eal/common/include/rte_version.h
index 6b1890e..4784e67 100644
--- a/lib/librte_eal/common/include/rte_version.h
+++ b/lib/librte_eal/common/include/rte_version.h
@@ -50,17 +50,17 @@ extern "C" {
 /**
  * String that appears before the version number
  */
-#define RTE_VER_PREFIX "RTE"
+#define RTE_VER_PREFIX "DPDK"

 /**
  * Major version number i.e. the x in x.y.z
  */
-#define RTE_VER_MAJOR 2
+#define RTE_VER_MAJOR 16

 /**
  * Minor version number i.e. the y in x.y.z
  */
-#define RTE_VER_MINOR 3
+#define RTE_VER_MINOR 4

 /**
  * Patch level number i.e. the z in x.y.z
@@ -105,13 +105,13 @@ rte_version(void)
if (version[0] != 0)
return version;
if (strlen(RTE_VER_SUFFIX) == 0)
-   snprintf(version, sizeof(version), "%s %d.%d.%d",
+   snprintf(version, sizeof(version), "%s %d.%02d.%d",
RTE_VER_PREFIX,
RTE_VER_MAJOR,
RTE_VER_MINOR,
RTE_VER_PATCH_LEVEL);
else
-   snprintf(version, sizeof(version), "%s %d.%d.%d%s%d",
+   snprintf(version, sizeof(version), "%s %d.%02d.%d%s%d",
RTE_VER_PREFIX,
RTE_VER_MAJOR,
RTE_VER_MINOR,
diff --git a/mk/rte.sdkconfig.mk b/mk/rte.sdkconfig.mk
index f8d95b1..564a5c3 100644
--- a/mk/rte.sdkconfig.mk
+++ b/mk/rte.sdkconfig.mk
@@ -35,7 +35,7 @@ showversion:
$$(sed -rne 's,^#define 
RTE_VER_[A-Z_]*[[:space:]]+([0-9]+).*,\1,p' \
-e 's,^#define RTE_VER_SUFFIX[[:space:]]+"(.*)",\1,p' \

$(RTE_SRCDIR)/lib/librte_eal/common/include/rte_version.h) ;\
-   printf '%d.%d.%d' "$$1" "$$2" "$$3"; \
+   printf '%d.%02d.%d' "$$1" "$$2" "$$3"; \
if [ -z "$$5" ]; then echo; \
else printf '%s' "$$4"; \
if [ $$5 -lt 16 ] ; then echo $$5; \
-- 
2.5.0

[dpdk-dev] [PATCH] virtio: fix crashes in virtio stats functions

2016-02-10 Thread Bruce Richardson

On Thu, Dec 24, 2015 at 11:48:41AM +0800, Yuanhan Liu wrote:
> On Wed, Dec 23, 2015 at 09:45:19AM +, Bernard Iremonger wrote:
> > This initialisation of nb_rx_queues and nb_tx_queues has been removed
> > from eth_virtio_dev_init.
> > 
> > The nb_rx_queues and nb_tx_queues were being initialised in 
> > eth_virtio_dev_init
> > before the tx_queues and rx_queues arrays were allocated.
> > 
> > The arrays are allocated when the ethdev port is configured and the
> > nb_tx_queues and nb_rx_queues are initialised.
> > 
> > If any of the following functions were called before the ethdev
> > port was configured there was a segmentation fault because
> > rx_queues and tx_queues were NULL:
> > 
> > rte_eth_stats_get
> > rte_eth_stats_reset
> > rte_eth_xstats_get
> > rte_eth_xstats_reset
> > 
> > Fixes: 823ad647950a ("virtio: support multiple queues")
> > Signed-off-by: Bernard Iremonger 
> 
> Acked-by: Yuanhan Liu 
> 
> Thanks.
> 
>   --yliu

Applied to dpdk-next-net/rel_16_04

/Bruce

[dpdk-dev] i40e: cannot change mtu to enable jumbo frame

2016-02-10 Thread Ananyev, Konstantin

> 
> On 02/09/2016 08:05 PM, Zhu, Heqing wrote:
> > Helin is still in Chinese New Year Vacation. Will the below command option 
> > help ?
> >
> > 4.5.9. port config - max-pkt-len
> > Set the maximum packet length:
> >
> > testpmd> port config all max-pkt-len (value)
> > This is equivalent to the --max-pkt-len command-line option.
> >
> >
> > -Original Message-
> > From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Julien Meunier
> > Sent: Tuesday, February 9, 2016 9:36 AM
> > To: Zhang, Helin ; dev at dpdk.org
> > Subject: [dpdk-dev] i40e: cannot change mtu to enable jumbo frame
> >
> > Hello Helin,
> >
> > I tried to send jumbo frames to a i40e card. However, I observed that all 
> > frames are dropped. Moreover, set_mtu function is not
> implemented on i40e PMD.
> >
> >   > testpmd --log-level 8 --huge-dir=/mnt/huge -n 4 -l 2,18 --socket-mem
> > 1024,1024 -w :02:00.0 -w :02:00.2 -- -i --nb-cores=1
> > --nb-ports=2 --total-num-mbufs=65536
> >
> > =
> > Configuration
> > =
> >
> > +---+  +-+
> > |   |  | |
> > | tgen  |  | |
> > |   +--+ port 0  |
> > |   |  | |
> > |   |  | |
> > |   |  | |
> > |   |  | |
> > |   +--+ port 1  |
> > |   |  | |
> > +---+  +-+
> >
> > DPDK: DPDK-v2.2
> >
> > ==
> > MTU = 1500
> > ==
> > Packet sent from a tgen
> >   > p = Ether / IP / UDP / Raw(MTU + HDR(Ethernet)- HDR(IP) - HDR(UDP))  > 
> > len(p) = 1514
> >
> > testpmd> start
> > PMD: i40e_rxd_to_vlan_tci(): Mbuf vlan_tci: 0, vlan_tci_outer: 0
> > testpmd> stop
> > Telling cores to stop...
> > Waiting for lcores to finish...
> > PMD: i40e_update_vsi_stats(): *** VSI[13] stats start ***
> > PMD: i40e_update_vsi_stats(): rx_bytes:1518
> > PMD: i40e_update_vsi_stats(): rx_unicast:  1
> > PMD: i40e_update_vsi_stats(): *** VSI[13] stats end ***
> > PMD: i40e_dev_stats_get(): *** PF stats start ***
> > PMD: i40e_dev_stats_get(): rx_bytes:1514
> > PMD: i40e_dev_stats_get(): rx_unicast:  1
> > PMD: i40e_dev_stats_get(): rx_unknown_protocol: 1
> > PMD: i40e_dev_stats_get(): rx_size_1522: 1
> > PMD: i40e_dev_stats_get(): *** PF stats end ***
> >
> > -- Forward statistics for port 0  --
> > RX-packets: 1  RX-dropped: 0 RX-total: 1
> > TX-packets: 0  TX-dropped: 0 TX-total: 0
> >
> > 
> > PMD: i40e_update_vsi_stats(): *** VSI[14] stats start ***
> > PMD: i40e_update_vsi_stats(): tx_bytes:1514
> > PMD: i40e_update_vsi_stats(): tx_unicast:  1
> > PMD: i40e_update_vsi_stats(): *** VSI[14] stats end ***
> > PMD: i40e_dev_stats_get(): *** PF stats start ***
> > PMD: i40e_dev_stats_get(): tx_bytes:1514
> > PMD: i40e_dev_stats_get(): tx_unicast:  1
> > PMD: i40e_dev_stats_get(): tx_size_1522: 1
> > PMD: i40e_dev_stats_get(): *** PF stats end ***
> >
> > -- Forward statistics for port 1 --
> > RX-packets: 0  RX-dropped: 0 RX-total: 0
> > TX-packets: 1  TX-dropped: 0 TX-total: 1
> >
> > 
> >
> > + Accumulated forward statistics for all ports+
> > RX-packets: 1  RX-dropped: 0 RX-total: 1
> > TX-packets: 1  TX-dropped: 0 TX-total: 1
> >
> > 
> >
> > => OK
> >
> > ==
> > MTU = 1600
> > ==
> > Packet sent
> >   > p = Ether / IP / UDP / Raw(MTU + HDR(Ethernet)- HDR(IP) - HDR(UDP))  > 
> > len(p) = 1614
> >
> > testpmd> port config mtu 0 1600
> > rte_eth_dev_set_mtu: Function not supported Set MTU failed. diag=-95
> > testpmd> port config mtu 1 1600
> > rte_eth_dev_set_mtu: Function not supported Set MTU failed. diag=-95
> > testpmd> start
> > testpmd> stop
> > Telling cores to stop...
> > Waiting for lcores to finish...
> > PMD: i40e_update_vsi_stats(): *** VSI[13] stats start ***
> > PMD: i40e_update_vsi_stats(): rx_bytes:1618
> > PMD: i40e_update_vsi_stats(): rx_unicast:  1
> > PMD: i40e_update_vsi_stats(): *** VSI[13] stats end ***
> > PMD: i40e_dev_stats_get(): *** PF stats start ***
> > PMD: i40e_dev_stats_get(): rx_bytes:1614
> > PMD: i40e_dev_stats_get(): rx_unicast:  1
> > PMD: i40e_dev_stats_get(): rx_unknown_protocol: 1
> > PMD: i40e_dev_stats_get(): rx_size_big:  1
> > PMD: i40e_dev_stats_get(): *** PF stats end ***
> >
> > -- Forward statistics for

[dpdk-dev] [dpdk-users] DPDK i40evf problem in receving packet

2016-02-10 Thread Saurabh Mishra

Hi,


*i40e card is :*


00:09.0 *Ether*net controller: Intel Corporation XL710/X710 Virtual
Function (rev 01)

00:0a.0 *Ether*net controller: Intel Corporation XL710/X710 Virtual
Function (rev 01)

firmware-version: f4.40 a1.4 n04.53 e80001dc0

Is there any extra verbose/debug in DPDK which we can enable to find out
why RX does not receive any packet after first one?

Thanks,
/Saurabh

On Wed, Feb 10, 2016 at 3:15 AM, Thomas Monjalon 
wrote:

> Formatting note:
> Please avoid top posting (inline replies are preferred),
> and do not include such disclaimer, not relevant on a mailing list.
> Thanks
>
> 2016-02-10 16:00, Muhammad Zain-ul-Abideen:
> > Hi Saurabh,
> > Can you tell me what card you are uing
> >
> >
> > *Regards:*
> > Zain ul Abideen
> >
> >
> > *Disclaimer**: The information contained in this e-mail and any
> attachments
> > is confidential; it is intended only for use of the individual or entity
> > named above. If the reader of this message is not the intended recipient,
> > you are notified that any dissemination, distribution or use of this
> > information is strictly prohibited. If you have received this
> communication
> > in error, please delete it and email confirmation to the sender. Thank
> You.*
> >
> >
> > On Wed, Feb 10, 2016 at 6:30 AM, Saurabh Mishra  >
> > wrote:
> >
> > > Hi Qian --
> > >
> > > Any suggestions? This is bit urgent.
> > >
> > > /Saurabh
> > >
> > > On Sat, Feb 6, 2016 at 9:22 AM, Saurabh Mishra <
> saurabh.globe at gmail.com>
> > > wrote:
> > >
> > > > Hi Qian --
> > > >
> > > >
> > > > Here's the data from Host:
> > > >
> > > > [root at oscompute3 ~]# ethtool -i p3p1
> > > >
> > > > driver: i40e
> > > >
> > > > version: 1.0.11-k
> > > >
> > > > firmware-version: f4.40 a1.4 n04.53 e80001dc0
> > > >
> > > > bus-info: :04:00.0
> > > >
> > > > supports-statistics: yes
> > > >
> > > > supports-test: yes
> > > >
> > > > supports-eeprom-access: yes
> > > >
> > > > supports-register-dump: yes
> > > >
> > > > supports-priv-flags: no
> > > >
> > > > [root at oscompute3 ~]# ethtool -i p3p2
> > > >
> > > > driver: i40e
> > > >
> > > > version: 1.0.11-k
> > > >
> > > > firmware-version: f4.40 a1.4 n04.53 e80001dc0
> > > >
> > > > bus-info: :04:00.1
> > > >
> > > > supports-statistics: yes
> > > >
> > > > supports-test: yes
> > > >
> > > > supports-eeprom-access: yes
> > > >
> > > > supports-register-dump: yes
> > > >
> > > > supports-priv-flags: no
> > > >
> > > > [root at oscompute3 ~]#
> > > >
> > > > EthApp> drvinfo
> > > >
> > > > Port 0 driver: rte_i40evf_pmd (ver: RTE 2.2.0)
> > > >
> > > > Port 1 driver: rte_i40evf_pmd (ver: RTE 2.2.0)
> > > >
> > > > EthApp>
> > > >
> > > > On Fri, Feb 5, 2016 at 4:59 PM, Xu, Qian Q 
> wrote:
> > > >
> > > >> What's your current firmware info, can u run ethtool -i
> port_interface
> > > to
> > > >> check?
> > > >>
> > > >> Thanks
> > > >> Qian
> > > >>
> > > >> -Original Message-
> > > >> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Saurabh Mishra
> > > >> Sent: Saturday, February 06, 2016 6:33 AM
> > > >> To: dev at dpdk.org; users at dpdk.org
> > > >> Subject: [dpdk-dev] DPDK i40evf problem in receving packet
> > > >>
> > > >> Hi,
> > > >>
> > > >> I'm seeing two problems:
> > > >>
> > > >> 1)  when use our kernel '3.10.88-8.0.0.0.6', we only receive first
> > > >> packet but not subsequent ones at all after that. However, when I
> use
> > > >> centos7.0, then l2fwd is able to receive all the packets.
> > > >>
> > > >> 2) I've also seen that on centos7.0, symmetric_mp itself is not
> working.
> > > >> dev start fails with 280 error.
> > > >>
> > > >> i40evf is giving us lot of headache. The i40evf kernel driver works
> > > >> without any problem. Host is a centos7 KVM. I've already upgraded
> > > firmware
> > > >> to latest.
> > > >>
> > > >> [root at localhost ~]# uname -a
> > > >>
> > > >> Linux localhost.localdomain 3.10.0-327.4.5.el7.x86_64 #1 SMP Mon
> Jan 25
> > > >> 22:07:14 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux
> > > >>
> > > >> [root at localhost ~]#
> > > >>
> > > >> [root at localhost ~]# ./symmetric_mp fakeelf --file-prefix=virtaddr
> -c2
> > > >> -m2048 -n1 --base-virtaddr=0x7fa0 --proc-type=primary -- -p
> 3
> > > >> --num-procs=1 --proc-id=0
> > > >>
> > > >> EAL: Detected lcore 0 as core 0 on socket 0
> > > >>
> > > >> EAL: Detected lcore 1 as core 0 on socket 0
> > > >>
> > > >> EAL: Support maximum 128 logical core(s) by configuration.
> > > >>
> > > >> EAL: Detected 2 lcore(s)
> > > >>
> > > >> EAL: No free hugepages reported in hugepages-1048576kB
> > > >>
> > > >> EAL: VFIO modules not all loaded, skip VFIO support...
> > > >>
> > > >> EAL: Setting up physically contiguous memory...
> > > >>
> > > >> EAL: Ask a virtual area of 0x2a80 bytes
> > > >>
> > > >> EAL: Virtual area found at 0x7fa0 (size = 0x2a80)
> > > >>
> > > >> EAL: Ask a virtual area of 0x20 bytes
> > > >>
> > > >> EAL: Virtual area found at 0x7fa02a80 (size = 0x20)
> > > >>
> > >

[dpdk-dev] [PATCH 3/9] drivers: no more pdev drivers

2016-02-10 Thread Jan Viktorin

On Wed, 10 Feb 2016 12:38:20 +0100
David Marchand  wrote:

> On Wed, Feb 10, 2016 at 11:20 AM, Jan Viktorin  
> wrote:
> > On Wed, 10 Feb 2016 10:27:14 +0100
> > David Marchand  wrote:  
> >> #define RTE_EAL_PCI_REGISTER(name, d)\
> >> void pciinitfn_ ##name(void);\
> >> void __attribute__((constructor, used)) pciinitfn_ ##name(void)\
> >> {\
> >> rte_eal_pci_register(d);\  
> >
> > I meant
> > rte_eal_pci_register(&(d)->pci_drv);\
> >
> > Perhaps, my assumption that a PCI driver is always referred as pci_drv is 
> > wrong...  
> 
> Well, I suppose we will always have a pci driver embedded in some
> other internal pmd structure.
> So we can always expect it to be called pci_drv ...
> 
> Btw, for drivers like mlx or virtio that need to do some more stuff in
> their constructor (the iopl stuff for virtio is the most interesting
> case, since the pci register happens only if iopl succeeded), we might
> need some RTE_MODULE_INIT for those.
> 
> But in such a case, I think having RTE_MODULE_INIT in all pmds would
> make more sense, and RTE_EAL_PCI_REGISTER looks unneeded ?
> 

The Linux Kernel solves this by allowing both ways. You can do it by
hand and for most cases you can use just a simple oneliner.

So, we can have the RTE_(EAL_)MODULE_INIT for the general case. And then
eg. RTE_EAL_PCI_MODULE_INIT that makes the most common glue code to
register a single driver and who finally calls the RTE_EAL_MODULE_INIT.

Regards
Jan

[dpdk-dev] [PATCH v2] af_packet: make the device detachable

2016-02-10 Thread John W. Linville

On Wed, Feb 10, 2016 at 04:42:53PM +0100, Wojciech ?muda wrote:

> Bernard, Bruce, I have a question, if I may. Do you know what is the
> reason that rte_pmd_af_packet_devinit() is the only non-static device
> initialization function among all the dpdk drivers? There's even a
> comment in the rte_eth_af_packet.h:
> 
> /**
>  * For use by the EAL only. Called as part of EAL init to set up any dummy 
> NICs
>  * configured on command line.
>  */
> int rte_pmd_af_packet_devinit(const char *name, const char *params);
> 
> Despite the comment above, I cannot see this function being called
> directly anywhere. Is there any reason it is implemented this way? Or
> should I change the definition to static, as it should be called via
> proper API functions?

The af_packet driver structure was essentially copied from the pcap
driver.  Way in the past there was the EAL initialization of the
"virtual" drivers, and the original af_packet code hooked into that.
Somewhere along the way the driver initialization code changed and
I guess the EAL initialization bits disappeared.  The comment above
rte_pmd_af_packet_devinit was overlooked.

Long story, short -- I think you can remove the comment and add the
static modifier and everything will be fine.

John
-- 
John W. LinvilleSomeday the world will need a hero, and you
linville at tuxdriver.com   might be all we have.  Be ready.

[dpdk-dev] [PATCH 8/9] pci: add a helper to refresh a device

2016-02-10 Thread Jan Viktorin

On Wed, 10 Feb 2016 13:00:50 +0100
David Marchand  wrote:

> On Wed, Feb 10, 2016 at 12:23 PM, Jan Viktorin  
> wrote:
> > On Fri, 29 Jan 2016 15:08:35 +0100
> > David Marchand  wrote:
> >  
> >> It will be used mainly for hotplug code.
> >>
> >> Signed-off-by: David Marchand 
> >> ---
> >>  lib/librte_eal/bsdapp/eal/eal_pci.c   | 49 
> >> +++
> >>  lib/librte_eal/common/eal_private.h   | 13 ++
> >>  lib/librte_eal/linuxapp/eal/eal_pci.c | 13 ++
> >>  3 files changed, 75 insertions(+)
> >>
> >> diff --git a/lib/librte_eal/bsdapp/eal/eal_pci.c 
> >> b/lib/librte_eal/bsdapp/eal/eal_pci.c
> >> index 4584522..5dd89e3 100644
> >> --- a/lib/librte_eal/bsdapp/eal/eal_pci.c
> >> +++ b/lib/librte_eal/bsdapp/eal/eal_pci.c
> >> @@ -396,6 +396,55 @@ error:
> >>   return -1;
> >>  }
> >>
> >> +int
> >> +pci_refresh_device(const struct rte_pci_addr *addr)  
> >
> > What about pci_reload_device or pci_reload_device_info? I don't mind
> > too much, only the word 'refresh' reminds me other associations.  
> 
> Or maybe pci_update_device ?
> I added pci_add_device in my other pci patchset, so update sounds better to 
> me.
> 

OK.

> 
> >> +{
> >> + int fd;
> >> + struct pci_conf matches[2];
> >> + struct pci_match_conf match = {
> >> + .pc_sel = {
> >> + .pc_domain = addr->domain,
> >> + .pc_bus = addr->bus,
> >> + .pc_dev = addr->devid,
> >> + .pc_func = addr->function,
> >> + },
> >> + };
> >> + struct pci_conf_io conf_io = {
> >> + .pat_buf_len = 0,
> >> + .num_patterns = 1,
> >> + .patterns = {  },
> >> + .match_buf_len = sizeof(matches),
> >> + .matches = [0],
> >> + };
> >> +
> >> + fd = open("/dev/pci", O_RDONLY);  
> >
> > Just courious who provides this special file... is a DPDK-specific
> > thing? I haven't noticed it anywhere in Linux.  
> 
> I don't know, just took the bsd pci code and plugged myself in it.
> So for me this is a special bsd device.
> 
> This is mainly copy/paste.
> Look at rte_eal_pci_scan() from lib/librte_eal/bsdapp/eal/eal_pci.c.

BSD... I didn't notice. That's the answer.

> 
> >  
> >> + if (fd < 0) {
> >> + RTE_LOG(ERR, EAL, "%s(): error opening /dev/pci\n", 
> >> __func__);
> >> + goto error;  
> >
> > If you write:
> > return -1;
> >
> > then you can...
> >  
> >> + }
> >> +
> >> + if (ioctl(fd, PCIOCGETCONF, _io) < 0) {
> >> + RTE_LOG(ERR, EAL, "%s(): error with ioctl on /dev/pci: %s\n",
> >> + __func__, strerror(errno));
> >> + goto error;
> >> + }
> >> +
> >> + if (conf_io.num_matches != 1)
> >> + goto error;
> >> +
> >> + if (pci_scan_one(fd, [0]) < 0)
> >> + goto error;
> >> +
> >> + close(fd);
> >> +
> >> + return 0;
> >> +
> >> +error:  
> >
> > ...remove this if:
> >  
> >> + if (fd >= 0)
> >> + close(fd);  
> >
> > Or, do you consider it more stable in the orignal way?  
> 
> Well, as said above, this is copy/paste code.
> But, anyway, when I write functions with goto statements, I prefer
> having a minimal number of return statements, matter of taste.
> Another way is to add two label error_close: error: but this is a bit
> overkill here.

All of them are OK. As for me, I prefer to not hide simple returns.

> 
> 
> >> + return -1;
> >> +}
> >> +
> >>  /* Read PCI config space. */
> >>  int rte_eal_pci_read_config(const struct rte_pci_device *dev,
> >>   void *buf, size_t len, off_t offset)
> >> diff --git a/lib/librte_eal/common/eal_private.h 
> >> b/lib/librte_eal/common/eal_private.h
> >> index 072e672..ed1903f 100644
> >> --- a/lib/librte_eal/common/eal_private.h
> >> +++ b/lib/librte_eal/common/eal_private.h
> >> @@ -155,6 +155,19 @@ struct rte_pci_driver;
> >>  struct rte_pci_device;
> >>
> >>  /**
> >> + * Refresh a pci device object by asking the kernel for the latest 
> >> information.
> >> + *
> >> + * This function is private to EAL.
> >> + *
> >> + * @param addr
> >> + *   The PCI Bus-Device-Function address to look for
> >> + * @return
> >> + *   - 0 on success.
> >> + *   - negative on error.  
> >
> > I don't know whether this is a convention in DPDK, anyway, I don't
> > like to restrict errors to just negatives. You cannot write
> >
> > if ((err = pci_refresh_device(...)) /* < 0 */) {
> > handle_error(err);
> > }
> >
> > as the check for < 0 is required (easy to be avoided).  
> 
> It is a remnant of a lot of code in eal that tries to have 0 for
> success, < 0 for errors, > 0 for special cases.
> 

OK, makes sense.

> 



-- 
   Jan Viktorin  E-mail: Viktorin at RehiveTech.com
   System Architect  Web:www.RehiveTech.com
   RehiveTech
   Brno, Czech Republic

[dpdk-dev] [PATCH 7/9] pci: add a helper for device name

2016-02-10 Thread David Marchand

On Wed, Feb 10, 2016 at 12:10 PM, Jan Viktorin  
wrote:
> On Fri, 29 Jan 2016 15:08:34 +0100
> David Marchand  wrote:
>
>> eal is a better place than ethdev for naming resources.
>> Add a helper here, and make use of it in ethdev hotplug code.
>>
>> Signed-off-by: David Marchand 
>> ---
>>  lib/librte_eal/common/include/rte_pci.h | 28 
>>  lib/librte_ether/rte_ethdev.c   | 22 ++
>>  2 files changed, 30 insertions(+), 20 deletions(-)
>>
>> diff --git a/lib/librte_eal/common/include/rte_pci.h 
>> b/lib/librte_eal/common/include/rte_pci.h
>> index 334c12e..9edd5f5 100644
>> --- a/lib/librte_eal/common/include/rte_pci.h
>> +++ b/lib/librte_eal/common/include/rte_pci.h
>> @@ -309,6 +309,34 @@ eal_parse_pci_DomBDF(const char *input, struct 
>> rte_pci_addr *dev_addr)
>>  }
>>  #undef GET_PCIADDR_FIELD
>>
>> +/**
>> + * Utility function to write a pci device name, this device name can later 
>> be
>> + * used to retrieve the corresponding rte_pci_addr using above functions.
>> + *
>> + * @param addr
>> + *   The PCI Bus-Device-Function address
>> + * @param output
>> + *   The output buffer string
>> + * @param size
>> + *   The output buffer size
>> + * @return
>> + *  0 on success, negative on error.
>> + */
>> +static inline int
>> +eal_pci_device_name(const struct rte_pci_addr *addr,
>> + char *output, int size)
>
> "size_t size" (or unsigned int) seems to be better to me.

Yes.

>
>> +{
>> + int ret;
>> +
>> + ret = snprintf(output, size, PCI_PRI_FMT,
>> +addr->domain, addr->bus,
>> +addr->devid, addr->function);
>> + if (ret < 0 || ret >= size)
>> + return -1;
>
> The return value is not checked (below). I think, such functions
> are usually missing error checking as nobody expects them to fail.
> Wouldn't it be better to panic here?

assert or panic, yes.


-- 
David Marchand

[dpdk-dev] [PATCH 8/9] pci: add a helper to refresh a device

2016-02-10 Thread David Marchand

On Wed, Feb 10, 2016 at 12:23 PM, Jan Viktorin  
wrote:
> On Fri, 29 Jan 2016 15:08:35 +0100
> David Marchand  wrote:
>
>> It will be used mainly for hotplug code.
>>
>> Signed-off-by: David Marchand 
>> ---
>>  lib/librte_eal/bsdapp/eal/eal_pci.c   | 49 
>> +++
>>  lib/librte_eal/common/eal_private.h   | 13 ++
>>  lib/librte_eal/linuxapp/eal/eal_pci.c | 13 ++
>>  3 files changed, 75 insertions(+)
>>
>> diff --git a/lib/librte_eal/bsdapp/eal/eal_pci.c 
>> b/lib/librte_eal/bsdapp/eal/eal_pci.c
>> index 4584522..5dd89e3 100644
>> --- a/lib/librte_eal/bsdapp/eal/eal_pci.c
>> +++ b/lib/librte_eal/bsdapp/eal/eal_pci.c
>> @@ -396,6 +396,55 @@ error:
>>   return -1;
>>  }
>>
>> +int
>> +pci_refresh_device(const struct rte_pci_addr *addr)
>
> What about pci_reload_device or pci_reload_device_info? I don't mind
> too much, only the word 'refresh' reminds me other associations.

Or maybe pci_update_device ?
I added pci_add_device in my other pci patchset, so update sounds better to me.


>> +{
>> + int fd;
>> + struct pci_conf matches[2];
>> + struct pci_match_conf match = {
>> + .pc_sel = {
>> + .pc_domain = addr->domain,
>> + .pc_bus = addr->bus,
>> + .pc_dev = addr->devid,
>> + .pc_func = addr->function,
>> + },
>> + };
>> + struct pci_conf_io conf_io = {
>> + .pat_buf_len = 0,
>> + .num_patterns = 1,
>> + .patterns = {  },
>> + .match_buf_len = sizeof(matches),
>> + .matches = [0],
>> + };
>> +
>> + fd = open("/dev/pci", O_RDONLY);
>
> Just courious who provides this special file... is a DPDK-specific
> thing? I haven't noticed it anywhere in Linux.

I don't know, just took the bsd pci code and plugged myself in it.
So for me this is a special bsd device.

This is mainly copy/paste.
Look at rte_eal_pci_scan() from lib/librte_eal/bsdapp/eal/eal_pci.c.

>
>> + if (fd < 0) {
>> + RTE_LOG(ERR, EAL, "%s(): error opening /dev/pci\n", __func__);
>> + goto error;
>
> If you write:
> return -1;
>
> then you can...
>
>> + }
>> +
>> + if (ioctl(fd, PCIOCGETCONF, _io) < 0) {
>> + RTE_LOG(ERR, EAL, "%s(): error with ioctl on /dev/pci: %s\n",
>> + __func__, strerror(errno));
>> + goto error;
>> + }
>> +
>> + if (conf_io.num_matches != 1)
>> + goto error;
>> +
>> + if (pci_scan_one(fd, [0]) < 0)
>> + goto error;
>> +
>> + close(fd);
>> +
>> + return 0;
>> +
>> +error:
>
> ...remove this if:
>
>> + if (fd >= 0)
>> + close(fd);
>
> Or, do you consider it more stable in the orignal way?

Well, as said above, this is copy/paste code.
But, anyway, when I write functions with goto statements, I prefer
having a minimal number of return statements, matter of taste.
Another way is to add two label error_close: error: but this is a bit
overkill here.


>> + return -1;
>> +}
>> +
>>  /* Read PCI config space. */
>>  int rte_eal_pci_read_config(const struct rte_pci_device *dev,
>>   void *buf, size_t len, off_t offset)
>> diff --git a/lib/librte_eal/common/eal_private.h 
>> b/lib/librte_eal/common/eal_private.h
>> index 072e672..ed1903f 100644
>> --- a/lib/librte_eal/common/eal_private.h
>> +++ b/lib/librte_eal/common/eal_private.h
>> @@ -155,6 +155,19 @@ struct rte_pci_driver;
>>  struct rte_pci_device;
>>
>>  /**
>> + * Refresh a pci device object by asking the kernel for the latest 
>> information.
>> + *
>> + * This function is private to EAL.
>> + *
>> + * @param addr
>> + *   The PCI Bus-Device-Function address to look for
>> + * @return
>> + *   - 0 on success.
>> + *   - negative on error.
>
> I don't know whether this is a convention in DPDK, anyway, I don't
> like to restrict errors to just negatives. You cannot write
>
> if ((err = pci_refresh_device(...)) /* < 0 */) {
> handle_error(err);
> }
>
> as the check for < 0 is required (easy to be avoided).

It is a remnant of a lot of code in eal that tries to have 0 for
success, < 0 for errors, > 0 for special cases.


-- 
David Marchand

[dpdk-dev] Regarding mbuf allocation/free in secondary process

2016-02-10 Thread Lawrence MacIntyre

Sara:

The use of polling is to provide high throughput. At high bandwidths, 
interrupt processing is  a great contributor to latency and can greatly 
decrease available bandwidth. Polling eliminates the overhead of 
interrupts but it will consume an entire CPU. That is why the DPDK 
library makes it easy to put each thread on a separate core. The 
downside is that for low bandwidth applications, you end up using a lot 
of CPU resources spinning in loops waiting for input.

Using the techniques Bruce mentioned will allow you to code a 
low-bandwidth, low CPU utilization app, but it will not be able to 
handle high bandwidth. You will have to choose one or the other. 
Unfortunately, computer hardware has a limited number of interrupts that 
it can handle per second, and those limits make it impractical to handle 
bandwidths much greater than 1 Gb/s (i.e 10, 40, or 100 Gb/s) using 
interrupts to handle I/O.

Lawrence

This one time (02/10/2016 05:01 AM), at band camp, Bruce Richardson wrote:
> On Tue, Feb 09, 2016 at 11:43:19PM -0800, Saravana Kumar wrote:
>> Hi DPDK community,
>>
>>
>>
>> I'd like to have DPDK NIC IO operations in (primary) process and
>> execution logic in (secondary) processes.
>> Primary process pushes NIC Rx mbufs to Secondary process through S/W ring
>>
>> Seconary process allocates mbuf for Tx path and pushes down to Primary
>> process for NIC Tx
>>
>>
>> I have few doubts here:
>>
>> 1. If Secondary process dies because of SIGKILL then how can the mbufs
>> allocated in Secondary process can be freed.
>> If it is normal signals like SIGINT/SIGTERM then we can be catch
>> those and free in those respective signal handlers
> If a process terminates abnormally then the buffers being used by that process
> may well be leaked. The solution you propose of catching signals will 
> certainly
> help as you want to try and ensure that a process always frees all its buffers
> properly on termination.
>
>> 2. Secondary process needs to poll on the S/W ring. This can consume 100% 
>> cpu.
>> Is there a way to avoid polling in secondary process for Rx path
> Not using DPDK software rings, no. You'd have to use some kernel constructs 
> such as
> fifo's/named pipes to do blocking reads on those. However, the overhead of 
> using
> such structures can be severe making them unusable for many packet processing
> applications. An alternative might be to use some small sleep calls i.e. 
> nanosleep
> between polls of the SW ring in cases where traffic rates are low. That will
> reduce your cpu usage.
>
> /Bruce
>
>

-- 
Lawrence MacIntyre  macintyrelp at ornl.gov  Oak Ridge National Laboratory
  865.574.7401  Cyber Space and Information Intelligence Research Group

[dpdk-dev] [PATCH 6/9] eal: initialize vdevs right next to pci devices

2016-02-10 Thread David Marchand

On Wed, Feb 10, 2016 at 12:05 PM, Jan Viktorin  
wrote:
> On Fri, 29 Jan 2016 15:08:33 +0100
> David Marchand  wrote:
>> --- a/lib/librte_eal/common/include/rte_dev.h
>> +++ b/lib/librte_eal/common/include/rte_dev.h
>> @@ -156,7 +156,7 @@ void rte_eal_driver_register(struct rte_driver *driver);
>>  void rte_eal_driver_unregister(struct rte_driver *driver);
>>
>>  /**
>> - * Initalize all the registered drivers in this process
>> + * Scan all devargs and attach to drivers if available
>>   */
>>  int rte_eal_dev_init(void);
>
> Move this to a separate commit? Is it just a forgotten doc comment?

Should be in previous commit, yes.

>>
>> diff --git a/lib/librte_eal/linuxapp/eal/eal.c 
>> b/lib/librte_eal/linuxapp/eal/eal.c
>> index 62241ee..95313af 100644
>> --- a/lib/librte_eal/linuxapp/eal/eal.c
>> +++ b/lib/librte_eal/linuxapp/eal/eal.c
>> @@ -834,8 +834,10 @@ rte_eal_init(int argc, char **argv)
>>   rte_config.master_lcore, (int)thread_id, cpuset,
>>   ret == 0 ? "" : "...");
>>
>> +#ifndef RTE_NEXT_ABI
>>   if (rte_eal_dev_init() < 0)
>>   rte_panic("Cannot init pmd devices\n");
>> +#endif
>>
>>   RTE_LCORE_FOREACH_SLAVE(i) {
>>
>> @@ -873,6 +875,11 @@ rte_eal_init(int argc, char **argv)
>>   rte_eal_mp_remote_launch(sync_func, NULL, SKIP_MASTER);
>>   rte_eal_mp_wait_lcore();
>>
>> +#ifdef RTE_NEXT_ABI
>> + if (rte_eal_dev_init() < 0)
>> + rte_panic("Cannot probe vdev devices\n");
>> +#endif
>> +
>>   /* Probe & Initialize PCI devices */
>>   if (rte_eal_pci_probe())
>>   rte_panic("Cannot probe PCI\n");
>
> I cannot see the point why it is enclosed in the RTE_NEXT_ABI. Is it
> such a serious breakage?

No.
We can make this move unconditional.

Thanks.

-- 
David Marchand

[dpdk-dev] [PATCH v2 5/5] docs: add release note for qtest virtio container support

2016-02-10 Thread Tetsuya Mukawa

Signed-off-by: Tetsuya Mukawa 
---
 doc/guides/rel_notes/release_2_3.rst | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/doc/guides/rel_notes/release_2_3.rst 
b/doc/guides/rel_notes/release_2_3.rst
index 1e7d51d..ab3baf9 100644
--- a/doc/guides/rel_notes/release_2_3.rst
+++ b/doc/guides/rel_notes/release_2_3.rst
@@ -43,6 +43,9 @@ This section should contain new features added in this 
release. Sample format:

   Add a new virtual device, named eth_cvio, to support virtio for containers.

+* **Virtio support for containers using QEMU qtest mode.**
+  Add a new virtual device, named eth_qtest_virtio, to support virtio for 
containers
+  using QEMU qtest mode.

 Resolved Issues
 ---
-- 
2.1.4

[dpdk-dev] [PATCH v2 4/5] virtio: Add support for qtest virtio-net PMD

2016-02-10 Thread Tetsuya Mukawa

The patch adds a new virtio-net PMD configuration that allows the PMD to
work on host as if the PMD is in VM.
Here is new configuration for virtio-net PMD.
 - CONFIG_RTE_VIRTIO_VDEV_QTEST
To use this mode, EAL needs map all hugepages as one file. Also the file
should be mapped between (1 << 31) and (1 << 44).

To allocate like above, add "--single-file" and "--qtest-virtio" option
to application command line.

To prepare virtio-net device on host, the users need to invoke QEMU
process in special qtest mode. This mode is mainly used for testing QEMU
devices from outer process. In this mode, no guest runs.
Here is QEMU command line.

 $ qemu-system-x86_64 \
 -machine pc-i440fx-1.4,accel=qtest \
 -display none -qtest-log /dev/null \
 -qtest unix:/tmp/socket,server \
 -netdev type=tap,script=/etc/qemu-ifup,id=net0,queues=1 \
 -device virtio-net-pci,netdev=net0,mq=on,disable-modern=false,addr=3 \
 -chardev socket,id=chr1,path=/tmp/ivshmem,server \
 -device ivshmem,size=1G,chardev=chr1,vectors=1,addr=4

 * Should use qemu-2.5.1, or above.
 * QEMU process is needed per port.
 * virtio-1.0 device are only supported.
 * The vhost backends like vhost-net and vhost-user can be specified.
 * In most cases, just using above command is enough, but you can also
   specify other QEMU virtio-net options.
 * Only checked "pc-i440fx-1.4" machine, but may work with other
   machines. It depends on a machine has piix3 south bridge.
   If the machine doesn't have, virtio-net PMD cannot receive status
   changed interrupts.
 * Should not add "--enable-kvm" to QEMU command line.

After invoking QEMU, the PMD can connect to QEMU process using unix
domain sockets. Over these sockets, virtio-net, ivshmem and piix3
device in QEMU are probed by the PMD.
Here is example of command line.

 $ testpmd -c f -n 1 -m 1024 --no-pci --single-file --qtest-virtio \
  --vdev="eth_qtest_virtio0,qtest=/tmp/socket,ivshmem=/tmp/ivshmem"\
  -- --disable-hw-vlan --txqflags=0xf00 -i

Please specify same unix domain sockets and memory size in both QEMU
and DPDK command lines like above.
The share memory size should be power of 2, because ivshmem only
accepts such memory size.

Signed-off-by: Tetsuya Mukawa 
---
 config/common_linuxapp |1 +
 drivers/net/virtio/Makefile|4 +
 drivers/net/virtio/qtest.c | 1342 
 drivers/net/virtio/qtest.h |   65 ++
 drivers/net/virtio/virtio_ethdev.c |  383 +-
 drivers/net/virtio/virtio_pci.c|  364 +-
 drivers/net/virtio/virtio_pci.h|5 +-
 7 files changed, 2122 insertions(+), 42 deletions(-)
 create mode 100644 drivers/net/virtio/qtest.c
 create mode 100644 drivers/net/virtio/qtest.h

diff --git a/config/common_linuxapp b/config/common_linuxapp
index f76e162..7cbf50d 100644
--- a/config/common_linuxapp
+++ b/config/common_linuxapp
@@ -539,3 +539,4 @@ CONFIG_RTE_TEST_PMD_RECORD_BURST_STATS=n
 # Enable virtio support for container
 #
 CONFIG_RTE_VIRTIO_VDEV=y
+CONFIG_RTE_VIRTIO_VDEV_QTEST=y
diff --git a/drivers/net/virtio/Makefile b/drivers/net/virtio/Makefile
index ef920f9..6c11378 100644
--- a/drivers/net/virtio/Makefile
+++ b/drivers/net/virtio/Makefile
@@ -56,6 +56,10 @@ ifeq ($(CONFIG_RTE_VIRTIO_VDEV),y)
SRCS-$(CONFIG_RTE_LIBRTE_VIRTIO_PMD) += vhost_embedded.c
 endif

+ifeq ($(CONFIG_RTE_VIRTIO_VDEV_QTEST),y)
+   SRCS-$(CONFIG_RTE_LIBRTE_VIRTIO_PMD) += qtest.c
+endif
+
 # this lib depends upon:
 DEPDIRS-$(CONFIG_RTE_LIBRTE_VIRTIO_PMD) += lib/librte_eal lib/librte_ether
 DEPDIRS-$(CONFIG_RTE_LIBRTE_VIRTIO_PMD) += lib/librte_mempool lib/librte_mbuf
diff --git a/drivers/net/virtio/qtest.c b/drivers/net/virtio/qtest.c
new file mode 100644
index 000..418214f
--- /dev/null
+++ b/drivers/net/virtio/qtest.c
@@ -0,0 +1,1342 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2016 IGEL Co., Ltd. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ * * Redistributions of source code must retain the above copyright
+ *   notice, this list of conditions and the following disclaimer.
+ * * Redistributions in binary form must reproduce the above copyright
+ *   notice, this list of conditions and the following disclaimer in
+ *   the documentation and/or other materials provided with the
+ *   distribution.
+ * * Neither the name of IGEL Co., Ltd. nor the names of its
+ *   contributors may be used to endorse or promote products derived
+ *   from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT

[dpdk-dev] [PATCH v2 3/5] vhost: Add a function to check virtio device type

2016-02-10 Thread Tetsuya Mukawa

The patch adds below function to cleanup virtio code.
 - virtio_dev_check()

Signed-off-by: Tetsuya Mukawa 
---
 drivers/net/virtio/virtio_ethdev.c | 52 ++
 drivers/net/virtio/virtio_ethdev.h | 32 +++
 2 files changed, 57 insertions(+), 27 deletions(-)

diff --git a/drivers/net/virtio/virtio_ethdev.c 
b/drivers/net/virtio/virtio_ethdev.c
index 1c8c955..c3e877a 100644
--- a/drivers/net/virtio/virtio_ethdev.c
+++ b/drivers/net/virtio/virtio_ethdev.c
@@ -371,7 +371,7 @@ int virtio_dev_queue_setup(struct rte_eth_dev *dev,
vq->mz = mz;
vq->vq_ring_virt_mem = mz->addr;

-   if (dev->dev_type == RTE_ETH_DEV_PCI) {
+   if (virtio_dev_check(dev, RTE_ETH_DEV_PCI, NULL, 0)) {
vq->vq_ring_mem = mz->phys_addr;

/* Virtio PCI device VIRTIO_PCI_QUEUE_PF register is 32bit,
@@ -429,7 +429,7 @@ int virtio_dev_queue_setup(struct rte_eth_dev *dev,
vq->virtio_net_hdr_vaddr = mz->addr;
memset(vq->virtio_net_hdr_vaddr, 0, hdr_size);

-   if (dev->dev_type == RTE_ETH_DEV_PCI)
+   if (virtio_dev_check(dev, RTE_ETH_DEV_PCI, NULL, 0))
vq->virtio_net_hdr_mem = mz->phys_addr;
 #ifdef RTE_VIRTIO_VDEV
else
@@ -439,7 +439,7 @@ int virtio_dev_queue_setup(struct rte_eth_dev *dev,

hw->vtpci_ops->setup_queue(hw, vq);

-   if (dev->dev_type == RTE_ETH_DEV_PCI)
+   if (virtio_dev_check(dev, RTE_ETH_DEV_PCI, NULL, 0))
vq->offset = offsetof(struct rte_mbuf, buf_physaddr);
 #ifdef RTE_VIRTIO_VDEV
else
@@ -490,15 +490,13 @@ static void
 virtio_dev_close(struct rte_eth_dev *dev)
 {
struct virtio_hw *hw = dev->data->dev_private;
-   struct rte_pci_device *pci_dev = dev->pci_dev;

PMD_INIT_LOG(DEBUG, "virtio_dev_close");

/* reset the NIC */
-   if (dev->dev_type == RTE_ETH_DEV_PCI) {
-   if (pci_dev->driver->drv_flags & RTE_PCI_DRV_INTR_LSC)
-   vtpci_irq_config(hw, VIRTIO_MSI_NO_VECTOR);
-   }
+   if (virtio_dev_check(dev, RTE_ETH_DEV_PCI, NULL, RTE_PCI_DRV_INTR_LSC))
+   vtpci_irq_config(hw, VIRTIO_MSI_NO_VECTOR);
+
vtpci_reset(hw);
hw->started = 0;
virtio_dev_free_mbufs(dev);
@@ -1001,7 +999,7 @@ virtio_interrupt_handler(__rte_unused struct 
rte_intr_handle *handle,
isr = vtpci_isr(hw);
PMD_DRV_LOG(INFO, "interrupt status = %#x", isr);

-   if (dev->dev_type == RTE_ETH_DEV_PCI)
+   if (virtio_dev_check(dev, RTE_ETH_DEV_PCI, NULL, 0))
if (rte_intr_enable(>pci_dev->intr_handle) < 0)
PMD_DRV_LOG(ERR, "interrupt enable failed");

@@ -1056,9 +1054,10 @@ eth_virtio_dev_init(struct rte_eth_dev *eth_dev)

pci_dev = eth_dev->pci_dev;

-   if (eth_dev->dev_type == RTE_ETH_DEV_PCI)
+   if (virtio_dev_check(eth_dev, RTE_ETH_DEV_PCI, NULL, 0)) {
if (vtpci_init(pci_dev, hw) < 0)
return -1;
+   }

/* Reset the device although not necessary at startup */
vtpci_reset(hw);
@@ -1072,7 +1071,7 @@ eth_virtio_dev_init(struct rte_eth_dev *eth_dev)
return -1;

/* If host does not support status then disable LSC */
-   if (eth_dev->dev_type == RTE_ETH_DEV_PCI) {
+   if (virtio_dev_check(eth_dev, RTE_ETH_DEV_PCI, NULL, 0)) {
if (!vtpci_with_feature(hw, VIRTIO_NET_F_STATUS))
pci_dev->driver->drv_flags &= ~RTE_PCI_DRV_INTR_LSC;

@@ -1154,13 +1153,14 @@ eth_virtio_dev_init(struct rte_eth_dev *eth_dev)

PMD_INIT_LOG(DEBUG, "hw->max_rx_queues=%d   hw->max_tx_queues=%d",
hw->max_rx_queues, hw->max_tx_queues);
-   if (eth_dev->dev_type == RTE_ETH_DEV_PCI) {
+   if (virtio_dev_check(eth_dev, RTE_ETH_DEV_PCI, NULL, 0)) {
PMD_INIT_LOG(DEBUG, "port %d vendorID=0x%x deviceID=0x%x",
 eth_dev->data->port_id, pci_dev->id.vendor_id,
 pci_dev->id.device_id);

/* Setup interrupt callback  */
-   if (pci_dev->driver->drv_flags & RTE_PCI_DRV_INTR_LSC)
+   if (virtio_dev_check(eth_dev, RTE_ETH_DEV_PCI,
+   NULL, RTE_PCI_DRV_INTR_LSC))
rte_intr_callback_register(_dev->intr_handle,
   virtio_interrupt_handler,
   eth_dev);
@@ -1197,11 +1197,11 @@ eth_virtio_dev_uninit(struct rte_eth_dev *eth_dev)
eth_dev->data->mac_addrs = NULL;

/* reset interrupt callback  */
-   if (eth_dev->dev_type == RTE_ETH_DEV_PCI)
-   if (pci_dev->driver->drv_flags & RTE_PCI_DRV_INTR_LSC)
-   rte_intr_callback_unregister(_dev->intr_handle,
-

[dpdk-dev] [PATCH v2 2/5] EAL: Add new EAL "--qtest-virtio" option

2016-02-10 Thread Tetsuya Mukawa

To work with qtest virtio-net PMD, virtual address that maps hugepages
should be between (1 << 31) to (1 << 44). This patch adds one more option
to map like this. Also all hugepages should consists of one file.
Because of this, the option will work only when '--single-file' option is
specified.

Signed-off-by: Tetsuya Mukawa 
---
 lib/librte_eal/common/eal_common_options.c | 10 
 lib/librte_eal/common/eal_internal_cfg.h   |  1 +
 lib/librte_eal/common/eal_options.h|  2 +
 lib/librte_eal/linuxapp/eal/eal_memory.c   | 81 +-
 4 files changed, 93 insertions(+), 1 deletion(-)

diff --git a/lib/librte_eal/common/eal_common_options.c 
b/lib/librte_eal/common/eal_common_options.c
index 65bccbd..34c8bd1 100644
--- a/lib/librte_eal/common/eal_common_options.c
+++ b/lib/librte_eal/common/eal_common_options.c
@@ -96,6 +96,7 @@ eal_long_options[] = {
{OPT_VMWARE_TSC_MAP,0, NULL, OPT_VMWARE_TSC_MAP_NUM   },
{OPT_XEN_DOM0,  0, NULL, OPT_XEN_DOM0_NUM },
{OPT_SINGLE_FILE,   0, NULL, OPT_SINGLE_FILE_NUM  },
+   {OPT_QTEST_VIRTIO,  0, NULL, OPT_QTEST_VIRTIO_NUM },
{0, 0, NULL, 0}
 };

@@ -902,6 +903,10 @@ eal_parse_common_option(int opt, const char *optarg,
conf->single_file = 1;
break;

+   case OPT_QTEST_VIRTIO_NUM:
+   conf->qtest_virtio = 1;
+   break;
+
/* don't know what to do, leave this to caller */
default:
return 1;
@@ -971,6 +976,11 @@ eal_check_common_options(struct internal_config 
*internal_cfg)
"be specified together with --"OPT_SINGLE_FILE"\n");
return -1;
}
+   if (internal_cfg->qtest_virtio && !internal_cfg->single_file) {
+   RTE_LOG(ERR, EAL, "Option --"OPT_QTEST_VIRTIO" cannot "
+   "be specified without --"OPT_SINGLE_FILE"\n");
+   return -1;
+   }

if (internal_cfg->no_hugetlbfs && internal_cfg->hugepage_unlink) {
RTE_LOG(ERR, EAL, "Option --"OPT_HUGE_UNLINK" cannot "
diff --git a/lib/librte_eal/common/eal_internal_cfg.h 
b/lib/librte_eal/common/eal_internal_cfg.h
index 9117ed9..7f3df39 100644
--- a/lib/librte_eal/common/eal_internal_cfg.h
+++ b/lib/librte_eal/common/eal_internal_cfg.h
@@ -71,6 +71,7 @@ struct internal_config {
volatile unsigned no_hpet;/**< true to disable HPET */
volatile unsigned vmware_tsc_map; /**< true to use VMware TSC mapping

* instead of native TSC */
+   volatile unsigned qtest_virtio;/**< mmap hugepages to fit qtest 
virtio PMD */
volatile unsigned no_shconf;  /**< true if there is no shared 
config */
volatile unsigned create_uio_dev; /**< true to create /dev/uioX devices 
*/
volatile enum rte_proc_type_t process_type; /**< multi-process proc 
type */
diff --git a/lib/librte_eal/common/eal_options.h 
b/lib/librte_eal/common/eal_options.h
index e5da14a..b33a3c3 100644
--- a/lib/librte_eal/common/eal_options.h
+++ b/lib/librte_eal/common/eal_options.h
@@ -85,6 +85,8 @@ enum {
OPT_XEN_DOM0_NUM,
 #define OPT_SINGLE_FILE   "single-file"
OPT_SINGLE_FILE_NUM,
+#define OPT_QTEST_VIRTIO  "qtest-virtio"
+   OPT_QTEST_VIRTIO_NUM,
OPT_LONG_MAX_NUM
 };

diff --git a/lib/librte_eal/linuxapp/eal/eal_memory.c 
b/lib/librte_eal/linuxapp/eal/eal_memory.c
index a6b3616..677d6a7 100644
--- a/lib/librte_eal/linuxapp/eal/eal_memory.c
+++ b/lib/librte_eal/linuxapp/eal/eal_memory.c
@@ -1092,6 +1092,73 @@ calc_num_pages_per_socket(uint64_t * memory,
 }

 /*
+ * Find memory space that fits qtest virtio-net PMD.
+ */
+static void *
+rte_eal_get_free_region(uint64_t alloc_size, uint64_t pagesz)
+{
+   uint64_t start, end, next_start;
+   uint64_t high_limit, low_limit;
+   char buf[1024], *p;
+   FILE *fp;
+   void *addr = NULL;
+
+   /* all hugepages should be mapped between below values */
+   low_limit = 1UL << 31;
+   high_limit = 1UL << 44;
+
+   /* allocation size should be aligned by page size */
+   if (alloc_size != RTE_ALIGN_CEIL(alloc_size, pagesz)) {
+   rte_panic("Invalid allocation size 0x%lx\n", alloc_size);
+   return NULL;
+   }
+
+   /*
+* address should be aligned by allocation size because
+* BAR register requiers such an address
+*/
+   low_limit = RTE_ALIGN_CEIL(low_limit, alloc_size);
+   high_limit = RTE_ALIGN_FLOOR(high_limit, alloc_size);
+
+   fp = fopen("/proc/self/maps", "r");
+   if (fp == NULL) {
+   rte_panic("Cannot open /proc/self/maps\n");
+   return NULL;
+   }
+
+   next_start = 0;
+   do {
+   start = next_start;
+
+   if ((p = fgets(buf, sizeof(buf), fp))

[dpdk-dev] [PATCH v2 1/5] virtio: Retrieve driver name from eth_dev

2016-02-10 Thread Tetsuya Mukawa

Currently, virtio_dev_info_get() retrieves driver name from pci_drv.
If the driver is virtual PMD, pci_drv will be invalid.
So retrieves the name from eth_dev.

Signed-off-by: Tetsuya Mukawa 
---
 drivers/net/virtio/virtio_ethdev.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/virtio/virtio_ethdev.c 
b/drivers/net/virtio/virtio_ethdev.c
index b790fd0..1c8c955 100644
--- a/drivers/net/virtio/virtio_ethdev.c
+++ b/drivers/net/virtio/virtio_ethdev.c
@@ -1438,7 +1438,7 @@ virtio_dev_info_get(struct rte_eth_dev *dev, struct 
rte_eth_dev_info *dev_info)
 {
struct virtio_hw *hw = dev->data->dev_private;

-   dev_info->driver_name = dev->driver->pci_drv.name;
+   dev_info->driver_name = dev->data->drv_name;
dev_info->max_rx_queues = (uint16_t)hw->max_rx_queues;
dev_info->max_tx_queues = (uint16_t)hw->max_tx_queues;
dev_info->min_rx_bufsize = VIRTIO_MIN_RX_BUFSIZE;
-- 
2.1.4

[dpdk-dev] [PATCH v2 0/5] Virtio-net PMD: QEMU QTest extension for container

2016-02-10 Thread Tetsuya Mukawa

The patches will work on below patch series.
 - [PATCH v2 0/5] virtio support for container
 - [PATCH 0/4] rework ioport access for virtio

[Changes]
v2 changes:
 - Rebase on above patch seiries.
 - Rebase on master
 - Add "--qtest-virtio" EAL option.
 - Fixes in qtest.c
  - Fix error handling for the case qtest connection is closed.
  - Use eventfd for interrupt messaging.
  - Use linux header for PCI register definitions.
  - Fix qtest_raw_send/recv to handle error correctly.
  - Fix bit mask of PCI_CONFIG_ADDR.
  - Describe memory and ioport usage of qtest guest in qtest.c
  - Remove loop that is for finding PCI devices.


[Abstraction]

Normally, virtio-net PMD only works on VM, because there is no virtio-net 
device on host.
This patches extend  virtio-net PMD to be able to work on host as virtual PMD.
But we didn't implement virtio-net device as a part of virtio-net PMD.
To prepare virtio-net device for the PMD, start QEMU process with special QTest 
mode, then connect it from virtio-net PMD through unix domain socket.

The PMD can connect to anywhere QEMU virtio-net device can.
For example, the PMD can connects to vhost-net kernel module and vhost-user 
backend application.
Similar to virtio-net PMD on QEMU, application memory that uses virtio-net PMD 
will be shared between vhost backend application.
But vhost backend application memory will not be shared.

Main target of this PMD is container like docker, rkt, lxc and etc.
We can isolate related processes(virtio-net PMD process, QEMU and vhost-user 
backend process) by container.
But, to communicate through unix domain socket, shared directory will be needed.


[How to use]

 Please use QEMU-2.5.1, or above.
 (So far, QEMU-2.5.1 hasn't been released yet, so please checkout master from 
QEMU repository)

 - Compile
 Set "CONFIG_RTE_VIRTIO_VDEV_QTEST=y" in config/common_linux.
 Then compile it.

 - Start QEMU like below.
 $ qemu-system-x86_64 \
  -machine pc-i440fx-1.4,accel=qtest \
  -display none -qtest-log /dev/null \
  -qtest unix:/tmp/socket,server \
  -netdev type=tap,script=/etc/qemu-ifup,id=net0,queues=1 \
  -device 
virtio-net-pci,netdev=net0,mq=on,disable-modern=false,addr=3 \
  -chardev socket,id=chr1,path=/tmp/ivshmem,server \
  -device ivshmem,size=1G,chardev=chr1,vectors=1,addr=4

 - Start DPDK application like below
 $ testpmd -c f -n 1 -m 1024 --no-pci --single-file --qtest-virtio \
 --vdev="eth_qtest_virtio0,qtest=/tmp/socket,ivshmem=/tmp/ivshmem"\
 -- --disable-hw-vlan --txqflags=0xf00 -i

(*1) Please Specify same memory size in QEMU and DPDK command line.
(*2) Should use qemu-2.5.1, or above.
(*3) QEMU process is needed per port.
(*4) virtio-1.0 device are only supported.
(*5) The vhost backends like vhost-net and vhost-user can be specified.
(*6) In most cases, just using above command is enough, but you can also
 specify other QEMU virtio-net options.
(*7) Only checked "pc-i440fx-1.4" machine, but may work with other
 machines. It depends on a machine has piix3 south bridge.
 If the machine doesn't have, virtio-net PMD cannot receive status
 changed interrupts.
(*8) Should not add "--enable-kvm" to QEMU command line.


[Detailed Description]

 - virtio-net device implementation
The PMD uses QEMU virtio-net device. To do that, QEMU QTest functionality is 
used.
QTest is a test framework of QEMU devices. It allows us to implement a device 
driver outside of QEMU.
With QTest, we can implement DPDK application and virtio-net PMD as standalone 
process on host.
When QEMU is invoked as QTest mode, any guest code will not run.
To know more about QTest, see below.
http://wiki.qemu.org/Features/QTest

 - probing devices
QTest provides a unix domain socket. Through this socket, driver process can 
access to I/O port and memory of QEMU virtual machine.
The PMD will send I/O port accesses to probe pci devices.
If we can find virtio-net and ivshmem device, initialize the devices.
Also, I/O port accesses of virtio-net PMD will be sent through socket, and 
virtio-net PMD can initialize vitio-net device on QEMU correctly.

 - ivshmem device to share memory
To share memory that virtio-net PMD process uses, ivshmem device will be used.
Because ivshmem device can only handle one file descriptor, shared memory 
should be consist of one file.
To allocate such a memory, EAL has new option called "--single-file".
Also, the hugepages should be mapped between "1 << 31" to "1 << 44".
To map like above, EAL has one more new option called "-qtest-virtio".
While initializing ivshmem device, we can set BAR(Base Address Register).
It represents which memory QEMU vcpu can access to this shared memory.
We will specify host virtual address of shared memory as this address.
It is very useful because we don't need to apply patch to QEMU to calculate 
address offset.
(For example, if virtio-net PMD process will allocate memory from shared

[dpdk-dev] [PATCH 3/9] drivers: no more pdev drivers

2016-02-10 Thread David Marchand

On Wed, Feb 10, 2016 at 11:20 AM, Jan Viktorin  
wrote:
> On Wed, 10 Feb 2016 10:27:14 +0100
> David Marchand  wrote:
>> #define RTE_EAL_PCI_REGISTER(name, d)\
>> void pciinitfn_ ##name(void);\
>> void __attribute__((constructor, used)) pciinitfn_ ##name(void)\
>> {\
>> rte_eal_pci_register(d);\
>
> I meant
> rte_eal_pci_register(&(d)->pci_drv);\
>
> Perhaps, my assumption that a PCI driver is always referred as pci_drv is 
> wrong...

Well, I suppose we will always have a pci driver embedded in some
other internal pmd structure.
So we can always expect it to be called pci_drv ...

Btw, for drivers like mlx or virtio that need to do some more stuff in
their constructor (the iopl stuff for virtio is the most interesting
case, since the pci register happens only if iopl succeeded), we might
need some RTE_MODULE_INIT for those.

But in such a case, I think having RTE_MODULE_INIT in all pmds would
make more sense, and RTE_EAL_PCI_REGISTER looks unneeded ?

-- 
David Marchand

[dpdk-dev] [PATCH v4 0/6] vmxnet3 TSO, tx cksum offload and cleanups

2016-02-10 Thread Bruce Richardson

On Tue, Jan 12, 2016 at 08:56:34PM -0800, Stephen Hemminger wrote:
> On Tue, 12 Jan 2016 18:08:31 -0800
> Yong Wang  wrote:
> 
> > v4:
> > * moved cleanups to separate patches
> > * correctly handled multi-seg pkts with data ring used
> > 
> > v3:
> > * fixed comments from Stephen
> > * added performance number for tx data ring
> > 
> > v2:
> > * fixed some logging issues when debug option turned on
> > * updated the txq_flags check in vmxnet3_dev_tx_queue_setup()
> > 
> > This patchset adds TCP/UDP checksum offload and TSO to vmxnet3 PMD.
> > One of the use cases is to support STT.  It also restores the tx
> > data ring feature that was removed from a previous patch.
> > 
> > Yong Wang (6):
> >   vmxnet3: fix typos and remove unused struct
> >   vmxnet3: restore tx data ring support
> >   vmxnet3: cleanup txNumDeferred usage
> >   vmxnet3: add tx l4 cksum offload
> >   vmxnet3: add TSO support
> >   vmxnet3: announce device offload capability
> > 
> >  doc/guides/rel_notes/release_2_3.rst|  11 +++
> >  drivers/net/vmxnet3/base/includeCheck.h |  39 
> >  drivers/net/vmxnet3/base/vmxnet3_defs.h |   9 +-
> >  drivers/net/vmxnet3/vmxnet3_ethdev.c|  16 +++-
> >  drivers/net/vmxnet3/vmxnet3_ring.h  |  13 ---
> >  drivers/net/vmxnet3/vmxnet3_rxtx.c  | 160 
> > +---
> >  6 files changed, 151 insertions(+), 97 deletions(-)
> >  delete mode 100644 drivers/net/vmxnet3/base/includeCheck.h
> > 
> 
> Looks good. The only thing maybe worth adding would be some more checks
> int the vmxnet3_dev_configure for unsupported offload bits, etc.
> 
> Acked-by: Stephen Hemminger 

Applied to dpdk-next-net/rel_16_04

Thanks,
Bruce

[dpdk-dev] [PATCH 8/9] pci: add a helper to refresh a device

2016-02-10 Thread Jan Viktorin

On Fri, 29 Jan 2016 15:08:35 +0100
David Marchand  wrote:

> It will be used mainly for hotplug code.
> 
> Signed-off-by: David Marchand 
> ---
>  lib/librte_eal/bsdapp/eal/eal_pci.c   | 49 
> +++
>  lib/librte_eal/common/eal_private.h   | 13 ++
>  lib/librte_eal/linuxapp/eal/eal_pci.c | 13 ++
>  3 files changed, 75 insertions(+)
> 
> diff --git a/lib/librte_eal/bsdapp/eal/eal_pci.c 
> b/lib/librte_eal/bsdapp/eal/eal_pci.c
> index 4584522..5dd89e3 100644
> --- a/lib/librte_eal/bsdapp/eal/eal_pci.c
> +++ b/lib/librte_eal/bsdapp/eal/eal_pci.c
> @@ -396,6 +396,55 @@ error:
>   return -1;
>  }
>  
> +int
> +pci_refresh_device(const struct rte_pci_addr *addr)

What about pci_reload_device or pci_reload_device_info? I don't mind
too much, only the word 'refresh' reminds me other associations.

> +{
> + int fd;
> + struct pci_conf matches[2];
> + struct pci_match_conf match = {
> + .pc_sel = {
> + .pc_domain = addr->domain,
> + .pc_bus = addr->bus,
> + .pc_dev = addr->devid,
> + .pc_func = addr->function,
> + },
> + };
> + struct pci_conf_io conf_io = {
> + .pat_buf_len = 0,
> + .num_patterns = 1,
> + .patterns = {  },
> + .match_buf_len = sizeof(matches),
> + .matches = [0],
> + };
> +
> + fd = open("/dev/pci", O_RDONLY);

Just courious who provides this special file... is a DPDK-specific
thing? I haven't noticed it anywhere in Linux.

> + if (fd < 0) {
> + RTE_LOG(ERR, EAL, "%s(): error opening /dev/pci\n", __func__);
> + goto error;

If you write:
return -1;

then you can...

> + }
> +
> + if (ioctl(fd, PCIOCGETCONF, _io) < 0) {
> + RTE_LOG(ERR, EAL, "%s(): error with ioctl on /dev/pci: %s\n",
> + __func__, strerror(errno));
> + goto error;
> + }
> +
> + if (conf_io.num_matches != 1)
> + goto error;
> +
> + if (pci_scan_one(fd, [0]) < 0)
> + goto error;
> +
> + close(fd);
> +
> + return 0;
> +
> +error:

...remove this if:

> + if (fd >= 0)
> + close(fd);

Or, do you consider it more stable in the orignal way?

> + return -1;
> +}
> +
>  /* Read PCI config space. */
>  int rte_eal_pci_read_config(const struct rte_pci_device *dev,
>   void *buf, size_t len, off_t offset)
> diff --git a/lib/librte_eal/common/eal_private.h 
> b/lib/librte_eal/common/eal_private.h
> index 072e672..ed1903f 100644
> --- a/lib/librte_eal/common/eal_private.h
> +++ b/lib/librte_eal/common/eal_private.h
> @@ -155,6 +155,19 @@ struct rte_pci_driver;
>  struct rte_pci_device;
>  
>  /**
> + * Refresh a pci device object by asking the kernel for the latest 
> information.
> + *
> + * This function is private to EAL.
> + *
> + * @param addr
> + *   The PCI Bus-Device-Function address to look for
> + * @return
> + *   - 0 on success.
> + *   - negative on error.

I don't know whether this is a convention in DPDK, anyway, I don't
like to restrict errors to just negatives. You cannot write

if ((err = pci_refresh_device(...)) /* < 0 */) {
handle_error(err);
}

as the check for < 0 is required (easy to be avoided).

> + */
> +int pci_refresh_device(const struct rte_pci_addr *addr);
> +
> +/**
>   * Unbind kernel driver for this device
>   *
>   * This function is private to EAL.
> diff --git a/lib/librte_eal/linuxapp/eal/eal_pci.c 
> b/lib/librte_eal/linuxapp/eal/eal_pci.c
> index a354f76..4fe8b60 100644
> --- a/lib/librte_eal/linuxapp/eal/eal_pci.c
> +++ b/lib/librte_eal/linuxapp/eal/eal_pci.c
> @@ -393,6 +393,19 @@ pci_scan_one(const char *dirname, uint16_t domain, 
> uint8_t bus,
>   return 0;
>  }
>  
> +int
> +pci_refresh_device(const struct rte_pci_addr *addr)
> +{
> + char filename[PATH_MAX];
> +
> + snprintf(filename, sizeof(filename), "%s/" PCI_PRI_FMT,
> +  SYSFS_PCI_DEVICES, addr->domain, addr->bus, addr->devid,
> +  addr->function);
> +
> + return pci_scan_one(filename, addr->domain, addr->bus, addr->devid,
> + addr->function);
> +}
> +
>  /*
>   * split up a pci address into its constituent parts.
>   */



-- 
   Jan Viktorin  E-mail: Viktorin at RehiveTech.com
   System Architect  Web:www.RehiveTech.com
   RehiveTech
   Brno, Czech Republic

[dpdk-dev] [PATCH v2] hash: fix CRC32c computation

2016-02-10 Thread De Lara Guarch, Pablo

Hi Didier,

> -Original Message-
> From: Didier Pallard [mailto:didier.pallard at 6wind.com]
> Sent: Tuesday, February 09, 2016 9:34 AM
> To: dev at dpdk.org; Richardson, Bruce; De Lara Guarch, Pablo
> Cc: jean-mickael.guerin at 6wind.com; thomas.monjalon at 6wind.com
> Subject: [PATCH v2] hash: fix CRC32c computation
> 
> As demonstrated by the following code, CRC32c computation is not valid
> when buffer length is not a multiple of 4 bytes:
> (Output obtained by code below)
> 
> CRC of 1 NULL bytes expected: 0x527d5351
> soft: 527d5351
> rte accelerated: 48674bc7
> rte soft: 48674bc7
> CRC of 2 NULL bytes expected: 0xf16177d2
> soft: f16177d2
> rte accelerated: 48674bc7
> rte soft: 48674bc7
> CRC of 2x1 NULL bytes expected: 0xf16177d2
> soft: f16177d2
> rte accelerated: 8c28b28a
> rte soft: 8c28b28a
> CRC of 3 NULL bytes expected: 0x6064a37a
> soft: 6064a37a
> rte accelerated: 48674bc7
> rte soft: 48674bc7
> CRC of 4 NULL bytes expected: 0x48674bc7
> soft: 48674bc7
> rte accelerated: 48674bc7
> rte soft: 48674bc7
> 
> Values returned by rte_hash_crc functions does not match the one
> computed by a trivial crc32c implementation.
> 
> ARM code is not tested.
> 
> code showing the problem:
> 
> uint8_t null_test[32] = {0};
> 
> static uint32_t crc32c_trivial(uint8_t *buffer, uint32_t length, uint32_t crc)
> {
> uint32_t i, j;
> for (i = 0; i < length; ++i)
> {
> crc = crc ^ buffer[i];
> for (j = 0; j < 8; j++)
> crc = (crc >> 1) ^ 0x8000 ^ ((~crc & 1) * 0x82f63b78);
> }
> return crc;
> }
> 
> void hash_test(void);
> void hash_test(void)
> {
>   printf("CRC of 1 nul byte expected: 0x527d5351\n");
>   printf("soft: %08x\n", crc32c_trivial(null_test, 1, 0));
>   rte_hash_crc_init_alg();
>   printf("rte accelerated: %08x\n", ~rte_hash_crc(null_test, 1,
> 0x));
>   rte_hash_crc_set_alg(CRC32_SW);
>   printf("rte soft: %08x\n", ~rte_hash_crc(null_test, 1, 0x));
> 
>   printf("CRC of 2 nul bytes expected: 0xf16177d2\n");
>   printf("soft: %08x\n", crc32c_trivial(null_test, 2, 0));
>   rte_hash_crc_init_alg();
>   printf("rte accelerated: %08x\n", ~rte_hash_crc(null_test, 2,
> 0x));
>   rte_hash_crc_set_alg(CRC32_SW);
>   printf("rte soft: %08x\n", ~rte_hash_crc(null_test, 2, 0x));
> 
>   printf("CRC of 2x1 nul bytes expected: 0xf16177d2\n");
>   printf("soft: %08x\n", crc32c_trivial(null_test, 1,
> crc32c_trivial(null_test, 1, 0)));
>   rte_hash_crc_init_alg();
>   printf("rte accelerated: %08x\n", ~rte_hash_crc(null_test, 1,
> rte_hash_crc(null_test, 1, 0x)));
>   rte_hash_crc_set_alg(CRC32_SW);
>   printf("rte soft: %08x\n", ~rte_hash_crc(null_test, 1,
> rte_hash_crc(null_test, 1, 0x)));
> 
>   printf("CRC of 3 nul bytes expected: 0x6064a37a\n");
>   printf("soft: %08x\n", crc32c_trivial(null_test, 3, 0));
>   rte_hash_crc_init_alg();
>   printf("rte accelerated: %08x\n", ~rte_hash_crc(null_test, 3,
> 0x));
>   rte_hash_crc_set_alg(CRC32_SW);
>   printf("rte soft: %08x\n", ~rte_hash_crc(null_test, 3, 0x));
> 
>   printf("CRC of 4 nul bytes expected: 0x48674bc7\n");
>   printf("soft: %08x\n", crc32c_trivial(null_test, 4, 0));
>   rte_hash_crc_init_alg();
>   printf("rte accelerated: %08x\n", ~rte_hash_crc(null_test, 4,
> 0x));
>   rte_hash_crc_set_alg(CRC32_SW);
>   printf("rte soft: %08x\n", ~rte_hash_crc(null_test, 4, 0x));
> }
> 
> Signed-off-by: Didier Pallard 
> Acked-by: David Marchand 

It compiles fine now, thanks!
Could you add also tests for not multiple of 4 bytes keys in 
test_hash_functions.c,
so we make sure from now on that it works (and you can demonstrate that your 
fix works)?
You could send a patchset with those new tests first and then the fix.

Also, a note in release notes would be welcome :)

Thanks!
Pablo

[dpdk-dev] [dpdk-users] DPDK i40evf problem in receving packet

2016-02-10 Thread Thomas Monjalon

Formatting note:
Please avoid top posting (inline replies are preferred),
and do not include such disclaimer, not relevant on a mailing list.
Thanks

2016-02-10 16:00, Muhammad Zain-ul-Abideen:
> Hi Saurabh,
> Can you tell me what card you are uing
> 
> 
> *Regards:*
> Zain ul Abideen
> 
> 
> *Disclaimer**: The information contained in this e-mail and any attachments
> is confidential; it is intended only for use of the individual or entity
> named above. If the reader of this message is not the intended recipient,
> you are notified that any dissemination, distribution or use of this
> information is strictly prohibited. If you have received this communication
> in error, please delete it and email confirmation to the sender. Thank You.*
> 
> 
> On Wed, Feb 10, 2016 at 6:30 AM, Saurabh Mishra 
> wrote:
> 
> > Hi Qian --
> >
> > Any suggestions? This is bit urgent.
> >
> > /Saurabh
> >
> > On Sat, Feb 6, 2016 at 9:22 AM, Saurabh Mishra 
> > wrote:
> >
> > > Hi Qian --
> > >
> > >
> > > Here's the data from Host:
> > >
> > > [root at oscompute3 ~]# ethtool -i p3p1
> > >
> > > driver: i40e
> > >
> > > version: 1.0.11-k
> > >
> > > firmware-version: f4.40 a1.4 n04.53 e80001dc0
> > >
> > > bus-info: :04:00.0
> > >
> > > supports-statistics: yes
> > >
> > > supports-test: yes
> > >
> > > supports-eeprom-access: yes
> > >
> > > supports-register-dump: yes
> > >
> > > supports-priv-flags: no
> > >
> > > [root at oscompute3 ~]# ethtool -i p3p2
> > >
> > > driver: i40e
> > >
> > > version: 1.0.11-k
> > >
> > > firmware-version: f4.40 a1.4 n04.53 e80001dc0
> > >
> > > bus-info: :04:00.1
> > >
> > > supports-statistics: yes
> > >
> > > supports-test: yes
> > >
> > > supports-eeprom-access: yes
> > >
> > > supports-register-dump: yes
> > >
> > > supports-priv-flags: no
> > >
> > > [root at oscompute3 ~]#
> > >
> > > EthApp> drvinfo
> > >
> > > Port 0 driver: rte_i40evf_pmd (ver: RTE 2.2.0)
> > >
> > > Port 1 driver: rte_i40evf_pmd (ver: RTE 2.2.0)
> > >
> > > EthApp>
> > >
> > > On Fri, Feb 5, 2016 at 4:59 PM, Xu, Qian Q  wrote:
> > >
> > >> What's your current firmware info, can u run ethtool -i port_interface
> > to
> > >> check?
> > >>
> > >> Thanks
> > >> Qian
> > >>
> > >> -Original Message-
> > >> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Saurabh Mishra
> > >> Sent: Saturday, February 06, 2016 6:33 AM
> > >> To: dev at dpdk.org; users at dpdk.org
> > >> Subject: [dpdk-dev] DPDK i40evf problem in receving packet
> > >>
> > >> Hi,
> > >>
> > >> I'm seeing two problems:
> > >>
> > >> 1)  when use our kernel '3.10.88-8.0.0.0.6', we only receive first
> > >> packet but not subsequent ones at all after that. However, when I use
> > >> centos7.0, then l2fwd is able to receive all the packets.
> > >>
> > >> 2) I've also seen that on centos7.0, symmetric_mp itself is not working.
> > >> dev start fails with 280 error.
> > >>
> > >> i40evf is giving us lot of headache. The i40evf kernel driver works
> > >> without any problem. Host is a centos7 KVM. I've already upgraded
> > firmware
> > >> to latest.
> > >>
> > >> [root at localhost ~]# uname -a
> > >>
> > >> Linux localhost.localdomain 3.10.0-327.4.5.el7.x86_64 #1 SMP Mon Jan 25
> > >> 22:07:14 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux
> > >>
> > >> [root at localhost ~]#
> > >>
> > >> [root at localhost ~]# ./symmetric_mp fakeelf --file-prefix=virtaddr -c2
> > >> -m2048 -n1 --base-virtaddr=0x7fa0 --proc-type=primary -- -p 3
> > >> --num-procs=1 --proc-id=0
> > >>
> > >> EAL: Detected lcore 0 as core 0 on socket 0
> > >>
> > >> EAL: Detected lcore 1 as core 0 on socket 0
> > >>
> > >> EAL: Support maximum 128 logical core(s) by configuration.
> > >>
> > >> EAL: Detected 2 lcore(s)
> > >>
> > >> EAL: No free hugepages reported in hugepages-1048576kB
> > >>
> > >> EAL: VFIO modules not all loaded, skip VFIO support...
> > >>
> > >> EAL: Setting up physically contiguous memory...
> > >>
> > >> EAL: Ask a virtual area of 0x2a80 bytes
> > >>
> > >> EAL: Virtual area found at 0x7fa0 (size = 0x2a80)
> > >>
> > >> EAL: Ask a virtual area of 0x20 bytes
> > >>
> > >> EAL: Virtual area found at 0x7fa02a80 (size = 0x20)
> > >>
> > >> EAL: Ask a virtual area of 0x5400 bytes
> > >>
> > >> EAL: Virtual area found at 0x7fa02aa0 (size = 0x5400)
> > >>
> > >> EAL: Ask a virtual area of 0x40 bytes
> > >>
> > >> EAL: Virtual area found at 0x7fa07ea0 (size = 0x40)
> > >>
> > >> EAL: Ask a virtual area of 0x20 bytes
> > >>
> > >> EAL: Virtual area found at 0x7fa07ee0 (size = 0x20)
> > >>
> > >> EAL: Ask a virtual area of 0x20 bytes
> > >>
> > >> EAL: Virtual area found at 0x7fa07f00 (size = 0x20)
> > >>
> > >> EAL: Ask a virtual area of 0x60 bytes
> > >>
> > >> EAL: Virtual area found at 0x7fa07f20 (size = 0x60)
> > >>
> > >> EAL: Ask a virtual area of 0x20 bytes
> > >>
> > >> EAL: Virtual area found at 0x7fa07f80 (size = 0x20)
> > >>
> > >> EAL: Ask a

[dpdk-dev] [PATCH 7/9] pci: add a helper for device name

2016-02-10 Thread Jan Viktorin

On Fri, 29 Jan 2016 15:08:34 +0100
David Marchand  wrote:

> eal is a better place than ethdev for naming resources.
> Add a helper here, and make use of it in ethdev hotplug code.
> 
> Signed-off-by: David Marchand 
> ---
>  lib/librte_eal/common/include/rte_pci.h | 28 
>  lib/librte_ether/rte_ethdev.c   | 22 ++
>  2 files changed, 30 insertions(+), 20 deletions(-)
> 
> diff --git a/lib/librte_eal/common/include/rte_pci.h 
> b/lib/librte_eal/common/include/rte_pci.h
> index 334c12e..9edd5f5 100644
> --- a/lib/librte_eal/common/include/rte_pci.h
> +++ b/lib/librte_eal/common/include/rte_pci.h
> @@ -309,6 +309,34 @@ eal_parse_pci_DomBDF(const char *input, struct 
> rte_pci_addr *dev_addr)
>  }
>  #undef GET_PCIADDR_FIELD
>  
> +/**
> + * Utility function to write a pci device name, this device name can later be
> + * used to retrieve the corresponding rte_pci_addr using above functions.
> + *
> + * @param addr
> + *   The PCI Bus-Device-Function address
> + * @param output
> + *   The output buffer string
> + * @param size
> + *   The output buffer size
> + * @return
> + *  0 on success, negative on error.
> + */
> +static inline int
> +eal_pci_device_name(const struct rte_pci_addr *addr,
> + char *output, int size)

"size_t size" (or unsigned int) seems to be better to me.

> +{
> + int ret;
> +
> + ret = snprintf(output, size, PCI_PRI_FMT,
> +addr->domain, addr->bus,
> +addr->devid, addr->function);
> + if (ret < 0 || ret >= size)
> + return -1;

The return value is not checked (below). I think, such functions
are usually missing error checking as nobody expects them to fail.
Wouldn't it be better to panic here?

> +
> + return 0;
> +}
> +
>  /* Compare two PCI device addresses. */
>  /**
>   * Utility function to compare two PCI device addresses.
> diff --git a/lib/librte_ether/rte_ethdev.c b/lib/librte_ether/rte_ethdev.c
> index 17e4f4d..5ba7479 100644
> --- a/lib/librte_ether/rte_ethdev.c
> +++ b/lib/librte_ether/rte_ethdev.c
> @@ -214,20 +214,6 @@ rte_eth_dev_allocate(const char *name, enum 
> rte_eth_dev_type type)
>   return eth_dev;
>  }
>  
> -static int
> -rte_eth_dev_create_unique_device_name(char *name, size_t size,
> - struct rte_pci_device *pci_dev)
> -{
> - int ret;
> -
> - ret = snprintf(name, size, "%d:%d.%d",
> - pci_dev->addr.bus, pci_dev->addr.devid,
> - pci_dev->addr.function);
> - if (ret < 0)
> - return ret;
> - return 0;
> -}
> -
>  int
>  rte_eth_dev_release_port(struct rte_eth_dev *eth_dev)
>  {
> @@ -251,9 +237,7 @@ rte_eth_dev_pci_probe(struct rte_pci_driver *pci_drv,
>  
>   eth_drv = (struct eth_driver *)pci_drv;
>  
> - /* Create unique Ethernet device name using PCI address */
> - rte_eth_dev_create_unique_device_name(ethdev_name,
> - sizeof(ethdev_name), pci_dev);
> + eal_pci_device_name(_dev->addr, ethdev_name, sizeof(ethdev_name));
>  
>   eth_dev = rte_eth_dev_allocate(ethdev_name, RTE_ETH_DEV_PCI);
>   if (eth_dev == NULL)
> @@ -304,9 +288,7 @@ rte_eth_dev_pci_remove(struct rte_pci_device *pci_dev)
>   if (pci_dev == NULL)
>   return -EINVAL;
>  
> - /* Create unique Ethernet device name using PCI address */
> - rte_eth_dev_create_unique_device_name(ethdev_name,
> - sizeof(ethdev_name), pci_dev);
> + eal_pci_device_name(_dev->addr, ethdev_name, sizeof(ethdev_name));
>  
>   eth_dev = rte_eth_dev_allocated(ethdev_name);
>   if (eth_dev == NULL)



-- 
   Jan Viktorin  E-mail: Viktorin at RehiveTech.com
   System Architect  Web:www.RehiveTech.com
   RehiveTech
   Brno, Czech Republic

[dpdk-dev] [PATCH 6/9] eal: initialize vdevs right next to pci devices

2016-02-10 Thread Jan Viktorin

On Fri, 29 Jan 2016 15:08:33 +0100
David Marchand  wrote:

> This way, the resources probing happens in a common place.
> 
> Signed-off-by: David Marchand 
> ---
>  lib/librte_eal/bsdapp/eal/eal.c | 7 +++
>  lib/librte_eal/common/include/rte_dev.h | 2 +-
>  lib/librte_eal/linuxapp/eal/eal.c   | 7 +++
>  3 files changed, 15 insertions(+), 1 deletion(-)
> 
> diff --git a/lib/librte_eal/bsdapp/eal/eal.c b/lib/librte_eal/bsdapp/eal/eal.c
> index a34e61d..b557a9f 100644
> --- a/lib/librte_eal/bsdapp/eal/eal.c
> +++ b/lib/librte_eal/bsdapp/eal/eal.c
> @@ -582,8 +582,10 @@ rte_eal_init(int argc, char **argv)
>   rte_config.master_lcore, thread_id, cpuset,
>   ret == 0 ? "" : "...");
>  
> +#ifndef RTE_NEXT_ABI
>   if (rte_eal_dev_init() < 0)
>   rte_panic("Cannot init pmd devices\n");
> +#endif
>  
>   RTE_LCORE_FOREACH_SLAVE(i) {
>  
> @@ -617,6 +619,11 @@ rte_eal_init(int argc, char **argv)
>   rte_eal_mp_remote_launch(sync_func, NULL, SKIP_MASTER);
>   rte_eal_mp_wait_lcore();
>  
> +#ifdef RTE_NEXT_ABI
> + if (rte_eal_dev_init() < 0)
> + rte_panic("Cannot probe vdev devices\n");
> +#endif
> +
>   /* Probe & Initialize PCI devices */
>   if (rte_eal_pci_probe())
>   rte_panic("Cannot probe PCI\n");
> diff --git a/lib/librte_eal/common/include/rte_dev.h 
> b/lib/librte_eal/common/include/rte_dev.h
> index 88c1a19..df69e28 100644
> --- a/lib/librte_eal/common/include/rte_dev.h
> +++ b/lib/librte_eal/common/include/rte_dev.h
> @@ -156,7 +156,7 @@ void rte_eal_driver_register(struct rte_driver *driver);
>  void rte_eal_driver_unregister(struct rte_driver *driver);
>  
>  /**
> - * Initalize all the registered drivers in this process
> + * Scan all devargs and attach to drivers if available
>   */
>  int rte_eal_dev_init(void);

Move this to a separate commit? Is it just a forgotten doc comment?

>  
> diff --git a/lib/librte_eal/linuxapp/eal/eal.c 
> b/lib/librte_eal/linuxapp/eal/eal.c
> index 62241ee..95313af 100644
> --- a/lib/librte_eal/linuxapp/eal/eal.c
> +++ b/lib/librte_eal/linuxapp/eal/eal.c
> @@ -834,8 +834,10 @@ rte_eal_init(int argc, char **argv)
>   rte_config.master_lcore, (int)thread_id, cpuset,
>   ret == 0 ? "" : "...");
>  
> +#ifndef RTE_NEXT_ABI
>   if (rte_eal_dev_init() < 0)
>   rte_panic("Cannot init pmd devices\n");
> +#endif
>  
>   RTE_LCORE_FOREACH_SLAVE(i) {
>  
> @@ -873,6 +875,11 @@ rte_eal_init(int argc, char **argv)
>   rte_eal_mp_remote_launch(sync_func, NULL, SKIP_MASTER);
>   rte_eal_mp_wait_lcore();
>  
> +#ifdef RTE_NEXT_ABI
> + if (rte_eal_dev_init() < 0)
> + rte_panic("Cannot probe vdev devices\n");
> +#endif
> +
>   /* Probe & Initialize PCI devices */
>   if (rte_eal_pci_probe())
>   rte_panic("Cannot probe PCI\n");

I cannot see the point why it is enclosed in the RTE_NEXT_ABI. Is it
such a serious breakage?

Regards
Jan

[dpdk-dev] i40e: cannot change mtu to enable jumbo frame

2016-02-10 Thread Julien Meunier

On 02/09/2016 08:05 PM, Zhu, Heqing wrote:
> Helin is still in Chinese New Year Vacation. Will the below command option 
> help ?
>
> 4.5.9. port config - max-pkt-len
> Set the maximum packet length:
>
> testpmd> port config all max-pkt-len (value)
> This is equivalent to the --max-pkt-len command-line option.
>
>
> -Original Message-
> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Julien Meunier
> Sent: Tuesday, February 9, 2016 9:36 AM
> To: Zhang, Helin ; dev at dpdk.org
> Subject: [dpdk-dev] i40e: cannot change mtu to enable jumbo frame
>
> Hello Helin,
>
> I tried to send jumbo frames to a i40e card. However, I observed that all 
> frames are dropped. Moreover, set_mtu function is not implemented on i40e PMD.
>
>   > testpmd --log-level 8 --huge-dir=/mnt/huge -n 4 -l 2,18 --socket-mem
> 1024,1024 -w :02:00.0 -w :02:00.2 -- -i --nb-cores=1
> --nb-ports=2 --total-num-mbufs=65536
>
> =
> Configuration
> =
>
> +---+  +-+
> |   |  | |
> | tgen  |  | |
> |   +--+ port 0  |
> |   |  | |
> |   |  | |
> |   |  | |
> |   |  | |
> |   +--+ port 1  |
> |   |  | |
> +---+  +-+
>
> DPDK: DPDK-v2.2
>
> ==
> MTU = 1500
> ==
> Packet sent from a tgen
>   > p = Ether / IP / UDP / Raw(MTU + HDR(Ethernet)- HDR(IP) - HDR(UDP))  > 
> len(p) = 1514
>
> testpmd> start
> PMD: i40e_rxd_to_vlan_tci(): Mbuf vlan_tci: 0, vlan_tci_outer: 0
> testpmd> stop
> Telling cores to stop...
> Waiting for lcores to finish...
> PMD: i40e_update_vsi_stats(): *** VSI[13] stats start ***
> PMD: i40e_update_vsi_stats(): rx_bytes:1518
> PMD: i40e_update_vsi_stats(): rx_unicast:  1
> PMD: i40e_update_vsi_stats(): *** VSI[13] stats end ***
> PMD: i40e_dev_stats_get(): *** PF stats start ***
> PMD: i40e_dev_stats_get(): rx_bytes:1514
> PMD: i40e_dev_stats_get(): rx_unicast:  1
> PMD: i40e_dev_stats_get(): rx_unknown_protocol: 1
> PMD: i40e_dev_stats_get(): rx_size_1522: 1
> PMD: i40e_dev_stats_get(): *** PF stats end ***
>
> -- Forward statistics for port 0  --
> RX-packets: 1  RX-dropped: 0 RX-total: 1
> TX-packets: 0  TX-dropped: 0 TX-total: 0
>
> 
> PMD: i40e_update_vsi_stats(): *** VSI[14] stats start ***
> PMD: i40e_update_vsi_stats(): tx_bytes:1514
> PMD: i40e_update_vsi_stats(): tx_unicast:  1
> PMD: i40e_update_vsi_stats(): *** VSI[14] stats end ***
> PMD: i40e_dev_stats_get(): *** PF stats start ***
> PMD: i40e_dev_stats_get(): tx_bytes:1514
> PMD: i40e_dev_stats_get(): tx_unicast:  1
> PMD: i40e_dev_stats_get(): tx_size_1522: 1
> PMD: i40e_dev_stats_get(): *** PF stats end ***
>
> -- Forward statistics for port 1 --
> RX-packets: 0  RX-dropped: 0 RX-total: 0
> TX-packets: 1  TX-dropped: 0 TX-total: 1
>
> 
>
> + Accumulated forward statistics for all ports+
> RX-packets: 1  RX-dropped: 0 RX-total: 1
> TX-packets: 1  TX-dropped: 0 TX-total: 1
>
> 
>
> => OK
>
> ==
> MTU = 1600
> ==
> Packet sent
>   > p = Ether / IP / UDP / Raw(MTU + HDR(Ethernet)- HDR(IP) - HDR(UDP))  > 
> len(p) = 1614
>
> testpmd> port config mtu 0 1600
> rte_eth_dev_set_mtu: Function not supported Set MTU failed. diag=-95
> testpmd> port config mtu 1 1600
> rte_eth_dev_set_mtu: Function not supported Set MTU failed. diag=-95
> testpmd> start
> testpmd> stop
> Telling cores to stop...
> Waiting for lcores to finish...
> PMD: i40e_update_vsi_stats(): *** VSI[13] stats start ***
> PMD: i40e_update_vsi_stats(): rx_bytes:1618
> PMD: i40e_update_vsi_stats(): rx_unicast:  1
> PMD: i40e_update_vsi_stats(): *** VSI[13] stats end ***
> PMD: i40e_dev_stats_get(): *** PF stats start ***
> PMD: i40e_dev_stats_get(): rx_bytes:1614
> PMD: i40e_dev_stats_get(): rx_unicast:  1
> PMD: i40e_dev_stats_get(): rx_unknown_protocol: 1
> PMD: i40e_dev_stats_get(): rx_size_big:  1
> PMD: i40e_dev_stats_get(): *** PF stats end ***
>
> -- Forward statistics for port 0  --
> RX-packets: 1  RX-dropped: 0 RX-total: 1
> TX-packets: 0  TX-dropped: 0 TX-total: 0
>
> 
> PMD:

[dpdk-dev] [PATCH] vhost: remove vhost_net_device_ops

2016-02-10 Thread Rich Lane

The indirection is unnecessary because there is only one implementation of the
vhost common code. Removing it makes the code more readable.

Signed-off-by: Rich Lane 
---
 examples/vhost_xen/virtio-net.h   |  2 -
 lib/librte_vhost/vhost-net.h  | 40 +---
 lib/librte_vhost/vhost_cuse/vhost-net-cdev.c  | 27 
 lib/librte_vhost/vhost_cuse/virtio-net-cdev.c |  4 +-
 lib/librte_vhost/vhost_user/vhost-net-user.c  | 23 +++
 lib/librte_vhost/vhost_user/virtio-net-user.c |  6 +-
 lib/librte_vhost/virtio-net.c | 92 ---
 7 files changed, 70 insertions(+), 124 deletions(-)

diff --git a/examples/vhost_xen/virtio-net.h b/examples/vhost_xen/virtio-net.h
index c8c5a7a..ab69726 100644
--- a/examples/vhost_xen/virtio-net.h
+++ b/examples/vhost_xen/virtio-net.h
@@ -110,6 +110,4 @@ struct virtio_net_device_ops {
void (* destroy_device) (volatile struct virtio_net *); /* Remove 
device. */
 };

-struct vhost_net_device_ops const * get_virtio_net_callbacks(void);
-
 #endif
diff --git a/lib/librte_vhost/vhost-net.h b/lib/librte_vhost/vhost-net.h
index c69b60b..81274df 100644
--- a/lib/librte_vhost/vhost-net.h
+++ b/lib/librte_vhost/vhost-net.h
@@ -43,8 +43,6 @@

 #include "rte_virtio_net.h"

-extern struct vhost_net_device_ops const *ops;
-
 /* Macros for printing using RTE_LOG */
 #define RTE_LOGTYPE_VHOST_CONFIG RTE_LOGTYPE_USER1
 #define RTE_LOGTYPE_VHOST_DATA   RTE_LOGTYPE_USER1
@@ -85,34 +83,26 @@ struct vhost_device_ctx {
uint64_tfh; /* Populated with fi->fh to track the device 
index. */
 };

-/*
- * Structure contains function pointers to be defined in virtio-net.c. These
- * functions are called in CUSE context and are used to configure devices.
- */
-struct vhost_net_device_ops {
-   int (*new_device)(struct vhost_device_ctx);
-   void (*destroy_device)(struct vhost_device_ctx);
+int vhost_new_device(struct vhost_device_ctx);
+void vhost_destroy_device(struct vhost_device_ctx);

-   void (*set_ifname)(struct vhost_device_ctx,
-   const char *if_name, unsigned int if_len);
+void vhost_set_ifname(struct vhost_device_ctx,
+   const char *if_name, unsigned int if_len);

-   int (*get_features)(struct vhost_device_ctx, uint64_t *);
-   int (*set_features)(struct vhost_device_ctx, uint64_t *);
+int vhost_get_features(struct vhost_device_ctx, uint64_t *);
+int vhost_set_features(struct vhost_device_ctx, uint64_t *);

-   int (*set_vring_num)(struct vhost_device_ctx, struct vhost_vring_state 
*);
-   int (*set_vring_addr)(struct vhost_device_ctx, struct vhost_vring_addr 
*);
-   int (*set_vring_base)(struct vhost_device_ctx, struct vhost_vring_state 
*);
-   int (*get_vring_base)(struct vhost_device_ctx, uint32_t, struct 
vhost_vring_state *);
+int vhost_set_vring_num(struct vhost_device_ctx, struct vhost_vring_state *);
+int vhost_set_vring_addr(struct vhost_device_ctx, struct vhost_vring_addr *);
+int vhost_set_vring_base(struct vhost_device_ctx, struct vhost_vring_state *);
+int vhost_get_vring_base(struct vhost_device_ctx, uint32_t, struct 
vhost_vring_state *);

-   int (*set_vring_kick)(struct vhost_device_ctx, struct vhost_vring_file 
*);
-   int (*set_vring_call)(struct vhost_device_ctx, struct vhost_vring_file 
*);
+int vhost_set_vring_kick(struct vhost_device_ctx, struct vhost_vring_file *);
+int vhost_set_vring_call(struct vhost_device_ctx, struct vhost_vring_file *);

-   int (*set_backend)(struct vhost_device_ctx, struct vhost_vring_file *);
-
-   int (*set_owner)(struct vhost_device_ctx);
-   int (*reset_owner)(struct vhost_device_ctx);
-};
+int vhost_set_backend(struct vhost_device_ctx, struct vhost_vring_file *);

+int vhost_set_owner(struct vhost_device_ctx);
+int vhost_reset_owner(struct vhost_device_ctx);

-struct vhost_net_device_ops const *get_virtio_net_callbacks(void);
 #endif /* _VHOST_NET_CDEV_H_ */
diff --git a/lib/librte_vhost/vhost_cuse/vhost-net-cdev.c 
b/lib/librte_vhost/vhost_cuse/vhost-net-cdev.c
index ae7ad8d..c613e68 100644
--- a/lib/librte_vhost/vhost_cuse/vhost-net-cdev.c
+++ b/lib/librte_vhost/vhost_cuse/vhost-net-cdev.c
@@ -58,7 +58,6 @@ static const char cuse_device_name[] = "/dev/cuse";
 static const char default_cdev[] = "vhost-net";

 static struct fuse_session *session;
-struct vhost_net_device_ops const *ops;

 /*
  * Returns vhost_device_ctx from given fuse_req_t. The index is populated later
@@ -86,7 +85,7 @@ vhost_net_open(fuse_req_t req, struct fuse_file_info *fi)
struct vhost_device_ctx ctx = fuse_req_to_vhost_ctx(req, fi);
int err = 0;

-   err = ops->new_device(ctx);
+   err = vhost_new_device(ctx);
if (err == -1) {
fuse_reply_err(req, EPERM);
return;
@@ -108,7 +107,7 @@ vhost_net_release(fuse_req_t req, struct fuse_file_info *fi)
int err = 0;
struct vhost_device_ctx ctx = fuse_req_to_vhost_ctx(req, fi);

-

[dpdk-dev] [PATCH 3/9] drivers: no more pdev drivers

2016-02-10 Thread Jan Viktorin

On Wed, 10 Feb 2016 10:27:14 +0100
David Marchand  wrote:

> On Wed, Feb 10, 2016 at 9:51 AM, David Marchand
>  wrote:
> > On Tue, Feb 9, 2016 at 6:05 PM, Jan Viktorin  
> > wrote:  
> >> What about introducing a macro for this?
> >>
> >> RTE_REGISTER_PCI_DRIVER(rte_qad_pmd);  
> >
> > Yes.  
> 
> The only problem here, is that rte_qad_pmd is a crypto structure (same
> problem with ethdev), so I can't just pass it to eal.
> I can't just pass the pci driver, for the cases where multiple drivers
> are registered in a single file (look at ixgbe driver).
> 
> So, how about :
> 
> In rte_pci.h :
> 
> #define RTE_EAL_PCI_REGISTER(name, d)\
> void pciinitfn_ ##name(void);\
> void __attribute__((constructor, used)) pciinitfn_ ##name(void)\
> {\
> rte_eal_pci_register(d);\

I meant
rte_eal_pci_register(&(d)->pci_drv);\

Perhaps, my assumption that a PCI driver is always referred as pci_drv is 
wrong...

> }
> 
> Then, in qat driver :
> RTE_EAL_PCI_REGISTER(qat, _qat_pmd.pci_drv);
> 

However, I think that this is a detail. Passing rte_pci_driver is good.

Regards
Jan

-- 
   Jan Viktorin  E-mail: Viktorin at RehiveTech.com
   System Architect  Web:www.RehiveTech.com
   RehiveTech
   Brno, Czech Republic

[dpdk-dev] [PATCH v2] cfgfile: support looking up sections by index

2016-02-10 Thread Rich Lane

On Tue, Feb 2, 2016 at 7:10 AM, Dumitrescu, Cristian <
cristian.dumitrescu at intel.com> wrote:

>
>
> > -Original Message-
> > From: Rich Lane [mailto:rich.lane at bigswitch.com]
> > Sent: Tuesday, January 19, 2016 4:42 AM
> > To: dev at dpdk.org
> > Cc: Dumitrescu, Cristian ; Panu
> Matilainen
> > 
> > Subject: [PATCH v2] cfgfile: support looking up sections by index
> >
> > This is useful when sections have duplicate names.
>
> Hi Rich,
>
> You are right, I can see this is needed to allow multiple sections with
> identical name in the same configuration file. When sections with the same
> name are not allowed, then this is not needed, as the current API is
> sufficient.
>
> To me, having several sections with the same name does not look like a
> good idea, in fact it might look like an application design flaw, as
> differentiating between sections with the same name can only done based on
> the position of the section in the configuration file, which is an error
> prone option. Some examples:
> 1. While maintaining a large config file, keeping a specific section at a
> fixed position quickly becomes a pain, and shifting the position of the
> section up or down can completely change the application behavior;
> 2. Using basic pre-processors (like CPP or M4) or scripts, the master
> configuration file can include other configuration files, with the
> inclusion of each decided at run-time based on application command line
> parameters, so the position of certain sections is actually not known until
> run-time.
>
> Can you provide some examples when having multiple sections with the same
> name is a key requirement?
>

My application uses a config file to assign RX/TX queues to cores. Ports
and cores have names (e.g. "core.fwd0"), but there's no reason to name
queues. The order of the queue sections is unimportant.


> A straight forward workaround to having multiple sections with the same
> name is to add a number to the name of each such section. Using the current
> API, all the sections with the same prefix name can be read gracefully. As
> an example, the ip_pipeline application allows multiple sections with the
> same name prefix but a different number prefix:
> PIPELINE0, PIPELINE1, ...
> LINK0, LINK1, ...
> MEMPOOL0, MEMPOOL1, ...
> RXQ0.0, RXQ0.1, RXQ1.0, ...
> TXQ0.0, TXQ0.1, TXQ1.0, ...
>
> Is there a reason why this approach is not acceptable for your application?
>

It is less usable. The person writing the config file must generate those
suffixes which are irrelevant to what they're trying to express. The sole
effect is to add another source of errors.

Additionally, this API allows reading a config file in linear instead of
quadratic time (wrt number of sections). While my config files aren't long
enough to make this a requirement it certainly seems desirable.

[dpdk-dev] [PATCH v3] cfgfile: support looking up sections by index

2016-02-10 Thread Rich Lane

This is useful when sections have duplicate names.

Signed-off-by: Rich Lane 
---
v2->v3
- Added check for index < 0.
v1->v2:
- Added new symbol to version script.

 lib/librte_cfgfile/rte_cfgfile.c   | 16 
 lib/librte_cfgfile/rte_cfgfile.h   | 23 +++
 lib/librte_cfgfile/rte_cfgfile_version.map |  6 ++
 3 files changed, 45 insertions(+)

diff --git a/lib/librte_cfgfile/rte_cfgfile.c b/lib/librte_cfgfile/rte_cfgfile.c
index a677dad..eba0713 100644
--- a/lib/librte_cfgfile/rte_cfgfile.c
+++ b/lib/librte_cfgfile/rte_cfgfile.c
@@ -333,6 +333,22 @@ rte_cfgfile_section_entries(struct rte_cfgfile *cfg, const 
char *sectionname,
return i;
 }

+int
+rte_cfgfile_section_entries_by_index(struct rte_cfgfile *cfg, int index,
+   struct rte_cfgfile_entry *entries, int max_entries)
+{
+   int i;
+   const struct rte_cfgfile_section *sect;
+
+   if (index < 0 || index >= cfg->num_sections)
+   return -1;
+
+   sect = cfg->sections[index];
+   for (i = 0; i < max_entries && i < sect->num_entries; i++)
+   entries[i] = *sect->entries[i];
+   return i;
+}
+
 const char *
 rte_cfgfile_get_entry(struct rte_cfgfile *cfg, const char *sectionname,
const char *entryname)
diff --git a/lib/librte_cfgfile/rte_cfgfile.h b/lib/librte_cfgfile/rte_cfgfile.h
index d443782..8e69971 100644
--- a/lib/librte_cfgfile/rte_cfgfile.h
+++ b/lib/librte_cfgfile/rte_cfgfile.h
@@ -155,6 +155,29 @@ int rte_cfgfile_section_entries(struct rte_cfgfile *cfg,
struct rte_cfgfile_entry *entries,
int max_entries);

+/** Get section entries as key-value pairs
+*
+* The index of a section is the same as the index of its name in the
+* result of rte_cfgfile_sections. This API can be used when there are
+* multiple sections with the same name.
+*
+* @param cfg
+*   Config file
+* @param index
+*   Section index
+* @param entries
+*   Pre-allocated array of at least max_entries entries where the section
+*   entries are stored as key-value pair after successful invocation
+* @param max_entries
+*   Maximum number of section entries to be stored in entries array
+* @return
+*   Number of entries populated on success, negative error code otherwise
+*/
+int rte_cfgfile_section_entries_by_index(struct rte_cfgfile *cfg,
+   int index,
+   struct rte_cfgfile_entry *entries,
+   int max_entries);
+
 /** Get value of the named entry in named config file section
 *
 * @param cfg
diff --git a/lib/librte_cfgfile/rte_cfgfile_version.map 
b/lib/librte_cfgfile/rte_cfgfile_version.map
index bf6c6fd..f6a27a9 100644
--- a/lib/librte_cfgfile/rte_cfgfile_version.map
+++ b/lib/librte_cfgfile/rte_cfgfile_version.map
@@ -13,3 +13,9 @@ DPDK_2.0 {

local: *;
 };
+
+DPDK_2.3 {
+   global:
+
+   rte_cfgfile_section_entries_by_index;
+} DPDK_2.0;
-- 
1.9.1

[dpdk-dev] [PATCH v2 1/3] driver/net/mpipe: support native build on tilegx platform.

2016-02-10 Thread Thomas Monjalon

2016-02-10 09:49, Bruce Richardson:
> On Tue, Feb 09, 2016 at 11:47:55PM +0100, Thomas Monjalon wrote:
> > 2016-02-09 21:15, Liming Sun:
> > > Looks like this patch serie has been merged into dpdk-next-net/rel_16_04.
> > > What would be the usual way to submit changes for new comments? Would it 
> > > be incremental changes (new commit) based on previous one? Thanks.
> > 
> > Good question.
> > I think it's better if Bruce drops or reverts the commits from dpdk-next-net
> > to let you re-submit a better new version.
> > Bruce, do you agree?
> 
> Unless there is something actually broken - that was previously working - by 
> this patchset I'd rather not revert it. This patch was sitting acked for a 
> month
> which is a reasonable time for comments before applying it. Allowing people to
> step up post-apply and look for patches being reverted is not something we 
> want
> to encourage IMHO. There are already too many reviews being done at the last
> minute, and allowing reverts may make that situation worse, while applying 
> acked
> patches within a reasonable time - irrespective of whether people subsequently
> find issues with them - should encourage earlier reviews, and makes it easier 
> on
> contributors.

Yes you are right.

> Therefore I'd rather see any additional enhancements or changes
> done as incremental patches on top of this set.
> 
> Regards,
> /Bruce
>

[dpdk-dev] [PATCH v3] vhost: fix leak of fds and mmaps

2016-02-10 Thread Rich Lane

The common vhost code only supported a single mmap per device. vhost-user
worked around this by saving the address/length/fd of each mmap after the end
of the rte_virtio_memory struct. This only works if the vhost-user code frees
dev->mem, since the common code is unaware of the extra info. The
VHOST_USER_RESET_OWNER message is one situation where the common code frees
dev->mem and leaks the fds and mappings. This happens every time I shut down a
VM.

The new code calls back into the implementation (vhost-user or vhost-cuse) to
clean up these resources.

The vhost-cuse changes are only compile tested.

Signed-off-by: Rich Lane 
Acked-by: Yuanhan Liu 
---
v2->v3:
- Rename "impl" to "backend".

v1->v2:
- Call into vhost-user/vhost-cuse to free mmaps.

 lib/librte_vhost/vhost-net.h  |  6 ++
 lib/librte_vhost/vhost_cuse/virtio-net-cdev.c | 12 
 lib/librte_vhost/vhost_user/vhost-net-user.c  |  1 -
 lib/librte_vhost/vhost_user/virtio-net-user.c | 25 ++---
 lib/librte_vhost/vhost_user/virtio-net-user.h |  1 -
 lib/librte_vhost/virtio-net.c |  8 +---
 6 files changed, 29 insertions(+), 24 deletions(-)

diff --git a/lib/librte_vhost/vhost-net.h b/lib/librte_vhost/vhost-net.h
index c69b60b..affbd1a 100644
--- a/lib/librte_vhost/vhost-net.h
+++ b/lib/librte_vhost/vhost-net.h
@@ -115,4 +115,10 @@ struct vhost_net_device_ops {


 struct vhost_net_device_ops const *get_virtio_net_callbacks(void);
+
+/*
+ * Backend-specific cleanup. Defined by vhost-cuse and vhost-user.
+ */
+void vhost_backend_cleanup(struct virtio_net *dev);
+
 #endif /* _VHOST_NET_CDEV_H_ */
diff --git a/lib/librte_vhost/vhost_cuse/virtio-net-cdev.c 
b/lib/librte_vhost/vhost_cuse/virtio-net-cdev.c
index ae2c3fa..374c884 100644
--- a/lib/librte_vhost/vhost_cuse/virtio-net-cdev.c
+++ b/lib/librte_vhost/vhost_cuse/virtio-net-cdev.c
@@ -421,3 +421,15 @@ int cuse_set_backend(struct vhost_device_ctx ctx, struct 
vhost_vring_file *file)

return ops->set_backend(ctx, file);
 }
+
+void
+vhost_backend_cleanup(struct virtio_net *dev)
+{
+   /* Unmap QEMU memory file if mapped. */
+   if (dev->mem) {
+   munmap((void *)(uintptr_t)dev->mem->mapped_address,
+   (size_t)dev->mem->mapped_size);
+   free(dev->mem);
+   dev->mem = NULL;
+   }
+}
diff --git a/lib/librte_vhost/vhost_user/vhost-net-user.c 
b/lib/librte_vhost/vhost_user/vhost-net-user.c
index 8b7a448..336efba 100644
--- a/lib/librte_vhost/vhost_user/vhost-net-user.c
+++ b/lib/librte_vhost/vhost_user/vhost-net-user.c
@@ -347,7 +347,6 @@ vserver_message_handler(int connfd, void *dat, int *remove)
close(connfd);
*remove = 1;
free(cfd_ctx);
-   user_destroy_device(ctx);
ops->destroy_device(ctx);

return;
diff --git a/lib/librte_vhost/vhost_user/virtio-net-user.c 
b/lib/librte_vhost/vhost_user/virtio-net-user.c
index 2934d1c..993ed71 100644
--- a/lib/librte_vhost/vhost_user/virtio-net-user.c
+++ b/lib/librte_vhost/vhost_user/virtio-net-user.c
@@ -339,21 +339,6 @@ user_set_vring_enable(struct vhost_device_ctx ctx,
 }

 void
-user_destroy_device(struct vhost_device_ctx ctx)
-{
-   struct virtio_net *dev = get_device(ctx);
-
-   if (dev && (dev->flags & VIRTIO_DEV_RUNNING))
-   notify_ops->destroy_device(dev);
-
-   if (dev && dev->mem) {
-   free_mem_region(dev);
-   free(dev->mem);
-   dev->mem = NULL;
-   }
-}
-
-void
 user_set_protocol_features(struct vhost_device_ctx ctx,
   uint64_t protocol_features)
 {
@@ -365,3 +350,13 @@ user_set_protocol_features(struct vhost_device_ctx ctx,

dev->protocol_features = protocol_features;
 }
+
+void
+vhost_backend_cleanup(struct virtio_net *dev)
+{
+   if (dev->mem) {
+   free_mem_region(dev);
+   free(dev->mem);
+   dev->mem = NULL;
+   }
+}
diff --git a/lib/librte_vhost/vhost_user/virtio-net-user.h 
b/lib/librte_vhost/vhost_user/virtio-net-user.h
index b82108d..1140ee1 100644
--- a/lib/librte_vhost/vhost_user/virtio-net-user.h
+++ b/lib/librte_vhost/vhost_user/virtio-net-user.h
@@ -55,5 +55,4 @@ int user_get_vring_base(struct vhost_device_ctx, struct 
vhost_vring_state *);
 int user_set_vring_enable(struct vhost_device_ctx ctx,
  struct vhost_vring_state *state);

-void user_destroy_device(struct vhost_device_ctx);
 #endif
diff --git a/lib/librte_vhost/virtio-net.c b/lib/librte_vhost/virtio-net.c
index de78a0f..cf2560e 100644
--- a/lib/librte_vhost/virtio-net.c
+++ b/lib/librte_vhost/virtio-net.c
@@ -199,13 +199,7 @@ cleanup_device(struct virtio_net *dev, int destroy)
 {
uint32_t i;

-   /* Unmap QEMU memory file if mapped. */
-   if (dev->mem) {
-   munmap((void *)(uintptr_t)dev->mem->mapped_address,
-

[dpdk-dev] [PATCH 3/9] drivers: no more pdev drivers

2016-02-10 Thread David Marchand

On Wed, Feb 10, 2016 at 9:51 AM, David Marchand
 wrote:
> On Tue, Feb 9, 2016 at 6:05 PM, Jan Viktorin  
> wrote:
>> What about introducing a macro for this?
>>
>> RTE_REGISTER_PCI_DRIVER(rte_qad_pmd);
>
> Yes.

The only problem here, is that rte_qad_pmd is a crypto structure (same
problem with ethdev), so I can't just pass it to eal.
I can't just pass the pci driver, for the cases where multiple drivers
are registered in a single file (look at ixgbe driver).

So, how about :

In rte_pci.h :

#define RTE_EAL_PCI_REGISTER(name, d)\
void pciinitfn_ ##name(void);\
void __attribute__((constructor, used)) pciinitfn_ ##name(void)\
{\
rte_eal_pci_register(d);\
}

Then, in qat driver :
RTE_EAL_PCI_REGISTER(qat, _qat_pmd.pci_drv);

-- 
David Marchand

[dpdk-dev] [PATCH 2/2] bonding: fix detach of bonded slave devices

2016-02-10 Thread Bernard Iremonger

Ensure that a bonded slave device is not detached,
until it is removed from the bonded device.

Fixes: 2efb58cbab6e ("bond: new link bonding library")
Fixes: a45b288ef21a ("bond: support link status polling")
Fixes: 494adb7f63f2 ("ethdev: add device fields from PCI layer")
Fixes: b1fb53a39d88 ("ethdev: remove some PCI specific handling")
Signed-off-by: Bernard Iremonger 
---
 drivers/net/bonding/rte_eth_bond_api.c | 33 +++--
 lib/librte_ether/rte_ethdev.c  |  8 ++--
 lib/librte_ether/rte_ethdev.h  |  4 +++-
 3 files changed, 20 insertions(+), 25 deletions(-)

diff --git a/drivers/net/bonding/rte_eth_bond_api.c 
b/drivers/net/bonding/rte_eth_bond_api.c
index 484a6f3..a0995ec 100644
--- a/drivers/net/bonding/rte_eth_bond_api.c
+++ b/drivers/net/bonding/rte_eth_bond_api.c
@@ -314,38 +314,23 @@ __eth_bond_slave_add_lock_free(uint8_t bonded_port_id, 
uint8_t slave_port_id)
 {
struct rte_eth_dev *bonded_eth_dev, *slave_eth_dev;
struct bond_dev_private *internals;
-   struct bond_dev_private *temp_internals;
struct rte_eth_link link_props;
struct rte_eth_dev_info dev_info;

-   int i, j;
-
if (valid_slave_port_id(slave_port_id) != 0)
return -1;

bonded_eth_dev = _eth_devices[bonded_port_id];
internals = bonded_eth_dev->data->dev_private;

-   /* Verify that new slave device is not already a slave of another
-* bonded device */
-   for (i = rte_eth_dev_count()-1; i >= 0; i--) {
-   if (check_for_bonded_ethdev(_eth_devices[i]) == 0) {
-   temp_internals = rte_eth_devices[i].data->dev_private;
-
-   for (j = 0; j < temp_internals->slave_count; j++) {
-   /* Device already a slave of a bonded device */
-   if (temp_internals->slaves[j].port_id == 
slave_port_id) {
-   RTE_BOND_LOG(ERR, "Slave port %d is 
already a slave",
-   slave_port_id);
-   return -1;
-   }
-   }
-   }
-   }
-
slave_eth_dev = _eth_devices[slave_port_id];
+   if (slave_eth_dev->data->dev_flags & RTE_ETH_DEV_BONDED_SLAVE) {
+   RTE_BOND_LOG(ERR, "Slave device is already a slave of a bonded 
device");
+   return -1;
+   }

/* Add slave details to bonded device */
+   slave_eth_dev->data->dev_flags |= RTE_ETH_DEV_BONDED_SLAVE;
slave_add(internals, slave_eth_dev);

rte_eth_dev_info_get(slave_port_id, _info);
@@ -385,6 +370,7 @@ __eth_bond_slave_add_lock_free(uint8_t bonded_port_id, 
uint8_t slave_port_id)
if (internals->link_props_set) {
if 
(link_properties_valid(&(bonded_eth_dev->data->dev_link),
  
&(slave_eth_dev->data->dev_link))) {
+   slave_eth_dev->data->dev_flags &= 
(~RTE_ETH_DEV_BONDED_SLAVE);
RTE_BOND_LOG(ERR,
"Slave port %d link 
speed/duplex not supported",
slave_port_id);
@@ -416,6 +402,7 @@ __eth_bond_slave_add_lock_free(uint8_t bonded_port_id, 
uint8_t slave_port_id)

if (bonded_eth_dev->data->dev_started) {
if (slave_configure(bonded_eth_dev, slave_eth_dev) != 0) {
+   slave_eth_dev->data->dev_flags &= 
(~RTE_ETH_DEV_BONDED_SLAVE);
RTE_BOND_LOG(ERR, "rte_bond_slaves_configure: port=%d",
slave_port_id);
return -1;
@@ -468,7 +455,7 @@ __eth_bond_slave_remove_lock_free(uint8_t bonded_port_id, 
uint8_t slave_port_id)
 {
struct rte_eth_dev *bonded_eth_dev;
struct bond_dev_private *internals;
-
+   struct rte_eth_dev *slave_eth_dev;
int i, slave_idx;

if (valid_slave_port_id(slave_port_id) != 0)
@@ -508,7 +495,9 @@ __eth_bond_slave_remove_lock_free(uint8_t bonded_port_id, 
uint8_t slave_port_id)
mac_address_set(_eth_devices[slave_port_id],
&(internals->slaves[slave_idx].persisted_mac_addr));

-   slave_remove(internals, _eth_devices[slave_port_id]);
+   slave_eth_dev = _eth_devices[slave_port_id];
+   slave_remove(internals, slave_eth_dev);
+   slave_eth_dev->data->dev_flags &= (~RTE_ETH_DEV_BONDED_SLAVE);

/*  first slave in the active list will be the primary by default,
 *  otherwise use first device in list */
diff --git a/lib/librte_ether/rte_ethdev.c b/lib/librte_ether/rte_ethdev.c
index 756b234..a6e83c1 100644
--- a/lib/librte_ether/rte_ethdev.c
+++ b/lib/librte_ether/rte_ethdev.c
@@ -1,7 +1,7 @@
 /*-
  *   BSD LICENSE
  *
- *   Copyright(c)

[dpdk-dev] [PATCH 1/2] bonding: fix detach of bonded device

2016-02-10 Thread Bernard Iremonger

Check that the bonded device has no slaves before detaching it.

Fixes: 8d30fe7fa737 ("bonding: support port hotplug")
Signed-off-by: Bernard Iremonger 
---
 drivers/net/bonding/rte_eth_bond_api.c | 7 ++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/drivers/net/bonding/rte_eth_bond_api.c 
b/drivers/net/bonding/rte_eth_bond_api.c
index 8a000c8..484a6f3 100644
--- a/drivers/net/bonding/rte_eth_bond_api.c
+++ b/drivers/net/bonding/rte_eth_bond_api.c
@@ -1,7 +1,7 @@
 /*-
  *   BSD LICENSE
  *
- *   Copyright(c) 2010-2015 Intel Corporation. All rights reserved.
+ *   Copyright(c) 2010-2016 Intel Corporation. All rights reserved.
  *   All rights reserved.
  *
  *   Redistribution and use in source and binary forms, with or without
@@ -277,6 +277,7 @@ int
 rte_eth_bond_free(const char *name)
 {
struct rte_eth_dev *eth_dev = NULL;
+   struct bond_dev_private *internals;

/* now free all data allocation - for eth_dev structure,
 * dummy pci driver and internal (private) data
@@ -287,6 +288,10 @@ rte_eth_bond_free(const char *name)
if (eth_dev == NULL)
return -ENODEV;

+   internals = eth_dev->data->dev_private;
+   if (internals->slave_count != 0)
+   return -EBUSY;
+
if (eth_dev->data->dev_started == 1) {
bond_ethdev_stop(eth_dev);
bond_ethdev_close(eth_dev);
-- 
2.6.3

[dpdk-dev] [PATCH 0/2] bonding fixes

2016-02-10 Thread Bernard Iremonger

These patches fix segmentation faults which were occurring when
slave devices were detached before being removed from the bonding
device.
The slave devices must now be removed from the bonding device
before they can be detached.
The bonding device cannot be detached now until all slave devices
have been removed from it.

Bernare Iremonger (2):
  bonding: fix detach of bonded device
  bonding: fix detach of bonded slave devices

 drivers/net/bonding/rte_eth_bond_api.c | 40 +++---
 lib/librte_ether/rte_ethdev.c  |  8 +--
 lib/librte_ether/rte_ethdev.h  |  4 +++-
 3 files changed, 26 insertions(+), 26 deletions(-)

-- 
2.6.3

[dpdk-dev] Regarding mbuf allocation/free in secondary process

2016-02-10 Thread Bruce Richardson

On Tue, Feb 09, 2016 at 11:43:19PM -0800, Saravana Kumar wrote:
> Hi DPDK community,
> 
> 
> 
> I'd like to have DPDK NIC IO operations in (primary) process and
> execution logic in (secondary) processes.
> Primary process pushes NIC Rx mbufs to Secondary process through S/W ring
> 
> Seconary process allocates mbuf for Tx path and pushes down to Primary
> process for NIC Tx
> 
> 
> I have few doubts here:
> 
> 1. If Secondary process dies because of SIGKILL then how can the mbufs
> allocated in Secondary process can be freed.
>If it is normal signals like SIGINT/SIGTERM then we can be catch
> those and free in those respective signal handlers

If a process terminates abnormally then the buffers being used by that process
may well be leaked. The solution you propose of catching signals will certainly
help as you want to try and ensure that a process always frees all its buffers
properly on termination.

> 
> 2. Secondary process needs to poll on the S/W ring. This can consume 100% cpu.
>Is there a way to avoid polling in secondary process for Rx path

Not using DPDK software rings, no. You'd have to use some kernel constructs 
such as
fifo's/named pipes to do blocking reads on those. However, the overhead of using
such structures can be severe making them unusable for many packet processing
applications. An alternative might be to use some small sleep calls i.e. 
nanosleep
between polls of the SW ring in cases where traffic rates are low. That will 
reduce your cpu usage.

/Bruce

[dpdk-dev] [PATCH 3/9] drivers: no more pdev drivers

2016-02-10 Thread David Marchand

On Tue, Feb 9, 2016 at 6:05 PM, Jan Viktorin  wrote:
> Maybe, a better subject?
>
> drivers: init pdev drivers in constructors

Why not, I will try to find a best one, and if I can't, I will go with this.

> On Fri, 29 Jan 2016 15:08:30 +0100
> David Marchand  wrote:
>> -static int
>> -rte_qat_pmd_init(const char *name __rte_unused, const char *params 
>> __rte_unused)
>> +/* Driver registration */
>> +static void __attribute__((constructor, used))
>> +rte_qat_pmd_init(void)
>>  {
>>   PMD_INIT_FUNC_TRACE();
>>   rte_eal_pci_register(_qat_pmd.pci_drv);
>> - return 0;
>>  }
>> -
>> -static struct rte_driver pmd_qat_drv = {
>> - .type = PMD_PDEV,
>> - .init = rte_qat_pmd_init,
>> -};
>> -
>> -PMD_REGISTER_DRIVER(pmd_qat_drv);
>
> What about introducing a macro for this?
>
> RTE_REGISTER_PCI_DRIVER(rte_qad_pmd);

Yes.


-- 
David Marchand

[dpdk-dev] [PATCH v2 1/3] driver/net/mpipe: support native build on tilegx platform.

2016-02-10 Thread Bruce Richardson

On Tue, Feb 09, 2016 at 11:47:55PM +0100, Thomas Monjalon wrote:
> 2016-02-09 21:15, Liming Sun:
> > Looks like this patch serie has been merged into dpdk-next-net/rel_16_04.
> > What would be the usual way to submit changes for new comments? Would it be 
> > incremental changes (new commit) based on previous one? Thanks.
> 
> Good question.
> I think it's better if Bruce drops or reverts the commits from dpdk-next-net
> to let you re-submit a better new version.
> Bruce, do you agree?

Unless there is something actually broken - that was previously working - by 
this patchset I'd rather not revert it. This patch was sitting acked for a month
which is a reasonable time for comments before applying it. Allowing people to
step up post-apply and look for patches being reverted is not something we want
to encourage IMHO. There are already too many reviews being done at the last
minute, and allowing reverts may make that situation worse, while applying acked
patches within a reasonable time - irrespective of whether people subsequently
find issues with them - should encourage earlier reviews, and makes it easier on
contributors.

Therefore I'd rather see any additional enhancements or changes
done as incremental patches on top of this set.

Regards,
/Bruce

[dpdk-dev] Regarding mbuf allocation/free in secondary process

2016-02-10 Thread Saravana Kumar

Thanks for your response..

Sara

On Wed, Feb 10, 2016 at 2:01 AM, Bruce Richardson <
bruce.richardson at intel.com> wrote:

> On Tue, Feb 09, 2016 at 11:43:19PM -0800, Saravana Kumar wrote:
> > Hi DPDK community,
> >
> >
> >
> > I'd like to have DPDK NIC IO operations in (primary) process and
> > execution logic in (secondary) processes.
> > Primary process pushes NIC Rx mbufs to Secondary process through S/W ring
> >
> > Seconary process allocates mbuf for Tx path and pushes down to Primary
> > process for NIC Tx
> >
> >
> > I have few doubts here:
> >
> > 1. If Secondary process dies because of SIGKILL then how can the mbufs
> > allocated in Secondary process can be freed.
> >If it is normal signals like SIGINT/SIGTERM then we can be catch
> > those and free in those respective signal handlers
>
> If a process terminates abnormally then the buffers being used by that
> process
> may well be leaked. The solution you propose of catching signals will
> certainly
> help as you want to try and ensure that a process always frees all its
> buffers
> properly on termination.
>
> >
> > 2. Secondary process needs to poll on the S/W ring. This can consume
> 100% cpu.
> >Is there a way to avoid polling in secondary process for Rx path
>
> Not using DPDK software rings, no. You'd have to use some kernel
> constructs such as
> fifo's/named pipes to do blocking reads on those. However, the overhead of
> using
> such structures can be severe making them unusable for many packet
> processing
> applications. An alternative might be to use some small sleep calls i.e.
> nanosleep
> between polls of the SW ring in cases where traffic rates are low. That
> will
> reduce your cpu usage.
>
> /Bruce
>
>

[dpdk-dev] [PATCH v3] driver/net/mpipe: fix the crash/hung issue when testpmd quits

2016-02-10 Thread Liming Sun

Fixes: the hung/crash issue when quitting testpmd under high
traffic rate. The following issue were found and fixed.
1. edesc->size is not initialized properly in mpipe_do_xmit() and could
   cause buffer leak or corruption when HW buffer return is used.
2. Check the 'idesc.be' error bit in mpipe_recv_flush() to make sure
   buffer is valid before releasing it. This is to avoid issues when
   running out of buffers.
3. priv->rx_buffers counter is not accurate when HW buffer return is
   used. Remove this counter to simplify the code.

Signed-off-by: Liming Sun 
Acked-by: Zhigang Lu 
---
 drivers/net/mpipe/mpipe_tilegx.c | 46 ++--
 1 file changed, 16 insertions(+), 30 deletions(-)

diff --git a/drivers/net/mpipe/mpipe_tilegx.c b/drivers/net/mpipe/mpipe_tilegx.c
index 8d006fa..4cb54c3 100644
--- a/drivers/net/mpipe/mpipe_tilegx.c
+++ b/drivers/net/mpipe/mpipe_tilegx.c
@@ -134,7 +134,6 @@ struct mpipe_dev_priv {
struct rte_mempool *rx_mpool;   /* mpool used by the rx queues. */
unsigned rx_offset; /* Receive head room. */
unsigned rx_size_code;  /* mPIPE rx buffer size code. */
-   unsigned rx_buffers;/* receive buffers on stack. */
int is_xaui:1,  /* Is this an xgbe or gbe? */
initialized:1,  /* Initialized port? */
running:1;  /* Running port? */
@@ -529,7 +528,6 @@ mpipe_recv_fill_stack(struct mpipe_dev_priv *priv, int 
count)
mpipe_recv_push(priv, mbuf);
}

-   priv->rx_buffers += count;
PMD_DEBUG_RX("%s: Filled %d/%d buffers\n", mpipe_name(priv), i, count);
 }

@@ -539,10 +537,9 @@ mpipe_recv_flush_stack(struct mpipe_dev_priv *priv)
const int offset = priv->rx_offset & ~RTE_MEMPOOL_ALIGN_MASK;
uint8_t in_port = priv->port_id;
struct rte_mbuf *mbuf;
-   unsigned count;
void *va;

-   for (count = 0; count < priv->rx_buffers; count++) {
+   while (1) {
va = gxio_mpipe_pop_buffer(priv->context, priv->stack);
if (!va)
break;
@@ -561,10 +558,6 @@ mpipe_recv_flush_stack(struct mpipe_dev_priv *priv)

__rte_mbuf_raw_free(mbuf);
}
-
-   PMD_DEBUG_RX("%s: Returned %d/%d buffers\n",
-mpipe_name(priv), count, priv->rx_buffers);
-   priv->rx_buffers -= count;
 }

 static void
@@ -1246,31 +1239,23 @@ mpipe_recv_flush(struct mpipe_dev_priv *priv)
gxio_mpipe_iqueue_t *iqueue;
gxio_mpipe_idesc_t idesc;
struct rte_mbuf *mbuf;
-   int retries = 0;
unsigned queue;

-   do {
-   mpipe_recv_flush_stack(priv);
-
-   /* Flush packets sitting in recv queues. */
-   for (queue = 0; queue < priv->nb_rx_queues; queue++) {
-   rx_queue = mpipe_rx_queue(priv, queue);
-   iqueue = _queue->iqueue;
-   while (gxio_mpipe_iqueue_try_get(iqueue, ) >= 0) {
-   mbuf = mpipe_recv_mbuf(priv, , in_port);
-   rte_pktmbuf_free(mbuf);
-   priv->rx_buffers--;
-   }
-   rte_free(rx_queue->rx_ring_mem);
-   }
-   } while (retries++ < 10 && priv->rx_buffers);
+   /* Release packets on the buffer stack. */
+   mpipe_recv_flush_stack(priv);

-   if (priv->rx_buffers) {
-   RTE_LOG(ERR, PMD, "%s: Leaked %d receive buffers.\n",
-   mpipe_name(priv), priv->rx_buffers);
-   } else {
-   PMD_DEBUG_RX("%s: Returned all receive buffers.\n",
-mpipe_name(priv));
+   /* Flush packets sitting in recv queues. */
+   for (queue = 0; queue < priv->nb_rx_queues; queue++) {
+   rx_queue = mpipe_rx_queue(priv, queue);
+   iqueue = _queue->iqueue;
+   while (gxio_mpipe_iqueue_try_get(iqueue, ) >= 0) {
+   /* Skip idesc with the 'buffer error' bit set. */
+   if (idesc.be)
+   continue;
+   mbuf = mpipe_recv_mbuf(priv, , in_port);
+   rte_pktmbuf_free(mbuf);
+   }
+   rte_free(rx_queue->rx_ring_mem);
}
 }

@@ -1339,6 +1324,7 @@ mpipe_do_xmit(struct mpipe_tx_queue *tx_queue, struct 
rte_mbuf **tx_pkts,
.xfer_size = rte_pktmbuf_data_len(mbuf),
.bound = next ? 0 : 1,
.stack_idx = mpipe_mbuf_stack_index(priv, mbuf),
+   .size  = priv->rx_size_code,
} };
if (mpipe_local.mbuf_push_debt[port_id] > 0) {
mpipe_local.mbuf_push_debt[port_id]--;
-- 
1.8.3.1

77 matches

Mail list logo