[dpdk-dev] [PATCH v2] examples: remove useless checking

2016-05-16 Thread Thomas Monjalon
> > The rte_eth_dev_count() function will never return a value greater
> > than RTE_MAX_ETHPORTS, so that checking is useless.
> > 
> > Signed-off-by: Mauricio Vasquez B  > studenti.polito.it>
> 
> Acked-by: Ferruh Yigit 

Applied, thanks


[dpdk-dev] [PATCH] examples/ip_pipline: fix memory initialization in firewall bulk functions

2016-05-16 Thread Thomas Monjalon
> > bulk functions expect that all memory is set with zeros
> > 
> > Fixes: 67ebdbef0c31 ("examples/ip_pipeline: add bulk update of firewall
> > rules")
> > 
> > Signed-off-by: Daniel Mrzyglod 
> 
> Acked-by: Cristian Dumitrescu 

Applied, thanks


[dpdk-dev] [PATCH] examples/ip_pipeline: fix out-of-bounds write

2016-05-16 Thread Thomas Monjalon
> > CID 124567:
> > In the function app_init_eal(struct app params * app) number of
> > entries into array exceeds the size of the array if the conditions
> > are fulfilled.
> > 
> > Fixes: 7f64b9c004aa ("examples/ip_pipeline: rework config file syntax")
> > 
> > Signed-off-by: Marcin Kerlin 
> 
> Acked-by: Cristian Dumitrescu 

Applied, thanks



[dpdk-dev] [PATCH 1/1] examples/distributor: fix unchecked return value

2016-05-16 Thread Thomas Monjalon
> > Fix issue reported by Coverity.
> > 
> > Coverity ID 13207:
> > Value returned from a function is not checked for errors before being used.
> > 
> > Fixes: 07db4a975094 ("examples/distributor: new sample app")
> > 
> > Signed-off-by: Marcin Kerlin 
> 
> Acked-by: Reshma Pattan 

Applied, thanks


[dpdk-dev] [PATCH v2] examples/netmap_compat: fix infinite loop

2016-05-16 Thread Thomas Monjalon
2016-04-27 15:22, Michal Kobylinski:
> Fix issue reported by Coverity.
> 
> Coverity ID 30701: Infinite loop: The loop does not have a normal
> termination condition, so will continue until an abnormal condition
> arises. In rte_netmap_poll: Infinite loop with unsatisfiable exit
> condition.
> 
> Fixes: 06371afe394d ("examples/netmap_compat: import netmap
> compatibility example")
> 
> Signed-off-by: Michal Kobylinski 

Applied, thanks


[dpdk-dev] [PATCH v4] examples/qos_meter: fix unchecked return value

2016-05-16 Thread Thomas Monjalon
"check flow configuration error"

> > Fix issue reported by Coverity.
> > 
> > Coverity ID 30693: Unchecked return value
> > check_return: Calling rte_meter_srtcm_config without checking return
> > value.
> > 
> > Fixes: e6541fdec8b2 ("meter: initial import")
> > 
> > Signed-off-by: Slawomir Mrozowicz 
> 
> Acked-by: Cristian Dumitrescu 

Applied, thanks


[dpdk-dev] [PATCH v4] examples/qos_sched: fix bad bit shift operation

2016-05-16 Thread Thomas Monjalon
> > Fix issue reported by Coverity.
> > 
> > Coverity ID 30690: Bad bit shift operation
> > large_shift: In expression 1ULL << i, left shifting by more than 63 bits
> > has undefined behavior. The shift amount, i, is as much as 127.
> > 
> > Fixes: de3cfa2c9823 ("sched: initial import")
> > 
> > Signed-off-by: Slawomir Mrozowicz 
> 
> Acked-by: Cristian Dumitrescu 

Applied, thanks



[dpdk-dev] [PATCH v4] examples/qos_sched: fix bad bit shift operation

2016-05-16 Thread Thomas Monjalon
> > Subject: [PATCH v4] examples/qos_sched: fix bad bit shift operation

Slawomir, please use --in-reply-to when sending a new revision,
to let us see the full history in our mailer and in the archives.

[...]
> > diff --git a/examples/qos_sched/args.c b/examples/qos_sched/args.c
> > index 3e7fd08..354372d 100644
> > --- a/examples/qos_sched/args.c
> > +++ b/examples/qos_sched/args.c
> > @@ -123,7 +123,7 @@ app_eal_core_mask(void)
> > uint64_t cm = 0;
> > struct rte_config *cfg = rte_eal_get_configuration();
> > 
> > -   for (i = 0; i < RTE_MAX_LCORE; i++) {
> > +   for (i = 0; i < APP_MAX_LCORE; i++) {
> > if (cfg->lcore_role[i] == ROLE_RTE)
> > cm |= (1ULL << i);
> > }
> > @@ -142,7 +142,7 @@ app_cpu_core_count(void)
> > char path[PATH_MAX];
> > uint32_t ncores = 0;
> > 
> > -   for(i = 0; i < RTE_MAX_LCORE; i++) {
> > +   for (i = 0; i < APP_MAX_LCORE; i++) {
> > len = snprintf(path, sizeof(path), SYS_CPU_DIR, i);
> > if (len <= 0 || (unsigned)len >= sizeof(path))
> > continue;
> > diff --git a/examples/qos_sched/main.h b/examples/qos_sched/main.h
> > index 82aa0fa..c7490c6 100644
> > --- a/examples/qos_sched/main.h
> > +++ b/examples/qos_sched/main.h
> > @@ -68,7 +68,10 @@ extern "C" {
> > 
> >  #define BURST_TX_DRAIN_US 100
> > 
> > -#define MAX_DATA_STREAMS (RTE_MAX_LCORE/2)
> > +#ifndef APP_MAX_LCORE
> > +#define APP_MAX_LCORE 64
> > +#endif
> > +#define MAX_DATA_STREAMS (APP_MAX_LCORE/2)
> >  #define MAX_SCHED_SUBPORTS 8
> >  #define MAX_SCHED_PIPES4096
> > 
> > --
> > 1.9.1
> 
> Acked-by: Cristian Dumitrescu 

Cristian, please remove patch content when acking.
My hand is tired of scrolling ;)



[dpdk-dev] [PATCH v3] examples/qos_sched: fix copy-paste error

2016-05-16 Thread Thomas Monjalon
copy-paste is the cause of this error.
The headline should show which error is fixed (or briefly its impact area).
Here: "fix error message"

> > Fix issue reported by Coverity.
> > 
> > Coverity ID 30699: Copy-paste error;
> > rx_port in pconf->rx_port looks like a copy-paste error.
> > 
> > Fixes: de3cfa2c9823 ("sched: initial import")
> > 
> > Signed-off-by: Slawomir Mrozowicz 
> 
> Acked-by: Cristian Dumitrescu 

Applied, thanks


[dpdk-dev] [PATCH v3] examples/qos_sched: fix negative loop bound

2016-05-16 Thread Thomas Monjalon
> > Fix issue reported by Coverity.
> > 
> > Coverity ID 30704: Negative loop bound
> > negative_returns: Using unsigned variable n_tokens in a loop exit condition.
> > 
> > Fixes: de3cfa2c9823 ("sched: initial import")
> > 
> > Signed-off-by: Slawomir Mrozowicz 
> 
> Acked-by: Cristian Dumitrescu 

There is no explanation but as it is acked and not so important,
Applied


[dpdk-dev] [PATCH v2] examples/qos_sched: fix out-of-bounds read

2016-05-16 Thread Thomas Monjalon
> > Fix issue reported by Coverity.
> > 
> > Coverity ID 30708: Out-of-bounds read
> > overrun-local: Overrunning array tokens of 8 8-byte elements
> > at element index 4294967294 (byte offset 34359738352)
> > using index i (which evaluates to 4294967294).
> > 
> > Fixes: de3cfa2c9823 ("sched: initial import")
> > 
> > Signed-off-by: Slawomir Mrozowicz 
> 
> Acked-by: Cristian Dumitrescu 

Applied, thanks


[dpdk-dev] [PATCH 1/1] lib/librte_cmdline: fix added checking return value

2016-05-16 Thread Marcin Kerlin
Unchecked return value: value returned from a function rdline_init is 
not checked, fix added checking return value and in the case of failure
frees memory and return null pointer.

Fixes: af75078fece3 ("first public release")
Coverity ID 13204

Signed-off-by: Marcin Kerlin 
---
 lib/librte_cmdline/cmdline.c | 10 --
 1 file changed, 8 insertions(+), 2 deletions(-)

diff --git a/lib/librte_cmdline/cmdline.c b/lib/librte_cmdline/cmdline.c
index c405878..a9c47be 100644
--- a/lib/librte_cmdline/cmdline.c
+++ b/lib/librte_cmdline/cmdline.c
@@ -130,6 +130,7 @@ struct cmdline *
 cmdline_new(cmdline_parse_ctx_t *ctx, const char *prompt, int s_in, int s_out)
 {
struct cmdline *cl;
+   int ret;

if (!ctx || !prompt)
return NULL;
@@ -142,8 +143,13 @@ cmdline_new(cmdline_parse_ctx_t *ctx, const char *prompt, 
int s_in, int s_out)
cl->s_out = s_out;
cl->ctx = ctx;

-   rdline_init(>rdl, cmdline_write_char,
-   cmdline_valid_buffer, cmdline_complete_buffer);
+   ret = rdline_init(>rdl, cmdline_write_char, cmdline_valid_buffer,
+   cmdline_complete_buffer);
+   if (ret != 0) {
+   free(cl);
+   return NULL;
+   }
+
cl->rdl.opaque = cl;
cmdline_set_prompt(cl, prompt);
rdline_newline(>rdl, cl->prompt);
-- 
1.9.1



[dpdk-dev] [PATCH] Add rte_mempool_free

2016-05-16 Thread Wiles, Keith
>There is no inverse of rte_mempool_create, so this patch adds one.
>The typical usage of rte_mempool_create is to create a pool at
>initialization time and only to free it upon program exit, so an
>rte_mempool_free function at first seems to be of little value.
>However, it is very useful as a sanity check for a clean shutdown
>when used in conjunction with tools like AddressSanitizer. Further,
>the call itself verifies that all elements have been returned to
>the pool or it fails.
>
>Signed-off-by: Ben Walker 
>---
> lib/librte_mempool/rte_dom0_mempool.c | 22 +++
> lib/librte_mempool/rte_mempool.c  | 70 +++
> lib/librte_mempool/rte_mempool.h  | 41 
> 3 files changed, 133 insertions(+)
>
>diff --git a/lib/librte_mempool/rte_dom0_mempool.c 
>b/lib/librte_mempool/rte_dom0_mempool.c
>index 0d6d750..edf2d58 100644
>--- a/lib/librte_mempool/rte_dom0_mempool.c
>+++ b/lib/librte_mempool/rte_dom0_mempool.c
>@@ -131,3 +131,25 @@ rte_dom0_mempool_create(const char *name, unsigned 
>elt_num, unsigned elt_size,
> 
>   return mp;
> }
>+
>+/* free the mempool supporting Dom0 */
>+int
>+rte_dom0_mempool_free(struct rte_mempool *mp)
>+{
>+  const struct rte_memzone *mz;
>+  char mz_name[RTE_MEMZONE_NAMESIZE];
>+  int rc;
>+
>+  rc = rte_mempool_xmem_free(mp);
>+  if (rc) {
>+  return rc;
>+  }

Remove {} on single line statements.
>+
>+  snprintf(mz_name, sizeof(mz_name), RTE_MEMPOOL_OBJ_NAME, mp->name);
>+  mz = rte_memzone_lookup(mz_name);
>+  if (mz) {
>+  rte_memzone_free(mz);
>+  }

Here too.
>+
>+  return 0;
>+}
>diff --git a/lib/librte_mempool/rte_mempool.c 
>b/lib/librte_mempool/rte_mempool.c
>index 70812d9..82f645e 100644
>--- a/lib/librte_mempool/rte_mempool.c
>+++ b/lib/librte_mempool/rte_mempool.c
>@@ -638,6 +638,76 @@ exit_unlock:
>   return NULL;
> }
> 
>+#ifndef RTE_LIBRTE_XEN_DOM0
>+/* stub if DOM0 support not configured */
>+int
>+rte_dom0_mempool_free(struct rte_mempool *mp __rte_unused)
>+{
>+  rte_errno = EINVAL;
>+  return -1;

I was thinking this should just return OK or zero. The chances of being called 
is very low and maybe will not be called, right? If so then do we need the 
function?
>+}
>+#endif
>+
>+int
>+rte_mempool_free(struct rte_mempool *mp)
>+{
>+  if (rte_xen_dom0_supported())
>+  return rte_dom0_mempool_free(mp);
>+  else
>+  return rte_mempool_xmem_free(mp);
>+}
>+
>+
>+/* Free the memory pool */
>+int
>+rte_mempool_xmem_free(struct rte_mempool *mp)
>+{
>+  char mz_name[RTE_MEMZONE_NAMESIZE];
>+  struct rte_mempool_list *mempool_list;
>+  struct rte_tailq_entry *te = NULL;
>+  const struct rte_memzone *mz;
>+  unsigned count;
>+
>+  if (!mp) {
>+  return 0;
>+  }
Remove the extra {}
>+
>+  count = rte_mempool_free_count(mp);
>+  if (count != 0) {
>+  /* All elements must be returned to the pool before free */
>+  return count;
>+  }

This one also does not really need the {}
>+
>+  rte_rwlock_write_lock(RTE_EAL_MEMPOOL_RWLOCK);
>+
>+  /* Free the ring associated with this mempool */
>+  if (mp->ring) {
>+  rte_ring_free(mp->ring);
>+  }

This one too.
>+
>+  /* Remove the entry from the mempool list and free it. */
>+  rte_rwlock_write_lock(RTE_EAL_TAILQ_RWLOCK);
>+  mempool_list = RTE_TAILQ_CAST(rte_mempool_tailq.head, rte_mempool_list);
>+  TAILQ_FOREACH(te, mempool_list, next) {
>+  if ((struct rte_mempool *)te->data == mp) {
>+  TAILQ_REMOVE(mempool_list, te, next);
>+  rte_free(te);
>+  break;
>+  }
>+  }
>+  rte_rwlock_write_unlock(RTE_EAL_TAILQ_RWLOCK);
>+
>+  snprintf(mz_name, sizeof(mz_name), RTE_MEMPOOL_MZ_FORMAT, mp->name);
>+  mz = rte_memzone_lookup(mz_name);
>+  if (mz) {
>+  rte_memzone_free(mz);
>+  }

This one too.
>+
>+  rte_rwlock_write_unlock(RTE_EAL_MEMPOOL_RWLOCK);
>+
>+  return 0;
>+}

The big question is how do you know the mempool is not being used someplace?

>+
> /* Return the number of entries in the mempool */
> unsigned
> rte_mempool_count(const struct rte_mempool *mp)
>diff --git a/lib/librte_mempool/rte_mempool.h 
>b/lib/librte_mempool/rte_mempool.h
>index 9745bf0..26949c7 100644
>--- a/lib/librte_mempool/rte_mempool.h
>+++ b/lib/librte_mempool/rte_mempool.h
>@@ -728,6 +728,47 @@ rte_dom0_mempool_create(const char *name, unsigned n, 
>unsigned elt_size,
>   rte_mempool_obj_ctor_t *obj_init, void *obj_init_arg,
>   int socket_id, unsigned flags);
> 
>+/**
>+ * Free the memory pool created by rte_mempool_create
>+ *
>+ * All elements must be placed back in the pool prior to calling this 
>function.
>+ *
>+ * @param mp
>+ *   A pointer to the mempool structure.
>+ * @return
>+ *   0 on success. -1 on error  

[dpdk-dev] [PATCH 2/2] doc: announce ABI change of struct rte_port_sink_params

2016-05-16 Thread Panu Matilainen
On 05/16/2016 04:18 PM, Fan Zhang wrote:
> The ABI changes are planned for rte_port_sink_params, which will be
> supported from release 16.11. Here announces that ABI changes in detail.
>
> Signed-off-by: Fan Zhang 
> Acked-by: Cristian Dumitrescu 
> ---
>  doc/guides/rel_notes/deprecation.rst | 4 
>  1 file changed, 4 insertions(+)
>
> diff --git a/doc/guides/rel_notes/deprecation.rst 
> b/doc/guides/rel_notes/deprecation.rst
> index d228bae..d2f7306 100644
> --- a/doc/guides/rel_notes/deprecation.rst
> +++ b/doc/guides/rel_notes/deprecation.rst
> @@ -78,3 +78,7 @@ Deprecation Notices
>  * ABI will change for rte_port_source_params struct. The member file_name
>data type will be changed from char * to const char *. This change targets
>release 16.11.
> +
> +* ABI will change for rte_port_sink_params struct. The member file_name
> +  data type will be changed from char * to const char *. This change targets
> +  release 16.11.
>

Surely such a minor change doesn't require two separate announcements or 
patches for that matter. In fact I doubt it's an ABI change at all 
(although it is an API change certainly).

However to me the bigger issue is that a change like this alone doesn't 
seem like worth breaking the ABI. The ABI was just broken in 16.04 to 
introduce these struct members (among other things) and to break the ABI 
again just to fixup a const-correctness issue seems a bit much. Could it 
maybe wait until there's some actually compelling reason to break the ABI?

Note that I'm not against the change as such, const-correctness is a 
good thing.

- Panu -


[dpdk-dev] [PATCH 4/4] pmd_hw_support.py: Add tool to query binaries for hw support information

2016-05-16 Thread Neil Horman
This tool searches for the primer sting PMD_DRIVER_INFO= in any ELF binary,
and, if found parses the remainder of the string as a json encoded string,
outputting the results in either a human readable or raw, script parseable
format

Signed-off-by: Neil Horman 
CC: Bruce Richardson 
CC: Thomas Monjalon 
CC: Stephen Hemminger 
CC: Panu Matilainen 
---
 tools/pmd_hw_support.py | 174 
 1 file changed, 174 insertions(+)
 create mode 100755 tools/pmd_hw_support.py

diff --git a/tools/pmd_hw_support.py b/tools/pmd_hw_support.py
new file mode 100755
index 000..0669aca
--- /dev/null
+++ b/tools/pmd_hw_support.py
@@ -0,0 +1,174 @@
+#!/usr/bin/python3
+#---
+# scripts/pmd_hw_support.py
+#
+# Utility to dump PMD_INFO_STRING support from an object file
+#
+#---
+import os, sys
+from optparse import OptionParser
+import string
+import json
+
+# For running from development directory. It should take precedence over the
+# installed pyelftools.
+sys.path.insert(0, '.')
+
+
+from elftools import __version__
+from elftools.common.exceptions import ELFError
+from elftools.common.py3compat import (
+ifilter, byte2int, bytes2str, itervalues, str2bytes)
+from elftools.elf.elffile import ELFFile
+from elftools.elf.dynamic import DynamicSection, DynamicSegment
+from elftools.elf.enums import ENUM_D_TAG
+from elftools.elf.segments import InterpSegment
+from elftools.elf.sections import SymbolTableSection
+from elftools.elf.gnuversions import (
+GNUVerSymSection, GNUVerDefSection,
+GNUVerNeedSection,
+)
+from elftools.elf.relocation import RelocationSection
+from elftools.elf.descriptions import (
+describe_ei_class, describe_ei_data, describe_ei_version,
+describe_ei_osabi, describe_e_type, describe_e_machine,
+describe_e_version_numeric, describe_p_type, describe_p_flags,
+describe_sh_type, describe_sh_flags,
+describe_symbol_type, describe_symbol_bind, describe_symbol_visibility,
+describe_symbol_shndx, describe_reloc_type, describe_dyn_tag,
+describe_ver_flags,
+)
+from elftools.elf.constants import E_FLAGS
+from elftools.dwarf.dwarfinfo import DWARFInfo
+from elftools.dwarf.descriptions import (
+describe_reg_name, describe_attr_value, set_global_machine_arch,
+describe_CFI_instructions, describe_CFI_register_rule,
+describe_CFI_CFA_rule,
+)
+from elftools.dwarf.constants import (
+DW_LNS_copy, DW_LNS_set_file, DW_LNE_define_file)
+from elftools.dwarf.callframe import CIE, FDE
+
+raw_output = False;
+
+class ReadElf(object):
+""" display_* methods are used to emit output into the output stream
+"""
+def __init__(self, file, output):
+""" file:
+stream object with the ELF file to read
+
+output:
+output stream to write to
+"""
+self.elffile = ELFFile(file)
+self.output = output
+
+# Lazily initialized if a debug dump is requested
+self._dwarfinfo = None
+
+self._versioninfo = None
+
+def _section_from_spec(self, spec):
+""" Retrieve a section given a "spec" (either number or name).
+Return None if no such section exists in the file.
+"""
+try:
+num = int(spec)
+if num < self.elffile.num_sections():
+return self.elffile.get_section(num)
+else:
+return None
+except ValueError:
+# Not a number. Must be a name then
+return self.elffile.get_section_by_name(str2bytes(spec))
+
+def parse_pmd_info_string(self, mystring):
+global raw_output
+i = mystring.index("=");
+mystring = mystring[i+2:]
+pmdinfo = json.loads(mystring)
+
+if raw_output:
+print(pmdinfo)
+return
+
+print("PMD NAME: " + pmdinfo["name"])
+print("PMD TYPE: " + pmdinfo["type"])
+if (pmdinfo["type"] == "PMD_PDEV"):
+print("PMD HW SUPPORT:")
+print("VENDOR\t DEVICE\t SUBVENDOR\t SUBDEVICE")
+for i in pmdinfo["pci_ids"]:
+print("0x%04x\t 0x%04x\t 0x%04x\t\t 0x%04x" % (i[0], i[1], 
i[2], i[3]))
+
+print("")
+
+
+def display_pmd_info_strings(self, section_spec):
+""" Display a strings dump of a section. section_spec is either a
+section number or a name.
+"""
+section = self._section_from_spec(section_spec)
+if section is None:
+return
+
+
+found = False
+data = section.data()
+dataptr = 0
+
+while dataptr < len(data):
+while ( dataptr < len(data) and
+not (32 <= byte2int(data[dataptr]) <= 127)):
+dataptr += 1
+
+if dataptr >= len(data):
+

[dpdk-dev] [PATCH 3/4] Makefile: Do post processing on objects that register a driver

2016-05-16 Thread Neil Horman
Modify the compilation makefile to identify C files that export PMD information,
and use that to trigger execution of the pmdinfo binary.  If the execution of
pmdinfo is successful, compile the output C file to an object, and use the
linker to do relocatable linking on the resultant object file into the parent
object that it came from.  This effectively just adds the json string into the
string table of the object that defines the PMD to the outside world.

Signed-off-by: Neil Horman 
CC: Bruce Richardson 
CC: Thomas Monjalon 
CC: Stephen Hemminger 
CC: Panu Matilainen 
---
 mk/internal/rte.compile-pre.mk | 19 ++-
 1 file changed, 18 insertions(+), 1 deletion(-)

diff --git a/mk/internal/rte.compile-pre.mk b/mk/internal/rte.compile-pre.mk
index b9bff4a..2771887 100644
--- a/mk/internal/rte.compile-pre.mk
+++ b/mk/internal/rte.compile-pre.mk
@@ -80,7 +80,8 @@ C_TO_O_STR = $(subst ','\'',$(C_TO_O)) #'# fix syntax 
highlight
 C_TO_O_DISP = $(if $(V),"$(C_TO_O_STR)","  HOSTCC $(@)")
 else
 C_TO_O = $(CC) -Wp,-MD,$(call obj2dep,$(@)).tmp $(CFLAGS) \
-   $(CFLAGS_$(@)) $(EXTRA_CFLAGS) -o $@ -c $<
+$(CFLAGS_$(@)) $(EXTRA_CFLAGS) -o $@ -c $<
+
 C_TO_O_STR = $(subst ','\'',$(C_TO_O)) #'# fix syntax highlight
 C_TO_O_DISP = $(if $(V),"$(C_TO_O_STR)","  CC $(@)")
 endif
@@ -88,10 +89,26 @@ C_TO_O_CMD = 'cmd_$@ = $(C_TO_O_STR)'
 C_TO_O_DO = @set -e; \
echo $(C_TO_O_DISP); \
$(C_TO_O) && \
+   sh -c "grep -q \"PMD_REGISTER_DRIVER_[PV]DEV(.*)\" $<; \
+   if [ \$$? -eq 0 ]; \
+   then \
+   echo MODGEN $@; \
+   OBJF=`readlink -f $@`; \
+   ${RTE_OUTPUT}/buildtools/pmdinfo \$$OBJF \$$OBJF.mod.c; \
+   if [ \$$? -eq 0 ]; \
+   then \
+   echo MODBUILD $@; \
+   $(CC) -c -o \$$OBJF.mod.o \$$OBJF.mod.c; \
+   $(CROSS)ld -r -o \$$OBJF.o \$$OBJF.mod.o \$$OBJF; \
+   mv -f \$$OBJF.o \$$OBJF; \
+   fi; \
+   fi; \
+   true" && \
echo $(C_TO_O_CMD) > $(call obj2cmd,$(@)) && \
sed 's,'$@':,dep_'$@' =,' $(call obj2dep,$(@)).tmp > $(call 
obj2dep,$(@)) && \
rm -f $(call obj2dep,$(@)).tmp

+
 # return an empty string if string are equal
 compare = $(strip $(subst $(1),,$(2)) $(subst $(2),,$(1)))

-- 
2.5.5



[dpdk-dev] [PATCH 2/4] drivers: Update driver registration macro usage

2016-05-16 Thread Neil Horman
Modify the PMD_REGISTER_DRIVER macro, bifurcating it into two
(PMD_REGISTER_DRIVER_PDEV and PMD_REGISTER_DRIVER_VDEV.  Both of these do the
same thing the origional macro did, but both add the definition of a string
variable that informs interested parties of the name of the pmd, and the former
also defines an second string that holds the symbol name of the pci table that
is registered by this pmd.

pmdinfo uses this information to extract hardware support from an object file
and create a json string to make hardware support info discoverable later.

Signed-off-by: Neil Horman 
CC: Bruce Richardson 
CC: Thomas Monjalon 
CC: Stephen Hemminger 
CC: Panu Matilainen 
---
 drivers/Makefile   |  2 ++
 drivers/crypto/aesni_gcm/aesni_gcm_pmd.c   |  2 +-
 drivers/crypto/aesni_mb/rte_aesni_mb_pmd.c |  2 +-
 drivers/crypto/null/null_crypto_pmd.c  |  2 +-
 drivers/crypto/qat/rte_qat_cryptodev.c |  2 +-
 drivers/crypto/snow3g/rte_snow3g_pmd.c |  2 +-
 drivers/net/af_packet/rte_eth_af_packet.c  |  2 +-
 drivers/net/bnx2x/bnx2x_ethdev.c   |  4 ++--
 drivers/net/bonding/rte_eth_bond_pmd.c |  2 +-
 drivers/net/cxgbe/cxgbe_ethdev.c   |  2 +-
 drivers/net/e1000/em_ethdev.c  |  2 +-
 drivers/net/e1000/igb_ethdev.c |  4 ++--
 drivers/net/ena/ena_ethdev.c   |  2 +-
 drivers/net/enic/enic_ethdev.c |  2 +-
 drivers/net/fm10k/fm10k_ethdev.c   |  2 +-
 drivers/net/i40e/i40e_ethdev.c |  2 +-
 drivers/net/i40e/i40e_ethdev_vf.c  |  2 +-
 drivers/net/ixgbe/ixgbe_ethdev.c   |  4 ++--
 drivers/net/mlx4/mlx4.c|  2 +-
 drivers/net/mlx5/mlx5.c|  2 +-
 drivers/net/mpipe/mpipe_tilegx.c   |  4 ++--
 drivers/net/nfp/nfp_net.c  |  2 +-
 drivers/net/null/rte_eth_null.c|  2 +-
 drivers/net/pcap/rte_eth_pcap.c|  2 +-
 drivers/net/ring/rte_eth_ring.c|  2 +-
 drivers/net/szedata2/rte_eth_szedata2.c|  2 +-
 drivers/net/vhost/rte_eth_vhost.c  |  2 +-
 drivers/net/virtio/virtio_ethdev.c |  2 +-
 drivers/net/vmxnet3/vmxnet3_ethdev.c   |  2 +-
 drivers/net/xenvirt/rte_eth_xenvirt.c  |  2 +-
 lib/librte_eal/common/include/rte_dev.h| 21 ++---
 31 files changed, 53 insertions(+), 36 deletions(-)

diff --git a/drivers/Makefile b/drivers/Makefile
index 81c03a8..75a3168 100644
--- a/drivers/Makefile
+++ b/drivers/Makefile
@@ -34,4 +34,6 @@ include $(RTE_SDK)/mk/rte.vars.mk
 DIRS-y += net
 DIRS-$(CONFIG_RTE_LIBRTE_CRYPTODEV) += crypto

+DEPDIRS-y += buildtools/pmdinfo
+
 include $(RTE_SDK)/mk/rte.subdir.mk
diff --git a/drivers/crypto/aesni_gcm/aesni_gcm_pmd.c 
b/drivers/crypto/aesni_gcm/aesni_gcm_pmd.c
index 2987ef6..1c1cbf7 100644
--- a/drivers/crypto/aesni_gcm/aesni_gcm_pmd.c
+++ b/drivers/crypto/aesni_gcm/aesni_gcm_pmd.c
@@ -521,4 +521,4 @@ static struct rte_driver aesni_gcm_pmd_drv = {
.uninit = aesni_gcm_uninit
 };

-PMD_REGISTER_DRIVER(aesni_gcm_pmd_drv);
+PMD_REGISTER_DRIVER_VDEV(aesni_gcm_pmd_drv, aesni_gcm);
diff --git a/drivers/crypto/aesni_mb/rte_aesni_mb_pmd.c 
b/drivers/crypto/aesni_mb/rte_aesni_mb_pmd.c
index 3415ac1..bbd44ff 100644
--- a/drivers/crypto/aesni_mb/rte_aesni_mb_pmd.c
+++ b/drivers/crypto/aesni_mb/rte_aesni_mb_pmd.c
@@ -716,4 +716,4 @@ static struct rte_driver cryptodev_aesni_mb_pmd_drv = {
.uninit = cryptodev_aesni_mb_uninit
 };

-PMD_REGISTER_DRIVER(cryptodev_aesni_mb_pmd_drv);
+PMD_REGISTER_DRIVER_VDEV(cryptodev_aesni_mb_pmd_drvi, aesni_mb);
diff --git a/drivers/crypto/null/null_crypto_pmd.c 
b/drivers/crypto/null/null_crypto_pmd.c
index bdaf13c..03b79d9 100644
--- a/drivers/crypto/null/null_crypto_pmd.c
+++ b/drivers/crypto/null/null_crypto_pmd.c
@@ -275,4 +275,4 @@ static struct rte_driver cryptodev_null_pmd_drv = {
.uninit = cryptodev_null_uninit
 };

-PMD_REGISTER_DRIVER(cryptodev_null_pmd_drv);
+PMD_REGISTER_DRIVER_VDEV(cryptodev_null_pmd_drv, cryptodev_null_pmd);
diff --git a/drivers/crypto/qat/rte_qat_cryptodev.c 
b/drivers/crypto/qat/rte_qat_cryptodev.c
index a7912f5..14794c2 100644
--- a/drivers/crypto/qat/rte_qat_cryptodev.c
+++ b/drivers/crypto/qat/rte_qat_cryptodev.c
@@ -137,4 +137,4 @@ static struct rte_driver pmd_qat_drv = {
.init = rte_qat_pmd_init,
 };

-PMD_REGISTER_DRIVER(pmd_qat_drv);
+PMD_REGISTER_DRIVER_PDEV(pmd_qat_drv, pci_id_qat_map, qat);
diff --git a/drivers/crypto/snow3g/rte_snow3g_pmd.c 
b/drivers/crypto/snow3g/rte_snow3g_pmd.c
index f3e0e66..0203670 100644
--- a/drivers/crypto/snow3g/rte_snow3g_pmd.c
+++ b/drivers/crypto/snow3g/rte_snow3g_pmd.c
@@ -548,4 +548,4 @@ static struct rte_driver cryptodev_snow3g_pmd_drv = {
.uninit = cryptodev_snow3g_uninit
 };

-PMD_REGISTER_DRIVER(cryptodev_snow3g_pmd_drv);
+PMD_REGISTER_DRIVER_VDEV(cryptodev_snow3g_pmd_drv, snow3g);
diff --git a/drivers/net/af_packet/rte_eth_af_packet.c 
b/drivers/net/af_packet/rte_eth_af_packet.c
index 

[dpdk-dev] [PATCH 1/4] pmdinfo: Add buildtools and pmdinfo utility

2016-05-16 Thread Neil Horman
pmdinfo is a tool used to parse object files and build json strings for use in
later determining hardware support in a dso or application binary.  pmdinfo
looks for the non-exported symbol names this_pmd_name and this_pmd_tbl
(where n is a integer counter).  It records the name of each of these tuples,
using the later to find the symbolic name of the pci_table for physical devices
that the object supports.  With this information, it outputs a C file with a
single line of the form:

static char *_driver_info[] __attribute__((used)) = " \
PMD_DRIVER_INFO=";

Where  is the arbitrary name of the pmd, and  is the json
encoded string that hold relevant pmd information, including the pmd name, type
and optional array of pci device/vendor ids that the driver supports.

This c file is suitable for compiling to object code, then relocatably linking
into the parent file from which the C was generated.  This creates an entry in
the string table of the object that can inform a later tool about hardware
support.

Signed-off-by: Neil Horman 
CC: Bruce Richardson 
CC: Thomas Monjalon 
CC: Stephen Hemminger 
CC: Panu Matilainen 
---
 GNUmakefile  |   2 +-
 buildtools/Makefile  |  36 
 buildtools/pmdinfo/Makefile  |  48 +
 buildtools/pmdinfo/pmdinfo.c | 435 +++
 buildtools/pmdinfo/pmdinfo.h | 210 +
 mk/rte.buildtools.mk | 148 +++
 mk/rte.sdkbuild.mk   |   3 +-
 7 files changed, 880 insertions(+), 2 deletions(-)
 create mode 100644 buildtools/Makefile
 create mode 100644 buildtools/pmdinfo/Makefile
 create mode 100644 buildtools/pmdinfo/pmdinfo.c
 create mode 100644 buildtools/pmdinfo/pmdinfo.h
 create mode 100644 mk/rte.buildtools.mk

diff --git a/GNUmakefile b/GNUmakefile
index b59e4b6..00fe0db 100644
--- a/GNUmakefile
+++ b/GNUmakefile
@@ -40,6 +40,6 @@ export RTE_SDK
 # directory list
 #

-ROOTDIRS-y := lib drivers app
+ROOTDIRS-y := buildtools lib drivers app

 include $(RTE_SDK)/mk/rte.sdkroot.mk
diff --git a/buildtools/Makefile b/buildtools/Makefile
new file mode 100644
index 000..0f15d58
--- /dev/null
+++ b/buildtools/Makefile
@@ -0,0 +1,36 @@
+#   BSD LICENSE
+#
+#   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+#   All rights reserved.
+#
+#   Redistribution and use in source and binary forms, with or without
+#   modification, are permitted provided that the following conditions
+#   are met:
+#
+# * Redistributions of source code must retain the above copyright
+#   notice, this list of conditions and the following disclaimer.
+# * Redistributions in binary form must reproduce the above copyright
+#   notice, this list of conditions and the following disclaimer in
+#   the documentation and/or other materials provided with the
+#   distribution.
+# * Neither the name of Intel Corporation nor the names of its
+#   contributors may be used to endorse or promote products derived
+#   from this software without specific prior written permission.
+#
+#   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+#   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+#   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+#   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+#   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+#   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+#   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+#   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+#   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+#   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+#   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+
+include $(RTE_SDK)/mk/rte.vars.mk
+
+DIRS-y += pmdinfo 
+
+include $(RTE_SDK)/mk/rte.subdir.mk
diff --git a/buildtools/pmdinfo/Makefile b/buildtools/pmdinfo/Makefile
new file mode 100644
index 000..3dea68b
--- /dev/null
+++ b/buildtools/pmdinfo/Makefile
@@ -0,0 +1,48 @@
+#   BSD LICENSE
+#
+#   Copyright(c) 2010-2015 Intel Corporation. All rights reserved.
+#   All rights reserved.
+#
+#   Redistribution and use in source and binary forms, with or without
+#   modification, are permitted provided that the following conditions
+#   are met:
+#
+# * Redistributions of source code must retain the above copyright
+#   notice, this list of conditions and the following disclaimer.
+# * Redistributions in binary form must reproduce the above copyright
+#   notice, this list of conditions and the following disclaimer in
+#   the documentation and/or other materials provided with the
+#   distribution.
+# * Neither the name of Intel Corporation nor the names of its
+#   contributors may be used to endorse or promote products derived
+#   from this software 

[dpdk-dev] [PATCH 0/4] Implement pmd hardware support exports

2016-05-16 Thread Neil Horman
Hey all-
So heres attempt number 2 at a method for exporting PMD hardware support
information.  As we discussed previously, the consensus seems to be that pmd
information should be:

1) Able to be interrogated on any ELF binary (application binary or individual
DSO)
2) Equally functional on statically linked applications or on DSO's
3) Resilient to symbol stripping
4) Script friendly
5) Show kernel dependencies
6) List driver options
7) Show driver name
8) Offer human readable output
9) Show DPDK version
10) Show driver version
11) Allow for expansion
12) Not place additional build environment dependencies on an application

I added item 12 myself, because I don't think its reasonable to require
applications to use a specific linker script to get hardware support information
(which precludes the use of a special .modinfo section like the kernel uses)

However, we still can use some tricks from the kernel to make this work.  In
this approach, what I've done is the following:

A) Modified the driver registration macro to also define two variables:
this_pmd_name
this_pmd_table

The second is optional, and omitted for virtual devices.  These symbols are
static, and marked as used, so they will always be emitted by the compiler, and
with their well known names, easy to search for in an object (.o) file

B) Added a utility called pmdinfo.  This utility is not meant for general use,
but rather is used by the dpdk build environment itself when compiling pmds.
for each object that uses the PMD_REGISTER_DRIVER macro, this utiity is run.  It
searches for the symbols in (A), and using those, extracts the hardware support
info, and module name from the object, using that to produce a new c file
containing a single variable in the following format:

static const char [] __attribute__((used)) = "PMD_DRIVER_INFO=";

The  is arbitrary, as its static and not referenced.  The relevant bit is
the string value assigned to it.  The  is a json encoded string of the
extracted hardware support information pmdinfo found for the corresponding
object.  This C file is suitable for compilation and relocatable linking back
into the parent object file.  The result of this operation is that the object
string table now contains a string that will not e removed by stripping, whos
leading text (PMD_DRIVER_INFO) can be easily searched for at any time weather
the symbol referring to it is stripped or not.

C) Added a utilty called pmd_hw_info.py.  This python script, searches the
string table of the .rodata section of any provided ELF file looking for the
PMD_DRIVER_INFO prefix on a string.  When found, it will interpret the remainder
of the string as json, and output the hardware support for that ELF file (if
any).


This approach ticks most of the above major boxes:
1) Impervious to stripping
2) Works on static and dynamic binaries
3) Human and script friendly
4) Allows for expansion
5) Doesn't require additional build environment needs for applications

Because of (4) the other items should be pretty easy to implement, as its just a
matter of modifying the macros to export the info, pmdinfo to encode it to json,
and pmd_hw_support.py to read it.  I'm not adding them now, as theres alot of
rote work to do to get some of them in place (e.g. DPDK has no current global
VERSION macro, drivers don't have a consistent version scheme, command line
strings have to be built for each driver, etc).  But once this is accepted,
those items can be done as time allows and should be fairly easy to implement.


Signed-off-by: Neil Horman 
CC: Bruce Richardson 
CC: Thomas Monjalon 
CC: Stephen Hemminger 
CC: Panu Matilainen 



[dpdk-dev] [RFC PATCH] tools:new tool for system info CPU, memory and huge pages

2016-05-16 Thread Thomas Monjalon
2016-05-16 14:11, Wiles, Keith:
> >2016-05-13 15:48, Wiles, Keith:
> >> I create this new tool to combine some information and use /sys/devices 
> >> instead. What I was hoping was some of you could try this script and see 
> >> if it works correctly. Also I was hope to find out if this script is 
> >> useful and what other features you would like in this type of tool.
> >
> >What is the intent of this script? Is it to be used for bug report?
> >There already have some tools to display system informations. Why adding
> >one more?
> >Examples of useful tools: hwloc/lstopo, lspci, hugeadm.
> 
> I was looking to replace the cpu_layout.py tool which uses the /procfs 
> instead of /sysfs, just figured we could then add some extra information into 
> this script as well. Yes, we have other tools, but some people do not know or 
> use or install these tools. I was hoping this one would be able to display a 
> number of things to help the developer and us in helping them debug problems.
> 
> Stephen Hemminger sent an email about the use of sysfs instead of procfs.
> http://dpdk.org/ml/archives/dev/2016-May/038560.html

I agree that cpu_layout.py should be removed.
Should we implement something else? Or just point to existing tools?
Or call existing tools from a small script?
Is it the DPDK focus to develop and maintain such tool?


[dpdk-dev] [PATCH] doc: move rel_notes instructions as comments

2016-05-16 Thread Thomas Monjalon
Hi John,

> Reviewed-by: John McNamara 
> Acked-by: John McNamara 

The meaning of Reviewed-by and Acked-by can be slightly different
but I don't think we need to put both.
In this case, as the doc maintainer, your ack should be fine.

By the way, I'm totally OK to discuss a better description of
these tags in the guidelines.


[dpdk-dev] [PATCH / RFC ] ethdev: Allow rte_eth_dev_configure with zero RX/TX queues

2016-05-16 Thread Simon Kågström
On 2016-05-16 14:43, Pattan, Reshma wrote:
>>> This was added to allow devices,  at least with one direction (RX/TX)
>> supported. As, devices with both directions disabled doesn't make  sense 
>> right?
>>
>> Well, not for running them, no. But this is a part of the shutdown procedure
>> between tests (I should have been more clear I guess).
> 
> Yes I understood this. But I am not sure if you can use 
> rte_eth_dev_configure(port, 0, 0) to free the resources.
> Can you check if you can use rte_eth_dev_rx_queue_stop/ 
> rte_eth_dev_tx_queue_stop to achieve the same, because they do take care of
> releasing mbufs, but doesn't free the queue's sw-ring and queue.

But isn't that very strange behavior. Aren't the descriptor rings
allocated in rx_queue_setup()? If so, the sequence

  rx_queue_stop(); // Release buffers
  rx_queue_start();

would leave the descriptor ring empty after start, i.e., not able to
receive data.

// Simon



[dpdk-dev] [RFC PATCH] tools:new tool for system info CPU, memory and huge pages

2016-05-16 Thread Richardson, Bruce


> -Original Message-
> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Thomas Monjalon
> Sent: Monday, May 16, 2016 3:32 PM
> To: Wiles, Keith 
> Cc: dev at dpdk.org
> Subject: Re: [dpdk-dev] [RFC PATCH] tools:new tool for system info CPU,
> memory and huge pages
> 
> 2016-05-16 14:11, Wiles, Keith:
> > >2016-05-13 15:48, Wiles, Keith:
> > >> I create this new tool to combine some information and use
> /sys/devices instead. What I was hoping was some of you could try this
> script and see if it works correctly. Also I was hope to find out if this
> script is useful and what other features you would like in this type of
> tool.
> > >
> > >What is the intent of this script? Is it to be used for bug report?
> > >There already have some tools to display system informations. Why
> > >adding one more?
> > >Examples of useful tools: hwloc/lstopo, lspci, hugeadm.
> >
> > I was looking to replace the cpu_layout.py tool which uses the /procfs
> instead of /sysfs, just figured we could then add some extra information
> into this script as well. Yes, we have other tools, but some people do not
> know or use or install these tools. I was hoping this one would be able to
> display a number of things to help the developer and us in helping them
> debug problems.
> >
> > Stephen Hemminger sent an email about the use of sysfs instead of
> procfs.
> > http://dpdk.org/ml/archives/dev/2016-May/038560.html
> 
> I agree that cpu_layout.py should be removed.
> Should we implement something else? Or just point to existing tools?
> Or call existing tools from a small script?
> Is it the DPDK focus to develop and maintain such tool?

+1 for pointing to existing tools.

/Bruce


[dpdk-dev] Ring PMD: why are stats counters atomic?

2016-05-16 Thread Mauricio Vásquez
Hello Bruce,

Although having this support does not harm anyone, I am not convinced that
it is useful, mainly because there exists the single-thread limitation in
other PMDs. Then, if an application has to use different kind of NICs (i.e,
different PMDs) it has to implement the locking strategies. On the other
hand, if an application  only uses rte_rings, it could just use the
rte_ring library.

Thanks, Mauricio V

On Tue, May 10, 2016 at 11:36 AM, Bruce Richardson <
bruce.richardson at intel.com> wrote:

> On Tue, May 10, 2016 at 11:13:08AM +0200, Mauricio V?squez wrote:
> > Hello,
> >
> > Per-queue stats counters are defined as rte_atomic64_t, in the tx/rx
> > functions, they are atomically increased if the rings have the multiple
> > consumers/producer flag enabled.
> >
> > According to the design principles, the application should not invoke
> those
> > functions on the same queue on different cores, then I think that atomic
> > increasing is not necessary.
> >
> > Is there something wrong with my reasoning?, If not, I am willing to
> send a
> > patch.
> >
> > Thank you very much,
> >
> Since the rte_rings, on which the ring pmd is obviously based, have
> multi-producer
> and multi-consumer support built-in, I thought it might be useful in the
> ring
> PMD itself to allow multiple threads to access the ring queues at the same
> time,
> if the underlying rings are marked as MP/MC safe. When doing enqueues and
> dequeue
> from the ring, the stats are either incremented atomically, or
> non-atomically,
> depending on the underlying queue type.
>
> const uint16_t nb_rx = (uint16_t)rte_ring_dequeue_burst(r->rng,
> ptrs, nb_bufs);
> if (r->rng->flags & RING_F_SC_DEQ)
> r->rx_pkts.cnt += nb_rx;
> else
> rte_atomic64_add(&(r->rx_pkts), nb_rx);
>
> If people don't think this behaviour is worthwhile keeping, I'm ok with
> removing
> it, since all other PMDs have the restriction that the queues are
> single-thread
> only.
>
> Regards,
> /Bruce
>


[dpdk-dev] [PATCH] power: fix argument cannot be negative

2016-05-16 Thread Thomas Monjalon
2016-05-16 14:39 GMT+02:00 Mcnamara, John :
> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Thomas Monjalon
>> The next statement is probably a useless copy paste of the coverity
>> report.
>>
>> > In send_msg: Negative value used as argument to a function expecting a
>> > positive value (for example, size of buffer or allocation)
>
> A question on this point. Is it just that the Coverity message is useless
> in this case or in general? For other error/warning fixes we include the
> message in the commit.

Sometimes, the coverity message is accurate, sometimes it s better to reword it.
Anyway, having 2 sentences saying the same thing is disturbing.


[dpdk-dev] [PATCH v2] examples/vm_power_manager: buffer not null terminated

2016-05-16 Thread Thomas Monjalon
2016-05-10 17:49, Daniel Mrzyglod:
> CID30691:
> If the buffer is treated as a null terminated string in later operations,
> a buffer overflow or over-read may occur.
> 
> In add_vm: The string buffer may not have a null terminator if the source
> string's length is equal to the buffer size
> 
> Fixes: e8ae9b662506 ("examples/vm_power: channel manager and monitor in host")
> 
> Signed-off-by: Daniel Mrzyglod 

Applied, thanks


[dpdk-dev] [RFC PATCH] tools:new tool for system info CPU, memory and huge pages

2016-05-16 Thread Wiles, Keith
>2016-05-16 14:11, Wiles, Keith:
>> >2016-05-13 15:48, Wiles, Keith:
>> >> I create this new tool to combine some information and use /sys/devices 
>> >> instead. What I was hoping was some of you could try this script and see 
>> >> if it works correctly. Also I was hope to find out if this script is 
>> >> useful and what other features you would like in this type of tool.
>> >
>> >What is the intent of this script? Is it to be used for bug report?
>> >There already have some tools to display system informations. Why adding
>> >one more?
>> >Examples of useful tools: hwloc/lstopo, lspci, hugeadm.
>> 
>> I was looking to replace the cpu_layout.py tool which uses the /procfs 
>> instead of /sysfs, just figured we could then add some extra information 
>> into this script as well. Yes, we have other tools, but some people do not 
>> know or use or install these tools. I was hoping this one would be able to 
>> display a number of things to help the developer and us in helping them 
>> debug problems.
>> 
>> Stephen Hemminger sent an email about the use of sysfs instead of procfs.
>> http://dpdk.org/ml/archives/dev/2016-May/038560.html
>
>I agree that cpu_layout.py should be removed.
>Should we implement something else? Or just point to existing tools?

Well I did create something else :-) As for pointing to existing tools, we 
should do that anyway in the documentation.
>Or call existing tools from a small script?

Calling the existing tools could be a problem as they may not be installed, but 
they could install the tools too after the script points it out.

>Is it the DPDK focus to develop and maintain such tool?
I can not answer that question per say. I can at least be the maintainer for 
this script or someone else.

I think we need something that pulls all of the information from the system for 
the developer to see at a quick glance to help debug his system. Also we can 
then say run this script and post to the list, which is nice and simple as a 
debug aid.
>


Regards,
Keith






[dpdk-dev] [PATCH] examples/kni: unchecked return value

2016-05-16 Thread Thomas Monjalon
2016-05-09 11:38, Daniel Mrzyglod:
> Fix issue reported by Coverity.
> Coverity ID 30692

Better to put reference on top of Fixes: line.

> If the function returns an error value, the error value may be mistaken for
> a normal value.
> 
> In kni_free_kni: Value returned from a function is not checked for errors
> before being used

One of the 2 sentences is enough.

> Fixes: b475eb0bc400 ("examples/kni: new parameters")
> 
> Signed-off-by: Daniel Mrzyglod 
> ---
>  examples/kni/main.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/examples/kni/main.c b/examples/kni/main.c
> index a5297f2..dcecd09 100644
> --- a/examples/kni/main.c
> +++ b/examples/kni/main.c
> @@ -831,7 +831,8 @@ kni_free_kni(uint8_t port_id)
>   return -1;
>  
>   for (i = 0; i < p[port_id]->nb_kni; i++) {
> - rte_kni_release(p[port_id]->kni[i]);
> + if (rte_kni_release(p[port_id]->kni[i]))
> + printf("fail to release kni\n");

Other error messages of this file start with an uppercase.

Applied with above changes, thanks


[dpdk-dev] [PATCH v3] i40e: configure MTU

2016-05-16 Thread Olivier Matz
Hi Beilei,

On 05/13/2016 10:15 AM, Beilei Xing wrote:
> This patch enables configuring MTU for i40e.
> Since changing MTU needs to reconfigure queue, stop port first
> before configuring MTU.
> 
> Signed-off-by: Beilei Xing 
> ---
> v3 changes:
>  Add frame size with extra I40E_VLAN_TAG_SIZE.
>  Delete i40e_dev_rx_init(pf) cause it will be called when port starts.
> 
> v2 changes:
>  If mtu is not within the allowed range, return -EINVAL instead of -EBUSY.
>  Delete rxq reconfigure cause rxq reconfigure will be finished in
>  i40e_dev_rx_init.
>
>  drivers/net/i40e/i40e_ethdev.c | 34 ++
>  1 file changed, 34 insertions(+)
> 
> [...]
> +static int
> +i40e_dev_mtu_set(struct rte_eth_dev *dev, uint16_t mtu)
> +{
> + struct i40e_pf *pf = I40E_DEV_PRIVATE_TO_PF(dev->data->dev_private);
> + struct rte_eth_dev_data *dev_data = pf->dev_data;
> + uint32_t frame_size = mtu + ETHER_HDR_LEN
> +   + ETHER_CRC_LEN + I40E_VLAN_TAG_SIZE;
> + int ret = 0;
> +
> + /* check if mtu is within the allowed range */
> + if ((mtu < ETHER_MIN_MTU) || (frame_size > I40E_FRAME_SIZE_MAX))
> + return -EINVAL;
> +
> + /* mtu setting is forbidden if port is start */
> + if (dev_data->dev_started) {
> + PMD_DRV_LOG(ERR,
> + "port %d must be stopped before configuration\n",
> + dev_data->port_id);
> + return -ENOTSUP;
> + }

I'm not convinced that ENOTSUP is the proper return value here.
It is usually returned when a function is not implemented, which
is not the case here: the function is implemented but is forbidden
because the port is running.

I saw that Julien commented on your v1 that the return value should
be one of:
 - (0) if successful.
 - (-ENOTSUP) if operation is not supported.
 - (-ENODEV) if *port_id* invalid.
 - (-EINVAL) if *mtu* invalid.

But I think your initial value (-EBUSY) was fine. Maybe it should be
added in the API instead, with the following description:
  (-EBUSY) if the operation is not allowed when the port is running

This would allow the application to take its dispositions to stop the
port and restart it with the proper jumbo_frame argument.

+CC Thomas which maintains ethdev API.


Regards,
Olivier


[dpdk-dev] [PATCH 2/2] doc: announce ABI change of struct rte_port_sink_params

2016-05-16 Thread Fan Zhang
The ABI changes are planned for rte_port_sink_params, which will be
supported from release 16.11. Here announces that ABI changes in detail.

Signed-off-by: Fan Zhang 
Acked-by: Cristian Dumitrescu 
---
 doc/guides/rel_notes/deprecation.rst | 4 
 1 file changed, 4 insertions(+)

diff --git a/doc/guides/rel_notes/deprecation.rst 
b/doc/guides/rel_notes/deprecation.rst
index d228bae..d2f7306 100644
--- a/doc/guides/rel_notes/deprecation.rst
+++ b/doc/guides/rel_notes/deprecation.rst
@@ -78,3 +78,7 @@ Deprecation Notices
 * ABI will change for rte_port_source_params struct. The member file_name
   data type will be changed from char * to const char *. This change targets
   release 16.11.
+
+* ABI will change for rte_port_sink_params struct. The member file_name
+  data type will be changed from char * to const char *. This change targets
+  release 16.11.
-- 
2.5.5



[dpdk-dev] [PATCH 1/2] doc: announce ABI change of struct rte_port_source_params

2016-05-16 Thread Fan Zhang
The ABI changes are planned for rte_port_source_params, which will be
supported from release 16.11. Here announces that ABI changes in detail.

Signed-off-by: Fan Zhang 
Acked-by: Cristian Dumitrescu 
---
 doc/guides/rel_notes/deprecation.rst | 4 
 1 file changed, 4 insertions(+)

diff --git a/doc/guides/rel_notes/deprecation.rst 
b/doc/guides/rel_notes/deprecation.rst
index fffe9c7..d228bae 100644
--- a/doc/guides/rel_notes/deprecation.rst
+++ b/doc/guides/rel_notes/deprecation.rst
@@ -74,3 +74,7 @@ Deprecation Notices
   a handle, like the way kernel exposes an fd to user for locating a
   specific file, and to keep all major structures internally, so that
   we are likely to be free from ABI violations in future.
+
+* ABI will change for rte_port_source_params struct. The member file_name
+  data type will be changed from char * to const char *. This change targets
+  release 16.11.
-- 
2.5.5



[dpdk-dev] [PATCH 0/2] doc: announce ABI change of struct rte_port_source_params

2016-05-16 Thread Fan Zhang
The ABI changes are planned for rte_port_source_params and
rte_port_sink_params, which will be supported from release 16.11. Here
announces that ABI changes in detail.

Fan Zhang (2):
  doc: announce ABI change of struct rte_port_source_params
  doc: announce ABI change of struct rte_port_sink_params

 doc/guides/rel_notes/deprecation.rst | 8 
 1 file changed, 8 insertions(+)

-- 
2.5.5



[dpdk-dev] [PATCH] power: fix argument cannot be negative

2016-05-16 Thread Thomas Monjalon
The title do not convey the real issue.
We should be more concerned by an issue of "wrong error message"
rather than an "argument" which "cannot be negative".

2016-04-20 16:39, Daniel Mrzyglod:
> Fix issue reported by Coverity.
> Coverity ID 13269 & 13266:

It is better to put these references below and start with the
explanation of the issue.

> Function strerror(errno) has built strings only for non-negative errno values.
> for negative values of errno it describe error as "Unknown error -errno"
> to be more descriptive i put string "channel not found" taken from header.
> 
> The negative argument will be interpreted as a very large unsigned value.

OK.
The next statement is probably a useless copy paste of the coverity report.

> In send_msg: Negative value used as argument to a function expecting
> a positive value (for example, size of buffer or allocation)

Coverity issue: 13266
Coverity issue: 13269

> Fixes: 445c6528b55f ("power: common interface for guest and host")
> 
> Signed-off-by: Daniel Mrzyglod 
[...]
>   RTE_LOG(ERR, GUEST_CHANNEL, "Error on channel '%s' 
> communications "
> - "test: %s\n", fd_path, strerror(ret));
> + "test: %s\n", fd_path, ret > 0 ? strerror(ret) :
> + "channel not connected");

The indent is messy. I sugest this:

+   RTE_LOG(ERR, GUEST_CHANNEL,
+   "Error on channel '%s' communications test: 
%s\n",
+   fd_path, ret > 0 ? strerror(ret) :
+   "channel not connected");


> - RTE_LOG(DEBUG, POWER, "Error sending message: %s\n", strerror(ret));
> + RTE_LOG(DEBUG, POWER, "Error sending message: %s\n", ret > 0 ? 
> strerror(ret)
> + : "channel not connected");

+   RTE_LOG(DEBUG, POWER, "Error sending message: %s\n",
+   ret > 0 ? strerror(ret) : "channel not connected");

Applied with above changes, thanks.


[dpdk-dev] Ring PMD: why are stats counters atomic?

2016-05-16 Thread Bruce Richardson
On Mon, May 16, 2016 at 03:12:10PM +0200, Mauricio V?squez wrote:
> Hello Bruce,
> 
> Although having this support does not harm anyone, I am not convinced that
> it is useful, mainly because there exists the single-thread limitation in
> other PMDs. Then, if an application has to use different kind of NICs (i.e,
> different PMDs) it has to implement the locking strategies. On the other
> hand, if an application  only uses rte_rings, it could just use the
> rte_ring library.
> 
> Thanks, Mauricio V
> 
I agree. 
If you want, please submit a patch to remove this behaviour and see 
if anyone objects to it. If there are no objections, I have no problem accepting
the patch.

However, since this is a behaviour change to existing functionality, we may
need to implement function versionning for this for ABI compatibility. Please
take that into account when drafting any patch.

Regards,
/Bruce

> On Tue, May 10, 2016 at 11:36 AM, Bruce Richardson <
> bruce.richardson at intel.com> wrote:
> 
> > On Tue, May 10, 2016 at 11:13:08AM +0200, Mauricio V?squez wrote:
> > > Hello,
> > >
> > > Per-queue stats counters are defined as rte_atomic64_t, in the tx/rx
> > > functions, they are atomically increased if the rings have the multiple
> > > consumers/producer flag enabled.
> > >
> > > According to the design principles, the application should not invoke
> > those
> > > functions on the same queue on different cores, then I think that atomic
> > > increasing is not necessary.
> > >
> > > Is there something wrong with my reasoning?, If not, I am willing to
> > send a
> > > patch.
> > >
> > > Thank you very much,
> > >
> > Since the rte_rings, on which the ring pmd is obviously based, have
> > multi-producer
> > and multi-consumer support built-in, I thought it might be useful in the
> > ring
> > PMD itself to allow multiple threads to access the ring queues at the same
> > time,
> > if the underlying rings are marked as MP/MC safe. When doing enqueues and
> > dequeue
> > from the ring, the stats are either incremented atomically, or
> > non-atomically,
> > depending on the underlying queue type.
> >
> > const uint16_t nb_rx = (uint16_t)rte_ring_dequeue_burst(r->rng,
> > ptrs, nb_bufs);
> > if (r->rng->flags & RING_F_SC_DEQ)
> > r->rx_pkts.cnt += nb_rx;
> > else
> > rte_atomic64_add(&(r->rx_pkts), nb_rx);
> >
> > If people don't think this behaviour is worthwhile keeping, I'm ok with
> > removing
> > it, since all other PMDs have the restriction that the queues are
> > single-thread
> > only.
> >
> > Regards,
> > /Bruce
> >


[dpdk-dev] [RFC PATCH] tools:new tool for system info CPU, memory and huge pages

2016-05-16 Thread Wiles, Keith
>2016-05-13 15:48, Wiles, Keith:
>> I create this new tool to combine some information and use /sys/devices 
>> instead. What I was hoping was some of you could try this script and see if 
>> it works correctly. Also I was hope to find out if this script is useful and 
>> what other features you would like in this type of tool.
>
>What is the intent of this script? Is it to be used for bug report?
>There already have some tools to display system informations. Why adding
>one more?
>Examples of useful tools: hwloc/lstopo, lspci, hugeadm.

I was looking to replace the cpu_layout.py tool which uses the /procfs instead 
of /sysfs, just figured we could then add some extra information into this 
script as well. Yes, we have other tools, but some people do not know or use or 
install these tools. I was hoping this one would be able to display a number of 
things to help the developer and us in helping them debug problems.

Stephen Hemminger sent an email about the use of sysfs instead of procfs.
http://dpdk.org/ml/archives/dev/2016-May/038560.html


>


Regards,
Keith






[dpdk-dev] [PATCH 3/4] ixgbe: automatic link recovery on VF

2016-05-16 Thread Olivier Matz
Hi Wenzhuo,

On 05/04/2016 11:10 PM, Wenzhuo Lu wrote:
> When the physical link is down and recover later,
> the VF link cannot recover until the user stop and
> start it manually.
> This patch implements the automatic recovery of VF
> port.
> The automatic recovery bases on the link up/down
> message received from PF. When VF receives the link
> up/down message, it will replace the RX/TX and
> operation functions with fake ones to stop RX/TX
> and any future operation. Then reset the VF port.
> After successfully resetting the port, recover the
> RX/TX and operation functions.
> 
> Signed-off-by: Wenzhuo Lu 
> 
> [...]
> 
> +void
> +ixgbevf_dev_link_up_down_handler(struct rte_eth_dev *dev)
> +{
> + struct ixgbe_hw *hw = IXGBE_DEV_PRIVATE_TO_HW(dev->data->dev_private);
> + struct ixgbe_adapter *adapter =
> + (struct ixgbe_adapter *)dev->data->dev_private;
> + int diag;
> + uint32_t vteiam;
> +
> + /* Only one working core need to performance VF reset */
> + if (rte_spinlock_trylock(>vf_reset_lock)) {
> + /**
> +  * When fake rec/xmit is replaced, working thread may is running
> +  * into real RX/TX func, so wait long enough to assume all
> +  * working thread exit. The assumption is it will spend less
> +  * than 100us for each execution of RX and TX func.
> +  */
> + rte_delay_us(100);
> +
> + do {
> + dev->data->dev_started = 0;
> + ixgbevf_dev_stop(dev);
> + rte_delay_us(100);

If I understand well, ixgbevf_dev_link_up_down_handler() is called
by ixgbevf_recv_pkts_fake() on a dataplane core. It means that the
core that acquired the lock will loop during 100us + 1sec at least.
If this core was also in charge of polling other queues of other
ports, or timers, many packets will be dropped (even with a 100us
loop). I don't think it is acceptable to actively wait inside a
rx function.

I think it would avoid many issues to delegate this work to the
application, maybe by notifying it that the port is in a bad state
and must be restarted. The application could then properly stop
polling the queues, and stop and restart the port in a separate thread,
without bothering the dataplane cores.


Regards,
Olivier


[dpdk-dev] [PATCH] power: fix argument cannot be negative

2016-05-16 Thread Mcnamara, John
> -Original Message-
> From: Thomas Monjalon [mailto:thomas.monjalon at 6wind.com]
> Sent: Monday, May 16, 2016 2:00 PM
> ...
> > A question on this point. Is it just that the Coverity message is
> > useless in this case or in general? For other error/warning fixes we
> > include the message in the commit.
> 
> Sometimes, the coverity message is accurate, sometimes it s better to
> reword it.
> Anyway, having 2 sentences saying the same thing is disturbing.

Ok. Noted.



[dpdk-dev] [PATCH] doc: known issue on EAL argv

2016-05-16 Thread Mcnamara, John
> -Original Message-
> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Jingjing Wu
> Sent: Friday, May 13, 2016 6:15 AM
> To: david.marchand at 6wind.com
> Cc: dev at dpdk.org; Wu, Jingjing ; Yang, Ziye
> ; Richardson, Bruce 
> Subject: [dpdk-dev] [PATCH] doc: known issue on EAL argv
> 
> This patch docs the issue on EAL argument that the last EAL argument is
> replaced by program name in argv[].
> 
> Reported-by: Ziye Yang 
> Signed-off-by: Jingjing Wu 


Hi Ziye,


The title would be better as "doc: add known issue with EAL argv"


Also, below are some suggested changes to the docs in the patch:




The last EAL argument is replaced by the program name in argv[]
---

**Description**:
   The last EAL argument is replaced by the program name in ``argv[]`` after 
``eal_parse_args()`` is called.
   This is the intended behavior but it causes the pointer to the last EAL 
argument to be lost.

**Implication**:
  If the last EAL argument in ``argv[]`` is generated by a malloc function, 
changing it will cause memory
  issues when freeing the argument.

**Resolution/Workaround**:
   An application should not consider the value in ``argv[]`` as unchanged.

**Affected Environment/Platform**:
   All.

**Driver/Module**:
   Environment Abstraction Layer (EAL).



[dpdk-dev] [PATCH] cfgfile: fix integer overflow

2016-05-16 Thread Mcnamara, John


> -Original Message-
> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Kobylinski, MichalX
> Sent: Monday, May 16, 2016 1:51 PM
> To: Thomas Monjalon 
> Cc: dev at dpdk.org; Dumitrescu, Cristian 
> Subject: Re: [dpdk-dev] [PATCH] cfgfile: fix integer overflow
> ...
> Coverity show that there is overflowed value.
> But the second parameter will never be greater than 254 (its range is 0 -
> 254).
> I used cast this parameter to unsigned in order that resolved bug reported
> by static analysis.

Hi,

If the error cannot happen in a real application then you can mark the defect
as a false positive in Coverity. Add some of the information provided above as
to why it is a false positive to the comments section of the defect or add a 
link to the patchwork discussion.

John






[dpdk-dev] [PATCH] cfgfile: fix integer overflow

2016-05-16 Thread Kobylinski, MichalX


> -Original Message-
> From: Thomas Monjalon [mailto:thomas.monjalon at 6wind.com]
> Sent: Monday, May 16, 2016 12:06 PM
> To: Kobylinski, MichalX 
> Cc: dev at dpdk.org; Dumitrescu, Cristian 
> Subject: Re: [dpdk-dev] [PATCH] cfgfile: fix integer overflow
> Importance: High
> 
> 2016-04-28 11:09, Dumitrescu, Cristian:
> > From: Kobylinski, MichalX
> > > Fix issue reported by Coverity.
> > >
> > > Coverity ID 13289: Integer overflowed argument: The argument will be
> > > too small or even negative, likely resulting in unexpected behavior
> > > (for example, under-allocation in a memory allocation function).
> > > In rte_cfgfile_load: An integer overflow occurs, with the overflowed
> > > value used as an argument to a function
> > >
> > > Fixes: eaafbad419bf ("cfgfile: library to interpret config files")
> > >
> > > Signed-off-by: Michal Kobylinski 
> >
> > I don't understand the root issue here, can you please explain?
> >
> > It looks to me that "end" is always going to point to a location bigger or
> equal to [1]. So the second parameter of _strip function is always
> going to be a positive number (0 included).
> 
> Michal, any answer please?

Hi Thomas, Cristian

Coverity show that there is overflowed value.
But the second parameter will never be greater than 254 (its range is 0 - 254).
I used cast this parameter to unsigned in order that resolved bug reported by 
static analysis.


[dpdk-dev] [PATCH / RFC ] ethdev: Allow rte_eth_dev_configure with zero RX/TX queues

2016-05-16 Thread Pattan, Reshma


> -Original Message-
> From: Simon K?gstr?m [mailto:simon.kagstrom at netinsight.net]
> Sent: Monday, May 16, 2016 11:33 AM
> To: Pattan, Reshma ; dev at dpdk.org;
> thomas.monjalon at 6wind.com
> Subject: Re: [PATCH / RFC ] ethdev: Allow rte_eth_dev_configure with zero
> RX/TX queues
> 
> On 2016-05-16 12:24, Pattan, Reshma wrote:
> >> diff --git a/lib/librte_ether/rte_ethdev.c
> >> b/lib/librte_ether/rte_ethdev.c index
> >> a31018e..5481d45 100644
> >> --- a/lib/librte_ether/rte_ethdev.c
> >> +++ b/lib/librte_ether/rte_ethdev.c
> >> @@ -944,11 +944,6 @@ rte_eth_dev_configure(uint8_t port_id, uint16_t
> >> nb_rx_q, uint16_t nb_tx_q,
> >> */
> >>(*dev->dev_ops->dev_infos_get)(dev, _info);
> >>
> >> -  if (nb_rx_q == 0 && nb_tx_q == 0) {
> >> -  RTE_PMD_DEBUG_TRACE("ethdev port_id=%d both rx and tx
> >> queue cannot be 0\n", port_id);
> >> -  return -EINVAL;
> >> -  }
> >
> > This was added to allow devices,  at least with one direction (RX/TX)
> supported. As, devices with both directions disabled doesn't make  sense 
> right?
> 
> Well, not for running them, no. But this is a part of the shutdown procedure
> between tests (I should have been more clear I guess).
> 

Yes I understood this. But I am not sure if you can use 
rte_eth_dev_configure(port, 0, 0) to free the resources.
Can you check if you can use rte_eth_dev_rx_queue_stop/ 
rte_eth_dev_tx_queue_stop to achieve the same, because they do take care of
releasing mbufs, but doesn't free the queue's sw-ring and queue.

Thanks,
Reshma


[dpdk-dev] [PATCH] power: fix argument cannot be negative

2016-05-16 Thread Mcnamara, John


> -Original Message-
> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Thomas Monjalon
> Sent: Monday, May 16, 2016 1:17 PM
> To: Mrzyglod, DanielX T 
> Cc: dev at dpdk.org; Carew, Alan 
> Subject: Re: [dpdk-dev] [PATCH] power: fix argument cannot be negative
> 
> The next statement is probably a useless copy paste of the coverity
> report.
> 
> > In send_msg: Negative value used as argument to a function expecting a
> > positive value (for example, size of buffer or allocation)


A question on this point. Is it just that the Coverity message is useless
in this case or in general? For other error/warning fixes we include the
message in the commit.

John


[dpdk-dev] [PATCH / RFC ] ethdev: Allow rte_eth_dev_configure with zero RX/TX queues

2016-05-16 Thread Simon Kågström
On 2016-05-16 12:24, Pattan, Reshma wrote:
>> diff --git a/lib/librte_ether/rte_ethdev.c b/lib/librte_ether/rte_ethdev.c 
>> index
>> a31018e..5481d45 100644
>> --- a/lib/librte_ether/rte_ethdev.c
>> +++ b/lib/librte_ether/rte_ethdev.c
>> @@ -944,11 +944,6 @@ rte_eth_dev_configure(uint8_t port_id, uint16_t
>> nb_rx_q, uint16_t nb_tx_q,
>>   */
>>  (*dev->dev_ops->dev_infos_get)(dev, _info);
>>
>> -if (nb_rx_q == 0 && nb_tx_q == 0) {
>> -RTE_PMD_DEBUG_TRACE("ethdev port_id=%d both rx and tx
>> queue cannot be 0\n", port_id);
>> -return -EINVAL;
>> -}
> 
> This was added to allow devices,  at least with one direction (RX/TX)  
> supported. As, devices with both directions disabled doesn't make  sense 
> right?

Well, not for running them, no. But this is a part of the shutdown
procedure between tests (I should have been more clear I guess).

As far as I can see in the code, rte_eth_dev_configure() is the only
point which actually calls {rx,tx}_queue_release(), so without this
call, we can't get the memory pool full again.


So basically, our test suite looks like

  rte_eth_dev_configure(port, 32, 32); // For example
  
  rte_eth_dev_configure(port, 0, 0);
  Check that the mempool is full again

  rte_eth_dev_configure(port, 32, 32);
  
  rte_eth_dev_configure(port, 0, 0);
  Check that the mempool is full again
  ...

And without this fix, the mempool check fails since a few of the buffers
are tied up in the RX descriptor ring of the PMD.

// Simon


[dpdk-dev] possible kni bug and proposed fix

2016-05-16 Thread Ferruh Yigit
On 5/15/2016 5:48 AM, ALeX Wang wrote:
> Hi,
> 
> When using the kni module to test my application inside
> debian (virtualbox) VM (kernel version 4.4), I get the
> 
> "KNI: Out of memory"
> 
> from syslog every time I `tcpreply` packets through
> the kni interface.
> 
> After checking source code, I saw that when I call
> 'rte_kni_rx_burst()', no matter how many packets
> are actually retrieved, we always call 'kni_allocate_mbufs()'
> and try allocate 'MAX_MBUF_BURST_NUM' more
> mbufs...  I fix the issue via using this patch below,
> 
> Could you confirm if this is an actual bug?
> 

Hi Alex,

I don't think this is a bug.

kni_allocate_mbufs() will allocate MAX_MBUF_BURST_NUM mbufs as you
mentioned. And will fill alloc_q with these mubfs _until it gets full_.
And if the alloc_q is full and there are remaining mbufs, they will be
freed. So for some cases this isn't the most optimized way, but there is
no defect.

Since you are getting "KNI: Out of memory", somewhere else can be
missing freeing mbufs.

mbufs freeing done in rte_kni_tx_burst(), I can guess two cases that can
cause problem:
a) not calling rte_kni_tx_burst() frequent, so that all free mbufs consumed.
b) calling rte_kni_tx_burst() with number of mbufs bigger than
MAX_MBUF_BURST_NUM, because this function frees at most
MAX_MBUF_BURST_NUM of mbufs, but if you are calling calling
rte_kni_tx_burst() with bigger numbers, this will cause mbufs to stuck
in free_q


Regards,
ferruh




[dpdk-dev] [PATCH] cfgfile: fix integer overflow

2016-05-16 Thread Thomas Monjalon
2016-04-28 11:09, Dumitrescu, Cristian:
> From: Kobylinski, MichalX
> > Fix issue reported by Coverity.
> > 
> > Coverity ID 13289: Integer overflowed argument: The argument will be too
> > small or even negative, likely resulting in unexpected behavior (for
> > example, under-allocation in a memory allocation function).
> > In rte_cfgfile_load: An integer overflow occurs, with the overflowed
> > value used as an argument to a function
> > 
> > Fixes: eaafbad419bf ("cfgfile: library to interpret config files")
> > 
> > Signed-off-by: Michal Kobylinski 
> 
> I don't understand the root issue here, can you please explain?
> 
> It looks to me that "end" is always going to point to a location bigger or 
> equal to [1]. So the second parameter of _strip function is always 
> going to be a positive number (0 included).

Michal, any answer please?


[dpdk-dev] [PATCH] mbuf: decrease refcnt when detaching

2016-05-16 Thread Hiroyuki MIKITA
Hi Konstantin,

Now, the attach operation increases refcnt, but the detach does not decrease it.
I think both operations should affect refcnt or not.
Which design is intended?

In "6.7. Direct and Indirect Buffers" of Programmer's Guide,
it is mentioned that "...whenever an indirect buffer is attached to
the direct buffer,
the reference counter on the direct buffer is incremented.
Similarly, whenever the indirect buffer is detached,
the reference counter on the direct buffer is decremented."

Regards,
Hiroyuki

2016-05-16 9:05 GMT+09:00 Ananyev, Konstantin :
>
>
>> -Original Message-
>> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Hiroyuki Mikita
>> Sent: Sunday, May 15, 2016 4:51 PM
>> To: olivier.matz at 6wind.com
>> Cc: dev at dpdk.org
>> Subject: [dpdk-dev] [PATCH] mbuf: decrease refcnt when detaching
>>
>> The rte_pktmbuf_detach() function should decrease refcnt on a direct
>> buffer.
>
> Hmm, why is that?
> What's wrong with the current approach?
> Konstantin
>
>>
>> Signed-off-by: Hiroyuki Mikita 
>> ---
>>  lib/librte_mbuf/rte_mbuf.h | 4 +++-
>>  1 file changed, 3 insertions(+), 1 deletion(-)
>>
>> diff --git a/lib/librte_mbuf/rte_mbuf.h b/lib/librte_mbuf/rte_mbuf.h
>> index 529debb..3b117ca 100644
>> --- a/lib/librte_mbuf/rte_mbuf.h
>> +++ b/lib/librte_mbuf/rte_mbuf.h
>> @@ -1468,9 +1468,11 @@ static inline void rte_pktmbuf_attach(struct rte_mbuf 
>> *mi, struct rte_mbuf *m)
>>   */
>>  static inline void rte_pktmbuf_detach(struct rte_mbuf *m)
>>  {
>> + struct rte_mbuf *md = rte_mbuf_from_indirect(m);
>>   struct rte_mempool *mp = m->pool;
>>   uint32_t mbuf_size, buf_len, priv_size;
>>
>> + rte_mbuf_refcnt_update(md, -1);
>>   priv_size = rte_pktmbuf_priv_size(mp);
>>   mbuf_size = sizeof(struct rte_mbuf) + priv_size;
>>   buf_len = rte_pktmbuf_data_room_size(mp);
>> @@ -1498,7 +1500,7 @@ __rte_pktmbuf_prefree_seg(struct rte_mbuf *m)
>>   if (RTE_MBUF_INDIRECT(m)) {
>>   struct rte_mbuf *md = rte_mbuf_from_indirect(m);
>>   rte_pktmbuf_detach(m);
>> - if (rte_mbuf_refcnt_update(md, -1) == 0)
>> + if (rte_mbuf_refcnt_read(md) == 0)
>>   __rte_mbuf_raw_free(md);
>>   }
>>   return m;
>> --
>> 1.9.1
>


[dpdk-dev] [PATCH / RFC ] ethdev: Allow rte_eth_dev_configure with zero RX/TX queues

2016-05-16 Thread Simon Kagstrom
This allows releasing RX/TX queue memory.
---
We're using DPDK 16.04 and have a test suite which performs a sequence
of separate tests of the type

   allocate mempool
   rte_eth_dev_configure(port, n_rxq, n_txq, ...)
   setup rx/tx queues
   rte_eth_dev_start(port)

   

   stop rx/tx queues
   rte_eth_dev_stop(port)

-> rte_eth_dev_configure(port, 0, 0, ...)

   check that there are no leaks from the mempool

The crucial point is the marked line above. This is done so that the
rx_queue_release/tx_queue_release callbacks in the PMD is called, so
that mbufs allocated by the driver is released.

Without this patch, this explicitly isn't allowed. Is there a particular
reason why it shouldn't? It was introduced in

  d505ba80a165a9735f3d9d3c6ab68a7bd85f271b

  "ethdev: support unidirectional configuration"

 lib/librte_ether/rte_ethdev.c | 5 -
 1 file changed, 5 deletions(-)

diff --git a/lib/librte_ether/rte_ethdev.c b/lib/librte_ether/rte_ethdev.c
index a31018e..5481d45 100644
--- a/lib/librte_ether/rte_ethdev.c
+++ b/lib/librte_ether/rte_ethdev.c
@@ -944,11 +944,6 @@ rte_eth_dev_configure(uint8_t port_id, uint16_t nb_rx_q, 
uint16_t nb_tx_q,
 */
(*dev->dev_ops->dev_infos_get)(dev, _info);

-   if (nb_rx_q == 0 && nb_tx_q == 0) {
-   RTE_PMD_DEBUG_TRACE("ethdev port_id=%d both rx and tx queue 
cannot be 0\n", port_id);
-   return -EINVAL;
-   }
-
if (nb_rx_q > dev_info.max_rx_queues) {
RTE_PMD_DEBUG_TRACE("ethdev port_id=%d nb_rx_queues=%d > %d\n",
port_id, nb_rx_q, dev_info.max_rx_queues);
-- 
1.9.1



[dpdk-dev] [RFC PATCH] tools:new tool for system info CPU, memory and huge pages

2016-05-16 Thread Thomas Monjalon
2016-05-13 15:48, Wiles, Keith:
> I create this new tool to combine some information and use /sys/devices 
> instead. What I was hoping was some of you could try this script and see if 
> it works correctly. Also I was hope to find out if this script is useful and 
> what other features you would like in this type of tool.

What is the intent of this script? Is it to be used for bug report?
There already have some tools to display system informations. Why adding
one more?
Examples of useful tools: hwloc/lstopo, lspci, hugeadm.


[dpdk-dev] [PATCH] rte mempool: division or modulo by zero

2016-05-16 Thread Olivier Matz
Hi Slawomir,

On 05/12/2016 02:46 PM, Slawomir Mrozowicz wrote:
> Fix issue reported by Coverity.
> 
> Coverity ID 13243: Division or modulo by zero
> In function call rte_mempool_xmem_size, division by expression total_size
> which may be zero has undefined behavior.
> 
> Fixes: 148f963fb532 ("xen: core library changes")
> 
> Signed-off-by: Slawomir Mrozowicz 
> ---
>  lib/librte_mempool/rte_mempool.c | 18 +++---
>  1 file changed, 11 insertions(+), 7 deletions(-)
> 
> diff --git a/lib/librte_mempool/rte_mempool.c 
> b/lib/librte_mempool/rte_mempool.c
> index f8781e1..01668c1 100644
> --- a/lib/librte_mempool/rte_mempool.c
> +++ b/lib/librte_mempool/rte_mempool.c
> @@ -327,15 +327,19 @@ rte_mempool_calc_obj_size(uint32_t elt_size, uint32_t 
> flags,
>  size_t
>  rte_mempool_xmem_size(uint32_t elt_num, size_t elt_sz, uint32_t pg_shift)
>  {
> - size_t n, pg_num, pg_sz, sz;
> + size_t n, pg_num, pg_sz;
> + size_t sz = 0;
>  
> - pg_sz = (size_t)1 << pg_shift;
> + if (elt_sz > 0) {
> + pg_sz = (size_t)1 << pg_shift;
> + n = pg_sz / elt_sz;
>  
> - if ((n = pg_sz / elt_sz) > 0) {
> - pg_num = (elt_num + n - 1) / n;
> - sz = pg_num << pg_shift;
> - } else {
> - sz = RTE_ALIGN_CEIL(elt_sz, pg_sz) * elt_num;
> + if (n > 0) {
> + pg_num = (elt_num + n - 1) / n;
> + sz = pg_num << pg_shift;
> + } else {
> + sz = RTE_ALIGN_CEIL(elt_sz, pg_sz) * elt_num;
> + }
>   }
>  
>   return sz;
> 

I think it would be clearer (either for the patch and the code) to avoid
an additional indent, and do something like that:

size_t
rte_mempool_xmem_size(uint32_t elt_num, size_t elt_sz,
uint32_t pg_shift)
{
if (elt_sz == 0)
return 0;

/* same code as before */

It will also facilitate the merge with
http://patchwork.dpdk.org/dev/patchwork/patch/12057/

Could you please submit a v2 with this logic?

Thanks,
Olivier


[dpdk-dev] [PATCH] mbuf: decrease refcnt when detaching

2016-05-16 Thread Thomas Monjalon
2016-05-16 11:46, Hiroyuki MIKITA:
> Now, the attach operation increases refcnt, but the detach does not decrease 
> it.
> I think both operations should affect refcnt or not.
> Which design is intended?
> 
> In "6.7. Direct and Indirect Buffers" of Programmer's Guide,
> it is mentioned that "...whenever an indirect buffer is attached to
> the direct buffer,
> the reference counter on the direct buffer is incremented.
> Similarly, whenever the indirect buffer is detached,
> the reference counter on the direct buffer is decremented."

The doc is the reference. The doxygen comment should explicit every
details of the behaviour.
And the unit tests must validate every parts of the behaviour.
Probably there is a bug which is not (yet) tested.
Please see the function testclone_testupdate_testdetach. Thanks



[dpdk-dev] [PATCH] doc: move rel_notes instructions as comments

2016-05-16 Thread Mcnamara, John
> -Original Message-
> From: Olivier Matz [mailto:olivier.matz at 6wind.com]
> Sent: Friday, May 13, 2016 2:28 PM
> To: dev at dpdk.org
> Cc: Mcnamara, John 
> Subject: [PATCH] doc: move rel_notes instructions as comments
> 
> We don't want to have this instructions in the generated docs, so use
> comments. It's also less confusing for people adding entries in the
> documentation.
> 
> Signed-off-by: Olivier Matz 

Good idea.

Reviewed-by: John McNamara 
Acked-by: John McNamara 



[dpdk-dev] [PATCH] mbuf: decrease refcnt when detaching

2016-05-16 Thread Olivier Matz
Hi Hiroyuki,


On 05/15/2016 05:50 PM, Hiroyuki Mikita wrote:
> The rte_pktmbuf_detach() function should decrease refcnt on a direct
> buffer.
> 
> Signed-off-by: Hiroyuki Mikita 
> ---
>  lib/librte_mbuf/rte_mbuf.h | 4 +++-
>  1 file changed, 3 insertions(+), 1 deletion(-)
> 
> diff --git a/lib/librte_mbuf/rte_mbuf.h b/lib/librte_mbuf/rte_mbuf.h
> index 529debb..3b117ca 100644
> --- a/lib/librte_mbuf/rte_mbuf.h
> +++ b/lib/librte_mbuf/rte_mbuf.h
> @@ -1468,9 +1468,11 @@ static inline void rte_pktmbuf_attach(struct rte_mbuf 
> *mi, struct rte_mbuf *m)
>   */
>  static inline void rte_pktmbuf_detach(struct rte_mbuf *m)
>  {
> + struct rte_mbuf *md = rte_mbuf_from_indirect(m);
>   struct rte_mempool *mp = m->pool;
>   uint32_t mbuf_size, buf_len, priv_size;
>  
> + rte_mbuf_refcnt_update(md, -1);
>   priv_size = rte_pktmbuf_priv_size(mp);
>   mbuf_size = sizeof(struct rte_mbuf) + priv_size;
>   buf_len = rte_pktmbuf_data_room_size(mp);
> @@ -1498,7 +1500,7 @@ __rte_pktmbuf_prefree_seg(struct rte_mbuf *m)
>   if (RTE_MBUF_INDIRECT(m)) {
>   struct rte_mbuf *md = rte_mbuf_from_indirect(m);
>   rte_pktmbuf_detach(m);
> - if (rte_mbuf_refcnt_update(md, -1) == 0)
> + if (rte_mbuf_refcnt_read(md) == 0)
>   __rte_mbuf_raw_free(md);
>   }
>   return m;
> 

Thanks for submitting this patch. I agree that rte_pktmbuf_attach()
and rte_pktmbuf_detach() are not symmetrical, but I think your patch
could trigger some race conditions.

Example:

- init: m, c1 and c2 are direct mbuf
- rte_pktmbuf_attach(c1, m);  # c1 becomes a clone of m
- rte_pktmbuf_attach(c2, m);  # c2 becomes another clone of m
- rte_pktmbuf_free(m);
- after that, we have:
  - m is a direct mbuf with refcnt = 2
  - c1 is an indirect mbuf pointing to data of m
  - c2 is an indirect mbuf pointing to data of m
- if we call rte_pktmbuf_free(c1) and rte_pktmbuf_free(c2) on 2
  different cores at the same time, m can be freed twice because
  (rte_mbuf_refcnt_read(md) == 0) can be true on both cores.

I think the proper way of doing would be to have rte_pktmbuf_detach()
returning the value of rte_mbuf_refcnt_update(md, -1), ensuring that
only one core will call _rte_mbuf_raw_free().

In the unit tests, in test_attach_from_different_pool(), the mbuf m
is never freed due to this behavior. That shows the current api is
a bit misleading. I think it should also be fixed in the patch.

Another issue is that it will break the API.
To avoid issues in applications relying on the current behavior of
rte_pktmbuf_detach(), I'd say we should keep the function as-is and
mark it as deprecated. Then, introduce a new function
rte_pktmbuf_detach2() (or any better name :) ) that changes the
behavior to what you suggest. An entry in the release note would
also be helpful.

The other option is to let things as-is and just document the behavior
of rte_pktmbuf_detach(), explicitly saying that it is not symmetrical
with the attach. But I'd prefer the first option.


Thoughts ?

Regards,
Olivier


[dpdk-dev] [PATCH v4 6/8] virtio-user: add new virtual pci driver for virtio

2016-05-16 Thread Yuanhan Liu
On Mon, May 16, 2016 at 01:48:01AM +, Tan, Jianfeng wrote:
> > On Fri, May 13, 2016 at 09:54:33AM +0800, Tan, Jianfeng wrote:
> > >
> > > So, I'd suggest something like following:
> > >
> > > if (is_vdev(..)) {
> > >
> > >
> > > The blocker issue of your suggestion is that we have no such condition.
> > >
> > > Previously, I use dev_type, but as David's comment said:
> > 
> > That's not the only option. There should be others, for example,
> > checking the existence of virtio_user_device. Or even, you could
> > add a new flag inside virtio hw while initiating your vdev.
> > 
> > > May I ask how many more such handling are needed, excluding the tx
> > queue
> > > header desc setup? And as stated, in generic, yes, we should try that.
> > >
> > >
> > > Those which need special handling:
> > > (1) vq->vq_ring_mem: it is set but never used, so it's out of question.
> > > (2) vq->virtio_net_hdr_mem and vring_hdr_desc_init
> > 
> > vring_hdr_desc_init is common.
> > 
> > > (3) vq->offset
> > >
> > > Just (2) and (3) so far. And the question is quite clear: where to put 
> > > these
> > > two special handling.
> > 
> > Apparently, you can't put it into the queue_setup(). And I still think
> > my proposal works great here.
> 
> OK, since it's indeed inappropriate to put these special handlings inside 
> queue_setup() from semantic perspective, I'll add them according to if 
> hw->vdev_private == NULL in the driver.

I'm thinking maybe we could rename "vdev_private" to "virtio_user_dev"
(or something like that), to make it explicit that it's for virtio user
device. I mean, there should be no other vdevs after all. OTOH, using
"hw->vdev_private != NULL" to say it's a virtio-user device is a bit
weird; it doesn't even make too much sense.

--yliu


[dpdk-dev] [RFC PATCH v2 1/3] rte: change xstats to use integer keys

2016-05-16 Thread Tahhan, Maryam
> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Remy Horton
> Sent: Friday, May 6, 2016 12:11 PM
> To: dev at dpdk.org
> Subject: [dpdk-dev] [RFC PATCH v2 1/3] rte: change xstats to use integer
> keys
> 
> Signed-off-by: Remy Horton 
> ---
Acked-by: Maryam Tahhan 


[dpdk-dev] [RFC PATCH v2 0/3] Remove string operations from xstats

2016-05-16 Thread Tahhan, Maryam
> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Remy Horton
> Sent: Friday, May 6, 2016 12:11 PM
> To: dev at dpdk.org
> Subject: [dpdk-dev] [RFC PATCH v2 0/3] Remove string operations from
> xstats
> 
> The current extended ethernet statistics fetching involve doing several
> string operations, which causes performance issues if there are lots of
> statistics and/or network interfaces. This RFC patchset changes the API
> for xstats to use integer identifiers instead of strings and implements this
> new API for the ixgbe driver. Others drivers to follow.
> 
> --
> 
> v2 changes:
> * Fetching xstats count now seperate API function
> * Added #define constants for some magic numbers
> * Fixed bug with virtual function count fetching
> * For non-xstats-supporting drivers, queue stats returned
> * Some refactoring/cleanups
> * Removed index assumption from example
> 
> 
> Remy Horton (3):
>   rte: change xstats to use integer keys
>   drivers/net/ixgbe: change xstats to use integer keys
>   examples/ethtool: add xstats display command
> 
>  drivers/net/ixgbe/ixgbe_ethdev.c  |  98
> -
>  examples/ethtool/ethtool-app/ethapp.c |  57 +++
>  lib/librte_ether/rte_ethdev.c | 100
> +-
>  lib/librte_ether/rte_ethdev.h |  55 +++
>  4 files changed, 284 insertions(+), 26 deletions(-)
> 
> --
> 2.5.5

Looks Great overall. Is there a need to update prog_guide/poll_mode_drv.rst 
with the new mods?

BR
Maryam


[dpdk-dev] [PATCH / RFC ] ethdev: Allow rte_eth_dev_configure with zero RX/TX queues

2016-05-16 Thread Pattan, Reshma


> -Original Message-
> From: Simon Kagstrom [mailto:simon.kagstrom at netinsight.net]
> Sent: Monday, May 16, 2016 10:34 AM
> To: dev at dpdk.org; thomas.monjalon at 6wind.com; Pattan, Reshma
> 
> Subject: [PATCH / RFC ] ethdev: Allow rte_eth_dev_configure with zero RX/TX
> queues
> 
> This allows releasing RX/TX queue memory.
> ---
> 
> Without this patch, this explicitly isn't allowed. Is there a particular 
> reason why it
> shouldn't? It was introduced in
> 
>   d505ba80a165a9735f3d9d3c6ab68a7bd85f271b
> 
>   "ethdev: support unidirectional configuration"
> 
>  lib/librte_ether/rte_ethdev.c | 5 -
>  1 file changed, 5 deletions(-)
> 
> diff --git a/lib/librte_ether/rte_ethdev.c b/lib/librte_ether/rte_ethdev.c 
> index
> a31018e..5481d45 100644
> --- a/lib/librte_ether/rte_ethdev.c
> +++ b/lib/librte_ether/rte_ethdev.c
> @@ -944,11 +944,6 @@ rte_eth_dev_configure(uint8_t port_id, uint16_t
> nb_rx_q, uint16_t nb_tx_q,
>*/
>   (*dev->dev_ops->dev_infos_get)(dev, _info);
> 
> - if (nb_rx_q == 0 && nb_tx_q == 0) {
> - RTE_PMD_DEBUG_TRACE("ethdev port_id=%d both rx and tx
> queue cannot be 0\n", port_id);
> - return -EINVAL;
> - }
> -

This was added to allow devices,  at least with one direction (RX/TX)  
supported. As, devices with both directions disabled doesn't make  sense right?

Thanks,
Reshma


[dpdk-dev] [PATCH] Add rte_mempool_free

2016-05-16 Thread Ben Walker
There is no inverse of rte_mempool_create, so this patch adds one.
The typical usage of rte_mempool_create is to create a pool at
initialization time and only to free it upon program exit, so an
rte_mempool_free function at first seems to be of little value.
However, it is very useful as a sanity check for a clean shutdown
when used in conjunction with tools like AddressSanitizer. Further,
the call itself verifies that all elements have been returned to
the pool or it fails.

Signed-off-by: Ben Walker 
---
 lib/librte_mempool/rte_dom0_mempool.c | 22 +++
 lib/librte_mempool/rte_mempool.c  | 70 +++
 lib/librte_mempool/rte_mempool.h  | 41 
 3 files changed, 133 insertions(+)

diff --git a/lib/librte_mempool/rte_dom0_mempool.c 
b/lib/librte_mempool/rte_dom0_mempool.c
index 0d6d750..edf2d58 100644
--- a/lib/librte_mempool/rte_dom0_mempool.c
+++ b/lib/librte_mempool/rte_dom0_mempool.c
@@ -131,3 +131,25 @@ rte_dom0_mempool_create(const char *name, unsigned 
elt_num, unsigned elt_size,

return mp;
 }
+
+/* free the mempool supporting Dom0 */
+int
+rte_dom0_mempool_free(struct rte_mempool *mp)
+{
+   const struct rte_memzone *mz;
+   char mz_name[RTE_MEMZONE_NAMESIZE];
+   int rc;
+
+   rc = rte_mempool_xmem_free(mp);
+   if (rc) {
+   return rc;
+   }
+
+   snprintf(mz_name, sizeof(mz_name), RTE_MEMPOOL_OBJ_NAME, mp->name);
+   mz = rte_memzone_lookup(mz_name);
+   if (mz) {
+   rte_memzone_free(mz);
+   }
+
+   return 0;
+}
diff --git a/lib/librte_mempool/rte_mempool.c b/lib/librte_mempool/rte_mempool.c
index 70812d9..82f645e 100644
--- a/lib/librte_mempool/rte_mempool.c
+++ b/lib/librte_mempool/rte_mempool.c
@@ -638,6 +638,76 @@ exit_unlock:
return NULL;
 }

+#ifndef RTE_LIBRTE_XEN_DOM0
+/* stub if DOM0 support not configured */
+int
+rte_dom0_mempool_free(struct rte_mempool *mp __rte_unused)
+{
+   rte_errno = EINVAL;
+   return -1;
+}
+#endif
+
+int
+rte_mempool_free(struct rte_mempool *mp)
+{
+   if (rte_xen_dom0_supported())
+   return rte_dom0_mempool_free(mp);
+   else
+   return rte_mempool_xmem_free(mp);
+}
+
+
+/* Free the memory pool */
+int
+rte_mempool_xmem_free(struct rte_mempool *mp)
+{
+   char mz_name[RTE_MEMZONE_NAMESIZE];
+   struct rte_mempool_list *mempool_list;
+   struct rte_tailq_entry *te = NULL;
+   const struct rte_memzone *mz;
+   unsigned count;
+
+   if (!mp) {
+   return 0;
+   }
+
+   count = rte_mempool_free_count(mp);
+   if (count != 0) {
+   /* All elements must be returned to the pool before free */
+   return count;
+   }
+
+   rte_rwlock_write_lock(RTE_EAL_MEMPOOL_RWLOCK);
+
+   /* Free the ring associated with this mempool */
+   if (mp->ring) {
+   rte_ring_free(mp->ring);
+   }
+
+   /* Remove the entry from the mempool list and free it. */
+   rte_rwlock_write_lock(RTE_EAL_TAILQ_RWLOCK);
+   mempool_list = RTE_TAILQ_CAST(rte_mempool_tailq.head, rte_mempool_list);
+   TAILQ_FOREACH(te, mempool_list, next) {
+   if ((struct rte_mempool *)te->data == mp) {
+   TAILQ_REMOVE(mempool_list, te, next);
+   rte_free(te);
+   break;
+   }
+   }
+   rte_rwlock_write_unlock(RTE_EAL_TAILQ_RWLOCK);
+
+   snprintf(mz_name, sizeof(mz_name), RTE_MEMPOOL_MZ_FORMAT, mp->name);
+   mz = rte_memzone_lookup(mz_name);
+   if (mz) {
+   rte_memzone_free(mz);
+   }
+
+   rte_rwlock_write_unlock(RTE_EAL_MEMPOOL_RWLOCK);
+
+   return 0;
+}
+
 /* Return the number of entries in the mempool */
 unsigned
 rte_mempool_count(const struct rte_mempool *mp)
diff --git a/lib/librte_mempool/rte_mempool.h b/lib/librte_mempool/rte_mempool.h
index 9745bf0..26949c7 100644
--- a/lib/librte_mempool/rte_mempool.h
+++ b/lib/librte_mempool/rte_mempool.h
@@ -728,6 +728,47 @@ rte_dom0_mempool_create(const char *name, unsigned n, 
unsigned elt_size,
rte_mempool_obj_ctor_t *obj_init, void *obj_init_arg,
int socket_id, unsigned flags);

+/**
+ * Free the memory pool created by rte_mempool_create
+ *
+ * All elements must be placed back in the pool prior to calling this function.
+ *
+ * @param mp
+ *   A pointer to the mempool structure.
+ * @return
+ *   0 on success. -1 on error  with rte_errno set appropriately.
+ * Possible rte_errno values include:
+ *- EINVAL - Invalid input value.
+ */
+int rte_mempool_free(struct rte_mempool *mp);
+
+/**
+ * Free the memory pool created by rte_mempool_xmem_create.
+ *
+ * All elements must be placed back in the pool prior to calling this function.
+ *
+ * @param mp
+ *   A pointer to the mempool structure.
+ * @return
+ *   0 on success. -1 on error  with rte_errno set appropriately.
+ * Possible 

[dpdk-dev] [PATCH] mbuf: decrease refcnt when detaching

2016-05-16 Thread Ananyev, Konstantin
Hi Hiroyuki,

> 
> Hi Konstantin,
> 
> Now, the attach operation increases refcnt, but the detach does not decrease 
> it.
> I think both operations should affect refcnt or not.
> Which design is intended?
> 
> In "6.7. Direct and Indirect Buffers" of Programmer's Guide,
> it is mentioned that "...whenever an indirect buffer is attached to
> the direct buffer,
> the reference counter on the direct buffer is incremented.
> Similarly, whenever the indirect buffer is detached,
> the reference counter on the direct buffer is decremented."

Well, right now the rte_pktmbuf_detach(struct rte_mbuf *m) just restores 
the fields of indirect mbufs to the default values, nothing more.
Actual freeing of the mbuf it was attached to is done in the 
__rte_pktmbuf_prefree_seg().
I suppose the name rte_pktmbuf_detach() is a bit confusing here,
might be, when created, it should be named rte_pktmbuf_restore() or so.
About proposed changes - it would introduce an extra unnecessary read of refcnt 
(as it is a volatile field).
If we'll decide to go that way, then I think rte_pktmbuf_detach() have to deal 
with freeing md.
Something like that:

static inline void 
rte_pktmbuf_detach(struct rte_mbuf *m)
{
 struct rte_mbuf *md = rte_mbuf_from_indirect(m);

  /* former rte_pktmbuf_detach */
  rte_pktmbuf_restore(m);
  if (rte_mbuf_refcnt_update(md, -1) == 0)
   __rte_mbuf_raw_free(md);
}

That might be a good thing in terms of API usability and clearness,
but would cause a change in public API behaviour, so I am not sure it is worth 
it.
Konstantin 

> 
> Regards,
> Hiroyuki
> 
> 2016-05-16 9:05 GMT+09:00 Ananyev, Konstantin  intel.com>:
> >
> >
> >> -Original Message-
> >> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Hiroyuki Mikita
> >> Sent: Sunday, May 15, 2016 4:51 PM
> >> To: olivier.matz at 6wind.com
> >> Cc: dev at dpdk.org
> >> Subject: [dpdk-dev] [PATCH] mbuf: decrease refcnt when detaching
> >>
> >> The rte_pktmbuf_detach() function should decrease refcnt on a direct
> >> buffer.
> >
> > Hmm, why is that?
> > What's wrong with the current approach?
> > Konstantin
> >
> >>
> >> Signed-off-by: Hiroyuki Mikita 
> >> ---
> >>  lib/librte_mbuf/rte_mbuf.h | 4 +++-
> >>  1 file changed, 3 insertions(+), 1 deletion(-)
> >>
> >> diff --git a/lib/librte_mbuf/rte_mbuf.h b/lib/librte_mbuf/rte_mbuf.h
> >> index 529debb..3b117ca 100644
> >> --- a/lib/librte_mbuf/rte_mbuf.h
> >> +++ b/lib/librte_mbuf/rte_mbuf.h
> >> @@ -1468,9 +1468,11 @@ static inline void rte_pktmbuf_attach(struct 
> >> rte_mbuf *mi, struct rte_mbuf *m)
> >>   */
> >>  static inline void rte_pktmbuf_detach(struct rte_mbuf *m)
> >>  {
> >> + struct rte_mbuf *md = rte_mbuf_from_indirect(m);
> >>   struct rte_mempool *mp = m->pool;
> >>   uint32_t mbuf_size, buf_len, priv_size;
> >>
> >> + rte_mbuf_refcnt_update(md, -1);
> >>   priv_size = rte_pktmbuf_priv_size(mp);
> >>   mbuf_size = sizeof(struct rte_mbuf) + priv_size;
> >>   buf_len = rte_pktmbuf_data_room_size(mp);
> >> @@ -1498,7 +1500,7 @@ __rte_pktmbuf_prefree_seg(struct rte_mbuf *m)
> >>   if (RTE_MBUF_INDIRECT(m)) {
> >>   struct rte_mbuf *md = rte_mbuf_from_indirect(m);
> >>   rte_pktmbuf_detach(m);
> >> - if (rte_mbuf_refcnt_update(md, -1) == 0)
> >> + if (rte_mbuf_refcnt_read(md) == 0)
> >>   __rte_mbuf_raw_free(md);
> >>   }
> >>   return m;
> >> --
> >> 1.9.1
> >


[dpdk-dev] possible kni bug and proposed fix

2016-05-16 Thread ALeX Wang
Hi Ferruh,

Thx for pointing out the 'fill alloc_q with these mubfs _until it gets
full_.',

I saw the size of all queues are 'KNI_FIFO_COUNT_MAX (1024)'...
The corresponding memory required is more than what I specify as
'socket_mem' (since i'm using VM)...

Also, in my use case, I only `tcpreplay` through the kni interface, and
my application only do rx and then free the 'mbufs'.  So there is no tx at
all.

So, in my case, I still think this is a bug/defect, or somewhere i still
misunderstand,

P.S. The description here seems to be inverted,
http://dpdk.org/doc/api/rte__kni_8h.html#a0cdd727cdc227d005fef22c0189f3dfe
'rte_kni_rx_burst' does the 'It handles allocating the mbufs for KNI
interface alloc queue.'

Thanks,
Alex Wang,

On 16 May 2016 at 04:20, Ferruh Yigit  wrote:

> On 5/15/2016 5:48 AM, ALeX Wang wrote:
> > Hi,
> >
> > When using the kni module to test my application inside
> > debian (virtualbox) VM (kernel version 4.4), I get the
> >
> > "KNI: Out of memory"
> >
> > from syslog every time I `tcpreply` packets through
> > the kni interface.
> >
> > After checking source code, I saw that when I call
> > 'rte_kni_rx_burst()', no matter how many packets
> > are actually retrieved, we always call 'kni_allocate_mbufs()'
> > and try allocate 'MAX_MBUF_BURST_NUM' more
> > mbufs...  I fix the issue via using this patch below,
> >
> > Could you confirm if this is an actual bug?
> >
>
> Hi Alex,
>
> I don't think this is a bug.
>
> kni_allocate_mbufs() will allocate MAX_MBUF_BURST_NUM mbufs as you
> mentioned. And will fill alloc_q with these mubfs _until it gets full_.
> And if the alloc_q is full and there are remaining mbufs, they will be
> freed. So for some cases this isn't the most optimized way, but there is
> no defect.
>
> Since you are getting "KNI: Out of memory", somewhere else can be
> missing freeing mbufs.
>
> mbufs freeing done in rte_kni_tx_burst(), I can guess two cases that can
> cause problem:
> a) not calling rte_kni_tx_burst() frequent, so that all free mbufs
> consumed.
> b) calling rte_kni_tx_burst() with number of mbufs bigger than
> MAX_MBUF_BURST_NUM, because this function frees at most
> MAX_MBUF_BURST_NUM of mbufs, but if you are calling calling
> rte_kni_tx_burst() with bigger numbers, this will cause mbufs to stuck
> in free_q
>
>
> Regards,
> ferruh
>
>
>


-- 
Alex Wang,
Open vSwitch developer


[dpdk-dev] [PATCH v4 6/8] virtio-user: add new virtual pci driver for virtio

2016-05-16 Thread Tan, Jianfeng
Hi Yuanhan,

> -Original Message-
> From: Yuanhan Liu [mailto:yuanhan.liu at linux.intel.com]
> Sent: Friday, May 13, 2016 12:45 PM
> To: Tan, Jianfeng
> Cc: dev at dpdk.org; Xie, Huawei; rich.lane at bigswitch.com; mst at 
> redhat.com;
> nakajima.yoshihiro at lab.ntt.co.jp; p.fedin at samsung.com;
> ann.zhuangyanying at huawei.com; mukawa at igel.co.jp;
> nhorman at tuxdriver.com
> Subject: Re: [PATCH v4 6/8] virtio-user: add new virtual pci driver for virtio
> 
> On Fri, May 13, 2016 at 09:54:33AM +0800, Tan, Jianfeng wrote:
> >
> > So, I'd suggest something like following:
> >
> > if (is_vdev(..)) {
> >
> >
> > The blocker issue of your suggestion is that we have no such condition.
> >
> > Previously, I use dev_type, but as David's comment said:
> 
> That's not the only option. There should be others, for example,
> checking the existence of virtio_user_device. Or even, you could
> add a new flag inside virtio hw while initiating your vdev.
> 
> > May I ask how many more such handling are needed, excluding the tx
> queue
> > header desc setup? And as stated, in generic, yes, we should try that.
> >
> >
> > Those which need special handling:
> > (1) vq->vq_ring_mem: it is set but never used, so it's out of question.
> > (2) vq->virtio_net_hdr_mem and vring_hdr_desc_init
> 
> vring_hdr_desc_init is common.
> 
> > (3) vq->offset
> >
> > Just (2) and (3) so far. And the question is quite clear: where to put these
> > two special handling.
> 
> Apparently, you can't put it into the queue_setup(). And I still think
> my proposal works great here.

OK, since it's indeed inappropriate to put these special handlings inside 
queue_setup() from semantic perspective, I'll add them according to if 
hw->vdev_private == NULL in the driver. Thanks for suggestion.

Thanks,
Jianfeng

> 
>   --yliu


[dpdk-dev] [PATCH] mbuf: decrease refcnt when detaching

2016-05-16 Thread Hiroyuki Mikita
The rte_pktmbuf_detach() function should decrease refcnt on a direct
buffer.

Signed-off-by: Hiroyuki Mikita 
---
 lib/librte_mbuf/rte_mbuf.h | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/lib/librte_mbuf/rte_mbuf.h b/lib/librte_mbuf/rte_mbuf.h
index 529debb..3b117ca 100644
--- a/lib/librte_mbuf/rte_mbuf.h
+++ b/lib/librte_mbuf/rte_mbuf.h
@@ -1468,9 +1468,11 @@ static inline void rte_pktmbuf_attach(struct rte_mbuf 
*mi, struct rte_mbuf *m)
  */
 static inline void rte_pktmbuf_detach(struct rte_mbuf *m)
 {
+   struct rte_mbuf *md = rte_mbuf_from_indirect(m);
struct rte_mempool *mp = m->pool;
uint32_t mbuf_size, buf_len, priv_size;

+   rte_mbuf_refcnt_update(md, -1);
priv_size = rte_pktmbuf_priv_size(mp);
mbuf_size = sizeof(struct rte_mbuf) + priv_size;
buf_len = rte_pktmbuf_data_room_size(mp);
@@ -1498,7 +1500,7 @@ __rte_pktmbuf_prefree_seg(struct rte_mbuf *m)
if (RTE_MBUF_INDIRECT(m)) {
struct rte_mbuf *md = rte_mbuf_from_indirect(m);
rte_pktmbuf_detach(m);
-   if (rte_mbuf_refcnt_update(md, -1) == 0)
+   if (rte_mbuf_refcnt_read(md) == 0)
__rte_mbuf_raw_free(md);
}
return m;
-- 
1.9.1



[dpdk-dev] [PATCH] mbuf: decrease refcnt when detaching

2016-05-16 Thread Ananyev, Konstantin


> -Original Message-
> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Hiroyuki Mikita
> Sent: Sunday, May 15, 2016 4:51 PM
> To: olivier.matz at 6wind.com
> Cc: dev at dpdk.org
> Subject: [dpdk-dev] [PATCH] mbuf: decrease refcnt when detaching
> 
> The rte_pktmbuf_detach() function should decrease refcnt on a direct
> buffer.

Hmm, why is that?
What's wrong with the current approach?
Konstantin

> 
> Signed-off-by: Hiroyuki Mikita 
> ---
>  lib/librte_mbuf/rte_mbuf.h | 4 +++-
>  1 file changed, 3 insertions(+), 1 deletion(-)
> 
> diff --git a/lib/librte_mbuf/rte_mbuf.h b/lib/librte_mbuf/rte_mbuf.h
> index 529debb..3b117ca 100644
> --- a/lib/librte_mbuf/rte_mbuf.h
> +++ b/lib/librte_mbuf/rte_mbuf.h
> @@ -1468,9 +1468,11 @@ static inline void rte_pktmbuf_attach(struct rte_mbuf 
> *mi, struct rte_mbuf *m)
>   */
>  static inline void rte_pktmbuf_detach(struct rte_mbuf *m)
>  {
> + struct rte_mbuf *md = rte_mbuf_from_indirect(m);
>   struct rte_mempool *mp = m->pool;
>   uint32_t mbuf_size, buf_len, priv_size;
> 
> + rte_mbuf_refcnt_update(md, -1);
>   priv_size = rte_pktmbuf_priv_size(mp);
>   mbuf_size = sizeof(struct rte_mbuf) + priv_size;
>   buf_len = rte_pktmbuf_data_room_size(mp);
> @@ -1498,7 +1500,7 @@ __rte_pktmbuf_prefree_seg(struct rte_mbuf *m)
>   if (RTE_MBUF_INDIRECT(m)) {
>   struct rte_mbuf *md = rte_mbuf_from_indirect(m);
>   rte_pktmbuf_detach(m);
> - if (rte_mbuf_refcnt_update(md, -1) == 0)
> + if (rte_mbuf_refcnt_read(md) == 0)
>   __rte_mbuf_raw_free(md);
>   }
>   return m;
> --
> 1.9.1