[dpdk-dev] [dpdk-announce] important design choices - statistics - ABI

2015-06-18 Thread Dumitrescu, Cristian


> -Original Message-
> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Morten Br?rup
> Sent: Wednesday, June 17, 2015 10:54 AM
> To: Thomas Monjalon
> Cc: dev at dpdk.org
> Subject: Re: [dpdk-dev] [dpdk-announce] important design choices -
> statistics - ABI
> 
> Dear Thomas,
> 
> I don't have time to follow the DPDK Developers mailing list, but since you 
> call
> for feedback, I would like to share my thoughts regarding these design
> choices.
> 
> 
> Regarding the statistics discussion:
> 
> 1. The suggested solution assumes that, when statistics is disabled, the cost
> of allocating and maintaining zero-value statistics is negligible. If 
> statistics
> counters are only available through accessor functions, this is probably true.
> 
> However, if statistics counters are directly accessible, e.g. as elements in 
> the
> fast path data structures of a library, maintaining zero-value statistics may 
> a
> have memory and/or performance impact.
> 

Counters are only accessible through API functions.

> Since the compile time flag
> CONFIG_RTE__STATS_COLLECT already tells the
> application if the statistics are present or not, the application should 
> simply
> use this flag to determine if statistics are accessible or not.
> 
> 2. The suggested solution with only one single flag per library prevents
> implementing statistics with varying granularity for different purposes. E.g. 
> a
> library could have one set of statistics counters for ordinary SNMP purposes,
> and another set of statistics counters for debugging/optimization purposes.
> 
> Multiple flags per library should be possible. A hierarchy of flags per 
> library is
> probably not required.
> 

Morten, thank you for your input. It would be good if you could add your 
contribution to the of the guidelines documentation patch by replying to the 
thread that Thomas indicated: 
http://dpdk.org/ml/archives/dev/2015-June/019461.html.

Our initial stats patch submission had a much finer granularity of stats 
configuration: per object type instead of per library, but a lot of people on 
this mailing list are against this, so we are now looking for one configuration 
flag per library.

> 
> Regarding the PHY speed ABI:
> 
> 1. The Ethernet PHY ABI for speed, duplex, etc. should be common
> throughout the entire DPDK. It might be confusing if some
> structures/functions use a bitmask to indicate PHY
> speed/duplex/personality/etc. and other structures/functions use a
> combination of an unsigned integer, duplex flag, personality enumeration
> etc. (By personality enumeration, I am referring to PHYs with multiple
> electrical interfaces. E.g. a dual personality PHY might have both an RJ45
> copper interface and an SFP module interface, whereof only one can be
> active at any time.)
> 
> 2. The auto-negotiation standard allows the PHY to announce (to its link
> partner) any subset of its capabilities to its link partner. E.g. a standard
> 10/100/100 Ethernet PHY (which can handle both 10 and 100 Mbit/s in both
> half and full duplex and 1 Gbit/s full duplex) can be configured to announce
> 10 Mbit/s half duplex and 100 Mbit/s full duplex capabilities to its link 
> partner.
> (Of course, more useful combinations are normally announced, but the
> purpose of the example is to show that any combination is possible.)
> 
> The ABI for auto-negotiation should include options to select the list of
> capabilities to announce to the link partner. The Linux PHY ABI only allows
> forcing a selected speed and duplex (thereby disabling auto-negotiation) or
> enabling auto-negotiation (thereby announcing all possible speeds and
> duplex combinations the PHY is capable of). Don't make the same mistake in
> DPDK.
> 
> PS: While working for Vitesse Semiconductors (an Ethernet chip company) a
> long time ago, I actually wrote the API for their line of Ethernet PHYs. So I
> have hands on experience in this area.
> 
> 
> Regarding the discussion about backwards/forwards compatibility in the ABI:
> 
> 1. Sometimes, ABI breakage is required. That is the cost the users pay for
> getting the benefits from upgrading to the latest and greatest version of any
> library. The current solution of requiring acknowledgement from several
> qualified developers is fine - these developers will consider the cost/benefit
> on behalf of all the DPDK users and make a qualified decision.
> 
> 2. It is my general experience that documentation is not always updated to
> reflect the fine details of the source code, and this also applies to release
> notes. For open source software, the primary point of documentation is
> usually the source code itself.
> 
> 2a. It should be clearly visible directly in the DPDK source code (including
> makefiles etc.) which ABI (i.e. functions, macros, type definitions etc.) is 
> the
> current, the deprecated, and the future.
> 
> 2b. When a developer migrates a project using DPDK from a previous version
> of the DPDK, it should be easy for the 

[dpdk-dev] [PATCH v2 0/6] cfgfile: config file parsing extension

2015-06-17 Thread Dumitrescu, Cristian


> -Original Message-
> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Maciej Gajdzica
> Sent: Wednesday, June 17, 2015 3:49 PM
> To: dev at dpdk.org
> Subject: [dpdk-dev] [PATCH v2 0/6] cfgfile: config file parsing extension
> 
> Added new implementation of section parsing in config file. Refactored
> existing code by spliting it to smaller functions. Changed section
> allocation scheme and added new features  - variable length entry value
> and line continue character '\'.
> 

Acked-by: Cristian Dumitrescu 



[dpdk-dev] [PATCH] doc: fix doxygen warnings for QoS API

2015-06-17 Thread Dumitrescu, Cristian


> -Original Message-
> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Michal Jastrzebski
> Sent: Wednesday, June 17, 2015 3:37 PM
> To: dev at dpdk.org
> Subject: [dpdk-dev] [PATCH] doc: fix doxygen warnings for QoS API
> 
> This patch fix doxygen warnings when generating documentation
> for qos_meter and qos_sched
> 
> Signed-off-by: Michal Jastrzebski 
> ---
>  lib/librte_meter/rte_meter.h  |  8 
>  lib/librte_sched/rte_bitmap.h | 10 +-
>  lib/librte_sched/rte_red.h| 44 +---
> ---
>  lib/librte_sched/rte_sched.h  |  2 ++
>  4 files changed, 33 insertions(+), 31 deletions(-)
> 

Acked-by: Cristian Dumitrescu 



[dpdk-dev] [PATCH] doc: guidelines for library statistics

2015-06-15 Thread Dumitrescu, Cristian


> -Original Message-
> From: Thomas Monjalon [mailto:thomas.monjalon at 6wind.com]
> Sent: Thursday, June 11, 2015 1:05 PM
> To: Dumitrescu, Cristian
> Cc: dev at dpdk.org
> Subject: Re: [dpdk-dev] [PATCH] doc: guidelines for library statistics
> 
> Hi Cristian,
> 
> Thanks for trying to make a policy clearer.
> We need to make a decision in the coming week.
> Below are comments on the style and content.
> 
> 2015-06-08 15:50, Cristian Dumitrescu:
> >  doc/guides/guidelines/statistics.rst | 42
> 
> 
> Maybe we should have a more general file like design.rst.

I am not sure I correctly understood your suggestion. Do you want the section 
on statistics to be called design.rst? Do you want it to be part of 
doc/guides/guidelines or create a brand new document?
To me, the initial idea of doc/guidelines/statistics.rst makes more sense, but 
I am fine either way.

> In order to have a lot of readers of such guidelines, they must be concise.

I reordered the sections to provide the guidelines first and the motivation 
afterwards. I think it is good to keep the motivation together with the 
guidelines. Do you suggest to remove the motivation from this document?

> 
> Please wrap lines to be not too long and/or split lines after the end of a
> sentence.

Done in next version.

> 
> > +Library Statistics
> > +==
> > +
> > +Description
> > +---
> > +
> > +This document describes the guidelines for DPDK library-level statistics
> counter support. This includes guidelines for turning library statistics on 
> and
> off, requirements for preventing ABI changes when library statistics are
> turned on and off, etc.
> 
> Should we consider that driver stats and lib stats are different in DPDK? Why?

I did update the document to make it clear these guidelines are applicable for 
the case when the stats counters are maintained by the library itself.
I think the primary difference is whether the stats counters are implemented by 
the HW (e.g. NIC) or in SW by the CPU.
* In the first case, the CPU does not spend cycles maintaining the counters. 
System resources might be consumed by the NIC for maintaining the counters 
(e.g. memory bandwidth), so sometimes the NICs provide mechanisms to 
enable/disable stats, which is done as part of init code rather than buid time.
* In the second case, the CPU maintains the counters, which comes at the usual 
cost of cycles and system resources. These guidelines focus on this case.

> 
> > +Motivation to allow the application to turn library statistics on and off
> > +-
> > +
> > +It is highly recommended that each library provides statistics counters to
> allow the application to monitor the library-level run-time events. Typical
> counters are: number of packets received/dropped/transmitted, number of
> buffers allocated/freed, number of occurrences for specific events, etc.
> > +
> > +Since the resources consumed for library-level statistics counter 
> > collection
> have to be spent out of the application budget and the counters collected by
> some libraries might not be relevant for the current application, in order to
> avoid any unwanted waste of resources and/or performance for the
> application, the application is to decide at build time whether the collection
> of library-level statistics counters should be turned on or off for each 
> library
> individually.
> 
> It would be good to have acknowledgements or other opinions on this.
> Some of them were expressed in other threads. Please comment here.
> 
> > +Library-level statistics counters can be relevant or not for specific
> applications:
> > +* For application A, counters maintained by library X are always relevant
> and the application needs to use them to implement certain features, as
> traffic accounting, logging, application-level statistics, etc. In this case, 
> the
> application requires that collection of statistics counters for library X is 
> always
> turned on;
> > +* For application B, counters maintained by library X are only useful 
> > during
> the application debug stage and not relevant once debug phase is over. In
> this case, the application may decide to turn on the collection of library X
> statistics counters during the debug phase and later on turn them off;
> 
> Users of binary package have not this choice.

There is a section in the document about allowing stats to be turned on/off 
without ABI impact.

> 
> > +* For application C, counters maintained by library X are not relevant at 
> > all.
> It might me that the application maintains its own set of statistics

[dpdk-dev] [PATCH v4 1/1] pipeline: add statistics for librte_pipeline ports and tables

2015-06-08 Thread Dumitrescu, Cristian
> This patch adds statistics collection for librte_pipeline.
> Those statistics ale disabled by default during build time.
> 

Acked-by: Cristian Dumitrescu 



[dpdk-dev] [PATCH v4 00/10] table: added table statistics

2015-06-08 Thread Dumitrescu, Cristian
> Added statistics for every type of table. By default all table statistics
> are disabled, user must activate them in config file.
> 
> Changes in v4:
>   - created single config option for all table statistics

Acked-by: Cristian Dumitrescu 



[dpdk-dev] [PATCH v4 00/13] port: added port statistics

2015-06-08 Thread Dumitrescu, Cristian
> Added statistics for every type of port. By default all port statistics
> are disabled, user must activate them in config file.
> 
> 
> Changes in v4:
>   - created single config option for all port statistics
> 


Acked-by: Cristian Dumitrescu 



[dpdk-dev] 4 Traffic classes per Pipe limitation

2015-06-07 Thread Dumitrescu, Cristian
No problem, Mike. Enjoy!

From: Michael Sardo [mailto:m...@bandex.io]
Sent: Sunday, June 7, 2015 12:24 AM
To: Dumitrescu, Cristian
Cc: Yeddula, Avinash; dev at dpdk.org
Subject: Re: [dpdk-dev] 4 Traffic classes per Pipe limitation

Oops, I should have searched a bit more before asking. I see that they've 
already been made available:
http://dpdk.org/ml/archives/dev/attachments/20150423/17a4d8de/attachment-0001.pdf

Thanks.

-Mike

On Sat, Jun 6, 2015 at 5:05 PM, Michael Sardo mailto:mike at 
bandex.io>> wrote:
Hello Cristian,

Are the slides shown in that video available? They're very helpful.

-Mike

On Fri, Jun 5, 2015 at 4:50 PM, Dumitrescu, Cristian mailto:cristian.dumitrescu at intel.com>> wrote:
Hi Avinash,

> -Original Message-
> From: dev [mailto:dev-bounces at dpdk.org<mailto:dev-bounces at dpdk.org>] On 
> Behalf Of Yeddula, Avinash
> Sent: Friday, June 5, 2015 6:06 PM
> To: dev at dpdk.org<mailto:dev at dpdk.org>
> Subject: [dpdk-dev] 4 Traffic classes per Pipe limitation
>
> Hi,
> This is related to the QOS scheduler functionality provided by dpdk.
>
> I see a limit on the number of traffic classes to be 4.  I'm exploring the
> available options to increase that limit to 8.

Yes, there are 4x traffic classes (scheduled in strict priority), but each 
traffic class has 4x queues (scheduled using WFQ); for big weight ratios 
between queues (e.g. 1:4 or 1:8, etc), WFQ becomes very similar to strict 
priority, a king of strict priority without starvation. So the 16x queues per 
pipe can be considered 16x sub-traffic-classes.

You might want to watch this video on DPDK QoS: https://youtu.be/_PPklkWGugs

>
> This is what I found when I researched on this topic.
> The limitation on number's of TC (and pipes) comes from the number of
> bits available. Since the QoS code overloads the 32 bit RSS field in
> the mbuf there isn't enough bits to a lot. But then again if you add lots
> of pipes or subports the memory footprint gets huge.

It is not that simple. The number of 4x traffic classes in deeply built into 
the implementation for performance reasons. Increasing the number of bits 
allocated to traffic class in mbuf->sched would not help.

>
> Any more info or suggestions on increasing the limit to 8 ?

Yes, look at the 16x pipe queues as 16x (sub)traffic classes.
>
> Thanks
> -Avinash




[dpdk-dev] 4 Traffic classes per Pipe limitation

2015-06-05 Thread Dumitrescu, Cristian
Hi Avinash,

> -Original Message-
> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Yeddula, Avinash
> Sent: Friday, June 5, 2015 6:06 PM
> To: dev at dpdk.org
> Subject: [dpdk-dev] 4 Traffic classes per Pipe limitation
> 
> Hi,
> This is related to the QOS scheduler functionality provided by dpdk.
> 
> I see a limit on the number of traffic classes to be 4.  I'm exploring the
> available options to increase that limit to 8.

Yes, there are 4x traffic classes (scheduled in strict priority), but each 
traffic class has 4x queues (scheduled using WFQ); for big weight ratios 
between queues (e.g. 1:4 or 1:8, etc), WFQ becomes very similar to strict 
priority, a king of strict priority without starvation. So the 16x queues per 
pipe can be considered 16x sub-traffic-classes.

You might want to watch this video on DPDK QoS: https://youtu.be/_PPklkWGugs 

> 
> This is what I found when I researched on this topic.
> The limitation on number's of TC (and pipes) comes from the number of
> bits available. Since the QoS code overloads the 32 bit RSS field in
> the mbuf there isn't enough bits to a lot. But then again if you add lots
> of pipes or subports the memory footprint gets huge.

It is not that simple. The number of 4x traffic classes in deeply built into 
the implementation for performance reasons. Increasing the number of bits 
allocated to traffic class in mbuf->sched would not help.

> 
> Any more info or suggestions on increasing the limit to 8 ?

Yes, look at the 16x pipe queues as 16x (sub)traffic classes.
> 
> Thanks
> -Avinash


[dpdk-dev] [PATCH] lib: fix RTE_MBUF_METADATA macros

2015-06-05 Thread Dumitrescu, Cristian


> -Original Message-
> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Daniel Mrzyglod
> Sent: Friday, June 5, 2015 3:55 PM
> To: dev at dpdk.org
> Subject: [dpdk-dev] [PATCH] lib: fix RTE_MBUF_METADATA macros
> 
> Fix RTE_MBUF_METADATA macros to allow for unaligned accesses to
> meta-data fields.
> Forcing aligned accesses is not really required, so this is removing an
> unneeded constraint.
> This issue was met during testing of the new version of the ip_pipeline
> application. There is no performance impact.
> This change has no ABI impact, as the previous code that uses aligned
> accesses continues to run without any issues.
> 
> Signed-off-by: Daniel Mrzyglod 


Ack-ed by: Cristian Dumitrescu 



[dpdk-dev] [PATCH] qos_sched: example modification to use librte_cfgfile

2015-06-05 Thread Dumitrescu, Cristian


> -Original Message-
> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Michal Jastrzebski
> Sent: Friday, June 5, 2015 10:28 AM
> To: dev at dpdk.org
> Cc: ??cristian.dumitrescu at intel.com
> Subject: [dpdk-dev] [PATCH] qos_sched: example modification to use
> librte_cfgfile
> 
> This is a modification of qos_sched example to use
> librte_cfgfile for parsing configuration file.
> 
> Signed-off-by: Michal Jastrzebski 
> ---
>  examples/qos_sched/cfg_file.c |  157 
> ++---
>  examples/qos_sched/cfg_file.h |   35 ++---
>  examples/qos_sched/init.c |   14 ++--
>  3 files changed, 47 insertions(+), 159 deletions(-)
> 
> diff --git a/examples/qos_sched/cfg_file.c b/examples/qos_sched/cfg_file.c
> index 05a8caf..71ddabb 100644
> --- a/examples/qos_sched/cfg_file.c
> +++ b/examples/qos_sched/cfg_file.c
> @@ -233,92 +233,7 @@ int cfg_close(struct cfg_file *cfg)
>  }
> 
>  int
> -cfg_num_sections(struct cfg_file *cfg, const char *sectionname, size_t
> length)
> -{
> - int i;
> - int num_sections = 0;
> - for (i = 0; i < cfg->num_sections; i++) {
> - if (strncmp(cfg->sections[i]->name, sectionname, length) ==
> 0)
> - num_sections++;
> - }
> - return num_sections;
> -}
> -
> -int
> -cfg_sections(struct cfg_file *cfg, char *sections[], int max_sections)
> -{
> - int i;
> - for (i = 0; i < cfg->num_sections && i < max_sections; i++) {
> - snprintf(sections[i], CFG_NAME_LEN, "%s",  cfg->sections[i]-
> >name);
> - }
> - return i;
> -}
> -
> -static const struct cfg_section *
> -_get_section(struct cfg_file *cfg, const char *sectionname)
> -{
> - int i;
> - for (i = 0; i < cfg->num_sections; i++) {
> - if (strncmp(cfg->sections[i]->name, sectionname,
> - sizeof(cfg->sections[0]->name)) == 0)
> - return cfg->sections[i];
> - }
> - return NULL;
> -}
> -
> -int
> -cfg_has_section(struct cfg_file *cfg, const char *sectionname)
> -{
> - return (_get_section(cfg, sectionname) != NULL);
> -}
> -
> -int
> -cfg_section_num_entries(struct cfg_file *cfg, const char *sectionname)
> -{
> - const struct cfg_section *s = _get_section(cfg, sectionname);
> - if (s == NULL)
> - return -1;
> - return s->num_entries;
> -}
> -
> -
> -int
> -cfg_section_entries(struct cfg_file *cfg, const char *sectionname,
> - struct cfg_entry *entries, int max_entries)
> -{
> - int i;
> - const struct cfg_section *sect = _get_section(cfg, sectionname);
> - if (sect == NULL)
> - return -1;
> - for (i = 0; i < max_entries && i < sect->num_entries; i++)
> - entries[i] = *sect->entries[i];
> - return i;
> -}
> -
> -const char *
> -cfg_get_entry(struct cfg_file *cfg, const char *sectionname,
> - const char *entryname)
> -{
> - int i;
> - const struct cfg_section *sect = _get_section(cfg, sectionname);
> - if (sect == NULL)
> - return NULL;
> - for (i = 0; i < sect->num_entries; i++)
> - if (strncmp(sect->entries[i]->name, entryname,
> CFG_NAME_LEN) == 0)
> - return sect->entries[i]->value;
> - return NULL;
> -}
> -
> -int
> -cfg_has_entry(struct cfg_file *cfg, const char *sectionname,
> - const char *entryname)
> -{
> - return (cfg_get_entry(cfg, sectionname, entryname) != NULL);
> -}
> -
> -
> -int
> -cfg_load_port(struct cfg_file *cfg, struct rte_sched_port_params
> *port_params)
> +cfg_load_port(struct rte_cfgfile *cfg, struct rte_sched_port_params
> *port_params)
>  {
>   const char *entry;
>   int j;
> @@ -326,19 +241,19 @@ cfg_load_port(struct cfg_file *cfg, struct
> rte_sched_port_params *port_params)
>   if (!cfg || !port_params)
>   return -1;
> 
> - entry = cfg_get_entry(cfg, "port", "frame overhead");
> + entry = rte_cfgfile_get_entry(cfg, "port", "frame overhead");
>   if (entry)
>   port_params->frame_overhead = (uint32_t)atoi(entry);
> 
> - entry = cfg_get_entry(cfg, "port", "number of subports per port");
> + entry = rte_cfgfile_get_entry(cfg, "port", "number of subports per
> port");
>   if (entry)
>   port_params->n_subports_per_port = (uint32_t)atoi(entry);
> 
> - entry = cfg_get_entry(cfg, "port", "number of pipes per subport");
> + entry = rte_cfgfile_get_entry(cfg, "port", "number of pipes per
> subport");
>   if (entry)
>   port_params->n_pipes_per_subport = (uint32_t)atoi(entry);
> 
> - entry = cfg_get_entry(cfg, "port", "queue sizes");
> + entry = rte_cfgfile_get_entry(cfg, "port", "queue sizes");
>   if (entry) {
>   char *next;
> 
> @@ -356,7 +271,7 @@ cfg_load_port(struct cfg_file *cfg, struct
> rte_sched_port_params *port_params)
> 
>   /* Parse WRED min thresholds */
>   snprintf(str, sizeof(str), "tc %d 

[dpdk-dev] [PATCH 5/5] rte_sched: allow reading without clearing

2015-06-04 Thread Dumitrescu, Cristian


> -Original Message-
> From: Stephen Hemminger [mailto:stephen at networkplumber.org]
> Sent: Wednesday, May 27, 2015 7:10 PM
> To: Dumitrescu, Cristian
> Cc: dev at dpdk.org; Stephen Hemminger
> Subject: [PATCH 5/5] rte_sched: allow reading without clearing
> 
> The rte_sched statistics API should allow reading statistics without
> clearing. Make auto-clear optional. In this version, this is handled
> by deprecating the old API and adding a new one.
> 
> Signed-off-by: Stephen Hemminger 
> ---
>  app/test/test_sched.c  |  4 +--
>  lib/librte_sched/rte_sched.c   | 65 
> +-
>  lib/librte_sched/rte_sched.h   | 37 ++-
>  lib/librte_sched/rte_sched_version.map |  2 ++
>  4 files changed, 74 insertions(+), 34 deletions(-)
> 
> diff --git a/app/test/test_sched.c b/app/test/test_sched.c
> index c7239f8..03f89b4 100644
> --- a/app/test/test_sched.c
> +++ b/app/test/test_sched.c
> @@ -198,13 +198,13 @@ test_sched(void)
> 
>   struct rte_sched_subport_stats subport_stats;
>   uint32_t tc_ov;
> - rte_sched_subport_read_stats(port, SUBPORT, _stats,
> _ov);
> + rte_sched_subport_stats(port, SUBPORT, _stats, _ov,
> 1);
>  #if 0
>   TEST_ASSERT_EQUAL(subport_stats.n_pkts_tc[TC-1], 10, "Wrong
> subport stats\n");
>  #endif
>   struct rte_sched_queue_stats queue_stats;
>   uint16_t qlen;
> - rte_sched_queue_read_stats(port, QUEUE, _stats, );
> + rte_sched_queue_stats(port, QUEUE, _stats, , 1);
>  #if 0
>   TEST_ASSERT_EQUAL(queue_stats.n_pkts, 10, "Wrong queue
> stats\n");
>  #endif
> diff --git a/lib/librte_sched/rte_sched.c b/lib/librte_sched/rte_sched.c
> index 9c9419d..b4d7edd 100644
> --- a/lib/librte_sched/rte_sched.c
> +++ b/lib/librte_sched/rte_sched.c
> @@ -965,61 +965,78 @@ rte_sched_port_pkt_read_color(const struct
> rte_mbuf *pkt)
>  }
> 
>  int
> -rte_sched_subport_read_stats(struct rte_sched_port *port,
> - uint32_t subport_id,
> - struct rte_sched_subport_stats *stats,
> - uint32_t *tc_ov)
> +rte_sched_subport_stats(struct rte_sched_port *port, uint32_t
> subport_id,
> + struct rte_sched_subport_stats *stats,
> + uint32_t *tc_ov, int clear)
>  {
>   struct rte_sched_subport *s;
> 
>   /* Check user parameters */
> - if ((port == NULL) ||
> - (subport_id >= port->n_subports_per_port) ||
> - (stats == NULL) ||
> - (tc_ov == NULL)) {
> + if (port == NULL || subport_id >= port->n_subports_per_port)
>   return -1;
> - }
> +
>   s = port->subport + subport_id;
> 
>   /* Copy subport stats and clear */
> - memcpy(stats, >stats, sizeof(struct rte_sched_subport_stats));
> - memset(>stats, 0, sizeof(struct rte_sched_subport_stats));
> + if (stats)
> + *stats = s->stats;
> + if (clear)
> + memset(>stats, 0, sizeof(struct
> rte_sched_subport_stats));
> 
>   /* Subport TC ovesubscription status */
> - *tc_ov = s->tc_ov;
> + if (tc_ov)
> + *tc_ov = s->tc_ov;
> 
>   return 0;
>  }
> 
> +/* Deprecated API, always clears */
>  int
> -rte_sched_queue_read_stats(struct rte_sched_port *port,
> - uint32_t queue_id,
> - struct rte_sched_queue_stats *stats,
> - uint16_t *qlen)
> +rte_sched_subport_read_stats(struct rte_sched_port *port, uint32_t
> subport_id,
> +  struct rte_sched_subport_stats *stats,
> +  uint32_t *tc_ov)
> +{
> + return rte_sched_subport_stats(port, subport_id, stats, tc_ov, 1);
> +}
> +
> +int
> +rte_sched_queue_stats(struct rte_sched_port *port, uint32_t queue_id,
> +   struct rte_sched_queue_stats *stats,
> +   uint16_t *qlen, int clear)
>  {
>   struct rte_sched_queue *q;
>   struct rte_sched_queue_extra *qe;
> 
>   /* Check user parameters */
> - if ((port == NULL) ||
> - (queue_id >= rte_sched_port_queues_per_port(port)) ||
> - (stats == NULL) ||
> - (qlen == NULL)) {
> + if (port == NULL || queue_id >=
> rte_sched_port_queues_per_port(port))
>   return -1;
> - }
> +
>   q = port->queue + queue_id;
>   qe = port->queue_extra + queue_id;
> 
>   /* Copy queue stats and clear */
> - memcpy(stats, >stats, sizeof(struct rte_sched_queue_stats));
> - memset(>stats, 0, sizeof(struct rte_sched_queue_stats));
> + if (stats)
> + *stats = qe-&g

[dpdk-dev] [PATCH 01/11] ip_pipeline: add parsing for config files with new syntax

2015-06-04 Thread Dumitrescu, Cristian


> -Original Message-
> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Stephen
> Hemminger
> Sent: Monday, June 1, 2015 2:34 PM
> To: Gajdzica, MaciejX T
> Cc: dev at dpdk.org
> Subject: Re: [dpdk-dev] [PATCH 01/11] ip_pipeline: add parsing for config
> files with new syntax
> 
> On Fri, 29 May 2015 17:43:08 +0200
> Maciej Gajdzica  wrote:
> 
> > +/**
> > + * Find object of name *name* in *obj_array* which is constant size array
> of
> > + * elements that have field *name*.
> > + *
> > + * @param obj_array
> > + *  Constant size array
> > + * @param name
> > + *  name of object to find.
> > + * @return
> > + *  Pointer to object in *obj_array* or NULL if not found.
> > + */
> > +#define APP_PARAM_FIND(obj_array, key)  \
> > +({  \
> > +   ssize_t obj_idx;\
> > +   const ssize_t obj_count = RTE_DIM(obj_array);   \
> > +\
> > +   for (obj_idx = 0; obj_idx < obj_count; obj_idx++) { \
> > +   if (!APP_PARAM_VALID(&((obj_array)[obj_idx])))  \
> > +   continue;   \
> > +   \
> > +   if (strcmp(key, (obj_array)[obj_idx].name) == 0)\
> > +   break;  \
> > +   }   \
> > +   obj_idx < obj_count ? obj_idx : -ENOENT;\
> > +})
> 
> Converting all the functions to macro's is a step backwards in several ways.
>  * macro's are hard to support
>  * macro's lead to lots of programming errors
>  * macro's look ugly
> 
> Why not use real functions, or make the example into C++ if you have
> to do generic programming.

We are using macros here only because C language does not offer us a better 
choice (i.e. support for templates). The alternative would be to write a 
quasi-identical function per each object type, which would lead to unnecessary 
code duplication.

We did our best to keep the number of macros small and to implement each macro 
as straightforward as possible.

All the DPDK sample applications are written in C, so this is the main reason 
we want to keep this application as C code. As people expect C code from DPDK 
sample apps, it is easier for people to reuse parts of this application.




[dpdk-dev] [PATCH 0/3] librte_cfgfile rework and extension

2015-06-01 Thread Dumitrescu, Cristian


> -Original Message-
> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Maciej Gajdzica
> Sent: Friday, May 29, 2015 4:27 PM
> To: dev at dpdk.org
> Subject: [dpdk-dev] [PATCH 0/3] librte_cfgfile rework and extension
> 
> From: Pawel Wodkowski 
> 
> This patchset provide extension to librte_cfgfile to overcome limit of 64
> characters for entry value and add line continue feature.
> 
> Pawel Wodkowski (3):
>   librte_cfgfile: fix code formating in header file
>   librte_compat: fix macro deffinition
>   librte_cfgfile: rework
> 
>  lib/librte_cfgfile/Makefile|   2 +-
>  lib/librte_cfgfile/rte_cfgfile.c   | 792 
> ++---
>  lib/librte_cfgfile/rte_cfgfile.h   |  52 +-
>  lib/librte_cfgfile/rte_cfgfile_version.map |   8 +
>  lib/librte_compat/rte_compat.h |   2 +-
>  5 files changed, 649 insertions(+), 207 deletions(-)
> 
> --
> 1.9.1

Acked-by: Cristian Dumitrescu 



[dpdk-dev] [PATCH v3 01/10] table: added structure for storing table stats

2015-05-28 Thread Dumitrescu, Cristian


> -Original Message-
> From: Stephen Hemminger [mailto:stephen at networkplumber.org]
> Sent: Tuesday, May 26, 2015 10:58 PM
> To: Dumitrescu, Cristian
> Cc: Gajdzica, MaciejX T; dev at dpdk.org
> Subject: Re: [dpdk-dev] [PATCH v3 01/10] table: added structure for storing
> table stats
> 
> On Tue, 26 May 2015 21:40:42 +0000
> "Dumitrescu, Cristian"  wrote:
> 
> >
> >
> > > -Original Message-
> > > From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Stephen
> > > Hemminger
> > > Sent: Tuesday, May 26, 2015 3:58 PM
> > > To: Gajdzica, MaciejX T
> > > Cc: dev at dpdk.org
> > > Subject: Re: [dpdk-dev] [PATCH v3 01/10] table: added structure for
> storing
> > > table stats
> > >
> > > On Tue, 26 May 2015 14:39:38 +0200
> > > Maciej Gajdzica  wrote:
> > >
> > > > +
> > > >  /** Lookup table interface defining the lookup table operation */
> > > >  struct rte_table_ops {
> > > > rte_table_op_create f_create;   /**< Create */
> > > > @@ -194,6 +218,7 @@ struct rte_table_ops {
> > > > rte_table_op_entry_add f_add;   /**< Entry add */
> > > > rte_table_op_entry_delete f_delete; /**< Entry delete */
> > > > rte_table_op_lookup f_lookup;   /**< Lookup */
> > > > +   rte_table_op_stats_read f_stats;/**< Stats */
> > > >  };
> > >
> > > Another good idea, which is an ABI change.
> >
> > This is simply adding a new API function, this is not changing any function
> prototype. There is no change required in the map file of this library. Is 
> there
> anything we should have done and we did not do?
> >
> 
> But if I built an external set of code which had rte_table_ops (don't worry I
> haven't)
> and that binary ran with the new definition, the core code it table would
> reference
> outside the (old version) of rte_table_ops structure and find garbage.

This is just adding  a new field at the end of an API data structure. Based on 
input from multiple people and after reviewing the rules listed on 
http://dpdk.org/doc/guides/rel_notes/abi.html , I think this is an acceptable 
change. There are other patches in flight on this mailing list that are in the 
same situation. Any typical/well behaved application will not break due to this 
change.




[dpdk-dev] [PATCH v3] pipeline: add statistics for librte_pipeline

2015-05-28 Thread Dumitrescu, Cristian
Hi Raja,

Thanks for your input.

I think we have the following options identified so far for stats collection 
configuration:

1. Stats configuration through the RTE_LOG_LEVEL
2. Single configuration flag global for all DPDK libraries
3. Single configuration flag per DPDK library

It would be good if Thomas and Stephen, as well as others, would reply with 
their preference order.

My personal preference order is: 3., 2., 1., but I can work with any of the 
above that is identified by the majority of the replies. My goal right now is 
reaching a conclusion on this item as soon as we can.

Regards,
Cristian



> -Original Message-
> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Rajagopalan
> Sivaramakrishnan
> Sent: Wednesday, May 27, 2015 11:45 PM
> To: dev at dpdk.org
> Subject: Re: [dpdk-dev] [PATCH v3] pipeline: add statistics for 
> librte_pipeline
> 
> 
> > > You also reiterate that you would like to have the stats always enabled.
> You
> > can definitely do this, it is one of the available choices, but why not also
> > accommodate the users that want to pick the opposite choice? Why force
> > apps to spend cycles on stats if the app either does not want these
> counters
> > (library counters not relevant for that app, maybe the app is only
> interested
> > in maintaining some other stats that it implements itself) or do not want
> > them anymore (maybe they only needed them during debug phase), etc?
> > Jay asked this question, and I did my best in my reply to describe our
> > motivation (http://www.dpdk.org/ml/archives/dev/2015-
> May/017992.html).
> > Maybe you missed that post, it would be good to get your reply on this one
> > too.
> >
> > I want to see DPDK get out of the config madness.
> > This is real code, not an Intel benchmark special.
> 
> 
> I agree that statistics will definitely be required in most real-world 
> production
> environments and the overhead
> from per-core stats gathering will be minimal if the data structures are such
> that CPU cache thrashing is avoided.
> However, if there are scenarios where it is desirable to turn stats off, I 
> think
> we can live with a config option.
> I am not comfortable with using the log level to enable/disable statistics as
> they are not really related. A
> separate config option for stats collection seems like a reasonable
> compromise.
> 
> Raja


[dpdk-dev] [PATCH v3] pipeline: add statistics for librte_pipeline ports and tables

2015-05-26 Thread Dumitrescu, Cristian


> -Original Message-
> From: Stephen Hemminger [mailto:stephen at networkplumber.org]
> Sent: Tuesday, May 26, 2015 10:48 PM
> To: Dumitrescu, Cristian
> Cc: Gajdzica, MaciejX T; dev at dpdk.org
> Subject: Re: [dpdk-dev] [PATCH v3] pipeline: add statistics for 
> librte_pipeline
> ports and tables
> 
> On Tue, 26 May 2015 21:35:22 +
> "Dumitrescu, Cristian"  wrote:
> 
> >
> >
> > > -Original Message-
> > > From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Stephen
> > > Hemminger
> > > Sent: Tuesday, May 26, 2015 3:57 PM
> > > To: Gajdzica, MaciejX T
> > > Cc: dev at dpdk.org
> > > Subject: Re: [dpdk-dev] [PATCH v3] pipeline: add statistics for
> librte_pipeline
> > > ports and tables
> > >
> > > On Tue, 26 May 2015 15:39:18 +0200
> > > Maciej Gajdzica  wrote:
> > >
> > > > +#if RTE_LOG_LEVEL == RTE_LOG_DEBUG
> > > > +#define RTE_PIPELINE_STATS_ADD(counter, val) \
> > > > +   ({ (counter) += (val); })
> > > > +
> > > > +#define RTE_PIPELINE_STATS_ADD_M(counter, mask) \
> > > > +   ({ (counter) += __builtin_popcountll(mask); })
> > > > +#else
> > > > +#define RTE_PIPELINE_STATS_ADD(counter, val)
> > > > +#define RTE_PIPELINE_STATS_ADD_M(counter, mask)
> > > > +#endif
> > >
> > > This is worse idea than the previous one.
> > > I want statistics done on a per lcore basis, and always available
> > > because real users on production system want statistics (but they
> > > don't want log spam).
> >
> > Stephen,
> >
> > Thomas and myself converged towards this solution, Thomas asked if
> anybody objects, you did not (http://www.dpdk.org/ml/archives/dev/2015-
> May/018099.html) . You say this idea is bad, but what exactly is your
> objection? Do you have an alternative proposal?
> 
> Yes. Always keep statistics.
> 
> We use functions like this internally.
> 
> struct xxx_stats {
>   uint64_t mib[XXX_MIB_MAX];
> };
> extern struct xxx_stats xxx_stats[RTE_MAX_LCORE];
> 
> #define XXXSTAT_INC(type) xxxstats[rte_lcore_id()].mibs[type]++
> 
> 

This can only be applied if stats are always enabled. I know this is your 
preferred choice, but other people have the requirement to be able to have the 
stats collection configurable, and we should also accommodate the needs of 
those people. Your code snippet does not provide a solution for this problem.

> > You already mentioned in the previous thread you would like to have per
> lcore statistics. I was kindly asking you to describe your idea, but you did 
> not
> do this yet (http://www.dpdk.org/ml/archives/dev/2015-May/017956.html ).
> Can you please describe it? Each port instance has its own statistics 
> counters.
> Each lcore can run one or more pipeline instances, therefore each lcore
> typically runs several port instances of the same or different type (each port
> instance with its own statistics), so how is "per lcore stats" requirement
> applicable here?
> 
> I thought you were familiar with technique of having per-cpu structures and
> variables
> widely used in Linux kernel to prevent cache thrashing. Although you call it
> false sharing,
> it isn't false., it happens when same counter ping-pongs between multiple
> threads for
> no added benefit.

The "per lcore stats" is applicable to stats structures that are accessed by 
more than one lcore. In our case, each port instance is handled by a single 
lcore, so it is never the case that two lcores would access the stats of the 
same port instance. So, we can safely say that port stats are "per lcore" 
already.

> 
> 
> > You also reiterate that you would like to have the stats always enabled. You
> can definitely do this, it is one of the available choices, but why not also
> accommodate the users that want to pick the opposite choice? Why force
> apps to spend cycles on stats if the app either does not want these counters
> (library counters not relevant for that app, maybe the app is only interested
> in maintaining some other stats that it implements itself) or do not want
> them anymore (maybe they only needed them during debug phase), etc?
> Jay asked this question, and I did my best in my reply to describe our
> motivation (http://www.dpdk.org/ml/archives/dev/2015-May/017992.html).
> Maybe you missed that post, it would be good to get your reply on this one
> too.
> 
> I want to see DPDK get out of the config madness.
> This is real code, not an Intel benchmark special.

You seem to make this assumption that our real reason for hav

[dpdk-dev] [PATCH v3 01/10] table: added structure for storing table stats

2015-05-26 Thread Dumitrescu, Cristian


> -Original Message-
> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Stephen
> Hemminger
> Sent: Tuesday, May 26, 2015 3:58 PM
> To: Gajdzica, MaciejX T
> Cc: dev at dpdk.org
> Subject: Re: [dpdk-dev] [PATCH v3 01/10] table: added structure for storing
> table stats
> 
> On Tue, 26 May 2015 14:39:38 +0200
> Maciej Gajdzica  wrote:
> 
> > +
> >  /** Lookup table interface defining the lookup table operation */
> >  struct rte_table_ops {
> > rte_table_op_create f_create;   /**< Create */
> > @@ -194,6 +218,7 @@ struct rte_table_ops {
> > rte_table_op_entry_add f_add;   /**< Entry add */
> > rte_table_op_entry_delete f_delete; /**< Entry delete */
> > rte_table_op_lookup f_lookup;   /**< Lookup */
> > +   rte_table_op_stats_read f_stats;/**< Stats */
> >  };
> 
> Another good idea, which is an ABI change.

This is simply adding a new API function, this is not changing any function 
prototype. There is no change required in the map file of this library. Is 
there anything we should have done and we did not do?



[dpdk-dev] [PATCH v3] pipeline: add statistics for librte_pipeline ports and tables

2015-05-26 Thread Dumitrescu, Cristian


> -Original Message-
> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Stephen
> Hemminger
> Sent: Tuesday, May 26, 2015 3:57 PM
> To: Gajdzica, MaciejX T
> Cc: dev at dpdk.org
> Subject: Re: [dpdk-dev] [PATCH v3] pipeline: add statistics for 
> librte_pipeline
> ports and tables
> 
> On Tue, 26 May 2015 15:39:18 +0200
> Maciej Gajdzica  wrote:
> 
> > +#if RTE_LOG_LEVEL == RTE_LOG_DEBUG
> > +#define RTE_PIPELINE_STATS_ADD(counter, val) \
> > +   ({ (counter) += (val); })
> > +
> > +#define RTE_PIPELINE_STATS_ADD_M(counter, mask) \
> > +   ({ (counter) += __builtin_popcountll(mask); })
> > +#else
> > +#define RTE_PIPELINE_STATS_ADD(counter, val)
> > +#define RTE_PIPELINE_STATS_ADD_M(counter, mask)
> > +#endif
> 
> This is worse idea than the previous one.
> I want statistics done on a per lcore basis, and always available
> because real users on production system want statistics (but they
> don't want log spam).

Stephen,

Thomas and myself converged towards this solution, Thomas asked if anybody 
objects, you did not (http://www.dpdk.org/ml/archives/dev/2015-May/018099.html) 
. You say this idea is bad, but what exactly is your objection? Do you have an 
alternative proposal?

You already mentioned in the previous thread you would like to have per lcore 
statistics. I was kindly asking you to describe your idea, but you did not do 
this yet (http://www.dpdk.org/ml/archives/dev/2015-May/017956.html ). Can you 
please describe it? Each port instance has its own statistics counters. Each 
lcore can run one or more pipeline instances, therefore each lcore typically 
runs several port instances of the same or different type (each port instance 
with its own statistics), so how is "per lcore stats" requirement applicable 
here?

You also reiterate that you would like to have the stats always enabled. You 
can definitely do this, it is one of the available choices, but why not also 
accommodate the users that want to pick the opposite choice? Why force apps to 
spend cycles on stats if the app either does not want these counters (library 
counters not relevant for that app, maybe the app is only interested in 
maintaining some other stats that it implements itself) or do not want them 
anymore (maybe they only needed them during debug phase), etc? Jay asked this 
question, and I did my best in my reply to describe our motivation 
(http://www.dpdk.org/ml/archives/dev/2015-May/017992.html). Maybe you missed 
that post, it would be good to get your reply on this one too.

Thanks,
Cristian



[dpdk-dev] [PATCH v3 00/10] table: added table statistics

2015-05-26 Thread Dumitrescu, Cristian


> -Original Message-
> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Maciej Gajdzica
> Sent: Tuesday, May 26, 2015 1:40 PM
> To: dev at dpdk.org
> Subject: [dpdk-dev] [PATCH v3 00/10] table: added table statistics
> 
> Added statistics for every type of table. By default all table statistics
> are disabled, user must activate them in config file.
> 
> Changes in v2:
>   - added missing signoffs
> 
> Changes in v3:
>   - removed new config options to enable/disable stats
>   - using RTE_LOG_LEVEL instead
> 
> Maciej Gajdzica (10):
>   table: added structure for storing table stats
>   table: added acl table stats
>   table: added array table stats
>   table: added hash_ext table stats
>   table: added hash_key16 table stats
>   table: added hash_key32 table stats
>   table: added hash_key8 table stats
>   table: added hash_lru table stats
>   table: added lpm_ipv6 table stats
>   table: added lpm table stats
> 
>  lib/librte_table/rte_table.h|   25 +++
>  lib/librte_table/rte_table_acl.c|   35 +
>  lib/librte_table/rte_table_array.c  |   34 +++-
>  lib/librte_table/rte_table_hash_ext.c   |   44
> ++
>  lib/librte_table/rte_table_hash_key16.c |   41
> 
>  lib/librte_table/rte_table_hash_key32.c |   41
> 
>  lib/librte_table/rte_table_hash_key8.c  |   52
> +++
>  lib/librte_table/rte_table_hash_lru.c   |   44
> ++
>  lib/librte_table/rte_table_lpm.c|   34 
>  lib/librte_table/rte_table_lpm_ipv6.c   |   34 
>  10 files changed, 383 insertions(+), 1 deletion(-)
> 
> --
> 1.7.9.5

Acked-by: Cristian Dumitrescu 



[dpdk-dev] [PATCH v3 00/13] port: added port statistics

2015-05-26 Thread Dumitrescu, Cristian


> -Original Message-
> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Maciej Gajdzica
> Sent: Tuesday, May 26, 2015 10:23 AM
> To: dev at dpdk.org
> Subject: [dpdk-dev] [PATCH v3 00/13] port: added port statistics
> 
> Added statistics for every type of port. By default all port statistics
> are disabled, user must activate them in config file.
> 
> Changes in v2:
>   - added missing signoffs
> 
> Changes in v3:
>   - removed new config options to enable/disable stats
>   - using RTE_LOG_LEVEL instead
> 
> Maciej Gajdzica (13):
>   port: added structures for port stats
>   port: added port_ethdev_reader stats
>   port: added port_ethdev_writer stats
>   port: added port_ethdev_writer_nodrop stats
>   port: added port_frag stats
>   port: added port_ras stats
>   port: added port_ring_reader stats
>   port: added port_ring_writer stats
>   port: added port_ring_writer_nodrop stats
>   port: added port_sched_reader stats
>   port: added port_sched_writer stats
>   port: added port_source stats
>   port: added port_sink stats
> 
>  lib/librte_port/rte_port.h |   60 +++--
>  lib/librte_port/rte_port_ethdev.c  |  110
> +-
>  lib/librte_port/rte_port_frag.c|   36 ++
>  lib/librte_port/rte_port_ras.c |   38 +++
>  lib/librte_port/rte_port_ring.c|  114
> +++-
>  lib/librte_port/rte_port_sched.c   |   96 +--
>  lib/librte_port/rte_port_source_sink.c |   98
> +--
>  7 files changed, 535 insertions(+), 17 deletions(-)
> 
> --
> 1.7.9.5

Acked-by: Cristian Dumitrescu 



[dpdk-dev] [PATCH v2] pipeline: add statistics for librte_pipeline ports and tables

2015-05-21 Thread Dumitrescu, Cristian


From: Jay Rolette [mailto:role...@infiniteio.com]
Sent: Thursday, May 21, 2015 4:00 PM
To: Dumitrescu, Cristian
Cc: Thomas Monjalon; dev at dpdk.org
Subject: Re: [dpdk-dev] [PATCH v2] pipeline: add statistics for librte_pipeline 
ports and tables


On Thu, May 21, 2015 at 8:33 AM, Dumitrescu, Cristian mailto:cristian.dumitrescu at intel.com>> wrote:
> > The problem I see with this approach is that you cannot turn off debug
> > messages while still having the statistics collected.  We should allow
> > people to collects the stats without getting the debug messages. How do
> > we do this?
>
> By setting build-time log level to debug, you would enable stats and debug
> messages. Then you can adjust the log messages level with the run-time
> option --log-level.

This is a really clever trick! I have to say it took me some time to make sure 
I got it right :)
So when RTE_LOG_LEVEL (build time option) is set to DEBUG (i.e. 8), then we get 
both the stats and the debug messages, but then we can adjust the level of 
debug messages at run-time through the --log-level EAL option.
I can work with this option, thank you!

There is one more simple proposal that came to my mind late last night: how 
about single config file option RTE_COLLECT_STATS=y/n that should be used by 
all the libs across the board to indicate whether stats should be enabled or 
not?
This is logically very similar to the solution above, as they both look at a 
single option in the config file, but maybe it is also more intuitive for 
people.

I will go with your choice. Which one do you pick?

As Stephen said previously, any real DPDK app needs stats. You can't support 
network products operationally without them. What's the point of making them 
optional other than trying to juice benchmarks?

Jay

Hi Jay,

As explained in this thread, our strategy as a library is to keep all options 
open for the application: the application should be the one to decide whether 
the statistics collection should be enabled or not.

The library should not take application level decisions:

a)  For application A, these stats are relevant, so they should be 
collected;

b)  For application B, these counters are not relevant, so they should not 
be collected, so we allow the application to spend the CPU cycles doing some 
other useful work;

c)  For application C, these counters might only be relevant during 
debugging phase, so they should be collected during that time, while later on, 
when debugging is completed, their collection should be disabled.

I do not see why we should force the application to collect the stats if/when 
it does not need them. The library should allow the application to decide what 
the library should do, not the other way around.

Regards,
Cristian



[dpdk-dev] [PATCH v2] pipeline: add statistics for librte_pipeline ports and tables

2015-05-21 Thread Dumitrescu, Cristian


> -Original Message-
> From: Thomas Monjalon [mailto:thomas.monjalon at 6wind.com]
> Sent: Thursday, May 21, 2015 8:30 AM
> To: Dumitrescu, Cristian
> Cc: Wodkowski, PawelX; dev at dpdk.org; Jastrzebski, MichalX K
> Subject: Re: [dpdk-dev] [PATCH v2] pipeline: add statistics for 
> librte_pipeline
> ports and tables
> 
> 2015-05-20 23:41, Dumitrescu, Cristian:
> > From: Thomas Monjalon [mailto:thomas.monjalon at 6wind.com]
> > > 2015-05-20 13:57, Dumitrescu, Cristian:
> > > > From: Thomas Monjalon [mailto:thomas.monjalon at 6wind.com]
> > > > > Thanks for the detailed explanation.
> > > > >
> > > > > You are raising a trade-off problem about
> > > > > feature/maintenance/performance.
> > > > > I think we must choose among 4 solutions:
> > > > >   1/ always enabled
> > > > >   2/ build-time log level
> > > > >   3/ run-time option
> > > > >   4/ build-time option
> > > > >
> > > > > It's good to have this discussion and let other contributors giving 
> > > > > their
> > > > > opinion.
> > > > >
> > > > > 2015-05-19 22:41, Dumitrescu, Cristian:
> > > > > > 1. What is the technical solution to avoid performance loss due to
> stats
> > > > > > support?
> > > > > > Generally, I would agree with you that config options should be
> avoided,
> > > > > > especially those that alter the API (function prototypes, data
> structure
> > > > > > layouts, etc). Please note this is not the case for any of our 
> > > > > > patches,
> > > > > > as only the stats collection is enabled/disabled, while the data
> > > > > > structures and functions are not changed by the build time option.
> > > > > >
> > > > > > But what can you do for statistics? We all know that collecting the
> stats
> > > > > > sucks up CPU cycles, especially due to memory accesses, so stats
> always
> > > > > > have a run-time cost. Traditionally, stats are typically enabled for
> > > > > > debugging purposes and then turned off for the release version
> when
> > > > > > performance is required. How can this be done if build time flags 
> > > > > > are
> > > > > > not used?
> > > > >
> > > > > Statistics in drivers are always enabled (first solution).
> > > > > If those statistics are only used for debugging, why not using the
> build-
> > > > > time option CONFIG_RTE_LOG_LEVEL? (second solution)
> > > >
> > > > Can you please describe what solution 2 on your list (build-time log
> > > > level) consists of?
> > > >
> > > > I see log level useful for printing messages when an event takes place,
> > > > but this is not what these stats patches are about. We want to poll
> > > > for those counters on demand: if the build-time flag is off, then the
> > > > value of those counters is 0; when the build-time is on, then the stats
> > > > counters have the real value. Basically, the build-time flag only
> > > > enables/disables the update of the counters at run-time, which is
> where
> > > > CPU cycles are consumed. I am not sure how the log levels can help
> here?
> > >
> > > I think that counting stats is a kind of logging.
> > > Some stats are always counted (see drivers) and you want to use these
> ones
> > > only for debugging (after rebuilding DPDK with some debug options).
> > > So I suggest, as second solution, to check CONFIG_RTE_LOG_LEVEL is at
> > > debug
> > > level instead of having one option per module.
> > > It would be implemented with "#if RTE_LOG_LEVEL == RTE_LOG_DEBUG"
> in
> > > RTE_PIPELINE_STATS_ADD.
> >
> > The problem I see with this approach is that you cannot turn off debug
> > messages while still having the statistics collected.  We should allow
> > people to collects the stats without getting the debug messages. How do
> > we do this?
> 
> By setting build-time log level to debug, you would enable stats and debug
> messages. Then you can adjust the log messages level with the run-time
> option --log-level.

This is a really clever trick! I have to say it took me some time to make sure 
I got it right :)
So when RTE_LOG_LEVEL (build time option) is set to DEBUG (i.e. 8), then we get 
both the stats and the debug messages, but then we can adjust the level of 
debug messages at run-time through the --log-level EAL option.
I can work with this option, thank you!

There is one more simple proposal that came to my mind late last night: how 
about single config file option RTE_COLLECT_STATS=y/n that should be used by 
all the libs across the board to indicate whether stats should be enabled or 
not?
This is logically very similar to the solution above, as they both look at a 
single option in the config file, but maybe it is also more intuitive for 
people.

I will go with your choice. Which one do you pick?



[dpdk-dev] [PATCH v2] pipeline: add statistics for librte_pipeline ports and tables

2015-05-21 Thread Dumitrescu, Cristian


> -Original Message-
> From: Thomas Monjalon [mailto:thomas.monjalon at 6wind.com]
> Sent: Wednesday, May 20, 2015 11:02 PM
> To: Stephen Hemminger; Dumitrescu, Cristian
> Cc: dev at dpdk.org
> Subject: Re: [dpdk-dev] [PATCH v2] pipeline: add statistics for 
> librte_pipeline
> ports and tables
> 
> 2015-05-20 10:59, Stephen Hemminger:
> > On Wed, 20 May 2015 16:44:35 +0200
> > Thomas Monjalon  wrote:
> >
> > > Please Cristian, do not top post.
> > > I'm replacing your comment in the right context.
> > >
> > > 2015-05-20 13:57, Dumitrescu, Cristian:
> > > > From: Thomas Monjalon [mailto:thomas.monjalon at 6wind.com]
> > > > > Thanks for the detailed explanation.
> > > > >
> > > > > You are raising a trade-off problem about
> > > > > feature/maintenance/performance.
> > > > > I think we must choose among 4 solutions:
> > > > >   1/ always enabled
> > > > >   2/ build-time log level
> > > > >   3/ run-time option
> > > > >   4/ build-time option
> > > > >
> > > > > It's good to have this discussion and let other contributors giving 
> > > > > their
> > > > > opinion.
> > > > >
> > > > > 2015-05-19 22:41, Dumitrescu, Cristian:
> > > > > > 1. What is the technical solution to avoid performance loss due to
> stats
> > > > > > support?
> > > > > > Generally, I would agree with you that config options should be
> avoided,
> > > > > > especially those that alter the API (function prototypes, data
> structure
> > > > > > layouts, etc). Please note this is not the case for any of our 
> > > > > > patches,
> > > > > > as only the stats collection is enabled/disabled, while the data
> > > > > > structures and functions are not changed by the build time option.
> > > > > >
> > > > > > But what can you do for statistics? We all know that collecting the
> stats
> > > > > > sucks up CPU cycles, especially due to memory accesses, so stats
> always
> > > > > > have a run-time cost. Traditionally, stats are typically enabled for
> > > > > > debugging purposes and then turned off for the release version
> when
> > > > > > performance is required. How can this be done if build time flags 
> > > > > > are
> not
> > > > > > used?
> > > > >
> > > > > Statistics in drivers are always enabled (first solution).
> > > > > If those statistics are only used for debugging, why not using the
> build-time
> > > > > option CONFIG_RTE_LOG_LEVEL? (second solution)
> > > >
> > > > Can you please describe what solution 2 on your list (build-time log
> > > > level) consists of?
> > > >
> > > > I see log level useful for printing messages when an event takes place,
> > > > but this is not what these stats patches are about. We want to poll
> > > > for those counters on demand: if the build-time flag is off, then the
> > > > value of those counters is 0; when the build-time is on, then the stats
> > > > counters have the real value. Basically, the build-time flag only
> > > > enables/disables the update of the counters at run-time, which is
> where
> > > > CPU cycles are consumed. I am not sure how the log levels can help
> here?
> > >
> > > I think that counting stats is a kind of logging.
> > > Some stats are always counted (see drivers) and you want to use these
> ones
> > > only for debugging (after rebuilding DPDK with some debug options).
> > > So I suggest, as second solution, to check CONFIG_RTE_LOG_LEVEL is at
> debug
> > > level instead of having one option per module.
> > > It would be implemented with "#if RTE_LOG_LEVEL == RTE_LOG_DEBUG"
> in
> > > RTE_PIPELINE_STATS_ADD.
> > >
> >
> > In my experience, per-queue or per-cpu statistics have no visible
> performance
> > impact. And every real world deployment wants statistics
> 
> Yes. Maybe that having a benchmark with per-cpu stats would help to discuss
> the first solution (stats always enabled).

What are per-cpu stats? Are you referring to the false sharing problem when 
each cpu core maintains its own stats, but stats from several cores end up 
being stores in the same cache line? We don't have this problem here.



[dpdk-dev] [PATCH v2] pipeline: add statistics for librte_pipeline ports and tables

2015-05-21 Thread Dumitrescu, Cristian


> -Original Message-
> From: Thomas Monjalon [mailto:thomas.monjalon at 6wind.com]
> Sent: Wednesday, May 20, 2015 3:45 PM
> To: Dumitrescu, Cristian
> Cc: Wodkowski, PawelX; dev at dpdk.org; Jastrzebski, MichalX K
> Subject: Re: [dpdk-dev] [PATCH v2] pipeline: add statistics for 
> librte_pipeline
> ports and tables
> 
> Please Cristian, do not top post.
> I'm replacing your comment in the right context.
> 
> 2015-05-20 13:57, Dumitrescu, Cristian:
> > From: Thomas Monjalon [mailto:thomas.monjalon at 6wind.com]
> > > Thanks for the detailed explanation.
> > >
> > > You are raising a trade-off problem about
> > > feature/maintenance/performance.
> > > I think we must choose among 4 solutions:
> > >   1/ always enabled
> > >   2/ build-time log level
> > >   3/ run-time option
> > >   4/ build-time option
> > >
> > > It's good to have this discussion and let other contributors giving their
> > > opinion.
> > >
> > > 2015-05-19 22:41, Dumitrescu, Cristian:
> > > > 1. What is the technical solution to avoid performance loss due to stats
> > > > support?
> > > > Generally, I would agree with you that config options should be
> avoided,
> > > > especially those that alter the API (function prototypes, data structure
> > > > layouts, etc). Please note this is not the case for any of our patches,
> > > > as only the stats collection is enabled/disabled, while the data
> > > > structures and functions are not changed by the build time option.
> > > >
> > > > But what can you do for statistics? We all know that collecting the 
> > > > stats
> > > > sucks up CPU cycles, especially due to memory accesses, so stats always
> > > > have a run-time cost. Traditionally, stats are typically enabled for
> > > > debugging purposes and then turned off for the release version when
> > > > performance is required. How can this be done if build time flags are
> not
> > > > used?
> > >
> > > Statistics in drivers are always enabled (first solution).
> > > If those statistics are only used for debugging, why not using the build-
> time
> > > option CONFIG_RTE_LOG_LEVEL? (second solution)
> >
> > Can you please describe what solution 2 on your list (build-time log
> > level) consists of?
> >
> > I see log level useful for printing messages when an event takes place,
> > but this is not what these stats patches are about. We want to poll
> > for those counters on demand: if the build-time flag is off, then the
> > value of those counters is 0; when the build-time is on, then the stats
> > counters have the real value. Basically, the build-time flag only
> > enables/disables the update of the counters at run-time, which is where
> > CPU cycles are consumed. I am not sure how the log levels can help here?
> 
> I think that counting stats is a kind of logging.
> Some stats are always counted (see drivers) and you want to use these ones
> only for debugging (after rebuilding DPDK with some debug options).
> So I suggest, as second solution, to check CONFIG_RTE_LOG_LEVEL is at
> debug
> level instead of having one option per module.
> It would be implemented with "#if RTE_LOG_LEVEL == RTE_LOG_DEBUG" in
> RTE_PIPELINE_STATS_ADD.

The problem I see with this approach is that you cannot turn off debug messages 
while still having the statistics collected.  We should allow people to 
collects the stats without getting the debug messages. How do we do this?

Although most people use the stats for debugging and then turn them off, some 
other people are OK with the stats related performance drop for the production 
code (e.g. they might want to log some stats periodically), but they want to 
get rid of the debug messages. 

Potential idea:
Would it be acceptable to create a new log level that is dedicated to stats? 
This would have to be higher priority than all the present ones, so it needs to 
be: #define RTE_LOG_STATS1U. This way, we can get rid of all the debug 
messages while still logging stats:

#if ((RTE_LOG_STATS <= RTE_LOG_LEVEL)
#define STATS_COUNTER_ADD(level, type, counter, val) counter += val
#else
#define STATS_COUNTER_ADD(level, type, counter, val)
#endif

However, this won't allow getting the debug messages with stats collection 
being suppressed. But probably this case is not relevant.



[dpdk-dev] [PATCH v2] pipeline: add statistics for librte_pipeline ports and tables

2015-05-20 Thread Dumitrescu, Cristian
Hi Thomas,

Can you please describe what solution 2 on your list (build-time log level) 
consists of?

I see log level useful for printing messages when an event takes place, but 
this is not what these stats patches are about. We want to poll for those 
counters on demand: if the build-time flag is off, then the value of those 
counters is 0; when the build-time is on, then the stats counters have the real 
value. Basically, the build-time flag only enables/disables the update of the 
counters at run-time, which is where CPU cycles are consumed. I am not sure how 
the log levels can help here?

Regards,
Cristian

> -Original Message-
> From: Thomas Monjalon [mailto:thomas.monjalon at 6wind.com]
> Sent: Wednesday, May 20, 2015 1:06 AM
> To: Dumitrescu, Cristian
> Cc: Wodkowski, PawelX; dev at dpdk.org; Jastrzebski, MichalX K
> Subject: Re: [dpdk-dev] [PATCH v2] pipeline: add statistics for 
> librte_pipeline
> ports and tables
> 
> Hi Cristian,
> 
> Thanks for the detailed explanation.
> 
> You are raising a trade-off problem about
> feature/maintenance/performance.
> I think we must choose among 4 solutions:
>   1/ always enabled
>   2/ build-time log level
>   3/ run-time option
>   4/ build-time option
> 
> It's good to have this discussion and let other contributors giving their
> opinion.
> 
> 2015-05-19 22:41, Dumitrescu, Cristian:
> > Hi Thomas,
> >
> > I am asking for your help to identify a technical solution for moving
> > forward. Enabling the stats is a very important part of our Packet
> > Framework enhancements, as it is a dependency for a set of patches
> > we plan to submit next.
> >
> > 1. What is the technical solution to avoid performance loss due to stats
> > support?
> > Generally, I would agree with you that config options should be avoided,
> > especially those that alter the API (function prototypes, data structure
> > layouts, etc). Please note this is not the case for any of our patches,
> > as only the stats collection is enabled/disabled, while the data
> > structures and functions are not changed by the build time option.
> >
> > But what can you do for statistics? We all know that collecting the stats
> > sucks up CPU cycles, especially due to memory accesses, so stats always
> > have a run-time cost. Traditionally, stats are typically enabled for
> > debugging purposes and then turned off for the release version when
> > performance is required. How can this be done if build time flags are not
> > used?
> 
> Statistics in drivers are always enabled (first solution).
> If those statistics are only used for debugging, why not using the build-time
> option CONFIG_RTE_LOG_LEVEL? (second solution)
> 
> > Enabling stats conditionally at run-time has a significant cost
> > potentially, as this involves adding branches to the code; this is
> 
> Yes, a run-time option (similar to run-time log level) will cost only
> an "unlikely if" branching. (third solution)
> 
> > particularly bad due to branches being added in the library code,
> > which is not aware of the overhead of the application running on top of
> > it on the same CPU core. We are battling cycles very hard in the
> > Packet Framework, trying to avoid any waste, and the cost of branch
> > misprediction of ~15 cycles is prohibitive when your packet budget
> > (for the overall app, not just the library) is less than ~200 cycles.
> 
> This kind of issue applies to many areas of DPDK.
> 
> > Sometimes, branches are predicted correctly by the CPU, some other
> > times not, for example when the application code adds a lot of additional
> > branches (which is the typical case for an app). This is outside of the
> > control of the library.
> >
> > I would not recommend the run-time conditional option for stats support
> > in DPDK, where performance is key. The only option that looks feasible
> > to me to avoid performance impact is well-behaved build time option
> > (i.e. those that does not change the API functions and data structures).
> 
> The compile-time option makes testing coverage really complex.
> (fourth solution)
> 
> >  2. You already agreed that having build time options for performance
> >  reasons is fine, why change now?
> 
> Right, build time options are tolerated in DPDK because it is a performance
> oriented project. We also have to avoid 2 issues:
> - increasing number of compile-time options hide some bugs
> - binary distributions of DPDK cannot use these compile-time "features"
> 
> > We had a previous email thread on QoS stats
> (http://www.dpdk.org/ml/archives/dev/2015-February/0

[dpdk-dev] [PATCH v2 1/3] port: added WRITER_APPROACH == 1 implementation to ring port

2015-05-19 Thread Dumitrescu, Cristian
HI Thomas and David,

We can remove one of the code branches if you guys feel strongly about it.

The reason we recommended to keep both is because both of them are working, and 
we wanted to keep both of them for a while to get some feedback from other 
people about which one they prefer. It might be that some apps would prefer one 
over the other for performance reasons. But again, we can resend this patch 
with only one code path present.

Regards,
Cristian

> -Original Message-
> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Thomas Monjalon
> Sent: Sunday, May 17, 2015 10:45 PM
> To: Jastrzebski, MichalX K; Gajdzica, MaciejX T
> Cc: dev at dpdk.org
> Subject: Re: [dpdk-dev] [PATCH v2 1/3] port: added WRITER_APPROACH == 1
> implementation to ring port
> 
> 2015-04-30 13:58, Michal Jastrzebski:
> > From: Maciej Gajdzica 
> >
> > Added better optimized implementation of tx_bulk for ring writer port
> > based on
> > similar solution in ethdev_writer port. New implementation sends burst
> > without
> > copying data to internal buffer if it is possible.
> >
> > Signed-off-by: Maciej Gajdzica 
> [...]
> > +#if RTE_PORT_RING_WRITER_APPROACH == 0
> 
> Nack
> Maybe you missed the previous comment:
>   http://dpdk.org/ml/archives/dev/2015-March/015999.html


[dpdk-dev] [PATCH v2] pipeline: add statistics for librte_pipeline ports and tables

2015-05-19 Thread Dumitrescu, Cristian
Hi Thomas,

I am asking for your help to identify a technical solution for moving forward. 
Enabling the stats is a very important part of our Packet Framework 
enhancements, as it is a dependency for a set of patches we plan to submit next.

1. What is the technical solution to avoid performance loss due to stats 
support?
Generally, I would agree with you that config options should be avoided, 
especially those that alter the API (function prototypes, data structure 
layouts, etc). Please note this is not the case for any of our patches, as only 
the stats collection is enabled/disabled, while the data structures and 
functions are not changed by the build time option.

But what can you do for statistics? We all know that collecting the stats sucks 
up CPU cycles, especially due to memory accesses, so stats always have a 
run-time cost. Traditionally, stats are typically enabled for debugging 
purposes and then turned off for the release version when performance is 
required. How can this be done if build time flags are not used?

Enabling stats conditionally at run-time has a significant cost potentially, as 
this involves adding branches to the code; this is particularly bad due to 
branches being added in the library code, which is not aware of the overhead of 
the application running on top of it on the same CPU core. We are battling 
cycles very hard in the Packet Framework, trying to avoid any waste, and the 
cost of branch misprediction of ~15 cycles is prohibitive when your packet 
budget (for the overall app, not just the library) is less than ~200 cycles. 
Sometimes, branches are predicted correctly by the CPU, some other times not, 
for example when the application code adds a lot of additional branches (which 
is the typical case for an app). This is outside of the control of the library.

I would not recommend the run-time conditional option for stats support in 
DPDK, where performance is key. The only option that looks feasible to me to 
avoid performance impact is well-behaved build time option (i.e. those that 
does not change the API functions and data structures).


 2. You already agreed that having build time options for performance reasons 
is fine, why change now?
We had a previous email thread on QoS stats 
(http://www.dpdk.org/ml/archives/dev/2015-February/013820.html ), and you 
stated:

>We must remove and avoid build-time options.
>The only ones which might be acceptable are the ones which allow more
>performance by disabling some features.

I think stats support is the perfect example where this is applicable. We took 
this as the guidance from you, and we worked out these stats exactly as you 
recommended. Let me also mention that the format of the API functions and data 
structures for stats were also designed to match exactly the requirements from 
this discussion (e.g. clear on read approach, etc). Our functions match exactly 
the strategy that Stephen and myself agreed on at that time. Our work was done 
under this agreement.


3. There are other libraries that already apply the build time option for stats 
support
Other DPDK libraries provide stats based on build time configuration: Ethdev, 
QoS scheduler, IP fragmentation. Are you suggesting we need to remove the build 
time stats option for these libraries, too? If yes, what about the performance 
impact, which would be prohibitive (I can definitely testify this for QOS 
scheduler, where every cycle counts). This would result in choosing between 
functionality (stats support) vs. performance, and it might be that for some of 
these libraries the decision will be to remove the stats support altogether in 
order to keep the performance unchanged.

I think we should encourage people to add stats to more libraries, and if we 
forbid build-time option for stats support we send a strong signal to 
discourage people from adding stats support to the applicable DPDK libraries.


4. Coding guidelines
Is there a rule on "build time options are no longer allowed in DPDK" published 
somewhere in our documentation? Is it part of the coding standard? Am I wrong 
to say this is not the case at this point?
If we intend to have such a rule, then I think it needs to be discussed on this 
list  and published, to avoid any misinterpretation (leading to rework and 
debates). It should be published together with all the acceptable exceptions 
that are part of any rule definition. I personally think the stats support 
should be such an exception.

Thanks for your help!

Regards,
Cristian


> -Original Message-
> From: Thomas Monjalon [mailto:thomas.monjalon at 6wind.com]
> Sent: Monday, May 18, 2015 12:02 PM
> To: Wodkowski, PawelX
> Cc: dev at dpdk.org; Dumitrescu, Cristian; Jastrzebski, MichalX K
> Subject: Re: [dpdk-dev] [PATCH v2] pipeline: add statistics for 
> librte_pipeline
> ports and tables
> 
> 2015-05-05 15:11, Dumitrescu, Cristian:
> > From: dev [mailto:dev-bounces 

[dpdk-dev] [PATCH] examples/ip_pipeline: fix build with clang 3.6

2015-05-19 Thread Dumitrescu, Cristian


> -Original Message-
> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Thomas Monjalon
> Sent: Tuesday, May 19, 2015 11:37 AM
> To: dev at dpdk.org
> Subject: [dpdk-dev] [PATCH] examples/ip_pipeline: fix build with clang 3.6
> 
> This error is detected:
> examples/ip_pipeline/cmdline.c:272:15: error: address of array
>   'params->file_path' will always evaluate to 'true'
> if (!params->file_path) {
> ~^
> 
> file_path is an array in a structure so it's unneeded to check it.
> 
> Signed-off-by: Thomas Monjalon 
> ---
>  examples/ip_pipeline/cmdline.c | 6 --
>  1 file changed, 6 deletions(-)
> 
> diff --git a/examples/ip_pipeline/cmdline.c
> b/examples/ip_pipeline/cmdline.c
> index 152acb5..3173fd0 100644
> --- a/examples/ip_pipeline/cmdline.c
> +++ b/examples/ip_pipeline/cmdline.c
> @@ -268,12 +268,6 @@ cmd_run_file_parsed(
>   struct cmdline *file_cl;
>   int fd;
> 
> - /* Check params */
> - if (!params->file_path) {
> - printf("Illegal value for file path (%s)\n", params->file_path);
> - return;
> - }
> -
>   fd = open(params->file_path, O_RDONLY, 0);
>   if (fd < 0) {
>   printf("Illegal value for file path (%s)\n", params->file_path);
> --
> 2.4.0

Acked-by: Cristian Dumitrescu 



[dpdk-dev] [PATCH v2 0/2] cmdline: add polling mode for command line

2015-05-13 Thread Dumitrescu, Cristian


> -Original Message-
> From: Wodkowski, PawelX
> Sent: Wednesday, May 13, 2015 1:00 PM
> To: dev at dpdk.org
> Cc: Dumitrescu, Cristian
> Subject: [PATCH v2 0/2] cmdline: add polling mode for command line
> 
> This patchset adds the ability to process console input in the same thread
> as packet processing by using poll() function and fixes some minor issues.
> 
> v2 changes:
>  - add doxygen documentation for cmdline_poll()
>  - map file issue fixed
>  - use proper email address.
>  - add addtional missing include in cmdline_parse_ipaddr.h
> 
> Pawel Wodkowski (2):
>   cmdline: fix missing include files
>   cmdline: add polling mode for command line
> 
>  doc/api/doxy-api.conf  |  1 +
>  lib/librte_cmdline/cmdline.c   | 35
> ++
>  lib/librte_cmdline/cmdline.h   | 24 
>  lib/librte_cmdline/cmdline_parse_ipaddr.h  |  2 ++
>  lib/librte_cmdline/cmdline_rdline.h|  1 +
>  lib/librte_cmdline/cmdline_vt100.h |  2 ++
>  lib/librte_cmdline/rte_cmdline_version.map |  8 +++
>  7 files changed, 73 insertions(+)
> 
> --
> 1.9.1

Acked by: Cristian Dumitrescu 



[dpdk-dev] [PATCH 0/2] cmdline: add polling mode for command line

2015-05-12 Thread Dumitrescu, Cristian


> -Original Message-
> From: Pawel Wodkowski [mailto:pwodkowx at stargo]
> Sent: Tuesday, May 12, 2015 12:10 PM
> To: dev at dpdk.org
> Cc: Dumitrescu, Cristian; Jastrzebski, MichalX K
> Subject: [PATCH 0/2] cmdline: add polling mode for command line
> 
> This patchset adds the ability to process console input in the same thread
> as packet processing by using poll() function and fixes some minor issues.
> 
> Pawel Wodkowski (2):
>   cmdline: fix missing include files
>   cmdline: add polling mode for command line
> 
>  lib/librte_cmdline/cmdline.c   | 35
> ++
>  lib/librte_cmdline/cmdline.h   |  4 
>  lib/librte_cmdline/cmdline_rdline.h|  1 +
>  lib/librte_cmdline/cmdline_vt100.h |  2 ++
>  lib/librte_cmdline/rte_cmdline_version.map |  1 +
>  5 files changed, 43 insertions(+)
> 
> --
> 1.9.1

Acked by: Cristian Dumitrescu 

The existing cmdline API requires a thread to be completely consumed for 
running the CLI, so it is not possible to push any other tasks (typically slow 
tasks, control plane related) on the same thread. This is because the 
cmdline_interact() function implements an infinite loop internally, so once 
called after initialization, it only returns when the application is terminated 
(e.g. as result of quit CLI command).

This patch removes this limitation by providing an alternative function called 
cmdline_poll(), which allows the application to own the thread dispatch loop 
and work with the CLI in polling mode, hence making it possible for the 
application to push additional work on the same thread. The thread dispatch 
loop should be owned by the application, and not by the cmdline library.



[dpdk-dev] [PATCH 0/6] rte_sched: patches against 2.o

2015-05-05 Thread Dumitrescu, Cristian


> -Original Message-
> From: Stephen Hemminger [mailto:stephen at networkplumber.org]
> Sent: Wednesday, April 29, 2015 6:05 PM
> To: Dumitrescu, Cristian
> Cc: dev at dpdk.org; Stephen Hemminger
> Subject: [PATCH 0/6] rte_sched: patches against 2.o
> 
> This is a subset of earlier patches for QoS (rte_sched) subsystem.
> Only changes were to fix whitespace and update against DPDK 2.0
> 
> Stephen Hemminger (6):
>   rte_sched: make RED optional at runtime
>   rte_sched: expand scheduler hierarchy for more VLAN's
>   rte_sched: keep track of RED drops
>   rte_sched: allow reading without clearing
>   rte_sched: don't put tabs in log messages
>   rte_sched: use correct log level
> 
>  app/test/test_sched.c|   4 +-
>  examples/qos_sched/stats.c   |  16 --
>  lib/librte_mbuf/rte_mbuf.h   |   5 +-
>  lib/librte_sched/rte_sched.c | 113 --
> -
>  lib/librte_sched/rte_sched.h |  62 +++-
>  5 files changed, 130 insertions(+), 70 deletions(-)
> 
> --
> 2.1.4

Acked by: Cristian Dumitrescu 



[dpdk-dev] [PATCH v2] pipeline: add statistics for librte_pipeline ports and tables

2015-05-05 Thread Dumitrescu, Cristian


> -Original Message-
> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Michal Jastrzebski
> Sent: Thursday, April 30, 2015 1:16 PM
> To: dev at dpdk.org
> Cc: Wodkowski, PawelX
> Subject: [dpdk-dev] [PATCH v2] pipeline: add statistics for librte_pipeline
> ports and tables
> 
> From: Pawel Wodkowski 
> 
> This patch adds statistics collection for librte_pipeline.
> Those statistics ale disabled by default during build time.
> 
> Signed-off-by: Pawel Wodkowski 
> ---
>  config/common_bsdapp   |1 +
>  config/common_linuxapp |1 +
>  lib/librte_pipeline/rte_pipeline.c |  185
> +---
>  lib/librte_pipeline/rte_pipeline.h |   99 +++
>  4 files changed, 275 insertions(+), 11 deletions(-)
> 
> diff --git a/config/common_bsdapp b/config/common_bsdapp
> index 1d0f5b2..e4f0bf5 100644
> --- a/config/common_bsdapp
> +++ b/config/common_bsdapp
> @@ -414,6 +414,7 @@ CONFIG_RTE_TABLE_LPM_STATS_COLLECT=n
>  # Compile librte_pipeline
>  #
>  CONFIG_RTE_LIBRTE_PIPELINE=y
> +CONFIG_RTE_PIPELINE_STATS_COLLECT=n
> 
>  #
>  # Compile librte_kni
> diff --git a/config/common_linuxapp b/config/common_linuxapp
> index 8b01ca9..05553d9 100644
> --- a/config/common_linuxapp
> +++ b/config/common_linuxapp
> @@ -421,6 +421,7 @@ CONFIG_RTE_TABLE_LPM_STATS_COLLECT=n
>  # Compile librte_pipeline
>  #
>  CONFIG_RTE_LIBRTE_PIPELINE=y
> +CONFIG_RTE_PIPELINE_STATS_COLLECT=n
> 
>  #
>  # Compile librte_kni
> diff --git a/lib/librte_pipeline/rte_pipeline.c
> b/lib/librte_pipeline/rte_pipeline.c
> index 36d92c9..69bf003 100644
> --- a/lib/librte_pipeline/rte_pipeline.c
> +++ b/lib/librte_pipeline/rte_pipeline.c
> @@ -48,6 +48,17 @@
> 
>  #define RTE_TABLE_INVALID UINT32_MAX
> 
> +#ifdef RTE_PIPELINE_STATS_COLLECT
> +#define RTE_PIPELINE_STATS_ADD(counter, val) \
> + ({ (counter) += (val); })
> +
> +#define RTE_PIPELINE_STATS_ADD_M(counter, mask) \
> + ({ (counter) += __builtin_popcountll(mask); })
> +#else
> +#define RTE_PIPELINE_STATS_ADD(counter, val)
> +#define RTE_PIPELINE_STATS_ADD_M(counter, mask)
> +#endif
> +
>  struct rte_port_in {
>   /* Input parameters */
>   struct rte_port_in_ops ops;
> @@ -63,6 +74,8 @@ struct rte_port_in {
> 
>   /* List of enabled ports */
>   struct rte_port_in *next;
> +
> + uint64_t n_pkts_dropped_by_ah;
>  };
> 
>  struct rte_port_out {
> @@ -74,6 +87,8 @@ struct rte_port_out {
> 
>   /* Handle to low-level port */
>   void *h_port;
> +
> + uint64_t n_pkts_dropped_by_ah;
>  };
> 
>  struct rte_table {
> @@ -90,6 +105,12 @@ struct rte_table {
> 
>   /* Handle to the low-level table object */
>   void *h_table;
> +
> + /* Stats for this table. */
> + uint64_t n_pkts_dropped_by_lkp_hit_ah;
> + uint64_t n_pkts_dropped_by_lkp_miss_ah;
> + uint64_t n_pkts_dropped_lkp_hit;
> + uint64_t n_pkts_dropped_lkp_miss;
>  };
> 
>  #define RTE_PIPELINE_MAX_NAME_SZ   124
> @@ -1040,6 +1061,8 @@ rte_pipeline_action_handler_port_bulk(struct
> rte_pipeline *p,
> 
>   port_out->f_action_bulk(p->pkts, _mask, port_out-
> >arg_ah);
>   p->action_mask0[RTE_PIPELINE_ACTION_DROP] |=
> pkts_mask ^  mask;
> + RTE_PIPELINE_STATS_ADD_M(port_out-
> >n_pkts_dropped_by_ah,
> + pkts_mask ^  mask);
>   }
> 
>   /* Output port TX */
> @@ -1071,6 +1094,9 @@ rte_pipeline_action_handler_port(struct
> rte_pipeline *p, uint64_t pkts_mask)
>   p-
> >action_mask0[RTE_PIPELINE_ACTION_DROP] |=
>   (pkt_mask ^ 1LLU) << i;
> 
> + RTE_PIPELINE_STATS_ADD(port_out-
> >n_pkts_dropped_by_ah,
> + pkt_mask ^ 1LLU);
> +
>   /* Output port TX */
>   if (pkt_mask != 0)
>   port_out->ops.f_tx(port_out-
> >h_port,
> @@ -1104,6 +1130,9 @@ rte_pipeline_action_handler_port(struct
> rte_pipeline *p, uint64_t pkts_mask)
>   p-
> >action_mask0[RTE_PIPELINE_ACTION_DROP] |=
>   (pkt_mask ^ 1LLU) << i;
> 
> + RTE_PIPELINE_STATS_ADD(port_out-
> >n_pkts_dropped_by_ah,
> + pkt_mask ^ 1LLU);
> +
>   /* Output port TX */
>   if (pkt_mask != 0)
>   port_out->ops.f_tx(port_out-
> >h_port,
> @@ -1140,6 +1169,9 @@ rte_pipeline_action_handler_port_meta(struct
> rte_pipeline *p,
>   p-
> >action_mask0[RTE_PIPELINE_ACTION_DROP] |=
>   (pkt_mask ^ 1LLU) << i;
> 
> + RTE_PIPELINE_STATS_ADD(port_out-
> >n_pkts_dropped_by_ah,
> +

[dpdk-dev] [PATCH v2 00/10] table: added table statistics

2015-05-05 Thread Dumitrescu, Cristian


> -Original Message-
> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Michal Jastrzebski
> Sent: Thursday, April 30, 2015 1:14 PM
> To: dev at dpdk.org
> Subject: [dpdk-dev] [PATCH v2 00/10] table: added table statistics
> 
> From: Maciej Gajdzica 
> 
> Added statistics for every type of table. By default all table statistics
> are disabled, user must activate them in config file.
> 
> Maciej Gajdzica (10):
>   table: added structure for storing table stats
>   table: added acl table stats
>   table: added array table stats
>   table: added hash_ext table stats
>   table: added hash_key16 table stats
>   table: added hash_key32 table stats
>   table: added hash_key8 table stats
>   table: added hash_lru table stats
>   table: added lpm_ipv6 table stats
>   table: added lpm table stats
> 
>  config/common_bsdapp|9 ++
>  config/common_linuxapp  |9 ++
>  lib/librte_table/rte_table.h|   25 +++
>  lib/librte_table/rte_table_acl.c|   35 +
>  lib/librte_table/rte_table_array.c  |   34 +++-
>  lib/librte_table/rte_table_hash_ext.c   |   44
> ++
>  lib/librte_table/rte_table_hash_key16.c |   41
> 
>  lib/librte_table/rte_table_hash_key32.c |   41
> 
>  lib/librte_table/rte_table_hash_key8.c  |   52
> +++
>  lib/librte_table/rte_table_hash_lru.c   |   44
> ++
>  lib/librte_table/rte_table_lpm.c|   34 
>  lib/librte_table/rte_table_lpm_ipv6.c   |   34 
>  12 files changed, 401 insertions(+), 1 deletion(-)
> 
> --
> 1.7.9.5

Acked by: Cristian Dumitrescu 



[dpdk-dev] [PATCH v2 00/13] port: added port statistics

2015-05-05 Thread Dumitrescu, Cristian


> -Original Message-
> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Michal Jastrzebski
> Sent: Thursday, April 30, 2015 1:07 PM
> To: dev at dpdk.org
> Subject: [dpdk-dev] [PATCH v2 00/13] port: added port statistics
> 
> From: Maciej Gajdzica 
> 
> Added statistics for every type of port. By default all port statistics
> are disabled, user must activate them in config file.
> 
> Maciej Gajdzica (13):
>   port: added structures for port stats
>   port: added port_ethdev_reader stats
>   port: added port_ethdev_writer stats
>   port: added port_ethdev_writer_nodrop stats
>   port: added port_frag stats
>   port: added port_ras stats
>   port: added port_ring_reader stats
>   port: added port_ring_writer stats
>   port: added port_ring_writer_nodrop stats
>   port: added port_sched_reader stats
>   port: added port_sched_writer stats
>   port: added port_source stats
>   port: added port_sink stats
> 
>  config/common_bsdapp   |   12 
>  config/common_linuxapp |   12 
>  lib/librte_port/rte_port.h |   60 ++--
>  lib/librte_port/rte_port_ethdev.c  |  113
> +-
>  lib/librte_port/rte_port_frag.c|   36 ++
>  lib/librte_port/rte_port_ras.c |   38 ++
>  lib/librte_port/rte_port_ring.c|  118
> +++-
>  lib/librte_port/rte_port_sched.c   |   96 --
>  lib/librte_port/rte_port_source_sink.c |   98
> --
>  9 files changed, 566 insertions(+), 17 deletions(-)
> 
> --
> 1.7.9.5

Acked by: Cristian Dumitrescu  


[dpdk-dev] [PATCH v2 0/3] port: added frag_ipv6 and ras_ipv6 ports

2015-05-05 Thread Dumitrescu, Cristian


> -Original Message-
> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Michal Jastrzebski
> Sent: Thursday, April 30, 2015 1:03 PM
> To: dev at dpdk.org
> Subject: [dpdk-dev] [PATCH v2 0/3] port: added frag_ipv6 and ras_ipv6 ports
> 
> From: Maciej Gajdzica 
> 
> Added ipv6 versions of ip fragmentation and ip reassembly ports.
> 
> Maciej Gajdzica (3):
>   port: removed IPV4_MTU_DEFAULT define
>   port: added ipv6 fragmentation port
>   port: added ipv6 reassembly port
> 
>  lib/librte_port/rte_port_frag.c |   67 --
>  lib/librte_port/rte_port_frag.h |9 ++-
>  lib/librte_port/rte_port_ras.c  |  142 
> ---
>  lib/librte_port/rte_port_ras.h  |9 ++-
>  4 files changed, 167 insertions(+), 60 deletions(-)
> 
> --
> 1.7.9.5

Acked by: Cristian Dumitrescu  


[dpdk-dev] [PATCH v2 0/3] port: added ethdev_writer_nodrop and ring_writer_nodrop ports

2015-05-05 Thread Dumitrescu, Cristian


> -Original Message-
> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Michal Jastrzebski
> Sent: Thursday, April 30, 2015 12:58 PM
> To: dev at dpdk.org
> Subject: [dpdk-dev] [PATCH v2 0/3] port: added ethdev_writer_nodrop and
> ring_writer_nodrop ports
> 
> From: Maciej Gajdzica 
> 
> When nodrop writer port fails to send data, it retries until reach maximum
> number of retries. Also added new tx_bulk implementation for ring writer
> port.
> 
> Maciej Gajdzica (3):
>   port: added WRITER_APPROACH == 1 implementation to ring port
>   port: added ethdev_writer_nodrop port
>   port: added ring_writer_nodrop port
> 
>  lib/librte_port/rte_port_ethdev.c |  230
> ++
>  lib/librte_port/rte_port_ethdev.h |   19 +++
>  lib/librte_port/rte_port_ring.c   |  285
> +
>  lib/librte_port/rte_port_ring.h   |   16 +++
>  4 files changed, 550 insertions(+)
> 
> --
> 1.7.9.5

Acked by: Cristian Dumitrescu 



[dpdk-dev] QoS Question

2015-04-24 Thread Dumitrescu, Cristian
Hi Greg,

Great question, thank you! Please see my comments below.

> -Original Message-
> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Greg Smith
> Sent: Monday, April 20, 2015 7:40 PM
> To: dev at dpdk.org
> Subject: [dpdk-dev] QoS Question
> 
> Hi DPDK team,
> 
> The docs on QoS
> (http://dpdk.org/doc/guides/prog_guide/qos_framework.html# ) describe
> the traffic class (TC) as follows:
> 1 - The  TCs of the same pipe handled in strict priority order.
> 2 - Upper limit enforced per TC at the pipe level.
> 3 - Lower priority TCs able to reuse pipe bandwidth currently unused by
> higher priority TCs.
> 4 - When subport TC is oversubscribed (configuration time event), pipe TC
> upper limit is capped to a dynamically adjusted value that is shared by all 
> the
> subport pipes.
> 
> Can someone describe how and when the TC upper limit is "dynamically"
> changed?

This feature is described at length in Programmer's Guide section 21.2.4.6.6. 
Subport Traffic Class Oversubscription. Please note this feature is not enabled 
by default in the code base. To enable it, please set flag 
CONFIG_RTE_SCHED_SUBPORT_TC_OV=y in the DPDK configuration file 
(config/common_linuxapp or config/common_bsdapp).

Subport traffic class oversubscription is implemented only for the lowest 
priority traffic class (TC3, a.k.a. Best Effort), as usually the Best Effort 
traffic class (where most of the traffic is) is oversubscribed, while the high 
priority traffic classes (e.g. voice) are usually fully provisioned. 
Overprovisioning takes place when the service provider is selling more 
bandwidth that physically available, i.e. when the summation of the _nominal_ 
rate assigned to the users exceeds the rate of the subport. This does not 
necessarily represent a problem, as only a fraction of the users are looking to 
fully utilize their service (i.e. use up 100% of their nominal rate) at any 
given time: when the current total demand from all subport users does not 
exceed the subport rate, no problem exists, as each subscriber has its demand 
fully serviced; when the current total demand (which changes dynamically) 
exceeds the limit, it is obviously no longer possible to fully meet the entire 
demand.

In the latter case, it is important that some fairness is taking place. We do 
not want to have some  subscribers getting close to 0% of their nominal rate, 
while others getting close to 100% of their nominal rate, as this would not be 
fair. On the other hand, we cannot reduce the nominal rate of the users (e.g. 
everybody is now allowed 73% of their rate), as the nominal rate of a 
subscriber is completely disconnected from its current demand: one user might 
demand only 10% of its rate at this moment, so reserving 73% of its rate for 
this users results in wasting 63% of its rate, which could otherwise be awarded 
to some other user which has a higher demand at the same moment.

What we need to do is this: we need to apply a water filling algorithm that 
computes a user quota (common for all the subport users), so that users with 
current demand less than this quota will be fully serviced, while users with 
high demand will be truncated. This user quota is updated periodically, with 
the new value being estimated based on subport consumption from the past: when 
we see that the previous quota resulted in some subport bandwidth not being 
consumed, we increase the quota incrementally until the entire subport 
bandwidth is consumed; when we see that the entire subport bandwidth is 
consumed, we start dropping the quota incrementally until we see that some 
subport bandwidth starts to be wasted.

> 
> For example, assume there's a 1Gb/s port and a single 1Gb/s subport and
> 2000 pipes each of 1Mb/s (total pipes = 2Gb/s which is > the 1Gb/s subport
> which I think means "oversubscribed" as used in the doc). Each Pipe has a
> single TC.

Yes, agree this is an example of oversubscription. I used a similar example to 
describe this feature during the DPDK community readout earlier this week 
(https://youtu.be/_PPklkWGugs). 


> In that case, would each pipe be shaped to an upper limit of 0.5 Mb/s?

Only in the very unlikely event that all the 2000 users are active and each one 
is asking for 0.5 Mbps or more.

Typically, some of these users are currently inactive (demand = 0%) and some 
others will ask for less than e.g. 0.5 Mbps; whatever subport bandwidth is left 
unused by the low demand users, it can be awarded to the high demand users (of 
course, no user will ever get more than its nominal rate).

Let's refine the example: let's say that, currently, the demand distribution 
for the 2000 users is: [500 users: 0 Mbps; 500 users: 0.4 Mbps; 500 users: 0.7 
Mbps; 500 users: 1 Mbps].
These users will be awarded the following rates: [500 users: 0 Mbps; 500 users: 
0.4 Mbps; 500 users: 0.7 Mbps; 500 users: 0.9 Mbps].
Basically, all users are fully serviced, except the users demanding 1 Mbps, 
which will be truncated to 

[dpdk-dev] DPDK Community Call - Software QoS

2015-04-24 Thread Dumitrescu, Cristian


> -Original Message-
> From: Thomas Monjalon [mailto:thomas.monjalon at 6wind.com]
> Sent: Friday, April 24, 2015 1:17 PM
> To: Dumitrescu, Cristian
> Cc: O'Driscoll, Tim; dev at dpdk.org
> Subject: Re: DPDK Community Call - Software QoS
> 
> 2015-04-24 09:14, Dumitrescu, Cristian:
> > Hi Tim,
> >
> > Thank you for your help in recording this readout!
> > The slides that you attached to the email are dropped by the dpdk.org
> email server.
> 
> No, they are not dropped:
>   http://dpdk.org/ml/archives/dev/2015-April/016842.html
>   http://dpdk.org/ml/archives/dev/attachments/20150423/17a4d8de/
> attachment-0001.pdf
> 
> > Thomas,
> >
> > I suggest we setup a page on dpdk.org to keep track of all the community
> readouts,
> > we can store there the slides and a link to the Youtube video, when
> available.
> > What do you think?
> 
> I think first that we must focus on having a good and up-to-date official
> documentation.
> 
> The document you attached and the audio explanation are a snapshot which
> can be
> referenced in a page dedicated to not updated or external documents. So it
> will
> be important to put a date on these entries and sort them by date.
> 
> I suggest to create a page "discussion supports" which will be linked from
> http://dpdk.org/doc in the "Archives" section.

Yes, any section that provides a list of "discussions" works for me. I think we 
only need a one-liner per discussion: Date, topic, event, author, link to 
slides/doc, link to video. Thanks, Thomas!


[dpdk-dev] DPDK Community Call - Software QoS

2015-04-24 Thread Dumitrescu, Cristian
Hi Tim,

Thank you for your help in recording this readout! The slides that you attached 
to the email are dropped by the dpdk.org email server.

Thomas,

I suggest we setup a page on dpdk.org to keep track of all the community 
readouts, we can store there the slides and a link to the Youtube video, when 
available. What do you think?

Regards,
Cristian

> -Original Message-
> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of O'Driscoll, Tim
> Sent: Thursday, April 23, 2015 2:24 PM
> To: dev at dpdk.org
> Subject: Re: [dpdk-dev] DPDK Community Call - Software QoS
> 
> Thanks to Cristian for a great presentation on QoS on Tuesday. For anybody
> who missed it, the slides are attached and the video is available at:
> https://youtu.be/_PPklkWGugs
> 
> > -Original Message-
> > From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of O'driscoll, Tim
> > Sent: Wednesday, April 1, 2015 4:41 PM
> > To: dev at dpdk.org
> > Subject: [dpdk-dev] DPDK Community Call - Software QoS
> >
> > Agenda:
> > - Software Quality of Service (Cristian Dumitrescu)
> >
> >
> > Tue, Apr 21, 4:00-5:30 PM GMT Daylight Time
> >
> > Please join my meeting from your computer, tablet or smartphone.
> > https://global.gotomeeting.com/join/293233645
> >
> > You can also dial in using your phone.
> > Ireland : +353 (0) 15 290 180
> > United States : +1 (312) 757-3126
> > Australia : +61 2 9087 3604
> > Austria : +43 (0) 7 2088 0034
> > Belgium : +32 (0) 28 08 9321
> > Canada : +1 (647) 497-9350
> > Denmark : +45 (0) 69 91 80 05
> > Finland : +358 (0) 931 58 1746
> > France : +33 (0) 182 880 780
> > Germany : +49 (0) 692 5736 7211
> > Italy : +39 0 699 26 68 58
> > Netherlands : +31 (0) 208 908 267
> > New Zealand : +64 (0) 9 442 7358
> > Norway : +47 21 54 32 44
> > Spain : +34 911 23 0850
> > Sweden : +46 (0) 853 527 835
> > Switzerland : +41 (0) 435 0006 96
> > United Kingdom : +44 (0) 20 3713 5028
> >
> > Access Code: 293-233-645
> > Audio PIN: Shown after joining the meeting
> >



[dpdk-dev] [PATCH] pipeline: add statistics for librte_pipeline ports and tables

2015-03-30 Thread Dumitrescu, Cristian


> -Original Message-
> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Maciej Gajdzica
> Sent: Monday, March 30, 2015 1:06 PM
> To: dev at dpdk.org
> Cc: Wodkowski, PawelX
> Subject: [dpdk-dev] [PATCH] pipeline: add statistics for librte_pipeline ports
> and tables
> 
> From: Pawel Wodkowski 
> 
> This patch adds statistics collection for librte_pipeline.
> Those statistics ale disabled by default during build time.
> 
> This patchset depends on patchset:
> port: added port statistics
> 
> Signed-off-by: Pawel Wodkowski 
> ---
>  config/common_bsdapp   |   1 +
>  config/common_linuxapp |   1 +
>  lib/librte_pipeline/rte_pipeline.c | 185
> ++---
>  lib/librte_pipeline/rte_pipeline.h |  99 
>  4 files changed, 275 insertions(+), 11 deletions(-)
> 
> diff --git a/config/common_bsdapp b/config/common_bsdapp
> index 829591c..25147b8 100644
> --- a/config/common_bsdapp
> +++ b/config/common_bsdapp
> @@ -413,6 +413,7 @@ CONFIG_RTE_TABLE_LPM_STATS_COLLECT=n
>  # Compile librte_pipeline
>  #
>  CONFIG_RTE_LIBRTE_PIPELINE=y
> +CONFIG_RTE_PIPELINE_STATS_COLLECT=n
> 
>  #
>  # Compile librte_kni
> diff --git a/config/common_linuxapp b/config/common_linuxapp
> index 891b5da..1497aa0 100644
> --- a/config/common_linuxapp
> +++ b/config/common_linuxapp
> @@ -420,6 +420,7 @@ CONFIG_RTE_TABLE_LPM_STATS_COLLECT=n
>  # Compile librte_pipeline
>  #
>  CONFIG_RTE_LIBRTE_PIPELINE=y
> +CONFIG_RTE_PIPELINE_STATS_COLLECT=n
> 
>  #
>  # Compile librte_kni
> diff --git a/lib/librte_pipeline/rte_pipeline.c
> b/lib/librte_pipeline/rte_pipeline.c
> index 36d92c9..69bf003 100644
> --- a/lib/librte_pipeline/rte_pipeline.c
> +++ b/lib/librte_pipeline/rte_pipeline.c
> @@ -48,6 +48,17 @@
> 
>  #define RTE_TABLE_INVALID UINT32_MAX
> 
> +#ifdef RTE_PIPELINE_STATS_COLLECT
> +#define RTE_PIPELINE_STATS_ADD(counter, val) \
> + ({ (counter) += (val); })
> +
> +#define RTE_PIPELINE_STATS_ADD_M(counter, mask) \
> + ({ (counter) += __builtin_popcountll(mask); })
> +#else
> +#define RTE_PIPELINE_STATS_ADD(counter, val)
> +#define RTE_PIPELINE_STATS_ADD_M(counter, mask)
> +#endif
> +
>  struct rte_port_in {
>   /* Input parameters */
>   struct rte_port_in_ops ops;
> @@ -63,6 +74,8 @@ struct rte_port_in {
> 
>   /* List of enabled ports */
>   struct rte_port_in *next;
> +
> + uint64_t n_pkts_dropped_by_ah;
>  };
> 
>  struct rte_port_out {
> @@ -74,6 +87,8 @@ struct rte_port_out {
> 
>   /* Handle to low-level port */
>   void *h_port;
> +
> + uint64_t n_pkts_dropped_by_ah;
>  };
> 
>  struct rte_table {
> @@ -90,6 +105,12 @@ struct rte_table {
> 
>   /* Handle to the low-level table object */
>   void *h_table;
> +
> + /* Stats for this table. */
> + uint64_t n_pkts_dropped_by_lkp_hit_ah;
> + uint64_t n_pkts_dropped_by_lkp_miss_ah;
> + uint64_t n_pkts_dropped_lkp_hit;
> + uint64_t n_pkts_dropped_lkp_miss;
>  };
> 
>  #define RTE_PIPELINE_MAX_NAME_SZ   124
> @@ -1040,6 +1061,8 @@ rte_pipeline_action_handler_port_bulk(struct
> rte_pipeline *p,
> 
>   port_out->f_action_bulk(p->pkts, _mask, port_out-
> >arg_ah);
>   p->action_mask0[RTE_PIPELINE_ACTION_DROP] |=
> pkts_mask ^  mask;
> + RTE_PIPELINE_STATS_ADD_M(port_out-
> >n_pkts_dropped_by_ah,
> + pkts_mask ^  mask);
>   }
> 
>   /* Output port TX */
> @@ -1071,6 +1094,9 @@ rte_pipeline_action_handler_port(struct
> rte_pipeline *p, uint64_t pkts_mask)
>   p-
> >action_mask0[RTE_PIPELINE_ACTION_DROP] |=
>   (pkt_mask ^ 1LLU) << i;
> 
> + RTE_PIPELINE_STATS_ADD(port_out-
> >n_pkts_dropped_by_ah,
> + pkt_mask ^ 1LLU);
> +
>   /* Output port TX */
>   if (pkt_mask != 0)
>   port_out->ops.f_tx(port_out-
> >h_port,
> @@ -1104,6 +1130,9 @@ rte_pipeline_action_handler_port(struct
> rte_pipeline *p, uint64_t pkts_mask)
>   p-
> >action_mask0[RTE_PIPELINE_ACTION_DROP] |=
>   (pkt_mask ^ 1LLU) << i;
> 
> + RTE_PIPELINE_STATS_ADD(port_out-
> >n_pkts_dropped_by_ah,
> + pkt_mask ^ 1LLU);
> +
>   /* Output port TX */
>   if (pkt_mask != 0)
>   port_out->ops.f_tx(port_out-
> >h_port,
> @@ -1140,6 +1169,9 @@ rte_pipeline_action_handler_port_meta(struct
> rte_pipeline *p,
>   p-
> >action_mask0[RTE_PIPELINE_ACTION_DROP] |=
>   (pkt_mask ^ 1LLU) << i;
> 
> + RTE_PIPELINE_STATS_ADD(port_out-
> 

[dpdk-dev] [PATCH 00/10] table: added table statistics

2015-03-30 Thread Dumitrescu, Cristian


> -Original Message-
> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Maciej Gajdzica
> Sent: Monday, March 30, 2015 12:42 PM
> To: dev at dpdk.org
> Subject: [dpdk-dev] [PATCH 00/10] table: added table statistics
> 
> Added statistics for every type of table. By default all table statistics
> are disabled, user must activate them in config file.
> 
> Maciej Gajdzica (10):
>   table: added structure for storing table stats
>   table: added acl table stats
>   table: added array table stats
>   table: added hash_ext table stats
>   table: added hash_key16 table stats
>   table: added hash_key32 table stats
>   table: added hash_key8 table stats
>   table: added hash_lru table stats
>   table: added lpm_ipv6 table stats
>   table: added lpm table stats
> 
>  config/common_bsdapp|9 ++
>  config/common_linuxapp  |9 ++
>  lib/librte_table/rte_table.h|   25 +++
>  lib/librte_table/rte_table_acl.c|   35 +
>  lib/librte_table/rte_table_array.c  |   34 +++-
>  lib/librte_table/rte_table_hash_ext.c   |   44
> ++
>  lib/librte_table/rte_table_hash_key16.c |   41
> 
>  lib/librte_table/rte_table_hash_key32.c |   41
> 
>  lib/librte_table/rte_table_hash_key8.c  |   52
> +++
>  lib/librte_table/rte_table_hash_lru.c   |   44
> ++
>  lib/librte_table/rte_table_lpm.c|   34 
>  lib/librte_table/rte_table_lpm_ipv6.c   |   34 
>  12 files changed, 401 insertions(+), 1 deletion(-)
> 
> --
> 1.7.9.5

Acked by: Cristian Dumitrescu 



[dpdk-dev] [PATCH 00/13] port: added port statistics

2015-03-30 Thread Dumitrescu, Cristian


> -Original Message-
> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Maciej Gajdzica
> Sent: Monday, March 30, 2015 11:29 AM
> To: dev at dpdk.org
> Subject: [dpdk-dev] [PATCH 00/13] port: added port statistics
> 
> Added statistics for every type of port. By default all port statistics
> are disabled, user must activate them in config file.
> 
> This patchset depends on two patchsets:
> port: added ethdev_writer_nodrop and ring_writer_nodrop ports
> port: added frag_ipv6 and ras_ipv6 ports
> 
> Maciej Gajdzica (13):
>   port: added structures for port stats
>   port: added port_ethdev_reader stats
>   port: added port_ethdev_writer stats
>   port: added port_ethdev_writer_nodrop stats
>   port: added port_frag stats
>   port: added port_ras stats
>   port: added port_ring_reader stats
>   port: added port_ring_writer stats
>   port: added port_ring_writer_nodrop stats
>   port: added port_sched_reader stats
>   port: added port_sched_writer stats
>   port: added port_source stats
>   port: added port_sink stats
> 
>  config/common_bsdapp   |   12 
>  config/common_linuxapp |   12 
>  lib/librte_port/rte_port.h |   60 ++--
>  lib/librte_port/rte_port_ethdev.c  |  113
> +-
>  lib/librte_port/rte_port_frag.c|   36 ++
>  lib/librte_port/rte_port_ras.c |   38 ++
>  lib/librte_port/rte_port_ring.c|  118
> +++-
>  lib/librte_port/rte_port_sched.c   |   96 --
>  lib/librte_port/rte_port_source_sink.c |   98
> --
>  9 files changed, 566 insertions(+), 17 deletions(-)
> 
> --
> 1.7.9.5

Acked by: Cristian Dumitrescu 



[dpdk-dev] [PATCH 0/3] port: added frag_ipv6 and ras_ipv6 ports

2015-03-30 Thread Dumitrescu, Cristian


> -Original Message-
> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Maciej Gajdzica
> Sent: Monday, March 30, 2015 11:15 AM
> To: dev at dpdk.org
> Subject: [dpdk-dev] [PATCH 0/3] port: added frag_ipv6 and ras_ipv6 ports
> 
> Added ipv6 versions of ip fragmentation and ip reassembly ports.
> 
> Maciej Gajdzica (3):
>   port: removed IPV4_MTU_DEFAULT define
>   port: added ipv6 fragmentation port
>   port: added ipv6 reassembly port
> 
>  lib/librte_port/rte_port_frag.c |   67 --
>  lib/librte_port/rte_port_frag.h |9 ++-
>  lib/librte_port/rte_port_ras.c  |  142 
> ---
>  lib/librte_port/rte_port_ras.h  |9 ++-
>  4 files changed, 167 insertions(+), 60 deletions(-)
> 
> --
> 1.7.9.5

Acked by: Cristian Dumitrescu 


[dpdk-dev] [PATCH 0/3] port: added ethdev_writer_nodrop and ring_writer_nodrop ports

2015-03-30 Thread Dumitrescu, Cristian


> -Original Message-
> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Maciej Gajdzica
> Sent: Monday, March 30, 2015 10:57 AM
> To: dev at dpdk.org
> Subject: [dpdk-dev] [PATCH 0/3] port: added ethdev_writer_nodrop and
> ring_writer_nodrop ports
> 
> When nodrop writer port fails to send data, it retries until reach maximum
> number of retries. Also added new tx_bulk implementation for ring writer
> port.
> 
> Maciej Gajdzica (3):
>   port: added WRITER_APPROACH == 1 implementation to ring port
>   port: added ethdev_writer_nodrop port
>   port: added ring_writer_nodrop port
> 
>  lib/librte_port/rte_port_ethdev.c |  230
> ++
>  lib/librte_port/rte_port_ethdev.h |   19 +++
>  lib/librte_port/rte_port_ring.c   |  285
> +
>  lib/librte_port/rte_port_ring.h   |   16 +++
>  4 files changed, 550 insertions(+)
> 
> --
> 1.7.9.5

Acked by: Cristian Dumitrescu 



[dpdk-dev] FlowDirector Rules for TX packets

2015-03-27 Thread Dumitrescu, Cristian
Hi Gal,

No, flow director is just an RX-side packet filtering NIC feature.

Regards,
Cristian

> -Original Message-
> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Gal Sagie
> Sent: Friday, March 27, 2015 10:44 AM
> To: 
> Subject: [dpdk-dev] FlowDirector Rules for TX packets
> 
> Hello All,
> 
> I can define drop filter rules for RX packets using FlowDirector.
> Support for configuring this from CLI is also available using ethtool with
> the --config-ntuple option
> 
> I am wondering, does FlowDirector has any support to drop TX packets?
> (Meaning that i can define rules which are if matched in the TX side will
> not send the packets out)
> 
> At least from ethtool it doesn't seems like this is supported..
> Any idea?
> 
> Thanks
> Gal.


[dpdk-dev] [PATCH] table: fix a crash during key8 and key32 overload

2015-03-23 Thread Dumitrescu, Cristian


> -Original Message-
> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Maciej Gajdzica
> Sent: Monday, March 23, 2015 3:09 PM
> To: dev at dpdk.org
> Subject: [dpdk-dev] [PATCH] table: fix a crash during key8 and key32
> overload
> 
> hash_key8_ext and hash_key32_ext tables allocate cache entries to
> support table overload cases. The crash can occur when cache entry is
> free after use. The problem is with computing the index of the free
> cache entry. The same case for key16 was fixed with earlier patch.
> 
> Signed-off-by: Maciej Gajdzica 
> ---
>  lib/librte_table/rte_table_hash_key32.c |5 ++---
>  lib/librte_table/rte_table_hash_key8.c  |5 ++---
>  2 files changed, 4 insertions(+), 6 deletions(-)
> 
> diff --git a/lib/librte_table/rte_table_hash_key32.c
> b/lib/librte_table/rte_table_hash_key32.c
> index da0ce6a..6790594 100644
> --- a/lib/librte_table/rte_table_hash_key32.c
> +++ b/lib/librte_table/rte_table_hash_key32.c
> @@ -540,9 +540,8 @@ rte_table_hash_entry_delete_key32_ext(
> 
>   memset(bucket, 0,
>   sizeof(struct
> rte_bucket_4_32));
> - bucket_index = (bucket -
> - ((struct rte_bucket_4_32 *)
> - f->memory)) - f->n_buckets;
> + bucket_index = (((uint8_t *)bucket -
> + (uint8_t *)f->memory)/f-
> >bucket_size) - f->n_buckets;
>   f->stack[f->stack_pos++] =
> bucket_index;
>   }
> 
> diff --git a/lib/librte_table/rte_table_hash_key8.c
> b/lib/librte_table/rte_table_hash_key8.c
> index 443ca7d..6803eb2 100644
> --- a/lib/librte_table/rte_table_hash_key8.c
> +++ b/lib/librte_table/rte_table_hash_key8.c
> @@ -528,9 +528,8 @@ rte_table_hash_entry_delete_key8_ext(
> 
>   memset(bucket, 0,
>   sizeof(struct
> rte_bucket_4_8));
> - bucket_index = (bucket -
> - ((struct rte_bucket_4_8 *)
> - f->memory)) - f->n_buckets;
> + bucket_index = (((uint8_t *)bucket -
> + (uint8_t *)f->memory)/f-
> >bucket_size) - f->n_buckets;
>   f->stack[f->stack_pos++] =
> bucket_index;
>   }
> 
> --
> 1.7.9.5

Acked by: Cristian Dumitrescu 

Thanks, Maciej!



[dpdk-dev] [PATCH] dpdk: fix a crash during rte_table_hash_key16_ext overload

2015-03-23 Thread Dumitrescu, Cristian


> -Original Message-
> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of
> miroslaw.walukiewicz at intel.com
> Sent: Tuesday, March 3, 2015 2:16 PM
> To: dev at dpdk.org
> Subject: [dpdk-dev] [PATCH] dpdk: fix a crash during
> rte_table_hash_key16_ext overload
> 
> From: Miroslaw Walukiewicz 
> 
> The hash_key16_ext table allocates a cache entries to support
> table overload cases.
> 
> The crash can occur when cache entry is free after use. The problem
> is with computing the index of the free cache entry.
> 
> The patch fixes a problem.
> 
> Signed-off-by: Mirek Walukiewicz 
> ---
>  lib/librte_table/rte_table_hash_key16.c |5 ++---
>  1 file changed, 2 insertions(+), 3 deletions(-)
> 
> diff --git a/lib/librte_table/rte_table_hash_key16.c
> b/lib/librte_table/rte_table_hash_key16.c
> index ee5f639..e0c99bd 100644
> --- a/lib/librte_table/rte_table_hash_key16.c
> +++ b/lib/librte_table/rte_table_hash_key16.c
> @@ -535,9 +535,8 @@ rte_table_hash_entry_delete_key16_ext(
> 
>   memset(bucket, 0,
>   sizeof(struct
> rte_bucket_4_16));
> - bucket_index = (bucket -
> - ((struct rte_bucket_4_16 *)
> - f->memory)) - f->n_buckets;
> + bucket_index = (((uint8_t *)bucket -
> + (uint8_t *)f->memory)/f-
> >bucket_size) - f->n_buckets;
>   f->stack[f->stack_pos++] =
> bucket_index;
>   }
> 

Acked by: Cristian Dumitrescu 

Mirek, identical issue is found at identical place in rte_table_hash_key8.c and 
rte_table_hash_key32.c, would you please submit the same fix for those as well?

Thanks,
Cristian



[dpdk-dev] Interactive/dynamic QoS scheduler

2015-03-19 Thread Dumitrescu, Cristian
Hi Alexandre,

> -Original Message-
> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Alexandre Frigon
> Sent: Wednesday, March 18, 2015 5:10 PM
> To: dev at dpdk.org
> Subject: [dpdk-dev] Interactive/dynamic QoS scheduler
> 
> Hi,
> 
> I'm currently working with the QoS api. I would like to know if it is 
> possible to
> have interaction at runtime with the parameter applied to pipes, subport and
> queue.

Yes, it is. You can call API functions rte_sched_subport_config() and 
rte_sched_pipe_config() at run-time to re-configure the subport and pipe rates.

This assumes that the port has been configured during initialization, with all 
the port subports and pipes fully provisioned at this time (through the port 
configuration parameters n_subports_per_port and n_pipes_per_subport). Please 
note you need to call the subport and pipe config functions before you put any 
packets in their associated queues (typically, these functions are first called 
at initialization for all the port subports and pipes), but you can call them 
again later on to modify their previously assigned rates.

All the pipe profiles are set in stone once the port is initialized (through 
port configuration parameter pipe_profiles), what you can do at run-time is to 
re-assign a pipe to a different pre-defined profile, which basically changes 
the rate of the pipe.

> 
> Looking at the Scheduler sample, I understand that all of these parameters
> are predetermined using profiles in the config file.
> What I would like to do is interact with the scheduler to change subport, pipe
> and tc rate or size with arbitrary value at runtime to have control over 
> traffic.
> 

This is just a limitation of the qos_sched application, which currently does 
not provide any mechanism for the dynamic update of the subport/pipe parameters.

> For example:
> The scheduler sample is running and at any given time I want a  number of
> pipe to drop from 10Gbit to 4.5 Gbit and then put it back to 8Gbit.
> Profiles are not useful in this case because I would have to set a different
> profile for each value from rate 1bit to 10Gbit.
> 
> Is it actually possible to do that and how?
> 

Yes, it is relatively simple to do it. The qos_sched application has a CLI that 
only implements some stats-related commands, but new commands could be added to 
update the subport/pipe configuration. At this point, you need to write that 
(relatively low complexity) code yourself unfortunately. Please consider 
developing a code patch to add this support and share it with the DPDK 
community!

> Thanks,
> Alexandre F.



[dpdk-dev] [PATCH v2 5/6] rte_sched: allow reading statistics without clearing

2015-03-16 Thread Dumitrescu, Cristian


> -Original Message-
> From: Stephen Hemminger [mailto:stephen at networkplumber.org]
> Sent: Monday, March 16, 2015 8:11 PM
> To: Dumitrescu, Cristian
> Cc: dev at dpdk.org; Stephen Hemminger
> Subject: Re: [PATCH v2 5/6] rte_sched: allow reading statistics without
> clearing
> 
> On Mon, 16 Mar 2015 19:58:51 +0000
> "Dumitrescu, Cristian"  wrote:
> 
> >
> >
> > > -Original Message-
> > > From: Stephen Hemminger [mailto:stephen at networkplumber.org]
> > > Sent: Thursday, March 12, 2015 11:06 PM
> > > To: Dumitrescu, Cristian
> > > Cc: dev at dpdk.org; Stephen Hemminger
> > > Subject: Re: [PATCH v2 5/6] rte_sched: allow reading statistics without
> > > clearing
> > >
> > > On Thu, 12 Mar 2015 19:28:11 +
> > > "Dumitrescu, Cristian"  wrote:
> > >
> > > > Hi Stephen,
> > > >
> > > > Thank you for adding flexibility over clearing the stats or not.
> > > >
> > > > I have one concern though: why change the stats read API to add a
> clear
> > > parameter rather than keep prototype for the stats functions unchanged
> and
> > > add the flag as part of the port creation parameters in struct
> > > rte_sched_port_params? This parameter could be saved into the internal
> > > struct rte_sched_port, which is passed (as a pointer) to the stats read
> > > functions. In my opinion, this approach is slightly more elegant and it
> keeps
> > > the changes to a minimum.
> > > >
> > > > int
> > > > rte_sched_queue_read_stats(struct rte_sched_port *port,
> > > > uint32_t queue_id,
> > > > struct rte_sched_queue_stats *stats,
> > > > uint16_t *qlen)
> > > > {
> > > > ...
> > > > if (port->clear_stats_on_read)
> > > > memset(...);
> > > > }
> > > >
> > > > I made this suggestion during the previous round, but I did not get any
> > > opinion from you on it yet.
> > > >
> > > > Regards,
> > > > Cristian
> > >
> > > I rejected the config parameter idea because I don't like it is 
> > > inconsistent
> > > with other statistics in DPDK and in related software. There is not a
> > > config parameter that changes what BSD or Linux kernel API does.
> >
> > Your approach has the advantage of being able to clear/not clear the stats
> per each read operation rather than configuring the behavior globally. I think
> this approach allows for the ultimate flexibility, so I am OK to go with it.
> >
> > >
> > > The only reason for keeping the read and clear in one operation is
> > > because you like it, and there are somebody might have built code
> > > that expects it.
> > >
> >
> > Clearing the stats with a delay after the stats have been read is prone to a
> race condition, as during this time more packets could be processed, and
> these packets will not show up in the counters that the user read.
> > I think it depends on the need of each particular application whether this
> race condition is important or not: if the counters are read rarely (e.g. once
> per day) and only course accuracy is required, the error is probably 
> irrelevant;
> if the app is looking for fine great accuracy (e.g. rate measurement,
> debugging, etc), then the error is not allowed. You seem to favour the
> former and ignore the later case.
> >
> >
> > > Changing the function signature is a nice red flag so that people will
> > > notice at change.
> >
> > There is a small API change here. I am OK with it for the above reasons,
> provided that there are no objections on this from other contributors.
> >
> 
> If you really wanted a fast/safe read and clear operation, how about using
> exchange instruction to exchange in a zero?

When stats read is decoupled from stats clear, the exchange operation cannot 
fix this.

To your point: when stats are cleared on read, would it make sense to use the 
exchange operation? In my opinion, it does not make sense, as the rte_sched API 
is not thread safe (for performance reasons), which is explicitly  stated. As 
clearly documented, the scheduler enqueue and dequeue operations need to run on 
the same CPU core. Therefore, IMHO having a thread safe stats read function 
while the main functional API is not thread safe is not going to add value.



[dpdk-dev] [PATCH v2 5/6] rte_sched: allow reading statistics without clearing

2015-03-16 Thread Dumitrescu, Cristian


> -Original Message-
> From: Stephen Hemminger [mailto:stephen at networkplumber.org]
> Sent: Tuesday, March 10, 2015 4:14 PM
> To: Dumitrescu, Cristian
> Cc: dev at dpdk.org; Stephen Hemminger; Stephen Hemminger
> Subject: [PATCH v2 5/6] rte_sched: allow reading statistics without clearing
> 
> From: Stephen Hemminger 
> 
> The rte_sched statistics API should allow reading statistics without
> clearing. Make auto-clear optional.
> 
> Signed-off-by: Stephen Hemminger 
> ---
> v2 -- change to flag for clear from separate read/clear
> 
>  app/test/test_sched.c|  4 ++--
>  examples/qos_sched/stats.c   | 16 +++-
>  lib/librte_sched/rte_sched.c | 44 ++-
> -
>  lib/librte_sched/rte_sched.h | 18 ++
>  4 files changed, 45 insertions(+), 37 deletions(-)
> 
> diff --git a/app/test/test_sched.c b/app/test/test_sched.c
> index 60c62de..3c9c9e3 100644
> --- a/app/test/test_sched.c
> +++ b/app/test/test_sched.c
> @@ -208,13 +208,13 @@ test_sched(void)
> 
>   struct rte_sched_subport_stats subport_stats;
>   uint32_t tc_ov;
> - rte_sched_subport_read_stats(port, SUBPORT, _stats,
> _ov);
> + rte_sched_subport_read_stats(port, SUBPORT, _stats,
> _ov, 1);
>  #if 0
>   TEST_ASSERT_EQUAL(subport_stats.n_pkts_tc[TC-1], 10, "Wrong
> subport stats\n");
>  #endif
>   struct rte_sched_queue_stats queue_stats;
>   uint16_t qlen;
> - rte_sched_queue_read_stats(port, QUEUE, _stats, );
> + rte_sched_queue_read_stats(port, QUEUE, _stats, ,
> 1);
>  #if 0
>   TEST_ASSERT_EQUAL(queue_stats.n_pkts, 10, "Wrong queue
> stats\n");
>  #endif
> diff --git a/examples/qos_sched/stats.c b/examples/qos_sched/stats.c
> index b4db7b5..88caa54 100644
> --- a/examples/qos_sched/stats.c
> +++ b/examples/qos_sched/stats.c
> @@ -61,7 +61,7 @@ qavg_q(uint8_t port_id, uint32_t subport_id, uint32_t
> pipe_id, uint8_t tc, uint8
>  average = 0;
> 
>  for (count = 0; count < qavg_ntimes; count++) {
> -rte_sched_queue_read_stats(port, queue_id, , );
> +rte_sched_queue_read_stats(port, queue_id, , , 1);
>  average += qlen;
>  usleep(qavg_period);
>  }
> @@ -99,7 +99,9 @@ qavg_tcpipe(uint8_t port_id, uint32_t subport_id,
> uint32_t pipe_id, uint8_t tc)
>  for (count = 0; count < qavg_ntimes; count++) {
>  part_average = 0;
>  for (i = 0; i < RTE_SCHED_QUEUES_PER_TRAFFIC_CLASS; i++) {
> -rte_sched_queue_read_stats(port, queue_id + (tc *
> RTE_SCHED_QUEUES_PER_TRAFFIC_CLASS + i), , );
> +rte_sched_queue_read_stats(port,
> +queue_id + (tc *
> RTE_SCHED_QUEUES_PER_TRAFFIC_CLASS + i),
> +, , 1);
>  part_average += qlen;
>  }
>  average += part_average /
> RTE_SCHED_QUEUES_PER_TRAFFIC_CLASS;
> @@ -138,7 +140,8 @@ qavg_pipe(uint8_t port_id, uint32_t subport_id,
> uint32_t pipe_id)
>  for (count = 0; count < qavg_ntimes; count++) {
>  part_average = 0;
>  for (i = 0; i < RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE *
> RTE_SCHED_QUEUES_PER_TRAFFIC_CLASS; i++) {
> -rte_sched_queue_read_stats(port, queue_id + i, 
> ,
> );
> +rte_sched_queue_read_stats(port, queue_id + i,
> +, , 1);
>  part_average += qlen;
>  }
>  average += part_average /
> (RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE *
> RTE_SCHED_QUEUES_PER_TRAFFIC_CLASS);
> @@ -178,7 +181,9 @@ qavg_tcsubport(uint8_t port_id, uint32_t subport_id,
> uint8_t tc)
>  queue_id = RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE *
> RTE_SCHED_QUEUES_PER_TRAFFIC_CLASS * (subport_id *
> port_params.n_pipes_per_subport + i);
> 
>  for (j = 0; j < RTE_SCHED_QUEUES_PER_TRAFFIC_CLASS; 
> j++) {
> -rte_sched_queue_read_stats(port, queue_id + 
> (tc *
> RTE_SCHED_QUEUES_PER_TRAFFIC_CLASS + j), , );
> +rte_sched_queue_read_stats(port,
> +queue_id + (tc *
> RTE_SCHED_QUEUES_PER_TRAFFIC_CLASS + j),
> +, , 1);
>  part_average += qlen;
>  }
>  }
> @@ -220,7 +225,8 @@ qavg_subport(uint8_t port_id, uint32_t 

[dpdk-dev] [PATCH v2 5/6] rte_sched: allow reading statistics without clearing

2015-03-16 Thread Dumitrescu, Cristian


> -Original Message-
> From: Stephen Hemminger [mailto:stephen at networkplumber.org]
> Sent: Thursday, March 12, 2015 11:06 PM
> To: Dumitrescu, Cristian
> Cc: dev at dpdk.org; Stephen Hemminger
> Subject: Re: [PATCH v2 5/6] rte_sched: allow reading statistics without
> clearing
> 
> On Thu, 12 Mar 2015 19:28:11 +0000
> "Dumitrescu, Cristian"  wrote:
> 
> > Hi Stephen,
> >
> > Thank you for adding flexibility over clearing the stats or not.
> >
> > I have one concern though: why change the stats read API to add a clear
> parameter rather than keep prototype for the stats functions unchanged and
> add the flag as part of the port creation parameters in struct
> rte_sched_port_params? This parameter could be saved into the internal
> struct rte_sched_port, which is passed (as a pointer) to the stats read
> functions. In my opinion, this approach is slightly more elegant and it keeps
> the changes to a minimum.
> >
> > int
> > rte_sched_queue_read_stats(struct rte_sched_port *port,
> > uint32_t queue_id,
> > struct rte_sched_queue_stats *stats,
> > uint16_t *qlen)
> > {
> > ...
> > if (port->clear_stats_on_read)
> > memset(...);
> > }
> >
> > I made this suggestion during the previous round, but I did not get any
> opinion from you on it yet.
> >
> > Regards,
> > Cristian
> 
> I rejected the config parameter idea because I don't like it is inconsistent
> with other statistics in DPDK and in related software. There is not a
> config parameter that changes what BSD or Linux kernel API does.

Your approach has the advantage of being able to clear/not clear the stats per 
each read operation rather than configuring the behavior globally. I think this 
approach allows for the ultimate flexibility, so I am OK to go with it.

> 
> The only reason for keeping the read and clear in one operation is
> because you like it, and there are somebody might have built code
> that expects it.
>

Clearing the stats with a delay after the stats have been read is prone to a 
race condition, as during this time more packets could be processed, and these 
packets will not show up in the counters that the user read.
I think it depends on the need of each particular application whether this race 
condition is important or not: if the counters are read rarely (e.g. once per 
day) and only course accuracy is required, the error is probably irrelevant; if 
the app is looking for fine great accuracy (e.g. rate measurement, debugging, 
etc), then the error is not allowed. You seem to favour the former and ignore 
the later case.


> Changing the function signature is a nice red flag so that people will
> notice at change.

There is a small API change here. I am OK with it for the above reasons, 
provided that there are no objections on this from other contributors.



[dpdk-dev] [PATCH v2 4/7] rte_sched: don't clear statistics when read

2015-02-25 Thread Dumitrescu, Cristian


> -Original Message-
> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Thomas Monjalon
> Sent: Tuesday, February 24, 2015 8:07 PM
> To: Stephen Hemminger
> Cc: dev at dpdk.org; Stephen Hemminger
> Subject: Re: [dpdk-dev] [PATCH v2 4/7] rte_sched: don't clear statistics when
> read
> 
> 2015-02-24 11:18, Stephen Hemminger:
> > On Mon, 23 Feb 2015 23:51:31 +
> > Thomas Monjalon  wrote:
> >
> > > 2015-02-05 07:43, Neil Horman:
> > > > On Wed, Feb 04, 2015 at 10:13:58PM -0800, Stephen Hemminger wrote:
> > > > > +
> > > > > +/**
> > > > > + * Hierarchical scheduler subport statistics reset
> > > > > + *
> > > > > + * @param port
> > > > > + *   Handle to port scheduler instance
> > > > > + * @param subport_id
> > > > > + *   Subport ID
> > > > > + * @return
> > > > > + *   0 upon success, error code otherwise
> > > > > + */
> > > > > +int
> > > > > +rte_sched_subport_stats_reset(struct rte_sched_port *port,
> > > > > +   uint32_t subport_id);
> > > > > +
> > > > >  /**
> > > > >   * Hierarchical scheduler queue statistics read
> > > > >   *
> > > > > @@ -338,6 +353,20 @@ rte_sched_queue_read_stats(struct
> rte_sched_port *port,
> > > > >   struct rte_sched_queue_stats *stats,
> > > > >   uint16_t *qlen);
> > > > >
> > > > > +/**
> > > > > + * Hierarchical scheduler queue statistics reset
> > > > > + *
> > > > > + * @param port
> > > > > + *   Handle to port scheduler instance
> > > > > + * @param queue_id
> > > > > + *   Queue ID within port scheduler
> > > > > + * @return
> > > > > + *   0 upon success, error code otherwise
> > > > > + */
> > > > > +int
> > > > > +rte_sched_queue_stats_reset(struct rte_sched_port *port,
> > > > > + uint32_t queue_id);
> > > > > +
> > > > Both need to be added to the version map to expose them properly.
> > > > Neil
> > >
> > > Stephen, this patchset is partially acked and could enter in 2.0.0-rc1.
> > > May you send a v3 addressing comments? Or should I break the serie by
> > > applying only some of them? Or postpone the serie to 2.1?
> >
> > I can resend v3. Wasn't clear that a conclusion was reached.
> > IMHO read should not clear.
> 
> Me too. I'm just saying that I cannot apply anything.
> So you have to decide the strategy to adopt for your patches.

How about my latest proposal to have the stats read functions either reset the 
counters or not, based on init-time user configuration? I did not see any reply 
on this.

Maybe you guys missed my reply, I am pasting it below:

"Personally, I think we should avoid proliferating the number of stats 
functions, I would keep a single set of stats read functions, which can clear 
the stats or not, depending on behaviour configured per rte_sched object at 
creation time. Basically, based on the value of configuration parameter struct 
rte_sched_params::clear_stats_on_reset, the stats read functions do clear the 
counters or not. In my opinion, this allows a clean init-time selection of the 
required behaviour, and it also provides backward compatibility. Any issues 
with this approach?"



[dpdk-dev] [PATCH v2 4/7] rte_sched: don't clear statistics when read

2015-02-23 Thread Dumitrescu, Cristian


> -Original Message-
> From: Stephen Hemminger [mailto:stephen at networkplumber.org]
> Sent: Saturday, February 21, 2015 1:53 AM
> To: Dumitrescu, Cristian
> Cc: Thomas Monjalon; dev at dpdk.org
> Subject: Re: [dpdk-dev] [PATCH v2 4/7] rte_sched: don't clear statistics when
> read
> 
> On Fri, 20 Feb 2015 21:28:55 +0000
> "Dumitrescu, Cristian"  wrote:
> 
> > Agree.
> > Stephen, how about a run-time solution (I agree it would be much better,
> why did I not consider this in the first place?) of adding a new bool 
> parameter
> in struct rte_sched_port_params: clear_stats_on_reset?
> > Both stats read function get the port handle (struct rte_sched_port *) as
> parameter, so there should be no ripple effect to propagate this flag.
> 
> Why not read_and_clear function if absolutely necessary.

I am not sure I understand what you mean. Are you suggesting different set of 
stats read functions, one that is read and clear, while the other one is read 
without clear?

Personally, I think we should avoid proliferating the number of stats 
functions, I would keep a single set of stats read functions, which can clear 
the stats or not, depending on behaviour configured per rte_sched object at 
creation time. Basically, based on the value of configuration parameter struct 
rte_sched_params::clear_stats_on_reset, the stats read functions do clear the 
counters or not. In my opinion, this allows a clean init-time selection of the 
required behaviour, and it also provides backward compatibility. Any issues 
with this approach?




[dpdk-dev] [PATCH v2 4/7] rte_sched: don't clear statistics when read

2015-02-20 Thread Dumitrescu, Cristian


> -Original Message-
> From: Thomas Monjalon [mailto:thomas.monjalon at 6wind.com]
> Sent: Friday, February 20, 2015 9:01 PM
> To: Dumitrescu, Cristian
> Cc: dev at dpdk.org; Stephen Hemminger
> Subject: Re: [dpdk-dev] [PATCH v2 4/7] rte_sched: don't clear statistics when
> read
> 
> 2015-02-20 20:23, Dumitrescu, Cristian:
> > From: Stephen Hemminger [mailto:stephen at networkplumber.org]
> > > On Fri, 20 Feb 2015 18:32:03 +
> > > "Dumitrescu, Cristian"  wrote:
> > >
> > > > Stephen, I suggest adding a new build-time configuration option for the
> > > > librte_sched library in config/common_* files:
> > > > CONFIG_RTE_SCHED_STATS_CLEAR_ON_READ.
> > >
> > > Build time config options do not work for distributions.
> >
> > Why?
> >
> > This does not affect the API, as the new API functions are always compiled
> > in, and the prototypes are not changed, and no data structures are
> affected.
> 
> Behaviour is an important part of the API. Think comments as part of the API.

Makes sense.

> 
> > This only changes the behavior of certain functions, so that user can
> > select which mode it needs.
> 
> When user doesn't or cannot rebuild, he has no choice.
> 
> > It also preserves backward compatibility.
> >
> > We have so many compilation options in config file, why is this one
> different?
> 
> We must remove and avoid build-time options.
> The only ones which might be acceptable are the ones which allow more
> performance by disabling some features.

Agree.
Stephen, how about a run-time solution (I agree it would be much better, why 
did I not consider this in the first place?) of adding a new bool parameter in 
struct rte_sched_port_params: clear_stats_on_reset?
Both stats read function get the port handle (struct rte_sched_port *) as 
parameter, so there should be no ripple effect to propagate this flag.

> 
> > This e-mail and any attachments may contain confidential material for the
> > sole use of the intended recipient(s). Any review or distribution by others
> > is strictly prohibited. If you are not the intended recipient, please 
> > contact
> > the sender and delete all copies.
> 
> Please ask to your administrator to remove this disclaimer.

I did earlier, this is still work in progress unfortunately.

--
Intel Shannon Limited
Registered in Ireland
Registered Office: Collinstown Industrial Park, Leixlip, County Kildare
Registered Number: 308263
Business address: Dromore House, East Park, Shannon, Co. Clare

This e-mail and any attachments may contain confidential material for the sole 
use of the intended recipient(s). Any review or distribution by others is 
strictly prohibited. If you are not the intended recipient, please contact the 
sender and delete all copies.




[dpdk-dev] RTE_PIPELINE_ACTION_PORT_META doesn't work properly

2015-02-20 Thread Dumitrescu, Cristian


> -Original Message-
> From: Thomas Monjalon [mailto:thomas.monjalon at 6wind.com]
> Sent: Friday, February 20, 2015 8:47 PM
> To: Dumitrescu, Cristian; imustafin at bk.ru
> Cc: dev at dpdk.org
> Subject: Re: [dpdk-dev] RTE_PIPELINE_ACTION_PORT_META doesn't work
> properly
> 
> 2015-02-20 19:26, Dumitrescu, Cristian:
> >
> > > -Original Message-
> > > From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of ??
> 
> > > Sent: Friday, February 20, 2015 4:25 PM
> > > To: dev at dpdk.org
> > > Subject: [dpdk-dev] RTE_PIPELINE_ACTION_PORT_META doesn't work
> > > properly
> > >
> > >  Hi, list!
> > >
> > > RTE_PIPELINE_ACTION_PORT_META option in rte_pipeline library
> doesn't
> > > work for non-default table entries. Is this bug or correct behaviour?
> > >
> > > This is my patch, that changes this behaviour:
> > >
> > > --- a/lib/librte_pipeline/rte_pipeline.c
> > > +++ b/lib/librte_pipeline/rte_pipeline.c
> > > @@ -999,8 +999,10 @@ rte_pipeline_compute_masks(struct
> rte_pipeline
> > > *p, uint64_t pkts_mask)
> > >  {
> > > p->action_mask1[RTE_PIPELINE_ACTION_DROP] = 0;
> > > p->action_mask1[RTE_PIPELINE_ACTION_PORT] = 0;
> > > +p->action_mask1[RTE_PIPELINE_ACTION_PORT_META] = 0;
> > > p->action_mask1[RTE_PIPELINE_ACTION_TABLE] = 0;
> > >
> > > +
> > > if ((pkts_mask & (pkts_mask + 1)) == 0) {
> > > uint64_t n_pkts = __builtin_popcountll(pkts_mask);
> > > uint32_t i;
> > > @@ -1224,6 +1226,7 @@ rte_pipeline_run(struct rte_pipeline *p)
> > > pkts_mask = RTE_LEN2MASK(n_pkts, uint64_t);
> > > p->action_mask0[RTE_PIPELINE_ACTION_DROP] = 0;
> > > p->action_mask0[RTE_PIPELINE_ACTION_PORT] = 0;
> > > +p->action_mask0[RTE_PIPELINE_ACTION_PORT_META] = 0;
> > > p->action_mask0[RTE_PIPELINE_ACTION_TABLE] = 0;
> > >
> > > /* Input port user actions */
> > > @@ -1300,6 +1303,9 @@ rte_pipeline_run(struct rte_pipeline *p)
> > > p->action_mask0[RTE_PIPELINE_ACTION_PORT] 
> > > |=
> > > p->action_mask1[
> > > RTE_PIPELINE_ACTION_PORT];
> > > +p->action_mask0[RTE_PIPELINE_ACTION_PORT_META] |=
> > > +p->action_mask1[
> > > +RTE_PIPELINE_ACTION_PORT_META];
> > > 
> > > p->action_mask0[RTE_PIPELINE_ACTION_TABLE] |=
> > > p->action_mask1[
> > > 
> > > RTE_PIPELINE_ACTION_TABLE];
> > >
> > >
> > >
> > >
> > >
> > > Thanks, Ildar
> >
> > Acked by: Cristian Dumitrescu 
> 
> No Cristian, you cannot ack this patch, even if it fixes the problem.
> The format is not OK: indent is wrong and it's not signed.
> Please Ildar, check http://dpdk.org/dev#send to know how to submit a
> patch.
> 
> > Thank you, Ildar!

Sorry, Thomas.
Ildar, please fix the signature (currently it the above signature is:  On 
Behalf Of ??) and the indentation and resubmit.

Regards,
Cristian

--
Intel Shannon Limited
Registered in Ireland
Registered Office: Collinstown Industrial Park, Leixlip, County Kildare
Registered Number: 308263
Business address: Dromore House, East Park, Shannon, Co. Clare

This e-mail and any attachments may contain confidential material for the sole 
use of the intended recipient(s). Any review or distribution by others is 
strictly prohibited. If you are not the intended recipient, please contact the 
sender and delete all copies.




[dpdk-dev] RTE_PIPELINE_ACTION_PORT_META doesn't work properly

2015-02-20 Thread Dumitrescu, Cristian


> -Original Message-
> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of ?? 
> Sent: Friday, February 20, 2015 4:25 PM
> To: dev at dpdk.org
> Subject: [dpdk-dev] RTE_PIPELINE_ACTION_PORT_META doesn't work
> properly
> 
>  Hi, list!
> 
> RTE_PIPELINE_ACTION_PORT_META option in rte_pipeline library doesn't
> work for non-default table entries. Is this bug or correct behaviour?
> 
> This is my patch, that changes this behaviour:
> 
> --- a/lib/librte_pipeline/rte_pipeline.c
> +++ b/lib/librte_pipeline/rte_pipeline.c
> @@ -999,8 +999,10 @@ rte_pipeline_compute_masks(struct rte_pipeline
> *p, uint64_t pkts_mask)
> ?{
> ??? p->action_mask1[RTE_PIPELINE_ACTION_DROP] = 0;
> ??? p->action_mask1[RTE_PIPELINE_ACTION_PORT] = 0;
> +??? p->action_mask1[RTE_PIPELINE_ACTION_PORT_META] = 0;
> ??? p->action_mask1[RTE_PIPELINE_ACTION_TABLE] = 0;
> 
> +
> ??? if ((pkts_mask & (pkts_mask + 1)) == 0) {
> ??? uint64_t n_pkts = __builtin_popcountll(pkts_mask);
> ??? uint32_t i;
> @@ -1224,6 +1226,7 @@ rte_pipeline_run(struct rte_pipeline *p)
> ??? pkts_mask = RTE_LEN2MASK(n_pkts, uint64_t);
> ??? p->action_mask0[RTE_PIPELINE_ACTION_DROP] = 0;
> ??? p->action_mask0[RTE_PIPELINE_ACTION_PORT] = 0;
> +??? p->action_mask0[RTE_PIPELINE_ACTION_PORT_META] = 0;
> ??? p->action_mask0[RTE_PIPELINE_ACTION_TABLE] = 0;
> 
> ??? /* Input port user actions */
> @@ -1300,6 +1303,9 @@ rte_pipeline_run(struct rte_pipeline *p)
> ??? p->action_mask0[RTE_PIPELINE_ACTION_PORT] |=
> ??? p->action_mask1[
> ??? RTE_PIPELINE_ACTION_PORT];
> +??? p->action_mask0[RTE_PIPELINE_ACTION_PORT_META] |=
> +??? p->action_mask1[
> +??? RTE_PIPELINE_ACTION_PORT_META];
> ??? p->action_mask0[RTE_PIPELINE_ACTION_TABLE] |=
> ??? p->action_mask1[
> ??? RTE_PIPELINE_ACTION_TABLE];
> 
> 
> 
> 
> 
> Thanks, Ildar

Acked by: Cristian Dumitrescu 

Thank you, Ildar!


--
Intel Shannon Limited
Registered in Ireland
Registered Office: Collinstown Industrial Park, Leixlip, County Kildare
Registered Number: 308263
Business address: Dromore House, East Park, Shannon, Co. Clare

This e-mail and any attachments may contain confidential material for the sole 
use of the intended recipient(s). Any review or distribution by others is 
strictly prohibited. If you are not the intended recipient, please contact the 
sender and delete all copies.



[dpdk-dev] [PATCH v2 5/7] rte_sched: don't put tabs in log messages

2015-02-20 Thread Dumitrescu, Cristian


> -Original Message-
> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Stephen
> Hemminger
> Sent: Thursday, February 5, 2015 6:14 AM
> To: dev at dpdk.org
> Cc: Stephen Hemminger
> Subject: [dpdk-dev] [PATCH v2 5/7] rte_sched: don't put tabs in log
> messages
> 
> From: Stephen Hemminger 
> 
> syslog does not like tabs in log messages; tab gets translated to #011
> 
> Signed-off-by: Stephen Hemminger 
> ---
>  lib/librte_sched/rte_sched.c | 14 +++---
>  1 file changed, 7 insertions(+), 7 deletions(-)
> 
> diff --git a/lib/librte_sched/rte_sched.c b/lib/librte_sched/rte_sched.c
> index d891e50..55fbc14 100644
> --- a/lib/librte_sched/rte_sched.c
> +++ b/lib/librte_sched/rte_sched.c
> @@ -495,10 +495,10 @@ rte_sched_port_log_pipe_profile(struct
> rte_sched_port *port, uint32_t i)
>   struct rte_sched_pipe_profile *p = port->pipe_profiles + i;
> 
>   RTE_LOG(INFO, SCHED, "Low level config for pipe profile %u:\n"
> - "\tToken bucket: period = %u, credits per period = %u, size =
> %u\n"
> - "\tTraffic classes: period = %u, credits per period = [%u, %u,
> %u, %u]\n"
> - "\tTraffic class 3 oversubscription: weight = %hhu\n"
> - "\tWRR cost: [%hhu, %hhu, %hhu, %hhu], [%hhu, %hhu,
> %hhu, %hhu], [%hhu, %hhu, %hhu, %hhu], [%hhu, %hhu, %hhu, %hhu]\n",
> + "Token bucket: period = %u, credits per period = %u, size =
> %u\n"
> + "Traffic classes: period = %u, credits per period = [%u, %u,
> %u, %u]\n"
> + "Traffic class 3 oversubscription: weight = %hhu\n"
> + "WRR cost: [%hhu, %hhu, %hhu, %hhu], [%hhu, %hhu,
> %hhu, %hhu], [%hhu, %hhu, %hhu, %hhu], [%hhu, %hhu, %hhu, %hhu]\n",
>   i,
> 
>   /* Token bucket */
> @@ -716,9 +716,9 @@ rte_sched_port_log_subport_config(struct
> rte_sched_port *port, uint32_t i)
>   struct rte_sched_subport *s = port->subport + i;
> 
>   RTE_LOG(INFO, SCHED, "Low level config for subport %u:\n"
> - "\tToken bucket: period = %u, credits per period = %u, size =
> %u\n"
> - "\tTraffic classes: period = %u, credits per period = [%u, %u,
> %u, %u]\n"
> - "\tTraffic class 3 oversubscription: wm min = %u, wm max =
> %u\n",
> + "Token bucket: period = %u, credits per period = %u, size =
> %u\n"
> + "Traffic classes: period = %u, credits per period = [%u, %u,
> %u, %u]\n"
> + "Traffic class 3 oversubscription: wm min = %u, wm max =
> %u\n",
>   i,
> 
>   /* Token bucket */
> --
> 2.1.4

Acked by: Cristian Dumitrescu 

--
Intel Shannon Limited
Registered in Ireland
Registered Office: Collinstown Industrial Park, Leixlip, County Kildare
Registered Number: 308263
Business address: Dromore House, East Park, Shannon, Co. Clare

This e-mail and any attachments may contain confidential material for the sole 
use of the intended recipient(s). Any review or distribution by others is 
strictly prohibited. If you are not the intended recipient, please contact the 
sender and delete all copies.




[dpdk-dev] [PATCH v2 4/7] rte_sched: don't clear statistics when read

2015-02-20 Thread Dumitrescu, Cristian


> -Original Message-
> From: Stephen Hemminger [mailto:stephen at networkplumber.org]
> Sent: Monday, February 9, 2015 10:55 PM
> To: Dumitrescu, Cristian
> Cc: dev at dpdk.org; Stephen Hemminger
> Subject: Re: [dpdk-dev] [PATCH v2 4/7] rte_sched: don't clear statistics when
> read
> 
> On Mon, 9 Feb 2015 22:48:36 +0000
> "Dumitrescu, Cristian"  wrote:
> 
> > Hi Stephen,
> >
> > What is the reason not to clear statistics on read? Do you have a use-case /
> justification for it?
> >
> > (BTW, I see you added the reset functions, but was it also your intention to
> remove the memset to 0 from the stats read functions? :) )
> >
> > Regards,
> > Cristian
> 
> Read and clear is a non-standard model. Interface statistics are not
> read/clear.
> We have lots of scripts that read statistics. Users don't like it if when
> stastics disappear.

Stephen, I suggest adding a new build-time configuration option for the 
librte_sched library in config/common_* files: 
CONFIG_RTE_SCHED_STATS_CLEAR_ON_READ.
- When set to YES, we clear the stats counters on read. When set to NO, we do 
not clear stats counters on read;
- The default value for this configuration option should be YES, in order to 
preserve the backward compatibility with the current behavior;
- The stats reset functions introduced by this patch should always be 
enabled/compiled into the code, regardless of whether this new option is set to 
YES or NO.

What do you think?

Regards,
Cristian


--
Intel Shannon Limited
Registered in Ireland
Registered Office: Collinstown Industrial Park, Leixlip, County Kildare
Registered Number: 308263
Business address: Dromore House, East Park, Shannon, Co. Clare

This e-mail and any attachments may contain confidential material for the sole 
use of the intended recipient(s). Any review or distribution by others is 
strictly prohibited. If you are not the intended recipient, please contact the 
sender and delete all copies.




[dpdk-dev] [PATCH v2 3/7] rte_sched: keep track of RED drops

2015-02-20 Thread Dumitrescu, Cristian


> -Original Message-
> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Stephen
> Hemminger
> Sent: Thursday, February 5, 2015 6:05 AM
> To: dev at dpdk.org
> Cc: Stephen Hemminger
> Subject: [dpdk-dev] [PATCH v2 3/7] rte_sched: keep track of RED drops
> 
> From: Stephen Hemminger 
> 
> Add new statistic to keep track of drops due to RED.
> 
> Signed-off-by: Stephen Hemminger 
> ---
>  lib/librte_sched/rte_sched.c | 28 +++-
>  lib/librte_sched/rte_sched.h |  6 ++
>  2 files changed, 29 insertions(+), 5 deletions(-)
> 
> diff --git a/lib/librte_sched/rte_sched.c b/lib/librte_sched/rte_sched.c
> index 6928c98..8cb8bf1 100644
> --- a/lib/librte_sched/rte_sched.c
> +++ b/lib/librte_sched/rte_sched.c
> @@ -1028,7 +1028,9 @@ rte_sched_port_update_subport_stats(struct
> rte_sched_port *port, uint32_t qindex
>  }
> 
>  static inline void
> -rte_sched_port_update_subport_stats_on_drop(struct rte_sched_port
> *port, uint32_t qindex, struct rte_mbuf *pkt)
> +rte_sched_port_update_subport_stats_on_drop(struct rte_sched_port
> *port,
> + uint32_t qindex,
> + struct rte_mbuf *pkt, uint32_t red)
>  {
>   struct rte_sched_subport *s = port->subport + (qindex /
> rte_sched_port_queues_per_subport(port));
>   uint32_t tc_index = (qindex >> 2) & 0x3;
> @@ -1036,6 +1038,9 @@
> rte_sched_port_update_subport_stats_on_drop(struct rte_sched_port
> *port, uint32_
> 
>   s->stats.n_pkts_tc_dropped[tc_index] += 1;
>   s->stats.n_bytes_tc_dropped[tc_index] += pkt_len;
> +#ifdef RTE_SCHED_RED
> + s->stats.n_pkts_red_dropped[tc_index] += red;
> +#endif
>  }
> 
>  static inline void
> @@ -1049,13 +1054,18 @@ rte_sched_port_update_queue_stats(struct
> rte_sched_port *port, uint32_t qindex,
>  }
> 
>  static inline void
> -rte_sched_port_update_queue_stats_on_drop(struct rte_sched_port
> *port, uint32_t qindex, struct rte_mbuf *pkt)
> +rte_sched_port_update_queue_stats_on_drop(struct rte_sched_port
> *port,
> +   uint32_t qindex,
> +   struct rte_mbuf *pkt, uint32_t red)
>  {
>   struct rte_sched_queue_extra *qe = port->queue_extra + qindex;
>   uint32_t pkt_len = pkt->pkt_len;
> 
>   qe->stats.n_pkts_dropped += 1;
>   qe->stats.n_bytes_dropped += pkt_len;
> +#ifdef RTE_SCHED_RED
> + qe->stats.n_pkts_red_dropped += red;
> +#endif
>  }
> 
>  #endif /* RTE_SCHED_COLLECT_STATS */
> @@ -1206,12 +1216,20 @@ rte_sched_port_enqueue_qwa(struct
> rte_sched_port *port, uint32_t qindex, struct
>   qlen = q->qw - q->qr;
> 
>   /* Drop the packet (and update drop stats) when queue is full */
> - if (unlikely(rte_sched_port_red_drop(port, pkt, qindex, qlen) ||
> (qlen >= qsize))) {
> + if (unlikely(rte_sched_port_red_drop(port, pkt, qindex, qlen))) {
> +#ifdef RTE_SCHED_COLLECT_STATS
> + rte_sched_port_update_subport_stats_on_drop(port,
> qindex, pkt, 1);
> + rte_sched_port_update_queue_stats_on_drop(port,
> qindex, pkt, 1);
> +#endif
>   rte_pktmbuf_free(pkt);
> + }
> +
> + if (qlen >= qsize) {
>  #ifdef RTE_SCHED_COLLECT_STATS
> - rte_sched_port_update_subport_stats_on_drop(port,
> qindex, pkt);
> - rte_sched_port_update_queue_stats_on_drop(port,
> qindex, pkt);
> + rte_sched_port_update_subport_stats_on_drop(port,
> qindex, pkt, 0);
> + rte_sched_port_update_queue_stats_on_drop(port,
> qindex, pkt, 0);
>  #endif
> + rte_pktmbuf_free(pkt);
>   return 0;
>   }
> 
> diff --git a/lib/librte_sched/rte_sched.h b/lib/librte_sched/rte_sched.h
> index dda287f..e9bf18a 100644
> --- a/lib/librte_sched/rte_sched.h
> +++ b/lib/librte_sched/rte_sched.h
> @@ -140,6 +140,9 @@ struct rte_sched_subport_stats {
> subport for each traffic class*/
>   uint32_t
> n_bytes_tc_dropped[RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE]; /**<
> Number of bytes dropped by the current
>subport for each traffic class due 
> to subport queues
> being full or congested */
> +#ifdef RTE_SCHED_RED
> + uint32_t
> n_pkts_red_dropped[RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE]; /**<
> Number of packets dropped by red */
> +#endif
>  };
> 
>  /** Pipe configuration parameters. The period and credits_per_period
> parameters are measured
> @@ -168,6 +171,9 @@ struct rte_sched_queue_stats {
>   /* Packets */
>   uint32_t n_pkts; /**< Number of packets successfully
> written to current queue */
>   uint32_t n_pkts_dropped; /**< Number of packets dropped due
> to current queue being full or congested */
> +#ifdef RTE_SCHED_RED
> + uint32_t n_pkts_red_dropped;
> +#endif
> 
>   /* Bytes */
>   uint32_t n_bytes;/**< Number of bytes successfully 
> written
> to current queue */
> --
> 2.1.4


[dpdk-dev] [PATCH v2 2/7] rte_sched: expand scheduler hierarchy for more VLAN's

2015-02-20 Thread Dumitrescu, Cristian


> -Original Message-
> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Stephen
> Hemminger
> Sent: Thursday, February 5, 2015 6:05 AM
> To: dev at dpdk.org
> Cc: Stephen Hemminger
> Subject: [dpdk-dev] [PATCH v2 2/7] rte_sched: expand scheduler hierarchy
> for more VLAN's
> 
> From: Stephen Hemminger 
> 
> The QoS subport is limited to 8 bits in original code.
> But customers demanded ability to support full number of VLAN's (4096)
> therefore use the full part of the tag field of mbuf.
> 
> Resize the pipe as well to allow for more pipes in future and
> avoid expensive bitfield access.
> 
> Signed-off-by: Stephen Hemminger 
> ---
> v2 use tag area rather than claiming reserved bit which isn't documented
> 
>  lib/librte_mbuf/rte_mbuf.h   |  5 -
>  lib/librte_sched/rte_sched.h | 38 -
> -
>  2 files changed, 28 insertions(+), 15 deletions(-)
> 
> diff --git a/lib/librte_mbuf/rte_mbuf.h b/lib/librte_mbuf/rte_mbuf.h
> index 16059c6..8f0c3a4 100644
> --- a/lib/librte_mbuf/rte_mbuf.h
> +++ b/lib/librte_mbuf/rte_mbuf.h
> @@ -258,7 +258,10 @@ struct rte_mbuf {
>   /**< First 4 flexible bytes or FD ID, dependent on
>PKT_RX_FDIR_* flag in ol_flags. */
>   } fdir;   /**< Filter identifier if FDIR enabled */
> - uint32_t sched;   /**< Hierarchical scheduler */
> + struct {
> + uint32_t lo;
> + uint32_t hi;
> + } sched;  /**< Hierarchical scheduler */
>   uint32_t usr; /**< User defined tags. See
> @rte_distributor_process */
>   } hash;   /**< hash information */
> 
> diff --git a/lib/librte_sched/rte_sched.h b/lib/librte_sched/rte_sched.h
> index e6bba22..dda287f 100644
> --- a/lib/librte_sched/rte_sched.h
> +++ b/lib/librte_sched/rte_sched.h
> @@ -195,16 +195,20 @@ struct rte_sched_port_params {
>  #endif
>  };
> 
> -/** Path through the scheduler hierarchy used by the scheduler enqueue
> operation to
> -identify the destination queue for the current packet. Stored in the field
> hash.sched
> -of struct rte_mbuf of each packet, typically written by the classification
> stage and read by
> -scheduler enqueue.*/
> +/*
> + * Path through the scheduler hierarchy used by the scheduler enqueue
> + * operation to identify the destination queue for the current
> + * packet. Stored in the field pkt.hash.sched of struct rte_mbuf of
> + * each packet, typically written by the classification stage and read
> + * by scheduler enqueue.
> + */
>  struct rte_sched_port_hierarchy {
> - uint32_t queue:2;/**< Queue ID (0 .. 3) */
> - uint32_t traffic_class:2;/**< Traffic class ID (0 .. 3)*/
> - uint32_t pipe:20;/**< Pipe ID */
> - uint32_t subport:6;  /**< Subport ID */
> - uint32_t color:2;/**< Color */
> + uint16_t queue:2;/**< Queue ID (0 .. 3) */
> + uint16_t traffic_class:2;/**< Traffic class ID (0 .. 3)*/
> + uint16_t color:2;/**< Color */
> + uint16_t unused:10;
> + uint16_t subport;/**< Subport ID */
> + uint32_t pipe;   /**< Pipe ID */
>  };

Extending the number of bits allocated for mbuf->sched makes sense to me. I 
agree with this partitioning.

> 
>  /*
> @@ -350,12 +354,15 @@ rte_sched_queue_read_stats(struct
> rte_sched_port *port,
>   */
>  static inline void
>  rte_sched_port_pkt_write(struct rte_mbuf *pkt,
> - uint32_t subport, uint32_t pipe, uint32_t traffic_class, uint32_t
> queue, enum rte_meter_color color)
> +  uint32_t subport, uint32_t pipe,
> +  uint32_t traffic_class,
> +  uint32_t queue, enum rte_meter_color color)
>  {
> - struct rte_sched_port_hierarchy *sched = (struct
> rte_sched_port_hierarchy *) >hash.sched;
> + struct rte_sched_port_hierarchy *sched
> + = (struct rte_sched_port_hierarchy *) >hash.sched;
> 
> - sched->color = (uint32_t) color;
>   sched->subport = subport;
> + sched->color = (uint32_t) color;
>   sched->pipe = pipe;
>   sched->traffic_class = traffic_class;
>   sched->queue = queue;
> @@ -379,9 +386,12 @@ rte_sched_port_pkt_write(struct rte_mbuf *pkt,
>   *
>   */
>  static inline void
> -rte_sched_port_pkt_read_tree_path(struct rte_mbuf *pkt, uint32_t
> *subport, uint32_t *pipe, uint32_t *traffic_class, uint32_t *queue)
> +rte_sched_port_pkt_read_tree_path(struct rte_mbuf *pkt, uint32_t
> *subport,
> +   uint32_t *pipe, uint32_t *traffic_class,
> +   uint32_t *queue)
>  {
> - struct rte_sched_port_hierarchy *sched = (struct
> rte_sched_port_hierarchy *) >hash.sched;
> + struct rte_sched_port_hierarchy *sched
> + = (struct rte_sched_port_hierarchy *) >hash.sched;
> 
>   

[dpdk-dev] [PATCH v2 1/7] rte_sched: make RED optional at runtime

2015-02-20 Thread Dumitrescu, Cristian


> -Original Message-
> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Stephen
> Hemminger
> Sent: Thursday, February 5, 2015 6:05 AM
> To: dev at dpdk.org
> Cc: Stephen Hemminger
> Subject: [dpdk-dev] [PATCH v2 1/7] rte_sched: make RED optional at runtime
> 
> From: Stephen Hemminger 
> 
> Want to be able to build with RTE_SCHED_RED enabled but
> allow disabling RED on a per-queue basis at runtime.
> 
> RED is disabled unless min/max thresholds set.
> 
> Signed-off-by: Stephen Hemmminger 
> ---
>  lib/librte_sched/rte_sched.c | 9 +
>  1 file changed, 9 insertions(+)
> 
> diff --git a/lib/librte_sched/rte_sched.c b/lib/librte_sched/rte_sched.c
> index 95dee27..6928c98 100644
> --- a/lib/librte_sched/rte_sched.c
> +++ b/lib/librte_sched/rte_sched.c
> @@ -636,6 +636,12 @@ rte_sched_port_config(struct
> rte_sched_port_params *params)
>   uint32_t j;
> 
>   for (j = 0; j < e_RTE_METER_COLORS; j++) {
> + /* if min/max are both zero, then RED is disabled */
> + if ((params->red_params[i][j].min_th |
> +  params->red_params[i][j].max_th) == 0) {
> + continue;
> + }
> +
>   if (rte_red_config_init(>red_config[i][j],
>   params->red_params[i][j].wq_log2,
>   params->red_params[i][j].min_th,
> @@ -1069,6 +1075,9 @@ rte_sched_port_red_drop(struct rte_sched_port
> *port, struct rte_mbuf *pkt, uint3
>   color = rte_sched_port_pkt_read_color(pkt);
>   red_cfg = >red_config[tc_index][color];
> 
> + if ( (red_cfg->min_th | red_cfg->max_th) == 0)
> + return 0;
> +
>   qe = port->queue_extra + qindex;
>   red = >red;
> 
> --
> 2.1.4

Acked by: Cristian Dumitrescu 

I agree with this approach of leaving RED as a library option that can be 
enabled/disabled at build-time in order to avoid any performance penalty when 
RED is not required.

On enqueue side (when RED is enabled):
-Makes sense to avoid the RED processing when RED is not enabled for the 
current queue;

On dequeue side (when RED is enabled):
-Not sure what's best when RED is not enabled for the current queue: whether to 
go ahead anyway and apply the timestamp, as it is harmless (but might trigger a 
cache miss) or check (which might trigger cache miss). Makes sense to go with 
the former option, which is implemented by this patch.

--
Intel Shannon Limited
Registered in Ireland
Registered Office: Collinstown Industrial Park, Leixlip, County Kildare
Registered Number: 308263
Business address: Dromore House, East Park, Shannon, Co. Clare

This e-mail and any attachments may contain confidential material for the sole 
use of the intended recipient(s). Any review or distribution by others is 
strictly prohibited. If you are not the intended recipient, please contact the 
sender and delete all copies.




[dpdk-dev] [PATCH v2 6/7] rte_sched: eliminate floating point in calculating byte clock

2015-02-16 Thread Dumitrescu, Cristian
Hi Stephen,

Sorry, NACK.

1. Overflow issue
As you declare cycles_per_byte as uint32_t, for a CPU frequency of 2-3 GHz, the 
line of code below results in overflow:
port->cycles_per_byte = (rte_get_tsc_hz() << RTE_SCHED_TIME_SHIFT) / 
params->rate;
Therefore, there is most likely a significant accuracy loss, which might result 
in more packets allowed to go out than it should.

2. Integer division has a higher cost than floating point division
My understanding is we are considering a performance improvement by replacing 
the double precision floating point division in:
double bytes_diff = ((double) cycles_diff) / port->cycles_per_byte;
with an integer division:
uint64_t bytes_diff = (cycles_diff << RTE_SCHED_TIME_SHIFT) / 
port->cycles_per_byte;
I don't think this is going to have the claimed benefit, as acording to "Intel 
64 and IA-32 Architectures Optimization  Reference Manual" (Appendix C), the 
latency of the integer division instruction is significantly bigger than the 
latency of integer division:
Instruction FDIV double precision: latency = 38-40 cycles
Instruction IDIV: latency = 56 - 80 cycles

3. Alternative
I hear though your suggestion about replacing the floating point division with 
a more performant construction. One suggestion would be to replace it with an 
integer multiplication followed by a shift right, probably by using a uint64_t 
bytes_per_cycle_scaled_up (the inverse of cycles_per_bytes). I need to 
prototype this code myself. Would you be OK to look into providing an 
alternative implementation?

Thanks,
Cristian


-Original Message-
From: dev [mailto:dev-boun...@dpdk.org] On Behalf Of Stephen Hemminger
Sent: Thursday, February 5, 2015 6:14 AM
To: dev at dpdk.org
Cc: Stephen Hemminger
Subject: [dpdk-dev] [PATCH v2 6/7] rte_sched: eliminate floating point in 
calculating byte clock

From: Stephen Hemminger 

The old code was doing a floating point divide for each rte_dequeue()
which is very expensive. Change to using fixed point scaled math instead.
This improved performance from 5Gbit/sec to 10 Gbit/sec

Signed-off-by: Stephen Hemminger 
---
 lib/librte_sched/rte_sched.c | 14 ++
 1 file changed, 10 insertions(+), 4 deletions(-)

diff --git a/lib/librte_sched/rte_sched.c b/lib/librte_sched/rte_sched.c
index 55fbc14..3023457 100644
--- a/lib/librte_sched/rte_sched.c
+++ b/lib/librte_sched/rte_sched.c
@@ -102,6 +102,9 @@

 #define RTE_SCHED_BMP_POS_INVALID UINT32_MAX

+/* For cycles_per_byte calculation */
+#define RTE_SCHED_TIME_SHIFT 20
+
 struct rte_sched_subport {
/* Token bucket (TB) */
uint64_t tb_time; /* time of last update */
@@ -239,7 +242,7 @@ struct rte_sched_port {
uint64_t time_cpu_cycles; /* Current CPU time measured in CPU cyles 
*/
uint64_t time_cpu_bytes;  /* Current CPU time measured in bytes */
uint64_t time;/* Current NIC TX time measured in bytes 
*/
-   double cycles_per_byte;   /* CPU cycles per byte */
+   uint32_t cycles_per_byte;   /* CPU cycles per byte (scaled) */

/* Scheduling loop detection */
uint32_t pipe_loop;
@@ -657,7 +660,9 @@ rte_sched_port_config(struct rte_sched_port_params *params)
port->time_cpu_cycles = rte_get_tsc_cycles();
port->time_cpu_bytes = 0;
port->time = 0;
-   port->cycles_per_byte = ((double) rte_get_tsc_hz()) / ((double) 
params->rate);
+
+   port->cycles_per_byte = (rte_get_tsc_hz() << RTE_SCHED_TIME_SHIFT)
+   / params->rate;

/* Scheduling loop detection */
port->pipe_loop = RTE_SCHED_PIPE_INVALID;
@@ -2156,11 +2161,12 @@ rte_sched_port_time_resync(struct rte_sched_port *port)
 {
uint64_t cycles = rte_get_tsc_cycles();
uint64_t cycles_diff = cycles - port->time_cpu_cycles;
-   double bytes_diff = ((double) cycles_diff) / port->cycles_per_byte;
+   uint64_t bytes_diff = (cycles_diff << RTE_SCHED_TIME_SHIFT)
+   / port->cycles_per_byte;

/* Advance port time */
port->time_cpu_cycles = cycles;
-   port->time_cpu_bytes += (uint64_t) bytes_diff;
+   port->time_cpu_bytes += bytes_diff;
if (port->time < port->time_cpu_bytes) {
port->time = port->time_cpu_bytes;
}
-- 
2.1.4

--
Intel Shannon Limited
Registered in Ireland
Registered Office: Collinstown Industrial Park, Leixlip, County Kildare
Registered Number: 308263
Business address: Dromore House, East Park, Shannon, Co. Clare

This e-mail and any attachments may contain confidential material for the sole 
use of the intended recipient(s). Any review or distribution by others is 
strictly prohibited. If you are not the intended recipient, please contact the 
sender and delete all copies.




[dpdk-dev] Explanation of the QoS offset values used in the QoS scheduler example app.

2015-02-16 Thread Dumitrescu, Cristian
Hi,

These are byte offsets used for reading these packet fields, considering that 
packet bytes are stored in memory in network order, while the CPU is little 
endian, so byte swapping takes place on read.

This is probably not the best way to write this code, and I agree this portion 
of the app code is a bit more cryptic than it should be. Using data structures 
to describe the header format for the input packet (Ethernet/SVLAN/CVLAN/IPv4) 
and using portable byte swapping macros is probably a better alternative.

This being said, the code implementation, code comments and Sample App Guide 
description seem to be consistent and correct.

Regards,
Cristian


-Original Message-
From: dev [mailto:dev-boun...@dpdk.org] On Behalf Of Aws Ismail
Sent: Saturday, February 14, 2015 7:35 PM
To: dev at dpdk.org
Subject: [dpdk-dev] Explanation of the QoS offset values used in the QoS 
scheduler example app.

Hi everyone,

I am looking at this portion of the code in the app_thread.c file of the
QoS scheduler example application:

/*
* QoS parameters are encoded as follows:
* Outer VLAN ID defines subport
* Inner VLAN ID defines pipe
* Destination IP 0.0.XXX.0 defines traffic class
* Destination IP host (0.0.0.XXX) defines queue
* Values below define offset to each field from start of frame
*/
#define SUBPORT_OFFSET 7
#define PIPE_OFFSET 9
#define TC_OFFSET 20
#define QUEUE_OFFSET 20
#define COLOR_OFFSET 19

static inline int get_pkt_sched(struct rte_mbuf *m, uint32_t *subport,
uint32_t *pipe, uint32_t *traffic_class, uint32_t *queue, uint32_t *color)
{
uint16_t *pdata = rte_pktmbuf_mtod(m, uint16_t *);

*subport = (rte_be_to_cpu_16(pdata[SUBPORT_OFFSET]) & 0x0FFF) &
(port_params.n_subports_per_port - 1); /* Outer VLAN ID*/

*pipe = (rte_be_to_cpu_16(pdata[PIPE_OFFSET]) & 0x0FFF) &
(port_params.n_pipes_per_subport - 1); /* Inner VLAN ID */

*traffic_class = (pdata[QUEUE_OFFSET] & 0x0F) &
(RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE - 1); /* Destination IP */

*queue = ((pdata[QUEUE_OFFSET] >> 8) & 0x0F) &
(RTE_SCHED_QUEUES_PER_TRAFFIC_CLASS - 1) ; /* Destination IP */

*color = pdata[COLOR_OFFSET] & 0x03; /* Destination IP */

return 0;
}

The offset values do not make sense to me. According to the programmer
guide, the queue selection is SVID/CVID/TC/QID based. And those offset seem
off in this case. Is this because it is assuming that the packet is being
altered before it gets to this stage ?

Can anyone provide a better explanation or at least the reason behind
choosing those offset values shown above.

Thanks.
--
Intel Shannon Limited
Registered in Ireland
Registered Office: Collinstown Industrial Park, Leixlip, County Kildare
Registered Number: 308263
Business address: Dromore House, East Park, Shannon, Co. Clare

This e-mail and any attachments may contain confidential material for the sole 
use of the intended recipient(s). Any review or distribution by others is 
strictly prohibited. If you are not the intended recipient, please contact the 
sender and delete all copies.



[dpdk-dev] [PATCH v2 4/7] rte_sched: don't clear statistics when read

2015-02-09 Thread Dumitrescu, Cristian
Hi Stephen,

What is the reason not to clear statistics on read? Do you have a use-case / 
justification for it?

(BTW, I see you added the reset functions, but was it also your intention to 
remove the memset to 0 from the stats read functions? :) )

Regards,
Cristian

-Original Message-
From: dev [mailto:dev-boun...@dpdk.org] On Behalf Of Stephen Hemminger
Sent: Thursday, February 5, 2015 6:14 AM
To: dev at dpdk.org
Cc: Stephen Hemminger
Subject: [dpdk-dev] [PATCH v2 4/7] rte_sched: don't clear statistics when read

From: Stephen Hemminger 

Make rte_sched statistics API work like the ethernet statistics API.
Don't auto-clear statistics.

Signed-off-by: Stephen Hemminger 
---
 lib/librte_sched/rte_sched.c | 30 ++
 lib/librte_sched/rte_sched.h | 29 +
 2 files changed, 59 insertions(+)

diff --git a/lib/librte_sched/rte_sched.c b/lib/librte_sched/rte_sched.c
index 8cb8bf1..d891e50 100644
--- a/lib/librte_sched/rte_sched.c
+++ b/lib/librte_sched/rte_sched.c
@@ -935,6 +935,21 @@ rte_sched_subport_read_stats(struct rte_sched_port *port,
 }

 int
+rte_sched_subport_stats_reset(struct rte_sched_port *port,
+ uint32_t subport_id)
+{
+   struct rte_sched_subport *s;
+
+   /* Check user parameters */
+   if (port == NULL || subport_id >= port->n_subports_per_port)
+   return -1;
+
+   s = port->subport + subport_id;
+   memset(>stats, 0, sizeof(struct rte_sched_subport_stats));
+   return 0;
+}
+
+int
 rte_sched_queue_read_stats(struct rte_sched_port *port,
uint32_t queue_id,
struct rte_sched_queue_stats *stats,
@@ -963,6 +978,21 @@ rte_sched_queue_read_stats(struct rte_sched_port *port,
return 0;
 }

+int
+rte_sched_queue_stats_reset(struct rte_sched_port *port,
+   uint32_t queue_id)
+{
+   struct rte_sched_queue_extra *qe;
+
+   /* Check user parameters */
+   if (port == NULL || queue_id >= rte_sched_port_queues_per_port(port))
+   return -1;
+
+   qe = port->queue_extra + queue_id;
+   memset(>stats, 0, sizeof(struct rte_sched_queue_stats));
+   return 0;
+}
+
 static inline uint32_t
 rte_sched_port_qindex(struct rte_sched_port *port, uint32_t subport, uint32_t 
pipe, uint32_t traffic_class, uint32_t queue)
 {
diff --git a/lib/librte_sched/rte_sched.h b/lib/librte_sched/rte_sched.h
index e9bf18a..3d007e4 100644
--- a/lib/librte_sched/rte_sched.h
+++ b/lib/librte_sched/rte_sched.h
@@ -317,6 +317,21 @@ rte_sched_subport_read_stats(struct rte_sched_port *port,
struct rte_sched_subport_stats *stats,
uint32_t *tc_ov);

+
+/**
+ * Hierarchical scheduler subport statistics reset
+ *
+ * @param port
+ *   Handle to port scheduler instance
+ * @param subport_id
+ *   Subport ID
+ * @return
+ *   0 upon success, error code otherwise
+ */
+int
+rte_sched_subport_stats_reset(struct rte_sched_port *port,
+ uint32_t subport_id);
+
 /**
  * Hierarchical scheduler queue statistics read
  *
@@ -338,6 +353,20 @@ rte_sched_queue_read_stats(struct rte_sched_port *port,
struct rte_sched_queue_stats *stats,
uint16_t *qlen);

+/**
+ * Hierarchical scheduler queue statistics reset
+ *
+ * @param port
+ *   Handle to port scheduler instance
+ * @param queue_id
+ *   Queue ID within port scheduler
+ * @return
+ *   0 upon success, error code otherwise
+ */
+int
+rte_sched_queue_stats_reset(struct rte_sched_port *port,
+   uint32_t queue_id);
+
 /*
  * Run-time
  *
-- 
2.1.4

--
Intel Shannon Limited
Registered in Ireland
Registered Office: Collinstown Industrial Park, Leixlip, County Kildare
Registered Number: 308263
Business address: Dromore House, East Park, Shannon, Co. Clare

This e-mail and any attachments may contain confidential material for the sole 
use of the intended recipient(s). Any review or distribution by others is 
strictly prohibited. If you are not the intended recipient, please contact the 
sender and delete all copies.




[dpdk-dev] [PATCH] MAINTAINERS: claim metering, sched and pkt framework

2015-02-09 Thread Dumitrescu, Cristian
Thank you, Thomas!

We are working on some enhancements on librte_cfg for release 2.1, so in order 
to avoid unnecessary code churn, it is probably
better to have the librte_cfgfile changes done first, then have a subsequent 
patch on qos_sched.

Regards,
Cristian


-Original Message-
From: Thomas Monjalon [mailto:thomas.monja...@6wind.com] 
Sent: Monday, February 9, 2015 3:15 PM
To: Dumitrescu, Cristian
Cc: dev at dpdk.org; Gonzalez Monroy, Sergio
Subject: Re: [dpdk-dev] [PATCH] MAINTAINERS: claim metering, sched and pkt 
framework

2015-02-06 13:13, Gonzalez Monroy, Sergio:
> > From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Cristian Dumitrescu
> > Sent: Wednesday, February 4, 2015 3:53 PM
> > To: dev at dpdk.org
> > Subject: [dpdk-dev] [PATCH] MAINTAINERS: claim metering, sched and pkt
> > framework
> > 
> > As original author of these DPDK components, I am volunteering to maintain
> > them going forward:
> > - Traffic Metering
> > - Hierarchical Scheduler
> > - Packet Framework
> > - Configuration File
> > 
> > Signed-off-by: Cristian Dumitrescu 
> 
> Acked-by: Sergio Gonzalez Monroy 

Acked-by: Thomas Monjalon 

Applied, thanks

About cfgfile, we are still waiting for the cleanup in qos_sched example:
http://dpdk.org/ml/archives/dev/2014-October/006774.html
Do you have news?
--
Intel Shannon Limited
Registered in Ireland
Registered Office: Collinstown Industrial Park, Leixlip, County Kildare
Registered Number: 308263
Business address: Dromore House, East Park, Shannon, Co. Clare

This e-mail and any attachments may contain confidential material for the sole 
use of the intended recipient(s). Any review or distribution by others is 
strictly prohibited. If you are not the intended recipient, please contact the 
sender and delete all copies.




[dpdk-dev] [PATCH] examples: fix unchecked malloc return value in ip_pipeline

2014-12-12 Thread Dumitrescu, Cristian
Acked by: 

-Original Message-
From: Richardson, Bruce 
Sent: Friday, December 12, 2014 12:24 PM
To: dev at dpdk.org; Dumitrescu, Cristian
Cc: Richardson, Bruce
Subject: [PATCH] examples: fix unchecked malloc return value in ip_pipeline

Static analysis shows that once instance of rte_zmalloc is missing
a return value check in the code. This is fixed by adding a return
value check. The malloc call itself is moved to earlier in the function
so that no work is done unless all memory allocation requests have
succeeded - thereby removing the need for rollback on error.

Signed-off-by: Bruce Richardson 
---
 examples/ip_pipeline/cmdline.c | 20 +---
 1 file changed, 13 insertions(+), 7 deletions(-)

diff --git a/examples/ip_pipeline/cmdline.c b/examples/ip_pipeline/cmdline.c
index 13d565e..152acb5 100644
--- a/examples/ip_pipeline/cmdline.c
+++ b/examples/ip_pipeline/cmdline.c
@@ -1093,7 +1093,7 @@ cmd_firewall_add_parsed(
__attribute__((unused)) void *data)
 {
struct cmd_firewall_add_result *params = parsed_result;
-   struct app_rule rule, *old_rule;
+   struct app_rule rule, *old_rule, *new_rule = NULL;
struct rte_mbuf *msg;
struct app_msg_req *req;
struct app_msg_resp *resp;
@@ -1148,6 +1148,18 @@ cmd_firewall_add_parsed(
if (msg == NULL)
rte_panic("Unable to allocate new message\n");

+   /* if we need a new rule structure, allocate it before we go further */
+   if (old_rule == NULL) {
+   new_rule = rte_zmalloc_socket("CLI", sizeof(struct app_rule),
+   RTE_CACHE_LINE_SIZE, rte_socket_id());
+   if (new_rule == NULL) {
+   printf("Cannot allocate memory for new rule\n");
+   rte_ctrlmbuf_free(msg);
+   return;
+   }
+   }
+
+
/* Fill request message */
req = (struct app_msg_req *)rte_ctrlmbuf_data(msg);
req->type = APP_MSG_REQ_FW_ADD;
@@ -1190,12 +1202,6 @@ cmd_firewall_add_parsed(
printf("Request FIREWALL_ADD failed (%u)\n", resp->result);
else {
if (old_rule == NULL) {
-   struct app_rule *new_rule = (struct app_rule *)
-   rte_zmalloc_socket("CLI",
-   sizeof(struct app_rule),
-   RTE_CACHE_LINE_SIZE,
-   rte_socket_id());
-
memcpy(new_rule, , sizeof(rule));
TAILQ_INSERT_TAIL(_table, new_rule, entries);
n_firewall_rules++;
-- 
1.9.3

--
Intel Shannon Limited
Registered in Ireland
Registered Office: Collinstown Industrial Park, Leixlip, County Kildare
Registered Number: 308263
Business address: Dromore House, East Park, Shannon, Co. Clare

This e-mail and any attachments may contain confidential material for the sole 
use of the intended recipient(s). Any review or distribution by others is 
strictly prohibited. If you are not the intended recipient, please contact the 
sender and delete all copies.




[dpdk-dev] [PATCH] table: fix table_array for incomplete bitmask

2014-12-04 Thread Dumitrescu, Cristian
Acked by: 

-Original Message-
From: dev [mailto:dev-boun...@dpdk.org] On Behalf Of Bruce Richardson
Sent: Thursday, December 4, 2014 2:24 PM
To: dev at dpdk.org
Subject: [dpdk-dev] [PATCH] table: fix table_array for incomplete bitmask

When a lookup was done on a table_array structure with an incomplete
bitmask, the results was always zero hits. This was because the
pkts_mask value was cleared as we process each entry, and the result
was assigned at the end of the loop, when pkts_mask was zero.
Changing the assignment to occur at the start, before the pkts_mask
gets cleared, fixes this issue.

Signed-off-by: Bruce Richardson 
---
 lib/librte_table/rte_table_array.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/lib/librte_table/rte_table_array.c 
b/lib/librte_table/rte_table_array.c
index 0b1d42a..4d3c05e 100644
--- a/lib/librte_table/rte_table_array.c
+++ b/lib/librte_table/rte_table_array.c
@@ -164,6 +164,8 @@ rte_table_array_lookup(
 {
struct rte_table_array *t = (struct rte_table_array *) table;

+   *lookup_hit_mask = pkts_mask;
+
if ((pkts_mask & (pkts_mask + 1)) == 0) {
uint64_t n_pkts = __builtin_popcountll(pkts_mask);
uint32_t i;
@@ -190,8 +192,6 @@ rte_table_array_lookup(
}
}

-   *lookup_hit_mask = pkts_mask;
-
return 0;
 }

-- 
1.9.3

--
Intel Shannon Limited
Registered in Ireland
Registered Office: Collinstown Industrial Park, Leixlip, County Kildare
Registered Number: 308263
Business address: Dromore House, East Park, Shannon, Co. Clare

This e-mail and any attachments may contain confidential material for the sole 
use of the intended recipient(s). Any review or distribution by others is 
strictly prohibited. If you are not the intended recipient, please contact the 
sender and delete all copies.




[dpdk-dev] Performance impact with QoS

2014-11-17 Thread Dumitrescu, Cristian
Hi Satish,

The QoS traffic manager has a large memory footprint due to large number of 
packet queues (e.g. 64K queues of 64 packets each) and large tables (e.g. 4K 
pipes with one cache line of context per pipe) that far exceeds the amount of 
CPU cache physically available. There are a lot of data structures that need to 
be brought into the L1 cache of the traffic manager core in order to take the 
scheduling decision: bitmap, pipe table entry, queue read/write pointers, queue 
elements, packet metadata (mbuf), etc. To minimize the penalties associated 
with the CPU pipeline stalling due to memory accesses, all these data 
structures are prefetched.

So, the point I am trying to make is there are a lot of critical CPU resources 
involved: size of L1/L2 cache (per CPU core), size of L3 cache (shared by all 
CPU cores), bandwidth of L1/L2 cache (per core), bandwidth of L3 cache (shared 
by all CPU cores), number of outstanding prefetches (per CPU core), etc.

If you map the QoS traffic manager on the same core with packet I/O (i.e. Poll 
Mode Driver RX/TX), my guess is these two I/O intensive workloads will both 
compete for the CPU resources listed above and will also impact each other by 
thrashing each other data structures in and out of L1/L2 cache. If you split 
them on different CPU cores, their operation is more performant and more 
predictable, as each one is having its own L1/L2 cache now.

Did you try a CPU core chaining setup (through rte_rings) similar to qos_sched 
application, like: RX -> (TM enqueue & dequeue) -> TX or RX -> (TM enqueue & TM 
dequeue & TX)? I am sure you will find the right setup for you by conducting 
similar experiments. Of course, result also depends on which other workloads 
your application is performing.

Regards,
Cristian

From: satish [mailto:nsatishb...@gmail.com]
Sent: Monday, November 17, 2014 6:03 AM
To: dev at dpdk.org
Cc: Dumitrescu, Cristian
Subject: Re: Performance impact with QoS

Hi All,
Can someone please provide comments on queries in below mail?

Regards,
Satish Babu

On Mon, Nov 10, 2014 at 4:24 PM, satish mailto:nsatishbabu at gmail.com>> wrote:
Hi,
I need comments on performance impact with DPDK-QoS.

We are working on developing a application based on DPDK.
Our application supports IPv4 forwarding with and without QoS.

Without QOS, we are achieving almost full wire rate (bi-directional traffic) 
with 128, 256 and 512 byte packets.
But when we enabled QoS, performance dropped to half for 128 and 256 byte 
packets.
For 512 byte packet, we didn't observe any drop even after enabling QoS 
(Achieving full wire rate).
Traffic used in both the cases is same. ( One stream with Qos match to first 
queue in traffic class 0)

In our application, we are using memory buffer pools to receive the packet 
bursts (Ring buffer is not used).
Same buffer is used during packet processing and TX (enqueue and dequeue). All 
above handled on the same core.

For normal forwarding(without QoS), we are using rte_eth_tx_burst for TX.

For forwarding with QoS, using rte_sched_port_pkt_write(), 
rte_sched_port_enqueue () and rte_sched_port_dequeue ()
before rte_eth_tx_burst ().

We understood that performance dip in case of 128 and 256 byte packet is bacause
of processing more number of packets compared to 512 byte packet.

Can some comment on performance dip in my case with QOS enabled?
[1] can this be because of inefficient use of RTE calls for QoS?
[2] Is it the poor buffer management?
[3] any other comments?

To achieve good performance in QoS case, is it must to use worker thread 
(running on different core) with ring buffer?

Please provide your comments.

Thanks in advance.

Regards,
Satish Babu




--
Regards,
Satish Babu
--
Intel Shannon Limited
Registered in Ireland
Registered Office: Collinstown Industrial Park, Leixlip, County Kildare
Registered Number: 308263
Business address: Dromore House, East Park, Shannon, Co. Clare

This e-mail and any attachments may contain confidential material for the sole 
use of the intended recipient(s). Any review or distribution by others is 
strictly prohibited. If you are not the intended recipient, please contact the 
sender and delete all copies.



[dpdk-dev] Max throughput Using QOS Scheduler

2014-11-06 Thread Dumitrescu, Cristian
Hi Srikanth,

>>Is there any difference between scheduler behavior  for above two scenarios  
>>while enqueing and de-queueing ??
All the pipe queues share the bandwidth allocated to their pipe. The 
distribution of available pipe bandwidth between the pipe queues is governed by 
features like traffic class strict priority, bandwidth sharing between pipe 
traffic classes, weights of the queues within the same traffic class, etc. In 
the case you mention, you are just using one queue for each traffic class.

Let?s take an example:

-Configuration: pipe rate = 10 Mbps, pipe traffic class 0 .. 3 rates = 
[20% of pipe rate = 2 Mbps, 30% of pipe rate = 3 Mbps, 40% of pipe rate = 4 
Mbps, 100% of pipe rate = 10 Mbps]. Convention is that traffic class 0 is the 
highest priority.

-Injected traffic per traffic class for this pipe: [3, 0, 0, 0] Mbps => 
Output traffic per traffic class for this pipe: [2 , 0, 0, 0] Mbps

-Injected traffic per traffic class for this pipe: [0, 0, 0, 15] Mbps 
=> Output traffic per traffic class for this pipe: [0, 0, 0, 10] Mbps

-Injected traffic per traffic class for this pipe: [1, 10, 2, 15] Mbps 
=> Output traffic per traffic class for this pipe: [1, 3, 2, 4] Mbps

Makes sense?

>>Queue size is 64 , and number of packets enqueued and dequeued is 64 as well.
I strongly recommend you never use a dequeue burst size that is equal to 
enqueue burst size, as performance will be bad.

In the qos_sched sample app, we use [enqueue burst size, dequeue burst size] 
set to [64, 32], other reasonable values could be [64, 48], [32, 16], etc. An 
enqueue burst bigger than dequeue burst will cause the big packet reservoir 
which is the traffic manager/port scheduler to fill up to a reasonable level 
that will allow dequeu to function optimally, and then the system regulates 
itself.

The reason is: since we interlace enqueue and dequeue calls, if you push on 
every iteration e.g. 64 packets in and then look to get 64 packets out, you?ll 
only have 64 packets into the queues, then you?ll work hard to find them, and 
you get out exactly those 64 packets that you pushed in.

>>And what is the improvements i would gain if i move to DPDK 1.7 w.r.t QOS ?
The QoS code is pretty stable since release 1.4, not many improvements added 
(maybe it?s the right time to revisit this feature and push it to the next 
level ?), but there are improvements in other DPDK libraries that are 
dependencies for QoS (e.g. packet Rx/Tx).

Hope this helps.

Regards,
Cristian



From: Srikanth Akula [mailto:srikanth...@gmail.com]
Sent: Thursday, October 30, 2014 4:10 PM
To: dev at dpdk.org; Dumitrescu, Cristian
Subject: Max throughput Using QOS Scheduler

Hello All ,

I am currently trying to implement QOS scheduler using DPDK 1.6 . I have 
configured 1 subport , 4096 pipes for the sub port and 4 TC's and 4 Queues .

Currently i am trying to send packets destined to single Queue of the available 
16 queues of one of the pipe .

Could some body explain what could be the throughput we can achieve using this 
scheme.  The reason for asking this is , i could sense different behavior each 
time when i send traffic destined to different destination Queues  .

for example :

1. << Only one stream>>> Stream destined Q0 of TC0 ..


2. << 4 streams >>>> 1st Stream destined for Q3 of Tc3 ...
 2nd stream destined for Q2 of Tc2
 3rd stream destined for Q1 of TC1
 4th Stream destined for Q0 of TC0

Is there any difference between scheduler behavior  for above two scenarios  
while enqueing and de-queueing ??

Queue size is 64 , and number of packets enqueud and dequeued is 64 as well.
And what is the improvements i would gain if i move to DPDK 1.7 w.r.t QOS ?


Could you please clarify my queries ?


Thanks & Regards,
Srikanth


--
Intel Shannon Limited
Registered in Ireland
Registered Office: Collinstown Industrial Park, Leixlip, County Kildare
Registered Number: 308263
Business address: Dromore House, East Park, Shannon, Co. Clare

This e-mail and any attachments may contain confidential material for the sole 
use of the intended recipient(s). Any review or distribution by others is 
strictly prohibited. If you are not the intended recipient, please contact the 
sender and delete all copies.



[dpdk-dev] Dynamic port/pipe QoS configuration

2014-10-20 Thread Dumitrescu, Cristian
Hi Satish,

Yes, you can change the pipe configuration during run-time, but in a slightly 
different way. The way to do it is by defining multiple pipe profiles at the 
port level initialization time and reassigning the pipe to a different level at 
run-time.

The pipe profiles are statically defined per port at initialization time (their 
number is configurable through rte_sched_port_params:: n_pipe_profiles 
parameter), so you need to know in advance the set of profiles you want to use 
at run-time. You can assign pipe X to profile A initially, then re-assign it to 
profile B later on (by using function rte_sched_pipe_config()).

Regards,
Cristian

From: satish mailto:nsatishb...@gmail.com>>
Date: October 17, 2014 at 2:59:35 PM PDT
To: mailto:dev at dpdk.org>>
Subject: Re: [dpdk-dev] Dynamic port/pipe QoS configuration
Hi,
Can someone please respond to below?
Thank you.

On Mon, Oct 13, 2014 at 3:54 PM, satish mailto:nsatishbabu at gmail.com>> wrote:


Hi,
We are trying to provide QoS support for one of our clients using
rte_sched.
In our implementation we are treating each pipe as a customer.
So, we can have maximum of 4096 customers per sub-port.
Customers(pipe) can be added, deleted or modified dynamically. Each
customer can have different profiles.

Currently we are using DPDK-v1.6.
Can I modify pipe profile during run time using rte_sched_pipe_config ()?

Our plan is to have initial configs as below (similar to examples in DPDK)
[1] Specify port params at the initialization of port as below
static struct rte_sched_port_params port_param = {
:
:
.n_subports_per_port = 1,
.n_pipes_per_subport = 4096,
.qsize = {64, 64, 64, 64},
.pipe_profiles = pipe_profile,
.n_pipe_profiles = 1,
}

[2]
static struct rte_sched_subport_params subport_param[] = {
{
.tb_rate = Link speed (1G/10G..) divided by 8 (bits),
.tb_size = 100,

.tc_rate = {Same as tb_rate, Same as tb_rate, Same as tb_rate, Same as
tb_rate},
.tc_period = 10,
},
};

[3]
static struct rte_sched_pipe_params pipe_profile[] = {
{ /* Profile #0 */
.tb_rate = Link speed (1G/10G..) divided by 8 (bits)/4096 (maximum number
of pipes),
.tb_size = 100,

.tc_rate = {pipe's tb_rate, pipe's tb_rate, pipe's tb_rate, pipe's
tb_rate},
.tc_period = 40,

.wrr_weights = {16, 4, 2, 1,  16, 4, 2, 1,  16, 4, 2, 1,  16, 4, 2, 1},
},
};

Our plan here is to initialize the pipe with default profile and modify
each pipe based on user configurations.
My questions are
[a] Can I modify pipe profile during run time using rte_sched_pipe_config
()? (question repeated)
If I can modify at pipe level,
[b] Can we have different profiles for pipes, With one default profile at
initialization?
[c] Can we modify port level params without deleting the port using
rte_sched_port_config ()?

Please provide your valuable comments.
Thanks in advance.

--
Regards,
Satish Babu




--
Regards,
Satish Babu
--
Intel Shannon Limited
Registered in Ireland
Registered Office: Collinstown Industrial Park, Leixlip, County Kildare
Registered Number: 308263
Business address: Dromore House, East Park, Shannon, Co. Clare

This e-mail and any attachments may contain confidential material for the sole 
use of the intended recipient(s). Any review or distribution by others is 
strictly prohibited. If you are not the intended recipient, please contact the 
sender and delete all copies.



[dpdk-dev] [v2 20/23] librte_cfgfile: interpret config files

2014-10-17 Thread Dumitrescu, Cristian
Hi Tomas,

Yes, you're right, we need to close on this pending item. Thanks for bringing 
it up.

I am currently working on a patch series, once I send it out I will come back 
and look into to qos_sched. Is this OK with you?

Regards,
Cristian

-Original Message-
From: Thomas Monjalon [mailto:thomas.monja...@6wind.com] 
Sent: Thursday, October 16, 2014 5:46 PM
To: Dumitrescu, Cristian
Cc: dev at dpdk.org; Wu, Jingjing; Liu, Jijiang
Subject: Re: [dpdk-dev] [v2 20/23] librte_cfgfile: interpret config files

Hi Cristian,

2014-06-04 19:08, Cristian Dumitrescu:
> This library provides a tool to interpret config files that have standard
> structure.
> 
> It is used by the Packet Framework examples/ip_pipeline sample application.
> 
> It originates from examples/qos_sched sample application and now it makes
> this code available as a library for other sample applications to use.
> The code duplication with qos_sched sample app to be addressed later.

4 months ago, you said that this duplication will be adressed later.
Neither you nor anyone at Intel submitted a patch to clean up that.
I just want to be sure that "later" doesn't mean "never" because
I'm accepting another "later" word for cleaning old filtering API.

Maybe you just forgot it so please prove me that I'm right to accept
"later" clean-up, in general.

Thanks
-- 
Thomas
--
Intel Shannon Limited
Registered in Ireland
Registered Office: Collinstown Industrial Park, Leixlip, County Kildare
Registered Number: 308263
Business address: Dromore House, East Park, Shannon, Co. Clare

This e-mail and any attachments may contain confidential material for the sole 
use of the intended recipient(s). Any review or distribution by others is 
strictly prohibited. If you are not the intended recipient, please contact the 
sender and delete all copies.




[dpdk-dev] [PATCH 07/13] mbuf: use macros only to access the mbuf metadata

2014-09-16 Thread Dumitrescu, Cristian
Hi Olivier,

I agree that your suggested approach for application-dependent metadata makes 
sense, in fact the two approaches work in exactly the same way (packet metadata 
immediately after the regular mbuf), there is only a subtle difference, which 
is related to defining consistent DPDK usage guidelines.

1. Advertising the presence of application-dependent meta-data as supported 
mechanism
If we explicitly have a metadata zero-size field at the end of the mbuf, we 
basically tell people that adding their own application meta-data at the end of 
the mandatory meta-data (mbuf structure) is a mechanism that DPDK allows and 
supports, and will continue to do so for the foreseeable future. In other 
words, we guarantee that an application doing so will continue to build 
successfully with future releases of DPDK, and we will not introduce changes in 
DPDK that could potentially break this mechanism. It is also a hint to people 
of where to put their application dependent meta-data.

2. Defining a standard base address for the application-dependent metadata
- There are also libraries in DPDK that work with application dependent 
meta-data, currently these are the Packet Framework libraries: librte_port, 
librte_table, librte_pipeline. Of course, the library does not have the 
knowledge of the application dependent meta-data format, so they treat it as 
opaque array of bytes, with the offset and size of the array given as 
arguments. In my opinion, it is safer (and more elegant) if these libraries 
(and others) can rely on an mbuf API to access the application dependent 
meta-data (in an opaque way) rather than make an assumption about the mbuf 
(i.e. the location of custom metadata relative to the mbuf) that is not clearly 
supported/defined by the mbuf library. 
- By having this API, we basically say: we define the custom meta-data base 
address (first location where custom metadata _could_ be placed) immediately 
after the mbuf, so libraries and apps accessing custom meta-data should do so 
by using a relative offset from this base rather than each application defining 
its own base: immediately after mbuf, or 128 bytes after mbuf, or 64 bytes 
before the end of the buffer, or other.

More (minor) comments inline below.

Thanks,
Cristian

-Original Message-
From: Olivier MATZ [mailto:olivier.m...@6wind.com] 
Sent: Friday, September 12, 2014 10:02 PM
To: Dumitrescu, Cristian; Richardson, Bruce; dev at dpdk.org
Subject: Re: [dpdk-dev] [PATCH 07/13] mbuf: use macros only to access the mbuf 
metadata

Hello Cristian,

> What is the reason to remove this field? Please explain the
> rationale of removing this field.

The rationale is explained in
http://dpdk.org/ml/archives/dev/2014-September/005232.html

"The format of the metadata is up to the application".

The type of data the application stores after the mbuf has not
to be defined in the mbuf. These macros limits the types of
metadata to uint8, uint16, uint32, uint64? What should I do
if I need a void*, a struct foo ? Should we add a macro for
each possible type?

[Cristian] Actually, this is not correct, as macros to access metadata through 
pointers (to void or scalar types) are provided as well. This pointer can be 
converted by the application to the format is defines.

> We previously agreed we need to provide an easy and standard
> mechanism for applications to extend the mandatory per buffer
> metadata (mbuf) with optional application-dependent
> metadata.

Defining a structure in the application which does not pollute
the rte_mbuf structure is "easy and standard(TM)" too.

[Cristian] I agree, both approaches work the same really, it is just the 
difference in advertising the presence of meta-data as supported mechanism and 
defining a standard base address for it.

> This field just provides a clean way for the apps to
> know where is the end of the mandatory metadata, i.e. the first
> location in the packet buffer where the app can add its own
> metadata (of course, the app has to manage the headroom space
> before the first byte of packet data). A zero-size field is the
> standard mechanism that DPDK uses extensively in pretty much
> every library to access memory immediately after a header
> structure.

Having the following is clean too:

struct metadata {
 ...
};

struct app_mbuf {
 struct rte_mbuf mbuf;
 struct metadata metadata;
};

There is no need to define anything in the mbuf structure.

[Cristian] I agree, both approaches work the same really, it is just the 
difference in advertising the presence of meta-data as supported mechanism and 
defining a standard base address for it.

> 
> The impact of removing this field is that there is no standard
> way to identify where the end of the mandatory metadata is, so
> each application will have to reinvent this. With no clear
> convention, we will end up with a lot of non-standard ways. Eve

[dpdk-dev] [PATCH 07/13] mbuf: use macros only to access the mbuf metadata

2014-09-12 Thread Dumitrescu, Cristian
Bruce, Olivier, 

What is the reason to remove this field? Please explain the rationale of 
removing this field.

We previously agreed we need to provide an easy and standard mechanism for 
applications to extend the mandatory per buffer metadata (mbuf) with optional 
application-dependent metadata. This field just provides a clean way for the 
apps to know where is the end of the mandatory metadata, i.e. the first 
location in the packet buffer where the app can add its own metadata (of 
course, the app has to manage the headroom space before the first byte of 
packet data). A zero-size field is the standard mechanism that DPDK uses 
extensively in pretty much every library to access memory immediately after a 
header structure.

The impact of removing this field is that there is no standard way to identify 
where the end of the mandatory metadata is, so each application will have to 
reinvent this. With no clear convention, we will end up with a lot of 
non-standard ways. Every time the format of the mbuf structure is going to be 
changed, this can potentially break applications that use custom metadata, 
while using this simple standard mechanism would prevent this. So why remove 
this?

Having applications define their optional meta-data is a real need. Please take 
a look at the Service Chaining IEFT emerging protocols 
(https://datatracker.ietf.org/wg/sfc/documents/), which provide standard 
mechanisms for applications to define their own packet meta-data and share it 
between the elements of the processing pipeline (for Service Chaining, these 
are typically virtual machines scattered amongst the data center).

And, in my opinion, there is no negative impact/cost associated with keeping 
this field.

Regards,
Cristian


-Original Message-
From: dev [mailto:dev-boun...@dpdk.org] On Behalf Of Richardson, Bruce
Sent: Tuesday, September 9, 2014 10:01 AM
To: Olivier MATZ; dev at dpdk.org
Subject: Re: [dpdk-dev] [PATCH 07/13] mbuf: use macros only to access the mbuf 
metadata

> -Original Message-
> From: Olivier MATZ [mailto:olivier.matz at 6wind.com]
> Sent: Monday, September 08, 2014 1:06 PM
> To: Richardson, Bruce; dev at dpdk.org
> Subject: Re: [dpdk-dev] [PATCH 07/13] mbuf: use macros only to access the
> mbuf metadata
> 
> Hi Bruce,
> 
> On 09/03/2014 05:49 PM, Bruce Richardson wrote:
> > Removed the explicit zero-sized metadata definition at the end of the
> > mbuf data structure. Updated the metadata macros to take account of this
> > change so that all existing code which uses those macros still works.
> >
> > Signed-off-by: Bruce Richardson 
> > ---
> >  lib/librte_mbuf/rte_mbuf.h | 22 --
> >  1 file changed, 8 insertions(+), 14 deletions(-)
> >
> > diff --git a/lib/librte_mbuf/rte_mbuf.h b/lib/librte_mbuf/rte_mbuf.h
> > index 5260001..ca66d9a 100644
> > --- a/lib/librte_mbuf/rte_mbuf.h
> > +++ b/lib/librte_mbuf/rte_mbuf.h
> > @@ -166,31 +166,25 @@ struct rte_mbuf {
> > struct rte_mempool *pool; /**< Pool from which mbuf was allocated.
> */
> > struct rte_mbuf *next;/**< Next segment of scattered packet. */
> >
> > -   union {
> > -   uint8_t metadata[0];
> > -   uint16_t metadata16[0];
> > -   uint32_t metadata32[0];
> > -   uint64_t metadata64[0];
> > -   } __rte_cache_aligned;
> >  } __rte_cache_aligned;
> >
> >  #define RTE_MBUF_METADATA_UINT8(mbuf, offset)  \
> > -   (mbuf->metadata[offset])
> > +   (((uint8_t *)&(mbuf)[1])[offset])
> >  #define RTE_MBUF_METADATA_UINT16(mbuf, offset) \
> > -   (mbuf->metadata16[offset/sizeof(uint16_t)])
> > +   (((uint16_t *)&(mbuf)[1])[offset/sizeof(uint16_t)])
> >  #define RTE_MBUF_METADATA_UINT32(mbuf, offset) \
> > -   (mbuf->metadata32[offset/sizeof(uint32_t)])
> > +   (((uint32_t *)&(mbuf)[1])[offset/sizeof(uint32_t)])
> >  #define RTE_MBUF_METADATA_UINT64(mbuf, offset) \
> > -   (mbuf->metadata64[offset/sizeof(uint64_t)])
> > +   (((uint64_t *)&(mbuf)[1])[offset/sizeof(uint64_t)])
> >
> >  #define RTE_MBUF_METADATA_UINT8_PTR(mbuf, offset)  \
> > -   (>metadata[offset])
> > +   (_MBUF_METADATA_UINT8(mbuf, offset))
> >  #define RTE_MBUF_METADATA_UINT16_PTR(mbuf, offset) \
> > -   (>metadata16[offset/sizeof(uint16_t)])
> > +   (_MBUF_METADATA_UINT16(mbuf, offset))
> >  #define RTE_MBUF_METADATA_UINT32_PTR(mbuf, offset) \
> > -   (>metadata32[offset/sizeof(uint32_t)])
> > +   (_MBUF_METADATA_UINT32(mbuf, offset))
> >  #define RTE_MBUF_METADATA_UINT64_PTR(mbuf, offset) \
> > -   (>metadata64[offset/sizeof(uint64_t)])
> > +   (_MBUF_METADATA_UINT64(mbuf, offset))
> >
> >  /**
> >   * Given the buf_addr returns the pointer to corresponding mbuf.
> >
> 
> I think it goes in the good direction. So:
> Acked-by: Olivier Matz 
> 
> Just one question: why not removing RTE_MBUF_METADATA*() macros?
> I'd just provide one macro that gives a (void*) to the first byte
> after the mbuf structure.
> 
> The 

[dpdk-dev] [PATCH 0/4] lib/librte_table: Fix bugs occuring in corner cases

2014-09-11 Thread Dumitrescu, Cristian
Acked by: Cristian.Dumitrescu 

-Original Message-
From: dev [mailto:dev-boun...@dpdk.org] On Behalf Of Balazs Nemeth
Sent: Thursday, September 11, 2014 6:47 PM
To: dev at dpdk.org
Cc: Nemeth, Balazs
Subject: [dpdk-dev] [PATCH 0/4] lib/librte_table: Fix bugs occuring in corner 
cases

This set of patches fixes bugs in the packet framework. Some of the
bugs occur in corner cases (i.e. when a lookup is performed on a few
packets or when buckets are in extended states) while others can cause
memory to be accessed beyond what is reserved during initialization
time.

Balazs Nemeth (4):
  lib/librte_table: Fix empty bucket removal during entry deletion in
rte_table_hash_ext
  lib/librte_table: Fix checking extended buckets in unoptimized case
  lib/librte_table: Fix incorrect t->data_size_shl initialization
  lib/librte_table: Fix pointer calculations at initialization

 lib/librte_table/rte_table_hash_ext.c   | 13 ++---
 lib/librte_table/rte_table_hash_key16.c |  4 ++--
 lib/librte_table/rte_table_hash_key32.c |  4 ++--
 lib/librte_table/rte_table_hash_key8.c  |  8 
 lib/librte_table/rte_table_hash_lru.c   |  7 +++
 5 files changed, 17 insertions(+), 19 deletions(-)

--
1.9.1
Intel Corporation NV/SA
Kings Square, Veldkant 31
2550 Kontich
RPM (Bruxelles) 0415.497.718. 
Citibank, Brussels, account 570/1031255/09

This e-mail and any attachments may contain confidential material for the sole 
use of the intended recipient(s). Any review or distribution by others is 
strictly prohibited. If you are not the intended recipient, please contact the 
sender and delete all copies.

--
Intel Shannon Limited
Registered in Ireland
Registered Office: Collinstown Industrial Park, Leixlip, County Kildare
Registered Number: 308263
Business address: Dromore House, East Park, Shannon, Co. Clare

This e-mail and any attachments may contain confidential material for the sole 
use of the intended recipient(s). Any review or distribution by others is 
strictly prohibited. If you are not the intended recipient, please contact the 
sender and delete all copies.




[dpdk-dev] [PATCH] examples/qos_sched: fix flow pause after 2M packets

2014-07-03 Thread Dumitrescu, Cristian
Acked-by: Cristian Dumitrescu 

-Original Message-
From: dev [mailto:dev-boun...@dpdk.org] On Behalf Of Yong Liu
Sent: Thursday, July 3, 2014 10:00 AM
To: dev at dpdk.org
Subject: [dpdk-dev] [PATCH] examples/qos_sched: fix flow pause after 2M packets

After enable vector pmd, qos_sched only send 32 packets every burst.
That will cause some packets not transmitted and therefore mempool
will be drain after a while.
App qos_sched now will re-send the packets which failed to send out in 
previous tx function.

Signed-off-by: Yong Liu 
Acked-by: Cristian Dumitrescu 
Tested-by: Waterman Cao 
---
 examples/qos_sched/app_thread.c |   16 +---
 1 files changed, 5 insertions(+), 11 deletions(-)

diff --git a/examples/qos_sched/app_thread.c b/examples/qos_sched/app_thread.c
index 7501147..59c4014 100755
--- a/examples/qos_sched/app_thread.c
+++ b/examples/qos_sched/app_thread.c
@@ -139,17 +139,11 @@ app_send_burst(struct thread_conf *qconf)

do {
ret = rte_eth_tx_burst(qconf->tx_port, qconf->tx_queue, mbufs, 
(uint16_t)n);
-   if (unlikely(ret < n)) { /* we cannot drop the packets, so 
re-send */
-   /* update number of packets to be sent */
-   n -= ret;
-   mbufs = (struct rte_mbuf **)[ret];
-   /* limit number of retries to avoid endless loop */
-   /* reset retry counter if some packets were sent */
-   if (likely(ret != 0)) {
-   continue;
-   }
-   }
-   } while (ret != n);
+   /* we cannot drop the packets, so re-send */
+   /* update number of packets to be sent */
+   n -= ret;
+   mbufs = (struct rte_mbuf **)[ret];
+   } while (n);
 }


-- 
1.7.7.6

--
Intel Shannon Limited
Registered in Ireland
Registered Office: Collinstown Industrial Park, Leixlip, County Kildare
Registered Number: 308263
Business address: Dromore House, East Park, Shannon, Co. Clare

This e-mail and any attachments may contain confidential material for the sole 
use of the intended recipient(s). Any review or distribution by others is 
strictly prohibited. If you are not the intended recipient, please contact the 
sender and delete all copies.




[dpdk-dev] [PATCH] examples/qos_sched: fix flow pause after 2M packets

2014-07-03 Thread Dumitrescu, Cristian


>>How about "mbufs += ret" rather than "mbufs = (struct rte_mbuf 
>>**)[ret]"? 
Functionally it is the same, but "mbufs = (struct rte_mbuf **)[ret]" is 
likely less prone to compiler warnings, so my vote is to keep it as it is and 
get this patch integrated asap into 1.7.

Regards,
Cristian


--
Intel Shannon Limited
Registered in Ireland
Registered Office: Collinstown Industrial Park, Leixlip, County Kildare
Registered Number: 308263
Business address: Dromore House, East Park, Shannon, Co. Clare

This e-mail and any attachments may contain confidential material for the sole 
use of the intended recipient(s). Any review or distribution by others is 
strictly prohibited. If you are not the intended recipient, please contact the 
sender and delete all copies.




[dpdk-dev] [PATCH] vfio: open VFIO container at startup rather than during init

2014-06-18 Thread Dumitrescu, Cristian
Hi Anatoly,

I would suggest we add a log message explaining which mechanism is loaded 
(igb_uio/vfio) and why (e.g. tried vfio first but container could not be 
opened, so falling back to igb_uio, etc).

Regards,
Cristian

-Original Message-
From: dev [mailto:dev-boun...@dpdk.org] On Behalf Of Burakov, Anatoly
Sent: Wednesday, June 18, 2014 9:57 AM
To: Thomas Monjalon
Cc: dev at dpdk.org
Subject: Re: [dpdk-dev] [PATCH] vfio: open VFIO container at startup rather 
than during init

Hi Thomas,

> Subject: Re: [dpdk-dev] [PATCH] vfio: open VFIO container at startup rather
> than during init
> 
> > Signed-off-by: Anatoly Burakov 
> 
> Please Anatoly, could you provide a text explaining what was broken and
> why you fixed it this way?

What was broken was if, for some reason, VFIO is loaded but the user can't 
initialize it (an example would be wrong permissions, or unsupported IOMMU 
type, which is what Bruce seems to be having... which shouldn't happen as far 
as I know, but there's nothing I can do on DPDK's side to fix this as this is 
the kernel reporting wrong kind of IOMMU type), DPDK would fail to load. The 
fix makes DPDK simply not try VFIO support at all if the container cannot be 
opened for some reason.

Best regards,
Anatoly Burakov
DPDK SW Engineer



--
Intel Shannon Limited
Registered in Ireland
Registered Office: Collinstown Industrial Park, Leixlip, County Kildare
Registered Number: 308263
Business address: Dromore House, East Park, Shannon, Co. Clare

This e-mail and any attachments may contain confidential material for the sole 
use of the intended recipient(s). Any review or distribution by others is 
strictly prohibited. If you are not the intended recipient, please contact the 
sender and delete all copies.




[dpdk-dev] [PATCH] Add an API to query enabled core index

2014-06-12 Thread Dumitrescu, Cristian
Maybe we could simplify this discussion by simply creating a new function to 
return the mask of all enabled cores (as provided through -c coremask EAL 
option) and have the user utilize this mask to derive whatever info it needs?

Right now, to get the mask of enabled cores, a for loop is required to test 
each core index one by one and re-create the mask.

In several instances, I needed to know just the number of enabled cores (i.e. 
number of bits set in -c coremask), and there was no alternative to the for 
loop above. But given such a function, we can quickly do:
uint64_t coremask = rte_eal_coremask();
n_lcores = __builtin_popcountll(coremask);

For what Patrick needs: 
uint32_t lcore_enabled_pos = __builtin_popcountll(coremask & 
RTE_LEN2MASK(lcore_index));

Regards,
Cristian

-Original Message-
From: dev [mailto:dev-boun...@dpdk.org] On Behalf Of Richardson, Bruce
Sent: Thursday, June 12, 2014 12:28 AM
To: Thomas Monjalon
Cc: dev at dpdk.org
Subject: Re: [dpdk-dev] [PATCH] Add an API to query enabled core index

> -Original Message-
> From: Thomas Monjalon [mailto:thomas.monjalon at 6wind.com]
> Sent: Wednesday, June 11, 2014 3:50 PM
> To: Richardson, Bruce
> Cc: Lu, Patrick; dev at dpdk.org
> Subject: Re: [dpdk-dev] [PATCH] Add an API to query enabled core index
> 
> 2014-06-11 21:57, Richardson, Bruce:
> > From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Thomas Monjalon
> > > 2014-06-11 13:45, Patrick Lu:
> > > > The new API rte_lcore_id2() will return an index from enabled lcores
> > > > starting from zero.
> > >
> > > I think core_id2 is not a representative name.
> > > What do you think of renaming core_id as lcore_hwid and core_id2 as
> > > lcore_index?
> >
> > I like lcore_index as the name for the new function. However, I'm not sure
> > in that case that we want/need to rename the old one.
> 
> I think it would be not easy to distinguish id and index. So I prefer
> hwid/index. And lcore is more precise than core.
> 

The function is already called "rte_lcore_id()" so there is no need to change 
it to make it an "lcore" function. That function has been around for a long 
time and is commonly used, so I'd prefer it not be changed unless it really is 
necessary. "rte_lcore_index" is a sufficiently different function name, in my 
opinion. The API documentation should clear up any confusion for the user 
anyway.
--
Intel Shannon Limited
Registered in Ireland
Registered Office: Collinstown Industrial Park, Leixlip, County Kildare
Registered Number: 308263
Business address: Dromore House, East Park, Shannon, Co. Clare

This e-mail and any attachments may contain confidential material for the sole 
use of the intended recipient(s). Any review or distribution by others is 
strictly prohibited. If you are not the intended recipient, please contact the 
sender and delete all copies.




[dpdk-dev] [v2 22/23] Packet Framework IPv4 pipeline sample app

2014-06-09 Thread Dumitrescu, Cristian
Hi Olivier,

A few notes on using pktmbuf here:
1. As the name implies, pktmbuf should be used for packets and ctrlmbuf should 
be used for control messages :). IMHO using pktmbuf to control messages is a 
confusing workaround.
2. Pktmbuf has a lot of overhead that is not needed in order to send short 
messages between cores. Pktmbuf has a lot of pointers and other fields that do 
not make sense for messages. I don't think we want people to say DPDK is 
difficult to use because e.g. sending 2 bytes from core A to core B requires 
initializing a bunch of pointers and other fields that do not make sense.
3. Once we start using pktmbuf to send messages, it is likely that other people 
will follow this example, and they might do it incorrectly. I don't think we 
want to see emails on this list from people asking e.g:
i) Why does my app segfaults, when all I want to do is send 2 bytes 
from core A to core B?
ii) Why does my app segfaults when core A writes a message to a NIC TX 
queue? :)

Using an app dependent structure requires duplicating the work to create/free 
the pool of such structures, and alloc/free mechanism. And then some people 
will ask why are we not using ctrlmbuf, as long as ctrlmbuf exists in DPDK.

I think that, as long as we have ctrlmbuf and pktmbuf in DPDK, we should follow 
the existing model. We should not look for workarounds that we know we plan to 
change anyway, we should look for the right solution. We both agree we need to 
refresh pktmbuf and ctrlmbuf, but my point is we should not do changes as long 
as we don't know what the agreed solution will look like?

Thanks,
Cristian


-Original Message-
From: Olivier MATZ [mailto:olivier.m...@6wind.com] 
Sent: Monday, June 9, 2014 1:14 PM
To: Dumitrescu, Cristian
Cc: dev at dpdk.org
Subject: Re: [dpdk-dev] [v2 22/23] Packet Framework IPv4 pipeline sample app

Hi Christian,

> We need a message type defined for message passing between cores, and
> pktmbuf is definitely not the right approach.

Could you please explain why a pktmbuf is not the right approach?

As proposed in http://dpdk.org/ml/archives/dev/2014-May/002759.html
I think the control mbuf could be replaced by a packet mbuf or an
application private structure.


Regards,
Olivier

--
Intel Shannon Limited
Registered in Ireland
Registered Office: Collinstown Industrial Park, Leixlip, County Kildare
Registered Number: 308263
Business address: Dromore House, East Park, Shannon, Co. Clare

This e-mail and any attachments may contain confidential material for the sole 
use of the intended recipient(s). Any review or distribution by others is 
strictly prohibited. If you are not the intended recipient, please contact the 
sender and delete all copies.




[dpdk-dev] [v2 22/23] Packet Framework IPv4 pipeline sample app

2014-06-09 Thread Dumitrescu, Cristian
Hi Olivier,

We could remove the ctrlmbuf from this app and replace it with something else, 
but I am afraid we do not have that something else yet defined and agreed. And 
I would like to avoid doing the same work twice:   change this app now to 
replace the ctrlmbuf with something else, and then replace this something else 
with whatever we decide to use for message passing as part of the 1.8 mbuf 
refresh discussion.

We need a message type defined for message passing between cores, and pktmbuf 
is definitely not the right approach. I can also invent something new, but it 
is unlikely people will accept it now without a debate, so it will only make 
this problem worse. Not to mention that we do not even have consensus to remove 
ctrlmbuf :(. 

My proposal is (as also discussed with Ivan on a different thread) to take the 
mbuf refresh discussion during 1.8 timeframe, which should include the decision 
on what to use for message passing. I can commit now to send a patch for this 
app at that time to do these changes, would this work?

Thanks,
Cristian


-Original Message-
From: Olivier MATZ [mailto:olivier.m...@6wind.com] 
Sent: Monday, June 9, 2014 10:12 AM
To: Dumitrescu, Cristian; dev at dpdk.org
Subject: Re: [dpdk-dev] [v2 22/23] Packet Framework IPv4 pipeline sample app

Hi Cristian,

On 06/04/2014 08:08 PM, Cristian Dumitrescu wrote:
> This Packet Framework sample application illustrates the capabilities of the 
> Intel DPDK Packet Framework toolbox.
>
> It creates different functional blocks used by a typical IPv4 framework like: 
> flow classification, firewall, routing, etc.
>
> CPU cores are connected together through standard interfaces built on SW 
> rings, which each CPU core running a separate pipeline instance.
>
> Please refer to Intel DPDK Sample App Guide for full description.
>
> Signed-off-by: Cristian Dumitrescu 

Would it be possible to replace the ctrlmbuf by something else (a
pktmbuf for instance)?

As you know this would conflict if we want to remove the ctrlmbuf from
the rte_mbuf structure.

Regards,
Olivier
--
Intel Shannon Limited
Registered in Ireland
Registered Office: Collinstown Industrial Park, Leixlip, County Kildare
Registered Number: 308263
Business address: Dromore House, East Park, Shannon, Co. Clare

This e-mail and any attachments may contain confidential material for the sole 
use of the intended recipient(s). Any review or distribution by others is 
strictly prohibited. If you are not the intended recipient, please contact the 
sender and delete all copies.




[dpdk-dev] [PATCH 04/29] mbuf: added offset of packet meta-data in the packet buffer just after mbuf

2014-06-05 Thread Dumitrescu, Cristian
Hi,

In order to minimize the number of iterations, Ivan and I had an offline 
discussion on this.

Ivan is concerned with the way the mbuf cloning/scatter-gather feature is 
currently implemented in DPDK, and not with this particular patch.

We agreed to take the discussion on cloning/scatter-gather implementation 
during the DPDK 1.8 time-frame, at this belongs to the mbuf refresh discussion. 
The mbuf library is not just the format of the mbuf data structure, it also 
includes all the features associated with the mbuf, as: accessing mbuf fields 
through get/set methods, buffer chaining, cloning/scatter-gather, enabling HW 
offloads through mbuf flags and fields, etc.

Regards,
Cristian

-Original Message-
From: Ivan Boule [mailto:ivan.bo...@6wind.com] 
Sent: Monday, June 2, 2014 1:24 PM
To: Dumitrescu, Cristian; dev at dpdk.org
Subject: Re: [dpdk-dev] [PATCH 04/29] mbuf: added offset of packet meta-data in 
the packet buffer just after mbuf

Hi Cristian,

I agree with you, the natural way to store application metadata into 
mbufs consists in putting it right after the rte_mbuf data structure.
This can be simply implemented this way:

struct metadata {
 ...
};

struct app_mbuf {
 struct rte_mbuf mbuf;
 struct metadata mtdt;
};

With such a representation, the application initializes the "buf_addr" 
field of each mbuf pointed to by a (struct app_mbuf *)amb pointer as:

 amb->mbuf.buf_addr = ((char *amb) + sizeof(struct app_mbuf));

But such a natural programming approach breaks the assumptions of the 
macros RTE_MBUF_FROM_BADDR, RTE_MBUF_TO_BADDR, RTE_MBUF_INDIRECT, and 
RTE_MBUF_DIRECT.

For instance, in the function  __rte_pktmbuf_prefree_seg(m), after the line
 struct rte_mbuf *md = RTE_MBUF_FROM_BADDR(m->buf_addr);
"md" is always different from "m", and thus systematically (and most of 
the time wrongly) considered as an indirect mbuf.

In the same way, the operations done by the function rte_pktmbuf_detach(m)
 void *buf = RTE_MBUF_TO_BADDR(m);
 m->buf_addr = buf;
does not set buf_addr to its [dafault] correct value.

To summarize, adding metadata after the rte_mbuf data structure is
incompatible with the packet cloning feature behind the [wrongly named]
RTE_MBUF_SCATTER_GATHER configuration option.

By the way, you suggest to use the headroom to also store packet metadata.
But, then, how does an application can store both types of information 
in a given mbuf at the same time, when the actual length of network 
headers in a mbuf is variable, as it depends on the protocol path 
followed by the packet in a networking stack (tunnels, etc)?

Regards,
Ivan


On 05/30/2014 12:28 AM, Dumitrescu, Cristian wrote:
> Hi Ivan,
>
> I hate to disagree with you :), but no, we do not break the scatter-gather 
> feature. We actually implemented the Packet Framework IPv4 fragmentation and 
> reassembly ports to validate exactly this.
>
> Maybe the confusion comes from the RTE_MBUF_TO_BADDR macro. This macro only 
> works (and it is only used) for direct mbufs, so probably the correct name 
> for this macro should be RTE_MBUF_DIRECT_TO_BADDR, as it cannot work (and it 
> is not used) for indirect mbufs.
>
> I am describing the rationale behind meta-data design below, sorry for the 
> long description that looks like a design document ...
>
> Quick recap:
> - mbuf->buf_addr points to the first address where a byte of data for the 
> current segment _can_ be placed
> - direct mbuf: the packet descriptor (mbuf) sits in the same mempool buffer 
> with the packet itself, so mbuf->buf_addr = mbuf + 1;
> - indirect mbuf: the packet descriptor is located in a different mempool 
> buffer than the packet itself, so mbuf->buf_addr != mbuf + 1;
> - Regardless of the mbuf type, it is not necessarily true that the first byte 
> of data is located at mbuf->buf_addr, as we save a headroom (configurable, by 
> default initially of CONFIG_RTE_PKTMBUF_HEADROOM = 128 bytes) at the start of 
> the data buffer (mbuf->buf_addr) for prepending packet headers ( ... or, why 
> not, meta-data!).  The location of the first real data byte is 
> mbuf->pkt.data, which is variable, as opposed to mbuf->buf_addr, which does 
> not change.
>
> Packet meta-data rationale:
> - I think we both agree that we need to be able to store packet meta-data in 
> the packet buffer. The packet buffer regions where meta-data could be stored 
> can only be the in the headroom or in the tailroom, which are both managed by 
> the application and both go up and down during the packet lifetime.
>
> - To me, packet-metadata is part of the packet descriptor: mbuf is the 
> mandatory & standard part of the packet descriptor, while meta-data is the 
> optional & non-standard (per application) part of the packet descriptor. 
> Therefore, it makes sense to put

[dpdk-dev] IMPORTANT: feature freeze for version 1.7.0

2014-06-02 Thread Dumitrescu, Cristian
Hi Thomas,

I am getting emails from people asking about Intel DPDK docs update for Packet 
Framework.

Is there a way to upload documents to dpdk.org to make them available for 
review?

We have Packet Framework updates to Programmer's Guide and Sample App Guide (MS 
Word with lots of tables and diagrams), and it would probably be very 
beneficial to people reviewing the code to have access to these docs as well, 
as opposed to getting the code now and the docs later once release is done.

Thanks,
Cristian

-Original Message-
From: dev [mailto:dev-boun...@dpdk.org] On Behalf Of Thomas Monjalon
Sent: Friday, May 30, 2014 2:12 PM
To: dev at dpdk.org
Subject: [dpdk-dev] IMPORTANT: feature freeze for version 1.7.0

Hello all,

We have a lot of new features mostly ready for DPDK 1.7.0.
It's now time to be sure that they are well reviewed and integrated.
Then release candidate should be tagged in mid-June in order to have some 
validation tests before end of June (planned released date).

So requests for integration of features which have not yet been sent will be 
delayed to next version. Only reworks or small fixes of pending patches should 
be accepted.

A new branch "rebased-next" will be created for patches which are reviewed and 
waiting next version. This branch will be regularly rebased on master HEAD.

Being currently on vacation, I won't be very active until Monday, 9th.
During this time, I'd like to see some reviews/acknowledgements for these 
patches:
- socket id detection fallback
http://dpdk.org/ml/archives/dev/2014-April/001987.html
- fix rte_free run time in O(n) free blocks
http://dpdk.org/ml/archives/dev/2014-May/002296.html
- fix mbuf corruption in pcap_tx
http://dpdk.org/ml/archives/dev/2014-May/002919.html
- igb_uio fixes
http://dpdk.org/ml/archives/dev/2014-May/002428.html
- vfio
http://dpdk.org/ml/archives/dev/2014-May/002914.html
- ethernet enhancements
http://dpdk.org/ml/archives/dev/2014-May/002436.html
- splitted statistics
http://dpdk.org/ml/archives/dev/2014-May/002707.html
- mtu/flow control
http://dpdk.org/ml/archives/dev/2014-May/002752.html
- link up/down
http://dpdk.org/ml/archives/dev/2014-May/002660.html
- Tx rate limitation
http://dpdk.org/ml/archives/dev/2014-May/002696.html
- lpm optimization
http://dpdk.org/ml/archives/dev/2014-May/002703.html
- generic filter
http://dpdk.org/ml/archives/dev/2014-May/002740.html
http://dpdk.org/ml/archives/dev/2014-May/002577.html
- ip_frag
http://dpdk.org/ml/archives/dev/2014-May/002930.html
- distributor
http://dpdk.org/ml/archives/dev/2014-May/002598.html
- link bonding
http://dpdk.org/ml/archives/dev/2014-May/002922.html
- acl
http://dpdk.org/ml/archives/dev/2014-May/002945.html
- packet framework
http://dpdk.org/ml/archives/dev/2014-May/002820.html

Thanks for your participation
-- 
Thomas
--
Intel Shannon Limited
Registered in Ireland
Registered Office: Collinstown Industrial Park, Leixlip, County Kildare
Registered Number: 308263
Business address: Dromore House, East Park, Shannon, Co. Clare

This e-mail and any attachments may contain confidential material for the sole 
use of the intended recipient(s). Any review or distribution by others is 
strictly prohibited. If you are not the intended recipient, please contact the 
sender and delete all copies.




[dpdk-dev] Please any one who can help me with librte_sched

2014-05-30 Thread Dumitrescu, Cristian
Hi Ariel,

So how are you making sure you are not getting more packets in while you are 
waiting for dequeue() to return zero for some time?

As stated, I don?t like this approach, firstly because it forces the on-the-fly 
BW configuration changes to result in dropping packets for a considerable 
amount of time in an uncontrolled way, and it should not be this way. Secondly, 
because making it work with no buffer leakage and no race conditions is 
probably not that simple ;)

Regards,
Cristian


From: Ariel Rodriguez [mailto:arodrig...@callistech.com]
Sent: Wednesday, May 28, 2014 11:50 AM
To: Dumitrescu, Cristian
Cc: dev at dpdk.org; Stephen Hemminger
Subject: RE: [dpdk-dev] Please any one who can help me with librte_sched


Ok i can do that... but still is there a way to ask to the rte_sched_port 
something like is_empty
... Or simply if the dequeue function return 0 packets retrieved from the old 
port structure running in other core,
Can i  assume that port is empty with that?

Regards

Ariel.
On May 28, 2014 7:10 AM, "Dumitrescu, Cristian" mailto:cristian.dumitrescu at intel.com>> wrote:
Hi Ariel,

I think you put your finger precisely on the problem associated with your 
approach: you have to iterate through all the queues and free up the packets, 
which takes a lot of time. Obviously this is not done by the rte_sched API.

Maybe a limited workaround for this approach would be to create and service the 
parallel rte_sched using a different CPU core, while the previous CPU core 
takes its time to free up all the packets and data structures correctly.

Regards,
Cristian

From: Ariel Rodriguez [mailto:arodriguez at 
callistech.com<mailto:arodrig...@callistech.com>]
Sent: Wednesday, May 28, 2014 1:46 AM
To: Dumitrescu, Cristian
Cc: Stephen Hemminger; dev at dpdk.org<mailto:dev at dpdk.org>
Subject: Re: [dpdk-dev] Please any one who can help me with librte_sched

Thank you perfect explanation, i think im going to creating a new parallel 
rte_sched_port and change the reference with managment core updating the 
tx/sched core. So, what happens with the packets on the old reference if i just 
do rte_port_free on it, are them leaked? Is there a why to flush the 
rte_sched_port or maybe gets the packet total size somewhere?.
Anyway the rcu algoritm fits ok in this aproach ... but maybe there is a way to 
flush the old reference port, and work from there with the recently  created 
rte_sched_port 

Regars,
Ariel.

On Tue, May 27, 2014 at 3:31 PM, Dumitrescu, Cristian mailto:cristian.dumitrescu at intel.com>> wrote:
Hi Ariel,

What's wrong with calling rte_sched_subport_config() and 
rte_sched_pipe_config() during run-time?

This assumes that:

1. Port initialization is done, which includes the following:
a) the number of subports, pipes per subport are fixed
b) the queues are all created and their size is fixed
c) the pipe profiles are defined
d) Basically the maximal data structures get created (maximum number of 
supports, pipes and queues) with no run-time changes allowed, apart for the 
bandwidth related parameters. Queues that do not receive packets are not used 
now, they will be used as soon as they get packets. The packets-to-queues 
mapping logic can change over time, as well as the level of activity for 
different users/queues.

2. The CPU core calling the subport/pipe config functions is the same as the 
core doing enque/dequeue for this port (for thread safety reasons).
a) As you say, the management core can send update requests to the core running 
the scheduler, with the latter sampling the request queue regularly and 
performing the updates.

Regards,
Cristian


-Original Message-
From: dev [mailto:dev-bounces at dpdk.org<mailto:dev-boun...@dpdk.org>] On 
Behalf Of Stephen Hemminger
Sent: Tuesday, May 27, 2014 5:35 PM
To: Ariel Rodriguez
Cc: dev at dpdk.org<mailto:dev at dpdk.org>
Subject: Re: [dpdk-dev] Please any one who can help me with librte_sched

On Tue, 27 May 2014 10:33:02 -0300
Ariel Rodriguez mailto:arodriguez at 
callistech.com>> wrote:

> Hello , this is my third mail , the previous mails have not been answered
> yet.
>
> I justo need someone explains to me  how the librte_sched framework behaves
> in a specific situation.
>
> I have a managment application , this connects with a ring with the tx
> core, when a user applies some configuration of the bandwith mangement ,
> the tx core read the message in the ring parse the configuration in a
> rte_port_params struct , subport_params and pipe_params, then creates a new
> rte_sched from scratch , and then changes the pointer of the current
> rte_sched_port currently doing scheduling and then the code execurte
> rte_sched_port_free() for the unreference (reference by temporal pointer)
> rte_sched_port . This is the only way i found for applying dinamic
> configuration or changes to the qos framework.
> So, with this, what ha

[dpdk-dev] [PATCH 04/29] mbuf: added offset of packet meta-data in the packet buffer just after mbuf

2014-05-29 Thread Dumitrescu, Cristian
There is a tricky type below (leave of -> live off), correcting ...

-Original Message-
From: Dumitrescu, Cristian 
Sent: Thursday, May 29, 2014 11:28 PM
To: 'Ivan Boule'; dev at dpdk.org
Subject: RE: [dpdk-dev] [PATCH 04/29] mbuf: added offset of packet meta-data in 
the packet buffer just after mbuf

Hi Ivan,

I hate to disagree with you :), but no, we do not break the scatter-gather 
feature. We actually implemented the Packet Framework IPv4 fragmentation and 
reassembly ports to validate exactly this.

Maybe the confusion comes from the RTE_MBUF_TO_BADDR macro. This macro only 
works (and it is only used) for direct mbufs, so probably the correct name for 
this macro should be RTE_MBUF_DIRECT_TO_BADDR, as it cannot work (and it is not 
used) for indirect mbufs.

I am describing the rationale behind meta-data design below, sorry for the long 
description that looks like a design document ...

Quick recap:
- mbuf->buf_addr points to the first address where a byte of data for the 
current segment _can_ be placed
- direct mbuf: the packet descriptor (mbuf) sits in the same mempool buffer 
with the packet itself, so mbuf->buf_addr = mbuf + 1;
- indirect mbuf: the packet descriptor is located in a different mempool buffer 
than the packet itself, so mbuf->buf_addr != mbuf + 1;
- Regardless of the mbuf type, it is not necessarily true that the first byte 
of data is located at mbuf->buf_addr, as we save a headroom (configurable, by 
default initially of CONFIG_RTE_PKTMBUF_HEADROOM = 128 bytes) at the start of 
the data buffer (mbuf->buf_addr) for prepending packet headers ( ... or, why 
not, meta-data!).  The location of the first real data byte is mbuf->pkt.data, 
which is variable, as opposed to mbuf->buf_addr, which does not change.

Packet meta-data rationale:
- I think we both agree that we need to be able to store packet meta-data in 
the packet buffer. The packet buffer regions where meta-data could be stored 
can only be the in the headroom or in the tailroom, which are both managed by 
the application and both go up and down during the packet lifetime.

- To me, packet-metadata is part of the packet descriptor: mbuf is the 
mandatory & standard part of the packet descriptor, while meta-data is the 
optional & non-standard (per application) part of the packet descriptor. 
Therefore, it makes sense to put the meta-data immediately after mbuf, but this 
is not mandatory.

- The zero-size field called meta-data is really just an offset: it points to 
the first buffer location where meta-data _can_ be placed. The reason for 
having this field is to provide a standard way to place meta-data in the packet 
buffer rather that hide it in the Packet Framework libraries and potentially 
conflict with other mbuf changes in the future. It flags that meta-data should 
be taken into consideration when any mbuf change is done in  the future.

- For direct mbufs, the same buffer space (the headroom) is shared between 
packet data (header prepending) and meta-data similar to how the stack and the 
heap manage the same memory. Of course, since it is managed by the application, 
it is the responsibility of the application to make sure the packet bytes and 
the meta-data do not step on one another, but this problem is not at all new, 
nor introduced by the meta-data field: even currently, the application has to 
make sure the headroom is dimensioned correctly, so that even in the worst case 
scenario (application-specific), the packet bytes do not run into the mbuf 
fields, right?

- For indirect mbufs, the packet meta-data is associated with the indirect mbuf 
rather than the direct mbuf (the one the indirect mbuf attaches to, as the 
direct mbuf contains a different packet that has different meta-data), so it 
makes sense that meta-data of the indirect mbuf is stored in the same buffer as 
the indirect mbuf (rather than the buffer of the attached direct mbuf). So this 
means that the buffer size used to store the indirect mbuf is sized 
appropriately (mbuf size, currently 64 bytes, plus the max size for additional 
meta-data). This is illustrated in the code of the IPv4 fragmentation port, 
where for every child packet (output fragment) we copy the parent meta-data in 
the child buffer (which stores an indirect buffer attached to the direct buffer 
of the input jumbo, plus now additional meta-data).

- We are also considering having a user_data field in the mbuf itself to point 
to meta-data in the same buffer or any other buffer. The Packet Framework 
functionally works with both approaches, but there is a performance problem 
associated with the mbuf->user_data approach that we are not addressing for 
this 1.7 release timeframe. The issue is the data dependency that is created, 
as in order to find out the location of the meta-data, we need to first read 
the mbuf  and then read meta-data from mbuf->user_data. The cost of the 
additional memory access is high, due t

[dpdk-dev] [PATCH 04/29] mbuf: added offset of packet meta-data in the packet buffer just after mbuf

2014-05-29 Thread Dumitrescu, Cristian
 the mbuf or the packet (efficiency). For 
the time being, if somebody needs more yet meta-data in yet another buffer, 
they can add (for their application) a user_data pointer as part of their 
application meta-data (instead of standard mbuf).

- As said, the mbuf->metadata points to the first location where meta-data 
_can_ be placed, if somebody needs a different offset, they can add it on top 
of the mbuf->metadata (e.g. by having a reserved field in their struct 
app_pkt_metadata). We have demonstrated the use of meta-data in the 
examples/ip_pipeline sample app (see struct app_pkt_metadata in "main.h").

Please let me know if this makes sense?

Regards,
Cristian

PS: This is where a white board would help a lot ...


-Original Message-
From: Ivan Boule [mailto:ivan.bo...@6wind.com] 
Sent: Wednesday, May 28, 2014 1:03 PM
To: Dumitrescu, Cristian; dev at dpdk.org
Subject: Re: [dpdk-dev] [PATCH 04/29] mbuf: added offset of packet meta-data in 
the packet buffer just after mbuf

Hi Cristian,

Currently, the DPDK framework does not make any assumption on the actual
layout of a mbuf.
More precisely, the DPDK does not impose any constraint on the actual
location of additional metadata, if any, or on the actual location and
size of its associated payload data buffer.
This is coherent with the fact that mbuf pools are not created by the
DPDK itself, but by network applications that are free to choose
whatever packet mbuf layout that fits their particular needs.

There is one exception to this basic DPDK rule: the mbuf cloning feature 
available through the RTE_MBUF_SCATTER_GATHER configuration option 
assumes that the payload data buffer of the mbuf immediately follows the 
rte_mbuf data structure (see the macros RTE_MBUF_FROM_BADDR, 
RTE_MBUF_TO_BADDR, RTE_MBUF_INDIRECT, and RTE_MBUF_DIRECT in the file 
lib/librte_mbuf/rte_mbuf.h).

The cloning feature prevents to build packet mbufs with their metadata 
located immediately after the rte_mbuf data structure, which is exactly 
what your patch introduces.

At least, a comment that clearly warns the user of this incompatibility
might be worth adding into both the code and your patch log.

Regards,
Ivan

On 05/27/2014 07:09 PM, Cristian Dumitrescu wrote:
> Added zero-size field (offset in data structure) to specify the beginning of 
> packet meta-data in the packet buffer just after the mbuf.
>
> The size of the packet meta-data is application specific and the packet 
> meta-data is managed by the application.
>
> The packet meta-data should always be accessed through the provided macros.
>
> This is used by the Packet Framework libraries (port, table, pipeline).
>
> There is absolutely no performance impact due to this mbuf field, as it does 
> not take any space in the mbuf structure (zero-size field).
>
> Signed-off-by: Cristian Dumitrescu 
> ---
>   lib/librte_mbuf/rte_mbuf.h |   17 +
>   1 files changed, 17 insertions(+), 0 deletions(-)
>
> diff --git a/lib/librte_mbuf/rte_mbuf.h b/lib/librte_mbuf/rte_mbuf.h
> index 4a9ab41..bf09618 100644
> --- a/lib/librte_mbuf/rte_mbuf.h
> +++ b/lib/librte_mbuf/rte_mbuf.h
> @@ -201,8 +201,25 @@ struct rte_mbuf {
>   struct rte_ctrlmbuf ctrl;
>   struct rte_pktmbuf pkt;
>   };
> + 
> + union {
> + uint8_t metadata[0];
> + uint16_t metadata16[0];
> + uint32_t metadata32[0];
> + uint64_t metadata64[0];
> + };
>   } __rte_cache_aligned;
>
> +#define RTE_MBUF_METADATA_UINT8(mbuf, offset)   (mbuf->metadata[offset])
> +#define RTE_MBUF_METADATA_UINT16(mbuf, offset)  
> (mbuf->metadata16[offset/sizeof(uint16_t)])
> +#define RTE_MBUF_METADATA_UINT32(mbuf, offset)  
> (mbuf->metadata32[offset/sizeof(uint32_t)])
> +#define RTE_MBUF_METADATA_UINT64(mbuf, offset)  
> (mbuf->metadata64[offset/sizeof(uint64_t)])
> +
> +#define RTE_MBUF_METADATA_UINT8_PTR(mbuf, offset)   (>metadata[offset])
> +#define RTE_MBUF_METADATA_UINT16_PTR(mbuf, offset)  
> (>metadata16[offset/sizeof(uint16_t)])
> +#define RTE_MBUF_METADATA_UINT32_PTR(mbuf, offset)  
> (>metadata32[offset/sizeof(uint32_t)])
> +#define RTE_MBUF_METADATA_UINT64_PTR(mbuf, offset)  
> (>metadata64[offset/sizeof(uint64_t)])
> +
>   /**
>* Given the buf_addr returns the pointer to corresponding mbuf.
>*/
>


-- 
Ivan Boule
6WIND Development Engineer
--
Intel Shannon Limited
Registered in Ireland
Registered Office: Collinstown Industrial Park, Leixlip, County Kildare
Registered Number: 308263
Business address: Dromore House, East Park, Shannon, Co. Clare

This e-mail and any attachments may contain confidential material for the sole 
use of the intended recipient(s). Any review or distribution by others is 
strictly prohibited. If you are not the intended recipient, please contact the 
sender and delete all copies.




[dpdk-dev] [PATCH 00/29] Packet Framework

2014-05-29 Thread Dumitrescu, Cristian
Hi Neil,

Packet Framework does not compete against OVS. OVS is an application (for 
virtual switching), while Packet Framework is a toolbox to build applications.

Can somebody pick OVS building blocks and reuse them to build other 
applications that use the OpenFlow design principles (port, table, pipeline, 
actions, etc)? Probably not easily, if at all.
Can somebody use Packet Framework to build a virtual switch application? 
Hopefully yes.
Can somebody use Packet Framework to develop various applications with a custom 
actions extended outside the small OVS hardwired set (suitable for a switch, 
but not for e.g. a base station)? Hopefully yes.

This being said, OVS and Packet Framework do use similar design principles 
originating from OpenFlow.

Regards,
Cristian



-Original Message-
From: Neil Horman [mailto:nhor...@tuxdriver.com] 
Sent: Tuesday, May 27, 2014 8:48 PM
To: Dumitrescu, Cristian
Cc: dev at dpdk.org
Subject: Re: [dpdk-dev] [PATCH 00/29] Packet Framework

On Tue, May 27, 2014 at 06:09:23PM +0100, Cristian Dumitrescu wrote:
> Intel DPDK Packet Framework provides a standard methodology (logically 
> similar to OpenFlow) for rapid development of complex packet processing 
> pipelines out of ports, tables and actions.
> 
> A pipeline is constructed by connecting its input ports to its output ports 
> through a chain of lookup tables. As result of lookup operation into the 
> current table, one of the table entries (or the default table entry, in case 
> of lookup miss) is identified to provide the actions to be executed on the 
> current packet and the associated action meta-data. The behavior of user 
> actions is defined through the configurable table action handler, while the 
> reserved actions define the next hop for the current packet (either another 
> table, an output port or packet drop) and are handled transparently by the 
> framework.
> 
> Three new Intel DPDK libraries are introduced for Packet Framework: 
> librte_port, librte_table, librte_pipeline. Please check the Intel DPDK 
> Programmer's Guide for full description of the Packet Framework design.
> 
> Two sample applications are provided for Packet Framework: app/test-pipeline 
> and examples/ip_pipeline. Please check the Intel Sample Apps Guide for a 
> detailed description of how these sample apps.
> 
Isn't this at least in part functionality that OVS provides on top of DPDK?  Why
re-invent the wheel?

Neil

--
Intel Shannon Limited
Registered in Ireland
Registered Office: Collinstown Industrial Park, Leixlip, County Kildare
Registered Number: 308263
Business address: Dromore House, East Park, Shannon, Co. Clare

This e-mail and any attachments may contain confidential material for the sole 
use of the intended recipient(s). Any review or distribution by others is 
strictly prohibited. If you are not the intended recipient, please contact the 
sender and delete all copies.




[dpdk-dev] Please any one who can help me with librte_sched

2014-05-28 Thread Dumitrescu, Cristian
Hi Ariel,

I think you put your finger precisely on the problem associated with your 
approach: you have to iterate through all the queues and free up the packets, 
which takes a lot of time. Obviously this is not done by the rte_sched API.

Maybe a limited workaround for this approach would be to create and service the 
parallel rte_sched using a different CPU core, while the previous CPU core 
takes its time to free up all the packets and data structures correctly.

Regards,
Cristian

From: Ariel Rodriguez [mailto:arodrig...@callistech.com]
Sent: Wednesday, May 28, 2014 1:46 AM
To: Dumitrescu, Cristian
Cc: Stephen Hemminger; dev at dpdk.org
Subject: Re: [dpdk-dev] Please any one who can help me with librte_sched

Thank you perfect explanation, i think im going to creating a new parallel 
rte_sched_port and change the reference with managment core updating the 
tx/sched core. So, what happens with the packets on the old reference if i just 
do rte_port_free on it, are them leaked? Is there a why to flush the 
rte_sched_port or maybe gets the packet total size somewhere?.
Anyway the rcu algoritm fits ok in this aproach ... but maybe there is a way to 
flush the old reference port, and work from there with the recently  created 
rte_sched_port 

Regars,
Ariel.

On Tue, May 27, 2014 at 3:31 PM, Dumitrescu, Cristian mailto:cristian.dumitrescu at intel.com>> wrote:
Hi Ariel,

What's wrong with calling rte_sched_subport_config() and 
rte_sched_pipe_config() during run-time?

This assumes that:

1. Port initialization is done, which includes the following:
a) the number of subports, pipes per subport are fixed
b) the queues are all created and their size is fixed
c) the pipe profiles are defined
d) Basically the maximal data structures get created (maximum number of 
supports, pipes and queues) with no run-time changes allowed, apart for the 
bandwidth related parameters. Queues that do not receive packets are not used 
now, they will be used as soon as they get packets. The packets-to-queues 
mapping logic can change over time, as well as the level of activity for 
different users/queues.

2. The CPU core calling the subport/pipe config functions is the same as the 
core doing enque/dequeue for this port (for thread safety reasons).
a) As you say, the management core can send update requests to the core running 
the scheduler, with the latter sampling the request queue regularly and 
performing the updates.

Regards,
Cristian


-Original Message-
From: dev [mailto:dev-bounces at dpdk.org<mailto:dev-boun...@dpdk.org>] On 
Behalf Of Stephen Hemminger
Sent: Tuesday, May 27, 2014 5:35 PM
To: Ariel Rodriguez
Cc: dev at dpdk.org<mailto:dev at dpdk.org>
Subject: Re: [dpdk-dev] Please any one who can help me with librte_sched

On Tue, 27 May 2014 10:33:02 -0300
Ariel Rodriguez mailto:arodriguez at 
callistech.com>> wrote:

> Hello , this is my third mail , the previous mails have not been answered
> yet.
>
> I justo need someone explains to me  how the librte_sched framework behaves
> in a specific situation.
>
> I have a managment application , this connects with a ring with the tx
> core, when a user applies some configuration of the bandwith mangement ,
> the tx core read the message in the ring parse the configuration in a
> rte_port_params struct , subport_params and pipe_params, then creates a new
> rte_sched from scratch , and then changes the pointer of the current
> rte_sched_port currently doing scheduling and then the code execurte
> rte_sched_port_free() for the unreference (reference by temporal pointer)
> rte_sched_port . This is the only way i found for applying dinamic
> configuration or changes to the qos framework.
> So, with this, what happens with the packets attached to the old
> rte_sched_port while is deleted? are those lost packets inside the
> rte_sched_port generates memory leaks?  how can i recover this packets _
> just dequeing from the port scheduler? Where the port scheduler  indicates
> empty packets in the queu state?
>
> Is there a better way to achieve this kind of behaviour? i just need to
> update  the rte_sched_port configuration dinamically, and i want to change
> the current pipe configuration and sub port configuration also.
>
> Regards .

If you need to do dynamic changes, I would recommend using an RCU type
algorithm where you exchange in new parameters and then cleanup/free
after a grace period.  See http://lttng.org/urcu
--
Intel Shannon Limited
Registered in Ireland
Registered Office: Collinstown Industrial Park, Leixlip, County Kildare
Registered Number: 308263
Business address: Dromore House, East Park, Shannon, Co. Clare

This e-mail and any attachments may contain confidential material for the sole 
use of the intended recipient(s). Any review or distribution by others is 
strict

[dpdk-dev] [PATCH] cmdline: finish at EOF

2014-05-27 Thread Dumitrescu, Cristian
Hi Olivier,

Great, so then maybe we should plan for an update of the DPDK command line 
library to your latest code base.

Are the code changes massive, i.e. would this be a big task?

Thanks,
Cristian

-Original Message-
From: Olivier MATZ [mailto:olivier.m...@6wind.com] 
Sent: Monday, May 26, 2014 3:27 PM
To: Dumitrescu, Cristian; dev at dpdk.org
Subject: Re: [dpdk-dev] [PATCH] cmdline: finish at EOF

Hi Cristian,

On 05/23/2014 05:32 PM, Olivier MATZ wrote:
> On 05/23/2014 05:21 PM, Cristian Dumitrescu wrote:
>> Bug fix in cmdline library to allow return on EOF as opposed to
>> infinite loop.
>>
>> Signed-off-by: Cristian Dumitrescu 
>> ---
>>   lib/librte_cmdline/cmdline.c |2 +-
>>   1 files changed, 1 insertions(+), 1 deletions(-)
>>   mode change 100644 => 100755 lib/librte_cmdline/cmdline.c
>
> Acked-by: Olivier Matz 

By the way, I just noticed that this bug had already been fixed into
my cmdline project onto my personal website some time ago:
http://git.droids-corp.org/?p=libcmdline.git;a=commitdiff;h=52e1d5772d1f465a013e65b8ae6e43a027438ed6

Regards,
Olivier
--
Intel Shannon Limited
Registered in Ireland
Registered Office: Collinstown Industrial Park, Leixlip, County Kildare
Registered Number: 308263
Business address: Dromore House, East Park, Shannon, Co. Clare

This e-mail and any attachments may contain confidential material for the sole 
use of the intended recipient(s). Any review or distribution by others is 
strictly prohibited. If you are not the intended recipient, please contact the 
sender and delete all copies.




[dpdk-dev] Please any one who can help me with librte_sched

2014-05-27 Thread Dumitrescu, Cristian
Hi Ariel,

What's wrong with calling rte_sched_subport_config() and 
rte_sched_pipe_config() during run-time?

This assumes that:

1. Port initialization is done, which includes the following:
a) the number of subports, pipes per subport are fixed
b) the queues are all created and their size is fixed
c) the pipe profiles are defined
d) Basically the maximal data structures get created (maximum number of 
supports, pipes and queues) with no run-time changes allowed, apart for the 
bandwidth related parameters. Queues that do not receive packets are not used 
now, they will be used as soon as they get packets. The packets-to-queues 
mapping logic can change over time, as well as the level of activity for 
different users/queues.

2. The CPU core calling the subport/pipe config functions is the same as the 
core doing enque/dequeue for this port (for thread safety reasons).
a) As you say, the management core can send update requests to the core running 
the scheduler, with the latter sampling the request queue regularly and 
performing the updates.

Regards,
Cristian


-Original Message-
From: dev [mailto:dev-boun...@dpdk.org] On Behalf Of Stephen Hemminger
Sent: Tuesday, May 27, 2014 5:35 PM
To: Ariel Rodriguez
Cc: dev at dpdk.org
Subject: Re: [dpdk-dev] Please any one who can help me with librte_sched

On Tue, 27 May 2014 10:33:02 -0300
Ariel Rodriguez  wrote:

> Hello , this is my third mail , the previous mails have not been answered
> yet.
> 
> I justo need someone explains to me  how the librte_sched framework behaves
> in a specific situation.
> 
> I have a managment application , this connects with a ring with the tx
> core, when a user applies some configuration of the bandwith mangement ,
> the tx core read the message in the ring parse the configuration in a
> rte_port_params struct , subport_params and pipe_params, then creates a new
> rte_sched from scratch , and then changes the pointer of the current
> rte_sched_port currently doing scheduling and then the code execurte
> rte_sched_port_free() for the unreference (reference by temporal pointer)
> rte_sched_port . This is the only way i found for applying dinamic
> configuration or changes to the qos framework.
> So, with this, what happens with the packets attached to the old
> rte_sched_port while is deleted? are those lost packets inside the
> rte_sched_port generates memory leaks?  how can i recover this packets _
> just dequeing from the port scheduler? Where the port scheduler  indicates
> empty packets in the queu state?
> 
> Is there a better way to achieve this kind of behaviour? i just need to
> update  the rte_sched_port configuration dinamically, and i want to change
> the current pipe configuration and sub port configuration also.
> 
> Regards .

If you need to do dynamic changes, I would recommend using an RCU type
algorithm where you exchange in new parameters and then cleanup/free
after a grace period.  See http://lttng.org/urcu

--
Intel Shannon Limited
Registered in Ireland
Registered Office: Collinstown Industrial Park, Leixlip, County Kildare
Registered Number: 308263
Business address: Dromore House, East Park, Shannon, Co. Clare

This e-mail and any attachments may contain confidential material for the sole 
use of the intended recipient(s). Any review or distribution by others is 
strictly prohibited. If you are not the intended recipient, please contact the 
sender and delete all copies.




[dpdk-dev] [PATCH RFC 03/11] mbuf: remove rte_ctrlmbuf

2014-05-26 Thread Dumitrescu, Cristian
I am also using the rte_ctrlmbuf to send messages between cores in one of the 
Packet Framework sample apps that I am going to send as a patch tomorrow or 
later this week.

Removing rte_ctrlmbuf would require additional rework to this (complex) sample 
app. It can be done, but it is additional work very close to the code freeze 
cycle.

Thanks,
Cristian

-Original Message-
From: dev [mailto:dev-boun...@dpdk.org] On Behalf Of Gilmore, Walter E
Sent: Sunday, May 25, 2014 10:39 PM
To: Olivier Matz; dev at dpdk.org
Subject: Re: [dpdk-dev] [PATCH RFC 03/11] mbuf: remove rte_ctrlmbuf

Olivier you're making an assumption that customer application code running on 
the Intel DPDK isn't using the rte_ctrlmbuf structure. 
Remember there are more than 300 customers using the Intel DPDK and there is no 
way you can predict that this is not used by them. 
The purpose of this structure is to send commands, events or any other type of 
information between user application tasks (normally from a manager task).
It has been there since the beginning in the original design and it's up to the 
user to define what is in the data field and how they wish to use it. 
It's one thing to fix a bug but to remove a structure like this because you 
don't see it use in the other parts is asking for trouble with customers. 

-Original Message-
From: dev [mailto:dev-boun...@dpdk.org] On Behalf Of Olivier Matz
Sent: Friday, May 09, 2014 10:51 AM
To: dev at dpdk.org
Subject: [dpdk-dev] [PATCH RFC 03/11] mbuf: remove rte_ctrlmbuf

The initial role of rte_ctrlmbuf is to carry generic messages (data pointer + 
data length) but it's not used by the DPDK or it applications.
Keeping it implies:
  - loosing 1 byte in the rte_mbuf structure
  - having some dead code rte_mbuf.[ch]

This patch removes this feature. Thanks to it, it is now possible to simplify 
the rte_mbuf structure by merging the rte_pktmbuf structure in it. This is done 
in next commit.

Signed-off-by: Olivier Matz 
---
 app/test-pmd/cmdline.c   |   1 -
 app/test-pmd/testpmd.c   |   2 -
 app/test-pmd/txonly.c|   2 +-
 app/test/commands.c  |   1 -
 app/test/test_mbuf.c |  72 +
 examples/ipv4_multicast/main.c   |   2 +-
 lib/librte_mbuf/rte_mbuf.c   |  65 +++-
 lib/librte_mbuf/rte_mbuf.h   | 175 ++-
 lib/librte_pmd_e1000/em_rxtx.c   |   2 +-
 lib/librte_pmd_e1000/igb_rxtx.c  |   2 +-
 lib/librte_pmd_ixgbe/ixgbe_rxtx.c|   4 +-
 lib/librte_pmd_virtio/virtio_rxtx.c  |   2 +-
 lib/librte_pmd_vmxnet3/vmxnet3_rxtx.c|   2 +-
 lib/librte_pmd_xenvirt/rte_eth_xenvirt.c |   2 +-
 14 files changed, 54 insertions(+), 280 deletions(-)

diff --git a/app/test-pmd/cmdline.c b/app/test-pmd/cmdline.c index 
7becedc..e3d1849 100644
--- a/app/test-pmd/cmdline.c
+++ b/app/test-pmd/cmdline.c
@@ -5010,7 +5010,6 @@ dump_struct_sizes(void)  #define DUMP_SIZE(t) 
printf("sizeof(" #t ") = %u\n", (unsigned)sizeof(t));
DUMP_SIZE(struct rte_mbuf);
DUMP_SIZE(struct rte_pktmbuf);
-   DUMP_SIZE(struct rte_ctrlmbuf);
DUMP_SIZE(struct rte_mempool);
DUMP_SIZE(struct rte_ring);
 #undef DUMP_SIZE
diff --git a/app/test-pmd/testpmd.c b/app/test-pmd/testpmd.c index 
9c56914..76b3823 100644
--- a/app/test-pmd/testpmd.c
+++ b/app/test-pmd/testpmd.c
@@ -389,13 +389,11 @@ testpmd_mbuf_ctor(struct rte_mempool *mp,
mb_ctor_arg = (struct mbuf_ctor_arg *) opaque_arg;
mb = (struct rte_mbuf *) raw_mbuf;

-   mb->type = RTE_MBUF_PKT;
mb->pool = mp;
mb->buf_addr = (void *) ((char *)mb + mb_ctor_arg->seg_buf_offset);
mb->buf_physaddr = (uint64_t) (rte_mempool_virt2phy(mp, mb) +
mb_ctor_arg->seg_buf_offset);
mb->buf_len  = mb_ctor_arg->seg_buf_size;
-   mb->type = RTE_MBUF_PKT;
mb->ol_flags = 0;
mb->pkt.data = (char *) mb->buf_addr + RTE_PKTMBUF_HEADROOM;
mb->pkt.nb_segs  = 1;
diff --git a/app/test-pmd/txonly.c b/app/test-pmd/txonly.c index 
1cf2574..1f066d0 100644
--- a/app/test-pmd/txonly.c
+++ b/app/test-pmd/txonly.c
@@ -93,7 +93,7 @@ tx_mbuf_alloc(struct rte_mempool *mp)
struct rte_mbuf *m;

m = __rte_mbuf_raw_alloc(mp);
-   __rte_mbuf_sanity_check_raw(m, RTE_MBUF_PKT, 0);
+   __rte_mbuf_sanity_check_raw(m, 0);
return (m);
 }

diff --git a/app/test/commands.c b/app/test/commands.c index b145036..c69544b 
100644
--- a/app/test/commands.c
+++ b/app/test/commands.c
@@ -262,7 +262,6 @@ dump_struct_sizes(void)  #define DUMP_SIZE(t) 
printf("sizeof(" #t ") = %u\n", (unsigned)sizeof(t));
DUMP_SIZE(struct rte_mbuf);
DUMP_SIZE(struct rte_pktmbuf);
-   DUMP_SIZE(struct rte_ctrlmbuf);
DUMP_SIZE(struct rte_mempool);
DUMP_SIZE(struct rte_ring);
 #undef DUMP_SIZE
diff --git 

[dpdk-dev] 4 Traffic classes per Pipe limitation

2013-11-29 Thread Dumitrescu, Cristian
Hi Ariel, some comments inlined below. Regards, Cristian

-Original Message-
From: dev [mailto:dev-boun...@dpdk.org] On Behalf Of Ariel Rodriguez
Sent: Thursday, November 28, 2013 8:53 PM
To: dev at dpdk.org
Subject: [dpdk-dev] 4 Traffic classes per Pipe limitation

 Hi, im working with the qos scheduler framework and i have a few
questions. Why the 4 traffic classes per pipe limitation? .

[Cristian] The DPDK hierarchical scheduler allows for 4 traffic classes with 4 
packet queues per traffic class for each pipe (user). Traffic classes and 
scheduled in strict priority, while queues within a pipe traffic class are 
scheduled using byte-level Weighted Round Robin (WRR) with configured weights. 
Since we have 16 packet queues per pipe (user), we could argue that actually 16 
(sub)traffic classes are supported. When the weight ratio between different 
queues of the same traffic class is high (e.g. 4:1, 8:1, 16:1, etc), then WRR 
starts behaving like strict priority. If your mapping to traffic classes is 
done using DSCP, you can simply map some DSCP values to different queues within 
same traffic class as well.

Im developing a deep packet inspection solution for a telecom company and i
we need more than just 4 traffic classes per pipe. Im able to recognise
almost all layer 7 applications, such as  youtube, p2p , netflix ,
google-ads , etc, etc and i really need to map this type of flows in
differents traffic classes.

[Cristian] Not sure I completely agree with you here. 
The traffic classes are there for the reason of providing different levels of 
delay (latency), delay variation (jitter), packet loss rate, etc. So, for 
example, one possible mapping of traffic types to the 4 traffic classes might 
be: voice = TC0 (highest priority), interactive video = TC1, 
non-interactive/cached video = TC2, best effort traffic (file downloads, web 
browsing, email, etc) = TC3 (lowest priority). In my opinion, youtube and 
netflix could be considered as being part of the same traffic class (e.g. TC2), 
as they have very similar (if not identical) requirements, probably same for 
p2p and google-ads, email, web browsing, etc (where best effort traffic class 
is probably applicable). If really needed, youtube and netflix could be mapped 
to different queues of TC2.
If different service / actions need to be applied to different applications 
that have the same scheduling requirements (and part of the same traffic 
class), then this would probably have to be decided earlier during the 
classification phase and e.g. rate limit youtube traffic per user using traffic 
metering algorithms, block p2p traffic if you are a firewall, etc; these are 
probably actions that could be enforced outside of scheduling/traffic 
management.

 The idea is mark each flow depending on the provisioning
information and assign that flows to different subport depending on the
information given and assign a pipe with the subscriber contract rate, but
we really need to have more than 4 traffic clases, because we want to
control the bandwidth of different  layer 7 protocols flows. At most we
need 32 or 64 traffic classes per subscriber.

[Cristian] Bandwidth control could be done on both ingress side as well as 
egress side. 
On ingress, the amount of incoming traffic for a specific user 
(flow/connection/application) could be limited to predefined values, with 
potentially different levels for different classes of users (e.g. regular / 
premium / company / etc).
On egress, several pipe profiles can be defined using the DPDK hierarchical 
scheduler, which would allow setting up a different rate limit for each traffic 
class for each user. Likewise, traffic classes can be rate limited at the 
subport level (group of users).

 I understand that in a given interval of time  a subscriber dont
use more than 4 protocols simultaneously , generally speaking , or 4
traffic classes in dpdk qos terminology, but the framework doesnt allow us
to configure more traffic classes.

 Im looking the code of qos scheduler and im not seeing why this
restriction. Is a performance problem, or a waste of resource problem? ,
 maybe when the port grinder search for the active queues for each traffic
class  the delay of iterating over all pipes and each traffic class is too
high.
 Cisco have a bandwidth managment solution that claims to control a
million of subscribers simoultaneosly with 64 traffic classes per
subscriber (pipes) and 4 queues per traffic classes (Cisco solution calls
traffic clases  as "Bandwith controller per service or BWC , a subscriber
can have 64 BWC simoultaneasly). Its this posible? maybe this guys
implements the bandwidth managment in hardware?.
 Anyway i really need this feature , but if the qos scheduller
cannot scale to more than 4 traffic classes per pipe i would have to
implement a custom packet scheduler from scratch and i really dont want to
do that.

 Thanks for the patience, and 

<    1   2   3