from:"Adrien Mazarguil"

[dpdk-dev] [PATCH 00/22] Generic flow API (rte_flow)

2016-12-01 Thread Adrien Mazarguil

Hi Yulong,

On Mon, Nov 28, 2016 at 10:03:53AM +, Pei, Yulong wrote:
> Hi Adrien,
> 
> I  think that you already did test for your patchset,  do you have any 
> automated test scripts can be shared for validation since there did not have 
> testpmd flow command documentation yet?

No automated script, at least not yet. I intend to submit v2 with extra
API documentation, testpmd commands with examples of expected behavior and
output, as well as fixes for the issues pointed out by Nelio.

-- 
Adrien Mazarguil
6WIND

[dpdk-dev] [PATCH 01/22] ethdev: introduce generic flow API

2016-12-01 Thread Adrien Mazarguil

Hi Kevin,

On Wed, Nov 30, 2016 at 05:47:17PM +, Kevin Traynor wrote:
> Hi Adrien,
> 
> On 11/16/2016 04:23 PM, Adrien Mazarguil wrote:
> > This new API supersedes all the legacy filter types described in
> > rte_eth_ctrl.h. It is slightly higher level and as a result relies more on
> > PMDs to process and validate flow rules.
> > 
> > Benefits:
> > 
> > - A unified API is easier to program for, applications do not have to be
> >   written for a specific filter type which may or may not be supported by
> >   the underlying device.
> > 
> > - The behavior of a flow rule is the same regardless of the underlying
> >   device, applications do not need to be aware of hardware quirks.
> > 
> > - Extensible by design, API/ABI breakage should rarely occur if at all.
> > 
> > - Documentation is self-standing, no need to look up elsewhere.
> > 
> > Existing filter types will be deprecated and removed in the near future.
> 
> I'd suggest to add a deprecation notice to deprecation.rst, ideally with
> a target release.

Will do, not a sure about the target release though. It seems a bit early
since no PMD really supports this API yet.

[...]
> > diff --git a/lib/librte_ether/rte_flow.c b/lib/librte_ether/rte_flow.c
> > new file mode 100644
> > index 000..064963d
> > --- /dev/null
> > +++ b/lib/librte_ether/rte_flow.c
> > @@ -0,0 +1,159 @@
> > +/*-
> > + *   BSD LICENSE
> > + *
> > + *   Copyright 2016 6WIND S.A.
> > + *   Copyright 2016 Mellanox.
> 
> There's Mellanox copyright but you are the only signed-off-by - is that
> right?

Yes, I'm the primary maintainer for Mellanox PMDs and this API was designed
on their behalf to expose several features from mlx4/mlx5 as the existing
filter types had too many limitations.

[...]
> > +/* Get generic flow operations structure from a port. */
> > +const struct rte_flow_ops *
> > +rte_flow_ops_get(uint8_t port_id, struct rte_flow_error *error)
> > +{
> > +   struct rte_eth_dev *dev = _eth_devices[port_id];
> > +   const struct rte_flow_ops *ops;
> > +   int code;
> > +
> > +   if (unlikely(!rte_eth_dev_is_valid_port(port_id)))
> > +   code = ENODEV;
> > +   else if (unlikely(!dev->dev_ops->filter_ctrl ||
> > + dev->dev_ops->filter_ctrl(dev,
> > +   RTE_ETH_FILTER_GENERIC,
> > +   RTE_ETH_FILTER_GET,
> > +   ) ||
> > + !ops))
> > +   code = ENOTSUP;
> > +   else
> > +   return ops;
> > +   rte_flow_error_set(error, code, RTE_FLOW_ERROR_TYPE_UNSPECIFIED,
> > +  NULL, rte_strerror(code));
> > +   return NULL;
> > +}
> > +
> 
> Is it expected that the application or pmd will provide locking between
> these functions if required? I think it's going to have to be the app.

Locking is indeed expected to be performed by applications. This API only
documents places where locking would make sense if necessary and expected
behavior.

Like all control path APIs, this one assumes a single control thread.
Applications must take the necessary precautions.

[...]
> > +/**
> > + * Flow rule attributes.
> > + *
> > + * Priorities are set on two levels: per group and per rule within groups.
> > + *
> > + * Lower values denote higher priority, the highest priority for both 
> > levels
> > + * is 0, so that a rule with priority 0 in group 8 is always matched after 
> > a
> > + * rule with priority 8 in group 0.
> > + *
> > + * Although optional, applications are encouraged to group similar rules as
> > + * much as possible to fully take advantage of hardware capabilities
> > + * (e.g. optimized matching) and work around limitations (e.g. a single
> > + * pattern type possibly allowed in a given group).
> > + *
> > + * Group and priority levels are arbitrary and up to the application, they
> > + * do not need to be contiguous nor start from 0, however the maximum 
> > number
> > + * varies between devices and may be affected by existing flow rules.
> > + *
> > + * If a packet is matched by several rules of a given group for a given
> > + * priority level, the outcome is undefined. It can take any path, may be
> > + * duplicated or even cause unrecoverable errors.
> 
> I get what you are trying to do here wrt supporting multiple
> pmds/hardware implementations and it's a good idea to keep it flexible.
> 
> Given that the outcome is undefined, it would be nice th

[dpdk-dev] [PATCH v12 0/6] add Tx preparation

2016-12-01 Thread Adrien Mazarguil

Hi Tomasz,

On Wed, Nov 30, 2016 at 10:30:54AM +, Kulasek, TomaszX wrote:
[...]
> > > In my opinion the second approach is both faster to applications and
> > > more friendly from a usability perspective, am I missing something
> > obvious?
> > 
> > I think it was not clearly explained in this patchset, but this is my
> > understanding:
> > tx_prepare and tx_burst can be called at different stages of a pipeline,
> > on different cores.
> 
> Yes, this API is intended to be used optionaly, not only just before tx_burst.
> 
> 1. Separating both stages:
>a) We may have a control over burst (packet content, validation) when 
> needed.
>b) For invalid packets we may restore them or do some another task if 
> needed (even on early stage of processing).
>c) Tx burst keep as simple as it should be.
> 
> 2. Joining the functionality of tx_prepare and tx_burst have some 
> disadvantages:
>a) When packet is invalid it cannot be restored by application should be 
> dropped.
>b) Tx burst needs to modify the content of the packet.
>c) We have no way to eliminate overhead of preparation (tx_prepare) for 
> the application where performance is a key.
> 
> 3. Using tx callbacks
>a) We still need to have different implementations for different devices.
>b) The overhead in performance (comparing to the pair tx_prepare/tx_burst) 
> will not be better while both ways uses very similar mechanism.
> 
> In addition, tx_prepare mechanism can be turned off by compilation flag (as 
> discussed with Jerin in http://dpdk.org/dev/patchwork/patch/15770/) to 
> provide real NOOP functionality (e.g. for low-end CPUs, where even 
> unnecessary memory dereference and check can have significant impact on 
> performance).

Thanks for the reminder, also I've missed v12 for some reason and still
thought rte_phdr_cksum_fix() was some generic function that applications had
to use directly regardless.

Although I agree with your description, I still think there is an issue,
please see my reply to Konstantin [1].

[1] http://dpdk.org/ml/archives/dev/2016-December/050970.html

-- 
Adrien Mazarguil
6WIND

[dpdk-dev] [PATCH v12 0/6] add Tx preparation

2016-12-01 Thread Adrien Mazarguil

quot;.
> > 
> > Actually I do not think we'll ever need tx_prep() unless we add our own
> > quirks to struct rte_eth_desc_lim (and friends) which are currently quietly
> > handled by TX burst functions.
> 
> Ok, so MLX PMD is not affected by these changes and tx_prep for MLX can be 
> safely
> set to NULL, correct?

Correct, actually the rest of this message should be in a separate
thread. From the MLX side, there is no issue with tx_prepare().

-- 
Adrien Mazarguil
6WIND

[dpdk-dev] [PATCH v12 0/6] add Tx preparation

2016-11-30 Thread Adrien Mazarguil

On Mon, Nov 28, 2016 at 12:03:06PM +0100, Thomas Monjalon wrote:
> We need attention of every PMD developers on this thread.

I've been following this thread from the beginning while working on rte_flow
and wanted to see where it was headed before replying. (I know, v11 was
submitted about 1 month ago but still.)

> Reminder of what Konstantin suggested:
> "
> - if the PMD supports TX offloads AND
> - if to be able use any of these offloads the upper layer SW would have to:
> * modify the contents of the packet OR
> * obey HW specific restrictions
> then it is a PMD developer responsibility to provide tx_prep() that would 
> implement
> expected modifications of the packet contents and restriction checks.
> Otherwise, tx_prep() implementation is not required and can be safely set to 
> NULL.  
> "
> 
> I copy/paste also my previous conclusion:
> 
> Before txprep, there is only one API: the application must prepare the
> packets checksum itself (get_psd_sum in testpmd).
> With txprep, the application have 2 choices: keep doing the job itself
> or call txprep which calls a PMD-specific function.

Something is definitely needed here, and only PMDs can provide it. I think
applications should not have to clear checksum fields or initialize them to
some magic value, same goes for any other offload or hardware limitation
that needs to be worked around.

tx_prep() is one possible answer to this issue, however as mentioned in the
original patch it can be very expensive if exposed by the PMD.

Another issue I'm more concerned about is the way limitations are managed
(struct rte_eth_desc_lim). While not officially tied to tx_prep(), this
structure contains new fields that are only relevant to a few devices, and I
fear it will keep growing with each new hardware quirk to manage, breaking
ABIs in the process.

What are applications supposed to do, check each of them regardless before
attempting to send a burst?

I understand tx_prep() automates this process, however I'm wondering why
isn't the TX burst function doing that itself. Using nb_mtu_seg_max as an
example, tx_prep() has an extra check in case of TSO that the TX burst
function does not perform. This ends up being much more expensive to
applications due to the additional loop doing redundant testing on each
mbuf.

If, say as a performance improvement, we decided to leave the validation
part to the TX burst function; what remains in tx_prep() is basically heavy
"preparation" requiring mbuf changes (i.e. erasing checksums, for now).

Following the same logic, why can't such a thing be made part of the TX
burst function as well (through a direct call to rte_phdr_cksum_fix()
whenever necessary). From an application standpoint, what are the advantages
of having to:

 if (tx_prep()) // iterate and update mbufs as needed
 tx_burst(); // iterate and send

Compared to:

 tx_burst(); // iterate, update as needed and send

Note that PMDs could still provide different TX callbacks depending on the
set of enabled offloads so performance is not unnecessarily impacted.

In my opinion the second approach is both faster to applications and more
friendly from a usability perspective, am I missing something obvious?

> The question is: does non-Intel drivers need a checksum preparation for TSO?
> Will it behave well if txprep does nothing in these drivers?
> 
> When looking at the code, most of drivers handle the TSO flags.
> But it is hard to know whether they rely on the pseudo checksum or not.
> 
> git grep -l 'PKT_TX_UDP_CKSUM\|PKT_TX_TCP_CKSUM\|PKT_TX_TCP_SEG' drivers/net/
> 
> drivers/net/bnxt/bnxt_txr.c
> drivers/net/cxgbe/sge.c
> drivers/net/e1000/em_rxtx.c
> drivers/net/e1000/igb_rxtx.c
> drivers/net/ena/ena_ethdev.c
> drivers/net/enic/enic_rxtx.c
> drivers/net/fm10k/fm10k_rxtx.c
> drivers/net/i40e/i40e_rxtx.c
> drivers/net/ixgbe/ixgbe_rxtx.c
> drivers/net/mlx4/mlx4.c
> drivers/net/mlx5/mlx5_rxtx.c
> drivers/net/nfp/nfp_net.c
> drivers/net/qede/qede_rxtx.c
> drivers/net/thunderx/nicvf_rxtx.c
> drivers/net/virtio/virtio_rxtx.c
> drivers/net/vmxnet3/vmxnet3_rxtx.c
> 
> Please, we need a comment for each driver saying
> "it is OK, we do not need any checksum preparation for TSO"
> or
> "yes we have to implement tx_prepare or TSO will not work in this mode"

For both mlx4 and mlx5 then,
"it is OK, we do not need any checksum preparation for TSO".

Actually I do not think we'll ever need tx_prep() unless we add our own
quirks to struct rte_eth_desc_lim (and friends) which are currently quietly
handled by TX burst functions.

-- 
Adrien Mazarguil
6WIND

[dpdk-dev] [PATCH v2] drivers: advertise kmod dependencies in pmdinfo

2016-11-22 Thread Adrien Mazarguil

Hi Olivier,

Neither mlx4 nor mlx5 depend on igb/uio/vfio modules, please see below.

On Tue, Nov 22, 2016 at 10:50:57AM +0100, Olivier Matz wrote:
> Add a new macro RTE_PMD_REGISTER_KMOD_DEP() that allows a driver to
> declare the list of kernel modules required to run properly.
> 
> Today, most PCI drivers require uio/vfio.
> 
> Signed-off-by: Olivier Matz 
> Acked-by: Fiona Trahe 
> ---
[...]
> diff --git a/drivers/net/mlx4/mlx4.c b/drivers/net/mlx4/mlx4.c
> index da61a85..a0065bf 100644
> --- a/drivers/net/mlx4/mlx4.c
> +++ b/drivers/net/mlx4/mlx4.c
> @@ -5937,3 +5937,4 @@ rte_mlx4_pmd_init(void)
>  
>  RTE_PMD_EXPORT_NAME(net_mlx4, __COUNTER__);
>  RTE_PMD_REGISTER_PCI_TABLE(net_mlx4, mlx4_pci_id_map);
> +RTE_PMD_REGISTER_KMOD_DEP(net_mlx4, "* igb_uio | uio_pci_generic | vfio");

RTE_PMD_REGISTER_KMOD_DEP(net_mlx4, "* ib_uverbs & mlx4_en & mlx4_core & 
mlx4_ib");

> diff --git a/drivers/net/mlx5/mlx5.c b/drivers/net/mlx5/mlx5.c
> index 90cc35e..b0343f3 100644
> --- a/drivers/net/mlx5/mlx5.c
> +++ b/drivers/net/mlx5/mlx5.c
> @@ -759,3 +759,4 @@ rte_mlx5_pmd_init(void)
> 
>  RTE_PMD_EXPORT_NAME(net_mlx5, __COUNTER__);
>  RTE_PMD_REGISTER_PCI_TABLE(net_mlx5, mlx5_pci_id_map);
> +RTE_PMD_REGISTER_KMOD_DEP(net_mlx5, "* igb_uio | uio_pci_generic | vfio");

RTE_PMD_REGISTER_KMOD_DEP(net_mlx5, "* ib_uverbs & mlx5_core & mlx5_ib");

-- 
Adrien Mazarguil
6WIND

[dpdk-dev] [PATCH 01/22] ethdev: introduce generic flow API

2016-11-18 Thread Adrien Mazarguil

Hi Beilei,

On Fri, Nov 18, 2016 at 06:36:31AM +, Xing, Beilei wrote:
> Hi Adrien,
> 
> > -Original Message-
> > From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Adrien Mazarguil
> > Sent: Thursday, November 17, 2016 12:23 AM
> > To: dev at dpdk.org
> > Cc: Thomas Monjalon ; De Lara Guarch,
> > Pablo ; Olivier Matz
> > 
> > Subject: [dpdk-dev] [PATCH 01/22] ethdev: introduce generic flow API
> > 
> > This new API supersedes all the legacy filter types described in 
> > rte_eth_ctrl.h.
> > It is slightly higher level and as a result relies more on PMDs to process 
> > and
> > validate flow rules.
> > 
> > Benefits:
> > 
> > - A unified API is easier to program for, applications do not have to be
> >   written for a specific filter type which may or may not be supported by
> >   the underlying device.
> > 
> > - The behavior of a flow rule is the same regardless of the underlying
> >   device, applications do not need to be aware of hardware quirks.
> > 
> > - Extensible by design, API/ABI breakage should rarely occur if at all.
> > 
> > - Documentation is self-standing, no need to look up elsewhere.
> > 
> > Existing filter types will be deprecated and removed in the near future.
> > 
> > Signed-off-by: Adrien Mazarguil 
> 
> 
> > +
> > +/**
> > + * Opaque type returned after successfully creating a flow.
> > + *
> > + * This handle can be used to manage and query the related flow (e.g.
> > +to
> > + * destroy it or retrieve counters).
> > + */
> > +struct rte_flow;
> > +
> 
> As we talked before, we use attr/pattern/actions to create and destroy a flow 
> in PMD, 
> but I don't think it's easy to clone the user-provided parameters and return 
> the result
> to the application as a rte_flow pointer.  As you suggested:
> /* PMD-specific code. */
>  struct rte_flow {
> struct rte_flow_attr attr;
> struct rte_flow_item *pattern;
> struct rte_flow_action *actions;
>  };

Just to provide some context to the community since the above snippet comes
from private exchanges, I've suggested the above structure as a mean to
create and remove rules in the same fashion as FDIR, by providing the rule
used for creation to the destroy callback.

As an opaque type, each PMD currently needs to implement its own version of
struct rte_flow. The above definition may ease transition from FDIR to
rte_flow for some PMDs, however they need to clone the entire
application-provided rule to do so because there is no requirement for it to
be kept allocated.

I've implemented such a function in testpmd (port_flow_new() in commit [1])
as an example.

 [1] http://dpdk.org/ml/archives/dev/2016-November/050266.html

However my suggestion is for PMDs to use their own HW-specific structure
that only contains relevant information instead of being forced to drag
large, non-native data around, missing useful context and that requires
parsing every time. This is one benefit of using an opaque type in the first
place, the other being ABI breakage avoidance.

> Because both pattern and actions are pointers, and there're also pointers in 
> structure
> rte_flow_item and struct rte_flow_action. We need to iterate allocation 
> during clone
> and iterate free during destroy, then seems that the code is something ugly, 
> right?

Well since I wrote that code, I won't easily admit it's ugly. I think PMDs
should not require the duplication of generic rules actually, which are only
defined as a common language between applications and PMDs. Both are free to
store rules in their own preferred and efficient format internally.

> I think application saves info when creating a flow rule, so why not 
> application provide
> attr/pattern/actions info to PMD before calling PMD API?

They have to do so temporarily (e.g. allocated on the stack) while calling
rte_flow_create() and rte_flow_validate(), that's it. Once a rule is
created, there's no requirement for applications to keep anything around.

For simple applications such as testpmd, the generic format is probably
enough. More complex and existing applications such as ovs-dpdk may rather
choose to keep using their internal format that already fits their needs,
partially duplicating this information in rte_flow_attr and
rte_flow_item/rte_flow_action lists would waste memory. The conversion in
this case should only be performed when creating/validating flow rules.

In short, I fail to see any downside with maintaining struct rte_flow opaque
to applications.

Best regards,

-- 
Adrien Mazarguil
6WIND

[dpdk-dev] [PATCH 3/3] net/mlx5: do not invalidate title CQE

2016-11-17 Thread Adrien Mazarguil

On Thu, Nov 17, 2016 at 10:49:56AM +0100, Nelio Laranjeiro wrote:
> We can leave the title completion queue entry untouched since its contents
> are not modified.
> 
> Reported-by: Liming Sun 
> Signed-off-by: Nelio Laranjeiro 
> ---
>  drivers/net/mlx5/mlx5_rxtx.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/net/mlx5/mlx5_rxtx.c b/drivers/net/mlx5/mlx5_rxtx.c
> index 04860bb..ffd09ac 100644
> --- a/drivers/net/mlx5/mlx5_rxtx.c
> +++ b/drivers/net/mlx5/mlx5_rxtx.c
> @@ -1162,7 +1162,7 @@ mlx5_rx_poll_len(struct rxq *rxq, volatile struct 
> mlx5_cqe *cqe,
>   zip->na += 8;
>   }
>   if (unlikely(rxq->zip.ai == rxq->zip.cqe_cnt)) {
> - uint16_t idx = rxq->cq_ci;
> + uint16_t idx = rxq->cq_ci + 1;
>   uint16_t end = zip->cq_ci;
>  
>       while (idx != end) {
> -- 
> 2.1.4

Acked-by: Adrien Mazarguil 

-- 
Adrien Mazarguil
6WIND

[dpdk-dev] [PATCH 2/3] net/mlx5: fix wrong htons

2016-11-17 Thread Adrien Mazarguil

On Thu, Nov 17, 2016 at 10:49:55AM +0100, Nelio Laranjeiro wrote:
> Completion queue entry data uses network endian, to access them we should use
> ntoh*().
> 
> Fixes: c305090bbaf8 ("net/mlx5: replace countdown with threshold for Tx 
> completions")
> 
> CC: stable at dpdk.org
> Reported-by: Liming Sun 
> Signed-off-by: Nelio Laranjeiro 
> ---
>  drivers/net/mlx5/mlx5_rxtx.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/net/mlx5/mlx5_rxtx.c b/drivers/net/mlx5/mlx5_rxtx.c
> index 9bd4d80..04860bb 100644
> --- a/drivers/net/mlx5/mlx5_rxtx.c
> +++ b/drivers/net/mlx5/mlx5_rxtx.c
> @@ -201,7 +201,7 @@ txq_complete(struct txq *txq)
>   } while (1);
>   if (unlikely(cqe == NULL))
>   return;
> - wqe = &(*txq->wqes)[htons(cqe->wqe_counter) &
> + wqe = &(*txq->wqes)[ntohs(cqe->wqe_counter) &
>   ((1 << txq->wqe_n) - 1)].hdr;
>   elts_tail = wqe->ctrl[3];
>   assert(elts_tail < (1 << txq->wqe_n));
> -- 
> 2.1.4

Acked-by: Adrien Mazarguil 

-- 
Adrien Mazarguil
6WIND

[dpdk-dev] [PATCH 1/3] net/mlx5: fix leak when starvation occurs

2016-11-17 Thread Adrien Mazarguil

On Thu, Nov 17, 2016 at 10:49:54AM +0100, Nelio Laranjeiro wrote:
> The list of segments to free was wrongly manipulated ending by only freeing
> the first segment instead of freeing all of them.  The last one still
> belongs to the NIC and thus should not be freed.
> 
> Fixes: a1bdb71a32da ("net/mlx5: fix crash in Rx")
> 
> CC: stable at dpdk.org
> Reported-by: Liming Sun 
> Signed-off-by: Nelio Laranjeiro 
> ---
>  drivers/net/mlx5/mlx5_rxtx.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/net/mlx5/mlx5_rxtx.c b/drivers/net/mlx5/mlx5_rxtx.c
> index beff580..9bd4d80 100644
> --- a/drivers/net/mlx5/mlx5_rxtx.c
> +++ b/drivers/net/mlx5/mlx5_rxtx.c
> @@ -1312,10 +1312,10 @@ mlx5_rx_burst(void *dpdk_rxq, struct rte_mbuf **pkts, 
> uint16_t pkts_n)
>   }
>   while (pkt != seg) {
>   assert(pkt != (*rxq->elts)[idx]);
> - seg = NEXT(pkt);
> + rep = NEXT(pkt);
>   rte_mbuf_refcnt_set(pkt, 0);
>   __rte_mbuf_raw_free(pkt);
> - pkt = seg;
> + pkt = rep;
>       }
>   break;
>   }
> -- 
> 2.1.4

Acked-by: Adrien Mazarguil 

-- 
Adrien Mazarguil
6WIND

[dpdk-dev] [PATCH 22/22] app/testpmd: add queue actions to flow command

2016-11-16 Thread Adrien Mazarguil

- QUEUE: assign packets to a given queue index.
- DUP: duplicate packets to a given queue index.
- RSS: spread packets among several queues.

Signed-off-by: Adrien Mazarguil 
---
 app/test-pmd/cmdline_flow.c | 152 +++
 1 file changed, 152 insertions(+)

diff --git a/app/test-pmd/cmdline_flow.c b/app/test-pmd/cmdline_flow.c
index e166045..70e2b76 100644
--- a/app/test-pmd/cmdline_flow.c
+++ b/app/test-pmd/cmdline_flow.c
@@ -157,8 +157,15 @@ enum index {
ACTION_MARK,
ACTION_MARK_ID,
ACTION_FLAG,
+   ACTION_QUEUE,
+   ACTION_QUEUE_INDEX,
ACTION_DROP,
ACTION_COUNT,
+   ACTION_DUP,
+   ACTION_DUP_INDEX,
+   ACTION_RSS,
+   ACTION_RSS_QUEUES,
+   ACTION_RSS_QUEUE,
ACTION_PF,
ACTION_VF,
ACTION_VF_ORIGINAL,
@@ -172,6 +179,14 @@ enum index {
 #define ITEM_RAW_SIZE \
(offsetof(struct rte_flow_item_raw, pattern) + ITEM_RAW_PATTERN_SIZE)

+/** Number of queue[] entries in struct rte_flow_action_rss. */
+#define ACTION_RSS_NUM 32
+
+/** Storage size for struct rte_flow_action_rss including queues. */
+#define ACTION_RSS_SIZE \
+   (offsetof(struct rte_flow_action_rss, queue) + \
+sizeof(*((struct rte_flow_action_rss *)0)->queue) * ACTION_RSS_NUM)
+
 /** Maximum number of subsequent tokens and arguments on the stack. */
 #define CTX_STACK_SIZE 16

@@ -489,8 +504,11 @@ static const enum index next_action[] = {
ACTION_PASSTHRU,
ACTION_MARK,
ACTION_FLAG,
+   ACTION_QUEUE,
ACTION_DROP,
ACTION_COUNT,
+   ACTION_DUP,
+   ACTION_RSS,
ACTION_PF,
ACTION_VF,
0,
@@ -502,6 +520,24 @@ static const enum index action_mark[] = {
0,
 };

+static const enum index action_queue[] = {
+   ACTION_QUEUE_INDEX,
+   ACTION_NEXT,
+   0,
+};
+
+static const enum index action_dup[] = {
+   ACTION_DUP_INDEX,
+   ACTION_NEXT,
+   0,
+};
+
+static const enum index action_rss[] = {
+   ACTION_RSS_QUEUES,
+   ACTION_NEXT,
+   0,
+};
+
 static const enum index action_vf[] = {
ACTION_VF_ORIGINAL,
ACTION_VF_ID,
@@ -519,6 +555,9 @@ static int parse_vc_spec(struct context *, const struct 
token *,
 const char *, unsigned int, void *, unsigned int);
 static int parse_vc_conf(struct context *, const struct token *,
 const char *, unsigned int, void *, unsigned int);
+static int parse_vc_action_rss_queue(struct context *, const struct token *,
+const char *, unsigned int, void *,
+unsigned int);
 static int parse_destroy(struct context *, const struct token *,
 const char *, unsigned int,
 void *, unsigned int);
@@ -568,6 +607,8 @@ static int comp_port(struct context *, const struct token *,
 unsigned int, char *, unsigned int);
 static int comp_rule_id(struct context *, const struct token *,
unsigned int, char *, unsigned int);
+static int comp_vc_action_rss_queue(struct context *, const struct token *,
+   unsigned int, char *, unsigned int);

 /** Token definitions. */
 static const struct token token_list[] = {
@@ -1169,6 +1210,21 @@ static const struct token token_list[] = {
.next = NEXT(NEXT_ENTRY(ACTION_NEXT)),
.call = parse_vc,
},
+   [ACTION_QUEUE] = {
+   .name = "queue",
+   .help = "assign packets to a given queue index",
+   .priv = PRIV_ACTION(QUEUE,
+   sizeof(struct rte_flow_action_queue)),
+   .next = NEXT(action_queue),
+   .call = parse_vc,
+   },
+   [ACTION_QUEUE_INDEX] = {
+   .name = "index",
+   .help = "queue index to use",
+   .next = NEXT(action_queue, NEXT_ENTRY(UNSIGNED)),
+   .args = ARGS(ARGS_ENTRY(struct rte_flow_action_queue, index)),
+   .call = parse_vc_conf,
+   },
[ACTION_DROP] = {
.name = "drop",
.help = "drop packets (note: passthru has priority)",
@@ -1183,6 +1239,39 @@ static const struct token token_list[] = {
.next = NEXT(NEXT_ENTRY(ACTION_NEXT)),
.call = parse_vc,
},
+   [ACTION_DUP] = {
+   .name = "dup",
+   .help = "duplicate packets to a given queue index",
+   .priv = PRIV_ACTION(DUP, sizeof(struct rte_flow_action_dup)),
+   .next = NEXT(action_dup),
+   .call = parse_vc,
+   },
+   [ACTION_DUP_INDEX] = {
+   .name = "index",
+   .help = "queue index to duplicate packets to",
+   .next = NEXT(actio

[dpdk-dev] [PATCH 21/22] app/testpmd: add various actions to flow command

2016-11-16 Thread Adrien Mazarguil

- MARK: attach 32 bit value to packets.
- FLAG: flag packets.
- DROP: drop packets.
- COUNT: enable counters for a rule.
- PF: redirect packets to physical device function.
- VF: redirect packets to virtual device function.

Signed-off-by: Adrien Mazarguil 
---
 app/test-pmd/cmdline_flow.c | 121 +++
 1 file changed, 121 insertions(+)

diff --git a/app/test-pmd/cmdline_flow.c b/app/test-pmd/cmdline_flow.c
index 892f300..e166045 100644
--- a/app/test-pmd/cmdline_flow.c
+++ b/app/test-pmd/cmdline_flow.c
@@ -154,6 +154,15 @@ enum index {
ACTION_END,
ACTION_VOID,
ACTION_PASSTHRU,
+   ACTION_MARK,
+   ACTION_MARK_ID,
+   ACTION_FLAG,
+   ACTION_DROP,
+   ACTION_COUNT,
+   ACTION_PF,
+   ACTION_VF,
+   ACTION_VF_ORIGINAL,
+   ACTION_VF_ID,
 };

 /** Size of pattern[] field in struct rte_flow_item_raw. */
@@ -478,6 +487,25 @@ static const enum index next_action[] = {
ACTION_END,
ACTION_VOID,
ACTION_PASSTHRU,
+   ACTION_MARK,
+   ACTION_FLAG,
+   ACTION_DROP,
+   ACTION_COUNT,
+   ACTION_PF,
+   ACTION_VF,
+   0,
+};
+
+static const enum index action_mark[] = {
+   ACTION_MARK_ID,
+   ACTION_NEXT,
+   0,
+};
+
+static const enum index action_vf[] = {
+   ACTION_VF_ORIGINAL,
+   ACTION_VF_ID,
+   ACTION_NEXT,
0,
 };

@@ -489,6 +517,8 @@ static int parse_vc(struct context *, const struct token *,
void *, unsigned int);
 static int parse_vc_spec(struct context *, const struct token *,
 const char *, unsigned int, void *, unsigned int);
+static int parse_vc_conf(struct context *, const struct token *,
+const char *, unsigned int, void *, unsigned int);
 static int parse_destroy(struct context *, const struct token *,
 const char *, unsigned int,
 void *, unsigned int);
@@ -1118,6 +1148,70 @@ static const struct token token_list[] = {
.next = NEXT(NEXT_ENTRY(ACTION_NEXT)),
.call = parse_vc,
},
+   [ACTION_MARK] = {
+   .name = "mark",
+   .help = "attach 32 bit value to packets",
+   .priv = PRIV_ACTION(MARK, sizeof(struct rte_flow_action_mark)),
+   .next = NEXT(action_mark),
+   .call = parse_vc,
+   },
+   [ACTION_MARK_ID] = {
+   .name = "id",
+   .help = "32 bit value to return with packets",
+   .next = NEXT(action_mark, NEXT_ENTRY(UNSIGNED)),
+   .args = ARGS(ARGS_ENTRY(struct rte_flow_action_mark, id)),
+   .call = parse_vc_conf,
+   },
+   [ACTION_FLAG] = {
+   .name = "flag",
+   .help = "flag packets",
+   .priv = PRIV_ACTION(FLAG, 0),
+   .next = NEXT(NEXT_ENTRY(ACTION_NEXT)),
+   .call = parse_vc,
+   },
+   [ACTION_DROP] = {
+   .name = "drop",
+   .help = "drop packets (note: passthru has priority)",
+   .priv = PRIV_ACTION(DROP, 0),
+   .next = NEXT(NEXT_ENTRY(ACTION_NEXT)),
+   .call = parse_vc,
+   },
+   [ACTION_COUNT] = {
+   .name = "count",
+   .help = "enable counters for this rule",
+   .priv = PRIV_ACTION(COUNT, 0),
+   .next = NEXT(NEXT_ENTRY(ACTION_NEXT)),
+   .call = parse_vc,
+   },
+   [ACTION_PF] = {
+   .name = "pf",
+   .help = "redirect packets to physical device function",
+   .priv = PRIV_ACTION(PF, 0),
+   .next = NEXT(NEXT_ENTRY(ACTION_NEXT)),
+   .call = parse_vc,
+   },
+   [ACTION_VF] = {
+   .name = "vf",
+   .help = "redirect packets to virtual device function",
+   .priv = PRIV_ACTION(VF, sizeof(struct rte_flow_action_vf)),
+   .next = NEXT(action_vf),
+   .call = parse_vc,
+   },
+   [ACTION_VF_ORIGINAL] = {
+   .name = "original",
+   .help = "use original VF ID if possible",
+   .next = NEXT(action_vf, NEXT_ENTRY(BOOLEAN)),
+   .args = ARGS(ARGS_ENTRY_BF(struct rte_flow_action_vf,
+  original)),
+   .call = parse_vc_conf,
+   },
+   [ACTION_VF_ID] = {
+   .name = "id",
+   .help = "VF ID to redirect packets to",
+   .next = NEXT(action_vf, NEXT_ENTRY(UNSIGNED)),
+   .args = ARGS(ARGS_ENTRY(struct rte_flow_action_vf, id)),
+   .call = parse_vc_conf,
+   },
 };

 /** Remove and return last entry from argument stack. */
@@ -1

[dpdk-dev] [PATCH 20/22] app/testpmd: add L4 items to flow command

2016-11-16 Thread Adrien Mazarguil

Add the ability to match a few properties of common L4[.5] protocol
headers:

- ICMP: type and code.
- UDP: source and destination ports.
- TCP: source and destination ports.
- SCTP: source and destination ports.
- VXLAN: network identifier.

Signed-off-by: Adrien Mazarguil 
---
 app/test-pmd/cmdline_flow.c | 163 +++
 1 file changed, 163 insertions(+)

diff --git a/app/test-pmd/cmdline_flow.c b/app/test-pmd/cmdline_flow.c
index 75096df..892f300 100644
--- a/app/test-pmd/cmdline_flow.c
+++ b/app/test-pmd/cmdline_flow.c
@@ -133,6 +133,20 @@ enum index {
ITEM_IPV6,
ITEM_IPV6_SRC,
ITEM_IPV6_DST,
+   ITEM_ICMP,
+   ITEM_ICMP_TYPE,
+   ITEM_ICMP_CODE,
+   ITEM_UDP,
+   ITEM_UDP_SRC,
+   ITEM_UDP_DST,
+   ITEM_TCP,
+   ITEM_TCP_SRC,
+   ITEM_TCP_DST,
+   ITEM_SCTP,
+   ITEM_SCTP_SRC,
+   ITEM_SCTP_DST,
+   ITEM_VXLAN,
+   ITEM_VXLAN_VNI,

/* Validate/create actions. */
ACTIONS,
@@ -360,6 +374,11 @@ static const enum index next_item[] = {
ITEM_VLAN,
ITEM_IPV4,
ITEM_IPV6,
+   ITEM_ICMP,
+   ITEM_UDP,
+   ITEM_TCP,
+   ITEM_SCTP,
+   ITEM_VXLAN,
0,
 };

@@ -421,6 +440,40 @@ static const enum index item_ipv6[] = {
0,
 };

+static const enum index item_icmp[] = {
+   ITEM_ICMP_TYPE,
+   ITEM_ICMP_CODE,
+   ITEM_NEXT,
+   0,
+};
+
+static const enum index item_udp[] = {
+   ITEM_UDP_SRC,
+   ITEM_UDP_DST,
+   ITEM_NEXT,
+   0,
+};
+
+static const enum index item_tcp[] = {
+   ITEM_TCP_SRC,
+   ITEM_TCP_DST,
+   ITEM_NEXT,
+   0,
+};
+
+static const enum index item_sctp[] = {
+   ITEM_SCTP_SRC,
+   ITEM_SCTP_DST,
+   ITEM_NEXT,
+   0,
+};
+
+static const enum index item_vxlan[] = {
+   ITEM_VXLAN_VNI,
+   ITEM_NEXT,
+   0,
+};
+
 static const enum index next_action[] = {
ACTION_END,
ACTION_VOID,
@@ -936,6 +989,103 @@ static const struct token token_list[] = {
.args = ARGS(ARGS_ENTRY_HTON(struct rte_flow_item_ipv6,
 hdr.dst_addr)),
},
+   [ITEM_ICMP] = {
+   .name = "icmp",
+   .help = "match ICMP header",
+   .priv = PRIV_ITEM(ICMP, sizeof(struct rte_flow_item_icmp)),
+   .next = NEXT(item_icmp),
+   .call = parse_vc,
+   },
+   [ITEM_ICMP_TYPE] = {
+   .name = "type",
+   .help = "ICMP packet type",
+   .next = NEXT(item_icmp, NEXT_ENTRY(UNSIGNED), item_param),
+   .args = ARGS(ARGS_ENTRY_HTON(struct rte_flow_item_icmp,
+hdr.icmp_type)),
+   },
+   [ITEM_ICMP_CODE] = {
+   .name = "code",
+   .help = "ICMP packet code",
+   .next = NEXT(item_icmp, NEXT_ENTRY(UNSIGNED), item_param),
+   .args = ARGS(ARGS_ENTRY_HTON(struct rte_flow_item_icmp,
+hdr.icmp_code)),
+   },
+   [ITEM_UDP] = {
+   .name = "udp",
+   .help = "match UDP header",
+   .priv = PRIV_ITEM(UDP, sizeof(struct rte_flow_item_udp)),
+   .next = NEXT(item_udp),
+   .call = parse_vc,
+   },
+   [ITEM_UDP_SRC] = {
+   .name = "src",
+   .help = "UDP source port",
+   .next = NEXT(item_udp, NEXT_ENTRY(UNSIGNED), item_param),
+   .args = ARGS(ARGS_ENTRY_HTON(struct rte_flow_item_udp,
+hdr.src_port)),
+   },
+   [ITEM_UDP_DST] = {
+   .name = "dst",
+   .help = "UDP destination port",
+   .next = NEXT(item_udp, NEXT_ENTRY(UNSIGNED), item_param),
+   .args = ARGS(ARGS_ENTRY_HTON(struct rte_flow_item_udp,
+hdr.dst_port)),
+   },
+   [ITEM_TCP] = {
+   .name = "tcp",
+   .help = "match TCP header",
+   .priv = PRIV_ITEM(TCP, sizeof(struct rte_flow_item_tcp)),
+   .next = NEXT(item_tcp),
+   .call = parse_vc,
+   },
+   [ITEM_TCP_SRC] = {
+   .name = "src",
+   .help = "TCP source port",
+   .next = NEXT(item_tcp, NEXT_ENTRY(UNSIGNED), item_param),
+   .args = ARGS(ARGS_ENTRY_HTON(struct rte_flow_item_tcp,
+hdr.src_port)),
+   },
+   [ITEM_TCP_DST] = {
+   .name = "dst",
+   .help = "TCP destination port",
+   .next = NEXT(item_tcp, NEXT_ENTRY(UNSIGNED), item_param),
+   .args = ARGS(ARGS_ENTRY_

[dpdk-dev] [PATCH 19/22] app/testpmd: add items ipv4/ipv6 to flow command

2016-11-16 Thread Adrien Mazarguil

Add the ability to match basic fields from IPv4 and IPv6 headers (source
and destination addresses only).

Signed-off-by: Adrien Mazarguil 
---
 app/test-pmd/cmdline_flow.c | 177 +++
 1 file changed, 177 insertions(+)

diff --git a/app/test-pmd/cmdline_flow.c b/app/test-pmd/cmdline_flow.c
index f2bd405..75096df 100644
--- a/app/test-pmd/cmdline_flow.c
+++ b/app/test-pmd/cmdline_flow.c
@@ -38,6 +38,7 @@
 #include 
 #include 
 #include 
+#include 

 #include 
 #include 
@@ -61,6 +62,8 @@ enum index {
BOOLEAN,
STRING,
MAC_ADDR,
+   IPV4_ADDR,
+   IPV6_ADDR,
RULE_ID,
PORT_ID,
GROUP_ID,
@@ -124,6 +127,12 @@ enum index {
ITEM_VLAN,
ITEM_VLAN_TPID,
ITEM_VLAN_TCI,
+   ITEM_IPV4,
+   ITEM_IPV4_SRC,
+   ITEM_IPV4_DST,
+   ITEM_IPV6,
+   ITEM_IPV6_SRC,
+   ITEM_IPV6_DST,

/* Validate/create actions. */
ACTIONS,
@@ -349,6 +358,8 @@ static const enum index next_item[] = {
ITEM_RAW,
ITEM_ETH,
ITEM_VLAN,
+   ITEM_IPV4,
+   ITEM_IPV6,
0,
 };

@@ -396,6 +407,20 @@ static const enum index item_vlan[] = {
0,
 };

+static const enum index item_ipv4[] = {
+   ITEM_IPV4_SRC,
+   ITEM_IPV4_DST,
+   ITEM_NEXT,
+   0,
+};
+
+static const enum index item_ipv6[] = {
+   ITEM_IPV6_SRC,
+   ITEM_IPV6_DST,
+   ITEM_NEXT,
+   0,
+};
+
 static const enum index next_action[] = {
ACTION_END,
ACTION_VOID,
@@ -441,6 +466,12 @@ static int parse_string(struct context *, const struct 
token *,
 static int parse_mac_addr(struct context *, const struct token *,
  const char *, unsigned int,
  void *, unsigned int);
+static int parse_ipv4_addr(struct context *, const struct token *,
+  const char *, unsigned int,
+  void *, unsigned int);
+static int parse_ipv6_addr(struct context *, const struct token *,
+  const char *, unsigned int,
+  void *, unsigned int);
 static int parse_port(struct context *, const struct token *,
  const char *, unsigned int,
  void *, unsigned int);
@@ -511,6 +542,20 @@ static const struct token token_list[] = {
.call = parse_mac_addr,
.comp = comp_none,
},
+   [IPV4_ADDR] = {
+   .name = "{IPv4 address}",
+   .type = "IPV4 ADDRESS",
+   .help = "standard IPv4 address notation",
+   .call = parse_ipv4_addr,
+   .comp = comp_none,
+   },
+   [IPV6_ADDR] = {
+   .name = "{IPv6 address}",
+   .type = "IPV6 ADDRESS",
+   .help = "standard IPv6 address notation",
+   .call = parse_ipv6_addr,
+   .comp = comp_none,
+   },
[RULE_ID] = {
.name = "{rule id}",
.type = "RULE ID",
@@ -849,6 +894,48 @@ static const struct token token_list[] = {
.next = NEXT(item_vlan, NEXT_ENTRY(UNSIGNED), item_param),
.args = ARGS(ARGS_ENTRY_HTON(struct rte_flow_item_vlan, tci)),
},
+   [ITEM_IPV4] = {
+   .name = "ipv4",
+   .help = "match IPv4 header",
+   .priv = PRIV_ITEM(IPV4, sizeof(struct rte_flow_item_ipv4)),
+   .next = NEXT(item_ipv4),
+   .call = parse_vc,
+   },
+   [ITEM_IPV4_SRC] = {
+   .name = "src",
+   .help = "source address",
+   .next = NEXT(item_ipv4, NEXT_ENTRY(IPV4_ADDR), item_param),
+   .args = ARGS(ARGS_ENTRY_HTON(struct rte_flow_item_ipv4,
+hdr.src_addr)),
+   },
+   [ITEM_IPV4_DST] = {
+   .name = "dst",
+   .help = "destination address",
+   .next = NEXT(item_ipv4, NEXT_ENTRY(IPV4_ADDR), item_param),
+   .args = ARGS(ARGS_ENTRY_HTON(struct rte_flow_item_ipv4,
+hdr.dst_addr)),
+   },
+   [ITEM_IPV6] = {
+   .name = "ipv6",
+   .help = "match IPv6 header",
+   .priv = PRIV_ITEM(IPV6, sizeof(struct rte_flow_item_ipv6)),
+   .next = NEXT(item_ipv6),
+   .call = parse_vc,
+   },
+   [ITEM_IPV6_SRC] = {
+   .name = "src",
+   .help = "source address",
+   .next = NEXT(item_ipv6, NEXT_ENTRY(IPV6_ADDR), item_param),
+   .args = ARGS(ARGS_ENTRY_HTON(struct rte_flow_item_ipv6,
+hdr.src_addr)),
+   },
+   [ITEM_IPV6_DST] = {
+

[dpdk-dev] [PATCH 18/22] app/testpmd: add items eth/vlan to flow command

2016-11-16 Thread Adrien Mazarguil

These pattern items match basic Ethernet headers (source, destination and
type) and related 802.1Q/ad VLAN headers.

Signed-off-by: Adrien Mazarguil 
---
 app/test-pmd/cmdline_flow.c | 126 +++
 1 file changed, 126 insertions(+)

diff --git a/app/test-pmd/cmdline_flow.c b/app/test-pmd/cmdline_flow.c
index 6f2f26c..f2bd405 100644
--- a/app/test-pmd/cmdline_flow.c
+++ b/app/test-pmd/cmdline_flow.c
@@ -43,6 +43,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 

 #include "testpmd.h"
@@ -59,6 +60,7 @@ enum index {
PREFIX,
BOOLEAN,
STRING,
+   MAC_ADDR,
RULE_ID,
PORT_ID,
GROUP_ID,
@@ -115,6 +117,13 @@ enum index {
ITEM_RAW_OFFSET,
ITEM_RAW_LIMIT,
ITEM_RAW_PATTERN,
+   ITEM_ETH,
+   ITEM_ETH_DST,
+   ITEM_ETH_SRC,
+   ITEM_ETH_TYPE,
+   ITEM_VLAN,
+   ITEM_VLAN_TPID,
+   ITEM_VLAN_TCI,

/* Validate/create actions. */
ACTIONS,
@@ -239,6 +248,14 @@ struct token {
.size = (sz), \
})

+/** Same as ARGS_ENTRY() using network byte ordering. */
+#define ARGS_ENTRY_HTON(s, f) \
+   (&(const struct arg){ \
+   .hton = 1, \
+   .offset = offsetof(s, f), \
+   .size = sizeof(((s *)0)->f), \
+   })
+
 /** Parser output buffer layout expected by cmd_flow_parsed(). */
 struct buffer {
enum index command; /**< Flow command. */
@@ -330,6 +347,8 @@ static const enum index next_item[] = {
ITEM_VF,
ITEM_PORT,
ITEM_RAW,
+   ITEM_ETH,
+   ITEM_VLAN,
0,
 };

@@ -362,6 +381,21 @@ static const enum index item_raw[] = {
0,
 };

+static const enum index item_eth[] = {
+   ITEM_ETH_DST,
+   ITEM_ETH_SRC,
+   ITEM_ETH_TYPE,
+   ITEM_NEXT,
+   0,
+};
+
+static const enum index item_vlan[] = {
+   ITEM_VLAN_TPID,
+   ITEM_VLAN_TCI,
+   ITEM_NEXT,
+   0,
+};
+
 static const enum index next_action[] = {
ACTION_END,
ACTION_VOID,
@@ -404,6 +438,9 @@ static int parse_boolean(struct context *, const struct 
token *,
 static int parse_string(struct context *, const struct token *,
const char *, unsigned int,
void *, unsigned int);
+static int parse_mac_addr(struct context *, const struct token *,
+ const char *, unsigned int,
+ void *, unsigned int);
 static int parse_port(struct context *, const struct token *,
  const char *, unsigned int,
  void *, unsigned int);
@@ -467,6 +504,13 @@ static const struct token token_list[] = {
.call = parse_string,
.comp = comp_none,
},
+   [MAC_ADDR] = {
+   .name = "{MAC address}",
+   .type = "MAC-48",
+   .help = "standard MAC address notation",
+   .call = parse_mac_addr,
+   .comp = comp_none,
+   },
[RULE_ID] = {
.name = "{rule id}",
.type = "RULE ID",
@@ -761,6 +805,50 @@ static const struct token token_list[] = {
pattern,
ITEM_RAW_PATTERN_SIZE)),
},
+   [ITEM_ETH] = {
+   .name = "eth",
+   .help = "match Ethernet header",
+   .priv = PRIV_ITEM(ETH, sizeof(struct rte_flow_item_eth)),
+   .next = NEXT(item_eth),
+   .call = parse_vc,
+   },
+   [ITEM_ETH_DST] = {
+   .name = "dst",
+   .help = "destination MAC",
+   .next = NEXT(item_eth, NEXT_ENTRY(MAC_ADDR), item_param),
+   .args = ARGS(ARGS_ENTRY(struct rte_flow_item_eth, dst)),
+   },
+   [ITEM_ETH_SRC] = {
+   .name = "src",
+   .help = "source MAC",
+   .next = NEXT(item_eth, NEXT_ENTRY(MAC_ADDR), item_param),
+   .args = ARGS(ARGS_ENTRY(struct rte_flow_item_eth, src)),
+   },
+   [ITEM_ETH_TYPE] = {
+   .name = "type",
+   .help = "EtherType",
+   .next = NEXT(item_eth, NEXT_ENTRY(UNSIGNED), item_param),
+   .args = ARGS(ARGS_ENTRY_HTON(struct rte_flow_item_eth, type)),
+   },
+   [ITEM_VLAN] = {
+   .name = "vlan",
+   .help = "match 802.1Q/ad VLAN tag",
+   .priv = PRIV_ITEM(VLAN, sizeof(struct rte_flow_item_vlan)),
+   .next = NEXT(item_vlan),
+   .call = parse_vc,
+   },
+   [ITEM_VLAN_TPID] = {
+   .name = "tpid",
+   .help = "tag protocol identifier",
+   .next = NEXT(item_vlan, NEXT_ENTRY(UNSIGNED), item_para

[dpdk-dev] [PATCH 17/22] app/testpmd: add item raw to flow command

2016-11-16 Thread Adrien Mazarguil

Matches arbitrary byte strings with properties:

- relative: look for pattern after the previous item.
- search: search pattern from offset (see also limit).
- offset: absolute or relative offset for pattern.
- limit: search area limit for start of pattern.
- length: pattern length.
- pattern: byte string to look for.

Signed-off-by: Adrien Mazarguil 
---
 app/test-pmd/cmdline_flow.c | 206 +++
 1 file changed, 206 insertions(+)

diff --git a/app/test-pmd/cmdline_flow.c b/app/test-pmd/cmdline_flow.c
index c61e31e..6f2f26c 100644
--- a/app/test-pmd/cmdline_flow.c
+++ b/app/test-pmd/cmdline_flow.c
@@ -57,6 +57,8 @@ enum index {
INTEGER,
UNSIGNED,
PREFIX,
+   BOOLEAN,
+   STRING,
RULE_ID,
PORT_ID,
GROUP_ID,
@@ -107,6 +109,12 @@ enum index {
ITEM_VF_ID,
ITEM_PORT,
ITEM_PORT_INDEX,
+   ITEM_RAW,
+   ITEM_RAW_RELATIVE,
+   ITEM_RAW_SEARCH,
+   ITEM_RAW_OFFSET,
+   ITEM_RAW_LIMIT,
+   ITEM_RAW_PATTERN,

/* Validate/create actions. */
ACTIONS,
@@ -116,6 +124,13 @@ enum index {
ACTION_PASSTHRU,
 };

+/** Size of pattern[] field in struct rte_flow_item_raw. */
+#define ITEM_RAW_PATTERN_SIZE 36
+
+/** Storage size for struct rte_flow_item_raw including pattern. */
+#define ITEM_RAW_SIZE \
+   (offsetof(struct rte_flow_item_raw, pattern) + ITEM_RAW_PATTERN_SIZE)
+
 /** Maximum number of subsequent tokens and arguments on the stack. */
 #define CTX_STACK_SIZE 16

@@ -217,6 +232,13 @@ struct token {
.size = sizeof(*((s *)0)->f), \
})

+/** Static initializer for ARGS() with arbitrary size. */
+#define ARGS_ENTRY_USZ(s, f, sz) \
+   (&(const struct arg){ \
+   .offset = offsetof(s, f), \
+   .size = (sz), \
+   })
+
 /** Parser output buffer layout expected by cmd_flow_parsed(). */
 struct buffer {
enum index command; /**< Flow command. */
@@ -307,6 +329,7 @@ static const enum index next_item[] = {
ITEM_PF,
ITEM_VF,
ITEM_PORT,
+   ITEM_RAW,
0,
 };

@@ -329,6 +352,16 @@ static const enum index item_port[] = {
0,
 };

+static const enum index item_raw[] = {
+   ITEM_RAW_RELATIVE,
+   ITEM_RAW_SEARCH,
+   ITEM_RAW_OFFSET,
+   ITEM_RAW_LIMIT,
+   ITEM_RAW_PATTERN,
+   ITEM_NEXT,
+   0,
+};
+
 static const enum index next_action[] = {
ACTION_END,
ACTION_VOID,
@@ -365,11 +398,19 @@ static int parse_int(struct context *, const struct token 
*,
 static int parse_prefix(struct context *, const struct token *,
const char *, unsigned int,
void *, unsigned int);
+static int parse_boolean(struct context *, const struct token *,
+const char *, unsigned int,
+void *, unsigned int);
+static int parse_string(struct context *, const struct token *,
+   const char *, unsigned int,
+   void *, unsigned int);
 static int parse_port(struct context *, const struct token *,
  const char *, unsigned int,
  void *, unsigned int);
 static int comp_none(struct context *, const struct token *,
 unsigned int, char *, unsigned int);
+static int comp_boolean(struct context *, const struct token *,
+   unsigned int, char *, unsigned int);
 static int comp_action(struct context *, const struct token *,
   unsigned int, char *, unsigned int);
 static int comp_port(struct context *, const struct token *,
@@ -412,6 +453,20 @@ static const struct token token_list[] = {
.call = parse_prefix,
.comp = comp_none,
},
+   [BOOLEAN] = {
+   .name = "{boolean}",
+   .type = "BOOLEAN",
+   .help = "any boolean value",
+   .call = parse_boolean,
+   .comp = comp_boolean,
+   },
+   [STRING] = {
+   .name = "{string}",
+   .type = "STRING",
+   .help = "fixed string",
+   .call = parse_string,
+   .comp = comp_none,
+   },
[RULE_ID] = {
.name = "{rule id}",
.type = "RULE ID",
@@ -662,6 +717,50 @@ static const struct token token_list[] = {
.next = NEXT(item_port, NEXT_ENTRY(UNSIGNED), item_param),
.args = ARGS(ARGS_ENTRY(struct rte_flow_item_port, index)),
},
+   [ITEM_RAW] = {
+   .name = "raw",
+   .help = "match an arbitrary byte string",
+   .priv = PRIV_ITEM(RAW, ITEM_RAW_SIZE),
+   .next = NEXT(item_raw),
+   .call = parse_vc,
+   },
+   [ITEM_RAW_RELATIVE] = {
+   .name

[dpdk-dev] [PATCH 16/22] app/testpmd: add various items to flow command

2016-11-16 Thread Adrien Mazarguil

- PF: match packets addressed to the physical function.
- VF: match packets addressed to a virtual function ID.
- PORT: device-specific physical port index to use.

Signed-off-by: Adrien Mazarguil 
---
 app/test-pmd/cmdline_flow.c | 53 
 1 file changed, 53 insertions(+)

diff --git a/app/test-pmd/cmdline_flow.c b/app/test-pmd/cmdline_flow.c
index 5816be4..c61e31e 100644
--- a/app/test-pmd/cmdline_flow.c
+++ b/app/test-pmd/cmdline_flow.c
@@ -102,6 +102,11 @@ enum index {
ITEM_ANY,
ITEM_ANY_MIN,
ITEM_ANY_MAX,
+   ITEM_PF,
+   ITEM_VF,
+   ITEM_VF_ID,
+   ITEM_PORT,
+   ITEM_PORT_INDEX,

/* Validate/create actions. */
ACTIONS,
@@ -299,6 +304,9 @@ static const enum index next_item[] = {
ITEM_VOID,
ITEM_INVERT,
ITEM_ANY,
+   ITEM_PF,
+   ITEM_VF,
+   ITEM_PORT,
0,
 };

@@ -309,6 +317,18 @@ static const enum index item_any[] = {
0,
 };

+static const enum index item_vf[] = {
+   ITEM_VF_ID,
+   ITEM_NEXT,
+   0,
+};
+
+static const enum index item_port[] = {
+   ITEM_PORT_INDEX,
+   ITEM_NEXT,
+   0,
+};
+
 static const enum index next_action[] = {
ACTION_END,
ACTION_VOID,
@@ -609,6 +629,39 @@ static const struct token token_list[] = {
.next = NEXT(item_any, NEXT_ENTRY(UNSIGNED), item_param),
.args = ARGS(ARGS_ENTRY(struct rte_flow_item_any, max)),
},
+   [ITEM_PF] = {
+   .name = "pf",
+   .help = "match packets addressed to the physical function",
+   .priv = PRIV_ITEM(PF, 0),
+   .next = NEXT(NEXT_ENTRY(ITEM_NEXT)),
+   .call = parse_vc,
+   },
+   [ITEM_VF] = {
+   .name = "vf",
+   .help = "match packets addressed to a virtual function ID",
+   .priv = PRIV_ITEM(VF, sizeof(struct rte_flow_item_vf)),
+   .next = NEXT(item_vf),
+   .call = parse_vc,
+   },
+   [ITEM_VF_ID] = {
+   .name = "id",
+   .help = "destination VF ID",
+   .next = NEXT(item_vf, NEXT_ENTRY(UNSIGNED), item_param),
+   .args = ARGS(ARGS_ENTRY(struct rte_flow_item_vf, id)),
+   },
+   [ITEM_PORT] = {
+   .name = "port",
+   .help = "device-specific physical port index to use",
+   .priv = PRIV_ITEM(PORT, sizeof(struct rte_flow_item_port)),
+   .next = NEXT(item_port),
+   .call = parse_vc,
+   },
+   [ITEM_PORT_INDEX] = {
+   .name = "index",
+   .help = "physical port index",
+   .next = NEXT(item_port, NEXT_ENTRY(UNSIGNED), item_param),
+   .args = ARGS(ARGS_ENTRY(struct rte_flow_item_port, index)),
+   },
/* Validate/create actions. */
[ACTIONS] = {
.name = "actions",
-- 
2.1.4

[dpdk-dev] [PATCH 15/22] app/testpmd: add item any to flow command

2016-11-16 Thread Adrien Mazarguil

This pattern item matches any protocol in place of the current layer and
has two properties:

- min: minimum number of layers covered (0 or more).
- max: maximum number of layers covered (0 means infinity).

Signed-off-by: Adrien Mazarguil 
---
 app/test-pmd/cmdline_flow.c | 30 ++
 1 file changed, 30 insertions(+)

diff --git a/app/test-pmd/cmdline_flow.c b/app/test-pmd/cmdline_flow.c
index 81930e1..5816be4 100644
--- a/app/test-pmd/cmdline_flow.c
+++ b/app/test-pmd/cmdline_flow.c
@@ -99,6 +99,9 @@ enum index {
ITEM_END,
ITEM_VOID,
ITEM_INVERT,
+   ITEM_ANY,
+   ITEM_ANY_MIN,
+   ITEM_ANY_MAX,

/* Validate/create actions. */
ACTIONS,
@@ -295,6 +298,14 @@ static const enum index next_item[] = {
ITEM_END,
ITEM_VOID,
ITEM_INVERT,
+   ITEM_ANY,
+   0,
+};
+
+static const enum index item_any[] = {
+   ITEM_ANY_MIN,
+   ITEM_ANY_MAX,
+   ITEM_NEXT,
0,
 };

@@ -579,6 +590,25 @@ static const struct token token_list[] = {
.next = NEXT(NEXT_ENTRY(ITEM_NEXT)),
.call = parse_vc,
},
+   [ITEM_ANY] = {
+   .name = "any",
+   .help = "match any protocol for the current layer",
+   .priv = PRIV_ITEM(ANY, sizeof(struct rte_flow_item_any)),
+   .next = NEXT(item_any),
+   .call = parse_vc,
+   },
+   [ITEM_ANY_MIN] = {
+   .name = "min",
+   .help = "minimum number of layers covered",
+   .next = NEXT(item_any, NEXT_ENTRY(UNSIGNED), item_param),
+   .args = ARGS(ARGS_ENTRY(struct rte_flow_item_any, min)),
+   },
+   [ITEM_ANY_MAX] = {
+   .name = "max",
+   .help = "maximum number of layers covered, 0 for infinity",
+   .next = NEXT(item_any, NEXT_ENTRY(UNSIGNED), item_param),
+   .args = ARGS(ARGS_ENTRY(struct rte_flow_item_any, max)),
+   },
/* Validate/create actions. */
[ACTIONS] = {
.name = "actions",
-- 
2.1.4

[dpdk-dev] [PATCH 14/22] app/testpmd: add rte_flow bit-field support

2016-11-16 Thread Adrien Mazarguil

Several rte_flow structures expose bit-fields that cannot be set in a
generic fashion at byte level. Add bit-mask support to handle them.

Signed-off-by: Adrien Mazarguil 
---
 app/test-pmd/cmdline_flow.c | 59 
 1 file changed, 59 insertions(+)

diff --git a/app/test-pmd/cmdline_flow.c b/app/test-pmd/cmdline_flow.c
index 89307cb..81930e1 100644
--- a/app/test-pmd/cmdline_flow.c
+++ b/app/test-pmd/cmdline_flow.c
@@ -136,6 +136,7 @@ struct arg {
uint32_t sign:1; /**< Value is signed. */
uint32_t offset; /**< Relative offset from ctx->object. */
uint32_t size; /**< Field size. */
+   const uint8_t *mask; /**< Bit-mask to use instead of offset/size. */
 };

 /** Parser token definition. */
@@ -195,6 +196,13 @@ struct token {
.size = sizeof(((s *)0)->f), \
})

+/** Static initializer for ARGS() to target a bit-field. */
+#define ARGS_ENTRY_BF(s, f) \
+   (&(const struct arg){ \
+   .size = sizeof(s), \
+   .mask = (const void *)&(const s){ .f = -1 }, \
+   })
+
 /** Static initializer for ARGS() to target a pointer. */
 #define ARGS_ENTRY_PTR(s, f) \
(&(const struct arg){ \
@@ -622,6 +630,34 @@ push_args(struct context *ctx, const struct arg *arg)
return 0;
 }

+/** Spread value into buffer according to bit-mask. */
+static size_t
+arg_entry_bf_fill(void *dst, uintmax_t val, const struct arg *arg)
+{
+   uint32_t i;
+   size_t len = 0;
+
+   /* Endian conversion is not supported on bit-fields. */
+   if (!arg->mask || arg->hton)
+   return 0;
+   for (i = 0; i != arg->size; ++i) {
+   unsigned int shift = 0;
+   uint8_t *buf = (uint8_t *)dst + i;
+
+   for (shift = 0; arg->mask[i] >> shift; ++shift) {
+   if (!(arg->mask[i] & (1 << shift)))
+   continue;
+   ++len;
+   if (!dst)
+   continue;
+   *buf &= ~(1 << shift);
+   *buf |= (val & 1) << shift;
+   val >>= 1;
+   }
+   }
+   return len;
+}
+
 /**
  * Parse a prefix length and generate a bit-mask.
  *
@@ -648,6 +684,23 @@ parse_prefix(struct context *ctx, const struct token 
*token,
u = strtoumax(str, , 0);
if (errno || (size_t)(end - str) != len)
goto error;
+   if (arg->mask) {
+   uintmax_t v = 0;
+
+   extra = arg_entry_bf_fill(NULL, 0, arg);
+   if (u > extra)
+   goto error;
+   if (!ctx->object)
+   return len;
+   extra -= u;
+   while (u--)
+   (v <<= 1, v |= 1);
+   v <<= extra;
+   if (!arg_entry_bf_fill(ctx->object, v, arg) ||
+   !arg_entry_bf_fill(ctx->objmask, -1, arg))
+   goto error;
+   return len;
+   }
bytes = u / 8;
extra = u % 8;
size = arg->size;
@@ -1071,6 +1124,12 @@ parse_int(struct context *ctx, const struct token *token,
goto error;
if (!ctx->object)
return len;
+   if (arg->mask) {
+   if (!arg_entry_bf_fill(ctx->object, u, arg) ||
+   !arg_entry_bf_fill(ctx->objmask, -1, arg))
+   goto error;
+   return len;
+   }
buf = (uint8_t *)ctx->object + arg->offset;
size = arg->size;
 objmask:
-- 
2.1.4

[dpdk-dev] [PATCH 13/22] app/testpmd: add rte_flow item spec prefix length

2016-11-16 Thread Adrien Mazarguil

Generating bit-masks from prefix lengths is often more convenient than
providing them entirely (e.g. to define IPv4 and IPv6 subnets).

This commit adds the "prefix" operator that assigns generated bit-masks to
any pattern item specification field.

Signed-off-by: Adrien Mazarguil 
---
 app/test-pmd/cmdline_flow.c | 80 
 1 file changed, 80 insertions(+)

diff --git a/app/test-pmd/cmdline_flow.c b/app/test-pmd/cmdline_flow.c
index 790b4b8..89307cb 100644
--- a/app/test-pmd/cmdline_flow.c
+++ b/app/test-pmd/cmdline_flow.c
@@ -56,6 +56,7 @@ enum index {
/* Common tokens. */
INTEGER,
UNSIGNED,
+   PREFIX,
RULE_ID,
PORT_ID,
GROUP_ID,
@@ -93,6 +94,7 @@ enum index {
ITEM_PARAM_SPEC,
ITEM_PARAM_LAST,
ITEM_PARAM_MASK,
+   ITEM_PARAM_PREFIX,
ITEM_NEXT,
ITEM_END,
ITEM_VOID,
@@ -277,6 +279,7 @@ static const enum index item_param[] = {
ITEM_PARAM_SPEC,
ITEM_PARAM_LAST,
ITEM_PARAM_MASK,
+   ITEM_PARAM_PREFIX,
0,
 };

@@ -320,6 +323,9 @@ static int parse_list(struct context *, const struct token 
*,
 static int parse_int(struct context *, const struct token *,
 const char *, unsigned int,
 void *, unsigned int);
+static int parse_prefix(struct context *, const struct token *,
+   const char *, unsigned int,
+   void *, unsigned int);
 static int parse_port(struct context *, const struct token *,
  const char *, unsigned int,
  void *, unsigned int);
@@ -360,6 +366,13 @@ static const struct token token_list[] = {
.call = parse_int,
.comp = comp_none,
},
+   [PREFIX] = {
+   .name = "{prefix}",
+   .type = "PREFIX",
+   .help = "prefix length for bit-mask",
+   .call = parse_prefix,
+   .comp = comp_none,
+   },
[RULE_ID] = {
.name = "{rule id}",
.type = "RULE ID",
@@ -527,6 +540,11 @@ static const struct token token_list[] = {
.help = "specify bit-mask with relevant bits set to one",
.call = parse_vc_spec,
},
+   [ITEM_PARAM_PREFIX] = {
+   .name = "prefix",
+   .help = "generate bit-mask from a prefix length",
+   .call = parse_vc_spec,
+   },
[ITEM_NEXT] = {
.name = "/",
.help = "specify next pattern item",
@@ -604,6 +622,62 @@ push_args(struct context *ctx, const struct arg *arg)
return 0;
 }

+/**
+ * Parse a prefix length and generate a bit-mask.
+ *
+ * Last argument (ctx->args) is retrieved to determine mask size, storage
+ * location and whether the result must use network byte ordering.
+ */
+static int
+parse_prefix(struct context *ctx, const struct token *token,
+const char *str, unsigned int len,
+void *buf, unsigned int size)
+{
+   const struct arg *arg = pop_args(ctx);
+   static const uint8_t conv[] = "\x00\x80\xc0\xe0\xf0\xf8\xfc\xfe\xff";
+   char *end;
+   uintmax_t u;
+   unsigned int bytes;
+   unsigned int extra;
+
+   (void)token;
+   /* Argument is expected. */
+   if (!arg)
+   return -1;
+   errno = 0;
+   u = strtoumax(str, , 0);
+   if (errno || (size_t)(end - str) != len)
+   goto error;
+   bytes = u / 8;
+   extra = u % 8;
+   size = arg->size;
+   if (bytes > size || bytes + !!extra > size)
+   goto error;
+   if (!ctx->object)
+   return len;
+   buf = (uint8_t *)ctx->object + arg->offset;
+#if RTE_BYTE_ORDER == RTE_LITTLE_ENDIAN
+   if (!arg->hton) {
+   memset((uint8_t *)buf + size - bytes, 0xff, bytes);
+   memset(buf, 0x00, size - bytes);
+   if (extra)
+   ((uint8_t *)buf)[size - bytes - 1] = conv[extra];
+   } else
+#endif
+   {
+   memset(buf, 0xff, bytes);
+   memset((uint8_t *)buf + bytes, 0x00, size - bytes);
+   if (extra)
+   ((uint8_t *)buf)[bytes] = conv[extra];
+   }
+   if (ctx->objmask)
+   memset((uint8_t *)ctx->objmask + arg->offset, 0xff, size);
+   return len;
+error:
+   push_args(ctx, arg);
+   return -1;
+}
+
 /** Default parsing function for token name matching. */
 static int
 parse_default(struct context *ctx, const struct token *token,
@@ -775,6 +849,12 @@ parse_vc_spec(struct context *ctx, const struct token 
*token,
case ITEM_PARAM_LAST:
index = 1;
break;
+   case ITEM_PARAM_PREFIX:
+   /* M

[dpdk-dev] [PATCH 12/22] app/testpmd: add rte_flow item spec handler

2016-11-16 Thread Adrien Mazarguil

Add parser code to fully set individual fields of pattern item
specification structures, using the following operators:

- fix: sets field and applies full bit-mask for perfect matching.
- spec: sets field without modifying its bit-mask.
- last: sets upper value of the spec => last range.
- mask: sets bit-mask affecting both spec and last from arbitrary value.

Signed-off-by: Adrien Mazarguil 
---
 app/test-pmd/cmdline_flow.c | 110 +++
 1 file changed, 110 insertions(+)

diff --git a/app/test-pmd/cmdline_flow.c b/app/test-pmd/cmdline_flow.c
index e70e8e2..790b4b8 100644
--- a/app/test-pmd/cmdline_flow.c
+++ b/app/test-pmd/cmdline_flow.c
@@ -89,6 +89,10 @@ enum index {

/* Validate/create pattern. */
PATTERN,
+   ITEM_PARAM_FIX,
+   ITEM_PARAM_SPEC,
+   ITEM_PARAM_LAST,
+   ITEM_PARAM_MASK,
ITEM_NEXT,
ITEM_END,
ITEM_VOID,
@@ -121,6 +125,7 @@ struct context {
uint16_t port; /**< Current port ID (for completions). */
uint32_t objdata; /**< Object-specific data. */
void *object; /**< Address of current object for relative offsets. */
+   void *objmask; /**< Object a full mask must be written to. */
 };

 /** Token argument. */
@@ -267,6 +272,14 @@ static const enum index next_list_attr[] = {
0,
 };

+static const enum index item_param[] = {
+   ITEM_PARAM_FIX,
+   ITEM_PARAM_SPEC,
+   ITEM_PARAM_LAST,
+   ITEM_PARAM_MASK,
+   0,
+};
+
 static const enum index next_item[] = {
ITEM_END,
ITEM_VOID,
@@ -287,6 +300,8 @@ static int parse_init(struct context *, const struct token 
*,
 static int parse_vc(struct context *, const struct token *,
const char *, unsigned int,
void *, unsigned int);
+static int parse_vc_spec(struct context *, const struct token *,
+const char *, unsigned int, void *, unsigned int);
 static int parse_destroy(struct context *, const struct token *,
 const char *, unsigned int,
 void *, unsigned int);
@@ -492,6 +507,26 @@ static const struct token token_list[] = {
.next = NEXT(next_item),
.call = parse_vc,
},
+   [ITEM_PARAM_FIX] = {
+   .name = "fix",
+   .help = "match value perfectly (with full bit-mask)",
+   .call = parse_vc_spec,
+   },
+   [ITEM_PARAM_SPEC] = {
+   .name = "spec",
+   .help = "match value according to configured bit-mask",
+   .call = parse_vc_spec,
+   },
+   [ITEM_PARAM_LAST] = {
+   .name = "last",
+   .help = "specify upper bound to establish a range",
+   .call = parse_vc_spec,
+   },
+   [ITEM_PARAM_MASK] = {
+   .name = "mask",
+   .help = "specify bit-mask with relevant bits set to one",
+   .call = parse_vc_spec,
+   },
[ITEM_NEXT] = {
.name = "/",
.help = "specify next pattern item",
@@ -605,6 +640,7 @@ parse_init(struct context *ctx, const struct token *token,
memset((uint8_t *)out + sizeof(*out), 0x22, size - sizeof(*out));
ctx->objdata = 0;
ctx->object = out;
+   ctx->objmask = NULL;
return len;
 }

@@ -632,11 +668,13 @@ parse_vc(struct context *ctx, const struct token *token,
out->command = ctx->curr;
ctx->objdata = 0;
ctx->object = out;
+   ctx->objmask = NULL;
out->args.vc.data = (uint8_t *)out + size;
return len;
}
ctx->objdata = 0;
ctx->object = >args.vc.attr;
+   ctx->objmask = NULL;
switch (ctx->curr) {
case GROUP:
case PRIORITY:
@@ -652,6 +690,7 @@ parse_vc(struct context *ctx, const struct token *token,
(void *)RTE_ALIGN_CEIL((uintptr_t)(out + 1),
   sizeof(double));
ctx->object = out->args.vc.pattern;
+   ctx->objmask = NULL;
return len;
case ACTIONS:
out->args.vc.actions =
@@ -660,6 +699,7 @@ parse_vc(struct context *ctx, const struct token *token,
out->args.vc.pattern_n),
   sizeof(double));
ctx->object = out->args.vc.actions;
+   ctx->objmask = NULL;
return len;
default:
if (!token->priv)
@@ -682,6 +722,7 @@ parse_vc(struct context *ctx, const struct token *token,
};
++out->args.vc.pattern_n;
ctx->object = ite

[dpdk-dev] [PATCH 11/22] app/testpmd: add flow query command

2016-11-16 Thread Adrien Mazarguil

Syntax:

 flow query {port_id} {rule_id} {action}

Query a specific action of an existing flow rule.

Signed-off-by: Adrien Mazarguil 
---
 app/test-pmd/cmdline.c  |   3 +
 app/test-pmd/cmdline_flow.c | 121 ++-
 2 files changed, 123 insertions(+), 1 deletion(-)

diff --git a/app/test-pmd/cmdline.c b/app/test-pmd/cmdline.c
index 851cc16..edd1ee3 100644
--- a/app/test-pmd/cmdline.c
+++ b/app/test-pmd/cmdline.c
@@ -831,6 +831,9 @@ static void cmd_help_long_parsed(void *parsed_result,
"flow flush {port_id}\n"
"Destroy all flow rules.\n\n"

+   "flow query {port_id} {rule_id} {action}\n"
+   "Query an existing flow rule.\n\n"
+
"flow list {port_id} [group {group_id}] [...]\n"
"List existing flow rules sorted by priority,"
" filtered by group identifiers.\n\n"
diff --git a/app/test-pmd/cmdline_flow.c b/app/test-pmd/cmdline_flow.c
index 1874849..e70e8e2 100644
--- a/app/test-pmd/cmdline_flow.c
+++ b/app/test-pmd/cmdline_flow.c
@@ -69,11 +69,15 @@ enum index {
CREATE,
DESTROY,
FLUSH,
+   QUERY,
LIST,

/* Destroy arguments. */
DESTROY_RULE,

+   /* Query arguments. */
+   QUERY_ACTION,
+
/* List arguments. */
LIST_GROUP,

@@ -208,6 +212,10 @@ struct buffer {
uint32_t rule_n;
} destroy; /**< Destroy arguments. */
struct {
+   uint32_t rule;
+   enum rte_flow_action_type action;
+   } query; /**< Query arguments. */
+   struct {
uint32_t *group;
uint32_t group_n;
} list; /**< List arguments. */
@@ -285,6 +293,12 @@ static int parse_destroy(struct context *, const struct 
token *,
 static int parse_flush(struct context *, const struct token *,
   const char *, unsigned int,
   void *, unsigned int);
+static int parse_query(struct context *, const struct token *,
+  const char *, unsigned int,
+  void *, unsigned int);
+static int parse_action(struct context *, const struct token *,
+   const char *, unsigned int,
+   void *, unsigned int);
 static int parse_list(struct context *, const struct token *,
  const char *, unsigned int,
  void *, unsigned int);
@@ -296,6 +310,8 @@ static int parse_port(struct context *, const struct token 
*,
  void *, unsigned int);
 static int comp_none(struct context *, const struct token *,
 unsigned int, char *, unsigned int);
+static int comp_action(struct context *, const struct token *,
+  unsigned int, char *, unsigned int);
 static int comp_port(struct context *, const struct token *,
 unsigned int, char *, unsigned int);
 static int comp_rule_id(struct context *, const struct token *,
@@ -367,7 +383,8 @@ static const struct token token_list[] = {
  CREATE,
  DESTROY,
  FLUSH,
- LIST)),
+ LIST,
+ QUERY)),
.call = parse_init,
},
/* Sub-level commands. */
@@ -399,6 +416,17 @@ static const struct token token_list[] = {
.args = ARGS(ARGS_ENTRY(struct buffer, port)),
.call = parse_flush,
},
+   [QUERY] = {
+   .name = "query",
+   .help = "query an existing flow rule",
+   .next = NEXT(NEXT_ENTRY(QUERY_ACTION),
+NEXT_ENTRY(RULE_ID),
+NEXT_ENTRY(PORT_ID)),
+   .args = ARGS(ARGS_ENTRY(struct buffer, args.query.action),
+ARGS_ENTRY(struct buffer, args.query.rule),
+ARGS_ENTRY(struct buffer, port)),
+   .call = parse_query,
+   },
[LIST] = {
.name = "list",
.help = "list existing flow rules",
@@ -414,6 +442,14 @@ static const struct token token_list[] = {
.args = ARGS(ARGS_ENTRY_PTR(struct buffer, args.destroy.rule)),
.call = parse_destroy,
},
+   /* Query arguments. */
+   [QUERY_ACTION] = {
+   .name = "{action}",
+   .type = "ACTION",
+   .help = "action to query, must be part of the rule",
+   .call = parse_action,
+   .comp = comp_action,
+   },
/* List ar

[dpdk-dev] [PATCH 10/22] app/testpmd: add flow validate/create commands

2016-11-16 Thread Adrien Mazarguil

Syntax:

 flow (validate|create) {port_id}
[group {group_id}] [priority {level}] [ingress] [egress]
pattern {item} [/ {item} [...]] / end
actions {action} [/ {action} [...]] / end

Either check the validity of a flow rule or create it. Any number of
pattern items and actions can be provided in any order. Completion is
available for convenience.

This commit only adds support for the most basic item and action types,
namely:

- END: terminates pattern items and actions lists.
- VOID: item/action filler, no operation.
- INVERT: inverted pattern matching, process packets that do not match.
- PASSTHRU: action that leaves packets up for additional processing by
  subsequent flow rules.

Signed-off-by: Adrien Mazarguil 
---
 app/test-pmd/cmdline.c  |  14 ++
 app/test-pmd/cmdline_flow.c | 314 ++-
 2 files changed, 327 insertions(+), 1 deletion(-)

diff --git a/app/test-pmd/cmdline.c b/app/test-pmd/cmdline.c
index 20a64b6..851cc16 100644
--- a/app/test-pmd/cmdline.c
+++ b/app/test-pmd/cmdline.c
@@ -811,6 +811,20 @@ static void cmd_help_long_parsed(void *parsed_result,
" (select|add)\n"
"Set the input set for FDir.\n\n"

+   "flow validate {port_id}"
+   " [group {group_id}] [priority {level}]"
+   " [ingress] [egress]"
+   " pattern {item} [/ {item} [...]] / end"
+   " actions {action} [/ {action} [...]] / end\n"
+   "Check whether a flow rule can be created.\n\n"
+
+   "flow create {port_id}"
+   " [group {group_id}] [priority {level}]"
+   " [ingress] [egress]"
+   " pattern {item} [{item} [...]] end"
+   " actions {action} [{action} [...]] end\n"
+   "Create a flow rule.\n\n"
+
"flow destroy {port_id} rule {rule_id} [...]\n"
"Destroy specific flow rules.\n\n"

diff --git a/app/test-pmd/cmdline_flow.c b/app/test-pmd/cmdline_flow.c
index 5a8980c..1874849 100644
--- a/app/test-pmd/cmdline_flow.c
+++ b/app/test-pmd/cmdline_flow.c
@@ -59,11 +59,14 @@ enum index {
RULE_ID,
PORT_ID,
GROUP_ID,
+   PRIORITY_LEVEL,

/* Top-level command. */
FLOW,

/* Sub-level commands. */
+   VALIDATE,
+   CREATE,
DESTROY,
FLUSH,
LIST,
@@ -73,6 +76,26 @@ enum index {

/* List arguments. */
LIST_GROUP,
+
+   /* Validate/create arguments. */
+   GROUP,
+   PRIORITY,
+   INGRESS,
+   EGRESS,
+
+   /* Validate/create pattern. */
+   PATTERN,
+   ITEM_NEXT,
+   ITEM_END,
+   ITEM_VOID,
+   ITEM_INVERT,
+
+   /* Validate/create actions. */
+   ACTIONS,
+   ACTION_NEXT,
+   ACTION_END,
+   ACTION_VOID,
+   ACTION_PASSTHRU,
 };

 /** Maximum number of subsequent tokens and arguments on the stack. */
@@ -92,6 +115,7 @@ struct context {
uint32_t eol:1; /**< EOL has been detected. */
uint32_t last:1; /**< No more arguments. */
uint16_t port; /**< Current port ID (for completions). */
+   uint32_t objdata; /**< Object-specific data. */
void *object; /**< Address of current object for relative offsets. */
 };

@@ -109,6 +133,8 @@ struct token {
const char *type;
/** Help displayed during completion (defaults to token name). */
const char *help;
+   /** Private data used by parser functions. */
+   const void *priv;
/**
 * Lists of subsequent tokens to push on the stack. Each call to the
 * parser consumes the last entry of that stack.
@@ -170,6 +196,14 @@ struct buffer {
uint16_t port; /**< Affected port ID. */
union {
struct {
+   struct rte_flow_attr attr;
+   struct rte_flow_item *pattern;
+   struct rte_flow_action *actions;
+   uint32_t pattern_n;
+   uint32_t actions_n;
+   uint8_t *data;
+   } vc; /**< Validate/create arguments. */
+   struct {
uint32_t *rule;
uint32_t rule_n;
} destroy; /**< Destroy arguments. */
@@ -180,6 +214,39 @@ struct buffer {
} args; /**< Command arguments. */
 };

+/** Private data for pattern items. */
+struct parse_item_priv {
+   enum rte_flow_item_type type; /**< Item type. */
+   uint32_t size; /**< Size of item specification structure. */
+};
+
+#define PRIV_ITEM(t, s) \
+   (&(const struct parse_item_priv){

[dpdk-dev] [PATCH 09/22] app/testpmd: add flow destroy command

2016-11-16 Thread Adrien Mazarguil

Syntax:

 flow destroy {port_id} rule {rule_id} [...]

Destroy a given set of flow rules associated with a port.

Signed-off-by: Adrien Mazarguil 
---
 app/test-pmd/cmdline.c  |   3 ++
 app/test-pmd/cmdline_flow.c | 106 ++-
 2 files changed, 108 insertions(+), 1 deletion(-)

diff --git a/app/test-pmd/cmdline.c b/app/test-pmd/cmdline.c
index 9f124fc..20a64b6 100644
--- a/app/test-pmd/cmdline.c
+++ b/app/test-pmd/cmdline.c
@@ -811,6 +811,9 @@ static void cmd_help_long_parsed(void *parsed_result,
" (select|add)\n"
"Set the input set for FDir.\n\n"

+   "flow destroy {port_id} rule {rule_id} [...]\n"
+   "Destroy specific flow rules.\n\n"
+
"flow flush {port_id}\n"
"Destroy all flow rules.\n\n"

diff --git a/app/test-pmd/cmdline_flow.c b/app/test-pmd/cmdline_flow.c
index 414bacc..5a8980c 100644
--- a/app/test-pmd/cmdline_flow.c
+++ b/app/test-pmd/cmdline_flow.c
@@ -56,6 +56,7 @@ enum index {
/* Common tokens. */
INTEGER,
UNSIGNED,
+   RULE_ID,
PORT_ID,
GROUP_ID,

@@ -63,9 +64,13 @@ enum index {
FLOW,

/* Sub-level commands. */
+   DESTROY,
FLUSH,
LIST,

+   /* Destroy arguments. */
+   DESTROY_RULE,
+
/* List arguments. */
LIST_GROUP,
 };
@@ -165,12 +170,22 @@ struct buffer {
uint16_t port; /**< Affected port ID. */
union {
struct {
+   uint32_t *rule;
+   uint32_t rule_n;
+   } destroy; /**< Destroy arguments. */
+   struct {
uint32_t *group;
uint32_t group_n;
} list; /**< List arguments. */
} args; /**< Command arguments. */
 };

+static const enum index next_destroy_attr[] = {
+   DESTROY_RULE,
+   END,
+   0,
+};
+
 static const enum index next_list_attr[] = {
LIST_GROUP,
END,
@@ -180,6 +195,9 @@ static const enum index next_list_attr[] = {
 static int parse_init(struct context *, const struct token *,
  const char *, unsigned int,
  void *, unsigned int);
+static int parse_destroy(struct context *, const struct token *,
+const char *, unsigned int,
+void *, unsigned int);
 static int parse_flush(struct context *, const struct token *,
   const char *, unsigned int,
   void *, unsigned int);
@@ -196,6 +214,8 @@ static int comp_none(struct context *, const struct token *,
 unsigned int, char *, unsigned int);
 static int comp_port(struct context *, const struct token *,
 unsigned int, char *, unsigned int);
+static int comp_rule_id(struct context *, const struct token *,
+   unsigned int, char *, unsigned int);

 /** Token definitions. */
 static const struct token token_list[] = {
@@ -225,6 +245,13 @@ static const struct token token_list[] = {
.call = parse_int,
.comp = comp_none,
},
+   [RULE_ID] = {
+   .name = "{rule id}",
+   .type = "RULE ID",
+   .help = "rule identifier",
+   .call = parse_int,
+   .comp = comp_rule_id,
+   },
[PORT_ID] = {
.name = "{port_id}",
.type = "PORT ID",
@@ -245,11 +272,19 @@ static const struct token token_list[] = {
.type = "{command} {port_id} [{arg} [...]]",
.help = "manage ingress/egress flow rules",
.next = NEXT(NEXT_ENTRY
-(FLUSH,
+(DESTROY,
+ FLUSH,
  LIST)),
.call = parse_init,
},
/* Sub-level commands. */
+   [DESTROY] = {
+   .name = "destroy",
+   .help = "destroy specific flow rules",
+   .next = NEXT(NEXT_ENTRY(DESTROY_RULE), NEXT_ENTRY(PORT_ID)),
+   .args = ARGS(ARGS_ENTRY(struct buffer, port)),
+   .call = parse_destroy,
+   },
[FLUSH] = {
.name = "flush",
.help = "destroy all flow rules",
@@ -264,6 +299,14 @@ static const struct token token_list[] = {
.args = ARGS(ARGS_ENTRY(struct buffer, port)),
.call = parse_list,
},
+   /* Destroy arguments. */
+   [DESTROY_RULE] = {
+   .name = "rule",
+   .help = "specify a rule identifier",
+   .next = NEXT(next_destroy_attr,

[dpdk-dev] [PATCH 08/22] app/testpmd: add flow flush command

2016-11-16 Thread Adrien Mazarguil

Syntax:

 flow flush {port_id}

Destroy all flow rules on a port.

Signed-off-by: Adrien Mazarguil 
---
 app/test-pmd/cmdline.c  |  3 +++
 app/test-pmd/cmdline_flow.c | 43 +++-
 2 files changed, 45 insertions(+), 1 deletion(-)

diff --git a/app/test-pmd/cmdline.c b/app/test-pmd/cmdline.c
index 09357c0..9f124fc 100644
--- a/app/test-pmd/cmdline.c
+++ b/app/test-pmd/cmdline.c
@@ -811,6 +811,9 @@ static void cmd_help_long_parsed(void *parsed_result,
" (select|add)\n"
"Set the input set for FDir.\n\n"

+   "flow flush {port_id}\n"
+   "Destroy all flow rules.\n\n"
+
"flow list {port_id} [group {group_id}] [...]\n"
"List existing flow rules sorted by priority,"
" filtered by group identifiers.\n\n"
diff --git a/app/test-pmd/cmdline_flow.c b/app/test-pmd/cmdline_flow.c
index 727fe78..414bacc 100644
--- a/app/test-pmd/cmdline_flow.c
+++ b/app/test-pmd/cmdline_flow.c
@@ -63,6 +63,7 @@ enum index {
FLOW,

/* Sub-level commands. */
+   FLUSH,
LIST,

/* List arguments. */
@@ -179,6 +180,9 @@ static const enum index next_list_attr[] = {
 static int parse_init(struct context *, const struct token *,
  const char *, unsigned int,
  void *, unsigned int);
+static int parse_flush(struct context *, const struct token *,
+  const char *, unsigned int,
+  void *, unsigned int);
 static int parse_list(struct context *, const struct token *,
  const char *, unsigned int,
  void *, unsigned int);
@@ -240,10 +244,19 @@ static const struct token token_list[] = {
.name = "flow",
.type = "{command} {port_id} [{arg} [...]]",
.help = "manage ingress/egress flow rules",
-   .next = NEXT(NEXT_ENTRY(LIST)),
+   .next = NEXT(NEXT_ENTRY
+(FLUSH,
+ LIST)),
.call = parse_init,
},
/* Sub-level commands. */
+   [FLUSH] = {
+   .name = "flush",
+   .help = "destroy all flow rules",
+   .next = NEXT(NEXT_ENTRY(PORT_ID)),
+   .args = ARGS(ARGS_ENTRY(struct buffer, port)),
+   .call = parse_flush,
+   },
[LIST] = {
.name = "list",
.help = "list existing flow rules",
@@ -316,6 +329,31 @@ parse_init(struct context *ctx, const struct token *token,
return len;
 }

+/** Parse tokens for flush command. */
+static int
+parse_flush(struct context *ctx, const struct token *token,
+   const char *str, unsigned int len,
+   void *buf, unsigned int size)
+{
+   struct buffer *out = buf;
+
+   /* Token name must match. */
+   if (parse_default(ctx, token, str, len, NULL, 0) < 0)
+   return -1;
+   /* Nothing else to do if there is no buffer. */
+   if (!out)
+   return len;
+   if (!out->command) {
+   if (ctx->curr != FLUSH)
+   return -1;
+   if (sizeof(*out) > size)
+   return -1;
+   out->command = ctx->curr;
+   ctx->object = out;
+   }
+   return len;
+}
+
 /** Parse tokens for list command. */
 static int
 parse_list(struct context *ctx, const struct token *token,
@@ -698,6 +736,9 @@ static void
 cmd_flow_parsed(const struct buffer *in)
 {
switch (in->command) {
+   case FLUSH:
+   port_flow_flush(in->port);
+   break;
case LIST:
port_flow_list(in->port, in->args.list.group_n,
   in->args.list.group);
-- 
2.1.4

[dpdk-dev] [PATCH 07/22] app/testpmd: add flow list command

2016-11-16 Thread Adrien Mazarguil

Syntax:

 flow list {port_id} [group {group_id}] [...]

List configured flow rules on a port. Output can optionally be limited to a
given set of group identifiers.

Signed-off-by: Adrien Mazarguil 
---
 app/test-pmd/cmdline.c  |   4 ++
 app/test-pmd/cmdline_flow.c | 141 +++
 2 files changed, 145 insertions(+)

diff --git a/app/test-pmd/cmdline.c b/app/test-pmd/cmdline.c
index b7d10b3..09357c0 100644
--- a/app/test-pmd/cmdline.c
+++ b/app/test-pmd/cmdline.c
@@ -810,6 +810,10 @@ static void cmd_help_long_parsed(void *parsed_result,
"sctp-src-port|sctp-dst-port|sctp-veri-tag|none)"
" (select|add)\n"
"Set the input set for FDir.\n\n"
+
+   "flow list {port_id} [group {group_id}] [...]\n"
+   "List existing flow rules sorted by priority,"
+   " filtered by group identifiers.\n\n"
);
}
 }
diff --git a/app/test-pmd/cmdline_flow.c b/app/test-pmd/cmdline_flow.c
index 7078f80..727fe78 100644
--- a/app/test-pmd/cmdline_flow.c
+++ b/app/test-pmd/cmdline_flow.c
@@ -56,9 +56,17 @@ enum index {
/* Common tokens. */
INTEGER,
UNSIGNED,
+   PORT_ID,
+   GROUP_ID,

/* Top-level command. */
FLOW,
+
+   /* Sub-level commands. */
+   LIST,
+
+   /* List arguments. */
+   LIST_GROUP,
 };

 /** Maximum number of subsequent tokens and arguments on the stack. */
@@ -77,6 +85,7 @@ struct context {
uint32_t reparse:1; /**< Start over from the beginning. */
uint32_t eol:1; /**< EOL has been detected. */
uint32_t last:1; /**< No more arguments. */
+   uint16_t port; /**< Current port ID (for completions). */
void *object; /**< Address of current object for relative offsets. */
 };

@@ -153,16 +162,36 @@ struct token {
 struct buffer {
enum index command; /**< Flow command. */
uint16_t port; /**< Affected port ID. */
+   union {
+   struct {
+   uint32_t *group;
+   uint32_t group_n;
+   } list; /**< List arguments. */
+   } args; /**< Command arguments. */
+};
+
+static const enum index next_list_attr[] = {
+   LIST_GROUP,
+   END,
+   0,
 };

 static int parse_init(struct context *, const struct token *,
  const char *, unsigned int,
  void *, unsigned int);
+static int parse_list(struct context *, const struct token *,
+ const char *, unsigned int,
+ void *, unsigned int);
 static int parse_int(struct context *, const struct token *,
 const char *, unsigned int,
 void *, unsigned int);
+static int parse_port(struct context *, const struct token *,
+ const char *, unsigned int,
+ void *, unsigned int);
 static int comp_none(struct context *, const struct token *,
 unsigned int, char *, unsigned int);
+static int comp_port(struct context *, const struct token *,
+unsigned int, char *, unsigned int);

 /** Token definitions. */
 static const struct token token_list[] = {
@@ -192,13 +221,44 @@ static const struct token token_list[] = {
.call = parse_int,
.comp = comp_none,
},
+   [PORT_ID] = {
+   .name = "{port_id}",
+   .type = "PORT ID",
+   .help = "port identifier",
+   .call = parse_port,
+   .comp = comp_port,
+   },
+   [GROUP_ID] = {
+   .name = "{group_id}",
+   .type = "GROUP ID",
+   .help = "group identifier",
+   .call = parse_int,
+   .comp = comp_none,
+   },
/* Top-level command. */
[FLOW] = {
.name = "flow",
.type = "{command} {port_id} [{arg} [...]]",
.help = "manage ingress/egress flow rules",
+   .next = NEXT(NEXT_ENTRY(LIST)),
.call = parse_init,
},
+   /* Sub-level commands. */
+   [LIST] = {
+   .name = "list",
+   .help = "list existing flow rules",
+   .next = NEXT(next_list_attr, NEXT_ENTRY(PORT_ID)),
+   .args = ARGS(ARGS_ENTRY(struct buffer, port)),
+   .call = parse_list,
+   },
+   /* List arguments. */
+   [LIST_GROUP] = {
+   .name = "group",
+   .help = "specify a group",
+   .next = NEXT(next_list_attr, NEXT_ENTRY(GROUP_ID)),
+   .args = ARGS(ARGS_ENTRY_PTR(struct buffer, args.list.group)),
+

[dpdk-dev] [PATCH 06/22] app/testpmd: add rte_flow integer support

2016-11-16 Thread Adrien Mazarguil

Parse all integer types and handle conversion to network byte order in a
single function.

Signed-off-by: Adrien Mazarguil 
---
 app/test-pmd/cmdline_flow.c | 148 +++
 1 file changed, 148 insertions(+)

diff --git a/app/test-pmd/cmdline_flow.c b/app/test-pmd/cmdline_flow.c
index 7dbda84..7078f80 100644
--- a/app/test-pmd/cmdline_flow.c
+++ b/app/test-pmd/cmdline_flow.c
@@ -34,11 +34,14 @@
 #include 
 #include 
 #include 
+#include 
+#include 
 #include 
 #include 

 #include 
 #include 
+#include 
 #include 
 #include 

@@ -50,6 +53,10 @@ enum index {
ZERO = 0,
END,

+   /* Common tokens. */
+   INTEGER,
+   UNSIGNED,
+
/* Top-level command. */
FLOW,
 };
@@ -61,12 +68,24 @@ enum index {
 struct context {
/** Stack of subsequent token lists to process. */
const enum index *next[CTX_STACK_SIZE];
+   /** Arguments for stacked tokens. */
+   const void *args[CTX_STACK_SIZE];
enum index curr; /**< Current token index. */
enum index prev; /**< Index of the last token seen. */
int next_num; /**< Number of entries in next[]. */
+   int args_num; /**< Number of entries in args[]. */
uint32_t reparse:1; /**< Start over from the beginning. */
uint32_t eol:1; /**< EOL has been detected. */
uint32_t last:1; /**< No more arguments. */
+   void *object; /**< Address of current object for relative offsets. */
+};
+
+/** Token argument. */
+struct arg {
+   uint32_t hton:1; /**< Use network byte ordering. */
+   uint32_t sign:1; /**< Value is signed. */
+   uint32_t offset; /**< Relative offset from ctx->object. */
+   uint32_t size; /**< Field size. */
 };

 /** Parser token definition. */
@@ -80,6 +99,8 @@ struct token {
 * parser consumes the last entry of that stack.
 */
const enum index *const *next;
+   /** Arguments stack for subsequent tokens that need them. */
+   const struct arg *const *args;
/**
 * Token-processing callback, returns -1 in case of error, the
 * length of the matched string otherwise. If NULL, attempts to
@@ -112,6 +133,22 @@ struct token {
 /** Static initializer for a NEXT() entry. */
 #define NEXT_ENTRY(...) (const enum index []){ __VA_ARGS__, 0, }

+/** Static initializer for the args field. */
+#define ARGS(...) (const struct arg *const []){ __VA_ARGS__, NULL, }
+
+/** Static initializer for ARGS() to target a field. */
+#define ARGS_ENTRY(s, f) \
+   (&(const struct arg){ \
+   .offset = offsetof(s, f), \
+   .size = sizeof(((s *)0)->f), \
+   })
+
+/** Static initializer for ARGS() to target a pointer. */
+#define ARGS_ENTRY_PTR(s, f) \
+   (&(const struct arg){ \
+   .size = sizeof(*((s *)0)->f), \
+   })
+
 /** Parser output buffer layout expected by cmd_flow_parsed(). */
 struct buffer {
enum index command; /**< Flow command. */
@@ -121,6 +158,11 @@ struct buffer {
 static int parse_init(struct context *, const struct token *,
  const char *, unsigned int,
  void *, unsigned int);
+static int parse_int(struct context *, const struct token *,
+const char *, unsigned int,
+void *, unsigned int);
+static int comp_none(struct context *, const struct token *,
+unsigned int, char *, unsigned int);

 /** Token definitions. */
 static const struct token token_list[] = {
@@ -135,6 +177,21 @@ static const struct token token_list[] = {
.type = "RETURN",
.help = "command may end here",
},
+   /* Common tokens. */
+   [INTEGER] = {
+   .name = "{int}",
+   .type = "INTEGER",
+   .help = "integer value",
+   .call = parse_int,
+   .comp = comp_none,
+   },
+   [UNSIGNED] = {
+   .name = "{unsigned}",
+   .type = "UNSIGNED",
+   .help = "unsigned integer value",
+   .call = parse_int,
+   .comp = comp_none,
+   },
/* Top-level command. */
[FLOW] = {
.name = "flow",
@@ -144,6 +201,23 @@ static const struct token token_list[] = {
},
 };

+/** Remove and return last entry from argument stack. */
+static const struct arg *
+pop_args(struct context *ctx)
+{
+   return ctx->args_num ? ctx->args[--ctx->args_num] : NULL;
+}
+
+/** Add entry on top of the argument stack. */
+static int
+push_args(struct context *ctx, const struct arg *arg)
+{
+   if (ctx->args_num == CTX_STACK_SIZE)
+   return -1;
+   ctx->args[ctx->args_num++] = arg;
+   return 0;
+}
+
 /** Default parsing function for token name ma

[dpdk-dev] [PATCH 05/22] app/testpmd: add flow command

2016-11-16 Thread Adrien Mazarguil

Managing generic flow API functions from command line requires the use of
dynamic tokens for convenience as flow rules are not fixed and cannot be
defined statically.

This commit adds specific flexible parser code and object for a new "flow"
command in separate file.

Signed-off-by: Adrien Mazarguil 
---
 app/test-pmd/Makefile   |   1 +
 app/test-pmd/cmdline.c  |   4 +
 app/test-pmd/cmdline_flow.c | 439 +++
 3 files changed, 444 insertions(+)

diff --git a/app/test-pmd/Makefile b/app/test-pmd/Makefile
index 891b85a..5988c3e 100644
--- a/app/test-pmd/Makefile
+++ b/app/test-pmd/Makefile
@@ -47,6 +47,7 @@ CFLAGS += $(WERROR_FLAGS)
 SRCS-y := testpmd.c
 SRCS-y += parameters.c
 SRCS-$(CONFIG_RTE_LIBRTE_CMDLINE) += cmdline.c
+SRCS-$(CONFIG_RTE_LIBRTE_CMDLINE) += cmdline_flow.c
 SRCS-y += config.c
 SRCS-y += iofwd.c
 SRCS-y += macfwd.c
diff --git a/app/test-pmd/cmdline.c b/app/test-pmd/cmdline.c
index c5b015c..b7d10b3 100644
--- a/app/test-pmd/cmdline.c
+++ b/app/test-pmd/cmdline.c
@@ -9520,6 +9520,9 @@ cmdline_parse_inst_t cmd_set_flow_director_flex_payload = 
{
},
 };

+/* Generic flow interface command. */
+extern cmdline_parse_inst_t cmd_flow;
+
 /* *** Classification Filters Control *** */
 /* *** Get symmetric hash enable per port *** */
 struct cmd_get_sym_hash_ena_per_port_result {
@@ -11557,6 +11560,7 @@ cmdline_parse_ctx_t main_ctx[] = {
(cmdline_parse_inst_t *)_set_hash_global_config,
(cmdline_parse_inst_t *)_set_hash_input_set,
(cmdline_parse_inst_t *)_set_fdir_input_set,
+   (cmdline_parse_inst_t *)_flow,
(cmdline_parse_inst_t *)_mcast_addr,
(cmdline_parse_inst_t *)_config_l2_tunnel_eth_type_all,
(cmdline_parse_inst_t *)_config_l2_tunnel_eth_type_specific,
diff --git a/app/test-pmd/cmdline_flow.c b/app/test-pmd/cmdline_flow.c
new file mode 100644
index 000..7dbda84
--- /dev/null
+++ b/app/test-pmd/cmdline_flow.c
@@ -0,0 +1,439 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright 2016 6WIND S.A.
+ *   Copyright 2016 Mellanox.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ * * Redistributions of source code must retain the above copyright
+ *   notice, this list of conditions and the following disclaimer.
+ * * Redistributions in binary form must reproduce the above copyright
+ *   notice, this list of conditions and the following disclaimer in
+ *   the documentation and/or other materials provided with the
+ *   distribution.
+ * * Neither the name of 6WIND S.A. nor the names of its
+ *   contributors may be used to endorse or promote products derived
+ *   from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include 
+#include 
+#include 
+#include 
+
+#include "testpmd.h"
+
+/** Parser token indices. */
+enum index {
+   /* Special tokens. */
+   ZERO = 0,
+   END,
+
+   /* Top-level command. */
+   FLOW,
+};
+
+/** Maximum number of subsequent tokens and arguments on the stack. */
+#define CTX_STACK_SIZE 16
+
+/** Parser context. */
+struct context {
+   /** Stack of subsequent token lists to process. */
+   const enum index *next[CTX_STACK_SIZE];
+   enum index curr; /**< Current token index. */
+   enum index prev; /**< Index of the last token seen. */
+   int next_num; /**< Number of entries in next[]. */
+   uint32_t reparse:1; /**< Start over from the beginning. */
+   uint32_t eol:1; /**< EOL has been detected. */
+   uint32_t last:1; /**< No more arguments. */
+};
+
+/** Parser token definition. */
+struct token {
+   /** Type displayed during completion (defaults to "TOKEN"). */
+   const char *type;
+   /** Help displayed during completion (defaults to token name). */
+   const char *help;
+   /**
+* Lists of subsequent tokens to push on the stack. Each call to the
+* parser consumes the last entry

[dpdk-dev] [PATCH 04/22] app/testpmd: implement basic support for rte_flow

2016-11-16 Thread Adrien Mazarguil

Add basic management functions for the generic flow API (validate, create,
destroy, flush, query and list). Flow rule objects and properties are
arranged in lists associated with each port.

Signed-off-by: Adrien Mazarguil 
---
 app/test-pmd/cmdline.c |   1 +
 app/test-pmd/config.c  | 484 
 app/test-pmd/csumonly.c|   1 +
 app/test-pmd/flowgen.c |   1 +
 app/test-pmd/icmpecho.c|   1 +
 app/test-pmd/ieee1588fwd.c |   1 +
 app/test-pmd/iofwd.c   |   1 +
 app/test-pmd/macfwd.c  |   1 +
 app/test-pmd/macswap.c |   1 +
 app/test-pmd/parameters.c  |   1 +
 app/test-pmd/rxonly.c  |   1 +
 app/test-pmd/testpmd.c |   6 +
 app/test-pmd/testpmd.h |  27 +++
 app/test-pmd/txonly.c  |   1 +
 14 files changed, 528 insertions(+)

diff --git a/app/test-pmd/cmdline.c b/app/test-pmd/cmdline.c
index 63b55dc..c5b015c 100644
--- a/app/test-pmd/cmdline.c
+++ b/app/test-pmd/cmdline.c
@@ -75,6 +75,7 @@
 #include 
 #include 
 #include 
+#include 

 #include 
 #include 
diff --git a/app/test-pmd/config.c b/app/test-pmd/config.c
index 36c47ab..c9dc872 100644
--- a/app/test-pmd/config.c
+++ b/app/test-pmd/config.c
@@ -92,6 +92,8 @@
 #include 
 #include 
 #include 
+#include 
+#include 

 #include "testpmd.h"

@@ -750,6 +752,488 @@ port_mtu_set(portid_t port_id, uint16_t mtu)
printf("Set MTU failed. diag=%d\n", diag);
 }

+/* Generic flow management functions. */
+
+/** Generate flow_item[] entry. */
+#define MK_FLOW_ITEM(t, s) \
+   [RTE_FLOW_ITEM_TYPE_ ## t] = { \
+   .name = # t, \
+   .size = s, \
+   }
+
+/** Information about known flow pattern items. */
+static const struct {
+   const char *name;
+   size_t size;
+} flow_item[] = {
+   MK_FLOW_ITEM(END, 0),
+   MK_FLOW_ITEM(VOID, 0),
+   MK_FLOW_ITEM(INVERT, 0),
+   MK_FLOW_ITEM(ANY, sizeof(struct rte_flow_item_any)),
+   MK_FLOW_ITEM(PF, 0),
+   MK_FLOW_ITEM(VF, sizeof(struct rte_flow_item_vf)),
+   MK_FLOW_ITEM(PORT, sizeof(struct rte_flow_item_port)),
+   MK_FLOW_ITEM(RAW, sizeof(struct rte_flow_item_raw)), /* +pattern[] */
+   MK_FLOW_ITEM(ETH, sizeof(struct rte_flow_item_eth)),
+   MK_FLOW_ITEM(VLAN, sizeof(struct rte_flow_item_vlan)),
+   MK_FLOW_ITEM(IPV4, sizeof(struct rte_flow_item_ipv4)),
+   MK_FLOW_ITEM(IPV6, sizeof(struct rte_flow_item_ipv6)),
+   MK_FLOW_ITEM(ICMP, sizeof(struct rte_flow_item_icmp)),
+   MK_FLOW_ITEM(UDP, sizeof(struct rte_flow_item_udp)),
+   MK_FLOW_ITEM(TCP, sizeof(struct rte_flow_item_tcp)),
+   MK_FLOW_ITEM(SCTP, sizeof(struct rte_flow_item_sctp)),
+   MK_FLOW_ITEM(VXLAN, sizeof(struct rte_flow_item_vxlan)),
+};
+
+/** Compute storage space needed by item specification. */
+static void
+flow_item_spec_size(const struct rte_flow_item *item,
+   size_t *size, size_t *pad)
+{
+   if (!item->spec)
+   goto empty;
+   switch (item->type) {
+   union {
+   const struct rte_flow_item_raw *raw;
+   } spec;
+
+   case RTE_FLOW_ITEM_TYPE_RAW:
+   spec.raw = item->spec;
+   *size = offsetof(struct rte_flow_item_raw, pattern) +
+   spec.raw->length * sizeof(*spec.raw->pattern);
+   break;
+   default:
+empty:
+   *size = 0;
+   break;
+   }
+   *pad = RTE_ALIGN_CEIL(*size, sizeof(double)) - *size;
+}
+
+/** Generate flow_action[] entry. */
+#define MK_FLOW_ACTION(t, s) \
+   [RTE_FLOW_ACTION_TYPE_ ## t] = { \
+   .name = # t, \
+   .size = s, \
+   }
+
+/** Information about known flow actions. */
+static const struct {
+   const char *name;
+   size_t size;
+} flow_action[] = {
+   MK_FLOW_ACTION(END, 0),
+   MK_FLOW_ACTION(VOID, 0),
+   MK_FLOW_ACTION(PASSTHRU, 0),
+   MK_FLOW_ACTION(MARK, sizeof(struct rte_flow_action_mark)),
+   MK_FLOW_ACTION(FLAG, 0),
+   MK_FLOW_ACTION(QUEUE, sizeof(struct rte_flow_action_queue)),
+   MK_FLOW_ACTION(DROP, 0),
+   MK_FLOW_ACTION(COUNT, 0),
+   MK_FLOW_ACTION(DUP, sizeof(struct rte_flow_action_dup)),
+   MK_FLOW_ACTION(RSS, sizeof(struct rte_flow_action_rss)), /* +queue[] */
+   MK_FLOW_ACTION(PF, 0),
+   MK_FLOW_ACTION(VF, sizeof(struct rte_flow_action_vf)),
+};
+
+/** Compute storage space needed by action configuration. */
+static void
+flow_action_conf_size(const struct rte_flow_action *action,
+ size_t *size, size_t *pad)
+{
+   if (!action->conf)
+   goto empty;
+   switch (action->type) {
+   union {
+   const struct rte_flow_action_rss *rss;
+   } conf;
+
+   case RTE_FLOW_ACTION_TYPE_RSS:
+   conf.rss = action->conf;
+   *size = offsetof(struct rte_flow_action_rss, queue) +
+

[dpdk-dev] [PATCH 03/22] cmdline: add alignment constraint

2016-11-16 Thread Adrien Mazarguil

This prevents sigbus errors on architectures that cannot handle unexpected
unaligned accesses to the output buffer.

Signed-off-by: Adrien Mazarguil 
---
 lib/librte_cmdline/cmdline_parse.c | 9 ++---
 1 file changed, 6 insertions(+), 3 deletions(-)

diff --git a/lib/librte_cmdline/cmdline_parse.c 
b/lib/librte_cmdline/cmdline_parse.c
index 14f5553..763c286 100644
--- a/lib/librte_cmdline/cmdline_parse.c
+++ b/lib/librte_cmdline/cmdline_parse.c
@@ -255,7 +255,10 @@ cmdline_parse(struct cmdline *cl, const char * buf)
unsigned int inst_num=0;
cmdline_parse_inst_t *inst;
const char *curbuf;
-   char result_buf[CMDLINE_PARSE_RESULT_BUFSIZE];
+   union {
+   char buf[CMDLINE_PARSE_RESULT_BUFSIZE];
+   long double align; /* strong alignment constraint for buf */
+   } result;
cmdline_parse_token_hdr_t *dyn_tokens[CMDLINE_PARSE_DYNAMIC_TOKENS];
void (*f)(void *, struct cmdline *, void *) = NULL;
void *data = NULL;
@@ -318,7 +321,7 @@ cmdline_parse(struct cmdline *cl, const char * buf)
debug_printf("INST %d\n", inst_num);

/* fully parsed */
-   tok = match_inst(inst, buf, 0, result_buf, sizeof(result_buf),
+   tok = match_inst(inst, buf, 0, result.buf, sizeof(result.buf),
 _tokens);

if (tok > 0) /* we matched at least one token */
@@ -353,7 +356,7 @@ cmdline_parse(struct cmdline *cl, const char * buf)

/* call func */
if (f) {
-   f(result_buf, cl, data);
+   f(result.buf, cl, data);
}

/* no match */
-- 
2.1.4

[dpdk-dev] [PATCH 02/22] cmdline: add support for dynamic tokens

2016-11-16 Thread Adrien Mazarguil

Considering tokens must be hard-coded in a list part of the instruction
structure, context-dependent tokens cannot be expressed.

This commit adds support for building dynamic token lists through a
user-provided function, which is called when the static token list is empty
(a single NULL entry).

Because no structures are modified (existing fields are reused), this
commit has no impact on the current ABI.

Signed-off-by: Adrien Mazarguil 
---
 lib/librte_cmdline/cmdline_parse.c | 60 +
 lib/librte_cmdline/cmdline_parse.h | 21 
 2 files changed, 74 insertions(+), 7 deletions(-)

diff --git a/lib/librte_cmdline/cmdline_parse.c 
b/lib/librte_cmdline/cmdline_parse.c
index b496067..14f5553 100644
--- a/lib/librte_cmdline/cmdline_parse.c
+++ b/lib/librte_cmdline/cmdline_parse.c
@@ -146,7 +146,9 @@ nb_common_chars(const char * s1, const char * s2)
  */
 static int
 match_inst(cmdline_parse_inst_t *inst, const char *buf,
-  unsigned int nb_match_token, void *resbuf, unsigned resbuf_size)
+  unsigned int nb_match_token, void *resbuf, unsigned resbuf_size,
+  cmdline_parse_token_hdr_t
+   *(*dyn_tokens)[CMDLINE_PARSE_DYNAMIC_TOKENS])
 {
unsigned int token_num=0;
cmdline_parse_token_hdr_t * token_p;
@@ -155,6 +157,11 @@ match_inst(cmdline_parse_inst_t *inst, const char *buf,
struct cmdline_token_hdr token_hdr;

token_p = inst->tokens[token_num];
+   if (!token_p && dyn_tokens && inst->f) {
+   if (!(*dyn_tokens)[0])
+   inst->f(&(*dyn_tokens)[0], NULL, dyn_tokens);
+   token_p = (*dyn_tokens)[0];
+   }
if (token_p)
memcpy(_hdr, token_p, sizeof(token_hdr));

@@ -196,7 +203,17 @@ match_inst(cmdline_parse_inst_t *inst, const char *buf,
buf += n;

token_num ++;
-   token_p = inst->tokens[token_num];
+   if (!inst->tokens[0]) {
+   if (token_num < (CMDLINE_PARSE_DYNAMIC_TOKENS - 1)) {
+   if (!(*dyn_tokens)[token_num])
+   inst->f(&(*dyn_tokens)[token_num],
+   NULL,
+   dyn_tokens);
+   token_p = (*dyn_tokens)[token_num];
+   } else
+   token_p = NULL;
+   } else
+   token_p = inst->tokens[token_num];
if (token_p)
memcpy(_hdr, token_p, sizeof(token_hdr));
}
@@ -239,6 +256,7 @@ cmdline_parse(struct cmdline *cl, const char * buf)
cmdline_parse_inst_t *inst;
const char *curbuf;
char result_buf[CMDLINE_PARSE_RESULT_BUFSIZE];
+   cmdline_parse_token_hdr_t *dyn_tokens[CMDLINE_PARSE_DYNAMIC_TOKENS];
void (*f)(void *, struct cmdline *, void *) = NULL;
void *data = NULL;
int comment = 0;
@@ -255,6 +273,7 @@ cmdline_parse(struct cmdline *cl, const char * buf)
return CMDLINE_PARSE_BAD_ARGS;

ctx = cl->ctx;
+   memset(_tokens, 0, sizeof(dyn_tokens));

/*
 * - look if the buffer contains at least one line
@@ -299,7 +318,8 @@ cmdline_parse(struct cmdline *cl, const char * buf)
debug_printf("INST %d\n", inst_num);

/* fully parsed */
-   tok = match_inst(inst, buf, 0, result_buf, sizeof(result_buf));
+   tok = match_inst(inst, buf, 0, result_buf, sizeof(result_buf),
+_tokens);

if (tok > 0) /* we matched at least one token */
err = CMDLINE_PARSE_BAD_ARGS;
@@ -355,6 +375,7 @@ cmdline_complete(struct cmdline *cl, const char *buf, int 
*state,
cmdline_parse_token_hdr_t *token_p;
struct cmdline_token_hdr token_hdr;
char tmpbuf[CMDLINE_BUFFER_SIZE], comp_buf[CMDLINE_BUFFER_SIZE];
+   cmdline_parse_token_hdr_t *dyn_tokens[CMDLINE_PARSE_DYNAMIC_TOKENS];
unsigned int partial_tok_len;
int comp_len = -1;
int tmp_len = -1;
@@ -374,6 +395,7 @@ cmdline_complete(struct cmdline *cl, const char *buf, int 
*state,

debug_printf("%s called\n", __func__);
memset(_hdr, 0, sizeof(token_hdr));
+   memset(_tokens, 0, sizeof(dyn_tokens));

/* count the number of complete token to parse */
for (i=0 ; buf[i] ; i++) {
@@ -396,11 +418,24 @@ cmdline_complete(struct cmdline *cl, const char *buf, int 
*state,
inst = ctx[inst_num];
while (inst) {
/* parse the first tokens of the inst */
-   if (nb_token && match_inst(inst, buf, nb_token, NULL, 
0))
+   if (nb_token &&
+

[dpdk-dev] [PATCH 01/22] ethdev: introduce generic flow API

2016-11-16 Thread Adrien Mazarguil

This new API supersedes all the legacy filter types described in
rte_eth_ctrl.h. It is slightly higher level and as a result relies more on
PMDs to process and validate flow rules.

Benefits:

- A unified API is easier to program for, applications do not have to be
  written for a specific filter type which may or may not be supported by
  the underlying device.

- The behavior of a flow rule is the same regardless of the underlying
  device, applications do not need to be aware of hardware quirks.

- Extensible by design, API/ABI breakage should rarely occur if at all.

- Documentation is self-standing, no need to look up elsewhere.

Existing filter types will be deprecated and removed in the near future.

Signed-off-by: Adrien Mazarguil 
---
 MAINTAINERS|   4 +
 lib/librte_ether/Makefile  |   3 +
 lib/librte_ether/rte_eth_ctrl.h|   1 +
 lib/librte_ether/rte_ether_version.map |  10 +
 lib/librte_ether/rte_flow.c| 159 +
 lib/librte_ether/rte_flow.h| 947 
 lib/librte_ether/rte_flow_driver.h | 177 ++
 7 files changed, 1301 insertions(+)

diff --git a/MAINTAINERS b/MAINTAINERS
index d6bb8f8..3b46630 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -243,6 +243,10 @@ M: Thomas Monjalon 
 F: lib/librte_ether/
 F: scripts/test-null.sh

+Generic flow API
+M: Adrien Mazarguil 
+F: lib/librte_ether/rte_flow*
+
 Crypto API
 M: Declan Doherty 
 F: lib/librte_cryptodev/
diff --git a/lib/librte_ether/Makefile b/lib/librte_ether/Makefile
index efe1e5f..9335361 100644
--- a/lib/librte_ether/Makefile
+++ b/lib/librte_ether/Makefile
@@ -44,6 +44,7 @@ EXPORT_MAP := rte_ether_version.map
 LIBABIVER := 5

 SRCS-y += rte_ethdev.c
+SRCS-y += rte_flow.c

 #
 # Export include files
@@ -51,6 +52,8 @@ SRCS-y += rte_ethdev.c
 SYMLINK-y-include += rte_ethdev.h
 SYMLINK-y-include += rte_eth_ctrl.h
 SYMLINK-y-include += rte_dev_info.h
+SYMLINK-y-include += rte_flow.h
+SYMLINK-y-include += rte_flow_driver.h

 # this lib depends upon:
 DEPDIRS-y += lib/librte_net lib/librte_eal lib/librte_mempool lib/librte_ring 
lib/librte_mbuf
diff --git a/lib/librte_ether/rte_eth_ctrl.h b/lib/librte_ether/rte_eth_ctrl.h
index fe80eb0..8386904 100644
--- a/lib/librte_ether/rte_eth_ctrl.h
+++ b/lib/librte_ether/rte_eth_ctrl.h
@@ -99,6 +99,7 @@ enum rte_filter_type {
RTE_ETH_FILTER_FDIR,
RTE_ETH_FILTER_HASH,
RTE_ETH_FILTER_L2_TUNNEL,
+   RTE_ETH_FILTER_GENERIC,
RTE_ETH_FILTER_MAX
 };

diff --git a/lib/librte_ether/rte_ether_version.map 
b/lib/librte_ether/rte_ether_version.map
index 72be66d..b5d2547 100644
--- a/lib/librte_ether/rte_ether_version.map
+++ b/lib/librte_ether/rte_ether_version.map
@@ -147,3 +147,13 @@ DPDK_16.11 {
rte_eth_dev_pci_remove;

 } DPDK_16.07;
+
+DPDK_17.02 {
+   global:
+
+   rte_flow_validate;
+   rte_flow_create;
+   rte_flow_destroy;
+   rte_flow_query;
+
+} DPDK_16.11;
diff --git a/lib/librte_ether/rte_flow.c b/lib/librte_ether/rte_flow.c
new file mode 100644
index 000..064963d
--- /dev/null
+++ b/lib/librte_ether/rte_flow.c
@@ -0,0 +1,159 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright 2016 6WIND S.A.
+ *   Copyright 2016 Mellanox.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ * * Redistributions of source code must retain the above copyright
+ *   notice, this list of conditions and the following disclaimer.
+ * * Redistributions in binary form must reproduce the above copyright
+ *   notice, this list of conditions and the following disclaimer in
+ *   the documentation and/or other materials provided with the
+ *   distribution.
+ * * Neither the name of 6WIND S.A. nor the names of its
+ *   contributors may be used to endorse or promote products derived
+ *   from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#include 
+
+#include 
+#include 
+#include "rte_ethdev.h"
+#include "rte_flow_driver.h"
+#include "rte_flow.h"
+
+

[dpdk-dev] [PATCH 00/22] Generic flow API (rte_flow)

2016-11-16 Thread Adrien Mazarguil

As previously discussed in RFC v1 [1], RFC v2 [2], with changes
described in [3] (also pasted below), here is the first non-draft series
for this new API.

Its capabilities are so generic that its name had to be vague, it may be
called "Generic flow API", "Generic flow interface" (possibly shortened
as "GFI") to refer to the name of the new filter type, or "rte_flow" from
the prefix used for its public symbols. I personally favor the latter.

While it is currently meant to supersede existing filter types in order for
all PMDs to expose a common filtering/classification interface, it may
eventually evolve to cover the following ideas as well:

- Rx/Tx offloads configuration through automatic offloads for specific
  packets, e.g. performing checksum on TCP packets could be expressed with
  an egress rule with a TCP pattern and a kind of checksum action.

- RSS configuration (already defined actually). Could be global or per rule
  depending on hardware capabilities.

- Switching configuration for devices with many physical ports; rules doing
  both ingress and egress could even be used to completely bypass software
  if supported by hardware.

 [1] http://dpdk.org/ml/archives/dev/2016-July/043365.html
 [2] http://dpdk.org/ml/archives/dev/2016-August/045383.html
 [3] http://dpdk.org/ml/archives/dev/2016-November/050044.html

Changes since RFC v2:

- New separate VLAN pattern item (previously part of the ETH definition),
  found to be much more convenient.

- Removed useless "any" field from VF pattern item, the same effect can be
  achieved by not providing a specification structure.

- Replaced bit-fields from the VXLAN pattern item to avoid endianness
  conversion issues on 24-bit fields.

- Updated struct rte_flow_item with a new "last" field to create inclusive
  ranges. They are defined as the interval between (spec & mask) and
  (last & mask). All three parameters are optional.

- Renamed ID action MARK.

- Renamed "queue" fields in actions QUEUE and DUP to "index".

- "rss_conf" field in RSS action is now const.

- VF action now uses a 32 bit ID like its pattern item counterpart.

- Removed redundant struct rte_flow_pattern, API functions now expect
  struct
  rte_flow_item lists terminated by END items.

- Replaced struct rte_flow_actions for the same reason, with struct
  rte_flow_action lists terminated by END actions.

- Error types (enum rte_flow_error_type) have been updated and the cause
  pointer in struct rte_flow_error is now const.

- Function prototypes (rte_flow_create, rte_flow_validate) have also been
  updated for clarity.

Additions:

- Public wrapper functions rte_flow_{validate|create|destroy|flush|query}
  are now implemented in rte_flow.c, with their symbols exported and
  versioned. Related filter type RTE_ETH_FILTER_GENERIC has been added.

- A separate header (rte_flow_driver.h) has been added for driver-side
  functionality, in particular struct rte_flow_ops which contains PMD
  callbacks returned by RTE_ETH_FILTER_GENERIC query.

- testpmd now exposes most of this API through the new "flow" command.

What remains to be done:

- Using endian-aware integer types (rte_beX_t) where necessary for clarity.

- API documentation (based on RFC).

- testpmd flow command documentation (although context-aware command
  completion should already help quite a bit in this regard).

- A few pattern item / action properties cannot be configured yet
  (e.g. rss_conf parameter for RSS action) and a few completions
  (e.g. possible queue IDs) should be added.

Adrien Mazarguil (22):
  ethdev: introduce generic flow API
  cmdline: add support for dynamic tokens
  cmdline: add alignment constraint
  app/testpmd: implement basic support for rte_flow
  app/testpmd: add flow command
  app/testpmd: add rte_flow integer support
  app/testpmd: add flow list command
  app/testpmd: add flow flush command
  app/testpmd: add flow destroy command
  app/testpmd: add flow validate/create commands
  app/testpmd: add flow query command
  app/testpmd: add rte_flow item spec handler
  app/testpmd: add rte_flow item spec prefix length
  app/testpmd: add rte_flow bit-field support
  app/testpmd: add item any to flow command
  app/testpmd: add various items to flow command
  app/testpmd: add item raw to flow command
  app/testpmd: add items eth/vlan to flow command
  app/testpmd: add items ipv4/ipv6 to flow command
  app/testpmd: add L4 items to flow command
  app/testpmd: add various actions to flow command
  app/testpmd: add queue actions to flow command

 MAINTAINERS|4 +
 app/test-pmd/Makefile  |1 +
 app/test-pmd/cmdline.c |   32 +
 app/test-pmd/cmdline_flow.c| 2581 +++
 app/test-pmd/config.c  |  484 +
 app/test-pmd/csumonly.c|1 +
 app/test-pmd/flowgen.c

[dpdk-dev] [RFC v2] Generic flow director/filtering/classification API

2016-11-09 Thread Adrien Mazarguil

Hi Helin and PMD maintainers,

On Tue, Nov 08, 2016 at 01:31:05AM +, Zhang, Helin wrote:
> Hi Adrien
> 
> Any update on the v1 APIs? We are struggling on that, as we need that for our 
> development.
> May I bring another idea to remove the blocking?
> Can we send out the APIs with PMD changes based on our understaning of the 
> RFC we discussed recenlty on community? Then you can just update any 
> modification on top of it, or ask the submittors to change with your review 
> comments?
> Any comments on this idea? If not, then we may go this way. I guess this 
> might be the most efficient way. Thank you very much!

Not wanting to hold back anyone's progress anymore (not that I was doing it
on purpose), here's my work tree with the updated and functional API
(rte_flow branch based on top of v16.11-rc3) while I'm preparing the
patchset for official submission:

 https://github.com/am6/dpdk.org/tree/rte_flow

As a work in progress, this branch is subject to change.

API changes since RFC v2:

- New separate VLAN pattern item (previously part of the ETH definition),
  found to be much more convenient.

- Removed useless "any" field from VF pattern item, the same effect can be
  achieved by not providing a specification structure.

- Replaced bit-fields from the VXLAN pattern item to avoid endianness
  conversion issues on 24-bit fields.

- Updated struct rte_flow_item with a new "last" field to create inclusive
  ranges. They are defined as the interval between (spec & mask) and
  (last & mask). All three parameters are optional.

- Renamed ID action MARK.

- Renamed "queue" fields in actions QUEUE and DUP to "index".

- "rss_conf" field in RSS action is now const.

- VF action now uses a 32 bit ID like its pattern item counterpart.

- Removed redundant struct rte_flow_pattern, API functions now expect struct
  rte_flow_item lists terminated by END items.

- Replaced struct rte_flow_actions for the same reason, with struct
  rte_flow_action lists terminated by END actions.

- Error types (enum rte_flow_error_type) have been updated and the cause
  pointer in struct rte_flow_error is now const.

- Function prototypes (rte_flow_create, rte_flow_validate) have also been
  updated for clarity.

Additions:

- Public wrapper functions rte_flow_{validate|create|destroy|flush|query}
  are now implemented in rte_flow.c, with their symbols exported and
  versioned. Related filter type RTE_ETH_FILTER_GENERIC has been added.

- A separate header (rte_flow_driver.h) has been added for driver-side
  functionality, in particular struct rte_flow_ops which contains PMD
  callbacks returned by RTE_ETH_FILTER_GENERIC query.

- testpmd now exposes most of this API through the new "flow" command.

What remains to be done:

- Using endian-aware integer types (rte_beX_t) where necessary for clarity.

- API documentation (based on RFC).

- testpmd flow command documentation (although context-aware command
  completion should already help quite a bit in this regard).

- A few pattern item / action properties cannot be configured yet
  (e.g. rss_conf parameter for RSS action) and a few completions
  (e.g. possible queue IDs) should be added.

-- 
Adrien Mazarguil
6WIND

[dpdk-dev] dpdk16.11 RC2 package ipv4 reassembly example can't work

2016-11-04 Thread Adrien Mazarguil

On Fri, Nov 04, 2016 at 06:36:30AM +, Lu, Wenzhuo wrote:
> Hi Adrien,
> 
> > -Original Message-
> > From: Adrien Mazarguil [mailto:adrien.mazarguil at 6wind.com]
> > Sent: Wednesday, November 2, 2016 11:21 PM
> > To: Lu, Wenzhuo
> > Cc: Ananyev, Konstantin; Liu, Yu Y; Chen, WeichunX; Xu, HuilongX;
> > dev at dpdk.org
> > Subject: Re: dpdk16.11 RC2 package ipv4 reassembly example can't work
> > 
> > Hi all,
> > 
> > On Wed, Nov 02, 2016 at 08:39:31AM +, Lu, Wenzhuo wrote:
> > > Correct the typo of receiver.
> > >
> > > Hi Adrien,
> > > The change from struct ip_frag_pkt pkt[0]  to struct ip_frag_pkt pkt[] 
> > > will
> > make IP reassembly not working. I think this is not the root cause. Maybe
> > Konstantin can give us some idea.
> > > But I notice one thing, you change some from [0] to [], but others just 
> > > add
> > '__extension__'. I believe if you add '__extension__' for struct 
> > ip_frag_pkt pkt[0],
> > we'll not hit this issue. Just curious why you use 2 ways to resolve the 
> > same
> > problem.
> > 
> > I've used the __extension__ method whenever the C99 syntax could not work
> > due to invalid usage in the code, e.g. a flexible array cannot be the only 
> > member
> > of a struct, you cannot make arrays out of structures that contain such 
> > fields,
> > while there is no such constraint with the GNU syntax.
> > 
> > For example see __extension__ uint8_t action_data[0] in struct
> > rte_pipeline_table_entry. The C99 could not be used because of
> > test_table_acl.c:
> > 
> >   struct rte_pipeline_table_entry entries[5];
> > 
> > If replacing ip_frag_pkt[] with __extension__ ip_frag_pkt pkt[0] in 
> > rte_ip_frag.h
> > solves the issue, either some code is breaking some constraint somewhere or
> > this change broke the ABI (unlikely considering a simple recompilation 
> > should
> > have taken care of the issue). I did not notice any change in sizeof(struct
> > rte_ip_frag_tbl) nor offsetof(struct rte_ip_frag_tbl, pkt) on my setup, 
> > perhaps
> > the compilation flags used in your test affect them somehow.
> Thanks for your explanation. I also checked sizeof(struct rte_ip_frag_tbl). I 
> don't see any change either.
> 
> > 
> > Can you confirm whether only reverting this particular field solves the 
> > issue?
> Yes. ip_frag_pkt pkt[0] or even ip_frag_pkt pkt[1] can work but ip_frag_pkt 
> pkt[] cannot :(
> Do you like the idea of changing the ip_frag_pkt[] to __extension__ 
> ip_frag_pkt pkt[0]?

Yes, restoring the original code (with __extension__) as a workaround until
we understand what is going on is safer, that's fine by me. The commit log
should explicitly state that weirdness occurs for an unknown reason with the
C99 syntax though (compiler bug is also a possibility).

-- 
Adrien Mazarguil
6WIND

[dpdk-dev] [PATCH 0/2] update mlx5 release note and guide

2016-11-02 Thread Adrien Mazarguil

On Wed, Nov 02, 2016 at 02:46:42PM +0100, Nelio Laranjeiro wrote:
> Nelio Laranjeiro (2):
>   doc: update mlx5 dependencies
>   doc: add mlx5 release notes
> 
>  doc/guides/nics/mlx5.rst   |   8 +-
>  doc/guides/rel_notes/release_16_11.rst | 136 
> ++---
>  2 files changed, 114 insertions(+), 30 deletions(-)
> 
> -- 
> 2.1.4

Acked-by: Adrien Mazarguil 

-- 
Adrien Mazarguil
6WIND

[dpdk-dev] [PATCH 0/3] fix Rx checksum offloads

2016-11-02 Thread Adrien Mazarguil

On Wed, Nov 02, 2016 at 11:39:36AM +0100, Nelio Laranjeiro wrote:
> Fill correctly the Mbuf Rx offloads.
> 
> Nelio Laranjeiro (3):
>   net/mlx5: fix Rx checksum macros
>   net/mlx5: define explicit fields for Rx offloads
>   net/mlx: fix support for new Rx checksum flags
> 
>  drivers/net/mlx4/mlx4.c  | 21 --
>  drivers/net/mlx5/mlx5_prm.h  | 37 +-
>  drivers/net/mlx5/mlx5_rxtx.c | 93 
> 
>  3 files changed, 87 insertions(+), 64 deletions(-)
> 
> -- 
> 2.1.4

Thanks. For the series:

Acked-by: Adrien Mazarguil 

-- 
Adrien Mazarguil
6WIND

[dpdk-dev] [PATCH] net/mlx5: fix wrong use of vector instruction

2016-11-02 Thread Adrien Mazarguil

On Tue, Nov 01, 2016 at 08:13:27AM +, Elad Persiko wrote:
> Constraint alignment was not respected in Tx.
> 
> Fixes: 1d88ba171942 ("net/mlx5: refactor Tx data path")
> 
> Signed-off-by: Elad Persiko 
> ---
>  drivers/net/mlx5/mlx5_rxtx.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/net/mlx5/mlx5_rxtx.c b/drivers/net/mlx5/mlx5_rxtx.c
> index 21164ba..ba8e202 100644
> --- a/drivers/net/mlx5/mlx5_rxtx.c
> +++ b/drivers/net/mlx5/mlx5_rxtx.c
> @@ -309,7 +309,7 @@ mlx5_tx_dbrec(struct txq *txq)
>   *txq->qp_db = htonl(txq->wqe_ci);
>   /* Ensure ordering between DB record and BF copy. */
>   rte_wmb();
> - rte_mov16(dst, (uint8_t *)data);
> + memcpy(dst, (uint8_t *)data, 16);
>   txq->bf_offset ^= (1 << txq->bf_buf_size);
>  }
>  
> @@ -449,7 +449,7 @@ mlx5_tx_burst(void *dpdk_txq, struct rte_mbuf **pkts, 
> uint16_t pkts_n)
>   wqe->eseg.mss = 0;
>   wqe->eseg.rsvd2 = 0;
>   /* Start by copying the Ethernet Header. */
> - rte_mov16((uint8_t *)raw, (uint8_t *)addr);
> + memcpy((uint8_t *)raw, ((uint8_t *)addr), 16);
>   length -= MLX5_WQE_DWORD_SIZE;
>   addr += MLX5_WQE_DWORD_SIZE;
>   /* Replace the Ethernet type by the VLAN if necessary. */
> -- 
> 1.8.3.1

Acked-by: Adrien Mazarguil 

-- 
Adrien Mazarguil
6WIND

[dpdk-dev] [RFC]Generic flow filtering API Sample Application

2016-11-02 Thread Adrien Mazarguil

Hi Wei,

On Wed, Nov 02, 2016 at 05:27:50AM +, Zhao1, Wei wrote:
> Hi  All,
> Now we are planning for an sample application for Generic flow 
> filtering API feature, and I have finished the RFC for this example app.
> Now  Adrien Mazarguil  has send v2 version of Generic flow 
> filtering API,  this sample application  RFC is based on that.
> 
> Thank you.

Thanks for your RFC, sorry for the late notice that I've been essentially
working on a similar implementation in testpmd in order to validate the API
before sending v1, which I concede is taking way longer than expected.

I have yet to submit my patches however this should happen soon, if you
haven't started working on your own implementation yet, please wait until my
implementation gets rejected to avoid any more duplicated effort in the
meantime.

BTW, I find a lot of similarities between our respective command-line
handling approaches, which is great! We're going in the same direction.

> Generic flow filtering API Sample Application
> 
> 
> The application is a simple example of generic flow filtering API using the 
> DPDK.
> The application performs flow director/filtering/classification in packet 
> processing.
> 
> Overview
> 
> 
> The application demonstrates the use of generic flow 
> director/filtering/classification API 
> in the DPDK to implement packet forwarding.And this document focus on the 
> guide line of writing rules configuration 
> files and prompt commands usage. It also supply the definition of the 
> available EAL options arguments which is useful
> in DPDK packet forwarding processing.
> 
> 
> Compiling the Application
> -
> 
> To compile the application:
> 
> #.Go to the sample application directory:
> 
>   .. code-block:: console
> 
>   export RTE_SDK=/path/to/rte_sdk
>   cd ${RTE_SDK}/examples/gen_filter
> 
> #.Set the target (a default target is used if not specified). For example:
> 
>   .. code-block:: console
> 
>   export RTE_TARGET=x86_64-native-linuxapp-gcc
> 
>   See the *DPDK Getting Started Guide* for possible RTE_TARGET values.
> 
> #.Build the application:
> 
>   .. code-block:: console
> 
>   make
> 
> Running the Application
> ---
> The application has a number of EAL options::
> 
>   ./gen_filter [EAL options] -- 
> 
> EAL options:
> * -c
>   Codemask, set the hexadecimal bitmask of the cores to run on.
> 
> * -n
>   Num, set the number of memory channels to use.
> 
> APP PARAMS:
>   The following are the application options parameters, they must be 
> separated
>   from the EAL options with a "--" separator.
> 
> * -i
>   Interactive, run this app in interactive mode. In this mode, the app 
> starts with a prompt that can
>   be used to start and stop forwarding, then manage generic filters rule 
> configure in the application,
>   reference to the following description for more details.In 
> non-interactive mode, the application starts with the configuration specified 
> on the
>   command-line and immediately enters forwarding mode.
> 
> * --portmask=0xXX
>   Set the hexadecimal bitmask of the ports which can be used by the 
> generic flow director test in packet forwarding.
>   
> * --coremask=0xXX
>   Set the hexadecimal bitmask of the cores running the packet forwarding 
> test. The master
>   lcore is reserved for command line parsing only and cannot be masked on 
> for packet forwarding.
> 
> * --nb-ports=N 
>   Set the number of forwarding ports, where 1 <= N <= "number of ports" 
> on the board
>   or CONFIG_RTE_MAX_ETHPORTS from the configuration file. The default 
> value is the number of ports on the board.
> 
> * --rxq=N
>   Set the number of RX queues per port to N, where 1 <= N <= 65535. The 
> default value is 1.
> 
> * --txq=N
>   Set the number of TX queues per port to N, where 1 <= N <= 65535. The 
> default value is 1.
> 
> 
> ###this part need to complete later after decision of which EAL commands 
> arguments need to be support in this application###
> 
> 
> Interactive mode
> 
> *   when the gen_filter application is started in interactive mode, 
> (-i|--interactive), it displays a prompt 
>   that can be used to start and stop forwarding, and configure the 
> application to set the Flow Director,
>   display statistics, set the Flow Director and other tasks. The 
> application has a number of commands

[dpdk-dev] dpdk16.11 RC2 package ipv4 reassembly example can't work

2016-11-02 Thread Adrien Mazarguil

Hi all,

On Wed, Nov 02, 2016 at 08:39:31AM +, Lu, Wenzhuo wrote:
> Correct the typo of receiver.
> 
> Hi Adrien,
> The change from struct ip_frag_pkt pkt[0]  to struct ip_frag_pkt pkt[] will 
> make IP reassembly not working. I think this is not the root cause. Maybe 
> Konstantin can give us some idea.
> But I notice one thing, you change some from [0] to [], but others just add 
> '__extension__'. I believe if you add '__extension__' for struct ip_frag_pkt 
> pkt[0], we'll not hit this issue. Just curious why you use 2 ways to resolve 
> the same problem.

I've used the __extension__ method whenever the C99 syntax could not work
due to invalid usage in the code, e.g. a flexible array cannot be the only
member of a struct, you cannot make arrays out of structures that contain
such fields, while there is no such constraint with the GNU syntax.

For example see __extension__ uint8_t action_data[0] in struct
rte_pipeline_table_entry. The C99 could not be used because of
test_table_acl.c:

  struct rte_pipeline_table_entry entries[5];

If replacing ip_frag_pkt[] with __extension__ ip_frag_pkt pkt[0] in
rte_ip_frag.h solves the issue, either some code is breaking some constraint
somewhere or this change broke the ABI (unlikely considering a simple
recompilation should have taken care of the issue). I did not notice any
change in sizeof(struct rte_ip_frag_tbl) nor offsetof(struct
rte_ip_frag_tbl, pkt) on my setup, perhaps the compilation flags used in
your test affect them somehow.

Can you confirm whether only reverting this particular field solves the
issue?

> From: Xu, HuilongX
> Sent: Wednesday, November 2, 2016 4:29 PM
> To: drien.mazarguil at 6wind.com
> Cc: Ananyev, Konstantin; Liu, Yu Y; Chen, WeichunX; Lu, Wenzhuo; Xu, HuilongX
> Subject: dpdk16.11 RC2 package ipv4 reassembly example can't work
> 
> Hi mazarguil,
> I find ip reassembly example can't work with dpdk16.11 rc2 package.
> But when I reset dpdk code before 347a1e037fd323e6c2af55d17f7f0dc4bfe1d479, 
> it works ok.
> Could you have time to check this issue, thanks  a lot.
> Unzip password: intel123
> 
> Test detail info:
> 
> os:4.2.3-300.fc23.x86_64
> gcc version:5.3.1 20160406 (Red Hat 5.3.1-6) (GCC)
> NIC:03:00.0 Ethernet controller [0200]: Intel Corporation Ethernet Connection 
> X552/X557-AT 10GBASE-T [8086:15ad] and
> 84:00.0 Ethernet controller [0200]: Intel Corporation 82599ES 10-Gigabit 
> SFI/SFP+ Network Connection [8086:10fb] (rev 01)
> package: dpdk16.11.rc2.tar.gz
> test steps:
> 1. build and install dpdk
> 2. build ip_reassembly example
> 3. run ip_reassembly
> ./examples/ip_reassembly/build/ip_reassembly -c 0x2 -n 4 - -p 0x1 
> --maxflows=1024 --flowttl=10s
> 4. set tester port mtu
> ip link set mtu 9000 dev ens160f1
> 5. setup scapy on tester and send packet
> scapy
> pcap = rdpcap("file.pcap")
> sendp(pcap, iface="ens160f1")
> 6. sniff packet on tester and check packet
> test result:
> dpdk16.04 reassembly packet successful but dpdk16.11 reassembly pack failed.
> 
> comments:
> file.pcap: send packets pcap file
> tcpdump_16.04_reassembly_successful.pcap: sniff packets by tcpdump on 16.04.
> tcpdump_reset_code_reassembly_failed.pcap: sniff packets by tcpdump on 16.11
> reset_code_reassembly_successful_.jpg: reassembly a packets successful detail 
> info
> dpdk16.11_reassembly_failed.jpg: reassembly a packets failed detail info
> 

-- 
Adrien Mazarguil
6WIND

[dpdk-dev] [RFC v2] Generic flow director/filtering/classification API

2016-11-02 Thread Adrien Mazarguil

Hi Helin,

On Mon, Oct 31, 2016 at 07:19:18AM +, Zhang, Helin wrote:
> Hi Adrien
> 
> Just a double check, do you have any update on the v1 patch set, as now it is 
> the end of October?
> We are extremly eager to see the v1 patch set for development.
> I don't think we need full validation on the v1 patch set for API. It should 
> be together with PMD and example application.
> If we can see the v1 API patch set earlier, we can help to validate it with 
> our code changes. That's should be more efficient and helpful.
> Any comments on my personal understanding?
> 
> Thank you very much for the hard work and kind helps!

I intend to send it shortly, likely this week. For the record, a large part
of this task was also dedicated to implement it on the client side (I've
just read Wei's RFC for a client-side application to which I will reply
separately), in order to validate it from a usability standpoint that led me
to make a few necessary adjustments to the API.

My next submission will include both the updated API with several changes
discussed on this ML and testpmd code (not a separate application) that uses
it. Just hang on a bit longer!

> > -Original Message-
> > From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Adrien Mazarguil
> > Sent: Friday, September 30, 2016 1:11 AM
> > To: dev at dpdk.org
> > Cc: Thomas Monjalon
> > Subject: Re: [dpdk-dev] [RFC v2] Generic flow 
> > director/filtering/classification
> > API
> > 
> > On Fri, Aug 19, 2016 at 08:50:44PM +0200, Adrien Mazarguil wrote:
> > > Hi All,
> > >
> > > Thanks to many for the positive and constructive feedback I've
> > > received so far. Here is the updated specification (v0.7) at last.
> > >
> > > I've attempted to address as many comments as possible but could not
> > > process them all just yet. A new section "Future evolutions" has been
> > > added for the remaining topics.
> > >
> > > This series adds rte_flow.h to the DPDK tree. Next time I will attempt
> > > to convert the specification as a documentation commit part of the
> > > patchset and actually implement API functions.
> > [...]
> > 
> > A quick update, we initially targeted 16.11 as the DPDK release this API 
> > would
> > be available for, turns out this goal was somewhat too optimistic as
> > September is ending and we are about to overshoot the deadline for
> > integration (basically everything took longer than expected, big surprise).
> > 
> > So instead of rushing things now to include a botched API in 16.11 with no
> > PMD support, we simply modified the target, now set to 17.02. On the plus
> > side this should leave developers more time to refine and test the API 
> > before
> > applications and PMDs start to use it.
> > 
> > I intend to send the patchset for the first non-draft version mid-October
> > worst case (ASAP in fact). I still haven't replied to several comments but 
> > did
> > take them into account, thanks for your feedback.
> > 
> > --
> > Adrien Mazarguil
> > 6WIND

-- 
Adrien Mazarguil
6WIND

[dpdk-dev] mbuf changes

2016-10-28 Thread Adrien Mazarguil

On Fri, Oct 28, 2016 at 04:11:45PM +0200, Morten Br?rup wrote:
> Comments at the end.
> 
> > -Original Message-
> > From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Pattan, Reshma
> > Sent: Friday, October 28, 2016 3:35 PM
> > To: Olivier Matz
> > Cc: dev at dpdk.org; Morten Br?rup
> > Subject: Re: [dpdk-dev] mbuf changes
> > 
> > Hi Olivier,
> > 
> > > -Original Message-
> > > From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Olivier Matz
> > > Sent: Tuesday, October 25, 2016 1:49 PM
> > > To: Richardson, Bruce ; Morten Br?rup
> > > 
> > > Cc: Adrien Mazarguil ; Wiles, Keith
> > > ; dev at dpdk.org; Oleg Kuporosov
> > > 
> > > Subject: Re: [dpdk-dev] mbuf changes
> > >
> > >
> > >
> > > On 10/25/2016 02:45 PM, Bruce Richardson wrote:
> > > > On Tue, Oct 25, 2016 at 02:33:55PM +0200, Morten Br?rup wrote:
> > > >> Comments at the end.
> > > >>
> > > >> Med venlig hilsen / kind regards
> > > >> - Morten Br?rup
> > > >>
> > > >>> -Original Message-
> > > >>> From: Bruce Richardson [mailto:bruce.richardson at intel.com]
> > > >>> Sent: Tuesday, October 25, 2016 2:20 PM
> > > >>> To: Morten Br?rup
> > > >>> Cc: Adrien Mazarguil; Wiles, Keith; dev at dpdk.org; Olivier Matz;
> > > >>> Oleg Kuporosov
> > > >>> Subject: Re: [dpdk-dev] mbuf changes
> > > >>>
> > > >>> On Tue, Oct 25, 2016 at 02:16:29PM +0200, Morten Br?rup wrote:
> > > >>>> Comments inline.
> > > >>>>
> > > >>>>> -Original Message-
> > > >>>>> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Bruce
> > > >>>>> Richardson
> > > >>>>> Sent: Tuesday, October 25, 2016 1:14 PM
> > > >>>>> To: Adrien Mazarguil
> > > >>>>> Cc: Morten Br?rup; Wiles, Keith; dev at dpdk.org; Olivier Matz;
> > > >>>>> Oleg Kuporosov
> > > >>>>> Subject: Re: [dpdk-dev] mbuf changes
> > > >>>>>
> > > >>>>> On Tue, Oct 25, 2016 at 01:04:44PM +0200, Adrien Mazarguil
> > wrote:
> > > >>>>>> On Tue, Oct 25, 2016 at 12:11:04PM +0200, Morten Br?rup wrote:
> > > >>>>>>> Comments inline.
> > > >>>>>>>
> > > >>>>>>> Med venlig hilsen / kind regards
> > > >>>>>>> - Morten Br?rup
> > > >>>>>>>
> > > >>>>>>>
> > > >>>>>>>> -Original Message-
> > > >>>>>>>> From: Adrien Mazarguil [mailto:adrien.mazarguil at 6wind.com]
> > > >>>>>>>> Sent: Tuesday, October 25, 2016 11:39 AM
> > > >>>>>>>> To: Bruce Richardson
> > > >>>>>>>> Cc: Wiles, Keith; Morten Br?rup; dev at dpdk.org; Olivier Matz;
> > > >>>>>>>> Oleg Kuporosov
> > > >>>>>>>> Subject: Re: [dpdk-dev] mbuf changes
> > > >>>>>>>>
> > > >>>>>>>> On Mon, Oct 24, 2016 at 05:25:38PM +0100, Bruce Richardson
> > > >>> wrote:
> > > >>>>>>>>> On Mon, Oct 24, 2016 at 04:11:33PM +, Wiles, Keith
> > > >>> wrote:
> > > >>>>>>>> [...]
> > > >>>>>>>>>>> On Oct 24, 2016, at 10:49 AM, Morten Br?rup
> > > >>>>>>>>  wrote:
> > > >>>>>>>> [...]
> > > >>>>
> > > >>>>>>>>> One other point I'll mention is that we need to have a
> > > >>>>>>>>> discussion on how/where to add in a timestamp value into
> > > >>> the
> > > >>>>>>>>> mbuf. Personally, I think it can be in a union with the
> > > >>>>> sequence
> > > >>>>>>>>> number value, but I also suspect that 32-bits of a
> > > >>> timestamp
> > > >>>>>>>>> is not going to be enough for
> > > >>>>>>>> many.
> > > >>>>>>>>>

[dpdk-dev] [PATCH] net/mlx5: fix handling of small mbuf sizes

2016-10-28 Thread Adrien Mazarguil

On Mon, Oct 24, 2016 at 11:10:59AM +0300, Raslan Darawsheh wrote:
> When mbufs are smaller than MRU, multi-segment support must be enabled to
> default set when not in promiscuous or allmulticast modes.
> 
> Fixes: 9964b965ad69 ("net/mlx5: re-add Rx scatter support")
> 
> Signed-off-by: Raslan Darawsheh 
> ---
>  drivers/net/mlx5/mlx5_rxq.c | 6 ++
>  1 file changed, 6 insertions(+)
> 
> diff --git a/drivers/net/mlx5/mlx5_rxq.c b/drivers/net/mlx5/mlx5_rxq.c
> index 4dc5cc3..62253ed 100644
> --- a/drivers/net/mlx5/mlx5_rxq.c
> +++ b/drivers/net/mlx5/mlx5_rxq.c
> @@ -946,6 +946,12 @@ rxq_ctrl_setup(struct rte_eth_dev *dev, struct rxq_ctrl 
> *rxq_ctrl,
>   (void)conf; /* Thresholds configuration (ignored). */
>   /* Enable scattered packets support for this queue if necessary. */
>   assert(mb_len >= RTE_PKTMBUF_HEADROOM);
> + /* If smaller than MRU, multi-segment support must be enabled. */
> + if (mb_len < (priv->mtu > dev->data->dev_conf.rxmode.max_rx_pkt_len ?
> +  dev->data->dev_conf.rxmode.max_rx_pkt_len :
> +  priv->mtu
> +  ))

Let's move poor "))" to the end of the previous line.

> + dev->data->dev_conf.rxmode.jumbo_frame = 1;
>   if ((dev->data->dev_conf.rxmode.jumbo_frame) &&
>   (dev->data->dev_conf.rxmode.max_rx_pkt_len >
>(mb_len - RTE_PKTMBUF_HEADROOM))) {
> -- 
> 1.9.1

Besides the above comment:

Acked-by: Adrien Mazarguil 

-- 
Adrien Mazarguil
6WIND

[dpdk-dev] [PATCH] net/mlx5: fix default set for multicast traffic

2016-10-28 Thread Adrien Mazarguil

On Mon, Oct 24, 2016 at 10:59:14AM +0300, Raslan Darawsheh wrote:
> Remove non-IPv6 multicast traffic with destination MAC 33:33:* from the
> default set when not in promiscuous or allmulticast modes.
> 
> Fixes: 0497ddaac511 ("mlx5: add special flows for broadcast and IPv6 
> multicast")
> 
> Signed-off-by: Raslan Darawsheh 
> ---
>  drivers/net/mlx5/mlx5_rxmode.c | 1 -
>  1 file changed, 1 deletion(-)
> 
> diff --git a/drivers/net/mlx5/mlx5_rxmode.c b/drivers/net/mlx5/mlx5_rxmode.c
> index 173e6e8..4ffe703 100644
> --- a/drivers/net/mlx5/mlx5_rxmode.c
> +++ b/drivers/net/mlx5/mlx5_rxmode.c
> @@ -104,7 +104,6 @@ static const struct special_flow_init special_flow_init[] 
> = {
>   .hash_types =
>   1 << HASH_RXQ_UDPV6 |
>   1 << HASH_RXQ_IPV6 |
> - 1 << HASH_RXQ_ETH |
>   0,
>   .per_vlan = 1,
>   },
> -- 
> 1.9.1

(NACK)

While technically correct, looks like this patch sometimes break IPv6
multicast traffic as well, let's drop it until we figure out the reason.

-- 
Adrien Mazarguil
6WIND

[dpdk-dev] mbuf changes

2016-10-25 Thread Adrien Mazarguil

On Tue, Oct 25, 2016 at 02:16:29PM +0200, Morten Br?rup wrote:
> Comments inline.

I'm only replying to the nb_segs bits here.

> > -Original Message-
> > From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Bruce Richardson
> > Sent: Tuesday, October 25, 2016 1:14 PM
> > To: Adrien Mazarguil
> > Cc: Morten Br?rup; Wiles, Keith; dev at dpdk.org; Olivier Matz; Oleg
> > Kuporosov
> > Subject: Re: [dpdk-dev] mbuf changes
> > 
> > On Tue, Oct 25, 2016 at 01:04:44PM +0200, Adrien Mazarguil wrote:
> > > On Tue, Oct 25, 2016 at 12:11:04PM +0200, Morten Br?rup wrote:
> > > > Comments inline.
> > > >
> > > > Med venlig hilsen / kind regards
> > > > - Morten Br?rup
> > > >
> > > >
> > > > > -Original Message-
> > > > > From: Adrien Mazarguil [mailto:adrien.mazarguil at 6wind.com]
> > > > > Sent: Tuesday, October 25, 2016 11:39 AM
> > > > > To: Bruce Richardson
> > > > > Cc: Wiles, Keith; Morten Br?rup; dev at dpdk.org; Olivier Matz; Oleg
> > > > > Kuporosov
> > > > > Subject: Re: [dpdk-dev] mbuf changes
> > > > >
> > > > > On Mon, Oct 24, 2016 at 05:25:38PM +0100, Bruce Richardson wrote:
> > > > > > On Mon, Oct 24, 2016 at 04:11:33PM +, Wiles, Keith wrote:
> > > > > [...]
> > > > > > > > On Oct 24, 2016, at 10:49 AM, Morten Br?rup
> > > > >  wrote:
> > > > > [...]
> > > > > > > > 5.
> > > > > > > >
> > > > > > > > And here?s something new to think about:
> > > > > > > >
> > > > > > > > m->next already reveals if there are more segments to a
> > packet.
> > > > > Which purpose does m->nb_segs serve that is not already covered
> > by
> > > > > m-
> > > > > >next?
> > > > > >
> > > > > > It is duplicate info, but nb_segs can be used to check the
> > > > > > validity
> > > > > of
> > > > > > the next pointer without having to read the second mbuf
> > cacheline.
> > > > > >
> > > > > > Whether it's worth having is something I'm happy enough to
> > > > > > discuss, though.
> > > > >
> > > > > Although slower in some cases than a full blown "next packet"
> > > > > pointer, nb_segs can also be conveniently abused to link several
> > > > > packets and their segments in the same list without wasting
> > space.
> > > >
> > > > I don?t understand that; can you please elaborate? Are you abusing
> > m->nb_segs as an index into an array in your application? If that is
> > the case, and it is endorsed by the community, we should get rid of m-
> > >nb_segs and add a member for application specific use instead.
> > >
> > > Well, that's just an idea, I'm not aware of any application using
> > > this, however the ability to link several packets with segments seems
> > > useful to me (e.g. buffering packets). Here's a diagram:
> > >
> > >  .---.   .---.   .---.   .---.   .---
> > ---
> > >  | pkt 0 |   | seg 1 |   | seg 2 |   | pkt 1 |   |
> > pkt 2
> > >  |  next --->|  next --->|  next --->|  next --->|
> > ...
> > >  | nb_segs 3 |   | nb_segs 1 |   | nb_segs 1 |   | nb_segs 1 |   |
> > >  `---'   `---'   `---'   `---'   `---
> > ---
> 
> I see. It makes it possible to refer to a burst of packets (with segments or 
> not) by a single mbuf reference, as an alternative to the current design 
> pattern of using an array and length (struct rte_mbuf **mbufs, unsigned 
> count).
> 
> This would require implementation in the PMDs etc.
> 
> And even in this case, m->nb_segs does not need to be an integer, but could 
> be replaced by a single bit indicating if the segment is a continuation of a 
> packet or the beginning (alternatively the end) of a packet, i.e. the bit can 
> be set for either the first or the last segment in the packet.

Sure however if we keep the current definition, a single bit would not be
enough as it must be nonzero for the buffer to be valid. I think a 8 bit
field is not that expensive for a counter.

> It is an almost equivalent alternative to the fundamental design pattern of 
> using an array of mbuf with count, which is widely implemented in DPDK. And 
> m->next still lives in the second cache line, so I don't see any gain by this.

That's right, it does not have to live in the first cache line, my only
concern was its entire removal.

> I still don't get how m->nb_segs can be abused without m->next.

By "abused" I mean that applications are not supposed to pass this kind of
mbuf lists directly to existing mbuf-handling functions (TX burst,
rte_pktmbuf_free() and so on), however these same applications (even PMDs)
can do so internally temporarily because it's so simple.

The next pointer of the last segment of a packet must still be set to NULL
every time a packet is retrieved from such a list to be processed.

> > However, nb_segs may be a good candidate for demotion, along with
> > possibly the port value, or the reference count.

Yes, I think that's fine as long as it's kept somewhere.

-- 
Adrien Mazarguil
6WIND

[dpdk-dev] mbuf changes

2016-10-25 Thread Adrien Mazarguil

On Tue, Oct 25, 2016 at 12:11:04PM +0200, Morten Br?rup wrote:
> Comments inline.
> 
> Med venlig hilsen / kind regards
> - Morten Br?rup
> 
> 
> > -Original Message-
> > From: Adrien Mazarguil [mailto:adrien.mazarguil at 6wind.com]
> > Sent: Tuesday, October 25, 2016 11:39 AM
> > To: Bruce Richardson
> > Cc: Wiles, Keith; Morten Br?rup; dev at dpdk.org; Olivier Matz; Oleg
> > Kuporosov
> > Subject: Re: [dpdk-dev] mbuf changes
> > 
> > On Mon, Oct 24, 2016 at 05:25:38PM +0100, Bruce Richardson wrote:
> > > On Mon, Oct 24, 2016 at 04:11:33PM +, Wiles, Keith wrote:
> > [...]
> > > > > On Oct 24, 2016, at 10:49 AM, Morten Br?rup
> >  wrote:
> > [...]
> > > > > 5.
> > > > >
> > > > > And here?s something new to think about:
> > > > >
> > > > > m->next already reveals if there are more segments to a packet.
> > Which purpose does m->nb_segs serve that is not already covered by m-
> > >next?
> > >
> > > It is duplicate info, but nb_segs can be used to check the validity
> > of
> > > the next pointer without having to read the second mbuf cacheline.
> > >
> > > Whether it's worth having is something I'm happy enough to discuss,
> > > though.
> > 
> > Although slower in some cases than a full blown "next packet" pointer,
> > nb_segs can also be conveniently abused to link several packets and
> > their segments in the same list without wasting space.
> 
> I don?t understand that; can you please elaborate? Are you abusing m->nb_segs 
> as an index into an array in your application? If that is the case, and it is 
> endorsed by the community, we should get rid of m->nb_segs and add a member 
> for application specific use instead. 

Well, that's just an idea, I'm not aware of any application using this,
however the ability to link several packets with segments seems
useful to me (e.g. buffering packets). Here's a diagram:

 .---.   .---.   .---.   .---.   .--
 | pkt 0 |   | seg 1 |   | seg 2 |   | pkt 1 |   | pkt 2
 |  next --->|  next --->|  next --->|  next --->| ...
 | nb_segs 3 |   | nb_segs 1 |   | nb_segs 1 |   | nb_segs 1 |   |
 `---'   `---'   `---'   `---'   `--

> > > One other point I'll mention is that we need to have a discussion on
> > > how/where to add in a timestamp value into the mbuf. Personally, I
> > > think it can be in a union with the sequence number value, but I also
> > > suspect that 32-bits of a timestamp is not going to be enough for
> > many.
> > >
> > > Thoughts?
> > 
> > If we consider that timestamp representation should use nanosecond
> > granularity, a 32-bit value may likely wrap around too quickly to be
> > useful. We can also assume that applications requesting timestamps may
> > care more about latency than throughput, Oleg found that using the
> > second cache line for this purpose had a noticeable impact [1].
> > 
> >  [1] http://dpdk.org/ml/archives/dev/2016-October/049237.html
> 
> I agree with Oleg about the latency vs. throughput importance for such 
> applications.
> 
> If you need high resolution timestamps, consider them to be generated by the 
> NIC RX driver, possibly by the hardware itself 
> (http://w3new.napatech.com/features/time-precision/hardware-time-stamp), so 
> the timestamp belongs in the first cache line. And I am proposing that it 
> should have the highest possible accuracy, which makes the value hardware 
> dependent.
> 
> Furthermore, I am arguing that we leave it up to the application to keep 
> track of the slowly moving bits (i.e. counting whole seconds, hours and 
> calendar date) out of band, so we don't use precious space in the mbuf. The 
> application doesn't need the NIC RX driver's fast path to capture which date 
> (or even which second) a packet was received. Yes, it adds complexity to the 
> application, but we can't set aside 64 bit for a generic timestamp. Or as a 
> weird tradeoff: Put the fast moving 32 bit in the first cache line and the 
> slow moving 32 bit in the second cache line, as a placeholder for the 
> application to fill out if needed. Yes, it means that the application needs 
> to check the time and update its variable holding the slow moving time once 
> every second or so; but that should be doable without significant effort.

That's a good point, however without a 64 bit value, elapsed time between
two arbitrary mbufs cannot be measured reliably due to not enough context,
one way or another the low resolution value is also needed.

Obviously latency-sensitive applications are unlikely to perform lengthy
buffering and require this but I'm not sure about all the possible
use-cases. Considering many NICs expose 64 bit timestaps, I suggest we do
not truncate them.

I'm not a fan of the weird tradeoff either, PMDs will be tempted to fill the
extra 32 bits whenever they can and negate the performance improvement of
the first cache line.

-- 
Adrien Mazarguil
6WIND

[dpdk-dev] mbuf changes

2016-10-25 Thread Adrien Mazarguil

On Mon, Oct 24, 2016 at 05:25:38PM +0100, Bruce Richardson wrote:
> On Mon, Oct 24, 2016 at 04:11:33PM +, Wiles, Keith wrote:
[...]
> > > On Oct 24, 2016, at 10:49 AM, Morten Br?rup  
> > > wrote:
[...]
> > > 5.
> > > 
> > > And here?s something new to think about:
> > > 
> > > m->next already reveals if there are more segments to a packet. Which 
> > > purpose does m->nb_segs serve that is not already covered by m->next?
> 
> It is duplicate info, but nb_segs can be used to check the validity of
> the next pointer without having to read the second mbuf cacheline.
> 
> Whether it's worth having is something I'm happy enough to discuss,
> though.

Although slower in some cases than a full blown "next packet" pointer,
nb_segs can also be conveniently abused to link several packets and their
segments in the same list without wasting space.

> One other point I'll mention is that we need to have a discussion on
> how/where to add in a timestamp value into the mbuf. Personally, I think
> it can be in a union with the sequence number value, but I also suspect
> that 32-bits of a timestamp is not going to be enough for many.
> 
> Thoughts?

If we consider that timestamp representation should use nanosecond
granularity, a 32-bit value may likely wrap around too quickly to be
useful. We can also assume that applications requesting timestamps may care
more about latency than throughput, Oleg found that using the second cache
line for this purpose had a noticeable impact [1].

 [1] http://dpdk.org/ml/archives/dev/2016-October/049237.html

-- 
Adrien Mazarguil
6WIND

[dpdk-dev] [PATCH v2] net/mlx5: fix init on secondary process

2016-10-19 Thread Adrien Mazarguil

On Wed, Oct 19, 2016 at 10:31:48AM +0100, Bruce Richardson wrote:
> On Mon, Oct 17, 2016 at 04:18:59PM +0200, Adrien Mazarguil wrote:
> > On Mon, Oct 17, 2016 at 02:52:39PM +0100, Ferruh Yigit wrote:
> > > Hi Adrien,
> > > 
> > > On 10/17/2016 1:56 PM, Olivier Gournet wrote:
> > > > Fixes: 1d88ba171942 ("net/mlx5: refactor Tx data path")
> > > > Fixes: 21c8bb4928c9 ("net/mlx5: split Tx queue structure")
> > > > 
> > > > Signed-off-by: Olivier Gournet 
> > > 
> > > According your comment on previous version of it, I think you have your
> > > Ack on this patch, but can you please confirm?
> > 
> > Yes it's fine, thanks.
> > 
> > Acked-by: Adrien Mazarguil 
> > 
> > -- 
> While this patch is acked, I'd still like a bit of detail in the commit
> message describing what the problem is and how the patch fixes it. The
> Chuck Norris approach of trying to stare down the code until it tells
> me just isn't working for me today! :-)
> 
> Adrien or Olivier, if you can supply a brief description of what this
> patch is doing and why I'll add it to the commit log on apply.

*cough* this patch restores the original behavior of not causing a secondary
process to segfault during init.

Seriously, one needs to look at mlx5_secondary_data_setup() in both commits
to really understand what happened, my suggestion for a commit log:

The changes introduced by these commits made secondaries attempt to
reinitialize the TX queue structures of the primary instead of their own,
for which they also do not allocate enough memory, leading to crashes.

-- 
Adrien Mazarguil
6WIND

[dpdk-dev] [PATCH v2] net/mlx5: fix init on secondary process

2016-10-17 Thread Adrien Mazarguil

On Mon, Oct 17, 2016 at 02:52:39PM +0100, Ferruh Yigit wrote:
> Hi Adrien,
> 
> On 10/17/2016 1:56 PM, Olivier Gournet wrote:
> > Fixes: 1d88ba171942 ("net/mlx5: refactor Tx data path")
> > Fixes: 21c8bb4928c9 ("net/mlx5: split Tx queue structure")
> > 
> > Signed-off-by: Olivier Gournet 
> 
> According your comment on previous version of it, I think you have your
> Ack on this patch, but can you please confirm?

Yes it's fine, thanks.

Acked-by: Adrien Mazarguil 

-- 
Adrien Mazarguil
6WIND

[dpdk-dev] [PATCH] net/mlx5: fix Rx function selection

2016-10-11 Thread Adrien Mazarguil

On Tue, Oct 11, 2016 at 04:44:50PM +0200, Nelio Laranjeiro wrote:
> mlx5_rx_queue_setup() was setting the Rx function by itself instead of
> using priv_select_rx_function() written for that purpose.
> 
> Fixes: cdab90cb5c8d ("net/mlx5: add Tx/Rx burst function selection wrapper")
> 
> Signed-off-by: Nelio Laranjeiro 

Acked-by: Adrien Mazarguil 

> ---
>  drivers/net/mlx5/mlx5_rxq.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/net/mlx5/mlx5_rxq.c b/drivers/net/mlx5/mlx5_rxq.c
> index b9a5fe6..fe27d22 100644
> --- a/drivers/net/mlx5/mlx5_rxq.c
> +++ b/drivers/net/mlx5/mlx5_rxq.c
> @@ -1264,7 +1264,7 @@ mlx5_rx_queue_setup(struct rte_eth_dev *dev, uint16_t 
> idx, uint16_t desc,
> (void *)dev, (void *)rxq_ctrl);
>   (*priv->rxqs)[idx] = _ctrl->rxq;
>   /* Update receive callback. */
> - dev->rx_pkt_burst = mlx5_rx_burst;
> + priv_select_rx_function(priv);
>   }
>   priv_unlock(priv);
>   return -ret;
> -- 
> 2.1.4
> 

-- 
Adrien Mazarguil
6WIND

[dpdk-dev] [PATCH] net/mlx5: fix init on secondary process

2016-10-11 Thread Adrien Mazarguil

Hi Olivier,

Secondary process support's got overlooked during this refactoring, thanks
for the patch. However can you describe the issue you're addressing as part
of the commit log?

I think problems started when txq got mistakenly converted to
primary_txq_ctrl in 21c8bb4928c9 ("net/mlx5: split Tx queue structure"), you
may add a Fixes line for that one as well.

Otherwise, this patch looks fine to me.

On Wed, Sep 28, 2016 at 04:24:18PM +0200, Olivier Gournet wrote:
> Fixes: 1d88ba171942 ("net/mlx5: refactor Tx data path")
> 
> Signed-off-by: Olivier Gournet 
> ---
>  drivers/net/mlx5/mlx5_ethdev.c | 6 --
>  1 file changed, 4 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/net/mlx5/mlx5_ethdev.c b/drivers/net/mlx5/mlx5_ethdev.c
> index 130e15d..6f39965 100644
> --- a/drivers/net/mlx5/mlx5_ethdev.c
> +++ b/drivers/net/mlx5/mlx5_ethdev.c
> @@ -1308,11 +1308,13 @@ mlx5_secondary_data_setup(struct priv *priv)
>   continue;
>   primary_txq_ctrl = container_of(primary_txq,
>   struct txq_ctrl, txq);
> - txq_ctrl = rte_calloc_socket("TXQ", 1, sizeof(*txq_ctrl), 0,
> + txq_ctrl = rte_calloc_socket("TXQ", 1, sizeof(*txq_ctrl) +
> +  primary_txq->elts_n *
> +  sizeof(struct rte_mbuf *), 0,
>primary_txq_ctrl->socket);
>   if (txq_ctrl != NULL) {
>   if (txq_ctrl_setup(priv->dev,
> -primary_txq_ctrl,
> +txq_ctrl,
>  primary_txq->elts_n,
>      primary_txq_ctrl->socket,
>  NULL) == 0) {
> -- 
> 2.1.4

-- 
Adrien Mazarguil
6WIND

[dpdk-dev] [RFC v2] Generic flow director/filtering/classification API

2016-10-11 Thread Adrien Mazarguil

Hi Wei,

On Tue, Oct 11, 2016 at 01:47:53AM +, Zhao1, Wei wrote:
> Hi  Adrien Mazarguil,
>  There is a struct rte_flow_action_rss in rte_flow.txt, the  member 
> rss_conf is a pointer type, is there any convenience in using pointer?
> Why not using  struct rte_eth_rss_conf rss_conf type, as rte_flow_item_ipv4/ 
> rte_flow_item_ipv6 struct member?
> 
> Thank you.
> 
>  struct rte_flow_action_rss {
>   struct rte_eth_rss_conf *rss_conf; /**< RSS parameters. */
>   uint16_t queues; /**< Number of entries in queue[]. */
>   uint16_t queue[]; /**< Queues indices to use. */
> };

Well I thought it made sharing flow RSS configuration with its counterpart
in struct rte_eth_conf easier (this pointer should even be const). Also,
while ABI breakage would still occur if rte_eth_rss_conf happened to be
modified, the impact on this API would be limited as it would not cause a
change in structure size. We'd ideally need some kind of version field to be
completely safe but I guess that would be somewhat overkill.

Now considering this API was written without an initial implementation, all
structure definitions that do not make sense are still open to debate, we
can adjust them as needed.

-- 
Adrien Mazarguil
6WIND

[dpdk-dev] [RFC v2] Generic flow director/filtering/classification API

2016-10-10 Thread Adrien Mazarguil

Hi Wei,

On Mon, Oct 10, 2016 at 09:42:53AM +, Zhao1, Wei wrote:
> Hi Adrien Mazarguil,
> 
> In your v2 version of rte_flow.txt , there is an action type 
> RTE_FLOW_ACTION_TYPE_MARK,  but there is no definition of struct 
> rte_flow_action_mark.
> And there is  an definition of struct rte_flow_action_id. Is it a typo or 
> other usage?
> 
> Thank you.
> 
> struct rte_flow_action_id {
>   uint32_t id; /**< 32 bit value to return with packets. */
> };

That is indeed a mistake, this struct should be named
"rte_flow_action_mark". I'll fix it for the next update, thanks.

-- 
Adrien Mazarguil
6WIND

[dpdk-dev] [PATCH] net/mlx: align drivers to latest naming convention

2016-10-07 Thread Adrien Mazarguil

On Fri, Oct 07, 2016 at 03:04:13PM +0200, David Marchand wrote:
> Fixes: 2f45703c17ac ("drivers: make driver names consistent")
> 
> Signed-off-by: David Marchand 
> ---
>  drivers/net/mlx4/mlx4.h  | 2 +-
>  drivers/net/mlx5/mlx5_defs.h | 2 +-
>  2 files changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/net/mlx4/mlx4.h b/drivers/net/mlx4/mlx4.h
> index d0c7bc2..4c7505e 100644
> --- a/drivers/net/mlx4/mlx4.h
> +++ b/drivers/net/mlx4/mlx4.h
> @@ -96,7 +96,7 @@ enum {
>   PCI_DEVICE_ID_MELLANOX_CONNECTX3PRO = 0x1007,
>  };
>  
> -#define MLX4_DRIVER_NAME "librte_pmd_mlx4"
> +#define MLX4_DRIVER_NAME "net_mlx4"
>  
>  /* Bit-field manipulation. */
>  #define BITFIELD_DECLARE(bf, type, size) \
> diff --git a/drivers/net/mlx5/mlx5_defs.h b/drivers/net/mlx5/mlx5_defs.h
> index cc2a6f3..b32816e 100644
> --- a/drivers/net/mlx5/mlx5_defs.h
> +++ b/drivers/net/mlx5/mlx5_defs.h
> @@ -37,7 +37,7 @@
>  #include "mlx5_autoconf.h"
>  
>  /* Reported driver name. */
> -#define MLX5_DRIVER_NAME "librte_pmd_mlx5"
> +#define MLX5_DRIVER_NAME "net_mlx5"
>  
>  /* Maximum number of simultaneous MAC addresses. */
>  #define MLX5_MAX_MAC_ADDRESSES 128
> -- 
> 2.7.4

Acked-by: Adrien Mazarguil 

-- 
Adrien Mazarguil
6WIND

[dpdk-dev] [PATCH v2] cryptodev: fix compilation error in SUSE 11 SP2

2016-10-05 Thread Adrien Mazarguil

On Wed, Oct 05, 2016 at 03:45:51AM +0100, Pablo de Lara wrote:
> This commit fixes following build error, which happens in SUSE 11 SP2,
> with gcc 4.5.1:
> 
> In file included from lib/librte_cryptodev/rte_cryptodev.c:70:0:
> lib/librte_cryptodev/rte_cryptodev.h:772:7:
> error: flexible array member in otherwise empty struct
> 
> Fixes: 347a1e037fd3 ("lib: use C99 syntax for zero-size arrays")
> 
> Signed-off-by: Pablo de Lara 
> ---
> 
> Changes in v2:
> - Fixed commit message
> 
>  lib/librte_cryptodev/rte_cryptodev.h | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/lib/librte_cryptodev/rte_cryptodev.h 
> b/lib/librte_cryptodev/rte_cryptodev.h
> index d565f39..6ad5e91 100644
> --- a/lib/librte_cryptodev/rte_cryptodev.h
> +++ b/lib/librte_cryptodev/rte_cryptodev.h
> @@ -773,7 +773,7 @@ struct rte_cryptodev_sym_session {
>   } __rte_aligned(8);
>   /**< Public symmetric session details */
>  
> - char _private[];
> + __extension__ char _private[0];
>   /**< Private session material */
>  };
>  
> -- 
> 2.7.4

Acked-by: Adrien Mazarguil 

-- 
Adrien Mazarguil
6WIND

[dpdk-dev] Possible bug in mlx5_tx_burst_mpw?

2016-09-16 Thread Adrien Mazarguil

On Wed, Sep 14, 2016 at 09:33:18PM +0200, Luke Gorrie wrote:
> Hi Adrien,
> 
> On 14 September 2016 at 16:30, Adrien Mazarguil  6wind.com>
> wrote:
> 
> > Your interpretation is correct (this is intentional and not a bug).
> >
> 
> Thanks very much for clarifying.
> 
> This is interesting to me because I am also working on a ConnectX-4 (Lx)
> driver based on the newly released driver interface specification [1] and I
> am wondering how interested I should be in this MPW feature that is
> currently not documented.

Seems like this document only describes established features whose interface
won't be subject to firmware evolutions, I think MPW is not one of them.
AFAIK currently MPW cannot be used with LSO which we intend to support soon.

Our implementation is a stripped down version of the code found in
libmlx5. I guess you could ask Mellanox directly if you need more
information.

> In the event successive packets share a few properties (length, number of
> > segments, offload flags), these can be factored out as an optimization to
> > lower the amount of traffic on the PCI bus. This feature is currently
> > supported by the ConnectX-4 Lx family of adapters.
> >
> 
> I have a concern here that I hope you will forgive me for voicing.
> 
> This optimization seems to run the risk of inflating scores on
> constant-packet-size IXIA-style benchmarks like [2] and making them less
> useful for predicting real-world performance. That seems like a negative to
> me as an application developer. I wonder if I am overlooking some practical
> benefits that motivate implementing this in silicon and in the driver and
> enabling it by default?

Your concern is understandable, no offense taken. You are obviously right
about benchmarks with constant packets, whose results can be improved by
MPW.

Performance-wise, with the right traffic patterns MPW allows ConnectX-4 Lx
adapters to outperform their non-Lx counterparts (e.g. comparing 40G EN Lx
PCIe 8x vs. 40G EN PCIe 8x) when measuring traffic rate (Mpps), not
throughput. Disabling MPW yields comparable results, which is why it is
considered to be an optimization.

Since processing MPW consumes a few additional CPU cycles, it can be
disabled at runtime with the txq_mpw_en switch (documented in mlx5.rst).

Now about the real-world scenario, we are not talking about needing millions
of identical packets to notice an improvement. MPW is effective from 2 to at
most 5 consecutive packets that share some meta-data (length, number of
segments and offload flags), all within the same burst. Just to be clear,
neither their destination nor their payload need to be the same, it would
have been useless otherwise.

Sending a few packets at once with such similar properties is common
occurrence in the real world, think about forwarding TCP traffic that has
been shaped to a constant size by LSO or MTU.

Like many optimizations, this one targets a specific yet common use-case.
If you would rather get a constant rate out of any traffic pattern for
predictable latency, DPDK which is burst-oriented is probably not what your
application needs if used as-is.

> [1]
> http://www.mellanox.com/related-docs/user_manuals/Ethernet_Adapters_Programming_Manual.pdf
> [2]
> https://www.mellanox.com/blog/2016/06/performance-beyond-numbers-stephen-curry-style-server-io/

-- 
Adrien Mazarguil
6WIND

[dpdk-dev] Possible bug in mlx5_tx_burst_mpw?

2016-09-14 Thread Adrien Mazarguil

Hi Luke,

On Wed, Sep 14, 2016 at 03:24:07PM +0200, Luke Gorrie wrote:
> Howdy,
> 
> Just noticed a line of code that struck me as odd and so I am writing just
> in case it is a bug:
> 
> http://dpdk.org/browse/dpdk/tree/drivers/net/mlx5/mlx5_rxtx.c#n1014
> 
> Specifically the check "(mpw.length != length)" in mlx_tx_burst_mpw() looks
> like a descriptor-format optimization for the special case where
> consecutive packets on the wire are exactly the same size. This would
> strike me as peculiar.
> 
> Just wanted to check, is that interpretation correct and if so then is this
> intentional?

Your interpretation is correct (this is intentional and not a bug).

In the event successive packets share a few properties (length, number of
segments, offload flags), these can be factored out as an optimization to
lower the amount of traffic on the PCI bus. This feature is currently
supported by the ConnectX-4 Lx family of adapters.

-- 
Adrien Mazarguil
6WIND

[dpdk-dev] [PATCH v9 01/25] eal: define macro container_of

2016-09-09 Thread Adrien Mazarguil

On Fri, Sep 09, 2016 at 09:49:06AM +0530, Shreyansh Jain wrote:
> Hi Ferruh,
> 
> On Thursday 08 September 2016 07:46 PM, Ferruh Yigit wrote:
> >On 9/7/2016 3:07 PM, Shreyansh Jain wrote:
> >>Signed-off-by: Jan Viktorin 
> >>Signed-off-by: Shreyansh Jain 
> >>---
> >> lib/librte_eal/common/include/rte_common.h | 21 +
> >> 1 file changed, 21 insertions(+)
> >>
> >>diff --git a/lib/librte_eal/common/include/rte_common.h 
> >>b/lib/librte_eal/common/include/rte_common.h
> >>index 332f2a4..c5d94f3 100644
> >>--- a/lib/librte_eal/common/include/rte_common.h
> >>+++ b/lib/librte_eal/common/include/rte_common.h
> >>@@ -322,6 +322,27 @@ rte_bsf32(uint32_t v)
> >> #define offsetof(TYPE, MEMBER)  __builtin_offsetof (TYPE, MEMBER)
> >> #endif
> >>
> >>+/**
> >>+ * Return pointer to the wrapping struct instance.
> >>+ * Example:
> >>+ *
> >>+ *  struct wrapper {
> >>+ *  ...
> >>+ *  struct child c;
> >>+ *  ...
> >>+ *  };
> >>+ *
> >>+ *  struct child *x = obtain(...);
> >>+ *  struct wrapper *w = container_of(x, struct wrapper, c);
> >>+ *
> >>+ * Some implementation already have this defined, thus, conditional
> >>+ * declaration.
> >>+ */
> >>+#ifndef container_of
> >>+#define container_of(p, type, member) \
> >>+   ((type *) (((char *) (p)) - offsetof(type, member)))
> >>+#endif
> >>+
> >> #define _RTE_STR(x) #x
> >> /** Take a macro value and get a string version of it */
> >> #define RTE_STR(x) _RTE_STR(x)
> >>
> >
> >Some mlx5 files includes dpdk version of container_of first, they
> >produce following warning:
> >
> >In file included from .../dpdk/build/include/rte_mbuf.h:57:0,
> > from .../dpdk/build/include/rte_ether.h:52,
> > from .../dpdk/drivers/net/mlx5/mlx5_trigger.c:38:
> >/usr/include/infiniband/verbs.h: In function ?verbs_get_device?:
> >/dpdk/build/include/rte_common.h:343:14: warning: cast discards
> >?const? qualifier from pointer target type [-Wcast-qual]
> >  ((type *) (((char *) (p)) - offsetof(type, member)))
> >
> >The verbs.h version of container_of is same with dpdk one, I am not able
> >to find why one gives warning but other not.
> 
> Thanks for highlighting. I am setting up my environment and will have a
> look.

This warning is a known issue in the Verbs header that will be addressed
eventually. It occurs even without Shreyansh's patch (more likely when
CONFIG_RTE_LIBRTE_MLX4_DEBUG and/or CONFIG_RTE_LIBRTE_MLX5_DEBUG are
enabled).

Your container_of() macro is fine, no need to spend time on this.

-- 
Adrien Mazarguil
6WIND

[dpdk-dev] [PATCH v4 00/10] Fix build errors related to exported headers

2016-09-08 Thread Adrien Mazarguil

On Tue, Aug 23, 2016 at 06:36:57PM +0200, Thomas Monjalon wrote:
> After rebasing the patchset, the compilation of each patch seems good.
> But the new checks fail with clang:
>   rte_memcpy.h:814:2: error:
>   implicit declaration of function '_mm_alignr_epi8' is invalid 
> in C99

This is an unfortunate false positive. mmintrin.h and other x86 intrinsics
headers files define their macros and types only if compiled with the right
-march or CPU flags options. check-includes.sh does not provide any of
those and relies on whatever the C compiler falls back to by default.

The problem is actually that we haven't implemented any fallback in DPDK for
such cases. In the meantime it can be worked around like this:

 EXTRA_CFLAGS=-march=core2 EXTRA_CXXFLAGS=-march=core2 
./scripts/check-includes.sh

> Other comments about the script:
> - it is too long (can it be parallelized?)
> - it does not stop printing errors after the first one

Addressing these concerns would require a complete redesign of that script
as a Makefile, and even then it would most likely end up taking too long
when there are no errors (all headers end up being checked).

I've removed it from test-build.sh, people will have to run it manually like
check-git-log.sh, updated v5 accordingly.

-- 
Adrien Mazarguil
6WIND

[dpdk-dev] [PATCH v5 10/10] scripts: check compilation of exported header files

2016-09-08 Thread Adrien Mazarguil

This script checks that header files in a given directory do not miss
dependencies when included on their own, do not conflict and accept being
compiled with the strictest possible flags.

It is too slow at the moment to be automatically executed by test-build.sh
and should be run voluntarily (like check-git-log.sh and friends) after
making changes to exported header files.

Signed-off-by: Adrien Mazarguil 
---
 MAINTAINERS   |   1 +
 scripts/check-includes.sh | 286 +
 2 files changed, 287 insertions(+)

diff --git a/MAINTAINERS b/MAINTAINERS
index bc9aa02..0e78941 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -26,6 +26,7 @@ T: git://dpdk.org/dpdk
 F: MAINTAINERS
 F: scripts/check-maintainers.sh
 F: scripts/check-git-log.sh
+F: scripts/check-includes.sh
 F: scripts/checkpatches.sh
 F: scripts/load-devel-config
 F: scripts/test-build.sh
diff --git a/scripts/check-includes.sh b/scripts/check-includes.sh
new file mode 100755
index 000..d65adc6
--- /dev/null
+++ b/scripts/check-includes.sh
@@ -0,0 +1,286 @@
+#!/bin/sh -e
+#
+#   BSD LICENSE
+#
+#   Copyright 2016 6WIND S.A.
+#
+#   Redistribution and use in source and binary forms, with or without
+#   modification, are permitted provided that the following conditions
+#   are met:
+#
+# * Redistributions of source code must retain the above copyright
+#   notice, this list of conditions and the following disclaimer.
+# * Redistributions in binary form must reproduce the above copyright
+#   notice, this list of conditions and the following disclaimer in
+#   the documentation and/or other materials provided with the
+#   distribution.
+# * Neither the name of 6WIND S.A. nor the names of its
+#   contributors may be used to endorse or promote products derived
+#   from this software without specific prior written permission.
+#
+#   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+#   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+#   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+#   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+#   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+#   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+#   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+#   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+#   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+#   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+#   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+
+# This script checks that header files in a given directory do not miss
+# dependencies when included on their own, do not conflict and accept being
+# compiled with the strictest possible flags.
+#
+# Files are looked up in the directory provided as the first argument,
+# otherwise build/include by default.
+#
+# Recognized environment variables:
+#
+# VERBOSE=1 is the same as -v.
+#
+# QUIET=1 is the same as -q.
+#
+# SUMMARY=1 is the same as -s.
+#
+# CC, CPPFLAGS, CFLAGS, EXTRA_CPPFLAGS, EXTRA_CFLAGS, CXX, CXXFLAGS and
+# EXTRA_CXXFLAGS are taken into account.
+#
+# PEDANTIC_CFLAGS, PEDANTIC_CXXFLAGS and PEDANTIC_CPPFLAGS provide strict
+# C/C++ compilation flags.
+#
+# IGNORE contains a list of shell patterns matching files (relative to the
+# include directory) to avoid. It is set by default to known DPDK headers
+# which must not be included on their own.
+#
+# IGNORE_CXX provides additional files for C++.
+
+while getopts hqvs arg; do
+   case $arg in
+   h)
+   cat < /dev/null
+
+[ "$VERBOSE" = 1 ] &&
+output ()
+{
+   local CCV
+   local CXXV
+
+   shift
+   CCV=$CC
+   CXXV=$CXX
+   CC="echo $CC" CXX="echo $CXX" "$@"
+   CC=$CCV
+   CXX=$CXXV
+
+   "$@"
+} ||
+output ()
+{
+
+   printf '  %s\n' "$1"
+   shift
+   "$@"
+}
+
+trap 'rm -f "$temp_cc" "$temp_cxx"' EXIT
+
+compile_cc ()
+{
+   ${CC} -I"$include_dir" \
+   ${PEDANTIC_CPPFLAGS} ${CPPFLAGS} ${EXTRA_CPPFLAGS} \
+   ${PEDANTIC_CFLAGS} ${CFLAGS} ${EXTRA_CFLAGS} \
+   -c -o /dev/null "${temp_cc}"
+}
+
+compile_cxx ()
+{
+   ${CXX} -I"$include_dir" \
+   ${PEDANTIC_CPPFLAGS} ${CPPFLAGS} ${EXTRA_CPPFLAGS} \
+   ${PEDANTIC_CXXFLAGS} ${CXXFLAGS} ${EXTRA_CXXFLAGS} \
+   -c -o /dev/null "${temp_cxx}"
+}
+
+ignore ()
+{
+   file="$1"
+   shift
+   while [ $# -ne 0 ]; do
+   case "$file" in
+   $1)
+   return 0
+   ;;
+   esac
+   shift
+   done
+   return 1

[dpdk-dev] [PATCH v5 09/10] lib: hide static functions never defined

2016-09-08 Thread Adrien Mazarguil

Arch-specific functions not defined for all architectures (missing on x86
in this case) and not used anywhere should not expose a prototype.

This commit prevents the following error:

 error: `rte_mov48' declared `static' but never defined

Signed-off-by: Adrien Mazarguil 
---
 lib/librte_eal/common/include/generic/rte_memcpy.h | 4 
 1 file changed, 4 insertions(+)

diff --git a/lib/librte_eal/common/include/generic/rte_memcpy.h 
b/lib/librte_eal/common/include/generic/rte_memcpy.h
index afb0afe..4e9d879 100644
--- a/lib/librte_eal/common/include/generic/rte_memcpy.h
+++ b/lib/librte_eal/common/include/generic/rte_memcpy.h
@@ -64,6 +64,8 @@ rte_mov16(uint8_t *dst, const uint8_t *src);
 static inline void
 rte_mov32(uint8_t *dst, const uint8_t *src);

+#ifdef __DOXYGEN__
+
 /**
  * Copy 48 bytes from one location to another using optimised
  * instructions. The locations should not overlap.
@@ -76,6 +78,8 @@ rte_mov32(uint8_t *dst, const uint8_t *src);
 static inline void
 rte_mov48(uint8_t *dst, const uint8_t *src);

+#endif /* __DOXYGEN__ */
+
 /**
  * Copy 64 bytes from one location to another using optimised
  * instructions. The locations should not overlap.
-- 
2.1.4

[dpdk-dev] [PATCH v5 08/10] lib: remove named variadic macros in exported headers

2016-09-08 Thread Adrien Mazarguil

Exported header files used by applications should allow the strictest
compiler flags. Language extensions used in many places must be explicitly
marked or removed to avoid warnings and compilation failures.

Since there is no way to force named variadic macros as extensions, use a
a standard __VA_ARGS__ with an extra dummy argument to format strings.

This commit prevents the following errors:

 error: ISO C does not permit named variadic macros

Signed-off-by: Adrien Mazarguil 
---
 lib/librte_cryptodev/rte_cryptodev.h   | 32 ++---
 lib/librte_cryptodev/rte_cryptodev_pmd.h   |  2 +-
 lib/librte_eal/common/include/rte_common.h |  9 +++
 3 files changed, 28 insertions(+), 15 deletions(-)

diff --git a/lib/librte_cryptodev/rte_cryptodev.h 
b/lib/librte_cryptodev/rte_cryptodev.h
index cf28541..d047ba8 100644
--- a/lib/librte_cryptodev/rte_cryptodev.h
+++ b/lib/librte_cryptodev/rte_cryptodev.h
@@ -77,26 +77,30 @@ extern const char **rte_cyptodev_names;

 /* Logging Macros */

-#define CDEV_LOG_ERR(fmt, args...) \
-   RTE_LOG(ERR, CRYPTODEV, "%s() line %u: " fmt "\n",  \
-   __func__, __LINE__, ## args)
+#define CDEV_LOG_ERR(...) \
+   RTE_LOG(ERR, CRYPTODEV, \
+   RTE_FMT("%s() line %u: " RTE_FMT_HEAD(__VA_ARGS__,) "\n", \
+   __func__, __LINE__, RTE_FMT_TAIL(__VA_ARGS__,)))

-#define CDEV_PMD_LOG_ERR(dev, fmt, args...)\
-   RTE_LOG(ERR, CRYPTODEV, "[%s] %s() line %u: " fmt "\n", \
-   dev, __func__, __LINE__, ## args)
+#define CDEV_PMD_LOG_ERR(dev, ...) \
+   RTE_LOG(ERR, CRYPTODEV, \
+   RTE_FMT("[%s] %s() line %u: " RTE_FMT_HEAD(__VA_ARGS__,) "\n", \
+   dev, __func__, __LINE__, RTE_FMT_TAIL(__VA_ARGS__,)))

 #ifdef RTE_LIBRTE_CRYPTODEV_DEBUG
-#define CDEV_LOG_DEBUG(fmt, args...)   \
-   RTE_LOG(DEBUG, CRYPTODEV, "%s() line %u: " fmt "\n",\
-   __func__, __LINE__, ## args)\
+#define CDEV_LOG_DEBUG(...) \
+   RTE_LOG(DEBUG, CRYPTODEV, \
+   RTE_FMT("%s() line %u: " RTE_FMT_HEAD(__VA_ARGS__,) "\n", \
+   __func__, __LINE__, RTE_FMT_TAIL(__VA_ARGS__,)))

-#define CDEV_PMD_TRACE(fmt, args...)   \
-   RTE_LOG(DEBUG, CRYPTODEV, "[%s] %s: " fmt "\n", \
-   dev, __func__, ## args)
+#define CDEV_PMD_TRACE(...) \
+   RTE_LOG(DEBUG, CRYPTODEV, \
+   RTE_FMT("[%s] %s: " RTE_FMT_HEAD(__VA_ARGS__,) "\n", \
+   dev, __func__, RTE_FMT_TAIL(__VA_ARGS__,)))

 #else
-#define CDEV_LOG_DEBUG(fmt, args...)
-#define CDEV_PMD_TRACE(fmt, args...)
+#define CDEV_LOG_DEBUG(...) (void)0
+#define CDEV_PMD_TRACE(...) (void)0
 #endif

 /**
diff --git a/lib/librte_cryptodev/rte_cryptodev_pmd.h 
b/lib/librte_cryptodev/rte_cryptodev_pmd.h
index a929ef1..cd46674 100644
--- a/lib/librte_cryptodev/rte_cryptodev_pmd.h
+++ b/lib/librte_cryptodev/rte_cryptodev_pmd.h
@@ -62,7 +62,7 @@ extern "C" {
 #define RTE_PMD_DEBUG_TRACE(...) \
rte_pmd_debug_trace(__func__, __VA_ARGS__)
 #else
-#define RTE_PMD_DEBUG_TRACE(fmt, args...)
+#define RTE_PMD_DEBUG_TRACE(...)
 #endif

 struct rte_cryptodev_session {
diff --git a/lib/librte_eal/common/include/rte_common.h 
b/lib/librte_eal/common/include/rte_common.h
index 98ecc1c..db5ac91 100644
--- a/lib/librte_eal/common/include/rte_common.h
+++ b/lib/librte_eal/common/include/rte_common.h
@@ -335,6 +335,15 @@ rte_bsf32(uint32_t v)
 /** Take a macro value and get a string version of it */
 #define RTE_STR(x) _RTE_STR(x)

+/**
+ * ISO C helpers to modify format strings using variadic macros.
+ * This is a replacement for the ", ## __VA_ARGS__" GNU extension.
+ * An empty %s argument is appended to avoid a dangling comma.
+ */
+#define RTE_FMT(fmt, ...) fmt "%.0s", __VA_ARGS__ ""
+#define RTE_FMT_HEAD(fmt, ...) fmt
+#define RTE_FMT_TAIL(fmt, ...) __VA_ARGS__
+
 /** Mask value of type "tp" for the first "ln" bit set. */
 #defineRTE_LEN2MASK(ln, tp)\
((tp)((uint64_t)-1 >> (sizeof(uint64_t) * CHAR_BIT - (ln
-- 
2.1.4

[dpdk-dev] [PATCH v5 07/10] lib: work around forward reference to enum types

2016-09-08 Thread Adrien Mazarguil

Exported header files used by applications should allow the strictest
compiler flags. Language extensions used in many places must be explicitly
marked or removed to avoid warnings and compilation failures.

This commit prevents the following errors:

 error: ISO C forbids forward references to `enum' types

Signed-off-by: Adrien Mazarguil 
---
 lib/librte_eal/common/include/generic/rte_cpuflags.h | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/lib/librte_eal/common/include/generic/rte_cpuflags.h 
b/lib/librte_eal/common/include/generic/rte_cpuflags.h
index c1da357..71321f3 100644
--- a/lib/librte_eal/common/include/generic/rte_cpuflags.h
+++ b/lib/librte_eal/common/include/generic/rte_cpuflags.h
@@ -44,6 +44,7 @@
 /**
  * Enumeration of all CPU features supported
  */
+__extension__
 enum rte_cpu_flag_t;

 /**
@@ -55,6 +56,7 @@ enum rte_cpu_flag_t;
  * flag name
  * NULL if flag ID is invalid
  */
+__extension__
 const char *
 rte_cpu_get_flag_name(enum rte_cpu_flag_t feature);

@@ -68,6 +70,7 @@ rte_cpu_get_flag_name(enum rte_cpu_flag_t feature);
  * 0 if flag is not available
  * -ENOENT if flag is invalid
  */
+__extension__
 int
 rte_cpu_get_flag_enabled(enum rte_cpu_flag_t feature);

-- 
2.1.4

[dpdk-dev] [PATCH v5 06/10] lib: add missing include dependencies

2016-09-08 Thread Adrien Mazarguil

Exported header files for use by applications should be self sufficient and
allow out of order inclusion. Moreover, they must include all the system
headers they need for types and macros.

This commit prevents the following errors:

 error: `RTE_MAX_LCORE' undeclared here (not in a function)
 error: `RTE_LPM_VALID_EXT_ENTRY_BITMASK' undeclared
  (first use in this function)
 error: #error "Unsupported cache line size"
 error: `asm' undeclared (first use in this function)
 error: implicit declaration of function `[...]'
 error: unknown type name `[...]'
 error: field `mac_addr' has incomplete type
 error: `CHAR_BIT' undeclared here (not in a function)
 error: `struct [...]' declared inside parameter list
 error: unknown type name `uint8_t'

Signed-off-by: Adrien Mazarguil 
---
 lib/librte_cfgfile/rte_cfgfile.h  | 2 ++
 lib/librte_cmdline/cmdline.h  | 1 +
 lib/librte_cmdline/cmdline_parse_portlist.h   | 1 +
 lib/librte_cmdline/cmdline_socket.h   | 3 +++
 lib/librte_eal/common/include/arch/arm/rte_byteorder.h| 2 ++
 lib/librte_eal/common/include/arch/arm/rte_prefetch_32.h  | 1 +
 lib/librte_eal/common/include/arch/arm/rte_prefetch_64.h  | 1 +
 lib/librte_eal/common/include/arch/arm/rte_vect.h | 1 +
 lib/librte_eal/common/include/arch/ppc_64/rte_atomic.h| 1 +
 lib/librte_eal/common/include/arch/ppc_64/rte_byteorder.h | 1 +
 lib/librte_eal/common/include/arch/ppc_64/rte_prefetch.h  | 1 +
 lib/librte_eal/common/include/arch/x86/rte_atomic.h   | 2 ++
 lib/librte_eal/common/include/arch/x86/rte_atomic_32.h| 6 ++
 lib/librte_eal/common/include/arch/x86/rte_atomic_64.h| 8 
 lib/librte_eal/common/include/arch/x86/rte_byteorder.h| 2 ++
 lib/librte_eal/common/include/arch/x86/rte_byteorder_32.h | 7 +++
 lib/librte_eal/common/include/arch/x86/rte_byteorder_64.h | 7 +++
 lib/librte_eal/common/include/arch/x86/rte_prefetch.h | 1 +
 lib/librte_eal/common/include/arch/x86/rte_rtm.h  | 1 +
 lib/librte_eal/common/include/arch/x86/rte_vect.h | 2 ++
 lib/librte_eal/common/include/generic/rte_atomic.h| 1 +
 lib/librte_eal/common/include/generic/rte_byteorder.h | 2 ++
 lib/librte_eal/common/include/rte_eal.h   | 1 +
 lib/librte_eal/common/include/rte_memory.h| 2 ++
 lib/librte_eal/common/include/rte_time.h  | 8 
 lib/librte_eal/common/include/rte_version.h   | 1 +
 lib/librte_ether/rte_dev_info.h   | 2 ++
 lib/librte_ether/rte_eth_ctrl.h   | 4 
 lib/librte_lpm/rte_lpm_neon.h | 1 +
 lib/librte_lpm/rte_lpm_sse.h  | 1 +
 lib/librte_pdump/rte_pdump.h  | 4 
 lib/librte_reorder/rte_reorder.h  | 2 ++
 lib/librte_sched/rte_bitmap.h | 1 +
 lib/librte_sched/rte_reciprocal.h | 2 ++
 lib/librte_sched/rte_sched_common.h   | 1 +
 35 files changed, 84 insertions(+)

diff --git a/lib/librte_cfgfile/rte_cfgfile.h b/lib/librte_cfgfile/rte_cfgfile.h
index f649836..e81a5a2 100644
--- a/lib/librte_cfgfile/rte_cfgfile.h
+++ b/lib/librte_cfgfile/rte_cfgfile.h
@@ -34,6 +34,8 @@
 #ifndef __INCLUDE_RTE_CFGFILE_H__
 #define __INCLUDE_RTE_CFGFILE_H__

+#include 
+
 #ifdef __cplusplus
 extern "C" {
 #endif
diff --git a/lib/librte_cmdline/cmdline.h b/lib/librte_cmdline/cmdline.h
index 2578ca8..65d73b0 100644
--- a/lib/librte_cmdline/cmdline.h
+++ b/lib/librte_cmdline/cmdline.h
@@ -63,6 +63,7 @@

 #include 
 #include 
+#include 

 /**
  * @file
diff --git a/lib/librte_cmdline/cmdline_parse_portlist.h 
b/lib/librte_cmdline/cmdline_parse_portlist.h
index 73d70e0..058df3e 100644
--- a/lib/librte_cmdline/cmdline_parse_portlist.h
+++ b/lib/librte_cmdline/cmdline_parse_portlist.h
@@ -61,6 +61,7 @@
 #ifndef _PARSE_PORTLIST_H_
 #define _PARSE_PORTLIST_H_

+#include 
 #include 

 #ifdef __cplusplus
diff --git a/lib/librte_cmdline/cmdline_socket.h 
b/lib/librte_cmdline/cmdline_socket.h
index 8cc2dfb..aa6068e 100644
--- a/lib/librte_cmdline/cmdline_socket.h
+++ b/lib/librte_cmdline/cmdline_socket.h
@@ -61,6 +61,9 @@
 #ifndef _CMDLINE_SOCKET_H_
 #define _CMDLINE_SOCKET_H_

+#include 
+#include 
+
 #ifdef __cplusplus
 extern "C" {
 #endif
diff --git a/lib/librte_eal/common/include/arch/arm/rte_byteorder.h 
b/lib/librte_eal/common/include/arch/arm/rte_byteorder.h
index 3f2dd1f..1b312b3 100644
--- a/lib/librte_eal/common/include/arch/arm/rte_byteorder.h
+++ b/lib/librte_eal/common/include/arch/arm/rte_byteorder.h
@@ -41,6 +41,8 @@
 extern "C" {
 #endif

+#include 
+#include 
 #include "generic/rte_byteorder.h"

 /* fix missing __builtin_bswap16 for gcc older then 4.8 */
diff --git a/lib/librte_eal/common/include/arch/arm/rte_prefetch_32.h 
b/lib/librte_e

[dpdk-dev] [PATCH v5 05/10] lib: work around unnamed structs/unions

2016-09-08 Thread Adrien Mazarguil

Exported header files used by applications should allow the strictest
compiler flags. Language extensions used in many places must be explicitly
marked to avoid warnings and compilation failures.

Unnamed structs/unions are allowed since C11, however many compiler
versions do not use this mode by default.

This commit prevents the following errors:

 error: ISO C99 doesn't support unnamed structs/unions
 error: struct has no named members

Signed-off-by: Adrien Mazarguil 
---
 lib/librte_cryptodev/rte_crypto.h | 2 ++
 lib/librte_cryptodev/rte_crypto_sym.h | 3 +++
 lib/librte_cryptodev/rte_cryptodev.h  | 4 
 lib/librte_cryptodev/rte_cryptodev_pmd.h  | 2 ++
 lib/librte_eal/common/include/arch/ppc_64/rte_cycles.h| 2 ++
 lib/librte_eal/common/include/arch/x86/rte_atomic_32.h| 3 +++
 lib/librte_eal/common/include/arch/x86/rte_cycles.h   | 2 ++
 lib/librte_eal/common/include/rte_common.h| 7 +++
 lib/librte_eal/common/include/rte_devargs.h   | 1 +
 lib/librte_eal/common/include/rte_interrupts.h| 2 ++
 lib/librte_eal/common/include/rte_memory.h| 1 +
 lib/librte_eal/common/include/rte_memzone.h   | 2 ++
 lib/librte_eal/linuxapp/eal/include/exec-env/rte_interrupts.h | 1 +
 lib/librte_eal/linuxapp/eal/include/exec-env/rte_kni_common.h | 4 
 lib/librte_hash/rte_thash.h   | 3 +++
 lib/librte_lpm/rte_lpm.h  | 1 +
 lib/librte_mbuf/rte_mbuf.h| 5 +
 lib/librte_mempool/rte_mempool.h  | 2 ++
 lib/librte_pipeline/rte_pipeline.h| 2 ++
 lib/librte_timer/rte_timer.h  | 2 ++
 20 files changed, 51 insertions(+)

diff --git a/lib/librte_cryptodev/rte_crypto.h 
b/lib/librte_cryptodev/rte_crypto.h
index 5bc3eaa..9019518 100644
--- a/lib/librte_cryptodev/rte_crypto.h
+++ b/lib/librte_cryptodev/rte_crypto.h
@@ -48,6 +48,7 @@ extern "C" {
 #include 
 #include 
 #include 
+#include 

 #include "rte_crypto_sym.h"

@@ -111,6 +112,7 @@ struct rte_crypto_op {
void *opaque_data;
/**< Opaque pointer for user data */

+   RTE_STD_C11
union {
struct rte_crypto_sym_op *sym;
/**< Symmetric operation parameters */
diff --git a/lib/librte_cryptodev/rte_crypto_sym.h 
b/lib/librte_cryptodev/rte_crypto_sym.h
index d9bd821..8178e5a 100644
--- a/lib/librte_cryptodev/rte_crypto_sym.h
+++ b/lib/librte_cryptodev/rte_crypto_sym.h
@@ -51,6 +51,7 @@ extern "C" {
 #include 
 #include 
 #include 
+#include 


 /** Symmetric Cipher Algorithms */
@@ -333,6 +334,7 @@ struct rte_crypto_sym_xform {
/**< next xform in chain */
enum rte_crypto_sym_xform_type type
; /**< xform type */
+   RTE_STD_C11
union {
struct rte_crypto_auth_xform auth;
/**< Authentication / hash xform */
@@ -371,6 +373,7 @@ struct rte_crypto_sym_op {

enum rte_crypto_sym_op_sess_type sess_type;

+   RTE_STD_C11
union {
struct rte_cryptodev_sym_session *session;
/**< Handle for the initialised session context */
diff --git a/lib/librte_cryptodev/rte_cryptodev.h 
b/lib/librte_cryptodev/rte_cryptodev.h
index 957bdd7..cf28541 100644
--- a/lib/librte_cryptodev/rte_cryptodev.h
+++ b/lib/librte_cryptodev/rte_cryptodev.h
@@ -48,6 +48,7 @@ extern "C" {
 #include "rte_kvargs.h"
 #include "rte_crypto.h"
 #include "rte_dev.h"
+#include 

 #define CRYPTODEV_NAME_NULL_PMDcryptodev_null_pmd
 /**< Null crypto PMD device name */
@@ -104,6 +105,7 @@ extern const char **rte_cyptodev_names;
 struct rte_cryptodev_symmetric_capability {
enum rte_crypto_sym_xform_type xform_type;
/**< Transform type : Authentication / Cipher */
+   RTE_STD_C11
union {
struct {
enum rte_crypto_auth_algorithm algo;
@@ -177,6 +179,7 @@ struct rte_cryptodev_capabilities {
enum rte_crypto_op_type op;
/**< Operation type */

+   RTE_STD_C11
union {
struct rte_cryptodev_symmetric_capability sym;
/**< Symmetric operation capability parameters */
@@ -751,6 +754,7 @@ rte_cryptodev_enqueue_burst(uint8_t dev_id, uint16_t qp_id,

 /** Cryptodev symmetric crypto session */
 struct rte_cryptodev_sym_session {
+   RTE_STD_C11
struct {
uint8_t dev_id;
/**< Device Id */
diff --git a/lib/librte_cryptodev/rte_cryptodev_pmd.h 
b/lib/librte_cryptodev/rte_cryptodev_pmd.h
index 42e7b79..a929ef1 100644
--- a/lib/librte_cryptodev/rte_cryptodev_pmd.h
+++ b/lib/librte_cryptodev/rt

[dpdk-dev] [PATCH v5 04/10] lib: work around nonstandard bit-fields

2016-09-08 Thread Adrien Mazarguil

Exported header files used by applications should allow the strictest
compiler flags. Language extensions used in many places must be explicitly
marked or removed to avoid warnings and compilation failures.

This commit prevents the following errors:

 error: type of bit-field `[...]' is a GCC extension

Note: the standard does not require implementations to issue a diagnostic
message with these, and such errors do not occur with recent GCC or clang
versions. However, GCC 4.7 is still common and using the extension keyword
is easier than checking compiler version.

Signed-off-by: Adrien Mazarguil 
---
 lib/librte_cryptodev/rte_cryptodev.h  | 2 ++
 lib/librte_eal/linuxapp/eal/include/exec-env/rte_kni_common.h | 1 +
 lib/librte_ether/rte_ethdev.h | 4 
 lib/librte_kni/rte_kni.h  | 1 +
 lib/librte_lpm/rte_lpm.h  | 4 
 lib/librte_mbuf/rte_mbuf.h| 1 +
 6 files changed, 13 insertions(+)

diff --git a/lib/librte_cryptodev/rte_cryptodev.h 
b/lib/librte_cryptodev/rte_cryptodev.h
index 1e30a19..957bdd7 100644
--- a/lib/librte_cryptodev/rte_cryptodev.h
+++ b/lib/librte_cryptodev/rte_cryptodev.h
@@ -619,6 +619,7 @@ struct rte_cryptodev {
struct rte_cryptodev_cb_list link_intr_cbs;
/**< User application callback for interrupts if present */

+   __extension__
uint8_t attached : 1;
/**< Flag indicating the device is attached */
 } __rte_cache_aligned;
@@ -642,6 +643,7 @@ struct rte_cryptodev_data {
char name[RTE_CRYPTODEV_NAME_MAX_LEN];
/**< Unique identifier name */

+   __extension__
uint8_t dev_started : 1;
/**< Device state: STARTED(1)/STOPPED(0) */

diff --git a/lib/librte_eal/linuxapp/eal/include/exec-env/rte_kni_common.h 
b/lib/librte_eal/linuxapp/eal/include/exec-env/rte_kni_common.h
index 7f458a3..2ef0506 100644
--- a/lib/librte_eal/linuxapp/eal/include/exec-env/rte_kni_common.h
+++ b/lib/librte_eal/linuxapp/eal/include/exec-env/rte_kni_common.h
@@ -159,6 +159,7 @@ struct rte_kni_device_info {
uint16_t group_id;/**< Group ID */
uint32_t core_id; /**< core ID to bind for kernel thread */

+   __extension__
uint8_t force_bind : 1;   /**< Flag for kernel thread binding */

/* mbuf size */
diff --git a/lib/librte_ether/rte_ethdev.h b/lib/librte_ether/rte_ethdev.h
index b0fe033..96575e8 100644
--- a/lib/librte_ether/rte_ethdev.h
+++ b/lib/librte_ether/rte_ethdev.h
@@ -255,6 +255,7 @@ struct rte_eth_stats {
 /**
  * A structure used to retrieve link-level information of an Ethernet port.
  */
+__extension__
 struct rte_eth_link {
uint32_t link_speed;/**< ETH_SPEED_NUM_ */
uint16_t link_duplex  : 1;  /**< ETH_LINK_[HALF/FULL]_DUPLEX */
@@ -346,6 +347,7 @@ struct rte_eth_rxmode {
enum rte_eth_rx_mq_mode mq_mode;
uint32_t max_rx_pkt_len;  /**< Only used if jumbo_frame enabled. */
uint16_t split_hdr_size;  /**< hdr buf size (header_split enabled).*/
+   __extension__
uint16_t header_split : 1, /**< Header Split enable. */
hw_ip_checksum   : 1, /**< IP/UDP/TCP checksum offload enable. 
*/
hw_vlan_filter   : 1, /**< VLAN filter enable. */
@@ -645,6 +647,7 @@ struct rte_eth_txmode {

/* For i40e specifically */
uint16_t pvid;
+   __extension__
uint8_t hw_vlan_reject_tagged : 1,
/**< If set, reject sending out tagged pkts */
hw_vlan_reject_untagged : 1,
@@ -1691,6 +1694,7 @@ struct rte_eth_dev_data {
struct ether_addr* hash_mac_addrs;
/** Device Ethernet MAC addresses of hash filtering. */
uint8_t port_id;   /**< Device [external] port identifier. */
+   __extension__
uint8_t promiscuous   : 1, /**< RX promiscuous mode ON(1) / OFF(0). */
scattered_rx : 1,  /**< RX of scattered packets is ON(1) / 
OFF(0) */
all_multicast : 1, /**< RX all multicast mode ON(1) / OFF(0). */
diff --git a/lib/librte_kni/rte_kni.h b/lib/librte_kni/rte_kni.h
index 7363e6c..5f6f9e4 100644
--- a/lib/librte_kni/rte_kni.h
+++ b/lib/librte_kni/rte_kni.h
@@ -88,6 +88,7 @@ struct rte_kni_conf {
struct rte_pci_addr addr;
struct rte_pci_id id;

+   __extension__
uint8_t force_bind : 1; /* Flag to bind kernel thread */
 };

diff --git a/lib/librte_lpm/rte_lpm.h b/lib/librte_lpm/rte_lpm.h
index 79a4593..28668a3 100644
--- a/lib/librte_lpm/rte_lpm.h
+++ b/lib/librte_lpm/rte_lpm.h
@@ -93,6 +93,7 @@ extern "C" {

 #if RTE_BYTE_ORDER == RTE_LITTLE_ENDIAN
 /** @internal Tbl24 entry structure. */
+__extension__
 struct rte_lpm_tbl_entry_v20 {
/**
 * Stores Next hop (tbl8 or tbl24 when valid_group is not set) or
@@ -116,6 +117

[dpdk-dev] [PATCH v5 03/10] lib: use C99 syntax for zero-size arrays

2016-09-08 Thread Adrien Mazarguil

Exported header files used by applications should allow the strictest
compiler flags. Language extensions used in many places must be explicitly
marked or removed to avoid warnings and compilation failures.

The extension keyword is used whenever the C99 syntax cannot do it.

This commit prevents the following errors:

 error: ISO C forbids zero-size array `[...]'

Signed-off-by: Adrien Mazarguil 
---
 lib/librte_acl/rte_acl.h  | 2 +-
 lib/librte_cryptodev/rte_cryptodev.h  | 2 +-
 lib/librte_cryptodev/rte_cryptodev_pmd.h  | 2 +-
 lib/librte_eal/linuxapp/eal/include/exec-env/rte_kni_common.h | 2 +-
 lib/librte_hash/rte_fbk_hash.h| 2 +-
 lib/librte_ip_frag/rte_ip_frag.h  | 2 +-
 lib/librte_lpm/rte_lpm.h  | 2 +-
 lib/librte_mbuf/rte_mbuf.h| 3 +++
 lib/librte_pipeline/rte_pipeline.h| 2 +-
 lib/librte_ring/rte_ring.h| 2 +-
 lib/librte_sched/rte_bitmap.h | 2 +-
 11 files changed, 13 insertions(+), 10 deletions(-)

diff --git a/lib/librte_acl/rte_acl.h b/lib/librte_acl/rte_acl.h
index 0979a09..c059dc3 100644
--- a/lib/librte_acl/rte_acl.h
+++ b/lib/librte_acl/rte_acl.h
@@ -144,7 +144,7 @@ struct rte_acl_rule_data {
struct rte_acl_field field[fld_num]; \
 }

-RTE_ACL_RULE_DEF(rte_acl_rule, 0);
+RTE_ACL_RULE_DEF(rte_acl_rule,);

 #defineRTE_ACL_RULE_SZ(fld_num)\
(sizeof(struct rte_acl_rule) + sizeof(struct rte_acl_field) * (fld_num))
diff --git a/lib/librte_cryptodev/rte_cryptodev.h 
b/lib/librte_cryptodev/rte_cryptodev.h
index affbdec..1e30a19 100644
--- a/lib/librte_cryptodev/rte_cryptodev.h
+++ b/lib/librte_cryptodev/rte_cryptodev.h
@@ -759,7 +759,7 @@ struct rte_cryptodev_sym_session {
} __rte_aligned(8);
/**< Public symmetric session details */

-   char _private[0];
+   char _private[];
/**< Private session material */
 };

diff --git a/lib/librte_cryptodev/rte_cryptodev_pmd.h 
b/lib/librte_cryptodev/rte_cryptodev_pmd.h
index 7d049ea..42e7b79 100644
--- a/lib/librte_cryptodev/rte_cryptodev_pmd.h
+++ b/lib/librte_cryptodev/rte_cryptodev_pmd.h
@@ -71,7 +71,7 @@ struct rte_cryptodev_session {
struct rte_mempool *mp;
} __rte_aligned(8);

-   char _private[0];
+   __extension__ char _private[0];
 };

 struct rte_cryptodev_driver;
diff --git a/lib/librte_eal/linuxapp/eal/include/exec-env/rte_kni_common.h 
b/lib/librte_eal/linuxapp/eal/include/exec-env/rte_kni_common.h
index 2acdfd9..7f458a3 100644
--- a/lib/librte_eal/linuxapp/eal/include/exec-env/rte_kni_common.h
+++ b/lib/librte_eal/linuxapp/eal/include/exec-env/rte_kni_common.h
@@ -102,7 +102,7 @@ struct rte_kni_fifo {
volatile unsigned read;  /**< Next position to be read */
unsigned len;/**< Circular buffer length */
unsigned elem_size;  /**< Pointer size - for 32/64 bit OS */
-   void * volatile buffer[0];   /**< The buffer contains mbuf pointers */
+   void *volatile buffer[]; /**< The buffer contains mbuf pointers */
 };

 /*
diff --git a/lib/librte_hash/rte_fbk_hash.h b/lib/librte_hash/rte_fbk_hash.h
index a430961..bd46048 100644
--- a/lib/librte_hash/rte_fbk_hash.h
+++ b/lib/librte_hash/rte_fbk_hash.h
@@ -115,7 +115,7 @@ struct rte_fbk_hash_table {
uint32_t init_val;  /**< For initialising hash function. */

/** A flat table of all buckets. */
-   union rte_fbk_hash_entry t[0];
+   union rte_fbk_hash_entry t[];
 };

 /**
diff --git a/lib/librte_ip_frag/rte_ip_frag.h b/lib/librte_ip_frag/rte_ip_frag.h
index 9ac7081..69596ab 100644
--- a/lib/librte_ip_frag/rte_ip_frag.h
+++ b/lib/librte_ip_frag/rte_ip_frag.h
@@ -124,7 +124,7 @@ struct rte_ip_frag_tbl {
struct ip_frag_pkt *last; /**< last used entry. */
struct ip_pkt_list lru;   /**< LRU list for table entries. */
struct ip_frag_tbl_stat stat; /**< statistics counters. */
-   struct ip_frag_pkt pkt[0];/**< hash table. */
+   struct ip_frag_pkt pkt[]; /**< hash table. */
 };

 /** IPv6 fragment extension header */
diff --git a/lib/librte_lpm/rte_lpm.h b/lib/librte_lpm/rte_lpm.h
index 2df1d67..79a4593 100644
--- a/lib/librte_lpm/rte_lpm.h
+++ b/lib/librte_lpm/rte_lpm.h
@@ -193,7 +193,7 @@ struct rte_lpm_v20 {
__rte_cache_aligned; /**< LPM tbl24 table. */
struct rte_lpm_tbl_entry_v20 tbl8[RTE_LPM_TBL8_NUM_ENTRIES]
__rte_cache_aligned; /**< LPM tbl8 table. */
-   struct rte_lpm_rule_v20 rules_tbl[0] \
+   struct rte_lpm_rule_v20 rules_tbl[]
__rte_cache_aligned; /**< LPM rules. */
 };

diff --git a/lib/lib

[dpdk-dev] [PATCH v5 02/10] lib: work around large enum values

2016-09-08 Thread Adrien Mazarguil

Exported header files used by applications should allow the strictest
compiler flags. Language extensions used in many places must be explicitly
marked or removed to avoid warnings and compilation failures.

This commit prevents the following errors:

 error: ISO C restricts enumerator values to range of `int'

Signed-off-by: Adrien Mazarguil 
---
 lib/librte_eal/common/include/rte_memory.h | 1 +
 1 file changed, 1 insertion(+)

diff --git a/lib/librte_eal/common/include/rte_memory.h 
b/lib/librte_eal/common/include/rte_memory.h
index d9e8c21..12e0ebb 100644
--- a/lib/librte_eal/common/include/rte_memory.h
+++ b/lib/librte_eal/common/include/rte_memory.h
@@ -54,6 +54,7 @@ extern "C" {

 #include 

+__extension__
 enum rte_page_sizes {
RTE_PGSIZE_4K= 1ULL << 12,
RTE_PGSIZE_64K   = 1ULL << 16,
-- 
2.1.4

[dpdk-dev] [PATCH v5 01/10] lib: work around braced-groups within expressions

2016-09-08 Thread Adrien Mazarguil

Exported header files used by applications should allow the strictest
compiler flags. Language extensions used in many places must be explicitly
marked or removed to avoid warnings and compilation failures.

This commit prevents the following errors:

 error: ISO C forbids braced-groups within expressions

Signed-off-by: Adrien Mazarguil 
---
 lib/librte_eal/common/include/arch/arm/rte_memcpy_32.h | 3 ++-
 lib/librte_eal/common/include/arch/ppc_64/rte_memcpy.h | 3 ++-
 lib/librte_eal/common/include/arch/x86/rte_memcpy.h| 4 ++--
 lib/librte_eal/common/include/arch/x86/rte_vect.h  | 6 --
 lib/librte_eal/common/include/rte_common.h | 6 --
 5 files changed, 14 insertions(+), 8 deletions(-)

diff --git a/lib/librte_eal/common/include/arch/arm/rte_memcpy_32.h 
b/lib/librte_eal/common/include/arch/arm/rte_memcpy_32.h
index da6c233..c3a2619 100644
--- a/lib/librte_eal/common/include/arch/arm/rte_memcpy_32.h
+++ b/lib/librte_eal/common/include/arch/arm/rte_memcpy_32.h
@@ -148,7 +148,8 @@ rte_mov256(uint8_t *dst, const uint8_t *src)
 }

 #define rte_memcpy(dst, src, n)  \
-   ({ (__builtin_constant_p(n)) ?   \
+   __extension__ ({ \
+   (__builtin_constant_p(n)) ?  \
memcpy((dst), (src), (n)) :  \
rte_memcpy_func((dst), (src), (n)); })

diff --git a/lib/librte_eal/common/include/arch/ppc_64/rte_memcpy.h 
b/lib/librte_eal/common/include/arch/ppc_64/rte_memcpy.h
index acf7aac..ca9d1dc 100644
--- a/lib/librte_eal/common/include/arch/ppc_64/rte_memcpy.h
+++ b/lib/librte_eal/common/include/arch/ppc_64/rte_memcpy.h
@@ -95,7 +95,8 @@ rte_mov256(uint8_t *dst, const uint8_t *src)
 }

 #define rte_memcpy(dst, src, n)  \
-   ({ (__builtin_constant_p(n)) ?   \
+   __extension__ ({ \
+   (__builtin_constant_p(n)) ?  \
memcpy((dst), (src), (n)) :  \
rte_memcpy_func((dst), (src), (n)); })

diff --git a/lib/librte_eal/common/include/arch/x86/rte_memcpy.h 
b/lib/librte_eal/common/include/arch/x86/rte_memcpy.h
index 413035e..b3bfc23 100644
--- a/lib/librte_eal/common/include/arch/x86/rte_memcpy.h
+++ b/lib/librte_eal/common/include/arch/x86/rte_memcpy.h
@@ -594,7 +594,7 @@ rte_mov256(uint8_t *dst, const uint8_t *src)
  * - __m128i  ~  must be pre-defined
  */
 #define MOVEUNALIGNED_LEFT47_IMM(dst, src, len, offset)
 \
-({ 
 \
+__extension__ ({   
 \
 int tmp;   
 \
 while (len >= 128 + 16 - offset) { 
 \
 xmm0 = _mm_loadu_si128((const __m128i *)((const uint8_t *)src - offset 
+ 0 * 16));  \
@@ -655,7 +655,7 @@ rte_mov256(uint8_t *dst, const uint8_t *src)
  * - __m128i  ~  used in MOVEUNALIGNED_LEFT47_IMM must be 
pre-defined
  */
 #define MOVEUNALIGNED_LEFT47(dst, src, len, offset)   \
-({\
+__extension__ ({  \
 switch (offset) { \
 case 0x01: MOVEUNALIGNED_LEFT47_IMM(dst, src, n, 0x01); break;\
 case 0x02: MOVEUNALIGNED_LEFT47_IMM(dst, src, n, 0x02); break;\
diff --git a/lib/librte_eal/common/include/arch/x86/rte_vect.h 
b/lib/librte_eal/common/include/arch/x86/rte_vect.h
index b698797..2836f2c 100644
--- a/lib/librte_eal/common/include/arch/x86/rte_vect.h
+++ b/lib/librte_eal/common/include/arch/x86/rte_vect.h
@@ -106,7 +106,8 @@ typedef union rte_ymm {
 #endif /* __AVX__ */

 #ifdef RTE_ARCH_I686
-#define _mm_cvtsi128_si64(a) ({ \
+#define _mm_cvtsi128_si64(a)\
+__extension__ ({\
rte_xmm_t m;\
m.x = (a);  \
(m.u64[0]); \
@@ -117,7 +118,8 @@ typedef union rte_ymm {
  * Prior to version 12.1 icc doesn't support _mm_set_epi64x.
  */
 #if (defined(__ICC) && __ICC < 1210)
-#define _mm_set_epi64x(a, b)  ({ \
+#define _mm_set_epi64x(a, b) \
+__extension__ ({ \
rte_xmm_t m; \
m.u64[0] = b;\
m.u64[1] = a;\
diff --git a/lib/librte_eal/common/include/rte_common.h 
b/lib/librte_eal/common/include/rte_common.h
index 332f2a4..477472b 100644
--- a/lib/librte_eal/common/include/rte_common.h
+++ b/lib/librte_eal/common/include/rte_common.h
@@ -268,7 +268,8 @@ rte_align64pow2(uint64_t v)
 /**
  * Macro to return the minimum of two numbers
  */
-#define RTE_MIN(a, b) ({ \
+#define RTE_MIN(a, b) \
+   __extension__ ({ \

[dpdk-dev] [PATCH v5 00/10] Fix build errors related to exported headers

2016-09-08 Thread Adrien Mazarguil

DPDK uses GNU C language extensions in most of its code base. This is fine
for internal source files whose compilation flags are controlled by DPDK,
however user applications that use exported "public" headers may experience
compilation failures when enabling strict error/standard checks (-std and
-pedantic for instance).

Exported headers are installed system-wide and must be as clean as possible
so applications do not have to resort to workarounds.

This patchset affects exported headers only, compilation problems are
addressed as follows:

- Adding the __extension__ keyword to nonstandard constructs (same method
  as existing libraries when there is no other choice).
- Adding the __extension__ keyword to C11 constructs to remain compatible
  with pure C99.
- Adding missing includes so exported files can be included out of order
  and on their own.
- Fixing GNU printf-like variadic macros as there is no magic keyword for
  these.

Changes in v5:

- Fixed compilation error (RH 6.7) in struct rte_cryptodev_session by using
  the extension keyword instead of the C99 syntax.
- Removed call to check-includes.sh from test-build.sh as it takes too long
  to complete. This script should be run voluntarily like check-git-log.sh
  and friends.

Changes in v4:

- Dropped "lib: work around structs with no members" patch, now addressed
  as a separate issue outside of this patchset by "mempool: fix empty
  structure definition".
- Fixed remaining compilation error with ICC reported by Ferruh. Finally
  settled on using the __extension__ keyword directly in struct
  rte_pipeline_table_entry as converting it to a standard flexible array
  may break existing programs.

Changes in v3:

- Fixed compilation issue on ARM and POWER8 due to missing parenthesis.
- Added bit-field fix for rte_kni.h.

Changes in v2:

- Rebased on top of the current HEAD.
- Added script to check headers automatically (check-includes.sh), for both
  C and C++ compilation.
- Updated test-build.sh to use it.
- Fixed consistency of new #include directives, now inside extern "C"
  blocks for files that already do that (Jan, fixing these was too much
  work for this patchset so I settled on this solution in the meantime).
- Updated headlines to address check-git-log.sh complaints.

Adrien Mazarguil (10):
  lib: work around braced-groups within expressions
  lib: work around large enum values
  lib: use C99 syntax for zero-size arrays
  lib: work around nonstandard bit-fields
  lib: work around unnamed structs/unions
  lib: add missing include dependencies
  lib: work around forward reference to enum types
  lib: remove named variadic macros in exported headers
  lib: hide static functions never defined
  scripts: check compilation of exported header files

 MAINTAINERS |   1 +
 lib/librte_acl/rte_acl.h|   2 +-
 lib/librte_cfgfile/rte_cfgfile.h|   2 +
 lib/librte_cmdline/cmdline.h|   1 +
 lib/librte_cmdline/cmdline_parse_portlist.h |   1 +
 lib/librte_cmdline/cmdline_socket.h |   3 +
 lib/librte_cryptodev/rte_crypto.h   |   2 +
 lib/librte_cryptodev/rte_crypto_sym.h   |   3 +
 lib/librte_cryptodev/rte_cryptodev.h|  40 ++-
 lib/librte_cryptodev/rte_cryptodev_pmd.h|   6 +-
 .../common/include/arch/arm/rte_byteorder.h |   2 +
 .../common/include/arch/arm/rte_memcpy_32.h |   3 +-
 .../common/include/arch/arm/rte_prefetch_32.h   |   1 +
 .../common/include/arch/arm/rte_prefetch_64.h   |   1 +
 .../common/include/arch/arm/rte_vect.h  |   1 +
 .../common/include/arch/ppc_64/rte_atomic.h |   1 +
 .../common/include/arch/ppc_64/rte_byteorder.h  |   1 +
 .../common/include/arch/ppc_64/rte_cycles.h |   2 +
 .../common/include/arch/ppc_64/rte_memcpy.h |   3 +-
 .../common/include/arch/ppc_64/rte_prefetch.h   |   1 +
 .../common/include/arch/x86/rte_atomic.h|   2 +
 .../common/include/arch/x86/rte_atomic_32.h |   9 +
 .../common/include/arch/x86/rte_atomic_64.h |   8 +
 .../common/include/arch/x86/rte_byteorder.h |   2 +
 .../common/include/arch/x86/rte_byteorder_32.h  |   7 +
 .../common/include/arch/x86/rte_byteorder_64.h  |   7 +
 .../common/include/arch/x86/rte_cycles.h|   2 +
 .../common/include/arch/x86/rte_memcpy.h|   4 +-
 .../common/include/arch/x86/rte_prefetch.h  |   1 +
 .../common/include/arch/x86/rte_rtm.h   |   1 +
 .../common/include/arch/x86/rte_vect.h  |   8 +-
 .../common/include/generic/rte_atomic.h |   1 +
 .../common/include/generic/rte_byteorder.h  |   2 +
 .../common/include/generic/rte_cpuflags.h   |   3 +
 .../common/include/generic/rte_memcpy.h |   4 +
 lib/librte_eal/common/include/rte_common.h  |  22 +-
 lib/librte_eal/common/include/rte_devargs.h |   1 +
 lib/librte_eal/common/include/rte_eal.h |   1 +
 lib/librte_eal/commo

[dpdk-dev] [RFC v2] ethdev: introduce generic flow API

2016-08-19 Thread Adrien Mazarguil

This new API supersedes all the legacy filter types described in
rte_eth_ctrl.h. It is slightly higher level and as a result relies more on
PMDs to process and validate flow rules.

It has the following benefits:

- A unified API is easier to program for, applications do not have to be
  written for a specific filter type which may or may not be supported by
  the underlying device.

- The behavior of a flow rule is the same regardless of the underlying
  device, applications do not need to be aware of hardware quirks.

- Extensible by design, API/ABI breakage should rarely occur if at all.

- Documentation is self-standing, no need to look up elsewhere.

The existing filter types will be deprecated and removed in the near
future.

Note that it is not complete yet. This commit only provides the header
file. The specification is provided separately, see below.

HTML version:
 https://rawgit.com/6WIND/rte_flow/master/rte_flow.html

PDF version:
 https://rawgit.com/6WIND/rte_flow/master/rte_flow.pdf

Git tree:
 https://github.com/6WIND/rte_flow

Signed-off-by: Adrien Mazarguil 
---
 lib/librte_ether/Makefile   |   2 +
 lib/librte_ether/rte_flow.h | 941 +++
 2 files changed, 943 insertions(+)

diff --git a/lib/librte_ether/Makefile b/lib/librte_ether/Makefile
index 0bb5dc9..a6f7cd5 100644
--- a/lib/librte_ether/Makefile
+++ b/lib/librte_ether/Makefile
@@ -52,8 +52,10 @@ SYMLINK-y-include += rte_ether.h
 SYMLINK-y-include += rte_ethdev.h
 SYMLINK-y-include += rte_eth_ctrl.h
 SYMLINK-y-include += rte_dev_info.h
+SYMLINK-y-include += rte_flow.h

 # this lib depends upon:
 DEPDIRS-y += lib/librte_eal lib/librte_mempool lib/librte_ring lib/librte_mbuf
+DEPDIRS-y += lib/librte_net

 include $(RTE_SDK)/mk/rte.lib.mk
diff --git a/lib/librte_ether/rte_flow.h b/lib/librte_ether/rte_flow.h
new file mode 100644
index 000..0aa6094
--- /dev/null
+++ b/lib/librte_ether/rte_flow.h
@@ -0,0 +1,941 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright 2016 6WIND S.A.
+ *   Copyright 2016 Mellanox.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ * * Redistributions of source code must retain the above copyright
+ *   notice, this list of conditions and the following disclaimer.
+ * * Redistributions in binary form must reproduce the above copyright
+ *   notice, this list of conditions and the following disclaimer in
+ *   the documentation and/or other materials provided with the
+ *   distribution.
+ * * Neither the name of 6WIND S.A. nor the names of its
+ *   contributors may be used to endorse or promote products derived
+ *   from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef RTE_FLOW_H_
+#define RTE_FLOW_H_
+
+/**
+ * @file
+ * RTE generic flow API
+ *
+ * This interface provides the ability to program packet matching and
+ * associated actions in hardware through flow rules.
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+/**
+ * Flow rule attributes.
+ *
+ * Priorities are set on two levels: per group and per rule within groups.
+ *
+ * Lower values denote higher priority, the highest priority for both levels
+ * is 0, so that a rule with priority 0 in group 8 is always matched after a
+ * rule with priority 8 in group 0.
+ *
+ * Although optional, applications are encouraged to group similar rules as
+ * much as possible to fully take advantage of hardware capabilities
+ * (e.g. optimized matching) and work around limitations (e.g. a single
+ * pattern type possibly allowed in a given group).
+ *
+ * Group and priority levels are arbitrary and up to the application, they
+ * do not need to be contiguous nor start from 0, however the maximum number
+ * varies between devices and may be affected by existing flow rules.
+ *
+ * If a packet is matched by several rules of a given group for a given
+ * priority level, the outcome is undefined. It can take any path, may be
+ * duplicated or even cause

[dpdk-dev] [RFC v2] Generic flow director/filtering/classification API

2016-08-19 Thread Adrien Mazarguil

Hi All,

Thanks to many for the positive and constructive feedback I've received so
far. Here is the updated specification (v0.7) at last.

I've attempted to address as many comments as possible but could not
process them all just yet. A new section "Future evolutions" has been
added for the remaining topics.

This series adds rte_flow.h to the DPDK tree. Next time I will attempt to
convert the specification as a documentation commit part of the patchset
and actually implement API functions.

I think including the entire document here makes it easier to annotate on
the ML, apologies in advance for the resulting traffic.

Finally I'm off for the next two weeks, do not expect replies from me in
the meantime.

Updates are also available online:

HTML version:
 https://rawgit.com/6WIND/rte_flow/master/rte_flow.html

PDF version:
 https://rawgit.com/6WIND/rte_flow/master/rte_flow.pdf  

Related draft header file (also in the next patch):
 https://raw.githubusercontent.com/6WIND/rte_flow/master/rte_flow.h

Git tree:
 https://github.com/6WIND/rte_flow

Changes from v1:

 Specification:

 - Settled on [generic] "flow interface" / "flow API" as the name of this
   framework, matches the rte_flow prefix better.
 - Minor wording changes in several places.
 - Partially added egress (TX) support.
 - Added "unrecoverable errors" as another consequence of overlapping
   rules.
 - Described flow rules groups and their interaction with flow rule
   priorities.
 - Fully described PF and VF meta pattern items so they are not open to
   interpretation anymore.
 - Removed the SIGNATURE meta pattern item as its description was too
   vague, may be re-added later if necessary.
 - Added the PORT pattern item to apply rules to non-default physical
   ports.
 - Entirely redefined the RAW pattern item.
 - Fixed tag error in the ETH item definition.
 - Updated protocol definitions (IPV4, IPV6, ICMP, UDP).
 - Added missing protocols (SCTP, VXLAN).
 - Converted ID action to MARK and FLAG actions, described interaction
   with the RSS hash result in mbufs.
 - Updated COUNT query structure to retrieve the number of bytes.
 - Updated VF action.
 - Documented negative item and action types, those will be used for
   dynamic types generated at run-time.
 - Added blurb about IPv4 options and IPv6 extension headers matching.
 - Updated function definitions.
 - Documented a flush method to remove all rules on a given port at once.
 - Documented the verbose error reporting interface.
 - Documented how the private interface for PMD use will work.
 - Documented expected behavior between successive port initializations.
 - Documented expected behavior for ports not under DPDK control.
 - Updated API migration section.
 - Added future evolutions section.

 Header file:

 - Not a draft anymore and can be used as-is for preliminary
   implementations.
 - Flow rule attributes (group, priority, etc) now have their own
   structure provided separately to API functions (struct rte_flow_attr).
 - Group and priority interactions have been documented.
 - Added PORT item.
 - Removed SIGNATURE item.
 - Defined ICMP, SCTP and VXLAN items.
 - Redefined PF, VF, RAW, IPV4, IPV6, UDP and TCP items.
 - Fixed tag error in the ETH item definition.
 - Converted ID action to MARK and FLAG actions.
   hash result in mbufs.
 - Updated COUNT query structure.
 - Updated VF action.
 - Added verbose errors interface.
 - Updated function prototypes according to the above.
 - Defined rte_flow_flush().



==
Generic flow interface
==

.. footer::

   v0.7

.. contents::
.. sectnum::
.. raw:: pdf

   PageBreak

Overview


DPDK provides several competing interfaces added over time to perform packet
matching and related actions such as filtering and classification.

They must be extended to implement the features supported by newer devices
in order to expose them to applications, however the current design has
several drawbacks:

- Complicated filter combinations which have not been hard-coded cannot be
  expressed.
- Prone to API/ABI breakage when new features must be added to an existing
  filter type, which frequently happens.

>From an application point of view:

- Having disparate interfaces, all optional and lacking in features does not
  make this API easy to use.
- Seemingly arbitrary built-in limitations of filter types based on the
  device they were initially designed for.
- Undefined relationship between different filter types.
- High complexity, considerable undocumented and/or undefined behavior.

Considering the growing number of devices supported by DPDK, adding a new
filter type each time a new feature must be implemented is not sustainable
in the long term. Applications not written to target a specific device
cannot really benefit from such an API.

For these reasons, this document defines an extensible unified API that
encompasses and supersedes these legacy filter types.

.. raw::

[dpdk-dev] ConnectX4 100GbE - Compilation problem

2016-08-18 Thread Adrien Mazarguil

MLNX_DPDK_2.2_2.7/mk/internal/rte.compile-pre.mk:126: recipe for
> target 'mlx4.o' failed
> 
> Iwould appreciate any suggestions and guidance.

Well fortunately these errors are also present in v2.2.0 and should have
been addressed since v16.07 by the following commit:

 http://dpdk.org/browse/dpdk/commit/?id=d06c608c013c36711e7a693b3fece68a93ae4369

You can either upgrade to v16.07, back-port this commit yourself or wait for
an update from Mellanox.

-- 
Adrien Mazarguil
6WIND

[dpdk-dev] [RFC] Generic flow director/filtering/classification API

2016-08-10 Thread Adrien Mazarguil

On Tue, Aug 09, 2016 at 02:47:44PM -0700, John Fastabend wrote:
> On 16-08-04 06:24 AM, Adrien Mazarguil wrote:
> > On Wed, Aug 03, 2016 at 12:11:56PM -0700, John Fastabend wrote:
[...]
> >> The problem is keeping priorities in order and/or possibly breaking
> >> rules apart (e.g. you have an L2 table and an L3 table) becomes very
> >> complex to manage at driver level. I think its easier for the
> >> application which has some context to do this. The application "knows"
> >> if its a router for example will likely be able to pack rules better
> >> than a PMD will.
> > 
> > I don't think most applications know they are L2 or L3 routers. They may not
> > know more than the pattern provided to the PMD, which may indeed end at a L2
> > or L3 protocol. If the application simply chooses a table based on this
> > information, then the PMD could have easily done the same.
> > 
> 
> But when we start thinking about encap/decap then its natural to start
> using this interface to implement various forwarding dataplanes. And one
> common way to organize a switch is into a TEP, router, switch
> (mac/vlan), ACL tables, etc. In fact we see this topology starting to
> show up in the NICs now.
> 
> Further each table may be "managed" by a different entity. In which
> case the software will want to manage the physical and virtual networks
> separately.
> 
> It doesn't make sense to me to require a software aggregator object to
> marshal the rules into a flat table then for a PMD to split them apart
> again.

OK, my point was mostly about handling basic cases easily and making sure
applications do not have to bother with petty HW details when they do not
want to, yet still get maximum performance by having the PMD make the most
appropriate choices automatically.

You've convinced me that in many cases PMDs won't be able to optimize
efficiently and that conscious applications will know better. The API has to
provide the ability to do so. I think it's fine as long as it is not
mandatory.

> > I understand the issue is what happens when applications really want to
> > define e.g. L2/L3/L2 rules in this specific order (or any ordering that
> > cannot be satisfied by HW due to table constraints).
> > 
> > By exposing tables, in such a case applications should move all rules from
> > L2 to a L3 table themselves (assuming this is even supported) to guarantee
> > ordering between rules, or fail to add them. This is basically what the PMD
> > could have done, possibly in a more efficient manner in my opinion.
> 
> I disagree with the more efficient comment :)
> 
> If the software layer is working on L2/TEP/ACL/router layers merging
> them just to pull them back apart is not going to be more efficient.

Moving flow rules around cannot be efficient by definition, however I think
that attempting to describe table capabilities may be as complicated as
describing HW bit-masking features. Applications may get it wrong as a
result while a PMD would not make any mistake.

Your use case is valid though, if the application already groups rules, then
sharing this information with the PMD would make sense from a performance
standpoint.

> > Let's assume two opposite scenarios for this discussion:
> > 
> > - App #1 is a command-line interface directly mapped to flow rules, which
> >   basically gets slow random input from users depending on how they want to
> >   configure their traffic. All rules differ considerably (L2, L3, L4, some
> >   with incomplete bit-masks, etc). All in all, few but complex rules with
> >   specific priorities.
> > 
> 
> Agree with this and in this case the application should be behind any
> network physical/virtual and not giving rules like encap/decap/etc. This
> application either sits on the physical function and "owns" the hardware
> resource or sits behind a virtual switch.
> 
> 
> > - App #2 is something like OVS, creating and deleting a large number of very
> >   specific (without incomplete bit-masks) and mostly identical
> >   single-priority rules automatically and very frequently.
> > 
> 
> Maybe for OVS but not all virtual switches are built with flat tables
> at the bottom like this. Nor is it optimal it necessarily optimal.
> 
> Another application (the one I'm concerned about :) would be build as
> a pipeline, something like
> 
>   ACL -> TEP -> ACL -> VEB -> ACL
> 
> If I have hardware that supports a TEP hardware block an ACL hardware
> block and a VEB  block for example I don't want to merge my control
> plane into a single table. The merging in this case is just pure
> overhead/complexity for no gain.

It could be do

[dpdk-dev] [RFC] Generic flow director/filtering/classification API

2016-08-10 Thread Adrien Mazarguil

gt; Also in the current scheme how would I match an ipv6 option or specific
> nsh option or mpls tag?

Ideally through specific pattern items defined for this purpose, which is
how I thought the API would evolve. Of course it wouldn't be fully dynamic
and you'd have to wait for a DPDK release that implements them.

-- 
Adrien Mazarguil
6WIND

[dpdk-dev] [RFC] Generic flow director/filtering/classification API

2016-08-04 Thread Adrien Mazarguil

me point, some combination
won't be possible. Getting there was only more complicated from
users/applications point of view.

For app #2 if the first rule can be created then subsequent rules shouldn't
be a problem until their number reaches device limits. Selecting the proper
table to use for these can easily be done by the PMD.

> >>> I don't see how the PMD can sort this out in any meaningful way and it
> >>> has to be exposed to the application that has the intelligence to 'know'
> >>> priorities between masks and non-masks filters. I'm sure you could come
> >>> up with something but it would be less than ideal in many cases I would
> >>> guess and we can't have the driver getting priorities wrong or we may
> >>> not get the correct behavior.
> > 
> > It may be solved by having the PMD maintain a SW state to quickly know which
> > rules are currently created and in what state the device is so basically the
> > application doesn't have to perform this work.
> > 
> > This API allows applications to express basic needs such as "redirect
> > packets matching this pattern to that queue". It must not deal with HW
> > details and limitations in my opinion. If a request cannot be satisfied,
> > then the rule cannot be created. No help from the application must be
> > expected by PMDs, otherwise it opens the door to the same issues as the
> > legacy filtering APIs.
> 
> This depends on the application and what/how it wants to manage the
> device. If the application manages a pipeline with some set of tables,
> then mapping this down to a single table, which then the PMD has to
> unwind back to a multi-table topology to me seems like a waste.

Of course, only I am not sure applications will behave differently if they
are aware of HW tables. I fear it will make things more complicated for
them and they will just stick with the most capable table all the time, but
I agree it should be easier for PMDs.

> > [...]
> >>>> Unfortunately, our maskfull region is extremely small too compared to
> >>>> maskless region.
> >>>>
> >>>
> >>> To me this means a userspace application would want to pack it
> >>> carefully to get the full benefit. So you need some mechanism to specify
> >>> the "region" hence the above table proposal.
> >>>
> >>
> >> Right. Makes sense.
> > 
> > I do not agree, applications should not be aware of it. Note this case can
> > be handled differently, so that rules do not have to be moved back and forth
> > between both tables. If the first created rule requires a maskfull entry,
> > then all subsequent rules will be entered into that table. Otherwise no
> > maskfull entry can be created as long as there is one maskless entry. When
> > either table is full, no more rules may be added. Would that work for you?
> > 
> 
> Its not about mask vs no mask. The devices with multiple tables that I
> have don't have this mask limitations. Its about how to optimally pack
> the rules and who implements that logic. I think its best done in the
> application where I have the context.
> 
> Is there a way to omit the table field if the PMD is expected to do
> a best effort and add the table field if the user wants explicit
> control over table mgmt. This would support both models. I at least
> would like to have explicit control over rule population in my pipeline
> for use cases where I'm building a pipeline on top of the hardware.

Yes that's a possibility. Perhaps the table ID to use could be specified as
a meta pattern item? We'd still need methods to report how many tables exist
and perhaps some way to report their limitations, these could be later
through a separate set of functions.

[...]
> >>> For this adding a meta-data item seems simplest to me. And if you want
> >>> to make the default to be only a single port that would maybe make it
> >>> easier for existing apps to port from flow director. Then if an
> >>> application cares it can create a list of ports if needed.
> >>>
> >>
> >> Agreed.
> > 
> > However although I'm not opposed to adding dedicated meta items, remember
> > applications will not automatically benefit from the increased performance
> > if a single PMD implements this feature, their maintainers will probably not
> > bother with it.
> > 
> 
> Unless as we noted in other thread the application is closely bound to
> its hardware for capability reasons. In this case it would make sense
> to implement.

Sure.

[...]

-- 
Adrien Mazarguil
6WIND

[dpdk-dev] [RFC] Generic flow director/filtering/classification API

2016-08-04 Thread Adrien Mazarguil

ength headers like IP. The limitation is it can't get past
> >>> undefined variable length headers.
> > 
> > RTE_FLOW_ITEM_TYPE_ANY is made for that purpose. Is that what you are
> > looking for?
> > 
> 
> But FLOW_ITEM_TYPE_ANY skips "any" header type is my understanding if
> we have new variable length header in the future we will have to add
> a new type RTE_FLOW_ITEM_TYPE_FOO for example. The RAW type will work
> for fixed headers as noted above.

I'm (slowly) starting to get it. How about the suggestion I made above for
RAW items then?

[...]
> The two open items from me are do we need to support adding new variable
> length headers? And how do we handle multiple tables I'll take that up
> in the other thread.

I think variable length headers may be eventually supported through pattern
tricks or eventually a separate conversion layer.

> >>> I looked at the git repo but I only saw the header definition I guess
> >>> the implementation is TBD after there is enough agreement on the
> >>> interface?
> > 
> > Precisely, I intend to update the tree and send a v2 soon (unfortunately did
> > not have much time these past few days to work on this).
> > 
> > Now what if, instead of a seemingly complex parse graph and still in
> > addition to the query method, enum values were defined for PMDs to report
> > an array of supported items, typical patterns and actions so applications
> > can get a quick idea of what devices are capable of without being too
> > specific. Something like:
> > 
> >  enum rte_flow_capability {
> >  RTE_FLOW_CAPABILITY_ITEM_ETH,
> >  RTE_FLOW_CAPABILITY_PATTERN_ETH_IP_TCP,
> >  RTE_FLOW_CAPABILITY_ACTION_ID,
> >  ...
> >  };
> > 
> > Although I'm not convinced about the usefulness of this because it would
> > have to be maintained separately, but that would be easier than building a
> > dummy flow rule for simple query purposes.
> 
> I'm not sure its necessary either at first.

Then I'll discard this idea.

> > The main question I have for you is, do you think the core of the specified
> > API is adequate enough assuming it can be extended later with new methods?
> > 
> 
> The above two items are my only opens at this point, I agree with your
> summary of my capabilities proposal namely it can be added.

Thanks, see you in the other thread.

-- 
Adrien Mazarguil
6WIND

[dpdk-dev] [RFC] Generic flow director/filtering/classification API

2016-08-03 Thread Adrien Mazarguil

Replying to everything at once, please see below.

On Tue, Jul 26, 2016 at 03:37:35PM +0530, Rahul Lakkireddy wrote:
> On Monday, July 07/25/16, 2016 at 09:40:02 -0700, John Fastabend wrote:
> > On 16-07-25 04:32 AM, Rahul Lakkireddy wrote:
> > > Hi Adrien,
> > > 
> > > On Thursday, July 07/21/16, 2016 at 19:07:38 +0200, Adrien Mazarguil 
> > > wrote:
> > >> Hi Rahul,
> > >>
> > >> Please see below.
> > >>
> > >> On Thu, Jul 21, 2016 at 01:43:37PM +0530, Rahul Lakkireddy wrote:
> > >>> Hi Adrien,
> > >>>
> > >>> The proposal looks very good.  It satisfies most of the features
> > >>> supported by Chelsio NICs.  We are looking for suggestions on exposing
> > >>> more additional features supported by Chelsio NICs via this API.
> > >>>
> > >>> Chelsio NICs have two regions in which filters can be placed -
> > >>> Maskfull and Maskless regions.  As their names imply, maskfull region
> > >>> can accept masks to match a range of values; whereas, maskless region
> > >>> don't accept any masks and hence perform a more strict exact-matches.
> > >>> Filters without masks can also be placed in maskfull region.  By
> > >>> default, maskless region have higher priority over the maskfull region.
> > >>> However, the priority between the two regions is configurable.
> > >>
> > >> I understand this configuration affects the entire device. Just to be 
> > >> clear,
> > >> assuming some filters are already configured, are they affected by a 
> > >> change
> > >> of region priority later?
> > >>
> > > 
> > > Both the regions exist at the same time in the device.  Each filter can
> > > either belong to maskfull or the maskless region.
> > > 
> > > The priority is configured at time of filter creation for every
> > > individual filter and cannot be changed while the filter is still
> > > active. If priority needs to be changed for a particular filter then,
> > > it needs to be deleted first and re-created.
> > 
> > Could you model this as two tables and add a table_id to the API? This
> > way user space could populate the table it chooses. We would have to add
> > some capabilities attributes to "learn" if tables support masks or not
> > though.
> > 
> 
> This approach sounds interesting.

Now I understand the idea behind these tables, however from an application
point of view I still think it's better if the PMD could take care of flow
rules optimizations automatically. Think about it, PMDs have exactly a
single kind of device they know perfectly well to manage, while applications
want the best possible performance out of any device in the most generic
fashion.

> > I don't see how the PMD can sort this out in any meaningful way and it
> > has to be exposed to the application that has the intelligence to 'know'
> > priorities between masks and non-masks filters. I'm sure you could come
> > up with something but it would be less than ideal in many cases I would
> > guess and we can't have the driver getting priorities wrong or we may
> > not get the correct behavior.

It may be solved by having the PMD maintain a SW state to quickly know which
rules are currently created and in what state the device is so basically the
application doesn't have to perform this work.

This API allows applications to express basic needs such as "redirect
packets matching this pattern to that queue". It must not deal with HW
details and limitations in my opinion. If a request cannot be satisfied,
then the rule cannot be created. No help from the application must be
expected by PMDs, otherwise it opens the door to the same issues as the
legacy filtering APIs.

[...]
> > > Unfortunately, our maskfull region is extremely small too compared to
> > > maskless region.
> > > 
> > 
> > To me this means a userspace application would want to pack it
> > carefully to get the full benefit. So you need some mechanism to specify
> > the "region" hence the above table proposal.
> > 
> 
> Right. Makes sense.

I do not agree, applications should not be aware of it. Note this case can
be handled differently, so that rules do not have to be moved back and forth
between both tables. If the first created rule requires a maskfull entry,
then all subsequent rules will be entered into that table. Otherwise no
maskfull entry can be created as long as there is one maskless entry. When
either table is full, no more rules may be added. Would that work f

[dpdk-dev] [RFC] Generic flow director/filtering/classification API

2016-08-03 Thread Adrien Mazarguil

Hi Kieran,

On Mon, Aug 01, 2016 at 04:08:51PM +0100, Kieran Mansley wrote:
> Apologies for coming a little late to this thread about the proposed new
> API for filtering etc.
> 
> I've reviewed it based on Solarflare's needs and hardware capabilities,
> and think the proposal is likely to be a big improvement on the current
> system.
> 
> I worry slightly that the goal of having applications that are not aware
> of the hardware they are running on will be difficult to meet.  My guess
> is that the different hardware platforms will have so little overlap in
> the functionality they support that to get best performance the
> applications will still be heavily tailored to the subsets of the API
> that the hardware they are using provides.  The discussion of filter
> priorities is a good example of this: to get best performance the
> application will want to use the hardware's filtering capabilities to do
> all the heavy lifting, but the abilities of different NICs to support
> particular priorities and combinations of filters will mean what works
> very well for one NIC may well return "I can't do that" for another, and
> vice versa.

I also think most applications will end up using mostly generic rules, while
applications tailored for specific devices will use more features. In my
mind this is like how applications would handle SSE/AVX/AltiVec/etc
optimizations. They need to be aware such features exist and have both
specific and more generic code. The query interface should help with that.

> One suggestion for extending the API further would be to allow it to
> also describe transmit filters as well as receive filters.

Yes, TX is probably the next step. I think it will be part of the same API,
using pattern/actions similarly only they would affect the TX path. But
let's focus on the RX side for now.

> There are also some filters that can prove very useful to our customers
> that while they could be achieved through the careful insertion of
> multiple filters with the right order and priorities, could be made more
> application-friendly by having a more meaningful alias. For example:
>  - multicast-mismatch (all multicast traffic that doesn't match another
> filter and would otherwise be discarded)
>  - unicast-mismatch (all unicast traffic that doesn't match another
> filter and would otherwise be discarded)
>  - all-multicast (all multicast traffic)
>  - all-unicast (all unicast traffic)

Why not, those may be added as new pattern items if the community feels they
are necessary. But right now I do not think these are difficult to specify,
of course one should dedicate priority levels far apart to avoid collisions
with more specific rules, but you still need priorities to determine which
of "all-multicast" or "unicast-mistmatch" should match first.

> Finally, I wonder if any thought has been given to dealing with
> situations where there is a conflict between different virtual or
> physical functions.  E.g. attempting to insert a MAC filter on one VF
> that would steal traffic destined to a different VF.  Will it be up to
> each driver to enforce these sorts of policies or will there be a
> general vendor-neutral framework to deal with this?

PFs and VFs are a complex topic eh? Considering it is not even guaranteed
for a PF to be able to see VF-addressed traffic as is currently the case
for mlx4 and mlx5 (AFAIK). It will be up to each PMD, but they must all
follow the same logic.

A flow rule with a VF pattern item should not be allowed on a PF device if
the PF is either unable to receive VF-addressed traffic, or if doing so
would prevent traffic from being received by a VF when the flow rule
specifies that it is supposed to pass through (either implictly or through a
VF action).

Simply matching the MAC address of a VF from a PF (without specifying the VF
pattern item) should be allowed though. It may not work as packets may not
be received at all, but if it does the application should take care of the
consequences as VF may not receive packets anymore.

Creating or updating the MAC address of a VF after adding a conflicting
flow rule on a PF should not be allowed or remain undefined.

All of this is not described in the specification yet because PF/VF patterns
and actions are not fully defined at the moment, there is still some
uncertainty about them.

> I should reiterate that I think this will be a big improvement, so thank
> you for proposing it.

Thanks!

-- 
Adrien Mazarguil
6WIND

[dpdk-dev] [RFC] Generic flow director/filtering/classification API

2016-08-03 Thread Adrien Mazarguil

Hi John,

I'm replying below to both messages.

On Tue, Aug 02, 2016 at 11:19:15AM -0700, John Fastabend wrote:
> On 16-07-23 02:10 PM, John Fastabend wrote:
> > On 16-07-21 12:20 PM, Adrien Mazarguil wrote:
> >> Hi Jerin,
> >>
> >> Sorry, looks like I missed your reply. Please see below.
> >>
> > 
> > Hi Adrian,
> > 
> > Sorry for a bit delay but a few comments that may be worth considering.
> > 
> > To start with completely agree on the general problem statement and the
> > nice summary of all the current models. Also good start on this.

Thanks.

> >> Considering that allowed pattern/actions combinations cannot be known in
> >> advance and would result in an unpractically large number of capabilities 
> >> to
> >> expose, a method is provided to validate a given rule from the current
> >> device configuration state without actually adding it (akin to a "dry run"
> >> mode).
> > 
> > Rather than have a query/validate process why did we jump over having an
> > intermediate representation of the capabilities? Here you state it is
> > unpractical but we know how to represent parse graphs and the drivers
> > could report their supported parse graph via a single query to a middle
> > layer.
> > 
> > This will actually reduce the msg chatter imagine many applications at
> > init time or in boundary cases where a large set of applications come
> > online at once and start banging on the interface all at once seems less
> > than ideal.

Well, I also thought about a kind of graph to represent capabilities but
feared the extra complexity would not be worth the trouble, thus settled on
the query idea. A couple more reasons:

- Capabilities evolve at the same time as devices are configured. For
  example, if a device supports a single RSS context, then a single rule
  with a RSS action may be created. The graph would have to be rewritten
  accordingly and thus queried/parsed again by the application.

- Expressing capabilities at bit granularity (say, for a matching pattern
  item mask) is complex, there is no way to simplify the representation of
  capabilities without either losing information or making the graph more
  complex to parse than simply providing a flow rule from an application
  point of view.

With that in mind, I am not opposed to the idea, both methods could even
coexist, with the query function eventually evolving to become a front-end
to a capability graph. Just remember that I am only defining the
fundamentals for the initial implementation, i.e. how rules are expressed as
patterns/actions and the basic functions to manage them, ideally without
having to redefine them ever.

> A bit more details on possible interface for capabilities query,
> 
> One way I've used to describe these graphs from driver to software
> stacks is to use a set of structures to build the graph. For fixed
> graphs this could just be *.h file for programmable hardware (typically
> coming from fw update on nics) the driver can read the parser details
> out of firmware and render the structures.

I understand, however I think this approach may be too low-level to express
all the possible combinations. This graph would have to include possible
actions for each possible pattern, all while considering that some actions
are not possible with some patterns and that there are exclusive actions.

Also while memory consumption is not really an issue, such a graph may be
huge. It could take a while for the PMD to update it when adding a rule
impacting capabilities.

> I've done this two ways: one is to define all the fields in their
> own structures using something like,
> 
> struct field {
>   char *name;
>   u32 uid;
>   u32 bitwidth;
> };
> 
> This gives a unique id (uid) for each field along with its
> width and a user friendly name. The fields are organized into
> headers via a header structure,
> 
> struct header_node {
>   char *name;
>   u32 uid;
>   u32 *fields;
>   struct parse_graph *jump;
> };
> 
> Each node has a unique id and then a list of fields. Where 'fields'
> is a list of uid's of fields its also easy enough to embed the field
> struct in the header_node if that is simpler its really a style
> question.
> 
> The 'struct parse_graph' gives the list of edges from this header node
> to other header nodes. Using a parse graph structure defined
> 
> struct parse_graph {
>   struct field_reference ref;
>   __u32 jump_uid;
> };
> 
> Again as a matter of style you can embed the parse graph in the header
> node as I did above or do it as its own object.
> 
> The field_reference noted below gives the id of the field and the value
> e.

[dpdk-dev] [PATCH v2] net/mlx5: Fix possible NULL deref in RX path

2016-08-02 Thread Adrien Mazarguil

On Tue, Aug 02, 2016 at 03:02:18PM +0300, Sagi Grimberg wrote:
> The user is allowed to call ->rx_pkt_burst() even without free
> mbufs in the pool. In this scenario we'll fail allocating a rep mbuf
> on the first iteration (where pkt is still NULL). This would cause us
> to deref a NULL pkt (reset refcount and free).
> 
> Fix this by checking the pkt before freeing it.
> 
> Fixes: a1bdb71a32da ("net/mlx5: fix crash in Rx")
> Signed-off-by: Sagi Grimberg 
> ---
> Changes from v1:
> - check pkt only once in case we failed to allocate a buffer
> 
>  drivers/net/mlx5/mlx5_rxtx.c | 9 -
>  1 file changed, 8 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/net/mlx5/mlx5_rxtx.c b/drivers/net/mlx5/mlx5_rxtx.c
> index fce3381ae87a..37573668e43e 100644
> --- a/drivers/net/mlx5/mlx5_rxtx.c
> +++ b/drivers/net/mlx5/mlx5_rxtx.c
> @@ -1572,6 +1572,14 @@ mlx5_rx_burst(void *dpdk_rxq, struct rte_mbuf **pkts, 
> uint16_t pkts_n)
>   rte_prefetch0(wqe);
>   rep = rte_mbuf_raw_alloc(rxq->mp);
>   if (unlikely(rep == NULL)) {
> + ++rxq->stats.rx_nombuf;
> + if (!pkt) {
> + /*
> +  * no buffers before we even started,
> +  * bail out silently.
> +  */
> + break;
> + }
>   while (pkt != seg) {
>   assert(pkt != (*rxq->elts)[idx]);
>   seg = NEXT(pkt);
> @@ -1579,7 +1587,6 @@ mlx5_rx_burst(void *dpdk_rxq, struct rte_mbuf **pkts, 
> uint16_t pkts_n)
>   __rte_mbuf_raw_free(pkt);
>   pkt = seg;
>   }
> - ++rxq->stats.rx_nombuf;
>   break;
>   }
>   if (!pkt) {
> -- 
> 1.9.1

A few nit-picks from check-git-log.sh:

 Wrong headline uppercase:
 net/mlx5: Fix possible NULL deref in RX path
 Wrong headline lowercase:
 net/mlx5: Fix possible NULL deref in RX path
 Missing blank line after 'Fixes' tag:
 Fixes: a1bdb71a32da ("net/mlx5: fix crash in Rx")

Otherwise,

Acked-by: Adrien Mazarguil 

-- 
Adrien Mazarguil
6WIND

[dpdk-dev] [PATCH] net/mlx5: Fix possible NULL deref in RX path

2016-08-02 Thread Adrien Mazarguil

On Tue, Aug 02, 2016 at 01:47:55PM +0300, Sagi Grimberg wrote:
> 
> 
> On 02/08/16 12:58, Adrien Mazarguil wrote:
> >On Tue, Aug 02, 2016 at 12:31:35PM +0300, Sagi Grimberg wrote:
> >>
> >>
> >>On 01/08/16 19:43, Adrien Mazarguil wrote:
> >>>Hi Sagi,
> >>>
> >>>On Mon, Aug 01, 2016 at 11:44:21AM +0300, Sagi Grimberg wrote:
> >>>>The user is allowed to call ->rx_pkt_burst() even without free
> >>>>mbufs in the pool. In this scenario we'll fail allocating a rep mbuf
> >>>>on the first iteration (where pkt is still NULL). This would cause us
> >>>>to deref a NULL pkt (reset refcount and free).
> >>>>
> >>>>Fix this by checking the pkt before freeing it.
> >>>
> >>>Just to be sure, did you get an actual NULL deref crash here or is that an
> >>>assumed possibility?
> >>>
> >>>I'm asking because this problem was supposed to be addressed by:
> >>>
> >>>a1bdb71a32da ("net/mlx5: fix crash in Rx")
> >>
> >>I actually got the NULL deref. This happens when the application doesn't
> >>restore mbufs to the pool correctly. In the case rte_mbuf_raw_alloc
> >>will fail on the first iteration (pkt wasn't assigned) unlike the
> >>condition handled in a1bdb71a32da.
> >>
> >>With this applied, I didn't see the crash.
> >
> >Thanks for confirming this,
> 
> Hey Adrien, I just noticed that I missed the rest of
> your response in the previous message (pre-coffee mail
> browsing...)
> 
> You analysis was on spot.
> 
> >now what about the different approach I
> >suggested in my previous message to avoid the extra check in the inner loop:
> >
> > if (!pkt)
> > pkt = seg;
> > while (pkt != seg) {
> >  ...
> > }
> 
> We can go this way, but it looks kinda confusing to set pkt = seg and
> then iterate on pkt != seg.
> 
> How about a more explicit approach:
> --
> diff --git a/drivers/net/mlx5/mlx5_rxtx.c b/drivers/net/mlx5/mlx5_rxtx.c
> index fce3381ae87a..37573668e43e 100644
> --- a/drivers/net/mlx5/mlx5_rxtx.c
> +++ b/drivers/net/mlx5/mlx5_rxtx.c
> @@ -1572,6 +1572,14 @@ mlx5_rx_burst(void *dpdk_rxq, struct rte_mbuf **pkts,
> uint16_t pkts_n)
> rte_prefetch0(wqe);
> rep = rte_mbuf_raw_alloc(rxq->mp);
> if (unlikely(rep == NULL)) {
> +   ++rxq->stats.rx_nombuf;
> +   if (!pkt) {
> +   /*
> +* no buffers before we even started,
> +* bail out silently.
> +*/
> +   break;
> +   }
> while (pkt != seg) {
> assert(pkt != (*rxq->elts)[idx]);
> seg = NEXT(pkt);
> @@ -1579,7 +1587,6 @@ mlx5_rx_burst(void *dpdk_rxq, struct rte_mbuf **pkts,
> uint16_t pkts_n)
> __rte_mbuf_raw_free(pkt);
> pkt = seg;
> }
> -   ++rxq->stats.rx_nombuf;
> break;
> }
> if (!pkt) {
> --

Yes, that's also fine.

> >Also the fixes line in your commit message?
> 
> I'll add it in v2. Thanks.

Go ahead, thanks!

-- 
Adrien Mazarguil
6WIND

[dpdk-dev] [PATCH] net/mlx5: Fix possible NULL deref in RX path

2016-08-02 Thread Adrien Mazarguil

On Tue, Aug 02, 2016 at 12:31:35PM +0300, Sagi Grimberg wrote:
> 
> 
> On 01/08/16 19:43, Adrien Mazarguil wrote:
> >Hi Sagi,
> >
> >On Mon, Aug 01, 2016 at 11:44:21AM +0300, Sagi Grimberg wrote:
> >>The user is allowed to call ->rx_pkt_burst() even without free
> >>mbufs in the pool. In this scenario we'll fail allocating a rep mbuf
> >>on the first iteration (where pkt is still NULL). This would cause us
> >>to deref a NULL pkt (reset refcount and free).
> >>
> >>Fix this by checking the pkt before freeing it.
> >
> >Just to be sure, did you get an actual NULL deref crash here or is that an
> >assumed possibility?
> >
> >I'm asking because this problem was supposed to be addressed by:
> >
> > a1bdb71a32da ("net/mlx5: fix crash in Rx")
> 
> I actually got the NULL deref. This happens when the application doesn't
> restore mbufs to the pool correctly. In the case rte_mbuf_raw_alloc
> will fail on the first iteration (pkt wasn't assigned) unlike the
> condition handled in a1bdb71a32da.
> 
> With this applied, I didn't see the crash.

Thanks for confirming this, now what about the different approach I
suggested in my previous message to avoid the extra check in the inner loop:

 if (!pkt)
 pkt = seg;
 while (pkt != seg) {
  ...
 }

Also the fixes line in your commit message?

-- 
Adrien Mazarguil
6WIND

[dpdk-dev] [PATCH] net/mlx5: Fix possible NULL deref in RX path

2016-08-01 Thread Adrien Mazarguil

Hi Sagi,

On Mon, Aug 01, 2016 at 11:44:21AM +0300, Sagi Grimberg wrote:
> The user is allowed to call ->rx_pkt_burst() even without free
> mbufs in the pool. In this scenario we'll fail allocating a rep mbuf
> on the first iteration (where pkt is still NULL). This would cause us
> to deref a NULL pkt (reset refcount and free).
> 
> Fix this by checking the pkt before freeing it.

Just to be sure, did you get an actual NULL deref crash here or is that an
assumed possibility?

I'm asking because this problem was supposed to be addressed by:

 a1bdb71a32da ("net/mlx5: fix crash in Rx")

> Signed-off-by: Sagi Grimberg 
> ---
>  drivers/net/mlx5/mlx5_rxtx.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/net/mlx5/mlx5_rxtx.c b/drivers/net/mlx5/mlx5_rxtx.c
> index fce3381ae87a..a07cc4794023 100644
> --- a/drivers/net/mlx5/mlx5_rxtx.c
> +++ b/drivers/net/mlx5/mlx5_rxtx.c
> @@ -1572,7 +1572,7 @@ mlx5_rx_burst(void *dpdk_rxq, struct rte_mbuf **pkts, 
> uint16_t pkts_n)
>   rte_prefetch0(wqe);
>   rep = rte_mbuf_raw_alloc(rxq->mp);
>   if (unlikely(rep == NULL)) {
> - while (pkt != seg) {
> + while (pkt && pkt != seg) {
>   assert(pkt != (*rxq->elts)[idx]);
>   seg = NEXT(pkt);
>   rte_mbuf_refcnt_set(pkt, 0);
> -- 
> 1.9.1

I've reviewed your patch and it indeed seems to address an issue, please
confirm my analysis below.

When rep cannot be allocated and is thus NULL, either pkt is still NULL
because the first packet segment has not been seen yet or points to the
first segment.

Either way at this point, seg points to current segment to process in the
queue and is never NULL.

Thus when pkt is still NULL (first segment) and rep cannot be allocated, the
comparison (pkt != seg) between a valid pointer (seg) and NULL (pkt)
succeeds. This case is not handled by the assert() statement and a crash
occurs.

We really want to avoid useless code in the data path, particularly inside
loops. The extra check you added is performed for each iteration, so what
about modifying your patch by adding the following if statement instead?

 if (!pkt)
 pkt = seg;
 while (pkt != seg) {
  ...
 }

I guess you could add "Fixes: a1bdb71a32da ("net/mlx5: fix crash in Rx")"
line to your commit log as well because the original patch only solved half
of the issue.

Thanks.

-- 
Adrien Mazarguil
6WIND

[dpdk-dev] [PATCH] doc: announce driver name changes

2016-07-22 Thread Adrien Mazarguil

On Fri, Jul 22, 2016 at 02:15:39PM +, De Lara Guarch, Pablo wrote:
> 
> 
> > -Original Message-
> > From: Yigit, Ferruh
> > Sent: Friday, July 22, 2016 2:19 PM
> > To: De Lara Guarch, Pablo; dev at dpdk.org; Mcnamara, John
> > Subject: Re: [dpdk-dev] [PATCH] doc: announce driver name changes
> > 
> > On 7/22/2016 1:54 PM, Adrien Mazarguil wrote:
> > > Hi Pablo,
> > >
> > > On Fri, Jul 22, 2016 at 12:37:22PM +, De Lara Guarch, Pablo wrote:
> > >> Hi,
> > >>
> > >>> -Original Message-
> > >>> From: De Lara Guarch, Pablo
> > >>> Sent: Saturday, July 09, 2016 5:57 PM
> > >>> To: dev at dpdk.org
> > >>> Cc: Mcnamara, John; De Lara Guarch, Pablo
> > >>> Subject: [PATCH] doc: announce driver name changes
> > >>>
> > >>> Driver names for all the supported devices in DPDK do not have
> > >>> a naming convention. Some are using a prefix, some are not
> > >>> and some have long names. Driver names are used when creating
> > >>> virtual devices, so it is useful to have consistency in the names.
> > >>>
> > >>> Signed-off-by: Pablo de Lara 
> > >>> ---
> > >>>  doc/guides/rel_notes/deprecation.rst | 5 +
> > >>>  1 file changed, 5 insertions(+)
> > >>>
> > >>> diff --git a/doc/guides/rel_notes/deprecation.rst
> > >>> b/doc/guides/rel_notes/deprecation.rst
> > >>> index f502f86..37d65c8 100644
> > >>> --- a/doc/guides/rel_notes/deprecation.rst
> > >>> +++ b/doc/guides/rel_notes/deprecation.rst
> > >>> @@ -41,3 +41,8 @@ Deprecation Notices
> > >>>  * The mempool functions for single/multi producer/consumer are
> > >>> deprecated and
> > >>>will be removed in 16.11.
> > >>>It is replaced by rte_mempool_generic_get/put functions.
> > >>> +
> > >>> +* Driver names are quite inconsistent among each others and they will
> > be
> > >>> +  renamed to something more consistent (net_ prefix for net drivers and
> > >>> +  crypto_ for crypto drivers) in 16.11. Some of these driver names are
> > used
> > >>> +  publicly, to create virtual devices, so a deprecation notice is 
> > >>> necessary.
> > >>> --
> > >>> 2.7.4
> > >>
> > >> Any more comments on this (apart from Christian Ehrhardt's)?
> > >
> > > Yes, since you're suggesting to prefix driver names, shall 
> > > "librte_pmd_mlx5"
> > > really become "net_librte_pmd_mlx5" or shortened to "net_mlx5" instead?
> > >
> > > What about using a '/' separator instead of '_'?
> > >
> > > Will this impact directories as well ("net/mlx5" -> "net/net_mlx5")?
> > >
> 
> We will leave these untouched, although I don't think renaming the 
> directories was necessary.

My feeling as well, the depreciation notice wasn't clear about the extent of
name changes.

> > For physical net devices, driver name is same as folder name (mlnx5,
> > ixgbe ...)
> > 
> > For virtual net devices, driver name is folder name with "eth_" prefix
> > (eth_pcap, eth_ring)
> > 
> > Driver names for net devices looks consistent already, I don't know
> > about crypto devices but if crypto driver names are inconsistent what do
> > you think renaming crypto drivers only?
> 
> Sure, as long as virtual Ethernet devices are consistent, I think it is ok.
> My main intention here was to have consistent (and short) driver names,
> to call rte_eal_vdev_init (or --vdev in command line).

So what about using "net/" instead of "net_" to share names with commit
prefixes and folders?

-- 
Adrien Mazarguil
6WIND

[dpdk-dev] [PATCH] doc: announce driver name changes

2016-07-22 Thread Adrien Mazarguil

Hi Pablo,

On Fri, Jul 22, 2016 at 12:37:22PM +, De Lara Guarch, Pablo wrote:
> Hi,
> 
> > -Original Message-
> > From: De Lara Guarch, Pablo
> > Sent: Saturday, July 09, 2016 5:57 PM
> > To: dev at dpdk.org
> > Cc: Mcnamara, John; De Lara Guarch, Pablo
> > Subject: [PATCH] doc: announce driver name changes
> > 
> > Driver names for all the supported devices in DPDK do not have
> > a naming convention. Some are using a prefix, some are not
> > and some have long names. Driver names are used when creating
> > virtual devices, so it is useful to have consistency in the names.
> > 
> > Signed-off-by: Pablo de Lara 
> > ---
> >  doc/guides/rel_notes/deprecation.rst | 5 +
> >  1 file changed, 5 insertions(+)
> > 
> > diff --git a/doc/guides/rel_notes/deprecation.rst
> > b/doc/guides/rel_notes/deprecation.rst
> > index f502f86..37d65c8 100644
> > --- a/doc/guides/rel_notes/deprecation.rst
> > +++ b/doc/guides/rel_notes/deprecation.rst
> > @@ -41,3 +41,8 @@ Deprecation Notices
> >  * The mempool functions for single/multi producer/consumer are
> > deprecated and
> >will be removed in 16.11.
> >It is replaced by rte_mempool_generic_get/put functions.
> > +
> > +* Driver names are quite inconsistent among each others and they will be
> > +  renamed to something more consistent (net_ prefix for net drivers and
> > +  crypto_ for crypto drivers) in 16.11. Some of these driver names are used
> > +  publicly, to create virtual devices, so a deprecation notice is 
> > necessary.
> > --
> > 2.7.4
> 
> Any more comments on this (apart from Christian Ehrhardt's)?

Yes, since you're suggesting to prefix driver names, shall "librte_pmd_mlx5"
really become "net_librte_pmd_mlx5" or shortened to "net_mlx5" instead?

What about using a '/' separator instead of '_'?

Will this impact directories as well ("net/mlx5" -> "net/net_mlx5")?

-- 
Adrien Mazarguil
6WIND

[dpdk-dev] [RFC] Generic flow director/filtering/classification API

2016-07-21 Thread Adrien Mazarguil

Hi Jerin,

Sorry, looks like I missed your reply. Please see below.

On Mon, Jul 11, 2016 at 04:11:43PM +0530, Jerin Jacob wrote:
> On Tue, Jul 05, 2016 at 08:16:46PM +0200, Adrien Mazarguil wrote:
> 
> Hi Adrien,
> 
> Overall this proposal looks very good. I could easily map to the
> classification hardware engines I am familiar with.

Great, thanks.

> > Priorities
> > ~~
> > 
> > A priority can be assigned to a matching pattern.
> > 
> > The default priority level is 0 and is also the highest. Support for more
> > than a single priority level in hardware is not guaranteed.
> > 
> > If a packet is matched by several filters at a given priority level, the
> > outcome is undefined. It can take any path and can even be duplicated.
> 
> In some cases fatal unrecoverable error too

Right, do you think I need to elaborate regarding unrecoverable errors?

How much unrecoverable by the way? Like not being able to receive any more
packets?

> > Matching pattern items for packet data must be naturally stacked (ordered
> > from lowest to highest protocol layer), as in the following examples:
> > 
> > +--+
> > | TCPv4 as L4  |
> > +===+==+
> > | 0 | Ethernet |
> > +---+--+
> > | 1 | IPv4 |
> > +---+--+
> > | 2 | TCP  |
> > +---+--+
> > 
> > ++
> > | TCPv6 in VXLAN |
> > +===++
> > | 0 | Ethernet   |
> > +---++
> > | 1 | IPv4   |
> > +---++
> > | 2 | UDP|
> > +---++
> > | 3 | VXLAN  |
> > +---++
> > | 4 | Ethernet   |
> > +---++
> > | 5 | IPv6   |
> > +---++
> 
> How about enumerating as "Inner-IPV6" flow type to avoid any confusion. 
> Though spec
> can be same for both IPv6 and Inner-IPV6.

I'm not sure, if we have a more than two encapsulated IPv6 headers, knowing
that one of them is "inner" is not really useful. This is why I choose to
enforce the stack ordering instead, I think it makes more sense.

> > | 6 | TCP|
> > +---++
> > 
> > +-+
> > | TCPv4 as L4 with meta items |
> > +===+=+
> > | 0 | VOID|
> > +---+-+
> > | 1 | Ethernet|
> > +---+-+
> > | 2 | VOID|
> > +---+-+
> > | 3 | IPv4|
> > +---+-+
> > | 4 | TCP |
> > +---+-+
> > | 5 | VOID|
> > +---+-+
> > | 6 | VOID|
> > +---+-+
> > 
> > The above example shows how meta items do not affect packet data matching
> > items, as long as those remain stacked properly. The resulting matching
> > pattern is identical to "TCPv4 as L4".
> > 
> > ++
> > | UDPv6 anywhere |
> > +===++
> > | 0 | IPv6   |
> > +---++
> > | 1 | UDP|
> > +---++
> > 
> > If supported by the PMD, omitting one or several protocol layers at the
> > bottom of the stack as in the above example (missing an Ethernet
> > specification) enables hardware to look anywhere in packets.
> 
> It would be good if the common code can give it as Ethernet, IPV6, UDP
> to PMD(to avoid common code duplication across PMDs)

I left this mostly at PMD's discretion for now. Applications must provide
explicit rules if they need a consistent behavior. PMDs may not support this
at all, I've just documented what applications should expect when attempting
this kind of pattern.

> > It is unspecified whether the payload of supported encapsulations
> > (e.g. VXLAN inner packet) is matched by such a pattern, which may apply to
> > inner, outer or both packets.
> 
> a separate flow type enumeration may fix that problem. like "Inner-IPV6"
> mentioned above.

Not sure about that, for the same reason as above. Which "inner" level would
be matched by such a pattern? Note that it could have started with VXLAN
followed by ETH and then IPv6 if the application cared.

This is basically the ability to remain vague about a rule. I didn't want to
forbid it outright because I'm sure there are possible use cases:

- PMD validation and debugging.

- Rough filtering according to protocols a packet might contain somewhere
  (think of the network admins who cannot stand anything ot

[dpdk-dev] [RFC] Generic flow director/filtering/classification API

2016-07-21 Thread Adrien Mazarguil

Hi Rahul,

Please see below.

On Thu, Jul 21, 2016 at 01:43:37PM +0530, Rahul Lakkireddy wrote:
> Hi Adrien,
> 
> The proposal looks very good.  It satisfies most of the features
> supported by Chelsio NICs.  We are looking for suggestions on exposing
> more additional features supported by Chelsio NICs via this API.
> 
> Chelsio NICs have two regions in which filters can be placed -
> Maskfull and Maskless regions.  As their names imply, maskfull region
> can accept masks to match a range of values; whereas, maskless region
> don't accept any masks and hence perform a more strict exact-matches.
> Filters without masks can also be placed in maskfull region.  By
> default, maskless region have higher priority over the maskfull region.
> However, the priority between the two regions is configurable.

I understand this configuration affects the entire device. Just to be clear,
assuming some filters are already configured, are they affected by a change
of region priority later?

> Please suggest on how we can let the apps configure in which region
> filters must be placed and set the corresponding priority accordingly
> via this API.

Okay. Applications, like customers, are always right.

With this in mind, PMDs are not allowed to break existing flow rules, and
face two options when applications provide a flow specification that would
break an existing rule:

- Refuse to create it (easiest).

- Find a workaround to apply it anyway (possibly quite complicated).

The reason you have these two regions is performance right? Otherwise I'd
just say put everything in the maskfull region.

PMDs are allowed to rearrange existing rules or change device parameters as
long as existing constraints are satisfied. In my opinion it does not matter
which region has the highest default priority. Applications always want the
best performance so the first created rule should be in the most appropriate
region.

If a subsequent rule requires it to be in the other region but the
application specified the wrong priority for this to work, then the PMD can
either choose to swap region priorities on the device (assuming it does not
affect other rules), or destroy and recreate the original rule in a way that
satisfies all constraints (i.e. moving conflicting rules from the maskless
region to the maskfull one).

Going further, when subsequent rules get destroyed the PMD should ideally
move back maskfull rules back into the maskless region for better
performance.

This is only a suggestion. PMDs have the right to say "no" at some point.

More important in my opinion is to make sure applications can create a given
set of flow rules in any order. If rules a/b/c can be created, then it won't
make sense from an application point of view if c/a/b for some reason cannot
and the PMD maintainers will rightfully get a bug report.

> More comments below.
> 
> On Tuesday, July 07/05/16, 2016 at 20:16:46 +0200, Adrien Mazarguil wrote:
> > Hi All,
> > 
> [...]
> 
> > 
> > ``ETH``
> > ^^^
> > 
> > Matches an Ethernet header.
> > 
> > - ``dst``: destination MAC.
> > - ``src``: source MAC.
> > - ``type``: EtherType.
> > - ``tags``: number of 802.1Q/ad tags defined.
> > - ``tag[]``: 802.1Q/ad tag definitions, innermost first. For each one:
> > 
> >  - ``tpid``: Tag protocol identifier.
> >  - ``tci``: Tag control information.
> > 
> > ``IPV4``
> > 
> > 
> > Matches an IPv4 header.
> > 
> > - ``src``: source IP address.
> > - ``dst``: destination IP address.
> > - ``tos``: ToS/DSCP field.
> > - ``ttl``: TTL field.
> > - ``proto``: protocol number for the next layer.
> > 
> > ``IPV6``
> > 
> > 
> > Matches an IPv6 header.
> > 
> > - ``src``: source IP address.
> > - ``dst``: destination IP address.
> > - ``tc``: traffic class field.
> > - ``nh``: Next header field (protocol).
> > - ``hop_limit``: hop limit field (TTL).
> > 
> > ``ICMP``
> > 
> > 
> > Matches an ICMP header.
> > 
> > - TBD.
> > 
> > ``UDP``
> > ^^^
> > 
> > Matches a UDP header.
> > 
> > - ``sport``: source port.
> > - ``dport``: destination port.
> > - ``length``: UDP length.
> > - ``checksum``: UDP checksum.
> > 
> > .. raw:: pdf
> > 
> >PageBreak
> > 
> > ``TCP``
> > ^^^
> > 
> > Matches a TCP header.
> > 
> > - ``sport``: source port.
> > - ``dport``: destination port.
> > - All other TCP fields and bits.
> > 
> > ``VXLAN``
> > ^
> > 
> > Matches a VXLAN header.
> > 
> > - TBD.
> > 
> 
> In addition to above ma

[dpdk-dev] [RFC] Generic flow director/filtering/classification API

2016-07-20 Thread Adrien Mazarguil

Hi Sugesh,

Please see below.

On Wed, Jul 20, 2016 at 04:32:50PM +, Chandran, Sugesh wrote:
[...]
> > > How about a hardware flow flag in packet descriptor that set when the
> > > packets hits any hardware rule. This way software doesn?t worry
> > > /blocked by a hardware rule . Even though there is an additional
> > > overhead of validating this flag, software datapath can identify the
> > hardware processed packets easily.
> > > This way the packets traverses the software fallback path until the
> > > rule configuration is complete. This flag avoids setting ID action for 
> > > every
> > hardware flow that are configuring.
> > 
> > That makes sense. I see it as a sort of single bit ID but it could be
> > implemented through a different action for less capable devices. PMDs that
> > support 32 bit IDs could reuse the same code for both features.
> > 
> > I understand you'd prefer having this feature always present, however we
> > already know that not all PMDs/devices support it, and like everything else
> > this is a kind of offload that needs to be explicitly requested by the
> > application as it may not be needed.
> > 
> > If we go with the separate action, then perhaps it would make sense to
> > rename "ID" to "MARK" to make things clearer:
> > 
> >  RTE_FLOW_ACTION_TYPE_FLAG /* Flag packets processed by flow rule. */
> > 
> >  RTE_FLOW_ACTION_TYPE_MARK /* Attach a 32 bit value to a packet. */
> > 
> > I guess the result of the FLAG action would be something in ol_flag.
> > 
> [Sugesh] This looks fine for me.

Great, I will update the specification accordingly.

> > Thoughts?
> > 
> [Sugesh] Two more queries that I missed out in the earlier comments are,
> Support for PTYPE :- Intel NICs can report packet type in mbuf.
> This can be used by software for the packet processing. Is generic API
> capable of handling that as well? 

Yes, however no PTYPE action has been defined for this (yet). It is only a
matter of adding one.

Currently packet type recognition is enabled per port using a separate API,
so correct me if I'm wrong but I am not aware of any adapter with the
ability to enable it per flow rule, so I do not think such an action needs
to be defined from the start. We may add it later.

> RSS hashing support :- Just to confirm, the RSS flow action allows application
> to decide the header fields to produce the hash. This gives
> programmability on load sharing across different queues. The
> application can program the NIC to calculate the RSS hash only using mac or 
> mac+ ip or 
> ip only using this.

I'd say yes but from your summary, I'm not sure we share the same idea of
what the RSS action is supposed to do, so here is mine.

Like all flow rules, the pattern part of the RSS action only filters the
packets on which the action will be performed.

The rss_conf parameter (struct rte_eth_rss_conf) only provides a key and a
RSS hash function to use (ETH_RSS_IPV4, ETH_RSS_NONFRAG_IPV6_UDP, etc).

Nothing prevents the RSS hash function from being applied to protocol
headers which are not necessarily present in the flow rule pattern. These
are two independent things, e.g. you could have a pattern matching IPv4
packets yet perform RSS hashing only on UDP headers.

Finally, the RSS action configuration only affects packets coming from this
flow rule. It is not performed on the device globally so packets which are
not matched are not affected by RSS processing. As a result it might not be
possible to configure two flow rules specifying incompatible RSS actions
simultaneously if the underlying device supports only a single global RSS
context.

Are we on the same page?

-- 
Adrien Mazarguil
6WIND

[dpdk-dev] [RFC] Generic flow director/filtering/classification API

2016-07-20 Thread Adrien Mazarguil

 - Given that the opaque rte_flow pointer associated with a flow rule is to
> >   be stored by the application, PMDs do not even have to keep references to
> >   them.
> Don?t understand. More details?

In an application:

 rte_flow *foo = rte_flow_create(...);

In the above example, foo cannot be dereferenced by the application nor RTE,
only the PMD is aware of its contents. This object can only be used with
rte_flow*() functions.

PMDs are thus free to make this object grow as needed when adding internal
features without breaking any kind of public API/ABI.

What I meant is, given that the application is supposed to store foo
somewhere in order to destroy it later, the PMD does not have to keep track
of that pointer assuming it does not need to access it later on its own for
some reason.

> > - The flow rules format described in this specification (pattern / actions)
> >   will be used by applications directly, and will be free to arrange them in
> >   lists, trees or in any other way if they need to keep flow specifications
> >   around for further processing.
> Who will create the lists, trees or something else? According to previous 
> discussion, I think the APP will program the rules one by one. So if APP 
> organize the rules to lists, trees..., PMD doesn?t know that. 
> And you said " Given that the opaque rte_flow pointer associated with a flow 
> rule is to be stored by the application ". I'm lost here.

I guess that's because we're discussing two different things, flow rule
specifications and flow rule objects. Let me sum it up:

- Flow rule specifications are the patterns/actions combinations provided by
  applications to rte_flow_create(). Applications can store those as needed
  and organize them as they wish (hash, tree, list). Neither PMDs nor RTE
  will do it for them.

- Flow rule objects (struct rte_flow *) are generated when a flow rule is
  created. Applications must keep these around if they want to manipulate
  them later (i.e. destroy or query existing rules).

Then PMDs *may* need to keep and arrange flow rule objects internally for
management purposes. Could be because HW requires it, detecting conflicting
rules, managing priorities and so on. Possible reasons are not described in
this API because these are thought as PMD-specific needs.

> > > When the port is stopped and restarted, rte can reconfigure the rules. Is 
> > > the
> > concern that PMD may adjust the sequence of the rules according to the 
> > priority,
> > so every NIC has a different list of rules? But PMD can adjust them again 
> > when
> > rte reconfiguring the rules.
> > 
> > What about PMDs able to stop and restart ports without destroying their own
> > flow rules? If we assume flow rules must be destroyed when stopping a port,
> > these PMDs are needlessly penalized with slower stop/start cycles. Think 
> > about
> > it assuming thousands of flow rules.
> I believe the rules maintained by SW should not be destroyed, because they're 
> used to be re-programed when the device starts again.

Do we agree that applications should not care? Flow rules configured before
stopping a port must still be there after restarting it.

What we seem to not agree about is that you think RTE should be responsible
for restoring flow rules of devices that lose them when stopped. I think
doing so is unfair to devices for which it is not the case and not really
nice to applications, so my opinion is that the PMD is responsible for
restoring flow rules however it wants. It is free to use RTE helpers to keep
their track, as long as it's all managed internally.

> > Thus from an application point of view, whatever happens when stopping and
> > restarting a port should not matter. If a flow rule was present before, it 
> > must
> > still be present afterwards. If the PMD had to destroy flow rules and 
> > re-create
> > them, it does not actually matter if they differ slightly at the HW level, 
> > as long as:
> > 
> > - Existing opaque flow rule pointers (rte_flow) are still valid to the PMD
> >   and refer to the same rules.
> > 
> > - The overall behavior of all rules is the same.
> > 
> > The list of rules you think of (patterns / actions) is maintained by 
> > applications
> > (not RTE), and only if they need them. RTE would needlessly duplicate this.
> As said before, need more details to understand this. Maybe an example is 
> better :)

The generic format both RTE and applications might understand is the one
described in this API (struct rte_flow_pattern and struct
rte_flow_actions).

If we wanted RTE to maintain some sort of per-port state for flow rule
specifications, it would have to be a copy of these structures arranged
somehow (list or something else).

If we consider that PMDs need to keep a context object associated to a flow
rule (the opaque struct rte_flow *), then RTE would most likely have to
store it along with the flow specification.

Such a list may not be useful to applications (list lookups take time), so
they would implement their own redundant method. They might also require
extra room to attach some application context to flow rules. A generic list
cannot plan for it.

Applications know what they want to do with flow rules and are responsible
for managing them efficiently with RTE out of the way.

I'm not sure if this answered your question, if not, please describe a
scenario where a RTE-managed list of flow rules would be mandatory.

-- 
Adrien Mazarguil
6WIND

[dpdk-dev] [RFC] Generic flow director/filtering/classification API

2016-07-19 Thread Adrien Mazarguil

pinion, the generic API has enough
  constraints for this to work and maintain consistency between flow
  rules. Note this is currently how most PMDs implement FDIR and other
  filter types.

- RTE can (and will) provide helpers to avoid most of the code redundancy,
  PMDs are free to use them or manage everything by themselves.

- Given that the opaque rte_flow pointer associated with a flow rule is to
  be stored by the application, PMDs do not even have to keep references to
  them.

- The flow rules format described in this specification (pattern / actions)
  will be used by applications directly, and will be free to arrange them in
  lists, trees or in any other way if they need to keep flow specifications
  around for further processing.

> When the port is stopped and restarted, rte can reconfigure the rules. Is the 
> concern that PMD may adjust the sequence of the rules according to the 
> priority, so every NIC has a different list of rules? But PMD can adjust them 
> again when rte reconfiguring the rules.

What about PMDs able to stop and restart ports without destroying their own
flow rules? If we assume flow rules must be destroyed when stopping a port,
these PMDs are needlessly penalized with slower stop/start cycles. Think
about it assuming thousands of flow rules.

Thus from an application point of view, whatever happens when stopping and
restarting a port should not matter. If a flow rule was present before, it
must still be present afterwards. If the PMD had to destroy flow rules and
re-create them, it does not actually matter if they differ slightly at the
HW level, as long as:

- Existing opaque flow rule pointers (rte_flow) are still valid to the PMD
  and refer to the same rules.

- The overall behavior of all rules is the same.

The list of rules you think of (patterns / actions) is maintained by
applications (not RTE), and only if they need them. RTE would needlessly
duplicate this.

-- 
Adrien Mazarguil
6WIND

[dpdk-dev] [RFC] Generic flow director/filtering/classification API

2016-07-18 Thread Adrien Mazarguil

e elapsed time for the
> > > > rules they create, make statistics and extrapolate from this
> > > > information for future rules. I do not think the PMD can help much here.
> > > [Sugesh] From an application point of view this can be an issue.
> > > Even there is a security concern when we program a short lived flow.
> > > Lets consider the case,
> > >
> > > 1) Control plane programs the hardware with Queue termination flow.
> > > 2) Software dataplane programmed to treat the packets from the specific
> > queue accordingly.
> > > 3) Remove the flow from the hardware. (Lets consider this is a long wait
> > process..).
> > > Or even there is a chance that hardware take more time to report the
> > > status than removing it physically . Now the packets in the queue no 
> > > longer
> > consider as matched/flow hit.
> > > . This is due to the software dataplane update is yet to happen.
> > > We must need a way to sync between software datapath and classifier
> > > APIs even though they are both programmed from a different control
> > thread.
> > >
> > > Are we saying these APIs are only meant for user defined static flows??
> > 
> > No, that is definitely not the intent. These are good points.
> > 
> > With the specified API, applications may have to adapt their logic and take
> > extra precautions in order to remain on the safe side at all times.
> > 
> > For your above example, the application cannot assume a rule is
> > added/deleted as long as the PMD has not completed the related operation,
> > which means keeping the SW rule/fallback in place in the meantime. Should
> > handle security concerns as long as after removing a rule, packets end up 
> > in a
> > default queue entirely processed by SW. Obviously this may worsen
> > response time.
> > 
> > The ID action can help with this. By knowing which rule a received packet is
> > associated with, processing can be temporarily offloaded by another thread
> > without much complexity.
> [Sugesh] Setting ID for every flow may not viable especially when the size of 
> ID
> is small(just only 8 bits). I am not sure is this a valid case though.

Agreed, I'm not saying this solution works for all devices, particularly
those that do not support ID at all.

> How about a hardware flow flag in packet descriptor that set when the
> packets hits any hardware rule. This way software doesn?t worry /blocked by a
> hardware rule . Even though there is an additional overhead of validating 
> this flag,
> software datapath can identify the hardware processed packets easily.
> This way the packets traverses the software fallback path until the rule 
> configuration is
> complete. This flag avoids setting ID action for every hardware flow that are 
> configuring.

That makes sense. I see it as a sort of single bit ID but it could be
implemented through a different action for less capable devices. PMDs that
support 32 bit IDs could reuse the same code for both features.

I understand you'd prefer having this feature always present, however we
already know that not all PMDs/devices support it, and like everything else
this is a kind of offload that needs to be explicitly requested by the
application as it may not be needed.

If we go with the separate action, then perhaps it would make sense to
rename "ID" to "MARK" to make things clearer:

 RTE_FLOW_ACTION_TYPE_FLAG /* Flag packets processed by flow rule. */

 RTE_FLOW_ACTION_TYPE_MARK /* Attach a 32 bit value to a packet. */

I guess the result of the FLAG action would be something in ol_flag.

Thoughts?

> > I think applications have to implement SW fallbacks all the time, as even
> > some sort of guarantee on the flow rule processing time may not be enough
> > to avoid misdirected packets and related security issues.
> [Sugesh] Software fallback will be there always. However I am little bit 
> confused on
> the way software going to identify the packets that are already hardware 
> processed . I feel we need some
> notification in the packet itself, when a hardware rule hits. ID/flag/any 
> other options?

Yeah I think so too, as long as it is optional because we cannot assume all
PMDs will support it.

> > Let's wait for applications to start using this API and then consider an 
> > extra
> > set of asynchronous / real-time functions when the need arises. It should 
> > not
> > impact the way rules are specified
> [Sugesh] Sure. I think the rule definition may not impact with this.

Thanks for your comments.

-- 
Adrien Mazarguil
6WIND

[dpdk-dev] [PATCH v4 00/10] Fix build errors related to exported headers

2016-07-18 Thread Adrien Mazarguil

On Fri, Jul 15, 2016 at 10:03:02PM +0100, Bruce Richardson wrote:
> On Wed, Jul 13, 2016 at 03:02:37PM +0200, Adrien Mazarguil wrote:
> > DPDK uses GNU C language extensions in most of its code base. This is fine
> > for internal source files whose compilation flags are controlled by DPDK,
> > however user applications that use exported "public" headers may experience
> > compilation failures when enabling strict error/standard checks (-std and
> > -pedantic for instance).
> > 
> > Exported headers are installed system-wide and must be as clean as possible
> > so applications do not have to resort to workarounds.
> > 
> > This patchset affects exported headers only, compilation problems are
> > addressed as follows:
> > 
> > - Adding the __extension__ keyword to nonstandard constructs (same method
> >   as existing libraries when there is no other choice).
> > - Adding the __extension__ keyword to C11 constructs to remain compatible
> >   with pure C99.
> > - Adding missing includes so exported files can be included out of order
> >   and on their own.
> > - Fixing GNU printf-like variadic macros as there is no magic keyword for
> >   these.
> > 
> 
> Having upgraded to Fedora 24, I'm seeing quite a few errors compiling with gcc
> 6.1.1 in debug mode. Applying this patchset seems to really cut down on those
> errors, so may need to be applied for 16.07 release.
> 
> The remaining error I'm seeing is, in mlx drivers, complaints about the
> pedantic flag (the flag which I think was causing all the other errors to be
> triggered too):
> 
>   error: `-pedantic' is not an option that controls warnings

Saw this as well with GCC 6, I've planned to drop these #pragmas as soon as
possible after this series is applied, however there is some work left to do
on the libibverbs side before that.

> For this set though, I don't see any new errors introduced into gcc or clang
> builds for the libs or drivers, and a number of errors cleared, so:
> 
> Tested-by: Bruce Richardson 

Thanks for testing.

-- 
Adrien Mazarguil
6WIND

[dpdk-dev] [RFC] Generic flow director/filtering/classification API

2016-07-15 Thread Adrien Mazarguil

long as the PMD has not completed the related operation,
which means keeping the SW rule/fallback in place in the meantime. Should
handle security concerns as long as after removing a rule, packets end up in
a default queue entirely processed by SW. Obviously this may worsen response
time.

The ID action can help with this. By knowing which rule a received packet is
associated with, processing can be temporarily offloaded by another thread
without much complexity.

I think applications have to implement SW fallbacks all the time, as even
some sort of guarantee on the flow rule processing time may not be enough to
avoid misdirected packets and related security issues.

Let's wait for applications to start using this API and then consider an
extra set of asynchronous / real-time functions when the need arises. It
should not impact the way rules are specified.

> > > > > [Sugesh] Another query is on the synchronization part. What if
> > > > > same rules
> > > > are
> > > > > handled from different threads? Is application responsible for
> > > > > handling the
> > > > concurrent
> > > > > hardware programming?
> > > >
> > > > Like most (if not all) DPDK APIs, applications are responsible for
> > > > managing locking issues as decribed in 4.3 (Behavior). Since this is
> > > > a control path API and applications usually have a single control
> > > > thread, locking should not be necessary in most cases.
> > > >
> > > > Regarding my above comment about using several control threads to
> > > > manage different devices, section 4.3 says:
> > > >
> > > >  "There is no provision for reentrancy/multi-thread safety, although
> > > > nothing  should prevent different devices from being configured at
> > > > the same  time. PMDs may protect their control path functions
> > accordingly."
> > > >
> > > > I'd like to emphasize it is not "per port" but "per device", since
> > > > in a few cases a configurable resource is shared by several ports.
> > > > It may be difficult for applications to determine which ports are
> > > > shared by a given device but this falls outside the scope of this API.
> > > >
> > > > Do you think adding the guarantee that it is always safe to
> > > > configure two different ports simultaneously without locking from
> > > > the application side is necessary? In which case the PMD would be
> > > > responsible for locking shared resources.
> > > [Sugesh] This would be little bit complicated when some of ports are
> > > not under DPDK itself(what if one port is managed by Kernel) Or ports
> > > are tied by different application. Locking in PMD helps when the ports
> > > are accessed by multiple DPDK application. However what if the port itself
> > not under DPDK?
> > 
> > Well, either we do not care about what happens outside of the DPDK
> > context, or PMDs must find a way to satisfy everyone. I'm not a fan of 
> > locking
> > either but it would be nice if flow rules configuration could be attempted 
> > on
> > different ports simultaneously without the risk of wrecking anything, so 
> > that
> > applications do not need to care.
> > 
> > Possible cases for a dual port device with global flow rule settings 
> > affecting
> > both ports:
> > 
> > 1) ports 1 & 2 are managed by DPDK: this is the easy case, a rule that needs
> >to alter a global setting necessary for an existing rule on any port is
> >not allowed (EEXIST). PMD must maintain a device context common to both
> >ports in order for this to work. This context is either under lock, or
> >the first port on which a flow rule is created owns all future flow
> >rules.
> > 
> > 2) port 1 is managed by DPDK, port 2 by something else, the PMD is aware of
> >it and knows that port 2 may modify the global context: no flow rules can
> >be created from the DPDK application due to safety issues (EBUSY?).
> > 
> > 3) port 1 is managed by DPDK, port 2 by something else, the PMD is aware of
> >it and knows that port 2 will not modify flow rules: PMD should not care,
> >no lock necessary.
> > 
> > 4) port 1 is managed by DPDK, port 2 by something else and the PMD is not
> >aware of it: either flow rules cannot be created ever at all, or we say
> >it is user's reponsibility to make sure this does not happen.
> > 
> > Considering that most control operations performed by DPDK affect the
> > device regardless of other applications, I think 1) is the only case that 
> > should
> > be defined, otherwise 4), defined as user's responsibility.

No more comments on this part? What do you suggest?

-- 
Adrien Mazarguil
6WIND

[dpdk-dev] [RFC] Generic flow director/filtering/classification API

2016-07-13 Thread Adrien Mazarguil

e, the PMD is aware of
   it and knows that port 2 may modify the global context: no flow rules can
   be created from the DPDK application due to safety issues (EBUSY?).

3) port 1 is managed by DPDK, port 2 by something else, the PMD is aware of
   it and knows that port 2 will not modify flow rules: PMD should not care,
   no lock necessary.

4) port 1 is managed by DPDK, port 2 by something else and the PMD is not
   aware of it: either flow rules cannot be created ever at all, or we say
   it is user's reponsibility to make sure this does not happen.

Considering that most control operations performed by DPDK affect the device
regardless of other applications, I think 1) is the only case that should be
defined, otherwise 4), defined as user's responsibility.

> > > > Destruction
> > > > ~~~
> > > >
> > > > Flow rules destruction is not automatic, and a queue should not be
> > released
> > > > if any are still attached to it. Applications must take care of 
> > > > performing
> > > > this step before releasing resources.
> > > >
> > > > ::
> > > >
> > > >  int
> > > >  rte_flow_destroy(uint8_t port_id,
> > > >   struct rte_flow *flow);
> > > >
> > > >
> > > [Sugesh] I would suggest having a clean-up API is really useful as the
> > releasing of
> > > Queue(is it applicable for releasing of port too?) is not guaranteeing the
> > automatic flow
> > > destruction.
> > 
> > Would something like rte_flow_flush(port_id) do the trick? I wanted to
> > emphasize in this first draft that applications should really keep the flow
> > pointers around in order to manage/destroy them. It is their responsibility,
> > not PMD's.
> [Sugesh] Thanks, I think the flush call will do.

Noted, will add it.

> > > This way application can initialize the port,
> > > clean-up all the existing rules and create new rules  on a clean slate.
> > 
> > No resource can be released as long as a flow rule is using it (bad things
> > may happen otherwise), all flow rules must be destroyed first, thus none can
> > possibly remain after initializing a port. It is assumed that PMDs do
> > automatic clean up during init if necessary to ensure this.
> [Sugesh] That will do.

I will make it more explicit as well.

[...]

-- 
Adrien Mazarguil
6WIND

[dpdk-dev] [PATCH] net/mlx5: work around compilation issue

2016-07-13 Thread Adrien Mazarguil

From: Olga Shern <ol...@mellanox.com>

RHEL 7.1's GCC for POWER8 reports the following error in one rte_memcpy()
macro call by mlx5:

 error: array subscript is above array bounds [-Werror=array-bounds]

It appears to be a GCC bug which can be worked around by making parentheses
more explicit.

Signed-off-by: Olga Shern 
Signed-off-by: Adrien Mazarguil 
---
 drivers/net/mlx5/mlx5_rxtx.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/net/mlx5/mlx5_rxtx.c b/drivers/net/mlx5/mlx5_rxtx.c
index 615de94..fce3381 100644
--- a/drivers/net/mlx5/mlx5_rxtx.c
+++ b/drivers/net/mlx5/mlx5_rxtx.c
@@ -472,8 +472,8 @@ mlx5_wqe_write_inline_vlan(struct txq *txq, volatile union 
mlx5_wqe *wqe,
   (uint8_t *)addr, 12);
rte_memcpy((uint8_t *)(uintptr_t)wqe->inl.eseg.inline_hdr_start + 12,
   , sizeof(vlan));
-   rte_memcpy((uint8_t *)(uintptr_t)wqe->inl.eseg.inline_hdr_start + 16,
-  ((uint8_t *)addr + 12), 2);
+   rte_memcpy((uint8_t *)((uintptr_t)wqe->inl.eseg.inline_hdr_start + 16),
+  (uint8_t *)(addr + 12), 2);
addr += MLX5_ETH_VLAN_INLINE_HEADER_SIZE - sizeof(vlan);
length -= MLX5_ETH_VLAN_INLINE_HEADER_SIZE - sizeof(vlan);
size = (sizeof(wqe->inl.ctrl.ctrl) +
-- 
2.1.4

[dpdk-dev] [PATCH v4 10/10] scripts: check compilation of exported header files

2016-07-13 Thread Adrien Mazarguil

This script checks that header files in a given directory do not miss
dependencies when included on their own, do not conflict and accept being
compiled with the strictest possible flags.

Signed-off-by: Adrien Mazarguil 
---
 MAINTAINERS   |   1 +
 scripts/check-includes.sh | 286 +
 scripts/test-build.sh |  14 ++
 3 files changed, 301 insertions(+)

diff --git a/MAINTAINERS b/MAINTAINERS
index f996c2e..e2933c4 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -26,6 +26,7 @@ T: git://dpdk.org/dpdk
 F: MAINTAINERS
 F: scripts/check-maintainers.sh
 F: scripts/check-git-log.sh
+F: scripts/check-includes.sh
 F: scripts/checkpatches.sh
 F: scripts/load-devel-config.sh
 F: scripts/test-build.sh
diff --git a/scripts/check-includes.sh b/scripts/check-includes.sh
new file mode 100755
index 000..d65adc6
--- /dev/null
+++ b/scripts/check-includes.sh
@@ -0,0 +1,286 @@
+#!/bin/sh -e
+#
+#   BSD LICENSE
+#
+#   Copyright 2016 6WIND S.A.
+#
+#   Redistribution and use in source and binary forms, with or without
+#   modification, are permitted provided that the following conditions
+#   are met:
+#
+# * Redistributions of source code must retain the above copyright
+#   notice, this list of conditions and the following disclaimer.
+# * Redistributions in binary form must reproduce the above copyright
+#   notice, this list of conditions and the following disclaimer in
+#   the documentation and/or other materials provided with the
+#   distribution.
+# * Neither the name of 6WIND S.A. nor the names of its
+#   contributors may be used to endorse or promote products derived
+#   from this software without specific prior written permission.
+#
+#   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+#   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+#   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+#   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+#   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+#   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+#   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+#   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+#   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+#   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+#   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+
+# This script checks that header files in a given directory do not miss
+# dependencies when included on their own, do not conflict and accept being
+# compiled with the strictest possible flags.
+#
+# Files are looked up in the directory provided as the first argument,
+# otherwise build/include by default.
+#
+# Recognized environment variables:
+#
+# VERBOSE=1 is the same as -v.
+#
+# QUIET=1 is the same as -q.
+#
+# SUMMARY=1 is the same as -s.
+#
+# CC, CPPFLAGS, CFLAGS, EXTRA_CPPFLAGS, EXTRA_CFLAGS, CXX, CXXFLAGS and
+# EXTRA_CXXFLAGS are taken into account.
+#
+# PEDANTIC_CFLAGS, PEDANTIC_CXXFLAGS and PEDANTIC_CPPFLAGS provide strict
+# C/C++ compilation flags.
+#
+# IGNORE contains a list of shell patterns matching files (relative to the
+# include directory) to avoid. It is set by default to known DPDK headers
+# which must not be included on their own.
+#
+# IGNORE_CXX provides additional files for C++.
+
+while getopts hqvs arg; do
+   case $arg in
+   h)
+   cat < /dev/null
+
+[ "$VERBOSE" = 1 ] &&
+output ()
+{
+   local CCV
+   local CXXV
+
+   shift
+   CCV=$CC
+   CXXV=$CXX
+   CC="echo $CC" CXX="echo $CXX" "$@"
+   CC=$CCV
+   CXX=$CXXV
+
+   "$@"
+} ||
+output ()
+{
+
+   printf '  %s\n' "$1"
+   shift
+   "$@"
+}
+
+trap 'rm -f "$temp_cc" "$temp_cxx"' EXIT
+
+compile_cc ()
+{
+   ${CC} -I"$include_dir" \
+   ${PEDANTIC_CPPFLAGS} ${CPPFLAGS} ${EXTRA_CPPFLAGS} \
+   ${PEDANTIC_CFLAGS} ${CFLAGS} ${EXTRA_CFLAGS} \
+   -c -o /dev/null "${temp_cc}"
+}
+
+compile_cxx ()
+{
+   ${CXX} -I"$include_dir" \
+   ${PEDANTIC_CPPFLAGS} ${CPPFLAGS} ${EXTRA_CPPFLAGS} \
+   ${PEDANTIC_CXXFLAGS} ${CXXFLAGS} ${EXTRA_CXXFLAGS} \
+   -c -o /dev/null "${temp_cxx}"
+}
+
+ignore ()
+{
+   file="$1"
+   shift
+   while [ $# -ne 0 ]; do
+   case "$file" in
+   $1)
+   return 0
+   ;;
+   esac
+   shift
+   done
+   return 1
+}
+
+# Check C/C++ compilation for each header file.
+
+while read -r path
+do
+   file=${path#$include_dir}
+   file=${file##/}
+   if ignor

[dpdk-dev] [PATCH v4 09/10] lib: hide static functions never defined

2016-07-13 Thread Adrien Mazarguil

Arch-specific functions not defined for all architectures (missing on x86
in this case) and not used anywhere should not expose a prototype.

This commit prevents the following error:

 error: `rte_mov48' declared `static' but never defined

Signed-off-by: Adrien Mazarguil 
---
 lib/librte_eal/common/include/generic/rte_memcpy.h | 4 
 1 file changed, 4 insertions(+)

diff --git a/lib/librte_eal/common/include/generic/rte_memcpy.h 
b/lib/librte_eal/common/include/generic/rte_memcpy.h
index afb0afe..4e9d879 100644
--- a/lib/librte_eal/common/include/generic/rte_memcpy.h
+++ b/lib/librte_eal/common/include/generic/rte_memcpy.h
@@ -64,6 +64,8 @@ rte_mov16(uint8_t *dst, const uint8_t *src);
 static inline void
 rte_mov32(uint8_t *dst, const uint8_t *src);

+#ifdef __DOXYGEN__
+
 /**
  * Copy 48 bytes from one location to another using optimised
  * instructions. The locations should not overlap.
@@ -76,6 +78,8 @@ rte_mov32(uint8_t *dst, const uint8_t *src);
 static inline void
 rte_mov48(uint8_t *dst, const uint8_t *src);

+#endif /* __DOXYGEN__ */
+
 /**
  * Copy 64 bytes from one location to another using optimised
  * instructions. The locations should not overlap.
-- 
2.1.4

[dpdk-dev] [PATCH v4 08/10] lib: remove named variadic macros in exported headers

2016-07-13 Thread Adrien Mazarguil

Exported header files used by applications should allow the strictest
compiler flags. Language extensions used in many places must be explicitly
marked or removed to avoid warnings and compilation failures.

Since there is no way to force named variadic macros as extensions, use a
a standard __VA_ARGS__ with an extra dummy argument to format strings.

This commit prevents the following errors:

 error: ISO C does not permit named variadic macros

Signed-off-by: Adrien Mazarguil 
---
 lib/librte_cryptodev/rte_cryptodev.h   | 32 ++---
 lib/librte_cryptodev/rte_cryptodev_pmd.h   |  2 +-
 lib/librte_eal/common/include/rte_common.h |  9 +++
 3 files changed, 28 insertions(+), 15 deletions(-)

diff --git a/lib/librte_cryptodev/rte_cryptodev.h 
b/lib/librte_cryptodev/rte_cryptodev.h
index cf28541..d047ba8 100644
--- a/lib/librte_cryptodev/rte_cryptodev.h
+++ b/lib/librte_cryptodev/rte_cryptodev.h
@@ -77,26 +77,30 @@ extern const char **rte_cyptodev_names;

 /* Logging Macros */

-#define CDEV_LOG_ERR(fmt, args...) \
-   RTE_LOG(ERR, CRYPTODEV, "%s() line %u: " fmt "\n",  \
-   __func__, __LINE__, ## args)
+#define CDEV_LOG_ERR(...) \
+   RTE_LOG(ERR, CRYPTODEV, \
+   RTE_FMT("%s() line %u: " RTE_FMT_HEAD(__VA_ARGS__,) "\n", \
+   __func__, __LINE__, RTE_FMT_TAIL(__VA_ARGS__,)))

-#define CDEV_PMD_LOG_ERR(dev, fmt, args...)\
-   RTE_LOG(ERR, CRYPTODEV, "[%s] %s() line %u: " fmt "\n", \
-   dev, __func__, __LINE__, ## args)
+#define CDEV_PMD_LOG_ERR(dev, ...) \
+   RTE_LOG(ERR, CRYPTODEV, \
+   RTE_FMT("[%s] %s() line %u: " RTE_FMT_HEAD(__VA_ARGS__,) "\n", \
+   dev, __func__, __LINE__, RTE_FMT_TAIL(__VA_ARGS__,)))

 #ifdef RTE_LIBRTE_CRYPTODEV_DEBUG
-#define CDEV_LOG_DEBUG(fmt, args...)   \
-   RTE_LOG(DEBUG, CRYPTODEV, "%s() line %u: " fmt "\n",\
-   __func__, __LINE__, ## args)\
+#define CDEV_LOG_DEBUG(...) \
+   RTE_LOG(DEBUG, CRYPTODEV, \
+   RTE_FMT("%s() line %u: " RTE_FMT_HEAD(__VA_ARGS__,) "\n", \
+   __func__, __LINE__, RTE_FMT_TAIL(__VA_ARGS__,)))

-#define CDEV_PMD_TRACE(fmt, args...)   \
-   RTE_LOG(DEBUG, CRYPTODEV, "[%s] %s: " fmt "\n", \
-   dev, __func__, ## args)
+#define CDEV_PMD_TRACE(...) \
+   RTE_LOG(DEBUG, CRYPTODEV, \
+   RTE_FMT("[%s] %s: " RTE_FMT_HEAD(__VA_ARGS__,) "\n", \
+   dev, __func__, RTE_FMT_TAIL(__VA_ARGS__,)))

 #else
-#define CDEV_LOG_DEBUG(fmt, args...)
-#define CDEV_PMD_TRACE(fmt, args...)
+#define CDEV_LOG_DEBUG(...) (void)0
+#define CDEV_PMD_TRACE(...) (void)0
 #endif

 /**
diff --git a/lib/librte_cryptodev/rte_cryptodev_pmd.h 
b/lib/librte_cryptodev/rte_cryptodev_pmd.h
index cf08a50..4a07362 100644
--- a/lib/librte_cryptodev/rte_cryptodev_pmd.h
+++ b/lib/librte_cryptodev/rte_cryptodev_pmd.h
@@ -62,7 +62,7 @@ extern "C" {
 #define RTE_PMD_DEBUG_TRACE(...) \
rte_pmd_debug_trace(__func__, __VA_ARGS__)
 #else
-#define RTE_PMD_DEBUG_TRACE(fmt, args...)
+#define RTE_PMD_DEBUG_TRACE(...)
 #endif

 struct rte_cryptodev_session {
diff --git a/lib/librte_eal/common/include/rte_common.h 
b/lib/librte_eal/common/include/rte_common.h
index 98ecc1c..db5ac91 100644
--- a/lib/librte_eal/common/include/rte_common.h
+++ b/lib/librte_eal/common/include/rte_common.h
@@ -335,6 +335,15 @@ rte_bsf32(uint32_t v)
 /** Take a macro value and get a string version of it */
 #define RTE_STR(x) _RTE_STR(x)

+/**
+ * ISO C helpers to modify format strings using variadic macros.
+ * This is a replacement for the ", ## __VA_ARGS__" GNU extension.
+ * An empty %s argument is appended to avoid a dangling comma.
+ */
+#define RTE_FMT(fmt, ...) fmt "%.0s", __VA_ARGS__ ""
+#define RTE_FMT_HEAD(fmt, ...) fmt
+#define RTE_FMT_TAIL(fmt, ...) __VA_ARGS__
+
 /** Mask value of type "tp" for the first "ln" bit set. */
 #defineRTE_LEN2MASK(ln, tp)\
((tp)((uint64_t)-1 >> (sizeof(uint64_t) * CHAR_BIT - (ln
-- 
2.1.4

[dpdk-dev] [PATCH v4 07/10] lib: work around forward reference to enum types

2016-07-13 Thread Adrien Mazarguil

Exported header files used by applications should allow the strictest
compiler flags. Language extensions used in many places must be explicitly
marked or removed to avoid warnings and compilation failures.

This commit prevents the following errors:

 error: ISO C forbids forward references to `enum' types

Signed-off-by: Adrien Mazarguil 
---
 lib/librte_eal/common/include/generic/rte_cpuflags.h | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/lib/librte_eal/common/include/generic/rte_cpuflags.h 
b/lib/librte_eal/common/include/generic/rte_cpuflags.h
index c1da357..71321f3 100644
--- a/lib/librte_eal/common/include/generic/rte_cpuflags.h
+++ b/lib/librte_eal/common/include/generic/rte_cpuflags.h
@@ -44,6 +44,7 @@
 /**
  * Enumeration of all CPU features supported
  */
+__extension__
 enum rte_cpu_flag_t;

 /**
@@ -55,6 +56,7 @@ enum rte_cpu_flag_t;
  * flag name
  * NULL if flag ID is invalid
  */
+__extension__
 const char *
 rte_cpu_get_flag_name(enum rte_cpu_flag_t feature);

@@ -68,6 +70,7 @@ rte_cpu_get_flag_name(enum rte_cpu_flag_t feature);
  * 0 if flag is not available
  * -ENOENT if flag is invalid
  */
+__extension__
 int
 rte_cpu_get_flag_enabled(enum rte_cpu_flag_t feature);

-- 
2.1.4

1 2 3 4 5 >

1 - 100 of 486 matches

Mail list logo