[dpdk-dev] [RFC] Generic flow director/filtering/classification API

2016-07-22 Thread Chandran, Sugesh
HI Adrien,
Thank you for your effort and considering the inputs and comments.
The design looks fine for me now.


Regards
_Sugesh


> -Original Message-
> From: Adrien Mazarguil [mailto:adrien.mazarguil at 6wind.com]
> Sent: Thursday, July 21, 2016 2:37 PM
> To: Chandran, Sugesh 
> Cc: dev at dpdk.org; Thomas Monjalon ;
> Zhang, Helin ; Wu, Jingjing
> ; Rasesh Mody ; Ajit
> Khaparde ; Rahul Lakkireddy
> ; Lu, Wenzhuo ;
> Jan Medala ; John Daley ; Chen,
> Jing D ; Ananyev, Konstantin
> ; Matej Vido ;
> Alejandro Lucero ; Sony Chacko
> ; Jerin Jacob
> ; De Lara Guarch, Pablo
> ; Olga Shern ;
> Chilikin, Andrey 
> Subject: Re: [dpdk-dev] [RFC] Generic flow director/filtering/classification
> API
> 
> Hi Sugesh,
> 
> I do not have much to add, please see below.
> 
> On Thu, Jul 21, 2016 at 11:06:52AM +, Chandran, Sugesh wrote:
> [...]
> > > > RSS hashing support :- Just to confirm, the RSS flow action allows
> > > > application to decide the header fields to produce the hash. This
> > > > gives programmability on load sharing across different queues. The
> > > > application can program the NIC to calculate the RSS hash only
> > > > using mac or mac+ ip or ip only using this.
> > >
> > > I'd say yes but from your summary, I'm not sure we share the same
> > > idea of what the RSS action is supposed to do, so here is mine.
> > >
> > > Like all flow rules, the pattern part of the RSS action only filters
> > > the packets on which the action will be performed.
> > >
> > > The rss_conf parameter (struct rte_eth_rss_conf) only provides a key
> > > and a RSS hash function to use (ETH_RSS_IPV4,
> > > ETH_RSS_NONFRAG_IPV6_UDP, etc).
> > >
> > > Nothing prevents the RSS hash function from being applied to
> > > protocol headers which are not necessarily present in the flow rule
> > > pattern. These are two independent things, e.g. you could have a
> > > pattern matching IPv4 packets yet perform RSS hashing only on UDP
> headers.
> > >
> > > Finally, the RSS action configuration only affects packets coming
> > > from this flow rule. It is not performed on the device globally so
> > > packets which are not matched are not affected by RSS processing. As
> > > a result it might not be possible to configure two flow rules
> > > specifying incompatible RSS actions simultaneously if the underlying
> > > device supports only a single global RSS context.
> > >
> > [Sugesh] thank you for the explanation. This means I can have a rule
> > that matches on Every incoming packets(all field wild card rule) and
> > does RSS hash on selected fields, MAC only, IP only or IP & MAC?
> 
> Yes, I guess it could even replace the current method for configuring RSS on a
> device in a more versatile fashion, but this is a topic for another debate.
> 
> Let's implement this API first!
> 
> > This can be useful to do a packet lookup in software by just using
> > Only hash.
> 
> Not sure to fully understand your idea, but I'm positive it could be done
> somehow :)
> 
> --
> Adrien Mazarguil
> 6WIND


[dpdk-dev] [RFC] Generic flow director/filtering/classification API

2016-07-21 Thread Chandran, Sugesh

Hi Adrien,
Please find my comments below

Regards
_Sugesh


> -Original Message-
> From: Adrien Mazarguil [mailto:adrien.mazarguil at 6wind.com]
> Sent: Wednesday, July 20, 2016 6:11 PM
> To: Chandran, Sugesh 
> Cc: dev at dpdk.org; Thomas Monjalon ;
> Zhang, Helin ; Wu, Jingjing
> ; Rasesh Mody ; Ajit
> Khaparde ; Rahul Lakkireddy
> ; Lu, Wenzhuo ;
> Jan Medala ; John Daley ; Chen,
> Jing D ; Ananyev, Konstantin
> ; Matej Vido ;
> Alejandro Lucero ; Sony Chacko
> ; Jerin Jacob
> ; De Lara Guarch, Pablo
> ; Olga Shern ;
> Chilikin, Andrey 
> Subject: Re: [dpdk-dev] [RFC] Generic flow director/filtering/classification
> API
> 
> Hi Sugesh,
> 
> Please see below.
> 
> On Wed, Jul 20, 2016 at 04:32:50PM +, Chandran, Sugesh wrote:
> [...]
> > > > How about a hardware flow flag in packet descriptor that set when
> > > > the packets hits any hardware rule. This way software doesn?t
> > > > worry /blocked by a hardware rule . Even though there is an
> > > > additional overhead of validating this flag, software datapath can
> > > > identify the
> > > hardware processed packets easily.
> > > > This way the packets traverses the software fallback path until
> > > > the rule configuration is complete. This flag avoids setting ID
> > > > action for every
> > > hardware flow that are configuring.
> > >
> > > That makes sense. I see it as a sort of single bit ID but it could
> > > be implemented through a different action for less capable devices.
> > > PMDs that support 32 bit IDs could reuse the same code for both
> features.
> > >
> > > I understand you'd prefer having this feature always present,
> > > however we already know that not all PMDs/devices support it, and
> > > like everything else this is a kind of offload that needs to be
> > > explicitly requested by the application as it may not be needed.
> > >
> > > If we go with the separate action, then perhaps it would make sense
> > > to rename "ID" to "MARK" to make things clearer:
> > >
> > >  RTE_FLOW_ACTION_TYPE_FLAG /* Flag packets processed by flow rule.
> > > */
> > >
> > >  RTE_FLOW_ACTION_TYPE_MARK /* Attach a 32 bit value to a packet. */
> > >
> > > I guess the result of the FLAG action would be something in ol_flag.
> > >
> > [Sugesh] This looks fine for me.
> 
> Great, I will update the specification accordingly.
[Sugesh] Thank you!
> 
> > > Thoughts?
> > >
> > [Sugesh] Two more queries that I missed out in the earlier comments
> > are, Support for PTYPE :- Intel NICs can report packet type in mbuf.
> > This can be used by software for the packet processing. Is generic API
> > capable of handling that as well?
> 
> Yes, however no PTYPE action has been defined for this (yet). It is only a
> matter of adding one.
[Sugesh] Thank you for confirming. Its fine for me
> 
> Currently packet type recognition is enabled per port using a separate API, so
> correct me if I'm wrong but I am not aware of any adapter with the ability to
> enable it per flow rule, so I do not think such an action needs to be defined
> from the start. We may add it later.
> 
> > RSS hashing support :- Just to confirm, the RSS flow action allows
> > application to decide the header fields to produce the hash. This
> > gives programmability on load sharing across different queues. The
> > application can program the NIC to calculate the RSS hash only using
> > mac or mac+ ip or ip only using this.
> 
> I'd say yes but from your summary, I'm not sure we share the same idea of
> what the RSS action is supposed to do, so here is mine.
> 
> Like all flow rules, the pattern part of the RSS action only filters the 
> packets
> on which the action will be performed.
> 
> The rss_conf parameter (struct rte_eth_rss_conf) only provides a key and a
> RSS hash function to use (ETH_RSS_IPV4, ETH_RSS_NONFRAG_IPV6_UDP,
> etc).
> 
> Nothing prevents the RSS hash function from being applied to protocol
> headers which are not necessarily present in the flow rule pattern. These are
> two independent things, e.g. you could have a pattern matching IPv4 packets
> yet perform RSS hashing only on UDP headers.
> 
> Finally, the RSS action configuration only affects packets coming from this
> flow rule. It is not performed on the device globally so packets which are not
> matched are not affected by RSS processing. As a result it might not be
> possible to configure two flow rules specifying incompatible RSS actions
> simultaneously if the underlying device supports only a single global RSS
> context.
> 
[Sugesh] thank you for the explanation. This means I can have a rule that 
matches on
Every incoming packets(all field wild card rule) and does RSS hash on selected 
fields,
MAC only, IP only or IP & MAC? This can be useful to do a packet lookup in 
software by just using
Only hash. 
> Are we on the same page?
> 
> --
> Adrien Mazarguil
> 6WIND


[dpdk-dev] [RFC] Generic flow director/filtering/classification API

2016-07-20 Thread Chandran, Sugesh
Hi Adrien,

Sorry for the late reply.
Snipped out the parts we agreed.

Regards
_Sugesh


> -Original Message-
> From: Adrien Mazarguil [mailto:adrien.mazarguil at 6wind.com]
> Sent: Monday, July 18, 2016 4:00 PM
> To: Chandran, Sugesh 
> Cc: dev at dpdk.org; Thomas Monjalon ;
> Zhang, Helin ; Wu, Jingjing
> ; Rasesh Mody ; Ajit
> Khaparde ; Rahul Lakkireddy
> ; Lu, Wenzhuo ;
> Jan Medala ; John Daley ; Chen,
> Jing D ; Ananyev, Konstantin
> ; Matej Vido ;
> Alejandro Lucero ; Sony Chacko
> ; Jerin Jacob
> ; De Lara Guarch, Pablo
> ; Olga Shern ;
> Chilikin, Andrey 
> Subject: Re: [dpdk-dev] [RFC] Generic flow director/filtering/classification
> API
> 
> On Mon, Jul 18, 2016 at 01:26:09PM +, Chandran, Sugesh wrote:
> > Hi Adrien,
> > Thank you for getting back on this.
> > Please find my comments below.
> 
> Hi Sugesh,
> 
> Same for me, removed again the parts we agree on.
> 
> [...]
> > > For your above example, the application cannot assume a rule is
> > > added/deleted as long as the PMD has not completed the related
> > > operation, which means keeping the SW rule/fallback in place in the
> > > meantime. Should handle security concerns as long as after removing
> > > a rule, packets end up in a default queue entirely processed by SW.
> > > Obviously this may worsen response time.
> > >
> > > The ID action can help with this. By knowing which rule a received
> > > packet is associated with, processing can be temporarily offloaded
> > > by another thread without much complexity.
> > [Sugesh] Setting ID for every flow may not viable especially when the
> > size of ID is small(just only 8 bits). I am not sure is this a valid case 
> > though.
> 
> Agreed, I'm not saying this solution works for all devices, particularly those
> that do not support ID at all.
> 
> > How about a hardware flow flag in packet descriptor that set when the
> > packets hits any hardware rule. This way software doesn?t worry
> > /blocked by a hardware rule . Even though there is an additional
> > overhead of validating this flag, software datapath can identify the
> hardware processed packets easily.
> > This way the packets traverses the software fallback path until the
> > rule configuration is complete. This flag avoids setting ID action for every
> hardware flow that are configuring.
> 
> That makes sense. I see it as a sort of single bit ID but it could be
> implemented through a different action for less capable devices. PMDs that
> support 32 bit IDs could reuse the same code for both features.
> 
> I understand you'd prefer having this feature always present, however we
> already know that not all PMDs/devices support it, and like everything else
> this is a kind of offload that needs to be explicitly requested by the
> application as it may not be needed.
> 
> If we go with the separate action, then perhaps it would make sense to
> rename "ID" to "MARK" to make things clearer:
> 
>  RTE_FLOW_ACTION_TYPE_FLAG /* Flag packets processed by flow rule. */
> 
>  RTE_FLOW_ACTION_TYPE_MARK /* Attach a 32 bit value to a packet. */
> 
> I guess the result of the FLAG action would be something in ol_flag.
> 
[Sugesh] This looks fine for me.
> Thoughts?
> 
[Sugesh] Two more queries that I missed out in the earlier comments are,
Support for PTYPE :- Intel NICs can report packet type in mbuf.
This can be used by software for the packet processing. Is generic API
capable of handling that as well? 
RSS hashing support :- Just to confirm, the RSS flow action allows application
to decide the header fields to produce the hash. This gives
programmability on load sharing across different queues. The
application can program the NIC to calculate the RSS hash only using mac or 
mac+ ip or 
ip only using this.


> > > I think applications have to implement SW fallbacks all the time, as
> > > even some sort of guarantee on the flow rule processing time may not
> > > be enough to avoid misdirected packets and related security issues.
> > [Sugesh] Software fallback will be there always. However I am little
> > bit confused on the way software going to identify the packets that
> > are already hardware processed . I feel we need some notification in the
> packet itself, when a hardware rule hits. ID/flag/any other options?
> 
> Yeah I think so too, as long as it is optional because we cannot assume all
> PMDs will support it.
> 
> > > Let's wait for applications to start using this API and then
> > > consider an extra set of asynchronous / real-time functions when the
> > > need arises. It should not impact the way rules are specified
> > [Sugesh] Sure. I think the rule definition may not impact with this.
> 
> Thanks for your comments.
> 
> --
> Adrien Mazarguil
> 6WIND


[dpdk-dev] [RFC] Generic flow director/filtering/classification API

2016-07-18 Thread Chandran, Sugesh
Hi Adrien,
Thank you for getting back on this.
Please find my comments below.

Regards
_Sugesh


> -Original Message-
> From: Adrien Mazarguil [mailto:adrien.mazarguil at 6wind.com]
> Sent: Friday, July 15, 2016 4:04 PM
> To: Chandran, Sugesh 
> Cc: dev at dpdk.org; Thomas Monjalon ;
> Zhang, Helin ; Wu, Jingjing
> ; Rasesh Mody ; Ajit
> Khaparde ; Rahul Lakkireddy
> ; Lu, Wenzhuo ;
> Jan Medala ; John Daley ; Chen,
> Jing D ; Ananyev, Konstantin
> ; Matej Vido ;
> Alejandro Lucero ; Sony Chacko
> ; Jerin Jacob
> ; De Lara Guarch, Pablo
> ; Olga Shern ;
> Chilikin, Andrey 
> Subject: Re: [dpdk-dev] [RFC] Generic flow director/filtering/classification
> API
> 
> On Fri, Jul 15, 2016 at 09:23:26AM +, Chandran, Sugesh wrote:
> > Thank you Adrien,
> > Please find below for some more comments/inputs
> >
> > Let me know your thoughts on this.
> 
> Thanks, stripping again non relevant parts.
> 
> [...]
> > > > > > [Sugesh] Is it a limitation to use only 32 bit ID? Is it
> > > > > > possible to have a
> > > > > > 64 bit ID? So that application can use the control plane flow
> > > > > > pointer Itself as an ID. Does it make sense?
> > > > >
> > > > > I've specified a 32 bit ID for now because this is what FDIR
> > > > > supports and also what existing devices can report today AFAIK
> > > > > (i40e and
> > > mlx5).
> > > > >
> > > > > We could use 64 bit for future-proofness in a separate action like
> "ID64"
> > > > > when at least one device supports it.
> > > > >
> > > > > To PMD maintainers: please comment if you know devices that
> > > > > support tagging matching packets with more than 32 bits of
> > > > > user-provided data!
> > > > [Sugesh] I guess the flow director ID is 64 bit , The XL710 datasheet 
> > > > says
> so.
> > > > And in the 'rte_mbuf' structure the 64 bit FDIR-ID is shared with
> > > > rss hash. This can be a software driver limitation that expose
> > > > only 32 bit. Possibly because of cache alignment issues? Since the
> > > > hardware can support 64 bit, I feel it make sense to support 64 bit as
> well.
> > >
> > > I agree we need 64 bit support, but then we also need a solution for
> > > devices that support only 32 bit. Possible methods I can think of:
> > >
> > > - A separate "ID64" action (or a "ID32" one, perhaps with a better name).
> > >
> > > - A single ID action with an unlimited number of bytes to return with
> > >   packets (would actually be a string). PMDs can then refuse to create
> flow
> > >   rules requesting an unsupported number of bytes. Devices
> > > supporting fewer
> > >   than 32 bits are also included this way without the need for yet another
> > >   action.
> > >
> > > Thoughts?
> > [Sugesh] I feel the single ID approach is much better. But I would say
> > a fixed size ID is easy to handle at upper layers. Say PMD returns
> > 64bit ID in which MSBs are masked out, based on how many bits the
> hardware can support.
> > PMD can refuse the unsupported number of bytes when requested. So
> the
> > size of ID going to be a parameter to program the flow.
> > What do you think?
> 
> What you suggest if I am not mistaken is:
> 
>  struct rte_flow_action_id {
>  uint64_t id;
>  uint64_t mask; /* either a bit-mask or a prefix/suffix length? */  };
> 
> I think in this case a mask is more versatile than a prefix/suffix length as 
> the
> value itself comes in an unknown endian (from PMD's POV). It also allows
> specific bits to be taken into account, like when HW only supports 32 bit, 
> with
> some black magic the full original 64 bit value can be restored as long as the
> application only cares about at most 32 bits anywhere.
> 
> However I do not think many applications "won't care" about specific bits in a
> given value and having to provide a properly crafted mask will be a hassle,
> they will just fill it with ones and hope for the best. As a result they won't
> take advantage of this feature or will stick to 32 bits all the time, or 
> whatever
> happens to be the least common denominator.
> 
> My previous suggestion was:
> 
>  struct rte_flow_action_id {
>  uint8_t size; /* number of bytes in id[] */
>  uint8_t id[];
>  };
> 
> It does not solve the issue if an application requests more bytes than
> supported, however as a 

[dpdk-dev] [RFC] Generic flow director/filtering/classification API

2016-07-18 Thread Chandran, Sugesh
Hi Andrey,

Regards
_Sugesh


> -Original Message-
> From: Chilikin, Andrey
> Sent: Friday, July 15, 2016 11:02 AM
> To: Chandran, Sugesh ; 'Adrien Mazarguil'
> 
> Cc: dev at dpdk.org; Thomas Monjalon ;
> Zhang, Helin ; Wu, Jingjing
> ; Rasesh Mody ; Ajit
> Khaparde ; Rahul Lakkireddy
> ; Lu, Wenzhuo ;
> Jan Medala ; John Daley ; Chen,
> Jing D ; Ananyev, Konstantin
> ; Matej Vido ;
> Alejandro Lucero ; Sony Chacko
> ; Jerin Jacob
> ; De Lara Guarch, Pablo
> ; Olga Shern 
> Subject: RE: [dpdk-dev] [RFC] Generic flow director/filtering/classification
> API
> 
> Hi Sugesh,
> 
> > -Original Message-----
> > From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Chandran, Sugesh
> > Sent: Friday, July 15, 2016 10:23 AM
> > To: 'Adrien Mazarguil' 
> 
> 
> 
> > > > > To PMD maintainers: please comment if you know devices that
> > > > > support tagging matching packets with more than 32 bits of
> > > > > user-provided data!
> > > > [Sugesh] I guess the flow director ID is 64 bit , The XL710 datasheet 
> > > > says
> so.
> > > > And in the 'rte_mbuf' structure the 64 bit FDIR-ID is shared with
> > > > rss hash. This can be a software driver limitation that expose
> > > > only
> > > > 32 bit. Possibly because of cache alignment issues? Since the
> > > > hardware can support 64 bit, I feel it make sense to support 64 bit as
> well.
> 
> XL710 supports 32bit FDIR ID only, I believe you mix it up with flexible 
> payload
> data which can take 0, 4 or 8 bytes of the RX descriptor.
[Sugesh] Thank you for correcting Andrey.
Its my mistake..I mixed up with flexible payload data. The FDIR ID is only 32 
bit.

> 
> Regards,
> Andrey


[dpdk-dev] [RFC] Generic flow director/filtering/classification API

2016-07-15 Thread Chandran, Sugesh
Thank you Adrien,
Please find below for some more comments/inputs

Let me know your thoughts on this.


Regards
_Sugesh


> -Original Message-
> From: Adrien Mazarguil [mailto:adrien.mazarguil at 6wind.com]
> Sent: Wednesday, July 13, 2016 9:03 PM
> To: Chandran, Sugesh 
> Cc: dev at dpdk.org; Thomas Monjalon ;
> Zhang, Helin ; Wu, Jingjing
> ; Rasesh Mody ; Ajit
> Khaparde ; Rahul Lakkireddy
> ; Lu, Wenzhuo ;
> Jan Medala ; John Daley ; Chen,
> Jing D ; Ananyev, Konstantin
> ; Matej Vido ;
> Alejandro Lucero ; Sony Chacko
> ; Jerin Jacob
> ; De Lara Guarch, Pablo
> ; Olga Shern 
> Subject: Re: [dpdk-dev] [RFC] Generic flow director/filtering/classification
> API
> 
> On Mon, Jul 11, 2016 at 10:42:36AM +, Chandran, Sugesh wrote:
> > Hi Adrien,
> >
> > Thank you for your response,
> > Please see my comments inline.
> 
> Hi Sugesh,
> 
> Sorry for the delay, please see my answers inline as well.
> 
> [...]
> > > > > Flow director
> > > > > -
> > > > >
> > > > > Flow director (FDIR) is the name of the most capable filter
> > > > > type, which covers most features offered by others. As such, it
> > > > > is the most
> > > widespread
> > > > > in PMDs that support filtering (i.e. all of them besides **e1000**).
> > > > >
> > > > > It is also the only type that allows an arbitrary 32 bits value
> > > > > provided by applications to be attached to a filter and returned
> > > > > with matching packets instead of relying on the destination queue to
> recognize flows.
> > > > >
> > > > > Unfortunately, even FDIR requires applications to be aware of
> > > > > low-level capabilities and limitations (most of which come
> > > > > directly from **ixgbe**
> > > and
> > > > > **i40e**):
> > > > >
> > > > > - Bitmasks are set globally per device (port?), not per filter.
> > > > [Sugesh] This means application cannot define filters that matches
> > > > on
> > > arbitrary different offsets?
> > > > If that?s the case, I assume the application has to program
> > > > bitmask in
> > > advance. Otherwise how
> > > > the API framework deduce this bitmask information from the rules??
> > > > Its
> > > not very clear to me
> > > > that how application pass down the bitmask information for
> > > > multiple filters
> > > on same port?
> > >
> > > This is my understanding of how flow director currently works,
> > > perhaps someome more familiar with it can answer this question better
> than I could.
> > >
> > > Let me take an example, if particular device can only handle a
> > > single IPv4 mask common to all flow rules (say only to match
> > > destination addresses), updating that mask to also match the source
> > > address affects all defined and future flow rules simultaneously.
> > >
> > > That is how FDIR currently works and I think it is wrong, as it
> > > penalizes devices that do support individual bit-masks per rule, and
> > > is a little awkward from an application point of view.
> > >
> > > What I suggest for the new API instead is the ability to specify one
> > > bit-mask per rule, and let the PMD deal with HW limitations by
> > > automatically configuring global bitmasks from the first added rule,
> > > then refusing to add subsequent rules if they specify a conflicting
> > > bit-mask. Existing rules remain unaffected that way, and
> > > applications do not have to be extra cautious.
> > >
> > [Sugesh] The issue with that approach is, the hardware simply discards
> > the rule when it is a super set of first one eventhough the hardware
> > is capable of handling it. How its guaranteed the first rule will set
> > the bitmask for all the subsequent rules.
> 
> Just to clarify, the API only says that new rules cannot affect existing ones
> (which I think makes sense from a user's perspective), so as long as the PMD
> does whatever is needed to make all rules work together, there should not
> be any problem with this approach.
> 
> Even if the PMD has to temporarily remove an existing rule and reconfigure
> global masks in order to add subsequent rules, it is fine as long as packets
> aren't misdirected in the meantime (they may be dropped if there is no
> other choice).
[Sugesh] I feel this is fine. Thank you for confirming.
> 
> > How abou

[dpdk-dev] [RFC] Generic flow director/filtering/classification API

2016-07-11 Thread Chandran, Sugesh
Hi Adrien,

Thank you for your response,
Please see my comments inline.

Regards
_Sugesh


> -Original Message-
> From: Adrien Mazarguil [mailto:adrien.mazarguil at 6wind.com]
> Sent: Friday, July 8, 2016 2:03 PM
> To: Chandran, Sugesh 
> Cc: dev at dpdk.org; Thomas Monjalon ;
> Zhang, Helin ; Wu, Jingjing
> ; Rasesh Mody ; Ajit
> Khaparde ; Rahul Lakkireddy
> ; Lu, Wenzhuo ;
> Jan Medala ; John Daley ; Chen,
> Jing D ; Ananyev, Konstantin
> ; Matej Vido ;
> Alejandro Lucero ; Sony Chacko
> ; Jerin Jacob
> ; De Lara Guarch, Pablo
> ; Olga Shern 
> Subject: Re: [dpdk-dev] [RFC] Generic flow director/filtering/classification
> API
> 
> Hi Sugesh,
> 
> On Thu, Jul 07, 2016 at 11:15:07PM +, Chandran, Sugesh wrote:
> > Hi Adrien,
> >
> > Thank you for proposing this. It would be really useful for application such
> as OVS-DPDK.
> > Please find my comments and questions inline below prefixed with
> [Sugesh]. Most of them are from the perspective of enabling these APIs in
> application such as OVS-DPDK.
> 
> Thanks, I'm replying below.
> 
> > > -Original Message-
> > > From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Adrien
> Mazarguil
> > > Sent: Tuesday, July 5, 2016 7:17 PM
> > > To: dev at dpdk.org
> > > Cc: Thomas Monjalon ; Zhang, Helin
> > > ; Wu, Jingjing ; 
> > > Rasesh
> > > Mody ; Ajit Khaparde
> > > ; Rahul Lakkireddy
> > > ; Lu, Wenzhuo
> ;
> > > Jan Medala ; John Daley ;
> Chen,
> > > Jing D ; Ananyev, Konstantin
> > > ; Matej Vido ;
> > > Alejandro Lucero ; Sony Chacko
> > > ; Jerin Jacob
> > > ; De Lara Guarch, Pablo
> > > ; Olga Shern 
> > > Subject: [dpdk-dev] [RFC] Generic flow director/filtering/classification
> API
> > >

<<<<<Snipped out >>>>>
> > > Flow director
> > > -
> > >
> > > Flow director (FDIR) is the name of the most capable filter type, which
> > > covers most features offered by others. As such, it is the most
> widespread
> > > in PMDs that support filtering (i.e. all of them besides **e1000**).
> > >
> > > It is also the only type that allows an arbitrary 32 bits value provided 
> > > by
> > > applications to be attached to a filter and returned with matching packets
> > > instead of relying on the destination queue to recognize flows.
> > >
> > > Unfortunately, even FDIR requires applications to be aware of low-level
> > > capabilities and limitations (most of which come directly from **ixgbe**
> and
> > > **i40e**):
> > >
> > > - Bitmasks are set globally per device (port?), not per filter.
> > [Sugesh] This means application cannot define filters that matches on
> arbitrary different offsets?
> > If that?s the case, I assume the application has to program bitmask in
> advance. Otherwise how
> > the API framework deduce this bitmask information from the rules?? Its
> not very clear to me
> > that how application pass down the bitmask information for multiple filters
> on same port?
> 
> This is my understanding of how flow director currently works, perhaps
> someome more familiar with it can answer this question better than I could.
> 
> Let me take an example, if particular device can only handle a single IPv4
> mask common to all flow rules (say only to match destination addresses),
> updating that mask to also match the source address affects all defined and
> future flow rules simultaneously.
> 
> That is how FDIR currently works and I think it is wrong, as it penalizes
> devices that do support individual bit-masks per rule, and is a little
> awkward from an application point of view.
> 
> What I suggest for the new API instead is the ability to specify one
> bit-mask per rule, and let the PMD deal with HW limitations by automatically
> configuring global bitmasks from the first added rule, then refusing to add
> subsequent rules if they specify a conflicting bit-mask. Existing rules
> remain unaffected that way, and applications do not have to be extra
> cautious.
> 
[Sugesh] The issue with that approach is, the hardware simply discards the rule
when it is a super set of first one eventhough the hardware is capable of 
handling it. How its guaranteed the first rule will set the bitmask for all the
subsequent rules. 
How about having a CLASSIFER_TYPE for the classifier. Every port can have 
set of supported flow types(for eg: L3_TYPE, L4_TYPE, L4_TYPE_8BYTE_FLEX,
L4_TYPE_16BYTE_FLEX) based on the underlying FDIR support. Application 

[dpdk-dev] [RFC] Generic flow director/filtering/classification API

2016-07-08 Thread Chandran, Sugesh
Hi Adrien,

Thank you for proposing this. It would be really useful for application such as 
OVS-DPDK.
Please find my comments and questions inline below prefixed with [Sugesh]. Most 
of them are from the perspective of enabling these APIs in application such as 
OVS-DPDK.

Regards
_Sugesh


> -Original Message-
> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Adrien Mazarguil
> Sent: Tuesday, July 5, 2016 7:17 PM
> To: dev at dpdk.org
> Cc: Thomas Monjalon ; Zhang, Helin
> ; Wu, Jingjing ; Rasesh
> Mody ; Ajit Khaparde
> ; Rahul Lakkireddy
> ; Lu, Wenzhuo ;
> Jan Medala ; John Daley ; Chen,
> Jing D ; Ananyev, Konstantin
> ; Matej Vido ;
> Alejandro Lucero ; Sony Chacko
> ; Jerin Jacob
> ; De Lara Guarch, Pablo
> ; Olga Shern 
> Subject: [dpdk-dev] [RFC] Generic flow director/filtering/classification API
> 
> Hi All,
> 
> First, forgive me for this large message, I know our mailboxes already
> suffer quite a bit from the amount of traffic on this ML.
> 
> This is not exactly yet another thread about how flow director should be
> extended, rather about a brand new API to handle filtering and
> classification for incoming packets in the most PMD-generic and
> application-friendly fashion we can come up with. Reasons described below.
> 
> I think this topic is important enough to include both the users of this API
> as well as PMD maintainers. So far I have CC'ed librte_ether (especially
> rte_eth_ctrl.h contributors), testpmd and PMD maintainers (with and
> without
> a .filter_ctrl implementation), but if you know application maintainers
> other than testpmd who use FDIR or might be interested in this discussion,
> feel free to add them.
> 
> The issues we found with the current approach are already summarized in
> the
> following document, but here is a quick summary for TL;DR folks:
> 
> - PMDs do not expose a common set of filter types and even when they do,
>   their behavior more or less differs.
> 
> - Applications need to determine and adapt to device-specific limitations
>   and quirks on their own, without help from PMDs.
> 
> - Writing an application that creates flow rules targeting all devices
>   supported by DPDK is thus difficult, if not impossible.
> 
> - The current API has too many unspecified areas (particularly regarding
>   side effects of flow rules) that make PMD implementation tricky.
> 
> This RFC API handles everything currently supported by .filter_ctrl, the
> idea being to reimplement all of these to make them fully usable by
> applications in a more generic and well defined fashion. It has a very small
> set of mandatory features and an easy method to let applications probe for
> supported capabilities.
> 
> The only downside is more work for the software control side of PMDs
> because
> they have to adapt to the API instead of the reverse. I think helpers can be
> added to EAL to assist with this.
> 
> HTML version:
> 
>  https://rawgit.com/6WIND/rte_flow/master/rte_flow.html
> 
> PDF version:
> 
>  https://rawgit.com/6WIND/rte_flow/master/rte_flow.pdf
> 
> Related draft header file (for reference while reading the specification):
> 
>  https://raw.githubusercontent.com/6WIND/rte_flow/master/rte_flow.h
> 
> Git tree for completeness (latest .rst version can be retrieved from here):
> 
>  https://github.com/6WIND/rte_flow
> 
> What follows is the ReST source of the above, for inline comments and
> discussion. I intend to update that specification accordingly.
> 
> 
> Generic filter interface
> 
> 
> .. footer::
> 
>v0.6
> 
> .. contents::
> .. sectnum::
> .. raw:: pdf
> 
>PageBreak
> 
> Overview
> 
> 
> DPDK provides several competing interfaces added over time to perform
> packet
> matching and related actions such as filtering and classification.
> 
> They must be extended to implement the features supported by newer
> devices
> in order to expose them to applications, however the current design has
> several drawbacks:
> 
> - Complicated filter combinations which have not been hard-coded cannot be
>   expressed.
> - Prone to API/ABI breakage when new features must be added to an
> existing
>   filter type, which frequently happens.
> 
> From an application point of view:
> 
> - Having disparate interfaces, all optional and lacking in features does not
>   make this API easy to use.
> - Seemingly arbitrary built-in limitations of filter types based on the
>   device they were initially designed for.
> - Undefined relationship between different filter types.
> - High complexity, considerable undocumented and/or undefined behavior.
> 
> Considering the growing number of devices supported by DPDK, adding a
> new
> filter type each time a new feature must be implemented is not sustainable
> in the long term. Applications not written to target a specific device
> cannot really benefit from such an API.
> 
> For these reasons, this document defines an extensible unified API that
> encompasses and 

[dpdk-dev] about rx checksum flags

2016-07-06 Thread Chandran, Sugesh
Hi Olivier,

Just to confirm , is this rx checksum patch already submitted in the DPDK ML?
We would like to use these flags to speed up the tunneling in OVS.



Regards
_Sugesh


> -Original Message-
> From: Chandran, Sugesh
> Sent: Friday, June 10, 2016 5:16 PM
> To: 'Olivier Matz' ; Ananyev, Konstantin
> ; Stephen Hemminger
> 
> Cc: Yuanhan Liu ; dev at dpdk.org; Richardson,
> Bruce ; Adrien Mazarguil
> ; Tan, Jianfeng 
> Subject: RE: [dpdk-dev] about rx checksum flags
> 
> 
> 
> Regards
> _Sugesh
> 
> > -Original Message-
> > From: Olivier Matz [mailto:olivier.matz at 6wind.com]
> > Sent: Wednesday, June 8, 2016 2:02 PM
> > To: Chandran, Sugesh ; Ananyev, Konstantin
> > ; Stephen Hemminger
> > 
> > Cc: Yuanhan Liu ; dev at dpdk.org;
> > Richardson, Bruce ; Adrien Mazarguil
> > ; Tan, Jianfeng 
> > Subject: Re: [dpdk-dev] about rx checksum flags
> >
> > Hi,
> >
> > On 06/08/2016 10:22 AM, Chandran, Sugesh wrote:
> > >>> I guess the IP checksum also important as L4. In some cases, UDP
> > >>> checksum is zero and no need to validate it. But Ip checksum is
> > >>> present on all the packets and that must be validated all  the time.
> > >>> At higher packet rate, the ip checksum offload can offer slight
> > >>> performance
> > >> improvement. What do you think??
> > >>>
> > >>
> > >> Agree, in some situations (and this is even more true with packet
> > >> types / smartnics), the application could process without accessing
> > >> the packet data if we keep the IP cksum flags.
> > > [Sugesh] True, If that's the case, Will you considering to implement
> > > IP checksum flags as well along with L4?
> > > As you said , this will be useful when we offload packet lookup
> > > itself into the NICs(May be when using Flow director) ?
> >
> > Yes, I plan to implement the same rx status flags (good, bad, unknown,
> > none) for rx IP checksum too.
> [Sugesh] That's great!, Thank you Olivier.
> >
> > Regards,
> > Olivier


[dpdk-dev] about rx checksum flags

2016-06-10 Thread Chandran, Sugesh


Regards
_Sugesh

> -Original Message-
> From: Olivier Matz [mailto:olivier.matz at 6wind.com]
> Sent: Wednesday, June 8, 2016 2:02 PM
> To: Chandran, Sugesh ; Ananyev, Konstantin
> ; Stephen Hemminger
> 
> Cc: Yuanhan Liu ; dev at dpdk.org; Richardson,
> Bruce ; Adrien Mazarguil
> ; Tan, Jianfeng 
> Subject: Re: [dpdk-dev] about rx checksum flags
> 
> Hi,
> 
> On 06/08/2016 10:22 AM, Chandran, Sugesh wrote:
> >>> I guess the IP checksum also important as L4. In some cases, UDP
> >>> checksum is zero and no need to validate it. But Ip checksum is
> >>> present on all the packets and that must be validated all  the time.
> >>> At higher packet rate, the ip checksum offload can offer slight
> >>> performance
> >> improvement. What do you think??
> >>>
> >>
> >> Agree, in some situations (and this is even more true with packet
> >> types / smartnics), the application could process without accessing
> >> the packet data if we keep the IP cksum flags.
> > [Sugesh] True, If that's the case, Will you considering to implement
> > IP checksum flags as well along with L4?
> > As you said , this will be useful when we offload packet lookup itself
> > into the NICs(May be when using Flow director) ?
> 
> Yes, I plan to implement the same rx status flags (good, bad, unknown,
> none) for rx IP checksum too.
[Sugesh] That's great!, Thank you Olivier.
> 
> Regards,
> Olivier


[dpdk-dev] about rx checksum flags

2016-06-08 Thread Chandran, Sugesh


Regards
_Sugesh


> -Original Message-
> From: Olivier Matz [mailto:olivier.matz at 6wind.com]
> Sent: Friday, June 3, 2016 1:43 PM
> To: Chandran, Sugesh ; Ananyev, Konstantin
> ; Stephen Hemminger
> 
> Cc: Yuanhan Liu ; dev at dpdk.org; Richardson,
> Bruce ; Adrien Mazarguil
> ; Tan, Jianfeng 
> Subject: Re: [dpdk-dev] about rx checksum flags
> 
> Hi,
> 
> On 06/02/2016 09:42 AM, Chandran, Sugesh wrote:
> >>>> Do you also suggest to drop IP checksum flags?
> >>> > >
> >>> > > IP checksum offload is mostly useless. If application needs to
> >>> > > look at IP, it can do whole checksum in very few instructions,
> >>> > > the whole header is in the same cache line as src/dst so the HW
> >>> > > offload is really no
> >> > help.
> >>> > >
> > [Sugesh] The checksum offload can boost the tunneling performance in
> OVS.
> > I guess the IP checksum also important as L4. In some cases, UDP
> > checksum is zero and no need to validate it. But Ip checksum is
> > present on all the packets and that must be validated all  the time.
> > At higher packet rate, the ip checksum offload can offer slight performance
> improvement. What do you think??
> >
> 
> Agree, in some situations (and this is even more true with packet types /
> smartnics), the application could process without accessing the packet data if
> we keep the IP cksum flags.
[Sugesh] True, If that's the case, Will you considering to implement IP
checksum flags as well along with L4?
As you said , this will be useful when we offload packet lookup itself into the 
NICs(May be
when using Flow director) ? 



> 
> Regards,
> Olivier


[dpdk-dev] about rx checksum flags

2016-06-02 Thread Chandran, Sugesh
Hi Olivier,

Thank you for working on this..
A comment on the proposal is given below,


Regards
_Sugesh

> -Original Message-
> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Ananyev,
> Konstantin
> Sent: Wednesday, June 1, 2016 10:07 AM
> To: Stephen Hemminger ; Olivier MATZ
> 
> Cc: Yuanhan Liu ; dev at dpdk.org; Richardson,
> Bruce ; Adrien Mazarguil
> ; Tan, Jianfeng 
> Subject: Re: [dpdk-dev] about rx checksum flags
> 
> 
> 
> > -Original Message-
> > From: Stephen Hemminger [mailto:stephen at networkplumber.org]
> > Sent: Tuesday, May 31, 2016 11:03 PM
> > To: Olivier MATZ
> > Cc: Yuanhan Liu; dev at dpdk.org; Ananyev, Konstantin; Richardson, Bruce;
> > Adrien Mazarguil; Tan, Jianfeng
> > Subject: Re: [dpdk-dev] about rx checksum flags
> >
> > On Tue, 31 May 2016 22:58:57 +0200
> > Olivier MATZ  wrote:
> >
> > > Hi Stephen,
> > >
> > > On 05/31/2016 10:28 PM, Stephen Hemminger wrote:
> > > > On Tue, 31 May 2016 21:11:59 +0200 Olivier MATZ
> > > >  wrote:
> > > >
> > > >>
> > > >>
> > > >> On 05/31/2016 10:09 AM, Yuanhan Liu wrote:
> > > >>> On Mon, May 30, 2016 at 05:26:21PM +0200, Olivier Matz wrote:
> > >   PKT_RX_L4_CKSUM_NONE: the L4 checksum is not correct in the
> > >  packet  data, but the integrity of the L4 header is verified.
> > >    -> the application can process the packet but must not verify the
> > >   checksum by sw. It has to take care to recalculate the cksum
> > >   if the packet is transmitted (either by sw or using tx
> > >  offload)
> > > >>>
> > > >>> I like the explanation you made at [1] better :)
> > > >>>
> > > >>> So in general, I think this proposal is good to have.
> > > >>
> > > >> Thanks everyone for your feedback.
> > > >>
> > > >> I'll try to send a first patch proposition soon.
> > > >>
> > > >> Regards,
> > > >> Olivier
> > > >
> > > > I think it is time to ditch the old definitions of Rx checksum and
> > > > instead use something more compatiable with virtio (and Linux). I.e
> have three values
> > > >   1) checksum is know good for packet contents
> > > >   2) checksum value one's complement for packet contents
> > > >   3) checksum is undetermined
> > > > The original definition seems to be Intel HW centric and applies
> > > > to a limited range of devices making it unusable by general application.
> > > >
> > > > Break the ABI, and ditch the old values (ok mark
> > > > PKT_RX_L4_CKSUM_BAD as __deprecated and remove all usage).
> > > >
> > >
> > > Don't you think knowing that a checksum is bad could be useful?
> >
> > Not really. They should be mark as undetermined, then software can
> > recheck for the possibly buggy hardware.
> 
> Hmm, I don't see the point here.
> If the HW clearly reports that checksum is invalid (not unknown), why SW has
> to assume it is ' undetermined' and recheck it?
> To me that means just wasted cycles.
> In general, it sounds like really strange approach to me:
> write your SW with assumption that all HW you are going to use will not work
> correctly.
> 
> >
> > > In that case the application can drop/log the packet without any
> > > additional cpu cost.
> > >
> > > What do you mean by beeing unusable by general application?
> >
> > Right now application can only see "known bad" or "indeterminate"
> > there is no way to no which packets are good. Since good is the
> > desired/expected case, software has to checksum every packet.
> >
> > >
> > > I think the "2)" also requires a csum_start + csum_offset in mbuf
> > > structure, right?
> >
> > Not really, it would mean having a way to get the raw one's complement
> > sum out of the hardware. This is a good way to support the tunnel
> > protocol du jour without having to have firmware support.
> > Unfortunately, most hardware vendors don't believe in doing it that way.
> 
> It might be a good feature to have, but if most HW vendors don't support it
> why to bother?
> 
> >
> >
> > > Do you also suggest to drop IP checksum flags?
> >
> > IP checksum offload is mostly useless. If application needs to look at
> > IP, it can do whole checksum in very few instructions, the whole
> > header is in the same cache line as src/dst so the HW offload is really no
> help.
> >
[Sugesh] The checksum offload can boost the tunneling performance in OVS.
I guess the IP checksum also important as L4. In some cases, UDP checksum is
zero and no need to validate it. But Ip checksum is present on all the packets 
and that must be
validated all  the time. At higher packet rate, the ip checksum offload can 
offer slight 
performance improvement. What do you think??

> > >
> > > Will it be possible to manage tunnel checksums?
> > >
> > > I think this would be a pretty big change. If there is no additional
> > > argument than beeing more compatible with virtio/linux, I'm
> > > wondering if it's worth breaking the API. Let's wait for other opinions.
> 
> I think that what Olivier proposed is good enough and definitely a step
> forward from what we have right 

[dpdk-dev] Fw: RE: DPDK vhostuser with vxlan# Does issue with igb_uio in ovs+dpdk setup

2016-02-04 Thread Chandran, Sugesh
Hi Abhijeet,

It looks to me that the arp entries may not populated right for the VxLAN ports 
in OVS.
Can you please refer the debug section in 
http://openvswitch.org/support/config-cookbooks/userspace-tunneling/
to verify and insert right arp entries in case they are missing ??


Regards
_Sugesh


> -Original Message-
> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Abhijeet Karve
> Sent: Thursday, February 4, 2016 2:55 PM
> To: Czesnowicz, Przemyslaw 
> Cc: dev at dpdk.org; discuss at openvswitch.org
> Subject: Re: [dpdk-dev] Fw: RE: DPDK vhostuser with vxlan# Does issue with
> igb_uio in ovs+dpdk setup
> 
> Hi All,
> 
> The issue which we are facing as described in previous threads thats
> beccause of seting up ovs+dpdk with igb_uio driver instead of vfio_pci?
> 
> Would appriciate if get any suggestions on this.
> 
> Thanks & Regards
> Abhijeet Karve
> 
> 
> To: przemyslaw.czesnowicz at intel.com
> From: Abhijeet Karve
> Date: 01/30/2016 07:32PM
> Cc: "dev at dpdk.org" , "discuss at openvswitch.org"
> , "Gray, Mark D" 
> Subject: Fw: RE: [dpdk-dev] DPDK vhostuser with vxlan
> 
> 
>  Hi przemek,
> 
> 
> We have setup vxlan tunnel between our two compute nodes, Can see the
> traafic in vxlan port on br-tun of source instance's compute node.
> 
> We are in same situation which is being described in below thread, i looked
> dev mailing archieves for it but seems no one has responded it.
> 
> 
> http://comments.gmane.org/gmane.linux.network.openvswitch.general/98
> 78
> 
> Would be really appriciate if provide us any suggestions on it.
> 
> 
> 
> Thanks & Regards
> Abhijeet Karve
> 
>  -Forwarded by on 01/30/2016 07:24PM -
> 
>  ===
>  To: "Czesnowicz, Przemyslaw" 
>  From: Abhijeet Karve/AHD/TCS at TCS
>  Date: 01/27/2016 09:52PM
>  Cc: "dev at dpdk.org" , "discuss at openvswitch.org"
> , "Gray, Mark D" 
>  Subject: RE: [dpdk-dev] DPDK OVS on Ubuntu 14.04# Issue's Resolved#
> Inter-VM communication & IP allocation through DHCP issue
> ===
>Hi Przemek,
> 
> Thanks for the quick response. Now  able to get the DHCP ip's for 2 vhostuser
> instances and able to ping each other. Isssue was a bug in cirros 0.3.0 images
> which we were using in openstack after using 0.3.1 image as given in the
> URL(https://www.redhat.com/archives/rhos-list/2013-
> August/msg00032.html), able to get the IP's in vhostuser VM instances.
> 
> As per our understanding, Packet flow across DPDK datapath will be like
> vhostuser ports are connected to the br-int bridge & same is being patched
> to the br-dpdk bridge where in our physical network (NIC) is connected with
> dpdk0 port.
> 
> So for testing the flow we have to connect that physical network(NIC) with
> external packet generator (e.g - ixia, iperf) & run the testpmd application in
> the vhostuser VM, right?
> 
> Does it required to add any flows/efforts in bridge configurations(either br-
> int or br-dpdk)?
> 
> 
> Thanks & Regards
> Abhijeet Karve
> 
> 
> 
> 
> From: "Czesnowicz, Przemyslaw" 
> To: Abhijeet Karve 
> Cc: "dev at dpdk.org" , "discuss at openvswitch.org"
> , "Gray, Mark D" 
> Date: 01/27/2016 05:11 PM
> Subject: RE: [dpdk-dev] DPDK OVS on Ubuntu 14.04# Issue's Resolved#
> Inter-VM communication & IP allocation through DHCP issue
> 
> 
> 
> Hi Abhijeet,
> 
> 
> It seems you are almost there!
> When booting the VMs do you request hugepage memory for them
> (by setting hw:mem_page_size=large in flavor extra_spec)?
> If not then please do, if yes then please look into libvirt logfiles for the
> VMs (in /var/log/libvirt/qemu/instance-xxx), I think there could be a
> clue.
> 
> 
> Regards
> Przemek
> 
> From: Abhijeet Karve [mailto:abhijeet.karve at tcs.com]
> Sent: Monday, January 25, 2016 6:13 PM
> To: Czesnowicz, Przemyslaw
> Cc: dev at dpdk.org; discuss at openvswitch.org; Gray, Mark D
> Subject: RE: [dpdk-dev] DPDK OVS on Ubuntu 14.04# Issue's Resolved#
> Inter-VM communication & IP allocation through DHCP issue
> 
> Hi Przemek,
> 
> Thank you for your response, It really provided us breakthrough.
> 
> After setting up DPDK on compute node for stable/kilo, We are trying to set
> up Openstack stable/liberty all-in-one setup, At present we are not able to
> get the IP allocation for the vhost type instances through DHCP. Also we tried
> assigning IP's manually to them but the inter-VM communication also not
> happening,
> 
> #neutron agent-list
> root at nfv-dpdk-devstack:/etc/neutron# neutron agent-list
> +--++---+---+--
> --+---+
> | id   | agent_type | host
>   | alive | admin_state_up
> | binary|
> +--++---+---+--
> --+---+
> | 3b29e93c-3a25-4f7d-bf6c-6bb309db5ec0 | DPDK OVS Agent | nfv-dpdk-
>