Hi, Adding small comment on top of Adrien's
Tuesday, April 10, 2018 1:20 PM, Adrien Mazarguil: > On Mon, Apr 09, 2018 at 05:10:35PM +0100, Mohammad Abdul Awal wrote: > > On 06/04/2018 21:26, Adrien Mazarguil wrote: > > > On Fri, Apr 06, 2018 at 01:24:01PM +0100, Declan Doherty wrote: > > > > Add new flow action types and associated action data structures to > > > > support the encapsulation and decapsulation of the virtual tunnel > > > > endpoints. > > > > > > > > The RTE_FLOW_ACTION_TYPE_TUNNEL_ENCAP action will cause the > > > > matching flow to be encapsulated in the virtual tunnel endpoint > > > > overlay defined in the tunnel_encap action data. > > > > > > > > The RTE_FLOW_ACTION_TYPE_TUNNEL_DECAP action will cause all > > > > virtual tunnel endpoint overlays up to and including the first > > > > instance of the flow item type defined in the tunnel_decap action > > > > data for the matching flows. > > > > > > > > Signed-off-by: Declan Doherty <declan.dohe...@intel.com> > > > This generic approach looks flexible enough to cover the use cases > > > that immediately come to mind (VLAN, VXLAN), its design is sound. > > > > > > However, while I'm aware it's not a concern at this point, it won't > > > be able to deal with stateful tunnel or encapsulation types (e.g. > > > IPsec or TCP) which will require additional meta data or some > > > run-time assistance from the application. > > > > > > Eventually for more complex use cases, dedicated encap/decap actions > > > will have to appear, so the issue I wanted to raise before going further > > > is > this: > > > > > > Going generic inevitably trades some of the usability; flat > > > structures dedicated to VXLAN encap/decap with only the needed info > > > to get the job done would likely be easier to implement in PMDs and > > > use in applications. Any number of such actions can be added to rte_flow > without ABI impact. > > > > > > If VXLAN is the only use case at this point, my suggestion would be > > > to go with simpler RTE_FLOW_ACTION_TYPE_VXLAN_(ENCAP|DECAP) > actions, > > > with fixed > > > L2/L3/L4/L5 header definitions to prepend according to RFC 7348. > > We can go this way and this will increase the action for more and more > > tunneling protocols being added. Current proposal is already a generic > > approach which specifies as a tunnel for all the tunneling protocols. > > Right, on the other hand there are not that many standard encapsulations > offloaded by existing devices. rte_flow could easily handle dedicated actions > for each of them without problem. > > My point is that many of those (will eventually) have their own quirks to > manage when doing encap/decap, it's not just a matter of prepending or > removing a bunch of header definitions, otherwise we could as well let > applications simply provide an arbitrary buffer to prepend. > > Consider that the "generic" part is already built into rte_flow as the way > patterns and action are handled. Adding another generic layer on top of that > could make things more inconvenient than necessary to applications (my > main concern). > > You'd need another layer of validation/error reporting machinery to properly > let applications know they cannot encap VXLAN on top of TCP on top of > QinQinQinQinQ for instance. Either a single bounded encapsulation definition > or a combination at the action list level is needed to avoid that. > > > > Now we can start with the generic approach, see how it fares and add > > > dedicated encap/decap later as needed. > > > > > > More comments below. > <snip> > > > > +Action: ``TUNNEL_ENCAP`` > > > > +^^^^^^^^^^^^^^^^^^^^^^ The ENCAP/DECAP doesn't have to be in the context of tunnel. For example - let's take GRE - application may want to decap the GRE and encap it with L2. The L2 encapsulation is not related to any tunnel. Same for the other direction - VM sends Eth frame, and we want to decap the Eth and encap with GRE. I think those action should be free from the tunnel association and just provide flow items we want to encap/decap or in a more generic way offset to the packet headers and buffer to encap (not sure how many devices supports that, may be overkill at this point). > > > > + > > > > +Performs an encapsulation action by encapsulating the flows > > > > +matched by the pattern items according to the network overlay > > > > +defined in the ``rte_flow_action_tunnel_encap`` pattern items. > > > > + > > > > +This action modifies the payload of matched flows. The pattern > > > > +items specified in the ``rte_flow_action_tunnel_encap`` action > > > > +structure must defined a valid set of overlay headers, from the > > > > +Ethernet header up to the overlay header. The pattern must be > terminated with the RTE_FLOW_ITEM_TYPE_END item type. > > > Regarding the use of a pattern list, if you consider PMDs are > > > already iterating on a list of actions when encountering > > > RTE_FLOW_ACTION_TYPE_TUNNEL_ENCAP, it adds yet another inner > loop. > > We understand that it is implementation specifics. If we do not go for > > another inner loop, all the bundling need to be handled in the same > > function, which seems more clumsy to me. This also breaks the tunnel > > endpoint concept. > > > > > > How about making each encountered > RTE_FLOW_ACTION_TYPE_TUNNEL_ENCAP > > > provide exactly one item instead (in encap, i.e. reverse order)? > > Again, if we have tunnel action, security action, and other actions, > > all the processing and tracking need to be done in one function. Now > > we will need ETH_ENCAP/DECAP, UDP_ENCAP/DECAP, > NVGRE_ENCAP/DECAP, etc. > > Well, the number of DECAP actions doesn't need to perfectly reflect that of > ENCAP since it implies all preceding layers. No problem with that. > > Regarding multiple dedicated actions, my suggestion was for a single generic > one as in this patch, but each instance on the ENCAP side would deal with a > single protocol layer, instead of having a single ENCAP action with multiple > inner layers (and thus an inner loop). > > PMDs also gain the ability to precisely report which encap step fails by > making > rte_flow_error point to the problematic object to ease debugging of flow > rules on the application side. > > Why would that break the tunnel idea and more importantly, how would it > prevent PMD developers from splitting their processing into multiple > functions? > > > > > > > In which case perhaps "GENERIC" would be a better fit than "TUNNEL". > > > > <snip> > > > > + > > > > + +-------+--------------------------+------------+ > > > > + | Index | Flow Item Type | Flow Item | > > > > + +=======+==========================+============+ > > > > + | 0 | RTE_FLOW_ITEM_TYPE_ETH | eth item | > > > > + +-------+--------------------------+------------+ > > > > + | 1 | RTE_FLOW_ITEM_TYPE_IPV4 | ipv4 item | > > > > + +-------+--------------------------+------------+ > > > > + | 2 | RTE_FLOW_ITEM_TYPE_UDP | udp item | > > > > + +-------+--------------------------+------------+ > > > > + | 3 | RTE_FLOW_ITEM_TYPE_VXLAN | vxlan item | > > > > + +-------+--------------------------+------------+ > > > > + | 4 | RTE_FLOW_ITEM_TYPE_END | NULL | > > > > + +-------+--------------------------+------------+ > > > One possible issue is that it relies on objects normally found on > > > the pattern side of flow rules. Those are supposed to match > > > something, they are not intended for packet header generation. While > their "spec" and "mask" > > > fields might make sense in this context, the "last" field is odd. > > > > > > You must define them without leaving anything open for > > > interpretation by PMDs and users alike. Defining things as > > > "undefined" is fine as long as it's covered. > > Please note that the "void *item" in the > > "rte_flow_action_tunnel_encap.pattern" points to the data structure > > defined for the corresponding rte_flow_item_type instead of a > > rte_flow_item structure. As an example, for the rte_flow_item_eth type, > the "void *item" > > will point to a "struct rte_flow_item_eth" instance. Thats why we have > > defined struct rte_flow_action_item inside struct > > rte_flow_action_tunnel_encap. So, no question of spec, mask, last > anymore. > > Right, I noticed that after commenting its structure definition below. > > I think I won't be the only one confused by this approach, also because a > mask is needed in addition to a specification structure, otherwise how do you > plan to tell what fields are relevant in application-provided protocol > headers? > > An application might set unusual IPv4/UDP/VXLAN fields and expect them to > be part of the encapsulated traffic. Without a mask, a PMD must take > headers verbatim, and I don't think many devices are ready for that yet. > > Hence my other suggestion: defining inflexible $PROTOCOL_(ENCAP|DECAP) > actions that do not allow more than what's defined by official RFCs for > $PROTOCOL. > > <snip> > > > > + */ > > > > +struct rte_flow_action_tunnel_encap { > > > > + struct rte_flow_action_item { > > > > + enum rte_flow_item_type type; > > > > + /**< Flow item type. */ > > > > + const void *item; > > > > + /**< Flow item definition which points to the data of > > > > + * corresponding rte_flow_item_type. > > > > + */ > > > I see it's a new action type, albeit a bit confusing (there is no > > > RTE_FLOW_ACTION_TYPE_ITEM). > > > > > > I suggest the standard pattern item type since you're going with > > > enum rte_flow_item_type anyway. Keep in mind you need some kind of > > > mask to tell what fields are relevant. An application might > > > otherwise want to encap with unsupported properties (e.g. specific IPv4 > ToS field and whatnot). > > > > > > How about a single "struct rte_flow_pattern_item item", neither > > > const and neither a pointer. It's generic enough, enclosed > > > spec/last/mask pointers take care of the specifics. You just need to > > > define what's supposed to happen when "last" is set. > > Please see the comment above regarding this field. > > Point still stands, if you need to distinguish spec and mask, a more complete > structure is needed. Instead of adding a new (confusing) type, you should > use rte_flow_item and define what happens when "last" is set. > > You should define it as reserved for now, any non-NULL value is an error. This > field might also be useful later. > > <snip> > > > > +}; > > > > + > > > > +/** > > > > + * RTE_FLOW_ACTION_TYP_TUNNEL_DECAP > > > > + * > > > > + * Virtual tunnel end-point decapsulation action data. > > > > + * > > > > + * Non-terminating action by default. > > > > + */ > > > > +struct rte_flow_action_tunnel_decap { > > > > + enum rte_flow_item_type type; > > > > + /**< > > > > + * Flow item type of virtual tunnel end-point to be decapsulated > > > > + */ > > > > +}; > > > Note that contrary to ENCAP, DECAP wouldn't necessarily need > > > repeated actions to peel each layer off. The current definition is fine. > > To clarify, the the decap is upto the PMD to remove all the header for > > a specified type. For example, for > > > > rte_flow_item_type type=RTE_FLOW_ITEM_TYPE_VXLAN, the PMD will > peel off (ETH, IPV4, UDP, VXLAN) header all together. > > Yep, that's fine, whether we use multiple actions or a single one doing > multiple things, a single DECAP can peel them off all at once :) > > > > > > > > + > > > > +/** > > > > * Definition of a single action. > > > > * > > > > * A list of actions is terminated by a END action. > > > > -- > > > > 2.7.4 > > > > > > If the reasons I gave did not manage to convince you about choosing > between > either fixed (VLAN|VXLAN)_(ENCAP|DECAP) actions or generic encap/decap > actions that deal with a single protocol layer at once instead of the > proposed approach, I'm ready to try it out as an experimental API (all new > objects tagged as experimental) *if* you address the lack of mask, which > remains an open issue. > > -- > Adrien Mazarguil > 6WIND