> >-----Original Message----- > >From: Stokes, Ian <ian.sto...@intel.com> > >Sent: 21. marts 2018 14:38 > >To: Finn Christensen <f...@napatech.com>; 'Yuanhan Liu' > ><y...@fridaylinux.org>; d...@openvswitch.org; Shahaf Shuler > ><shah...@mellanox.com> > >Cc: Darrell Ball <db...@vmware.com>; Chandran, Sugesh > ><sugesh.chand...@intel.com>; Simon Horman <simon.hor...@netronome.com> > >Subject: RE: [PATCH v7 3/6] netdev-dpdk: implement flow offload with > >rte flow > > > >> -----Original Message----- > >> From: Finn Christensen [mailto:f...@napatech.com] > >> Sent: Friday, March 16, 2018 12:44 PM > >> To: Stokes, Ian <ian.sto...@intel.com>; 'Yuanhan Liu' > >> <y...@fridaylinux.org>; d...@openvswitch.org; Shahaf Shuler > >> <shah...@mellanox.com> > >> Cc: Darrell Ball <db...@vmware.com>; Chandran, Sugesh > >> <sugesh.chand...@intel.com>; Simon Horman > ><simon.hor...@netronome.com> > >> Subject: RE: [PATCH v7 3/6] netdev-dpdk: implement flow offload with > >> rte flow > >> > >> Hi Ian, > >> > >> Thanks for the review. > >> I'll fix the compilation issues and the comments you have below. > >> Please see my answers below. > >> > >> Thanks > >> Finn > >> > >> >-----Original Message----- > >> >From: Stokes, Ian <ian.sto...@intel.com> > >> >Sent: 12. marts 2018 16:00 > >> >To: 'Yuanhan Liu' <y...@fridaylinux.org>; d...@openvswitch.org > >> >Cc: Finn Christensen <f...@napatech.com>; Darrell Ball > >> ><db...@vmware.com>; Chandran, Sugesh <sugesh.chand...@intel.com>; > >> >Simon Horman <simon.hor...@netronome.com> > >> >Subject: RE: [PATCH v7 3/6] netdev-dpdk: implement flow offload with > >> >rte flow > >> > > >> >> -----Original Message----- > >> >> From: Yuanhan Liu [mailto:y...@fridaylinux.org] > >> >> Sent: Monday, January 29, 2018 7:00 AM > >> >> To: d...@openvswitch.org > >> >> Cc: Stokes, Ian <ian.sto...@intel.com>; Finn Christensen > >> >> <f...@napatech.com>; Darrell Ball <db...@vmware.com>; Chandran, > >> >> Sugesh <sugesh.chand...@intel.com>; Simon Horman > >> >> <simon.hor...@netronome.com>; Yuanhan Liu <y...@fridaylinux.org> > >> >> Subject: [PATCH v7 3/6] netdev-dpdk: implement flow offload with > >> >> rte flow > >> >> > >> >> From: Finn Christensen <f...@napatech.com> > >> >> > >> >> The basic yet the major part of this patch is to translate the > "match" > >> >> to rte flow patterns. And then, we create a rte flow with MARK + > >> >> RSS actions. Afterwards, all packets match the flow will have the > >> >> mark id in the mbuf. > >> >> > >> >> The reason RSS is needed is, for most NICs, a MARK only action is > >> >> not allowed. It has to be used together with some other actions, > >> >> such as QUEUE, RSS, etc. However, QUEUE action can specify one > >> >> queue only, which may break the rss. Likely, RSS action is > >> >> currently the best we could > >> >now. > >> >> Thus, RSS action is choosen. > >> >> > >> >> For any unsupported flows, such as MPLS, -1 is returned, meaning > >> >> the flow offload is failed and then skipped. > >> >> > >> >> Co-authored-by: Yuanhan Liu <y...@fridaylinux.org> > >> >> Signed-off-by: Finn Christensen <f...@napatech.com> > >> >> Signed-off-by: Yuanhan Liu <y...@fridaylinux.org> > >> > > >> >Hi Finn, > >> > > >> >Thanks for working on this. There are compilation errors introduced > >> >by this commit, details below along with other comments. > >> > > >> >For completeness here is the link to the OVS Travis build with > >> >associated failures > >> > > >> >https://travis-ci.org/istokes/ovs/builds/352283122 > >> > > >> > > >> >Thanks > >> >Ian > >> > > >> >> --- > >> >> > >> >> v7: - set the rss_conf for rss action to NULL, to workaround a > >> >> mlx5 > >> change > >> >> in DPDK v17.11. Note that it will obey the rss settings > >> >> OVS-DPDK > >> has > >> >> set in the beginning. Thus, nothing should be effected. > >> >> > >> >> --- > >> >> lib/netdev-dpdk.c | 559 > >> >> +++++++++++++++++++++++++++++++++++++++++++++++++++++- > >> >> 1 file changed, 558 insertions(+), 1 deletion(-) > >> >> > >> >> diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c index > >> >> ac2e38e..4bd0503 > >> >> 100644 > >> >> --- a/lib/netdev-dpdk.c > >> >> +++ b/lib/netdev-dpdk.c > >> >> @@ -38,7 +38,9 @@ > >> >> #include <rte_pci.h> > >> >> #include <rte_vhost.h> > >> >> #include <rte_version.h> > >> >> +#include <rte_flow.h> > >> >> > >> >> +#include "cmap.h" > >> >> #include "dirs.h" > >> >> #include "dp-packet.h" > >> >> #include "dpdk.h" > >> >> @@ -60,6 +62,7 @@ > >> >> #include "sset.h" > >> >> #include "unaligned.h" > >> >> #include "timeval.h" > >> >> +#include "uuid.h" > >> >> #include "unixctl.h" > >> >> > >> >> enum {VIRTIO_RXQ, VIRTIO_TXQ, VIRTIO_QNUM}; @@ -3633,6 > >+3636,560 > >> >@@ > >> >> unlock: > >> >> return err; > >> >> } > >> >> > >> >> +/* > >> >> + * A mapping from ufid to dpdk rte_flow. > >> >> + */ > >> >> +static struct cmap ufid_to_rte_flow = CMAP_INITIALIZER; > >> >> + > >> >> +struct ufid_to_rte_flow_data { > >> >> + struct cmap_node node; > >> >> + ovs_u128 ufid; > >> >> + struct rte_flow *rte_flow; > >> >> +}; > >> > > >> >Might be cleaner to declare above along with the other structs/maps > >> >specific to netdev-dpdk.c towards the beginning of this file. > >> > >> Ok done. > >> > >> > > >> >> + > >> >> +/* Find rte_flow with @ufid */ > >> >> +static struct rte_flow * > >> >> +ufid_to_rte_flow_find(const ovs_u128 *ufid) { > >> >> + size_t hash = hash_bytes(ufid, sizeof(*ufid), 0); > >> >> + struct ufid_to_rte_flow_data *data; > >> >> + > >> >> + CMAP_FOR_EACH_WITH_HASH (data, node, hash, > >&ufid_to_rte_flow) { > >> >> + if (ovs_u128_equals(*ufid, data->ufid)) { > >> >> + return data->rte_flow; > >> >> + } > >> >> + } > >> >> + > >> >> + return NULL; > >> >> +} > >> >> + > >> >> +static inline void > >> >> +ufid_to_rte_flow_associate(const ovs_u128 *ufid, struct rte_flow > >> >> +*rte_flow) { > >> >> + size_t hash = hash_bytes(ufid, sizeof(*ufid), 0); > >> >> + struct ufid_to_rte_flow_data *data = xzalloc(sizeof(*data)); > >> >> + > >> >> + /* > >> >> + * We should not simply overwrite an existing rte flow. > >> >> + * We should have deleted it first before re-adding it. > >> >> + * Thus, if following assert triggers, something is wrong: > >> >> + * the rte_flow is not destroyed. > >> >> + */ > >> >> + ovs_assert(ufid_to_rte_flow_find(ufid) == NULL); > >> >> + > >> >> + data->ufid = *ufid; > >> >> + data->rte_flow = rte_flow; > >> >> + > >> >> + cmap_insert(&ufid_to_rte_flow, > >> >> + CONST_CAST(struct cmap_node *, &data->node), > >> >> + hash); } > >> >> + > >> >> +static inline void > >> >> +ufid_to_rte_flow_disassociate(const ovs_u128 *ufid) { > >> >> + size_t hash = hash_bytes(ufid, sizeof(*ufid), 0); > >> >> + struct ufid_to_rte_flow_data *data; > >> >> + > >> >> + CMAP_FOR_EACH_WITH_HASH (data, node, hash, > >&ufid_to_rte_flow) { > >> >> + if (ovs_u128_equals(*ufid, data->ufid)) { > >> >> + cmap_remove(&ufid_to_rte_flow, > >> >> + CONST_CAST(struct cmap_node *, > >> >> + &data->node), > >> >> hash); > >> >> + free(data); > >> >> + return; > >> >> + } > >> >> + } > >> >> + > >> >> + VLOG_WARN("ufid "UUID_FMT" is not associated with an rte > >flow\n", > >> >> + UUID_ARGS((struct uuid *)ufid)); } > >> >> + > >> >> +struct flow_patterns { > >> >> + struct rte_flow_item *items; > >> >> + int cnt; > >> >> + int max; > >> >> +}; > >> >> + > >> >> +struct flow_actions { > >> >> + struct rte_flow_action *actions; > >> >> + int cnt; > >> >> + int max; > >> >> +}; > >> >> + > >> >> +static void > >> >> +add_flow_pattern(struct flow_patterns *patterns, enum > >> >> +rte_flow_item_type > >> >> type, > >> >> + const void *spec, const void *mask) { > >> >> + int cnt = patterns->cnt; > >> >> + > >> >> + if (cnt == 0) { > >> >> + patterns->max = 8; > >> >> + patterns->items = xcalloc(patterns->max, sizeof(struct > >> >> rte_flow_item)); > >> >> + } else if (cnt == patterns->max) { > >> >> + patterns->max *= 2; > >> >> + patterns->items = xrealloc(patterns->items, patterns->max * > >> >> + sizeof(struct rte_flow_item)); > >> >> + } > >> > > >> >Just a general query about max value above, so if cnt == 0 you set > >> >the max to 8. However if cnt is == max you then double the value of > max. > >> > > >> >What is the purpose of max and what is it's limit? Maybe it becomes > >> >clearer later in the patch but at the moment it seems 8 is the > >> >default max however it can be higher, I just find the behavior a > >> >little > >> misleading. > >> > >> The rte flow pattern item part is allocated in one allocation and > >> then grows with xrealloc if needed. It initializes with 8 elements > >> and then grows by doubling the number of elements. This is done to > >> avoid a maximum on the number of items and also not allocate for each > >> single item element individually, basically to simplify the code. > >> There is no limit other than when xalloc fails and then ovs_abort is > >> called. > >> If the "max" wording is a bit misleading, we can change it. Maybe > >> "current_max" is better? > > > >Ah ok, that makes a bit more sense. 'current_max' would be fine with a > >small comment explaining that we want to avoid individual allocations. > > OK, I'll do that. > > > > >In the case where the xrealoc fails I can see why allocating on an > >individual babsis would be poor here. > > > >In the case where no more items can be allocated can we fall back to > >the previous number allocated and flag an error? > > It uses xrealloc. And when it fails, it calls out_of_memory and finally > abort. > These allocations are really small, and we will probably have a limited > number of items, even usually below 8. Furthermore, if it fails that small > an allocation I think it is OK to have xrealloc flag an out_of_memory. It > makes the code simpler. > If you still think it is worth implementing code for failing a flow > offload caused by an xrealloc failure, let me know. >
And once we've exhausted memory there's not much can be done at that point form a user's perspective. OK this approach as is is fine. We can always revisit down the road if it does become an issue. Ian _______________________________________________ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev