Hi Antonio, Thank you for the patches,
Please find my comments below. Regards _Sugesh > -----Original Message----- > From: [email protected] [mailto:ovs-dev- > [email protected]] On Behalf Of Fischetti, Antonio > Sent: Tuesday, June 6, 2017 5:15 PM > To: Darrell Ball <[email protected]>; [email protected] > Subject: Re: [ovs-dev] [PATCH RFC v2] Conntrack: Avoid recirculation for > established connections. > > Thanks Darrel for your useful comments, I've tried to replicate the usecases > you mentioned, please find inline some details. > > Regards, > Antonio > > > -----Original Message----- > > From: Darrell Ball [mailto:[email protected]] > > Sent: Thursday, June 1, 2017 6:20 PM > > To: Fischetti, Antonio <[email protected]>; > > [email protected] > > Subject: Re: [ovs-dev] [PATCH RFC v2] Conntrack: Avoid recirculation > > for established connections. > > > > Comments inline > > > > On 5/29/17, 8:24 AM, "[email protected] on behalf of > > Fischetti, Antonio" <[email protected] on behalf of > > [email protected]> wrote: > > > > Thanks Joe for your feedback and the interesting insights in > > conntrack in your earlier communication. > > We have added all the details that we considered for this first > > implementation. Also, some answers are inline. > > > > The purpose of this implementation is to avoid recirculation just > > for those packets that are part of established connections. > > > > This shouldn't affect the packet recirculation for actions other > > than conntrack. For example in MPLS, after a pop_mpls action the > > packet will still be recirculated to follow the usual datapath. > > > > Most importantly, the CT module isn't by-passed in this > > implementation. > > > > Besides the recirculation, all other action[s] shall be executed > > as-is on each packet. > > Any new CT change or action set by the controller will be managed > > as usual. > > > > For our testing we set up a simple firewall, below are the flows. > > > > > > Flow Setup > > ---------- > > table=0, priority=100,ct_state=-trk,ip actions=ct(table=1) > > table=0, priority=10,arp actions=NORMAL > > table=0, priority=1 actions=drop > > table=1, ct_state=+new+trk,ip,in_port=1 actions=ct(commit),output:2 > > table=1, ct_state=+new+trk,ip,in_port=2 actions=drop > > table=1, ct_state=+est+trk,ip,in_port=1 actions=output:2 > > table=1, ct_state=+est+trk,ip,in_port=2 actions=output:1 > > > > > > Basic idea > > ---------- > > With the above rules, all packets belonging to an established > > connection will first hit the flow > > "ct_state=-trk,ip actions=ct(table=1)" > > > > then on recirculation, they will hit > > "ct_state=+est+trk,ip,in_port=.. actions=output:X". > > > > The basic idea is to do the following 2 steps. > > 1. Combine the two sets of actions by removing the recirculation. > > a) Original actions: > > - "ct(zone=N), recirc(id)" [ i.e ct(table=1) ] > > - "output:X" > > b) Combined Actions after Removing recirculation: > > - "ct(zone=N), output:X". > > > > 2. All the subsequent packets shall hit a flow with the combined > > actions. > > > > > > [Darrell] > > > > 1) This would be constraining on how rules are written such that we > > can’t go back to > > the beginning of the pipeline, just because a packet is > > established. > > Meaning we can’t run stateless or stateful rules earlier > > in the pipeline. > > There are lots of possible combinations. > > One case is: > > We may run a bunch of firewall checks in table X, later > > on (table X+1) run the non-filtered packets through conntrack > > and then want to go back to the firewall table. > > > > [Antonio] > I've tried to replicate the case you described. I've used the following set of > rules. They may have no practical meaning, the purpose is to have - at least > for packets from port 1 - that they traverse tables #0 => 1 => 2 and then are > sent back to table#0. > > NXST_FLOW reply (xid=0x4): > table=0, priority=100,ct_state=-trk,ip actions=ct(table=1) table=1, > ct_state=+new+trk,ip,in_port=1,nw_src=10.10.10.0/24 > actions=ct(commit),output:2 table=1, ct_state=+est+trk,ip,in_port=1 > actions=mod_nw_src:3.3.3.3,resubmit(,2) > table=2, ct_state=+est+trk,tcp,in_port=1 actions=drop table=2, > ct_state=+est+trk,udp,in_port=1 actions=resubmit(1,0) table=0, > ct_state=+est+trk,udp,in_port=1,nw_src=3.3.3.3 actions=output:2 table=1, > ct_state=+new+trk,ip,in_port=2 actions=drop table=1, > ct_state=+est+trk,ip,in_port=2 actions=output:1 table=0, priority=10,arp > actions=NORMAL > > For packets from port1: > - table#0: untracked packets go through the CT, then are recirculated, ie go > to table#1. > - table#1: new conn are committed. > - table#1: packets of est conn are modified in the IPsrc that becomes > 3.3.3.3, > then sent to table#2. > - table#2: TCP packets are dropped - but I'm sending just UDP traffic. > - table#2: UDP packets go back to table#0. > - table#0: est conn packets with IPsrc=3.3.3.3 are sent to port2. > > In this case we still see that the packets of est conn will skip one > recirculation. > > In fact what happens is the following. > For packets belonging to est connections - and coming from > port1 - the actions > mod_nw_src:3.3.3.3,resubmit(,2) > become > ct(),mod_nw_src:3.3.3.3,resubmit(,2) > > and the EMC entry that was hit by the original packet will point to the > megaflow > table=1, ct_state=+est+trk,ip,in_port=1 > actions=ct(),mod_nw_src:3.3.3.3,resubmit(,2) > So new packets of the same connection will skip one recirculation. > > For packets from port2 the megaflow > table=1, ct_state=+est+trk,ip,in_port=2 actions=output:1 becomes > table=1, ct_state=+est+trk,ip,in_port=2 actions=ct(),output:1 and will be > pointed by the EMC entry hit by the original packet. > So new packets of the same connection will skip one recirculation. > > Is this usecase somewhat similar to the one you were mentioning? > Or maybe you can suggest some other rule setup? > > > > 2) This allows the dataplane to override what the ofproto layer specifies. > > This looks like a layering violation. > > > > I guess this also speaks to the datapath (userspace) specific design here. > > > > > > I don’t think it matters what the performance advantage is, although > > it is small in this case. > > > > > > > > > > Considerations > > -------------- > > On EMC entries: > > All packets hitting a connection in the same direction will have > > the same 5-tuple. > > They shall all hit the same EMC entry - here referred as > > 'EMC_ORIG' - and point to the flow > > "ct_state=-trk,ip actions=ct(table=1)", referred from here as > > 'FLOW_ORIG'. > > > > On recirculation the packets shall hit another EMC entry - > > referred as 'EMC_RECIRC' - and point to the flow > > "ct_state=+est+trk,ip,in_port=.. actions=output:X", > > referred from here as 'FLOW_RECIRC'. > > > > FLOW_ORIG is the flow hit by the ingress packet while FLOW_RECIRC > > is the flow hit > > by the recirculated packet. > > > > > > Implementation Details > > ---------------------- > > We thought about different implementations, but for the purpose of > > this RFC we stick to the following approach. > > > > On packet recirculation, a new flow has to be created with a list > > of appropriate actions. At this > > stage we prepend a new action called 'ct(Skip)' to the original > > list of actions. This > > new action is used as a temporary place-holder that will be over > > written later on. > > > > Original Actions: "Output:2" ==> New Actions: "ct(Skip), > > Output:2" > > > > For all the packets of an established conn. (i.e marked with > > +trk,+est bits), do the 3 steps: > > 1. Retrieve Flows from EMC. > > - FLOW_ORIG whose actions include: ct(zone=N), recirc(id) > > - FLOW_RECIRC whose actions include: ct(Skip), Output:X > > > > 2. Update the actions of FLOW_RECIRC by overwriting ct(skip) with > > the ct action from FLOW_ORIG, ie ct(zone=N). > > ct(Skip), Output:X => ct(zone=N), Output:X > > > > 3. Update the flow pointer in EMC_ORIG entry to 'FLOW_RECIRC' flow. > > > > In this way, we avoid recirculating the packets belonging to > > established connections and > > this no ways skips sending the packets to the CT module. > > > > > > Antonio > > > > > -----Original Message----- > > > From: [email protected] [mailto:ovs-dev- > > > [email protected]] On Behalf Of [email protected] > > > Sent: Thursday, May 25, 2017 5:03 PM > > > To: [email protected] > > > Subject: [ovs-dev] [PATCH RFC v2] Conntrack: Avoid recirculation for > > > established connections. > > > > > > From: Antonio Fischetti <[email protected]> > > > > > > With conntrack enabled, packets get recirculated and this impacts > > > the performance with thousands of active concurrent connections. > > > > > > This patch is aimed at avoiding recirculation for packets > > belonging to > > > established connections in steady state. This is achieved by > > > manipulating the megaflows and actions. In this case the actions > > of the > > > megaflow of recirculated packet is initially updated with new > > 'CT_SKIP' > > > action and later updated with the actions of the megaflow of the > > > original packet(to handle zones). There after the EMC entry > > > belonging to the original packet is updated to point to the > > 'megaflow' of > > > the recirculated packet there by avoiding recirculation. > > > > > > Signed-off-by: Antonio Fischetti <[email protected]> > > > Signed-off-by: Bhanuprakash Bodireddy > > <[email protected]> > > > Co-authored-by: Bhanuprakash Bodireddy > > <[email protected]> > > > --- > > > - ~10% performance improvement is observed in our testing with UDP > > > streams. > > > - Limited testing is done with RFC patch and the tests include > > hundreds of > > > concurrent active TCP connections. > > > - This is very early implementation and we are posting here to > > get initial > > > feedback. > > > - This RFC patch is implemented leveraging EMC, but optimization > > could be > > > extended to dpcls as well to handle higher flows. > > > - No VXLAN testing is done with this patch. > > > > > > datapath/linux/compat/include/linux/openvswitch.h | 1 + > > > lib/dpif-netdev.c | 148 > > > ++++++++++++++++++++-- > > > lib/packets.h | 2 + > > > 3 files changed, 143 insertions(+), 8 deletions(-) > > > > > > diff --git a/datapath/linux/compat/include/linux/openvswitch.h > > > b/datapath/linux/compat/include/linux/openvswitch.h > > > index d22102e..6dc5674 100644 > > > --- a/datapath/linux/compat/include/linux/openvswitch.h > > > +++ b/datapath/linux/compat/include/linux/openvswitch.h > > > @@ -762,6 +762,7 @@ enum ovs_ct_attr { > > > OVS_CT_ATTR_NAT, /* Nested OVS_NAT_ATTR_* */ > > > OVS_CT_ATTR_FORCE_COMMIT, /* No argument */ > > > OVS_CT_ATTR_EVENTMASK, /* u32 mask of IPCT_* events. */ > > > + OVS_CT_ATTR_SKIP, /* No argument, does not invoke CT. > > */ > > > __OVS_CT_ATTR_MAX > > > }; > > > > > > diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c > > > index b50164b..fbbb42e 100644 > > > --- a/lib/dpif-netdev.c > > > +++ b/lib/dpif-netdev.c > > > @@ -2101,7 +2101,8 @@ emc_probabilistic_insert(struct > > dp_netdev_pmd_thread > > > *pmd, > > > } > > > > > > static inline struct dp_netdev_flow * > > > -emc_lookup(struct emc_cache *cache, const struct > > netdev_flow_key > > *key) > > > +emc_lookup(struct emc_cache *cache, const struct > > netdev_flow_key *key, > > > + void **pfound_entry) > > > { > > > struct emc_entry *current_entry; > > > > > > @@ -2110,11 +2111,13 @@ emc_lookup(struct emc_cache *cache, > > const struct > > > netdev_flow_key *key) > > > && emc_entry_alive(current_entry) > > > && netdev_flow_key_equal_mf(¤t_entry->key, &key- > > >mf)) { > > > > > > + *pfound_entry = current_entry; > > > /* We found the entry with the 'key->mf' miniflow */ > > > return current_entry->flow; > > > } > > > } > > > > > > + *pfound_entry = NULL; > > > return NULL; > > > } > > > > > > @@ -4583,10 +4586,12 @@ emc_processing(struct > > dp_netdev_pmd_thread *pmd, > > > key->hash = dpif_netdev_packet_get_rss_hash(packet, &key- > > >mf); > > > > > > /* If EMC is disabled skip emc_lookup */ > > > - flow = (cur_min == 0) ? NULL: emc_lookup(flow_cache, key); > > > + flow = (cur_min == 0) ? NULL: > > > + emc_lookup(flow_cache, key, &packet- > > >md.prev_emc_entry); > > > if (OVS_LIKELY(flow)) { > > > dp_netdev_queue_batches(packet, flow, &key->mf, > > batches, > > > n_batches); > > > + packet->md.prev_flow = flow; > > > } else { > > > /* Exact match cache missed. Group missed packets > > together at > > > * the beginning of the 'packets' array. */ > > > @@ -4595,6 +4600,7 @@ emc_processing(struct > dp_netdev_pmd_thread > > *pmd, > > > * must be returned to the caller. The next key should > > be > > > extracted > > > * to 'keys[n_missed + 1]'. */ > > > key = &keys[++n_missed]; > > > + packet->md.prev_flow = NULL; > > > } > > > } > > > > > > @@ -4604,6 +4610,9 @@ emc_processing(struct > dp_netdev_pmd_thread > > *pmd, > > > return dp_packet_batch_size(packets_); > > > } > > > > > > +#define CT_PREPEND_ACTION_LEN 0x0C > > > +#define CT_PREPEND_ACTION_SPARE_LEN 32 > > > +#define CT_TOT_ACTIONS_LEN (CT_PREPEND_ACTION_LEN + > > > CT_PREPEND_ACTION_SPARE_LEN) > > > static inline void > > > handle_packet_upcall(struct dp_netdev_pmd_thread *pmd, > > > struct dp_packet *packet, > > > @@ -4663,12 +4672,44 @@ handle_packet_upcall(struct > > dp_netdev_pmd_thread > > > *pmd, > > > ovs_mutex_lock(&pmd->flow_mutex); > > > netdev_flow = dp_netdev_pmd_lookup_flow(pmd, key, NULL); > > > if (OVS_LIKELY(!netdev_flow)) { > > > - netdev_flow = dp_netdev_flow_add(pmd, &match, &ufid, > > > - add_actions->data, > > > - add_actions->size); > > > + /* Recirculated packets that belong to established > > > connections > > > + * are treated differently. A place-holder is > > prepended to > > > the > > > + * list of actions. */ > > > + if (!(packet->md.ct_state & CS_INVALID) && > > > + (packet->md.ct_state & CS_TRACKED) && > > > + (packet->md.ct_state & CS_ESTABLISHED) && > > > + (packet->md.recirc_id)) { > > > + > > > > > > [Joe] "Just because the value of ct_state in the packet metadata > > of this > > particular packet happens to have those particular state flags doesn't > > mean that the OpenFlow pipeline is enforcing this constraint for all > > packets hitting this flow. You would have to ensure that the mask in > > the flow is also ensuring that all future packets hitting the new flow > > will also have that particular ct_state." > > > > This part of the code should prepend a ct(Skip) action to the > > FLOW_RECIRC only and just for packets that belong to an > > established conn. > > The FLOW_ORIG shouldn't be altered, because it will be hit by all > > ingress > > packets. > > > > > + /* Prepend an OVS_CT_ATTR_SKIP to the list of > > actions > > > inside > > > + * add_actions. It does not really invoke the > > ConnTrack > > > module, > > > + * it's a Ct action that works as a place-holder. > > It > > > will be > > > + * overwritten inside emc_patch_established_conn() > > with > > > + * the proper ct(zone=X) action. */ > > > + struct dp_netdev_actions *new_actions; > > > + uint8_t tmp_action[CT_TOT_ACTIONS_LEN] = { > > > + 0x0C, 0x00, 0x0C, 0x00, > > > + 0x06, 0x00, 0x09, 0x00, > > > + 0x00, 0x00, 0x00, 0x00, > > > + /* will append here add_actions. */ > > > + }; > > > + memcpy(&tmp_action[CT_PREPEND_ACTION_LEN], > > > + add_actions->data, > > > + add_actions->size); > > > + new_actions = dp_netdev_actions_create( > > > + (const struct nlattr *)tmp_action, > > > + CT_PREPEND_ACTION_LEN + add_actions->size); > > > > [Joe] "Hmm. In some ways I think this would be cleaner to be > > implemented in > > the xlate code, around compose_conntrack_action().. but that code is > > currently pretty much dataplane-oblivious and this suggestion is a > > userspace-specific optimization." > > > > > > > + netdev_flow = dp_netdev_flow_add(pmd, &match, > > &ufid, > > > + new_actions- > > >actions, > > > + new_actions- > > >size); > > > + } else { > > > + netdev_flow = dp_netdev_flow_add(pmd, &match, > > &ufid, > > > + add_actions->data, > > > + add_actions- > > >size); > > > + } > > > } > > > ovs_mutex_unlock(&pmd->flow_mutex); > > > - emc_probabilistic_insert(pmd, key, netdev_flow); > > > + /* For this RFC, probabilistic emc insertion is disabled. > > */ > > > + emc_insert(&pmd->flow_cache, key, netdev_flow); > > > } > > > } > > > > > > @@ -4761,7 +4802,8 @@ fast_path_processing(struct > > dp_netdev_pmd_thread > > > *pmd, > > > > > > flow = dp_netdev_flow_cast(rules[i]); > > > > > > - emc_probabilistic_insert(pmd, &keys[i], flow); > > > + /* For testing purposes the emc_insert is restored here. */ > > > + emc_insert(&pmd->flow_cache, &keys[i], flow); > > > dp_netdev_queue_batches(packet, flow, &keys[i].mf, batches, > > > n_batches); > > > } > > > > > > @@ -4998,6 +5040,78 @@ dp_execute_userspace_action(struct > > > dp_netdev_pmd_thread *pmd, > > > } > > > } > > > > > > +/* Search into EMC to retrieve the entry related to the > > original packet > > > + * and the entry related to the recirculated packet. > > > + * If both EMC entries are alive, then: > > > + * - The flow actions of the recirculated packet are updated with > > > + * 'ct(zone=N)' as retrieved from the actions of the original > > flow. > > > + * - The EMC orig entry flow is updated with the flow pointer > > of recirc > > > entry. > > > + */ > > > +static inline void > > > +emc_patch_established_conn(struct dp_netdev_pmd_thread *pmd, > > > + struct dp_packet *packet, long long now OVS_UNUSED) > > > +{ > > > + struct emc_cache *cache = &pmd->flow_cache; > > > + struct netdev_flow_key key; > > > + struct emc_entry *orig_entry; /* EMC entry hit by the original > > > packet. */ > > > + struct emc_entry *recirc_entry; /* EMC entry for recirculated > > packet. > > > */ > > > + uint32_t old_hash; > > > + > > > + if (!packet->md.prev_emc_entry) { > > > + return; > > > + } > > > + > > > + orig_entry = packet->md.prev_emc_entry; > > > + if (!emc_entry_alive(orig_entry)) { > > > + return; > > > + } > > > + > > > + /* Check that the original EMC entry was not overwritten. */ > > > + struct dp_netdev_flow *orig_flow = orig_entry->flow; > > > + if (packet->md.prev_flow && (packet->md.prev_flow != > > orig_flow)) { > > > + return; > > > + } > > > + > > > + /* TODO as we are calling miniflow_extract now, can we avoid to > > > invoke > > > + * it again when we'll classify this recirculated packet? */ > > > + miniflow_extract(packet, &key.mf); > > > + key.len = 0; /* Not computed yet. */ > > > + old_hash = dp_packet_get_rss_hash(packet); > > > + key.hash = dpif_netdev_packet_get_rss_hash(packet, NULL); > > > + dp_packet_set_rss_hash(packet, old_hash); > > > + > > > + EMC_FOR_EACH_POS_WITH_HASH(cache, recirc_entry, key.hash) { > > > + if (recirc_entry->key.hash == key.hash > > > + && emc_entry_alive(recirc_entry) > > > + && netdev_flow_key_equal_mf(&recirc_entry->key, > > &key.mf)) { > > > + > > > + if (orig_entry->flow == recirc_entry->flow) { > > > + /* Nothing to do, already patched by a packet of > > this > > > + * same batch. */ > > > + return; > > > + } > > > + /* We found the entry related to the recirculated > > packet. > > > + * Retrieve the actions for orig and recirc entries. * > > */ > > > + struct dp_netdev_actions * orig_actions = > > > + dp_netdev_flow_get_actions(orig_entry->flow); > > > + struct dp_netdev_actions * recirc_actions = > > > + dp_netdev_flow_get_actions(recirc_entry->flow); > > > + > > > + /* The orig entry action contains a CT action with the > > zone > > > info. */ [Sugesh] is it safe to assume first action would be ct ?? > > > + struct nlattr *p = &orig_actions->actions[0]; > > > + > > > + /* Overwrite the CT Skip action of recirc entry with > > > ct(zone=N). */ > > > + memcpy(recirc_actions->actions, p, p->nla_len); [Sugesh] What if recirc_action size is smaller than p? shouldn’t we make sure the size of skip action match with size of actions present in ct_table? > > > + > > > + /* Update orig EMC entry. */ > > > + orig_entry->flow = recirc_entry->flow; > > > + dp_netdev_flow_ref(orig_entry->flow); > > > + > > > + return; > > > + } > > > + } > > > +} > > > + > > > static void > > > dp_execute_cb(void *aux_, struct dp_packet_batch *packets_, > > > const struct nlattr *a, bool may_steal) > > > @@ -5024,7 +5138,6 @@ dp_execute_cb(void *aux_, struct > > dp_packet_batch > > > *packets_, > > > } else { > > > tx_qid = pmd->static_tx_qid; > > > } > > > - > > > netdev_send(p->port->netdev, tx_qid, packets_, > > may_steal, > > > dynamic_txqs); > > > return; > > > > [Joe] "Unrelated style change." > > > > Will do that. > > > > > @@ -5145,10 +5258,21 @@ dp_execute_cb(void *aux_, struct > > dp_packet_batch > > > *packets_, > > > struct dp_packet *packet; > > > DP_PACKET_BATCH_FOR_EACH (packet, packets_) { > > > packet->md.recirc_id = nl_attr_get_u32(a); > > > + > > > + if (!(packet->md.ct_state & CS_INVALID) && > > > + (packet->md.ct_state & CS_TRACKED) && > > > + (packet->md.ct_state & CS_ESTABLISHED)) { > > > + (*depth)++; > > > + emc_patch_established_conn(pmd, packet, now); > > > + (*depth)--; > > > + } > > > } > > > > > > (*depth)++; > > > dp_netdev_recirculate(pmd, packets_); > > > + DP_PACKET_BATCH_FOR_EACH (packet, packets_) { > > > + packet->md.recirc_id = 0; > > > + } > > > (*depth)--; > > > > > > return; > > > > [Joe] "Can you give some more detail about why resetting the > > recirc_id is > > correct here? Could/should this be applied as an independent change > > regardless of the rest of this series?" > > > > This addition is needed so that in handle_packet_upcall we add the > > ct(Skip) > > just to the FLOW_RECIRC and not to the FLOW_ORIG. > > I guess this could be also applied as an independent change? > > > > > @@ -5166,6 +5290,7 @@ dp_execute_cb(void *aux_, struct > > dp_packet_batch > > > *packets_, > > > const char *helper = NULL; > > > const uint32_t *setmark = NULL; > > > const struct ovs_key_ct_labels *setlabel = NULL; > > > + bool skip_ct = false; > > > > > > NL_ATTR_FOR_EACH_UNSAFE (b, left, nl_attr_get(a), > > > nl_attr_get_size(a)) { > > > @@ -5194,6 +5319,9 @@ dp_execute_cb(void *aux_, struct > > dp_packet_batch > > > *packets_, > > > /* Silently ignored, as userspace datapath does not > > > generate > > > * netlink events. */ > > > break; > > > + case OVS_CT_ATTR_SKIP: > > > + skip_ct = true; > > > + break; > > > case OVS_CT_ATTR_NAT: > > > case OVS_CT_ATTR_UNSPEC: > > > case __OVS_CT_ATTR_MAX: > > > @@ -5201,6 +5329,10 @@ dp_execute_cb(void *aux_, struct > > dp_packet_batch > > > *packets_, > > > } > > > } > > > > > > + if (OVS_UNLIKELY(skip_ct)) { > > > + break; > > > + } > > > + > > > conntrack_execute(&dp->conntrack, packets_, aux->flow- > > >dl_type, > > > force, > > > commit, zone, setmark, setlabel, helper); > > > break; > > > diff --git a/lib/packets.h b/lib/packets.h > > > index 7dbf6dd..2c46a15 100644 > > > --- a/lib/packets.h > > > +++ b/lib/packets.h > > > @@ -112,6 +112,8 @@ struct pkt_metadata { > > > struct flow_tnl tunnel; /* Encapsulating tunnel parameters. > > Note > > > that > > > * if 'ip_dst' == 0, the rest of > > the > > > fields may > > > * be uninitialized. */ [Sugesh] I don’t see the initiation of these fields. Have you verified the cost of these fields in pkt_metadata for normal traffic,?. I mean the performance impact.? > > > + void *prev_emc_entry; /* EMC entry that this packet > > matched. */ > > > + void *prev_flow; /* Flow pointed by the matching EMC > > > entry. */ > > > }; > > > > > > static inline void > > > -- > > > 2.4.11 > > > [Sugesh] I cannot apply the patch on latest master due to some conflicts. I am not sure , how does the stats get updated when you are combining the flows in datapath. Earlier there are two UUID for two flows and now only one UUID is being used. I don’t have a expertise in conntrack, However what would be the result when we have actions like table=0, priority=100,ct_state=-trk,ip actions=clone(push_vlan(4), trunc(200), output:2), ct(table=1) table=1, ct_state=+est+trk,ip,in_port=1 actions=output:2 Also another example would be table=0, priority=100,ct_state=-trk,ip actions=push_vlan(4), ct(table=1) ,dec_ttl table=1, ct_state=+est+trk,ip,in_port=1 actions=output:2 As Joe suggested before, I think the logic of combine will be good to be in the ofproto-xlate layer than here. Though its userspace specific, it could be the same way how the userspace_tunnel handled there. Also another concern is on the actions that does fields modification along with connection tracking. Probably it’s a valid use case with NAT. In this scenario, if the flow fields are changed by action, how does it possible to track back the original flow? Or am I missing anything ? > > > _______________________________________________ > > > dev mailing list > > > [email protected] > > > https://urldefense.proofpoint.com/v2/url?u=https- > > 3A__mail.openvswitch.org_mailman_listinfo_ovs- > > 2Ddev&d=DwICAg&c=uilaK90D4TOVoH58JNXRgQ&r=BVhFA09CGX7JQ5Ih- > > uZnsw&m=nVfAqNkwbufuhX-dkxok67-OaTEtfFTDAj- > > Z39vSDZE&s=7SujU3lZVLCtG1pTSrZt1CMmmuv4iKnSkprKdkxSXZY&e= > > _______________________________________________ > > dev mailing list > > [email protected] > > https://urldefense.proofpoint.com/v2/url?u=https- > > 3A__mail.openvswitch.org_mailman_listinfo_ovs- > > 2Ddev&d=DwICAg&c=uilaK90D4TOVoH58JNXRgQ&r=BVhFA09CGX7JQ5Ih- > > uZnsw&m=nVfAqNkwbufuhX-dkxok67-OaTEtfFTDAj- > > Z39vSDZE&s=7SujU3lZVLCtG1pTSrZt1CMmmuv4iKnSkprKdkxSXZY&e= > > > > _______________________________________________ > dev mailing list > [email protected] > https://mail.openvswitch.org/mailman/listinfo/ovs-dev _______________________________________________ dev mailing list [email protected] https://mail.openvswitch.org/mailman/listinfo/ovs-dev
