On 3 Jun 2023, at 2:01, Peng He wrote:
> push_dp_ops only handles delete ops errors but ignores the modify > ops results. It's better to handle all the dp operation errors in > a consistent way. > > This patch prevents the inconsistency by considering modify failure > in revalidators. > > To note, we cannot perform two state transitions and change ukey_state > into UKEY_EVICTED directly here, because, if we do so, the > sweep will remove the ukey alone and leave dp flow alive. Later, the > dump will retrieve the dp flow and might even recover it. This will > contribute the stats of this dp flow twice. > > v8->v9: add testsuite and delete INCONSISTENT ukey at revalidate_sweep. > v9->v10: change the commit message and refine the test case. Hi Peng, Thanks for keeping this patch updated :) Some more comments below, but I think the next rev should be all fine. Cheers, Eelco > Signed-off-by: Peng He <[email protected]> > --- > ofproto/ofproto-dpif-upcall.c | 48 ++++++++++++++++++++++++---------- > tests/dpif-netdev.at | 49 +++++++++++++++++++++++++++++++++++ > 2 files changed, 84 insertions(+), 13 deletions(-) > > diff --git a/ofproto/ofproto-dpif-upcall.c b/ofproto/ofproto-dpif-upcall.c > index cd57fdbd9..5a75d9444 100644 > --- a/ofproto/ofproto-dpif-upcall.c > +++ b/ofproto/ofproto-dpif-upcall.c > @@ -258,6 +258,7 @@ enum ukey_state { > UKEY_CREATED = 0, > UKEY_VISIBLE, /* Ukey is in umap, datapath flow install is queued. > */ > UKEY_OPERATIONAL, /* Ukey is in umap, datapath flow is installed. */ > + UKEY_INCONSISTENT, /* Ukey is in umap, datapath flow is inconsistent. */ > UKEY_EVICTING, /* Ukey is in umap, datapath flow delete is queued. > */ > UKEY_EVICTED, /* Ukey is in umap, datapath flow is deleted. */ > UKEY_DELETED, /* Ukey removed from umap, ukey free is deferred. */ > @@ -1999,6 +2000,10 @@ transition_ukey_at(struct udpif_key *ukey, enum > ukey_state dst, > * UKEY_VISIBLE -> UKEY_EVICTED > * A handler attempts to install the flow, but the datapath rejects it. > * Consider that the datapath has already destroyed it. > + * UKEY_OPERATIONAL -> UKEY_INCONSISTENT > + * A revalidator modifies the flow with error returns. > + * UKEY_INCONSISTENT -> UKEY_EVICTING > + * A revalidator decides to evict the datapath flow. > * UKEY_OPERATIONAL -> UKEY_EVICTING > * A revalidator decides to evict the datapath flow. > * UKEY_EVICTING -> UKEY_EVICTED > @@ -2007,7 +2012,9 @@ transition_ukey_at(struct udpif_key *ukey, enum > ukey_state dst, > * A revalidator has removed the ukey from the umap and is deleting it. > */ > if (ukey->state == dst - 1 || (ukey->state == UKEY_VISIBLE && > - dst < UKEY_DELETED)) { > + dst < UKEY_DELETED) || > + (ukey->state == UKEY_OPERATIONAL && Indentation is wrong here, but maybe we should fix both lines, so it’s easier to read: if (ukey->state == dst - 1 || (ukey->state == UKEY_VISIBLE && dst < UKEY_DELETED) || (ukey->state == UKEY_OPERATIONAL && dst == UKEY_EVICTING)) { > + dst == UKEY_EVICTING)) { > ukey->state = dst; > } else { > struct ds ds = DS_EMPTY_INITIALIZER; > @@ -2472,26 +2479,31 @@ push_dp_ops(struct udpif *udpif, struct ukey_op *ops, > size_t n_ops) > > for (i = 0; i < n_ops; i++) { > struct ukey_op *op = &ops[i]; > - struct dpif_flow_stats *push, *stats, push_buf; > - > - stats = op->dop.flow_del.stats; > - push = &push_buf; > - > - if (op->dop.type != DPIF_OP_FLOW_DEL) { > - /* Only deleted flows need their stats pushed. */ > - continue; > - } > > if (op->dop.error) { > - /* flow_del error, 'stats' is unusable. */ > if (op->ukey) { > ovs_mutex_lock(&op->ukey->mutex); > - transition_ukey(op->ukey, UKEY_EVICTED); > + if (op->ukey->state == UKEY_EVICTING) { How about not checking the UKEY_STATE’s as they are internal, but using the operation type instead? This will still result in errors in transition_ukey() for any corner cases we might have missed (if some new operation gets added and fails). ovs_mutex_lock(&op->ukey->mutex); - if (op->ukey->state == UKEY_EVICTING) { + if (op->dop.type == DPIF_OP_FLOW_DEL) { transition_ukey(op->ukey, UKEY_EVICTED); - } else if (op->ukey->state == UKEY_OPERATIONAL) { - /* Modify failed, ukey's state was UKEY_OPERATIONAL */ + } else { + /* Modification of the flow failed. */ transition_ukey(op->ukey, UKEY_INCONSISTENT); } ovs_mutex_unlock(&op->ukey->mutex); > + transition_ukey(op->ukey, UKEY_EVICTED); > + } else if (op->ukey->state == UKEY_OPERATIONAL) { > + /* Modify failed, ukey's state was UKEY_OPERATIONAL */ > + transition_ukey(op->ukey, UKEY_INCONSISTENT); > + } > ovs_mutex_unlock(&op->ukey->mutex); > } > continue; > } > > + if (op->dop.type != DPIF_OP_FLOW_DEL) { > + /* Only deleted flows need their stats pushed. */ > + continue; > + } > + > + struct dpif_flow_stats *push, *stats, push_buf; > + > + stats = op->dop.flow_del.stats; > + push = &push_buf; > + > if (op->ukey) { > ovs_mutex_lock(&op->ukey->mutex); > transition_ukey(op->ukey, UKEY_EVICTED); > @@ -2839,6 +2851,15 @@ revalidate(struct revalidator *revalidator) > continue; > } > > + if (ukey->state == UKEY_INCONSISTENT) { > + ukey->dump_seq = dump_seq; > + reval_op_init(&ops[n_ops++], UKEY_DELETE, udpif, ukey, > + &recircs, &odp_actions); > + ovs_mutex_unlock(&ukey->mutex); I think it would be nice to have a coverage counter here for this situation. COVERAGE_INC(dumped_inconsisten_flow); > + continue; > + } > + > + > if (ukey->state <= UKEY_OPERATIONAL) { > /* The flow is now confirmed to be in the datapath. */ > transition_ukey(ukey, UKEY_OPERATIONAL); > @@ -2927,13 +2948,14 @@ revalidator_sweep__(struct revalidator *revalidator, > bool purge) > } > ukey_state = ukey->state; > if (ukey_state == UKEY_OPERATIONAL > + || (ukey_state == UKEY_INCONSISTENT) > || (ukey_state == UKEY_VISIBLE && purge)) { > struct recirc_refs recircs = RECIRC_REFS_EMPTY_INITIALIZER; > bool seq_mismatch = (ukey->dump_seq != dump_seq > && ukey->reval_seq != reval_seq); > enum reval_result result; > > - if (purge) { > + if (purge || ukey_state == UKEY_INCONSISTENT) { > result = UKEY_DELETE; > } else if (!seq_mismatch) { > result = UKEY_KEEP; > diff --git a/tests/dpif-netdev.at b/tests/dpif-netdev.at > index baab60a22..5ee8e6f7e 100644 > --- a/tests/dpif-netdev.at > +++ b/tests/dpif-netdev.at > @@ -716,3 +716,52 @@ AT_CHECK([test `ovs-vsctl get Interface p2 > statistics:tx_q0_packets` -gt 0 -a dn > > OVS_VSWITCHD_STOP > AT_CLEANUP > + > +AT_SETUP([dpif-netdev - revalidators handle dp modification fail correctly]) > +OVS_VSWITCHD_START( > + [add-port br0 p1 \ > + -- set interface p1 type=dummy \ > + -- set bridge br0 datapath-type=dummy \ > + -- add-port br0 p2 \ > + -- set interface p2 type=dummy -- > + ]) > + > +AT_DATA([flows.txt], [dnl > +table=0,in_port=p1,actions=p2 > +]) > + > +ovs-ofctl add-flows br0 flows.txt As this is only a single flow, I would change it to something like: -AT_DATA([flows.txt], [dnl -table=0,in_port=p1,actions=p2 -]) - -ovs-ofctl add-flows br0 flows.txt +ovs-ofctl add-flow br0 table=0,in_port=p1,actions=p2 > +ovs-appctl netdev-dummy/receive p1 > 'ipv4(src=10.0.0.1,dst=10.0.0.2),tcp(src=1,dst=2)' > +ovs-appctl netdev-dummy/receive p1 > 'ipv4(src=10.0.0.1,dst=10.0.0.2),tcp(src=1,dst=2)' > + > +AT_CHECK([ovs-appctl dpctl/dump-flows | sed 's/.*thread://' | > strip_xout_keep_actions ], [0], [ > +recirc_id(0),in_port(1),packet_type(ns=0,id=0),eth_type(0x0800),ipv4(frag=no), > packets:0, bytes:0, used:0.0s, actions:2 > +]) > + > +dnl wait the dp flow enter OPERATIONAL Start with capital W for wait, and end with a dot. Some more re-writing: -dnl wait the dp flow enter OPERATIONAL +dnl Wait for the dp flow to enter OPERATIONAL state. > +ovs-appctl revalidator/wait > + > +AT_CHECK([ovs-appctl revalidator/pause]) > + > +dnl delete all dp flows, so flow modify will fail. dnl Delete all dp flows, so flow modification will fail. > +AT_CHECK([ovs-appctl dpctl/del-flows]) > + > +dnl replace openflow rules, let revalidator work +dnl Replace OpenFlow rules and let the revalidator work. > +AT_DATA([flows2.txt], [dnl > +table=0,in_port=p1,ip actions=ct(commit) > +]) Do this in one command > +AT_CHECK([ovs-appctl revalidator/resume]) > + > +dnl trigger revalidating and wait Trigger revalidation and wait for it to complete. > +AT_CHECK([ovs-ofctl --bundle replace-flows br0 flows2.txt]) -AT_CHECK([ovs-ofctl --bundle replace-flows br0 flows2.txt]) > +ovs-appctl revalidator/wait > + > +dnl inconsistent ukey should be deleted. > +AT_CHECK([ovs-appctl upcall/show | grep keys | grep -q -v 0], [1]) If we add the new coverage counter we should add the check here also. > +dnl delete revalidator warnings log to let testsuite pass. > +sed -i 's/.*failed to put.*$//' ovs-vswitchd.log > +sed -i 's/.*failed to flow_del.*$//' ovs-vswitchd.log Should we not explicitly check for the flow(_del) error, as this will tell us the test introduced the right problem to begin with? The above we can do through the OVS_VSWITCHD_STOP(["......"] command. > + > +OVS_VSWITCHD_STOP > +AT_CLEANUP > -- > 2.25.1 _______________________________________________ dev mailing list [email protected] https://mail.openvswitch.org/mailman/listinfo/ovs-dev
