Patch v5 has statistics issues. In order to solve this issue, we had a discussion.
below is the quote of the email. ” After a second thought, I think maybe keeping INCONSISTENT just for the modify error is a better option. With current patch: 1. the modify error case: OPERATIONAL -> INCONSISTENT -> EVICTING -> EVICTED 2. the delete error case: EVICTING -> EVICTED Change both to INCONSISTENT: the modify error case: did not change. the delete error case: EVICTING -> INCONSISTENT -> EVICTED? “ And we agree to take the second solution. Eelco Chaudron <[email protected]> 于2022年12月8日周四 18:54写道: > > > On 27 Nov 2022, at 8:28, Peng He wrote: > > > push_dp_ops only handles delete ops errors but ignores the modify > > ops results. It's better to handle all the dp operation errors in > > a consistent way. > > > > We observe in the production environment that sometimes a megaflow > > with wrong actions keep staying in datapath. The coverage command shows > > revalidators have dumped several times, however the correct > > actions are not set. This implies that the ukey's action does not > > equal to the meagaflow's, i.e. revalidators think the underlying > > megaflow's actions are correct however they are not. > > > > We also check the megaflow using the ofproto/trace command, and the > > actions are not matched with the ones in the actual magaflow. By > > performing a revalidator/purge command, the right actions are set. > > > > This patch prevents the inconsistency by considering modify failure > > in revalidators. > > > > To note, we cannot perform two state transitions and change ukey_state > > into UKEY_EVICTED directly here, because, if we do so, the > > sweep will remove the ukey alone and leave dp flow alive. Later, the > > dump will retrieve the dp flow and might even recover it. This will > > contribute the stats of this dp flow twice. > > > > Signed-off-by: Peng He <[email protected]> > > --- > > ofproto/ofproto-dpif-upcall.c | 34 +++++++++++++++++++++++----------- > > 1 file changed, 23 insertions(+), 11 deletions(-) > > > > diff --git a/ofproto/ofproto-dpif-upcall.c > b/ofproto/ofproto-dpif-upcall.c > > index 7ad728adf..c2cefbeb8 100644 > > --- a/ofproto/ofproto-dpif-upcall.c > > +++ b/ofproto/ofproto-dpif-upcall.c > > @@ -2416,26 +2416,30 @@ push_dp_ops(struct udpif *udpif, struct ukey_op > *ops, size_t n_ops) > > > > for (i = 0; i < n_ops; i++) { > > struct ukey_op *op = &ops[i]; > > - struct dpif_flow_stats *push, *stats, push_buf; > > - > > - stats = op->dop.flow_del.stats; > > - push = &push_buf; > > - > > - if (op->dop.type != DPIF_OP_FLOW_DEL) { > > - /* Only deleted flows need their stats pushed. */ > > - continue; > > - } > > > > if (op->dop.error) { > > - /* flow_del error, 'stats' is unusable. */ > > if (op->ukey) { > > ovs_mutex_lock(&op->ukey->mutex); > > - transition_ukey(op->ukey, UKEY_EVICTED); > > + if (op->dop.type == DPIF_OP_FLOW_DEL) { > > + transition_ukey(op->ukey, UKEY_EVICTED); > > + } else { > > + transition_ukey(op->ukey, UKEY_EVICTING); > > + } > > ovs_mutex_unlock(&op->ukey->mutex); > > } > > continue; > > } > > > > + if (op->dop.type != DPIF_OP_FLOW_DEL) { > > + /* Only deleted flows need their stats pushed. */ > > + continue; > > + } > > + > > + struct dpif_flow_stats *push, *stats, push_buf; > > + > > + stats = op->dop.flow_del.stats; > > + push = &push_buf; > > + > > if (op->ukey) { > > ovs_mutex_lock(&op->ukey->mutex); > > transition_ukey(op->ukey, UKEY_EVICTED); > > @@ -2848,6 +2852,14 @@ revalidator_sweep__(struct revalidator > *revalidator, bool purge) > > continue; > > } > > ukey_state = ukey->state; > > + > > + if (ukey_state == UKEY_EVICTING) { > > + /* previous modify operation fails on this ukey and > ukey_state > > + * is set to UKEY_EVICTING, issue a delete operation on > this > > + * ukey. > > + */ > > + delete_op_init(udpif, &ops[n_ops++], ukey); > > How can we be sure this state here is only for failing to modify > operations? > This state is set in other places in the code also. > because revalidate_sweep is called after revaldiate. all the datapath ops have already been pushed, and if there are ukeys at UKEY_EVICTING, it's because previously datapath ops have errors. and if del ops fail, ukey is deleted. so only ukeys at UKEY_EVICTING must come from modify fail. > Maybe we should keep the INCONSISTENT state to avoid this (see v5)? > > > + } > > if (ukey_state == UKEY_OPERATIONAL > > || (ukey_state == UKEY_VISIBLE && purge)) { > > struct recirc_refs recircs = > RECIRC_REFS_EMPTY_INITIALIZER; > > -- > > 2.25.1 > > -- hepeng _______________________________________________ dev mailing list [email protected] https://mail.openvswitch.org/mailman/listinfo/ovs-dev
