Eelco Chaudron <[email protected]> 于2022年11月18日周五 15:35写道:
> > > On 18 Nov 2022, at 2:53, Peng He wrote: > > > Eelco Chaudron <[email protected]> 于2022年11月16日周三 18:14写道: > > > >> > >> > >> On 6 Nov 2022, at 8:12, Peng He wrote: > >> > >>> push_dp_ops only handles delete ops errors but ignores (忽略) the modify > >> (修改) > >>> ops results. It's better to handle all the dp operation errors in > >>> a consistent way. > >>> > >>> We observe in the production environment that sometimes a megaflow > >>> with wrong actions keep staying in datapath. The coverage command shows > >>> revalidators have dumped several times, however the correct > >>> actions are not set. This implies (暗示) that the ukey's action does not > >>> equal to the meagaflow's, i.e. revalidators think the underlying (基础) > >>> megaflow's actions are correct however they are not. > >>> > >>> We also check the megaflow using the ofproto/trace command, and the > >>> actions are not matched with the ones in the actual magaflow. By > >>> performing a revalidator/purge command, the right actions are set. > >>> > >>> This patch (补丁) prevents the inconsistency by considering modify (修改) > >> failure > >>> in revalidators. > >>> > >>> To note, we cannot perform two state transitions and change ukey_state > >>> into UKEY_EVICTED directly here, because, if we do so, the > >>> sweep (扫) will remove the ukey alone and leave dp flow alive. Later, > the > >>> dump will retrieve (检索) the dp flow and might even recover it. This > will > >>> contribute the stats of this dp flow twice. > >>> > >>> Signed-off-by: Peng He <[email protected]> > >> > >> Hi Peng, > >> > >> Thanks for looking at the statistics part, see some comments inline! > >> > >> In addition, I already acked patch (补丁) 2 out of this series, but it > >> mentions patch x/3, but I do not see patch 3 in this series. Is this > >> missing? Or are there only two patches (补丁) left? > > > > > > there are only two patches. the third one is about the race comments, > which > > is not in this patchset. > > I guess I made some mistake. > > > > > >> > >> > >> Cheers, > >> > >> Eelco > >> > >> > >>> --- > >>> ofproto/ofproto-dpif-upcall.c | 39 ++++++++++++++++++++++------------- > >>> 1 file (文件) changed, 25 insertions(+), 14 deletions(-) > >>> > >>> diff --git a/ofproto/ofproto-dpif-upcall.c > >> b/ofproto/ofproto-dpif-upcall.c > >>> index 7ad728adf..a7970fa9b 100644 > >>> --- a/ofproto/ofproto-dpif-upcall.c > >>> +++ b/ofproto/ofproto-dpif-upcall.c > >>> @@ -254,6 +254,7 @@ enum ukey_state { > >>> UKEY_CREATED = 0, > >>> UKEY_VISIBLE, /* Ukey is in umap, datapath flow install (安装) > >> is queued (队列) . */ > >>> UKEY_OPERATIONAL, /* Ukey is in umap, datapath flow is installed > >> (安装) . */ > >>> + UKEY_INCONSISTENT, /* Ukey is in umap, datapath flow is modified > >> (修改) but failed */ > >>> UKEY_EVICTING, /* Ukey is in umap, datapath flow delete is > >> queued (队列) . */ > >>> UKEY_EVICTED, /* Ukey is in umap, datapath flow is deleted. > */ > >>> UKEY_DELETED, /* Ukey removed from umap, ukey free is > >> deferred (推迟) . */ > >>> @@ -1966,6 +1967,10 @@ transition_ukey_at(struct udpif_key *ukey, enum > >> ukey_state dst, > >>> * UKEY_VISIBLE -> UKEY_EVICTED > >>> * A handler attempts to install (安装) the flow, but the datapath > >> rejects it. > >>> * Consider that the datapath has already destroyed it. > >>> + * UKEY_OPERATIONAL -> UKEY_INCONSISTENT > >>> + * A revalidator modifies (修改) the flow with error returns (返回) . > >>> + * UKEY_INCONSISTENT -> UKEY_EVICTING > >>> + * A revalidator decides to evict (驱逐) the datapath flow. > >>> * UKEY_OPERATIONAL -> UKEY_EVICTING > >>> * A revalidator decides to evict (驱逐) the datapath flow. > >>> * UKEY_EVICTING -> UKEY_EVICTED > >>> @@ -1974,7 +1979,8 @@ transition_ukey_at(struct udpif_key *ukey, enum > >> ukey_state dst, > >>> * A revalidator has removed the ukey from the umap and is > >> deleting it. > >>> */ > >>> if (ukey->state == dst - 1 || (ukey->state == UKEY_VISIBLE && > >>> - dst < UKEY_DELETED)) { > >>> + dst < UKEY_DELETED) || > >>> + (ukey->state == UKEY_OPERATIONAL && dst == > UKEY_EVICTING)) { > >>> ukey->state = dst; > >>> } else { > >>> struct ds ds = DS_EMPTY_INITIALIZER; > >>> @@ -2416,26 +2422,31 @@ push_dp_ops(struct udpif *udpif, struct ukey_op > >> *ops, size_t n_ops) > >>> > >>> for (i = 0; i < n_ops; i++) { > >>> struct ukey_op *op = &ops[i]; > >>> - struct dpif_flow_stats *push, *stats, push_buf; > >>> - > >>> - stats = op->dop.flow_del.stats; > >>> - push = &push_buf; > >>> - > >>> - if (op->dop.type != DPIF_OP_FLOW_DEL) { > >>> - /* Only deleted flows need their stats pushed. */ > >>> - continue; > >>> - } > >>> > >>> if (op->dop.error) { > >>> - /* flow_del error, 'stats' is unusable. */ > >>> if (op->ukey) { > >>> ovs_mutex_lock(&op->ukey->mutex); > >>> - transition_ukey(op->ukey, UKEY_EVICTED); > >>> + if (op->dop.type == DPIF_OP_FLOW_DEL) > >> > >> We should {} to the if branch. > >> > > > > OK > > > > > >> > >>> + transition_ukey(op->ukey, UKEY_EVICTED); > >>> + else { > >>> + transition_ukey(op->ukey, UKEY_INCONSISTENT); > >> > >> Is there a specific (特定) reason to have delete go through a different > >> state on failure? If not, it might be more (更多) consistent especially > when > >> debugging. > > > > > > As is stated in the commit comments, > > " > >> To note, we cannot perform two state transitions and change ukey_state > >> into UKEY_EVICTED directly here, because, if we do so, the > >> sweep will remove the ukey alone and leave dp flow alive. Later, the > >> dump will retrieve the dp flow and might even recover it. This will > >> contribute the stats of this dp flow twice. > > " > > So I have to introduce another state here. > > I meant chaining the UKEY_EVICTED also to UKEY_INCONSISTENT. > By chaining, you mean changing, correct? I got it. So if error, we should all go to the UKEY_INCONSISTENT. yes, it makes sense, will do that in the next version. > > > > >> > >>> + } > >>> ovs_mutex_unlock(&op->ukey->mutex); > >>> } > >>> + /* if modify (修改) or delete fails, there is no need to > push > >> stats */ > >> > >> Captial for “If” and ending with a dot. > >> > > > > Sure, will do that. > > > > > >> > >>> continue; > >>> } > >>> > >>> + if (op->dop.type != DPIF_OP_FLOW_DEL) { > >>> + /* Only deleted flows need their stats pushed. */ > >>> + continue; > >>> + } > >>> + > >>> + struct dpif_flow_stats *push, *stats, push_buf; > >>> + > >>> + stats = op->dop.flow_del.stats; > >>> + push = &push_buf; > >>> + > >>> if (op->ukey) { > >>> ovs_mutex_lock(&op->ukey->mutex); > >>> transition_ukey(op->ukey, UKEY_EVICTED); > >>> @@ -2848,14 +2859,14 @@ revalidator_sweep__(struct revalidator > >> *revalidator, bool purge (吹扫) ) > >>> continue; > >>> } > >>> ukey_state = ukey->state; > >>> - if (ukey_state == UKEY_OPERATIONAL > >>> + if (ukey_state == UKEY_OPERATIONAL || ukey_state == > >> UKEY_INCONSISTENT > >>> || (ukey_state == UKEY_VISIBLE && purge (吹扫) )) { > >>> struct recirc_refs recircs = > >> RECIRC_REFS_EMPTY_INITIALIZER; > >>> bool seq_mismatch = (ukey->dump_seq != dump_seq > >>> && ukey->reval_seq != reval_seq); > >>> enum reval_result result; > >>> > >>> - if ( purge (吹扫) ) { > >>> + if ( purge (吹扫) || ukey_state == UKEY_INCONSISTENT) { > >>> result = UKEY_DELETE; > >>> } else if (!seq_mismatch) { > >>> result = UKEY_KEEP; > >>> -- > >>> 2.25.1 > >>> > >>> _______________________________________________ > >>> dev mailing list (列表) > >>> [email protected] > >>> https://mail.openvswitch.org/mailman/listinfo/ovs-dev > >> > >> > > > > -- > > hepeng > > -- hepeng _______________________________________________ dev mailing list [email protected] https://mail.openvswitch.org/mailman/listinfo/ovs-dev
