Hi, Zhike, After receiving your email, I was becoming curious about this code and did more investigation on it.
and I found some problems with the code and now I believe this inconsistent processing is NOT the root cause for the inconsistent actions between ukey and datapath. and I found a new cause for that, but due to this complex race between PMD and revalidator, I wish this time I am right. But before that, why are you interested in this patch? Have you found the same issue in your environment? On Thu, Sep 22, 2022 at 6:54 PM .贺鹏 <hepeng.0...@bytedance.com> wrote: > Hi, zhike, > > It's difficult to give a very clear sequences about how this inconsistency > happens, but I can give you more details. > > This is observed in our production environment. The correct megaflow > should encap packets with vxlan header and send out, but the action is drop. > This is usually because the neigh info is not available at the moment when > the upcall happens. > > Normally, the drop action is ephemeral, and reavalidator will later modify > the megaflow's action into the tnl_push. > > But there are a few of cases, only happened 1~2 times in a year, where the > drop actions will never be replaced by tnl_push. > > just like in the commits mentioned, > > "The coverage command shows revalidators have dumped several times, > however the correct actions are not set. This implies that the ukey's > action does not equal to the meagaflow's, i.e. revalidators think the > underlying > > megaflow's actions are correct however they are not." > > > I do not know how this happened, but I do think this inconsistent processing > could be one of the reasons. > > Even there is no such bug, I think keeping processing inconsistent is > necessary. > > > > > > On Wed, Sep 21, 2022 at 5:57 PM 王志克 <wangzh...@jd.com> wrote: > >> Hi Hepeng, >> >> >> >> Can you please explain the sequence that how this inconsistence could >> happen? Why you believe the current actions in existing netdev_flow is old? >> >> >> >> Thanks. >> >> >> >> Br, >> >> wangzhike >> >> >> >> >> >> >> >> >> >> >> ***************************************************************************************************************************** >> >> [ovs-dev,ovs-dev,v2,4/4] dpif-netdev: fix inconsistent processing between >> ukey and megaflow >> >> Message ID >> >> 20220604151857.66550-4-hepeng.0...@bytedance.com >> >> State >> >> New >> >> Headers >> >> show >> >> Series >> >> [ovs-dev,ovs-dev,v2,1/4] ofproto-dpif-upcall: fix push_dp_ops >> <http://patchwork.ozlabs.org/project/openvswitch/list/?series=303324>| >> expand >> Checks >> >> Context >> >> Check >> >> Description >> >> ovsrobot/apply-robot >> >> *warning* >> >> apply and check: warning >> <https://mail.openvswitch.org/pipermail/ovs-build/2022-June/022431.html> >> >> ovsrobot/github-robot-_Build_and_Test >> >> *success* >> >> github build: passed >> <https://mail.openvswitch.org/pipermail/ovs-build/2022-June/022436.html> >> >> ovsrobot/intel-ovs-compilation >> >> *success* >> >> test: success >> <https://mail.openvswitch.org/pipermail/ovs-build/2022-June/022439.html> >> Commit Message >> >> Peng He >> <http://patchwork.ozlabs.org/project/openvswitch/list/?submitter=78087>June >> 4, 2022, 3:18 p.m. UTC >> >> When PMDs perform upcalls, the newly generated ukey will replace >> >> the old, however, the newly generated mageflow will be discard >> >> to reuse the old one without checking if the actions of new and >> >> old are equal. >> >> >> >> We observe in the production environment that sometimes a megaflow >> >> with wrong actions keep staying in datapath. The coverage command shows >> >> revalidators have dumped serveral times, however the correct >> >> actions are not set. This implies that the ukey's action does not >> >> equal to the meagaflow's, i.e. revalidators think the underlying >> >> megaflow's actions are correct however they are not. >> >> >> >> We also check the megaflow using the ofproto/trace command, and the >> >> actions are not matched with the ones in the actual magaflow. By >> >> performing a revalidator/purge command, the right actions are set. >> >> >> >> *Signed-off-by: Peng He <hepeng.0...@bytedance.com >> <hepeng.0...@bytedance.com>>* >> >> --- >> >> lib/dpif-netdev.c | 17 ++++++++++++++++- >> >> 1 file changed, 16 insertions(+), 1 deletion(-) >> >> Comments >> >> 0-day Robot >> <http://patchwork.ozlabs.org/project/openvswitch/list/?submitter=74326>June >> 4, 2022, 3:44 p.m. UTC | #1 >> <http://patchwork.ozlabs.org/comment/2907057/> >> >> Bleep bloop. Greetings Peng He, I am a robot and I have tried out your >> patch. >> >> Thanks for your contribution. >> >> >> >> I encountered some error that I wasn't expecting. See the details below. >> >> >> >> >> >> checkpatch: >> >> ERROR: Author Peng He <xnhp0...@gmail.com> needs to sign off. >> >> WARNING: Unexpected sign-offs from developers who are not authors or >> co-authors or committers: Peng He <hepeng.0...@bytedance.com> >> >> Lines checked: 58, Warnings: 1, Errors: 1 >> >> >> >> >> >> Please check this out. If you feel there has been an error, please email >> acon...@redhat.com >> >> >> >> Thanks, >> >> 0-day Robot >> >> 1638948diff >> <http://patchwork.ozlabs.org/project/openvswitch/patch/20220604151857.66550-4-hepeng.0...@bytedance.com/raw/> >> mbox >> <http://patchwork.ozlabs.org/project/openvswitch/patch/20220604151857.66550-4-hepeng.0...@bytedance.com/mbox/> >> series <http://patchwork.ozlabs.org/series/303324/mbox/> >> Patch >> >> *diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c* >> >> *index ff57b3961..985c25c58 100644* >> >> *--- a/lib/dpif-netdev.c* >> >> *+++ b/lib/dpif-netdev.c* >> >> *@@ -8305,7 +8305,22 @@* handle_packet_upcall(struct dp_netdev_pmd_thread >> *pmd, >> >> * to be locking revalidators out of making flow modifications. */ >> >> ovs_mutex_lock(&pmd->flow_mutex); >> >> netdev_flow = dp_netdev_pmd_lookup_flow(pmd, key, NULL); >> >> - if (OVS_LIKELY(!netdev_flow)) { >> >> + if (OVS_UNLIKELY(netdev_flow)) { >> >> + struct dp_netdev_actions *old_act = >> >> + dp_netdev_flow_get_actions(netdev_flow); >> >> + >> >> + if ((add_actions->size != old_act->size) || >> >> + memcmp(old_act->actions, add_actions->data, >> >> + add_actions->size)) { >> >> + >> >> + struct dp_netdev_actions *new_act = >> >> + dp_netdev_actions_create(add_actions->data, >> >> + add_actions->size); >> >> + >> >> + ovsrcu_set(&netdev_flow->actions, new_act); >> >> + ovsrcu_postpone(dp_netdev_actions_free, old_act); >> >> + } >> >> + } else { >> >> netdev_flow = dp_netdev_flow_add(pmd, &match, &ufid, >> >> add_actions->data, >> >> add_actions->size, >> orig_in_port); >> >> >> >> >> > _______________________________________________ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev