Hi Peng,

Right, I also met this issue, and wondering the sequence for this 
inconsistence, and would like to hear your “new cause”.

Anyway I believe your below patch should fix this issue.
[ovs-dev,ovs-dev,v2,1/4] ofproto-dpif-upcall: fix push_dp_ops

Br,
Zhike

From: ".贺鹏" <[email protected]>
Date: Friday, September 23, 2022 at 8:59 PM
To: 王志克 <[email protected]>
Cc: "[email protected]" <[email protected]>, "[email protected]" 
<[email protected]>
Subject: [来自外部的邮件]Re: [External] Re:[ovs-dev,ovs-dev,v2,4/4] dpif-netdev: fix 
inconsistent processing between ukey and megaflow

京东安全提示:此封邮件来自公司外部,除非您能判断发件人和知道邮件内容安全,否则请勿打开链接或者附件。
JD Security Tips: Please do not click on links or open attachments unless you 
trust the sender and know the content is safe.


Hi, Zhike,

After receiving your email, I was becoming curious about this code and did more 
investigation on it.

and I found some problems with the code and now I believe this inconsistent 
processing is NOT the root cause for the inconsistent actions between ukey and 
datapath.
and I found a new cause for that, but due to this complex race between PMD and 
revalidator, I wish this time I am right.

But before that, why are you interested in this patch? Have you found the same 
issue in your environment?




On Thu, Sep 22, 2022 at 6:54 PM .贺鹏 
<[email protected]<mailto:[email protected]>> wrote:
Hi, zhike,

It's difficult to give a very clear sequences about how this inconsistency 
happens, but I can give you more details.

This is observed in our production environment. The correct megaflow should 
encap packets with vxlan header and send out, but the action is drop.
This is usually because the neigh info is not available at the moment when the 
upcall happens.

Normally, the drop action is ephemeral, and reavalidator will later modify the 
megaflow's action into the tnl_push.

But there are a few of cases, only happened 1~2 times in a year, where the drop 
actions will never be replaced by tnl_push.

just like in the commits mentioned,

"The coverage command shows revalidators have dumped several times,
however the correct actions are not set. This implies that the ukey's
action does not equal to the meagaflow's, i.e. revalidators think the underlying

megaflow's actions are correct however they are not."



I do not know how this happened, but I do think this inconsistent processing 
could be one of the reasons.

Even there is no such bug, I think keeping processing inconsistent is necessary.





On Wed, Sep 21, 2022 at 5:57 PM 王志克 <[email protected]<mailto:[email protected]>> 
wrote:
Hi Hepeng,

Can you please explain the sequence that how this inconsistence could happen? 
Why you believe the current actions in existing netdev_flow is old?

Thanks.

Br,
wangzhike




*****************************************************************************************************************************
[ovs-dev,ovs-dev,v2,4/4] dpif-netdev: fix inconsistent processing between ukey 
and megaflow
Message ID

[email protected]<mailto:[email protected]>

State

New

Headers

show

Series

[ovs-dev,ovs-dev,v2,1/4] ofproto-dpif-upcall: fix push_dp_ops 
<http://patchwork.ozlabs.org/project/openvswitch/list/?series=303324> | expand

Checks
Context

Check

Description

ovsrobot/apply-robot

warning

apply and check: 
warning<https://mail.openvswitch.org/pipermail/ovs-build/2022-June/022431.html>

ovsrobot/github-robot-_Build_and_Test

success

github build: 
passed<https://mail.openvswitch.org/pipermail/ovs-build/2022-June/022436.html>

ovsrobot/intel-ovs-compilation

success

test: 
success<https://mail.openvswitch.org/pipermail/ovs-build/2022-June/022439.html>

Commit Message
Peng 
He<http://patchwork.ozlabs.org/project/openvswitch/list/?submitter=78087>June 
4, 2022, 3:18 p.m. UTC

When PMDs perform upcalls, the newly generated ukey will replace

the old, however, the newly generated mageflow will be discard

to reuse the old one without checking if the actions of new and

old are equal.



We observe in the production environment that sometimes a megaflow

with wrong actions keep staying in datapath. The coverage command shows

revalidators have dumped serveral times, however the correct

actions are not set. This implies that the ukey's action does not

equal to the meagaflow's, i.e. revalidators think the underlying

megaflow's actions are correct however they are not.



We also check the megaflow using the ofproto/trace command, and the

actions are not matched with the ones in the actual magaflow. By

performing a revalidator/purge command, the right actions are set.



Signed-off-by: Peng He 
<[email protected]<mailto:[email protected]>>

---

 lib/dpif-netdev.c | 17 ++++++++++++++++-

 1 file changed, 16 insertions(+), 1 deletion(-)

Comments
0-day 
Robot<http://patchwork.ozlabs.org/project/openvswitch/list/?submitter=74326>June
 4, 2022, 3:44 p.m. UTC | #1<http://patchwork.ozlabs.org/comment/2907057/>

Bleep bloop.  Greetings Peng He, I am a robot and I have tried out your patch.

Thanks for your contribution.



I encountered some error that I wasn't expecting.  See the details below.





checkpatch:

ERROR: Author Peng He <[email protected]<mailto:[email protected]>> needs to 
sign off.

WARNING: Unexpected sign-offs from developers who are not authors or co-authors 
or committers: Peng He 
<[email protected]<mailto:[email protected]>>

Lines checked: 58, Warnings: 1, Errors: 1





Please check this out.  If you feel there has been an error, please email 
[email protected]<mailto:[email protected]>



Thanks,

0-day Robot
1638948diff<http://patchwork.ozlabs.org/project/openvswitch/patch/[email protected]/raw/>mbox<http://patchwork.ozlabs.org/project/openvswitch/patch/[email protected]/mbox/>series<http://patchwork.ozlabs.org/series/303324/mbox/>
Patch

diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c

index ff57b3961..985c25c58 100644

--- a/lib/dpif-netdev.c

+++ b/lib/dpif-netdev.c

@@ -8305,7 +8305,22 @@  handle_packet_upcall(struct dp_netdev_pmd_thread *pmd,

          * to be locking revalidators out of making flow modifications. */

         ovs_mutex_lock(&pmd->flow_mutex);

         netdev_flow = dp_netdev_pmd_lookup_flow(pmd, key, NULL);

-        if (OVS_LIKELY(!netdev_flow)) {

+        if (OVS_UNLIKELY(netdev_flow)) {

+            struct dp_netdev_actions *old_act =

+                dp_netdev_flow_get_actions(netdev_flow);

+

+            if ((add_actions->size != old_act->size) ||

+                    memcmp(old_act->actions, add_actions->data,

+                                             add_actions->size)) {

+

+               struct dp_netdev_actions *new_act =

+                   dp_netdev_actions_create(add_actions->data,

+                                            add_actions->size);

+

+               ovsrcu_set(&netdev_flow->actions, new_act);

+               ovsrcu_postpone(dp_netdev_actions_free, old_act);

+            }

+        } else {

             netdev_flow = dp_netdev_flow_add(pmd, &match, &ufid,

                                              add_actions->data,

                                              add_actions->size, orig_in_port);


_______________________________________________
dev mailing list
[email protected]
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Reply via email to