On 7/27/23 04:31, Peng He wrote: > > > Eelco Chaudron <echau...@redhat.com <mailto:echau...@redhat.com>> > 于2023年7月6日周四 15:52写道: > > > > On 6 Jul 2023, at 4:32, Peng He wrote: > > > Eelco Chaudron <echau...@redhat.com <mailto:echau...@redhat.com>> > 于2023年7月5日周三 22:16写道: > > > >> > >> > >> On 1 Jul 2023, at 4:43, Peng He wrote: > >> > >>> OVS allows overlapping megaflows, as long as the actions of these > >>> megaflows are equal. However, the current implementation of action > >>> modification relies on flow_lookup instead of ufid, this could result > >>> in looking up a wrong megaflow and make the ukeys and megaflows > >> inconsistent > >>> > >>> Just like the test case in the patch, at first we have a rule with the > >>> prefix: > >>> > >>> 10.1.2.0/24 <http://10.1.2.0/24> > >>> > >>> and we will get a megaflow with prefixes 10.1.2.2/24 > <http://10.1.2.2/24> when a packet with > >> IP > >>> 10.1.2.2 is received. > >>> > >>> Then suppose we change the rule into 10.1.0.0/16 > <http://10.1.0.0/16>. OVS prefers to keep > >> the > >>> 10.1.2.2/24 <http://10.1.2.2/24> megaflow and just changes its action > instead of extending > >>> the prefix into 10.1.2.2/16 <http://10.1.2.2/16>. > >>> > >>> then suppose we have a 10.1.0.2 packet, since it misses the megaflow, > >>> this time, we will have an overlapping megaflow with the right prefix: > >>> 10.1.0.2/16 <http://10.1.0.2/16> > >>> > >>> now we have two megaflows: > >>> 10.1.2.2/24 <http://10.1.2.2/24> > >>> 10.1.0.2/16 <http://10.1.0.2/16> > >>> > >>> last, suppose we have changed the ruleset again. The revalidator this > >>> time still decides to change the actions of both megaflows instead of > >>> deleting them. > >>> > >>> The dpif_netdev_flow_put will search the megaflow to modify with > unmasked > >>> keys, however it might lookup the wrong megaflow as the key 10.1.2.2 > >> matches > >>> both 10.1.2.2/24 <http://10.1.2.2/24> and 10.1.0.2/16 > <http://10.1.0.2/16>! > >>> > >>> This patch changes the megaflow lookup code in modification path into > >>> relying the ufid to find the correct megaflow instead of key lookup. > >> > >> Thanks for fixing Ilya’s comments! I’ve also copied in some of the v3 > >> discussion, so we can wrap it up here. > >> > >> Acked-by: Eelco Chaudron <echau...@redhat.com > <mailto:echau...@redhat.com>> > >> > >> //Eelco > >> > >>> Signed-off-by: Peng He <hepeng.0...@bytedance.com > <mailto:hepeng.0...@bytedance.com>> > >>> --- > >>> lib/dpif-netdev.c | 62 > ++++++++++++++++++++++++++--------------------- > >>> tests/pmd.at <http://pmd.at> | 48 > ++++++++++++++++++++++++++++++++++++ > >>> 2 files changed, 82 insertions(+), 28 deletions(-) > >>> > >>> diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c > >>> index 70b953ae6..b90ed1542 100644 > >>> --- a/lib/dpif-netdev.c > >>> +++ b/lib/dpif-netdev.c > >>> @@ -4198,36 +4198,43 @@ flow_put_on_pmd(struct dp_netdev_pmd_thread > *pmd, > >>> memset(stats, 0, sizeof *stats); > >>> } > >>> > >>> - ovs_mutex_lock(&pmd->flow_mutex); > >>> - netdev_flow = dp_netdev_pmd_lookup_flow(pmd, key, NULL); > >>> - if (!netdev_flow) { > >>> - if (put->flags & DPIF_FP_CREATE) { > >>> + if (put->flags & DPIF_FP_CREATE) { > >> > >> > >> EC> Should this not be: > >> EC> > >> EC> if (put->flags & DPIF_FP_CREATE && !(put->flags & DPIF_FP_MODIFY)) > >> EC> > >> EC> I know this will break the fix, but I’m wondering what the possible > >> EC> combinations are. > >> EC> Quickly looking at the code the following ones seem to exist: > >> EC> > >> EC> DPIF_FP_CREATE -> Create a flow, if one exists return EEXIST > >> EC> DPIF_FP_MODIFY -> Modify a flow, if it does not exist return ENONT > >> EC> DPIF_FP_CREATE | DPIF_FP_MODIFY -> If it exists modify it, if it > does > >> EC> not exists create it. > >> EC> > >> EC> Could the last combination not potentially fail the lookup? > >> EC> > >> EC> Or are we assuming only standalone DPIF_FP_MODIFY’s are the > problem? In > >> EC> theory, it could also be the combination. > >> EC> > >> > >> PH> sorry, I was wrong. Such a combination does exist in > >> PH>the dpif_probe_feature, and it probes the datapath by trying to put > >> flows: > >> PH> > >> PH> error = dpif_flow_put(dpif, DPIF_FP_CREATE | DPIF_FP_MODIFY | > >> PH>DPIF_FP_PROBE, > >> PH> key->data, key->size, NULL, 0, > >> PH> nl_actions, nl_actions_size, > >> PH> ufid, NON_PMD_CORE_ID, NULL); > >> > >> PH> so we CANNOT change the code into: > >> > >> PH> if (put->flags & DPIF_FP_CREATE && !(put->flags & DPIF_FP_MODIFY)) > >> > >> PH> as the issue the patch tries to fix only exist in MODIFY alone > path. > >> PH> If CREATE bit is set, we need to go through the CREATE path no > matter > >> if > >> PH> MODIFY exist or not. > >> > >> Ok, if this is only used by the probe function we should be fine. I did > >> quickly search the code and it seems to be the way. However, if it’s > ever > >> used by any part of ovs with both flags set, it might fail the lookup > and > >> we run into the same problem. > >> > > > > By "the same problem", you mean the problem listed in this patch or this > > fix might lead to a potential failure in other part of OVS? > > If you have a request with both DPIF_FP_MODIFY and DPIF_FP_CREATE set, > the lookup could still return the wrong entry. But if this is not used it > should be fine for now. > > > So do we need to check such combination here, and just return -ENOSUPP?
I'm not sure that's a good solution for a problem in the current version of the fix. In general, this patch seems to not comply with the documented behavior of the flow_put callback. The intended behavior is described in the comment for a struct dpif_flow_put in lib/dpif.h. Both flags being present is a valid case that should be handled. Empty set of flags can also be a valid case of some sort. It will always fail, but it should provide different error codes depending on the flow existence in the datapath. Is there anything major that prevents us from using UFID always? Userspace datapath seems to always have it and it will generate one if not provided. The internal generation might be a problem, I guess, due to a risk of it being non-unique. So, we probably should not use a locally generated ufid for a lookup, only for dp_netdev_flow_add. So, the logic might be: if put->ufid: /* Actually provided UFID from upper layers. */ Lookup by UFID else: Lookup by KEY Not Found: DPIF_FP_CREATE is SET: Add the flow DPIF_FP_CREATE is not SET: ENOENT Found: DPIF_FP_MODIFY is SET: Modify the flow DPIF_FP_MODIFY is not SET: EEXIST This way, if flows are added manually it will be user's responsibility to specify or not specify UFIDs, the dpctl/add-flow is a debug interface after all, so should not be a problem. And the ovs-vswitchd itself will always provide UFID and it checks that ukeys have different UFIDs, unless manually disabled, again, for testing or debugging purposes. Such logic is not that different from what we have today and should cover the issue and all the combinations of flags. Or am I missing something? A few other comments regarding the implementation: - The change reduces the critical section of the flow mutex down to the DPIF_FP_CREATE case. We can't actually do that. The main reason will be a possible race condition on the ovsrcu_set() during the actions update and a subsequent potential actions leak. RCU set operations require external synchronization. There could be other race conditions as well, because multiple threads can work on the same flow at the same time. - The '/* Overlapping flow. */' comment is actually not correct. It's not really the problem of this particular patch as it wasn't correct before either, but we should probably just remove it as you're touching this code anyways. Best regards, Ilya Maximets. _______________________________________________ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev