On 9/25/21 12:22 AM, Han Zhou wrote: > On Fri, Sep 24, 2021 at 7:27 AM Numan Siddique <[email protected]> wrote: >> >> On Fri, Sep 24, 2021 at 7:22 AM Dumitru Ceara <[email protected]> wrote: >>> >>> On 9/24/21 12:06 PM, Xavier Simonart wrote: >>>> Hi >>>> >>> >>> Hi Xavier, >>> >>>> I have the following question on when ovn is reporting a port to be > up if >>>> conditional monitoring is enabled. >>>> >>>> In the current ovn master, if conditional monitoring is enabled, a > port is >>>> reported up too early i.e. before all related flows are properly > installed >>>> in ovs. >>>> Packets sent through ovs immediately after a port has been reported up >>>> might be lost. This is visible in some ovn tests failing > intermittently. >>>> >>> >>> Thanks for investigating this issue! I'm cc-ing Han explicitly too >>> because he mentioned the same bug during yesterday's IRC meeting. >>> > Thanks for bringing this to my attention. Please see my comments below. > >>>> This is the flow of interaction between ovs/ovn-controller and ovn-sb > after >>>> a Logical switch has been created, a port added to it and the related >>>> interface has been added to ovs. >>>> >>>> [A] OVS=>OVN new interface notification >>>> [B] OVN=>SB monitor_cond_change(Port_Binding) >>>> [C] OVN<= SB notification Port_Binding (mac, uuid, ...) >>>> [D] OVS<= OVN Adding flows for tables [1] (seqno 1) >>>> [E] OVN=>SB monitor_cond_change(Logical_Flow, MAC_Binding, ...) >>>> [F] OVS=>OVN Flows seqno 1 installed >>>> [G] OVN=>SB Port up >>>> [H] OVS<=OVN <= ovn_installed >>>> [I] OVN<=SB <= notification (Logical_Flows) >>>> [J] OVS<=OVN <= Adding flows for all tables (seqno 2) >>>> [K] (flows installed, port only up now...) >>>> >>>> [1] 0, 37, 38, 39, 64, 65 >>>> >>>> Potential solution/workarounds/... in ovn controller >>>> >>>> - (1) check that there is no conditional monitoring in flight > before >>>> reporting a port to be up >>>> >>>> CON: if adding many ports, we might have conditional monitorings in > flight >>>> for a long time, resulting in a long delay in reporting port up >>> >>> I agree, this seems too risky. >>> >>>> >>>> - (2) Disable conditional monitoring of Port_Binding. >>>> >>>> CON: as such, it does not help - only steps [B,C] above are skipped >>>> >>>> - (3) "Proper" fix: only report a port up when there are no > monitoring >>>> conditions related to this port, its datapath, ... in flight >>>> >>>> CON: Must track changes in monitor conditions; overkill for this small >>>> issue? >>> >>> To me it seems like this would be quite complex, with too many chances >>> of introducing bugs for a small issue that shouldn't have high impact in >>> production deployments. >>> >>>> >>>> - (4) Combine 1 & 2: i.e. disable conditional monitoring of > Port_Binding >>>> and check that there is no conditional monitoring in flight before >>>> reporting a port to be UP >>>> >>>> CON: port up might be delayed if many local datapaths are added > (adding a >>>> port to an existing local datapath will not delay port up as in 1) >>>> >>> >>> This sounds like a good compromise to me. > > It seems to me monitoring all Port_Binding doesn't solve the problem. If a > node keeps claiming and releasing the first/last LSP of a LS on the node, > the monitoring condition change to the Datapath table would never stop (in > theory).
I think this continuous claiming/releasing of the first/last LSP is an indication of something else that's wrong in the cluster; but I agree this might be hit in very specific cases. > > Besides, enabling conditional monitoring is to avoid downloading > unnecessary data by every ovn-controller. Port_Binding is one of the > biggest part of the data (although less than Logical_Flow). When > conditional monitoring is enabled (ovn-monitor-all = false), users would > expect port bindings of other nodes are not downloaded. > >>> >>>> - (5) Ignore the problem in ovn, only fix the unit test (e.g. > checking >>>> from unit tests that relevant flows are installed)... >>> >>> This "works" too but I think I'd prefer (4). >> >> >> Given that the issue would be seen only for the first port claim of a >> logical switch 'S' on a given >> chassis, I'd say (5) should suffice. >> > > I think the severity of the problem depends on how we use the "up" in the > LSP and "ovn-installed" in the OVS interface:external_ids. > > For "up", if it is used in test cases to ensure port is claimed by a > chassis, it is fine. If it is used to ensure are flows are installed, it is > risky. In many cases an extra "ovn-nbctl --wait=hv sync" following the "up" > == true check would help. > > For "ovn-installed", we are relying on it in ovn-k8s for port readiness > check. When conditional monitoring is enabled, the required flows may not > be downloaded for the pod to work. This is not acceptable even if it is the > problem only for the first pod on the LS. Now ovn-k8s disables conditional > monitoring, so it is ok for now. If ovn-k8s enables conditional monitoring > (maybe through an option), we can fallback to the old way to check for > certain OVS flows in those cases. Other CMS may handle it the same way. The problem is that the old ovn-k8s way of checking for certain OVS flows is also an heuristic that might be inaccurate. There's no guarantee that if the flows the CMS checks for are present, all other flows corresponding to the logical port were actually installed (e.g., due to a similar monitor condition change). Furthermore, such checks rely on the CMS "knowing" what OpenFlow tables to dump, and OVN offers no guarantee that these tables stay the same across releases (or even within the same release). Moreover, AFAIK, at least in the ovn-k8s case, the CNI way of checking for specific openflows was creating a performance issue at scale, therefore it got removed: https://github.com/ovn-org/ovn-kubernetes/pull/2455 I'm not extremely familiar with the ovn-k8s code base but AFAICT in most cases there's always at least one logical switch port bound for every node switch, the management port. Which lowers significantly the chances of hitting this issue. > > I have no better suggestion to improve the behavior of ovn-installed. But > it seems the benefit of it is not strong enough to justify complex > solutions to it. I think we can keep it as is with the above ways to avoid > the problems. > > In addition, the ovn-installed mechanism introduced another update to the > SB DB (for the "up" status). It could be a significant overhead to the SB > and the whole control plane if the churn rate is high in a large-scale > environment. The major write operation from ovn-controller to SB DB was > port-binding's chassis column and mac-binding entries. Now we are doubling > the amount of port-binding updates. Sure, this additional SB DB transaction can be removed (we could go back to a single write, for example, only when the flows have been installed); however, until now, we didn't identify this additional update as incurring any performance hits in the large scale deployments we have downstream. Regards, Dumitru _______________________________________________ dev mailing list [email protected] https://mail.openvswitch.org/mailman/listinfo/ovs-dev
