On Fri, Sep 24, 2021 at 7:22 AM Dumitru Ceara <[email protected]> wrote: > > On 9/24/21 12:06 PM, Xavier Simonart wrote: > > Hi > > > > Hi Xavier, > > > I have the following question on when ovn is reporting a port to be up if > > conditional monitoring is enabled. > > > > In the current ovn master, if conditional monitoring is enabled, a port is > > reported up too early i.e. before all related flows are properly installed > > in ovs. > > Packets sent through ovs immediately after a port has been reported up > > might be lost. This is visible in some ovn tests failing intermittently. > > > > Thanks for investigating this issue! I'm cc-ing Han explicitly too > because he mentioned the same bug during yesterday's IRC meeting. > > > This is the flow of interaction between ovs/ovn-controller and ovn-sb after > > a Logical switch has been created, a port added to it and the related > > interface has been added to ovs. > > > > [A] OVS=>OVN new interface notification > > [B] OVN=>SB monitor_cond_change(Port_Binding) > > [C] OVN<= SB notification Port_Binding (mac, uuid, ...) > > [D] OVS<= OVN Adding flows for tables [1] (seqno 1) > > [E] OVN=>SB monitor_cond_change(Logical_Flow, MAC_Binding, ...) > > [F] OVS=>OVN Flows seqno 1 installed > > [G] OVN=>SB Port up > > [H] OVS<=OVN <= ovn_installed > > [I] OVN<=SB <= notification (Logical_Flows) > > [J] OVS<=OVN <= Adding flows for all tables (seqno 2) > > [K] (flows installed, port only up now...) > > > > [1] 0, 37, 38, 39, 64, 65 > > > > Potential solution/workarounds/... in ovn controller > > > > - (1) check that there is no conditional monitoring in flight before > > reporting a port to be up > > > > CON: if adding many ports, we might have conditional monitorings in flight > > for a long time, resulting in a long delay in reporting port up > > I agree, this seems too risky. > > > > > - (2) Disable conditional monitoring of Port_Binding. > > > > CON: as such, it does not help - only steps [B,C] above are skipped > > > > - (3) "Proper" fix: only report a port up when there are no monitoring > > conditions related to this port, its datapath, ... in flight > > > > CON: Must track changes in monitor conditions; overkill for this small > > issue? > > To me it seems like this would be quite complex, with too many chances > of introducing bugs for a small issue that shouldn't have high impact in > production deployments. > > > > > - (4) Combine 1 & 2: i.e. disable conditional monitoring of Port_Binding > > and check that there is no conditional monitoring in flight before > > reporting a port to be UP > > > > CON: port up might be delayed if many local datapaths are added (adding a > > port to an existing local datapath will not delay port up as in 1) > > > > This sounds like a good compromise to me. > > > - (5) Ignore the problem in ovn, only fix the unit test (e.g. checking > > from unit tests that relevant flows are installed)... > > This "works" too but I think I'd prefer (4).
Given that the issue would be seen only for the first port claim of a logical switch 'S' on a given chassis, I'd say (5) should suffice. Thanks Numan > > > > > > > - (6) Any other solution? > > > > > > What's your views on this? > > > > Thanks > > Xavier > > > > Regards, > Dumitru > _______________________________________________ dev mailing list [email protected] https://mail.openvswitch.org/mailman/listinfo/ovs-dev
