Hi

I have the following question on when ovn is reporting a port to be up if
conditional monitoring is enabled.

In the current ovn master, if conditional monitoring is enabled, a port is
reported up too early i.e. before all related flows are properly installed
in ovs.
Packets sent through ovs immediately after a port has been reported up
might be lost. This is visible in some ovn tests failing intermittently.

This is the flow of interaction between ovs/ovn-controller and ovn-sb after
a Logical switch has been created, a port added to it and the related
interface has been added to ovs.

[A] OVS=>OVN new interface notification
[B] OVN=>SB monitor_cond_change(Port_Binding)
[C] OVN<= SB notification Port_Binding (mac, uuid, ...)
[D] OVS<= OVN Adding flows for tables [1] (seqno 1)
[E] OVN=>SB monitor_cond_change(Logical_Flow, MAC_Binding, ...)
[F] OVS=>OVN Flows seqno 1 installed
[G] OVN=>SB Port up
[H] OVS<=OVN <= ovn_installed
[I] OVN<=SB  <= notification (Logical_Flows)
[J] OVS<=OVN <= Adding flows for all tables (seqno 2)
[K] (flows installed, port only up now...)

[1] 0, 37, 38, 39, 64, 65

Potential solution/workarounds/... in ovn controller

   - (1) check that there is no conditional monitoring in flight before
   reporting a port to be up

CON: if adding many ports, we might have conditional monitorings in flight
for a long time, resulting in a long delay in reporting port up

   - (2) Disable conditional monitoring of Port_Binding.

CON: as such, it does not help - only steps [B,C] above are skipped

   - (3) "Proper" fix: only report a port up when there are no monitoring
   conditions related to this port, its datapath, ... in flight

CON: Must track changes in monitor conditions; overkill for this small
issue?

   - (4) Combine 1 & 2: i.e. disable conditional monitoring of Port_Binding
   and check that there is no conditional monitoring in flight before
   reporting a port to be UP

CON: port up might be delayed if many local datapaths are added (adding a
port to an existing local datapath will not delay port up as in 1)

   - (5) Ignore the problem in ovn, only fix the unit test (e.g. checking
   from unit tests that relevant flows are installed)...


   - (6) Any other solution?


What's your views on this?

Thanks
Xavier
_______________________________________________
dev mailing list
[email protected]
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Reply via email to