On 11/6/20 5:59 PM, Ben Pfaff wrote: > On Fri, Nov 06, 2020 at 05:25:36PM +0100, Dumitru Ceara wrote: >> On 11/6/20 4:18 AM, Ben Pfaff wrote: >>> Some of these are from ovn-northd, not ovn-northd-ddlog, and so I don't >>> think it's likely that my patch series causes them, since it doesn't >>> really touch ovn-northd. The OVN testsuite has a regrettable number of >>> race conditions in it. >> >> I agree, there are probably races in the testsuite but I think there's >> also a bug with "ovn-nbctl --wait=hv sync". >> >> For example, adding a "sleep 1" here: >> https://github.com/ovn-org/ovn/blob/c108f23e1c10910031f9409b79001d001aae0c8f/tests/ovn.at#L21478 >> >> makes this test pass on my machine with both ovn-northd and >> ovn-northd-ddlog. > > This hints toward a bug in ovn-controller. It suggests that > ovn-controller reports that it is caught up before it has pushed all the > flows to ovs-vswitchd. >
Right, I tracked it down to ovn-controller setting nb_cfg in the SB chassis record while there's still an unreplied monitor_cond_change from SB. If some logical_flows were added to the SB at nb_cfg == X. And if at nb_cfg == X+1 some changes happen to the SB that would also make ovn-controller request a monitor condition change that includes flows added at X then ovn-controller "acks" nb_cfg == X too early. I think it might be enough to just delay reporting that ovn-controller caught up if there are still in flight monitor_cond_change requests. I'll see if I can come up with a fix and if there are other corner cases. Regards, Dumitru _______________________________________________ dev mailing list [email protected] https://mail.openvswitch.org/mailman/listinfo/ovs-dev
