On 11/6/20 6:10 PM, Dumitru Ceara wrote:
> On 11/6/20 5:59 PM, Ben Pfaff wrote:
>> On Fri, Nov 06, 2020 at 05:25:36PM +0100, Dumitru Ceara wrote:
>>> On 11/6/20 4:18 AM, Ben Pfaff wrote:
>>>> Some of these are from ovn-northd, not ovn-northd-ddlog, and so I don't
>>>> think it's likely that my patch series causes them, since it doesn't
>>>> really touch ovn-northd.  The OVN testsuite has a regrettable number of
>>>> race conditions in it.
>>>
>>> I agree, there are probably races in the testsuite but I think there's
>>> also a bug with "ovn-nbctl --wait=hv sync".
>>>
>>> For example, adding a "sleep 1" here:
>>> https://github.com/ovn-org/ovn/blob/c108f23e1c10910031f9409b79001d001aae0c8f/tests/ovn.at#L21478
>>>
>>> makes this test pass on my machine with both ovn-northd and
>>> ovn-northd-ddlog.
>>
>> This hints toward a bug in ovn-controller.  It suggests that
>> ovn-controller reports that it is caught up before it has pushed all the
>> flows to ovs-vswitchd.
>>
> 
> Right, I tracked it down to ovn-controller setting nb_cfg in the SB
> chassis record while there's still an unreplied monitor_cond_change from SB.
> 
> If some logical_flows were added to the SB at nb_cfg == X.
> And if at nb_cfg == X+1 some changes happen to the SB that would also
> make ovn-controller request a monitor condition change that includes
> flows added at X then ovn-controller "acks" nb_cfg == X too early.
> 
> I think it might be enough to just delay reporting that ovn-controller
> caught up if there are still in flight monitor_cond_change requests.
> I'll see if I can come up with a fix and if there are other corner cases.

I sent a fix for this (needs a new ovsdb-idl API):
- OVS patch:
https://patchwork.ozlabs.org/project/openvswitch/list/?series=213074
- OVN patch: http://patchwork.ozlabs.org/project/ovn/list/?series=213075

Thanks,
Dumitru

_______________________________________________
dev mailing list
[email protected]
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Reply via email to