The test incorrectly assumed that "ovn-nbctl --wait=hv sync" will always send an OpenFlow barrier to ovs-vswitchd. That doesn't happen unless there are other OpenFlow (rule or group) changes that need to be programmed in the datapath.
An initial attempt to fix this [0] uncovered the fact that there's no easy way to cover all possible scenarios [1]. It seems that the effort to fix the test is not justified for the case it tests for (that inactivity probes are handled by the features module - that is quite obviously handled in the code). It's therefore more reasonable to just skip the test (to avoid noise in CI). Spotted in CI, mostly on oversubscribed systems. A log sample that shows that ovn-controller didn't generate any barrier request for the nbctl sync request: 2023-11-15T12:12:22.937Z|00084|vconn|DBG|unix#4: received: OFPT_BARRIER_REQUEST (OF1.5) (xid=0x13): 2023-11-15T12:12:22.937Z|00085|vconn|DBG|unix#4: sent (Success): OFPT_BARRIER_REPLY (OF1.5) (xid=0x13): ... 2023-11-15T12:12:23.032Z|00090|unixctl|DBG|received request time/warp["60000"], id=0 2023-11-15T12:12:23.032Z|00091|unixctl|DBG|replying with success, id=0: "warped" 2023-11-15T12:12:23.042Z|00094|vconn|DBG|unix#3: sent (Success): OFPT_ECHO_REQUEST (OF1.5) (xid=0x0): 0 bytes of payload 2023-11-15T12:12:23.042Z|00095|vconn|DBG|unix#4: sent (Success): OFPT_ECHO_REQUEST (OF1.5) (xid=0x0): 0 bytes of payload 2023-11-15T12:12:23.042Z|00097|vconn|DBG|unix#5: sent (Success): OFPT_ECHO_REQUEST (OF1.5) (xid=0x0): 0 bytes of payload 2023-11-15T12:12:23.042Z|00098|unixctl|DBG|received request time/warp["60000"], id=0 2023-11-15T12:12:23.042Z|00099|unixctl|DBG|replying with success, id=0: "warped" 2023-11-15T12:12:23.052Z|00100|rconn|ERR|br-int<->unix#3: no response to inactivity probe after 60 seconds, disconnecting 2023-11-15T12:12:23.052Z|00101|rconn|ERR|br-int<->unix#4: no response to inactivity probe after 60 seconds, disconnecting 2023-11-15T12:12:23.052Z|00102|rconn|ERR|br-int<->unix#5: no response to inactivity probe after 60 seconds, disconnecting 2023-11-15T12:12:23.052Z|00103|unixctl|DBG|received request time/warp["60000"], id=0 [0] https://mail.openvswitch.org/pipermail/ovs-dev/2023-November/409473.html [1] https://mail.openvswitch.org/pipermail/ovs-dev/2023-November/409530.html Fixes: ff00142808dc ("controller: Fixed ovs/ovn(features) connection lost when running more than 120 seconds") Signed-off-by: Dumitru Ceara <[email protected]> --- V2: - removed the test instead of trying to fix it (after discussion with Xavier). --- tests/ovn.at | 31 ------------------------------- 1 file changed, 31 deletions(-) diff --git a/tests/ovn.at b/tests/ovn.at index b8c61f87fb..198387d93e 100644 --- a/tests/ovn.at +++ b/tests/ovn.at @@ -35342,37 +35342,6 @@ OVN_CLEANUP([hv1],[hv2]) AT_CLEANUP ]) -OVN_FOR_EACH_NORTHD([ -AT_SETUP([feature inactivity probe]) -ovn_start -net_add n1 - -sim_add hv1 -as hv1 -check ovs-vsctl add-br br-phys -ovn_attach n1 br-phys 192.168.0.1 - -dnl Ensure that there are 4 openflow connections. -OVS_WAIT_UNTIL([test "$(grep -c 'negotiated OpenFlow version' hv1/ovs-vswitchd.log)" -eq "4"]) - -dnl "Wait" 3 times 60 seconds and ensure ovn-controller writes to the -dnl openflow connections in the meantime. This should allow ovs-vswitchd -dnl to probe the openflow connections at least twice. - -as hv1 ovs-appctl time/warp 60000 -check ovn-nbctl --wait=hv sync - -as hv1 ovs-appctl time/warp 60000 -check ovn-nbctl --wait=hv sync - -as hv1 ovs-appctl time/warp 60000 -check ovn-nbctl --wait=hv sync - -AT_CHECK([test -z "`grep disconnecting hv1/ovs-vswitchd.log`"]) -OVN_CLEANUP([hv1]) -AT_CLEANUP -]) - OVN_FOR_EACH_NORTHD([ AT_SETUP([Logical flows with Chassis_Template_Var references]) AT_KEYWORDS([templates]) -- 2.39.3 _______________________________________________ dev mailing list [email protected] https://mail.openvswitch.org/mailman/listinfo/ovs-dev
