On Wed, Nov 15, 2023 at 2:40 PM Dumitru Ceara <[email protected]> wrote:
> The test incorrectly assumed that "ovn-nbctl --wait=hv sync" will always
> send an OpenFlow barrier to ovs-vswitchd. That doesn't happen unless
> there are other OpenFlow (rule or group) changes that need to be
> programmed in the datapath.
>
> Fix the test by actually installing new rules in between warp
> invocations.
>
> Spotted in CI, mostly on oversubscribed systems. A log sample that
> shows that ovn-controller didn't generate any barrier request for the
> nbctl sync request:
>
> 2023-11-15T12:12:22.937Z|00084|vconn|DBG|unix#4: received:
> OFPT_BARRIER_REQUEST (OF1.5) (xid=0x13):
> 2023-11-15T12:12:22.937Z|00085|vconn|DBG|unix#4: sent (Success):
> OFPT_BARRIER_REPLY (OF1.5) (xid=0x13):
> ...
> 2023-11-15T12:12:23.032Z|00090|unixctl|DBG|received request
> time/warp["60000"], id=0
> 2023-11-15T12:12:23.032Z|00091|unixctl|DBG|replying with success, id=0:
> "warped"
> 2023-11-15T12:12:23.042Z|00094|vconn|DBG|unix#3: sent (Success):
> OFPT_ECHO_REQUEST (OF1.5) (xid=0x0): 0 bytes of payload
> 2023-11-15T12:12:23.042Z|00095|vconn|DBG|unix#4: sent (Success):
> OFPT_ECHO_REQUEST (OF1.5) (xid=0x0): 0 bytes of payload
> 2023-11-15T12:12:23.042Z|00097|vconn|DBG|unix#5: sent (Success):
> OFPT_ECHO_REQUEST (OF1.5) (xid=0x0): 0 bytes of payload
> 2023-11-15T12:12:23.042Z|00098|unixctl|DBG|received request
> time/warp["60000"], id=0
> 2023-11-15T12:12:23.042Z|00099|unixctl|DBG|replying with success, id=0:
> "warped"
> 2023-11-15T12:12:23.052Z|00100|rconn|ERR|br-int<->unix#3: no response to
> inactivity probe after 60 seconds, disconnecting
> 2023-11-15T12:12:23.052Z|00101|rconn|ERR|br-int<->unix#4: no response to
> inactivity probe after 60 seconds, disconnecting
> 2023-11-15T12:12:23.052Z|00102|rconn|ERR|br-int<->unix#5: no response to
> inactivity probe after 60 seconds, disconnecting
> 2023-11-15T12:12:23.052Z|00103|unixctl|DBG|received request
> time/warp["60000"], id=0
>
> Fixes: ff00142808dc ("controller: Fixed ovs/ovn(features) connection lost
> when running more than 120 seconds")
> Signed-off-by: Dumitru Ceara <[email protected]>
> ---
> tests/ovn.at | 15 +++++++++++++++
> 1 file changed, 15 insertions(+)
>
> diff --git a/tests/ovn.at b/tests/ovn.at
> index b8c61f87fb..face39e5c2 100644
> --- a/tests/ovn.at
> +++ b/tests/ovn.at
> @@ -35352,20 +35352,35 @@ as hv1
> check ovs-vsctl add-br br-phys
> ovn_attach n1 br-phys 192.168.0.1
>
> +check ovn-nbctl ls-add ls
> +
> dnl Ensure that there are 4 openflow connections.
> OVS_WAIT_UNTIL([test "$(grep -c 'negotiated OpenFlow version'
> hv1/ovs-vswitchd.log)" -eq "4"])
>
> +as hv1 check ovs-vsctl -- add-port br-int lsp0 \
> + -- set Interface lsp0 external-ids:iface-id=lsp0 \
> + -- add-port br-int lsp1 \
> + -- set Interface lsp1 external-ids:iface-id=lsp1 \
> + -- add-port br-int lsp2 \
> + -- set Interface lsp2 external-ids:iface-id=lsp2
> +
> dnl "Wait" 3 times 60 seconds and ensure ovn-controller writes to the
> dnl openflow connections in the meantime. This should allow ovs-vswitchd
> dnl to probe the openflow connections at least twice.
>
> as hv1 ovs-appctl time/warp 60000
> +check ovn-nbctl lsp-add ls lsp0
> +wait_for_ports_up
> check ovn-nbctl --wait=hv sync
>
> as hv1 ovs-appctl time/warp 60000
> +check ovn-nbctl lsp-add ls lsp1
> +wait_for_ports_up
> check ovn-nbctl --wait=hv sync
>
> as hv1 ovs-appctl time/warp 60000
> +check ovn-nbctl lsp-add ls lsp2
> +wait_for_ports_up
> check ovn-nbctl --wait=hv sync
>
> AT_CHECK([test -z "`grep disconnecting hv1/ovs-vswitchd.log`"])
> --
> 2.39.3
>
> _______________________________________________
> dev mailing list
> [email protected]
> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
>
>
Looks good to me, thanks.
Acked-by: Ales Musil <[email protected]>
--
Ales Musil
Senior Software Engineer - OVN Core
Red Hat EMEA <https://www.redhat.com>
[email protected]
<https://red.ht/sig>
_______________________________________________
dev mailing list
[email protected]
https://mail.openvswitch.org/mailman/listinfo/ovs-dev