On Mon, Mar 18, 2024 at 11:27 AM Mark Michelson <[email protected]> wrote: > > Hi Han, > > I have a comment below > > On 3/5/24 01:27, Han Zhou wrote: > > The ovn-ofctrl-wait-before-clear setting is designed to minimize > > downtime during the initial start-up of the ovn-controller. For this > > purpose, the ovn-controller should wait only once upon entering the > > S_WAIT_BEFORE_CLEAR state for the first time. Subsequent reconnections > > to the OVS, such as those occurring during an OVS restart/upgrade, > > should not trigger this wait. However, the current implemention always > > waits for the configured time in the S_WAIT_BEFORE_CLEAR state, which > > can inadvertently delay flow installations during OVS restart/upgrade, > > potentially causing more harm than good. (The extent of the impact > > varies based on the method used to restart OVS, including whether flow > > save/restore tools and the flow-restore-wait feature are employed.) > > > > This patch avoids the unnecessary wait after the initial one. > > > > Fixes: 896adfd2d8b3 ("ofctrl: Support ovn-ofctrl-wait-before-clear to reduce down time during upgrade.") > > Signed-off-by: Han Zhou <[email protected]> > > --- > > controller/ofctrl.c | 1 - > > tests/ovn-controller.at | 9 +++++++-- > > 2 files changed, 7 insertions(+), 3 deletions(-) > > > > diff --git a/controller/ofctrl.c b/controller/ofctrl.c > > index f14cd79a8dbb..0d72ecbaa167 100644 > > --- a/controller/ofctrl.c > > +++ b/controller/ofctrl.c > > @@ -634,7 +634,6 @@ run_S_WAIT_BEFORE_CLEAR(void) > > if (!wait_before_clear_time || > > (wait_before_clear_expire && > > time_msec() >= wait_before_clear_expire)) { > > - wait_before_clear_expire = 0; > > state = S_CLEAR_FLOWS; > > return; > > } > > diff --git a/tests/ovn-controller.at b/tests/ovn-controller.at > > index 37f1ded1bd26..b65e11722cbb 100644 > > --- a/tests/ovn-controller.at > > +++ b/tests/ovn-controller.at > > @@ -2284,10 +2284,15 @@ lflow_run_2=$(ovn-appctl -t ovn-controller coverage/read-counter lflow_run) > > AT_CHECK_UNQUOTED([echo $lflow_run_1], [0], [$lflow_run_2 > > ]) > > > > -# Restart OVS this time, and wait until flows are reinstalled > > +# Restart OVS this time. Flows should be reinstalled without waiting. > > OVS_APP_EXIT_AND_WAIT([ovs-vswitchd]) > > start_daemon ovs-vswitchd --enable-dummy=system -vvconn -vofproto_dpif -vunixctl > > -OVS_WAIT_UNTIL([ovs-ofctl dump-flows br-int | grep -F 10.1.2.4 | grep -vF 2.2.2.2]) > > + > > +# Sleep for 3s, which is long enough for the flows to be installed, but > > +# shorter than the wait-before-clear (5s), to make sure the flows are installed > > +# without waiting. > > +sleep 3 > > This change makes me nervous. The comment makes sense. However, I worry > that on slow or loaded systems, relying on the flows to be written > within 3 seconds may not always work out. > > If there were a way to peek into the ofctrl state machine and check that > we have moved off of S_WAIT_BEFORE_CLEAR by this point, that might work > better. But that is something that is hard to justify exposing. > > I came up with this possible idea: > * set wait-before-clear to a time longer than OVS_CTL_TIMEOUT (e.g. 60 > seconds) > * Restart ovs > * Use OVS_WAIT_UNTIL(...), just like the test used to do. > > This way, we get plenty of opportunities to ensure the flows were > written. In most cases, this probably will actually be quicker than the > 3 second sleep added in this patch. However, if it takes longer than 3 > seconds, then the test can still pass. If the flows get written > properly, then we know ovn-controller did not wait for the > wait-before-clear time. >
Hi Mark, thanks for your comment! I agree with you that sleep for 3s is not very reliable. Your suggestion looks better, but I think there is still a potential problem. The approach assumes that ovn-controller will always apply the new settings of ofctrl-wait-before-clear. It is true for the current implementation, but there is nothing preventing us from removing this logic, so that ovn-controller ignores any ofctrl-wait-before-clear setting changes after startup. In fact, it is more reasonable to ignore it. So, let's assume ovn-controller doesn't take care of the changes of the settings. In this test case, the setting is initially 5s when ovn-controller starts, and later changing it to 60s doesn't take effect and ovn-controller still uses the 5s value. And then let's assume ovn-controller still waits for the 5s after OVS is restarted, and the test case will pass because OVS_WAIT_UNTIL can wait for much longer than 5s. So the test will be incorrect. To avoid this potential problem, I tried another approach. I figured that simply adding a "ovn-nbctl --wait=hv sync" is sufficient to ensure ovn-controller had the chance to install flows, without the need to sleep. So please see if the below diff looks good: --------------------------------------------------------------------------------------------------------------------------- diff --git a/tests/ovn-controller.at b/tests/ovn-controller.at index 0da6570c5882..3202f0beff46 100644 --- a/tests/ovn-controller.at +++ b/tests/ovn-controller.at @@ -2328,10 +2328,10 @@ AT_CHECK_UNQUOTED([echo $lflow_run_1], [0], [$lflow_run_2 OVS_APP_EXIT_AND_WAIT([ovs-vswitchd]) start_daemon ovs-vswitchd --enable-dummy=system -vvconn -vofproto_dpif -vunixctl -# Sleep for 3s, which is long enough for the flows to be installed, but -# shorter than the wait-before-clear (5s), to make sure the flows are installed -# without waiting. -sleep 3 +# Sync to make sure ovn-controller is given enough time to install the flows. +check ovn-nbctl --wait=hv sync + +# Flow should be installed without any extra waiting. AT_CHECK([ovs-ofctl dump-flows br-int | grep -F 10.1.2.4 | grep -vF 2.2.2.2], [0], [ignore]) check ovn-nbctl --wait=hv lb-add lb3 3.3.3.3 10.1.2.5 \ --------------------------------------------------------------------------------------------------------------------------- Thanks, Han > > +AT_CHECK([ovs-ofctl dump-flows br-int | grep -F 10.1.2.4 | grep -vF 2.2.2.2], [0], [ignore]) > > > > check ovn-nbctl --wait=hv lb-add lb3 3.3.3.3 10.1.2.5 \ > > -- ls-lb-add ls1 lb3 > _______________________________________________ dev mailing list [email protected] https://mail.openvswitch.org/mailman/listinfo/ovs-dev
