On 6/16/26 5:46 PM, Xavier Simonart via dev wrote: > As in [0], multiple load balancing system tests are randomly failing from > time to time as they check that, after 10 or 20 requests sent to load > balancer, all backends are at least reached once. Statistically, this is > failing from time to time. > [1] fixed such issues, but there are new occurrences. > If after 10 requests we did not get the expected distribution, we > send 10 more requests. We do that up to 30 times.
Hi, Xavier. Are you sure this is what is happening here? The chance that all 20 requests are sent to the same backend supposed to be 1 to 2^20, which is a very small chance and so it should not really happen in practice. Maybe there is a different reason here after all? How frequently you see the test failures? > > [0] https://github.com/ovsrobot/ovn/actions/runs/27547031217/job/81423590350 > [1] c906da4f1dea: tests: Fixed load balancing system-tests > > Fixes: 40a686e8e70f ("Add IPv6 support for lb health-check") > Fixes: 33cfa4655fd7 ("tests: Move SCTP test from kernel only to general OVN > system tests.") > Fixes: da5529438342 ("northd: Do not drop ip traffic with destination vip > expressed via template vars.") > Signed-off-by: Xavier Simonart <[email protected]> > --- > tests/system-ovn.at | 84 +++++++++++++++++++++------------------------ > 1 file changed, 39 insertions(+), 45 deletions(-) > > diff --git a/tests/system-ovn.at b/tests/system-ovn.at > index 35df0ec2f..2cadbc6a7 100644 > --- a/tests/system-ovn.at > +++ b/tests/system-ovn.at > @@ -5143,15 +5143,15 @@ OVS_WAIT_UNTIL( > ) > > # From sw0-p2 send traffic to vip - 2001::a > -for i in `seq 1 20`; do > - echo Request $i > - ovn-sbctl list service_monitor > - NS_CHECK_EXEC([sw0-p2], [wget http://[[2001::a]] -t 5 -T 1 > --retry-connrefused -v -o wget$i.log]) > -done > +OVS_WAIT_FOR_OUTPUT([ > + for i in `seq 1 20`; do > + ovn-sbctl list service_monitor >> service_monitor.log > + NS_EXEC([sw0-p2], [wget http://[[2001::a]] -t 5 -T 1 > --retry-connrefused -v -o wget$i.log]) I don't think this is a good change to replace NS_CHECK_EXEC with a simple NS_EXEC. As explained in commit: b087f2556514 ("tests: system-ovn: Fix force SNAT IP in load-balancer template test.") It will take forever for this test to fail if there is an actual issue in the pipeline and the packets are not delivered / conntrack entries are not created. It will take about 2.5 hours for the test to actually fail, IIUC. We should not have that. Best regards, Ilya Maximets. _______________________________________________ dev mailing list [email protected] https://mail.openvswitch.org/mailman/listinfo/ovs-dev
