On 6/17/26 6:03 PM, Xavier Simonart wrote: > Hi Ilya, > > Thanks for the review. > > On Tue, Jun 16, 2026 at 6:30 PM Ilya Maximets <[email protected] > <mailto:[email protected]>> wrote: > > On 6/16/26 5:46 PM, Xavier Simonart via dev wrote: > > As in [0], multiple load balancing system tests are randomly failing > from > > time to time as they check that, after 10 or 20 requests sent to load > > balancer, all backends are at least reached once. Statistically, this is > > failing from time to time. > > [1] fixed such issues, but there are new occurrences. > > If after 10 requests we did not get the expected distribution, we > > send 10 more requests. We do that up to 30 times. > > Hi, Xavier. Are you sure this is what is happening here? > The chance that all 20 requests are sent to the same backend > supposed to be 1 to 2^20, which is a very small chance and > so it should not really happen in practice. Maybe there is > a different reason here after all? How frequently you see > the test failures? > > It did not happen when we send 20 requests but it occurred during a test > where we "only" send 10 requests > (see the fix around line 17733), and we can see what's happening in the > tcpdumps. > I then changed all occurrences of that same pattern. > However I agree that with 20 requests the probability becomes really low. > With 10 requests it happens more often than we might think: we have roughly > two patches a day, > ovs-robot runs 4x system-tests (gcc, clang, userspace, dpdk), and we have > roughly 40 occurrences of > this pattern in system tests. So we run through this ~300 times per day...
Yeah, I agree that 10 is indeed too low. > > > > > > [0] > https://github.com/ovsrobot/ovn/actions/runs/27547031217/job/81423590350 > <https://github.com/ovsrobot/ovn/actions/runs/27547031217/job/81423590350> > > [1] c906da4f1dea: tests: Fixed load balancing system-tests > > > > Fixes: 40a686e8e70f ("Add IPv6 support for lb health-check") > > Fixes: 33cfa4655fd7 ("tests: Move SCTP test from kernel only to general > OVN system tests.") > > Fixes: da5529438342 ("northd: Do not drop ip traffic with destination > vip expressed via template vars.") > > Signed-off-by: Xavier Simonart <[email protected] > <mailto:[email protected]>> > > --- > > tests/system-ovn.at <http://system-ovn.at> | 84 > +++++++++++++++++++++------------------------ > > 1 file changed, 39 insertions(+), 45 deletions(-) > > > > diff --git a/tests/system-ovn.at <http://system-ovn.at> > b/tests/system-ovn.at <http://system-ovn.at> > > index 35df0ec2f..2cadbc6a7 100644 > > --- a/tests/system-ovn.at <http://system-ovn.at> > > +++ b/tests/system-ovn.at <http://system-ovn.at> > > @@ -5143,15 +5143,15 @@ OVS_WAIT_UNTIL( > > ) > > > > # From sw0-p2 send traffic to vip - 2001::a > > -for i in `seq 1 20`; do > > - echo Request $i > > - ovn-sbctl list service_monitor > > - NS_CHECK_EXEC([sw0-p2], [wget http://[[2001::a]] -t 5 -T 1 > --retry-connrefused -v -o wget$i.log]) > > -done > > +OVS_WAIT_FOR_OUTPUT([ > > + for i in `seq 1 20`; do > > + ovn-sbctl list service_monitor >> service_monitor.log > > + NS_EXEC([sw0-p2], [wget http://[[2001::a]] -t 5 -T 1 > --retry-connrefused -v -o wget$i.log]) > > I don't think this is a good change to replace NS_CHECK_EXEC > with a simple NS_EXEC. As explained in commit: > b087f2556514 ("tests: system-ovn: Fix force SNAT IP in load-balancer > template test.") > It will take forever for this test to fail if there is an actual > issue in the pipeline and the packets are not delivered / conntrack > entries are not created. It will take about 2.5 hours for the test > to actually fail, IIUC. We should not have that. > > I do not think that we can run NS_EXEC within OVS_WAIT_FOR_OUTPUT. > So, instead I could simply ensure that we send 20 requests (i.e. only change > the test which sends > 10 for now). This should be enough to reduce the number of failures to less > than one per year, > and we can keep NS_CHECK_EXEC. > I'll send v2. With the NS_CHECK_EXEC we could do 30 even, I guess, or 25, if it's not too slow on a happy path. But we need to CHECK. > > > Best regards, Ilya Maximets. > > Thanks > Xavier _______________________________________________ dev mailing list [email protected] https://mail.openvswitch.org/mailman/listinfo/ovs-dev
