note: Please, don't top-post. Reply to the comments inline. It's hard for the reader to follow the conversation otherwise.
On 2/18/26 11:50 AM, Matteo Perin wrote: > Hi Ilya, > > Thank you very much for the comments. > Regarding the system-route test reworking I will reshape the changes > taking into account your advice. > I cannot hide that the changes were made in quite a rush, and I was > not totally sure how to correctly handle the state capture (i.e. if > capturing it to a file was an acceptable pattern), but now it is > clearer, so thank you for that. > > Regarding the other unit tests, you are right that they should not > cause a problem with the intended setup, but I still see the routing > cache being "polluted" by system rules. > For example, for tests 834, 847, 848 and 850 I can see that the test > fails because of the exact match: > Cached: 0: from all lookup local > +Cached: 5270: from all lookup 52 > User: 32764: from 1.1.2.81 lookup 11 > User: 32765: from 1.1.2.80 lookup 10 > Cached: 32766: from all lookup main > > I looked into it a little bit and it seems there might be a bug or > unintended behavior with --disable-system-route as it does not fully > prevent system routing rules from being cached at startup. > In route-table.c, when route_table_reset() is called, each rule found > querying all kernel routing rules via RTM_GETRULE gets cached with > user=false and this happens regardless of the --disable-system-route > flag. > The use_system_routing_table value is only checked in ovs_router_insert() > and ovs_router_lookup_fallback() in ovs-router.c, so it will only prevent > ongoing system routing rules updates and fallback lookups from being cached. > > If the intended behavior of the --disable-system-route=false option is to > also skip system rules during the initial dump, maybe we could expose a > function in the ovs_router module like ovs_router_system_routing_enabled() > and check it inside the route_table rule_handle_msg() call before caching > rules during initialization. Yeah, that's an actual bug that we need to fix. I'd suggest following the steps of ovs_router_insert() function. We export it for other modules like route-table to use, but th eovs-router itself only uses the internal version of it - ovs_router_insert__(). This way we can still add all the rules we want from the ovs-router module, but any rules that are coming from the system are filtered by checking the 'use_system_routing_table' configuration first. > As a side note, I also think this will help fix the issues with the system > test without needing to change it too much (if we make sure to set the flag > properly). What do you think? The point of a system test is to check the chancing of the rules from the system. Turing system routes off will defeat the purpose of the test. Best regards, Ilya Maximets. > > Best Regards, > Matteo > > > On Tue, 17 Feb 2026 at 18:44, Ilya Maximets <[email protected] > <mailto:[email protected]>> wrote: > > On 2/11/26 4:08 PM, Matteo Perin via dev wrote: > > Tests including routing rule checks were failing on systems with > > non-standard routing rules, causing issues when trying to run them on > > some deployments. > > These failures occurred because tests performed exact output matching > > on the full output of 'ovs-appctl ovs/route/rule/show', which includes > > both user-added and system-cached rules. > > > > When the system has additional routing rules that meet certain criteria > > (FR_ACT_TO_TBL action without unsupported selectors like fwmark, dport, > > sport, iif, ipproto, or tun_id), OVS caches them, as expected, causing > > them to appear in the "Cached:" section of the output. > > Tests that expect exact output fail when encountering these legitimate, > > albeit unexpected, cached rules. > > Hi, Matteo. Thanks for the patch! > > > > > The following affected tests were modified to verify only the expected > > behavior rather than requiring exact output matches: > > > > - system-route.at <http://system-route.at> (ovs-route - unsupported > rules): > > Captures full initial cache state before adding test rules, verifies > that > > the cache state remains unchanged after adding unsupported rules > (keeping > > the intent of the test). > > This one makes sense to me... > > > > > - ovs-router.at <http://ovs-router.at>: > > For standard rule existence checks, changed from exact match to > verifying > > each expected rule exists independently. > > For user rule verification, added filtering to show only "User:" rules, > > ignoring "Cached:" system rules entirely. > > > > 3. tunnel-push-pop.at <http://tunnel-push-pop.at> (tunnel configuration > with routing rules): > > Added filter to output to show only "User:" rules when verifying routing > > rule configuration. > > ...but these two do not. These are not system tests, they are running > with the dummy datapath and with system routes disabled. So, it should > be perfectly fine to do an exact match comparison. And we should keep > doing that. Or do you see these tests actually fail in your setup? > If so, we need to find the reason and fix it. But the tests should stay > as they are. > > > > > Signed-off-by: Matteo Perin <[email protected] > <mailto:[email protected]>> > > --- > > tests/ovs-router.at <http://ovs-router.at> | 114 > ++++++++------------------------------- > > tests/system-route.at <http://system-route.at> | 51 > +++++++----------- > > tests/tunnel-push-pop.at <http://tunnel-push-pop.at> | 5 +- > > 3 files changed, 43 insertions(+), 127 deletions(-) > > > > <snip> > > > --- a/tests/system-route.at <http://system-route.at> > > +++ b/tests/system-route.at <http://system-route.at> > > @@ -343,45 +343,34 @@ on_exit 'ip link del p1-route' > > AT_CHECK([ip tuntap add name p1-route mode tap]) > > AT_CHECK([ip link set p1-route up]) > > > > -dnl Check there are no non-standard rules cached in OVS. > > -AT_CHECK([ovs-appctl ovs/route/rule/show], [0], [dnl > > -Cached: 0: from all lookup local > > -Cached: 32766: from all lookup main > > -Cached: 32767: from all lookup default > > -]) > > -AT_CHECK([ovs-appctl ovs/route/rule/show -6], [0], [dnl > > -Cached: 0: from all lookup local > > -Cached: 32766: from all lookup main > > -]) > > - > > -dnl Add unsupported rules to kernel. > > +dnl Register cleanup for all rules we are going to add. > > on_exit 'ip rule del priority 100 fwmark 0x16 lookup 42' > > -AT_CHECK([ip rule add priority 100 fwmark 0x16 lookup 42]) > > on_exit 'ip rule del priority 101 from 10.0.0.1 dport 22 lookup 42' > > -AT_CHECK([ip rule add priority 101 from 10.0.0.1 dport 22 lookup 42]) > > on_exit 'ip rule del priority 102 from 10.0.0.1 sport 22 lookup 42' > > -AT_CHECK([ip rule add priority 102 from 10.0.0.1 sport 22 lookup 42]) > > on_exit 'ip rule del priority 103 from 10.0.0.1 tun_id 22 lookup 42' > > -AT_CHECK([ip rule add priority 103 from 10.0.0.1 tun_id 22 lookup 42]) > > on_exit 'ip rule del priority 104 iif p1-route lookup 42' > > -AT_CHECK([ip rule add priority 104 iif p1-route lookup 42]) > > on_exit 'ip rule del priority 105 ipproto udp lookup 42' > > -AT_CHECK([ip rule add priority 105 ipproto udp lookup 42]) > > on_exit 'ip rule del priority 106 from all tun_id 22 lookup 42' > > -AT_CHECK([ip rule add priority 106 from all tun_id 22 lookup 42]) > > > > -dnl Give the main thread a chance to act. > > -AT_CHECK([ovs-appctl revalidator/wait]) > > -dnl Check OVS rules cache hasn't changed. > > -AT_CHECK([ovs-appctl ovs/route/rule/show], [0], [dnl > > -Cached: 0: from all lookup local > > -Cached: 32766: from all lookup main > > -Cached: 32767: from all lookup default > > -]) > > -AT_CHECK([ovs-appctl ovs/route/rule/show -6], [0], [dnl > > -Cached: 0: from all lookup local > > -Cached: 32766: from all lookup main > > -]) > > +dnl Capture initial cache state, add unsupported rules, and verify OVS > cache > > +dnl hasn't changed. This test verifies that unsupported routing rules > (with > > +dnl selectors like fwmark, dport, sport, iif, ipproto, tun_id) are not > cached > > +dnl by OVS, even though the kernel accepts them. > > +AT_CHECK([ > > + initial_v4=$(ovs-appctl ovs/route/rule/show) > > + initial_v6=$(ovs-appctl ovs/route/rule/show -6) > > + ip rule add priority 100 fwmark 0x16 lookup 42 > > + ip rule add priority 101 from 10.0.0.1 dport 22 lookup 42 > > + ip rule add priority 102 from 10.0.0.1 sport 22 lookup 42 > > + ip rule add priority 103 from 10.0.0.1 tun_id 22 lookup 42 > > + ip rule add priority 104 iif p1-route lookup 42 > > + ip rule add priority 105 ipproto udp lookup 42 > > + ip rule add priority 106 from all tun_id 22 lookup 42 > > + ovs-appctl revalidator/wait > > + final_v4=$(ovs-appctl ovs/route/rule/show) > > + final_v6=$(ovs-appctl ovs/route/rule/show -6) > > + test "$initial_v4" = "$final_v4" && test "$initial_v6" = "$final_v6" > > +], [0]) > > We should not write tests this way. The block above, IIRC, will only fail > if the last command fails. If any of the 'ip rule add' will fail, the > test > will still pass, even though it didn't test want it supposed to test. > Please, keep the commands where they were originally. Just capture the > state before and after and then compare. You also removed some comments > on > that we're explaining why certain calls are made. We should keep them. > > I'd also suggest to not use plain shell comparisons. Instead, capture the > output with AT_CHECK([...], [0], [stdout]), then rename the stdout file > into > something more meaningfull (under AT_CHECK as well), then do the same at > the end and do a diif -u comparison of two files (also under AT_CHECK). > This way all the commands and outputs will end up in the log and it will > be clear what exactly is different. > > Also, it may be good to sort the outputs. > > Best regards, Ilya Maximets. > _______________________________________________ dev mailing list [email protected] https://mail.openvswitch.org/mailman/listinfo/ovs-dev
