On 2025-06-23 10:53 PM, Ilya Maximets wrote: > External email: Use caution opening links or attachments > > > On 6/19/25 2:50 PM, Dima Chumak via dev wrote: >> This patch series introduces infrastructure and user-facing improvements >> for multi-table routing in OVS. The main motivation is to enable more >> advanced routing scenarios, such as policy-based routing with source >> address selectors. For example, this can be used to support >> OVN-Kubernetes multi-VTEP topology where nodes may have multiple SR-IOV >> network adapters and to facilitate selection of which VTEP to use to >> send/receive the packets to/from the wire. >> >> The core of this series adds support for multiple routing tables within >> OVS. This is a prerequisite for importing non-default routing tables >> from the kernel and enables advanced routing lookups that consider >> parameters beyond just the destination address (e.g., source address). >> >> Additional routing tables are now created by reading the Routing Policy >> Database (RPDB) from the kernel. Only tables referenced by RPDB rules >> with a table lookup action are imported, and rule priorities and table >> IDs are preserved. The current implementation supports RPDB rules with a >> source address selector (`[not] from IP`). >> >> User interface changes: >> >> - The `ovs-appctl ovs/route/show` command now accepts an optional >> `table=ID` or `table=all` parameter, allowing users to display routes >> from specific or all tables. >> >> - The `ovs-appctl ovs/route/add` and `ovs/route/del` commands accept a >> `table=ID` parameter for adding or deleting user routes in non-default >> tables. >> >> - A new `ovs-appctl ovs/rule/show` command is introduced to display the >> internal routing rules database, sorted by priority. >> >> - The `ovs-appctl ovs/route/lookup` command now supports an optional >> `src=IP` parameter for lookups that match on source IP address. >> >> Performance optimization: >> >> A back-off mechanism is added to the periodic router reset logic. This >> reduces CPU usage under high update rates and large numbers of >> routes/rules by dynamically increasing the delay between full router >> table dumps, based on the duration of the last reset. >> >> Example usage: >> >> - Show all routes, including those from non-default tables: >> >> ovs-appctl ovs/route/show table=all >> >> - Add a route to a specific table: >> >> ovs-appctl ovs/route/add 10.7.7.0/24 br-phy0 table=10 >> >> - Show routing rules: >> >> ovs-appctl ovs/rule/show >> >> - Lookup a route with a source IP: >> >> ovs-appctl ovs/route/lookup 10.0.0.5 src=10.0.0.2 >> >> Dima Chumak (8): >> route-table: Export is_standard_table_id(). >> ovs-router: Add infrastructure for multi-table routing. >> route-table: Introduce multi-table route lookup. >> ovs-router: Add 'table=ID' parameter in ovs/route/show. >> ovs-router: Introduce ovs/rule/show command. >> ovs-router: Add 'table=ID' parameter in ovs/route/{add,del}. >> ovs-router: Add 'src=IP' parameter in ovs/route/lookup. >> router-table: Add back-off to periodic router reset. >> >> NEWS | 7 + >> lib/netdev-dummy.c | 6 +- >> lib/ovs-router.c | 535 ++++++++++++++++++++++++++++++----- >> lib/ovs-router.h | 38 ++- >> lib/packets.c | 20 ++ >> lib/packets.h | 2 + >> lib/route-table.c | 287 +++++++++++++++++-- >> lib/route-table.h | 18 +- >> tests/ovs-router.at | 4 + >> tests/system-route.at | 107 +++++++ >> tests/test-lib-route-table.c | 5 +- >> 11 files changed, 934 insertions(+), 95 deletions(-) >> > > Hi. Thanks for the change! It looks promising. I won't go through > individual patches, but I'll list a few comments here instead, as some of > them are more architectural and will require some re-structuring of the > patches. > > 1. is_standard_table_id() belongs to route-table module as it is specific > to Linux netlink interface. If we for some reason need to export it, this > should be done via route-table.h and there is also no reason for it to be > an inline function. > > 2. I think, netlink-specific code should not be allowed in the ovs-router > module. i.e. ifdef HAVE_NETLINK parts should not be there. We need to > define generic representation of the table numbers (if we even need them). > > 3. The code is special-casing default tables, but this is awkward and often > incorrect. We should just treat them as separate tables the same as any > other routing table. This special casing leads to local table to be > considered after the non-standard ones, which is probably wrong, as default > Linux routing looks at the local table first. So, I'd say let's split the > default classifier into 3 and have rules based routing the only way of > routing, where the highest priority rule is to check the local table, then > all the non-standard, then main and default. We can have the 3 default > rules always present in case we don't have them from the kernel, so our > user-API doesn't change on non-Linux systems. > > 4. We should not route packets with non-local source addresses, i.e. the > src classifier lookup should be performed first and not skipped. > > 5. 'ovs/rule/' is a confusing name for commands, because we normally think > about OF rules in OVS, not routing rules. So, I'd suggest moving these > commands under ovs/route/ or something like ovs/routing-rule/. > > 6. I'm not sure if the last patch from this set is needed. I'd not expect > a large amount of routes in the tables actually used for tunnel routing. > Do you have a legitimate use case in mind for a large number of routes? > If someone is using a BGP full routing table for OVS tunnel routing, I'd > say they're doing something wrong and should re-consider their setup. > In other cases, current logic where we only dump tables that we use > should be sufficient. > > 7. We need a few tunnel-push-pop unit tests for source-based routing. > We'll need a command to add rules from userspace. Otherwise, this > functionality is incomplete and source-based routing could not actually > be used outside of Linux. We'll need a way to distinguish rules that > are cached form the kernel and ones added manually, similarly how we > do for routes. > > 8. pvector can be used instead of rculist to avoid relying on kernel to > dump rules in a specific order, as this behavior is likely not guaranteed. > > 9. Some coverage for rule-related events in the test-lib-route-table might > be good to have. >
Hi Ilya, thanks for the feedback, the requested changes look clear, I'll start working on V2 then. Thanks, Dima > Best regards, Ilya Maximets. _______________________________________________ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev