On 7/17/25 11:56 AM, Felix Huettner wrote: > On Thu, Jul 17, 2025 at 11:29:24AM +0200, Ilya Maximets wrote: >> On 7/16/25 9:05 AM, Smirnov Aleksandr (K2 Cloud) wrote: >>> Hello, >>> >>> I noticed a big difference in the flow generated by northd between >>> releases 24.09 and 25.03 >>> >>> In the 25.03 northd fail to find similar routes and form ecmp group. >>> >>> I append following information: >>> >>> 1. Testcase scenario that can be easily copy-pasted to ovn-ic.at >>> >>> 2. Test output if ran in 24.09 >>> >>> 3. Test output if ran in 25.03 >>> >>> Could you please clarify is this real issue? >> >> It looks like Felix made a change to never group "connected" routes, >> i.e. the learned routes, in commit: >> f8924740f26e ("northd: Move connected routes to route engine.") >> >> The code that makes all such routes to never consider groupping is >> the following: >> >> northd/en-group-ecmp-route.c: >> static void >> add_route(struct group_ecmp_datapath *gn, const struct parsed_route *pr) >> { >> if (pr->source == ROUTE_SOURCE_CONNECTED) { >> unique_routes_add(gn, pr); >> return; >> } >> ... >> >> All the routes learned from the other router through the transit switch >> have ROUTE_SOURCE_CONNECTED as their source and not being considered for >> ecmp groupping. There is also a comment in the removal part: >> >> if (pr->source == ROUTE_SOURCE_CONNECTED) { >> /* Connected routes are never part of an ecmp group. >> * We should recompute. */ >> return false; >> } >> >> This makes me think that the change was intentional. > > Hi Ilya, Hi Smirnov, > > so i implemented it this way because i assumed that > ROUTE_SOURCE_CONNECTED means that this route is directly connected to > the local LR. So that the LR has an interface that really has IPs out of > that network. In that case i never saw a way how one LR would have > multiple LRPs with the same network range. That just seemed like a > unrealistic case. So i decided to skip the ecmp grouping checks because > i thought this will just never happen. > > However i just now saw that ROUTE_SOURCE_CONNECTED is actually also set > for the ic routes. Since there it seems to be more used for route > prioritization. It no longer holds that guarantee that there can be no > duplicate IPs. > > Would it make sense to create ROUTE_SOURCE_ORIGIN_CONNECTED and > ROUTE_SOURCE_ORIGIN_STATIC and map the "origin" values to that. Then > grouping should work as expected. Then the ROUTE_SOURCE_ORIGIN_* could > also be covered route_source_to_offset to prioritize them correctly. > >> >> But also, I'm not sure what is the end goal of this kind of setup. >> The underlying traffic through both transit switches will go through >> the same tunnels in the end, with just a slightly different metadata, >> so there is no real high-availability in this setup. Or am I missing >> some other use case here? > > You could also do ecmp to different destinations if you have 3 ovn > clusters. But i honestly see the point even less :) > >> >> At the same time it seems a little arbitrary that learned routes can't >> form ecmp groups though. Not sure why we have this seemingly artificial >> restriction. > > For me it was just that i thought there is never a reason to group them, > so i just wanted to skip unnecessary further processing. But it seems > like that assumption no longer holds. > > I hope that helps clarifying it.
Ack, thanks! It seems like the issue only appears when ovn-ic copies "connected" routes from the other zone. And unless we have multiple ports with the same subnet on the same router, we can only get these multiple routes when we learn the same route through multiple transit switches. Which is a questionable topology. So, I'm not sure if we actually need to fix that or not. Aleksandr, do you have a practical use case for this kind of topology? > > Thanks a lot, > Felix > >> >> What happens if learn an actual ecmp route from the other router? i.e. >> if we have a real ecmp route to something external configured on one of >> the routers connected through a transit switch, will it be learned >> properly? It sounds like it wouldn't... This is not really a case, if it's a real statically configured ecmp route, then it will not be "connected" in the first place and will be properly grouped after learning it in the other zone, because ovn-ic just copies the "origin". So, this is not a problem and the only questionable case is the actual learning of "connected" routes through different interconnects. >> >> Felix, do you have some comments on this one? >> >> Best regards, Ilya Maximets. _______________________________________________ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev