On Thu, Jul 17, 2025 at 01:50:56PM +0000, Smirnov Aleksandr (K2 Cloud) wrote: > In case you decide the fix in not desired, ovn-nbctl must be fixed > because current report is confusing saying routes are ecmp while in fact > they are not.
Hi everyone, i just took a further look and discovered more chaos in the "origin=connected" topic. Currently treating it as ROUTE_SOURCE_CONNECTED also means that we will prioritize these as normal connected routes and higher than ROUTE_SOURCE_STATIC. So if we assume the following routes: * 192.168.0.0/24 via 10.0.10.10 as normale Logical_Router_Static_Route * 192.168.0.0/24 via 10.0.11.11 as Logical_Router_Static_Route via ovn-ic with origin=connected * 192.168.0.0/24 via 10.0.12.12 as Logical_Router_Static_Route via ovn-ic with origin=static This currently results in the following effective routes (if i understood it correctly): * High priority: 192.168.0.0/24 via 10.0.11.11 * Low priority: 192.168.0.0/24 via ecmp of 10.0.10.10 and 10.0.12.12 This is honestly quite confusing from my perspective: 1. why should an ovn-ic route be of higher priority than a local route 2. why should ecmp work between ovn-ic and non-ic routes >From my view (of never using ovn-ic) i would have expected that the ovn-ic are always of lower priority than non-ic routes. What are your opinions on that? Thanks a lot, Felix > > On 7/17/25 4:16 PM, Ilya Maximets wrote: > > Hrm, adding Felix back. > > > > On 7/17/25 3:14 PM, Ilya Maximets wrote: > >> On 7/17/25 11:56 AM, Felix Huettner wrote: > >>> On Thu, Jul 17, 2025 at 11:29:24AM +0200, Ilya Maximets wrote: > >>>> On 7/16/25 9:05 AM, Smirnov Aleksandr (K2 Cloud) wrote: > >>>>> Hello, > >>>>> > >>>>> I noticed a big difference in the flow generated by northd between > >>>>> releases 24.09 and 25.03 > >>>>> > >>>>> In the 25.03 northd fail to find similar routes and form ecmp group. > >>>>> > >>>>> I append following information: > >>>>> > >>>>> 1. Testcase scenario that can be easily copy-pasted to ovn-ic.at > >>>>> > >>>>> 2. Test output if ran in 24.09 > >>>>> > >>>>> 3. Test output if ran in 25.03 > >>>>> > >>>>> Could you please clarify is this real issue? > >>>> It looks like Felix made a change to never group "connected" routes, > >>>> i.e. the learned routes, in commit: > >>>> f8924740f26e ("northd: Move connected routes to route engine.") > >>>> > >>>> The code that makes all such routes to never consider groupping is > >>>> the following: > >>>> > >>>> northd/en-group-ecmp-route.c: > >>>> static void > >>>> add_route(struct group_ecmp_datapath *gn, const struct parsed_route *pr) > >>>> { > >>>> if (pr->source == ROUTE_SOURCE_CONNECTED) { > >>>> unique_routes_add(gn, pr); > >>>> return; > >>>> } > >>>> ... > >>>> > >>>> All the routes learned from the other router through the transit switch > >>>> have ROUTE_SOURCE_CONNECTED as their source and not being considered for > >>>> ecmp groupping. There is also a comment in the removal part: > >>>> > >>>> if (pr->source == ROUTE_SOURCE_CONNECTED) { > >>>> /* Connected routes are never part of an ecmp group. > >>>> * We should recompute. */ > >>>> return false; > >>>> } > >>>> > >>>> This makes me think that the change was intentional. > >>> Hi Ilya, Hi Smirnov, > >>> > >>> so i implemented it this way because i assumed that > >>> ROUTE_SOURCE_CONNECTED means that this route is directly connected to > >>> the local LR. So that the LR has an interface that really has IPs out of > >>> that network. In that case i never saw a way how one LR would have > >>> multiple LRPs with the same network range. That just seemed like a > >>> unrealistic case. So i decided to skip the ecmp grouping checks because > >>> i thought this will just never happen. > >>> > >>> However i just now saw that ROUTE_SOURCE_CONNECTED is actually also set > >>> for the ic routes. Since there it seems to be more used for route > >>> prioritization. It no longer holds that guarantee that there can be no > >>> duplicate IPs. > >>> > >>> Would it make sense to create ROUTE_SOURCE_ORIGIN_CONNECTED and > >>> ROUTE_SOURCE_ORIGIN_STATIC and map the "origin" values to that. Then > >>> grouping should work as expected. Then the ROUTE_SOURCE_ORIGIN_* could > >>> also be covered route_source_to_offset to prioritize them correctly. > >>> > >>>> But also, I'm not sure what is the end goal of this kind of setup. > >>>> The underlying traffic through both transit switches will go through > >>>> the same tunnels in the end, with just a slightly different metadata, > >>>> so there is no real high-availability in this setup. Or am I missing > >>>> some other use case here? > >>> You could also do ecmp to different destinations if you have 3 ovn > >>> clusters. But i honestly see the point even less :) > >>> > >>>> At the same time it seems a little arbitrary that learned routes can't > >>>> form ecmp groups though. Not sure why we have this seemingly artificial > >>>> restriction. > >>> For me it was just that i thought there is never a reason to group them, > >>> so i just wanted to skip unnecessary further processing. But it seems > >>> like that assumption no longer holds. > >>> > >>> I hope that helps clarifying it. > >> Ack, thanks! It seems like the issue only appears when ovn-ic copies > >> "connected" routes from the other zone. And unless we have multiple > >> ports with the same subnet on the same router, we can only get these > >> multiple routes when we learn the same route through multiple transit > >> switches. Which is a questionable topology. So, I'm not sure if we > >> actually need to fix that or not. > >> > >> Aleksandr, do you have a practical use case for this kind of topology? > >> > >>> Thanks a lot, > >>> Felix > >>> > >>>> What happens if learn an actual ecmp route from the other router? i.e. > >>>> if we have a real ecmp route to something external configured on one of > >>>> the routers connected through a transit switch, will it be learned > >>>> properly? It sounds like it wouldn't... > >> This is not really a case, if it's a real statically configured ecmp route, > >> then it will not be "connected" in the first place and will be properly > >> grouped after learning it in the other zone, because ovn-ic just copies > >> the "origin". So, this is not a problem and the only questionable case is > >> the actual learning of "connected" routes through different interconnects. > >> > >>>> Felix, do you have some comments on this one? > >>>> > >>>> Best regards, Ilya Maximets. > > _______________________________________________ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev