On 6/12/26 2:20 PM, Dmitrii Shcherbakov wrote:
> OVN 26.03 introduced distributed load balancers [1]: when a
> Load_Balancer has options:distributed=true and ip_port_mappings is
> configured, each chassis delivers traffic only to the local backend.
> Deployments that use BGP to advertise tenant routes into the fabric
> [2] could previously advertise LB VIPs only as a single prefix per
> VIP, with no per-chassis health awareness. The fabric could ECMP
> across chassis but could not prefer those hosting healthy backends
> or withdraw routes from chassis with unhealthy ones.
>
> This series splits the Advertised_Route emission for LB VIPs from
> one-per-VIP to one-per-(VIP, backend LSP) and adds a controller-side
> gate that skips route installation when the Service_Monitor for the
> corresponding backend is offline. This allows the local dynamic
> routing speaker to advertise the VIP prefix only from chassis that
> host a healthy backend, so fabric ECMP converges onto
> healthy chassis.
>
> The series also installs local forwarding routes on the advertising
> logical router for peer-LR LB VIPs and NAT external IPs enumerated for
> redistribution, so that the advertising LR can forward traffic to
> those addresses through the peer.
>
Hi Dmitrii,
Thanks a lot for the series, judging by the cover it looks like it
includes a lot of good stuff!
I do plan to review it closely in the near future but I was wondering if
others (e.g., Martin, Frode, Felix) maybe have some time to review it too.
Regards,
Dumitru
> Patch 1 fixes an existing bug where attaching a distributed LB to a
> router with a chassis-redirect port can leave stale lr_in_admission
> guards, dropping ingress LB traffic on non-gateway chassis.
>
> Patch 2 fixes a pre-existing bug in parsed_route_lookup() where the
> IPv6 nexthop comparison skipped matching routes instead of
> non-matching ones, causing false lookups for IPv6 routes with
> non-zero nexthops. The forwarding-route tests in patch 3 exercise
> this code path: the nexthop mutation test checks that a route whose
> nexthop changes is not treated as unchanged.
>
> Patch 3 installs local forwarding routes for peer-LR LB VIPs and NAT
> external IPs that an advertising LRP has enumerated for redistribution.
>
> Patch 4 splits Advertised_Route emission from one-per-VIP to
> one-per-(VIP, backend LSP), populating service selector columns
> that let ovn-controller match the Service_Monitor row for each
> backend.
>
> Patch 5 adds the controller-side gate: kernel-route installation is
> skipped when the Service_Monitor for a backend is offline.
>
> Changes since v3 (patch 1):
> - Fix -Wshadow violation in build_lb_datapaths: rename inner
> loop variable od to grp_od to avoid shadowing the outer-scope
> od declared earlier in the same function.
>
> Changes since v2 (all in patch 1):
> - Track has_distributed_lb on LB-group expansion and deletion paths
> so that the northd handler's distributed-LB guard covers all
> incremental code paths that involve distributed LBs.
> - Instead of unconditionally forcing a full northd recompute when
> any distributed LB change is tracked, collect a snapshot of each
> affected LR's is_distributed state before incremental processing,
> then after processing recompute the flag per LR using the per-LR
> LB list (lr_lb_map). If any LR's flag changed, fall back to full
> recompute. Otherwise the incremental path is safe to use.
> - Expose find_od_lb_data() so the post-check can walk only the LBs
> attached to each affected LR, not every LB in the system.
> - Add regression tests covering direct LB attach/detach, LB-group
> attach/detach, options:distributed toggle, LB-group membership
> changes, multiple distributed LBs on the same LR, the incremental
> path when a second distributed LB is attached (no is_distributed
> transition) and full LB deletion while still attached to an LR.
>
> Changes since v1:
> - Patch 3: the nexthop mutation test provides regression coverage
> for the parsed_route_lookup() fix in patch 2 (verifies that a route
> whose nexthop changes is not treated as unchanged).
> - Patch 3: compare tracked_port in parsed_route_lookup() so that
> forwarding routes with different tracked_port values are not
> treated as identical.
> - Patch 3: fix pre-existing typos in en-advertised-route-sync.c
> comments.
> - Patch 3: add regression test for nexthop mutation (change peer LRP
> address and verify forwarding flow updates).
> - Patch 3: add negative test for LB-level dynamic-routing-advertise=
> false (verify both Advertised_Route and forwarding flow are
> not installed).
>
> Changes since v0:
> - New patch 2: fix nexthop comparison in parsed_route_lookup().
> - Patch 3 (was 2): swap in pre-existing parsed_route objects on
> match so that pointers held by group_ecmp_route remain valid.
> Detach from old_parsed_routes first to avoid corrupting the
> source hmap.
> - Share-advertise system tests moved to the deferred
> share-advertise-routes follow-up, since they depend on
> the dynamic-routing-share-advertise-routes option.
> - Inline diagrams switched to ASCII-only characters.
> - Line lengths rewrapped per checkpatch feedback.
>
> A separate LRP option for VRF route sharing (share-advertise-routes)
> is deferred to a follow-up series. Deployments that need cross-VRF
> route sharing before the option lands can achieve the same effect
> by configuring the dynamic routing speaker to import routes between
> the OVN-managed VRFs on each chassis.
>
> Prior work: RFC for LB BGP advertisement [3] in which a review [4]
> suggested a separate incremental processing node for dynamic routes,
> which this series adopts. For tracked_port, patch 3 uses the peer LRP
> for forwarding routes and patch 4 uses the backend LSP (from
> ip_port_mappings) for per-backend locality rows. Patch 5 adds
> Service_Monitor gating on the per-backend rows.
>
> The SB schema (version 21.9.0) gains tracked_service_{ip,port,protocol}
> columns on Advertised_Route for the service selector. All new columns
> are optional with safe defaults, so rolling upgrades are supported.
> Builds on the distributed LB feature (OVN 26.03+).
>
> Tests cover LB and NAT route redistribution (IPv4/IPv6), forwarding
> route installation, per-backend selector emission, SCTP exclusion,
> ip_port_mappings fallback (per-backend rows when mappings exist,
> one peer-LRP row when none do), and Service_Monitor gating.
>
> [1] commit 7b0eb4d9ed ("northd: Add distributed load balancer support.")
> [2]
> https://opendev.org/openstack/neutron-specs/src/branch/master/specs/2025.2/ovn-bgp-integration.rst
> [3] https://mail.openvswitch.org/pipermail/ovs-dev/2025-February/420552.html
> [4] https://mail.openvswitch.org/pipermail/ovs-dev/2025-February/420789.html
>
> Dmitrii Shcherbakov (5):
> northd: Recompute only when LR is_distributed flips on LB.
> northd: Fix nexthop comparison in parsed_route_lookup().
> northd: Install forwarding routes for redistribute={lb,nat}.
> northd: Emit per-backend Advertised_Route for LB redistribution.
> controller: Skip LB route install when Service_Monitor is offline.
>
> controller/ovn-controller.c | 25 +-
> controller/route.c | 101 +++
> controller/route.h | 1 +
> northd/en-advertised-route-sync.c | 540 ++++++++++++++--
> northd/en-advertised-route-sync.h | 25 +-
> northd/en-group-ecmp-route.c | 76 +-
> northd/en-group-ecmp-route.h | 4 +
> northd/en-lb-data.c | 22 +-
> northd/en-lb-data.h | 3 +
> northd/en-lr-stateful.c | 2 +
> northd/en-northd.c | 1 +
> northd/inc-proc-northd.c | 5 +
> northd/lb.c | 2 +
> northd/lb.h | 15 +
> northd/northd.c | 217 +++++-
> northd/northd.h | 3 +
> ovn-sb.ovsschema | 18 +-
> ovn-sb.xml | 56 +-
> tests/ovn-inc-proc-graph-dump.at | 9 +-
> tests/ovn-northd.at | 1413 ++++++++++++++++++++++++++++++-
> tests/system-ovn.at | 521 ++++++++++++
> 21 files changed, 2946 insertions(+), 113 deletions(-)
>
_______________________________________________
dev mailing list
[email protected]
https://mail.openvswitch.org/mailman/listinfo/ovs-dev