OVN 26.03 introduced distributed load balancers [1]: when a
Load_Balancer has options:distributed=true and ip_port_mappings is
configured, each chassis delivers traffic only to the local backend.
Deployments that use BGP to advertise tenant routes into the fabric
[2] could previously advertise LB VIPs only as a single prefix per
VIP, with no per-chassis health awareness. The fabric could ECMP
across chassis but could not prefer those hosting healthy backends
or withdraw routes from chassis with unhealthy ones.
This series splits the Advertised_Route emission for LB VIPs from
one-per-VIP to one-per-(VIP, backend LSP) and adds a controller-side
gate that skips route installation when the Service_Monitor for the
corresponding backend is offline. This allows the local dynamic
routing speaker to advertise the VIP prefix only from chassis that
host a healthy backend, so fabric ECMP converges onto
healthy chassis.
The series also installs local forwarding routes on the advertising
logical router for peer-LR LB VIPs and NAT external IPs enumerated for
redistribution, so that the advertising LR can forward traffic to
those addresses through the peer.
Patch 1 fixes an existing bug where attaching a distributed LB to a
router with a chassis-redirect port can leave stale lr_in_admission
guards, dropping ingress LB traffic on non-gateway chassis.
Patch 2 fixes a pre-existing bug in parsed_route_lookup() where the
IPv6 nexthop comparison skipped matching routes instead of
non-matching ones, causing false lookups for IPv6 routes with
non-zero nexthops. The forwarding-route tests in patch 3 exercise
this code path: the nexthop mutation test checks that a route whose
nexthop changes is not treated as unchanged.
Patch 3 installs local forwarding routes for peer-LR LB VIPs and NAT
external IPs that an advertising LRP has enumerated for redistribution.
Patch 4 splits Advertised_Route emission from one-per-VIP to
one-per-(VIP, backend LSP), populating service selector columns
that let ovn-controller match the Service_Monitor row for each
backend.
Patch 5 adds the controller-side gate: kernel-route installation is
skipped when the Service_Monitor for a backend is offline.
Changes since v2 (all in patch 1):
- Track has_distributed_lb on LB-group expansion and deletion paths
so that the northd handler's distributed-LB guard covers all
incremental code paths that involve distributed LBs.
- Instead of unconditionally forcing a full northd recompute when
any distributed LB change is tracked, collect a snapshot of each
affected LR's is_distributed state before incremental processing,
then after processing recompute the flag per LR using the per-LR
LB list (lr_lb_map). If any LR's flag changed, fall back to full
recompute. Otherwise the incremental path is safe to use.
- Expose find_od_lb_data() so the post-check can walk only the LBs
attached to each affected LR, not every LB in the system.
- Add regression tests covering direct LB attach/detach, LB-group
attach/detach, options:distributed toggle, LB-group membership
changes, multiple distributed LBs on the same LR, the incremental
path when a second distributed LB is attached (no is_distributed
transition) and full LB deletion while still attached to an LR.
Changes since v1:
- Patch 3: the nexthop mutation test provides regression coverage
for the parsed_route_lookup() fix in patch 2 (verifies that a route
whose nexthop changes is not treated as unchanged).
- Patch 3: compare tracked_port in parsed_route_lookup() so that
forwarding routes with different tracked_port values are not
treated as identical.
- Patch 3: fix pre-existing typos in en-advertised-route-sync.c
comments.
- Patch 3: add regression test for nexthop mutation (change peer LRP
address and verify forwarding flow updates).
- Patch 3: add negative test for LB-level dynamic-routing-advertise=
false (verify both Advertised_Route and forwarding flow are
not installed).
Changes since v0:
- New patch 2: fix nexthop comparison in parsed_route_lookup().
- Patch 3 (was 2): swap in pre-existing parsed_route objects on
match so that pointers held by group_ecmp_route remain valid.
Detach from old_parsed_routes first to avoid corrupting the
source hmap.
- Share-advertise system tests moved to the deferred
share-advertise-routes follow-up, since they depend on
the dynamic-routing-share-advertise-routes option.
- Inline diagrams switched to ASCII-only characters.
- Line lengths rewrapped per checkpatch feedback.
A separate LRP option for VRF route sharing (share-advertise-routes)
is deferred to a follow-up series. Deployments that need cross-VRF
route sharing before the option lands can achieve the same effect
by configuring the dynamic routing speaker to import routes between
the OVN-managed VRFs on each chassis.
Prior work: RFC for LB BGP advertisement [3] in which a review [4]
suggested a separate incremental processing node for dynamic routes,
which this series adopts. For tracked_port, patch 3 uses the peer LRP
for forwarding routes and patch 4 uses the backend LSP (from
ip_port_mappings) for per-backend locality rows. Patch 5 adds
Service_Monitor gating on the per-backend rows.
The SB schema (version 21.9.0) gains tracked_service_{ip,port,protocol}
columns on Advertised_Route for the service selector. All new columns
are optional with safe defaults, so rolling upgrades are supported.
Builds on the distributed LB feature (OVN 26.03+).
Tests cover LB and NAT route redistribution (IPv4/IPv6), forwarding
route installation, per-backend selector emission, SCTP exclusion,
ip_port_mappings fallback (per-backend rows when mappings exist,
one peer-LRP row when none do), and Service_Monitor gating.
[1] commit 7b0eb4d9ed ("northd: Add distributed load balancer support.")
[2]
https://opendev.org/openstack/neutron-specs/src/branch/master/specs/2025.2/ovn-bgp-integration.rst
[3] https://mail.openvswitch.org/pipermail/ovs-dev/2025-February/420552.html
[4] https://mail.openvswitch.org/pipermail/ovs-dev/2025-February/420789.html
Dmitrii Shcherbakov (5):
northd: Recompute only when LR is_distributed flips on LB.
northd: Fix nexthop comparison in parsed_route_lookup().
northd: Install forwarding routes for redistribute={lb,nat}.
northd: Emit per-backend Advertised_Route for LB redistribution.
controller: Skip LB route install when Service_Monitor is offline.
controller/ovn-controller.c | 25 +-
controller/route.c | 101 +++
controller/route.h | 1 +
northd/en-advertised-route-sync.c | 540 ++++++++++++++--
northd/en-advertised-route-sync.h | 25 +-
northd/en-group-ecmp-route.c | 76 +-
northd/en-group-ecmp-route.h | 4 +
northd/en-lb-data.c | 22 +-
northd/en-lb-data.h | 3 +
northd/en-lr-stateful.c | 2 +
northd/en-northd.c | 1 +
northd/inc-proc-northd.c | 5 +
northd/lb.c | 2 +
northd/lb.h | 15 +
northd/northd.c | 217 +++++-
northd/northd.h | 3 +
ovn-sb.ovsschema | 18 +-
ovn-sb.xml | 56 +-
tests/ovn-inc-proc-graph-dump.at | 9 +-
tests/ovn-northd.at | 1413 ++++++++++++++++++++++++++++++-
tests/system-ovn.at | 521 ++++++++++++
21 files changed, 2946 insertions(+), 113 deletions(-)
--
2.53.0
_______________________________________________
dev mailing list
[email protected]
https://mail.openvswitch.org/mailman/listinfo/ovs-dev