On Mon, May 11, 2026 at 11:40 PM JayGue Lee <[email protected]> wrote:
>
>
> build_lb_health_check_response_lflows() builds a per-backend
> service_monitor reply punt lflow with the match:
>
>     inport == "<backend_lsp>" && ip4.dst == <svc_mon_src_ip> &&
>     ip4.src == <backend_ip>   && eth.dst == <svc_mon_lrp_mac> &&
>     tcp.src == <backend_port>
>
> When the backend's LSP is type=external on a Logical_Switch that has
> a localnet port (the typical configuration produced by Neutron /
> ovn-octavia-provider for baremetal pool members on a provider VLAN),
> the probe reply re-enters br-int through the localnet port.  At the
> time the lflow is evaluated MFF_LOG_INPORT therefore carries the
> localnet LSP's tunnel_key, not the backend LSP's, and
> 'inport == "<backend_lsp>"' never matches.
>
> Even if the match is rewritten to 'inport == "<localnet_lsp>"',
> 'handle_svc_check(inport);' alone is not sufficient: pinctrl indexes
> its svc_monitors_map by (dp_key, port_key) where port_key equals the
> backend LSP's tunnel_key (sync_svc_monitors() in controller/pinctrl.c
> sets svc_mon->port_key = pb->tunnel_key from
> sb_svc_mon->logical_port).  pinctrl_find_svc_monitor() therefore
> requires MFF_LOG_INPORT to hold the backend LSP's tunnel_key when
> the controller op fires, otherwise the lookup fails silently and
> Service_Monitor.status stays 'offline' indefinitely even though the
> backend is fully reachable on the wire.
>
> Detect type=external backends on a Logical_Switch with a localnet
> port and substitute a localnet-aware reply lflow:
>
>   match  : flags.localnet == 1 && eth.src == <bm_mac> &&
>            ip{4,6}.dst == <svc_mon_src_ip> &&
>            ip{4,6}.src == <backend_ip> &&
>            eth.dst == <svc_mon_lrp_mac> && <proto-tail>
>   action : outport = "<backend_lsp>";
>            handle_svc_check(outport);
>            outport = "";
>
> flags.localnet is set in S_SWITCH_IN_LOOKUP_FDB on packets arriving
> from any localnet port (commit e6ffc4919c388, "northd: Enable ARP/ND
> responder for localnet-sourced requests."), so it identifies the
> re-entry path independently of which localnet LSP the packet came
> through (a switch may have several).  eth.src disambiguates which
> baremetal backend on that localnet.
>
> handle_svc_check(<field>) saves MFF_LOG_INPORT, moves <field> into
> it, runs the controller op, and restores MFF_LOG_INPORT
> (encode_HANDLE_SVC_CHECK() in lib/actions.c).  Borrowing
> MFF_LOG_OUTPORT to carry the backend LSP's tunnel_key into
> MFF_LOG_INPORT for the punt is exactly what pinctrl needs and is the
> same semantics as the manual workaround flow operators install today
> ('actions=set_field:<bm_tunnel_key>->reg14, controller(...)').
> 'outport = "";' clears MFF_LOG_OUTPORT afterwards so the cleared
> value does not leak into egress and reflect the reply back to the
> backend.
>
> Regular (non-external) backends are unchanged.
>
> Suggested-by: Numan Siddique <[email protected]>
> Signed-off-by: JayGue Lee <[email protected]>

Thanks for the patch.

Looks to me your solution is incomplete.

With LB health checks,  CMS/user can configure to use a private IP
from the subnet
or the router port IP as the source IP for the health checks generated
by ovn-controller.

The code you've added only handles the scenario when the router port
IP is configured
to use as the source ip.  The test case you added confirms that.

It is also possible that the user can configure some other private IP
from the same subnet in which case
ovn-controller will use the $svc_monitor_mac (which is generated and
stored in nb_global)
as the source mac when it sends out the health check packet.  For this
scenario the existing logical flow
(which is shown below) will be matched and ovn-controller ignore it as
the inport is localnet port and not external port.

There is one more inline comment below.  Please see.


Instead of the approach you have taken in this patch which is to add
logical flows which checks on
"flags.localnet == 1" how about we do the following ?

   - In the logical switch pipeline just after port security checks
we can add another stage (or in the apply port security stage itself)
     which matches on localnet port and eth.src == <external_port_mac>
and rewrites the inport to the actual external port.

Eg.
   if there is an external port "foo_external_port" with mac
"00:00:01:00:00:50" then add the flow like below

  table=0 (ls_in_check_port_sec), priority=70   , match=(inport ==
"localnet-port" && eth.src == 00:00:01:00:00:50), action=(inport =
"foo_external_port"; next;)

I think if we do this, then we don't have to match on localnet port
and eth.src down the line in other pipeline stages of the logical
switch.
And the existing logical flows for the handle_svc_check() would work
without any issues.

The downside of this approach is that it will break DHCP for external
ports (and maybe a few others).  I think those can be fixed.

I am not forcing this approach, but checking if it makes sense to do
this way.  If not we can continue with your patch and address the
comments which I provided above.

@Dumitru Ceara @Ales Musil @Mark Michelson @Han Zhou and others,  do
you have any thoughts on this ?

@JayGue Lee   Either you can wait for comments from others or address
the review comments and submit v6 meanwhile.


Thanks
Numan

> ---
> v5 (this revision):
>   - Test-only fix.  In v3/v4 the new ovn-northd.at unit test
>     used 'check ovn-nbctl ... -- --id=@hc create
>     Load_Balancer_Health_Check ...' which caused the testsuite
>     to fail because the create command prints the new row's
>     UUID to stdout while 'check' expects an empty stdout (see
>     Documentation/topics/test-development.rst).  Replaced with
>     'check_uuid', matching the convention used elsewhere in
>     the same file (e.g. tests/ovn-northd.at:1506).  No
>     functional change.  Verified locally on a Linux VM:
>     227/227 ovn-northd.at unit tests PASS, including the new
>     687/688 (parallelization=yes/no).
> v4:
>   - Address 0-day Robot checkpatch warnings on v3: two
>     ds_put_format() lines in build_lb_health_check_response_lflows()
>     were 80 chars after the new 'else' indentation pushed them over
>     the 79-char limit.  Fixed by breaking the format string onto a
>     new line, matching the style already used in the new
>     'external_via_localnet' branch.  No functional change.
> v3:
>   - Address review feedback from Numan Siddique on the original
>     bug report thread (ovs-discuss).  Numan suggested fixing the
>     existing reply lflow rather than adding a new one, by matching
>     on inport == <localnet_port> when the backend is type=external.
>   - Restructure the patch accordingly: instead of emitting a second
>     lflow at the same priority, the existing per-backend reply lflow
>     is now a single ovn_lflow_add() whose match/action are branched
>     when the backend is type=external on a Logical_Switch with a
>     localnet port.  This is a smaller and conceptually cleaner change
>     than v2 (no duplicate lflows on the same datapath, no priority
>     overlap to reason about).
>   - Match uses 'flags.localnet == 1 && eth.src == <bm_mac>' instead
>     of 'inport == "<localnet_lsp>"'.  flags.localnet is now set in
>     S_SWITCH_IN_LOOKUP_FDB by commit e6ffc4919c388 ("northd: Enable
>     ARP/ND responder for localnet-sourced requests."), so a single
>     predicate covers Logical_Switches that have multiple localnet
>     LSPs (which is common in Neutron provider deployments).  eth.src
>     disambiguates which baremetal backend on that localnet.
>   - Action piece preserved from v2 ('outport = "<backend_lsp>";
>     handle_svc_check(outport); outport = "";').  Match-only fixes do
>     not work because pinctrl_find_svc_monitor() (controller/pinctrl.c)
>     indexes svc_monitors_map by (dp_key, port_key) where port_key is
>     the backend LSP's tunnel_key, so MFF_LOG_INPORT must hold that
>     tunnel_key when the controller op fires.  This is also what the
>     manual workaround flow operators install today
>     ('actions=set_field:<bm_tunnel_key>->reg14, controller(...)') does
>     in production, confirmed against a live br-int dump.
>   - Use ls_has_localnet_port() helper added by commit 9699f8922c02747
>     ("northd: Add ls_has_localnet_port() helper.") instead of the
>     open-coded vector_is_empty(&od->localnet_ports) check.
>   - Updated commit subject to "Fix HM reply lflow for type=external
>     backends on localnet LS." to reflect that v3 modifies the
>     existing lflow rather than generating a new one.
>   - Test rewritten: a single Logical_Router peers with two switches
>     (a regular tenant LS with a VM backend, and a provider LS with a
>     localnet port and a type=external backend) and a single LB binds
>     both backends.  The test asserts the regular backend keeps the
>     original 'inport == "<vm-port>" / handle_svc_check(inport);'
>     behavior and the type=external backend gets the localnet-aware
>     match and the outport-borrowing action.
> v2:
>   - Replaced the ovn_port_find() call (which crashed on
>     peer_switch_od->ports because that hmap is keyed on dp_node, not
>     key_node) with an inline HMAP_FOR_EACH (dp_node, ...) loop.
>
>  NEWS                |  6 +++
>  northd/northd.c     | 96 ++++++++++++++++++++++++++++++++++++++------
>  tests/ovn-northd.at | 98 +++++++++++++++++++++++++++++++++++++++++++++
>  3 files changed, 187 insertions(+), 13 deletions(-)
>
> diff --git a/NEWS b/NEWS
> index 68cdbff..0fa8a07 100644
> --- a/NEWS
> +++ b/NEWS
> @@ -6,6 +6,12 @@ Post v26.03.0
>       * Add ECMP/multi-homing support for EVPN FDB entries. FDB entries
>         backed by a kernel nexthop group are load-balanced via OpenFlow
>         select groups with weighted buckets.
> +   - Fixed Load_Balancer health check replies failing silently for
> +     baremetal pool members whose backend LSP is type=external on a
> +     Logical_Switch that has a localnet port.  ovn-northd now emits a
> +     localnet-aware reply lflow (matched on flags.localnet + eth.src)
> +     and loads the backend LSP's tunnel_key into MFF_LOG_INPORT for
> +     the controller dispatch so pinctrl_find_svc_monitor() succeeds.
>
>  OVN v26.03.0 - xxx xx xxxx
>  --------------------------
> diff --git a/northd/northd.c b/northd/northd.c
> index 8305e04..cc1c2ef 100644
> --- a/northd/northd.c
> +++ b/northd/northd.c
> @@ -8922,30 +8922,82 @@ build_lb_health_check_response_lflows(
>                  continue;
>              }
>
> +            /* For backends whose LSP is type=external on a switch with a
> +             * localnet port (typical for Neutron / ovn-octavia-provider
> +             * baremetal pool members on a provider VLAN), the reply
> +             * re-enters br-int via the localnet port.  At that point
> +             * MFF_LOG_INPORT carries the localnet LSP's tunnel_key, not
> +             * the backend's, so the original 'inport == <backend>' match
> +             * never fires and pinctrl_handle_svc_check() never runs.
> +             *
> +             * Detect that case here so we can substitute a localnet-aware
> +             * match (keyed on flags.localnet + eth.src) and an action
> +             * that loads the backend LSP's tunnel_key into MFF_LOG_INPORT
> +             * before dispatching to ovn-controller, which is what
> +             * pinctrl_find_svc_monitor() keys on. */
> +            struct ovn_port *backend_op = NULL;
> +            struct ovn_port *op;
> +            HMAP_FOR_EACH (op, dp_node, &peer_switch_od->ports) {
> +                if (op->nbsp && op->key
> +                    && !strcmp(op->key, backend_nb->logical_port)) {
> +                    backend_op = op;
> +                    break;
> +                }
> +            }
> +            bool external_via_localnet =
> +                backend_op && backend_op->nbsp && backend_op->nbsp->type
> +                && !strcmp(backend_op->nbsp->type, "external")
> +                && backend_op->n_lsp_addrs
> +                && ls_has_localnet_port(peer_switch_od);
> +
>              ds_clear(match);
>              ds_clear(action);
>
>              /* icmp6 type 1 and icmp4 type 3 are included in the match, 
> because
>               * the controller is using them to detect unreachable ports. */
>              if (addr_is_ipv6(backend_nb->svc_mon_src_ip)) {
> -                ds_put_format(match, "inport == \"%s\" && ip6.dst == %s && "
> -                              "ip6.src == %s && eth.dst == %s && ",
> -                              backend_nb->logical_port,
> -                              backend_nb->svc_mon_src_ip,
> -                              backend->ip_str,
> -                              backend_nb->svc_mon_lrp->lrp_networks.ea_s);
> +                if (external_via_localnet) {
> +                    ds_put_format(match,
> +                                  "flags.localnet == 1 && eth.src == %s && "
> +                                  "ip6.dst == %s && ip6.src == %s && "
> +                                  "eth.dst == %s && ",
> +                                  backend_op->lsp_addrs[0].ea_s,
> +                                  backend_nb->svc_mon_src_ip,
> +                                  backend->ip_str,
> +                                  
> backend_nb->svc_mon_lrp->lrp_networks.ea_s);
> +                } else {
> +                    ds_put_format(match,
> +                                  "inport == \"%s\" && ip6.dst == %s && "
> +                                  "ip6.src == %s && eth.dst == %s && ",
> +                                  backend_nb->logical_port,
> +                                  backend_nb->svc_mon_src_ip,
> +                                  backend->ip_str,
> +                                  
> backend_nb->svc_mon_lrp->lrp_networks.ea_s);
> +                }
>                  if (!strcmp(protocol, "tcp")) {
>                      ds_put_format(match, "tcp.src == %s", backend->port_str);
>                  } else {
>                      ds_put_cstr(match, "icmp6.type == 1");
>                  }
>              } else {
> -                ds_put_format(match, "inport == \"%s\" && ip4.dst == %s && "
> -                              "ip4.src == %s && eth.dst == %s && ",
> -                              backend_nb->logical_port,
> -                              backend_nb->svc_mon_src_ip,
> -                              backend->ip_str,
> -                              backend_nb->svc_mon_lrp->lrp_networks.ea_s);
> +                if (external_via_localnet) {
> +                    ds_put_format(match,
> +                                  "flags.localnet == 1 && eth.src == %s && "
> +                                  "ip4.dst == %s && ip4.src == %s && "
> +                                  "eth.dst == %s && ",
> +                                  backend_op->lsp_addrs[0].ea_s,
> +                                  backend_nb->svc_mon_src_ip,
> +                                  backend->ip_str,
> +                                  
> backend_nb->svc_mon_lrp->lrp_networks.ea_s);
> +                } else {
> +                    ds_put_format(match,
> +                                  "inport == \"%s\" && ip4.dst == %s && "
> +                                  "ip4.src == %s && eth.dst == %s && ",
> +                                  backend_nb->logical_port,
> +                                  backend_nb->svc_mon_src_ip,
> +                                  backend->ip_str,
> +                                  
> backend_nb->svc_mon_lrp->lrp_networks.ea_s);
> +                }
>                  if (!strcmp(protocol, "tcp")) {
>                      ds_put_format(match, "tcp.src == %s", backend->port_str);
>                  } else {
> @@ -8953,6 +9005,24 @@ build_lb_health_check_response_lflows(
>                  }
>              }
>
> +            /* For the type=external/localnet case, MFF_LOG_INPORT at this
> +             * point holds the localnet LSP's tunnel_key but pinctrl indexes
> +             * Service_Monitor by the backend LSP's tunnel_key.  Borrow
> +             * MFF_LOG_OUTPORT to carry the backend LSP's tunnel_key into
> +             * MFF_LOG_INPORT for the controller dispatch (handle_svc_check()
> +             * saves/moves/restores MFF_LOG_INPORT around the punt), then
> +             * clear outport so the cleared MFF_LOG_OUTPORT does not leak
> +             * into egress. */
> +            if (external_via_localnet) {
> +                ds_put_format(action,
> +                              "outport = \"%s\"; "
> +                              "handle_svc_check(outport); "
> +                              "outport = \"\";",
> +                              backend_nb->logical_port);

Why do you set outport to the backend logical port name ?
ovn-controller expects the inport to be set right ?
The action should set inport right ?

i.e
uint32_t port_key = md->flow.regs[MFF_LOG_INPORT - MFF_REG0];


Thanks
Numan

> +            } else {
> +                ds_put_cstr(action, "handle_svc_check(inport);");
> +            }
> +
>              /* ovn-controller expects health check responses from the LS
>               * datapath in which the backend is located. That's why we
>               * install the response lflow into the peer's datapath. */
> @@ -8960,7 +9030,7 @@ build_lb_health_check_response_lflows(
>                                                 peer_switch_od->nbs->copp,
>                                                 meter_groups);
>              ovn_lflow_add(lflows, peer_switch_od, S_SWITCH_IN_L2_LKUP, 110,
> -                          ds_cstr(match), "handle_svc_check(inport);",
> +                          ds_cstr(match), ds_cstr(action),
>                            lb_dps->lflow_ref, WITH_CTRL_METER(meter));
>          }
>      }
> diff --git a/tests/ovn-northd.at b/tests/ovn-northd.at
> index 074b152..708586c 100644
> --- a/tests/ovn-northd.at
> +++ b/tests/ovn-northd.at
> @@ -1687,6 +1687,104 @@ OVN_CLEANUP_NORTHD
>  AT_CLEANUP
>  ])
>
> +OVN_FOR_EACH_NORTHD_NO_HV_PARALLELIZATION([
> +AT_SETUP([Load balancer health check reply lflow for type=external backend 
> on localnet LS])
> +ovn_start
> +
> +# Topology:
> +#
> +#   lr0 --(lr0-sw0)-- sw0 (regular tenant LS, 10.0.0.0/24)
> +#                      `-- vm-port (type="", regular VIF backend)
> +#
> +#   lr0 --(lr0-prov)-- prov (provider LS with localnet)
> +#                       |-- prov-localnet (type=localnet)
> +#                       `-- bm-port (type=external, baremetal pool member)
> +#
> +# A baremetal LB pool member's LSP is type=external; replies to HM probes
> +# re-enter br-int via the localnet port, so MFF_LOG_INPORT carries the
> +# localnet LSP's tunnel_key and the original
> +#   "inport == <bm-port> && ... ; handle_svc_check(inport);"
> +# reply lflow never matches.  pinctrl_find_svc_monitor() is keyed on
> +# (dp_key, port_key) where port_key = backend LSP's tunnel_key, so
> +# MFF_LOG_INPORT must hold that tunnel_key when the controller op fires.
> +#
> +# The fix replaces the per-backend reply lflow's match/action when the
> +# backend is type=external on a switch with a localnet port:
> +#   match  : "flags.localnet == 1 && eth.src == <bm_mac> && ..."
> +#   action : "outport = "<bm-port>"; handle_svc_check(outport); outport = 
> "";"
> +# handle_svc_check() saves/moves/restores MFF_LOG_INPORT around the punt,
> +# so dispatch sees REG14 = bm-port's tunnel_key as required by pinctrl.
> +# Regular (non-external) backends keep the original behavior.
> +
> +check ovn-nbctl lr-add lr0
> +check ovn-nbctl lrp-add lr0 lr0-sw0 00:00:00:00:01:01 10.0.0.1/24
> +check ovn-nbctl lrp-add lr0 lr0-prov 00:00:00:00:02:01 10.0.50.1/24
> +
> +check ovn-nbctl ls-add sw0
> +check ovn-nbctl --wait=sb lsp-add sw0 sw0-lr0 \
> +  -- lsp-set-type sw0-lr0 router \
> +  -- lsp-set-options sw0-lr0 router-port=lr0-sw0 \
> +  -- lsp-set-addresses sw0-lr0 router
> +check ovn-nbctl --wait=sb lsp-add sw0 vm-port \
> +  -- lsp-set-addresses vm-port "00:00:00:00:01:02 10.0.0.10"
> +
> +check ovn-nbctl ls-add prov
> +check ovn-nbctl --wait=sb lsp-add prov prov-lr0 \
> +  -- lsp-set-type prov-lr0 router \
> +  -- lsp-set-options prov-lr0 router-port=lr0-prov \
> +  -- lsp-set-addresses prov-lr0 router
> +check ovn-nbctl --wait=sb lsp-add prov prov-localnet \
> +  -- lsp-set-type prov-localnet localnet \
> +  -- lsp-set-options prov-localnet network_name=physnet1 \
> +  -- lsp-set-addresses prov-localnet unknown
> +check ovn-nbctl --wait=sb lsp-add prov bm-port \
> +  -- lsp-set-type bm-port external \
> +  -- lsp-set-addresses bm-port "00:00:00:00:02:0a 10.0.50.10"
> +
> +check ovn-sbctl chassis-add hv1 geneve 127.0.0.1
> +check ovn-sbctl lsp-bind vm-port hv1
> +check ovn-sbctl lsp-bind bm-port hv1
> +
> +# LB has both a regular-VIF backend on sw0 and a type=external backend on 
> prov.
> +check ovn-nbctl lb-add lb1 192.168.0.10:80 10.0.0.10:80,10.0.50.10:80 tcp
> +check ovn-nbctl --wait=sb set load_balancer lb1 \
> +  ip_port_mappings:10.0.0.10=vm-port:10.0.0.1
> +check ovn-nbctl --wait=sb set load_balancer lb1 \
> +  ip_port_mappings:10.0.50.10=bm-port:10.0.50.1
> +
> +check_uuid ovn-nbctl --wait=sb -- --id=@hc create Load_Balancer_Health_Check 
> \
> +  vip="192.168.0.10\:80" -- add Load_Balancer lb1 health_check @hc
> +
> +check ovn-nbctl lr-lb-add lr0 lb1
> +check ovn-nbctl ls-lb-add sw0 lb1
> +check ovn-nbctl ls-lb-add prov lb1
> +check ovn-nbctl --wait=sb sync
> +
> +# Regular backend on sw0: original "inport == <vm-port>" / 
> "handle_svc_check(inport);"
> +# behavior unchanged.
> +AT_CAPTURE_FILE([sw0_lflows])
> +ovn-sbctl dump-flows sw0 | grep ls_in_l2_lkup | grep handle_svc_check \
> +  > sw0_lflows
> +AT_CHECK([cat sw0_lflows | ovn_strip_lflows], [0], [dnl
> +  table=??(ls_in_l2_lkup      ), priority=110  , match=(eth.dst == 
> $svc_monitor_mac && (tcp || icmp || icmp6)), 
> action=(handle_svc_check(inport);)
> +  table=??(ls_in_l2_lkup      ), priority=110  , match=(inport == "vm-port" 
> && ip4.dst == 10.0.0.1 && ip4.src == 10.0.0.10 && eth.dst == 
> 00:00:00:00:01:01 && tcp.src == 80), action=(handle_svc_check(inport);)
> +])
> +
> +# type=external backend on prov (localnet LS): match keyed on flags.localnet
> +# + eth.src, action borrows outport to load the backend tunnel_key into
> +# MFF_LOG_INPORT for the controller dispatch.
> +AT_CAPTURE_FILE([prov_lflows])
> +ovn-sbctl dump-flows prov | grep ls_in_l2_lkup | grep handle_svc_check \
> +  > prov_lflows
> +AT_CHECK([cat prov_lflows | ovn_strip_lflows], [0], [dnl
> +  table=??(ls_in_l2_lkup      ), priority=110  , match=(eth.dst == 
> $svc_monitor_mac && (tcp || icmp || icmp6)), 
> action=(handle_svc_check(inport);)
> +  table=??(ls_in_l2_lkup      ), priority=110  , match=(flags.localnet == 1 
> && eth.src == 00:00:00:00:02:0a && ip4.dst == 10.0.50.1 && ip4.src == 
> 10.0.50.10 && eth.dst == 00:00:00:00:02:01 && tcp.src == 80), action=(outport 
> = "bm-port"; handle_svc_check(outport); outport = "";)
> +])
> +
> +OVN_CLEANUP_NORTHD
> +AT_CLEANUP
> +])
> +
>  OVN_FOR_EACH_NORTHD_NO_HV([
>  AT_SETUP([Load balancer VIP in NAT entries])
>  AT_SKIP_IF([test $HAVE_PYTHON = no])
> --
> 2.49.0
>
_______________________________________________
dev mailing list
[email protected]
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Reply via email to