build_lb_health_check_response_lflows() builds a per-backend
service_monitor reply punt lflow with the match:

    inport == "<backend_lsp>" && ip4.dst == <svc_mon_src_ip> &&
    ip4.src == <backend_ip>   && eth.dst == <svc_mon_lrp_mac> &&
    tcp.src == <backend_port>

When the backend's LSP is type=external on a Logical_Switch that has
a localnet port (the typical configuration produced by Neutron /
ovn-octavia-provider for baremetal pool members on a provider VLAN),
the probe reply re-enters br-int through the localnet port.  At the
time the lflow is evaluated MFF_LOG_INPORT therefore carries the
localnet LSP's tunnel_key, not the backend LSP's, and
'inport == "<backend_lsp>"' never matches.

Even if the match is rewritten to 'inport == "<localnet_lsp>"',
'handle_svc_check(inport);' alone is not sufficient: pinctrl indexes
its svc_monitors_map by (dp_key, port_key) where port_key equals the
backend LSP's tunnel_key (sync_svc_monitors() in controller/pinctrl.c
sets svc_mon->port_key = pb->tunnel_key from
sb_svc_mon->logical_port).  pinctrl_find_svc_monitor() therefore
requires MFF_LOG_INPORT to hold the backend LSP's tunnel_key when
the controller op fires, otherwise the lookup fails silently and
Service_Monitor.status stays 'offline' indefinitely even though the
backend is fully reachable on the wire.

Detect type=external backends on a Logical_Switch with a localnet
port and substitute a localnet-aware reply lflow:

  match  : flags.localnet == 1 && eth.src == <bm_mac> &&
           ip{4,6}.dst == <svc_mon_src_ip> &&
           ip{4,6}.src == <backend_ip> &&
           eth.dst == <svc_mon_lrp_mac> && <proto-tail>
  action : outport = "<backend_lsp>";
           handle_svc_check(outport);
           outport = "";

flags.localnet is set in S_SWITCH_IN_LOOKUP_FDB on packets arriving
from any localnet port (commit e6ffc4919c388, "northd: Enable ARP/ND
responder for localnet-sourced requests."), so it identifies the
re-entry path independently of which localnet LSP the packet came
through (a switch may have several).  eth.src disambiguates which
baremetal backend on that localnet.

handle_svc_check(<field>) saves MFF_LOG_INPORT, moves <field> into
it, runs the controller op, and restores MFF_LOG_INPORT
(encode_HANDLE_SVC_CHECK() in lib/actions.c).  Borrowing
MFF_LOG_OUTPORT to carry the backend LSP's tunnel_key into
MFF_LOG_INPORT for the punt is exactly what pinctrl needs and is the
same semantics as the manual workaround flow operators install today
('actions=set_field:<bm_tunnel_key>->reg14, controller(...)').
'outport = "";' clears MFF_LOG_OUTPORT afterwards so the cleared
value does not leak into egress and reflect the reply back to the
backend.

Regular (non-external) backends are unchanged.

Suggested-by: Numan Siddique <[email protected]>
Signed-off-by: JayGue Lee <[email protected]>
---
v3 (this revision):
  - Address review feedback from Numan Siddique on the original
    bug report thread (ovs-discuss):
      https://mail.openvswitch.org/pipermail/ovs-discuss/
    Numan suggested fixing the existing reply lflow rather than
    adding a new one, by matching on inport == <localnet_port> when
    the backend is type=external.
  - Restructure the patch accordingly: instead of emitting a second
    lflow at the same priority, the existing per-backend reply lflow
    is now a single ovn_lflow_add() whose match/action are branched
    when the backend is type=external on a Logical_Switch with a
    localnet port.  This is a smaller and conceptually cleaner change
    than v2 (no duplicate lflows on the same datapath, no priority
    overlap to reason about).
  - Match uses 'flags.localnet == 1 && eth.src == <bm_mac>' instead
    of 'inport == "<localnet_lsp>"'.  flags.localnet is now set in
    S_SWITCH_IN_LOOKUP_FDB by commit e6ffc4919c388 ("northd: Enable
    ARP/ND responder for localnet-sourced requests."), so a single
    predicate covers Logical_Switches that have multiple localnet
    LSPs (which is common in Neutron provider deployments).  eth.src
    disambiguates which baremetal backend on that localnet.
  - Action piece preserved from v2 ('outport = "<backend_lsp>";
    handle_svc_check(outport); outport = "";').  Match-only fixes do
    not work because pinctrl_find_svc_monitor() (controller/pinctrl.c)
    indexes svc_monitors_map by (dp_key, port_key) where port_key is
    the backend LSP's tunnel_key, so MFF_LOG_INPORT must hold that
    tunnel_key when the controller op fires.  This is also what the
    manual workaround flow operators install today
    ('actions=set_field:<bm_tunnel_key>->reg14, controller(...)') does
    in production, confirmed against a live br-int dump.
  - Use ls_has_localnet_port() helper added by commit 9699f8922c02747
    ("northd: Add ls_has_localnet_port() helper.") instead of the
    open-coded vector_is_empty(&od->localnet_ports) check.
  - Updated commit subject to "Fix HM reply lflow for type=external
    backends on localnet LS." to reflect that v3 modifies the
    existing lflow rather than generating a new one.
  - Test rewritten: a single Logical_Router peers with two switches
    (a regular tenant LS with a VM backend, and a provider LS with a
    localnet port and a type=external backend) and a single LB binds
    both backends.  The test asserts the regular backend keeps the
    original 'inport == "<vm-port>" / handle_svc_check(inport);'
    behavior and the type=external backend gets the localnet-aware
    match and the outport-borrowing action.
v2:
  - Replaced the ovn_port_find() call (which crashed on
    peer_switch_od->ports because that hmap is keyed on dp_node, not
    key_node) with an inline HMAP_FOR_EACH (dp_node, ...) loop.

 NEWS                |  6 +++
 northd/northd.c     | 94 +++++++++++++++++++++++++++++++++++++------
 tests/ovn-northd.at | 98 +++++++++++++++++++++++++++++++++++++++++++++
 3 files changed, 185 insertions(+), 13 deletions(-)

diff --git a/NEWS b/NEWS
index 68cdbff..0fa8a07 100644
--- a/NEWS
+++ b/NEWS
@@ -6,6 +6,12 @@ Post v26.03.0
      * Add ECMP/multi-homing support for EVPN FDB entries. FDB entries
        backed by a kernel nexthop group are load-balanced via OpenFlow
        select groups with weighted buckets.
+   - Fixed Load_Balancer health check replies failing silently for
+     baremetal pool members whose backend LSP is type=external on a
+     Logical_Switch that has a localnet port.  ovn-northd now emits a
+     localnet-aware reply lflow (matched on flags.localnet + eth.src)
+     and loads the backend LSP's tunnel_key into MFF_LOG_INPORT for
+     the controller dispatch so pinctrl_find_svc_monitor() succeeds.
 
 OVN v26.03.0 - xxx xx xxxx
 --------------------------
diff --git a/northd/northd.c b/northd/northd.c
index 8305e04..5754abf 100644
--- a/northd/northd.c
+++ b/northd/northd.c
@@ -8922,30 +8922,80 @@ build_lb_health_check_response_lflows(
                 continue;
             }
 
+            /* For backends whose LSP is type=external on a switch with a
+             * localnet port (typical for Neutron / ovn-octavia-provider
+             * baremetal pool members on a provider VLAN), the reply
+             * re-enters br-int via the localnet port.  At that point
+             * MFF_LOG_INPORT carries the localnet LSP's tunnel_key, not
+             * the backend's, so the original 'inport == <backend>' match
+             * never fires and pinctrl_handle_svc_check() never runs.
+             *
+             * Detect that case here so we can substitute a localnet-aware
+             * match (keyed on flags.localnet + eth.src) and an action
+             * that loads the backend LSP's tunnel_key into MFF_LOG_INPORT
+             * before dispatching to ovn-controller, which is what
+             * pinctrl_find_svc_monitor() keys on. */
+            struct ovn_port *backend_op = NULL;
+            struct ovn_port *op;
+            HMAP_FOR_EACH (op, dp_node, &peer_switch_od->ports) {
+                if (op->nbsp && op->key
+                    && !strcmp(op->key, backend_nb->logical_port)) {
+                    backend_op = op;
+                    break;
+                }
+            }
+            bool external_via_localnet =
+                backend_op && backend_op->nbsp && backend_op->nbsp->type
+                && !strcmp(backend_op->nbsp->type, "external")
+                && backend_op->n_lsp_addrs
+                && ls_has_localnet_port(peer_switch_od);
+
             ds_clear(match);
             ds_clear(action);
 
             /* icmp6 type 1 and icmp4 type 3 are included in the match, because
              * the controller is using them to detect unreachable ports. */
             if (addr_is_ipv6(backend_nb->svc_mon_src_ip)) {
-                ds_put_format(match, "inport == \"%s\" && ip6.dst == %s && "
-                              "ip6.src == %s && eth.dst == %s && ",
-                              backend_nb->logical_port,
-                              backend_nb->svc_mon_src_ip,
-                              backend->ip_str,
-                              backend_nb->svc_mon_lrp->lrp_networks.ea_s);
+                if (external_via_localnet) {
+                    ds_put_format(match,
+                                  "flags.localnet == 1 && eth.src == %s && "
+                                  "ip6.dst == %s && ip6.src == %s && "
+                                  "eth.dst == %s && ",
+                                  backend_op->lsp_addrs[0].ea_s,
+                                  backend_nb->svc_mon_src_ip,
+                                  backend->ip_str,
+                                  backend_nb->svc_mon_lrp->lrp_networks.ea_s);
+                } else {
+                    ds_put_format(match, "inport == \"%s\" && ip6.dst == %s && 
"
+                                  "ip6.src == %s && eth.dst == %s && ",
+                                  backend_nb->logical_port,
+                                  backend_nb->svc_mon_src_ip,
+                                  backend->ip_str,
+                                  backend_nb->svc_mon_lrp->lrp_networks.ea_s);
+                }
                 if (!strcmp(protocol, "tcp")) {
                     ds_put_format(match, "tcp.src == %s", backend->port_str);
                 } else {
                     ds_put_cstr(match, "icmp6.type == 1");
                 }
             } else {
-                ds_put_format(match, "inport == \"%s\" && ip4.dst == %s && "
-                              "ip4.src == %s && eth.dst == %s && ",
-                              backend_nb->logical_port,
-                              backend_nb->svc_mon_src_ip,
-                              backend->ip_str,
-                              backend_nb->svc_mon_lrp->lrp_networks.ea_s);
+                if (external_via_localnet) {
+                    ds_put_format(match,
+                                  "flags.localnet == 1 && eth.src == %s && "
+                                  "ip4.dst == %s && ip4.src == %s && "
+                                  "eth.dst == %s && ",
+                                  backend_op->lsp_addrs[0].ea_s,
+                                  backend_nb->svc_mon_src_ip,
+                                  backend->ip_str,
+                                  backend_nb->svc_mon_lrp->lrp_networks.ea_s);
+                } else {
+                    ds_put_format(match, "inport == \"%s\" && ip4.dst == %s && 
"
+                                  "ip4.src == %s && eth.dst == %s && ",
+                                  backend_nb->logical_port,
+                                  backend_nb->svc_mon_src_ip,
+                                  backend->ip_str,
+                                  backend_nb->svc_mon_lrp->lrp_networks.ea_s);
+                }
                 if (!strcmp(protocol, "tcp")) {
                     ds_put_format(match, "tcp.src == %s", backend->port_str);
                 } else {
@@ -8953,6 +9003,24 @@ build_lb_health_check_response_lflows(
                 }
             }
 
+            /* For the type=external/localnet case, MFF_LOG_INPORT at this
+             * point holds the localnet LSP's tunnel_key but pinctrl indexes
+             * Service_Monitor by the backend LSP's tunnel_key.  Borrow
+             * MFF_LOG_OUTPORT to carry the backend LSP's tunnel_key into
+             * MFF_LOG_INPORT for the controller dispatch (handle_svc_check()
+             * saves/moves/restores MFF_LOG_INPORT around the punt), then
+             * clear outport so the cleared MFF_LOG_OUTPORT does not leak
+             * into egress. */
+            if (external_via_localnet) {
+                ds_put_format(action,
+                              "outport = \"%s\"; "
+                              "handle_svc_check(outport); "
+                              "outport = \"\";",
+                              backend_nb->logical_port);
+            } else {
+                ds_put_cstr(action, "handle_svc_check(inport);");
+            }
+
             /* ovn-controller expects health check responses from the LS
              * datapath in which the backend is located. That's why we
              * install the response lflow into the peer's datapath. */
@@ -8960,7 +9028,7 @@ build_lb_health_check_response_lflows(
                                                peer_switch_od->nbs->copp,
                                                meter_groups);
             ovn_lflow_add(lflows, peer_switch_od, S_SWITCH_IN_L2_LKUP, 110,
-                          ds_cstr(match), "handle_svc_check(inport);",
+                          ds_cstr(match), ds_cstr(action),
                           lb_dps->lflow_ref, WITH_CTRL_METER(meter));
         }
     }
diff --git a/tests/ovn-northd.at b/tests/ovn-northd.at
index 074b152..defa985 100644
--- a/tests/ovn-northd.at
+++ b/tests/ovn-northd.at
@@ -1687,6 +1687,104 @@ OVN_CLEANUP_NORTHD
 AT_CLEANUP
 ])
 
+OVN_FOR_EACH_NORTHD_NO_HV_PARALLELIZATION([
+AT_SETUP([Load balancer health check reply lflow for type=external backend on 
localnet LS])
+ovn_start
+
+# Topology:
+#
+#   lr0 --(lr0-sw0)-- sw0 (regular tenant LS, 10.0.0.0/24)
+#                      `-- vm-port (type="", regular VIF backend)
+#
+#   lr0 --(lr0-prov)-- prov (provider LS with localnet)
+#                       |-- prov-localnet (type=localnet)
+#                       `-- bm-port (type=external, baremetal pool member)
+#
+# A baremetal LB pool member's LSP is type=external; replies to HM probes
+# re-enter br-int via the localnet port, so MFF_LOG_INPORT carries the
+# localnet LSP's tunnel_key and the original
+#   "inport == <bm-port> && ... ; handle_svc_check(inport);"
+# reply lflow never matches.  pinctrl_find_svc_monitor() is keyed on
+# (dp_key, port_key) where port_key = backend LSP's tunnel_key, so
+# MFF_LOG_INPORT must hold that tunnel_key when the controller op fires.
+#
+# The fix replaces the per-backend reply lflow's match/action when the
+# backend is type=external on a switch with a localnet port:
+#   match  : "flags.localnet == 1 && eth.src == <bm_mac> && ..."
+#   action : "outport = "<bm-port>"; handle_svc_check(outport); outport = "";"
+# handle_svc_check() saves/moves/restores MFF_LOG_INPORT around the punt,
+# so dispatch sees REG14 = bm-port's tunnel_key as required by pinctrl.
+# Regular (non-external) backends keep the original behavior.
+
+check ovn-nbctl lr-add lr0
+check ovn-nbctl lrp-add lr0 lr0-sw0 00:00:00:00:01:01 10.0.0.1/24
+check ovn-nbctl lrp-add lr0 lr0-prov 00:00:00:00:02:01 10.0.50.1/24
+
+check ovn-nbctl ls-add sw0
+check ovn-nbctl --wait=sb lsp-add sw0 sw0-lr0 \
+  -- lsp-set-type sw0-lr0 router \
+  -- lsp-set-options sw0-lr0 router-port=lr0-sw0 \
+  -- lsp-set-addresses sw0-lr0 router
+check ovn-nbctl --wait=sb lsp-add sw0 vm-port \
+  -- lsp-set-addresses vm-port "00:00:00:00:01:02 10.0.0.10"
+
+check ovn-nbctl ls-add prov
+check ovn-nbctl --wait=sb lsp-add prov prov-lr0 \
+  -- lsp-set-type prov-lr0 router \
+  -- lsp-set-options prov-lr0 router-port=lr0-prov \
+  -- lsp-set-addresses prov-lr0 router
+check ovn-nbctl --wait=sb lsp-add prov prov-localnet \
+  -- lsp-set-type prov-localnet localnet \
+  -- lsp-set-options prov-localnet network_name=physnet1 \
+  -- lsp-set-addresses prov-localnet unknown
+check ovn-nbctl --wait=sb lsp-add prov bm-port \
+  -- lsp-set-type bm-port external \
+  -- lsp-set-addresses bm-port "00:00:00:00:02:0a 10.0.50.10"
+
+check ovn-sbctl chassis-add hv1 geneve 127.0.0.1
+check ovn-sbctl lsp-bind vm-port hv1
+check ovn-sbctl lsp-bind bm-port hv1
+
+# LB has both a regular-VIF backend on sw0 and a type=external backend on prov.
+check ovn-nbctl lb-add lb1 192.168.0.10:80 10.0.0.10:80,10.0.50.10:80 tcp
+check ovn-nbctl --wait=sb set load_balancer lb1 \
+  ip_port_mappings:10.0.0.10=vm-port:10.0.0.1
+check ovn-nbctl --wait=sb set load_balancer lb1 \
+  ip_port_mappings:10.0.50.10=bm-port:10.0.50.1
+
+check ovn-nbctl --wait=sb -- --id=@hc create Load_Balancer_Health_Check \
+  vip="192.168.0.10\:80" -- add Load_Balancer lb1 health_check @hc
+
+check ovn-nbctl lr-lb-add lr0 lb1
+check ovn-nbctl ls-lb-add sw0 lb1
+check ovn-nbctl ls-lb-add prov lb1
+check ovn-nbctl --wait=sb sync
+
+# Regular backend on sw0: original "inport == <vm-port>" / 
"handle_svc_check(inport);"
+# behavior unchanged.
+AT_CAPTURE_FILE([sw0_lflows])
+ovn-sbctl dump-flows sw0 | grep ls_in_l2_lkup | grep handle_svc_check \
+  > sw0_lflows
+AT_CHECK([cat sw0_lflows | ovn_strip_lflows], [0], [dnl
+  table=??(ls_in_l2_lkup      ), priority=110  , match=(eth.dst == 
$svc_monitor_mac && (tcp || icmp || icmp6)), action=(handle_svc_check(inport);)
+  table=??(ls_in_l2_lkup      ), priority=110  , match=(inport == "vm-port" && 
ip4.dst == 10.0.0.1 && ip4.src == 10.0.0.10 && eth.dst == 00:00:00:00:01:01 && 
tcp.src == 80), action=(handle_svc_check(inport);)
+])
+
+# type=external backend on prov (localnet LS): match keyed on flags.localnet
+# + eth.src, action borrows outport to load the backend tunnel_key into
+# MFF_LOG_INPORT for the controller dispatch.
+AT_CAPTURE_FILE([prov_lflows])
+ovn-sbctl dump-flows prov | grep ls_in_l2_lkup | grep handle_svc_check \
+  > prov_lflows
+AT_CHECK([cat prov_lflows | ovn_strip_lflows], [0], [dnl
+  table=??(ls_in_l2_lkup      ), priority=110  , match=(eth.dst == 
$svc_monitor_mac && (tcp || icmp || icmp6)), action=(handle_svc_check(inport);)
+  table=??(ls_in_l2_lkup      ), priority=110  , match=(flags.localnet == 1 && 
eth.src == 00:00:00:00:02:0a && ip4.dst == 10.0.50.1 && ip4.src == 10.0.50.10 
&& eth.dst == 00:00:00:00:02:01 && tcp.src == 80), action=(outport = "bm-port"; 
handle_svc_check(outport); outport = "";)
+])
+
+OVN_CLEANUP_NORTHD
+AT_CLEANUP
+])
+
 OVN_FOR_EACH_NORTHD_NO_HV([
 AT_SETUP([Load balancer VIP in NAT entries])
 AT_SKIP_IF([test $HAVE_PYTHON = no])
-- 
2.49.0

_______________________________________________
dev mailing list
[email protected]
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Reply via email to