On Thu, 2025-12-11 at 16:31 -0500, Mark Michelson wrote:
> Hi Martin,
> 

Thank you for the review and feedback Mark.

> The gist of the change is that in the IP-port mappings of a load
> balancer, if you specify a peered logical router's IP address, then
> the service monitor message we send out will be from that router's
> IP/MAC, instead of using the global service monitor MAC address. You
> then have added a flow to match on the replies of the service monitor
> requests, and you've suppressed the ARP responder flows from being
> installed for the service monitor IP as well.
> 
> I think my main concerns with this patch are:
> * Finding the logical router MAC from the ovn_northd_lb_backend
> struct
> seems inefficient. This adds a three-deep nested loop to an already
> two-deep nested loop that is in a function called from within a loop.
> If the router MAC could be determined more efficiently, that would be
> a welcome change.

Ack, I'll look for ways to reduce these nested loops.

> * I have some concerns about the flow you've installed for matching
> the service monitor replies. As you've mentioned, the match in this
> RFC is a bit loose. I think even adding the ip.src to the match
> leaves
> us open to DOS possibilities since the router IP and MAC are easy to
> learn. The flow should either have a more rigid mechanism for
> determining that the incoming packet is a reply to a service monitor
> request, or the flow should have a meter applied to it.

Metered connection sounds like a good idea. Additionally, would you say
that including `inport == <lb_backend_lsp>` match would help? To me it
sounds like this would bring us on par with current implementation (WRT
DOS risk). It would ensure that potential source of the DOS is only the
LB backend host, which is the same situation as today. I.e. in the
current implementation, if the LB backend host is compromised it can
suss out IP/MAC of the service monitor from ARP table and launch the
DOS. 

> * This is probably just because this is an RFC, but the patch does
> not
> cover IPv6, and it also appears not to cover ICMP-based service
> monitoring.

Re: IPv6
yes I fully plan to include IPv6 in the final proposal.

Re: ICMP
So I was originally thinking only about Load Balancer health checks. If
I understand correctly, LB health checks use only tcp/udp/sctp. ICMP on
the other hand is used only by "Network Function" service checks.
I'll heave to read up on the "Network Function"s as I haven't run into
them yet, but I take it that you think this feature, that I'm
proposing, should apply to them as well?
 
> 
> I put a couple of small notes below
> 
> On Wed, Dec 3, 2025 at 6:15 AM Martin Kalcok via dev
> <[email protected]> wrote:
> > 
> > Load Balancer Health Checks require specifying "source IP". This IP
> > has to be from the subnet of the monitored LB backend and should
> > not
> > be used by any other port in the LSP. If the "source IP" is also
> > used
> > by some other port, the host with the LB backend won't be able to
> > communicate
> > with this port due to the ARP responder rule installed by the
> > controller.
> > 
> > This limitation forces CMSes to reserve an IP per LSP with LB
> > backend, which
> > is not always practical or possible.
> > 
> > This proposal would allow usage of LRP IPs as the source of LB
> > health check
> > probes. It introduces following changes for such scenario:
> > 
> >   * service_monitor (SBDB) will use MAC address of the LRP
> >   * ARP responder rule is not necessary in this case
> >   * Destination IP and Source Port will be used to determine that a
> > packet
> >     is a response to a health check probe. These packets will be
> > redirected
> >     to the controller.
> > 
> > Behavior is unchanged for the LB health checks that use
> > reserved/unused
> > "source IPs".
> > 
> > Signed-off-by: Martin Kalcok <[email protected]>
> > ---
> >  northd/northd.c | 95
> > +++++++++++++++++++++++++++++++++++++++++++++++--
> >  1 file changed, 93 insertions(+), 2 deletions(-)
> > 
> > diff --git a/northd/northd.c b/northd/northd.c
> > index 011f449ec..d713ccc60 100644
> > --- a/northd/northd.c
> > +++ b/northd/northd.c
> > @@ -3223,6 +3223,25 @@ ovn_lb_svc_create(struct ovsdb_idl_txn
> > *ovnsb_txn,
> >              struct ovn_port *op = ovn_port_find(ls_ports,
> >                                                  backend_nb-
> > >logical_port);
> > 
> > +
> > +            /* If the service monitor is backed by a real port,
> > use its MAC
> > +               address instead of the default service check MAC.*/
> > +            const char *source_mac = svc_monitor_mac;
> > +            const struct eth_addr *source_mac_ea =
> > svc_monitor_mac_ea;
> > +            struct ovn_port *op2;
> > +            VECTOR_FOR_EACH (&op->od->router_ports, op2) {
> > +                for (size_t k = 0; k < op2->n_lsp_addrs; k++) {
> > +                    struct lport_addresses *op_addrs = &op2-
> > >lsp_addrs[k];
> 
> Instead of using op2->lsp_addrs, you should be using
> op2->peer->lrp_addrs. Logical switch ports that are peered with
> logical router ports can specify "router" as their IP address. So
> reaching across to the peered logical router port will work if that
> is
> the case.

Thanks for the hint. Interestingly, the current approach seems to work
for LSPs with address "router" as well. However I do agree with you
that checking for `op2->peer` (this variable name is WIP btw :D) will
save us from unnecessarily looking into address sets of LSPs that
aren't peered with a LRP.

> 
> > +                    for (size_t l =0; l < op_addrs->n_ipv4_addrs;
> > l++) {
> > +                        const char *ip_s = op_addrs-
> > >ipv4_addrs[l].addr_s;
> > +                        if (!strcmp(ip_s, backend_nb-
> > >svc_mon_src_ip)) {
> > +                            source_mac = op_addrs->ea_s;
> > +                            source_mac_ea = &op_addrs->ea;
> > +                        }
> > +                    }
> > +                }
> > +            }
> > +
> >              if (!backend_nb->remote_backend &&
> >                  (!op || !lsp_is_enabled(op->nbsp))) {
> >                  continue;
> > @@ -3257,9 +3276,9 @@ ovn_lb_svc_create(struct ovsdb_idl_txn
> > *ovnsb_txn,
> >              struct eth_addr ea;
> >              if (!mon_info->sbrec_mon->src_mac ||
> >                  !eth_addr_from_string(mon_info->sbrec_mon-
> > >src_mac, &ea) ||
> > -                !eth_addr_equals(ea, *svc_monitor_mac_ea)) {
> > +                !eth_addr_equals(ea, *source_mac_ea)) {
> >                  sbrec_service_monitor_set_src_mac(mon_info-
> > >sbrec_mon,
> > -                                                 
> > svc_monitor_mac);
> > +                                                  source_mac);
> >              }
> > 
> >              if (!mon_info->sbrec_mon->src_ip ||
> > @@ -8474,6 +8493,61 @@ build_lb_rules(struct lflow_table *lflows,
> > struct ovn_lb_datapaths *lb_dps,
> >          const char *ip_match =
> >              lb_vip->address_family == AF_INET ? "ip4" : "ip6";
> > 
> > +        /* For each LB backend that is monitored by a source_ip
> > belonging
> > +           to a real port, install rule that punts service check
> > replies to the
> > +           controller.*/
> > +        size_t j = 0;
> > +        const struct ovn_lb_backend *backend;
> > +        VECTOR_FOR_EACH_PTR (&lb_vip->backends, backend) {
> > +            struct ovn_northd_lb_backend *backend_nb =
> > +                &lb->vips_nb->backends_nb[j++];
> > +
> > +            if (!backend_nb->health_check) {
> > +                continue;
> > +            }
> > +
> > +            const char *protocol = lb->nlb->protocol;
> > +            if (!protocol || !protocol[0]) {
> > +                protocol = "tcp";
> > +            }
> > +
> > +            size_t index;
> > +            DYNAMIC_BITMAP_FOR_EACH_1 (index, &lb_dps->nb_ls_map)
> > {
> > +                struct ovn_datapath *od =
> > vector_get(&ls_datapaths->dps, index,
> > +                                                     struct
> > ovn_datapath *);
> > +                /* Only install the rule if the datapath has a
> > port with
> > +                   monitor source IP.*/
> > +                struct ovn_port *svc_mon_op;
> > +                bool port_found = false;
> > +                VECTOR_FOR_EACH (&od->router_ports, svc_mon_op) {
> > +                    if (find_lport_address(
> > +                                svc_mon_op->lsp_addrs,
> > +                                backend_nb->svc_mon_src_ip)) {
> > +                        port_found = true;
> > +                        break;
> > +                    }
> > +                }
> > +                if (!port_found) {
> > +                    continue;
> > +                }
> > +
> > +                ds_clear(match);
> > +                ds_clear(action);
> > +                ds_put_format(
> > +                        match,
> > +                        "ip4.dst == %s && %s.src == %s",
> > +                        backend_nb->svc_mon_src_ip,
> > +                        protocol,
> > +                        backend->port_str);
> > +                ovn_lflow_add(lflows,
> > +                              od,
> > +                              S_SWITCH_IN_LB, 324,
> > +                              ds_cstr(match),
> > +                              "handle_svc_check(inport);",
> > +                              lb_dps->lflow_ref);
> > +            }
> > +        }
> > +
> >          ds_clear(action);
> >          ds_clear(match);
> > 
> > @@ -10273,6 +10347,23 @@ build_lswitch_arp_nd_local_svc_mon(const
> > struct ovn_lb_datapaths *lb_dps,
> >                  continue;
> >              }
> > 
> > +            /* ARP responder is necessary only if the service
> > check is not
> > +               backed by a real port and an IP. */
> > +            struct ovn_port *svc_mon_src_op;
> > +            bool found = false;
> > +            VECTOR_FOR_EACH (&op->od->router_ports,
> > svc_mon_src_op) {
> > +                if (find_lport_address(
> > +                            svc_mon_src_op->lsp_addrs,
> > +                            backend_nb->svc_mon_src_ip)
> > +                        ) {
> 
> You should probably call lrp_find_member_ip(svc_mon_src_op->peer,
> backend_nb->svc_mon_src_ip) instead of find_lport_address(). This is
> for the same reason as I specified earlier with regards to logical
> switch ports having "router" as their configured address.

Ack, thanks.

And thanks again for the feedback.
Martin.

> 
> > +                    found = true;
> > +                    break;
> > +                }
> > +            }
> > +            if (found) {
> > +                continue;
> > +            }
> > +
> >              ds_clear(match);
> >              ds_clear(actions);
> > 
> > --
> > 2.51.0
> > 
> > _______________________________________________
> > dev mailing list
> > [email protected]
> > https://mail.openvswitch.org/mailman/listinfo/ovs-dev
> > 
_______________________________________________
dev mailing list
[email protected]
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Reply via email to