On 1/30/26 11:27 AM, [email protected] wrote:
> On Wed, 2026-01-28 at 15:01 +0100, Dumitru Ceara wrote:
>> On 1/26/26 10:39 AM, Martin Kalcok via dev wrote:
>>> Load Balancer Health Checks require specifying "source IP". This IP
>>> has to be from the subnet of the monitored LB backend and should
>>> not
>>> be used by any other port in the LSP. If the "source IP" is also
>>> used
>>> by some other port, the host with the LB backend won't be able to
>>> communicate
>>> with this port due to the ARP responder rule installed by the
>>> controller.
>>>
>>> This limitation forces CMSes to reserve an IP per LSP with LB
>>> backend, which
>>> is not always practical or possible.
>>>
>>> This change allows usage of LRP IPs as the source of LB health
>>> check
>>> probes. It introduces following changes for such scenario:
>>>
>>>   * service_monitor (SBDB) will use MAC address of the LRP
>>>   * ARP responder rule is not necessary in this case
>>>   * Destination IP, "inport" and source port will be used to
>>> determine that a
>>>     packet is a response to a health check probe. These packets
>>> will be
>>>     redirected to the controller.
>>>
>>> Behavior is unchanged for the LB health checks that use
>>> reserved/unused
>>> "source IPs".
>>>
>>> Signed-off-by: Martin Kalcok <[email protected]>
>>> ---
>>>
>>
>> Hi Martin,
>>
>> Thanks for this new revision!
> 
> Thanks for the review and sorry for the formatting issues. I usually
> just rely on vim to do the right thing and it works, but somehow things
> got mixed up in this patch. I don't suppose that there's a linter to
> use, to avoid it in the future (aside from checkpatch.py)?
> 
> I have v4 pretty much ready but I have few follow-up questions.
> 

Hi Martin,

>>
>>> RFC -> v2
>>>   * Reduce number of iterations when searching for monitor source
>>> IP in ports
>>>     by searching only in lrp_networks of LRPs, using
>>> `lrp_find_member_ip`.
>>>   * Narrowed down match for health check replies by matching
>>>     "inport == <monitored_lb_backend_port>"
>>>   * ICMP unreachable messages are also forwarded to the controller
>>> to allow
>>>     for monitoring UDP ports and detecting when they are down.
>>>   * Flows that send monitor replies to the controller are still
>>> inserted into
>>>     LS datapath, but the logic is tied to the LR. i.e.: Whether to
>>> add the
>>>     flows is evaluated when LB is attached to LR.
>>>     It removes the necessity to add LB to the LS as well. It is
>>> enough to add
>>>     it to the LR.
>>>   * IPv6 support added
>>>   * System and Unit tests added
>>>     * existing LB and Service Monitor checks were already quite
>>> involved, so
>>>       I opted to add new set of tests that focus specifically on a
>>> setup
>>>       with LRP IPs as a source for service checks.
>>>   * Documentation of "ip_port_mappings" field in "Load_Balancer"
>>> was extended
>>>     to explicitly state requirements for the source_ip.
>>>
>>> v2 -> v3
>>>   * rebased on `main`
>>>     * `vector_get` replaced with `sparse_array_get`
>>>     * `ovn_lflow_metered` replace with WITH_CTRL_METER macro
>>>   * Addressed review comments
>>>     * NEWS entry added
>>>     * Formatting and comments fixed
>>>     * Formatting of "match" rules in
>>> `build_lb_health_check_response_lflows`
>>>       split for ipv4 and ipv6 cases.
>>>     * [tests] added `check` to ovs-vsctl calls
>>>     * [tests] dropped unnecessary --wait=sb
>>>   * Bug found: Function `build_lb_health_check_response_lflows`
>>> was, on each 
>>>     call, iterating over backends of the same `ovn_northd_lb_vip`
>>> instance
>>>     (see in v2: `&lb->vips_nb->backends_nb[j++];`). This caused
>>> occasional
>>>     segfaults and was replaced by passing "current"
>>> `ovn_northd_lb_vip` instance
>>>     as an argument to the function.
>>>
>>>  NEWS                |   5 +
>>>  northd/northd.c     | 143 +++++++++++++++-
>>>  ovn-nb.xml          |   5 +-
>>>  tests/ovn.at        | 325 +++++++++++++++++++++++++++++++++++
>>>  tests/system-ovn.at | 400
>>> ++++++++++++++++++++++++++++++++++++++++++++
>>>  5 files changed, 875 insertions(+), 3 deletions(-)
>>>
>>> diff --git a/NEWS b/NEWS
>>> index b7268c835..2a2b5e12d 100644
>>> --- a/NEWS
>>> +++ b/NEWS
>>> @@ -93,6 +93,11 @@ Post v25.09.0
>>>       other_config column.
>>>     - Introduce the capability to specify multiple ips for ovn-
>>> evpn-local-ip
>>>       option.
>>> +   - Load balancer health checks can now use Logical Router Port
>>> IPs as the
>>> +     source IP for health check probes. Previously, health checks
>>> required
>>> +     reserving an unused IP from the backend's subnet. This change
>>> allows
>>> +     using LRP IPs directly, eliminating the need to reserve
>>> additional IPs
>>> +     per backend port.
>>>  
>>>  OVN v25.09.0 - xxx xx xxxx
>>>  --------------------------
>>> diff --git a/northd/northd.c b/northd/northd.c
>>> index adaa94e85..1e4e24216 100644
>>> --- a/northd/northd.c
>>> +++ b/northd/northd.c
>>> @@ -3260,6 +3260,28 @@ ovn_lb_svc_create(struct ovsdb_idl_txn
>>> *ovnsb_txn,
>>>                  continue;
>>>              }
>>>  
>>> +            /* If the service monitor is backed by a real port,
>>> use its MAC
>>> +             * address instead of the default service check MAC.
>>> */
>>> +            const char *source_mac = svc_monitor_mac;
>>> +            const struct eth_addr *source_mac_ea =
>>> svc_monitor_mac_ea;
>>
>> Nit: I'd swap these two around.
> 
> ack.
> 
>>
>>> +            if (op) {
>>> +                struct ovn_port *svc_mon_op;
>>> +                VECTOR_FOR_EACH (&op->od->router_ports,
>>> svc_mon_op) {
>>> +                    if (!svc_mon_op->peer) {
>>> +                        continue;
>>> +                    }
>>> +                    const char *lrp_ip = lrp_find_member_ip(
>>> +                            svc_mon_op->peer,
>>> +                            backend_nb->svc_mon_src_ip);
>>
>> Nit: the indentation is still a bit weird.
> 
> ack.
> 
>>
>>> +                    if (lrp_ip && !strcmp(lrp_ip,
>>> +                                          backend_nb-
>>>> svc_mon_src_ip)) {
>>> +                        source_mac = svc_mon_op->peer-
>>>> lrp_networks.ea_s;
>>> +                        source_mac_ea = &svc_mon_op->peer-
>>>> lrp_networks.ea;
>>
>> Nit: I'd swap these two around.
>>
> ack.
> 
>>> +                        break;
>>> +                    }
>>> +                }
>>> +            }
>>> +
>>>              const char *protocol = lb->nlb->protocol;
>>>              if (!protocol || !protocol[0]) {
>>>                  protocol = "tcp";
>>> @@ -3289,9 +3311,9 @@ ovn_lb_svc_create(struct ovsdb_idl_txn
>>> *ovnsb_txn,
>>>              struct eth_addr ea;
>>>              if (!mon_info->sbrec_mon->src_mac ||
>>>                  !eth_addr_from_string(mon_info->sbrec_mon-
>>>> src_mac, &ea) ||
>>> -                !eth_addr_equals(ea, *svc_monitor_mac_ea)) {
>>> +                !eth_addr_equals(ea, *source_mac_ea)) {
>>>                  sbrec_service_monitor_set_src_mac(mon_info-
>>>> sbrec_mon,
>>> -                                                 
>>> svc_monitor_mac);
>>> +                                                  source_mac);
>>>              }
>>>  
>>>              if (!mon_info->sbrec_mon->src_ip ||
>>> @@ -8499,6 +8521,99 @@ build_lb_rules_for_stateless_acl(struct
>>> lflow_table *lflows,
>>>      }
>>>  }
>>>  
>>> +static void
>>> +build_lb_health_check_response_lflows(
>>> +    struct lflow_table *lflows,
>>> +    const struct ovn_northd_lb *lb,
>>> +    const struct ovn_lb_vip *lb_vip,
>>> +    const struct ovn_northd_lb_vip *lb_vip_nb,
>>> +    const struct ovn_lb_datapaths *lb_dps,
>>> +    const struct ovn_datapaths *lr_datapaths,
>>> +    const struct shash *meter_groups,
>>> +    struct ds *match,
>>> +    struct ds *action)
>>> +{
>>> +    /* For each LB backend that is monitored by a source_ip
>>> belonging
>>> +     * to a real LRP, install rule that punts service check
>>> replies to the
>>> +     * controller. */
>>> +    size_t j = 0;
>>> +    const struct ovn_lb_backend *backend;
>>
>> Nit: reverse xmas tree and I'd add an empty line here.
> 
> ack.
> 
>>
>>> +    VECTOR_FOR_EACH_PTR (&lb_vip->backends, backend) {
>>> +        struct ovn_northd_lb_backend *backend_nb =
>>> +            &lb_vip_nb->backends_nb[j++];
>>> +
>>> +        if (!backend_nb->health_check) {
>>> +            continue;
>>> +        }
>>> +
>>> +        const char *protocol = lb->nlb->protocol;
>>> +        if (!protocol || !protocol[0]) {
>>> +            protocol = "tcp";
>>> +        }
>>> +
>>> +        size_t index;
>>> +        DYNAMIC_BITMAP_FOR_EACH_1 (index, &lb_dps->nb_lr_map) {
>>> +            struct ovn_datapath *od =
>>> sparse_array_get(&lr_datapaths->dps,
>>> +                                                        index);
>>
>> Nit: indentation is one space off.
> 
> ack.
> 
>>
>>> +            /* Only install the rule if the datapath has a port
>>> with
>>> +             * monitor source IP. */
>>> +            struct ovn_datapath *peer_switch_od = NULL;
>>> +            struct ovn_port *svc_mon_op;
>>> +
>>> +            HMAP_FOR_EACH (svc_mon_op, dp_node, &od->ports) {
>>> +                if (!svc_mon_op->peer) {
>>> +                    continue;
>>> +                }
>>> +                const char *lrp_ip = lrp_find_member_ip(
>>> +                    svc_mon_op,
>>> +                    backend_nb->svc_mon_src_ip);
>>> +                if (lrp_ip &&
>>> +                        !strcmp(lrp_ip, backend_nb-
>>>> svc_mon_src_ip)) {
>>> +                    peer_switch_od = svc_mon_op->peer->od;
>>
>> What if svc_mon_op->peer->od is not a switch datapath?  That is, what
>> if
>> svc_mon_op is a router port connecting this router to another router?
>>
>> I know the topology would be a bit weird, but I think we should check
>> that and maybe just skip those ports.  Otherwise the lflows we add
>> below
>> might cause issues.
> 
> ack. I added check for switch datapath as you suggested in patch `2/2`.
>>
>>> +                    break;
>>> +                }
>>> +            }
>>> +            if (!peer_switch_od) {
>>> +                continue;
>>> +            }
>>> +
>>> +            ds_clear(match);
>>> +            ds_clear(action);
>>> +
>>> +            /* icmp6 type 1 and icmp4 type 3 are included as well,
>>> because
>>> +             * the controller is using them to detect unreachable
>>> ports. */
>>> +            if (addr_is_ipv6(backend_nb->svc_mon_src_ip)) {
>>> +                ds_put_format(match,
>>> +                              "inport == \"%s\" && ip6.dst == %s
>>> && "
>>> +                              "(%s.src == %s || icmp6.type == 1)",
>>> +                              backend_nb->logical_port,
>>> +                              backend_nb->svc_mon_src_ip,
>>> +                              protocol,
>>> +                              backend->port_str);
>>> +            } else {
>>> +                ds_put_format(match,
>>> +                              "inport == \"%s\" && ip4.dst == %s
>>> && "
>>> +                              "(%s.src == %s || icmp4.type == 3)",
>>> +                              backend_nb->logical_port,
>>> +                              backend_nb->svc_mon_src_ip,
>>> +                              protocol,
>>> +                              backend->port_str);
>>> +            }
>>> +
>>
>> Should we be a little more strict here and match on eth.dst ==
>> $svc_monitor_mac (i.e., svc_mon_op->peer.lport_addresses.ea)?
> 
> I don't mind, but out of curiosity, what is the potential benefit? Is
> it to make the health check detect potential misconfiguration on the
> backend side? i.e. If the traffic for LRP (likely a gateway) comes from
> the backend with a bad mac, then there are possibly more serious issues
> with connectivity and we want the check to fail?
> 

Right but OTOH does it cost us anything if we're stricter in our check?

I'm OK either way in the end, I'll leave it up to you.  Actually, after
reading your other reply below, I prefer that we add the check. :)

>>
>>> +            /* ovn-controller expects health check responses from
>>> the LS
>>> +             * datapath in which the backend is located. That's
>>> why we
>>> +             * install the response lflow into the peer's
>>> datapath. */
>>> +            const char *meter = copp_meter_get(COPP_SVC_MONITOR,
>>> +                                                peer_switch_od-
>>>> nbs->copp,
>>> +                                                meter_groups);
>>
>> Nit: indentation, the two lines above need to be one space to the
>> left.
> 
> ack.
> 
>>
>>> +            ovn_lflow_add(lflows, peer_switch_od, S_SWITCH_IN_LB,
>>> 160,
>>
>> I'm not sure if this was discussed in previous versions but for
>> regular
>> LB healthchecks (when we don't use the router IP) we
>> handle_svc_check()
>> in S_SWITCH_IN_L2_LKUP.  Why do we do it in S_SWITCH_IN_LB for router
>> IPs?
> 
> I actually don't remember if there was a deeper reason for this, but
> the regular health checks are strictly MAC based (if MAC equals magic
> number, then send to controller), so they fit well into the L2_LKUP
> stage. However for the LRP checks, we don't (as of now) do any MAC-
> based matching, so it felt out of place in the L2_LKUP table. I chose
> the IN_LB because it felt like a good fit, semantically.
> If we add your suggestion to match also the `eth.dst == svc_mon_op-
>> peer.lport_addresses.ea`, then it could fit in the L2_LKUP again. I'll
> defer to your suggestion here.
>     

Ah, right, yes, that would be my preference.  Then the feature is
implemented in the same way for both router-owned and non-router-owned IPs.

>>
>>> +                          ds_cstr(match),
>>> "handle_svc_check(inport);",
>>> +                          lb_dps->lflow_ref,
>>> +                          WITH_CTRL_METER(meter));
>>
>> Nit: this fits on a single line.
> 
> ack.
> 
>>
>>> +        }
>>> +    }
>>> +}
>>> +
>>>  static void
>>>  build_lb_rules(struct lflow_table *lflows, struct ovn_lb_datapaths
>>> *lb_dps,
>>>                 const struct ovn_datapaths *ls_datapaths,
>>> @@ -10349,6 +10464,26 @@ build_lswitch_arp_nd_local_svc_mon(const
>>> struct ovn_lb_datapaths *lb_dps,
>>>                  continue;
>>>              }
>>>  
>>> +            /* ARP responder is necessary only if the service
>>> check is not
>>> +             * backed by a real port and an IP. */
>>> +            struct ovn_port *svc_mon_op;
>>> +            bool port_found = false;
>>> +            VECTOR_FOR_EACH (&op->od->router_ports, svc_mon_op) {
>>> +                if (!svc_mon_op->peer) {
>>> +                    continue;
>>> +                }
>>> +                const char *lrp_ip = lrp_find_member_ip(
>>> +                        svc_mon_op->peer,
>>> +                        backend_nb->svc_mon_src_ip);
>>> +                if (lrp_ip && !strcmp(lrp_ip, backend_nb-
>>>> svc_mon_src_ip)) {
>>> +                    port_found = true;
>>> +                    break;
>>> +                }
>>> +            }
>>> +            if (port_found) {
>>> +                continue;
>>> +            }
>>> +
>>>              ds_clear(match);
>>>              ds_clear(actions);
>>>  
>>> @@ -13136,6 +13271,10 @@ build_lrouter_flows_for_lb(struct
>>> ovn_lb_datapaths *lb_dps,
>>>          build_lrouter_allow_vip_traffic_template(lflows, lb_dps,
>>> lb_vip, lb,
>>>                                                   lr_datapaths);
>>>  
>>> +        build_lb_health_check_response_lflows(
>>> +            lflows, lb, lb_vip, &lb->vips_nb[i], lb_dps,
>>> lr_datapaths,
>>> +            meter_groups, match, action);
>>> +
>>>          if (!build_empty_lb_event_flow(lb_vip, lb, match, action))
>>> {
>>>              continue;
>>>          }
>>> diff --git a/ovn-nb.xml b/ovn-nb.xml
>>> index e9ca27413..2ddc3bc20 100644
>>> --- a/ovn-nb.xml
>>> +++ b/ovn-nb.xml
>>> @@ -2402,7 +2402,10 @@
>>>            e.g.
>>> <code><var>port_name</var>:<var>sourc_ip</var></code> for IPv4.
>>>            Health checks are sent to this port with the specified
>>> source IP.
>>>            For IPv6 square brackets must be used around IP address,
>>> e.g:
>>> -          <code><var>port_name</var>:<var>[sourc_ip]</var></code>
>>> +          <code><var>port_name</var>:<var>[sourc_ip]</var></code>.
>>> The source
>>> +          IP must be from the subnet of the monitored endpoint. It
>>> can be
>>
>> Nit: we don't always enforce it but our coding style say that
>> sentences
>> in comments should be separated by two spaces.  We try to do that for
>> docs too.
> 
> ack.
> 
>>
>>> +          either an unused IP from the subnet, or an IP of one of
>>> the Logical
>>> +          Router Ports connected to the same switch.
>>>            Remote endpoint:
>>>            Specify :target_zone_name at the end of the above syntax
>>> to create
>>>            remote health checks in a specific zone.
>>> diff --git a/tests/ovn.at b/tests/ovn.at
>>> index 4d15d4a62..782fd0256 100644
>>> --- a/tests/ovn.at
>>> +++ b/tests/ovn.at
>>> @@ -26972,6 +26972,331 @@
>>> ignored_tables=OFTABLE_CHK_LB_HAIRPIN,OFTABLE_CT_SNAT_HAIRPIN])
>>>  AT_CLEANUP
>>>  ])
>>>  
>>> +OVN_FOR_EACH_NORTHD([
>>> +AT_SETUP([Load balancer health checks - LRP source IP - IPv4])
>>> +AT_KEYWORDS([lb])
>>> +ovn_start
>>> +
>>> +net_add n1
>>> +
>>> +sim_add hv1
>>> +as hv1
>>> +ovs-vsctl add-br br-phys
>>> +ovn_attach n1 br-phys 192.168.0.1
>>> +check ovs-vsctl -- add-port br-int hv1-vif1 -- \
>>> +    set interface hv1-vif1 external-ids:iface-id=sw0-p1 \
>>> +    options:tx_pcap=hv1/vif1-tx.pcap \
>>> +    options:rxq_pcap=hv1/vif1-rx.pcap \
>>> +    ofport-request=1
>>> +check ovs-vsctl -- add-port br-int hv1-vif2 -- \
>>> +    set interface hv1-vif2 external-ids:iface-id=sw0-p2 \
>>> +    options:tx_pcap=hv1/vif2-tx.pcap \
>>> +    options:rxq_pcap=hv1/vif2-rx.pcap \
>>> +    ofport-request=2
>>> +
>>> +sim_add hv2
>>> +as hv2
>>> +check ovs-vsctl add-br br-phys
>>> +ovn_attach n1 br-phys 192.168.0.2
>>> +check ovs-vsctl -- add-port br-int hv2-vif1 -- \
>>> +    set interface hv2-vif1 external-ids:iface-id=sw1-p1 \
>>> +    options:tx_pcap=hv2/vif1-tx.pcap \
>>> +    options:rxq_pcap=hv2/vif1-rx.pcap \
>>> +    ofport-request=1
>>> +
>>> +check ovn-nbctl ls-add sw0
>>> +check ovn-nbctl lsp-add sw0 sw0-p1
>>> +check ovn-nbctl lsp-set-addresses sw0-p1 "50:54:00:00:00:03
>>> 10.0.0.3"
>>> +check ovn-nbctl lsp-set-port-security sw0-p1 "50:54:00:00:00:03
>>> 10.0.0.3"
>>> +
>>> +check ovn-nbctl lsp-add sw0 sw0-p2
>>> +check ovn-nbctl lsp-set-addresses sw0-p2 "50:54:00:00:00:04
>>> 10.0.0.4"
>>> +check ovn-nbctl lsp-set-port-security sw0-p2 "50:54:00:00:00:04
>>> 10.0.0.4"
>>> +
>>> +check ovn-nbctl ls-add sw1
>>> +check ovn-nbctl lsp-add sw1 sw1-p1
>>> +check ovn-nbctl lsp-set-addresses sw1-p1 "40:54:00:00:00:03
>>> 20.0.0.3"
>>> +check ovn-nbctl lsp-set-port-security sw1-p1 "40:54:00:00:00:03
>>> 20.0.0.3"
>>> +
>>> +# Create a logical router and attach both logical switches.
>>> +check ovn-nbctl lr-add lr0
>>> +check ovn-nbctl lrp-add lr0 lr0-sw0 00:00:00:00:ff:01 10.0.0.1/24
>>> +check ovn-nbctl lsp-add-router-port sw0 sw0-lr0 lr0-sw0
>>> +
>>> +check ovn-nbctl lrp-add lr0 lr0-sw1 00:00:00:00:ff:02 20.0.0.1/24
>>> +check ovn-nbctl lsp-add-router-port sw1 sw1-lr0 lr0-sw1
>>> +
>>> +# Create TCP LB and configure health checks using router port IPs
>>> as source.
>>> +check ovn-nbctl lb-add lb1 10.0.0.10:80 10.0.0.3:80,20.0.0.3:80
>>> +check ovn-nbctl set load_balancer lb1
>>> ip_port_mappings:10.0.0.3=sw0-p1:10.0.0.1
>>> +check ovn-nbctl set load_balancer lb1
>>> ip_port_mappings:20.0.0.3=sw1-p1:20.0.0.1
>>> +
>>> +AT_CHECK([ovn-nbctl --wait=sb \
>>> +          -- --id=@hc create Load_Balancer_Health_Check
>>> vip="10.0.0.10\:80" \
>>> +          -- add Load_Balancer lb1 health_check @hc | uuidfilt],
>>> [0], [<0>
>>> +])
>>> +
>>> +# Create UDP LB with health check using router port IP.
>>> +check ovn-nbctl lb-add lb2 10.0.0.11:53 10.0.0.4:53 udp
>>> +check ovn-nbctl set load_balancer lb2
>>> ip_port_mappings:10.0.0.4=sw0-p2:10.0.0.1
>>> +
>>> +AT_CHECK([ovn-nbctl --wait=sb \
>>> +          -- --id=@hc create Load_Balancer_Health_Check
>>> vip="10.0.0.11\:53" \
>>> +          -- add Load_Balancer lb2 health_check @hc | uuidfilt],
>>> [0], [<0>
>>> +])
>>> +
>>> +check ovn-nbctl --wait=sb lr-lb-add lr0 lb1
>>> +check ovn-nbctl --wait=sb lr-lb-add lr0 lb2
>>> +
>>> +OVN_POPULATE_ARP
>>> +wait_for_ports_up
>>> +check ovn-nbctl --wait=hv sync
>>> +
>>> +wait_row_count Service_Monitor 3
>>> +
>>> +# Verify that Service_Monitor uses LRP MAC address, not
>>> svc_monitor_mac.
>>> +lrp_sw0_mac=$(ovn-nbctl get Logical_Router_Port lr0-sw0 mac | sed
>>> 's/"//g')
>>> +lrp_sw1_mac=$(ovn-nbctl get Logical_Router_Port lr0-sw1 mac | sed
>>> 's/"//g')
>>> +
>>> +svc_mon_sw0_p1_mac=$(ovn-sbctl --bare --columns src_mac find
>>> Service_Monitor logical_port=sw0-p1)
>>> +svc_mon_sw0_p2_mac=$(ovn-sbctl --bare --columns src_mac find
>>> Service_Monitor logical_port=sw0-p2)
>>> +svc_mon_sw1_mac=$(ovn-sbctl --bare --columns src_mac find
>>> Service_Monitor logical_port=sw1-p1)
>>> +
>>> +AT_CHECK([test "$svc_mon_sw0_p1_mac" = "$lrp_sw0_mac"])
>>> +AT_CHECK([test "$svc_mon_sw0_p2_mac" = "$lrp_sw0_mac"])
>>> +AT_CHECK([test "$svc_mon_sw1_mac" = "$lrp_sw1_mac"])
>>> +
>>> +# Verify priority 160 flows exist for punting health check replies
>>> to controller.
>>> +AT_CAPTURE_FILE([sbflows1])
>>> +ovn-sbctl dump-flows sw0 > sbflows1
>>> +AT_CAPTURE_FILE([sbflows2])
>>> +ovn-sbctl dump-flows sw1 > sbflows2
>>> +
>>> +## TCP Load balancer health check flows.
>>> +# TCP backend - verify flow matches on inport, destination IP, and
>>> TCP source port.
>>> +AT_CHECK([grep "priority=160" sbflows1 | grep "handle_svc_check" |
>>> \
>>> +          grep 'inport == "sw0-p1"' | grep "ip4.dst == 10.0.0.1" |
>>> \
>>> +          grep -q "tcp.src == 80" ])
>>> +
>>> +AT_CHECK([grep "priority=160" sbflows2 | grep "handle_svc_check" |
>>> \
>>> +          grep 'inport == "sw1-p1"' | grep "ip4.dst == 20.0.0.1" |
>>> \
>>> +          grep -q "tcp.src == 80" ])
>>> +
>>> +
>>
>> Is this extra newline on purpose?
> 
> nope, removed.
> 
>>
>>> +# Verify flow also matches ICMP unreachable (type 3 for IPv4).
>>> +AT_CHECK([grep "priority=160" sbflows1 | grep "handle_svc_check" |
>>> \
>>> +          grep 'inport == "sw0-p1"' | grep "ip4.dst == 10.0.0.1" |
>>> \
>>> +          grep -q "icmp4.type == 3"])
>>> +
>>> +AT_CHECK([grep "priority=160" sbflows2 | grep "handle_svc_check" |
>>> \
>>> +          grep 'inport == "sw1-p1"' | grep "ip4.dst == 20.0.0.1" |
>>> \
>>> +          grep -q "icmp4.type == 3"])
>>> +
>>> +## UDP Load balancer health check flows.
>>> +# UDP backend - verify flow uses UDP instead of TCP.
>>> +AT_CHECK([grep "priority=160" sbflows1 | grep "handle_svc_check" |
>>> \
>>> +          grep 'inport == "sw0-p2"' | grep "ip4.dst == 10.0.0.1"
>>> |\
>>
>> Nit: missing space before \
> 
> ack.
> 
>>
>>> +          grep -q "udp.src == 53"])
>>
>> Can all the AT_CHECKs above be replaced with a plain "check"?
> 
> For the v4 I replaced all "simple" conditions with the `check`. However
> if the check:
> * expected non-zero exit code
> * expected specific output
> * piped multiple commands together
> 
> I stuck with the AT_CHECK because it seems easier to read (to me). If
> you disagree, please let me know and I can try to replace all of them.
>  

No, that's fine, thanks!

>>
>>> +
>>> +# Verify NO TCP match in the UDP backend flow.
>>> +AT_CHECK([grep "priority=160" sbflows1 | grep "handle_svc_check" |
>>> \
>>> +          grep 'inport == "sw0-p2"' | grep "tcp.src"], [1])
>>> +
>>> +# Verify NO ARP responder flows using svc_monitor_mac for the
>>> router port IPs.
>>> +# The router port IPs should use the LRP MAC, not the global
>>> svc_monitor_mac.
>>> +svc_monitor_mac=$(ovn-nbctl get NB_Global .
>>> options:svc_monitor_mac | sed 's/"//g')
>>> +AT_CHECK([test -n "$svc_monitor_mac"])
>>> +
>>> +AT_CHECK([grep "arp.tpa == 10.0.0.1" sbflows1 | \
>>> +          grep "arp.sha = $svc_monitor_mac"], [1])
>>> +AT_CHECK([grep "arp.tpa == 20.0.0.1" sbflows2 | \
>>> +          grep "arp.sha = $svc_monitor_mac"], [1])
>>> +
>>> +# Verify health check packets are sent with LRP MAC as source.
>>> +lrp_sw0_mac_hex=$(echo $lrp_sw0_mac | sed 's/://g')
>>> +lrp_sw1_mac_hex=$(echo $lrp_sw1_mac | sed 's/://g')
>>> +
>>> +OVS_WAIT_UNTIL(
>>> +    [test 1 -le `$PYTHON "$ovs_srcdir/utilities/ovs-pcap.in"
>>> hv1/vif1-tx.pcap | \
>>> +    grep "505400000003${lrp_sw0_mac_hex}" | wc -l`]
>>> +)
>>> +
>>> +OVS_WAIT_UNTIL(
>>> +    [test 1 -le `$PYTHON "$ovs_srcdir/utilities/ovs-pcap.in"
>>> hv1/vif2-tx.pcap | \
>>> +    grep "505400000004${lrp_sw0_mac_hex}" | wc -l`]
>>> +)
>>> +
>>> +OVS_WAIT_UNTIL(
>>> +    [test 1 -le `$PYTHON "$ovs_srcdir/utilities/ovs-pcap.in"
>>> hv2/vif1-tx.pcap | \
>>> +    grep "405400000003${lrp_sw1_mac_hex}" | wc -l`]
>>> +)
>>> +
>>> +OVN_CLEANUP([hv1], [hv2])
>>> +AT_CLEANUP
>>> +])
>>> +
>>> +OVN_FOR_EACH_NORTHD([
>>> +AT_SETUP([Load balancer health checks - LRP source IP - IPv6])
>>> +AT_KEYWORDS([lb])
>>> +ovn_start
>>> +
>>> +net_add n1
>>> +
>>> +sim_add hv1
>>> +as hv1
>>> +ovs-vsctl add-br br-phys
>>> +ovn_attach n1 br-phys 192.168.0.1
>>> +check ovs-vsctl -- add-port br-int hv1-vif1 -- \
>>> +    set interface hv1-vif1 external-ids:iface-id=sw0-p1 \
>>> +    options:tx_pcap=hv1/vif1-tx.pcap \
>>> +    options:rxq_pcap=hv1/vif1-rx.pcap \
>>> +    ofport-request=1
>>> +check ovs-vsctl -- add-port br-int hv1-vif2 -- \
>>> +    set interface hv1-vif2 external-ids:iface-id=sw0-p2 \
>>> +    options:tx_pcap=hv1/vif2-tx.pcap \
>>> +    options:rxq_pcap=hv1/vif2-rx.pcap \
>>> +    ofport-request=2
>>> +
>>> +sim_add hv2
>>> +as hv2
>>> +check ovs-vsctl add-br br-phys
>>> +ovn_attach n1 br-phys 192.168.0.2
>>> +check ovs-vsctl -- add-port br-int hv2-vif1 -- \
>>> +    set interface hv2-vif1 external-ids:iface-id=sw1-p1 \
>>> +    options:tx_pcap=hv2/vif1-tx.pcap \
>>> +    options:rxq_pcap=hv2/vif1-rx.pcap \
>>> +    ofport-request=1
>>> +
>>> +check ovn-nbctl ls-add sw0
>>> +check ovn-nbctl lsp-add sw0 sw0-p1
>>> +check ovn-nbctl lsp-set-addresses sw0-p1 "50:54:00:00:00:03
>>> 2001::3"
>>> +check ovn-nbctl lsp-set-port-security sw0-p1 "50:54:00:00:00:03
>>> 2001::3"
>>> +
>>> +check ovn-nbctl lsp-add sw0 sw0-p2
>>> +check ovn-nbctl lsp-set-addresses sw0-p2 "50:54:00:00:00:04
>>> 2001::4"
>>> +check ovn-nbctl lsp-set-port-security sw0-p2 "50:54:00:00:00:04
>>> 2001::4"
>>> +
>>> +check ovn-nbctl ls-add sw1
>>> +check ovn-nbctl lsp-add sw1 sw1-p1
>>> +check ovn-nbctl lsp-set-addresses sw1-p1 "40:54:00:00:00:03
>>> 2002::3"
>>> +check ovn-nbctl lsp-set-port-security sw1-p1 "40:54:00:00:00:03
>>> 2002::3"
>>> +
>>> +# Create a logical router and attach both logical switches.
>>> +check ovn-nbctl lr-add lr0
>>> +check ovn-nbctl lrp-add lr0 lr0-sw0 00:00:00:00:ff:01 2001::1/64
>>> +check ovn-nbctl lsp-add-router-port sw0 sw0-lr0 lr0-sw0
>>> +
>>> +check ovn-nbctl lrp-add lr0 lr0-sw1 00:00:00:00:ff:02 2002::1/64
>>> +check ovn-nbctl lsp-add-router-port sw1 sw1-lr0 lr0-sw1
>>> +
>>> +# Create TCP LB and configure health checks using router port IPs
>>> as source.
>>> +check ovn-nbctl lb-add lb1 [[2001::a]]:80
>>> [[2001::3]]:80,[[2002::3]]:80
>>> +check ovn-nbctl set load_balancer lb1
>>> ip_port_mappings:\"[[2001::3]]\"=\"sw0-p1:[[2001::1]]\"
>>> +check ovn-nbctl set load_balancer lb1
>>> ip_port_mappings:\"[[2002::3]]\"=\"sw1-p1:[[2002::1]]\"
>>> +
>>> +AT_CHECK([ovn-nbctl --wait=sb \
>>> +          -- --id=@hc create Load_Balancer_Health_Check
>>> vip="\[\[2001\:\:a\]\]\:80" \
>>> +          -- add Load_Balancer lb1 health_check @hc | uuidfilt],
>>> [0], [<0>
>>> +])
>>> +
>>> +# Create UDP LB with health check using router port IP.
>>> +check ovn-nbctl lb-add lb2 [[2001::b]]:53 [[2001::4]]:53 udp
>>> +check ovn-nbctl set load_balancer lb2
>>> ip_port_mappings:\"[[2001::4]]\"=\"sw0-p2:[[2001::1]]\"
>>> +
>>> +AT_CHECK([ovn-nbctl --wait=sb \
>>> +          -- --id=@hc create Load_Balancer_Health_Check
>>> vip="\[\[2001\:\:b\]\]\:53" \
>>> +          -- add Load_Balancer lb2 health_check @hc | uuidfilt],
>>> [0], [<0>
>>> +])
>>> +
>>> +check ovn-nbctl --wait=sb lr-lb-add lr0 lb1
>>> +check ovn-nbctl --wait=sb lr-lb-add lr0 lb2
>>> +
>>> +OVN_POPULATE_ARP
>>> +wait_for_ports_up
>>> +check ovn-nbctl --wait=hv sync
>>> +
>>> +wait_row_count Service_Monitor 3
>>> +
>>> +# Verify that Service_Monitor uses LRP MAC address.
>>> +lrp_sw0_mac=$(ovn-nbctl get Logical_Router_Port lr0-sw0 mac | sed
>>> 's/"//g')
>>> +lrp_sw1_mac=$(ovn-nbctl get Logical_Router_Port lr0-sw1 mac | sed
>>> 's/"//g')
>>> +
>>> +svc_mon_sw0_p1_mac=$(ovn-sbctl --bare --columns src_mac find
>>> Service_Monitor logical_port=sw0-p1)
>>> +svc_mon_sw0_p2_mac=$(ovn-sbctl --bare --columns src_mac find
>>> Service_Monitor logical_port=sw0-p2)
>>> +svc_mon_sw1_mac=$(ovn-sbctl --bare --columns src_mac find
>>> Service_Monitor logical_port=sw1-p1)
>>> +
>>> +AT_CHECK([test "$svc_mon_sw0_p1_mac" = "$lrp_sw0_mac"])
>>> +AT_CHECK([test "$svc_mon_sw0_p2_mac" = "$lrp_sw0_mac"])
>>> +AT_CHECK([test "$svc_mon_sw1_mac" = "$lrp_sw1_mac"])
>>> +
>>> +# Verify priority 160 flows exist for punting health check replies
>>> to controller.
>>> +AT_CAPTURE_FILE([sbflows1])
>>> +ovn-sbctl dump-flows sw0 > sbflows1
>>> +AT_CAPTURE_FILE([sbflows2])
>>> +ovn-sbctl dump-flows sw1 > sbflows2
>>> +
>>> +## TCP Load balancer health check flows.
>>> +# TCP backend - verify flow matches on inport, destination IP, and
>>> TCP source port.
>>> +AT_CHECK([grep "priority=160" sbflows1 | grep "handle_svc_check" |
>>> \
>>> +          grep 'inport == "sw0-p1"' | grep "ip6.dst == 2001::1" |
>>> \
>>> +          grep -q "tcp.src == 80" ])
>>> +
>>> +AT_CHECK([grep "priority=160" sbflows2 | grep "handle_svc_check" |
>>> \
>>> +          grep 'inport == "sw1-p1"' | grep "ip6.dst == 2002::1" |
>>> \
>>> +          grep -q "tcp.src == 80" ])
>>> +
>>> +# Verify flow also matches ICMP unreachable (type 1 for IPv6).
>>> +AT_CHECK([grep "priority=160" sbflows1 | grep "handle_svc_check" |
>>> \
>>> +          grep 'inport == "sw0-p1"' | grep "ip6.dst == 2001::1" |
>>> \
>>> +          grep -q "icmp6.type == 1"])
>>> +
>>> +AT_CHECK([grep "priority=160" sbflows2 | grep "handle_svc_check" |
>>> \
>>> +          grep 'inport == "sw1-p1"' | grep "ip6.dst == 2002::1" |
>>> \
>>> +          grep -q "icmp6.type == 1"])
>>> +
>>> +## UDP Load balancer health check flows.
>>> +# UDP backend - verify flow uses UDP instead of TCP.
>>> +AT_CHECK([grep "priority=160" sbflows1 | grep "handle_svc_check" |
>>> \
>>> +          grep 'inport == "sw0-p2"' | grep "ip6.dst == 2001::1" |
>>> \
>>> +          grep -q "udp.src == 53"])
>>
>> Similar to the v4 case, can we use "check" instead of AT_CHECK for
>> all
>> these?
> 
> see above response.
> 
>>
>>> +
>>> +# Verify NO TCP match in the UDP backend flow.
>>> +AT_CHECK([grep "priority=160" sbflows1 | grep "handle_svc_check" |
>>> \
>>> +          grep 'inport == "sw0-p2"' | grep "tcp.src"], [1])
>>> +
>>> +# Verify NO IPv6 ND responder flows using svc_monitor_mac for the
>>> router port IPs.
>>> +# The router port IPs should use the LRP MAC, not the global
>>> svc_monitor_mac.
>>> +svc_monitor_mac=$(ovn-nbctl get NB_Global .
>>> options:svc_monitor_mac | sed 's/"//g')
>>> +AT_CHECK([test -n "$svc_monitor_mac"])
>>> +
>>> +AT_CHECK([ovn-sbctl dump-flows sw0 | grep "nd.target == 2001::1" |
>>> \
>>> +          grep "nd.tll = $svc_monitor_mac"], [1])
>>> +AT_CHECK([ovn-sbctl dump-flows sw1 | grep "nd.target == 2002::1" |
>>> \
>>> +          grep "nd.tll = $svc_monitor_mac"], [1])
>>> +
>>> +# Verify health check packets are sent with LRP MAC as source.
>>> +lrp_sw0_mac_hex=$(echo $lrp_sw0_mac | sed 's/://g')
>>> +lrp_sw1_mac_hex=$(echo $lrp_sw1_mac | sed 's/://g')
>>> +
>>> +OVS_WAIT_UNTIL(
>>> +    [test 1 -le `$PYTHON "$ovs_srcdir/utilities/ovs-pcap.in"
>>> hv1/vif1-tx.pcap | \
>>> +    grep "505400000003${lrp_sw0_mac_hex}" | wc -l`]
>>> +)
>>> +
>>> +OVS_WAIT_UNTIL(
>>> +    [test 1 -le `$PYTHON "$ovs_srcdir/utilities/ovs-pcap.in"
>>> hv1/vif2-tx.pcap | \
>>> +    grep "505400000004${lrp_sw0_mac_hex}" | wc -l`]
>>> +)
>>> +
>>> +OVS_WAIT_UNTIL(
>>> +    [test 1 -le `$PYTHON "$ovs_srcdir/utilities/ovs-pcap.in"
>>> hv2/vif1-tx.pcap | \
>>> +    grep "405400000003${lrp_sw1_mac_hex}" | wc -l`]
>>> +)
>>> +
>>> +OVN_CLEANUP([hv1], [hv2])
>>> +AT_CLEANUP
>>> +])
>>> +
>>>  OVN_FOR_EACH_NORTHD([
>>>  AT_SETUP([SCTP Load balancer health checks])
>>>  AT_KEYWORDS([lb sctp])
>>> diff --git a/tests/system-ovn.at b/tests/system-ovn.at
>>> index 6eaf38a4c..a372d3676 100644
>>> --- a/tests/system-ovn.at
>>> +++ b/tests/system-ovn.at
>>> @@ -4422,6 +4422,406 @@ OVS_TRAFFIC_VSWITCHD_STOP(["/failed to
>>> query port patch-.*/d
>>>  AT_CLEANUP
>>>  ])
>>>  
>>> +OVN_FOR_EACH_NORTHD([
>>> +AT_SETUP([Load balancer health checks with LRP IP - IPv4])
>>> +AT_KEYWORDS([lb])
>>> +
>>> +# This is a variant of a 'Load balancer health checks - IPv4' test
>>> +# that uses LRP IPs as a source IP for the health checks. As such,
>>> it
>>> +# doesn't go so deep into testing and focuses only on "health
>>> checks"
>>> +# having proper status and the backend services being reachable.
>>> +#
>>> +# Topology:
>>> +#
>>> +#           LB VIP: 172.16.0.42
>>> +#         Backends: 10.0.0.3 10.0.0.4
>>> +#
>>> +#                                                                 
>>>        .3 +--------------+
>>> +#                                          +------------
>>> +           +------ |    sw0-p1    |
>>> +#                                      .42 |           
>>> |           |       +--------------+
>>> +#  +--------------+ .2                 .1  |            |
>>> .1        |
>>> +#  |   sw-ext-p1  | ------- sw-ext ------- |    lr0     | ------- 
>>> sw0 10.0.0.0/24
>>> +#  +--------------+                        |           
>>> |           |
>>> +#                        172.16.0.0/24     |           
>>> |           |    .4 +--------------+
>>> +#                                          +------------
>>> +           +------ |    sw0-p2    |
>>> +#                                                                 
>>>           +--------------+
>>> +#
>>
>> Nice!
> 
> Thanks :)
> 
>>
>>> +ovn_start
>>> +
>>> +OVS_TRAFFIC_VSWITCHD_START()
>>> +ADD_BR([br-int])
>>> +
>>> +# Set external-ids in br-int needed for ovn-controller.
>>> +check ovs-vsctl \
>>> +        -- set Open_vSwitch . external-ids:system-id=hv1 \
>>> +        -- set Open_vSwitch . external-ids:ovn-
>>> remote=unix:$ovs_base/ovn-sb/ovn-sb.sock \
>>> +        -- set Open_vSwitch . external-ids:ovn-encap-type=geneve \
>>> +        -- set Open_vSwitch . external-ids:ovn-encap-ip=169.0.0.1
>>> \
>>> +        -- set bridge br-int fail-mode=secure other-
>>> config:disable-in-band=true
>>> +
>>> +# Start ovn-controller.
>>> +start_daemon ovn-controller
>>> +
>>> +# Create internal network with LB backends.
>>> +check ovn-nbctl ls-add sw0
>>> +
>>> +check ovn-nbctl lsp-add sw0 sw0-p1
>>> +check ovn-nbctl lsp-set-addresses sw0-p1 "50:54:00:00:00:03
>>> 10.0.0.3"
>>> +
>>> +check ovn-nbctl lsp-add sw0 sw0-p2
>>> +check ovn-nbctl lsp-set-addresses sw0-p2 "50:54:00:00:00:04
>>> 10.0.0.4"
>>> +
>>> +# Create external network with LB VIP.
>>> +check ovn-nbctl ls-add sw-ext
>>> +
>>> +check ovn-nbctl lsp-add sw-ext sw-ext-p1
>>> +check ovn-nbctl lsp-set-addresses sw-ext-p1 "60:64:00:00:00:02
>>> 172.16.0.2"
>>> +
>>> +# Create a logical router and attach both logical switches.
>>> +check ovn-nbctl lr-add lr0
>>> +check ovn-nbctl set logical_router lr0 options:chassis=hv1
>>> +check ovn-nbctl lrp-add lr0 lr0-sw0 00:00:00:00:ff:01 10.0.0.1/24
>>> +check ovn-nbctl lrp-add lr0 lr0-sw-ext 00:00:00:00:ee:01
>>> 172.16.0.1/24
>>> +check ovn-nbctl lsp-add-router-port sw0 sw0-lr0 lr0-sw0
>>> +check ovn-nbctl lsp-add-router-port sw-ext sw-ext-lr0 lr0-sw-ext
>>> +
>>> +check ovn-nbctl lb-add lb1 172.16.0.42:80 10.0.0.3:80,10.0.0.4:80
>>> +
>>> +# Load balancer health check will be using LR IP as a source of
>>> the health check.
>>> +check ovn-nbctl set load_balancer . ip_port_mappings:10.0.0.3=sw0-
>>> p1:10.0.0.1
>>> +check ovn-nbctl set load_balancer . ip_port_mappings:10.0.0.4=sw0-
>>> p2:10.0.0.1
>>> +
>>> +check_uuid ovn-nbctl --id=@hc create \
>>> +Load_Balancer_Health_Check vip="172.16.0.42\:80" -- add
>>> Load_Balancer . \
>>> +health_check @hc
>>> +
>>> +# LB is bound only to the LR. It shouldn't be necessary to bind it
>>> to the
>>> +# LS as well.
>>> +check ovn-nbctl lr-lb-add lr0 lb1
>>> +
>>> +OVN_POPULATE_ARP
>>> +check ovn-nbctl --wait=hv sync
>>> +
>>> +ADD_NAMESPACES(sw0-p1)
>>> +ADD_VETH(sw0-p1, sw0-p1, br-int, "10.0.0.3/24",
>>> "50:54:00:00:00:03", \
>>> +         "10.0.0.1")
>>> +
>>> +ADD_NAMESPACES(sw0-p2)
>>> +ADD_VETH(sw0-p2, sw0-p2, br-int, "10.0.0.4/24",
>>> "50:54:00:00:00:04", \
>>> +         "10.0.0.1")
>>> +
>>> +ADD_NAMESPACES(sw-ext-p1)
>>> +ADD_VETH(sw-ext-p1, sw-ext-p1, br-int, "172.16.0.2/24",
>>> "60:64:00:00:00:02", \
>>> +         "172.16.0.1")
>>> +
>>> +wait_row_count Service_Monitor 2 status=offline
>>> +
>>> +# Start webservers in 'sw0-p1' and 'sw1-p1'.
>>> +OVS_START_L7([sw0-p1], [http])
>>> +sw0_p1_pid_file=`cat l7_pid_file`
>>> +OVS_START_L7([sw0-p2], [http])
>>> +
>>> +wait_row_count Service_Monitor 2 status=online
>>> +
>>> +for i in `seq 1 10`; do
>>> +    NS_CHECK_EXEC([sw-ext-p1], [curl --connect-timeout 5
>>> http://172.16.0.42 > curl$i.log 2>&1 ])
>>> +done
>>> +
>>> +# Stop webserver in sw0-p1.
>>> +kill `cat $sw0_p1_pid_file`
>>> +
>>> +wait_row_count Service_Monitor 1 logical_port=sw0-p1
>>> status=offline
>>> +wait_row_count Service_Monitor 1 logical_port=sw0-p2 status=online
>>> +
>>> +for i in `seq 11 20`; do
>>> +    NS_CHECK_EXEC([sw-ext-p1], [curl --connect-timeout 5
>>> http://172.16.0.42 > curl$i.log 2>&1 ])
>>> +done
>>> +
>>> +# Trigger port binding release and check if status changed to
>>> offline.
>>> +check ovs-vsctl remove interface ovs-sw0-p2 external_ids iface-id
>>> +wait_row_count Service_Monitor 2
>>> +wait_row_count Service_Monitor 2 status=offline
>>> +
>>> +check ovs-vsctl set interface ovs-sw0-p2 external_ids:iface-
>>> id=sw0-p2
>>> +wait_row_count Service_Monitor 2
>>> +wait_row_count Service_Monitor 1 status=online
>>> +
>>> +# Remove TCP Load Balancer.
>>> +check ovn-nbctl lb-del lb1
>>> +
>>> +# Create UDP load balancer.
>>> +check ovn-nbctl lb-add lb2 172.16.0.42:53 10.0.0.3:53,10.0.0.4:53
>>> udp
>>> +lb_udp=`ovn-nbctl lb-list | grep udp | awk '{print $1}'`
>>> +
>>> +check ovn-nbctl set load_balancer $lb_udp
>>> ip_port_mappings:10.0.0.3=sw0-p1:10.0.0.1
>>> +check ovn-nbctl set load_balancer $lb_udp
>>> ip_port_mappings:10.0.0.4=sw0-p2:10.0.0.1
>>> +
>>> +check_uuid ovn-nbctl --id=@hc create \
>>> +Load_Balancer_Health_Check vip="172.16.0.42\:53" -- add
>>> Load_Balancer $lb_udp \
>>> +health_check @hc
>>> +
>>> +check ovn-nbctl --wait=hv lr-lb-add lr0 lb2
>>> +
>>> +# Wait until udp service_monitor are set to offline.
>>> +wait_row_count Service_Monitor 2 status=offline protocol=udp
>>> +
>>> +# We need to use "-k" argument because we keep receiving health
>>> check probes.
>>> +NETNS_DAEMONIZE([sw0-p1], [nc -4 -k -l -u -p 53 -d 0.1 -c
>>> '/bin/cat'], [udp1.pid])
>>> +NETNS_DAEMONIZE([sw0-p2], [nc -4 -k -l -u -p 53 -d 0.1 -c
>>> '/bin/cat'], [udp2.pid])
>>> +
>>> +# UDP nc with "-k" is bit tricky to cleanup, especially on test
>>> failures, due to its
>>> +# child processes keeping the listening port open. If they are not
>>> cleaned up properly,
>>> +# the whole test cleanup hangs.
>>
>> Is this something we need to fix in other tests too?  Outside of this
>> patch I mean.
>>
>>> +udp1_pid=$(cat udp1.pid)
>>> +udp2_pid=$(cat udp2.pid)
>>> +echo "pkill -P $udp1_pid && kill $udp1_pid" >> cleanup
>>> +echo "pkill -P $udp2_pid && kill $udp2_pid" >> cleanup
>>> +:> udp1.pid
>>> +:> udp2.pid
>>> +
>>> +wait_row_count Service_Monitor 2 status=online
>>> +
>>> +NETNS_DAEMONIZE(sw-ext-p1, [nc -4 -u 172.16.0.42 53 -d 0.1 -c 'for
>>> i in $(seq 1 5); do echo "udp_data $i"; read msg; echo "$msg"
>>>>> ./udp_data_1; done'], [udp1_client.pid])
>>> +
>>> +OVS_WAIT_FOR_OUTPUT([cat ./udp_data_1],  [0], [dnl
>>> +udp_data 1
>>> +udp_data 2
>>> +udp_data 3
>>> +udp_data 4
>>> +udp_data 5
>>> +])
>>> +
>>> +# Stop UDP service in sw0-p1.
>>> +# Due to the way UDP nc with "-k" forks, we need to kill all
>>> +# children processes and then then the parent.
>>> +NS_CHECK_EXEC([sw0-p1], [pkill -P $udp1_pid && kill $udp1_pid])
>>> +
>>> +wait_row_count Service_Monitor 1 logical_port=sw0-p1
>>> status=offline
>>> +wait_row_count Service_Monitor 1 logical_port=sw0-p2 status=online
>>> +
>>> +NETNS_DAEMONIZE(sw-ext-p1, [nc -4 -u 172.16.0.42 53 -d 0.1 -c 'for
>>> i in $(seq 1 5); do echo "udp_data $i"; read msg; echo "$msg"
>>>>> ./udp_data_2; done'], [udp2_client.pid])
>>> +
>>> +OVS_WAIT_FOR_OUTPUT([cat ./udp_data_2],  [0], [dnl
>>> +udp_data 1
>>> +udp_data 2
>>> +udp_data 3
>>> +udp_data 4
>>> +udp_data 5
>>> +])
>>> +
>>> +OVN_CLEANUP_CONTROLLER([hv1])
>>> +
>>> +OVN_CLEANUP_NORTHD
>>> +
>>> +as
>>> +OVS_TRAFFIC_VSWITCHD_STOP(["/failed to query port patch-.*/d
>>> +/connection dropped.*/d
>>> +/Service monitor not found.*/d
>>> +/handle service check: Unsupported protocol*/d"])
>>> +
>>> +AT_CLEANUP
>>> +])
>>> +
>>> +OVN_FOR_EACH_NORTHD([
>>> +AT_SETUP([Load balancer health checks with LRP IP - IPv6])
>>> +AT_KEYWORDS([lb])
>>> +
>>> +# This is a variant of a 'Load balancer health checks - IPv6' test
>>> +# that uses LRP IPs as a source IP for the health checks. As such,
>>> it
>>> +# doesn't go so deep into testing and focuses only on "health
>>> checks"
>>> +# having proper status and the backend services being reachable.
>>> +#
>>> +# Topology:
>>> +#
>>> +#           LB VIP: fd20::42
>>> +#         Backends: fd11::3 fd11::4
>>> +#
>>> +#                                                                 
>>>       ::3 +--------------+
>>> +#                                          +------------
>>> +           +------ |    sw0-p1    |
>>> +#                                     ::42 |           
>>> |           |       +--------------+
>>> +#  +--------------+ ::2               ::1  |            |
>>> ::1       |
>>> +#  |   sw-ext-p1  | ------- sw-ext ------- |    lr0     | ------- 
>>> sw0 fd11::/64
>>> +#  +--------------+                        |           
>>> |           |
>>> +#                        fd20::/64         |           
>>> |           |   ::4 +--------------+
>>> +#                                          +------------
>>> +           +------ |    sw0-p2    |
>>> +#                                                                 
>>>           +--------------+
>>> +#
>>> +ovn_start
>>> +
>>> +OVS_TRAFFIC_VSWITCHD_START()
>>> +ADD_BR([br-int])
>>> +
>>> +# Set external-ids in br-int needed for ovn-controller.
>>> +check ovs-vsctl \
>>> +        -- set Open_vSwitch . external-ids:system-id=hv1 \
>>> +        -- set Open_vSwitch . external-ids:ovn-
>>> remote=unix:$ovs_base/ovn-sb/ovn-sb.sock \
>>> +        -- set Open_vSwitch . external-ids:ovn-encap-type=geneve \
>>> +        -- set Open_vSwitch . external-ids:ovn-encap-ip=169.0.0.1
>>> \
>>> +        -- set bridge br-int fail-mode=secure other-
>>> config:disable-in-band=true
>>> +
>>> +# Start ovn-controller.
>>> +start_daemon ovn-controller
>>> +
>>> +# Create internal network with LB backends.
>>> +check ovn-nbctl ls-add sw0
>>> +
>>> +check ovn-nbctl lsp-add sw0 sw0-p1
>>> +check ovn-nbctl lsp-set-addresses sw0-p1 "50:54:00:00:00:03
>>> fd11::3"
>>> +
>>> +check ovn-nbctl lsp-add sw0 sw0-p2
>>> +check ovn-nbctl lsp-set-addresses sw0-p2 "50:54:00:00:00:04
>>> fd11::4"
>>> +
>>> +# Create external network with LB VIP.
>>> +check ovn-nbctl ls-add sw-ext
>>> +
>>> +check ovn-nbctl lsp-add sw-ext sw-ext-p1
>>> +check ovn-nbctl lsp-set-addresses sw-ext-p1 "60:64:00:00:00:02
>>> fd20::2"
>>> +
>>> +# Create a logical router and attach both logical switches.
>>> +check ovn-nbctl lr-add lr0
>>> +check ovn-nbctl set logical_router lr0 options:chassis=hv1
>>> +check ovn-nbctl lrp-add lr0 lr0-sw0 00:00:00:00:ff:01 fd11::1/64
>>> +check ovn-nbctl lrp-add lr0 lr0-sw-ext 00:00:00:00:ee:01
>>> fd20::1/64
>>> +check ovn-nbctl lsp-add-router-port sw0 sw0-lr0 lr0-sw0
>>> +check ovn-nbctl lsp-add-router-port sw-ext sw-ext-lr0 lr0-sw-ext
>>> +
>>> +check ovn-nbctl lb-add lb1 [[fd20::42]]:80
>>> [[fd11::3]]:80,[[fd11::4]]:80
>>> +
>>> +# Load balancer health check will be using LR IP as a source of
>>> the health check.
>>> +check ovn-nbctl set load_balancer .
>>> ip_port_mappings:\"[[fd11::3]]\"=\"sw0-p1:[[fd11::1]]\"
>>> +check ovn-nbctl set load_balancer .
>>> ip_port_mappings:\"[[fd11::4]]\"=\"sw0-p2:[[fd11::1]]\"
>>> +
>>> +check_uuid ovn-nbctl --id=@hc create \
>>> +Load_Balancer_Health_Check vip="\[\[fd20\:\:42\]\]\:80" -- add
>>> Load_Balancer . \
>>> +health_check @hc
>>> +
>>> +# LB is bound only to the LR. It shouldn't be necessary to bind it
>>> to the
>>> +# LS as well.
>>> +check ovn-nbctl lr-lb-add lr0 lb1
>>> +
>>> +OVN_POPULATE_ARP
>>> +check ovn-nbctl --wait=hv sync
>>> +
>>> +ADD_NAMESPACES(sw0-p1)
>>> +ADD_VETH(sw0-p1, sw0-p1, br-int, "fd11::3/64",
>>> "50:54:00:00:00:03", \
>>> +         "fd11::1")
>>> +
>>> +ADD_NAMESPACES(sw0-p2)
>>> +ADD_VETH(sw0-p2, sw0-p2, br-int, "fd11::4/64",
>>> "50:54:00:00:00:04", \
>>> +         "fd11::1")
>>> +
>>> +ADD_NAMESPACES(sw-ext-p1)
>>> +ADD_VETH(sw-ext-p1, sw-ext-p1, br-int, "fd20::2/64",
>>> "60:64:00:00:00:02", \
>>> +         "fd20::1")
>>> +
>>> +wait_row_count Service_Monitor 2 status=offline
>>> +
>>> +# Start webservers in 'sw0-p1' and 'sw0-p2'.
>>> +OVS_START_L7([sw0-p1], [http6])
>>> +sw0_p1_pid_file=`cat l7_pid_file`
>>> +OVS_START_L7([sw0-p2], [http6])
>>> +
>>> +wait_row_count Service_Monitor 2 status=online
>>> +
>>> +for i in `seq 1 10`; do
>>> +    NS_CHECK_EXEC([sw-ext-p1], [curl --connect-timeout 5
>>> http://[[fd20::42]] > curl$i.log 2>&1 ])
>>> +done
>>> +
>>> +# Stop webserver in sw0-p1.
>>> +kill `cat $sw0_p1_pid_file`
>>> +
>>> +wait_row_count Service_Monitor 1 logical_port=sw0-p1
>>> status=offline
>>> +wait_row_count Service_Monitor 1 logical_port=sw0-p2 status=online
>>> +
>>> +for i in `seq 11 20`; do
>>> +    NS_CHECK_EXEC([sw-ext-p1], [curl --connect-timeout 5
>>> http://[[fd20::42]] > curl$i.log 2>&1 ])
>>> +done
>>> +
>>> +# Trigger port binding release and check if status changed to
>>> offline.
>>> +check ovs-vsctl remove interface ovs-sw0-p2 external_ids iface-id
>>> +wait_row_count Service_Monitor 2
>>> +wait_row_count Service_Monitor 2 status=offline
>>> +
>>> +check ovs-vsctl set interface ovs-sw0-p2 external_ids:iface-
>>> id=sw0-p2
>>> +wait_row_count Service_Monitor 2
>>> +wait_row_count Service_Monitor 1 status=online
>>> +
>>> +# Remove TCP Load Balancer.
>>> +check ovn-nbctl lb-del lb1
>>> +
>>> +# Create udp load balancer.
>>> +check ovn-nbctl lb-add lb2 [[fd20::42]]:53
>>> [[fd11::3]]:53,[[fd11::4]]:53 udp
>>> +lb_udp=`ovn-nbctl lb-list | grep udp | awk '{print $1}'`
>>> +
>>> +check ovn-nbctl set load_balancer $lb_udp
>>> ip_port_mappings:\"[[fd11::3]]\"=\"sw0-p1:[[fd11::1]]\"
>>> +check ovn-nbctl set load_balancer $lb_udp
>>> ip_port_mappings:\"[[fd11::4]]\"=\"sw0-p2:[[fd11::1]]\"
>>> +
>>> +check_uuid ovn-nbctl --id=@hc create \
>>> +Load_Balancer_Health_Check vip="\[\[fd20\:\:42\]\]\:53" -- add
>>> Load_Balancer $lb_udp \
>>> +health_check @hc
>>> +
>>> +check ovn-nbctl --wait=hv lr-lb-add lr0 lb2
>>> +
>>> +# Wait until udp service_monitor are set to offline.
>>> +wait_row_count Service_Monitor 2 status=offline protocol=udp
>>> +
>>> +# We need to use "-k" argument because we keep receiving health
>>> check probes.
>>> +NETNS_DAEMONIZE([sw0-p1], [nc -6 -k -l -u -p 53 -d 0.1 -c
>>> '/bin/cat'], [udp1.pid])
>>> +NETNS_DAEMONIZE([sw0-p2], [nc -6 -k -l -u -p 53 -d 0.1 -c
>>> '/bin/cat'], [udp2.pid])
>>> +
>>> +# UDP nc with "-k" is bit tricky to cleanup, especially on test
>>> failures, due to its
>>> +# child processes keeping the listening port open. If they are not
>>> cleaned up properly,
>>> +# the whole test cleanup hangs.
>>> +udp1_pid=$(cat udp1.pid)
>>> +udp2_pid=$(cat udp2.pid)
>>> +echo "pkill -P $udp1_pid && kill $udp1_pid" >> cleanup
>>> +echo "pkill -P $udp2_pid && kill $udp2_pid" >> cleanup
>>> +:> udp1.pid
>>> +:> udp2.pid
>>> +
>>> +wait_row_count Service_Monitor 2 status=online
>>> +
>>> +NETNS_DAEMONIZE(sw-ext-p1, [nc -6 -u fd20::42 53 -d 0.1 -c 'for i
>>> in $(seq 1 5); do echo "udp_data $i"; read msg; echo "$msg"
>>>>> ./udp_data_1; done'], [udp1_client.pid])
>>> +
>>> +OVS_WAIT_FOR_OUTPUT([cat ./udp_data_1],  [0], [dnl
>>> +udp_data 1
>>> +udp_data 2
>>> +udp_data 3
>>> +udp_data 4
>>> +udp_data 5
>>> +])
>>> +
>>> +# Stop UDP service in sw0-p1.
>>> +# Due to the way UDP nc with "-k" forks, we need to kill all
>>> +# children processes and then then the parent.
>>> +NS_CHECK_EXEC([sw0-p1], [pkill -P $udp1_pid && kill $udp1_pid])
>>> +
>>> +wait_row_count Service_Monitor 1 logical_port=sw0-p1
>>> status=offline
>>> +wait_row_count Service_Monitor 1 logical_port=sw0-p2 status=online
>>> +
>>> +NETNS_DAEMONIZE(sw-ext-p1, [nc -6 -u fd20::42 53 -d 0.1 -c 'for i
>>> in $(seq 1 5); do echo "udp_data $i"; read msg; echo "$msg"
>>>>> ./udp_data_2; done'], [udp2_client.pid])
>>> +
>>> +OVS_WAIT_FOR_OUTPUT([cat ./udp_data_2],  [0], [dnl
>>> +udp_data 1
>>> +udp_data 2
>>> +udp_data 3
>>> +udp_data 4
>>> +udp_data 5
>>> +])
>>> +
>>> +OVN_CLEANUP_CONTROLLER([hv1])
>>> +
>>> +OVN_CLEANUP_NORTHD
>>> +
>>> +as
>>> +OVS_TRAFFIC_VSWITCHD_STOP(["/failed to query port patch-.*/d
>>> +/connection dropped.*/d
>>> +/Service monitor not found.*/d
>>> +/handle service check: Unsupported protocol*/d"])
>>> +
>>> +AT_CLEANUP
>>> +])
>>> +
>>>  OVN_FOR_EACH_NORTHD([
>>>  AT_SETUP([Load balancer health checks - IPv4])
>>>  AT_KEYWORDS([lb])
>>
>> Regards,
>> Dumitru
> 

Thanks,
Dumitru


_______________________________________________
dev mailing list
[email protected]
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Reply via email to