On 1/26/26 10:30 AM, [email protected] wrote:
> On Thu, 2026-01-22 at 15:13 +0100, Dumitru Ceara wrote:
>> On 12/19/25 12:17 PM, Martin Kalcok via dev wrote:
>>> Load Balancer Health Checks require specifying "source IP". This IP
>>> has to be from the subnet of the monitored LB backend and should
>>> not
>>> be used by any other port in the LSP. If the "source IP" is also
>>> used
>>> by some other port, the host with the LB backend won't be able to
>>> communicate
>>> with this port due to the ARP responder rule installed by the
>>> controller.
>>>
>>> This limitation forces CMSes to reserve an IP per LSP with LB
>>> backend, which
>>> is not always practical or possible.
>>>
>>> This proposal would allow usage of LRP IPs as the source of LB
>>> health check
>>> probes. It introduces following changes for such scenario:
>>>
>>>   * service_monitor (SBDB) will use MAC address of the LRP
>>>   * ARP responder rule is not necessary in this case
>>>   * Destination IP, "inport" and  source port will be used to
>>> determine that a
>>>     packet is a response to a health check probe. These packets
>>> will be
>>>     redirected to the controller.
>>>
>>> Behavior is unchanged for the LB health checks that use
>>> reserved/unused
>>> "source IPs".
>>>
>>> Signed-off-by: Martin Kalcok <[email protected]>
>>> ---
>>
>> Hi Martin,
>>
>> Thanks for this version!  It needs a rebase since quite a few patches
>> went in since.
> 
> Thanks for the review Dumitru. I prepared v3 and I'll submit it in a
> few moments.

Hi Martin,

>>
>> In general, it makes sense to me.  I am a bit worried about all the
>> lookups we do so it would be great if we could improve that part but
>> I
>> guess that can also be a follow up.
> 
> 
> I'm attaching a second commit in the v3 that attemtps to solve some of
> the expensive lookups (more info in the commit message). If you decide
> that it's acceptable, it can be squashed into the first commit.
> 

Ack, I'll have a look at v3 and we can continue the discussion there.

>> Please see below for some more comments.
>>
>>>
>>> RFC -> v2
>>>   * Reduce number of iterations when searching for monitor source
>>> IP in ports
>>>     by searching only in lrp_networks of LRPs, using
>>> `lrp_find_member_ip`.
>>>   * Narrowed down match for health check replies by matching
>>>     "inport == <monitored_lb_backend_port>"
>>>   * ICMP unreachable messages are also forwarded to the controller
>>> to allow
>>>     for monitoring UDP ports and detecting when they are down.
>>>   * Flows that send monitor replies to the controller are still
>>> inserted into
>>>     LS datapath, but the logic is tied to the LR. i.e.: Whether to
>>> add the
>>>     flows is evaluated when LB is attached to LR.
>>>     It removes the necessity to add LB to the LS as well. It is
>>> enough to add
>>>     it to the LR.
>>>   * IPv6 support added
>>>   * System and Unit tests added
>>>     * existing LB and Service Monitor checks were already quite
>>> involved, so
>>>       I opted to add new set of tests that focus specifically on a
>>> setup
>>>       with LRP IPs as a source for service checks.
>>>   * Documentation of "ip_port_mappings" field in "Load_Balancer"
>>> was extended
>>>     to explicitly state requirements for the source_ip.
>>>
>>>
>>>  northd/northd.c     | 143 +++++++++++++++-
>>>  ovn-nb.xml          |   5 +-
>>>  tests/ovn.at        | 325 +++++++++++++++++++++++++++++++++++
>>>  tests/system-ovn.at | 400
>>> ++++++++++++++++++++++++++++++++++++++++++++
>>>  4 files changed, 870 insertions(+), 3 deletions(-)
>>>
>>
>> We should probably mention this change in the NEWS file as it's a
>> relevant behavior change.
> 
> ack
> 
>>
>>> diff --git a/northd/northd.c b/northd/northd.c
>>> index 011f449ec..46b07a277 100644
>>> --- a/northd/northd.c
>>> +++ b/northd/northd.c
>>> @@ -3228,6 +3228,28 @@ ovn_lb_svc_create(struct ovsdb_idl_txn
>>> *ovnsb_txn,
>>>                  continue;
>>>              }
>>>  
>>> +            /* If the service monitor is backed by a real port,
>>> use its MAC
>>> +               address instead of the default service check MAC.*/
>>
>> Nit: multiline comments should contain '*' on each line.
>> Nit2: missing space after period.
> 
> ack
> 
>>
>>> +            const char *source_mac = svc_monitor_mac;
>>> +            const struct eth_addr *source_mac_ea =
>>> svc_monitor_mac_ea;
>>> +            if (op) {
>>
>> 'op' can't be NULL here.  We check just above.
> 
> The above check for `op` is done only if the remote_backend is false. I
> had failing tests without this additional check, because remote_backed
> was `true` and op was NULL. 
> 

Oh right, my bad.  Please ignore this.

>>
>>> +                struct ovn_port *svc_mon_op;
>>> +                VECTOR_FOR_EACH (&op->od->router_ports,
>>> svc_mon_op) {
>>> +                    if (!svc_mon_op->peer) {
>>> +                        continue;
>>> +                    }
>>> +                    const char *lrp_ip = lrp_find_member_ip(
>>> +                            svc_mon_op->peer,
>>> +                            backend_nb->svc_mon_src_ip);
>>
>> These nested lookups seem costly.  It would be nice if we could avoid
>> some of that but I couldn't think of an easy way of doing that.  Any
>> ideas?
> 
> I wasn't able to simplify this particular block, but in the v3 I'm
> attaching a patch that stores the result of this lookup in
> `ovn_northd_lb_backend->svc_mon_lrp`, so other parts of the code don't
> have to do it again. They can just look at the `svc_mon_lrp` member and
> infer whether a real port is used for service check or not.
> 
> For my own curiosity, is this VECTOR_FOR_EACH really adding that much
> complexity? We should be iteration only through the routers connected
> to the LB backend's LS, right?
>  

I guess you're right, it's only one loop for each backend with a port
backing it.  It's probably fine, although we're 6 indentation level deep
at this point in the function, which makes it quite hard to read.

>>
>>> +                    if (lrp_ip && !strcmp(lrp_ip,
>>> +                                          backend_nb-
>>>> svc_mon_src_ip)) {
>>> +                        source_mac = svc_mon_op->peer-
>>>> lrp_networks.ea_s;
>>> +                        source_mac_ea = &svc_mon_op->peer-
>>>> lrp_networks.ea;
>>> +                        break;
>>> +                    }
>>> +                }
>>> +            }
>>> +
>>>              const char *protocol = lb->nlb->protocol;
>>>              if (!protocol || !protocol[0]) {
>>>                  protocol = "tcp";
>>> @@ -3257,9 +3279,9 @@ ovn_lb_svc_create(struct ovsdb_idl_txn
>>> *ovnsb_txn,
>>>              struct eth_addr ea;
>>>              if (!mon_info->sbrec_mon->src_mac ||
>>>                  !eth_addr_from_string(mon_info->sbrec_mon-
>>>> src_mac, &ea) ||
>>> -                !eth_addr_equals(ea, *svc_monitor_mac_ea)) {
>>> +                !eth_addr_equals(ea, *source_mac_ea)) {
>>>                  sbrec_service_monitor_set_src_mac(mon_info-
>>>> sbrec_mon,
>>> -                                                 
>>> svc_monitor_mac);
>>> +                                                  source_mac);
>>>              }
>>>  
>>>              if (!mon_info->sbrec_mon->src_ip ||
>>> @@ -8460,6 +8482,99 @@ build_lb_rules_for_stateless_acl(struct
>>> lflow_table *lflows,
>>>      }
>>>  }
>>>  
>>> +static void
>>> +build_lb_health_check_response_lflows(
>>> +    struct lflow_table *lflows,
>>> +    const struct ovn_northd_lb *lb,
>>> +    const struct ovn_lb_vip *lb_vip,
>>> +    const struct ovn_lb_datapaths *lb_dps,
>>> +    const struct ovn_datapaths *lr_datapaths,
>>> +    const struct shash *meter_groups,
>>> +    struct ds *match,
>>> +    struct ds *action)
>>> +{
>>
>> Nit: we could indent all args under the `(` they just fit in under 80
>> columns. :)
> 
> Unfortunately there was a bug that necessitated another argument and it
> won't fit <80 columns anymore :( (more info on the bug in the changelog
> of v3)
> 

Ack.

>>
>>> +    /* For each LB backend that is monitored by a source_ip
>>> belonging
>>> +       to a real LRP, install rule that punts service check
>>> replies to the
>>> +       controller.*/
>>
>> Nit: comment style, missing '*'.
> 
> ack
> 
>>
>>> +    size_t j = 0;
>>> +    const struct ovn_lb_backend *backend;
>>> +    VECTOR_FOR_EACH_PTR (&lb_vip->backends, backend) {
>>> +        struct ovn_northd_lb_backend *backend_nb =
>>> +            &lb->vips_nb->backends_nb[j++];
>>> +
>>> +        if (!backend_nb->health_check) {
>>> +            continue;
>>> +        }
>>> +
>>> +        const char *protocol = lb->nlb->protocol;
>>> +        if (!protocol || !protocol[0]) {
>>> +            protocol = "tcp";
>>> +        }
>>> +
>>> +        size_t index;
>>> +        DYNAMIC_BITMAP_FOR_EACH_1 (index, &lb_dps->nb_lr_map) {
>>> +            struct ovn_datapath *od = vector_get(&lr_datapaths-
>>>> dps, index,
>>> +                                                 struct
>>> ovn_datapath *);
>>> +            /* Only install the rule if the datapath has a port
>>> with
>>> +               monitor source IP.*/
>>
>> Nit: comment style.
> 
> ack
> 
>>
>>> +            struct ovn_port *svc_mon_op;
>>> +            struct ovn_datapath *peer_switch_od = NULL;
>>
>> Nit: reverse xmas tree please.
> 
> ack
> 
>>
>>> +
>>> +            HMAP_FOR_EACH (svc_mon_op, dp_node, &od->ports) {
>>> +                if (!svc_mon_op->peer) {
>>> +                    continue;
>>> +                }
>>> +                const char *lrp_ip = lrp_find_member_ip(
>>> +                    svc_mon_op,
>>> +                    backend_nb->svc_mon_src_ip);
>>> +                if (lrp_ip &&
>>> +                        !strcmp(lrp_ip, backend_nb-
>>>> svc_mon_src_ip)) {
>>> +                    peer_switch_od = svc_mon_op->peer->od;
>>> +                    break;
>>> +                }
>>> +            }
>>> +            if (!peer_switch_od) {
>>> +                continue;
>>> +            }
>>> +
>>> +            char *ip_v, *icmp_unreach;
>>> +            if (addr_is_ipv6(backend_nb->svc_mon_src_ip)) {
>>> +                ip_v = "6";
>>> +                icmp_unreach = "1";
>>> +            } else {
>>> +                ip_v = "4";
>>> +                icmp_unreach = "3";
>>> +            }
>>> +
>>> +            ds_clear(match);
>>> +            ds_clear(action);
>>> +            ds_put_format(
>>> +                    match,
>>> +                    "inport == \"%s\" && ip%s.dst == %s && "
>>> +                    "(%s.src == %s || icmp%s.type == %s)",
>>> +                    backend_nb->logical_port,
>>> +                    ip_v,
>>> +                    backend_nb->svc_mon_src_ip,
>>> +                    protocol,
>>> +                    backend->port_str,
>>> +                    ip_v,
>>> +                    icmp_unreach);
>>
>> TBH, this is hard to read, maybe separate code paths for v4 and v6
>> make
>> it easier to read?
>>
>> Also the indentation is quite confusing now.
> 
> ack
> 
>>
>>> +            /* ovn-controller expects health check responses from
>>> the LS
>>> +             * datapath in which the backend is located. That's
>>> why we
>>> +             * install the response lflow into the peer's
>>> datapath. */
>>> +            ovn_lflow_metered(lflows,
>>> +                              peer_switch_od,
>>> +                              S_SWITCH_IN_LB, 160,
>>> +                              ds_cstr(match),
>>> +                              "handle_svc_check(inport);",
>>> +                              copp_meter_get(COPP_SVC_MONITOR,
>>> +                                             peer_switch_od->nbs-
>>>> copp,
>>> +                                             meter_groups),
>>> +                              lb_dps->lflow_ref);
>>> +        }
>>> +    }
>>> +}
>>> +
>>>  static void
>>>  build_lb_rules(struct lflow_table *lflows, struct ovn_lb_datapaths
>>> *lb_dps,
>>>                 const struct ovn_datapaths *ls_datapaths,
>>> @@ -10273,6 +10388,26 @@ build_lswitch_arp_nd_local_svc_mon(const
>>> struct ovn_lb_datapaths *lb_dps,
>>>                  continue;
>>>              }
>>>  
>>> +            /* ARP responder is necessary only if the service
>>> check is not
>>> +               backed by a real port and an IP. */
>>
>> Nit: comment style.
> 
> ack
> 
>>
>>> +            struct ovn_port *svc_mon_op;
>>> +            bool port_found = false;
>>> +            VECTOR_FOR_EACH (&op->od->router_ports, svc_mon_op) {
>>> +                if (!svc_mon_op->peer) {
>>> +                    continue;
>>> +                }
>>> +                const char *lrp_ip = lrp_find_member_ip(
>>> +                        svc_mon_op->peer,
>>> +                        backend_nb->svc_mon_src_ip);
>>> +                if (lrp_ip && !strcmp(lrp_ip, backend_nb-
>>>> svc_mon_src_ip)) {
>>> +                    port_found = true;
>>> +                    break;
>>> +                }
>>> +            }
>>> +            if (port_found) {
>>> +                continue;
>>> +            }
>>> +
>>>              ds_clear(match);
>>>              ds_clear(actions);
>>>  
>>> @@ -12974,6 +13109,10 @@ build_lrouter_flows_for_lb(struct
>>> ovn_lb_datapaths *lb_dps,
>>>          build_lrouter_allow_vip_traffic_template(lflows, lb_dps,
>>> lb_vip, lb,
>>>                                                   lr_datapaths);
>>>  
>>> +        build_lb_health_check_response_lflows(
>>> +            lflows, lb, lb_vip, lb_dps, lr_datapaths,
>>> meter_groups,
>>> +            match, action);
>>> +
>>>          if (!build_empty_lb_event_flow(lb_vip, lb, match, action))
>>> {
>>>              continue;
>>>          }
>>> diff --git a/ovn-nb.xml b/ovn-nb.xml
>>> index b5fe44e53..ff383ac42 100644
>>> --- a/ovn-nb.xml
>>> +++ b/ovn-nb.xml
>>> @@ -2381,7 +2381,10 @@
>>>            e.g.
>>> <code><var>port_name</var>:<var>sourc_ip</var></code> for IPv4.
>>>            Health checks are sent to this port with the specified
>>> source IP.
>>>            For IPv6 square brackets must be used around IP address,
>>> e.g:
>>> -          <code><var>port_name</var>:<var>[sourc_ip]</var></code>
>>> +          <code><var>port_name</var>:<var>[sourc_ip]</var></code>.
>>> The source
>>> +          IP must be from the subnet of the monitored endpoint. It
>>> can be
>>> +          either an unused IP from the subnet, or an IP of one of
>>> the Logical
>>> +          Router Ports connected to the same switch.
>>>            Remote endpoint:
>>>            Specify :target_zone_name at the end of the above syntax
>>> to create
>>>            remote health checks in a specific zone.
>>> diff --git a/tests/ovn.at b/tests/ovn.at
>>> index 30560f883..396cfa286 100644
>>> --- a/tests/ovn.at
>>> +++ b/tests/ovn.at
>>> @@ -26915,6 +26915,331 @@
>>> ignored_tables=OFTABLE_CHK_LB_HAIRPIN,OFTABLE_CT_SNAT_HAIRPIN])
>>>  AT_CLEANUP
>>>  ])
>>>  
>>> +OVN_FOR_EACH_NORTHD([
>>> +AT_SETUP([Load balancer health checks - LRP source IP - IPv4])
>>> +AT_KEYWORDS([lb])
>>> +ovn_start
>>> +
>>> +net_add n1
>>> +
>>> +sim_add hv1
>>> +as hv1
>>> +ovs-vsctl add-br br-phys
>>> +ovn_attach n1 br-phys 192.168.0.1
>>> +check ovs-vsctl -- add-port br-int hv1-vif1 -- \
>>> +    set interface hv1-vif1 external-ids:iface-id=sw0-p1 \
>>> +    options:tx_pcap=hv1/vif1-tx.pcap \
>>> +    options:rxq_pcap=hv1/vif1-rx.pcap \
>>> +    ofport-request=1
>>> +check ovs-vsctl -- add-port br-int hv1-vif2 -- \
>>> +    set interface hv1-vif2 external-ids:iface-id=sw0-p2 \
>>> +    options:tx_pcap=hv1/vif2-tx.pcap \
>>> +    options:rxq_pcap=hv1/vif2-rx.pcap \
>>> +    ofport-request=2
>>> +
>>> +sim_add hv2
>>> +as hv2
>>> +check ovs-vsctl add-br br-phys
>>> +ovn_attach n1 br-phys 192.168.0.2
>>> +check ovs-vsctl -- add-port br-int hv2-vif1 -- \
>>> +    set interface hv2-vif1 external-ids:iface-id=sw1-p1 \
>>> +    options:tx_pcap=hv2/vif1-tx.pcap \
>>> +    options:rxq_pcap=hv2/vif1-rx.pcap \
>>> +    ofport-request=1
>>> +
>>> +check ovn-nbctl ls-add sw0
>>> +check ovn-nbctl lsp-add sw0 sw0-p1
>>> +check ovn-nbctl lsp-set-addresses sw0-p1 "50:54:00:00:00:03
>>> 10.0.0.3"
>>> +check ovn-nbctl lsp-set-port-security sw0-p1 "50:54:00:00:00:03
>>> 10.0.0.3"
>>> +
>>> +check ovn-nbctl lsp-add sw0 sw0-p2
>>> +check ovn-nbctl lsp-set-addresses sw0-p2 "50:54:00:00:00:04
>>> 10.0.0.4"
>>> +check ovn-nbctl lsp-set-port-security sw0-p2 "50:54:00:00:00:04
>>> 10.0.0.4"
>>> +
>>> +check ovn-nbctl ls-add sw1
>>> +check ovn-nbctl lsp-add sw1 sw1-p1
>>> +check ovn-nbctl lsp-set-addresses sw1-p1 "40:54:00:00:00:03
>>> 20.0.0.3"
>>> +check ovn-nbctl lsp-set-port-security sw1-p1 "40:54:00:00:00:03
>>> 20.0.0.3"
>>> +
>>> +# Create a logical router and attach both logical switches
>> Nit: comments should be sentences and end with period.  Applies to
>> all
>> new comments added by this patch.
> 
> ack
> 
>>
>>> +check ovn-nbctl lr-add lr0
>>> +check ovn-nbctl lrp-add lr0 lr0-sw0 00:00:00:00:ff:01 10.0.0.1/24
>>> +check ovn-nbctl lsp-add-router-port sw0 sw0-lr0 lr0-sw0
>>> +
>>> +check ovn-nbctl lrp-add lr0 lr0-sw1 00:00:00:00:ff:02 20.0.0.1/24
>>> +check ovn-nbctl lsp-add-router-port sw1 sw1-lr0 lr0-sw1
>>> +
>>> +# Create TCP LB and configure health checks using router port IPs
>>> as source
>>> +check ovn-nbctl lb-add lb1 10.0.0.10:80 10.0.0.3:80,20.0.0.3:80
>>> +check ovn-nbctl set load_balancer lb1
>>> ip_port_mappings:10.0.0.3=sw0-p1:10.0.0.1
>>> +check ovn-nbctl set load_balancer lb1
>>> ip_port_mappings:20.0.0.3=sw1-p1:20.0.0.1
>>> +
>>> +AT_CHECK([ovn-nbctl --wait=sb \
>>> +          -- --id=@hc create Load_Balancer_Health_Check
>>> vip="10.0.0.10\:80" \
>>> +          -- add Load_Balancer lb1 health_check @hc | uuidfilt],
>>> [0], [<0>
>>> +])
>>> +
>>> +# Create UDP LB with health check using router port IP
>>> +check ovn-nbctl lb-add lb2 10.0.0.11:53 10.0.0.4:53 udp
>>> +check ovn-nbctl set load_balancer lb2
>>> ip_port_mappings:10.0.0.4=sw0-p2:10.0.0.1
>>> +
>>> +AT_CHECK([ovn-nbctl --wait=sb \
>>> +          -- --id=@hc create Load_Balancer_Health_Check
>>> vip="10.0.0.11\:53" \
>>> +          -- add Load_Balancer lb2 health_check @hc | uuidfilt],
>>> [0], [<0>
>>> +])
>>> +
>>> +check ovn-nbctl --wait=sb lr-lb-add lr0 lb1
>>> +check ovn-nbctl --wait=sb lr-lb-add lr0 lb2
>>> +
>>> +OVN_POPULATE_ARP
>>> +wait_for_ports_up
>>> +check ovn-nbctl --wait=hv sync
>>> +
>>> +wait_row_count Service_Monitor 3
>>> +
>>> +# Verify that Service_Monitor uses LRP MAC address, not
>>> svc_monitor_mac
>>> +lrp_sw0_mac=$(ovn-nbctl get Logical_Router_Port lr0-sw0 mac | sed
>>> 's/"//g')
>>> +lrp_sw1_mac=$(ovn-nbctl get Logical_Router_Port lr0-sw1 mac | sed
>>> 's/"//g')
>>> +
>>> +svc_mon_sw0_p1_mac=$(ovn-sbctl --bare --columns src_mac find
>>> Service_Monitor logical_port=sw0-p1)
>>> +svc_mon_sw0_p2_mac=$(ovn-sbctl --bare --columns src_mac find
>>> Service_Monitor logical_port=sw0-p2)
>>> +svc_mon_sw1_mac=$(ovn-sbctl --bare --columns src_mac find
>>> Service_Monitor logical_port=sw1-p1)
>>> +
>>> +AT_CHECK([test "$svc_mon_sw0_p1_mac" = "$lrp_sw0_mac"])
>>> +AT_CHECK([test "$svc_mon_sw0_p2_mac" = "$lrp_sw0_mac"])
>>> +AT_CHECK([test "$svc_mon_sw1_mac" = "$lrp_sw1_mac"])
>>> +
>>> +# Verify priority 160 flows exist for punting health check replies
>>> to controller
>>> +AT_CAPTURE_FILE([sbflows1])
>>> +ovn-sbctl dump-flows sw0 > sbflows1
>>> +AT_CAPTURE_FILE([sbflows2])
>>> +ovn-sbctl dump-flows sw1 > sbflows2
>>> +
>>> +## TCP Load balancer health check flows
>>> +# TCP backend - verify flow matches on inport, destination IP, and
>>> TCP source port
>>> +AT_CHECK([grep "priority=160" sbflows1 | grep "handle_svc_check" |
>>> \
>>> +          grep 'inport == "sw0-p1"' | grep "ip4.dst == 10.0.0.1" |
>>> \
>>> +          grep -q "tcp.src == 80" ])
>>> +
>>> +AT_CHECK([grep "priority=160" sbflows2 | grep "handle_svc_check" |
>>> \
>>> +          grep 'inport == "sw1-p1"' | grep "ip4.dst == 20.0.0.1" |
>>> \
>>> +          grep -q "tcp.src == 80" ])
>>> +
>>> +
>>> +# Verify flow also matches ICMP unreachable (type 3 for IPv4)
>>> +AT_CHECK([grep "priority=160" sbflows1 | grep "handle_svc_check" |
>>> \
>>> +          grep 'inport == "sw0-p1"' | grep "ip4.dst == 10.0.0.1" |
>>> \
>>> +          grep -q "icmp4.type == 3"])
>>> +
>>> +AT_CHECK([grep "priority=160" sbflows2 | grep "handle_svc_check" |
>>> \
>>> +          grep 'inport == "sw1-p1"' | grep "ip4.dst == 20.0.0.1" |
>>> \
>>> +          grep -q "icmp4.type == 3"])
>>> +
>>> +## UDP Load balancer health check flows
>>> +# UDP backend - verify flow uses UDP instead of TCP
>>> +AT_CHECK([grep "priority=160" sbflows1 | grep "handle_svc_check" |
>>> \
>>> +          grep 'inport == "sw0-p2"' | grep "ip4.dst == 10.0.0.1"
>>> |\
>>> +          grep -q "udp.src == 53"])
>>> +
>>> +# Verify NO TCP match in the UDP backend flow
>>> +AT_CHECK([grep "priority=160" sbflows1 | grep "handle_svc_check" |
>>> \
>>> +          grep 'inport == "sw0-p2"' | grep "tcp.src"], [1])
>>> +
>>> +# Verify NO ARP responder flows using svc_monitor_mac for the
>>> router port IPs
>>> +# The router port IPs should use the LRP MAC, not the global
>>> svc_monitor_mac
>>> +svc_monitor_mac=$(ovn-nbctl get NB_Global .
>>> options:svc_monitor_mac | sed 's/"//g')
>>> +AT_CHECK([test -n "$svc_monitor_mac"])
>>> +
>>> +AT_CHECK([grep "arp.tpa == 10.0.0.1" sbflows1 | \
>>> +          grep "arp.sha = $svc_monitor_mac"], [1])
>>> +AT_CHECK([grep "arp.tpa == 20.0.0.1" sbflows2 | \
>>> +          grep "arp.sha = $svc_monitor_mac"], [1])
>>> +
>>> +# Verify health check packets are sent with LRP MAC as source
>>> +lrp_sw0_mac_hex=$(echo $lrp_sw0_mac | sed 's/://g')
>>> +lrp_sw1_mac_hex=$(echo $lrp_sw1_mac | sed 's/://g')
>>> +
>>> +OVS_WAIT_UNTIL(
>>> +    [test 1 -le `$PYTHON "$ovs_srcdir/utilities/ovs-pcap.in"
>>> hv1/vif1-tx.pcap | \
>>> +    grep "505400000003${lrp_sw0_mac_hex}" | wc -l`]
>>> +)
>>> +
>>> +OVS_WAIT_UNTIL(
>>> +    [test 1 -le `$PYTHON "$ovs_srcdir/utilities/ovs-pcap.in"
>>> hv1/vif2-tx.pcap | \
>>> +    grep "505400000004${lrp_sw0_mac_hex}" | wc -l`]
>>> +)
>>> +
>>> +OVS_WAIT_UNTIL(
>>> +    [test 1 -le `$PYTHON "$ovs_srcdir/utilities/ovs-pcap.in"
>>> hv2/vif1-tx.pcap | \
>>> +    grep "405400000003${lrp_sw1_mac_hex}" | wc -l`]
>>> +)
>>> +
>>> +OVN_CLEANUP([hv1], [hv2])
>>> +AT_CLEANUP
>>> +])
>>> +
>>> +OVN_FOR_EACH_NORTHD([
>>> +AT_SETUP([Load balancer health checks - LRP source IP - IPv6])
>>> +AT_KEYWORDS([lb])
>>> +ovn_start
>>> +
>>> +net_add n1
>>> +
>>> +sim_add hv1
>>> +as hv1
>>> +ovs-vsctl add-br br-phys
>>> +ovn_attach n1 br-phys 192.168.0.1
>>> +check ovs-vsctl -- add-port br-int hv1-vif1 -- \
>>> +    set interface hv1-vif1 external-ids:iface-id=sw0-p1 \
>>> +    options:tx_pcap=hv1/vif1-tx.pcap \
>>> +    options:rxq_pcap=hv1/vif1-rx.pcap \
>>> +    ofport-request=1
>>> +check ovs-vsctl -- add-port br-int hv1-vif2 -- \
>>> +    set interface hv1-vif2 external-ids:iface-id=sw0-p2 \
>>> +    options:tx_pcap=hv1/vif2-tx.pcap \
>>> +    options:rxq_pcap=hv1/vif2-rx.pcap \
>>> +    ofport-request=2
>>> +
>>> +sim_add hv2
>>> +as hv2
>>> +check ovs-vsctl add-br br-phys
>>> +ovn_attach n1 br-phys 192.168.0.2
>>> +check ovs-vsctl -- add-port br-int hv2-vif1 -- \
>>> +    set interface hv2-vif1 external-ids:iface-id=sw1-p1 \
>>> +    options:tx_pcap=hv2/vif1-tx.pcap \
>>> +    options:rxq_pcap=hv2/vif1-rx.pcap \
>>> +    ofport-request=1
>>> +
>>> +check ovn-nbctl ls-add sw0
>>> +check ovn-nbctl lsp-add sw0 sw0-p1
>>> +check ovn-nbctl lsp-set-addresses sw0-p1 "50:54:00:00:00:03
>>> 2001::3"
>>> +check ovn-nbctl lsp-set-port-security sw0-p1 "50:54:00:00:00:03
>>> 2001::3"
>>> +
>>> +check ovn-nbctl lsp-add sw0 sw0-p2
>>> +check ovn-nbctl lsp-set-addresses sw0-p2 "50:54:00:00:00:04
>>> 2001::4"
>>> +check ovn-nbctl lsp-set-port-security sw0-p2 "50:54:00:00:00:04
>>> 2001::4"
>>> +
>>> +check ovn-nbctl ls-add sw1
>>> +check ovn-nbctl lsp-add sw1 sw1-p1
>>> +check ovn-nbctl lsp-set-addresses sw1-p1 "40:54:00:00:00:03
>>> 2002::3"
>>> +check ovn-nbctl lsp-set-port-security sw1-p1 "40:54:00:00:00:03
>>> 2002::3"
>>> +
>>> +# Create a logical router and attach both logical switches
>>> +check ovn-nbctl lr-add lr0
>>> +check ovn-nbctl lrp-add lr0 lr0-sw0 00:00:00:00:ff:01 2001::1/64
>>> +check ovn-nbctl lsp-add-router-port sw0 sw0-lr0 lr0-sw0
>>> +
>>> +check ovn-nbctl lrp-add lr0 lr0-sw1 00:00:00:00:ff:02 2002::1/64
>>> +check ovn-nbctl lsp-add-router-port sw1 sw1-lr0 lr0-sw1
>>> +
>>> +# Create TCP LB and configure health checks using router port IPs
>>> as source
>>> +check ovn-nbctl lb-add lb1 [[2001::a]]:80
>>> [[2001::3]]:80,[[2002::3]]:80
>>> +check ovn-nbctl set load_balancer lb1
>>> ip_port_mappings:\"[[2001::3]]\"=\"sw0-p1:[[2001::1]]\"
>>> +check ovn-nbctl set load_balancer lb1
>>> ip_port_mappings:\"[[2002::3]]\"=\"sw1-p1:[[2002::1]]\"
>>> +
>>> +AT_CHECK([ovn-nbctl --wait=sb \
>>> +          -- --id=@hc create Load_Balancer_Health_Check
>>> vip="\[\[2001\:\:a\]\]\:80" \
>>> +          -- add Load_Balancer lb1 health_check @hc | uuidfilt],
>>> [0], [<0>
>>> +])
>>> +
>>> +# Create UDP LB with health check using router port IP
>>> +check ovn-nbctl lb-add lb2 [[2001::b]]:53 [[2001::4]]:53 udp
>>> +check ovn-nbctl set load_balancer lb2
>>> ip_port_mappings:\"[[2001::4]]\"=\"sw0-p2:[[2001::1]]\"
>>> +
>>> +AT_CHECK([ovn-nbctl --wait=sb \
>>> +          -- --id=@hc create Load_Balancer_Health_Check
>>> vip="\[\[2001\:\:b\]\]\:53" \
>>> +          -- add Load_Balancer lb2 health_check @hc | uuidfilt],
>>> [0], [<0>
>>> +])
>>> +
>>> +check ovn-nbctl --wait=sb lr-lb-add lr0 lb1
>>> +check ovn-nbctl --wait=sb lr-lb-add lr0 lb2
>>> +
>>> +OVN_POPULATE_ARP
>>> +wait_for_ports_up
>>> +check ovn-nbctl --wait=hv sync
>>> +
>>> +wait_row_count Service_Monitor 3
>>> +
>>> +# Verify that Service_Monitor uses LRP MAC address
>>> +lrp_sw0_mac=$(ovn-nbctl get Logical_Router_Port lr0-sw0 mac | sed
>>> 's/"//g')
>>> +lrp_sw1_mac=$(ovn-nbctl get Logical_Router_Port lr0-sw1 mac | sed
>>> 's/"//g')
>>> +
>>> +svc_mon_sw0_p1_mac=$(ovn-sbctl --bare --columns src_mac find
>>> Service_Monitor logical_port=sw0-p1)
>>> +svc_mon_sw0_p2_mac=$(ovn-sbctl --bare --columns src_mac find
>>> Service_Monitor logical_port=sw0-p2)
>>> +svc_mon_sw1_mac=$(ovn-sbctl --bare --columns src_mac find
>>> Service_Monitor logical_port=sw1-p1)
>>> +
>>> +AT_CHECK([test "$svc_mon_sw0_p1_mac" = "$lrp_sw0_mac"])
>>> +AT_CHECK([test "$svc_mon_sw0_p2_mac" = "$lrp_sw0_mac"])
>>> +AT_CHECK([test "$svc_mon_sw1_mac" = "$lrp_sw1_mac"])
>>> +
>>> +# Verify priority 160 flows exist for punting health check replies
>>> to controller
>>> +AT_CAPTURE_FILE([sbflows1])
>>> +ovn-sbctl dump-flows sw0 > sbflows1
>>> +AT_CAPTURE_FILE([sbflows2])
>>> +ovn-sbctl dump-flows sw1 > sbflows2
>>> +
>>> +## TCP Load balancer health check flows
>>> +# TCP backend - verify flow matches on inport, destination IP, and
>>> TCP source port
>>> +AT_CHECK([grep "priority=160" sbflows1 | grep "handle_svc_check" |
>>> \
>>> +          grep 'inport == "sw0-p1"' | grep "ip6.dst == 2001::1" |
>>> \
>>> +          grep -q "tcp.src == 80" ])
>>> +
>>> +AT_CHECK([grep "priority=160" sbflows2 | grep "handle_svc_check" |
>>> \
>>> +          grep 'inport == "sw1-p1"' | grep "ip6.dst == 2002::1" |
>>> \
>>> +          grep -q "tcp.src == 80" ])
>>> +
>>> +# Verify flow also matches ICMP unreachable (type 1 for IPv6)
>>> +AT_CHECK([grep "priority=160" sbflows1 | grep "handle_svc_check" |
>>> \
>>> +          grep 'inport == "sw0-p1"' | grep "ip6.dst == 2001::1" |
>>> \
>>> +          grep -q "icmp6.type == 1"])
>>> +
>>> +AT_CHECK([grep "priority=160" sbflows2 | grep "handle_svc_check" |
>>> \
>>> +          grep 'inport == "sw1-p1"' | grep "ip6.dst == 2002::1" |
>>> \
>>> +          grep -q "icmp6.type == 1"])
>>> +
>>> +## UDP Load balancer health check flows
>>> +# UDP backend - verify flow uses UDP instead of TCP
>>> +AT_CHECK([grep "priority=160" sbflows1 | grep "handle_svc_check" |
>>> \
>>> +          grep 'inport == "sw0-p2"' | grep "ip6.dst == 2001::1" |
>>> \
>>> +          grep -q "udp.src == 53"])
>>> +
>>> +# Verify NO TCP match in the UDP backend flow
>>> +AT_CHECK([grep "priority=160" sbflows1 | grep "handle_svc_check" |
>>> \
>>> +          grep 'inport == "sw0-p2"' | grep "tcp.src"], [1])
>>> +
>>> +# Verify NO IPv6 ND responder flows using svc_monitor_mac for the
>>> router port IPs
>>> +# The router port IPs should use the LRP MAC, not the global
>>> svc_monitor_mac
>>> +svc_monitor_mac=$(ovn-nbctl get NB_Global .
>>> options:svc_monitor_mac | sed 's/"//g')
>>> +AT_CHECK([test -n "$svc_monitor_mac"])
>>> +
>>> +AT_CHECK([ovn-sbctl dump-flows sw0 | grep "nd.target == 2001::1" |
>>> \
>>> +          grep "nd.tll = $svc_monitor_mac"], [1])
>>> +AT_CHECK([ovn-sbctl dump-flows sw1 | grep "nd.target == 2002::1" |
>>> \
>>> +          grep "nd.tll = $svc_monitor_mac"], [1])
>>> +
>>> +# Verify health check packets are sent with LRP MAC as source
>>> +lrp_sw0_mac_hex=$(echo $lrp_sw0_mac | sed 's/://g')
>>> +lrp_sw1_mac_hex=$(echo $lrp_sw1_mac | sed 's/://g')
>>> +
>>> +OVS_WAIT_UNTIL(
>>> +    [test 1 -le `$PYTHON "$ovs_srcdir/utilities/ovs-pcap.in"
>>> hv1/vif1-tx.pcap | \
>>> +    grep "505400000003${lrp_sw0_mac_hex}" | wc -l`]
>>> +)
>>> +
>>> +OVS_WAIT_UNTIL(
>>> +    [test 1 -le `$PYTHON "$ovs_srcdir/utilities/ovs-pcap.in"
>>> hv1/vif2-tx.pcap | \
>>> +    grep "505400000004${lrp_sw0_mac_hex}" | wc -l`]
>>> +)
>>> +
>>> +OVS_WAIT_UNTIL(
>>> +    [test 1 -le `$PYTHON "$ovs_srcdir/utilities/ovs-pcap.in"
>>> hv2/vif1-tx.pcap | \
>>> +    grep "405400000003${lrp_sw1_mac_hex}" | wc -l`]
>>> +)
>>> +
>>> +OVN_CLEANUP([hv1], [hv2])
>>> +AT_CLEANUP
>>> +])
>>> +
>>>  OVN_FOR_EACH_NORTHD([
>>>  AT_SETUP([SCTP Load balancer health checks])
>>>  AT_KEYWORDS([lb sctp])
>>> diff --git a/tests/system-ovn.at b/tests/system-ovn.at
>>> index 8af7cb058..7d9401805 100644
>>> --- a/tests/system-ovn.at
>>> +++ b/tests/system-ovn.at
>>> @@ -4422,6 +4422,406 @@ OVS_TRAFFIC_VSWITCHD_STOP(["/failed to
>>> query port patch-.*/d
>>>  AT_CLEANUP
>>>  ])
>>>  
>>> +OVN_FOR_EACH_NORTHD([
>>> +AT_SETUP([Load balancer health checks with LRP IP - IPv4])
>>> +AT_KEYWORDS([lb])
>>> +
>>> +# This is a variant of a 'Load balancer health checks - IPv4' test
>>> +# that uses LRP IPs as a source IP for the health checks. As such,
>>> it
>>> +# doesn't go so deep into testing and focuses only on "health
>>> checks"
>>> +# having proper status and the backend services being reachable.
>>> +#
>>> +# Topology:
>>> +#
>>> +#           LB VIP: 172.16.0.42
>>> +#         Backends: 10.0.0.3 10.0.0.4
>>> +#
>>> +#                                                                 
>>>        .3 +--------------+
>>> +#                                          +------------
>>> +           +------ |    sw0-p1    |
>>> +#                                      .42 |           
>>> |           |       +--------------+
>>> +#  +--------------+ .2                 .1  |            |
>>> .1        |
>>> +#  |   sw-ext-p1  | ------- sw-ext ------- |    lr0     | ------- 
>>> sw0 10.0.0.0/24
>>> +#  +--------------+                        |           
>>> |           |
>>> +#                        172.16.0.0/24     |           
>>> |           |    .4 +--------------+
>>> +#                                          +------------
>>> +           +------ |    sw0-p2    |
>>> +#                                                                 
>>>           +--------------+
>>> +#
>>> +ovn_start
>>> +
>>> +OVS_TRAFFIC_VSWITCHD_START()
>>> +ADD_BR([br-int])
>>> +
>>> +# Set external-ids in br-int needed for ovn-controller
>>
>> Nit: missing period at the end of the sentence.
> 
> ack
> 
>>
>>> +ovs-vsctl \
>>
>> Nit: missing check.
> 
> ack
> 
>>
>>> +        -- set Open_vSwitch . external-ids:system-id=hv1 \
>>> +        -- set Open_vSwitch . external-ids:ovn-
>>> remote=unix:$ovs_base/ovn-sb/ovn-sb.sock \
>>> +        -- set Open_vSwitch . external-ids:ovn-encap-type=geneve \
>>> +        -- set Open_vSwitch . external-ids:ovn-encap-ip=169.0.0.1
>>> \
>>> +        -- set bridge br-int fail-mode=secure other-
>>> config:disable-in-band=true
>>> +
>>> +# Start ovn-controller
>>> +start_daemon ovn-controller
>>> +
>>> +# Create internal network with LB backends
>>> +check ovn-nbctl ls-add sw0
>>> +
>>> +check ovn-nbctl lsp-add sw0 sw0-p1
>>> +check ovn-nbctl lsp-set-addresses sw0-p1 "50:54:00:00:00:03
>>> 10.0.0.3"
>>> +
>>> +check ovn-nbctl lsp-add sw0 sw0-p2
>>> +check ovn-nbctl lsp-set-addresses sw0-p2 "50:54:00:00:00:04
>>> 10.0.0.4"
>>> +
>>> +# Create external network with LB VIP
>>> +check ovn-nbctl ls-add sw-ext
>>> +
>>> +check ovn-nbctl lsp-add sw-ext sw-ext-p1
>>> +check ovn-nbctl lsp-set-addresses sw-ext-p1 "60:64:00:00:00:02
>>> 172.16.0.2"
>>> +
>>> +# Create a logical router and attach both logical switches
>>> +check ovn-nbctl lr-add lr0
>>> +check ovn-nbctl set logical_router lr0 options:chassis=hv1
>>> +check ovn-nbctl lrp-add lr0 lr0-sw0 00:00:00:00:ff:01 10.0.0.1/24
>>> +check ovn-nbctl lrp-add lr0 lr0-sw-ext 00:00:00:00:ee:01
>>> 172.16.0.1/24
>>> +check ovn-nbctl lsp-add-router-port sw0 sw0-lr0 lr0-sw0
>>> +check ovn-nbctl lsp-add-router-port sw-ext sw-ext-lr0 lr0-sw-ext
>>> +
>>> +check ovn-nbctl lb-add lb1 172.16.0.42:80 10.0.0.3:80,10.0.0.4:80
>>> +
>>> +# Load balancer health check will be using LR IP as a source of
>>> the health check
>>> +check ovn-nbctl --wait=sb set load_balancer .
>>> ip_port_mappings:10.0.0.3=sw0-p1:10.0.0.1
>>> +check ovn-nbctl --wait=sb set load_balancer .
>>> ip_port_mappings:10.0.0.4=sw0-p2:10.0.0.1
>>> +
>>> +check_uuid ovn-nbctl --wait=sb -- --id=@hc create \
>>> +Load_Balancer_Health_Check vip="172.16.0.42\:80" -- add
>>> Load_Balancer . \
>>> +health_check @hc
>>> +
>>> +# LB is bound only to the LR. It shouldn't be necessary to bind it
>>> to the
>>> +# LS as well.
>>> +check ovn-nbctl --wait=sb lr-lb-add lr0 lb1
>>> +
>>
>> All three '--wait=sb' instances above can be removed.  We don't check
>> anything in the SB and anyway here we call 'ovn-nbctl --wait=hv
>> sync'.
>>
>>> +OVN_POPULATE_ARP
>>> +check ovn-nbctl --wait=hv sync
>>> +
>>> +ADD_NAMESPACES(sw0-p1)
>>> +ADD_VETH(sw0-p1, sw0-p1, br-int, "10.0.0.3/24",
>>> "50:54:00:00:00:03", \
>>> +         "10.0.0.1")
>>> +
>>> +ADD_NAMESPACES(sw0-p2)
>>> +ADD_VETH(sw0-p2, sw0-p2, br-int, "10.0.0.4/24",
>>> "50:54:00:00:00:04", \
>>> +         "10.0.0.1")
>>> +
>>> +ADD_NAMESPACES(sw-ext-p1)
>>> +ADD_VETH(sw-ext-p1, sw-ext-p1, br-int, "172.16.0.2/24",
>>> "60:64:00:00:00:02", \
>>> +         "172.16.0.1")
>>> +
>>> +wait_row_count Service_Monitor 2 status=offline
>>> +
>>> +# Start webservers in 'sw0-p1' and 'sw1-p1'.
>>> +OVS_START_L7([sw0-p1], [http])
>>> +sw0_p1_pid_file=`cat l7_pid_file`
>>> +OVS_START_L7([sw0-p2], [http])
>>> +
>>> +wait_row_count Service_Monitor 2 status=online
>>> +
>>> +for i in `seq 1 10`; do
>>> +    NS_CHECK_EXEC([sw-ext-p1], [curl --connect-timeout 5
>>> http://172.16.0.42 > curl$i.log 2>&1 ])
>>> +done
>>> +
>>> +# Stop webserver in sw0-p1
>>> +kill `cat $sw0_p1_pid_file`
>>> +
>>> +wait_row_count Service_Monitor 1 logical_port=sw0-p1
>>> status=offline
>>> +wait_row_count Service_Monitor 1 logical_port=sw0-p2 status=online
>>> +
>>> +for i in `seq 11 20`; do
>>> +    NS_CHECK_EXEC([sw-ext-p1], [curl --connect-timeout 5
>>> http://172.16.0.42 > curl$i.log 2>&1 ])
>>> +done
>>> +
>>> +# trigger port binding release and check if status changed to
>>> offline
>>> +check ovs-vsctl remove interface ovs-sw0-p2 external_ids iface-id
>>> +wait_row_count Service_Monitor 2
>>> +wait_row_count Service_Monitor 2 status=offline
>>> +
>>> +check ovs-vsctl set interface ovs-sw0-p2 external_ids:iface-
>>> id=sw0-p2
>>> +wait_row_count Service_Monitor 2
>>> +wait_row_count Service_Monitor 1 status=online
>>> +
>>> +# Remove TCP Load Balancer
>>> +check ovn-nbctl lb-del lb1
>>> +
>>> +# Create udp load balancer.
>>> +check ovn-nbctl lb-add lb2 172.16.0.42:53 10.0.0.3:53,10.0.0.4:53
>>> udp
>>> +lb_udp=`ovn-nbctl lb-list | grep udp | awk '{print $1}'`
>>> +
>>> +check ovn-nbctl --wait=sb set load_balancer $lb_udp
>>> ip_port_mappings:10.0.0.3=sw0-p1:10.0.0.1
>>> +check ovn-nbctl --wait=sb set load_balancer $lb_udp
>>> ip_port_mappings:10.0.0.4=sw0-p2:10.0.0.1
>>> +
>>> +check_uuid ovn-nbctl --wait=sb -- --id=@hc create \
>>> +Load_Balancer_Health_Check vip="172.16.0.42\:53" -- add
>>> Load_Balancer $lb_udp \
>>> +health_check @hc
>>> +
>>
>> All three '--wait=sb' instances above can be removed.  We don't check
>> anything in the SB until here.  Maybe we should change the lr-lb-add
>> below to use --wait=hv?
> 
> ack. --wait=sb removed and only --wait=hv left at the end.
> 
>>
>>> +check ovn-nbctl --wait=sb lr-lb-add lr0 lb2
>>> +
>>> +# Wait until udp service_monitor are set to offline
>>> +wait_row_count Service_Monitor 2 status=offline protocol=udp
>>> +
>>> +# We need to use "-k" argument because we keep receiving health
>>> check probes
>>> +NETNS_DAEMONIZE([sw0-p1], [nc -4 -k -l -u -p 53 -d 0.1 -c
>>> '/bin/cat'], [udp1.pid])
>>> +NETNS_DAEMONIZE([sw0-p2], [nc -4 -k -l -u -p 53 -d 0.1 -c
>>> '/bin/cat'], [udp2.pid])
>>> +
>>> +# UDP nc with "-k" is bit tricky to cleanup, especially on test
>>> failures, due to its
>>> +# child processes keeping the listening port open. If they are not
>>> cleanup up properly,
>>> +# the whole test clean up hangs.
>>> +udp1_pid=$(cat udp1.pid)
>>> +udp2_pid=$(cat udp2.pid)
>>> +echo "pkill -P $udp1_pid && kill $udp1_pid" >> cleanup
>>> +echo "pkill -P $udp2_pid && kill $udp2_pid" >> cleanup
>>> +:> udp1.pid
>>> +:> udp2.pid
>>> +
>>> +wait_row_count Service_Monitor 2 status=online
>>> +
>>> +NETNS_DAEMONIZE(sw-ext-p1, [nc -4 -u 172.16.0.42 53 -d 0.1 -c 'for
>>> i in $(seq 1 5); do echo "udp_data $i"; read msg; echo "$msg"
>>>>> ./udp_data_1; done'], [udp1_client.pid])
>>> +
>>> +OVS_WAIT_FOR_OUTPUT([cat ./udp_data_1],  [0], [dnl
>>> +udp_data 1
>>> +udp_data 2
>>> +udp_data 3
>>> +udp_data 4
>>> +udp_data 5
>>> +])
>>> +
>>> +# Stop UDP service in sw0-p1
>>> +# Due to the way UDP nc with "-k" forks, we need to kill all
>>> +# children processes and then then the parent.
>>> +NS_CHECK_EXEC([sw0-p1], [pkill -P $udp1_pid && kill $udp1_pid])
>>> +
>>> +wait_row_count Service_Monitor 1 logical_port=sw0-p1
>>> status=offline
>>> +wait_row_count Service_Monitor 1 logical_port=sw0-p2 status=online
>>> +
>>> +NETNS_DAEMONIZE(sw-ext-p1, [nc -4 -u 172.16.0.42 53 -d 0.1 -c 'for
>>> i in $(seq 1 5); do echo "udp_data $i"; read msg; echo "$msg"
>>>>> ./udp_data_2; done'], [udp2_client.pid])
>>> +
>>> +OVS_WAIT_FOR_OUTPUT([cat ./udp_data_2],  [0], [dnl
>>> +udp_data 1
>>> +udp_data 2
>>> +udp_data 3
>>> +udp_data 4
>>> +udp_data 5
>>> +])
>>> +
>>> +OVN_CLEANUP_CONTROLLER([hv1])
>>> +
>>> +OVN_CLEANUP_NORTHD
>>> +
>>> +as
>>> +OVS_TRAFFIC_VSWITCHD_STOP(["/failed to query port patch-.*/d
>>> +/connection dropped.*/d
>>> +/Service monitor not found.*/d
>>> +/handle service check: Unsupported protocol*/d"])
>>> +
>>> +AT_CLEANUP
>>> +])
>>> +
>>> +OVN_FOR_EACH_NORTHD([
>>> +AT_SETUP([Load balancer health checks with LRP IP - IPv6])
>>> +AT_KEYWORDS([lb])
>>> +
>>> +# This is a variant of a 'Load balancer health checks - IPv6' test
>>> +# that uses LRP IPs as a source IP for the health checks. As such,
>>> it
>>> +# doesn't go so deep into testing and focuses only on "health
>>> checks"
>>> +# having proper status and the backend services being reachable.
>>> +#
>>> +# Topology:
>>> +#
>>> +#           LB VIP: fd20::42
>>> +#         Backends: fd11::3 fd11::4
>>> +#
>>> +#                                                                 
>>>       ::3 +--------------+
>>> +#                                          +------------
>>> +           +------ |    sw0-p1    |
>>> +#                                     ::42 |           
>>> |           |       +--------------+
>>> +#  +--------------+ ::2               ::1  |            |
>>> ::1       |
>>> +#  |   sw-ext-p1  | ------- sw-ext ------- |    lr0     | ------- 
>>> sw0 fd11::/64
>>> +#  +--------------+                        |           
>>> |           |
>>> +#                        fd20::/64         |           
>>> |           |   ::4 +--------------+
>>> +#                                          +------------
>>> +           +------ |    sw0-p2    |
>>> +#                                                                 
>>>           +--------------+
>>> +#
>>> +ovn_start
>>> +
>>> +OVS_TRAFFIC_VSWITCHD_START()
>>> +ADD_BR([br-int])
>>> +
>>> +# Set external-ids in br-int needed for ovn-controller
>>
>> Nit: missing period at the end of the sentence.  This applies to more
>> comments in the new tests.
> 
> ack
> 
>>
>>> +ovs-vsctl \
>>
>> Nit: missing check.
> 
> ack
> 
>>
>>> +        -- set Open_vSwitch . external-ids:system-id=hv1 \
>>> +        -- set Open_vSwitch . external-ids:ovn-
>>> remote=unix:$ovs_base/ovn-sb/ovn-sb.sock \
>>> +        -- set Open_vSwitch . external-ids:ovn-encap-type=geneve \
>>> +        -- set Open_vSwitch . external-ids:ovn-encap-ip=169.0.0.1
>>> \
>>> +        -- set bridge br-int fail-mode=secure other-
>>> config:disable-in-band=true
>>> +
>>> +# Start ovn-controller
>>> +start_daemon ovn-controller
>>> +
>>> +# Create internal network with LB backends
>>> +check ovn-nbctl ls-add sw0
>>> +
>>> +check ovn-nbctl lsp-add sw0 sw0-p1
>>> +check ovn-nbctl lsp-set-addresses sw0-p1 "50:54:00:00:00:03
>>> fd11::3"
>>> +
>>> +check ovn-nbctl lsp-add sw0 sw0-p2
>>> +check ovn-nbctl lsp-set-addresses sw0-p2 "50:54:00:00:00:04
>>> fd11::4"
>>> +
>>> +# Create external network with LB VIP
>>> +check ovn-nbctl ls-add sw-ext
>>> +
>>> +check ovn-nbctl lsp-add sw-ext sw-ext-p1
>>> +check ovn-nbctl lsp-set-addresses sw-ext-p1 "60:64:00:00:00:02
>>> fd20::2"
>>> +
>>> +# Create a logical router and attach both logical switches
>>> +check ovn-nbctl lr-add lr0
>>> +check ovn-nbctl set logical_router lr0 options:chassis=hv1
>>> +check ovn-nbctl lrp-add lr0 lr0-sw0 00:00:00:00:ff:01 fd11::1/64
>>> +check ovn-nbctl lrp-add lr0 lr0-sw-ext 00:00:00:00:ee:01
>>> fd20::1/64
>>> +check ovn-nbctl lsp-add-router-port sw0 sw0-lr0 lr0-sw0
>>> +check ovn-nbctl lsp-add-router-port sw-ext sw-ext-lr0 lr0-sw-ext
>>> +
>>> +check ovn-nbctl lb-add lb1 [[fd20::42]]:80
>>> [[fd11::3]]:80,[[fd11::4]]:80
>>> +
>>> +# Load balancer health check will be using LR IP as a source of
>>> the health check
>>> +check ovn-nbctl --wait=sb set load_balancer .
>>> ip_port_mappings:\"[[fd11::3]]\"=\"sw0-p1:[[fd11::1]]\"
>>> +check ovn-nbctl --wait=sb set load_balancer .
>>> ip_port_mappings:\"[[fd11::4]]\"=\"sw0-p2:[[fd11::1]]\"
>>> +
>>> +check_uuid ovn-nbctl --wait=sb -- --id=@hc create \
>>> +Load_Balancer_Health_Check vip="\[\[fd20\:\:42\]\]\:80" -- add
>>> Load_Balancer . \
>>> +health_check @hc
>>> +
>>> +# LB is bound only to the LR. It shouldn't be necessary to bind it
>>> to the
>>> +# LS as well.
>>> +check ovn-nbctl --wait=sb lr-lb-add lr0 lb1
>>> +
>>
>> All three '--wait=sb' instances above can be removed.  We don't check
>> anything in the SB and anyway here we call 'ovn-nbctl --wait=hv
>> sync'.
> 
> ack. --wait=sb removed and only --wait=hv left at the end.
> 
>>
>>> +OVN_POPULATE_ARP
>>> +check ovn-nbctl --wait=hv sync
>>> +
>>> +ADD_NAMESPACES(sw0-p1)
>>> +ADD_VETH(sw0-p1, sw0-p1, br-int, "fd11::3/64",
>>> "50:54:00:00:00:03", \
>>> +         "fd11::1")
>>> +
>>> +ADD_NAMESPACES(sw0-p2)
>>> +ADD_VETH(sw0-p2, sw0-p2, br-int, "fd11::4/64",
>>> "50:54:00:00:00:04", \
>>> +         "fd11::1")
>>> +
>>> +ADD_NAMESPACES(sw-ext-p1)
>>> +ADD_VETH(sw-ext-p1, sw-ext-p1, br-int, "fd20::2/64",
>>> "60:64:00:00:00:02", \
>>> +         "fd20::1")
>>> +
>>> +wait_row_count Service_Monitor 2 status=offline
>>> +
>>> +# Start webservers in 'sw0-p1' and 'sw0-p2'.
>>> +OVS_START_L7([sw0-p1], [http6])
>>> +sw0_p1_pid_file=`cat l7_pid_file`
>>> +OVS_START_L7([sw0-p2], [http6])
>>> +
>>> +wait_row_count Service_Monitor 2 status=online
>>> +
>>> +for i in `seq 1 10`; do
>>> +    NS_CHECK_EXEC([sw-ext-p1], [curl --connect-timeout 5
>>> http://[[fd20::42]] > curl$i.log 2>&1 ])
>>> +done
>>> +
>>> +# Stop webserver in sw0-p1
>>> +kill `cat $sw0_p1_pid_file`
>>> +
>>> +wait_row_count Service_Monitor 1 logical_port=sw0-p1
>>> status=offline
>>> +wait_row_count Service_Monitor 1 logical_port=sw0-p2 status=online
>>> +
>>> +for i in `seq 11 20`; do
>>> +    NS_CHECK_EXEC([sw-ext-p1], [curl --connect-timeout 5
>>> http://[[fd20::42]] > curl$i.log 2>&1 ])
>>> +done
>>> +
>>> +# trigger port binding release and check if status changed to
>>> offline
>>> +check ovs-vsctl remove interface ovs-sw0-p2 external_ids iface-id
>>> +wait_row_count Service_Monitor 2
>>> +wait_row_count Service_Monitor 2 status=offline
>>> +
>>> +check ovs-vsctl set interface ovs-sw0-p2 external_ids:iface-
>>> id=sw0-p2
>>> +wait_row_count Service_Monitor 2
>>> +wait_row_count Service_Monitor 1 status=online
>>> +
>>> +# Remove TCP Load Balancer
>>> +check ovn-nbctl lb-del lb1
>>> +
>>> +# Create udp load balancer.
>>> +check ovn-nbctl lb-add lb2 [[fd20::42]]:53
>>> [[fd11::3]]:53,[[fd11::4]]:53 udp
>>> +lb_udp=`ovn-nbctl lb-list | grep udp | awk '{print $1}'`
>>> +
>>> +check ovn-nbctl --wait=sb set load_balancer $lb_udp
>>> ip_port_mappings:\"[[fd11::3]]\"=\"sw0-p1:[[fd11::1]]\"
>>> +check ovn-nbctl --wait=sb set load_balancer $lb_udp
>>> ip_port_mappings:\"[[fd11::4]]\"=\"sw0-p2:[[fd11::1]]\"
>>> +
>>> +check_uuid ovn-nbctl --wait=sb -- --id=@hc create \
>>> +Load_Balancer_Health_Check vip="\[\[fd20\:\:42\]\]\:53" -- add
>>> Load_Balancer $lb_udp \
>>> +health_check @hc
>>> +
>>
>> All three '--wait=sb' instances above can be removed.  As in the IPv4
>> test comment, maybe the following one should be a --wait=hv?
> 
> ack. --wait=sb removed and only --wait=hv left at the end.
> 
>>
>>> +check ovn-nbctl --wait=sb lr-lb-add lr0 lb2
>>> +
>>> +# Wait until udp service_monitor are set to offline
>>> +wait_row_count Service_Monitor 2 status=offline protocol=udp
>>> +
>>> +# We need to use "-k" argument because we keep receiving health
>>> check probes
>>> +NETNS_DAEMONIZE([sw0-p1], [nc -6 -k -l -u -p 53 -d 0.1 -c
>>> '/bin/cat'], [udp1.pid])
>>> +NETNS_DAEMONIZE([sw0-p2], [nc -6 -k -l -u -p 53 -d 0.1 -c
>>> '/bin/cat'], [udp2.pid])
>>> +
>>> +# UDP nc with "-k" is bit tricky to cleanup, especially on test
>>> failures, due to its
>>> +# child processes keeping the listening port open. If they are not
>>> cleanup up properly,
>>> +# the whole test clean up hangs.
>>> +udp1_pid=$(cat udp1.pid)
>>> +udp2_pid=$(cat udp2.pid)
>>> +echo "pkill -P $udp1_pid && kill $udp1_pid" >> cleanup
>>> +echo "pkill -P $udp2_pid && kill $udp2_pid" >> cleanup
>>> +:> udp1.pid
>>> +:> udp2.pid
>>> +
>>> +wait_row_count Service_Monitor 2 status=online
>>> +
>>> +NETNS_DAEMONIZE(sw-ext-p1, [nc -6 -u fd20::42 53 -d 0.1 -c 'for i
>>> in $(seq 1 5); do echo "udp_data $i"; read msg; echo "$msg"
>>>>> ./udp_data_1; done'], [udp1_client.pid])
>>> +
>>> +OVS_WAIT_FOR_OUTPUT([cat ./udp_data_1],  [0], [dnl
>>> +udp_data 1
>>> +udp_data 2
>>> +udp_data 3
>>> +udp_data 4
>>> +udp_data 5
>>> +])
>>> +
>>> +# Stop UDP service in sw0-p1
>>> +# Due to the way UDP nc with "-k" forks, we need to kill all
>>> +# children processes and then then the parent.
>>> +NS_CHECK_EXEC([sw0-p1], [pkill -P $udp1_pid && kill $udp1_pid])
>>> +
>>> +wait_row_count Service_Monitor 1 logical_port=sw0-p1
>>> status=offline
>>> +wait_row_count Service_Monitor 1 logical_port=sw0-p2 status=online
>>> +
>>> +NETNS_DAEMONIZE(sw-ext-p1, [nc -6 -u fd20::42 53 -d 0.1 -c 'for i
>>> in $(seq 1 5); do echo "udp_data $i"; read msg; echo "$msg"
>>>>> ./udp_data_2; done'], [udp2_client.pid])
>>> +
>>> +OVS_WAIT_FOR_OUTPUT([cat ./udp_data_2],  [0], [dnl
>>> +udp_data 1
>>> +udp_data 2
>>> +udp_data 3
>>> +udp_data 4
>>> +udp_data 5
>>> +])
>>> +
>>> +OVN_CLEANUP_CONTROLLER([hv1])
>>> +
>>> +OVN_CLEANUP_NORTHD
>>> +
>>> +as
>>> +OVS_TRAFFIC_VSWITCHD_STOP(["/failed to query port patch-.*/d
>>> +/connection dropped.*/d
>>> +/Service monitor not found.*/d
>>> +/handle service check: Unsupported protocol*/d"])
>>> +
>>> +AT_CLEANUP
>>> +])
>>> +
>>>  OVN_FOR_EACH_NORTHD([
>>>  AT_SETUP([Load balancer health checks - IPv4])
>>>  AT_KEYWORDS([lb])
>>
>> Regards,
>> Dumitru
> 
> Thanks again for the review,
> Martin.

Thanks for following up!  I'll review v3 as soon as possible.

Regards,
Dumitru

_______________________________________________
dev mailing list
[email protected]
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Reply via email to