On 7/15/21 8:40 PM, Mark Gray wrote:
> When adding two SB flows with the same vip but different protocols, only
> the most recent flow will be added due to the `if` statement:
> 
>             if (!sset_contains(&all_ips, lb_vip->vip_str)) {
>                 sset_add(&all_ips, lb_vip->vip_str);
> 
> This can cause unexpected behaviour when two load balancers with
> the same VIP (and different protocols) are added to a logical router.
> 
> This is due to the addition of "protocol" to the match in
> defrag table flows in a previous commit.
> 
> Add flow to defrag table for every load-balancer in order to resolve this. 
> Flows
> for Load Balancers without a port specified are added with priority 100. Flows
> for Load Balancers with a port specified are added with priority 110.
> 
> Add a test to check behaviour of Logical Flows when two load balancers
> of the same VIP are added.
> 
> This bug was discovered through the OVN CI (ovn-kubernetes.yml).
> 
> Fixes: 384a7c6237da ("northd: Refactor Logical Flows for routers with 
> DNAT/Load Balancers")
> Signed-off-by: Mark Gray <[email protected]>
> ---

Hi Mark,

Trying to test this in a realistic deployment I took a NB database from
a recent OpenShift (with ovn-kubernetes) scale test from a:
- ~300 nodes cluster (1 cluster router, 1 LS per node, 1 GW router per
  node, 1 external LS per node).
- ~1800 load balancers with a total of number of ~13000 VIPs distributed
  among them (with various numbers of backends), for different
  protocols.
- logical datapath groups enabled.

I got:

Baseline (before fix):
----------------------
  northd loop time:      10 s
  SB size:              261 MB
  SB size compacted:    255 MB
  number of lflows:  362390
After fix:
----------
  northd loop time:     9.5 s
  SB size:              261 MB
  SB size compacted:    255 MB
  number of lflows:  362391

So looks like the removal of the sset has another positive effect. :)

I left a few minor comments below.  With those addressed, feel free to
add my ack to the next revision:

Acked-by: Dumitru Ceara <[email protected]>

Regards,
Dumitru

> 
> Notes:
>     v2 - Address Mark M.'s comments
>          * Add flows to defrag table for every LB VIP and every LB VIP + proto
>            rather than just every unique LB VIP
>          * Change priority of flows in LB VIP + proto case in order to not 
> clash
>            with flows in LB VIP case.
>          * Add additional tests.
> 
>  northd/ovn-northd.8.xml | 35 +++++++++++-----
>  northd/ovn-northd.c     | 64 ++++++++++++++--------------
>  northd/ovn_northd.dl    |  8 ++--
>  tests/ovn-northd.at     | 93 +++++++++++++++++++++++++++++++----------
>  4 files changed, 130 insertions(+), 70 deletions(-)
> 
> diff --git a/northd/ovn-northd.8.xml b/northd/ovn-northd.8.xml
> index a20c5b90dc66..6599ba194fe5 100644
> --- a/northd/ovn-northd.8.xml
> +++ b/northd/ovn-northd.8.xml
> @@ -2773,17 +2773,32 @@ icmp6 {
>      </p>
>  
>      <p>
> -      If load balancing rules with virtual IP addresses (and ports) are
> -      configured in <code>OVN_Northbound</code> database for a Gateway 
> router,
> +      If load balancing rules with only virtual IP addresses are configured 
> in
> +      <code>OVN_Northbound</code> database for a Gateway router,
>        a priority-100 flow is added for each configured virtual IP address
> -      <var>VIP</var>. For IPv4 <var>VIPs</var> the flow matches <code>ip
> -      &amp;&amp; ip4.dst == <var>VIP</var></code>.  For IPv6 <var>VIPs</var>,
> -      the flow matches <code>ip &amp;&amp; ip6.dst == <var>VIP</var></code>.
> -      The flow applies the action <code>reg0 = <var>VIP</var>
> -      &amp;&amp; ct_dnat;</code> to send IP packets to the
> -      connection tracker for packet de-fragmentation and to dnat the
> -      destination IP for the committed connection before sending it to the
> -      next table.
> +      <var>VIP</var>. For IPv4 <var>VIPs</var> the flow matches
> +      <code>ip &amp;&amp; ip4.dst == <var>VIP</var></code>.  For IPv6
> +      <var>VIPs</var>, the flow matches <code>ip &amp;&amp; ip6.dst ==
> +      <var>VIP</var></code>. The flow applies the action <code>reg0 =
> +      <var>VIP</var>; ct_dnat;</code>  (or <code>xxreg0</code> for IPv6) to
> +      send IP packets to the connection tracker for packet de-fragmentation 
> and
> +      to dnat the destination IP for the committed connection before sending 
> it
> +      to the next table.
> +    </p>
> +
> +    <p>
> +      If load balancing rules with virtual IP addresses and ports are
> +      configured in <code>OVN_Northbound</code> database for a Gateway 
> router,
> +      a priority-110 flow is added for each configured virtual IP address
> +      <var>VIP</var> and protocol <var>PROTO</var>. For IPv4 <var>VIPs</var>
> +      the flow matches <code>ip &amp;&amp; ip4.dst == <var>VIP</var> 
> &amp;&amp;
> +      <var>PROTO</var></code>. For IPv6 <var>VIPs</var>, the flow matches
> +      <code>ip &amp;&amp; ip6.dst == <var>VIP</var> &amp;&amp;
> +      <var>PROTO</var></code>. The flow applies the action <code>reg0 =
> +      <var>VIP</var>; ct_dnat;</code> (or <code>xxreg0</code> for IPv6) to 
> send
> +      IP packets to the connection tracker for packet de-fragmentation and to
> +      dnat the destination IP for the committed connection before sending it 
> to
> +      the next table.
>      </p>
>  
>      <p>
> diff --git a/northd/ovn-northd.c b/northd/ovn-northd.c
> index 999c3f482c29..2a11172d94b6 100644
> --- a/northd/ovn-northd.c
> +++ b/northd/ovn-northd.c
> @@ -9207,8 +9207,6 @@ static void
>  build_lrouter_lb_flows(struct hmap *lflows, struct ovn_datapath *od,
>                         struct hmap *lbs, struct ds *match)
>  {
> -    /* A set to hold all ips that need defragmentation and tracking. */
> -    struct sset all_ips = SSET_INITIALIZER(&all_ips);
>  
>      for (int i = 0; i < od->nbr->n_load_balancer; i++) {
>          struct nbrec_load_balancer *nb_lb = od->nbr->load_balancer[i];
> @@ -9217,6 +9215,7 @@ build_lrouter_lb_flows(struct hmap *lflows, struct 
> ovn_datapath *od,
>          ovs_assert(lb);
>  
>          for (size_t j = 0; j < lb->n_vips; j++) {
> +            int prio = 100;
>              struct ovn_lb_vip *lb_vip = &lb->vips[j];
>  
>              bool is_udp = nullable_string_is_equal(nb_lb->protocol, "udp");
> @@ -9225,42 +9224,41 @@ build_lrouter_lb_flows(struct hmap *lflows, struct 
> ovn_datapath *od,
>              const char *proto = is_udp ? "udp" : is_sctp ? "sctp" : "tcp";
>  
>              struct ds defrag_actions = DS_EMPTY_INITIALIZER;
> -            if (!sset_contains(&all_ips, lb_vip->vip_str)) {
> -                sset_add(&all_ips, lb_vip->vip_str);
> -                /* If there are any load balancing rules, we should send
> -                 * the packet to conntrack for defragmentation and
> -                 * tracking.  This helps with two things.
> -                 *
> -                 * 1. With tracking, we can send only new connections to
> -                 *    pick a DNAT ip address from a group.
> -                 * 2. If there are L4 ports in load balancing rules, we
> -                 *    need the defragmentation to match on L4 ports. */
> -                ds_clear(match);
> -                ds_clear(&defrag_actions);
> -                if (IN6_IS_ADDR_V4MAPPED(&lb_vip->vip)) {
> -                    ds_put_format(match, "ip && ip4.dst == %s",
> -                                  lb_vip->vip_str);
> -                    ds_put_format(&defrag_actions, "reg0 = %s; ct_dnat;",
> -                                  lb_vip->vip_str);
> -                } else {
> -                    ds_put_format(match, "ip && ip6.dst == %s",
> -                                  lb_vip->vip_str);
> -                    ds_put_format(&defrag_actions, "xxreg0 = %s; ct_dnat;",
> -                                  lb_vip->vip_str);
> -                }
>  
> -                if (lb_vip->vip_port) {
> -                    ds_put_format(match, " && %s", proto);
> -                }
> -                ovn_lflow_add_with_hint(lflows, od, S_ROUTER_IN_DEFRAG,
> -                                        100, ds_cstr(match),
> -                                        ds_cstr(&defrag_actions),
> -                                        &nb_lb->header_);
> +            /* If there are any load balancing rules, we should send
> +                * the packet to conntrack for defragmentation and
> +                * tracking.  This helps with two things.
> +                *
> +                * 1. With tracking, we can send only new connections to
> +                *    pick a DNAT ip address from a group.
> +                * 2. If there are L4 ports in load balancing rules, we
> +                *    need the defragmentation to match on L4 ports. */

Indentation needs to be fixed here.

> +            ds_clear(match);
> +            ds_clear(&defrag_actions);
> +            if (IN6_IS_ADDR_V4MAPPED(&lb_vip->vip)) {
> +                ds_put_format(match, "ip && ip4.dst == %s",
> +                                lb_vip->vip_str);
> +                ds_put_format(&defrag_actions, "reg0 = %s; ct_dnat;",
> +                                lb_vip->vip_str);
> +            } else {
> +                ds_put_format(match, "ip && ip6.dst == %s",
> +                                lb_vip->vip_str);
> +                ds_put_format(&defrag_actions, "xxreg0 = %s; ct_dnat;",
> +                                lb_vip->vip_str);

Here too (all the ds_put_format calls above).

_______________________________________________
dev mailing list
[email protected]
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Reply via email to