On Wed, May 10, 2023 at 4:05 AM Ales Musil <[email protected]> wrote:
>
> On Mon, May 8, 2023 at 5:19 PM Mark Michelson <[email protected]> wrote:
>
> > Current code always skips conntrack for traffic that ingresses or
> > egresses on a localnet port. However, this makes it impossible for
> > traffic to be load-balanced when it arrives on a localnet port.
> >
> > This patch allows for traffic to be load balanced on localnet ports by
> > making two changes:
> > * Localnet ports now have a conntrack zone assigned.
> > * When a load balancer is configured on a logical switch containing a
> >   localnet port, then conntrack is no longer skipped for traffic
> >   involving the localnet port.
> >
> > Co-authored by: Dumitru Ceara <[email protected]>
> > Signed-off-by: Dumitru Ceara <[email protected]>
> > Signed-off-by: Mark Michelson <[email protected]>
> > ---
> > v2 -> v3:
> >  * We now unconditionally skip localnet ports for ACL-related conntrack.
> >  * There is a CT dump check in the system test.
> > ---
> >  controller/ovn-controller.c | 13 +++---
> >  northd/northd.c             | 18 +++++---
> >  tests/ovn-northd.at         | 60 +++++++++++++++++++++++++
> >  tests/system-ovn.at         | 90 +++++++++++++++++++++++++++++++++++++
> >  4 files changed, 170 insertions(+), 11 deletions(-)
> >
> > diff --git a/controller/ovn-controller.c b/controller/ovn-controller.c
> > index de90025f0..e22cd1eb9 100644
> > --- a/controller/ovn-controller.c
> > +++ b/controller/ovn-controller.c
> > @@ -708,7 +708,7 @@ get_snat_ct_zone(const struct sbrec_datapath_binding
> > *dp)
> >  }
> >
> >  static void
> > -update_ct_zones(const struct shash *binding_lports,
> > +update_ct_zones(const struct sset *local_lports,
> >                  const struct hmap *local_datapaths,
> >                  struct simap *ct_zones, unsigned long *ct_zone_bitmap,
> >                  struct shash *pending_ct_zones)
> > @@ -721,9 +721,9 @@ update_ct_zones(const struct shash *binding_lports,
> >      unsigned long unreq_snat_zones_map[BITMAP_N_LONGS(MAX_CT_ZONES)];
> >      struct simap unreq_snat_zones = SIMAP_INITIALIZER(&unreq_snat_zones);
> >
> > -    struct shash_node *shash_node;
> > -    SHASH_FOR_EACH (shash_node, binding_lports) {
> > -        sset_add(&all_users, shash_node->name);
> > +    const char *local_lport;
> > +    SSET_FOR_EACH (local_lport, local_lports) {
> > +        sset_add(&all_users, local_lport);
> >      }
> >
> >      /* Local patched datapath (gateway routers) need zones assigned. */
> > @@ -2377,7 +2377,7 @@ en_ct_zones_run(struct engine_node *node, void *data)
> >          EN_OVSDB_GET(engine_get_input("OVS_bridge", node));
> >
> >      restore_ct_zones(bridge_table, ovs_table, ct_zones_data);
> > -    update_ct_zones(&rt_data->lbinding_data.lports,
> > &rt_data->local_datapaths,
> > +    update_ct_zones(&rt_data->local_lports, &rt_data->local_datapaths,
> >                      &ct_zones_data->current, ct_zones_data->bitmap,
> >                      &ct_zones_data->pending);
> >
> > @@ -2467,7 +2467,8 @@ ct_zones_runtime_data_handler(struct engine_node
> > *node, void *data)
> >          SHASH_FOR_EACH (shash_node, &tdp->lports) {
> >              struct tracked_lport *t_lport = shash_node->data;
> >              if (strcmp(t_lport->pb->type, "")
> > -                && strcmp(t_lport->pb->type, "localport")) {
> > +                && strcmp(t_lport->pb->type, "localport")
> > +                && strcmp(t_lport->pb->type, "localnet")) {
> >                  /* We allocate zone-id's only to VIF and localport
> > lports. */

I suppose you can update the comment mentioning localnet ports too.


> >                  continue;
> >              }
> > diff --git a/northd/northd.c b/northd/northd.c
> > index b69fcf321..41d0f5994 100644
> > --- a/northd/northd.c
> > +++ b/northd/northd.c
> > @@ -5968,7 +5968,8 @@ build_pre_acls(struct ovn_datapath *od, const struct
> > hmap *port_groups,
> >          }
> >          for (size_t i = 0; i < od->n_localnet_ports; i++) {
> >              skip_port_from_conntrack(od, od->localnet_ports[i],
> > -                                     S_SWITCH_IN_PRE_ACL,
> > S_SWITCH_OUT_PRE_ACL,
> > +                                     S_SWITCH_IN_PRE_ACL,
> > +                                     S_SWITCH_OUT_PRE_ACL,
> >                                       110, lflows);
> >          }
> >
> > @@ -6137,10 +6138,17 @@ build_pre_lb(struct ovn_datapath *od, const struct
> > shash *meter_groups,
> >                                   S_SWITCH_IN_PRE_LB, S_SWITCH_OUT_PRE_LB,
> >                                   110, lflows);
> >      }
> > -    for (size_t i = 0; i < od->n_localnet_ports; i++) {
> > -        skip_port_from_conntrack(od, od->localnet_ports[i],
> > -                                 S_SWITCH_IN_PRE_LB, S_SWITCH_OUT_PRE_LB,
> > -                                 110, lflows);
> > +    /* Localnet ports have no need for going through conntrack, unless
> > +     * the logical switch has a load balancer. Then, conntrack is
> > necessary
> > +     * so that traffic arriving via the localnet port can be load
> > +     * balanced.
> > +     */
> > +    if (!od->has_lb_vip) {
> > +        for (size_t i = 0; i < od->n_localnet_ports; i++) {
> > +            skip_port_from_conntrack(od, od->localnet_ports[i],
> > +                                     S_SWITCH_IN_PRE_LB,
> > S_SWITCH_OUT_PRE_LB,
> > +                                     110, lflows);
> > +        }
> >      }
> >
> >      /* Do not sent statless flows via conntrack */
> > diff --git a/tests/ovn-northd.at b/tests/ovn-northd.at
> > index 047b8b6ad..850bc25a4 100644
> > --- a/tests/ovn-northd.at
> > +++ b/tests/ovn-northd.at
> > @@ -8975,3 +8975,63 @@ mac_binding_timestamp: true
> >
> >  AT_CLEANUP
> >  ])
> > +
> > +OVN_FOR_EACH_NORTHD_NO_HV([
> > +AT_SETUP([Localnet ports on LS with LB])
> > +ovn_start
> > +# In the past, traffic arriving on localnet ports has skipped conntrack.
> > +# This test ensures that we still skip conntrack for localnet ports,
> > +# *except* for the case where the logical switch has a load balancer
> > +# configured. In this case, the localnet port will not skip conntrack,
> > +# allowing for traffic to be load balanced on the localnet port.
> > +
> > +check ovn-nbctl ls-add sw
> > +check ovn-nbctl lsp-add sw sw-ln
> > +check ovn-nbctl lsp-set-type sw-ln localnet
> > +check ovn-nbctl lsp-set-addresses sw-ln unknown
> > +check ovn-nbctl --wait=sb sync
> > +
> > +# Since this test is only concerned with logical flows, we don't need to
> > +# configure anything else that we normally would with regards to localnet
> > +# ports
> > +
> > +
> > +# First, ensure that conntrack is skipped for the localnet port since
> > there
> > +# isn't a load balancer configured.
> > +
> > +AT_CHECK([ovn-sbctl lflow-list sw | grep ls_in_pre_lb | grep priority=110
> > | grep sw-ln | sed 's/table=../table=??/'], [0], [dnl
> > +  table=??(ls_in_pre_lb       ), priority=110  , match=(ip && inport ==
> > "sw-ln"), action=(next;)
> > +])
> > +
> > +AT_CHECK([ovn-sbctl lflow-list sw | grep ls_out_pre_lb | grep
> > priority=110 | grep sw-ln | sed 's/table=../table=??/'], [0], [dnl
> > +  table=??(ls_out_pre_lb      ), priority=110  , match=(ip && outport ==
> > "sw-ln"), action=(ct_clear; next;)
> > +])
> > +
> > +# Now add a load balancer and ensure that we no longer are skipping
> > conntrack
> > +# for the localnet port
> > +
> > +check ovn-nbctl lb-add lb 10.0.0.1:80 10.0.0.100:8080 tcp
> > +check ovn-nbctl ls-lb-add sw lb
> > +check ovn-nbctl --wait=sb sync
> > +
> > +AT_CHECK([ovn-sbctl lflow-list sw | grep ls_in_pre_lb | grep priority=110
> > | grep sw-ln | sed 's/table=../table=??/'], [0], [dnl
> > +])
> > +
> > +AT_CHECK([ovn-sbctl lflow-list sw | grep ls_out_pre_lb | grep
> > priority=110 | grep sw-ln | sed 's/table=../table=??/'], [0], [dnl
> > +])
> > +
> > +# And ensure that removing the load balancer from the switch results in
> > skipping
> > +# conntrack again
> > +check ovn-nbctl ls-lb-del sw lb
> > +check ovn-nbctl --wait=sb sync
> > +
> > +AT_CHECK([ovn-sbctl lflow-list sw | grep ls_in_pre_lb | grep priority=110
> > | grep sw-ln | sed 's/table=../table=??/'], [0], [dnl
> > +  table=??(ls_in_pre_lb       ), priority=110  , match=(ip && inport ==
> > "sw-ln"), action=(next;)
> > +])
> > +
> > +AT_CHECK([ovn-sbctl lflow-list sw | grep ls_out_pre_lb | grep
> > priority=110 | grep sw-ln | sed 's/table=../table=??/'], [0], [dnl
> > +  table=??(ls_out_pre_lb      ), priority=110  , match=(ip && outport ==
> > "sw-ln"), action=(ct_clear; next;)
> > +])
> > +
> > +AT_CLEANUP
> > +])
> > diff --git a/tests/system-ovn.at b/tests/system-ovn.at
> > index df0dd99fb..0c5882055 100644
> > --- a/tests/system-ovn.at
> > +++ b/tests/system-ovn.at
> > @@ -10693,6 +10693,7 @@ ADD_BR([br-int])
> >
> >  # Set external-ids in br-int needed for ovn-controller
> >  check ovs-vsctl \
> > +ovs-vsctl \

Looks like this extra - "ovs-vsctl" is a typo and because of which
this system test is failing.


> >          -- set Open_vSwitch . external-ids:system-id=hv1 \
> >          -- set Open_vSwitch .
> > external-ids:ovn-remote=unix:$ovs_base/ovn-sb/ovn-sb.sock \
> >          -- set Open_vSwitch . external-ids:ovn-encap-type=geneve \
> > @@ -11009,3 +11010,92 @@ OVS_TRAFFIC_VSWITCHD_STOP(["/failed to query port
> > patch-.*/d
> >  AT_CLEANUP
> >  ])
> >
> > +OVN_FOR_EACH_NORTHD([
> > +AT_SETUP([load balancer with localnet port])
> > +CHECK_CONNTRACK()
> > +CHECK_CONNTRACK_NAT()
> > +ovn_start
> > +OVS_TRAFFIC_VSWITCHD_START()
> > +ADD_BR([br-int])
> > +ADD_BR([br-phys], [set Bridge br-phys fail-mode=standalone])
> > +
> > +# Set external-ids in br-int needed for ovn-controller
> > +ovs-vsctl \
> > +        -- set Open_vSwitch . external-ids:system-id=hv1 \
> > +        -- set Open_vSwitch .
> > external-ids:ovn-remote=unix:$ovs_base/ovn-sb/ovn-sb.sock \
> > +        -- set Open_vSwitch . external-ids:ovn-encap-type=geneve \
> > +        -- set Open_vSwitch . external-ids:ovn-encap-ip=169.0.0.1 \
> > +        -- set bridge br-int fail-mode=secure
> > other-config:disable-in-band=true
> > +
> > +start_daemon ovn-controller
> > +
> > +check ovn-nbctl lr-add ro
> > +check ovn-nbctl lrp-add ro ro-sw 00:00:00:00:00:01 192.168.0.1/24
> > +check ovn-nbctl lrp-add ro ro-pub 00:00:00:00:01:01 10.0.0.1/24
> > +
> > +check ovn-nbctl ls-add sw
> > +check ovn-nbctl lsp-add sw sw-vm1 \
> > +    -- lsp-set-addresses sw-vm1 "00:00:00:00:00:02 192.168.0.2"
> > +check ovn-nbctl lsp-add sw sw-ro \
> > +    -- lsp-set-type sw-ro router \
> > +    -- lsp-set-addresses sw-ro router \
> > +    -- lsp-set-options sw-ro router-port=ro-sw
> > +
> > +check ovn-nbctl ls-add pub
> > +check ovn-nbctl lsp-add pub sw-ln \
> > +    -- lsp-set-type sw-ln localnet \
> > +    -- lsp-set-addresses sw-ln unknown \
> > +    -- lsp-set-options sw-ln network_name=phys
> > +check ovn-nbctl lsp-add pub pub-ro \
> > +    -- lsp-set-type pub-ro router \
> > +    -- lsp-set-addresses pub-ro router \
> > +    -- lsp-set-options pub-ro router-port=ro-pub
> > +
> > +check ovs-vsctl set open . external-ids:ovn-bridge-mappings=phys:br-phys
> > +
> > +ADD_NAMESPACES(sw-vm1)
> > +ADD_VETH(sw-vm1, sw-vm1, br-int, "192.168.0.2/24", "00:00:00:00:00:02", \
> > +         "192.168.0.1")
> > +
> > +ADD_NAMESPACES(ln)
> > +ADD_VETH(ln, ln, br-phys, "10.0.0.2/24", "00:00:00:00:01:02", \
> > +         "10.0.0.1")
> > +
> > +# We have the basic network set up. Now let's add a load balancer
> > +# on the "pub" logical switch.
> > +
> > +check ovn-nbctl lb-add ln-lb 172.16.0.1:80 192.168.0.2:80 tcp
> > +check ovn-nbctl ls-lb-add pub ln-lb
> > +check ovn-nbctl --wait=hv sync
> > +
> > +# Add a route so that the localnet port can reach the load balancer
> > +# VIP.
> > +NS_CHECK_EXEC([ln], [ip route add 172.16.0.1 via 10.0.0.1])
> > +NS_CHECK_EXEC([ln], [ip route add 192.168.0.0/24 via 10.0.0.1])
> > +
> > +OVS_START_L7([sw-vm1], [http])
> > +
> > +NS_CHECK_EXEC([ln], [wget 172.16.0.1 -t 5 -T 1 --retry-connrefused -v -o
> > wget.log])
> > +
> > +AT_CHECK([ovs-appctl dpctl/dump-conntrack | FORMAT_CT(172.16.0.1) | \
> > +sed -e 's/zone=[[0-9]]*/zone=<cleared>/'], [0], [dnl
> >
> > +tcp,orig=(src=10.0.0.2,dst=172.16.0.1,sport=<cleared>,dport=<cleared>),reply=(src=192.168.0.2,dst=10.0.0.2,sport=<cleared>,dport=<cleared>),zone=<cleared>,mark=2,protoinfo=(state=<cleared>)
> > +])
> > +
> > +OVS_APP_EXIT_AND_WAIT([ovn-controller])
> > +
> > +as ovn-sb
> > +OVS_APP_EXIT_AND_WAIT([ovsdb-server])
> > +
> > +as ovn-nb
> > +OVS_APP_EXIT_AND_WAIT([ovsdb-server])
> > +
> > +as northd
> > +OVS_APP_EXIT_AND_WAIT([NORTHD_TYPE])
> > +
> > +as
> > +OVS_TRAFFIC_VSWITCHD_STOP(["/failed to query port patch-.*/d
> > +/connection dropped.*/d"])
> > +
> > +AT_CLEANUP
> > +])

I did these small changes myself and triggered a CI run [1].  Looks
like the added system test is failing with an extra conntrack entry

[1] - 
https://github.com/numansiddique/ovn/actions/runs/4997805358/jobs/8952547838


---
...
10.0.0.2 - - [17/May/2023 00:24:42] "GET / HTTP/1.1" 200 -
./system-ovn.at:11180: ovs-appctl dpctl/dump-conntrack | grep -F
"dst=172.16.0.1" | sed -e 's/port=[0-9]*/port=<cleared>/g' -e
's/id=[0-9]*/id=<cleared>/g' -e 's/state=[0-9_A-Z]*/state=<cleared>/g'
| sort | uniq | \
sed -e 's/zone=[0-9]*/zone=<cleared>/'
--- - 2023-05-17 00:24:42.878622448 +0000
+++ /workspace/ovn-tmp/tests/system-kmod-testsuite.dir/at-groups/297/stdout
2023-05-17 00:24:42.873207946 +0000
@@ -1,2 +1,3 @@
 
tcp,orig=(src=10.0.0.2,dst=172.16.0.1,sport=<cleared>,dport=<cleared>),reply=(src=192.168.0.2,dst=10.0.0.2,sport=<cleared>,dport=<cleared>),zone=<cleared>,mark=2,protoinfo=(state=<cleared>)
+tcp,orig=(src=172.16.0.101,dst=172.16.0.102,sport=<cleared>,dport=<cleared>),reply=(src=173.0.2.2,dst=172.16.0.101,sport=<cleared>,dport=<cleared>),zone=<cleared>,protoinfo=(state=<cleared>)
---

If the tests passed  I'd have applied this patch myself.  Can you
please take a look at the test failure ?

Locally the system test passed.  So it's possible it's a flake or
maybe in CI run some external traffic packet entered the localnet
port.


With these comments addressed :

Acked-by: Numan Siddique <[email protected]>

Numan

> > --
> > 2.39.2
> >
> > _______________________________________________
> > dev mailing list
> > [email protected]
> > https://mail.openvswitch.org/mailman/listinfo/ovs-dev
> >
> >
> Looks good to me, thanks.
>
> Acked-by: Ales Musil <[email protected]>
>
> --
>
> Ales Musil
>
> Senior Software Engineer - OVN Core
>
> Red Hat EMEA <https://www.redhat.com>
>
> [email protected]    IM: amusil
> <https://red.ht/sig>
> _______________________________________________
> dev mailing list
> [email protected]
> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
_______________________________________________
dev mailing list
[email protected]
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Reply via email to