On Fri, Apr 1, 2022 at 1:55 PM Mark Michelson <[email protected]> wrote: > > Commit 4deac4509abbedd6ffaecf27eed01ddefccea40a introduced functionality > in ovn-northd to use a single zone for SNAT and DNAT when possible. It > accounts for certain situations, such as hairpinned traffic, where we > still need a separate SNAT and DNAT zone. > > However, one situation it did not account for was when traffic traverses > a logical router and is DNATted as a result of a load balancer, then > when the traffic egresses the router, it needs to be SNATted. In this > situation, we try to use the same CT zone for the SNAT as for the load > balancer DNAT, which does not work. > > This commit fixes the issue by setting the DNAT_LOCAL bit in the initial > stage of the egress pipeline if the packet was dnatted during the > ingress pipeline. This ensures that when the SNAT stage is reached, a > separate CT zone is used for SNAT. > > Signed-off-by: Mark Michelson <[email protected]>
Hi Mark, Thanks for addressing the comments in v2. The patch LGTM. However the system tests added in this patch are failing - both locally and also in CI - https://github.com/ovsrobot/ovn/runs/5792688717?check_suite_focus=true Also there are a couple of checkpatch line length warnings. Numan > --- > v1 -> v2 > * Switched from setting flows in SNAT stage to using CHK_DNAT_LOCAL > stage, as suggested by Numan. Updated the commit message to indicate > this. > * Removed ovn-northd test in favor of moving the checks to the existing > NAT configuration check test. This also resulted in fixing these > tests, since they were failing with my first patch applied. > * Added new flow documentation to ovn-northd.8.xml > * Added notes to both northd.c and ovn-northd.8.xml that this relies on > the ct_label.natted bit to carry its status from the ingress to egress > pipelines. > --- > northd/northd.c | 17 ++++++ > northd/ovn-northd.8.xml | 16 ++++++ > tests/ovn-northd.at | 7 ++- > tests/system-ovn.at | 117 ++++++++++++++++++++++++++++++++++++++++ > 4 files changed, 156 insertions(+), 1 deletion(-) > > diff --git a/northd/northd.c b/northd/northd.c > index b264fb850..637cb8534 100644 > --- a/northd/northd.c > +++ b/northd/northd.c > @@ -13388,6 +13388,23 @@ build_lrouter_nat_defrag_and_lb(struct ovn_datapath > *od, struct hmap *lflows, > } > } > > + if (od->nbr->n_nat) { > + ds_clear(match); > + ds_put_cstr(match, "ip && ct_label.natted == 1"); > + /* This flow is unique since it is in the egress pipeline but checks > the > + * value of ct_label.natted, which would have been set in the ingress > + * pipeline. If a change is ever introduced that clears or otherwise > + * invalidates the ct_label between the ingress and egress > pipelines, then > + * an alternative will need to be devised. > + */ > + ds_clear(actions); > + ds_put_cstr(actions, REGBIT_DST_NAT_IP_LOCAL" = 1; next;"); > + ovn_lflow_add_with_hint(lflows, od, S_ROUTER_OUT_CHECK_DNAT_LOCAL, > + 50, ds_cstr(match), ds_cstr(actions), > + &od->nbr->header_); > + > + } > + > /* Handle force SNAT options set in the gateway router. */ > if (od->is_gw_router) { > if (dnat_force_snat_ip) { > diff --git a/northd/ovn-northd.8.xml b/northd/ovn-northd.8.xml > index 4784bff04..06ef74dcc 100644 > --- a/northd/ovn-northd.8.xml > +++ b/northd/ovn-northd.8.xml > @@ -4310,6 +4310,22 @@ nd_ns { > </li> > </ul> > > + <p> > + This table also installs a priority-50 logical flow for each logical > + router that has NATs configured on it. The flow has match > + <code>ip && ct_label.natted == 1</code> and action > + <code>REGBIT_DST_NAT_IP_LOCAL = 1; next;</code>. This is intended > + to ensure that traffic that was DNATted locally will use a separate > + conntrack zone for SNAT if SNAT is required later in the egress > + pipeline. Note that this flow checks the value of > + <code>ct_label.natted</code>, which is set in the ingress pipeline. > + This means that ovn-northd assumes that this value is carried over > + from the ingress pipeline to the egress pipeline and is not altered > + or cleared. If conntrack label values are ever changed to be cleared > + between the ingress and egress pipelines, then the match conditions > + of this flow will be updated accordingly. > + </p> > + > <h3>Egress Table 1: UNDNAT</h3> > > <p> > diff --git a/tests/ovn-northd.at b/tests/ovn-northd.at > index 60d91a771..3b3866fa2 100644 > --- a/tests/ovn-northd.at > +++ b/tests/ovn-northd.at > @@ -4983,6 +4983,7 @@ AT_CHECK([grep "lr_in_dnat" lr0flows | sort], [0], [dnl > > AT_CHECK([grep "lr_out_chk_dnat_local" lr0flows | sed 's/table=./table=?/' | > sort], [0], [dnl > table=? (lr_out_chk_dnat_local), priority=0 , match=(1), > action=(reg9[[4]] = 0; next;) > + table=? (lr_out_chk_dnat_local), priority=50 , match=(ip && > ct_label.natted == 1), action=(reg9[[4]] = 1; next;) > table=? (lr_out_chk_dnat_local), priority=50 , match=(ip && ip4.dst == > 172.168.0.10 && is_chassis_resident("cr-lr0-public")), action=(reg9[[4]] = 1; > next;) > table=? (lr_out_chk_dnat_local), priority=50 , match=(ip && ip4.dst == > 172.168.0.20 && is_chassis_resident("cr-lr0-public")), action=(reg9[[4]] = 1; > next;) > table=? (lr_out_chk_dnat_local), priority=50 , match=(ip && ip4.dst == > 172.168.0.30 && is_chassis_resident("cr-lr0-public")), action=(reg9[[4]] = 1; > next;) > @@ -5057,6 +5058,7 @@ AT_CHECK([grep "lr_in_dnat" lr0flows | sort], [0], [dnl > > AT_CHECK([grep "lr_out_chk_dnat_local" lr0flows | sed 's/table=./table=?/' | > sort], [0], [dnl > table=? (lr_out_chk_dnat_local), priority=0 , match=(1), > action=(reg9[[4]] = 0; next;) > + table=? (lr_out_chk_dnat_local), priority=50 , match=(ip && > ct_label.natted == 1), action=(reg9[[4]] = 1; next;) > table=? (lr_out_chk_dnat_local), priority=50 , match=(ip && ip4.dst == > 172.168.0.10 && is_chassis_resident("cr-lr0-public")), action=(reg9[[4]] = 1; > next;) > table=? (lr_out_chk_dnat_local), priority=50 , match=(ip && ip4.dst == > 172.168.0.20 && is_chassis_resident("cr-lr0-public")), action=(reg9[[4]] = 1; > next;) > table=? (lr_out_chk_dnat_local), priority=50 , match=(ip && ip4.dst == > 172.168.0.30 && is_chassis_resident("cr-lr0-public")), action=(reg9[[4]] = 1; > next;) > @@ -5125,6 +5127,7 @@ AT_CHECK([grep "lr_in_dnat" lr0flows | sort], [0], [dnl > > AT_CHECK([grep "lr_out_chk_dnat_local" lr0flows | sed 's/table=./table=?/' | > sort], [0], [dnl > table=? (lr_out_chk_dnat_local), priority=0 , match=(1), > action=(reg9[[4]] = 0; next;) > + table=? (lr_out_chk_dnat_local), priority=50 , match=(ip && > ct_label.natted == 1), action=(reg9[[4]] = 1; next;) > ]) > > AT_CHECK([grep "lr_out_undnat" lr0flows | sed 's/table=./table=?/' | sort], > [0], [dnl > @@ -5184,6 +5187,7 @@ AT_CHECK([grep "lr_in_dnat" lr0flows | sort], [0], [dnl > > AT_CHECK([grep "lr_out_chk_dnat_local" lr0flows | sed 's/table=./table=?/' | > sort], [0], [dnl > table=? (lr_out_chk_dnat_local), priority=0 , match=(1), > action=(reg9[[4]] = 0; next;) > + table=? (lr_out_chk_dnat_local), priority=50 , match=(ip && > ct_label.natted == 1), action=(reg9[[4]] = 1; next;) > ]) > > AT_CHECK([grep "lr_out_undnat" lr0flows | sed 's/table=./table=?/' | sort], > [0], [dnl > @@ -5249,6 +5253,7 @@ AT_CHECK([grep "lr_in_dnat" lr0flows | sort], [0], [dnl > > AT_CHECK([grep "lr_out_chk_dnat_local" lr0flows | sed 's/table=./table=?/' | > sort], [0], [dnl > table=? (lr_out_chk_dnat_local), priority=0 , match=(1), > action=(reg9[[4]] = 0; next;) > + table=? (lr_out_chk_dnat_local), priority=50 , match=(ip && > ct_label.natted == 1), action=(reg9[[4]] = 1; next;) > ]) > > AT_CHECK([grep "lr_out_undnat" lr0flows | sed 's/table=./table=?/' | sort], > [0], [dnl > @@ -5327,6 +5332,7 @@ AT_CHECK([grep "lr_in_dnat" lr0flows | sort], [0], [dnl > > AT_CHECK([grep "lr_out_chk_dnat_local" lr0flows | sed 's/table=./table=?/' | > sort], [0], [dnl > table=? (lr_out_chk_dnat_local), priority=0 , match=(1), > action=(reg9[[4]] = 0; next;) > + table=? (lr_out_chk_dnat_local), priority=50 , match=(ip && > ct_label.natted == 1), action=(reg9[[4]] = 1; next;) > ]) > > AT_CHECK([grep "lr_out_undnat" lr0flows | sed 's/table=./table=?/' | sort], > [0], [dnl > @@ -5975,7 +5981,6 @@ AT_CHECK([grep -e "(lr_in_ip_routing ).*outport" > lr0flows | sed 's/table=../ta > ]) > > AT_CLEANUP > -]) > > OVN_FOR_EACH_NORTHD([ > AT_SETUP([check exclude-lb-vips-from-garp option]) > diff --git a/tests/system-ovn.at b/tests/system-ovn.at > index d4c22e7e9..0b29becc8 100644 > --- a/tests/system-ovn.at > +++ b/tests/system-ovn.at > @@ -7975,3 +7975,120 @@ OVS_TRAFFIC_VSWITCHD_STOP(["/failed to query port > patch-.*/d > > AT_CLEANUP > ]) > + > +OVN_FOR_EACH_NORTHD([ > +AT_SETUP([SNAT in separate zone from DNAT]) > + > +CHECK_CONNTRACK() > +CHECK_CONNTRACK_NAT() > +ovn_start > +OVS_TRAFFIC_VSWITCHD_START() > +ADD_BR([br-int]) > + > +# Set external-ids in br-int needed for ovn-controller > +ovs-vsctl \ > + -- set Open_vSwitch . external-ids:system-id=hv1 \ > + -- set Open_vSwitch . > external-ids:ovn-remote=unix:$ovs_base/ovn-sb/ovn-sb.sock \ > + -- set Open_vSwitch . external-ids:ovn-encap-type=geneve \ > + -- set Open_vSwitch . external-ids:ovn-encap-ip=169.0.0.1 \ > + -- set bridge br-int fail-mode=secure > other-config:disable-in-band=true > + > +# The goal of this test is to ensure that when traffic is first DNATted > +# (by way of a load balancer), and then SNATted, the SNAT happens in a > +# separate conntrack zone from the DNAT. > + > +start_daemon ovn-controller > + > +check ovn-nbctl ls-add public > + > +check ovn-nbctl lr-add r1 > +check ovn-nbctl lrp-add r1 r1_public 00:de:ad:ff:00:01 172.16.0.1/16 > +check ovn-nbctl lrp-add r1 r1_s1 00:de:ad:fe:00:01 173.0.1.1/24 > +check ovn-nbctl lrp-set-gateway-chassis r1_public hv1 > + > +check ovn-nbctl lb-add r1_lb 30.0.0.1 172.16.0.102 > +check ovn-nbctl lr-lb-add r1 r1_lb > + > +check ovn-nbctl ls-add s1 > +check ovn-nbctl lsp-add s1 s1_r1 > +check ovn-nbctl lsp-set-type s1_r1 router > +check ovn-nbctl lsp-set-addresses s1_r1 router > +check ovn-nbctl lsp-set-options s1_r1 router-port=r1_s1 > + > +check ovn-nbctl lsp-add s1 vm1 > +check ovn-nbctl lsp-set-addresses vm1 "00:de:ad:01:00:01 173.0.1.2" > + > +check ovn-nbctl lsp-add public public_r1 > +check ovn-nbctl lsp-set-type public_r1 router > +check ovn-nbctl lsp-set-addresses public_r1 router > +check ovn-nbctl lsp-set-options public_r1 router-port=r1_public > nat-addresses=router > + > +check ovn-nbctl lr-add r2 > +check ovn-nbctl lrp-add r2 r2_public 00:de:ad:ff:00:02 172.16.0.2/16 > +check ovn-nbctl lrp-add r2 r2_s2 00:de:ad:fe:00:02 173.0.2.1/24 > +check ovn-nbctl lr-nat-add r2 dnat_and_snat 172.16.0.102 173.0.2.2 > +check ovn-nbctl lrp-set-gateway-chassis r2_public hv1 > + > +check ovn-nbctl ls-add s2 > +check ovn-nbctl lsp-add s2 s2_r2 > +check ovn-nbctl lsp-set-type s2_r2 router > +check ovn-nbctl lsp-set-addresses s2_r2 router > +check ovn-nbctl lsp-set-options s2_r2 router-port=r2_s2 > + > +check ovn-nbctl lsp-add s2 vm2 > +check ovn-nbctl lsp-set-addresses vm2 "00:de:ad:01:00:02 173.0.2.2" > + > +check ovn-nbctl lsp-add public public_r2 > +check ovn-nbctl lsp-set-type public_r2 router > +check ovn-nbctl lsp-set-addresses public_r2 router > +check ovn-nbctl lsp-set-options public_r2 router-port=r2_public > nat-addresses=router > + > +ADD_NAMESPACES(vm1) > +ADD_VETH(vm1, vm1, br-int, "173.0.1.2/24", "00:de:ad:01:00:01", \ > + "173.0.1.1") > +ADD_NAMESPACES(vm2) > +ADD_VETH(vm2, vm2, br-int, "173.0.2.2/24", "00:de:ad:01:00:02", \ > + "173.0.2.1") > + > +check ovn-nbctl lr-nat-add r1 dnat_and_snat 172.16.0.101 173.0.1.2 vm1 > 00:00:00:01:02:03 > +check ovn-nbctl --wait=hv sync > + > +# Next, make sure that a ping works as expected > +NS_CHECK_EXEC([vm1], [ping -q -c 3 -i 0.3 -w 2 30.0.0.1 | FORMAT_PING], \ > +[0], [dnl > +3 packets transmitted, 3 received, 0% packet loss, time 0ms > +]) > + > +# Finally, make sure that conntrack shows two separate zones being used for > +# DNAT and SNAT > +AT_CHECK([ovs-appctl dpctl/dump-conntrack | FORMAT_CT(30.0.0.1) | \ > +sed -e 's/zone=[[0-9]]*/zone=<cleared>/'], [0], [dnl > +icmp,orig=(src=173.0.1.2,dst=30.0.0.1,id=<cleared>,type=8,code=0),reply=(src=172.16.0.102,dst=173.0.1.2,id=<cleared>,type=0,code=0),zone=<cleared>,labels=0x2 > +]) > + > +# The final two entries appear identical here. That is because FORMAT_CT > +# scrubs the zone numbers. In actuality, the zone numbers are different, > +# which is why there are two entries. > +AT_CHECK([ovs-appctl dpctl/dump-conntrack | FORMAT_CT(172.16.0.102) | \ > +sed -e 's/zone=[[0-9]]*/zone=<cleared>/'], [0], [dnl > +icmp,orig=(src=172.16.0.101,dst=172.16.0.102,id=<cleared>,type=8,code=0),reply=(src=173.0.2.2,dst=172.16.0.101,id=<cleared>,type=0,code=0),zone=<cleared> > +icmp,orig=(src=173.0.1.2,dst=172.16.0.102,id=<cleared>,type=8,code=0),reply=(src=172.16.0.102,dst=172.16.0.101,id=<cleared>,type=0,code=0),zone=<cleared> > +icmp,orig=(src=173.0.1.2,dst=172.16.0.102,id=<cleared>,type=8,code=0),reply=(src=172.16.0.102,dst=172.16.0.101,id=<cleared>,type=0,code=0),zone=<cleared> > +]) > + > +OVS_APP_EXIT_AND_WAIT([ovn-controller]) > + > +as ovn-sb > +OVS_APP_EXIT_AND_WAIT([ovsdb-server]) > + > +as ovn-nb > +OVS_APP_EXIT_AND_WAIT([ovsdb-server]) > + > +as northd > +OVS_APP_EXIT_AND_WAIT([NORTHD_TYPE]) > + > +as > +OVS_TRAFFIC_VSWITCHD_STOP(["/failed to query port patch-.*/d > +/connection dropped.*/d"]) > +AT_CLEANUP > +]) > -- > 2.31.1 > > _______________________________________________ > dev mailing list > [email protected] > https://mail.openvswitch.org/mailman/listinfo/ovs-dev > _______________________________________________ dev mailing list [email protected] https://mail.openvswitch.org/mailman/listinfo/ovs-dev
