On Mon, Sep 22, 2025 at 4:31 PM <martin.kal...@canonical.com> wrote:
> On Mon, 2025-09-22 at 11:03 +0200, Ales Musil wrote: > > > > > > On Mon, Sep 22, 2025 at 10:51 AM <martin.kal...@canonical.com> wrote: > > > On Mon, 2025-09-22 at 08:59 +0200, Ales Musil wrote: > > > > > > > > > > > > On Fri, Sep 19, 2025 at 5:04 PM Martin Kalcok > > > > <martin.kal...@canonical.com> wrote: > > > > > Commit 40136a2f2c84 introduced the ability to directly reach > > > > > networks behind > > > > > SNAT and DNAT_AND_SNAT on Distributed Routers, matching the > > > > > behavior > > > > > of Gateway Routers. This was achieved by committing the > > > > > incoming > > > > > traffic into > > > > > the SNAT TC zone, which allowed the router to SNAT only traffic > > > > > that originated > > > > > from inside the SNATed network. > > > > > An unintended consequence was that incoming traffic destined > > > > > for > > > > > the external > > > > > IP of a DNAT_AND_SNAT rule would also be committed to the SNAT > > > > > CT > > > > > zone, > > > > > resulting in an unnecessary entry, lingering in the table until > > > > > it > > > > > expired. > > > > > > > > > > This change attempts to fix that by handling traffic matching > > > > > DNAT_AND_SNAT > > > > > rule with slightly higher priority than the simple SNAT, and > > > > > committing it to > > > > > DNAT CT zone instead. > > > > > > > > > > Tests for the connectivity are also improved. Previously the > > > > > system > > > > > tests > > > > > 'DNAT and SNAT on distributed router - N/S' used: > > > > > * 'ping' for ICMP traffic > > > > > * 'nc -u -z' for UDP traffic > > > > > * 'nc --send-only' for TCP traffic > > > > > > > > > > It was especially the TCP traffic test that was problematic > > > > > because > > > > > it did not > > > > > detect errors that would occur after the initial TCP handshake. > > > > > This patch proposes a new macro NS_CHECK_CONNECTIVITY that > > > > > validates the > > > > > connectivity more thoroughly by exchanging multiple messages > > > > > over > > > > > the TCP > > > > > and UDP to make sure that the traffic can flow. > > > > > > > > > > Fixes: 40136a2f2c84 ("northd: Fix direct access to SNAT > > > > > network.") > > > > > Signed-off-by: Martin Kalcok <martin.kal...@canonical.com> > > > > > --- > > > > > > > > > > > > > > > > > Hi Martin, > > > > > > > > thank you for the patch, I have a few comments down below. > > > > > > > > > > Thank your for the review Ales. > > > > > > > > northd/northd.c | 64 +++++++++++----- > > > > > tests/ovn-northd.at | 9 +++ > > > > > tests/system-common-macros.at | 42 +++++++++++ > > > > > tests/system-ovn.at | 138 +++++++++----------------- > > > > > ---- > > > > > ---- > > > > > 4 files changed, 129 insertions(+), 124 deletions(-) > > > > > > > > > > diff --git a/northd/northd.c b/northd/northd.c > > > > > index 8b5413ef3..fcd07f442 100644 > > > > > --- a/northd/northd.c > > > > > +++ b/northd/northd.c > > > > > @@ -16611,28 +16611,52 @@ build_lrouter_out_snat_flow(struct > > > > > lflow_table *lflows, > > > > > * properly tracked so we can decide whether to perform > > > > > SNAT > > > > > on traffic > > > > > * exiting the network. */ > > > > > if (features->ct_commit_to_zone && features->ct_next_zone > > > > > && > > > > > - nat_entry->type == SNAT && !od->is_gw_router && > > > > > !commit_all) { > > > > > - /* For traffic that comes from SNAT network, initiate > > > > > CT > > > > > state before > > > > > - * entering S_ROUTER_OUT_SNAT to allow matching on > > > > > various > > > > > CT states. > > > > > - */ > > > > > - ovn_lflow_add(lflows, od, S_ROUTER_OUT_POST_UNDNAT, > > > > > 70, > > > > > - ds_cstr(match), "ct_next(snat);", > > > > > - lflow_ref); > > > > > - > > > > > - build_lrouter_out_snat_match(lflows, od, nat, match, > > > > > - distributed_nat, > > > > > cidr_bits, > > > > > is_v6, > > > > > - l3dgw_port, lflow_ref, > > > > > true); > > > > > + !od->is_gw_router && !commit_all) { > > > > > + /* Traffic to/from hosts behind SNAT is tracked > > > > > through > > > > > the > > > > > + * SNAT CT zone.*/ > > > > > + if (nat_entry->type == SNAT) { > > > > > + /* For traffic that comes from the SNAT network, > > > > > initiate CT state > > > > > + * from the SNAT zone, before entering > > > > > S_ROUTER_OUT_SNAT to allow > > > > > + * matching on various CT states.*/ > > > > > + ovn_lflow_add(lflows, od, > > > > > S_ROUTER_OUT_POST_UNDNAT, > > > > > 70, > > > > > + ds_cstr(match), "ct_next(snat);", > > > > > + lflow_ref); > > > > > > > > > > - /* New traffic that goes into SNAT network is > > > > > committed to > > > > > CT to avoid > > > > > - * SNAT-ing replies.*/ > > > > > - ovn_lflow_add(lflows, od, S_ROUTER_OUT_SNAT, priority, > > > > > - ds_cstr(match), "ct_snat;", > > > > > - lflow_ref); > > > > > + /* New traffic that goes into the SNAT network is > > > > > committed to SNAT > > > > > + * CT zone to avoid SNAT-ing replies.*/ > > > > > + build_lrouter_out_snat_match(lflows, od, nat, > > > > > match, > > > > > + distributed_nat, > > > > > cidr_bits, is_v6, > > > > > + l3dgw_port, > > > > > lflow_ref, > > > > > true); > > > > > + ovn_lflow_add(lflows, od, S_ROUTER_OUT_SNAT, > > > > > priority, > > > > > + ds_cstr(match), "ct_snat;", > > > > > + lflow_ref); > > > > > + ds_put_cstr(match, " && ct.new"); > > > > > > > > > > > > > > > > > Shouldn't we also commit to the DNAT if it's new only? I don't > > > > see > > > > the match > > > > down below. > > > > > > Yes, thank you. I guess potential issues were paritally sidestepped > > > by > > > the match already containing `(!ct.trk || !ct.rpl)` from the > > > ct_dnat > > > flow above. I'll update it. > > > > > > > > > > > > + ovn_lflow_add(lflows, od, S_ROUTER_OUT_POST_SNAT, > > > > > priority, > > > > > + ds_cstr(match), > > > > > "ct_commit_to_zone(snat);", > > > > > + lflow_ref); > > > > > + /* Traffic to/from hosts behind DNAT_AND_SNAT is > > > > > tracked > > > > > through the > > > > > + * DNAT CT zone with slightly higher priority flows.*/ > > > > > + } else { > > > > > + /* For traffic that comes from DNAT_AND_SNAT > > > > > hosts, > > > > > initiate CT > > > > > + * state from the DNAT zone, before entering > > > > > S_ROUTER_OUT_SNAT to > > > > > + * allow matching on various CT states.*/ > > > > > + ovn_lflow_add(lflows, od, > > > > > S_ROUTER_OUT_POST_UNDNAT, > > > > > 75, > > > > > + ds_cstr(match), "ct_next(dnat);", > > > > > + lflow_ref); > > > > > > > > > > - ds_put_cstr(match, " && ct.new"); > > > > > - ovn_lflow_add(lflows, od, S_ROUTER_OUT_POST_SNAT, > > > > > priority, > > > > > - ds_cstr(match), > > > > > "ct_commit_to_zone(snat);", > > > > > - lflow_ref); > > > > > + /* New traffic addressed to the logical IP of the > > > > > DNAT_AND_SNAT > > > > > + * rule is committed to DNAT CT zone to avoid > > > > > SNAT-ing > > > > > replies.*/ > > > > > + build_lrouter_out_snat_match(lflows, od, nat, > > > > > match, > > > > > + distributed_nat, > > > > > cidr_bits, is_v6, > > > > > + l3dgw_port, > > > > > lflow_ref, > > > > > true); > > > > > + ds_put_cstr(match, " && (!ct.trk || !ct.rpl)"); > > > > > + ovn_lflow_add(lflows, od, S_ROUTER_OUT_SNAT, > > > > > priority > > > > > + 5, > > > > > + ds_cstr(match), "ct_dnat;", > > > > > + lflow_ref); > > > > > + ovn_lflow_add(lflows, od, S_ROUTER_OUT_POST_SNAT, > > > > > priority + 5, > > > > > + ds_cstr(match), > > > > > "ct_commit_to_zone(dnat);", > > > > > + lflow_ref); > > > > > > > > > > > > > > > > > Those two block seem to be identical except for the priority and > > > > the zone, we could just have a single block instead of the > > > > duplication. > > > > > > Yeah, I was debating myself whether it can be streamlined more but > > > with > > > the small discrepancy in the match rule for the `ct_dnat` action as > > > well, I went with the more verbose approach. I'll give this another > > > go > > > and try to flatten it. > > > > > > > Also note that the else covers also pure DNAT nats which should > > > > be > > > > skipped there. > > > > > > There's an early return at the start of this function if the nat- > > > >type > > > isn't (SNAT || SNAT_AND_DNAT). But perhaps there's a value in > > > making it > > > more explicit in this if/else block as well. > > > > > > > > > Ah you are right, in that case there is no need to add anything > > additional. > > ack. > > > > > > > > > Thanks again, > > > Martin. > > > > > > > > > > > > + } > > > > > } > > > > > } > > > > > > > > > > diff --git a/tests/ovn-northd.at b/tests/ovn-northd.at > > > > > index c9e998129..2836daafe 100644 > > > > > --- a/tests/ovn-northd.at > > > > > +++ b/tests/ovn-northd.at > > > > > @@ -1321,10 +1321,12 @@ AT_CHECK([grep -e "lr_out_snat" > > > > > drflows3 | > > > > > ovn_strip_lflows], [0], [dnl > > > > > table=??(lr_out_snat ), priority=0 , match=(1), > > > > > action=(next;) > > > > > table=??(lr_out_snat ), priority=120 , > > > > > match=(nd_ns), > > > > > action=(next;) > > > > > table=??(lr_out_snat ), priority=161 , match=(ip && > > > > > ip4.src == 50.0.0.11 && outport == "DR-S1" && > > > > > is_chassis_resident("cr-DR-S1") && ip4.dst == $allowed_range && > > > > > (!ct.trk || !ct.rpl)), action=(ct_snat(172.16.1.2);) > > > > > + table=??(lr_out_snat ), priority=166 , match=(ip && > > > > > ip4.dst == 50.0.0.11 && inport == "DR-S1" && > > > > > is_chassis_resident("cr-DR-S1") && ip4.src == $allowed_range && > > > > > (!ct.trk || !ct.rpl)), action=(ct_dnat;) > > > > > ]) > > > > > > > > > > AT_CHECK([grep -e "lr_out_post_snat" drflows3 | > > > > > ovn_strip_lflows], > > > > > [0], [dnl > > > > > table=??(lr_out_post_snat ), priority=0 , match=(1), > > > > > action=(next;) > > > > > + table=??(lr_out_post_snat ), priority=166 , match=(ip && > > > > > ip4.dst == 50.0.0.11 && inport == "DR-S1" && > > > > > is_chassis_resident("cr-DR-S1") && ip4.src == $allowed_range && > > > > > (!ct.trk || !ct.rpl)), action=(ct_commit_to_zone(dnat);) > > > > > ]) > > > > > > > > > > AT_CHECK([grep -e "lr_out_snat" crflows3 | ovn_strip_lflows], > > > > > [0], > > > > > [dnl > > > > > @@ -1357,6 +1359,7 @@ AT_CHECK([grep -e "lr_out_snat" drflows4 > > > > > | > > > > > ovn_strip_lflows], [0], [dnl > > > > > table=??(lr_out_snat ), priority=120 , > > > > > match=(nd_ns), > > > > > action=(next;) > > > > > table=??(lr_out_snat ), priority=161 , match=(ip && > > > > > ip4.src == 50.0.0.11 && outport == "DR-S1" && > > > > > is_chassis_resident("cr-DR-S1") && (!ct.trk || !ct.rpl)), > > > > > action=(ct_snat(172.16.1.2);) > > > > > table=??(lr_out_snat ), priority=163 , match=(ip && > > > > > ip4.src == 50.0.0.11 && outport == "DR-S1" && > > > > > is_chassis_resident("cr-DR-S1") && ip4.dst == > > > > > $disallowed_range), > > > > > action=(next;) > > > > > + table=??(lr_out_snat ), priority=166 , match=(ip && > > > > > ip4.dst == 50.0.0.11 && inport == "DR-S1" && > > > > > is_chassis_resident("cr-DR-S1") && (!ct.trk || !ct.rpl)), > > > > > action=(ct_dnat;) > > > > > ]) > > > > > > > > > > AT_CHECK([grep -e "lr_out_snat" crflows4 | ovn_strip_lflows], > > > > > [0], > > > > > [dnl > > > > > @@ -5999,6 +6002,7 @@ AT_CHECK([grep "lr_out_post_undnat" > > > > > lr0flows > > > > > | ovn_strip_lflows], [0], [dnl > > > > > table=??(lr_out_post_undnat ), priority=0 , match=(1), > > > > > action=(next;) > > > > > table=??(lr_out_post_undnat ), priority=70 , match=(ip && > > > > > ip4.src == 10.0.0.0/24 && outport == "lr0-public" && > > > > > is_chassis_resident("cr-lr0-public") && (!ct.trk || !ct.rpl)), > > > > > action=(ct_next(snat);) > > > > > table=??(lr_out_post_undnat ), priority=70 , match=(ip && > > > > > ip4.src == 10.0.0.10 && outport == "lr0-public" && > > > > > is_chassis_resident("cr-lr0-public") && (!ct.trk || !ct.rpl)), > > > > > action=(ct_next(snat);) > > > > > + table=??(lr_out_post_undnat ), priority=75 , match=(ip && > > > > > ip4.src == 10.0.0.3 && outport == "lr0-public" && > > > > > is_chassis_resident("cr-lr0-public") && (!ct.trk || !ct.rpl)), > > > > > action=(ct_next(dnat);) > > > > > ]) > > > > > > > > > > AT_CHECK([grep "lr_out_snat" lr0flows | ovn_strip_lflows], > > > > > [0], > > > > > [dnl > > > > > @@ -6009,12 +6013,14 @@ AT_CHECK([grep "lr_out_snat" lr0flows | > > > > > ovn_strip_lflows], [0], [dnl > > > > > table=??(lr_out_snat ), priority=161 , match=(ip && > > > > > ip4.dst == 10.0.0.10 && inport == "lr0-public" && > > > > > is_chassis_resident("cr-lr0-public")), action=(ct_snat;) > > > > > table=??(lr_out_snat ), priority=161 , match=(ip && > > > > > ip4.src == 10.0.0.10 && outport == "lr0-public" && > > > > > is_chassis_resident("cr-lr0-public") && (!ct.trk || !ct.rpl)), > > > > > action=(ct_snat(172.168.0.30);) > > > > > table=??(lr_out_snat ), priority=161 , match=(ip && > > > > > ip4.src == 10.0.0.3 && outport == "lr0-public" && > > > > > is_chassis_resident("cr-lr0-public") && (!ct.trk || !ct.rpl)), > > > > > action=(ct_snat(172.168.0.20);) > > > > > + table=??(lr_out_snat ), priority=166 , match=(ip && > > > > > ip4.dst == 10.0.0.3 && inport == "lr0-public" && > > > > > is_chassis_resident("cr-lr0-public") && (!ct.trk || !ct.rpl)), > > > > > action=(ct_dnat;) > > > > > ]) > > > > > > > > > > AT_CHECK([grep "lr_out_post_snat" lr0flows | > > > > > ovn_strip_lflows], > > > > > [0], [dnl > > > > > table=??(lr_out_post_snat ), priority=0 , match=(1), > > > > > action=(next;) > > > > > table=??(lr_out_post_snat ), priority=153 , match=(ip && > > > > > ip4.dst == 10.0.0.0/24 && inport == "lr0-public" && > > > > > is_chassis_resident("cr-lr0-public") && ct.new), > > > > > action=(ct_commit_to_zone(snat);) > > > > > table=??(lr_out_post_snat ), priority=161 , match=(ip && > > > > > ip4.dst == 10.0.0.10 && inport == "lr0-public" && > > > > > is_chassis_resident("cr-lr0-public") && ct.new), > > > > > action=(ct_commit_to_zone(snat);) > > > > > + table=??(lr_out_post_snat ), priority=166 , match=(ip && > > > > > ip4.dst == 10.0.0.3 && inport == "lr0-public" && > > > > > is_chassis_resident("cr-lr0-public") && (!ct.trk || !ct.rpl)), > > > > > action=(ct_commit_to_zone(dnat);) > > > > > ]) > > > > > > > > > > # Associate load balancer to lr0 > > > > > @@ -6157,6 +6163,7 @@ AT_CHECK([grep "lr_out_post_undnat" > > > > > lr0flows > > > > > | ovn_strip_lflows], [0], [dnl > > > > > table=??(lr_out_post_undnat ), priority=0 , match=(1), > > > > > action=(next;) > > > > > table=??(lr_out_post_undnat ), priority=70 , match=(ip && > > > > > ip4.src == 10.0.0.0/24 && outport == "lr0-public" && > > > > > is_chassis_resident("cr-lr0-public") && (!ct.trk || !ct.rpl)), > > > > > action=(ct_next(snat);) > > > > > table=??(lr_out_post_undnat ), priority=70 , match=(ip && > > > > > ip4.src == 10.0.0.10 && outport == "lr0-public" && > > > > > is_chassis_resident("cr-lr0-public") && (!ct.trk || !ct.rpl)), > > > > > action=(ct_next(snat);) > > > > > + table=??(lr_out_post_undnat ), priority=75 , match=(ip && > > > > > ip4.src == 10.0.0.3 && outport == "lr0-public" && > > > > > is_chassis_resident("cr-lr0-public") && (!ct.trk || !ct.rpl)), > > > > > action=(ct_next(dnat);) > > > > > ]) > > > > > > > > > > AT_CHECK([grep "lr_out_snat" lr0flows | ovn_strip_lflows], > > > > > [0], > > > > > [dnl > > > > > @@ -6167,12 +6174,14 @@ AT_CHECK([grep "lr_out_snat" lr0flows | > > > > > ovn_strip_lflows], [0], [dnl > > > > > table=??(lr_out_snat ), priority=161 , match=(ip && > > > > > ip4.dst == 10.0.0.10 && inport == "lr0-public" && > > > > > is_chassis_resident("cr-lr0-public")), action=(ct_snat;) > > > > > table=??(lr_out_snat ), priority=161 , match=(ip && > > > > > ip4.src == 10.0.0.10 && outport == "lr0-public" && > > > > > is_chassis_resident("cr-lr0-public") && (!ct.trk || !ct.rpl)), > > > > > action=(ct_snat(172.168.0.30);) > > > > > table=??(lr_out_snat ), priority=161 , match=(ip && > > > > > ip4.src == 10.0.0.3 && outport == "lr0-public" && > > > > > is_chassis_resident("cr-lr0-public") && (!ct.trk || !ct.rpl)), > > > > > action=(ct_snat(172.168.0.20);) > > > > > + table=??(lr_out_snat ), priority=166 , match=(ip && > > > > > ip4.dst == 10.0.0.3 && inport == "lr0-public" && > > > > > is_chassis_resident("cr-lr0-public") && (!ct.trk || !ct.rpl)), > > > > > action=(ct_dnat;) > > > > > ]) > > > > > > > > > > AT_CHECK([grep "lr_out_post_snat" lr0flows | > > > > > ovn_strip_lflows], > > > > > [0], [dnl > > > > > table=??(lr_out_post_snat ), priority=0 , match=(1), > > > > > action=(next;) > > > > > table=??(lr_out_post_snat ), priority=153 , match=(ip && > > > > > ip4.dst == 10.0.0.0/24 && inport == "lr0-public" && > > > > > is_chassis_resident("cr-lr0-public") && ct.new), > > > > > action=(ct_commit_to_zone(snat);) > > > > > table=??(lr_out_post_snat ), priority=161 , match=(ip && > > > > > ip4.dst == 10.0.0.10 && inport == "lr0-public" && > > > > > is_chassis_resident("cr-lr0-public") && ct.new), > > > > > action=(ct_commit_to_zone(snat);) > > > > > + table=??(lr_out_post_snat ), priority=166 , match=(ip && > > > > > ip4.dst == 10.0.0.3 && inport == "lr0-public" && > > > > > is_chassis_resident("cr-lr0-public") && (!ct.trk || !ct.rpl)), > > > > > action=(ct_commit_to_zone(dnat);) > > > > > ]) > > > > > > > > > > # Make the logical router as Gateway router > > > > > diff --git a/tests/system-common-macros.at b/tests/system- > > > > > common- > > > > > macros.at > > > > > index 251a4c0b8..a3de97491 100644 > > > > > --- a/tests/system-common-macros.at > > > > > +++ b/tests/system-common-macros.at > > > > > @@ -614,3 +614,45 @@ m4_define([OVN_NEIGH_EQUAL], > > > > > m4_define([OVN_NEIGH_V6_EQUAL], > > > > > [OVS_WAIT_UNTIL_EQUAL([ip -6 neigh show dev $1 $2 | sed -e > > > > > 's|[[[[:space:]]]]*$||g' | grep $3 | sort], [$4]) > > > > > ]) > > > > > + > > > > > +# NS_CHECK_CONNECTIVITY([NS_SRC], [NS_DST], [IP_ADDR]) > > > > > +# > > > > > +# Ensures that network namespace NS_SRC can reach NS_DST on > > > > > IP_ADDR. The check performs: > > > > > +# * Ping > > > > > +# * Exchange of multiple messages over TCP > > > > > +# * Exchange of multiple messages over UDP > > > > > +m4_define([NS_CHECK_CONNECTIVITY], > > > > > +[ > > > > > + ns_src="$1" > > > > > + ns_dst="$2" > > > > > + ip="$3" > > > > > + # Start a simple TCP echo server that replies with "ack > > > > > <received_msg>". > > > > > + NETNS_DAEMONIZE($ns_dst, [nc -l -k -p 1235 -c 'while read > > > > > msg; > > > > > do echo "ack $msg"; done'], [nc-$ns_dst-$ip-tcp.pid]) > > > > > + # Start a simple UDP echo server that replies with "ack > > > > > <received_msg>". > > > > > + NETNS_DAEMONIZE($ns_dst, [nc -l -u -k -p 1234 -c 'read > > > > > msg; > > > > > echo "ack $msg"'], [nc-$ns_dst-$ip-udp.pid]) > > > > > + > > > > > + # Ensure that the destination NS can be pinged on the > > > > > specified IP > > > > > + NS_CHECK_EXEC([$ns_src], [ping -q -c 3 -i 0.3 -w 2 $ip | > > > > > FORMAT_PING], \ > > > > > + [0], [dnl > > > > > +3 packets transmitted, 3 received, 0% packet loss, time 0ms > > > > > +]) > > > > > + > > > > > + # Exchange multiple messages over TCP and UDP to verify > > > > > connectivity > > > > > + # Note(mkalcok): Server replies are printed to STDERR > > > > > because > > > > > STDOUT is captured by the nc. > > > > > + # Note2(mkalcok): The variables inside the 'nc -c' > > > > > argument > > > > > have to be escaped, otherwise they > > > > > + # get interpreted by the outer shell (from NS_CHECK_EXEC). > > > > > + NS_CHECK_EXEC([$ns_src], [timeout 2 nc $ip 1235 -c 'for i > > > > > in > > > > > \$(seq 1 3); do echo "tcp_data \$i"; read msg; echo "\$msg" > > > > > 1>&2; > > > > > done'], [0], [], [dnl > > > > > +ack tcp_data 1 > > > > > +ack tcp_data 2 > > > > > +ack tcp_data 3 > > > > > +]) > > > > > + NS_CHECK_EXEC([$ns_src], [timeout 2 nc -u $ip 1234 -c 'for > > > > > i > > > > > in \$(seq 1 3); do echo "udp_data \$i"; read msg; echo "\$msg" > > > > > 1>&2; done'], [0], [], [dnl > > > > > +ack udp_data 1 > > > > > +ack udp_data 2 > > > > > +ack udp_data 3 > > > > > +]) > > > > > > > > > > > > > > > > > I seems that the this loop is not properly waiting > > > > for all data to arrive, see the CI failure. > > > > > > The CI shows exit code of 124, which indicates that the 2 second > > > timeout was not enough. I made up that 2 second value because it > > > seemed > > > like a plenty and tests passed locally for me even with 0.1 > > > timeout. > > > Is it possible that the load on the testbed is making things slow? > > > Would it be reasonable to increase the timeout value? > > > > > > > > > > > I think we should utilize the "OVS_WAIT_FOR_OUTPUT" > > to avoid guesses about the timeout. > > > > Sorry, I'm not sure how to apply OVS_WAIT_FOR_OUTPUT in this scenario. > If I read the macro correctly, you give it a command and it retries it > until it produces the expected output. And that is slightly different > from the current implementation where a single 'nc' "session" is used > to exchange couple of messages. The single session ensures that the NAT > happens correctly at the beginning and during the session. (We saw > previously issues where first message went through fine, but then the > NAT would fall apart for subsequent messages). > You could be outputting to a file in the background. The command would just check the file periodically. > > Btw, the OVS_WAIT_FOR_OUTPUT ultimately just waits for the amount > specified in OVS_CTL_TIMEOUT. Couldn't I just use that as a value for > `timeout`? > That might be another option instead of some hardcoded value. > > Thanks, > Martin. > > > > > > > > > > > > > + > > > > > + # cleanup echo servers > > > > > + kill $(cat nc-$ns_dst-$ip-tcp.pid) > > > > > + kill $(cat nc-$ns_dst-$ip-udp.pid) > > > > > +]) > > > > > diff --git a/tests/system-ovn.at b/tests/system-ovn.at > > > > > index 8e356df6f..0d01582a4 100644 > > > > > --- a/tests/system-ovn.at > > > > > +++ b/tests/system-ovn.at > > > > > @@ -3777,46 +3777,49 @@ AT_CHECK([ovn-nbctl lr-route-add R1 > > > > > 10.0.0.0/24 172.16.1.2]) > > > > > check ovn-nbctl --wait=hv sync > > > > > OVS_WAIT_UNTIL([ovs-ofctl dump-flows br-int | grep > > > > > 'nat(src=172.16.1.1)']) > > > > > > > > > > -# North-South DNAT: 'alice1' pings 'foo1' using 172.16.1.3. > > > > > -NS_CHECK_EXEC([alice1], [ping -q -c 3 -i 0.3 -w 2 172.16.1.3 | > > > > > FORMAT_PING], \ > > > > > -[0], [dnl > > > > > -3 packets transmitted, 3 received, 0% packet loss, time 0ms > > > > > -]) > > > > > +# North-South DNAT: 'alice1' reaches 'foo1' on 172.16.1.3. > > > > > +NS_CHECK_CONNECTIVITY([alice1], [foo1], [172.16.1.3]) > > > > > > > > > > # We verify that DNAT indeed happened via 'dump-conntrack' > > > > > command. > > > > > AT_CHECK([ovs-appctl dpctl/dump-conntrack | > > > > > FORMAT_CT(172.16.1.3) > > > > > | \ > > > > > sed -e 's/zone=[[0-9]]*/zone=<cleared>/'], [0], [dnl > > > > > icmp,orig=(src=172.16.1.2,dst=172.16.1.3,id=<cleared>,type=8,c > > > > > ode= > > > > > 0),reply=(src=192.168.1.2,dst=172.16.1.2,id=<cleared>,type=0,co > > > > > de=0 > > > > > ),zone=<cleared> > > > > > +tcp,orig=(src=172.16.1.2,dst=172.16.1.3,sport=<cleared>,dport= > > > > > <cle > > > > > ared>),reply=(src=192.168.1.2,dst=172.16.1.2,sport=<cleared>,dp > > > > > ort= > > > > > <cleared>),zone=<cleared>,protoinfo=(state=<cleared>) > > > > > +udp,orig=(src=172.16.1.2,dst=172.16.1.3,sport=<cleared>,dport= > > > > > <cle > > > > > ared>),reply=(src=192.168.1.2,dst=172.16.1.2,sport=<cleared>,dp > > > > > ort= > > > > > <cleared>),zone=<cleared> > > > > > ]) > > > > > > > > > > -# South-North SNAT: 'foo2' pings 'alice1'. But 'alice1' > > > > > receives > > > > > traffic > > > > > +# South-North SNAT: 'foo2' reaches 'alice1'. But 'alice1' > > > > > receives > > > > > traffic > > > > > # from 172.16.1.4 > > > > > -NS_CHECK_EXEC([foo2], [ping -q -c 3 -i 0.3 -w 2 172.16.1.2 | > > > > > FORMAT_PING], \ > > > > > -[0], [dnl > > > > > -3 packets transmitted, 3 received, 0% packet loss, time 0ms > > > > > -]) > > > > > +NS_CHECK_CONNECTIVITY([foo2], [alice1], 172.16.1.2) > > > > > > > > > > # We verify that SNAT indeed happened via 'dump-conntrack' > > > > > command. > > > > > AT_CHECK([ovs-appctl dpctl/dump-conntrack | > > > > > FORMAT_CT(172.16.1.4) > > > > > | \ > > > > > sed -e 's/zone=[[0-9]]*/zone=<cleared>/'], [0], [dnl > > > > > icmp,orig=(src=192.168.1.3,dst=172.16.1.2,id=<cleared>,type=8, > > > > > code > > > > > =0),reply=(src=172.16.1.2,dst=172.16.1.4,id=<cleared>,type=0,co > > > > > de=0 > > > > > ),zone=<cleared> > > > > > +tcp,orig=(src=192.168.1.3,dst=172.16.1.2,sport=<cleared>,dport > > > > > =<cl > > > > > eared>),reply=(src=172.16.1.2,dst=172.16.1.4,sport=<cleared>,dp > > > > > ort= > > > > > <cleared>),zone=<cleared>,protoinfo=(state=<cleared>) > > > > > +udp,orig=(src=192.168.1.3,dst=172.16.1.2,sport=<cleared>,dport > > > > > =<cl > > > > > eared>),reply=(src=172.16.1.2,dst=172.16.1.4,sport=<cleared>,dp > > > > > ort= > > > > > <cleared>),zone=<cleared> > > > > > ]) > > > > > > > > > > AT_CHECK([ovs-appctl dpctl/flush-conntrack]) > > > > > > > > > > -# South-North SNAT: 'bar1' pings 'alice1'. But 'alice1' > > > > > receives > > > > > traffic > > > > > +# South-North SNAT: 'bar1' reaches 'alice1'. But 'alice1' > > > > > receives > > > > > traffic > > > > > # from 172.16.1.1 > > > > > -NS_CHECK_EXEC([bar1], [ping -q -c 3 -i 0.3 -w 2 172.16.1.2 | > > > > > FORMAT_PING], \ > > > > > -[0], [dnl > > > > > -3 packets transmitted, 3 received, 0% packet loss, time 0ms > > > > > -]) > > > > > +NS_CHECK_CONNECTIVITY([bar1], [alice1], 172.16.1.2) > > > > > > > > > > # We verify that SNAT indeed happened via 'dump-conntrack' > > > > > command. > > > > > AT_CHECK([ovs-appctl dpctl/dump-conntrack | > > > > > FORMAT_CT(172.16.1.1) > > > > > | \ > > > > > sed -e 's/zone=[[0-9]]*/zone=<cleared>/'], [0], [dnl > > > > > icmp,orig=(src=192.168.2.2,dst=172.16.1.2,id=<cleared>,type=8, > > > > > code > > > > > =0),reply=(src=172.16.1.2,dst=172.16.1.1,id=<cleared>,type=0,co > > > > > de=0 > > > > > ),zone=<cleared> > > > > > +tcp,orig=(src=192.168.2.2,dst=172.16.1.2,sport=<cleared>,dport > > > > > =<cl > > > > > eared>),reply=(src=172.16.1.2,dst=172.16.1.1,sport=<cleared>,dp > > > > > ort= > > > > > <cleared>),zone=<cleared>,protoinfo=(state=<cleared>) > > > > > +udp,orig=(src=192.168.2.2,dst=172.16.1.2,sport=<cleared>,dport > > > > > =<cl > > > > > eared>),reply=(src=172.16.1.2,dst=172.16.1.1,sport=<cleared>,dp > > > > > ort= > > > > > <cleared>),zone=<cleared> > > > > > ]) > > > > > > > > > > +# North-South Direct (Bypassing DNAT): 'alice1' reaches 'foo1' > > > > > using 192.168.1.2 > > > > > +NS_CHECK_CONNECTIVITY(alice1, foo1, 192.168.1.2) > > > > > + > > > > > +# North-South Direct (Bypassing SNAT): 'alice1' reaches 'bar1' > > > > > using 192.168.2.2 > > > > > +NS_CHECK_CONNECTIVITY(alice1, bar1, 192.168.2.2) > > > > > + > > > > > # Try to ping external network > > > > > NETNS_START_TCPDUMP([ext-net], [-n -c 3 -i ext-veth dst > > > > > 172.16.1.3 > > > > > and icmp], [ext-net]) > > > > > AT_CHECK([ovn-nbctl lr-nat-del R1 snat]) > > > > > @@ -3825,43 +3828,6 @@ NS_CHECK_EXEC([foo1], [ping -q -c 3 -i > > > > > 0.3 - > > > > > w 2 10.0.0.1 | FORMAT_PING], \ > > > > > 3 packets transmitted, 3 received, 0% packet loss, time 0ms > > > > > ]) > > > > > > > > > > -# test_connectivity_from_ext takes parameters 'vm' and 'ip'. > > > > > It > > > > > tests > > > > > -# icmp, udp and tcp connectivity from external network to the > > > > > 'vm' > > > > > on > > > > > -# the specified 'ip'. > > > > > -test_connectivity_from_ext() { > > > > > - local vm=$1; shift > > > > > - local ip=$1; shift > > > > > - > > > > > - # Start listening daemon TCP connections. > > > > > - NETNS_DAEMONIZE($vm, [nc -l -k 1235], [nc-$vm-$ip- > > > > > tcp.pid]) > > > > > - > > > > > - # Ensure that vm can be pinged on the specified IP > > > > > - NS_CHECK_EXEC([alice1], [ping -q -c 3 -i 0.3 -w 2 $ip | > > > > > FORMAT_PING], \ > > > > > - [0], [dnl > > > > > -3 packets transmitted, 3 received, 0% packet loss, time 0ms > > > > > -]) > > > > > - > > > > > - # Perform two consecutive UDP connections to the specified > > > > > IP > > > > > - NETNS_DAEMONIZE($vm, [nc -l -u 1234], [nc-$vm-$ip- > > > > > udp.pid]) > > > > > - NS_CHECK_EXEC([alice1], [nc -u $ip 1234 -p 2000 -z]) > > > > > - kill $(cat nc-$vm-$ip-udp.pid) > > > > > - > > > > > - NETNS_DAEMONIZE($vm, [nc -l -u 1234], [nc-$vm-$ip- > > > > > udp.pid]) > > > > > - NS_CHECK_EXEC([alice1], [nc -u $ip 1234 -p 2000 -z]) > > > > > - kill $(cat nc-$vm-$ip-udp.pid) > > > > > - > > > > > - # Send data over TCP connection to the specified IP > > > > > - NS_CHECK_EXEC([alice1], [echo "TCP test" | nc --send-only > > > > > $ip > > > > > 1235]) > > > > > -} > > > > > - > > > > > -# Test access from external network to the internal IP of a VM > > > > > that > > > > > -# has also configured DNAT > > > > > -test_connectivity_from_ext foo1 192.168.1.2 > > > > > - > > > > > -# Test access from external network to the internal IP of a VM > > > > > that > > > > > -# does not have DNAT > > > > > -test_connectivity_from_ext bar1 192.168.2.2 > > > > > - > > > > > OVS_WAIT_UNTIL([ > > > > > total_pkts=$(cat ext-net.tcpdump | wc -l) > > > > > test "${total_pkts}" = "3" > > > > > @@ -3980,26 +3946,22 @@ AT_CHECK([ovn-nbctl lr-nat-add R1 snat > > > > > fd20::1 ::/0]) > > > > > check ovn-nbctl --wait=hv sync > > > > > OVS_WAIT_UNTIL([ovs-ofctl dump-flows br-int | grep > > > > > 'nat(src=fd20::1)']) > > > > > > > > > > -# North-South DNAT: 'alice1' pings 'foo1' using fd20::3 > > > > > -NS_CHECK_EXEC([alice1], [ping6 -q -c 3 -i 0.3 -w 2 fd20::3 | > > > > > FORMAT_PING], \ > > > > > -[0], [dnl > > > > > -3 packets transmitted, 3 received, 0% packet loss, time 0ms > > > > > -]) > > > > > +# North-South DNAT: 'alice1' reaches 'foo1' using fd20::3 > > > > > +NS_CHECK_CONNECTIVITY([alice1], [foo1], [fd20::3]) > > > > > > > > > > # We verify that DNAT indeed happened via 'dump-conntrack' > > > > > command. > > > > > AT_CHECK([ovs-appctl dpctl/dump-conntrack | FORMAT_CT(fd20::3) > > > > > | \ > > > > > sed -e 's/zone=[[0-9]]*/zone=<cleared>/'], [0], [dnl > > > > > icmpv6,orig=(src=fd20::2,dst=fd20::3,id=<cleared>,type=128,cod > > > > > e=0) > > > > > ,reply=(src=fd11::2,dst=fd20::2,id=<cleared>,type=129,code=0),z > > > > > one= > > > > > <cleared> > > > > > +tcp,orig=(src=fd20::2,dst=fd20::3,sport=<cleared>,dport=<clear > > > > > ed>) > > > > > ,reply=(src=fd11::2,dst=fd20::2,sport=<cleared>,dport=<cleared> > > > > > ),zo > > > > > ne=<cleared>,protoinfo=(state=<cleared>) > > > > > +udp,orig=(src=fd20::2,dst=fd20::3,sport=<cleared>,dport=<clear > > > > > ed>) > > > > > ,reply=(src=fd11::2,dst=fd20::2,sport=<cleared>,dport=<cleared> > > > > > ),zo > > > > > ne=<cleared> > > > > > ]) > > > > > > > > > > AT_CHECK([ovs-appctl dpctl/flush-conntrack]) > > > > > > > > > > -# South-North SNAT: 'foo2' pings 'alice1'. But 'alice1' > > > > > receives > > > > > traffic > > > > > +# South-North SNAT: 'foo2' reaches 'alice1'. But 'alice1' > > > > > receives > > > > > traffic > > > > > # from fd20::4 > > > > > -NS_CHECK_EXEC([foo2], [ping6 -q -c 3 -i 0.3 -w 2 fd20::2 | > > > > > FORMAT_PING], \ > > > > > -[0], [dnl > > > > > -3 packets transmitted, 3 received, 0% packet loss, time 0ms > > > > > -]) > > > > > +NS_CHECK_CONNECTIVITY([foo2], [alice1], fd20::2) > > > > > > > > > > ovs-appctl dpctl/dump-conntrack | grep icmpv6 > > > > > AT_CHECK([ovs-appctl dpctl/dump-conntrack | FORMAT_CT(fd11::3) > > > > > | \ > > > > > @@ -4010,59 +3972,29 @@ sed -e 's/zone=[[0- > > > > > 9]]*/zone=<cleared>/'], > > > > > [0], [dnl > > > > > AT_CHECK([ovs-appctl dpctl/dump-conntrack | FORMAT_CT(fd20::4) > > > > > | \ > > > > > sed -e 's/zone=[[0-9]]*/zone=<cleared>/'], [0], [dnl > > > > > icmpv6,orig=(src=fd11::3,dst=fd20::2,id=<cleared>,type=128,cod > > > > > e=0) > > > > > ,reply=(src=fd20::2,dst=fd20::4,id=<cleared>,type=129,code=0),z > > > > > one= > > > > > <cleared> > > > > > +tcp,orig=(src=fd11::3,dst=fd20::2,sport=<cleared>,dport=<clear > > > > > ed>) > > > > > ,reply=(src=fd20::2,dst=fd20::4,sport=<cleared>,dport=<cleared> > > > > > ),zo > > > > > ne=<cleared>,protoinfo=(state=<cleared>) > > > > > +udp,orig=(src=fd11::3,dst=fd20::2,sport=<cleared>,dport=<clear > > > > > ed>) > > > > > ,reply=(src=fd20::2,dst=fd20::4,sport=<cleared>,dport=<cleared> > > > > > ),zo > > > > > ne=<cleared> > > > > > ]) > > > > > > > > > > AT_CHECK([ovs-appctl dpctl/flush-conntrack]) > > > > > > > > > > -# South-North SNAT: 'bar1' pings 'alice1'. But 'alice1' > > > > > receives > > > > > traffic > > > > > +# South-North SNAT: 'bar1' reaches 'alice1'. But 'alice1' > > > > > receives > > > > > traffic > > > > > # from fd20::1 > > > > > -NS_CHECK_EXEC([bar1], [ping6 -q -c 3 -i 0.3 -w 2 fd20::2 | > > > > > FORMAT_PING], \ > > > > > -[0], [dnl > > > > > -3 packets transmitted, 3 received, 0% packet loss, time 0ms > > > > > -]) > > > > > +NS_CHECK_CONNECTIVITY([bar1], [alice1], fd20::2) > > > > > > > > > > # We verify that SNAT indeed happened via 'dump-conntrack' > > > > > command. > > > > > AT_CHECK([ovs-appctl dpctl/dump-conntrack | FORMAT_CT(fd20::1) > > > > > | \ > > > > > sed -e 's/zone=[[0-9]]*/zone=<cleared>/'], [0], [dnl > > > > > icmpv6,orig=(src=fd12::2,dst=fd20::2,id=<cleared>,type=128,cod > > > > > e=0) > > > > > ,reply=(src=fd20::2,dst=fd20::1,id=<cleared>,type=129,code=0),z > > > > > one= > > > > > <cleared> > > > > > +tcp,orig=(src=fd12::2,dst=fd20::2,sport=<cleared>,dport=<clear > > > > > ed>) > > > > > ,reply=(src=fd20::2,dst=fd20::1,sport=<cleared>,dport=<cleared> > > > > > ),zo > > > > > ne=<cleared>,protoinfo=(state=<cleared>) > > > > > +udp,orig=(src=fd12::2,dst=fd20::2,sport=<cleared>,dport=<clear > > > > > ed>) > > > > > ,reply=(src=fd20::2,dst=fd20::1,sport=<cleared>,dport=<cleared> > > > > > ),zo > > > > > ne=<cleared> > > > > > ]) > > > > > > > > > > -# test_connectivity_from_ext takes parameters 'vm' and 'ip'. > > > > > It > > > > > tests > > > > > -# icmp, udp and tcp connectivity from external network to the > > > > > 'vm' > > > > > on > > > > > -# the specified 'ip'. > > > > > -test_connectivity_from_ext() { > > > > > - local vm=$1; shift > > > > > - local ip=$1; shift > > > > > - > > > > > - # Start listening daemon for TCP connections. > > > > > - NETNS_DAEMONIZE($vm, [nc -6 -l -k 1235], [nc-$vm-$ip- > > > > > tcp.pid]) > > > > > - > > > > > - # Ensure that vm can be pinged on the specified IP > > > > > - NS_CHECK_EXEC([alice1], [ping -q -c 3 -i 0.3 -w 2 $ip | > > > > > FORMAT_PING], \ > > > > > - [0], [dnl > > > > > -3 packets transmitted, 3 received, 0% packet loss, time 0ms > > > > > -]) > > > > > - > > > > > - # Perform two consecutive UDP connections to the specified > > > > > IP > > > > > - NETNS_DAEMONIZE($vm, [nc -6 -l -u 1234], [nc-$vm-$ip- > > > > > udp.pid]) > > > > > - NS_CHECK_EXEC([alice1], [nc -u $ip 1234 -p 2000 -z]) > > > > > - kill $(cat nc-$vm-$ip-udp.pid) > > > > > - > > > > > - NETNS_DAEMONIZE($vm, [nc -6 -l -u 1234], [nc-$vm-$ip- > > > > > udp.pid]) > > > > > - NS_CHECK_EXEC([alice1], [nc -u $ip 1234 -p 2000 -z]) > > > > > - kill $(cat nc-$vm-$ip-udp.pid) > > > > > - > > > > > - # Send data over TCP connection to the specified IP > > > > > - NS_CHECK_EXEC([alice1], [echo "TCP test" | nc --send-only > > > > > $ip > > > > > 1235]) > > > > > -} > > > > > - > > > > > -# Test access from external network to the internal IP of a VM > > > > > that > > > > > -# has also configured DNAT > > > > > -test_connectivity_from_ext foo1 fd11::2 > > > > > +# North-South Direct (Bypassing DNAT): 'alice1' reaches 'foo1' > > > > > using fd11::2 > > > > > +NS_CHECK_CONNECTIVITY([alice1], [foo1], [fd11::2]) > > > > > > > > > > -# Test access from external network to the internal IP of a VM > > > > > that > > > > > -# does not have DNAT > > > > > -test_connectivity_from_ext bar1 fd12::2 > > > > > +# North-South Direct (Bypassing SNAT): 'alice1' reaches 'bar1' > > > > > using fd12::2 > > > > > +NS_CHECK_CONNECTIVITY([alice1], [bar1], [fd12::2]) > > > > > > > > > > OVN_CLEANUP_CONTROLLER([hv1]) > > > > > > > > > > @@ -4244,7 +4176,6 @@ NS_CHECK_EXEC([foo2], [ping -q -c 3 -i > > > > > 0.3 -w > > > > > 2 172.16.1.4 | FORMAT_PING], \ > > > > > AT_CHECK([ovs-appctl dpctl/dump-conntrack | grep icmp | > > > > > FORMAT_CT(172.16.1.1) | \ > > > > > sed -e 's/zone=[[0-9]]*/zone=<cleared>/'], [0], [dnl > > > > > icmp,orig=(src=172.16.1.1,dst=172.16.1.4,id=<cleared>,type=8,c > > > > > ode= > > > > > 0),reply=(src=192.168.2.2,dst=172.16.1.1,id=<cleared>,type=0,co > > > > > de=0 > > > > > ),zone=<cleared> > > > > > - > > > > > icmp,orig=(src=172.16.1.1,dst=192.168.2.2,id=<cleared>,type=8,c > > > > > ode=0 > > > > > ),reply=(src=192.168.2.2,dst=172.16.1.1,id=<cleared>,type=0,cod > > > > > e=0) > > > > > ,zone=<cleared> > > > > > icmp,orig=(src=192.168.1.3,dst=172.16.1.4,id=<cleared>,type=8, > > > > > code > > > > > =0),reply=(src=172.16.1.4,dst=172.16.1.1,id=<cleared>,type=0,co > > > > > de=0 > > > > > ),zone=<cleared> > > > > > ]) > > > > > > > > > > @@ -4412,7 +4343,6 @@ NS_CHECK_EXEC([foo2], [ping -q -c 3 -i > > > > > 0.3 -w > > > > > 2 fd20::4 | FORMAT_PING], \ > > > > > AT_CHECK([ovs-appctl dpctl/dump-conntrack | FORMAT_CT(fd20::1) > > > > > | \ > > > > > sed -e 's/zone=[[0-9]]*/zone=<cleared>/'], [0], [dnl > > > > > icmpv6,orig=(src=fd11::3,dst=fd20::4,id=<cleared>,type=128,cod > > > > > e=0) > > > > > ,reply=(src=fd20::4,dst=fd20::1,id=<cleared>,type=129,code=0),z > > > > > one= > > > > > <cleared> > > > > > - > > > > > icmpv6,orig=(src=fd20::1,dst=fd12::2,id=<cleared>,type=128,code > > > > > =0),r > > > > > eply=(src=fd12::2,dst=fd20::1,id=<cleared>,type=129,code=0),zon > > > > > e=<c > > > > > leared> > > > > > icmpv6,orig=(src=fd20::1,dst=fd20::4,id=<cleared>,type=128,cod > > > > > e=0) > > > > > ,reply=(src=fd12::2,dst=fd20::1,id=<cleared>,type=129,code=0),z > > > > > one= > > > > > <cleared> > > > > > ]) > > > > > > > > > > _______________________________________________ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev