> Hi Lorenzo, I have a few comments below
> 
> On 6/10/22 13:49, Lorenzo Bianconi wrote:
> > Match the drop flow for stateless dnat_and_snat flow in 
> > S_ROUTER_IN_GW_REDIRECT
> > stage just if allowed_ext_ips or exempted_ext_ips conditions do not
> > match since it breaks the hairping scenario with stateless nat.
> 
> s/hairping/hairpin

ops, I will fix it, thx

> 
> I do like the idea of calling a ping that is expected to hairpin a
> "hairping" though :)
> 
> > Fix the northd documentation.
> > 
> > Fixes: c0224a14f ("northd: fix stateless nat with allowed_ext_ips")
> > Reported-at: https://bugzilla.redhat.com/show_bug.cgi?id=2094980
> > Signed-off-by: Lorenzo Bianconi <[email protected]>
> > ---
> >   northd/northd.c         | 43 ++++++++++++++++++++++++++++-------------
> >   northd/ovn-northd.8.xml | 20 ++++++++++++++-----
> >   tests/system-ovn.at     | 15 ++++++++++++++
> >   3 files changed, 60 insertions(+), 18 deletions(-)
> > 
> > diff --git a/northd/northd.c b/northd/northd.c
> > index 0207f6ce1..5264dbd71 100644
> > --- a/northd/northd.c
> > +++ b/northd/northd.c
> > @@ -12035,6 +12035,7 @@ build_gateway_redirect_flows_for_lrouter(
> >       }
> >       for (size_t i = 0; i < od->n_l3dgw_ports; i++) {
> >           const struct ovsdb_idl_row *stage_hint = NULL;
> > +        bool add_def_flow = true;
> >           if (od->l3dgw_ports[i]->nbrp) {
> >               stage_hint = &od->l3dgw_ports[i]->nbrp->header_;
> > @@ -12053,22 +12054,38 @@ build_gateway_redirect_flows_for_lrouter(
> >           ovn_lflow_add_with_hint(lflows, od, S_ROUTER_IN_GW_REDIRECT, 50,
> >                                   ds_cstr(match), ds_cstr(actions),
> >                                   stage_hint);
> > -    }
> > +        for (int j = 0; j < od->n_nat_entries; j++) {
> > +            const struct ovn_nat *nat = &od->nat_entries[j];
> > -    for (int i = 0; i < od->n_nat_entries; i++) {
> > -        const struct ovn_nat *nat = &od->nat_entries[i];
> > +            if (!lrouter_nat_is_stateless(nat->nb) ||
> > +                strcmp(nat->nb->type, "dnat_and_snat")) {
> > +                continue;
> > +            }
> > -        if (!lrouter_nat_is_stateless(nat->nb) ||
> > -            strcmp(nat->nb->type, "dnat_and_snat")) {
> > -           continue;
> > -        }
> > +            if (nat->nb->allowed_ext_ips || nat->nb->exempted_ext_ips) {
> > +                struct nbrec_address_set  *as = nat->nb->allowed_ext_ips
> > +                    ? nat->nb->allowed_ext_ips : nat->nb->exempted_ext_ips;
> > +                struct ds match_ext = DS_EMPTY_INITIALIZER;
> > -        ds_clear(match);
> > -        ds_put_format(match, "ip && ip%s.dst == %s",
> > -                      nat_entry_is_v6(nat) ? "6" : "4",
> > -                      nat->nb->external_ip);
> > -        ovn_lflow_add(lflows, od, S_ROUTER_IN_GW_REDIRECT, 100,
> > -                      ds_cstr(match), "drop;");
> > +                ds_put_format(&match_ext, "%s && ip%s.src == $%s",
> > +                              ds_cstr(match), nat_entry_is_v6(nat) ? "6" : 
> > "4",
> > +                              as->name);
> > +                ovn_lflow_add_with_hint(lflows, od, 
> > S_ROUTER_IN_GW_REDIRECT,
> > +                                        75, ds_cstr(&match_ext),
> > +                                        ds_cstr(actions), stage_hint);
> > +
> > +                if (add_def_flow) {
> > +                    ds_clear(&match_ext);
> > +                    ds_put_format(&match_ext, "ip && ip%s.dst == %s",
> > +                                  nat_entry_is_v6(nat) ? "6" : "4",
> > +                                  nat->nb->external_ip);
> > +                    ovn_lflow_add(lflows, od, S_ROUTER_IN_GW_REDIRECT, 70,
> > +                                  ds_cstr(&match_ext), "drop;");
> > +                    add_def_flow = false;
> > +                }
> > +                ds_destroy(&match_ext);
> > +            }
> > +        }
> 
> I think the correct thing to do is to remove the original loop from the
> code, and allow all traffic to be output to the gateway port. The allowed
> and exempted IP evaluation should be done by the gateway router after the
> traffic is hairpinned. Even if you wanted to try to add an optimization here
> (drop traffic early), the exempted_ips logic is not correct. The new code
> allows traffic from exempted IPs to be output to the redirect port. The
> problem is that once that traffic is hairpinned, the traffic will be dropped
> since the IP is in the exempted_ext_ips set. If anything, the code should be
> allowing all IPs that do not match the extempted_ext_ips set. However, that
> would translate to a lot of flows. It's easier to just allow all traffic to
> be redirected and filter the traffic after redirecting.

(as we discussed offline) the point here is for N/S traffic the does not match
the the allowed_ext_ips set and so it is recirculated on the gw node. I added
the drop flow to take care of it.
Regarding the exempted_ext_ips I agree, we can do someting different here,
something like:

            struct ds match_ext = DS_EMPTY_INITIALIZER;
            struct nbrec_address_set  *as = nat->nb->allowed_ext_ips
                ? nat->nb->allowed_ext_ips : nat->nb->exempted_ext_ips;
            ds_put_format(&match_ext, "%s && ip%s.src == $%s",
                          ds_cstr(match), nat_entry_is_v6(nat) ? "6" : "4",
                          as->name);

            if (nat->nb->allowed_ext_ips) {
               ....
            } else if (nat->nb->exempted_ext_ips) {
                ovn_lflow_add_with_hint(lflows, od, S_ROUTER_IN_GW_REDIRECT,
                                        75, ds_cstr(&match_ext), "drop;",
                                        stage_hint);
            }

what do you think?

> 
> >       }
> >       /* Packets are allowed by default. */
> > diff --git a/northd/ovn-northd.8.xml b/northd/ovn-northd.8.xml
> > index 1f7022490..ad1a66cd4 100644
> > --- a/northd/ovn-northd.8.xml
> > +++ b/northd/ovn-northd.8.xml
> > @@ -3018,8 +3018,7 @@ icmp6 {
> >             <code>ip &amp;&amp; ip6.dst == <var>B</var></code>
> >             with an action <code>ct_snat; </code>. If the NAT rule is of 
> > type
> >             dnat_and_snat and has <code>stateless=true</code> in the
> > -          options, then the action would be <code>ip4/6.dst=
> > -          (<var>B</var>)</code>.
> > +          options, then the action would be <code>next;</code>.
> >           </p>
> >           <p>
> > @@ -3059,7 +3058,7 @@ icmp6 {
> >                 action <code>ct_snat_in_czone;</code> to unSNAT in the 
> > common
> >                 zone.  If the NAT rule is of type dnat_and_snat and has
> >                 <code>stateless=true</code> in the options, then the action
> > -              would be <code>ip4/6.dst=(<var>B</var>)</code>.
> > +              would be <code>next;</code>.
> >               </p>
> >               <p>
> > @@ -4235,6 +4234,18 @@ icmp6 {
> >           gateway chassis.
> >         </li>
> > +      <li>
> > +        For each <code>dnat_and_snat</code> NAT rule with
> > +        <code>stateless=true</code> and <code>allowed_ext_ips</code>
> > +        or <code>exempted_ext_ips</code> configured, a similar priority-75
> > +        flow is programmed with an additional match
> > +        <code>ip4.src == <var>allowed_ext_ips</var></code> or
> > +        <code>ip4.src == <var>exempted_ext_ips</var></code>.
> > +        Moreover a priority-70 drop flow is added with match
> > +        <code>ip4.dst == <var>B</var></code> where <var>B</var> is the NAT
> > +        rule external IP. A specular flow is added for IPv6 traffic.
> 
> Did you mean to use the word specular here? It's so uncommon that my email
> client flags it as a misspelling. I looked it up, and it is a word, and it
> *kind of* fits here too (it means behaving like a mirror). But I have a
> feeling you meant to use "similar", right?

yes, I mean "similar" :)

Regards,
Lorenzo

> 
> > +      </li>
> > +
> >         <li>
> >           A priority-0 logical flow with match <code>1</code> has actions
> >           <code>next;</code>.
> > @@ -4415,8 +4426,7 @@ nd_ns {
> >             is the logical router gateway port, with an action
> >             <code>ct_dnat_in_czone;</code>. If the NAT rule is of type
> >             dnat_and_snat and has <code>stateless=true</code> in the
> > -          options, then the action would be <code>ip4/6.src=
> > -          (<var>B</var>)</code>.
> > +          options, then the action would be <code>next;</code>.
> >           </p>
> >           <p>
> > diff --git a/tests/system-ovn.at b/tests/system-ovn.at
> > index 4bf22593a..df2da3408 100644
> > --- a/tests/system-ovn.at
> > +++ b/tests/system-ovn.at
> > @@ -6741,6 +6741,21 @@ NS_CHECK_EXEC([vm1], [ping -q -c 3 -i 0.3 -w 2 
> > 172.18.2.10 | FORMAT_PING], \
> >   [0], [dnl
> >   3 packets transmitted, 3 received, 0% packet loss, time 0ms
> >   ])
> > +
> > +dnat_and_snat_uuid=$(fetch_column nb:NAT _uuid external_ip=172.18.2.10)
> > +ovn-nbctl set NAT $dnat_and_snat_uuid options:stateless=true
> > +
> > +# A ping from vm1 should hairpin in lr1 and successfully DNAT to vm2
> > +NS_CHECK_EXEC([vm1], [ping -q -c 3 -i 0.3 -w 2 172.18.2.10 | FORMAT_PING], 
> > \
> > +[0], [dnl
> > +3 packets transmitted, 3 received, 0% packet loss, time 0ms
> > +])
> > +# A ping from vm2 should hairpin in lr1 and successfully DNAT to vm2
> > +NS_CHECK_EXEC([vm2], [ping -q -c 3 -i 0.3 -w 2 172.18.2.10 | FORMAT_PING], 
> > \
> > +[0], [dnl
> > +3 packets transmitted, 3 received, 0% packet loss, time 0ms
> > +])
> > +
> >   kill $(pidof ovn-controller)
> >   as ovn-sb
> > 
> 
_______________________________________________
dev mailing list
[email protected]
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Reply via email to