Hi Ales, thanks for another round of review.

On Wed, Mar 20, 2024 at 12:34 PM Ales Musil <[email protected]> wrote:

>
>
> On Thu, Mar 14, 2024 at 2:13 PM Martin Kalcok <[email protected]>
> wrote:
>
>> Hello all,
>> I have one more follow-up regarding the comments in v1. @amusil, you were
>> concerned about the impact this change would have on the performance so I
>> ran some tests to try to gauge it. I used following setup with two physical
>> hosts connected over switch:
>>
>>        |    Physical ext. | Hypervisor1
>>        |       network    |
>> alice1-|----- switch -----|-- DLR --- foo1
>>        |                  |
>>
>> * alice1 acts as external device.
>>   * It doesn't run OVN
>>   * It's connected to regular physical network.
>> * Hypervisor1 runs OVN chassis
>> * foo1 is a container behind Distributed LR with SNAT enabled.
>>
>> Goal of this test was to measure whether the proposed patch affects
>> throughput of the traffic that flows from "foo1" to external network. I
>> used iperf3 and ran 3x 5 minute measurements with 10 streams ("iperf3 -i 0
>> -t 300 -P 10 -c alice1"). I used 5 minute for each measurement to smooth
>> out possible jitter and I ran each measurement 3 times to get feel for the
>> variance in the network throughput over longer period. I also did some
>> trial and error testing that showed that 10 parallel streams reached
>> highest throughput.
>>
>> Following are the three (summary) results with this patch applied:
>>
>> # run 1 (with patch)
>> [SUM]   0.00-300.00 sec   296 GBytes  8.48 Gbits/sec  sender
>> [SUM]   0.00-300.00 sec   296 GBytes  8.48 Gbits/sec  receiver
>>
>> # run 2 (with patch)
>> [SUM]   0.00-300.00 sec   288 GBytes  8.24 Gbits/sec sender
>> [SUM]   0.00-300.00 sec   288 GBytes  8.23 Gbits/sec receiver
>>
>> # run 3 (with patch)
>> [SUM]   0.00-300.00 sec   299 GBytes  8.55 Gbits/sec sender
>> [SUM]   0.00-300.00 sec   299 GBytes  8.55 Gbits/sec receiver
>>
>> Next thing I did was to rebuild ovn-controller without these patches,
>> replaced the ovn-controller binary on the Hypervisor1 and restarted
>> ovn-controller daemon. I verified that the feature flag "ct-commit-to-zone"
>> was removed from the chassis (in the SB database), which forced the
>> recompute, and then I reran the tests:
>>
>> # run 1 (clean)
>> [SUM]   0.00-300.00 sec   300 GBytes  8.59 Gbits/sec sender
>> [SUM]   0.00-300.00 sec   300 GBytes  8.59 Gbits/sec receiver
>>
>> # run 2 (clean)
>> [SUM]   0.00-300.00 sec   285 GBytes  8.15 Gbits/sec sender
>> [SUM]   0.00-300.00 sec   285 GBytes  8.15 Gbits/sec receiver
>>
>> # run 3 (clean)
>> [SUM]   0.00-300.00 sec   295 GBytes  8.46 Gbits/sec sender
>> [SUM]   0.00-300.00 sec   295 GBytes  8.46 Gbits/sec receiver
>>
>> These tests show that while there was some fluctuation between each
>> individual test, when comparing patched and clean version, they come out
>> about the same. I'm happy to run more tests/scenarios if you have something
>> else in mind that should be tested.
>>
>>
>>
> Hi Martin,
>
> thank you for the performance numbers, it looks fine to me. I have some
> comments, see down below.
>
>
>> On Wed, Mar 13, 2024 at 10:17 AM Martin Kalcok <
>> [email protected]> wrote:
>>
>>> Regarding the failed unstable test in the CI, I suspect that this is not
>>> related to the code change, I've had couple successful CI runs in my branch
>>> [0].
>>>
>>> Martin.
>>>
>>> [0] https://github.com/mkalcok/ovn/actions/runs/8256539983
>>>
>>> On Tue, Mar 12, 2024 at 8:45 PM Martin Kalcok <
>>> [email protected]> wrote:
>>>
>>>> This change fixes bug that breaks ability of machines from external
>>>> networks to communicate with machines in SNATed networks (specifically
>>>> when using a Distributed router).
>>>>
>>>> Currently when a machine (S1) on an external network tries to talk
>>>> over TCP with a machine (A1) in a network that has enabled SNAT, the
>>>> connection is established successfully. However after the three-way
>>>> handshake, any packets that come from the A1 machine will have their
>>>> source address translated by the Distributed router, breaking the
>>>> communication.
>>>>
>>>> Existing rule in `build_lrouter_out_snat_flow` that decides which
>>>> packets should be SNATed already tries to avoid SNATing packets in
>>>> reply direction with `(!ct.trk || !ct.rpl)`. However, previous stages
>>>> in the distributed LR egress pipeline do not initiate the CT state.
>>>>
>>>> Additionally we need to commit new connections that originate from
>>>> external networks into CT, so that the packets in the reply direction
>>>> can be properly identified.
>>>>
>>>> Rationale:
>>>>
>>>> In my original RFC [0], there were questions about the motivation for
>>>> fixing this issue. I'll try to summarize why I think this is a bug
>>>> that should be fixed.
>>>>
>>>> 1. Current implementation for Distributed router already tries to
>>>>    avoid SNATing packets in the reply direction, it's just missing
>>>>    initialized CT states to make proper decisions.
>>>>
>>>> 2. This same scenario works with Gateway Router. I tested with
>>>>    following setup:
>>>>
>>>>     foo -- R1 -- join - R3 -- alice
>>>>                   |
>>>>     bar ----------R2
>>>>
>>>>     R1 is a Distributed router with SNAT for foo. R2 is a Gateway
>>>>     router with SNAT for bar. R3 is a Gateway router with no SNAT.
>>>>     Using 'alice1' as a client I was able to talk over TCP with
>>>>     'bar1' but connection with 'foo1' failed.
>>>>
>>>> 3. Regarding security and "leaking" of internal IPs. Reading through
>>>>    RFC 4787 [1], 5382 [2] and their update in 7857 [3], the
>>>>    specifications do not seem to mandate that SNAT implementations
>>>>    must filter incoming traffic destined directly to the internal
>>>>    network. Sections like "5. Filtering Behavior" in 4787 and
>>>>    "4.3 Externally Initiated Connections" in 5382 describe only
>>>>    behavior for traffic destined to external IP/Port associated
>>>>    with NAT on the device that performs NAT.
>>>>
>>>>    Besides, with the current implementation, it's already possible
>>>>    to scan the internal network with pings and TCP syn scanning.
>>>>
>>>> 4. We do have customers/clouds that depend on this functionality.
>>>>    This is a scenario that used to work in Openstack with ML2/OVS
>>>>    and migrating those clouds to ML2/OVN would break it.
>>>>
>>>> [0]
>>>> https://mail.openvswitch.org/pipermail/ovs-dev/2024-February/411670.html
>>>> [1]https://datatracker.ietf.org/doc/html/rfc4787
>>>> [2]https://datatracker.ietf.org/doc/html/rfc5382
>>>> [3]https://datatracker.ietf.org/doc/html/rfc7857
>>>>
>>>> Signed-off-by: Martin Kalcok <[email protected]>
>>>> ---
>>>>  northd/northd.c         | 68 ++++++++++++++++++++++++++++++++--------
>>>>  northd/ovn-northd.8.xml | 29 +++++++++++++++++
>>>>  tests/ovn-northd.at     | 33 ++++++++++++++++----
>>>>  tests/system-ovn.at     | 69 +++++++++++++++++++++++++++++++++++++++++
>>>>  4 files changed, 180 insertions(+), 19 deletions(-)
>>>>
>>>> diff --git a/northd/northd.c b/northd/northd.c
>>>> index 2c3560ce2..25af52d5a 100644
>>>> --- a/northd/northd.c
>>>> +++ b/northd/northd.c
>>>> @@ -14438,20 +14438,27 @@ build_lrouter_out_is_dnat_local(struct
>>>> lflow_table *lflows,
>>>>
>>>>  static void
>>>>  build_lrouter_out_snat_match(struct lflow_table *lflows,
>>>> -                             const struct ovn_datapath *od,
>>>> -                             const struct nbrec_nat *nat, struct ds
>>>> *match,
>>>> -                             bool distributed_nat, int cidr_bits, bool
>>>> is_v6,
>>>> -                             struct ovn_port *l3dgw_port,
>>>> -                             struct lflow_ref *lflow_ref)
>>>> +                                         const struct ovn_datapath *od,
>>>> +                                         const struct nbrec_nat *nat,
>>>> +                                         struct ds *match,
>>>> +                                         bool distributed_nat, int
>>>> cidr_bits,
>>>> +                                         bool is_v6,
>>>> +                                         struct ovn_port *l3dgw_port,
>>>> +                                         struct lflow_ref *lflow_ref,
>>>> +                                         bool is_reverse)
>>>>  {
>>>>      ds_clear(match);
>>>>
>>>> -    ds_put_format(match, "ip && ip%c.src == %s", is_v6 ? '6' : '4',
>>>> +    ds_put_format(match, "ip && ip%c.%s == %s",
>>>> +                  is_v6 ? '6' : '4',
>>>> +                  is_reverse ? "dst" : "src",
>>>>                    nat->logical_ip);
>>>>
>>>>      if (!od->is_gw_router) {
>>>>          /* Distributed router. */
>>>> -        ds_put_format(match, " && outport == %s",
>>>> l3dgw_port->json_key);
>>>> +        ds_put_format(match, " && %s == %s",
>>>> +                      is_reverse ? "inport" : "outport",
>>>> +                      l3dgw_port->json_key);
>>>>          if (od->n_l3dgw_ports) {
>>>>              ds_put_format(match, " && is_chassis_resident(\"%s\")",
>>>>                            distributed_nat
>>>> @@ -14462,7 +14469,7 @@ build_lrouter_out_snat_match(struct lflow_table
>>>> *lflows,
>>>>
>>>>      if (nat->allowed_ext_ips || nat->exempted_ext_ips) {
>>>>          lrouter_nat_add_ext_ip_match(od, lflows, match, nat,
>>>> -                                     is_v6, false, cidr_bits,
>>>> +                                     is_v6, is_reverse, cidr_bits,
>>>>                                       lflow_ref);
>>>>      }
>>>>  }
>>>> @@ -14489,7 +14496,8 @@ build_lrouter_out_snat_stateless_flow(struct
>>>> lflow_table *lflows,
>>>>      uint16_t priority = cidr_bits + 1;
>>>>
>>>>      build_lrouter_out_snat_match(lflows, od, nat, match,
>>>> distributed_nat,
>>>> -                                 cidr_bits, is_v6, l3dgw_port,
>>>> lflow_ref);
>>>> +                                 cidr_bits, is_v6, l3dgw_port,
>>>> lflow_ref,
>>>> +                                 false);
>>>>
>>>>      if (!od->is_gw_router) {
>>>>          /* Distributed router. */
>>>> @@ -14536,7 +14544,7 @@ build_lrouter_out_snat_in_czone_flow(struct
>>>> lflow_table *lflows,
>>>>
>>>>      build_lrouter_out_snat_match(lflows, od, nat, match,
>>>> distributed_nat,
>>>>                                   cidr_bits, is_v6, l3dgw_port,
>>>> -                                 lflow_ref);
>>>> +                                 lflow_ref, false);
>>>>
>>>>      if (od->n_l3dgw_ports) {
>>>>          priority += 128;
>>>> @@ -14585,12 +14593,14 @@ build_lrouter_out_snat_flow(struct
>>>> lflow_table *lflows,
>>>>                              struct ds *actions, bool distributed_nat,
>>>>                              struct eth_addr mac, int cidr_bits, bool
>>>> is_v6,
>>>>                              struct ovn_port *l3dgw_port,
>>>> -                            struct lflow_ref *lflow_ref)
>>>> +                            struct lflow_ref *lflow_ref,
>>>> +                            const struct chassis_features *features)
>>>>  {
>>>>      if (strcmp(nat->type, "snat") && strcmp(nat->type,
>>>> "dnat_and_snat")) {
>>>>          return;
>>>>      }
>>>>
>>>> +    struct ds match_all_from_snat = DS_EMPTY_INITIALIZER;
>>>>      ds_clear(actions);
>>>>
>>>>      /* The priority here is calculated such that the
>>>> @@ -14599,7 +14609,9 @@ build_lrouter_out_snat_flow(struct lflow_table
>>>> *lflows,
>>>>      uint16_t priority = cidr_bits + 1;
>>>>
>>>>      build_lrouter_out_snat_match(lflows, od, nat, match,
>>>> distributed_nat,
>>>> -                                 cidr_bits, is_v6, l3dgw_port,
>>>> lflow_ref);
>>>> +                                 cidr_bits, is_v6, l3dgw_port,
>>>> lflow_ref,
>>>> +                                 false);
>>>> +    ds_clone(&match_all_from_snat, match);
>>>>
>>>
> We don't have to clone the match, you can just store the match length e.g.
> size_t match_len = match->len;
>

Oh, cool approach. I didn't think about it this way, thanks.


>
>
>>
>>>>      if (!od->is_gw_router) {
>>>>          /* Distributed router. */
>>>> @@ -14624,6 +14636,36 @@ build_lrouter_out_snat_flow(struct lflow_table
>>>> *lflows,
>>>>                              priority, ds_cstr(match),
>>>>                              ds_cstr(actions), &nat->header_,
>>>>                              lflow_ref);
>>>> +
>>>> +    /* For the SNAT networks, we need to make sure that connections are
>>>> +     * properly tracked so we can decide whether to perform SNAT on
>>>> traffic
>>>> +     * exiting the network. */
>>>> +    if (features->ct_commit_to_zone && !strcmp(nat->type, "snat") &&
>>>> +        !od->is_gw_router) {
>>>>
>>>
> Here you can just truncate the ds to the state before the previous if:
> ds_truncate(match, match_len);
>
>
>> +        /* For traffic that comes from SNAT network, initiate CT state
>>>> before
>>>> +         * entering S_ROUTER_OUT_SNAT to allow matching on various CT
>>>> states.
>>>> +         */
>>>> +        ovn_lflow_add(lflows, od, S_ROUTER_OUT_POST_UNDNAT, 70,
>>>> +                      ds_cstr(&match_all_from_snat), "ct_snat; ",
>>>> +                      lflow_ref);
>>>> +
>>>> +        build_lrouter_out_snat_match(lflows, od, nat, match,
>>>> +                                     distributed_nat, cidr_bits, is_v6,
>>>> +                                     l3dgw_port, lflow_ref, true);
>>>> +
>>>> +        /* New traffic that goes into SNAT network is committed to CT
>>>> to avoid
>>>> +         * SNAT-ing replies.*/
>>>> +        ovn_lflow_add(lflows, od, S_ROUTER_OUT_SNAT, priority,
>>>> +                      ds_cstr(match), "ct_snat;",
>>>> +                      lflow_ref);
>>>> +
>>>> +        ds_put_cstr(match, "&& ct.new");
>>>> +        ovn_lflow_add(lflows, od, S_ROUTER_OUT_POST_SNAT, priority,
>>>> +                      ds_cstr(match), "ct_commit(snat); next; ",
>>>> +                      lflow_ref);
>>>> +    }
>>>> +
>>>> +    ds_destroy(&match_all_from_snat);
>>>>  }
>>>>
>>>>  static void
>>>> @@ -15167,7 +15209,7 @@ build_lrouter_nat_defrag_and_lb(
>>>>          } else {
>>>>              build_lrouter_out_snat_flow(lflows, od, nat, match,
>>>> actions,
>>>>                                          distributed_nat, mac,
>>>> cidr_bits, is_v6,
>>>> -                                        l3dgw_port, lflow_ref);
>>>> +                                        l3dgw_port, lflow_ref,
>>>> features);
>>>>          }
>>>>
>>>>          /* S_ROUTER_IN_ADMISSION - S_ROUTER_IN_IP_INPUT */
>>>> diff --git a/northd/ovn-northd.8.xml b/northd/ovn-northd.8.xml
>>>> index 17b414144..ce6fd1270 100644
>>>> --- a/northd/ovn-northd.8.xml
>>>> +++ b/northd/ovn-northd.8.xml
>>>> @@ -4869,6 +4869,13 @@ nd_ns {
>>>>
>>>>      <p>
>>>>        <ul>
>>>> +        <li>
>>>> +          A priority-70 logical flow is added that initiates CT state
>>>> for
>>>> +          traffic that is configured to be SNATed on Distributed
>>>> routers.
>>>> +          This allows the next table, <code>lr_out_snat</code>, to
>>>> +          effectively match on various CT states.
>>>> +        </li>
>>>> +
>>>>          <li>
>>>>            A priority-50 logical flow is added that commits any
>>>> untracked flows
>>>>            from the previous table <code>lr_out_undnat</code> for
>>>> Gateway
>>>> @@ -5061,6 +5068,18 @@ nd_ns {
>>>>
>>>>        </li>
>>>>
>>>> +      <li>
>>>> +        An additional flow is added for traffic that goes in opposite
>>>> +        direction (i.e. it enters a network with configured SNAT).
>>>> Where the
>>>> +        flow above matched on <code>ip4.src == <var>A</var> &amp;&amp;
>>>> outport
>>>> +        == <var>GW</var></code>, this flow matches on <code> ip4.dst ==
>>>> +        <var>A</var> &amp;&amp; inport == <var>GW</var></code>. A CT
>>>> state is
>>>> +        initiated for this traffic so that the following table, <code>
>>>> +        lr_out_post_snat</code>, can identify whether the traffic flow
>>>> was
>>>> +        initiated from the internal or external network.
>>>> +
>>>> +      </li>
>>>> +
>>>>        <li>
>>>>          A priority-0 logical flow with match <code>1</code> has actions
>>>>          <code>next;</code>.
>>>> @@ -5072,6 +5091,16 @@ nd_ns {
>>>>      <p>
>>>>        Packets reaching this table are processed according to the flows
>>>> below:
>>>>        <ul>
>>>> +        <li>
>>>> +          Traffic that goes directly into a network configured with
>>>> SNAT on
>>>> +          Distributed routers, and was initiated from an external
>>>> network (i.e.
>>>> +          it matches <code>ct.new</code>), is committed to the SNAT CT
>>>> zone.
>>>> +          This ensures that replies returning from the SNATed network
>>>> do not
>>>> +          have their source address translated. For details about
>>>> match rules
>>>> +          and priority see section "Egress Table 3: SNAT on Distributed
>>>> +          Routers".
>>>> +        </li>
>>>> +
>>>>          <li>
>>>>            A priority-0 logical flow that matches all packets not
>>>> already
>>>>            handled (match <code>1</code>) and action <code>next;</code>.
>>>> diff --git a/tests/ovn-northd.at b/tests/ovn-northd.at
>>>> index 0732486f3..d4a1212d0 100644
>>>> --- a/tests/ovn-northd.at
>>>> +++ b/tests/ovn-northd.at
>>>> @@ -1107,7 +1107,8 @@ ovn_start
>>>>  #
>>>>  # DR is connected to S1 and CR is connected to S2
>>>>
>>>> -check ovn-sbctl chassis-add gw1 geneve 127.0.0.1
>>>> +check ovn-sbctl chassis-add gw1 geneve 127.0.0.1 \
>>>> +  -- set chassis gw1 other_config:ct-commit-to-zone="true"
>>>>
>>>>  check ovn-nbctl lr-add DR
>>>>  check ovn-nbctl lrp-add DR DR-S1 02:ac:10:01:00:01 172.16.1.1/24
>>>> @@ -1135,6 +1136,7 @@ echo "CR-LRP UUID is: " $uuid
>>>>  check ovn-nbctl set Logical_Router $cr_uuid options:chassis=gw1
>>>>  check ovn-nbctl --wait=sb sync
>>>>
>>>> +
>>>>
>>>
> nit: Unrelated
>

You mean the unnecessary extra line, right? ack.


>
>  ovn-nbctl create Address_Set name=allowed_range addresses=\"1.1.1.1\"
>>>>  ovn-nbctl create Address_Set name=disallowed_range
>>>> addresses=\"2.2.2.2\"
>>>>
>>>> @@ -1155,6 +1157,7 @@ AT_CAPTURE_FILE([crflows])
>>>>  AT_CHECK([grep -e "lr_out_snat" drflows | ovn_strip_lflows], [0], [dnl
>>>>    table=??(lr_out_snat        ), priority=0    , match=(1),
>>>> action=(next;)
>>>>    table=??(lr_out_snat        ), priority=120  , match=(nd_ns),
>>>> action=(next;)
>>>> +  table=??(lr_out_snat        ), priority=161  , match=(ip && ip4.dst
>>>> == 50.0.0.11 && inport == "DR-S1" && is_chassis_resident("cr-DR-S1") &&
>>>> ip4.src == $allowed_range), action=(ct_snat;)
>>>>    table=??(lr_out_snat        ), priority=161  , match=(ip && ip4.src
>>>> == 50.0.0.11 && outport == "DR-S1" && is_chassis_resident("cr-DR-S1") &&
>>>> ip4.dst == $allowed_range && (!ct.trk || !ct.rpl)),
>>>> action=(ct_snat(172.16.1.1);)
>>>>  ])
>>>>
>>>> @@ -1185,6 +1188,7 @@ AT_CAPTURE_FILE([crflows2])
>>>>  AT_CHECK([grep -e "lr_out_snat" drflows2 | ovn_strip_lflows], [0], [dnl
>>>>    table=??(lr_out_snat        ), priority=0    , match=(1),
>>>> action=(next;)
>>>>    table=??(lr_out_snat        ), priority=120  , match=(nd_ns),
>>>> action=(next;)
>>>> +  table=??(lr_out_snat        ), priority=161  , match=(ip && ip4.dst
>>>> == 50.0.0.11 && inport == "DR-S1" && is_chassis_resident("cr-DR-S1")),
>>>> action=(ct_snat;)
>>>>    table=??(lr_out_snat        ), priority=161  , match=(ip && ip4.src
>>>> == 50.0.0.11 && outport == "DR-S1" && is_chassis_resident("cr-DR-S1") &&
>>>> (!ct.trk || !ct.rpl)), action=(ct_snat(172.16.1.1);)
>>>>    table=??(lr_out_snat        ), priority=163  , match=(ip && ip4.src
>>>> == 50.0.0.11 && outport == "DR-S1" && is_chassis_resident("cr-DR-S1") &&
>>>> ip4.dst == $disallowed_range), action=(next;)
>>>>  ])
>>>> @@ -1279,7 +1283,7 @@ AT_CHECK([grep -e "lr_out_snat" crflows5 |
>>>> ovn_strip_lflows], [0], [dnl
>>>>    table=??(lr_out_snat        ), priority=33   , match=(ip && ip4.src
>>>> == 50.0.0.11 && ip4.dst == $allowed_range), action=(ip4.src=172.16.1.2;
>>>> next;)
>>>>  ])
>>>>
>>>> -# Stateful FIP with DISALLOWED_IPs
>>>> +# Stateless FIP with DISALLOWED_IPs
>>>>  ovn-nbctl lr-nat-del DR dnat_and_snat  172.16.1.2
>>>>  ovn-nbctl lr-nat-del CR dnat_and_snat  172.16.1.2
>>>>
>>>> @@ -5489,7 +5493,8 @@ AT_CHECK([grep "lr_out_snat" lr0flows |
>>>> ovn_strip_lflows], [0], [dnl
>>>>
>>>>  check ovn-sbctl chassis-add gw1 geneve 127.0.0.1 \
>>>>    -- set chassis gw1 other_config:ct-no-masked-label="true" \
>>>> -  -- set chassis gw1 other_config:ovn-ct-lb-related="true"
>>>> +  -- set chassis gw1 other_config:ovn-ct-lb-related="true" \
>>>> +  -- set chassis gw1 other_config:ct-commit-to-zone="true"
>>>>
>>>>  # Create a distributed gw port on lr0
>>>>  check ovn-nbctl ls-add public
>>>> @@ -5590,12 +5595,16 @@ AT_CHECK([grep "lr_out_undnat" lr0flows |
>>>> ovn_strip_lflows], [0], [dnl
>>>>
>>>>  AT_CHECK([grep "lr_out_post_undnat" lr0flows | ovn_strip_lflows], [0],
>>>> [dnl
>>>>    table=??(lr_out_post_undnat ), priority=0    , match=(1),
>>>> action=(next;)
>>>> +  table=??(lr_out_post_undnat ), priority=70   , match=(ip && ip4.src
>>>> == 10.0.0.0/24 && outport == "lr0-public" &&
>>>> is_chassis_resident("cr-lr0-public")), action=(ct_snat; )
>>>> +  table=??(lr_out_post_undnat ), priority=70   , match=(ip && ip4.src
>>>> == 10.0.0.10 && outport == "lr0-public" &&
>>>> is_chassis_resident("cr-lr0-public")), action=(ct_snat; )
>>>>  ])
>>>>
>>>>  AT_CHECK([grep "lr_out_snat" lr0flows | ovn_strip_lflows], [0], [dnl
>>>>    table=??(lr_out_snat        ), priority=0    , match=(1),
>>>> action=(next;)
>>>>    table=??(lr_out_snat        ), priority=120  , match=(nd_ns),
>>>> action=(next;)
>>>> +  table=??(lr_out_snat        ), priority=153  , match=(ip && ip4.dst
>>>> == 10.0.0.0/24 && inport == "lr0-public" &&
>>>> is_chassis_resident("cr-lr0-public")), action=(ct_snat;)
>>>>    table=??(lr_out_snat        ), priority=153  , match=(ip && ip4.src
>>>> == 10.0.0.0/24 && outport == "lr0-public" &&
>>>> is_chassis_resident("cr-lr0-public") && (!ct.trk || !ct.rpl)),
>>>> action=(ct_snat(172.168.0.10);)
>>>> +  table=??(lr_out_snat        ), priority=161  , match=(ip && ip4.dst
>>>> == 10.0.0.10 && inport == "lr0-public" &&
>>>> is_chassis_resident("cr-lr0-public")), action=(ct_snat;)
>>>>    table=??(lr_out_snat        ), priority=161  , match=(ip && ip4.src
>>>> == 10.0.0.10 && outport == "lr0-public" &&
>>>> is_chassis_resident("cr-lr0-public") && (!ct.trk || !ct.rpl)),
>>>> action=(ct_snat(172.168.0.30);)
>>>>    table=??(lr_out_snat        ), priority=161  , match=(ip && ip4.src
>>>> == 10.0.0.3 && outport == "lr0-public" &&
>>>> is_chassis_resident("cr-lr0-public") && (!ct.trk || !ct.rpl)),
>>>> action=(ct_snat(172.168.0.20);)
>>>>  ])
>>>> @@ -5738,12 +5747,16 @@ AT_CHECK([grep "lr_out_undnat" lr0flows |
>>>> ovn_strip_lflows], [0], [dnl
>>>>
>>>>  AT_CHECK([grep "lr_out_post_undnat" lr0flows | ovn_strip_lflows], [0],
>>>> [dnl
>>>>    table=??(lr_out_post_undnat ), priority=0    , match=(1),
>>>> action=(next;)
>>>> +  table=??(lr_out_post_undnat ), priority=70   , match=(ip && ip4.src
>>>> == 10.0.0.0/24 && outport == "lr0-public" &&
>>>> is_chassis_resident("cr-lr0-public")), action=(ct_snat; )
>>>> +  table=??(lr_out_post_undnat ), priority=70   , match=(ip && ip4.src
>>>> == 10.0.0.10 && outport == "lr0-public" &&
>>>> is_chassis_resident("cr-lr0-public")), action=(ct_snat; )
>>>>  ])
>>>>
>>>>  AT_CHECK([grep "lr_out_snat" lr0flows | ovn_strip_lflows], [0], [dnl
>>>>    table=??(lr_out_snat        ), priority=0    , match=(1),
>>>> action=(next;)
>>>>    table=??(lr_out_snat        ), priority=120  , match=(nd_ns),
>>>> action=(next;)
>>>> +  table=??(lr_out_snat        ), priority=153  , match=(ip && ip4.dst
>>>> == 10.0.0.0/24 && inport == "lr0-public" &&
>>>> is_chassis_resident("cr-lr0-public")), action=(ct_snat;)
>>>>    table=??(lr_out_snat        ), priority=153  , match=(ip && ip4.src
>>>> == 10.0.0.0/24 && outport == "lr0-public" &&
>>>> is_chassis_resident("cr-lr0-public") && (!ct.trk || !ct.rpl)),
>>>> action=(ct_snat(172.168.0.10);)
>>>> +  table=??(lr_out_snat        ), priority=161  , match=(ip && ip4.dst
>>>> == 10.0.0.10 && inport == "lr0-public" &&
>>>> is_chassis_resident("cr-lr0-public")), action=(ct_snat;)
>>>>    table=??(lr_out_snat        ), priority=161  , match=(ip && ip4.src
>>>> == 10.0.0.10 && outport == "lr0-public" &&
>>>> is_chassis_resident("cr-lr0-public") && (!ct.trk || !ct.rpl)),
>>>> action=(ct_snat(172.168.0.30);)
>>>>    table=??(lr_out_snat        ), priority=161  , match=(ip && ip4.src
>>>> == 10.0.0.3 && outport == "lr0-public" &&
>>>> is_chassis_resident("cr-lr0-public") && (!ct.trk || !ct.rpl)),
>>>> action=(ct_snat(172.168.0.20);)
>>>>  ])
>>>>
>>> @@ -7576,9 +7589,14 @@ ovn_start
>>>>  # Validate SNAT, DNAT and DNAT_AND_SNAT behavior with multiple
>>>>  # distributed gateway LRPs.
>>>>
>>>> -check ovn-sbctl chassis-add gw1 geneve 127.0.0.1
>>>> -check ovn-sbctl chassis-add gw2 geneve 128.0.0.1
>>>> -check ovn-sbctl chassis-add gw3 geneve 129.0.0.1
>>>> +check ovn-sbctl chassis-add gw1 geneve 127.0.0.1 \
>>>> +  -- set chassis gw1 other_config:ct-commit-to-zone="true"
>>>> +
>>>> +check ovn-sbctl chassis-add gw2 geneve 128.0.0.1 \
>>>> +  -- set chassis gw2 other_config:ct-commit-to-zone="true"
>>>> +
>>>> +check ovn-sbctl chassis-add gw3 geneve 129.0.0.1 \
>>>> +  -- set chassis gw3 other_config:ct-commit-to-zone="true"
>>>>
>>>>  check ovn-nbctl lr-add DR
>>>>  check ovn-nbctl lrp-add DR DR-S1 02:ac:10:01:00:01 172.16.1.1/24
>>>> @@ -7643,6 +7661,9 @@ AT_CHECK([grep lr_in_unsnat lrflows | grep
>>>> ct_snat | ovn_strip_lflows], [0], [dn
>>>>  ])
>>>>
>>>>  AT_CHECK([grep lr_out_snat lrflows | grep ct_snat | ovn_strip_lflows],
>>>> [0], [dnl
>>>> +  table=??(lr_out_snat        ), priority=161  , match=(ip && ip4.dst
>>>> == 20.0.0.10 && inport == "DR-S1" && is_chassis_resident("cr-DR-S1")),
>>>> action=(ct_snat;)
>>>> +  table=??(lr_out_snat        ), priority=161  , match=(ip && ip4.dst
>>>> == 20.0.0.10 && inport == "DR-S2" && is_chassis_resident("cr-DR-S2")),
>>>> action=(ct_snat;)
>>>> +  table=??(lr_out_snat        ), priority=161  , match=(ip && ip4.dst
>>>> == 20.0.0.10 && inport == "DR-S3" && is_chassis_resident("cr-DR-S3")),
>>>> action=(ct_snat;)
>>>>    table=??(lr_out_snat        ), priority=161  , match=(ip && ip4.src
>>>> == 20.0.0.10 && outport == "DR-S1" && is_chassis_resident("cr-DR-S1") &&
>>>> (!ct.trk || !ct.rpl)), action=(ct_snat(172.16.1.10);)
>>>>    table=??(lr_out_snat        ), priority=161  , match=(ip && ip4.src
>>>> == 20.0.0.10 && outport == "DR-S2" && is_chassis_resident("cr-DR-S2") &&
>>>> (!ct.trk || !ct.rpl)), action=(ct_snat(10.0.0.10);)
>>>>    table=??(lr_out_snat        ), priority=161  , match=(ip && ip4.src
>>>> == 20.0.0.10 && outport == "DR-S3" && is_chassis_resident("cr-DR-S3") &&
>>>> (!ct.trk || !ct.rpl)), action=(ct_snat(192.168.0.10);)
>>>>
>>>
> I would actually expect one of the tests in ovn-northd.at to also check
> the flow that does the ct_commit(snat).
>

ack.


>
>
>> diff --git a/tests/system-ovn.at b/tests/system-ovn.at
>>>> index c22c7882f..80e9b7d0b 100644
>>>> --- a/tests/system-ovn.at
>>>> +++ b/tests/system-ovn.at
>>>> @@ -3572,6 +3572,40 @@ NS_CHECK_EXEC([foo1], [ping -q -c 3 -i 0.3 -w 2
>>>> 10.0.0.1 | FORMAT_PING], \
>>>>  3 packets transmitted, 3 received, 0% packet loss, time 0ms
>>>>  ])
>>>>
>>>> +
>>>> +# test_connectivity_from_ext takes parameters 'vm' and 'ip'. It tests
>>>> +# icmp, udp and tcp connectivity from external network to the 'vm' on
>>>> +# the specified 'ip'.
>>>> +test_connectivity_from_ext() {
>>>> +    local vm=$1; shift
>>>> +    local ip=$1; shift
>>>> +
>>>> +    # Start listening daemons for UDP and TCP connections
>>>> +    NETNS_DAEMONIZE($vm, [nc -l -u 1234], [nc-$vm-$ip-udp.pid])
>>>> +    NETNS_DAEMONIZE($vm, [nc -l -k 1235], [nc-$vm-$ip-tcp.pid])
>>>> +
>>>> +    # Ensure that vm can be pinged on the specified IP
>>>> +    NS_CHECK_EXEC([alice1], [ping -q -c 3 -i 0.3 -w 2 $ip |
>>>> FORMAT_PING], \
>>>> +    [0], [dnl
>>>> +3 packets transmitted, 3 received, 0% packet loss, time 0ms
>>>> +])
>>>> +
>>>> +    # Perform two consecutive UDP connections to the specified IP
>>>> +    NS_CHECK_EXEC([alice1], [nc -u $ip 1234 -p 2000 -z])
>>>> +    NS_CHECK_EXEC([alice1], [nc -u $ip 1234 -p 2000 -z])
>>>> +
>>>> +    # Send data over TCP connection to the specified IP
>>>> +    NS_CHECK_EXEC([alice1], [echo "TCP test" | nc --send-only $ip
>>>> 1235])
>>>>
>>>
> Do we actually need to send data? I think -z is also enough for TCP to
> prove that the connection works.
>

Ordinarily nc with -z would be enough. However in this case even the
original code would pass the test without sending any payload. As mentioned
in the my ovs-discuss thread about this issue [0], TCP handshake worked
fine. The problem manifests only after the actual data was sent.

[0]
https://mail.openvswitch.org/pipermail/ovs-discuss/2024-January/052901.html


>
>
>> +}
>>>> +
>>>> +# Test access from external network to the internal IP of a VM that
>>>> +# has also configured DNAT
>>>> +test_connectivity_from_ext foo1 192.168.1.2
>>>> +
>>>> +# Test access from external network to the internal IP of a VM that
>>>> +# does not have DNAT
>>>> +test_connectivity_from_ext bar1 192.168.2.2
>>>> +
>>>>  OVS_WAIT_UNTIL([
>>>>      total_pkts=$(cat ext-net.pcap | wc -l)
>>>>      test "${total_pkts}" = "3"
>>>> @@ -3738,6 +3772,39 @@ sed -e 's/zone=[[0-9]]*/zone=<cleared>/'], [0],
>>>> [dnl
>>>>
>>>>  
>>>> icmpv6,orig=(src=fd12::2,dst=fd20::2,id=<cleared>,type=128,code=0),reply=(src=fd20::2,dst=fd20::1,id=<cleared>,type=129,code=0),zone=<cleared>
>>>>  ])
>>>>
>>>> +# test_connectivity_from_ext takes parameters 'vm' and 'ip'. It tests
>>>> +# icmp, udp and tcp connectivity from external network to the 'vm' on
>>>> +# the specified 'ip'.
>>>> +test_connectivity_from_ext() {
>>>> +    local vm=$1; shift
>>>> +    local ip=$1; shift
>>>> +
>>>> +    # Start listening daemons for UDP and TCP connections
>>>> +    NETNS_DAEMONIZE($vm, [nc -l -u 1234], [nc-$vm-$ip-udp.pid])
>>>> +    NETNS_DAEMONIZE($vm, [nc -l -k 1235], [nc-$vm-$ip-tcp.pid])
>>>> +
>>>> +    # Ensure that vm can be pinged on the specified IP
>>>> +    NS_CHECK_EXEC([alice1], [ping -q -c 3 -i 0.3 -w 2 $ip |
>>>> FORMAT_PING], \
>>>> +    [0], [dnl
>>>> +3 packets transmitted, 3 received, 0% packet loss, time 0ms
>>>> +])
>>>> +
>>>> +    # Perform two consecutive UDP connections to the specified IP
>>>> +    NS_CHECK_EXEC([alice1], [nc -u $ip 1234 -p 2000 -z])
>>>> +    NS_CHECK_EXEC([alice1], [nc -u $ip 1234 -p 2000 -z])
>>>> +
>>>> +    # Send data over TCP connection to the specified IP
>>>> +    NS_CHECK_EXEC([alice1], [echo "TCP test" | nc --send-only $ip
>>>> 1235])
>>>>
>>>
>
> Same as above.
>
> +}
>>>> +
>>>> +# Test access from external network to the internal IP of a VM that
>>>> +# has also configured DNAT
>>>> +test_connectivity_from_ext foo1 fd11::2
>>>> +
>>>> +# Test access from external network to the internal IP of a VM that
>>>> +# does not have DNAT
>>>> +test_connectivity_from_ext bar1 fd12::2
>>>> +
>>>>  OVS_APP_EXIT_AND_WAIT([ovn-controller])
>>>>
>>>>  as ovn-sb
>>>> @@ -3918,6 +3985,7 @@ NS_CHECK_EXEC([foo2], [ping -q -c 3 -i 0.3 -w 2
>>>> 172.16.1.4 | FORMAT_PING], \
>>>>  AT_CHECK([ovs-appctl dpctl/dump-conntrack | grep icmp |
>>>> FORMAT_CT(172.16.1.1) | \
>>>>  sed -e 's/zone=[[0-9]]*/zone=<cleared>/'], [0], [dnl
>>>>
>>>>  
>>>> icmp,orig=(src=172.16.1.1,dst=172.16.1.4,id=<cleared>,type=8,code=0),reply=(src=192.168.2.2,dst=172.16.1.1,id=<cleared>,type=0,code=0),zone=<cleared>
>>>>
>>>> +icmp,orig=(src=172.16.1.1,dst=192.168.2.2,id=<cleared>,type=8,code=0),reply=(src=192.168.2.2,dst=172.16.1.1,id=<cleared>,type=0,code=0),zone=<cleared>
>>>>
>>>>  
>>>> icmp,orig=(src=192.168.1.3,dst=172.16.1.4,id=<cleared>,type=8,code=0),reply=(src=172.16.1.4,dst=172.16.1.1,id=<cleared>,type=0,code=0),zone=<cleared>
>>>>  ])
>>>>
>>>> @@ -4086,6 +4154,7 @@ NS_CHECK_EXEC([foo2], [ping -q -c 3 -i 0.3 -w 2
>>>> fd20::4 | FORMAT_PING], \
>>>>  AT_CHECK([ovs-appctl dpctl/dump-conntrack | FORMAT_CT(fd20::1) | \
>>>>  sed -e 's/zone=[[0-9]]*/zone=<cleared>/'], [0], [dnl
>>>>
>>>>  
>>>> icmpv6,orig=(src=fd11::3,dst=fd20::4,id=<cleared>,type=128,code=0),reply=(src=fd20::4,dst=fd20::1,id=<cleared>,type=129,code=0),zone=<cleared>
>>>>
>>>> +icmpv6,orig=(src=fd20::1,dst=fd12::2,id=<cleared>,type=128,code=0),reply=(src=fd12::2,dst=fd20::1,id=<cleared>,type=129,code=0),zone=<cleared>
>>>>
>>>>  
>>>> icmpv6,orig=(src=fd20::1,dst=fd20::4,id=<cleared>,type=128,code=0),reply=(src=fd12::2,dst=fd20::1,id=<cleared>,type=129,code=0),zone=<cleared>
>>>>  ])
>>>>
>>>> --
>>>> 2.40.1
>>>>
>>>>
>>>
>>> --
>>> Best Regards,
>>> Martin Kalcok.
>>>
>>
>>
>> --
>> Best Regards,
>> Martin Kalcok.
>>
>
> Thanks,
> Ales
> --
>
> Ales Musil
>
> Senior Software Engineer - OVN Core
>
> Red Hat EMEA <https://www.redhat.com>
>
> [email protected]
> <https://red.ht/sig>
>


-- 
Best Regards,
Martin Kalcok.
_______________________________________________
dev mailing list
[email protected]
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Reply via email to