On Wed, Dec 14, 2022 at 2:52 PM Lorenzo Bianconi <
[email protected]> wrote:
>
> In the current codebase ct_commit {} action clears ct_state metadata of
> the incoming packet. This behaviour introduces an issue if we need to
> check the connection tracking state in the subsequent pipeline stages,
> e.g. for hairpin traffic:
>
> table=14(ls_in_pre_hairpin  ), priority=100  , match=(ip && ct.trk),
action=(reg0[6] = chk_lb_hairpin(); reg0[12] = chk_lb_hairpin_reply();
next;)
>
> Fix the issue moving PRE_HAIRPIN,NAT_HAIRPIN and HAIRPIN stages before
> ACL_AFTER_LB and STATEFUL ones.
>
> Suggested-by: Han Zhou <[email protected]>
> Suggested-by: Dumitru Ceara <[email protected]>
> Reported-at: https://bugzilla.redhat.com/show_bug.cgi?id=2103086
> Signed-off-by: Lorenzo Bianconi <[email protected]>
> ---
> Changes since v3:
> - swap hairpin stages with acl_after_lb and stateful ones
> Changes since v2:
> - add ovn system-tests for ct_commit_continue
> Changes since v1:
> - introduce new nested action ct_commit_continue instead of modifying
>   ct_commit_v2
> ---
>  northd/northd.c         |  10 +--
>  northd/ovn-northd.8.xml | 176 ++++++++++++++++++++--------------------
>  tests/ovn-northd.at     |   6 +-
>  tests/system-ovn.at     |  24 +++++-
>  4 files changed, 118 insertions(+), 98 deletions(-)
>
> diff --git a/northd/northd.c b/northd/northd.c
> index 7c48bb3b4..8fb78b99a 100644
> --- a/northd/northd.c
> +++ b/northd/northd.c
> @@ -125,11 +125,11 @@ enum ovn_stage {
>      PIPELINE_STAGE(SWITCH, IN,  LB_AFF_CHECK,  11, "ls_in_lb_aff_check")
 \
>      PIPELINE_STAGE(SWITCH, IN,  LB,            12, "ls_in_lb")
 \
>      PIPELINE_STAGE(SWITCH, IN,  LB_AFF_LEARN,  13, "ls_in_lb_aff_learn")
 \
> -    PIPELINE_STAGE(SWITCH, IN,  ACL_AFTER_LB,  14, "ls_in_acl_after_lb")
 \
> -    PIPELINE_STAGE(SWITCH, IN,  STATEFUL,      15, "ls_in_stateful")
 \
> -    PIPELINE_STAGE(SWITCH, IN,  PRE_HAIRPIN,   16, "ls_in_pre_hairpin")
  \
> -    PIPELINE_STAGE(SWITCH, IN,  NAT_HAIRPIN,   17, "ls_in_nat_hairpin")
  \
> -    PIPELINE_STAGE(SWITCH, IN,  HAIRPIN,       18, "ls_in_hairpin")
  \
> +    PIPELINE_STAGE(SWITCH, IN,  PRE_HAIRPIN,   14, "ls_in_pre_hairpin")
  \
> +    PIPELINE_STAGE(SWITCH, IN,  NAT_HAIRPIN,   15, "ls_in_nat_hairpin")
  \
> +    PIPELINE_STAGE(SWITCH, IN,  HAIRPIN,       16, "ls_in_hairpin")
  \
> +    PIPELINE_STAGE(SWITCH, IN,  ACL_AFTER_LB,  17, "ls_in_acl_after_lb")
 \
> +    PIPELINE_STAGE(SWITCH, IN,  STATEFUL,      18, "ls_in_stateful")
 \
>      PIPELINE_STAGE(SWITCH, IN,  ARP_ND_RSP,    19, "ls_in_arp_rsp")
  \
>      PIPELINE_STAGE(SWITCH, IN,  DHCP_OPTIONS,  20, "ls_in_dhcp_options")
 \
>      PIPELINE_STAGE(SWITCH, IN,  DHCP_RESPONSE, 21,
"ls_in_dhcp_response") \
> diff --git a/northd/ovn-northd.8.xml b/northd/ovn-northd.8.xml
> index dffbba96d..073e40106 100644
> --- a/northd/ovn-northd.8.xml
> +++ b/northd/ovn-northd.8.xml
> @@ -1028,92 +1028,7 @@
>        </li>
>      </ul>
>
> -    <h3>Ingress table 14: <code>from-lport</code> ACLs after LB</h3>
> -
> -    <p>
> -      Logical flows in this table closely reproduce those in the
> -      <code>ACL</code> table in the <code>OVN_Northbound</code> database
> -      for the <code>from-lport</code> direction with the option
> -      <code>apply-after-lb</code> set to <code>true</code>.
> -      The <code>priority</code> values from the <code>ACL</code> table
have a
> -      limited range and have 1000 added to them to leave room for OVN
default
> -      flows at both higher and lower priorities.
> -    </p>
> -
> -    <ul>
> -      <li>
> -        <code>allow</code> apply-after-lb ACLs translate into logical
flows
> -        with the <code>next;</code> action.  If there are any stateful
ACLs
> -        (including both before-lb and after-lb ACLs)
> -        on this datapath, then <code>allow</code> ACLs translate to
> -        <code>ct_commit; next;</code> (which acts as a hint for the next
tables
> -        to commit the connection to conntrack). In case the
<code>ACL</code>
> -        has a label then <code>reg3</code> is loaded with the label
value and
> -        <code>reg0[13]</code> bit is set to 1 (which acts as a hint for
the
> -        next tables to commit the label to conntrack).
> -      </li>
> -      <li>
> -        <code>allow-related</code> apply-after-lb ACLs translate into
logical
> -        flows with the <code>ct_commit(ct_label=0/1); next;</code>
actions
> -        for new connections and <code>reg0[1] = 1; next;</code> for
existing
> -        connections.  In case the <code>ACL</code> has a label then
> -        <code>reg3</code> is loaded with the label value and
> -        <code>reg0[13]</code> bit is set to 1 (which acts as a hint for
the
> -        next tables to commit the label to conntrack).
> -      </li>
> -      <li>
> -        <code>allow-stateless</code> apply-after-lb ACLs translate into
logical
> -        flows with the <code>next;</code> action.
> -      </li>
> -      <li>
> -        <code>reject</code> apply-after-lb ACLs translate into logical
> -        flows with the
> -        <code>tcp_reset { output &lt;-&gt; inport;
> -        next(pipeline=egress,table=5);}</code>
> -        action for TCP connections,<code>icmp4/icmp6</code> action
> -        for UDP connections, and <code>sctp_abort {output &lt;-%gt;
inport;
> -        next(pipeline=egress,table=5);}</code> action for SCTP
associations.
> -      </li>
> -      <li>
> -        Other apply-after-lb ACLs translate to <code>drop;</code> for new
> -        or untracked connections and
<code>ct_commit(ct_label=1/1);</code> for
> -        known connections.  Setting <code>ct_label</code> marks a
connection
> -        as one that was previously allowed, but should no longer be
> -        allowed due to a policy change.
> -      </li>
> -    </ul>
> -
> -    <ul>
> -      <li>
> -        One priority-0 fallback flow that matches all packets and
advances to
> -        the next table.
> -      </li>
> -    </ul>
> -
> -    <h3>Ingress Table 15: Stateful</h3>
> -
> -    <ul>
> -      <li>
> -        A priority 100 flow is added which commits the packet to the
conntrack
> -        and sets the most significant 32-bits of <code>ct_label</code>
with the
> -        <code>reg3</code> value based on the hint provided by previous
tables
> -        (with a match for <code>reg0[1] == 1 &amp;&amp; reg0[13] ==
1</code>).
> -        This is used by the <code>ACLs</code> with label to commit the
label
> -        value to conntrack.
> -      </li>
> -
> -      <li>
> -        For <code>ACLs</code> without label, a second priority-100 flow
commits
> -        packets to connection tracker using <code>ct_commit; next;</code>
> -        action based on a hint provided by the previous tables (with a
match
> -        for <code>reg0[1] == 1 &amp;&amp; reg0[13] == 0</code>).
> -      </li>
> -      <li>
> -        A priority-0 flow that simply moves traffic to the next table.
> -      </li>
> -    </ul>
> -
> -    <h3>Ingress Table 16: Pre-Hairpin</h3>
> +    <h3>Ingress Table 14: Pre-Hairpin</h3>
>      <ul>
>        <li>
>          If the logical switch has load balancer(s) configured, then a
> @@ -1131,7 +1046,7 @@
>        </li>
>      </ul>
>
> -    <h3>Ingress Table 17: Nat-Hairpin</h3>
> +    <h3>Ingress Table 15: Nat-Hairpin</h3>
>      <ul>
>        <li>
>           If the logical switch has load balancer(s) configured, then a
> @@ -1166,7 +1081,7 @@
>        </li>
>      </ul>
>
> -    <h3>Ingress Table 18: Hairpin</h3>
> +    <h3>Ingress Table 16: Hairpin</h3>
>      <ul>
>        <li>
>          <p>
> @@ -1200,6 +1115,91 @@
>        </li>
>      </ul>
>
> +    <h3>Ingress table 17: <code>from-lport</code> ACLs after LB</h3>
> +
> +    <p>
> +      Logical flows in this table closely reproduce those in the
> +      <code>ACL</code> table in the <code>OVN_Northbound</code> database
> +      for the <code>from-lport</code> direction with the option
> +      <code>apply-after-lb</code> set to <code>true</code>.
> +      The <code>priority</code> values from the <code>ACL</code> table
have a
> +      limited range and have 1000 added to them to leave room for OVN
default
> +      flows at both higher and lower priorities.
> +    </p>
> +
> +    <ul>
> +      <li>
> +        <code>allow</code> apply-after-lb ACLs translate into logical
flows
> +        with the <code>next;</code> action.  If there are any stateful
ACLs
> +        (including both before-lb and after-lb ACLs)
> +        on this datapath, then <code>allow</code> ACLs translate to
> +        <code>ct_commit; next;</code> (which acts as a hint for the next
tables
> +        to commit the connection to conntrack). In case the
<code>ACL</code>
> +        has a label then <code>reg3</code> is loaded with the label
value and
> +        <code>reg0[13]</code> bit is set to 1 (which acts as a hint for
the
> +        next tables to commit the label to conntrack).
> +      </li>
> +      <li>
> +        <code>allow-related</code> apply-after-lb ACLs translate into
logical
> +        flows with the <code>ct_commit(ct_label=0/1); next;</code>
actions
> +        for new connections and <code>reg0[1] = 1; next;</code> for
existing
> +        connections.  In case the <code>ACL</code> has a label then
> +        <code>reg3</code> is loaded with the label value and
> +        <code>reg0[13]</code> bit is set to 1 (which acts as a hint for
the
> +        next tables to commit the label to conntrack).
> +      </li>
> +      <li>
> +        <code>allow-stateless</code> apply-after-lb ACLs translate into
logical
> +        flows with the <code>next;</code> action.
> +      </li>
> +      <li>
> +        <code>reject</code> apply-after-lb ACLs translate into logical
> +        flows with the
> +        <code>tcp_reset { output &lt;-&gt; inport;
> +        next(pipeline=egress,table=5);}</code>
> +        action for TCP connections,<code>icmp4/icmp6</code> action
> +        for UDP connections, and <code>sctp_abort {output &lt;-%gt;
inport;
> +        next(pipeline=egress,table=5);}</code> action for SCTP
associations.
> +      </li>
> +      <li>
> +        Other apply-after-lb ACLs translate to <code>drop;</code> for new
> +        or untracked connections and
<code>ct_commit(ct_label=1/1);</code> for
> +        known connections.  Setting <code>ct_label</code> marks a
connection
> +        as one that was previously allowed, but should no longer be
> +        allowed due to a policy change.
> +      </li>
> +    </ul>
> +
> +    <ul>
> +      <li>
> +        One priority-0 fallback flow that matches all packets and
advances to
> +        the next table.
> +      </li>
> +    </ul>
> +
> +    <h3>Ingress Table 18: Stateful</h3>
> +
> +    <ul>
> +      <li>
> +        A priority 100 flow is added which commits the packet to the
conntrack
> +        and sets the most significant 32-bits of <code>ct_label</code>
with the
> +        <code>reg3</code> value based on the hint provided by previous
tables
> +        (with a match for <code>reg0[1] == 1 &amp;&amp; reg0[13] ==
1</code>).
> +        This is used by the <code>ACLs</code> with label to commit the
label
> +        value to conntrack.
> +      </li>
> +
> +      <li>
> +        For <code>ACLs</code> without label, a second priority-100 flow
commits
> +        packets to connection tracker using <code>ct_commit; next;</code>
> +        action based on a hint provided by the previous tables (with a
match
> +        for <code>reg0[1] == 1 &amp;&amp; reg0[13] == 0</code>).
> +      </li>
> +      <li>
> +        A priority-0 flow that simply moves traffic to the next table.
> +      </li>
> +    </ul>
> +
>      <h3>Ingress Table 19: ARP/ND responder</h3>
>
>      <p>
> diff --git a/tests/ovn-northd.at b/tests/ovn-northd.at
> index ca4263eac..6ca1b6ad4 100644
> --- a/tests/ovn-northd.at
> +++ b/tests/ovn-northd.at
> @@ -2380,7 +2380,7 @@ check ovn-nbctl --wait=sb \
>      -- ls-lb-add ls lb
>
>  AT_CHECK([ovn-sbctl lflow-list ls | grep -e ls_in_acl_hint -e
ls_out_acl_hint -e ls_in_acl -e ls_out_acl | sort], [0], [dnl
> -  table=14(ls_in_acl_after_lb ), priority=0    , match=(1),
action=(next;)
> +  table=17(ls_in_acl_after_lb ), priority=0    , match=(1),
action=(next;)
>    table=3 (ls_out_acl_hint    ), priority=0    , match=(1),
action=(next;)
>    table=3 (ls_out_acl_hint    ), priority=1    , match=(ct.est &&
ct_mark.blocked == 0), action=(reg0[[10]] = 1; next;)
>    table=3 (ls_out_acl_hint    ), priority=2    , match=(ct.est &&
ct_mark.blocked == 1), action=(reg0[[9]] = 1; next;)
> @@ -2423,7 +2423,7 @@ ovn-nbctl --wait=sb clear logical_switch ls acls
>  ovn-nbctl --wait=sb clear logical_switch ls load_balancer
>
>  AT_CHECK([ovn-sbctl lflow-list ls | grep -e ls_in_acl_hint -e
ls_out_acl_hint -e ls_in_acl -e ls_out_acl | sort], [0], [dnl
> -  table=14(ls_in_acl_after_lb ), priority=0    , match=(1),
action=(next;)
> +  table=17(ls_in_acl_after_lb ), priority=0    , match=(1),
action=(next;)
>    table=3 (ls_out_acl_hint    ), priority=65535, match=(1),
action=(next;)
>    table=4 (ls_out_acl         ), priority=65535, match=(1),
action=(next;)
>    table=7 (ls_in_acl_hint     ), priority=65535, match=(1),
action=(next;)
> @@ -7638,7 +7638,7 @@ sort | sed 's/table=../table=??/' ], [0], [dnl
>    table=??(ls_in_check_port_sec), priority=100  , match=(vlan.present),
action=(drop;)
>    table=??(ls_in_check_port_sec), priority=50   , match=(1),
action=(reg0[[15]] = check_in_port_sec(); next;)
>    table=??(ls_in_check_port_sec), priority=70   , match=(inport ==
"localnetport"), action=(set_queue(10); reg0[[15]] = check_in_port_sec();
next;)
> -  table=??(ls_in_check_port_sec), priority=70   , match=(inport ==
"sw0p1"), action=(reg0[[14]] = 1; next(pipeline=ingress, table=18);)
> +  table=??(ls_in_check_port_sec), priority=70   , match=(inport ==
"sw0p1"), action=(reg0[[14]] = 1; next(pipeline=ingress, table=16);)
>    table=??(ls_in_check_port_sec), priority=70   , match=(inport ==
"sw0p2"), action=(set_queue(10); reg0[[15]] = check_in_port_sec(); next;)
>    table=??(ls_in_apply_port_sec), priority=0    , match=(1),
action=(next;)
>    table=??(ls_in_apply_port_sec), priority=50   , match=(reg0[[15]] ==
1), action=(drop;)
> diff --git a/tests/system-ovn.at b/tests/system-ovn.at
> index 42c130f2f..e999cca4d 100644
> --- a/tests/system-ovn.at
> +++ b/tests/system-ovn.at
> @@ -4664,7 +4664,7 @@ ADD_VETH(lsp, lsp, br-int, "42.42.42.1/24",
"00:00:00:00:00:01", \
>  ovn-nbctl --wait=hv -t 3 sync
>
>  # Start IPv4 TCP server on lsp.
> -NS_CHECK_EXEC([lsp], [timeout 2s nc -k -l 42.42.42.1 4041 &], [0])
> +NETNS_DAEMONIZE([lsp], [nc -l -k 42.42.42.1 4041], [lsp0.pid])
>
>  # Check that IPv4 TCP hairpin connection succeeds on both VIPs.
>  NS_CHECK_EXEC([lsp], [nc 88.88.88.88 8080 -z], [0], [ignore], [ignore])
> @@ -4686,6 +4686,16 @@ OVS_WAIT_UNTIL([
>      test "${total_pkts}" = "2"
>  ])
>
> +ovn-nbctl pg-add pg0 lsp
> +ovn-nbctl --apply-after-lb acl-add pg0 from-lport 1004 "ip4 && ip4.dst
== 10.0.0.2" drop
> +ovn-nbctl --apply-after-lb acl-add pg0 from-lport 1002 "ip4 && tcp"
allow-related
> +ovn-nbctl --apply-after-lb acl-add pg0 from-lport 1002 "ip4 && udp" allow
> +ovn-nbctl --wait=hv sync
> +
> +## Check that IPv4 TCP hairpin connection succeeds on both VIPs.
> +NS_CHECK_EXEC([lsp], [nc 88.88.88.88 8080 -z], [0], [ignore], [ignore])
> +NS_CHECK_EXEC([lsp], [nc 88.88.88.89 8080 -z], [0], [ignore], [ignore])
> +
>  OVS_APP_EXIT_AND_WAIT([ovn-controller])
>
>  as ovn-sb
> @@ -4750,7 +4760,7 @@ OVS_WAIT_UNTIL([test "$(ip netns exec lsp ip a |
grep 4200::1 | grep tentative)"
>  ovn-nbctl --wait=hv -t 3 sync
>
>  # Start IPv6 TCP server on lsp.
> -NS_CHECK_EXEC([lsp], [timeout 2s nc -k -l 4200::1 4041 &], [0])
> +NETNS_DAEMONIZE([lsp], [nc -l -k 4200::1 4041], [lsp0.pid])
>
>  # Check that IPv6 TCP hairpin connection succeeds on both VIPs.
>  NS_CHECK_EXEC([lsp], [nc 8800::0088 8080 -z], [0], [ignore], [ignore])
> @@ -4772,6 +4782,16 @@ OVS_WAIT_UNTIL([
>      test "${total_pkts}" = "2"
>  ])
>
> +ovn-nbctl pg-add pg0 lsp
> +ovn-nbctl --apply-after-lb acl-add pg0 from-lport 1002 "ip6 && tcp"
allow-related
> +ovn-nbctl --apply-after-lb acl-add pg0 from-lport 1002 "ip6 && udp" allow
> +ovn-nbctl --apply-after-lb acl-add pg0 from-lport 1000 "ip6" drop
> +ovn-nbctl --wait=hv sync
> +
> +# Check that IPv6 TCP hairpin connection succeeds on both VIPs.
> +NS_CHECK_EXEC([lsp], [nc 8800::0088 8080 -z], [0], [ignore], [ignore])
> +NS_CHECK_EXEC([lsp], [nc 8800::0089 8080 -z], [0], [ignore], [ignore])
> +
>  OVS_APP_EXIT_AND_WAIT([ovn-controller])
>
>  as ovn-sb
> --
> 2.38.1
>

Thanks Lorenzo.
Acked-by: Han Zhou <[email protected]>
_______________________________________________
dev mailing list
[email protected]
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Reply via email to