On Wed, Dec 14, 2022 at 2:52 PM Lorenzo Bianconi < [email protected]> wrote: > > In the current codebase ct_commit {} action clears ct_state metadata of > the incoming packet. This behaviour introduces an issue if we need to > check the connection tracking state in the subsequent pipeline stages, > e.g. for hairpin traffic: > > table=14(ls_in_pre_hairpin ), priority=100 , match=(ip && ct.trk), action=(reg0[6] = chk_lb_hairpin(); reg0[12] = chk_lb_hairpin_reply(); next;) > > Fix the issue moving PRE_HAIRPIN,NAT_HAIRPIN and HAIRPIN stages before > ACL_AFTER_LB and STATEFUL ones. > > Suggested-by: Han Zhou <[email protected]> > Suggested-by: Dumitru Ceara <[email protected]> > Reported-at: https://bugzilla.redhat.com/show_bug.cgi?id=2103086 > Signed-off-by: Lorenzo Bianconi <[email protected]> > --- > Changes since v3: > - swap hairpin stages with acl_after_lb and stateful ones > Changes since v2: > - add ovn system-tests for ct_commit_continue > Changes since v1: > - introduce new nested action ct_commit_continue instead of modifying > ct_commit_v2 > --- > northd/northd.c | 10 +-- > northd/ovn-northd.8.xml | 176 ++++++++++++++++++++-------------------- > tests/ovn-northd.at | 6 +- > tests/system-ovn.at | 24 +++++- > 4 files changed, 118 insertions(+), 98 deletions(-) > > diff --git a/northd/northd.c b/northd/northd.c > index 7c48bb3b4..8fb78b99a 100644 > --- a/northd/northd.c > +++ b/northd/northd.c > @@ -125,11 +125,11 @@ enum ovn_stage { > PIPELINE_STAGE(SWITCH, IN, LB_AFF_CHECK, 11, "ls_in_lb_aff_check") \ > PIPELINE_STAGE(SWITCH, IN, LB, 12, "ls_in_lb") \ > PIPELINE_STAGE(SWITCH, IN, LB_AFF_LEARN, 13, "ls_in_lb_aff_learn") \ > - PIPELINE_STAGE(SWITCH, IN, ACL_AFTER_LB, 14, "ls_in_acl_after_lb") \ > - PIPELINE_STAGE(SWITCH, IN, STATEFUL, 15, "ls_in_stateful") \ > - PIPELINE_STAGE(SWITCH, IN, PRE_HAIRPIN, 16, "ls_in_pre_hairpin") \ > - PIPELINE_STAGE(SWITCH, IN, NAT_HAIRPIN, 17, "ls_in_nat_hairpin") \ > - PIPELINE_STAGE(SWITCH, IN, HAIRPIN, 18, "ls_in_hairpin") \ > + PIPELINE_STAGE(SWITCH, IN, PRE_HAIRPIN, 14, "ls_in_pre_hairpin") \ > + PIPELINE_STAGE(SWITCH, IN, NAT_HAIRPIN, 15, "ls_in_nat_hairpin") \ > + PIPELINE_STAGE(SWITCH, IN, HAIRPIN, 16, "ls_in_hairpin") \ > + PIPELINE_STAGE(SWITCH, IN, ACL_AFTER_LB, 17, "ls_in_acl_after_lb") \ > + PIPELINE_STAGE(SWITCH, IN, STATEFUL, 18, "ls_in_stateful") \ > PIPELINE_STAGE(SWITCH, IN, ARP_ND_RSP, 19, "ls_in_arp_rsp") \ > PIPELINE_STAGE(SWITCH, IN, DHCP_OPTIONS, 20, "ls_in_dhcp_options") \ > PIPELINE_STAGE(SWITCH, IN, DHCP_RESPONSE, 21, "ls_in_dhcp_response") \ > diff --git a/northd/ovn-northd.8.xml b/northd/ovn-northd.8.xml > index dffbba96d..073e40106 100644 > --- a/northd/ovn-northd.8.xml > +++ b/northd/ovn-northd.8.xml > @@ -1028,92 +1028,7 @@ > </li> > </ul> > > - <h3>Ingress table 14: <code>from-lport</code> ACLs after LB</h3> > - > - <p> > - Logical flows in this table closely reproduce those in the > - <code>ACL</code> table in the <code>OVN_Northbound</code> database > - for the <code>from-lport</code> direction with the option > - <code>apply-after-lb</code> set to <code>true</code>. > - The <code>priority</code> values from the <code>ACL</code> table have a > - limited range and have 1000 added to them to leave room for OVN default > - flows at both higher and lower priorities. > - </p> > - > - <ul> > - <li> > - <code>allow</code> apply-after-lb ACLs translate into logical flows > - with the <code>next;</code> action. If there are any stateful ACLs > - (including both before-lb and after-lb ACLs) > - on this datapath, then <code>allow</code> ACLs translate to > - <code>ct_commit; next;</code> (which acts as a hint for the next tables > - to commit the connection to conntrack). In case the <code>ACL</code> > - has a label then <code>reg3</code> is loaded with the label value and > - <code>reg0[13]</code> bit is set to 1 (which acts as a hint for the > - next tables to commit the label to conntrack). > - </li> > - <li> > - <code>allow-related</code> apply-after-lb ACLs translate into logical > - flows with the <code>ct_commit(ct_label=0/1); next;</code> actions > - for new connections and <code>reg0[1] = 1; next;</code> for existing > - connections. In case the <code>ACL</code> has a label then > - <code>reg3</code> is loaded with the label value and > - <code>reg0[13]</code> bit is set to 1 (which acts as a hint for the > - next tables to commit the label to conntrack). > - </li> > - <li> > - <code>allow-stateless</code> apply-after-lb ACLs translate into logical > - flows with the <code>next;</code> action. > - </li> > - <li> > - <code>reject</code> apply-after-lb ACLs translate into logical > - flows with the > - <code>tcp_reset { output <-> inport; > - next(pipeline=egress,table=5);}</code> > - action for TCP connections,<code>icmp4/icmp6</code> action > - for UDP connections, and <code>sctp_abort {output <-%gt; inport; > - next(pipeline=egress,table=5);}</code> action for SCTP associations. > - </li> > - <li> > - Other apply-after-lb ACLs translate to <code>drop;</code> for new > - or untracked connections and <code>ct_commit(ct_label=1/1);</code> for > - known connections. Setting <code>ct_label</code> marks a connection > - as one that was previously allowed, but should no longer be > - allowed due to a policy change. > - </li> > - </ul> > - > - <ul> > - <li> > - One priority-0 fallback flow that matches all packets and advances to > - the next table. > - </li> > - </ul> > - > - <h3>Ingress Table 15: Stateful</h3> > - > - <ul> > - <li> > - A priority 100 flow is added which commits the packet to the conntrack > - and sets the most significant 32-bits of <code>ct_label</code> with the > - <code>reg3</code> value based on the hint provided by previous tables > - (with a match for <code>reg0[1] == 1 && reg0[13] == 1</code>). > - This is used by the <code>ACLs</code> with label to commit the label > - value to conntrack. > - </li> > - > - <li> > - For <code>ACLs</code> without label, a second priority-100 flow commits > - packets to connection tracker using <code>ct_commit; next;</code> > - action based on a hint provided by the previous tables (with a match > - for <code>reg0[1] == 1 && reg0[13] == 0</code>). > - </li> > - <li> > - A priority-0 flow that simply moves traffic to the next table. > - </li> > - </ul> > - > - <h3>Ingress Table 16: Pre-Hairpin</h3> > + <h3>Ingress Table 14: Pre-Hairpin</h3> > <ul> > <li> > If the logical switch has load balancer(s) configured, then a > @@ -1131,7 +1046,7 @@ > </li> > </ul> > > - <h3>Ingress Table 17: Nat-Hairpin</h3> > + <h3>Ingress Table 15: Nat-Hairpin</h3> > <ul> > <li> > If the logical switch has load balancer(s) configured, then a > @@ -1166,7 +1081,7 @@ > </li> > </ul> > > - <h3>Ingress Table 18: Hairpin</h3> > + <h3>Ingress Table 16: Hairpin</h3> > <ul> > <li> > <p> > @@ -1200,6 +1115,91 @@ > </li> > </ul> > > + <h3>Ingress table 17: <code>from-lport</code> ACLs after LB</h3> > + > + <p> > + Logical flows in this table closely reproduce those in the > + <code>ACL</code> table in the <code>OVN_Northbound</code> database > + for the <code>from-lport</code> direction with the option > + <code>apply-after-lb</code> set to <code>true</code>. > + The <code>priority</code> values from the <code>ACL</code> table have a > + limited range and have 1000 added to them to leave room for OVN default > + flows at both higher and lower priorities. > + </p> > + > + <ul> > + <li> > + <code>allow</code> apply-after-lb ACLs translate into logical flows > + with the <code>next;</code> action. If there are any stateful ACLs > + (including both before-lb and after-lb ACLs) > + on this datapath, then <code>allow</code> ACLs translate to > + <code>ct_commit; next;</code> (which acts as a hint for the next tables > + to commit the connection to conntrack). In case the <code>ACL</code> > + has a label then <code>reg3</code> is loaded with the label value and > + <code>reg0[13]</code> bit is set to 1 (which acts as a hint for the > + next tables to commit the label to conntrack). > + </li> > + <li> > + <code>allow-related</code> apply-after-lb ACLs translate into logical > + flows with the <code>ct_commit(ct_label=0/1); next;</code> actions > + for new connections and <code>reg0[1] = 1; next;</code> for existing > + connections. In case the <code>ACL</code> has a label then > + <code>reg3</code> is loaded with the label value and > + <code>reg0[13]</code> bit is set to 1 (which acts as a hint for the > + next tables to commit the label to conntrack). > + </li> > + <li> > + <code>allow-stateless</code> apply-after-lb ACLs translate into logical > + flows with the <code>next;</code> action. > + </li> > + <li> > + <code>reject</code> apply-after-lb ACLs translate into logical > + flows with the > + <code>tcp_reset { output <-> inport; > + next(pipeline=egress,table=5);}</code> > + action for TCP connections,<code>icmp4/icmp6</code> action > + for UDP connections, and <code>sctp_abort {output <-%gt; inport; > + next(pipeline=egress,table=5);}</code> action for SCTP associations. > + </li> > + <li> > + Other apply-after-lb ACLs translate to <code>drop;</code> for new > + or untracked connections and <code>ct_commit(ct_label=1/1);</code> for > + known connections. Setting <code>ct_label</code> marks a connection > + as one that was previously allowed, but should no longer be > + allowed due to a policy change. > + </li> > + </ul> > + > + <ul> > + <li> > + One priority-0 fallback flow that matches all packets and advances to > + the next table. > + </li> > + </ul> > + > + <h3>Ingress Table 18: Stateful</h3> > + > + <ul> > + <li> > + A priority 100 flow is added which commits the packet to the conntrack > + and sets the most significant 32-bits of <code>ct_label</code> with the > + <code>reg3</code> value based on the hint provided by previous tables > + (with a match for <code>reg0[1] == 1 && reg0[13] == 1</code>). > + This is used by the <code>ACLs</code> with label to commit the label > + value to conntrack. > + </li> > + > + <li> > + For <code>ACLs</code> without label, a second priority-100 flow commits > + packets to connection tracker using <code>ct_commit; next;</code> > + action based on a hint provided by the previous tables (with a match > + for <code>reg0[1] == 1 && reg0[13] == 0</code>). > + </li> > + <li> > + A priority-0 flow that simply moves traffic to the next table. > + </li> > + </ul> > + > <h3>Ingress Table 19: ARP/ND responder</h3> > > <p> > diff --git a/tests/ovn-northd.at b/tests/ovn-northd.at > index ca4263eac..6ca1b6ad4 100644 > --- a/tests/ovn-northd.at > +++ b/tests/ovn-northd.at > @@ -2380,7 +2380,7 @@ check ovn-nbctl --wait=sb \ > -- ls-lb-add ls lb > > AT_CHECK([ovn-sbctl lflow-list ls | grep -e ls_in_acl_hint -e ls_out_acl_hint -e ls_in_acl -e ls_out_acl | sort], [0], [dnl > - table=14(ls_in_acl_after_lb ), priority=0 , match=(1), action=(next;) > + table=17(ls_in_acl_after_lb ), priority=0 , match=(1), action=(next;) > table=3 (ls_out_acl_hint ), priority=0 , match=(1), action=(next;) > table=3 (ls_out_acl_hint ), priority=1 , match=(ct.est && ct_mark.blocked == 0), action=(reg0[[10]] = 1; next;) > table=3 (ls_out_acl_hint ), priority=2 , match=(ct.est && ct_mark.blocked == 1), action=(reg0[[9]] = 1; next;) > @@ -2423,7 +2423,7 @@ ovn-nbctl --wait=sb clear logical_switch ls acls > ovn-nbctl --wait=sb clear logical_switch ls load_balancer > > AT_CHECK([ovn-sbctl lflow-list ls | grep -e ls_in_acl_hint -e ls_out_acl_hint -e ls_in_acl -e ls_out_acl | sort], [0], [dnl > - table=14(ls_in_acl_after_lb ), priority=0 , match=(1), action=(next;) > + table=17(ls_in_acl_after_lb ), priority=0 , match=(1), action=(next;) > table=3 (ls_out_acl_hint ), priority=65535, match=(1), action=(next;) > table=4 (ls_out_acl ), priority=65535, match=(1), action=(next;) > table=7 (ls_in_acl_hint ), priority=65535, match=(1), action=(next;) > @@ -7638,7 +7638,7 @@ sort | sed 's/table=../table=??/' ], [0], [dnl > table=??(ls_in_check_port_sec), priority=100 , match=(vlan.present), action=(drop;) > table=??(ls_in_check_port_sec), priority=50 , match=(1), action=(reg0[[15]] = check_in_port_sec(); next;) > table=??(ls_in_check_port_sec), priority=70 , match=(inport == "localnetport"), action=(set_queue(10); reg0[[15]] = check_in_port_sec(); next;) > - table=??(ls_in_check_port_sec), priority=70 , match=(inport == "sw0p1"), action=(reg0[[14]] = 1; next(pipeline=ingress, table=18);) > + table=??(ls_in_check_port_sec), priority=70 , match=(inport == "sw0p1"), action=(reg0[[14]] = 1; next(pipeline=ingress, table=16);) > table=??(ls_in_check_port_sec), priority=70 , match=(inport == "sw0p2"), action=(set_queue(10); reg0[[15]] = check_in_port_sec(); next;) > table=??(ls_in_apply_port_sec), priority=0 , match=(1), action=(next;) > table=??(ls_in_apply_port_sec), priority=50 , match=(reg0[[15]] == 1), action=(drop;) > diff --git a/tests/system-ovn.at b/tests/system-ovn.at > index 42c130f2f..e999cca4d 100644 > --- a/tests/system-ovn.at > +++ b/tests/system-ovn.at > @@ -4664,7 +4664,7 @@ ADD_VETH(lsp, lsp, br-int, "42.42.42.1/24", "00:00:00:00:00:01", \ > ovn-nbctl --wait=hv -t 3 sync > > # Start IPv4 TCP server on lsp. > -NS_CHECK_EXEC([lsp], [timeout 2s nc -k -l 42.42.42.1 4041 &], [0]) > +NETNS_DAEMONIZE([lsp], [nc -l -k 42.42.42.1 4041], [lsp0.pid]) > > # Check that IPv4 TCP hairpin connection succeeds on both VIPs. > NS_CHECK_EXEC([lsp], [nc 88.88.88.88 8080 -z], [0], [ignore], [ignore]) > @@ -4686,6 +4686,16 @@ OVS_WAIT_UNTIL([ > test "${total_pkts}" = "2" > ]) > > +ovn-nbctl pg-add pg0 lsp > +ovn-nbctl --apply-after-lb acl-add pg0 from-lport 1004 "ip4 && ip4.dst == 10.0.0.2" drop > +ovn-nbctl --apply-after-lb acl-add pg0 from-lport 1002 "ip4 && tcp" allow-related > +ovn-nbctl --apply-after-lb acl-add pg0 from-lport 1002 "ip4 && udp" allow > +ovn-nbctl --wait=hv sync > + > +## Check that IPv4 TCP hairpin connection succeeds on both VIPs. > +NS_CHECK_EXEC([lsp], [nc 88.88.88.88 8080 -z], [0], [ignore], [ignore]) > +NS_CHECK_EXEC([lsp], [nc 88.88.88.89 8080 -z], [0], [ignore], [ignore]) > + > OVS_APP_EXIT_AND_WAIT([ovn-controller]) > > as ovn-sb > @@ -4750,7 +4760,7 @@ OVS_WAIT_UNTIL([test "$(ip netns exec lsp ip a | grep 4200::1 | grep tentative)" > ovn-nbctl --wait=hv -t 3 sync > > # Start IPv6 TCP server on lsp. > -NS_CHECK_EXEC([lsp], [timeout 2s nc -k -l 4200::1 4041 &], [0]) > +NETNS_DAEMONIZE([lsp], [nc -l -k 4200::1 4041], [lsp0.pid]) > > # Check that IPv6 TCP hairpin connection succeeds on both VIPs. > NS_CHECK_EXEC([lsp], [nc 8800::0088 8080 -z], [0], [ignore], [ignore]) > @@ -4772,6 +4782,16 @@ OVS_WAIT_UNTIL([ > test "${total_pkts}" = "2" > ]) > > +ovn-nbctl pg-add pg0 lsp > +ovn-nbctl --apply-after-lb acl-add pg0 from-lport 1002 "ip6 && tcp" allow-related > +ovn-nbctl --apply-after-lb acl-add pg0 from-lport 1002 "ip6 && udp" allow > +ovn-nbctl --apply-after-lb acl-add pg0 from-lport 1000 "ip6" drop > +ovn-nbctl --wait=hv sync > + > +# Check that IPv6 TCP hairpin connection succeeds on both VIPs. > +NS_CHECK_EXEC([lsp], [nc 8800::0088 8080 -z], [0], [ignore], [ignore]) > +NS_CHECK_EXEC([lsp], [nc 8800::0089 8080 -z], [0], [ignore], [ignore]) > + > OVS_APP_EXIT_AND_WAIT([ovn-controller]) > > as ovn-sb > -- > 2.38.1 >
Thanks Lorenzo. Acked-by: Han Zhou <[email protected]> _______________________________________________ dev mailing list [email protected] https://mail.openvswitch.org/mailman/listinfo/ovs-dev
