On Tue, Apr 13, 2021 at 4:02 AM Dumitru Ceara <[email protected]> wrote: > > On 4/13/21 2:26 AM, Ihar Hrachyshka wrote: > > On Mon, Apr 12, 2021 at 5:00 AM Dumitru Ceara <[email protected]> wrote: > >> > >> On 4/6/21 6:35 PM, Ihar Hrachyshka wrote: > >>> On Fri, Apr 2, 2021 at 7:23 AM Dumitru Ceara <[email protected]> wrote: > >>>> > >>>> On 3/24/21 3:40 PM, Ihar Hrachyshka wrote: > >>>>> For allow ACLs, bypass connection tracking by avoiding setting ct > >>>>> hints for matching traffic. Avoid sending all traffic to ct when a > >>>>> stateful ACL is present. Before the patch, this unnecessarily hit > >>>>> performance when mixed ACL action types were used for the same > >>>>> datapath. > >>>>> > >>>>> === > >>>>> > >>>>> For performance measurements, ovn-fake-multinode environment and qperf > >>>>> were used. Performance measured between two virtual nodes, two ports > >>>>> that belong to different LSs connected via router. Using qperf, > >>>>> performance was measured for UDP, TCP, SCTP protocols (using > >>>>> <proto>_lat and <proto>_bw tests). The qperf version used: > >>>>> 0.4.9-16.fc31.x86_64. Each test scenario was executed five times and > >>>>> averages compared. > >>>>> > >>>>> Tests were executed with `allow` rules for the tested protocol and > >>>>> `allow-related` for another protocol set for both ports, both > >>>>> directions, e.g. for TCP scenario, the following ACLs were defined: > >>>>> > >>>>> ovn-nbctl acl-add sw0 to-lport 100 tcp allow > >>>>> ovn-nbctl acl-add sw0 from-lport 100 tcp allow > >>>>> ovn-nbctl acl-add sw1 to-lport 100 tcp allow > >>>>> ovn-nbctl acl-add sw1 from-lport 100 tcp allow > >>>>> > >>>>> ovn-nbctl acl-add sw0 to-lport 100 sctp allow-related > >>>>> ovn-nbctl acl-add sw0 from-lport 100 sctp allow-related > >>>>> ovn-nbctl acl-add sw1 to-lport 100 sctp allow-related > >>>>> ovn-nbctl acl-add sw1 from-lport 100 sctp allow-related > >>>>> > >>>>> In this particular environment, improvement was seen in send_bw, > >>>>> latency, and msg_rate measurements, where applicable, for all three > >>>>> protocols under test. > >>>>> > >>>>> for UDP, send_bw: 293.6 MB/sec => 313.2 MB/sec (+6.68%) > >>>>> latency: 16 us => 14.08 us (-12%) > >>>>> msg_rate: 62.56 K/sec => 71.06 K/sec (+13.59%) > >>>>> > >>>>> for TCP, latency: 18.6 us => 14.88 us (-20%) > >>>>> msg_rate: 53.8 K/sec => 67.28 K/sec (+25.06%) > >>>>> > >>>>> for SCTP, latency: 21.98 us => 19.42 us (-11.65%) > >>>>> msg_rate: 45.58 K/sec => 51.54 K/sec (+13.08%) > >>>>> > >>>>> Interestingly, some performance improvement was also seen for the same > >>>>> scenarios with no ACLs set at all, albeit significantly more > >>>>> negligible. > >>>>> > >>>>> for UDP, send_bw: 320.0 MB/sec => 338.6 MB/sec (+5.81%) > >>>>> latency: 13.74 us => 12.88 us (-6.68%) > >>>>> msg_rate: 73.02 K/sec => 77.84 K/sec (+6.6%) > >>>>> > >>>>> for TCP, latency: 15.62 us => 14.26 us (-9.54%) > >>>>> msg_rate: 64.02 K/sec => 70.26 K/sec (+9.75%) > >>>>> > >>>>> for SCTP, latency: 19.56 us => 18.16 us (-7.16%) > >>>>> msg_rate: 51.16 K/sec => 55.12 K/sec (+7.74%) > >>>>> > >>>>> Comparable numbers can be captured with iperf. It may be useful to run > >>>>> more tests in a more elaborate (bare metal) environment. > >>>>> > >>>>> === > >>>>> > >>>>> The patch takes inspiration from a now abandoned patch: > >>>>> > >>>>> "ovn-northd: Support mixing stateless/stateful ACLs with > >>>>> Stateless_Filter." by Dumitru Ceara. > >>>>> > >>>>> Signed-off-by: Ihar Hrachyshka <[email protected]> > >>>>> > >>>>> --- > >>>> > >>>> Hi Ihar, > >>>> > >>>> Thanks for working on this and sorry for the delay in reviewing your > >>>> patch. > >>>> > >>>>> > >>>>> v1: initial version. > >>>>> v2: rebased after conflict. > >>>>> --- > >>>>> NEWS | 1 + > >>>>> northd/ovn-northd.8.xml | 9 +- > >>>>> northd/ovn-northd.c | 166 +++++++++++++++-------- > >>>>> tests/ovn-northd.at | 287 ++++++++++++++++++++++++++++++++++++++++ > >>>>> 4 files changed, 400 insertions(+), 63 deletions(-) > >>>>> > >>>>> diff --git a/NEWS b/NEWS > >>>>> index 530c5d42f..548a45fb7 100644 > >>>>> --- a/NEWS > >>>>> +++ b/NEWS > >>>>> @@ -7,6 +7,7 @@ Post-v21.03.0 > >>>>> (This may take testing and tuning to be effective.) This version > >>>>> of OVN > >>>>> requires DDLog 0.36. > >>>>> - Introduce ovn-controller incremetal processing engine statistics > >>>>> + - Bypass connection tracking for ACL "allow" action processing. > >>>>> > >>>>> OVN v21.03.0 - 12 Mar 2021 > >>>>> ------------------------- > >>>>> diff --git a/northd/ovn-northd.8.xml b/northd/ovn-northd.8.xml > >>>>> index a62f5c057..f38d71682 100644 > >>>>> --- a/northd/ovn-northd.8.xml > >>>>> +++ b/northd/ovn-northd.8.xml > >>>>> @@ -419,7 +419,9 @@ > >>>>> before eventually advancing to ingress table <code>ACLs</code>. > >>>>> If > >>>>> special ports such as route ports or localnet ports can't use > >>>>> ct(), a > >>>>> priority-110 flow is added to skip over stateful ACLs. IPv6 > >>>>> Neighbor > >>>>> - Discovery and MLD traffic also skips stateful ACLs. > >>>>> + Discovery and MLD traffic also skips stateful ACLs. For > >>>>> stateless "allow" > >>>>> + ACLs, a flow is added to bypass setting the hint for connection > >>>>> tracker > >>>>> + processing. > >>>>> </p> > >>>>> > >>>>> <p> > >>>>> @@ -603,10 +605,7 @@ > >>>>> <ul> > >>>>> <li> > >>>>> <code>allow</code> ACLs translate into logical flows with > >>>>> - the <code>next;</code> action. If there are any stateful ACLs > >>>>> - on this datapath, then <code>allow</code> ACLs translate to > >>>>> - <code>ct_commit; next;</code> (which acts as a hint for the > >>>>> next tables > >>>>> - to commit the connection to conntrack), > >>>>> + the <code>next;</code> action. > >>>>> </li> > >>>>> <li> > >>>>> <code>allow-related</code> ACLs translate into logical > >>>>> diff --git a/northd/ovn-northd.c b/northd/ovn-northd.c > >>>>> index 4783e43d7..cd343a3e3 100644 > >>>>> --- a/northd/ovn-northd.c > >>>>> +++ b/northd/ovn-northd.c > >>>>> @@ -4943,7 +4943,58 @@ skip_port_from_conntrack(struct ovn_datapath > >>>>> *od, struct ovn_port *op, > >>>>> } > >>>>> > >>>>> static void > >>>>> -build_pre_acls(struct ovn_datapath *od, struct hmap *lflows) > >>>>> +build_stateless_filter(struct ovn_datapath *od, > >>>>> + const struct nbrec_acl *acl, > >>>>> + struct hmap *lflows) > >>>>> +{ > >>>>> + /* Stateless filters must be applied in both directions so that > >>>>> reply > >>>>> + * traffic bypasses conntrack too. > >>>>> + */ > >>>> > >>>> In the "Stateless_Filter" approach that I was proposing initially, > >>>> filters didn't have a direction. But ACLs have an explicit direction > >>>> (to-lport/from-lport). > >>>> > >>>> So I think it would be enough to add the flow either in IN_PRE_ACL or in > >>>> OUT_PRE_ACL, according to the ACL's direction. > >>>> > >>>> On the other hand now, if we have the following configuration: > >>>> > >>>> vm1 (10.0.0.1) --- LS --- vm2 (10.0.0.2) > >>>> > >>>> ACLs: > >>>> priority 2, from-lport, "inport==vm1 && ip4.dst==10.0.0.2", allow > >>>> priority 1, from-lport, "inport==vm1 && tcp && ip4.dst==10.0.0.2", > >>>> allow-related > >>>> priority 0, from-lport, "inport==vm1 && tcp", drop > >>>> > >>>> Notice that there's no DROP rule for non-tcp traffic so I'd assume that > >>>> ICMP traffic should be always allowed between vm1 and vm2. However, > >>>> ping from vm1 to vm2 fails. > >>>> > >>>> The problem is that replies from vm2 are sent to conntrack in the > >>>> IN/OUT_PRE_STATEFUL stage because there's no high priority rule to match > >>>> them. > >>>> > >>>> In this case the user must add an explicit ACL to allow returning > >>>> stateless traffic: > >>>> priority 1, from-lport, "inport==vm1 && ip4.src==10.0.0.2", allow > >>>> > >>>> But this feels very error prone and quite counter intuitive. > >>>> > >>>> The rest of the patch looks OK to me and it still needs the ddlog > >>>> implementation but I think we need to find a way to fix the use cases > >>>> like the one above before moving forward. > >>>> > >>>> What do you think? > >>>> > >>> > >>> Dumitru, > >>> thanks for reviewing the patch. > >>> > >>> On your concern, this sounds like a usability issue and not > >>> necessarily an implementation issue, correct? While for simple cases > >>> (like allowing *all* ICMP traffic both ways, stateless) we could > >>> introduce a new CLI command to inject the same match into both > >>> pipelines, it cannot be easily done for any rule that e.g. refers to > >>> any "directional" fields, like IP src/dst in your example. > >>> > >>> How I see it, the whole idea of stateless rules is offloading some > >>> magic from ct "related" machinery to explicit user configuration, if > >>> returning traffic is expected (it is not necessarily the case and > >>> depends on the app). > >>> > >>> While I appreciate that user interface becomes more quirky when > >>> switching from allow-related to allow, I wonder if there is anything > >>> one could do to mitigate it in general due to the very nature of > >>> "directed" matching rules. (AFAIU Stateless_Filter implementation that > >>> you mentioned won't simplify anything in this regard, but let me know > >>> if you are onto something here that I miss.) > >> > >> With Stateless_Filter, my goal was to allow the CMS describe the type of > >> traffic that should skip conntrack in the ACL stage in a very general > >> way, e.g., "all TCP traffic should be firewalled in a 'stateless' way". > > > > In OpenStack, CMS allows for the same level of configurability for > > both stateful and stateless security groups. Perhaps you have a > > different CMS in mind that would have different scenarios for > > stateless and stateful rules. > > > > Yes, I was thinking of ovn-kubernetes where ALCs are used to implement > network policies which AFAIU are mainly stateful. Dan Winship suggested > a while back an alternative implementation for TCP network policies that > wouldn't require conntrack. IIRC it was something like this: > - only allow packets with tcp-flags=+syn-ack from pods that are allowed > to initiate connections > - drop all other packets with tcp-flags=+syn-ack > - allow all other TCP packets > > This would rely on the pod's TCP stack to implement the firewalling by > dropping unexpected TCP packets and would also avoid sending TCP traffic > to conntrack for ACL in OVN. > > But for UDP traffic we'd still have to send traffic to conntrack. > > This was where Stateless_Filter would come into picture. > > >> > >> Also, with this in mind, the Stateless_Filter rules would've been > >> generic enough to be applied directly in both directions. > >> > >> Back to the scenario above: > >> > >> priority 2, from-lport, "inport==vm1 && ip4.dst==10.0.0.2", allow > >> priority 1, from-lport, "inport==vm1 && tcp && ip4.dst==10.0.0.2", > >> allow-related > >> priority 0, from-lport, "inport==vm1 && tcp", drop > >> > >> As user I think I'd be very confused by the fact that ICMP replies from > >> vm2 are dropped. I agree that this rule set can be changed but I wonder > >> how often such ACL combinations already exist in production deployments. > >> Because, with the current changes, they will break. > >> > > > > If it's just a matter of backwards compatibility and we believe that > > it's unsafe to touch existing verbs, this can either be solved by a > > new resource type, like in your Stateless_Filter approach, or by > > introducing a new verb, f.e. 'allow-stateless' [plus as clearly > > describing the difference between all the verbs in documentation]. > > That might be the safest way indeed. >
The latest (v4) up for review includes the switch from allow to new allow-stateless. > > > > If it's also a matter of UX, this is largely a function of > > configurability of the mechanism (which is a function of CMS > > considered). For that, I don't have an easy answer on how to keep it > > both simple / magical AND support all possible matching scenarios that > > regular rules would support. > > > > We can continue discussing here or in the new version of the patch > > that I'm about to send, incl. ddlog and other comments of yours > > addressed. I will put some background as to the issue at stake in its > > commit message for reference while we contemplate alternatives. > > Sounds good. > > > > > Ihar > > > > Thanks, > Dumitru > _______________________________________________ dev mailing list [email protected] https://mail.openvswitch.org/mailman/listinfo/ovs-dev
