On 5/7/21 7:35 AM, Han Zhou wrote:
> Hi Ihar,
> 
> Thanks for the patch, and it's great to see the significant performance
> improvement!
> Please see my comments below.
> 
> On Thu, May 6, 2021 at 7:35 AM Ihar Hrachyshka <[email protected]> wrote:
>>
>> For allow-stateless ACLs, bypass connection tracking by avoiding
>> setting ct hints for matching traffic. Avoid sending all traffic to ct
>> when a stateful ACL is present.
>>
>> ===
>>
>> Reusing an existing 'allow' verb for stateless matching would have its
>> drawbacks, specifically, when it comes to backwards incompatibility of
>> the new behavior with existing environments. When using "allow" ACLs
>> in mixed allow/allow-related environment, we still commit "allow"
>> traffic to conntrack. This unnecessarily hits performance when mixed
>> ACL action types were used for the same datapath. This is why we
>> introduce a new action verb to describe stateless behavior.
>>
>> Another complexity to consider is the fact that with stateless
>> matching, one would not be able to rely on 'related' magic that
>> guarantees that reply traffic is passed through. Instead, the user
>> would have to accurately define matching rules both for request and
>> reply directions of a protocol session. Specifically, when allowing
>> ICMP for a specific peer host, one has to define 'allow-stateless'
>> rules that would match against ip.dst for request direction and ip.src
>> for reply direction. Other protocols and scenarios will require their
>> own fine grained matching approaches implemented by the user.
>>
>> ===
>>
>> For performance measurements, qperf was used. Tests were executed on two
>> setups:
>>
>> 1) ovn-fake-multinode virtual environment;
>> 2) Supermicro SYS-5039MS-H8TRF 3-node RH-OSP 16.x cluster with 2 compute
>>    nodes, NIC: "Intel Corporation Ethernet Controller XXV710 for 25GbE
>>    SFP28 (rev 02)", using fake_nova_driver to avoid qemu overhead.
>>
>> 1) ovn-fake-multinode:
>>
>> Performance measured between two virtual nodes, two ports
>> that belong to different LSs connected via router. Using qperf,
>> performance was measured for UDP, TCP, SCTP protocols (using
>> <proto>_lat and <proto>_bw tests). The qperf version used:
>> 0.4.9-16.fc31.x86_64.  Each test scenario was executed five times and
>> averages compared.
>>
>> Tests were executed with `allow-stateless` rules for the tested
>> protocol and `allow-related` for another protocol set for both ports,
>> both directions, e.g. for TCP scenario, the following ACLs were
>> defined:
>>
>> ovn-nbctl acl-add sw0 to-lport 100 tcp allow-stateless
>> ovn-nbctl acl-add sw0 from-lport 100 tcp allow-stateless
>> ovn-nbctl acl-add sw1 to-lport 100 tcp allow-stateless
>> ovn-nbctl acl-add sw1 from-lport 100 tcp allow-stateless
>>
>> ovn-nbctl acl-add sw0 to-lport 100 sctp allow-related
>> ovn-nbctl acl-add sw0 from-lport 100 sctp allow-related
>> ovn-nbctl acl-add sw1 to-lport 100 sctp allow-related
>> ovn-nbctl acl-add sw1 from-lport 100 sctp allow-related
>>
>> In this particular environment, improvement was seen in send_bw,
>> latency, and msg_rate measurements, where applicable, for all three
>> protocols under test.
>>
>> for UDP, send_bw: 293.6 MB/sec => 313.2 MB/sec (+6.68%)
>>          latency: 16 us => 14.08 us (-12%)
>>          msg_rate: 62.56 K/sec => 71.06 K/sec (+13.59%)
>>
>> for TCP, latency: 18.6 us => 14.88 us (-20%)
>>          msg_rate: 53.8 K/sec => 67.28 K/sec (+25.06%)
>>
>> for SCTP, latency: 21.98 us => 19.42 us (-11.65%)
>>           msg_rate: 45.58 K/sec => 51.54 K/sec (+13.08%)
>>
>> Interestingly, some performance improvement was also seen for the same
>> scenarios with no ACLs set at all, albeit significantly more
>> negligible.
>>
>> for UDP, send_bw: 320.0 MB/sec => 338.6 MB/sec (+5.81%)
>>          latency: 13.74 us => 12.88 us (-6.68%)
>>          msg_rate: 73.02 K/sec => 77.84 K/sec (+6.6%)
>>
>> for TCP, latency: 15.62 us => 14.26 us (-9.54%)
>>          msg_rate: 64.02 K/sec => 70.26 K/sec (+9.75%)
>>
>> for SCTP, latency: 19.56 us => 18.16 us (-7.16%)
>>           msg_rate: 51.16 K/sec => 55.12 K/sec (+7.74%)
>>
>> 2) Supermicro RH-OSP cluster:
>>
>> Two scenarios executed:
>> - connectivity between two ports on the same switch
>> - connectivity between two ports on different switches connected via
>>   router
>>
>> For the same switch, improvements are as follows:
>> - TCP latency: -5.3%, UDP latency: -13.5%, SCTP latency: -10.8%
>> - TCP bw: +0.8%, UDP bw, +12.25%, SCTP bw: +21.29%
>>
>> For different switches, improvements are as follows:
>> - TCP latency: -6%, UDP: -11.2%, SCTP: -0.2%
>> - TCP bw: -0.7%, UDP bw: +2%, SCTP bw: +9.9%
>>
>> The effect is more noticeable in same switch scenario.
> 
> This is interesting. Any idea why with different switches the result is so
> different? I'd expect similar improvement just like in same switch, because
> in this kind of test the flow should be cached in kernel datapath
> (regardless of same/different logical switches) and in both cases the cost
> of traversing conntrack is saved.
> 
>>
>> Comparable numbers can be captured with iperf. It may be useful to run
>> more tests in a more elaborate (bare metal) environment.
>>
>> ===
>>
>> The patch takes inspiration from a now abandoned patch:
>>
>> "ovn-northd: Support mixing stateless/stateful ACLs with
>> Stateless_Filter." by Dumitru Ceara.
>>
>> The original patch assumed CMS doesn't require full flexibility of
>> matching rules for stateless matching (for example, to be used by
>> OpenShift). But other CMS interfaces may require the same
>> customizability for stateless as well as stateful matching, like in
>> OpenStack Neutron API. Which is why this patch reuses existing ACL
>> object type to describe stateless rules.
>>
>> Signed-off-by: Ihar Hrachyshka <[email protected]>
>>
>> ---
>>
>> v1: initial version.
>> v2: rebased after conflict.
>> v3: added ddlog implementation.
>> v3: apply stateless skip-hint-set rules to appropriate direction only.
>> v3: added more background as to implementation in commit message.
>> v3: test stateless scenarios with ddlog too.
>> v3: rebased after conflict.
>> v4: introduce a separate allow-stateless ACL match verb.
>> v5: rebased.
>> v6: updated docs for new allow-stateless approach.
>> v6: removed no longer valid comments.
>> v6: removed acl_is_stateless.
>> v7: bump north db schema version to 5.31.0 -> 5.32.0.
>> v8: fixed checkpatch failure on a too long line in dbschema.
>> v8: added perf data on baremetal lab testing.
>> v9: fixed ovsdb checksum.
>> ---
>>  NEWS                    |   2 +
>>  northd/ovn-northd.8.xml |   8 +-
>>  northd/ovn-northd.c     |  65 ++++++++-
>>  northd/ovn_northd.dl    |  31 ++++
>>  ovn-nb.ovsschema        |   9 +-
>>  ovn-nb.xml              |   9 +-
>>  tests/ovn-northd.at     | 309 ++++++++++++++++++++++++++++++++++++++++
>>  utilities/ovn-nbctl.c   |   6 +-
>>  8 files changed, 428 insertions(+), 11 deletions(-)
>>
>> diff --git a/NEWS b/NEWS
>> index 1ddde15f8..d72139e76 100644
>> --- a/NEWS
>> +++ b/NEWS
>> @@ -11,6 +11,8 @@ Post-v21.03.0
>>      'use_parallel_build' to enable it.  It is disabled by default.
>>    - Support vlan-passthru mode for tag=0 localnet ports.
>>    - Support custom 802.11ad EthType for localnet ports.
>> +  - Introduce a new "allow-stateless" ACL verb to always bypass
> connection
>> +    tracking. The existing "allow" verb behavior is left intact.
>>
>>  OVN v21.03.0 - 12 Mar 2021
>>  -------------------------
>> diff --git a/northd/ovn-northd.8.xml b/northd/ovn-northd.8.xml
>> index 54e88d3fa..08c4ea852 100644
>> --- a/northd/ovn-northd.8.xml
>> +++ b/northd/ovn-northd.8.xml
>> @@ -419,7 +419,9 @@
>>        before eventually advancing to ingress table <code>ACLs</code>. If
>>        special ports such as route ports or localnet ports can't use
> ct(), a
>>        priority-110 flow is added to skip over stateful ACLs. IPv6
> Neighbor
>> -      Discovery and MLD traffic also skips stateful ACLs.
>> +      Discovery and MLD traffic also skips stateful ACLs. For
> "allow-stateless"
>> +      ACLs, a flow is added to bypass setting the hint for connection
> tracker
>> +      processing.
> 
> This approach clearly achieves the goal of bypassing conntrack and there is
> great dataplane performance gain, but it also doubles the ACL logical flows
> which in turn could significantly increase the control plane cost,
> especially in ovn-controller. In large scale environments where lots of
> ACLs referencing large port-group/address-sets are used, it is already a
> bottleneck of ovn-controller flow computing. It would be better if we could
> avoid doubling this cost. Instead of directly add the ACL match here, can
> we set a new hint (e.g. REGBIT_BYPASS_CT = 1), and in the later stage
> pre-stateful add a higher priority flow, if this hint is matched then
> directly "next" without going through CT? I hope this can achieve the same
> goal but not increasing control plane cost. What do you think?
> 

Wouldn't this break ACL priority?  E.g., with the following ACLs:

ACL1: from-lport, prio=100, ip.dst=1.1.1.1, allow-stateless
ACL2: from-lport, prio=50, ip, drop

If we don't add the lflow for ACL1 in the ls_in_acl stage then traffic
will be dropped by the lflow added for ACL2.

On the other hand, IIUC, this feature is designed to be used in
combination with "allow-related" ACLs defined on the same switch,
replacing "allow" ACLs in that case.

Before this change an "allow" ACL would've been treated as
"allow-related" if there was at least one other "allow-related" ACL
applied on the logical switch.  Essentially generating 2 logical flows
per "allow" ACL:

lflow1: "REGBIT_ACL_HINT_ALLOW_NEW==1 && acl.match" then
"REGBIT_CONNTRACK_COMMIT=1, next"
lflow2: "REGBIT_ACL_HINT_ALLOW==1 && acl.match" then "next"

If I'm not wrong, changing this ACL to "allow-stateless" would also
generate 2 logical flows, so at least from a number of logical flows
perspective, the result is the same.

Regards,
Dumitru

_______________________________________________
dev mailing list
[email protected]
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Reply via email to