On 2/9/24 00:28, Numan Siddique wrote:
> On Thu, Feb 8, 2024, 4:49 PM <[email protected]> wrote:
>
>> From: Numan Siddique <[email protected]>
>>
>> This patch series fixes the memory and CPU usage increase seen
>> in ovn-northd after the lflow I-P patches were merged.
>>
>> The first 2 patches in the series addresses duplicate flows
>> added by northd into the lflow table for the same datapath.
>>
>> The 3rd patch fixes a bug in lflow-mgr and the 4th patch
>> actually addresses the northd memory and CPU increase issue.
>>
>> We considered 2 approaches to solve this.
>>
>> Approach 1 (which this series adopts) solves by maintaining dp
>> refcnt for an lflow only if required.
>>
>> Approach 2 (which can be found [1]) solves this by resorting to
>> a full recompute when an lflow is added more than once for a datapath.
>>
>> Below are the test results with ovn-heater for both the approaches.
>>
>> Cluster density 500 node test
>> -----------------------------
>>
>> | Avg. Poll Intervals | Total test time | northd
>> RSS
>>
>> --------------------------------+-----------------+-------------------------
>> Before lflow I-P | 1.5 seconds | 1005 seconds | 2.5 GB
>> lflow i-p merged | 6 seconds | 2246 seconds | 8.5 GB
>> Approach 1 | 2.1 seconds | 1142 seconds | 2.67 GB
>> Approach 2 | 1.8 seconds | 1046 seconds | 2.41 GB
>>
>> -----------------------------------------------------------------------------
>>
>> Node density heavy 500 node test
>> --------------------------------
>>
>> | Avg. Poll Intervals | Total test time | northd
>> RSS
>> --------------------------------+-----------------+-----------------------
>> Before lflow I-P | 1.3 seconds | 192 seconds | 1.49 GB
>>
>> lflow I-P merged | 4.5 seconds | 87 seconds | 7.3 GB
>> Approach 1 | 2.4 seconds | 83 seconds | 2.2 GB
>>
>
>
> A clarification here : with Approach 1 even though the 2.4 seconds poll
> interval seems high, but it is only with 3 poll interval data. i.e.
> ovn-northd didn't recompute for the most part of test duration.
>
>
> Thanks
> Numan
>
> Approach 2 | 1.36 seconds | 193 seconds | 2.2 GB
>> -------------------------------------------------------------------------
>>
>> Both has advantages and disadvantages
>>
>> (As outlined by Ilya below about pros and cons)
>> Approach 1
>> ---------
>> Pros:
>> * Doesn't fall back to recompute more often than current main.
>> * Fairly simple.
>> * Can be optimized by getting rid of duplicated lflows - we'll allocate
>> less refcounts.
>> Cons:
>> * Higher CPU and memory usage in ovn-heater tests due to actual refcount
>> and
>> hash map allocations.
>>
>> Approach 2:
>> ---------
>> Pros:
>> * Lower memory usage due to no refcounting.
>> * Lower CPU usage in cases where we do not fall into recompute.
>> * End code is simpler.
>> * Can be optimized by getting rid of duplicated lflows - we'll not fall
>> back to recompute that often.
>>
>> Cons:
>> * Falling into recompute more frequently - Higher CPU usage in some
>> cases.
>> (whenever users create the same LBs for different protos)
>> * Concerning log message in legitimate configurations.
>>
>>
>> We chose Approach 1 based on the above test results.
>>
>> [1] - https://github.com/numansiddique/ovn/commits/dp_refcnt_fix_v1
>>
To add to the ovn-heater results, we also ran some OpenShift scale tests
with this patch set applied. The tests were run on a 120 node cluster
deployed with the OVN-Kubernetes "interconnect" architecture (northd and
DBs running on each node). The results are listed below. CPU and memory
usage are aggregated across the whole cluster:
+----------------+-----------------+-------------------------+-------------------------+------------------+------------------+------------------+------------------+
| Version | Test Case | POD-ready latency (avg) | Pod-Ready
latency (max) | northd CPU (avg) | northd CPU (max) | northd RSS (avg) | northd
RSS (max) |
+----------------+-----------------+-------------------------+-------------------------+------------------+------------------+------------------+------------------+
| branch-23.09 | cluster-density | 5.72s | 13s
| 4620% | 6363% | 18.3GB | 23.6GB
|
| branch-24.09 | cluster-density | 5.74s | 13s
| 3246% | 5921% | 18.8GB | 24.3GB
|
| with patch set | cluster-density | 5.75s | 13s
| 3190% | 5772% | 18.6GB | 24.2GB
|
| | | |
| | | |
|
| branch-23.09 | node-density | 1.98s | 4s
| 5054% | 12893% | 10.8GB | 27.3GB
|
| branch-24.09 | node-density | 1.92s | 4s
| 1348% | 5655% | 10.8GB | 28.1GB
|
| with patch set | node-density | 1.92s | 4s
| 1433% | 5698% | 10.6GB | 14.1GB
|
+----------------+-----------------+-------------------------+-------------------------+------------------+------------------+------------------+------------------+
This shows that in the OpenShift use case, the patch set performs at
least as good as the (buggy) branch-24.09 version which includes
a623606052ea ("northd: Refactor lflow management into a separate module.").
I think it's fine to merge this series (once the few small comments Han
and I had are addressed).
Regards,
Dumitru
_______________________________________________
dev mailing list
[email protected]
https://mail.openvswitch.org/mailman/listinfo/ovs-dev