On Wed, Feb 14, 2024 at 9:51 AM Dumitru Ceara <[email protected]> wrote: > > On 2/9/24 00:28, Numan Siddique wrote: > > On Thu, Feb 8, 2024, 4:49 PM <[email protected]> wrote: > > > >> From: Numan Siddique <[email protected]> > >> > >> This patch series fixes the memory and CPU usage increase seen > >> in ovn-northd after the lflow I-P patches were merged. > >> > >> The first 2 patches in the series addresses duplicate flows > >> added by northd into the lflow table for the same datapath. > >> > >> The 3rd patch fixes a bug in lflow-mgr and the 4th patch > >> actually addresses the northd memory and CPU increase issue. > >> > >> We considered 2 approaches to solve this. > >> > >> Approach 1 (which this series adopts) solves by maintaining dp > >> refcnt for an lflow only if required. > >> > >> Approach 2 (which can be found [1]) solves this by resorting to > >> a full recompute when an lflow is added more than once for a datapath. > >> > >> Below are the test results with ovn-heater for both the approaches. > >> > >> Cluster density 500 node test > >> ----------------------------- > >> > >> | Avg. Poll Intervals | Total test time | northd > >> RSS > >> > >> --------------------------------+-----------------+------------------------- > >> Before lflow I-P | 1.5 seconds | 1005 seconds | 2.5 GB > >> lflow i-p merged | 6 seconds | 2246 seconds | 8.5 GB > >> Approach 1 | 2.1 seconds | 1142 seconds | 2.67 GB > >> Approach 2 | 1.8 seconds | 1046 seconds | 2.41 GB > >> > >> ----------------------------------------------------------------------------- > >> > >> Node density heavy 500 node test > >> -------------------------------- > >> > >> | Avg. Poll Intervals | Total test time | northd > >> RSS > >> --------------------------------+-----------------+----------------------- > >> Before lflow I-P | 1.3 seconds | 192 seconds | 1.49 GB > >> > >> lflow I-P merged | 4.5 seconds | 87 seconds | 7.3 GB > >> Approach 1 | 2.4 seconds | 83 seconds | 2.2 GB > >> > > > > > > A clarification here : with Approach 1 even though the 2.4 seconds poll > > interval seems high, but it is only with 3 poll interval data. i.e. > > ovn-northd didn't recompute for the most part of test duration. > > > > > > Thanks > > Numan > > > > Approach 2 | 1.36 seconds | 193 seconds | 2.2 GB > >> ------------------------------------------------------------------------- > >> > >> Both has advantages and disadvantages > >> > >> (As outlined by Ilya below about pros and cons) > >> Approach 1 > >> --------- > >> Pros: > >> * Doesn't fall back to recompute more often than current main. > >> * Fairly simple. > >> * Can be optimized by getting rid of duplicated lflows - we'll allocate > >> less refcounts. > >> Cons: > >> * Higher CPU and memory usage in ovn-heater tests due to actual refcount > >> and > >> hash map allocations. > >> > >> Approach 2: > >> --------- > >> Pros: > >> * Lower memory usage due to no refcounting. > >> * Lower CPU usage in cases where we do not fall into recompute. > >> * End code is simpler. > >> * Can be optimized by getting rid of duplicated lflows - we'll not fall > >> back to recompute that often. > >> > >> Cons: > >> * Falling into recompute more frequently - Higher CPU usage in some > >> cases. > >> (whenever users create the same LBs for different protos) > >> * Concerning log message in legitimate configurations. > >> > >> > >> We chose Approach 1 based on the above test results. > >> > >> [1] - https://github.com/numansiddique/ovn/commits/dp_refcnt_fix_v1 > >> > > To add to the ovn-heater results, we also ran some OpenShift scale tests > with this patch set applied. The tests were run on a 120 node cluster > deployed with the OVN-Kubernetes "interconnect" architecture (northd and > DBs running on each node). The results are listed below. CPU and memory > usage are aggregated across the whole cluster: > > +----------------+-----------------+-------------------------+-------------------------+------------------+------------------+------------------+------------------+ > | Version | Test Case | POD-ready latency (avg) | Pod-Ready > latency (max) | northd CPU (avg) | northd CPU (max) | northd RSS (avg) | > northd RSS (max) | > +----------------+-----------------+-------------------------+-------------------------+------------------+------------------+------------------+------------------+ > | branch-23.09 | cluster-density | 5.72s | 13s > | 4620% | 6363% | 18.3GB | 23.6GB > | > | branch-24.09 | cluster-density | 5.74s | 13s > | 3246% | 5921% | 18.8GB | 24.3GB > | > | with patch set | cluster-density | 5.75s | 13s > | 3190% | 5772% | 18.6GB | 24.2GB > | > | | | | > | | | | > | > | branch-23.09 | node-density | 1.98s | 4s > | 5054% | 12893% | 10.8GB | 27.3GB > | > | branch-24.09 | node-density | 1.92s | 4s > | 1348% | 5655% | 10.8GB | 28.1GB > | > | with patch set | node-density | 1.92s | 4s > | 1433% | 5698% | 10.6GB | 14.1GB > | > +----------------+-----------------+-------------------------+-------------------------+------------------+------------------+------------------+------------------+ > > This shows that in the OpenShift use case, the patch set performs at > least as good as the (buggy) branch-24.09 version which includes > a623606052ea ("northd: Refactor lflow management into a separate module."). > > I think it's fine to merge this series (once the few small comments Han > and I had are addressed). >
Thanks Han, Ilya and Dumitru (and also for the scale tests). I addressed the review comments and applied the patches to main and branch-24.03. Numan > Regards, > Dumitru > > > _______________________________________________ > dev mailing list > [email protected] > https://mail.openvswitch.org/mailman/listinfo/ovs-dev _______________________________________________ dev mailing list [email protected] https://mail.openvswitch.org/mailman/listinfo/ovs-dev
