On Thu, Sep 30, 2021 at 11:04 PM Anton Ivanov < [email protected]> wrote:
> On 01/10/2021 01:32, Han Zhou wrote: > > > > On Thu, Sep 30, 2021 at 2:03 PM Anton Ivanov < > [email protected]> wrote: > >> On 30/09/2021 20:48, Han Zhou wrote: >> >> >> >> On Thu, Sep 30, 2021 at 7:34 AM Anton Ivanov < >> [email protected]> wrote: >> >>> Summary of findings. >>> >>> 1. The numbers on the perf test do not align with heater which is much >>> closer to a realistic load. On some tests where heater gives 5-10% >>> end-to-end improvement with parallelization we get worse results with the >>> perf-test. You spotted this one correctly. >>> >>> Example of the northd average pulled out of the test report via grep and >>> sed. >>> >>> 127.489353 >>> 131.509458 >>> 116.088205 >>> 94.721911 >>> 119.629756 >>> 114.896258 >>> 124.811069 >>> 129.679160 >>> 106.699905 >>> 134.490338 >>> 112.106713 >>> 135.957658 >>> 132.471111 >>> 94.106849 >>> 117.431450 >>> 115.861592 >>> 106.830657 >>> 132.396905 >>> 107.092542 >>> 128.945760 >>> 94.298464 >>> 120.455510 >>> 136.910426 >>> 134.311765 >>> 115.881292 >>> 116.918458 >>> >>> These values are all over the place - this is not a reproducible test. >>> >>> 2. In the present state you need to re-run it > 30+ times and take an >>> average. The standard deviation for the values for the northd loop is > >>> 10%. Compared to that the reproducibility of ovn-heater is significantly >>> better. I usually get less than 0.5% difference between runs if there was >>> no iteration failures. I would suggest using that instead if you want to do >>> performance comparisons until we have figured out what affects the >>> perf-test. >>> >>> 3. It is using the short term running average value in reports which is >>> probably wrong because you have very significant skew from the last several >>> values. >>> >>> I will look into all of these. >>> >> Thanks for the summary! However, I think there is a bigger problem >> (probably related to my environment) than the stability of the test (make >> check-perf TESTSUITEFLAGS="--rebuild") itself. As I mentioned in an earlier >> email I observed even worse results with a large scale topology closer to a >> real world deployment of ovn-k8s just testing with the command: >> ovn-nbctl --print-wait-time --wait=sb sync >> >> This command simply triggers a change in NB_Global table and wait for >> northd to complete all the recompute and update SB. It doesn't have to use >> "sync" command but any change to the NB DB produces similar result (e.g.: >> ovn-nbctl --print-wait-time --wait=sb ls-add ls1) >> >> Without parallel: >> ovn-northd completion: 7807ms >> >> With parallel: >> ovn-northd completion: 41267ms >> >> Is this with current master or prior to these patches? >> > 1. There was an issue prior to these where the hash on first iteration >> with an existing database when loading a large database for the first time >> was not sized correctly. These numbers sound about right when this bug was >> around. >> > The patches are included. The commit id is 9242f27f63 as mentioned in my > first email. > >> 2. There should be NO DIFFERENCE in a single compute cycle with an >> existing database between a run with parallel and without with dp groups at >> present. This is because the first cycle does not use parallel compute. It >> is disabled in order to achieve the correct hash sizings for future cycle >> by auto-scaling the hash. >> > Yes, I understand this and I did enable dp-group for the above "ovn-nbctl > sync" test, so the number I showed above for "with parallel" was for the > 2nd run and onwards. For the first round the result is exactly the same as > without parallel. > > I just tried disabling DP group for the large scale "ovn-nbctl sync" test > (after taking some effort squeezing out memory spaces on my desktop), and > the result shows that parallel build performs slightly better (although it > is 3x slower than with dp-group & without parallel, which is expected). > Summarize the result together below: > > Without parallel, with dp-group: > ovn-northd completion: 7807ms > > With parallel, with dp-group: > ovn-northd completion: 41267ms > > without parallel, without dp-group: > ovn-northd completion: 27996ms > > with parallel, without dp-group: > ovn-northd completion: 26584ms > > Now the interesting part: > I implemented a POC of a hash based mutex array that replaces the rw lock > in the function do_ovn_lflow_add_pd(), and the performance is greatly > improved for the dp-group test: > > with parallel, with dp-group (hash based mutex): > ovn-northd completion: 5081ms > > This is 8x faster than the current parallel one and 30% faster than > without parallel. This result looks much more reasonable to me. My theory > is that when using parallel with dp-group, the rwlock contention is causing > the low CPU utilization of the threads and the overall slowness on my > machine. I will refine the POC to a formal patch and send it for review, > hopefully by tomorrow. > > Cool. The older implementation prior to going to rwlock was based on that. > > I found a couple of issues with it which is why I switched to RWlock > > Namely - the access to the lflow hash size is not controlled and the hash > size ends up corrupt because different threads modify it without a lock. In > a worst case scenario you end up with a dog's breakfast in this entire > cache line. > > So you need a couple of extra macros to insert fast without touching the > cache size. > > This in turn leaves you with a hash you cannot resize to correct size for > searching for post-processing lflows and reconciliation. You will probably > need the post-processing optimization patch which I submitted a couple of > weeks back. Instead of using a HMAPX to hold the single flows and modifying > in-place the lflow hash it rebuilds it completely and replaces the original > one. At that point you have the right size and the hash is resized to > optimum size. > > By the way, there is one more option - you may want to try switching to > fatrwlock - that is supposed to decrease contention and make things faster. > Though that probably will not be enough. > > I still do not get it why your results are so different from ovn-heater > tests, but that is something I will look into separately. > > Brgds, > Hi Anton, thanks for bringing up the tricky points. I end up with a relatively simple solution: https://patchwork.ozlabs.org/project/ovn/patch/[email protected]/ The considerations are described in the comments. Please take a look and see if it solves the concerns. Thanks, Han > > Thanks, > Han > > > -- > Anton R. Ivanov > Cambridgegreys Limited. Registered in England. Company Number > 10273661https://www.cambridgegreys.com/ > > _______________________________________________ dev mailing list [email protected] https://mail.openvswitch.org/mailman/listinfo/ovs-dev
