On Thu, Sep 30, 2021 at 7:34 AM Anton Ivanov <
[email protected]> wrote:
> Summary of findings.
>
> 1. The numbers on the perf test do not align with heater which is much
> closer to a realistic load. On some tests where heater gives 5-10%
> end-to-end improvement with parallelization we get worse results with the
> perf-test. You spotted this one correctly.
>
> Example of the northd average pulled out of the test report via grep and
> sed.
>
> 127.489353
> 131.509458
> 116.088205
> 94.721911
> 119.629756
> 114.896258
> 124.811069
> 129.679160
> 106.699905
> 134.490338
> 112.106713
> 135.957658
> 132.471111
> 94.106849
> 117.431450
> 115.861592
> 106.830657
> 132.396905
> 107.092542
> 128.945760
> 94.298464
> 120.455510
> 136.910426
> 134.311765
> 115.881292
> 116.918458
>
> These values are all over the place - this is not a reproducible test.
>
> 2. In the present state you need to re-run it > 30+ times and take an
> average. The standard deviation for the values for the northd loop is >
> 10%. Compared to that the reproducibility of ovn-heater is significantly
> better. I usually get less than 0.5% difference between runs if there was
> no iteration failures. I would suggest using that instead if you want to do
> performance comparisons until we have figured out what affects the
> perf-test.
>
> 3. It is using the short term running average value in reports which is
> probably wrong because you have very significant skew from the last several
> values.
>
> I will look into all of these.
>
Thanks for the summary! However, I think there is a bigger problem
(probably related to my environment) than the stability of the test (make
check-perf TESTSUITEFLAGS="--rebuild") itself. As I mentioned in an earlier
email I observed even worse results with a large scale topology closer to a
real world deployment of ovn-k8s just testing with the command:
ovn-nbctl --print-wait-time --wait=sb sync
This command simply triggers a change in NB_Global table and wait for
northd to complete all the recompute and update SB. It doesn't have to use
"sync" command but any change to the NB DB produces similar result (e.g.:
ovn-nbctl --print-wait-time --wait=sb ls-add ls1)
Without parallel:
ovn-northd completion: 7807ms
With parallel:
ovn-northd completion: 41267ms
This result is stable and consistent when repeating the command on my
machine. Would you try it on your machine as well? I understand that only
the lflow generation part can be parallelized and it doesn't solve all the
bottleneck, but I did expect it to be faster instead of slower. If your
result always shows that parallel is better, then I will have to dig it out
myself on my test machine.
Thanks,
Han
> Brgds,
> On 30/09/2021 08:26, Han Zhou wrote:
>
>
>
> On Thu, Sep 30, 2021 at 12:08 AM Anton Ivanov <
> [email protected]> wrote:
>
>> After quickly adding some more prints into the testsuite.
>>
>> Test 1:
>>
>> Without
>>
>> 1: ovn-northd basic scale test -- 200 Hypervisors, 200 Logical
>> Ports/Hypervisor -- ovn-northd -- dp-groups=yes
>> ---
>> Maximum (NB in msec): 1130
>> Average (NB in msec): 620.375000
>> Maximum (SB in msec): 23
>> Average (SB in msec): 21.468759
>> Maximum (northd-loop in msec): 6002
>> Minimum (northd-loop in msec): 0
>> Average (northd-loop in msec): 914.760417
>> Long term average (northd-loop in msec): 104.799340
>>
>> With
>>
>> 1: ovn-northd basic scale test -- 200 Hypervisors, 200 Logical
>> Ports/Hypervisor -- ovn-northd -- dp-groups=yes
>> ---
>> Maximum (NB in msec): 1148
>> Average (NB in msec): 630.250000
>> Maximum (SB in msec): 24
>> Average (SB in msec): 21.468744
>> Maximum (northd-loop in msec): 6090
>> Minimum (northd-loop in msec): 0
>> Average (northd-loop in msec): 762.101565
>> Long term average (northd-loop in msec): 80.735192
>>
>> The metric which actually matters and which SHOULD me measured - long
>> term average is better by 20%. Using short term average instead of long
>> term in the test suite is actually a BUG.
>>
> Good catch!
>
>
>> Are you running yours under some sort of virtualization?
>>
>
> No, I am testing on a bare-metal.
>
>
>> A.
>> On 30/09/2021 07:52, Han Zhou wrote:
>>
>> Thanks Anton for checking. I am using: Intel(R) Core(TM) i9-7920X CPU @
>> 2.90GHz, 24 cores.
>> It is weird why my result is so different. I also verified with a scale
>> test script that creates a large scale NB/SB with 800 nodes of simulated
>> k8s setup. And then just run:
>> ovn-nbctl --print-wait-time --wait=sb sync
>>
>> Without parallel:
>> ovn-northd completion: 7807ms
>>
>> With parallel:
>> ovn-northd completion: 41267ms
>>
>> I suspected the hmap size problem but I tried changing the initial size
>> to 64k buckets and it didn't help. I will find some time to check the
>> "perf" reports.
>>
>> Thanks,
>> Han
>>
>> On Wed, Sep 29, 2021 at 11:31 PM Anton Ivanov <
>> [email protected]> wrote:
>>
>>> On 30/09/2021 07:16, Anton Ivanov wrote:
>>>
>>> Results on a Ryzen 5 3600 - 6 cores 12 threads
>>>
>>> I will also have a look into the "maximum" measurement for multi-thread.
>>>
>>> It does not tie up with the drop in average across the board.
>>>
>>> A.
>>>
>>>
>>> Without
>>>
>>>
>>> 1: ovn-northd basic scale test -- 200 Hypervisors, 200 Logical
>>> Ports/Hypervisor -- ovn-northd -- dp-groups=yes
>>> ---
>>> Maximum (NB in msec): 1256
>>> Average (NB in msec): 679.463785
>>> Maximum (SB in msec): 25
>>> Average (SB in msec): 22.489798
>>> Maximum (northd-loop in msec): 1347
>>> Average (northd-loop in msec): 799.944878
>>>
>>> 2: ovn-northd basic scale test -- 200 Hypervisors, 200 Logical
>>> Ports/Hypervisor -- ovn-northd -- dp-groups=no
>>> ---
>>> Maximum (NB in msec): 1956
>>> Average (NB in msec): 809.387285
>>> Maximum (SB in msec): 24
>>> Average (SB in msec): 21.649258
>>> Maximum (northd-loop in msec): 2011
>>> Average (northd-loop in msec): 961.718686
>>>
>>> 5: ovn-northd basic scale test -- 500 Hypervisors, 50 Logical
>>> Ports/Hypervisor -- ovn-northd -- dp-groups=yes
>>> ---
>>> Maximum (NB in msec): 557
>>> Average (NB in msec): 474.010337
>>> Maximum (SB in msec): 15
>>> Average (SB in msec): 13.927192
>>> Maximum (northd-loop in msec): 1261
>>> Average (northd-loop in msec): 580.999122
>>>
>>> 6: ovn-northd basic scale test -- 500 Hypervisors, 50 Logical
>>> Ports/Hypervisor -- ovn-northd -- dp-groups=no
>>> ---
>>> Maximum (NB in msec): 756
>>> Average (NB in msec): 625.614724
>>> Maximum (SB in msec): 15
>>> Average (SB in msec): 14.181048
>>> Maximum (northd-loop in msec): 1649
>>> Average (northd-loop in msec): 746.208332
>>>
>>>
>>> With
>>>
>>> 1: ovn-northd basic scale test -- 200 Hypervisors, 200 Logical
>>> Ports/Hypervisor -- ovn-northd -- dp-groups=yes
>>> ---
>>> Maximum (NB in msec): 1140
>>> Average (NB in msec): 631.125000
>>> Maximum (SB in msec): 24
>>> Average (SB in msec): 21.453609
>>> Maximum (northd-loop in msec): 6080
>>> Average (northd-loop in msec): 759.718815
>>>
>>> 2: ovn-northd basic scale test -- 200 Hypervisors, 200 Logical
>>> Ports/Hypervisor -- ovn-northd -- dp-groups=no
>>> ---
>>> Maximum (NB in msec): 1210
>>> Average (NB in msec): 673.000000
>>> Maximum (SB in msec): 27
>>> Average (SB in msec): 22.453125
>>> Maximum (northd-loop in msec): 6514
>>> Average (northd-loop in msec): 808.596842
>>>
>>> 5: ovn-northd basic scale test -- 500 Hypervisors, 50 Logical
>>> Ports/Hypervisor -- ovn-northd -- dp-groups=yes
>>> ---
>>> Maximum (NB in msec): 798
>>> Average (NB in msec): 429.750000
>>> Maximum (SB in msec): 15
>>> Average (SB in msec): 12.998533
>>> Maximum (northd-loop in msec): 3835
>>> Average (northd-loop in msec): 564.875986
>>>
>>> 6: ovn-northd basic scale test -- 500 Hypervisors, 50 Logical
>>> Ports/Hypervisor -- ovn-northd -- dp-groups=no
>>> ---
>>> Maximum (NB in msec): 1074
>>> Average (NB in msec): 593.875000
>>> Maximum (SB in msec): 14
>>> Average (SB in msec): 13.655273
>>> Maximum (northd-loop in msec): 4973
>>> Average (northd-loop in msec): 771.102605
>>>
>>> The only one slower is test 6 which I will look into.
>>>
>>> The rest are > 5% faster.
>>>
>>> A.
>>>
>>> On 30/09/2021 00:56, Han Zhou wrote:
>>>
>>>
>>>
>>> On Wed, Sep 15, 2021 at 5:45 AM <[email protected]> wrote:
>>> >
>>> > From: Anton Ivanov <[email protected]>
>>> >
>>> > Restore parallel build with dp groups using rwlock instead
>>> > of per row locking as an underlying mechanism.
>>> >
>>> > This provides improvement ~ 10% end-to-end on ovn-heater
>>> > under virutalization despite awakening some qemu gremlin
>>> > which makes qemu climb to silly CPU usage. The gain on
>>> > bare metal is likely to be higher.
>>> >
>>> Hi Anton,
>>>
>>> I am trying to see the benefit of parallel_build, but encountered
>>> unexpected performance result when running the perf tests with command:
>>> make check-perf TESTSUITEFLAGS="--rebuild"
>>>
>>> It shows significantly worse performance than without parallel_build.
>>> For dp_group = no cases, it is better, but still ~30% slower than without
>>> parallel_build. I have 24 cores, but each thread is not consuming much CPU
>>> except the main thread. I also tried hardcode the number of thread to just
>>> 4, which end up with slightly better results, but still far behind "without
>>> parallel_build".
>>>
>>> no parallel |
>>> parallel (24 pool threads) | parallel with (4
>>> pool threads)
>>> |
>>> |
>>> 1: ovn-northd basic scale test -- 200 Hypervisors, 200 | 1:
>>> ovn-northd basic scale test -- 200 Hypervisors, 200 | 1: ovn-northd
>>> basic scale test -- 200 Hypervisors, 200
>>> --- | ---
>>> | ---
>>> Maximum (NB in msec): 1058 | Maximum
>>> (NB in msec): 4269 | Maximum (NB in msec):
>>> 4097
>>> Average (NB in msec): 836.941167 | Average
>>> (NB in msec): 3697.253931 | Average (NB in msec):
>>> 3498.311525
>>> Maximum (SB in msec): 30 | Maximum
>>> (SB in msec): 30 | Maximum (SB in msec):
>>> 28
>>> Average (SB in msec): 25.934011 | Average
>>> (SB in msec): 26.001840 | Average (SB in msec):
>>> 25.685091
>>> Maximum (northd-loop in msec): 1204 | Maximum
>>> (northd-loop in msec): 4379 | Maximum (northd-loop in
>>> msec): 4251
>>> Average (northd-loop in msec): 1005.330078 | Average
>>> (northd-loop in msec): 4233.871504 | Average (northd-loop in
>>> msec): 4022.774208
>>> |
>>> |
>>> 2: ovn-northd basic scale test -- 200 Hypervisors, 200 | 2:
>>> ovn-northd basic scale test -- 200 Hypervisors, 200 | 2: ovn-northd
>>> basic scale test -- 200 Hypervisors, 200
>>> --- | ---
>>> | ---
>>> Maximum (NB in msec): 1124 | Maximum
>>> (NB in msec): 1480 | Maximum (NB in msec):
>>> 1331
>>> Average (NB in msec): 892.403405 | Average
>>> (NB in msec): 1206.189287 | Average (NB in msec):
>>> 1089.378455
>>> Maximum (SB in msec): 29 | Maximum
>>> (SB in msec): 31 | Maximum (SB in msec):
>>> 30
>>> Average (SB in msec): 26.922632 | Average
>>> (SB in msec): 26.636706 | Average (SB in msec):
>>> 25.657484
>>> Maximum (northd-loop in msec): 1275 | Maximum
>>> (northd-loop in msec): 1639 | Maximum (northd-loop in
>>> msec): 1495
>>> Average (northd-loop in msec): 1074.917873 | Average
>>> (northd-loop in msec): 1458.152327 | Average (northd-loop in
>>> msec): 1301.057201
>>> |
>>> |
>>> 5: ovn-northd basic scale test -- 500 Hypervisors, 50 L| 5:
>>> ovn-northd basic scale test -- 500 Hypervisors, 50 L| 5: ovn-northd
>>> basic scale test -- 500 Hypervisors, 50
>>> --- | ---
>>> | ---
>>> Maximum (NB in msec): 768 | Maximum
>>> (NB in msec): 3086 | Maximum (NB in msec):
>>> 2876
>>> Average (NB in msec): 614.491938 | Average
>>> (NB in msec): 2681.688365 | Average (NB in msec):
>>> 2531.255444
>>> Maximum (SB in msec): 18 | Maximum
>>> (SB in msec): 17 | Maximum (SB in msec):
>>> 18
>>> Average (SB in msec): 16.347526 | Average
>>> (SB in msec): 15.955263 | Average (SB in msec):
>>> 16.278075
>>> Maximum (northd-loop in msec): 889 | Maximum
>>> (northd-loop in msec): 3247 | Maximum (northd-loop in
>>> msec): 3031
>>> Average (northd-loop in msec): 772.083572 | Average
>>> (northd-loop in msec): 3117.504297 | Average (northd-loop in
>>> msec): 2833.182361
>>> |
>>> |
>>> 6: ovn-northd basic scale test -- 500 Hypervisors, 50 L| 6:
>>> ovn-northd basic scale test -- 500 Hypervisors, 50 L| 6: ovn-northd
>>> basic scale test -- 500 Hypervisors, 50
>>> --- | ---
>>> | ---
>>> Maximum (NB in msec): 1046 | Maximum
>>> (NB in msec): 1371 | Maximum (NB in msec):
>>> 1262
>>> Average (NB in msec): 827.735852 | Average
>>> (NB in msec): 1135.514228 | Average (NB in msec):
>>> 970.544792
>>> Maximum (SB in msec): 19 | Maximum
>>> (SB in msec): 18 | Maximum (SB in msec):
>>> 19
>>> Average (SB in msec): 16.828127 | Average
>>> (SB in msec): 16.083914 | Average (SB in msec):
>>> 15.602525
>>> Maximum (northd-loop in msec): 1163 | Maximum
>>> (northd-loop in msec): 1545 | Maximum (northd-loop in
>>> msec): 1411
>>> Average (northd-loop in msec): 972.567407 | Average
>>> (northd-loop in msec): 1328.617583 | Average (northd-loop in
>>> msec): 1207.667100
>>>
>>> I didn't debug yet, but do you have any clue what could be the reason? I
>>> am using the upstream commit 9242f27f63 which already included this patch.
>>> Below is my change to the perf-northd.at file just to enable
>>> parallel_build:
>>>
>>> diff --git a/tests/perf-northd.at b/tests/perf-northd.at
>>> index 74b69e9d4..9328c2e21 100644
>>> --- a/tests/perf-northd.at
>>> +++ b/tests/perf-northd.at
>>> @@ -191,6 +191,7 @@ AT_SETUP([ovn-northd basic scale test -- 200
>>> Hypervisors, 200 Logical Ports/Hype
>>> PERF_RECORD_START()
>>>
>>> ovn_start
>>> +ovn-nbctl set nb_global . options:use_parallel_build=true
>>>
>>> BUILD_NBDB(OVN_BASIC_SCALE_CONFIG(200, 200))
>>>
>>> @@ -203,9 +204,10 @@ AT_SETUP([ovn-northd basic scale test -- 500
>>> Hypervisors, 50 Logical Ports/Hyper
>>> PERF_RECORD_START()
>>>
>>> ovn_start
>>> +ovn-nbctl set nb_global . options:use_parallel_build=true
>>>
>>> BUILD_NBDB(OVN_BASIC_SCALE_CONFIG(500, 50))
>>>
>>> Thanks,
>>> Han
>>>
>>>
>>> --
>>> Anton R. Ivanov
>>> Cambridgegreys Limited. Registered in England. Company Number
>>> 10273661https://www.cambridgegreys.com/
>>>
>>>
>>> --
>>> Anton R. Ivanov
>>> Cambridgegreys Limited. Registered in England. Company Number
>>> 10273661https://www.cambridgegreys.com/
>>>
>>> --
>> Anton R. Ivanov
>> Cambridgegreys Limited. Registered in England. Company Number
>> 10273661https://www.cambridgegreys.com/
>>
>> --
> Anton R. Ivanov
> Cambridgegreys Limited. Registered in England. Company Number
> 10273661https://www.cambridgegreys.com/
>
>
_______________________________________________
dev mailing list
[email protected]
https://mail.openvswitch.org/mailman/listinfo/ovs-dev